validation/aggregating/*/*.sql, schemas/vegbien.sql, lib/runscripts/validations.pg.sql.run, inputs/bien2_traits/validations.sql: added _ to beginning of each view name so the validation views would sort at the top in the datasource's tables list. this will also make the validation result sets easily distinguishable from the data tables.
bugfix: lib/runscripts/validations.pg.sql.run: updated table match pattern to include the type prefix that validations queries now contain
bugfix: lib/runscripts/validations.pg.sql.run: --table: need to include explicit schema so that matching tables from other schemas are not included
added lib/runscripts/validations.pg.sql.run
lib/runscripts/file.pg.sql.run, schema.pg.sql.run: support custom options to pg_dump in $@
lib/runscripts/in_datasrc_dir.run: datasrc_make(): use `if remaking ...` instead of accessing $_remake manually, for clarity
bugfix: lib/runscripts/in_datasrc_dir.run: datasrc_make(): use set_make_vars and $_remake as required in lib/sh/make.sh `remaking`
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
lib/runscripts/table.run: added test_() target and use it in remake_VegBIEN_mappings() (it would not be clear that remake_VegBIEN_mappings() runs the tests)
bugfix: lib/runscripts/datasrc_dir.run: import(): don't run `sql/install` if the schema already exists, because this will try to rerun all the schema-creation queries. note that this idempotent functionality was not provided by the `make .../install` target that was previously used (idempotency is new with new-style import).
bugfix: lib/runscripts/datasrc_dir.run: import(): can't run `datasrc_make reinstall` anymore because this now defers to the runscript for new-style import datasources (which was done so that `make .../install` properly reinstalls all the datasources). instead, call the applicable make targets manually (there are just 2 of them).
lib/runscripts/table.run: table_make_install(): simplified the setting of $noclobber since there no longer needs to be a different command for when the log exists
bugfix: lib/runscripts/table.run: need to errexit the make target, so that errors in the SQL install scripts are not suppressed. this requires pre-checking if the table exists (using new pg_table_exists), so that the install target's errexit does not then need to be suppressed for cases when the table already exists.
bugfix: lib/runscripts/file.pg.sql.run: export_(): exclude Source and related tables so that these will be re-created by the staging tables installation instead, ensuring that they are always in sync with the Source/ subdir
bugfix: lib/runscripts/view.run: don't do anything in load_data(), to avoid trying to remake header.csv before the view is created. (for views, this instead happens in postprocess().)
lib/runscripts/table.run: reordered functions in the order they are called by import()
lib/runscripts/extract.run: added export_sample()
lib/runscripts/import_subset.run: $version: use new $extract_view, which is set to the same value that this was
lib/runscripts/extract.run: use the extract-specific view instead of all of analytical_stem
added lib/runscripts/import_subset.run, extract.run
bugfix: lib/runscripts/util.run: `trap on_exit EXIT`: only set this if the script is not a dot script, because if it is a dot script, on_exit() will not be invoked until the calling shell exits, which may be much later than when the script is run. previously, this was handled by canceling the EXIT trap if on_exit() is run manually, but this would not work correctly if a load-time error prevented on_exit() from running and canceling the trap.
bugfix: lib/runscripts/util.run: if is_dot_script, fix $ when no args causes this to incorrectly contain the script name. use is_dot_script rather than the presence of $ args to decide whether to use @BASH_ARGV, because @BASH_ARGV is actually wrong when run as a .-script (it contains the script name).
when no args causes this to incorrectly contain the script name. use is_dot_script rather than the presence of $
lib/runscripts/util.run: run script template: changed sample command name to all() because each runscript requires this in order to be run without args
lib/runscripts/util.run: support scripts that are run as shell-includes (with leading "."), by allowing the calling script to manually invoke on_exit() without it then being invoked twice (the end of a shell-include does not trigger the EXIT trap)
lib/runscripts/util.run: support scripts that are run as shell-includes (with leading "."), by also accepting $@ args that are passed along in the util.run include, in addition to @BASH_ARGV
bugfix: lib/runscripts/util.run: to_top_file(): need to pass "$@" to to_file
lib/runscripts/util.run: to_top_file: added function for this (in addition to alias), so that this can be run from sudo in a wrap_fn command
lib/runscripts/util.run: added $wrap_fn to run any function via sudo, etc.
bugfix: lib/runscripts/table.run: load_data(): pass $is_view through to `make reinstall` so that DROP VIEW will be used instead of DROP TABLE where applicable
bugfix: lib/runscripts/table.run: load_data(): in remaking mode, need to remake header.csv in case the columns have changed
lib/runscripts/datasrc_dir.run: import(): added remake (rm=1) mode that reinstalls the datasource before continuing with the subdirs' import actions
lib/runscripts/in_datasrc_dir.run: added datasrc_make(), which runs make in the datasrc dir
bugfix: lib/runscripts/*: calls to rm: use `rm -f` instead to avoid an error (which aborts the program) if the file does not yet exist
lib/runscripts/datasrc_dir.run: added postprocess target to run postprocess in just the table subdirs, skipping any additional subdirs that don't have this target
lib/runscripts/datasrc_dir.run: @subdirs: moved import_order.txt subdirs into separate @table_subdirs, which provides access to just the table subdirs when the user adds other dirs to @subdirs
bugfix: lib/runscripts/view.run: remake_VegBIEN_mappings(): also need to remake header.csv, not just map.csv as for tables, because view columns may change when the view is regenerated
lib/runscripts/util.run: usage: documented that this usage also applies to all files that include this file
lib/runscripts/util.run: usage: clarified that the cmd to run is a function
added lib/runscripts/pg.conf.run, which installs PostgreSQL config files
added lib/runscripts/install.run, analogous to import.run
added lib/runscripts/data.pg.sql.run (analogous to schema.pg.sql.run for data-only SQL scripts)
added lib/runscripts/file.pg.sql.run and use it in schema.pg.sql.run
added lib/runscripts/schema.pg.sql.run and use it in inputs/.TNRS/schema.sql.run
lib/runscripts/util.run: added to_top_file alias for use with $top_file
lib/runscripts/datasrc_dir.run: include of import.run: use .rel instead of `. "$(dirname "${BASH_SOURCE0}")"/...`
lib/runscripts/datasrc_dir.run: moved commands related to any runscript in the datasrc dir to new in_datasrc_dir.run
lib/runscripts/table.run: postprocess(): added remake action that calls trim_table()
lib/runscripts/table.run: added trim_table(), which calls util.trim(regclass, regclass)
lib/runscripts/table.run: map_table(): added remake action that calls reset_col_names()
lib/runscripts/table.run: added reset_col_names(), which calls util.reset_col_names()
bugfix: lib/runscripts/table.run: map_table(): moved $map_table to global var so it can be used by other functions
bugfix: lib/runscripts/table.run: postprocess(): don't propagate $remake to remake_VegBIEN_mappings(), since this will cause map.csv to be remade, which is not related to the postprocessing.
lib/runscripts/table.run: map_table(): util.set_col_names_with_metadata(): removed unnecessary cast to regclass, which is performed implicitly. this used to be needed when the polymorphic util.rename_cols() was used instead.
lib/runscripts/table.run: postprocess(): propagate the $remake flag to remake_VegBIEN_mappings using self_make, so that a remake=1 on postprocess will cause map.csv to be regenerated as it would for a remake=1 directly on remake_VegBIEN_mappings
bugfix: postprocess(): moved $can_test flag from import() to this function because it is used here
lib/runscripts/table.run: import(): moved postprocessing commands to separate postprocess() function that can be invoked on an already-imported staging table to avoid running the load_data() target. this is especially useful when running the postprocessing on a working copy without the unversioned data files, for datasources whose load_data() target would otherwise try to download the files because they don't already exist.
lib/runscripts/table.run: postprocess(): renamed to custom_postprocess() since this runs only the datasource's custom postprocessing commands, not all the postprocessing commands including map_table, mk_derived
lib/runscripts/util.run: added , function, which treats each of the command-line args as commands the way make does (instead of as args to the same command, the way runscripts do)
lib/sh/util.sh: moved runscript-related commands to lib/runscripts/util.run because these only apply to runscripts
bugfix: lib/runscripts/table.run: remake_VegBIEN_mappings(): when remaking, do not remake header.csv, because it should keep the original CSV columns rather than being reset to whatever the current staging table columns happen to be. to force-regenerate this, instead delete it first and then run remake_VegBIEN_mappings(). remake mode will now just regenerate map.csv from header.csv, in case map.csv's columns are incomplete or out of order.
bugfix: lib/runscripts/table.run: map_table(): do not rename view columns, since their column names come from their (column-renamed) joined tables rather than from a map.csv. header.csv, map.csv for views will generally become out-of-date whenever the joined tables change, so it is better not to generate them at all.
lib/runscripts/table.run: added $is_view
lib/runscripts/table.run: added $postprocess_sql to store postprocess.sql path, and use it in postprocess()
bugfix: lib/runscripts/table.run: import(): allow automated tests (remake_VegBIEN_mappings) to be disabled by setting can_test= if the public schema shouldn't be modified (e.g. if it's the live DB)
lib/runscripts/table.run: map_table(): run map_table repeatedly until no more renames are made: added command to do this
lib/runscripts/table.run: map_table(): documented that collisions may prevent all renames from being made at once. if this is the case, map_table must be run repeatedly until no more renames are made. collisions may result if the staging table gets messed up (e.g. due to missing input columns in map.csv).
bugfix: lib/runscripts/table.run: need to run remake_VegBIEN_mappings after mk_derived rather than before so the derived cols will be included in the automated test result
lib/runscripts/table.run: map_table(): store the map table in the datasource schema, so that it can easily be referred to when using the staging tables. this also allows it to be found more easily when debugging its contents.
lib/runscripts/table.run: map_table(): create the map table as a persistent table in the temp schema, so that its contents can be viewed for debugging
lib/runscripts/table.run: use new util.set_col_names_with_metadata() instead of util.set_col_names() so that metadata values (beginning with : ) are automatically mapped to constant columns rather than needing to add a mk_const_col() call to postprocess.sql for each of them. there are a lot of metadata value entries, especially in the Source/ tables for each datasource, so this will save time in translating the datasources to new-style import. note that this requires disabling the map_filter_insert trigger on the map table to prevent it from filtering out the metadata entries before util.set_col_names_with_metadata() can use them.
bugfix: lib/runscripts/datasrc_dir.run, subdir.run: need to remove leading . from dir name to get installed schema name, using new dir2schema()
lib/runscripts/datasrc_dir.run, subdir.run: use new lib/sh/datasrc.sh, which contains code in common to both datasrc-related dir runscripts
added lib/runscripts/view.run, for use with table subdirs for views, such as inputs/FIA/occurrence_all/
bugfix: lib/runscripts/table.run: remake_VegBIEN_mappings(): run yes using piped_cmd() so the SIGPIPE doesn't cause an errexit
lib/runscripts/datasrc_dir.run: extend import.run and provide an import() implementation that runs all the runscripts for import_order.txt subdirs
lib/runscripts/table.run: import(): also run remake_VegBIEN_mappings() to accept the test output. this function was previously unused, but was left in for future use when lib/import.sh was translated to lib/runscripts/table.run (it was used in its import.sh form in inputs/FIA/occurrence_all/import).
bugfix: lib/runscripts/table.run: remake_VegBIEN_mappings(): need to change to $top_dir before running `rm header.csv map.csv`
lib/runscripts/table.run: remake_VegBIEN_mappings(): only remake header.csv, map.csv if this target is being run directly, to avoid needing to remake them every time. for tables that are views, this instead requires them to be explicitly remade when the view columns change.
bugfix: lib/runscripts/subdir.run: subdir_make(): only remake if $remake has been explicitly propagated to subdir_make() by using self_make
lib/runscripts/table.run: load_data(): first make sure schema is installed
lib/runscripts/table.run: added datasrc_make_install()
table_make_install(): take $install_log as an overridable kw param to support install logs in different locations
lib/runscripts/table.run: load_data(): split noclobber functionality into separate table_make_install() function, which can be used by other install-related targets
*{.sh,run}: use simpler .rel() instead of `. "$(dirname "${BASH_SOURCE0}")"/...` for relative includes
bugfix: load_data(): verbosity_min: use verbosity_min='' so that csv2db's default verbosity (3) is used, instead of setting the verbosity directly to 3, which caused the log++ logging output from bin/make to be echoed at verbosity 3, creating cluttered output
lib/runscripts/table.run: import(): use self_make on load_data so that the remake status determines whether the table is reinstalled
bugfix: lib/runscripts/mysql.table.run: import(): added missing set_make_vars, needed by self_make
bugfix: lib/runscripts/table.run: load_data(): need to use $_remake instead of $remake when using set_make_vars
lib/runscripts/table.run: added set_make_vars to all make targets so $remake would be propagated appropriately
lib/runscripts/table.run: load_data(): also clobber install log if remaking, because the table will be reinstalled
lib/runscripts/table.run: load_data(): automatically select noclobber mode depending on whether the install log already exists. this removes the need for a separate load_data_first_run() function.
bugfix: lib/runscripts/table.run: load_data(): ignore errors if table already exists
lib/runscripts/table.run: load_data(): use noclobber=1 to avoid overwriting the install log when re-running the install target idempotently. load_data_first_run() is now available to preserve the output in the log on the first run.
bugfix: lib/runscripts/mysql.table.run: import(): move previous versions of table.tsv out of the main dir before loading staging tables, to prevent them from being considered a TSV segment file and prepended to table.tsv
added lib/runscripts/mysql.table.run (general to all MySQL datasources) and use it in inputs/GBIF/table.run
lib/runscripts/subdir.run: auto-append -remake to all targets in remake mode
lib/runscripts/subdir.run: subdir_make(): put prepending of "$subdir/" on its own line for clarity
bugfix: *run: overriding targets: use new self_make to properly progagate the $remake flag to the overridden target, so that the target itself is not skipped
lib/runscripts/table.run, table.run: use new db_make.sh