moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
lib/runscripts/table.run: added test_() target and use it in remake_VegBIEN_mappings() (it would not be clear that remake_VegBIEN_mappings() runs the tests)
lib/runscripts/table.run: table_make_install(): simplified the setting of $noclobber since there no longer needs to be a different command for when the log exists
bugfix: lib/runscripts/table.run: need to errexit the make target, so that errors in the SQL install scripts are not suppressed. this requires pre-checking if the table exists (using new pg_table_exists), so that the install target's errexit does not then need to be suppressed for cases when the table already exists.
lib/runscripts/table.run: reordered functions in the order they are called by import()
bugfix: lib/runscripts/table.run: load_data(): pass $is_view through to `make reinstall` so that DROP VIEW will be used instead of DROP TABLE where applicable
bugfix: lib/runscripts/table.run: load_data(): in remaking mode, need to remake header.csv in case the columns have changed
bugfix: lib/runscripts/*: calls to rm: use `rm -f` instead to avoid an error (which aborts the program) if the file does not yet exist
lib/runscripts/table.run: postprocess(): added remake action that calls trim_table()
lib/runscripts/table.run: added trim_table(), which calls util.trim(regclass, regclass)
lib/runscripts/table.run: map_table(): added remake action that calls reset_col_names()
lib/runscripts/table.run: added reset_col_names(), which calls util.reset_col_names()
bugfix: lib/runscripts/table.run: map_table(): moved $map_table to global var so it can be used by other functions
bugfix: lib/runscripts/table.run: postprocess(): don't propagate $remake to remake_VegBIEN_mappings(), since this will cause map.csv to be remade, which is not related to the postprocessing.
lib/runscripts/table.run: map_table(): util.set_col_names_with_metadata(): removed unnecessary cast to regclass, which is performed implicitly. this used to be needed when the polymorphic util.rename_cols() was used instead.
lib/runscripts/table.run: postprocess(): propagate the $remake flag to remake_VegBIEN_mappings using self_make, so that a remake=1 on postprocess will cause map.csv to be regenerated as it would for a remake=1 directly on remake_VegBIEN_mappings
bugfix: postprocess(): moved $can_test flag from import() to this function because it is used here
lib/runscripts/table.run: import(): moved postprocessing commands to separate postprocess() function that can be invoked on an already-imported staging table to avoid running the load_data() target. this is especially useful when running the postprocessing on a working copy without the unversioned data files, for datasources whose load_data() target would otherwise try to download the files because they don't already exist.
lib/runscripts/table.run: postprocess(): renamed to custom_postprocess() since this runs only the datasource's custom postprocessing commands, not all the postprocessing commands including map_table, mk_derived
bugfix: lib/runscripts/table.run: remake_VegBIEN_mappings(): when remaking, do not remake header.csv, because it should keep the original CSV columns rather than being reset to whatever the current staging table columns happen to be. to force-regenerate this, instead delete it first and then run remake_VegBIEN_mappings(). remake mode will now just regenerate map.csv from header.csv, in case map.csv's columns are incomplete or out of order.
bugfix: lib/runscripts/table.run: map_table(): do not rename view columns, since their column names come from their (column-renamed) joined tables rather than from a map.csv. header.csv, map.csv for views will generally become out-of-date whenever the joined tables change, so it is better not to generate them at all.
lib/runscripts/table.run: added $is_view
lib/runscripts/table.run: added $postprocess_sql to store postprocess.sql path, and use it in postprocess()
bugfix: lib/runscripts/table.run: import(): allow automated tests (remake_VegBIEN_mappings) to be disabled by setting can_test= if the public schema shouldn't be modified (e.g. if it's the live DB)
lib/runscripts/table.run: map_table(): run map_table repeatedly until no more renames are made: added command to do this
lib/runscripts/table.run: map_table(): documented that collisions may prevent all renames from being made at once. if this is the case, map_table must be run repeatedly until no more renames are made. collisions may result if the staging table gets messed up (e.g. due to missing input columns in map.csv).
bugfix: lib/runscripts/table.run: need to run remake_VegBIEN_mappings after mk_derived rather than before so the derived cols will be included in the automated test result
lib/runscripts/table.run: map_table(): store the map table in the datasource schema, so that it can easily be referred to when using the staging tables. this also allows it to be found more easily when debugging its contents.
lib/runscripts/table.run: map_table(): create the map table as a persistent table in the temp schema, so that its contents can be viewed for debugging
lib/runscripts/table.run: use new util.set_col_names_with_metadata() instead of util.set_col_names() so that metadata values (beginning with : ) are automatically mapped to constant columns rather than needing to add a mk_const_col() call to postprocess.sql for each of them. there are a lot of metadata value entries, especially in the Source/ tables for each datasource, so this will save time in translating the datasources to new-style import. note that this requires disabling the map_filter_insert trigger on the map table to prevent it from filtering out the metadata entries before util.set_col_names_with_metadata() can use them.
bugfix: lib/runscripts/table.run: remake_VegBIEN_mappings(): run yes using piped_cmd() so the SIGPIPE doesn't cause an errexit
lib/runscripts/table.run: import(): also run remake_VegBIEN_mappings() to accept the test output. this function was previously unused, but was left in for future use when lib/import.sh was translated to lib/runscripts/table.run (it was used in its import.sh form in inputs/FIA/occurrence_all/import).
bugfix: lib/runscripts/table.run: remake_VegBIEN_mappings(): need to change to $top_dir before running `rm header.csv map.csv`
lib/runscripts/table.run: remake_VegBIEN_mappings(): only remake header.csv, map.csv if this target is being run directly, to avoid needing to remake them every time. for tables that are views, this instead requires them to be explicitly remade when the view columns change.
lib/runscripts/table.run: load_data(): first make sure schema is installed
lib/runscripts/table.run: added datasrc_make_install()
table_make_install(): take $install_log as an overridable kw param to support install logs in different locations
lib/runscripts/table.run: load_data(): split noclobber functionality into separate table_make_install() function, which can be used by other install-related targets
*{.sh,run}: use simpler .rel() instead of `. "$(dirname "${BASH_SOURCE0}")"/...` for relative includes
bugfix: load_data(): verbosity_min: use verbosity_min='' so that csv2db's default verbosity (3) is used, instead of setting the verbosity directly to 3, which caused the log++ logging output from bin/make to be echoed at verbosity 3, creating cluttered output
lib/runscripts/table.run: import(): use self_make on load_data so that the remake status determines whether the table is reinstalled
bugfix: lib/runscripts/table.run: load_data(): need to use $_remake instead of $remake when using set_make_vars
lib/runscripts/table.run: added set_make_vars to all make targets so $remake would be propagated appropriately
lib/runscripts/table.run: load_data(): also clobber install log if remaking, because the table will be reinstalled
lib/runscripts/table.run: load_data(): automatically select noclobber mode depending on whether the install log already exists. this removes the need for a separate load_data_first_run() function.
bugfix: lib/runscripts/table.run: load_data(): ignore errors if table already exists
lib/runscripts/table.run: load_data(): use noclobber=1 to avoid overwriting the install log when re-running the install target idempotently. load_data_first_run() is now available to preserve the output in the log on the first run.
bugfix: *run: overriding targets: use new self_make to properly progagate the $remake flag to the overridden target, so that the target itself is not skipped
lib/runscripts/table.run, table.run: use new db_make.sh
lib/runscripts/table_dir.run: renamed table to subdir because this can apply to any datasrc subdir. moved table-specific code to table.run.
added lib/runscripts/table_dir.run and use it in table.run
lib/runscripts/table.run: include make.sh so runscripts based on it can use make-related utils
lib/runscripts/table.run: don't mk_esc_name schema, table because these will be mk_esc_name'd by functions that use them
bugfix: lib/runscripts/table.run: load_data(): use new $verbosity_min instead of running `verbosity_min` so that the command name logging is not output with the new verbosity
bugfix: lib/runscripts/table.run: load_data(): need to ensure the verbosity is at least 3 because the install logs require verbose output. (3 is the default for the installer, but is overridden by the runscripts, which instead set the default to 1.)
lib/runscripts/table.run: $schema, $table: log the `cd` used to calculate the value at log_level 3 instead of 1 (note that the cd() function call for this will be logged at log_level 5)
lib/runscripts/table.run: import(): added step to load the data into the staging table before postprocessing it
lib/runscripts/table.run: table_make(): run canon_rel_path on the datasrc dir so it's displayed as the direct path to it, without ..
lib/runscripts/table.run: table_make(): moved comment about "${@/#/$table/}" to right after the line it describes
lib/runscripts/table.run: input_make(): renamed to table_make() to make it clear that the target names are relative to the table subdir itself, not the datasrc dir. it was previously called input_make because it used inputs/input.Makefile directly, but now will use any Makefile in the datasrc dir.
lib/runscripts/table.run: input_make(): added $silent flag which turns on make's --silent option
lib/runscripts/table.run: input_make(): documented that it requires a Makefile in the datasrc dir containing `include ../input.Makefile
lib/runscripts/table.run: input_make(): added comment explaining that "${@/#/$table/}" replaces the empty str at the beginning of str (/#) with $table/
lib/runscripts/table.run: put functions on one line where possible
lib/runscripts/table.run: input_make(): use the local make() function instead of external make directly, because it sets $cmd_log_fd appropriately to ensure that all the echoed make commands get properly logged to stdlog/stderr (stdlog is fd 30 when it's redirected to a file)
*{.sh,run}: use command instead of deprecated echo_run (don't prepend anything when the command is already aliased with `command`)
*{.sh,run}: replaced extern with command, a shell builtin that does the same thing. this is also part of dash (the Bourne shell on Linux).
*{.sh,run}: removed extra space between function name and (), which is apparently not needed even though `help function` includes it. this greatly improves readability, including when function names are pasted into commit messages!
lib/runscripts/table.run: remake_VegBIEN_mappings (): added public_schema_exists check, ported from lib/import.sh
bugfix: : use new func_override for runscript inheritance instead of invoking the overridden function as a (command-line) target of the parent runscript. this ensures that the overridden function is run in the *same process as the calling function, so that $top_dir keeps the same value and runscript-relative paths continue to work.
moved lib/*.sh to sh/ subdir so it's easier to find the .sh files among all the other lib/ files
lib/util.sh: use extern instead of env to run external commands, because it's simpler and auto-captures the logging output
renamed lib/runscripts/local.run to lib/local.sh since the things it defines are not just for runscripts
*{.sh,run}: changed echo_func to an alias that includes the "$", so that callers don't need to specify the "$" manually. although the original echo_func function is still there, callers need to switch over to the alias at the same time as it's defined because otherwise the "$@" would be doubled, since echo_func refers to the alias instead.
", so that callers don't need to specify the "$
*{.sh,run}: use just env instead of echo_run env now that env is an auto-echoing alias
lib/runscripts/table.run: moved general DB commands to lib/util.sh
moved lib/*.run into runscripts/ subdir so runscripts are grouped together and easier to find rather than being scattered throughout lib/
*run: added include guards
lib/table.run: use new local.run instead of defining $bin_dir, etc. itself
lib/table.run: added mysql (), mysql_ANSI ()
lib/table.run: run mk_esc_name for $schema, $table
lib/table.run: added esc_name (), mk_esc_name ()
lib/util.run: echo_func (): use the caller's FUNCNAME via the $FUNCNAME[] array instead of requiring them to pass it in the function args as `echo_func "$FUNCNAME" "$@"`
lib/table.run: template: import(): Also pass "$@" to superclass method
lib/table.run: template: Use "$FUNCNAME" instead of hardcoding import
lib/table.run: Use new import.run, which defines all()
lib/table.run: Added all (default target)
*run: Use -e option to bash on the #! line instead of separate `set -o errexit` line so that there is no issue with the `set -o errexit` line getting separated from the #! line (errexit is required for the scripts to work properly)
lib/util.run: Run run_cmd at shell exit (using trap) instead of requiring every runscript to have `run_cmd ` at the end of it
lib/table.run: Switched from echo_run to echo_func
lib/util.run: Echo the command at the beginning of each function using new echo_func, instead of having to type echo_run before every call to a function. Note that because echo_func uses BASH_SOURCE, the path to the file containing the function will be included in the debug message, which greatly facilitates locating which file a command is in.
lib/util.run: echo_cmd(): Renamed to echo_run for clarity, because it also runs the command
Added lib/table.run, which includes the commands in import.sh but uses run scripts to allow running commands other than just import. (For example, map_table or postprocess can be run separately. Uninstall-related commands which would not belong in an import script can also be added, because import is only one of many commands a run script can offer.)
lib/import.sh: remake_VegBIEN_mappings(): Also remake VegBIEN.csv and test.xml.ref use `make test`
lib/import.sh: Added remake_VegBIEN_mappings()
lib/import.sh: Added make() and use it instead of the full make command
lib/import.sh: map_table(): Removed unneeded () around psql. This also fixes a bug where an error exit status from psql would not have aborted the script because `set -o errexit` does not apply to commands enclosed in (). For () you need to use ` || exit` instead (or ` || return` inside a function).
lib/import.sh: Use `set -o errexit` so any command that exits with an error aborts the script. Note that a command's exit status can still be ignored using ` || true`. Removed no longer needed ` || return` in functions.
lib/import.sh: functions: abort if a command encounters an error