schemas/public_.sql: viewFullOccurrence_*: renamed to view_full_occurrence_* at Brian M's and Martha's request (e-mails from Martha on 2014-8-12 at 17:37PT, and from Brian M on 2014-8-13 at 16:21PT). note that this change has already been made on vegbiendev.
lib/tnrs.py single_tnrs_request(), bin/tnrs_client: use_tnrs_export: default to False because this mode uses incorrect selected matches (vegpath.org/issues/943), and the JSON mode that fixes this is now available
bin/tnrs_db: tnrs.tnrs_request() call: explicitly set use_tnrs_export=True so that this continues to work if the default value is changed
bin/tnrs_client: added env var to configure use_tnrs_export
bin/make_analytical_db: materialize viewFullOccurrence_individual_view instead of analytical_stem_view because analytical_stem_view is now generatable via a simple join onto viewFullOccurrence_individual_view. this avoids running into potential disk space constraints when materializing and backing up both tables (~50 GB/table * 2 tables * 2 copies (incl. the backup) = 200 GB, which is very close to the available disk space).
bin/make_analytical_db: removed extra () around psql_verbose_vegbien
bin/make_analytical_db: removed no longer used mk_table()
bin/make_analytical_db: use more up-to-date *_view_modify() functions instead of mk_table()
bin/after_import: use new bin/make_backups
bugfix: bin/make_backups: need to `popd` when done
bugfix: bin/make_backups: need to `set +x` when done
bin/make_backups: run with initial "." so background processes will be owned by the invoking shell
added bin/make_backups
bin/import_all: hidden_srcs(): removed `by_col=1` because these should be done in the same mode as the main datasources
bin/make_analytical_db: removed threatened_taxonlabel because this is now handled by iucn_red_list
bin/make_analytical_db: added iucn_red_list_view_modify()
bin/make_analytical_db: removed unused code to create views in the analytical_db schema
bin/make_analytical_db: merged mk_table and mk_analytical_table since they now do the same thing
bugfix: bin/with_all: isset(): need to use `&>/dev/null` instead of `>&-`, etc because closing an fd causes declare to return false
bugfix: bin/with_all, import_all: don't disown processes because they should be auto-killed if the shell is (disown was only needed before we used screen)
bin/import_all: delete_logs(): documented that `trap EXIT` doesn't run until shell exit
bin/import_all: delete_logs(): print when this happens, so it can be verified that it's happening properly
bugfix: bin/import_all: need to run delete_logs manually because `trap EXIT` doesn't run until bg cmds done
bin/import_all: delete_logs: moved testing of whether to delete logs to delete_logs() so that delete_logs() can be run regardless of the $delete_logs setting
bugfix: bin/import_all: delete_logs(): also need to match log filenames when n=""
bugfix: bin/with_all: isset(): need to use `>&- 2>&-` because &> does not work with - as the dest
fix: bin/with_all: removed debug statements
bugfix: bin/with_all: testing if @inputs is set: `"${inputs+isset}"` syntax doesn't work for empty arrays, so need to use `declare -p` instead
bugfix: bin/stop_imports: also need to include `bin/after_import`
bugfix: bin/import_all: now that always using log files to fix output clutter, need to delete created logs if logging is turned off
bugfix: bin/import_all: don't errexit if a background process is Ctrl-C'd
bugfix: bin/import_all: was run without initial "." test: don't exit nonzero because this will close the subshell
bugfix: bin/import_all: ensure that this is run in a subshell, which is needed so errexits don't close the terminal window
bin/import_all: documented that this must be run in a subshell (obtained by running `$0`)
bugfix: bin/import_all: need to always use log files for background processes
fix: bin/import_all: Source/import: don't use by_col=1 for this because it's slower for small #s of rows. by_col mode is no longer needed for metadata-only tables because these tables now have a single empty row so that they also work in row-based mode.
fix: bin/import_all: hidden srcs: use with_all for this to avoid needing to list every source, and to display the backgrounded command with the variables substituted
bin/import_all: TNRS, geoscrub: integrated into the list of metadata sources
bin/import_all: TNRS, geoscrub: use import rather than publish because the non-imported tables have now been excluded
fix: bin/import_all: updated for new metadata datasource names (see issue #940)
inputs/.TNRS/schema.sql: taxon_match: insert names via taxon_match_input auto-updatable view instead of directly into taxon_match, to allow the taxon_match columns to be renamed while still supporting inserts using the TNRS column names
inputs/.TNRS/schema.sql: tnrs_match: renamed to taxon_match to use the normalized VegCore name for this, and to avoid repeating the schema name
inputs/.TNRS/schema.sql: tnrs: renamed to tnrs_match to distinguish it from other TNRS-related tables
fix: bin/in_place: usage: removed duplicate copy of [preserve_mtime=1]
bin/in_place: diff: use --brief to avoid scanning the entire file for large files
bin/in_place: added $preserve_mtime flag
lib/sh/db.sh pg_dump(), bin/pg_dump_vegbien: --format: use the long form of the formats to make the code self-documenting
bin/repl: match as whole-word text (like SQL identifier): documented that this is a generalization of lib/sql_gen.py map_expr() to work on entire source files
bin/repl, lib/sql_gen.py Expression transforming: documented that this can also be done in Postgres with expression substitution (wiki.vegpath.org/Postgres_queries#expression-substitution)
bin/make_analytical_db: removed remake_diff_tables() because this is now done for each datasource in inputs/input.Makefile
bin/make_analytical_db: removed no longer needed "${public}_validations" schema qualifier, now that it is in the search_path
fix: bin/vegbien_dest: added public_validations
fix: bin/repl: text mode (whether all patterns are plain text) should default to on, not off, if matching entire cells in a spreadsheet
fix: bin/repl: don't consider uppercase SQL keywords to indicate that a word is in a sentence
bugfix: bin/repl: only use excluded_prefix_re/excluded_suffix_re in text mode (used in renaming columns in SQL scripts), to prevent the special coding for column renames from also affecting regular regexp/word replacements
bugfix: bin/repl: text mode: also don't match if it's part of a '-'-separated identifier
bugfix: bin/repl: text mode: also don't match if it's a word in a sentence
bugfix: bin/repl: text mode: turned off the suffix matching, because there are cases where a mapping adds a suffix which would cause the same replacement to be performed repeatedly
bin/repl: text mode: exclude prefixes that should not cause replacement, to avoid doubling leading *
bin/repl: text mode: also match w/ suffix (eg. _verbatim)
bin/psql_verbose_vegbien: use \\ instead of \ inside '' because this is sh, not bash
bin/psql_verbose_vegbien: changed prep-statement order to match lib/sh/db.sh psql()
bin/psql_verbose_vegbien: use `\set VERBOSITY terse` to hide stack traces/DETAIL sections of error messages, like in lib/sh/db.sh psql()
bin/make_analytical_db: added `public_validations.remake_diff_tables()`
bugfix: bin/pg_dump_vegbien: fixed arg-count check to allow passing command-line options to pg_dump via args
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
bugfix: bin/boldify: also match [[]]-style links at the beginning and end of a line
bin/boldify: made it idempotent
bugfix: bin/boldify: fixed extended regular expression syntax, which doesn't support a \] inside [] (you instead have to put the ] right after the opening [^ )
added bin/boldify, which makes Redmine links bold
bugfix: bin/map: in_is_db: don't ignore errors when the table does not exist, because these prevent an errexit and allow an import to continue when a staging table is missing. suppressing this error had previously been necessary because metadata-only tables (Source/) used to not have installed staging tables, and the program had to react accordingly.
bugfix: bin/pg_dump_limit: support errexit by ignoring the nonzero exit status that grep returns when it doesn't match anything
bin/make_analytical_db: don't regenerate family_higher_plant_group from the NCBI data because the lookup table is now prepopulated as part of the schema
bin/import_all: don't import NCBI because the lookup table is now prepopulated as part of the schema
bugfix: bin/import_all: run in errexit mode, so that if the user cancels reinstalling of the import schema, the script will then abort instead of continuing and using the wrong schema
bin/map: support param start="", which indicates the default value. this fixes a bug in inputs/input.Makefile $(restart_row), which outputs "" if an explicit starting row is not found.
bugfix: bin/with_all: @inputs default value: use `local`, so that the default value is only set for the current function and doesn't leak back out into the caller. this fixes a bug in subset imports where import_all's Source/import call to with_all would add the .* datasources, but these would then stay in for the import_scrub call, causing extra .* datasources to incorrectly be imported.
bin/make_analytical_db: removed no longer needed setting of $schema to $public, because this is now done by psql()
bugfix: bin/import_all: restore the working dir when main() is done, in case it started as something other than the root dir
bin/after_import: support turning off the end-of-import backup for imports that are not the full database
bugfix: bin/make_analytical_db: when running into a public schema other than "public", also pass this to `/run export_` (which currently uses $schema instead of $public)
bugfix: bin/import_all: fix $ when .-included without args (which causes bash to put the wrong values in $ instead of leaving it empty)
when .-included without args (which causes bash to put the wrong values in $
bin/import_all: `make schemas/$version/install`: reinstall instead to allow re-running the import to the same custom schema (e.g. 2013-10-18.Brian_Enquist.Canadensys)
bin/import_all: `make schemas/$version/install`: ignore errors if schema exists, to support running with -e
bugfix: bin/import_all: removing inputs/.TNRS/tnrs/tnrs.make.lock: use `"rm" -f` instead of plain "rm" to avoid having an error exit status, which will abort the script if run with the -e flag (as runscripts are)
bin/*_all: *_main(): renamed to just main() because it does not matter that other shell-includes' main() methods will clobber this, because it is only executed once
bugfix: bin/import_all: Source tables: use .../import instead of import_temp because import_temp is only needed when importing all tables, to prevent the temp suffix from being removed yet
fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
bugfix: bin/import_all: need to publish datasources that won't be published by `make .../import`, so that the per-datasource import XPaths that refer to TNRS/geoscrub will link up with the TNRS/geoscrub source entry instead of creating a new entry without the metadata (because the entry with the metadata was named TNRS.new/geoscrub.new)
bin/import_all: removed no longer needed import of geoscrub data, because analytical_stem_view is now joined to the geoscrub_output table directly, instead of using the imported canon_place entries
bin/with_all: $all: renamed to $hidden_srcs for clarity, since this now just adds the hidden (.*) datasources, rather than always using all datasources
bugfix: bin/with_all: in $all mode, just prepend the .* datasources to the user-selected (or default) @inputs, so that using $all to add these datasources doesn't inadvertently cause the action to be performed for all datasources
bin/import_all: usage: documented that this can now be run with a custom datasources list (each of the form inputs/src/)
bin/with_all: added support for providing a custom list of inputs to run the command on
bin/import_all: use just import_scrub, not reimport_scrub, because import_scrub now automatically publishes the datasource's import (i.e. removes the temp suffix)
bin/map: usage: documented that verbosity > 3 in commit mode turns on debug_temp mode, which creates real tables instead of temp tables
bugfix: bin/import_all: use reimport_scrub instead of import_scrub so that the temp suffix of the datasource name is removed
bugfix: bin/after_import: run backups/fix_perms right after the backup files are created to make them private
bugfix: bin/make_analytical_db: `/run export_`: don't take input from the terminal, because this causes rm to prompt the user (from a background task) about overwriting the previous export