/trunk/bin - Changes - BIEN 3 - NCEAS Projects

root/trunk/bin @ 14906

svn:ignore: dotlockfile

#	Date	Author	Comment
14905	10/26/2014 04:58 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: don't disable errexit because this prevents the program from being Ctrl-C'd. this functionality is no longer needed now that the README.TXT instructs to run bin/import_all in a subshell.
14904	10/26/2014 04:56 PM	Aaron Marcuse-Kubitza	bin/import_all: removed functionality now provided by util.run
14903	10/26/2014 04:56 PM	Aaron Marcuse-Kubitza	bin/import_all: converted to a runscript so it can use runscript functionality
14732	09/24/2014 07:23 PM	Aaron Marcuse-Kubitza	bugfix: bin/make: don't run verbosity_compat until right before executing the external command, so that it doesn't mess up the logging mechanism. this is run automatically by command(), so there is no need to do anything here. note that logging bugs like these can now be troubleshooted much more easily with pst() to narrow down which functions could be causing the problem.
14644	09/04/2014 07:41 AM	Aaron Marcuse-Kubitza	schemas/public_.sql: viewFullOccurrence_: renamed to view_full_occurrence_ at Brian M's and Martha's request (e-mails from Martha on 2014-8-12 at 17:37PT, and from Brian M on 2014-8-13 at 16:21PT). note that this change has already been made on vegbiendev.
14622	08/28/2014 08:13 PM	Aaron Marcuse-Kubitza	lib/tnrs.py single_tnrs_request(), bin/tnrs_client: use_tnrs_export: default to False because this mode uses incorrect selected matches (vegpath.org/issues/943), and the JSON mode that fixes this is now available
14621	08/28/2014 08:05 PM	Aaron Marcuse-Kubitza	bin/tnrs_db: tnrs.tnrs_request() call: explicitly set use_tnrs_export=True so that this continues to work if the default value is changed
14575	08/25/2014 09:13 PM	Aaron Marcuse-Kubitza	bin/tnrs_client: added env var to configure use_tnrs_export
14443	08/09/2014 10:23 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: materialize viewFullOccurrence_individual_view instead of analytical_stem_view because analytical_stem_view is now generatable via a simple join onto viewFullOccurrence_individual_view. this avoids running into potential disk space constraints when materializing and backing up both tables (~50 GB/table * 2 tables * 2 copies (incl. the backup) = 200 GB, which is very close to the available disk space).
14439	08/09/2014 09:52 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed extra () around psql_verbose_vegbien
14438	08/09/2014 09:51 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed no longer used mk_table()
14437	08/09/2014 09:49 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: use more up-to-date *_view_modify() functions instead of mk_table()
14395	08/01/2014 05:47 AM	Aaron Marcuse-Kubitza	bin/after_import: use new bin/make_backups
14394	08/01/2014 05:47 AM	Aaron Marcuse-Kubitza	bugfix: bin/make_backups: need to `popd` when done
14393	08/01/2014 05:46 AM	Aaron Marcuse-Kubitza	bugfix: bin/make_backups: need to `set +x` when done
14392	08/01/2014 05:44 AM	Aaron Marcuse-Kubitza	bin/make_backups: run with initial "." so background processes will be owned by the invoking shell
14391	08/01/2014 04:40 AM	Aaron Marcuse-Kubitza	added bin/make_backups
14099	07/17/2014 09:05 AM	Aaron Marcuse-Kubitza	bin/import_all: hidden_srcs(): removed `by_col=1` because these should be done in the same mode as the main datasources
14096	07/16/2014 07:13 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed threatened_taxonlabel because this is now handled by iucn_red_list
14095	07/16/2014 07:12 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: added iucn_red_list_view_modify()
14094	07/16/2014 06:48 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed unused code to create views in the analytical_db schema
14093	07/16/2014 06:35 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: merged mk_table and mk_analytical_table since they now do the same thing
14092	07/16/2014 06:34 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed unused code to create views in the analytical_db schema
14089	07/16/2014 03:50 PM	Aaron Marcuse-Kubitza	bugfix: bin/with_all: isset(): need to use `&>/dev/null` instead of `>&-`, etc because closing an fd causes declare to return false
14088	07/16/2014 03:31 PM	Aaron Marcuse-Kubitza	bugfix: bin/with_all, import_all: don't disown processes because they should be auto-killed if the shell is (disown was only needed before we used screen)
14073	07/15/2014 04:49 PM	Aaron Marcuse-Kubitza	bin/import_all: delete_logs(): documented that `trap EXIT` doesn't run until shell exit
14072	07/15/2014 04:48 PM	Aaron Marcuse-Kubitza	bin/import_all: delete_logs(): print when this happens, so it can be verified that it's happening properly
14071	07/15/2014 04:32 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: need to run delete_logs manually because `trap EXIT` doesn't run until bg cmds done
14070	07/15/2014 04:28 PM	Aaron Marcuse-Kubitza	bin/import_all: delete_logs: moved testing of whether to delete logs to delete_logs() so that delete_logs() can be run regardless of the $delete_logs setting
14069	07/15/2014 03:58 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: delete_logs(): also need to match log filenames when n=""
14065	07/15/2014 02:39 PM	Aaron Marcuse-Kubitza	bugfix: bin/with_all: isset(): need to use `>&- 2>&-` because &> does not work with - as the dest
14035	07/14/2014 08:05 PM	Aaron Marcuse-Kubitza	fix: bin/with_all: removed debug statements
14034	07/14/2014 05:19 PM	Aaron Marcuse-Kubitza	bugfix: bin/with_all: testing if @inputs is set: `"${inputs+isset}"` syntax doesn't work for empty arrays, so need to use `declare -p` instead
13988	07/11/2014 10:03 AM	Aaron Marcuse-Kubitza	bugfix: bin/stop_imports: also need to include `bin/after_import`
13985	07/11/2014 09:13 AM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: now that always using log files to fix output clutter, need to delete created logs if logging is turned off
13984	07/11/2014 08:45 AM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: don't errexit if a background process is Ctrl-C'd
13983	07/11/2014 08:41 AM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: was run without initial "." test: don't exit nonzero because this will close the subshell
13982	07/11/2014 08:38 AM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: ensure that this is run in a subshell, which is needed so errexits don't close the terminal window
13981	07/11/2014 08:32 AM	Aaron Marcuse-Kubitza	bin/import_all: documented that this must be run in a subshell (obtained by running `$0`)
13980	07/11/2014 08:25 AM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: need to always use log files for background processes
13979	07/11/2014 08:12 AM	Aaron Marcuse-Kubitza	fix: bin/import_all: Source/import: don't use by_col=1 for this because it's slower for small #s of rows. by_col mode is no longer needed for metadata-only tables because these tables now have a single empty row so that they also work in row-based mode.
13978	07/11/2014 08:06 AM	Aaron Marcuse-Kubitza	fix: bin/import_all: hidden srcs: use with_all for this to avoid needing to list every source, and to display the backgrounded command with the variables substituted
13977	07/11/2014 07:40 AM	Aaron Marcuse-Kubitza	bin/import_all: TNRS, geoscrub: integrated into the list of metadata sources
13976	07/11/2014 07:39 AM	Aaron Marcuse-Kubitza	bin/import_all: TNRS, geoscrub: use import rather than publish because the non-imported tables have now been excluded
13974	07/10/2014 07:25 PM	Aaron Marcuse-Kubitza	fix: bin/import_all: updated for new metadata datasource names (see issue #940)
13866	06/26/2014 04:11 AM	Aaron Marcuse-Kubitza	inputs/.TNRS/schema.sql: taxon_match: insert names via taxon_match_input auto-updatable view instead of directly into taxon_match, to allow the taxon_match columns to be renamed while still supporting inserts using the TNRS column names
13862	06/26/2014 02:38 AM	Aaron Marcuse-Kubitza	inputs/.TNRS/schema.sql: tnrs_match: renamed to taxon_match to use the normalized VegCore name for this, and to avoid repeating the schema name
13850	06/25/2014 03:33 PM	Aaron Marcuse-Kubitza	inputs/.TNRS/schema.sql: tnrs: renamed to tnrs_match to distinguish it from other TNRS-related tables
13793	06/17/2014 04:26 PM	Aaron Marcuse-Kubitza	fix: bin/in_place: usage: removed duplicate copy of [preserve_mtime=1]
13309	04/23/2014 10:01 PM	Aaron Marcuse-Kubitza	bin/in_place: diff: use --brief to avoid scanning the entire file for large files
13308	04/23/2014 09:57 PM	Aaron Marcuse-Kubitza	bin/in_place: added $preserve_mtime flag
13161	04/17/2014 02:51 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh pg_dump(), bin/pg_dump_vegbien: --format: use the long form of the formats to make the code self-documenting
13077	04/09/2014 02:52 AM	Aaron Marcuse-Kubitza	bin/repl: match as whole-word text (like SQL identifier): documented that this is a generalization of lib/sql_gen.py map_expr() to work on entire source files
13076	04/09/2014 02:50 AM	Aaron Marcuse-Kubitza	bin/repl, lib/sql_gen.py Expression transforming: documented that this can also be done in Postgres with expression substitution (wiki.vegpath.org/Postgres_queries#expression-substitution)
12995	03/30/2014 06:31 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed remake_diff_tables() because this is now done for each datasource in inputs/input.Makefile
12990	03/30/2014 06:02 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed no longer needed "${public}_validations" schema qualifier, now that it is in the search_path
12989	03/30/2014 06:00 PM	Aaron Marcuse-Kubitza	fix: bin/vegbien_dest: added public_validations
12914	03/26/2014 09:23 PM	Aaron Marcuse-Kubitza	fix: bin/repl: text mode (whether all patterns are plain text) should default to on, not off, if matching entire cells in a spreadsheet
12897	03/26/2014 02:17 PM	Aaron Marcuse-Kubitza	fix: bin/repl: don't consider uppercase SQL keywords to indicate that a word is in a sentence
12754	03/18/2014 05:22 PM	Aaron Marcuse-Kubitza	bugfix: bin/repl: only use excluded_prefix_re/excluded_suffix_re in text mode (used in renaming columns in SQL scripts), to prevent the special coding for column renames from also affecting regular regexp/word replacements
12750	03/18/2014 04:59 AM	Aaron Marcuse-Kubitza	bugfix: bin/repl: text mode: also don't match if it's part of a '-'-separated identifier
12749	03/18/2014 04:57 AM	Aaron Marcuse-Kubitza	bugfix: bin/repl: text mode: also don't match if it's a word in a sentence
12748	03/18/2014 04:42 AM	Aaron Marcuse-Kubitza	bugfix: bin/repl: text mode: turned off the suffix matching, because there are cases where a mapping adds a suffix which would cause the same replacement to be performed repeatedly
12746	03/18/2014 04:25 AM	Aaron Marcuse-Kubitza	bin/repl: text mode: exclude prefixes that should not cause replacement, to avoid doubling leading *
12742	03/18/2014 04:03 AM	Aaron Marcuse-Kubitza	bin/repl: text mode: also match w/ suffix (eg. _verbatim)
12394	02/23/2014 08:29 PM	Aaron Marcuse-Kubitza	bin/psql_verbose_vegbien: use \\ instead of \ inside '' because this is sh, not bash
12393	02/23/2014 08:26 PM	Aaron Marcuse-Kubitza	bin/psql_verbose_vegbien: changed prep-statement order to match lib/sh/db.sh psql()
12392	02/23/2014 08:26 PM	Aaron Marcuse-Kubitza	bin/psql_verbose_vegbien: use `\set VERBOSITY terse` to hide stack traces/DETAIL sections of error messages, like in lib/sh/db.sh psql()
12391	02/23/2014 08:11 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: added `public_validations.remake_diff_tables()`
12041	02/05/2014 12:55 AM	Aaron Marcuse-Kubitza	bugfix: bin/pg_dump_vegbien: fixed arg-count check to allow passing command-line options to pg_dump via args
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11952	01/15/2014 08:16 AM	Aaron Marcuse-Kubitza	bugfix: bin/boldify: also match [[]]-style links at the beginning and end of a line
11951	01/15/2014 08:11 AM	Aaron Marcuse-Kubitza	bin/boldify: made it idempotent
11950	01/15/2014 08:08 AM	Aaron Marcuse-Kubitza	bugfix: bin/boldify: fixed extended regular expression syntax, which doesn't support a \] inside [] (you instead have to put the ] right after the opening [^ )
11949	01/15/2014 07:59 AM	Aaron Marcuse-Kubitza	added bin/boldify, which makes Redmine links bold
11918	12/17/2013 05:47 AM	Aaron Marcuse-Kubitza	bugfix: bin/map: in_is_db: don't ignore errors when the table does not exist, because these prevent an errexit and allow an import to continue when a staging table is missing. suppressing this error had previously been necessary because metadata-only tables (Source/) used to not have installed staging tables, and the program had to react accordingly.
11870	12/09/2013 03:09 PM	Aaron Marcuse-Kubitza	bugfix: bin/pg_dump_limit: support errexit by ignoring the nonzero exit status that grep returns when it doesn't match anything
11840	12/05/2013 08:38 AM	Aaron Marcuse-Kubitza	bin/make_analytical_db: don't regenerate family_higher_plant_group from the NCBI data because the lookup table is now prepopulated as part of the schema
11839	12/05/2013 08:37 AM	Aaron Marcuse-Kubitza	bin/import_all: don't import NCBI because the lookup table is now prepopulated as part of the schema
11823	12/04/2013 07:26 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: run in errexit mode, so that if the user cancels reinstalling of the import schema, the script will then abort instead of continuing and using the wrong schema
11806	12/03/2013 08:58 AM	Aaron Marcuse-Kubitza	bin/map: support param start="", which indicates the default value. this fixes a bug in inputs/input.Makefile $(restart_row), which outputs "" if an explicit starting row is not found.
11456	10/29/2013 03:33 AM	Aaron Marcuse-Kubitza	bugfix: bin/with_all: @inputs default value: use `local`, so that the default value is only set for the current function and doesn't leak back out into the caller. this fixes a bug in subset imports where import_all's Source/import call to with_all would add the .* datasources, but these would then stay in for the import_scrub call, causing extra .* datasources to incorrectly be imported.
11434	10/24/2013 05:07 PM	Aaron Marcuse-Kubitza	bin/make_analytical_db: removed no longer needed setting of $schema to $public, because this is now done by psql()
11430	10/24/2013 04:03 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: restore the working dir when main() is done, in case it started as something other than the root dir
11429	10/24/2013 03:49 PM	Aaron Marcuse-Kubitza	bin/after_import: support turning off the end-of-import backup for imports that are not the full database
11423	10/24/2013 01:11 PM	Aaron Marcuse-Kubitza	bugfix: bin/make_analytical_db: when running into a public schema other than "public", also pass this to `/run export_` (which currently uses $schema instead of $public)
11422	10/24/2013 01:10 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: fix $ `when .-included without args (which causes bash to put the wrong values in $` instead of leaving it empty)
11421	10/24/2013 01:09 PM	Aaron Marcuse-Kubitza	bin/import_all: `make schemas/$version/install`: reinstall instead to allow re-running the import to the same custom schema (e.g. 2013-10-18.Brian_Enquist.Canadensys)
11420	10/24/2013 01:07 PM	Aaron Marcuse-Kubitza	bin/import_all: `make schemas/$version/install`: ignore errors if schema exists, to support running with -e
11419	10/23/2013 11:10 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: removing inputs/.TNRS/tnrs/tnrs.make.lock: use `"rm" -f` instead of plain "rm" to avoid having an error exit status, which will abort the script if run with the -e flag (as runscripts are)
11416	10/23/2013 10:34 PM	Aaron Marcuse-Kubitza	bin/_all: _main(): renamed to just main() because it does not matter that other shell-includes' main() methods will clobber this, because it is only executed once
11415	10/23/2013 10:29 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: Source tables: use .../import instead of import_temp because import_temp is only needed when importing all tables, to prevent the temp suffix from being removed yet
11396	10/21/2013 07:14 PM	Aaron Marcuse-Kubitza	fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
11393	10/20/2013 05:21 PM	Aaron Marcuse-Kubitza	bugfix: bin/import_all: need to publish datasources that won't be published by `make .../import`, so that the per-datasource import XPaths that refer to TNRS/geoscrub will link up with the TNRS/geoscrub source entry instead of creating a new entry without the metadata (because the entry with the metadata was named TNRS.new/geoscrub.new)
11390	10/20/2013 04:55 PM	Aaron Marcuse-Kubitza	bin/import_all: removed no longer needed import of geoscrub data, because analytical_stem_view is now joined to the geoscrub_output table directly, instead of using the imported canon_place entries
11374	10/19/2013 06:56 PM	Aaron Marcuse-Kubitza	bin/with_all: $all: renamed to $hidden_srcs for clarity, since this now just adds the hidden (.*) datasources, rather than always using all datasources
11373	10/19/2013 06:50 PM	Aaron Marcuse-Kubitza	bugfix: bin/with_all: in $all mode, just prepend the .* datasources to the user-selected (or default) @inputs, so that using $all to add these datasources doesn't inadvertently cause the action to be performed for all datasources
11371	10/19/2013 02:15 PM	Aaron Marcuse-Kubitza	bin/import_all: usage: documented that this can now be run with a custom datasources list (each of the form inputs/src/)
11370	10/19/2013 02:02 PM	Aaron Marcuse-Kubitza	bin/with_all: added support for providing a custom list of inputs to run the command on
11286	10/17/2013 04:44 PM	Aaron Marcuse-Kubitza	bin/import_all: use just import_scrub, not reimport_scrub, because import_scrub now automatically publishes the datasource's import (i.e. removes the temp suffix)

Project

General

Profile