Project

General

Profile

  • svn:executable: *

# Date Author Comment
14071 07/15/2014 04:32 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: need to run delete_logs manually because `trap EXIT` doesn't run until bg cmds done

14070 07/15/2014 04:28 PM Aaron Marcuse-Kubitza

bin/import_all: delete_logs: moved testing of whether to delete logs to delete_logs() so that delete_logs() can be run regardless of the $delete_logs setting

14069 07/15/2014 03:58 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: delete_logs(): also need to match log filenames when n=""

13985 07/11/2014 09:13 AM Aaron Marcuse-Kubitza

bugfix: bin/import_all: now that always using log files to fix output clutter, need to delete created logs if logging is turned off

13984 07/11/2014 08:45 AM Aaron Marcuse-Kubitza

bugfix: bin/import_all: don't errexit if a background process is Ctrl-C'd

13983 07/11/2014 08:41 AM Aaron Marcuse-Kubitza

bugfix: bin/import_all: was run without initial "." test: don't exit nonzero because this will close the subshell

13982 07/11/2014 08:38 AM Aaron Marcuse-Kubitza

bugfix: bin/import_all: ensure that this is run in a subshell, which is needed so errexits don't close the terminal window

13981 07/11/2014 08:32 AM Aaron Marcuse-Kubitza

bin/import_all: documented that this must be run in a subshell (obtained by running `$0`)

13980 07/11/2014 08:25 AM Aaron Marcuse-Kubitza

bugfix: bin/import_all: need to always use log files for background processes

13979 07/11/2014 08:12 AM Aaron Marcuse-Kubitza

fix: bin/import_all: Source/import: don't use by_col=1 for this because it's slower for small #s of rows. by_col mode is no longer needed for metadata-only tables because these tables now have a single empty row so that they also work in row-based mode.

13978 07/11/2014 08:06 AM Aaron Marcuse-Kubitza

fix: bin/import_all: hidden srcs: use with_all for this to avoid needing to list every source, and to display the backgrounded command with the variables substituted

13977 07/11/2014 07:40 AM Aaron Marcuse-Kubitza

bin/import_all: TNRS, geoscrub: integrated into the list of metadata sources

13976 07/11/2014 07:39 AM Aaron Marcuse-Kubitza

bin/import_all: TNRS, geoscrub: use import rather than publish because the non-imported tables have now been excluded

13974 07/10/2014 07:25 PM Aaron Marcuse-Kubitza

fix: bin/import_all: updated for new metadata datasource names (see issue #940)

11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11839 12/05/2013 08:37 AM Aaron Marcuse-Kubitza

bin/import_all: don't import NCBI because the lookup table is now prepopulated as part of the schema

11823 12/04/2013 07:26 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: run in errexit mode, so that if the user cancels reinstalling of the import schema, the script will then abort instead of continuing and using the wrong schema

11430 10/24/2013 04:03 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: restore the working dir when main() is done, in case it started as something other than the root dir

11422 10/24/2013 01:10 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: fix $ when .-included without args (which causes bash to put the wrong values in $ instead of leaving it empty)

11421 10/24/2013 01:09 PM Aaron Marcuse-Kubitza

bin/import_all: `make schemas/$version/install`: reinstall instead to allow re-running the import to the same custom schema (e.g. 2013-10-18.Brian_Enquist.Canadensys)

11420 10/24/2013 01:07 PM Aaron Marcuse-Kubitza

bin/import_all: `make schemas/$version/install`: ignore errors if schema exists, to support running with -e

11419 10/23/2013 11:10 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: removing inputs/.TNRS/tnrs/tnrs.make.lock: use `"rm" -f` instead of plain "rm" to avoid having an error exit status, which will abort the script if run with the -e flag (as runscripts are)

11416 10/23/2013 10:34 PM Aaron Marcuse-Kubitza

bin/*_all: *_main(): renamed to just main() because it does not matter that other shell-includes' main() methods will clobber this, because it is only executed once

11415 10/23/2013 10:29 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: Source tables: use .../import instead of import_temp because import_temp is only needed when importing all tables, to prevent the temp suffix from being removed yet

11393 10/20/2013 05:21 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: need to publish datasources that won't be published by `make .../import`, so that the per-datasource import XPaths that refer to TNRS/geoscrub will link up with the TNRS/geoscrub source entry instead of creating a new entry without the metadata (because the entry with the metadata was named TNRS.new/geoscrub.new)

11390 10/20/2013 04:55 PM Aaron Marcuse-Kubitza

bin/import_all: removed no longer needed import of geoscrub data, because analytical_stem_view is now joined to the geoscrub_output table directly, instead of using the imported canon_place entries

11374 10/19/2013 06:56 PM Aaron Marcuse-Kubitza

bin/with_all: $all: renamed to $hidden_srcs for clarity, since this now just adds the hidden (.*) datasources, rather than always using all datasources

11371 10/19/2013 02:15 PM Aaron Marcuse-Kubitza

bin/import_all: usage: documented that this can now be run with a custom datasources list (each of the form inputs/src/)

11286 10/17/2013 04:44 PM Aaron Marcuse-Kubitza

bin/import_all: use just import_scrub, not reimport_scrub, because import_scrub now automatically publishes the datasource's import (i.e. removes the temp suffix)

10871 09/05/2013 12:11 AM Aaron Marcuse-Kubitza

bugfix: bin/import_all: use reimport_scrub instead of import_scrub so that the temp suffix of the datasource name is removed

10849 08/31/2013 07:44 PM Aaron Marcuse-Kubitza

bugfix: bin/import_all: `rm inputs/.TNRS/tnrs/tnrs.make.lock`: need to use `"rm"` instead of `rm` so that we don't use any rm alias the user might have in their shell (import_all is run in the calling shell so that the jobs are owned by the calling shell)

10847 08/31/2013 07:27 PM Aaron Marcuse-Kubitza

bin/import_all: added step to remove any leftover TNRS lockfile (previously done manually)

10586 08/03/2013 09:14 PM Aaron Marcuse-Kubitza

bin/import_all: use new bin/after_import

10580 08/03/2013 12:25 AM Aaron Marcuse-Kubitza

bin/import_all: with_all import_scrub: documented that this step uses $by_col, so that users know to include by_col=1 when running this step separately

10579 08/03/2013 12:24 AM Aaron Marcuse-Kubitza

bin/import_all: use column-based import (by_col=1) by default, instead of requiring the user to explicitly specify it. instead turn it off explicitly (by_col=) for row-based import.

10576 08/02/2013 11:55 PM Aaron Marcuse-Kubitza

bin/import_all: don't set $dump_opts until running the backup command that uses it, so that the user can run this backup command separately just by copying the line out of the script (without worrying about env vars that need to be set, other than $version which is visible outside the script)

7618 02/20/2013 08:58 AM Aaron Marcuse-Kubitza

Moved wait on tnrs.make lock from import_all to make_analytical_db, so that running make_analytical_db for a one-time import also waits on the lock

7419 02/02/2013 11:28 AM Aaron Marcuse-Kubitza

import_all: after_import(): Added wait on tnrs.make's lockfile to ensure that all background scrubbing processes are complete before creating the analytical DB

7418 02/02/2013 11:18 AM Aaron Marcuse-Kubitza

import_all: Moved `waitpid $jobs` into after_import()

7276 01/18/2013 03:25 AM Aaron Marcuse-Kubitza

import_all: Output the PIDs of the import_scrub and after_import processes, so those processes can be managed without shell job control. This is useful if the connection is lost to the remote shell running the import, which prevents using job control on the import processes.

7267 01/16/2013 02:51 PM Aaron Marcuse-Kubitza

import_all: Use new import_scrub (input.Makefile) instead of import, which avoids needing to start background processes for tnrs-remake and scrub-remake

7245 01/16/2013 07:56 AM Aaron Marcuse-Kubitza

input.Makefile: $(import?): Renamed $public_import option to $full_import because it applies to any import of all datasources, not just a public import on vegbiendev

7228 01/15/2013 10:42 PM Aaron Marcuse-Kubitza

import_all: Run disown_all after background processes have been created, so that they will not be aborted if the shell exits (e.g. due to a broken connection). Note that with_all processes are automatically disowned as they are created, but other processes, such as after_import, were not.

7163 01/11/2013 02:07 AM Aaron Marcuse-Kubitza

import_all: Removed no longer needed TNRS import, which has been replaced by scrub.make (which adds TNRS taxondeterminations after the import instead of creating taxonlabel links before it)

7132 01/09/2013 09:13 AM Aaron Marcuse-Kubitza

inputs/.TNRS/: Changed tnrs+accepted to a view (defined in schema.sql) so accepted names would automatically be populated as they are parsed by TNRS, rather than needing to run `make inputs/.TNRS/tnrs+accepted/reinstall` to populate them

7127 01/09/2013 02:23 AM Aaron Marcuse-Kubitza

import_all: Reinstall tnrs+accepted, for eventual use by unscrubbed_taxondetermination_view

7125 01/09/2013 02:02 AM Aaron Marcuse-Kubitza

import_all: Directly import just the TNRS tables that should be imported, because some TNRS tables are included in import_order.txt so that they are part of the automated testing, but should not be imported at the same time as tnrs_accepted/tnrs_other

7121 01/08/2013 10:19 PM Aaron Marcuse-Kubitza

import_all: Made temporary vars local, so they wouldn't affect the calling shell

7103 01/07/2013 06:39 PM Aaron Marcuse-Kubitza

import_all: Make $dump_opts, $public_import local vars, so they will be automatically unset if the script is aborted

7095 01/07/2013 05:00 PM Aaron Marcuse-Kubitza

import_all: Make $import_source a local var, so it will be automatically unset if the script is aborted

7089 01/07/2013 04:10 PM Aaron Marcuse-Kubitza

import_all: Added command to add scrubbed taxondeterminations

7087 01/07/2013 04:08 PM Aaron Marcuse-Kubitza

import_all: Start tnrs-remake after starting the inputs, so that for subset imports (e.g. n=2), there will already be names to scrub when tnrs-remake starts up and it won't enter pause mode to wait for new rows (the pause is calibrated for full imports, and is too long for subset imports)

7048 01/04/2013 05:25 PM Aaron Marcuse-Kubitza

import_all: Run import with $public_import set in order to exclude excluded datasources

7038 01/03/2013 02:31 AM Aaron Marcuse-Kubitza

import_all: `make backups/vegbien.$version.backup/test`: Documented that this uses $dump_opts. $dump_opts must be manually set when running this command outside of import_all.

7023 12/21/2012 03:34 PM Aaron Marcuse-Kubitza

import_all: Allow caller to override $dump_opts

7022 12/21/2012 03:33 PM Aaron Marcuse-Kubitza

pg_dump_vegbien: Renamed $opts env var to $dump_opts to avoid conflicting with other commands' vars of the same name

6981 12/20/2012 10:45 AM Aaron Marcuse-Kubitza

make_analytical_db: Automatically call export_analytical_db when finished

6977 12/20/2012 10:09 AM Aaron Marcuse-Kubitza

import_all: after_import(): Added `make backups/vegbien.$version.backup/test`

6960 12/19/2012 01:49 PM Aaron Marcuse-Kubitza

import_all: after_import(): Added `make backups/TNRS.backup-remake`

6958 12/19/2012 01:42 PM Aaron Marcuse-Kubitza

import_all: after_import(): Added export_analytical_db

6946 12/19/2012 12:30 PM Aaron Marcuse-Kubitza

import_all: Run the import directly into a new, already-versioned public schema. This removes the need to manually rename the schema after import, and allows the backup commands to use the stored $version shell variable to refer to the last import.

6897 12/18/2012 09:41 PM Aaron Marcuse-Kubitza

import_all: Run all imports (not just the main datasources' import) with $import_source turned off, so that the Source tables will not be imported a second time when the datasource's main tables are imported. Note that it's not necessary to wait for asynchronous commands after the jobs for the main import are started (so that $import_source is not unset until after they are started), because with_all does not return until all jobs are started and have noted the $import_source setting in effect in the shell environment.

6896 12/18/2012 09:32 PM Aaron Marcuse-Kubitza

import_all: Source tables import: Fixed bug where need to use $all option to with_all to also include special datasources starting with "."

6594 12/04/2012 09:52 PM Aaron Marcuse-Kubitza

import_all: Fixed bug where need to wait for all asynchronous commands started before the main import, not just the first

6593 12/04/2012 09:51 PM Aaron Marcuse-Kubitza

import_all: Import all Source tables before the herbaria list, so that any custom metadata will override the info in the herbaria list

6382 11/24/2012 03:33 AM Aaron Marcuse-Kubitza

import_all: Added import of inputs/.herbaria/ before the main import

6211 11/15/2012 07:45 PM Aaron Marcuse-Kubitza

import_all: Change to main directory make targets are run from. Use relative paths to bin/ commands, which is possible now that the current dir is set.

6210 11/15/2012 07:41 PM Aaron Marcuse-Kubitza

import_all: Create a background process that waits until the import is done and then runs make_analytical_db

6208 11/15/2012 06:52 PM Aaron Marcuse-Kubitza

import_all: Documented that `wait %1` waits for asynchronous commands

5959 11/01/2012 10:52 AM Aaron Marcuse-Kubitza

import_all: After starting geoscrub import in the background, wait for make commands to scroll by before starting NCBI import

5957 11/01/2012 10:22 AM Aaron Marcuse-Kubitza

import_all: Removed explicit by_col=1 from datasources that don't require it for proper import. (It will still be set if the user provides it on the command line.)

5944 11/01/2012 09:01 AM Aaron Marcuse-Kubitza

import_all: Added geoscrub import, which can happen concurrently with NCBI/TNRS but must come before the main datasources for the matched places to link up properly

5943 11/01/2012 08:59 AM Aaron Marcuse-Kubitza

import_all: Documented that TNRS import must come after NCBI for cross links to be made

5917 11/01/2012 05:15 AM Aaron Marcuse-Kubitza

Calls to `make inputs/.TNRS/cleanup`: Do `make inputs/.TNRS/tnrs_accepted/reinstall; make inputs/.TNRS/tnrs_other/reinstall` instead to use new split TNRS tables

5836 10/30/2012 03:29 AM Aaron Marcuse-Kubitza

import_all: Pass command-line args (such as make vars) to all commands, not just with_all, so that a custom public schema is properly used by all commands

5503 10/15/2012 08:22 AM Aaron Marcuse-Kubitza

import_all: Also import the NCBI tree of life, before the TNRS names

5318 10/08/2012 09:58 PM Aaron Marcuse-Kubitza

import_all: Added commands to import TNRS names so the user doesn't have to do this manually

5214 10/03/2012 01:11 PM Aaron Marcuse-Kubitza

tnrs_db: Made wait option default to off to facilitate running tnrs_db by itself, rather than as part of an import

5206 10/03/2012 08:57 AM Aaron Marcuse-Kubitza

README.TXT: Data import: import_all: Don't run with & because this prevents the created jobs from being owned by the calling shell. Instead, import the TNRS names as a separate backgrounded step and wait for it to finish before starting import_all. Removed TNRS import steps from import_all since these are now invoked separately.

5172 10/02/2012 10:35 PM Aaron Marcuse-Kubitza

import_all: Use new dedicated cleanup make target to clean up TNRS.tnrs

5111 09/28/2012 11:42 AM Aaron Marcuse-Kubitza

import_all: Clean up any new TNRS.tnrs entries before importing the TNRS data

5081 09/27/2012 11:28 AM Aaron Marcuse-Kubitza

import_all: Start the tnrs daemon using `make inputs/.TNRS/tnrs/tnrs-remake &`

5055 09/27/2012 07:10 AM Aaron Marcuse-Kubitza

import_all: Added import of .TNRS datasource, which happens synchronously before other datasources are imported

5039 09/27/2012 03:37 AM Aaron Marcuse-Kubitza

import_all: Pass any args, such as vars, through to with_all

1953 04/23/2012 07:00 PM Aaron Marcuse-Kubitza

Scripts that are meant to be run in the calling shell: Fixed bug where running the script inside another script would make the script think it was being run as a program, and abort with a usage error

1952 04/23/2012 06:56 PM Aaron Marcuse-Kubitza

Scripts that are meant to be run in the calling shell: Fixed bug where running the script as a program (without initial ".") wouldn't be able to call return in something that was not a function. Converted all code to a <script_name>_main method so that return would work properly again. Converted all variables to local variables.

1948 04/23/2012 05:36 PM Aaron Marcuse-Kubitza

import_all: Use new with_all. Use ${BASH_SOURCE0} for $self and $self for $0.

1551 03/22/2012 05:33 PM Aaron Marcuse-Kubitza

import_all: Print Usage message if was run without initial "."

1550 03/22/2012 04:52 PM Aaron Marcuse-Kubitza

Renamed import-all to import_all to match convention of using underscores

1547 03/22/2012 04:33 PM Aaron Marcuse-Kubitza

import-all: Fixed to display the datasource name in the job name instead of 'make ${input}import &'

1546 03/20/2012 11:13 PM Aaron Marcuse-Kubitza

import-all: disown each new import process to ignore SIGHUP

1541 03/20/2012 10:38 PM Aaron Marcuse-Kubitza

Added import-all to import all inputs at once