Project

General

Profile

Statistics
| Revision:

# Date Author Comment
9725 06/06/2013 12:04 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: added PATH_rm(), which removes components from the PATH

9724 06/06/2013 12:03 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: added $top_dir_abs, $top_dir_orig

9723 06/05/2013 11:44 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/util.sh: command(): use `builtin command` instead of `exec` so that options like -p (reset PATH) work properly. also, the command builtin it overrides is designed to be used with more than just external commands, and command() should not impose additional limitations.

9722 06/05/2013 10:54 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added self_sys alias, which uses only system utilities (`command -p`) instead of the current PATH

9721 06/05/2013 10:37 PM Aaron Marcuse-Kubitza

lib/sh/make.sh: make(): at verbosity < 3, hide messages about making included Makefiles. this makes the make output much more readable when a Makefile contains an include statement, because there won't be a ton of log messages every time a Makefile is included. this filtering is so useful that it probably makes sense to run make for any of our Makefiles using `lib/runscripts/util.run make ...` instead of plain make. compare e.g. `make inputs/ACAD/Specimen/map.csv` (53 lines of output) and `lib/runscripts/util.run make inputs/ACAD/Specimen/map.csv` (17 lines of output, 1/3 as much).

9720 06/05/2013 10:15 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: log++: before cmd: documented that you need to use "log++" instad of log++ to avoid using the log++ alias, which prepends a log_local call. omitting the quotes is generally not a problem, but when there is another command wrapping the log++, you need the "" to avoid the wrapper applying to log_local's declare call instead.

9719 06/05/2013 10:09 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: log++: before cmd: documented how to use it as `log+ #` when incrementing multiple log_levels at once (this is a better method than `"log++" "log++" ...`)

9718 06/05/2013 10:04 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: log++: before cmd: documented that you need to use it as `"log++" "log++"` when repeating it multiple times

9717 06/05/2013 10:02 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added filter_fd(), which encapsulates the use of >() process substitution for filtering an fd other than stdout (yes, this is possible without lots of 3>&1 1>&2 2>&3 redirections!). this can be useful e.g. to filter logging output or highlight errors.

9716 06/05/2013 09:49 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: log+(): documented that with a cmd, assignments are applied just to the cmd, so log_local is not needed

9715 06/05/2013 08:59 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: moved pipe_delay() before fd-related functions so it can be used by them

9714 06/05/2013 08:56 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: removed no longer needed load new aliases before echo_stdin(), since pipe_delay() is now a function

9713 06/05/2013 08:48 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/util.sh: set_fds(): add #<>&- before every #<>&# reopen: need to use loop var $i instead of $1 (which would have been used with a while/shift method of iterating over $@)

9712 06/05/2013 12:57 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/util.sh: set_fds(): added workaround for strange bash bug where reopening an fd sometimes first requires explicitly closing it, by adding an <>&- entry for every redirection

9711 06/05/2013 12:53 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added match_prefix()

9710 06/05/2013 03:28 AM Aaron Marcuse-Kubitza

lib/sh/make.sh: make(): determine --silent status based on the verbosity (<=0) instead of a kw param

9709 06/05/2013 03:11 AM Aaron Marcuse-Kubitza

inputs/GBIF/table.run: table.tsv.md5/make(): only use %-remake if remaking

9708 06/05/2013 03:10 AM Aaron Marcuse-Kubitza

lib/sh/make.sh: make(): removed extra space after --silent

9707 06/04/2013 11:17 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: moved mysql_root() after the mysql->mysql_ANSI alias (and load new aliases) so that it will also use ANSI mode and support "" identifiers

9706 06/04/2013 11:15 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql: always use ANSI mode, to support "" identifiers. note that `` are still supported in this mode, so it also works with SHOW CREATE TABLE output and dumpfiles.

9705 06/04/2013 11:05 PM Aaron Marcuse-Kubitza

bugfix: inputs/GBIF/raw_occurrence_record/run: plant_fraction_for_herbaria_filter/make(): need to make prerequisites first (plant_fraction/make)

9704 06/04/2013 06:35 PM Aaron Marcuse-Kubitza

bugfix: inputs/GBIF/table.run: table.tsv.md5/make(): use %-remake to ensure that the .md5 file is remade, regardless of the .md5 file's mtime relative to table.tsv. you would generally expect table.tsv's new mtime to be newer than the .md5 file's (thus triggering make to run), but if you e.g. ran svn up after making the table.tsv, this might not be the case.

9703 06/04/2013 06:31 PM Aaron Marcuse-Kubitza

/Makefile: moved %/remake, %-remake to lib/common.Makefile because they are generally useful

9702 06/04/2013 06:26 PM Aaron Marcuse-Kubitza

/Makefile: moved %/reinstall to lib/common.Makefile because it is generally useful

9701 06/04/2013 06:07 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql_cmd(): run the command with `time` because in mysql()'s output_data mode, no queries, and therefore no runtimes, are echoed, so the total runtime needs to be echoed separately instead. the total runtime is also useful in general, when many long-running queries are run and you would also like to know the total time (e.g. in make_analytical_db).

9700 06/04/2013 05:54 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: limit(): usage: surrounded query in "" to clarify that it's a string, not a command

9699 06/01/2013 09:32 PM Aaron Marcuse-Kubitza

inputs/GBIF/table.run: table.tsv/make(): use new set_large_table to prevent table.tsv from being deleted on error for full export runs (while still deleting it on error for sample subset runs)

9698 06/01/2013 09:31 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: added set_large_table alias, used to set to_file's $del flag based on $limit

9697 06/01/2013 09:30 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added exit2bool()

9696 06/01/2013 09:18 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: int2bool(): renamed to int2exit() because it actually sets a boolean exit status rather than returning a boolean string

9695 06/01/2013 09:07 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/util.sh: to_file(): also kw_params the other kw params if_not_exists, del

9694 06/01/2013 07:02 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql_cmd(): use --quick to avoid buffering entire result (which is slow and memory-intensive for large result sets). this option applies to both mysql() and mysqldump().

9693 06/01/2013 06:35 PM Aaron Marcuse-Kubitza

lib/sh/make.sh: self_make(): documented that it should be preceded by set_make_vars

9692 06/01/2013 06:34 PM Aaron Marcuse-Kubitza

bugfix: inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.run: ^.preamble.sql/make(): need to run set_make_vars even though the make vars are not used, because set_make_vars sets $_remake, which is needed by self_make

9691 06/01/2013 06:33 PM Aaron Marcuse-Kubitza

lib/sh/make.sh: set_make_vars: echo_vars $_remake to help debugging

9690 06/01/2013 06:27 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/util.sh: to_file(): need to run invoked cmd using redir so that >$stdout redir is applied properly when cmd is a shell function instead of an external command (external commands were already redirected properly because command() calls redir). this fixes a bug in inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.run ^.preamble.sql/make(), where the generated file would be output to stdout instead of to the file because to_file()'s command was a shell function, and therefore the redirection was not applied. this fix requires redir() to be a separate function from command(), because command() does many things that are only applicable to external commands.

9689 06/01/2013 06:20 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: redir(): override save_e and add `unset redirs` so error handlers are not redirected

9688 06/01/2013 06:15 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added alias_append(), similar to func_override() for aliases

9687 06/01/2013 06:05 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: redir(): don't do redir actions if redir will be run later (i.e. if command to run starts with `redir` or `command`)

9686 06/01/2013 06:04 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: redir(), command(): log their function calls at the usual log_level (2) instead of one higher, so that they are printed in the call tree to help debugging

9685 06/01/2013 05:39 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added redir() and use it in command() to perform and echo the redirections

9684 06/01/2013 09:35 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: command(): removed comment that "the following redirections must happen in exactly this order", because there is now only one redirection

9683 06/01/2013 06:01 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: command(): removed 2>&$err_fd, which is no longer needed because $err_fd is now 2 (in general, there is probably not a need for a special $err_fd var, because 2 is already stderr)

9682 06/01/2013 05:52 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: command(): don't set err_fd to global stderr, because this prevents errors from being captured and parsed by callers. it is also not necessary to use a separate port than stderr, because stderr already contains only errors (since logging messages go to their own port). global stderr would still be useful e.g. for displaying input prompts the user when reading from global stdin.

9681 06/01/2013 05:17 AM Aaron Marcuse-Kubitza

bugfix: *run: overriding targets: use new self_make to properly progagate the $remake flag to the overridden target, so that the target itself is not skipped

9680 06/01/2013 05:15 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: to_file(): added del= flag to prevent the file from being auto-removed on error (e.g. to preserve a partial result, which normally would be removed)

9679 06/01/2013 05:12 AM Aaron Marcuse-Kubitza

lib/sh/make.sh: added self_make(), which progagates the $remake flag (normally it is not progagated, because prerequisites should not also be remade)

9678 06/01/2013 03:58 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: usage comments: when there is a descriptive comment on the same line as the usage, prepend it with # (as if it were an end-of-line comment) instead of enclosing it in (), to make it visually obvious that it's a comment and not part of the usage commands

9677 06/01/2013 03:54 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: added all_funcs(), which lists all declared functions

9676 06/01/2013 03:31 AM Aaron Marcuse-Kubitza

bugfix: inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added check_target_exists so table.tsv would not be overwritten if it already existed

9675 06/01/2013 03:30 AM Aaron Marcuse-Kubitza

bugfix: inputs/GBIF/table.run: table.tsv/make(): force table.tsv.md5 to be remade (using remake=1) because the table.tsv contents will have changed

9674 06/01/2013 03:29 AM Aaron Marcuse-Kubitza

bugfix: lib/sh/make.sh: set_make_vars: can't use end-of-line comment in alias because it will comment out whatever is after the alias where it's used. can't just put a newline or ; after the end-of-line comment because the alias's lines will be combined onto one line using ; , so end-of-line comments would not be supported.

9673 06/01/2013 03:23 AM Aaron Marcuse-Kubitza

lib/sh/util.sh: added pf(), to print a function declaration for debugging

9672 06/01/2013 02:08 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): documented runtime (1 hr)

9671 06/01/2013 01:19 AM Aaron Marcuse-Kubitza

*.url: mailto URLs: use the standard e-mail dotpath syntax e-mail@host?name.date.subject.(attachment) (vegpath.org/wiki/Global_IDs#Resource). populated missing fields (e.g. name, subject) where needed.

9670 06/01/2013 12:12 AM Aaron Marcuse-Kubitza

*.url: mailto URLs: ensured that they are proper URLs with the mailto: protocol

9669 05/30/2013 08:32 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: to_file(): exc handler that rm's file: unset redirs so it isn't used in the rm cmd

9668 05/30/2013 08:22 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): removed no longer needed explicit clear of $remake, which is now done by make.sh instead

9667 05/30/2013 08:18 PM Aaron Marcuse-Kubitza

lib/sh/make.sh: set_make_vars: don't progagate remake to prerequisites, so that remake=1 only applies to the outermost target rather than forcing every prerequisite to be remade, too

9666 05/30/2013 07:58 PM Aaron Marcuse-Kubitza

lib/sh/make.sh: moved remaking section before set_make_vars so that it can be used in set_make_vars

9665 05/30/2013 07:53 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: added herbaria_filter/seal()

9664 05/30/2013 07:51 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): changed "from IH" to "contains all of IH" because not all rows are now from IH

9663 05/30/2013 07:49 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): renamed acronym->institution_code to match the column name in raw_occurrence_record rather than in IH

9662 05/30/2013 07:46 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: removed no longer used herbaria_filter.plant_fraction.csv_/make(). use plant_fraction_for_herbaria_filter view instead.

9661 05/30/2013 07:45 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): use the plant_fraction_for_herbaria_filter view directly instead of first exporting it to a CSV

9660 05/30/2013 07:19 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql_import(): in append mode, use LOAD DATA IGNORE to allow inserting duplicate rows

9659 05/30/2013 07:09 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): if remaking, turn off remake mode after doing this target's rm operations, so that prerequisite targets are not also remade

9658 05/30/2013 06:56 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: to_file(): removed no longer needed separate logging of >$stdout, which is now done by command()

9657 05/30/2013 06:50 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: echo_redirs_cmd(): use $ in a subshell instead of manipulating the @redirs array directly, because operations on $ (e.g. $#, $1, shift) are much simpler than the corresponding array operations ( ${#redirs[]}, ${redirs[0]}, redirs=("${redirs[]:1}") )

9656 05/30/2013 06:42 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: echo_redirs_cmd(): log each file redir with a separate log() statement, so each line is indented

9655 05/30/2013 06:38 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added echo_redirs_cmd and use it in command() to print cmd

9654 05/30/2013 06:32 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: command(): print <>file redirects before command, because they introduce it

9653 05/30/2013 06:31 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: added starts_with()

9652 05/30/2013 05:49 PM Aaron Marcuse-Kubitza

lib/sh/util.sh: to_file(): use @redirs to echo and set >$stdout instead of setting it manually, which is possible now that the command() @redirs bug has been fixed

9651 05/30/2013 05:43 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/util.sh: convention of fds to use for command-specific alternate stdin/stdout/stderr: changed to 40/41/42 because 10/11/12 are used by eval (which is used by set_fds()). use of fd 10/11/12 will cause hard-to-find silent bugs because exec will not print an error when these are used. documented why not to use other series of fds for this purpose:...

9650 05/30/2013 02:43 PM Aaron Marcuse-Kubitza

lib/sh/local.sh: psql(): use new psql() from db.sh instead of psql_script_vegbien/psql_verbose_vegbien. this requires setting local_pg_database=vegbien to replace vegbien_dest used by psql_*_vegbien.

9649 05/30/2013 02:38 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: psql(): set $PG* connection env vars from our connection vars ($server, $user, etc.). use use_pg to import $database so it can be different from $database for MySQL

9648 05/30/2013 02:31 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: added use_pg alias

9647 05/30/2013 02:31 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/db.sh: psql(): added missing `--set ON_ERROR_STOP=1 --quiet` opts from psql_script_vegbien

9646 05/30/2013 02:12 PM Aaron Marcuse-Kubitza

lib/sh/db.sh: added psql(), which replaces psql_script_vegbien and psql_verbose_vegbien for general connections. it also supports separate command and stdin files, to allow using `\copy from pstdin`, with pstdin pointing to a separate, EOF-terminated CSV file instead of inlined with the command and terminated with the \. escape (which may be contained within the CSV file itself).

9645 05/30/2013 01:05 PM Aaron Marcuse-Kubitza

bugfix: lib/sh/local.sh: psql(): $file can't both be passed as a --file param and be prefixed with the necessary \set schema, etc. commands, so instead include $file when cat-ing stdin

9644 05/30/2013 08:28 AM Aaron Marcuse-Kubitza

added inputs/GBIF/raw_occurrence_record/postprocess.sql, which removes institutions that we have direct data for

9643 05/30/2013 08:18 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): skip table if already exists (unless remaking), like plant_fraction/make()

9642 05/30/2013 08:16 AM Aaron Marcuse-Kubitza

bugfix: lib/sh/db.sh: mysql_import(): need to use direct connection to DB instead of via ssh, because ssh does not tunnel nonstandard fds

9641 05/30/2013 08:15 AM Aaron Marcuse-Kubitza

lib/sh/db.sh: added ssh2local alias

9640 05/30/2013 07:36 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: herbaria_filter.plant_fraction.csv_/make(): use new plant_fraction_for_herbaria_filter view

9639 05/30/2013 07:13 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: added plant_fraction_for_herbaria_filter/make(). note that for simplicity, plant_fraction_for_herbaria_filter is a view instead of a table.

9638 05/30/2013 06:50 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: .table/(): renamed to */*() because a target named after a table refers to the table unless it has an explicit file extension

9637 05/30/2013 06:49 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/*(): renamed to plant_fraction/*() because a target named after a table refers to the table unless it has an explicit file extension

9636 05/30/2013 06:41 AM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql_seal_table(): also revoke GRANT OPTION, which apparently needs to be done in addition (and in a separate command, unlike when granting GRANT OPTION)

9635 05/30/2013 06:40 AM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql_seal_table(): REVOKE: ignore errors if REVOKE was already run

9634 05/30/2013 06:39 AM Aaron Marcuse-Kubitza

lib/sh/db.sh: mysql_seal_table(): REVOKE: removed unneeded explicit database since this is automatically set to the current database

9633 05/30/2013 06:19 AM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: added plant_fraction.table/seal(), which uses new mysql_seal_table()

9632 05/30/2013 06:19 AM Aaron Marcuse-Kubitza

lib/sh/db.sh: added mysql_seal_table(), which prevents further modifications to a table by a user. this uses new mysql_root().

9631 05/30/2013 06:18 AM Aaron Marcuse-Kubitza

lib/sh/db.sh: added mysql_root(). this version uses just use_root (compare to the mysql_root() override in local.sh).

9630 05/30/2013 06:16 AM Aaron Marcuse-Kubitza

lib/sh/local.sh: database connection vars: connect to vegbiendev via ssh and run commands locally, to allow running commands as root (which can only connect to the database locally). this effectively requires an ssh account on vegbiendev, but any ssh account (including an anonymous one, if we set one up) will do. this causes schemas/VegCore/VegCore.my.sql, VegCore.pg.sql to change, because they are now created by mysqldump running on vegbiendev (Linux) instead of on a Mac.

9629 05/29/2013 10:35 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: plant_fraction: added index on plant_fraction for fast extraction of herbaria by fraction threshold

9628 05/29/2013 10:10 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: tables: set ENGINE to MyISAM and DEFAULT CHARSET to utf8 to match the other GBIF tables. (note that MyISAM is not the default, but is needed to avoid row sort order problems and other issues with InnoDB.)

9627 05/29/2013 08:09 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): in remaking mode, drop the table first

9626 05/29/2013 08:04 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): only create and populate the table if it doesn't already exist, to avoid clobbering existing data. the noclobber functionality uses new skip_table(), which is the table analog of require_not_exists().