inputs/GBIF/table.run: table.tsv/make(): use new set_large_table to prevent table.tsv from being deleted on error for full export runs (while still deleting it on error for sample subset runs)
lib/sh/db.sh: added set_large_table alias, used to set to_file's $del flag based on $limit
lib/sh/util.sh: added exit2bool()
lib/sh/util.sh: int2bool(): renamed to int2exit() because it actually sets a boolean exit status rather than returning a boolean string
bugfix: lib/sh/util.sh: to_file(): also kw_params the other kw params if_not_exists, del
lib/sh/db.sh: mysql_cmd(): use --quick to avoid buffering entire result (which is slow and memory-intensive for large result sets). this option applies to both mysql() and mysqldump().
lib/sh/make.sh: self_make(): documented that it should be preceded by set_make_vars
bugfix: inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.run: ^.preamble.sql/make(): need to run set_make_vars even though the make vars are not used, because set_make_vars sets $_remake, which is needed by self_make
lib/sh/make.sh: set_make_vars: echo_vars $_remake to help debugging
bugfix: lib/sh/util.sh: to_file(): need to run invoked cmd using redir so that >$stdout redir is applied properly when cmd is a shell function instead of an external command (external commands were already redirected properly because command() calls redir). this fixes a bug in inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.run ^.preamble.sql/make(), where the generated file would be output to stdout instead of to the file because to_file()'s command was a shell function, and therefore the redirection was not applied. this fix requires redir() to be a separate function from command(), because command() does many things that are only applicable to external commands.
lib/sh/util.sh: redir(): override save_e and add `unset redirs` so error handlers are not redirected
lib/sh/util.sh: added alias_append(), similar to func_override() for aliases
lib/sh/util.sh: redir(): don't do redir actions if redir will be run later (i.e. if command to run starts with `redir` or `command`)
lib/sh/util.sh: redir(), command(): log their function calls at the usual log_level (2) instead of one higher, so that they are printed in the call tree to help debugging
lib/sh/util.sh: added redir() and use it in command() to perform and echo the redirections
lib/sh/util.sh: command(): removed comment that "the following redirections must happen in exactly this order", because there is now only one redirection
lib/sh/util.sh: command(): removed 2>&$err_fd, which is no longer needed because $err_fd is now 2 (in general, there is probably not a need for a special $err_fd var, because 2 is already stderr)
lib/sh/util.sh: command(): don't set err_fd to global stderr, because this prevents errors from being captured and parsed by callers. it is also not necessary to use a separate port than stderr, because stderr already contains only errors (since logging messages go to their own port). global stderr would still be useful e.g. for displaying input prompts the user when reading from global stdin.
bugfix: *run: overriding targets: use new self_make to properly progagate the $remake flag to the overridden target, so that the target itself is not skipped
lib/sh/util.sh: to_file(): added del= flag to prevent the file from being auto-removed on error (e.g. to preserve a partial result, which normally would be removed)
lib/sh/make.sh: added self_make(), which progagates the $remake flag (normally it is not progagated, because prerequisites should not also be remade)
lib/sh/util.sh: usage comments: when there is a descriptive comment on the same line as the usage, prepend it with # (as if it were an end-of-line comment) instead of enclosing it in (), to make it visually obvious that it's a comment and not part of the usage commands
lib/sh/util.sh: added all_funcs(), which lists all declared functions
bugfix: inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added check_target_exists so table.tsv would not be overwritten if it already existed
bugfix: inputs/GBIF/table.run: table.tsv/make(): force table.tsv.md5 to be remade (using remake=1) because the table.tsv contents will have changed
bugfix: lib/sh/make.sh: set_make_vars: can't use end-of-line comment in alias because it will comment out whatever is after the alias where it's used. can't just put a newline or ; after the end-of-line comment because the alias's lines will be combined onto one line using ; , so end-of-line comments would not be supported.
lib/sh/util.sh: added pf(), to print a function declaration for debugging
inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): documented runtime (1 hr)
*.url: mailto URLs: use the standard e-mail dotpath syntax e-mail@host?name.date.subject.(attachment) (vegpath.org/wiki/Global_IDs#Resource). populated missing fields (e.g. name, subject) where needed.
*.url: mailto URLs: ensured that they are proper URLs with the mailto: protocol
lib/sh/util.sh: to_file(): exc handler that rm's file: unset redirs so it isn't used in the rm cmd
inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): removed no longer needed explicit clear of $remake, which is now done by make.sh instead
lib/sh/make.sh: set_make_vars: don't progagate remake to prerequisites, so that remake=1 only applies to the outermost target rather than forcing every prerequisite to be remade, too
lib/sh/make.sh: moved remaking section before set_make_vars so that it can be used in set_make_vars
inputs/GBIF/raw_occurrence_record/run: added herbaria_filter/seal()
inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): changed "from IH" to "contains all of IH" because not all rows are now from IH
inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): renamed acronym->institution_code to match the column name in raw_occurrence_record rather than in IH
inputs/GBIF/raw_occurrence_record/run: removed no longer used herbaria_filter.plant_fraction.csv_/make(). use plant_fraction_for_herbaria_filter view instead.
inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): use the plant_fraction_for_herbaria_filter view directly instead of first exporting it to a CSV
lib/sh/db.sh: mysql_import(): in append mode, use LOAD DATA IGNORE to allow inserting duplicate rows
inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): if remaking, turn off remake mode after doing this target's rm operations, so that prerequisite targets are not also remade
lib/sh/util.sh: to_file(): removed no longer needed separate logging of >$stdout, which is now done by command()
lib/sh/util.sh: echo_redirs_cmd(): use $ in a subshell instead of manipulating the @redirs array directly, because operations on $ (e.g. $#, $1, shift) are much simpler than the corresponding array operations ( ${#redirs[]}, ${redirs[0]}, redirs=("${redirs[]:1}") )
in a subshell instead of manipulating the @redirs array directly, because operations on $
]}, ${redirs[0]}, redirs=("${redirs[
lib/sh/util.sh: echo_redirs_cmd(): log each file redir with a separate log() statement, so each line is indented
lib/sh/util.sh: added echo_redirs_cmd and use it in command() to print cmd
lib/sh/util.sh: command(): print <>file redirects before command, because they introduce it
lib/sh/util.sh: added starts_with()
lib/sh/util.sh: to_file(): use @redirs to echo and set >$stdout instead of setting it manually, which is possible now that the command() @redirs bug has been fixed
bugfix: lib/sh/util.sh: convention of fds to use for command-specific alternate stdin/stdout/stderr: changed to 40/41/42 because 10/11/12 are used by eval (which is used by set_fds()). use of fd 10/11/12 will cause hard-to-find silent bugs because exec will not print an error when these are used. documented why not to use other series of fds for this purpose:...
lib/sh/local.sh: psql(): use new psql() from db.sh instead of psql_script_vegbien/psql_verbose_vegbien. this requires setting local_pg_database=vegbien to replace vegbien_dest used by psql_*_vegbien.
lib/sh/db.sh: psql(): set $PG* connection env vars from our connection vars ($server, $user, etc.). use use_pg to import $database so it can be different from $database for MySQL
lib/sh/db.sh: added use_pg alias
bugfix: lib/sh/db.sh: psql(): added missing `--set ON_ERROR_STOP=1 --quiet` opts from psql_script_vegbien
lib/sh/db.sh: added psql(), which replaces psql_script_vegbien and psql_verbose_vegbien for general connections. it also supports separate command and stdin files, to allow using `\copy from pstdin`, with pstdin pointing to a separate, EOF-terminated CSV file instead of inlined with the command and terminated with the \. escape (which may be contained within the CSV file itself).
bugfix: lib/sh/local.sh: psql(): $file can't both be passed as a --file param and be prefixed with the necessary \set schema, etc. commands, so instead include $file when cat-ing stdin
added inputs/GBIF/raw_occurrence_record/postprocess.sql, which removes institutions that we have direct data for
inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): skip table if already exists (unless remaking), like plant_fraction/make()
bugfix: lib/sh/db.sh: mysql_import(): need to use direct connection to DB instead of via ssh, because ssh does not tunnel nonstandard fds
lib/sh/db.sh: added ssh2local alias
inputs/GBIF/raw_occurrence_record/run: herbaria_filter.plant_fraction.csv_/make(): use new plant_fraction_for_herbaria_filter view
inputs/GBIF/raw_occurrence_record/run: added plant_fraction_for_herbaria_filter/make(). note that for simplicity, plant_fraction_for_herbaria_filter is a view instead of a table.
inputs/GBIF/raw_occurrence_record/run: .table/(): renamed to */*() because a target named after a table refers to the table unless it has an explicit file extension
inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/*(): renamed to plant_fraction/*() because a target named after a table refers to the table unless it has an explicit file extension
lib/sh/db.sh: mysql_seal_table(): also revoke GRANT OPTION, which apparently needs to be done in addition (and in a separate command, unlike when granting GRANT OPTION)
lib/sh/db.sh: mysql_seal_table(): REVOKE: ignore errors if REVOKE was already run
lib/sh/db.sh: mysql_seal_table(): REVOKE: removed unneeded explicit database since this is automatically set to the current database
inputs/GBIF/raw_occurrence_record/run: added plant_fraction.table/seal(), which uses new mysql_seal_table()
lib/sh/db.sh: added mysql_seal_table(), which prevents further modifications to a table by a user. this uses new mysql_root().
lib/sh/db.sh: added mysql_root(). this version uses just use_root (compare to the mysql_root() override in local.sh).
lib/sh/local.sh: database connection vars: connect to vegbiendev via ssh and run commands locally, to allow running commands as root (which can only connect to the database locally). this effectively requires an ssh account on vegbiendev, but any ssh account (including an anonymous one, if we set one up) will do. this causes schemas/VegCore/VegCore.my.sql, VegCore.pg.sql to change, because they are now created by mysqldump running on vegbiendev (Linux) instead of on a Mac.
inputs/GBIF/raw_occurrence_record/run: plant_fraction: added index on plant_fraction for fast extraction of herbaria by fraction threshold
inputs/GBIF/raw_occurrence_record/run: tables: set ENGINE to MyISAM and DEFAULT CHARSET to utf8 to match the other GBIF tables. (note that MyISAM is not the default, but is needed to avoid row sort order problems and other issues with InnoDB.)
inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): in remaking mode, drop the table first
inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): only create and populate the table if it doesn't already exist, to avoid clobbering existing data. the noclobber functionality uses new skip_table(), which is the table analog of require_not_exists().
lib/runscripts/table.run, table.run: use new db_make.sh
added lib/sh/db_make.sh that includes both db.sh and make.sh, and will eventually contain DB-related make commands
lib/sh/db.sh: added skip_table(), which prints an already_exists_msg for tables
lib/sh/util.sh: already_exists_msg: undid r9621 because the `|| return 0` should actually always be explicitly specified by the caller, to make it clear that the function will be aborted
lib/sh/util.sh: already_exists_msg(): added alias for use as an error handler. note that ..._not_exists() functions should continue to use the "already_exists_msg" function instead to preserve the exit status.
lib/sh/util.sh: added already_exists_msg() and use it instead of manually generating the die() call
schemas/my.cnf: added innodb_file_per_table so each InnoDB table will get its own file. this should also allow databases with InnoDB tables to be manually renamed.
added schemas/my.cnf from /etc/mysql/my.cnf
schemas/VegCore/VegCore.my.sql, VegCore.pg.sql: synced to VegCore MySQL DB. for some reason, the fkeys are now output in the opposite order from what they were in before.
inputs/.TNRS/schema.sql: MatchedTaxon: filter out rows where Max_score was not high enough to use the TNRS result as a match. removed now-duplicated filter for this in AcceptedTaxon.
inputs/.TNRS/schema.sql: ScrubbedTaxon: removed extra ; at end of WHERE clause
web/links/index.htm: updated to Firefox bookmarks. some broken favicons have also been fixed, by reopening bookmark in Firefox. (this will only update a favicon if there is a newer version. to delete a favicon completely, use Firefox's SQLite Manager plugin.)
web/index.php: use XHTML DOCTYPE to match what's used by mod_autoindex. this requires some adjustments in spacing for XHTML's slightly different formatting
bugfix: web/.htaccess: need to do DirectoryIndex redirects before checking for existing file/dir, because a DirectoryIndexed dir is existing but still needs to be redirected to the index.* file
web/.htaccess: mod_autoindex: use the main.css stylesheet to match the look-and-feel of index.php
web/.htaccess: mod_autoindex: Note that some listed files are not web-accessible: use ' instead of " to avoid \-escaping embedded "
web/.htaccess: mod_autoindex: sort by description when provided, to allow setting a custom (non-alphabetical) sort order using AddDescription
web/.htaccess: mod_autoindex: added note that some listed files are not web-accessible. they will produce a "Forbidden" error when clicked.
bugfix: web/index.php: added space between the full directory index and the preceding content
web/index.php: moved the full directory index within the rest of the document body
web/index.php: include full directory index, since the URL patterns list is just a subset of the content available through vegpath.org
web/.htaccess: added mod_autoindex IndexOptions, in particular FoldersFirst
bugfix: web/.htaccess: changed "mod_dir listing"->"mod_autoindex listing" because mod_dir does not actually handle the autogenerated listings
bugfix: web/.htaccess: DirectoryIndex: use disabled instead of on because on is actually treated as a filename, and does not invoke mod_autoindex. the DirectoryIndex directive and the mod_dir module actually apply only to manual index files, not to autogenerated dir listings (which are handled by mod_autoindex).
web/index.php: removed no longer needed custom alias j.mp/vegpath# for when page reached through vegbiendev.nceas.ucsb.edu, because vegpath.org is a much more reliable domain than the previous path.vg, and a separate way to reach VegPath when path.vg is down is no longer needed
web/.htaccess: <dir>/all forces mod_dir listing: use simpler $mod_dir_listing env var instead of query string modification to indicate that an explicit mod_dir listing should be displayed. this causes /all to replace ?index=1 as the way to force a mod_dir listing. note that the %{ENV:...} test needs to use $REDIRECT_mod_dir_listing instead of $mod_dir_listing, because a redirect will occur between the /all rule and the index.* rule, causing all env vars to be prepended with REDIRECT_ .