planning/goals/BIEN3_derived_data_products.docx: updated to most recent version from Brad's e-mail on 2013-4-16
/README.TXT: Full database import: before running screen: added `unset TMOUT` because TMOUT (autologout) causes screen to exit even with background processes active
/README.TXT: Maintenance: added things to put in your .profile on a live machine (e.g. vegbiendev). in particular, you MUST NOT have a TMOUT (autologout) set, because this causes screen to exit even if background processes (e.g. from column-based import) are running
bugfix: lib/runscripts/table.run: load_data(): need to ensure the verbosity is at least 3 because the install logs require verbose output. (3 is the default for the installer, but is overridden by the runscripts, which instead set the default to 1.)
lib/sh/util.sh: logging: added verbosity_min()
lib/sh/util.sh: logging: added verbosity_int() and use it instead of `round_down "$verbosity"`
bugfix: lib/sh/util.sh: log+ alias: don't expand next word because it's not a cmd
lib/runscripts/table.run: $schema, $table: log the `cd` used to calculate the value at log_level 3 instead of 1 (note that the cd() function call for this will be logged at log_level 5)
lib/sh/util.sh: cd(): trace-log the cd() function call itself (at log_level 3) in addition to the cd builtin call
lib/runscripts/table.run: import(): added step to load the data into the staging table before postprocessing it
inputs/GBIF/raw_occurrence_record/run: moved table.tsv.md5/make() and invocation of it to inputs/GBIF/table.run because it's general to all tables (which would all use table.tsv for this datasource). use $target_filename in calling table.tsv.md5/make from table.tsv/make.
bugfix: lib/sql.py: parse_exception(): typed_name_re: need to ensure that full name is matched rather than just first character
web/links/index.htm: updated to Firefox bookmarks. Mountain Lion upgrade: added Python psycopg2, Python OrderedDict, X11.
lib/runscripts/table.run: table_make(): run canon_rel_path on the datasrc dir so it's displayed as the direct path to it, without ..
lib/runscripts/table.run: table_make(): moved comment about "${@/#/$table/}" to right after the line it describes
lib/runscripts/table.run: input_make(): renamed to table_make() to make it clear that the target names are relative to the table subdir itself, not the datasrc dir. it was previously called input_make because it used inputs/input.Makefile directly, but now will use any Makefile in the datasrc dir.
added inputs/GBIF/Makefile, which links to ../input.Makefile, to allow running make directly in the datasrc dir (i.e. without --makefile=.../input.Makefile). this is required by the runscripts.
inputs/GBIF/raw_occurrence_record/run: table.tsv.md5/make(): don't add extra .md5 extension to $target_filename because it already has the extension as part of the target name (now that this command is run in its own make target rather than in table.tsv/make())
lib/sql_gen.py: import OrderedDict from collections instead of ordereddict for Mac 10.8 Mountain Lion upgrade
lib/sh/util.sh: logging: renamed $log_level_indent to $log_indent_step to avoid confusion with the log_level, which is a different kind of indent (using + signs instead of |s)
lib/sh/util.sh: logging: also export PS4, because it follows verbosity and therefore also needs to be propagated to invoked commands
lib/sh/util.sh: can_log(): support decimal verbosities using round_down()
lib/sh/util.sh: always use echo_export instead of export, even when the verbosity at load time would suppress output, because the verbosity may actually increase during the script due to log-- calls, etc., and vars should then still be echoed as expected
lib/sh/util.sh: log+: inlined PS4_prefix_n alias because there is now room for it
lib/sh/util.sh: log+: $verbosity: support decimal verbosities (but not decimal log_levels) by using new float+int
lib/sh/util.sh: removed no longer used log-. use log+ with a negative argument instead.
bugfix: lib/sh/util.sh: log+: PS4 when $1 < 0: need to negate $1 because now it's a negative number
lib/sh/util.sh: log--: use log+ with 1 instead of log so we don't need a separate log- function
lib/sh/util.sh: logging: log+: support negative log_level adjustments. log-: use log+ with the negative of its argument
web/links/index.htm: updated to Firefox bookmarks. sorted MySQL and PostgreSQL sections.
web/links/index.htm: updated to Firefox bookmarks
web/links/index.htm: Mac 10.8 Mountain Lion > PostgreSQL: appended tab to name to disambiguate it from the general PostgreSQL section
bugfix: web/links/index.htm.run: prepend __ to the HTML anchors of bookmarks toolbar links so they don't shadow/override folders of the same name: also renamed the corresponding self-anchor hyperlinks
bugfix: web/links/index.htm.run: prepend __ to the HTML anchors of bookmarks toolbar links so they don't shadow/override folders of the same name: need to replace all occurrences (using /g option to sed) to include both HTML anchors on the line
web/links/index.htm.run: prepend __ to the HTML anchors of bookmarks toolbar links so they don't shadow/override folders of the same name
lib/sh/util.sh: sed: changed it to an alias so it will also be expanded when passed to an external command (like in_place) that can only run an executable, not a shell function (this occurs as long as the external command is defined as an alias which ends in space, to alias-expand the next word). added associated $sed_cmd var for cases when there is no alias wrapper around the external command, and the literal alias body must be used instead.
lib/sh/util.sh: log+/-(): setting verbosity: added space around operators to support negative numbers
lib/sh/util.sh: added float functions (esp. float+int()) for dealing with decimal verbosities used by sql.py and column-based import
inputs/GBIF/raw_occurrence_record/run: inputs/GBIF/raw_occurrence_record/run: added check_target_exists so you know why make skipped the file (for other, non-silent targets, it would also avoid make's verbose output when the file exists)
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): moved making of table.tsv.md5 to separate function
lib/sh/util.sh: $log_fd: indicated that this is only initially stderr (however, the new port will just use stderr if it's not redirected separately)
lib/sh/util.sh: verbosities: 3: added that this includes values of kw params
lib/sh/util.sh: logging: added description of the verbosities available, including what each one does and what it's useful for:
lib/sh/util.sh: command__set_fds(): echo the >& line at a higher log_level, because this information (i.e. which fd is used by the command for logging) is primarily for debugging and should not normally be printed
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): also add md5 sum for table.tsv
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added back filter kw args, which had gotten deleted in a commit without update (although actually, svn should not allow a commit without update, so the working copy may have gotten corrupted)
lib/runscripts/table.run: input_make(): added $silent flag which turns on make's --silent option
lib/sh/make.sh: set_make_vars: usage: added "use $target" to indicate that vars are made available by this alias
lib/sh/make.sh: inline_make(): take the script stdin from caller-provided stdin and get make's stdin from global stdin, so that the caller can just use <<'EOF' rather than having to include a specific fd before the <<
lib/runscripts/table.run: input_make(): documented that it requires a Makefile in the datasrc dir containing `include ../input.Makefile
lib/runscripts/table.run: input_make(): added comment explaining that "${@/#/$table/}" replaces the empty str at the beginning of str (/#) with $table/
lib/runscripts/table.run: put functions on one line where possible
lib/runscripts/table.run: input_make(): use the local make() function instead of external make directly, because it sets $cmd_log_fd appropriately to ensure that all the echoed make commands get properly logged to stdlog/stderr (stdlog is fd 30 when it's redirected to a file)
inputs/GBIF/raw_occurrence_record/run table.tsv/make() and functions used by it: added usage comments for cmd line usage, caller usage, and declaring function usage
web/links/index.htm: removed no longer needed - at the beginning of every folder's description
lib/Firefox_bookmarks.reformat.csv: label page's description: don't do this for folders (i.e. descriptions preceded by an <H3> tag) because their descriptions are always author-added rather than from a web page. this avoids needing to add a - at the beginning of every folder's description.
web/links/index.htm: updated to Firefox bookmarks. upgrading Mac OS X to 10.8 Mountain Lion: put fixes subdirs in order.
web/links/index.htm: updated to Firefox bookmarks. upgrading Mac OS X to 10.8 Mountain Lion: added fixes for Apache ~/Sites dirs, Apache PHP
web/links/index.htm: updated to Firefox bookmarks. added bookmarks for upgrading Mac OS X to 10.8 Mountain Lion. (WARNING: DO NOT upgrade unless you are prepared to fix several programs broken by the upgrade: svn, PostgreSQL. instructions are in the corresponding bookmark subdirs. these programs will be COMPLETELY UNAVAILABLE until they are manually fixed!)
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): cols: also include scientific_name, which is preferable as a TNRS input because it also contains lower ranks
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): cols: also include id, institution_code, collection_code, catalogue_number
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added filter for institution_codes in herbaria.ih (in PostgreSQL)
inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added column subset (from http://vegpath.org/twiki/bin/view/Main/ConfCall20130509#subsetting_strategy > include)
lib/sh/db.sh: mk_select: constructed queries: in ${var:+if_true} syntax, put the newline at the end of the if_true value instead of the beginning, so that each ${var:+if_true} expression starts at the beginning of a line
lib/sh/db.sh: mk_select: constructed queries: support custom columns list using $cols
lib/sh/db.sh: mk_select: constructed queries: support WHERE clause using $filter
lib/sh/util.sh: echo_stdin(): don' increase the log_level, because the input being sent to the command (which is usually a set of interpreted commands itself) is necessary to fully know what action is being performed
bugfix: lib/sh/local.sh: add a manual errexit for $() exprs by embedding them in just a var assignment (without local or declare), whose exit status will then equal the of the $(). a `|| return` also needs to be added because errexit does not work on assignment statements. this commit adds them for func_loc(), echo_func(), canon_rel_path(), set_paths(), save_cache, cached realpath(), local.sh global vars
lib/sh/util.sh: shell-variable-based caching: usage: updated alias names
web/links/index.htm: updated to Firefox bookmarks. regenerating also removed extra <a name=...> tags that were added when running index.htm.run on an already-processed index.htm.
bugfix: lib/sh/util.sh: `set -o`: need a -o before every option to set
lib/Firefox_bookmarks.reformat.csv: add HTML anchors for external links' names: hyperlink the added anchors as clickable paragraph marks (like Redmine), which take you to the HTML anchor. this is analogous to the clickable folder names which take you to their anchors.
lib/Firefox_bookmarks.reformat.csv: add HTML anchors for external links' names, in addition to their URLs
bugfix: lib/Firefox_bookmarks.reformat.csv: add HTML anchors for external links' URLs: use .*? instead of .* to match the contents of the <A> tag before the HREF
lib/Firefox_bookmarks.reformat.csv: add HTML anchors for external links using the URL itself as the anchor. these can be used to link to the comments attached to a bookmark in the bookmarks page, rather than to the bookmark's destination.
lib/sh/util.sh: `set -o`: added pipefail option, to ensure that exit statuses (esp. for errexit) also work with pipelines (a|b)
lib/sh/util.sh: use new log+ with a numeric arg instead of multiple calls to log++
lib/sh/util.sh: use new log+ with a numeric arg instead of multiple calls to log++. use the command form of log-- (`log-- echo_func`) to counter the normal log++ performed by echo_func, for cases when the function name is descriptive and should be output at the same log_level as the commands it runs.
*{.sh,run}: removed extra space between function name and ()
lib/sh/util.sh: echo_func(): take the FUNCNAME as an argument (auto-added in the echo_func alias) instead of getting it from the FUNCNAME array (which would have produced an inaccurate value if another function call (such as log++) intervened between the caller and echo_func())
lib/sh/util.sh: echo_func(): display where the function was declared (using new func_loc()) instead of where echo_func() was called from. this is more intuitive when debugging, becaues the line # is where the function starts. it also helps remove the dependency on the FUNCNAME/BASH_* arrays, which would produce an inaccurate value if another function call (such as log++) intervened between the caller and echo_func().
lib/sh/util.sh: added func_loc(), which gets where a function was declared in the format file:line, and helper alias set_func_loc
lib/sh/util.sh: str2varname(): use bash's internal ${var//glob/repl} syntax, which supports character classes ([a-z], etc.), instead of sed (which is slower because it's an external command and uses regexps)
lib/sh/util.sh: added support for caching realpath() using the new shell-variable-based caching. this can be enabled via $realpath_cache and defaults to off because it's currently slower than without. note that the cache needs to be cleared in cd() because relative paths will become invalid.
lib/sh/util.sh: added str2varname() and use it in include_guard_var()
lib/sh/util.sh: added shell-variable-based caching functions
lib/sh/util.sh: canon_rel_path(): use $PWD instead of the slower $(pwd -P) now that $PWD has symlinks expanded
lib/sh/util.sh: expand symlinks in $PWD: moved it before canon_rel_path(), which will require it
lib/sh/util.sh: expand symlinks in $PWD using `cd -P .` so it matches the output of realpath
lib/sh/util.sh: cd(): documented what -P option does and why it's used
lib/sh/util.sh: cd(): always run cd with -P (expand symlinks) so that $PWD will contain a canonical path which matches the output of `readlink -f`. however, don't display the added -P in the logging output because it distracts from the directory being changed to.
lib/sh/util.sh: logging: added log+/-, which take a variable log_level step, and use them in log++/--
lib/sh/util.sh: repeat(): for simplicity and speed, just append to a local string var (and echo the result at the end) instead of using printf
bugfix: lib/sh/util.sh: repeat(): need to use %s instead of %q (escaped string) in printf