bugfix: lib/runscripts/in_datasrc_dir.run: datasrc_make(): use set_make_vars and $_remake as required in lib/sh/make.sh `remaking`
lib/sh/make.sh: remaking alias: documented that you MUST use set_make_vars at the beginning of any function that uses this, so that $_remake is properly set to $remake and not left at its previous value
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
fix: lib/sh/util.sh: verbosity_min(): usage: clarified that '' is a special value that causes $verbosity to be overwritten to ''
lib/runscripts/table.run: added test_() target and use it in remake_VegBIEN_mappings() (it would not be clear that remake_VegBIEN_mappings() runs the tests)
lib/common.Makefile: added %/live, for use with `make inputs/download`
bugfix: lib/runscripts/datasrc_dir.run: import(): don't run `sql/install` if the schema already exists, because this will try to rerun all the schema-creation queries. note that this idempotent functionality was not provided by the `make .../install` target that was previously used (idempotency is new with new-style import).
bugfix: lib/runscripts/datasrc_dir.run: import(): can't run `datasrc_make reinstall` anymore because this now defers to the runscript for new-style import datasources (which was done so that `make .../install` properly reinstalls all the datasources). instead, call the applicable make targets manually (there are just 2 of them).
lib/sh/local.sh: public_schema_exists(): use a higher log_level for pg_schema_exists, to avoid all the verbose output involved in running the query
bugfix: lib/sh/local.sh: public_schema_exists(): can no longer use psql_script_vegbien for this, because using `SET search_path` (called by psql_script_vegbien) with a schema that does not exist no longer produces an error. instead, use new pg_schema_exists(), which uses a different command that does produce an error if the schema does not exist.
lib/sh/db.sh: added pg_require_schema()
lib/sh/util.sh: stderr2stdout(): documented that this redirects fd 2->1 and log_fd (but not back to 2)
bugfix: lib/sh/util.sh: stderr2stdout() use `command` before tee, which re-filters log_fd so that stderr itself is also filtered. this allows log-filtering out an otherwise-confusing benign error when using e.g. stderr_matches().
lib/sh/util.sh: added not(), for use in prefixing wrapped commands
lib/sh/db.sh: added pg_schema_exists()
lib/sh/util.sh: added stderr_matches()
lib/sh/util.sh: documented that fds 2x/3x should not be used because we use these, as opposed to 1x which is used by the shell internally
lib/sh/util.sh: added stdout_contains()
lib/sh/util.sh: added stderr2stdout()
fix: lib/sh/db.sh: pg_table_exists(): usage: documented that $table is actually required for this function
lib/sh/util.sh: import_vars: don't overwrite vars that are already defined, to allow the caller to specify their own values for the vars to create. this requires callers that rely on the overwriting functionality to reverse the order in which they run use_* commands, so that the higher-precedence use_* is applied first and the other one as the default values for the first.
lib/sh/db.sh: pg_table_exists(): use `SELECT NULL` instead of `SELECT *` to avoid a long column list cluttering up the log output
lib/runscripts/table.run: table_make_install(): simplified the setting of $noclobber since there no longer needs to be a different command for when the log exists
bugfix: lib/runscripts/table.run: need to errexit the make target, so that errors in the SQL install scripts are not suppressed. this requires pre-checking if the table exists (using new pg_table_exists), so that the install target's errexit does not then need to be suppressed for cases when the table already exists.
lib/sh/db.sh: added pg_table_exists()
bugfix: lib/Firefox_bookmarks.reformat.csv: remove empty <DD> tags (which Firefox now adds for all bookmarks) so they don't create a blank space on the page
bugfix: lib/Firefox_bookmarks.reformat.csv: don't prepend "page's description:" to empty <DD> tags, which Firefox now adds for all bookmarks, even if they don't have a description
Makefile, schemas/.Mac.conf: upgraded to PostgreSQL 9.3, which is needed for proper exception parsing in the auto-re-create-views functionality. this also removes the Mac 10.8 Mountain Lion quirks, such as renaming the postgres user to _postgres (which messed everything up, but is now back to normal).
bugfix: lib/runscripts/file.pg.sql.run: export_(): exclude Source and related tables so that these will be re-created by the staging tables installation instead, ensuring that they are always in sync with the Source/ subdir
lib/sh/util.sh: already_exists_msg(): added instructions on how to force-remake when the file already exists (prepend `rm=1` to the command)
bugfix: lib/runscripts/view.run: don't do anything in load_data(), to avoid trying to remake header.csv before the view is created. (for views, this instead happens in postprocess().)
lib/runscripts/table.run: reordered functions in the order they are called by import()
bugfix: lib/sh/make.sh: $remake: need to explicitly propagate this to invoked commands if it was set from $rm
lib/sh/db.sh: mk_select(): usage: documented that this also takes a $limit/$n param
lib/sh/db.sh: limit(): also support using $n as the limit param, since this var name is used by other parts of the import process
lib/sh/db.sh: limit(): usage: documented that this also need a $limit param
lib/runscripts/extract.run: added export_sample()
lib/runscripts/import_subset.run: $version: use new $extract_view, which is set to the same value that this was
lib/runscripts/extract.run: use the extract-specific view instead of all of analytical_stem
added lib/runscripts/import_subset.run, extract.run
lib/sh/local.sh: psql(): also accept $public as the $schema param, since this is used by a lot of import scripts
lib/sh/util.sh: added require_dot_script()
bugfix: lib/sh/util.sh: $top_script: use @BASH_SOURCE instead of $0, because this is also defined for .-scripts
bugfix: lib/runscripts/util.run: `trap on_exit EXIT`: only set this if the script is not a dot script, because if it is a dot script, on_exit() will not be invoked until the calling shell exits, which may be much later than when the script is run. previously, this was handled by canceling the EXIT trap if on_exit() is run manually, but this would not work correctly if a load-time error prevented on_exit() from running and canceling the trap.
bugfix: lib/runscripts/util.run: if is_dot_script, fix $ when no args causes this to incorrectly contain the script name. use is_dot_script rather than the presence of $ args to decide whether to use @BASH_ARGV, because @BASH_ARGV is actually wrong when run as a .-script (it contains the script name).
when no args causes this to incorrectly contain the script name. use is_dot_script rather than the presence of $
bugfix: lib/sh/util.sh: is_dot_script(): need to subtract 1 from ${#BASH_LINENO[@]}, because this is the array length rather than the index of the last element as in Perl
lib/sh/util.sh: added is_dot_script()
lib/runscripts/util.run: run script template: changed sample command name to all() because each runscript requires this in order to be run without args
lib/runscripts/util.run: support scripts that are run as shell-includes (with leading "."), by allowing the calling script to manually invoke on_exit() without it then being invoked twice (the end of a shell-include does not trigger the EXIT trap)
lib/runscripts/util.run: support scripts that are run as shell-includes (with leading "."), by also accepting $@ args that are passed along in the util.run include, in addition to @BASH_ARGV
bugfix: lib/sh/util.sh: alias_append(): need to enclose $(alias) call in "" because its result may contain separator chars (i.e. whitespace) that will be parsed incorrectly. this appears to only be a bug when runscripts are run as shell-includes, with a leading ".".
bugfix: lib/sh/local.sh: psql(): $is_root: use `` around case statement instead of $(), because it contains an embedded unbalanced )
bugfix: lib/sh/local.sh: psql(): don't default the connection vars using use_local if running as the postgres user. in that case, connection must happen via a socket, with server="", and as the user running the command (postgres), with user="".
bugfix: lib/sh/db.sh: avoid outputting to /dev/fd/# when running as sudo on Linux, because this causes a "Permission denied" error (due to the /dev/fd/# file being owned by a different user). this is not a problem with normal redirects (>&#), because they do not use /dev/fd/# files which can have access permissions.
bugfix: lib/runscripts/util.run: to_top_file(): need to pass "$@" to to_file
lib/runscripts/util.run: to_top_file: added function for this (in addition to alias), so that this can be run from sudo in a wrap_fn command
lib/sh/db.sh: pg_as_root(): run sudo with echo_run to help debug
bugfix: lib/sh/db.sh: pg_cmd(): only set PG* connection/login env vars when the corresponding var is non-empty. there are some situations in which these must be unset (in order to use the default value), and other situations when the var must be set to something (i.e. "") to avoid it being defaulted to a value in local.sh > connection vars.
bugfix: lib/sh/local.sh: pg_as_root(): need to use -E (preserve environment) option to sudo, so that $schema, $table get passed through
bugfix: lib/sh/local.sh: psql(): only \set schema, table if $schema, $table are non-empty, because otherwise, you will get a "zero-length delimited identifier" error
lib/sh/local.sh: added require_remote()
lib/sh/db.sh: added pg_as_root()
lib/runscripts/util.run: added $wrap_fn to run any function via sudo, etc.
bugfix: lib/common.Makefile: $(subMake): don't enclose the target in "" because sometimes the target is empty (i.e. `all`), and nothing should be passed to the sub-make
bugfix: *Makefile: recursive invocation of $(MAKE): enclose targets in "" in case they contain *
bugfix: lib/runscripts/table.run: load_data(): pass $is_view through to `make reinstall` so that DROP VIEW will be used instead of DROP TABLE where applicable
lib/sh/util.sh: added instructions for making an export only visible locally
bugfix: lib/runscripts/table.run: load_data(): in remaking mode, need to remake header.csv in case the columns have changed
lib/runscripts/datasrc_dir.run: import(): added remake (rm=1) mode that reinstalls the datasource before continuing with the subdirs' import actions
lib/runscripts/in_datasrc_dir.run: added datasrc_make(), which runs make in the datasrc dir
lib/sql_io.py: put_table(): default param: documented that this will be used for all missing rows, regardless of which error caused them not to be inserted. this means that auto-forwarding (wiki.vegpath.org/Auto-forwarding) can be used with any type of constraint violation, not just NOT NULL constraints (which it is typically used with).
lib/sql_io.py: put_table(): added link to new INSERT ON DUPLICATE SELECT wiki page, which now contains the explanation in the doc comment
bugfix: lib/runscripts/*: calls to rm: use `rm -f` instead to avoid an error (which aborts the program) if the file does not yet exist
bugfix: lib/sh/make.sh: don't allow rm to override remake if an invoked script uses this file. this fixes a bug in `rm=1 inputs/.../.../run` where the remake action would be invoked on the map_table command even though it had been suppressed, because it was run externally (i.e. make.sh was reloaded) and the rm=1 flag was still active
lib/runscripts/datasrc_dir.run: added postprocess target to run postprocess in just the table subdirs, skipping any additional subdirs that don't have this target
lib/runscripts/datasrc_dir.run: @subdirs: moved import_order.txt subdirs into separate @table_subdirs, which provides access to just the table subdirs when the user adds other dirs to @subdirs
bugfix: lib/runscripts/view.run: remake_VegBIEN_mappings(): also need to remake header.csv, not just map.csv as for tables, because view columns may change when the view is regenerated
bugfix: lib/sh/util.sh: set_fds(): don't add surrounding quotes to empty redirect dest
bugfix: lib/sh/util.sh: set_fds(): need to check if redirect is empty before escaping it with `printf %q`, which may add surrounding quotes to an empty string
bugfix: lib/sh/util.sh: set_fds(): need to escape redirect destinations which are files, because they may contain special shell characters
lib/sh/util.sh: added rm_prefix()
lib/sh/db.sh: mysql_cmd(): added caller usage with connection/login opts
lib/sh/db.sh: mysql(), mysql_export(): usage: added database=...
bugfix: schemas/Makefile, lib/common.Makefile: enclose schema names in "" so that they won't be lowercased
bugfix: lib/sql_io.py: put_table(): Getting output table pkeys of existing/inserted rows: need to include the index cond in the join condition here, too (using var join_custom_cond), so that an index scan can be used instead of a much slower full-table sort
bugfix: lib/sql_io.py: put_table(): DuplicateKeyException: need to include any index cond in the join condition, so that an index scan can be used instead of a much slower full-table sort (otherwise the query planner will not know that it can restrict results to rows satisfying the index cond)
lib/sql_gen.py: Join: added custom_cond param that can be used to add to the JOIN condition
lib/sql.py: distinct_table(): support custom filters on the distincting query
lib/sql_gen.py: ColValueCond: support conds that are just plain SQL (without separate left and right sides) using special custom_cond flag value
lib/sql_io.py: ensure_cond(): documented meaning of passed, failed params (at least one row passed/failed the constraint)
lib/runscripts/util.run: usage: documented that this usage also applies to all files that include this file
lib/runscripts/util.run: usage: clarified that the cmd to run is a function
added lib/runscripts/pg.conf.run, which installs PostgreSQL config files
added lib/runscripts/install.run, analogous to import.run
bugfix: lib/db_xml.py: put_table(): turned off db.autoanalyze, since forcing an ANALYZE after every bulk insert is inefficient for small datasources. the default autovacuum settings in schemas/postgresql.conf should be fine; however, the frequency and/or threshold may need to be increased if autovacuum does not ANALYZE frequently enough to replace db.autoanalyze.
lib/sh/db.sh: mk_select(): added support for ORDER BY
lib/sh/db.sh: added pg_export_table_to_dir(), analogous to pg_export_table_to_dir_no_header()
lib/sh/util.sh: is_array(): handle unset vars (=false). this fixes a bug in pg_export_table_no_header, which produced the error "lib/sh/util.sh: line 290: declare: cols: not found".
fix: lib/sh/util.sh: join(): documented that delim must be a single char
added lib/runscripts/data.pg.sql.run (analogous to schema.pg.sql.run for data-only SQL scripts)