moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
lib/sh/db.sh: added pg_require_schema()
lib/sh/db.sh: added pg_schema_exists()
fix: lib/sh/db.sh: pg_table_exists(): usage: documented that $table is actually required for this function
lib/sh/util.sh: import_vars: don't overwrite vars that are already defined, to allow the caller to specify their own values for the vars to create. this requires callers that rely on the overwriting functionality to reverse the order in which they run use_* commands, so that the higher-precedence use_* is applied first and the other one as the default values for the first.
lib/sh/db.sh: pg_table_exists(): use `SELECT NULL` instead of `SELECT *` to avoid a long column list cluttering up the log output
lib/sh/db.sh: added pg_table_exists()
lib/sh/db.sh: mk_select(): usage: documented that this also takes a $limit/$n param
lib/sh/db.sh: limit(): also support using $n as the limit param, since this var name is used by other parts of the import process
lib/sh/db.sh: limit(): usage: documented that this also need a $limit param
bugfix: lib/sh/db.sh: avoid outputting to /dev/fd/# when running as sudo on Linux, because this causes a "Permission denied" error (due to the /dev/fd/# file being owned by a different user). this is not a problem with normal redirects (>&#), because they do not use /dev/fd/# files which can have access permissions.
lib/sh/db.sh: pg_as_root(): run sudo with echo_run to help debug
bugfix: lib/sh/db.sh: pg_cmd(): only set PG* connection/login env vars when the corresponding var is non-empty. there are some situations in which these must be unset (in order to use the default value), and other situations when the var must be set to something (i.e. "") to avoid it being defaulted to a value in local.sh > connection vars.
bugfix: lib/sh/local.sh: pg_as_root(): need to use -E (preserve environment) option to sudo, so that $schema, $table get passed through
lib/sh/db.sh: added pg_as_root()
lib/sh/db.sh: mysql_cmd(): added caller usage with connection/login opts
lib/sh/db.sh: mysql(), mysql_export(): usage: added database=...
lib/sh/db.sh: mk_select(): added support for ORDER BY
lib/sh/db.sh: added pg_export_table_to_dir(), analogous to pg_export_table_to_dir_no_header()
bugfix: lib/sh/db.sh: pg_dump(): don't default $struct flag to on, because both structure and data should be printed by default
lib/sh/db.sh: pg_dump(): added create_schema= flag to remove CREATE SCHEMA statements (useful if the schema already exists)
lib/sh/db.sh: added pg_dump(), using the code in bin/pg_dump_vegbien with clarity improvements
lib/sh/db.sh: added pg_cmd() (analogous to mysql_cmd() for PostgreSQL), and use it in psql(), so that other PostgreSQL operations can use this to set the PG* connection/login vars
lib/sh/db.sh: pg_export(): added usage
lib/sh/db.sh: mysqldump(): added create_db=1 flag to print the CREATE DATABASE statement
lib/sh/db.sh: mysqldump(): don't use --compatible=postgresql when the table structure is being exported, because this removes the table options (which include the COMMENT attribute). --compatible=postgresql remains on in data-only mode because embedded ` in data cannot easily be distinguished from ` around column names, so ANSI_QUOTES is needed to do the translation to " (and data sections do not contain table options). note that all --compatible modes that offer ANSI_QUOTES unfortunately exclude the table options, and there is no way to run a SQL query to set the SQL mode before beginning the dump, so ANSI_QUOTES translation must be handled by my2pg instead.
lib/sh/db.sh: mk_select(): support passing $cols as array instead of SQL string, which is easier to enter in a shell script (less quotes, \ , etc.)
lib/sh/db.sh: added cols2list()
lib/sh/db.sh: psql(): display stack traces and DETAIL sections of error messages at verbosity 2+, to help debugging (previously they were always turned off). in particular, the DETAIL section of a "duplicate key value violates unique constraint" error is useful because it contains the duplicated key.
lib/sh/db.sh: psql(): hide the verbose CONTEXT information that is output with each NOTICE by setting the VERBOSITY psql var to terse (postgresql.1045698.n5.nabble.com/Quiet-quot-CONTEXT-quot-td1906036.html#a1906037)
*{.sh,run}: use new log-() instead of log+() with a negative #
lib/sh/db.sh: set test mode when using limited # rows
*{.sh,run}: use mysql instead of mysql_ANSI because mysql is now an alias to mysql_ANSI (since ANSI mode still supports key MySQL features, like `` quotes)
lib/sh/db.sh: moved mysql_root() after the mysql->mysql_ANSI alias (and load new aliases) so that it will also use ANSI mode and support "" identifiers
lib/sh/db.sh: mysql: always use ANSI mode, to support "" identifiers. note that `` are still supported in this mode, so it also works with SHOW CREATE TABLE output and dumpfiles.
lib/sh/db.sh: mysql_cmd(): run the command with `time` because in mysql()'s output_data mode, no queries, and therefore no runtimes, are echoed, so the total runtime needs to be echoed separately instead. the total runtime is also useful in general, when many long-running queries are run and you would also like to know the total time (e.g. in make_analytical_db).
lib/sh/db.sh: limit(): usage: surrounded query in "" to clarify that it's a string, not a command
lib/sh/db.sh: added set_large_table alias, used to set to_file's $del flag based on $limit
lib/sh/db.sh: mysql_cmd(): use --quick to avoid buffering entire result (which is slow and memory-intensive for large result sets). this option applies to both mysql() and mysqldump().
lib/sh/db.sh: mysql_import(): in append mode, use LOAD DATA IGNORE to allow inserting duplicate rows
bugfix: lib/sh/util.sh: convention of fds to use for command-specific alternate stdin/stdout/stderr: changed to 40/41/42 because 10/11/12 are used by eval (which is used by set_fds()). use of fd 10/11/12 will cause hard-to-find silent bugs because exec will not print an error when these are used. documented why not to use other series of fds for this purpose:...
lib/sh/db.sh: psql(): set $PG* connection env vars from our connection vars ($server, $user, etc.). use use_pg to import $database so it can be different from $database for MySQL
lib/sh/db.sh: added use_pg alias
bugfix: lib/sh/db.sh: psql(): added missing `--set ON_ERROR_STOP=1 --quiet` opts from psql_script_vegbien
lib/sh/db.sh: added psql(), which replaces psql_script_vegbien and psql_verbose_vegbien for general connections. it also supports separate command and stdin files, to allow using `\copy from pstdin`, with pstdin pointing to a separate, EOF-terminated CSV file instead of inlined with the command and terminated with the \. escape (which may be contained within the CSV file itself).
bugfix: lib/sh/db.sh: mysql_import(): need to use direct connection to DB instead of via ssh, because ssh does not tunnel nonstandard fds
lib/sh/db.sh: added ssh2local alias
lib/sh/db.sh: mysql_seal_table(): also revoke GRANT OPTION, which apparently needs to be done in addition (and in a separate command, unlike when granting GRANT OPTION)
lib/sh/db.sh: mysql_seal_table(): REVOKE: ignore errors if REVOKE was already run
lib/sh/db.sh: mysql_seal_table(): REVOKE: removed unneeded explicit database since this is automatically set to the current database
lib/sh/db.sh: added mysql_seal_table(), which prevents further modifications to a table by a user. this uses new mysql_root().
lib/sh/db.sh: added mysql_root(). this version uses just use_root (compare to the mysql_root() override in local.sh).
lib/sh/db.sh: added skip_table(), which prints an already_exists_msg for tables
lib/sh/db.sh: mysql(): when echoing queries, also echo runtimes (turned on with `--verbose --verbose --verbose`)
lib/sh/db.sh: added mysql_rm_privileged_statements()
lib/sh/db.sh: mysql(): added $log_queries flag, which can be turned off to avoid using --verbose. this is useful when running bulk INSERT statements.
lib/sh/db.sh: added use_root alias (similar to use_local/use_remote)
lib/sh/db.sh: mysql(): added $data_only flag which enables --skip-column-names and $output_data
lib/sh/util.sh: $sed_cmd: don't use `command`, which causes sed calls (which are usually internal) to always be logged. instead, use echo_run wherever sed needs to be logged.
lib/sh/db.sh: added mk_drop()
bugfix: lib/sh/db.sh: pg_export_table_to_dir_no_header(): inlined $(pg_header) so setting $cols wouldn't affect pg_export_table_no_header(), which uses it as a kw param
bugfix: lib/sh/db.sh: load new aliases before mk_select(), which uses mk_table_esc
lib/sh/db.sh: added mk_select() and use it in mk_select_var
lib/sh/db.sh: added limit() and use it instead of `${limit:+LIMIT $limit}`
lib/sh/db.sh: added mysql_truncate() and use it instead of `mk_truncate|mysql_ANSI`
lib/sh/db.sh: truncate(): renamed to mk_truncate() because it actually just creates a TRUNCATE statement, rather than also executing it
lib/sh/db.sh: use_local/use_remote: unset $prefix after using it so it isn't unintentionally applied as a kw param for a later function
lib/sh/db.sh: mk_select: renamed to mk_select_var since it actually sets a var in the local context rather than returning a query
bugfix: lib/sh/db.sh: mk_select: ensure newline before LIMIT clause, in case caller provided custom query which did not have trailing newline
lib/sh/db.sh: mysql_import(): automatically ensure the table is empty (i.e. using truncate()), unless append=1 is specified. extra calls to truncate() now that this happens automatically have also been removed.
lib/sh/db.sh: added truncate()
lib/sh/db.sh: added mysql_import()
lib/sh/db.sh: removed no longer used mk_esc_name()
lib/sh/db.sh: mk_esc_name_alias(): don't overwrite an already-defined $*_esc, to allow the user to provide an already-escaped value (such as a schema-qualified table) directly
lib/sh/db.sh: mk_esc_name_alias(): echo_vars the *_esc var when it's set
lib/sh/db.sh: added mk_esc_name_alias() and use it to create mk_schema_esc, mk_table_esc
lib/sh/db.sh: mysql(): run with --local-infile=1
bugfix: lib/sh/db.sh: log_sql(): use can_log() instead because the verbosity now gets decremented as the log_level increases, so the threshold to compare to is 0 instead of 2
inputs/GBIF/raw_occurrence_record/run table.tsv/make() and functions used by it: added usage comments for cmd line usage, caller usage, and declaring function usage
lib/sh/db.sh: mk_select: constructed queries: in ${var:+if_true} syntax, put the newline at the end of the if_true value instead of the beginning, so that each ${var:+if_true} expression starts at the beginning of a line
lib/sh/db.sh: mk_select: constructed queries: support custom columns list using $cols
lib/sh/db.sh: mk_select: constructed queries: support WHERE clause using $filter
*{.sh,run}: put doc comment on next line when possible, so that the function name always comes before it
lib/sh/util.sh: renamed params() to kw_params() to clarify the type of params it handles
: use new `params` in all functions that have keyword params, in order to remove their vars from the environment. note that functions that use $? (such as die()) must save it *before running params, because params will overwrite $?.
lib/sh/util.sh: command(): use just one control var $cmd_log_fd instead of three flags (limit_log_fd, limit_stdout, stderr_is_errors) that indicated various common fd configurations. this is much clearer (you state which fd the common uses as its logging fd), more configurable (the logging fd can be any fd, not just 1 or 2), and more automatic (redirecting fd 2 to err_fd happens automatically if it isn't used for logging).
*{.sh,run}: use new limit_stdout=1 instead of stdout2stderr=1 limit_stderr=1
lib/sh/util.sh: merged limit_stderr_cmd/limit_stdout_cmd into command(), using flag vars to control what limiting actions it needs to perform. in command invocations, this involves setting the appropriate flag vars instead of using a limit_std*_cmd alias.
*{.sh,run}: multi-command aliases: enclose in {} so the alias can be used after ||
lib/sh/db.sh: added mysql_export(). documented that mysql_export_outfile() supports CSV, but requires the FILE privilege.
lib/sh/db.sh: mysql(): use --column-names to ensure the output is formatted appropriately. note that --column-names is the default, but better to specify it to be sure.
lib/sh/db.sh: mysql(): output_data mode: use --batch to ensure the output is formatted appropriately. note that --batch is the default when stdin is from a pipe, but better to specify it to be sure.
*{.sh,run}: use new limit_stdout_cmd instead of `stdout2stderr=1 limit_stderr_cmd`
lib/sh/db.sh: mysql_export_outfile(): ensure newline between format info ($mysql_load_data_format) and rest of SELECT query
lib/sh/db.sh: mysql_export_outfile(): ensure newline between SELECT columns list and INTO OUTFILE
lib/sh/db.sh: added mysql_export_outfile(), which uses SELECT ... INTO OUTFILE
lib/sh/db.sh: mysql(): added $output_data option that turns off --verbose and uses echo_stdin instead so that queries are not echoed to stdout along with data. in --verbose mode, use limit_stderr_cmd with stdout2stderr=1 to redirect echoed queries through the logging mechanism.
lib/sh/db.sh: mysql_cmd(): removed set_database because this is now performed by each caller before they use $database
lib/sh/db.sh: mysql_cmd: get command name by adding $FUNCNAME in an alias instead of using ${FUNCNAME1} in the function, so that callers can also call mysql_cmd via a nested function, etc.
lib/sh/db.sh: mysql_cmd(): moved setting of database to specific callers (mysql(), mysqldump()) because the syntax to set it differs between mysql* commands (e.g. mysql uses `--database=` while mysqldump uses `--databases ... --tables`)