sql.py: mk_select(): joins: Fixed bug where join_not_equal did not do what it was designed for, which is filtering out matches of the join condition (before the bug fix, it effectively did a cross join with matching rows excluded, causing duplication of rows). Renamed join_not_equal to filter_out to reflect its intended use. Support table-scoped column names in the WHERE conds list.
sql.py: put_table(): Fixed bug where ORDER BY column needed to have table0 name prefixed (if it didn't already have a table name), to avoid ambiguous column references
sql.py: mk_select(): Fixed bug in joins where right_col had the table name prepended before it was copied for use with a different table name in join_using and join_not_equal
Mapped some unmapped fields in DwC inputs
Added mappings/for_review/DwC2-VegBIEN.specimens.fields.csv
db_xml.py: put_table(): Fixed bug where didn't commit right after inserting node, but instead waited until children with fkeys to parent (independent of the node itself) were inserted
sql.py: put_table(): insert_(): Use insert_select() instead of run_query_into() if new option pkeys_table_exists is on
sql.py: mk_select(): Support joins with !=
sql.py: mk_select(): Support only some join columns being join_using
sql.py: put_table(): Renamed in_joins to insert_joins and joins to select_joins for clarity
db_xml.py: put_table(): Support children with fkeys to parent
sql.py: mk_select(): Make tuple optional for None literal values
sql.py: put_table(): Removed "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings
bin/map: by_col: row_ct = 0 because it's unknown for now
mk_select(): Support join conditions with literal values
sql.py: mk_insert_select(): embeddable: Don't cache function_query because function def could change and then change back
sql.py: with_savepoint(): Renamed savepoints to have "level" prefix, since the # indicates the level #
sql.py: get_cur_query(): Also accept input params to combine with input_query, and pass input params when get_cur_query() is called
sql.py: DbConn.run_query(): Pass input query to get_cur_query()
sql.py: get_cur_query() and _add_cursor_info(): Support input_query param that will be used if the raw query is None. Pass input_query in DbConn.execute().
sql.py: DbConn.run_query(): Check that query != None
bin/map: out_is_db: Only rollback() and close() out_db if it was connected
sql.py: DbConn: Added connected()
sql.py: Wrapped calls to get_cur_query() that are used as strings in str(), because get_cur_query() can return None
sql.py: next_version(): Versions start from 1, because first existing name was version 0
put_table(): Use short name for temp_suffix now that version # will be added if needed
sql.py: mk_select(): Parse join columns for literal values and table-scoped names as well
mappings/DwC2-VegBIEN.specimens.csv: establishmentMeans: Call _toGrowthform on growthform
schemas/vegbien.sql: Added _toGrowthform
sql.py: put_table(): Changed temp_prefix to a suffix so main name won't be removed if name is truncated
sql.py: mk_select(): fields: Support columns with tables. Changed syntax for literal values so that it wouldn't conflict with new syntax for columns with tables.
iters.py: flatten(): If not an iterable, just return the value
sql.py: put_table(): Pass in_pkeys and out_pkeys to run_query_into() by ref so they will be updated if the table names are changed
sql.py: put_table(): Pass pkeys to run_query_into() by ref so it will be updated if the table name is changed
sql.py: run_query_into(): If CREATE TABLE AS generates a DuplicateTableException, rename the table with a version # prepended
sql.py: run_query_into(): Made into param a reference so that the function can change it, and renamed it to into_ref
sql.py: put_table(): If DuplicateKeyException: run_query_into() recoverably, so that DB errors such as DuplicateTableException will be parsed
sql.py: Removed no-longer-needed try_insert()
sql.py: Merged with_parsed_errors() into run_query() so all recoverable queries would automatically benefit from DB error message parsing. DbConn: Moved _add_cursor_info() to DbCursor.execute().
sql.py: with_parsed_errors(): Raise DuplicateTableException for "relation already exists" errors instead of "table name specified more than once" errors
sql.py: run_query_into(): Removed "DROP TABLE IF EXISTS" because sometimes when there are collisions in the temp table names, the code actually uses both "copies" of the temp table. Eventually, this situation will be resolved by adding a counter to the temp table name.
sql.py: Cleaned up DbException's and subclasses' messages
exc.py: ExceptionWithCause: Added cause_newline option to put the cause on its own line instead of on the message line
sql.py: with_parsed_errors(): Also parse "table name specified more than once" errors as DuplicateTableExceptions
sql.py: put_table(): Handle DuplicateKeyExceptions by running a select query on the unique constraint columns
sql.py: mk_select(): Support tuples of tables, not just lists
sql.py: with_parsed_errors(): Support table names that start with "_"
sql.py: DbConn: Added with_savepoint(). with_savepoint(): Use new DbConn.with_savepoint().
schemas/functions.sql: Added _toBool
mappings/DwC2-VegBIEN.specimens.csv: establishmentMeans: Use _toBool on iscultivated, isnative
schemas/functions.sql: Made trigger functions IMMUTABLE since they do not modify other tables
sql.py: put_table(): Added support for putting just a window subset of the rows in the table. Removed "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings.
sql.py: put_table(): Return the column where the pkeys are made available (the out_pkey) instead of taking it as an argument
sql.py: put_table(): Get input pkeys corresponding to rows in insert and join together out_pkeys and in_pkeys into final pkeys table
sql.py: put_table(): Fully support multiple in_tables, joined together using the main input table's pkey
sql.py: mk_select(): joins: Fixed bug where USING-based joins did not have closing ")"
db_xml.py: put_table(): Fixed bug where in_table was last in in_tables instead of first, causing it to be ignored by the current put_table() implementation, which only considers the first table name
db_xml.py: put_table(): Fixed bug where pkeys_table returned by recursive call to put_table() needed to be prefixed with $ to be treated as an input column name rather than a literal value
sql.py: mk_select(): Support joins with USING, which can be used to merge multiple input cols into the same output col
sql.py: mk_insert_select(): embeddable: Fixed bug where query that uses function was being sorted by its first column (the default mk_select() setting), when it should be left in its original order
sql.py: put_table(): Take a dict mapping out to in cols instead of separate in and out cols lists
sql.py: mk_select(): Joins: Reversed order of left_col and right_col in the joins dict as well, so the joined table's columns are the keys
sql.py: mk_select(): Joins: Reversed order of left_col and right_col so the column of the table being joined is first, to match the form of a WHERE clause
sql.py: mk_select(): Support joins
sql.py: mk_select(): Accept a list of tables to join together (initial implementation just uses the first table)
sql.py: mk_select(): Support ORDER BY clause. By default, order by the pkey, since PostgreSQL apparently doesn't do this automatically (and this was causing some staging table tests to fail).
bin/map: In debug mode, print the row # and input row just like in error messages
bin/map: verbose_errors also defaults to on in debug mode
sql.py: add_row_num(): Make the row number column the primary key
csv2db: Use new sql.cleanup_table() to map NULL-equivalents to NULL. Consider the empty string to be NULL.
sql.py: Added cleanup_table()
csvs.py: Added row filters
db_xml.py: put_table(): Fixed bug where relational functions were not being treated as value nodes, and thus their containing child was treated as a child with a backwards pointer instead of a field
xml_func.py: Added is_func*() and is_xml_func*() and use them where their definitions were used
db_xml.py: Added value() and use it where xml_dom.first_elem() was used
mappings/DwC2-VegBIEN.specimens.csv: *Latitude/*Longitude: Moved _toDouble directly after the output col name, so that it's run after any translation functions (which all return strings). *ElevationInMeters: Added _toDouble around all output cols.
xpath.py: get(): Create attrs: Fixed bug where attrs were created with last_only on, which caused attrs to get created multiple times if there were multiple attrs of the same name but different values, becase the last_only optimization would only check the last attr of that name
mappings/DwC2-VegBIEN.specimens.csv: *Latitude/*Longitude: Use new _toDouble to convert strings to doubles (needed for by_col)
schemas/functions.sql: Added _toDouble
bin/map: When calling xml_func.process(), pass DB connection if available
xml_func.py: process(): If DB with relational functions available (passed in via db param), call any non-local XML functions as relational funcs
sql.py: put(): pkey param (now pkey_) defaults to table's pkey
bin/map: by_col: In debug mode, print stripped XML tree that guides import
vegbien_dest: Fixed bug where there was a missing line continuation char before schemas var
sql.py: DbConn: Fixed bug where schemas db_config value needed to be split apart into strings. Fixed bug where current_setting() returned a value rather than an identifier, so it had to be used with set_config() instead of SET, and run after SET TRANSACTION ISOLATION LEVEL. Moved Input validation section before Database connections because it's used by Database connections.
Regenerated vegbien.ERD exports
vegbien.ERD.mwb: Changed lines to a configuration that MySQLWorkbench wouldn't keep resetting whenever the ERD was reopened
vegbien_dest: Added "functions" to schemas
sql.py: db_config: Added schemas param. DbConn: Use any schemas db_config value to set search_path.
sql.py: add_row_num(): Name the column "_row_num" so that it doesn't conflict with any "row_num" column that's part of the table schema
main Makefile: VegBIEN DB: functions schema: Renamed schemas/functions/clear to .../reset to reflect that it also resets the schema to what's in the dump file. schemas/functions/reset: Use now-available schemas/functions.sql to create the schema.
Added autogen schemas/functions.sql
schemas/vegbien.sql.make: Use new pg_dump_vegbien
Added pg_dump_vegbien to dump a schema of the vegbien db
main Makefile: VegBIEN DB: Added functions schema targets
Makefile: $(confirm): Support a separate line outside of the highlighted line. Include the "Continue?" in the macro since all prompts include it.
Makefile: VegBIEN DB: Display different warning message depending on whether entire DB or just current public schema is being deleted
db_xml.py: put_table(): Recurse into forward pointers