db_xml.py: put_table(): Treat every node name that starts with "_" as a function, not just members of put_table_special_funcs. This ensures that DB function args are always treated as values, not children with fkeys to parent.
bin/map: by_col: Strip only XML functions that are not in the DB
db_xml.py: put_table(): Make special_funcs externally available as module constant put_table_special_funcs
sql.py: tables(): Changed schema param to schema_like and filter the schema using LIKE so that all schemas can be selected
to_do/timeline.doc: Updated to reflect the month we spent on optimization and column-based import
sql.py: put_table(): in_table name: Remove '-pkeys' suffix from the into table name before adding '-input' so that the name is shorter and clearer
sql.py: put_table(): Wrap repr() calls for debug messages in strings.as_tt() to add Redmine formatting
sql.py: put_table(): Output "Adding index" debug message with level=2.5 so it's not part of the Redmine steps
schemas/vegbien.sql, functions.sql: Cast functions: Fixed bug where invalid value exceptions were not being caught, because implicit conversions to the return type apparently only happen outside the block containing the RETURN statement (i.e. at the end of the function). Fixed by adding explicit type conversion to return type, so that type conversion would happen inside try block.
sql.py: put_table(): Re-enabled FunctionValueException handling, by just filtering out the value on all input columns that use the named function (since the error message does not specify which column it was that had the invalid value). This is in some ways better, anyway, because that way the invalid value is filtered out right away in all columns that could contain it, instead of potentially once for each column (if the value appears in more than one input column).
sql.py: add_index(): Fixed bug where expressions could not be converted to a string until their table name had been removed
sql_gen.py: Added Expr
sql.py: add_index(): Fixed bug where expressions needed to be enclosed in () to distinguish them from plain columns
sql.py: add_index(): Support simple expressions as well as columns
sql.py: Renamed index_col() to add_index() so its name isn't similar to index_cols()
sql_gen.py: FunctionCall: Removed repr() because it's a Code object and its to_str() does not take extra arguments
sql.py: run_query(): FunctionValueException: Expanded parsing to include regular function calls, not just relational functions' trigger functions. put_table(): Disabled FunctionValueException handling because this expands FunctionValueException beyond what put_table() could handle.
sql.py: put_table(): MissingCastException: Fixed bug where renaming of cast literal value was not properly propagated to the returned value of the function call, causing the query to assume that a DISTINCT ON column referred to column in one of the joined tables instead of a named column in the SELECT columns list. This logic error would have been very difficult to catch without inspecting the code!
sql_gen.py: Added wrap_in_func()
sql_gen.py: FunctionCall: Filter args through remove_col_rename() to remove any renamings from the function args
sql.py: put_table(): No handler for exception: Print full exception instead of just first line to assist in debugging
schemas/vegbien.sql, functions.sql: Removed _to* relational functions because type casting for those types is now automatic
mappings/DwC2-VegBIEN.specimens.csv: Removed _to* relational functions because type casting for those types is now automatic
schemas/functions.sql: Added cast functions for _to* relational functions
schemas/vegbien.sql: Changed cast functions' input types to text because type must match exactly, not just be implicitly castable
sql.py: run_query(): MissingCastException parsing: Support multiple-word types
sql.py: put_table(): Handle MissingCastExceptions by attempting to call a function with the name of the type on the column
sql_gen.py: Added Functions section with Function and FunctionCall
sql.py: Added MissingCastException and parse it in run_query()
schemas/vegbien.sql: Added cast functions for enum types which map invalid values to NULL
sql.py: put_table(): Fixed bug where some exceptions with no handler would not even allow insertion of no rows into the out_table (due to type mismatch issues), by creating an empty pkeys table as a special case
sql.py: put_table(): Preparing to insert new rows: Fixed bug where main_select needed to be generated after distinct_on was set in the if statement
sql.py: put_table(): log_exc(): Fixed bug where the exception strings rather than the exceptions themselves needed to be put in the set, because exceptions are not comparable with ==
sql.py: put_table(): Moved mk_main_select() call out of try block since it is not related to the exceptions that may be thrown
sql.py: put_table(): log_exc(): Check if exception already caught before to avoid infinite loops
Added debug2redmine and helper file debug2redmine.csv
sql.py, db_xml.py: Removed unnecessary calls to sql_gen.clean_name() now that str() handles this automatically
sql_gen.py: sql_gen classes inherit from new base class BasicObject, whose str() calls clean_name() on the object's repr(). Changed the main debug-repr producing method to be repr() instead of str().
Moved clean_name() from sql.py to sql_gen.py because it's DB-general and so that it can be used by sql_gen.py without circular dependencies
db_xml.py: into_table_name(): Handle hierarchical tables specially by including their rank in the into table. Interpret any table with a value column as a function, regardless of out_table name.
sql.py: put_table(): Log "Default value column does not exist in mapping" error with level 2.1 so that it doesn't appear in Redmine output
db_xml.py: put_table(): Pass next as sql.put_table()'s default param now that it is supported
sql.py: put_table(): Changed default param to be an output column because that is what would be passed in by db_xml.put_table(), and because there is already a mapping that resolves that to a flattened input column
sql.py: put_table(): Added default param for the value or input column to use as the pkey for missing rows
sql.py: put_table(): Use single quotes rather than double quotes around strings where possible
db_xml.py: Added internal next param used by simplifyPath. put_table(): Refactored to use outer parent_ids_loc var and modify that as needed rather than having to pass parent_ids_loc as a param to put_table_().
sql.py: put_table(): When calling strings.as_*table(), pass custom ustr that removes col renames and adds double quotes on plain strings
strings.py: as_*table(): Added ustr param to override the method (by default ustr()) used to convert each value to a string
sql_gen.py: MockDb.esc_value(): Use new strings.repr_no_u()
strings.py: Added repr_no_u()
sql.py: clean_name(): Also remove '`' (which is used by MySQL)
sql.py: esc_name_by_module(): Use new sql_gen.esc_name()
sql_gen.py: Added esc_name() and use it in MockDb.esc_name()
sql.py: next_version(): Use special chars in version part of name string for clarity
sql.py: mk_insert_select(): embeddable: function_name is first line of query for clarity, and to reduce length from including the column names. This also fixes the problem of double quotes around column names in the previous function_name.
sql.py: esc_name_by_module(): Double embedded quotes to escape them instead of removing them
sql.py: put_table(): Use "-" to separate temp table suffixes from into table name
db_xml.py: into_table_name(): Format relational functions' into table names as a function call on the value column, using special chars for readability
sql.py: run_query(): Exception parsing: Use "(.+?)" wherever possible to match names containing special chars
sql.py: clean_name(): For clarity, just remove '"'s, so that "."s are preserved and show the path structure of the input name
db_xml.py: put_table(): sql.put_table(): Name the into table ...literal instead of ...value if the value column is a literal value
bin/map: Logging: log(): Remove extra debug info from DB query messages and format level 1.5 (summary) messages as Redmine list items
sql.py: put_table(): Renamed temp_prefix param to into and allow it to be a sql_gen.Table object. Use into directly as the pkeys table, and make its default value be `out_table.name+'_pkeys'`.
db_xml.py: put_table(): Pass custom temp_prefix to sql.put_table() for relational funcs, so that their value param's input column name is included in the temp table name
sql.py: put_table(): Added optional param temp_prefix for the prefix of generated temp tables
sql.py: put_table(): Made debug messages more self-documenting
sql.py: put_table(): Changed "Setting missing rows' pkeys to NULL" to "Setting pkeys of missing rows to NULL" to avoid having single quote in debug output, which messes up text editor SQL syntax highlighting
sql.py: Parsed exceptions: Use strings.as_tt() to format Python values
strings.py: Split as_table() into as_table() and as_inline_table() depending on whether the table needs to be inlined in an ordered list item or not
strings.py: as_table(): Changed to use
formatting because Redmine tables can't be embedded in ordered lists without restarting the numbering
strings.py: as_table(): Fixed bug where table was not ended properly, by adding a space after the last \n and having rstrip() string only newlines
sql.py: mk_select(): Columns: Separate columns with newlines
sql.py: put_table(): Use new strings.as_table() to format mappings as tables
strings.py: Added as_tt() and as_table()
bin/map: Logging: log(): Strip trailing newlines from msg
strings.py: as_code(): Added multiline param to disable multiline formatted output
sql.py: put_table(): "Ignoring existing rows, comparing on" debug message: Wrap the mapping in strings.as_code() so it will have Redmine syntax-highlighting
sql.py: put_table(): "Putting columns" debug message: Wrap the mapping in strings.as_code() so it will have Redmine syntax-highlighting
sql.py: DbConn.run_query(): Query debug message: Wrap the query in strings.as_code() so it will have Redmine syntax-highlighting
strings.py: Added as_code()
sql.py: DbConn.run_query(): Prepend "DB query" before the query debug message so it can be identified as a DB query
db_xml.py: put_table(): Subset in_table: Document that in_table will be shadowed (hidden) by the created temp table, rather than versioned, now that the table is (almost) always created as a temp table
sql.py: Create temp items as permanent in autocommit mode rather than in debug mode so that temp items are only permanent if actually committing result. This ensures that the generated SQL in test mode matches what would actually get run in regular commit mode, and the SQL is only altered to make the temp items visible if actually debugging (autocommit mode).
sql.py, sql_gen.py: Reformatted generated SQL for presentability by adding newlines
sql.py: DbConn.run_query(): Put a newline before the query in the debug message so that multiline queries have all rows at the left edge rather than the first row prefixed by other text
sql.py: DbConn.run_query(): Don't put generated query debug message all on one line, so that embedded newlines are preserved
sql.py: Fixed bug where queries with versioned identifiers which threw an exception (not related to name collisions) were being output with a too-high log_level, because all exceptions were output with the higher exc_log_level, by making the following changes: DbConn.run_query(): Changed exc_log_level param to log_ignore_excs param so that only certain exceptions would cause the query to be output with a higher log_level. Moved the code that actual emits the query debug message from DbConn.run_query() to module-level run_query() so it would apply the log_ignore_excs filter after the exception had already been parsed into specific types.
Moved "Putting columns" debug message from db_xml.py put_table() to sql.py put_table() to put it in the same place as the other debug messages
sql_gen.py: Added remove_col_rename() and use it where `if isinstance(value, NamedCol): value = value.code` was used
sql_gen.py: CompareCond.to_str(): If left_value has been renamed as a NamedCol, unwrap it
sql_gen.py: Join.to_str(): Fixed bug where USING should be used if all columns are join_same_not_null, rather than join_same, because USING uses plain = for comparison. sql.py: put_table(): input_joins now can use sql_gen.join_same_not_null in order to use USING syntax.
db_xml.py: put_table(): Output debug messages with a level of 1.5 to match sql.put_table()'s level for summary messages
bin/map: Fixed bug where verbosity needed to be 1 outside of test mode so that profiling and errors stats would be printed at end of import. Verbosity defaults to 0.5 rather than 1 in test mode so profiling and errors stats do not clutter up the test output when running automated tests.
bin/map: Only display verbose_errors in test mode, but with any nonzero verbosity. They should not be displayed outside of test mode because verbose errors make the log files huge.
bin/map: Renamed verbose param to verbosity because it's now a number, not a boolean
bin/map: Removed no longer used debug param (verbose=2 is used instead)
bin/map: Fixed bug where verbose_errors' default value depended on debug var, which was not yet set. Removed verbose_errors param and instead turn verbose_errors on whenever verbosity >= 1. Verbosity defaults to 1 in test mode.
bin/map: Logging: Don't set sql.run_raw_query.debug, because it is not used anymore (sql.connect(log_debug=...) is used instead)
bin/map: Logging: Print debug messages (level > 1) prefixed with their level, to distinguish higher- and lower-level debug messages