Project

General

Profile

Statistics
| Revision:

# Date Author Comment
3573 07/24/2012 05:46 AM Aaron Marcuse-Kubitza

src_map: Fixed bug where non-header rows needed to be materialized with empty fields for each column in the header

3572 07/24/2012 04:27 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: Via maps cleanup: Match maps/$(via).%.csv with pattern instead of $(viaMaps) var so that a non-existing via map will have the recipe run, too. When auto-creating via maps is later added, this will be required.

3571 07/24/2012 04:07 AM Aaron Marcuse-Kubitza

inputs/*/maps/src.*.csv: Regenerated using new src_map output format

3570 07/24/2012 04:06 AM Aaron Marcuse-Kubitza

parallelproc.py: MultiProducerPool: Removed warning if not using parallel processing because this also gets generated when it's explicitly turned off, which is currently the case and clutters up stderr when testing

3569 07/24/2012 03:57 AM Aaron Marcuse-Kubitza

src_map: Also add columns for the output mappings and comments, so that the src map can be directly copied for use as the via map (DwC.specimens.csv, etc.). The output mapping column name must be provided by the caller, which input.Makefile maps/src.%.csv provides using the new mappings roots.

3568 07/24/2012 03:52 AM Aaron Marcuse-Kubitza

Added mappings/roots for use in creating src maps

3567 07/24/2012 03:41 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: maps/src.%.csv: Clean up by passing through `$(bin)/cols '*'` whenever it's changed. This ensures that the CSV dialect is always consistently Python's Excel dialect. (Note that this dialect actually uses \r\n as the line ending. The \n line endings were from src maps generated by a previous version of bin/src_map.)

3566 07/24/2012 03:28 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: maps/$(via).%.full.csv: Removed alternate rule when $(srcMap) doesn't exist, because this effect is actually achieved by the no-prereqs rule for maps/src.%.csv, which causes make to think it exists when matching pattern rules even if its recipe doesn't actually create it

3565 07/24/2012 03:23 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: maps/$(via).%.full.csv: Added alternate rule when $(srcMap) doesn't exist

3564 07/24/2012 03:21 AM Aaron Marcuse-Kubitza

inputs/CTFS/maps/: Removed unneeded src.organisms.csv since there is an way to deal with it not existing in input.Makefile

3563 07/24/2012 03:18 AM Aaron Marcuse-Kubitza

inputs/CTFS/maps/: Removed unneeded .VegX.plots.csv.last_cleanup

3562 07/24/2012 02:13 AM Aaron Marcuse-Kubitza

inputs/*/maps/src.*.csv: Standardized line endings to \n

3561 07/24/2012 01:56 AM Aaron Marcuse-Kubitza

input.Makefile: Maps building: maps/$(via).%.full.csv: Added the src map as a prerequisite so it would be rebuilt when the src map changes. This is possible now that every datasource has at least an empty src map. (An empty src map is now treated the same way as a non-existing one.)

3560 07/24/2012 01:52 AM Aaron Marcuse-Kubitza

inputs/*/maps/src.*.csv: Removed extraneous quotes around fields, which are added by Excel but not by Python

3559 07/24/2012 01:49 AM Aaron Marcuse-Kubitza

inputs/*/maps/src.*.csv: Removed extraneous quotes around fields, which are added by Excel but not by Python

3558 07/24/2012 01:41 AM Aaron Marcuse-Kubitza

inputs/CTFS: Added empty maps/src.organisms.csv so that every table of every datasource has a src map

3557 07/24/2012 12:18 AM Aaron Marcuse-Kubitza

README.TXT: Datasource setup: Documented how to populate the src/ subdir with input data

3556 07/23/2012 10:52 PM Aaron Marcuse-Kubitza

Added inputs/CVS/

3555 07/23/2012 10:28 PM Aaron Marcuse-Kubitza

sql_gen.py: plpythonu_error_handler: Translate specific Python exception types to PostgreSQL error codes (ValueError -> data_exception) instead of assuming everything is a data_exception. When removing the PL/Python prefix, preserve the Python exception class in a DETAIL message. Support non-PL/Python internal_errors by re-raising them.

3554 07/23/2012 10:25 PM Aaron Marcuse-Kubitza

sql_gen.py: Added reraise_exc

3553 07/23/2012 10:21 PM Aaron Marcuse-Kubitza

schemas/py_functions.sql: _date(): Raise (or pass through) ValueErrors directly instead of wrapping them in FormatExceptions, to simplify the code. This will also enable later translation of ValueErrors to data_exceptions. When year is required and missing, output a parsable 'null value in column year violates not-null constraint' error.

3552 07/23/2012 09:48 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): log_exc(): Handle infinite loops from repeated exceptions by removing all rows, instead of just aborting with a failed assertion

3551 07/23/2012 09:36 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Fixed bug where special case for unrecoverable errors needed to avoid creating an empty output pkeys table because function mode defines the returned pkeys table separately

3550 07/23/2012 09:08 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Factored defining the error handling wrapper function out of the main loop because it only needs to run once. Don't log "Trying to insert new rows" in function mode because it's inaccurate.

3549 07/23/2012 07:14 PM Aaron Marcuse-Kubitza

sql_gen.py: Exceptions: Added suppress_exc and use it in ExcHandler.to_str()

3548 07/23/2012 06:53 PM Aaron Marcuse-Kubitza

README.TXT: Backups: After a new import: Added step to delete previous imports so they won't bloat the full DB backup. (Note that these imports have already been backed up, and only the most recent import needs to be live in the DB.)

3547 07/23/2012 06:48 PM Aaron Marcuse-Kubitza

README.TXT: Backups: Documented what to do after a new import

3546 07/23/2012 06:39 PM Aaron Marcuse-Kubitza

backups/Makefile: Full DB: Added vegbien.backup/all to run both test and rotate

3545 07/23/2012 06:24 PM Aaron Marcuse-Kubitza

README.TXT: Renamed Maintenance section to Backups for clarity

3544 07/23/2012 06:19 PM Aaron Marcuse-Kubitza

backups/Makefile: .sql: When testing, turn it off so make won't skip `.sql: %` in favor of it

3543 07/23/2012 06:07 PM Aaron Marcuse-Kubitza

backups/Makefile: Split %.backup and %.sql into separate targets for clarity

3542 07/23/2012 05:56 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. Note that this import adds data provider feedback for SQL functions as well as additional date processing using _date().

3541 07/20/2012 07:10 AM Aaron Marcuse-Kubitza

schemas/py_functions.sql: _date(): Re-enabled now that exceptions thrown are properly handled. FormatException: Support raising parsable data_exceptions when provided with the value that was invalid. Date parsing mode: Return date as the value in FormatException so it can be filtered out automatically by column-based import.

3540 07/20/2012 07:06 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Creating error handling wrapper function: Fixed bug where needed to cast NULL returned in error handler to appropriate type, because it's contained within a SELECT query which does not do implicit casts from type unknown

3539 07/20/2012 07:03 AM Aaron Marcuse-Kubitza

sql_gen.py: Cast: Support types which are Code objects

3538 07/20/2012 06:05 AM Aaron Marcuse-Kubitza

sql_io.py: func_wrapper_exception_handler(): Use new sql_gen.merge_not_null() to try to ensure that NULL values are not folded (which would cause the concatenated values not to match up with the concatenated column names). Note that this adds a dependency on the db object, which callers must now provide.

3537 07/20/2012 06:03 AM Aaron Marcuse-Kubitza

sql_gen.py: Added merge_not_null()

3536 07/20/2012 06:03 AM Aaron Marcuse-Kubitza

sql_gen.py: Added try_mk_not_null()

3535 07/20/2012 05:54 AM Aaron Marcuse-Kubitza

sql_gen.py: Renamed ArrayJoin to ArrayMerge to avoid confusion with Join (a SQL construct)

3534 07/20/2012 05:46 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Creating error handling wrapper function: Set srcs on row_var so that the column type and nullability info of row_var's columns can be retrieved for use with sql_gen.ensure_not_null()

3533 07/20/2012 05:38 AM Aaron Marcuse-Kubitza

sql_gen.py: RowExcIgnore.to_str(): Compare self.row_var to global const row_var using == to allow caller to provide a copy of row_var with the underlying table set appropriately

3532 07/20/2012 05:35 AM Aaron Marcuse-Kubitza

sql_gen.py: underlying_table(): Support derived tables and row vars by obtaining the underlying table from the srcs

3531 07/20/2012 05:25 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): Setting pkeys of missing rows: Fixed bug where also needed to do this when is_function if an empty pkeys table was created (due to an error that could not be localized to a row)

3530 07/20/2012 05:16 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): After main loop: If is_literals, return immediately to avoid needing to test for is_literals in all the code that follows (which only applies to the normal case)

3529 07/20/2012 04:43 AM Aaron Marcuse-Kubitza

sql_gen.py: RowExcIgnore: If a custom row_var is used, require it to already be defined. This also allows sql_io.ExcToErrorsTable to place the column var definition in the outer DECLARE, eliminating the extra DECLARE block.

3528 07/20/2012 04:30 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Creating error handling wrapper function: Use new sql_gen.row_var

3527 07/20/2012 04:28 AM Aaron Marcuse-Kubitza

sql_gen.py: RowExcIgnore: Created global constant for default row_var for callers to use

3526 07/20/2012 04:24 AM Aaron Marcuse-Kubitza

sql_gen.py: RowExcIgnore.to_str(): Moved SQL comment explaining the use of an EXCEPTION block for each individual row to Python code to avoid cluttering the logged SQL code

3525 07/20/2012 04:19 AM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Creating error handling wrapper function: Handle errors using new func_wrapper_exception_handler(), which saves any data_exceptions in the errors table in addition to handling PL/Python errors

3524 07/20/2012 04:13 AM Aaron Marcuse-Kubitza

sql_io.py: Added func_wrapper_exception_handler()

3523 07/20/2012 04:10 AM Aaron Marcuse-Kubitza

sql_gen.py: Added ArrayJoin

3522 07/20/2012 04:10 AM Aaron Marcuse-Kubitza

sql_gen.py: Added Array and to_Array()

3521 07/20/2012 02:47 AM Aaron Marcuse-Kubitza

sql_gen.py: Added List and inherit from it in Tuple

3520 07/20/2012 02:45 AM Aaron Marcuse-Kubitza

sql_gen.py: Renamed Tuple to Row and List to Tuple to more accurately reflect the datatype generated by each class (a Tuple being merely a grouping of values)

3519 07/20/2012 02:43 AM Aaron Marcuse-Kubitza

sql_gen.py: Moved Composite types to Literal values section as a subsection, since Composite types was really about just the input syntaxes for these types

3518 07/20/2012 02:32 AM Aaron Marcuse-Kubitza

sql_gen.py: Replaced srcs_str() with cross_join_srcs() which more correctly combines the srcs of each column using a Cartesian product. Eventually, the entire tree of srcs will need to be preserved instead of flattened in order to properly attribute errors to a specific column or set of columns.

3517 07/20/2012 02:03 AM Aaron Marcuse-Kubitza

sql_gen.py: srcs_str(): Fixed bug where needed to filter out columns with no srcs so that there aren't empty elements in the ","-separated list

3516 07/20/2012 02:00 AM Aaron Marcuse-Kubitza

sql_gen.py: Added has_srcs()

3515 07/20/2012 01:44 AM Aaron Marcuse-Kubitza

sql_gen.py: Added NestedExcHandler

3514 07/20/2012 01:44 AM Aaron Marcuse-Kubitza

sql_gen.py: Added srcs_str()

3513 07/20/2012 01:43 AM Aaron Marcuse-Kubitza

sql_gen.py: as_Col(): Support non-Code, non-string inputs by making them Literals

3512 07/20/2012 01:42 AM Aaron Marcuse-Kubitza

sql_gen.py: Added is_col() and use it in is_table_col()

3511 07/19/2012 11:54 PM Aaron Marcuse-Kubitza

sql_io.py: ExcToErrorsTable: Require users to explicitly specify an expression for the value that caused the error, instead of assuming that a variable named "value" already exists. This allows a value expression to be computed only if needed for error handling.

3510 07/19/2012 11:22 PM Aaron Marcuse-Kubitza

sql_gen.py: Moved repr() from ExcHandler to BaseExcHandler

3509 07/19/2012 11:21 PM Aaron Marcuse-Kubitza

sql_gen.py: Added BaseExcHandler and inherit from it in ExcHandlers

3508 07/19/2012 10:58 PM Aaron Marcuse-Kubitza

sql_io.py: cast(): Determining if will be saving errors: Don't add extra check if isinstance(col, sql_gen.Col) because the special case for sql_gen.Literal handles supported non-columns

3507 07/19/2012 10:56 PM Aaron Marcuse-Kubitza

sql_io.py: data_exception_handler(): Removed no longer needed db param

3506 07/19/2012 10:47 PM Aaron Marcuse-Kubitza

sql_io.py: Added ExcToErrorsTable, which separates out the errors table inserting code from the exception handling code. data_exception_handler(): Refactored to use new sql_gen.data_exception_handler() and ExcToErrorsTable.

3505 07/19/2012 10:43 PM Aaron Marcuse-Kubitza

sql_gen.py: Added data_exception_handler

3504 07/19/2012 10:08 PM Aaron Marcuse-Kubitza

sql_io.py: data_exception_handler(): Refactored to use new sql_gen.ExcToWarning when not using an errors table

3503 07/19/2012 10:03 PM Aaron Marcuse-Kubitza

sql_gen.py: Added ExcToWarning

3502 07/19/2012 10:02 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxondetermination: taxondetermination_taxonoccurrence_id_fkey(): Fixed bug where string containing a \-escape needed an "E" prefix

3501 07/19/2012 09:42 PM Aaron Marcuse-Kubitza

sql_io.py: data_exception_handler(): Require the caller to provide a statement to return a default value in case of error, rather than assuming the caller can accept a return value of NULL

3500 07/19/2012 09:27 PM Aaron Marcuse-Kubitza

sql_io.py: data_exception_handler(): Refactored to use new sql.define_func()

3499 07/19/2012 09:20 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Calling function on input rows: Convert PL/Python exceptions (internal_errors) to data_exceptions using sql_gen.plpythonu_error_handler and an error handling wrapper function

3498 07/19/2012 09:10 PM Aaron Marcuse-Kubitza

debug2redmine.csv: EXPLAIN comments: Fixed bug where needed to also match whitespace at beginning of line (indent)

3497 07/19/2012 09:07 PM Aaron Marcuse-Kubitza

Use sql_gen.ReturnQuery where RETURN QUERY was previously manually prepended

3496 07/19/2012 09:05 PM Aaron Marcuse-Kubitza

sql_gen.py: Added ReturnQuery

3495 07/19/2012 08:48 PM Aaron Marcuse-Kubitza

sql.py: define_func(): Fixed bug where next_version() needed to have module name removed since it's in the same module

3494 07/19/2012 08:47 PM Aaron Marcuse-Kubitza

sql.py: mk_select(): Added explain param to turn off automatically running EXPLAIN on the created query. This is useful for SELECT statements which use local variables in PL/pgSQL functions.

3493 07/19/2012 08:44 PM Aaron Marcuse-Kubitza

sql_gen.py: with_table(): Only set the table if the passed-in value is a Col or FunctionCall

3492 07/19/2012 08:41 PM Aaron Marcuse-Kubitza

sql_gen.py: Added Tuple

3491 07/19/2012 08:41 PM Aaron Marcuse-Kubitza

sql_gen.py: Added List and use it in Values.to_str()

3490 07/19/2012 08:14 PM Aaron Marcuse-Kubitza

sql.py: Added define_func()

3489 07/19/2012 07:07 PM Aaron Marcuse-Kubitza

Use sql_gen.SetOf where SETOF was previously manually prepended

3488 07/19/2012 07:06 PM Aaron Marcuse-Kubitza

sql_gen.py: Added SetOf

3487 07/19/2012 07:06 PM Aaron Marcuse-Kubitza

sql_gen.py: FunctionDef: Support return_types which are Code objects

3486 07/19/2012 06:55 PM Aaron Marcuse-Kubitza

Use sql_gen.ColType where %TYPE was previously manually appended

3485 07/19/2012 06:54 PM Aaron Marcuse-Kubitza

sql_gen.py: Added ColType

3484 07/19/2012 06:47 PM Aaron Marcuse-Kubitza

Use sql_gen.RowType where %ROWTYPE was previously manually appended

3483 07/19/2012 06:45 PM Aaron Marcuse-Kubitza

sql_gen.py: Added RowType

3482 07/19/2012 06:45 PM Aaron Marcuse-Kubitza

sql_gen.py: RowExcIgnore: Accept row types which are Code objects

3481 07/19/2012 06:42 PM Aaron Marcuse-Kubitza

sql_gen.py: TypedCol: Accept types which are Code objects

3480 07/19/2012 06:34 PM Aaron Marcuse-Kubitza

sql_io.py: data_exception_handler(): Documented that the invalid value must be in a local variable of type text

3479 07/19/2012 06:33 PM Aaron Marcuse-Kubitza

sql_io.py: data_exception_handler(): Documented that the invalid value must be in a local variable of type text

3478 07/19/2012 06:32 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Creating empty pkeys table so its row type can be used: Don't do this if is_literals because special error handling does not apply to that

3477 07/19/2012 06:13 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): is_function: Create empty pkeys table before calling function on all rows so its row type can later be used in an error handling wrapper function

3476 07/19/2012 05:33 PM Aaron Marcuse-Kubitza

input.Makefile: Staging tables: import/install-%: Run csv2db with a nice increment of +5 to avoid interfering with the user's other processes

3475 07/19/2012 05:28 PM Aaron Marcuse-Kubitza

root map: Run bin/map with a nice increment of +5 to avoid interfering with the user's other processes

3474 07/19/2012 05:24 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): Handle psycopg2.extensions.TransactionRollbackError by retrying the last query