/ - Changes - BIEN 3 - NCEAS Projects

root @ 2206

#	Date	Author	Comment
2206	05/17/2012 10:48 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Do inserts and selects in a loop so that it will keep retrying the operation with additional constraints until it succeeds
2205	05/15/2012 03:56 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): mk_select_(): Fixed bug where order_by needed to be None because otherwise it wouldn't match the distinct_on cols if they were specified
2204	05/15/2012 03:55 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): insert_(): Fixed bug where distinct_on was not passed to mk_select_()
2203	05/15/2012 03:30 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): mk_select_(): Fixed bug where distinct_on needed to be passed as a keyword param
2202	05/15/2012 03:21 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): insert_() and mk_select_() take distinct_on param
2201	05/15/2012 03:10 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Factored out code that inserts into pkeys table into run_query_into_pkeys() helper function
2200	05/15/2012 02:55 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Implemented DISTINCT ON according to the distinct_on param
2199	05/15/2012 02:48 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Added distinct_on param to set the columns to SELECT DISTINCT ON
2198	05/15/2012 02:31 PM	Aaron Marcuse-Kubitza	sql.py: clean_name(): Convert names to lowercase so that PostgreSQL will behave the same whether the name is escaped with "" or not. This will help avoid bugs in code that uses temp tables created by the sql module.
2197	05/15/2012 02:29 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Added order_by=None wherever rows were not supposed to be re-ordered. On DuplicateKeyException: Save existing pkeys in temp table for joining on.
2196	05/15/2012 01:31 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Pass limit and start to sql.put_table()
2195	05/15/2012 01:09 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Added limit and start options
2194	05/15/2012 11:54 AM	Aaron Marcuse-Kubitza	sql.py: When creating a temporary entity (table, function, etc.), instead create it as a permanent entity in debug mode so it can be viewed after the program is run
2193	05/15/2012 11:40 AM	Aaron Marcuse-Kubitza	sql.py: DbConn: Store whether in debug mode (log_debug != log_debug_none) for easy use by methods
2192	05/15/2012 11:31 AM	Aaron Marcuse-Kubitza	bin/map: connect_db(): Turn on autocommit mode in debug mode if commit is on, so that incremental results can be seen in the DB
2191	05/15/2012 11:30 AM	Aaron Marcuse-Kubitza	sql.py: DbConn: Use internal autocommit handling instead of DB connection autocommit attr to avoid autocommits inside a savepoint
2190	05/15/2012 11:15 AM	Aaron Marcuse-Kubitza	sql.py: DbConn: Added autocommit option to turn on autocommit mode. Use set_session() instead of SQL command to set isolation level.
2189	05/14/2012 05:50 PM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): embeddable: Fixed bug where the function may do different things when run, because the function (and other statements whose cached strings depend on the function name) may be run after the function definition would have changed, by versioning the function name and using CREATE FUNCTION instead of CREATE OR REPLACE FUNCTION so that its definition never changes
2188	05/14/2012 05:28 PM	Aaron Marcuse-Kubitza	sql.py: Parse "function already exists" errors as DuplicateFunctionException
2187	05/14/2012 05:13 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): joins: Fixed bug where join_not_equal did not do what it was designed for, which is filtering out matches of the join condition (before the bug fix, it effectively did a cross join with matching rows excluded, causing duplication of rows). Renamed join_not_equal to filter_out to reflect its intended use. Support table-scoped column names in the WHERE conds list.
2186	05/14/2012 04:22 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Fixed bug where ORDER BY column needed to have table0 name prefixed (if it didn't already have a table name), to avoid ambiguous column references
2185	05/14/2012 04:11 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Fixed bug in joins where right_col had the table name prepended before it was copied for use with a different table name in join_using and join_not_equal
2184	05/14/2012 03:42 PM	Aaron Marcuse-Kubitza	Mapped some unmapped fields in DwC inputs
2183	05/14/2012 02:19 PM	Aaron Marcuse-Kubitza	Added mappings/for_review/DwC2-VegBIEN.specimens.fields.csv
2182	05/14/2012 01:21 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Fixed bug where didn't commit right after inserting node, but instead waited until children with fkeys to parent (independent of the node itself) were inserted
2181	05/14/2012 01:16 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): insert_(): Use insert_select() instead of run_query_into() if new option pkeys_table_exists is on
2180	05/14/2012 12:51 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Support joins with !=
2179	05/14/2012 12:45 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Support only some join columns being join_using
2178	05/14/2012 12:40 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Renamed in_joins to insert_joins and joins to select_joins for clarity
2177	05/14/2012 12:21 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Support children with fkeys to parent
2176	05/14/2012 12:11 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Make tuple optional for None literal values
2175	05/13/2012 02:05 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Removed "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings
2174	05/13/2012 02:02 PM	Aaron Marcuse-Kubitza	bin/map: by_col: row_ct = 0 because it's unknown for now
2173	05/13/2012 02:00 PM	Aaron Marcuse-Kubitza	mk_select(): Support join conditions with literal values
2172	05/13/2012 01:42 PM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): embeddable: Don't cache function_query because function def could change and then change back
2171	05/13/2012 01:35 PM	Aaron Marcuse-Kubitza	sql.py: with_savepoint(): Renamed savepoints to have "level" prefix, since the # indicates the level #
2170	05/13/2012 01:32 PM	Aaron Marcuse-Kubitza	sql.py: get_cur_query(): Also accept input params to combine with input_query, and pass input params when get_cur_query() is called
2169	05/13/2012 01:26 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.run_query(): Pass input query to get_cur_query()
2168	05/13/2012 01:19 PM	Aaron Marcuse-Kubitza	sql.py: get_cur_query() and _add_cursor_info(): Support input_query param that will be used if the raw query is None. Pass input_query in DbConn.execute().
2167	05/13/2012 01:09 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.run_query(): Check that query != None
2166	05/13/2012 01:05 PM	Aaron Marcuse-Kubitza	bin/map: out_is_db: Only rollback() and close() out_db if it was connected
2165	05/13/2012 01:04 PM	Aaron Marcuse-Kubitza	sql.py: DbConn: Added connected()
2164	05/13/2012 01:01 PM	Aaron Marcuse-Kubitza	sql.py: Wrapped calls to get_cur_query() that are used as strings in str(), because get_cur_query() can return None
2163	05/13/2012 12:57 PM	Aaron Marcuse-Kubitza	sql.py: next_version(): Versions start from 1, because first existing name was version 0
2162	05/13/2012 12:55 PM	Aaron Marcuse-Kubitza	put_table(): Use short name for temp_suffix now that version # will be added if needed
2161	05/13/2012 12:51 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Parse join columns for literal values and table-scoped names as well
2160	05/13/2012 11:54 AM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: establishmentMeans: Call _toGrowthform on growthform
2159	05/13/2012 11:53 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added _toGrowthform
2158	05/13/2012 11:19 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Changed temp_prefix to a suffix so main name won't be removed if name is truncated
2157	05/13/2012 11:14 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): fields: Support columns with tables. Changed syntax for literal values so that it wouldn't conflict with new syntax for columns with tables.
2156	05/13/2012 11:08 AM	Aaron Marcuse-Kubitza	iters.py: flatten(): If not an iterable, just return the value
2155	05/13/2012 10:32 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Pass in_pkeys and out_pkeys to run_query_into() by ref so they will be updated if the table names are changed
2154	05/13/2012 10:28 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Pass pkeys to run_query_into() by ref so it will be updated if the table name is changed
2153	05/13/2012 10:15 AM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): If CREATE TABLE AS generates a DuplicateTableException, rename the table with a version # prepended
2152	05/13/2012 10:08 AM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Made into param a reference so that the function can change it, and renamed it to into_ref
2151	05/13/2012 09:36 AM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Made into param a reference so that the function can change it, and renamed it to into_ref
2150	05/13/2012 09:11 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): If DuplicateKeyException: run_query_into() recoverably, so that DB errors such as DuplicateTableException will be parsed
2149	05/13/2012 09:07 AM	Aaron Marcuse-Kubitza	sql.py: Removed no-longer-needed try_insert()
2148	05/13/2012 09:05 AM	Aaron Marcuse-Kubitza	sql.py: Merged with_parsed_errors() into run_query() so all recoverable queries would automatically benefit from DB error message parsing. DbConn: Moved _add_cursor_info() to DbCursor.execute().
2147	05/13/2012 07:45 AM	Aaron Marcuse-Kubitza	sql.py: with_parsed_errors(): Raise DuplicateTableException for "relation already exists" errors instead of "table name specified more than once" errors
2146	05/13/2012 07:43 AM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Removed "DROP TABLE IF EXISTS" because sometimes when there are collisions in the temp table names, the code actually uses both "copies" of the temp table. Eventually, this situation will be resolved by adding a counter to the temp table name.
2145	05/13/2012 07:26 AM	Aaron Marcuse-Kubitza	sql.py: Cleaned up DbException's and subclasses' messages
2144	05/13/2012 07:26 AM	Aaron Marcuse-Kubitza	exc.py: ExceptionWithCause: Added cause_newline option to put the cause on its own line instead of on the message line
2143	05/13/2012 07:10 AM	Aaron Marcuse-Kubitza	sql.py: with_parsed_errors(): Also parse "table name specified more than once" errors as DuplicateTableExceptions
2142	05/13/2012 06:56 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Handle DuplicateKeyExceptions by running a select query on the unique constraint columns
2141	05/13/2012 06:14 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Support tuples of tables, not just lists
2140	05/13/2012 05:29 AM	Aaron Marcuse-Kubitza	sql.py: with_parsed_errors(): Support table names that start with "_"
2139	05/13/2012 05:20 AM	Aaron Marcuse-Kubitza	sql.py: DbConn: Added with_savepoint(). with_savepoint(): Use new DbConn.with_savepoint().
2138	05/13/2012 04:13 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _toBool
2137	05/13/2012 04:12 AM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: establishmentMeans: Use _toBool on iscultivated, isnative
2136	05/13/2012 04:11 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _toBool
2135	05/13/2012 04:01 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Made trigger functions IMMUTABLE since they do not modify other tables
2134	05/13/2012 03:51 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Added support for putting just a window subset of the rows in the table. Removed "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings.
2133	05/13/2012 03:30 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Return the column where the pkeys are made available (the out_pkey) instead of taking it as an argument
2132	05/13/2012 03:20 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Get input pkeys corresponding to rows in insert and join together out_pkeys and in_pkeys into final pkeys table
2131	05/13/2012 01:04 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Fully support multiple in_tables, joined together using the main input table's pkey
2130	05/13/2012 01:02 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): joins: Fixed bug where USING-based joins did not have closing ")"
2129	05/13/2012 12:28 AM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Fixed bug where in_table was last in in_tables instead of first, causing it to be ignored by the current put_table() implementation, which only considers the first table name
2128	05/13/2012 12:17 AM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Fixed bug where pkeys_table returned by recursive call to put_table() needed to be prefixed with $ to be treated as an input column name rather than a literal value
2127	05/09/2012 05:29 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Support joins with USING, which can be used to merge multiple input cols into the same output col
2126	05/09/2012 04:42 AM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): embeddable: Fixed bug where query that uses function was being sorted by its first column (the default mk_select() setting), when it should be left in its original order
2125	05/09/2012 04:36 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Take a dict mapping out to in cols instead of separate in and out cols lists
2124	05/09/2012 04:08 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Joins: Reversed order of left_col and right_col in the joins dict as well, so the joined table's columns are the keys
2123	05/09/2012 04:05 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Joins: Reversed order of left_col and right_col so the column of the table being joined is first, to match the form of a WHERE clause
2122	05/09/2012 03:56 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Support joins
2121	05/09/2012 03:27 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Accept a list of tables to join together (initial implementation just uses the first table)
2120	05/09/2012 02:26 AM	Aaron Marcuse-Kubitza	sql.py: mk_select(): Support ORDER BY clause. By default, order by the pkey, since PostgreSQL apparently doesn't do this automatically (and this was causing some staging table tests to fail).
2119	05/09/2012 02:04 AM	Aaron Marcuse-Kubitza	bin/map: In debug mode, print the row # and input row just like in error messages
2118	05/09/2012 01:51 AM	Aaron Marcuse-Kubitza	bin/map: verbose_errors also defaults to on in debug mode
2117	05/09/2012 01:39 AM	Aaron Marcuse-Kubitza	sql.py: add_row_num(): Make the row number column the primary key
2116	05/09/2012 12:36 AM	Aaron Marcuse-Kubitza	csv2db: Use new sql.cleanup_table() to map NULL-equivalents to NULL. Consider the empty string to be NULL.
2115	05/09/2012 12:35 AM	Aaron Marcuse-Kubitza	sql.py: Added cleanup_table()
2114	05/09/2012 12:33 AM	Aaron Marcuse-Kubitza	csvs.py: Added row filters
2113	05/07/2012 11:14 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Fixed bug where relational functions were not being treated as value nodes, and thus their containing child was treated as a child with a backwards pointer instead of a field
2112	05/07/2012 11:12 PM	Aaron Marcuse-Kubitza	xml_func.py: Added is_func() and is_xml_func() and use them where their definitions were used
2111	05/07/2012 10:40 PM	Aaron Marcuse-Kubitza	db_xml.py: Added value() and use it where xml_dom.first_elem() was used
2110	05/07/2012 10:12 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Latitude/Longitude: Moved _toDouble directly after the output col name, so that it's run after any translation functions (which all return strings). *ElevationInMeters: Added _toDouble around all output cols.
2109	05/07/2012 09:56 PM	Aaron Marcuse-Kubitza	xpath.py: get(): Create attrs: Fixed bug where attrs were created with last_only on, which caused attrs to get created multiple times if there were multiple attrs of the same name but different values, becase the last_only optimization would only check the last attr of that name
2108	05/07/2012 09:19 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Latitude/Longitude: Use new _toDouble to convert strings to doubles (needed for by_col)
2107	05/07/2012 09:16 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _toDouble

Project

General

Profile