/ - Changes - BIEN 3 - NCEAS Projects

root @ 2433

#	Date	Author	Comment
2433	05/25/2012 03:21 PM	Aaron Marcuse-Kubitza	xml_func.py: strip(): Added preserve param for XML functions not to remove
2432	05/25/2012 02:49 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Handle forward pointers in translation-to-sql_gen step instead of in XML-tree-parsing step, so that special handling for structural XML functions can use the parsed tree before any sql.put_table() processing takes place
2431	05/25/2012 02:44 PM	Aaron Marcuse-Kubitza	xml_dom.py: Added is_node()
2430	05/25/2012 02:22 PM	Aaron Marcuse-Kubitza	sql.py: table_row_count(): Pass start=0 to mk_select() to avoid "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings
2429	05/25/2012 02:12 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Handle unknown exceptions by returning NULL for all rows. Refactored Missing mapping for NOT NULL column handling to use new helper function remove_all_rows().
2428	05/25/2012 01:54 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Assert that insert_out_pkeys and insert_in_pkeys have same row count. Assert that pkeys and in_table have same row count.
2427	05/25/2012 12:57 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Use new sql.table_row_count()
2426	05/25/2012 12:56 PM	Aaron Marcuse-Kubitza	sql.py: Added table_row_count()
2425	05/25/2012 12:52 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Use new sql_gen.row_count
2424	05/25/2012 12:47 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added row_count
2423	05/25/2012 12:41 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Count # rows and update in_row_ct_ref once all columns have been processed. Don't pass in_row_ct_ref to recursive calls because it should only be increased once.
2422	05/25/2012 12:28 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Added in_row_ct_ref param to store the # of input rows processed. Renamed row_ct_ref param to row_ins_ct_ref to distinguish it from new in_row_ct_ref param.
2421	05/24/2012 09:26 PM	Aaron Marcuse-Kubitza	sql_gen.py: MockDb.esc_name(): Don't use sql.esc_name_by_module() to avoid circular dependency on sql module
2420	05/24/2012 09:20 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Factored out mk_select() calls in calls to run_query_into_pkeys() into new helper function insert_into_pkeys()
2419	05/24/2012 09:09 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): run_query_into_pkeys() calls use order_by=None in their select statements because there is a pkey, so order (row #) does not matter
2418	05/24/2012 09:05 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Subset in_table if limit != None or start != 0. start param defaults to 0 again to avoid subsetting the table when starting from row 0 (with no limit).
2417	05/24/2012 08:46 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Don't pass limit, start recursively, because the table subsetting will happen only once in the first invocation of the function. Moved limit, start params to end since they are not passed recursively. start param no longer defaults to 0 because this is not needed since sql.put_table() now sets start to 0 where needed.
2416	05/24/2012 08:38 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Removed limit and start params because they were never fully implemented, and because it's simpler to just have the caller subset their input table
2415	05/24/2012 08:27 PM	Aaron Marcuse-Kubitza	lists.py: Added uniqify()
2414	05/24/2012 08:08 PM	Aaron Marcuse-Kubitza	sql.py: Moved mk_flatten_mapping(), flatten() to Basic queries section since they don't involve database structure info
2413	05/24/2012 08:06 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Use single quotes rather than double quotes around strings where possible
2412	05/24/2012 07:59 PM	Aaron Marcuse-Kubitza	schemas/functions.sql, vegbien.sql: Changed CAST-related relational functions to return NULL on data exceptions and convert the exceptions to warnings. This helps column-based import by mapping invalid values to NULL instead of aborting the whole query on the first invalid value.
2411	05/24/2012 07:33 PM	Aaron Marcuse-Kubitza	sql.py: index_col(): Cache the query so it doesn't try to add an index on the same column multiple times
2410	05/24/2012 07:18 PM	Aaron Marcuse-Kubitza	sql.py mk_select(), sql_gen.py Join.to_str(): Fixed bug where conditions needed to be wrapped in () before being AND-ed together to ensure the proper operator precedence
2409	05/24/2012 06:49 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Add index on columns with invalid values to enable fast filtering
2408	05/24/2012 06:47 PM	Aaron Marcuse-Kubitza	sql.py: Added index_col()
2407	05/24/2012 06:18 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Add pkey on returned pkeys table to enable fast joins
2406	05/24/2012 06:17 PM	Aaron Marcuse-Kubitza	sql.py: Added index_pkey()
2405	05/24/2012 05:41 PM	Aaron Marcuse-Kubitza	sql.py: mk_update(): When running sql_gen.to_name_only_col(), check that the col's table is table
2404	05/24/2012 05:38 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Renamed *pkeys to insert*_pkeys to distinguish them from the full set of pkeys on the input table
2403	05/24/2012 05:27 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): FunctionValueException: Change invalid values to NULL using UPDATE instead of filtering them out using WHERE, to avoid adding lots of conditions to the SELECT statement
2402	05/24/2012 05:11 PM	Aaron Marcuse-Kubitza	sql.py: Added mk_update() and update()
2401	05/24/2012 05:10 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added to_name_only_col()
2400	05/24/2012 04:56 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added as_Value()
2399	05/24/2012 04:29 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): conds: Use new sql_gen.ColValueCond instead of sql_gen.as_ValueCond(). Documented that Code and ValueCond are sql_gen objects.
2398	05/24/2012 04:28 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added ColValueCond
2397	05/24/2012 03:59 PM	Aaron Marcuse-Kubitza	sql.py: mk_flatten_mapping(): Filter str(col) through clean_name() to remove quotes, etc.
2396	05/24/2012 03:58 PM	Aaron Marcuse-Kubitza	sql.py: Added clean_name()
2395	05/24/2012 03:43 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Join together input tables into new table for speed and so don't modify input if values edited
2394	05/24/2012 03:37 PM	Aaron Marcuse-Kubitza	sql.py: mk_flatten_mapping(): Take as_items param to return a list of dict items instead of a dict. Sort preserve cols before other cols. flatten(): Turn on as_items so that cols list is sorted in input order, with preserve cols first. This ensures that if a pkey is provided in preserve, it will be the first col in the generated table.
2393	05/24/2012 03:24 PM	Aaron Marcuse-Kubitza	sql.py: mk_flatten_mapping(), flatten(): Take list of cols to select instead of using all cols in all tables to join
2392	05/24/2012 02:58 PM	Aaron Marcuse-Kubitza	sql.py: mk_flatten_mapping(), flatten(): Renamed flat_table param to into to be consistent with run_query_into() and put it first because it is the output param
2391	05/24/2012 02:55 PM	Aaron Marcuse-Kubitza	sql.py: Added flatten()
2390	05/24/2012 02:38 PM	Aaron Marcuse-Kubitza	sql.py: mk_flatten_mapping(): preserve Col objects will have tables changed to flat_table to work with flattened table
2389	05/24/2012 02:29 PM	Aaron Marcuse-Kubitza	sql.py: mk_flatten_mapping(): Added preserve param for list of columns not to rename
2388	05/24/2012 02:18 PM	Aaron Marcuse-Kubitza	sql.py: esc_name_by_module(): Support module value None, and use default module psycopg2 for it
2387	05/23/2012 09:58 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Renamed pkeys_ref to pkeys to reflect that they are now objects rather than an array-based references
2386	05/23/2012 09:54 PM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Renamed into_ref param to into to reflect that it's now an object rather than an array-based reference
2385	05/23/2012 09:51 PM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Made into_ref a sql_gen.Table instead of an array containing a table name to improve flexibility and clarity
2384	05/23/2012 09:34 PM	Aaron Marcuse-Kubitza	dicts.py: Added join()
2383	05/23/2012 09:20 PM	Aaron Marcuse-Kubitza	sql.py: Added mk_flatten_mapping()
2382	05/23/2012 08:28 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Renamed the copy of in_tables that gets modified to in_tables_, so that the original list can eventually be reused in joining together the input tables into a temp table
2381	05/23/2012 07:10 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): FunctionValueException: Also match "date/time field value out of range" errors
2380	05/23/2012 07:04 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): conds: Use a set instead of a list for faster checking of the "cond not in conds" assertion
2379	05/23/2012 06:55 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): conds: Support containers of any iterable type
2378	05/23/2012 06:52 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Made conds a list so that there can be multiple conditions on the same column
2377	05/23/2012 06:36 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): conds is list of (key, value) tuples instead of dict (dict still supported for compatibility), so that there can be multiple conditions on the same column
2376	05/23/2012 06:35 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): conds is list of (key, value) tuples instead of dict (dict still supported for compatibility), so that there can be multiple conditions on the same column
2375	05/23/2012 06:28 PM	Aaron Marcuse-Kubitza	util.py: NamedTuple inherits from objects.BasicObject so that it's comparable and hashable. This fixes a bug in dicts.make_hashable() where the NamedTuple created for a dict would appear to be hashable but would always compare as unequal.
2374	05/23/2012 06:15 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.esc_value(): Run strings.to_unicode() on the generated string so that if it contains unescaped non-ASCII characters, these will not cause problems when concatenated with plain strings
2373	05/23/2012 05:58 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): FunctionValueException: Unpack match.groups() into vars to make code clearer
2372	05/23/2012 05:56 PM	Aaron Marcuse-Kubitza	exc.py: str_(): Avoid traceback exception-formatting functions when possible because they escape non-ASCII characters
2371	05/23/2012 05:11 PM	Aaron Marcuse-Kubitza	sql.py: get_cur_query(): If no raw query: Use strings.ustr() instead of repr() to ensure that if the exception is parsed, embedded quotes will not be double-escaped. Prefix the query by [input] to show that it's not the raw query.
2370	05/23/2012 04:59 PM	Aaron Marcuse-Kubitza	sql_gen.py: Non-Code objects: str() passes informative placeholder string to self.to_str() instead of empty string
2369	05/23/2012 04:41 PM	Aaron Marcuse-Kubitza	sql.py: ExceptionWithNameValue: Use repr() instead of strings.ustr() on the value
2368	05/23/2012 04:38 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): Exception parsing: Use non-greedy qualifier "?" in regexps wherever possible to avoid matching closing quotes later in the error message
2367	05/23/2012 04:32 PM	Aaron Marcuse-Kubitza	sql_gen.py: MockDb.esc_value(): Use repr() instead of strings.ustr() so the quotes around the value are included
2366	05/23/2012 04:30 PM	Aaron Marcuse-Kubitza	sql_gen.py: ValueCond and Join class hierarchies inherit from objects.BasicObject like Code does
2365	05/23/2012 04:24 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): ignore(): Fixed bug where value needed to be filtered through repr(). NullValueException: Fixed bug where value passed to ignore() was the string 'NULL' instead of the value None.
2364	05/23/2012 04:14 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: plantname.rank: Filter through _toTaxonrank
2363	05/23/2012 04:03 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): ignore(): Avoid infinite loops by asserting that in_col is not in conds
2362	05/23/2012 03:58 PM	Aaron Marcuse-Kubitza	objects.py: BasicObject: Fixed bug where util needed to be imported. Added eq() and hash().
2361	05/23/2012 03:47 PM	Aaron Marcuse-Kubitza	strings.py: Removed no longer used DebugPrintable (that functionality is now in objects.BasicObject)
2360	05/23/2012 03:46 PM	Aaron Marcuse-Kubitza	sql_gen.py: Code: Inherit from new objects.BasicObject
2359	05/23/2012 03:46 PM	Aaron Marcuse-Kubitza	Added objects.py
2358	05/23/2012 03:37 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Renamed log_ignore() to ignore() and factored common conds-modifying code into it
2357	05/23/2012 03:29 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Moved post-insert code outside while loop because it will now always be run (there are no longer special cases where the postprocessing doesn't happen)
2356	05/23/2012 03:25 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Missing mapping for NOT NULL column: Just create an empty pkeys table, since the missing rows' pkeys will be set to NULL later
2355	05/23/2012 03:17 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Joining together output and input pkeys: Use new sql_gen.join_same_not_null
2354	05/23/2012 03:14 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Setting missing rows' pkeys to NULL: Use new sql_gen.join_same_not_null
2353	05/23/2012 03:14 PM	Aaron Marcuse-Kubitza	sql_gen.py: Join: Added join_same_not_null. to_str(): Refactored to switch order of left and right tables and cols because left_table is on the right in the comparison, and using the sides of the comparison instead of the sides of the join makes the code clearer.
2352	05/23/2012 02:51 PM	Aaron Marcuse-Kubitza	sql_gen.py: Renamed join_using to join_same to reflect that it can also be used without USING
2351	05/23/2012 02:48 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Set missing rows' pkeys to NULL
2350	05/23/2012 02:10 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): NullValueException: no mapping for missing col: Fixed bug where run_query_into_pkeys() was still using insert_joins instead of input_joins
2349	05/23/2012 02:06 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added MockDb. All str() methods: Use self.to_str() with mockDb.
2348	05/23/2012 01:59 PM	Aaron Marcuse-Kubitza	sql_gen.py: Use db.esc_name() instead of sql.esc_name(db, ...) so passed-in db can be a mock object
2347	05/23/2012 01:58 PM	Aaron Marcuse-Kubitza	sql.py: DbConn: Added esc_name()
2346	05/23/2012 01:51 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Debug-print which columns are being put
2345	05/23/2012 01:50 PM	Aaron Marcuse-Kubitza	sql.py: ConstraintException, NullValueException: Improved error messages
2344	05/23/2012 01:31 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): FunctionValueException: Fixed bug where out_table was still assumed to be an escaped string, but is now a Table object
2343	05/23/2012 01:29 PM	Aaron Marcuse-Kubitza	sql.py: mk_select(): joins: Use new table_not_null_col() instead of pkey() to get a non-NULL column to filter out on
2342	05/22/2012 10:00 PM	Aaron Marcuse-Kubitza	exc.py: add_msg(): Fixed bug where msg needed to be converted to a unicode object before appending it to another unicode object
2341	05/22/2012 09:54 PM	Aaron Marcuse-Kubitza	mappings/VegX-VegBIEN.stems.csv: Fixed bug where taxonfit was named taxonFit. (This was only recently discovered because column names are now escaped, causing them not to be case-insensitive.)
2340	05/22/2012 09:51 PM	Aaron Marcuse-Kubitza	sql.py: Added table_not_null_col()
2339	05/22/2012 09:50 PM	Aaron Marcuse-Kubitza	sql.py: Added table_cols() and use it in pkey()
2338	05/22/2012 09:36 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql, schemas/functions.sql: Relational functions: Added dummy not_null column to provide a column to use in LEFT JOIN filter-out filters
2337	05/22/2012 09:24 PM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): embeddable: Use new sql_gen.NamedTable
2336	05/22/2012 09:23 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added NamedTable. Table: Added to_Table().
2335	05/22/2012 09:06 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added section labels for each type of SQL code object
2334	05/22/2012 08:25 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): DuplicateKeyException: Fixed bug where dict_subset_right_join() was used instead of dict_subset(), adding spurious None values for columns in the constraint which are not in the input tables

Project

General

Profile