Project

General

Profile

Statistics
| Revision:

# Date Author Comment
2460 05/29/2012 02:53 PM Aaron Marcuse-Kubitza

sql_gen.py: Join.to_str(): Fixed bug where USING should be used if all columns are join_same_not_null, rather than join_same, because USING uses plain = for comparison. sql.py: put_table(): input_joins now can use sql_gen.join_same_not_null in order to use USING syntax.

2459 05/25/2012 07:14 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Output debug messages with a level of 1.5 to match sql.put_table()'s level for summary messages

2458 05/25/2012 07:01 PM Aaron Marcuse-Kubitza

bin/map: Fixed bug where verbosity needed to be 1 outside of test mode so that profiling and errors stats would be printed at end of import. Verbosity defaults to 0.5 rather than 1 in test mode so profiling and errors stats do not clutter up the test output when running automated tests.

2457 05/25/2012 06:55 PM Aaron Marcuse-Kubitza

bin/map: Only display verbose_errors in test mode, but with any nonzero verbosity. They should not be displayed outside of test mode because verbose errors make the log files huge.

2456 05/25/2012 06:52 PM Aaron Marcuse-Kubitza

bin/map: Renamed verbose param to verbosity because it's now a number, not a boolean

2455 05/25/2012 06:51 PM Aaron Marcuse-Kubitza

bin/map: Removed no longer used debug param (verbose=2 is used instead)

2454 05/25/2012 06:48 PM Aaron Marcuse-Kubitza

bin/map: Fixed bug where verbose_errors' default value depended on debug var, which was not yet set. Removed verbose_errors param and instead turn verbose_errors on whenever verbosity >= 1. Verbosity defaults to 1 in test mode.

2453 05/25/2012 06:33 PM Aaron Marcuse-Kubitza

bin/map: Logging: Don't set sql.run_raw_query.debug, because it is not used anymore (sql.connect(log_debug=...) is used instead)

2452 05/25/2012 06:29 PM Aaron Marcuse-Kubitza

bin/map: Logging: Print debug messages (level > 1) prefixed with their level, to distinguish higher- and lower-level debug messages

2451 05/25/2012 06:22 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Only display warning for exceptions with no handler (which are unexpected), not missing mappings for NOT NULL columns (which are normal in datasources without those columns)

2450 05/25/2012 06:15 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Log summarizing debug messages with a level of 1.5 so they will be displayed even when the major SQL queries (which have a level of 2) are not shown

2449 05/25/2012 06:08 PM Aaron Marcuse-Kubitza

bin/map: Provide a log_debug() function to sql.connect() if verbosity > 1 rather than >= 2, to support fractional verbosities

2448 05/25/2012 06:04 PM Aaron Marcuse-Kubitza

sql.py: log_debug_none: Fixed bug where needed to take kw arg level to work with verbosity-based logging

2447 05/25/2012 05:57 PM Aaron Marcuse-Kubitza

bin/map: Allow fractional verbosity values

2446 05/25/2012 05:56 PM Aaron Marcuse-Kubitza

sql.py: Functions that version created tables, functions, etc. if they already exist: Use (default) exc_log_level=4 to hide the unsuccessful attempts to create items that already exist and show only the successful attempt

2445 05/25/2012 05:43 PM Aaron Marcuse-Kubitza

sql.py: DbConn.run_query(): Added exc_log_level param to specify a different log_level if the query throws an exception. This will useful for functions that version created tables, functions, etc. if they already exist.

2444 05/25/2012 05:34 PM Aaron Marcuse-Kubitza

sql.py: DbConn.run_query(): Removed no longer accurate doc comment, because that functionality is now in module-level run_query()

2443 05/25/2012 05:31 PM Aaron Marcuse-Kubitza

sql.py: Specify log_levels for minor queries so they can be excluded from the debug output

2442 05/25/2012 05:16 PM Aaron Marcuse-Kubitza

sql.py: select(): Pass log_level to run_query()

2441 05/25/2012 05:13 PM Aaron Marcuse-Kubitza

sql.py: DbConn.run_query(): Added log_level param and pass it to self.log_debug(). run_query(): Pass extra kw_args to DbConn.run_query() (via run_raw_query()) so that caller can specify log_level.

2440 05/25/2012 04:54 PM Aaron Marcuse-Kubitza

sql.py: run_query_into(): Fixed bug where "temporary tables cannot specify a schema name"

2439 05/25/2012 04:42 PM Aaron Marcuse-Kubitza

bin/map: Switched to verbosity-level-based system of logging. verbose is now an integer, and debug sets the minimum verbosity to 2.

2438 05/25/2012 04:37 PM Aaron Marcuse-Kubitza

input.Makefile: Configuration: Removed debug var since it's not used in the Makefile

2437 05/25/2012 04:09 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): put_table_(): Fixed bug where row_ins_ct_ref needed to be passed recursively to put_table() as keyword arg, because the in_row_ct_ref is not passed recursively

2436 05/25/2012 04:07 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): _simplifyPath: Parse "next" XPath param to extract col name of next level's pkey

2435 05/25/2012 03:26 PM Aaron Marcuse-Kubitza

bin/map: by_col: xml_func.strip(): Don't remove _simplifyPath because it is now handled by db_xml.put_table()

2434 05/25/2012 03:25 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Added basic special handling for structural XML functions, which for now just skips the function

2433 05/25/2012 03:21 PM Aaron Marcuse-Kubitza

xml_func.py: strip(): Added preserve param for XML functions not to remove

2432 05/25/2012 02:49 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Handle forward pointers in translation-to-sql_gen step instead of in XML-tree-parsing step, so that special handling for structural XML functions can use the parsed tree before any sql.put_table() processing takes place

2431 05/25/2012 02:44 PM Aaron Marcuse-Kubitza

xml_dom.py: Added is_node()

2430 05/25/2012 02:22 PM Aaron Marcuse-Kubitza

sql.py: table_row_count(): Pass start=0 to mk_select() to avoid "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings

2429 05/25/2012 02:12 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Handle unknown exceptions by returning NULL for all rows. Refactored Missing mapping for NOT NULL column handling to use new helper function remove_all_rows().

2428 05/25/2012 01:54 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Assert that insert_out_pkeys and insert_in_pkeys have same row count. Assert that pkeys and in_table have same row count.

2427 05/25/2012 12:57 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Use new sql.table_row_count()

2426 05/25/2012 12:56 PM Aaron Marcuse-Kubitza

sql.py: Added table_row_count()

2425 05/25/2012 12:52 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Use new sql_gen.row_count

2424 05/25/2012 12:47 PM Aaron Marcuse-Kubitza

sql_gen.py: Added row_count

2423 05/25/2012 12:41 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Count # rows and update in_row_ct_ref once all columns have been processed. Don't pass in_row_ct_ref to recursive calls because it should only be increased once.

2422 05/25/2012 12:28 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Added in_row_ct_ref param to store the # of input rows processed. Renamed row_ct_ref param to row_ins_ct_ref to distinguish it from new in_row_ct_ref param.

2421 05/24/2012 09:26 PM Aaron Marcuse-Kubitza

sql_gen.py: MockDb.esc_name(): Don't use sql.esc_name_by_module() to avoid circular dependency on sql module

2420 05/24/2012 09:20 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Factored out mk_select() calls in calls to run_query_into_pkeys() into new helper function insert_into_pkeys()

2419 05/24/2012 09:09 PM Aaron Marcuse-Kubitza

sql.py: put_table(): run_query_into_pkeys() calls use order_by=None in their select statements because there is a pkey, so order (row #) does not matter

2418 05/24/2012 09:05 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Subset in_table if limit != None or start != 0. start param defaults to 0 again to avoid subsetting the table when starting from row 0 (with no limit).

2417 05/24/2012 08:46 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Don't pass limit, start recursively, because the table subsetting will happen only once in the first invocation of the function. Moved limit, start params to end since they are not passed recursively. start param no longer defaults to 0 because this is not needed since sql.put_table() now sets start to 0 where needed.

2416 05/24/2012 08:38 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Removed limit and start params because they were never fully implemented, and because it's simpler to just have the caller subset their input table

2415 05/24/2012 08:27 PM Aaron Marcuse-Kubitza

lists.py: Added uniqify()

2414 05/24/2012 08:08 PM Aaron Marcuse-Kubitza

sql.py: Moved mk_flatten_mapping(), flatten() to Basic queries section since they don't involve database structure info

2413 05/24/2012 08:06 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Use single quotes rather than double quotes around strings where possible

2412 05/24/2012 07:59 PM Aaron Marcuse-Kubitza

schemas/functions.sql, vegbien.sql: Changed CAST-related relational functions to return NULL on data exceptions and convert the exceptions to warnings. This helps column-based import by mapping invalid values to NULL instead of aborting the whole query on the first invalid value.

2411 05/24/2012 07:33 PM Aaron Marcuse-Kubitza

sql.py: index_col(): Cache the query so it doesn't try to add an index on the same column multiple times

2410 05/24/2012 07:18 PM Aaron Marcuse-Kubitza

sql.py mk_select(), sql_gen.py Join.to_str(): Fixed bug where conditions needed to be wrapped in () before being AND-ed together to ensure the proper operator precedence

2409 05/24/2012 06:49 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Add index on columns with invalid values to enable fast filtering

2408 05/24/2012 06:47 PM Aaron Marcuse-Kubitza

sql.py: Added index_col()

2407 05/24/2012 06:18 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Add pkey on returned pkeys table to enable fast joins

2406 05/24/2012 06:17 PM Aaron Marcuse-Kubitza

sql.py: Added index_pkey()

2405 05/24/2012 05:41 PM Aaron Marcuse-Kubitza

sql.py: mk_update(): When running sql_gen.to_name_only_col(), check that the col's table is table

2404 05/24/2012 05:38 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Renamed pkeys to insert_pkeys to distinguish them from the full set of pkeys on the input table

2403 05/24/2012 05:27 PM Aaron Marcuse-Kubitza

sql.py: put_table(): FunctionValueException: Change invalid values to NULL using UPDATE instead of filtering them out using WHERE, to avoid adding lots of conditions to the SELECT statement

2402 05/24/2012 05:11 PM Aaron Marcuse-Kubitza

sql.py: Added mk_update() and update()

2401 05/24/2012 05:10 PM Aaron Marcuse-Kubitza

sql_gen.py: Added to_name_only_col()

2400 05/24/2012 04:56 PM Aaron Marcuse-Kubitza

sql_gen.py: Added as_Value()

2399 05/24/2012 04:29 PM Aaron Marcuse-Kubitza

sql.py: mk_select(): conds: Use new sql_gen.ColValueCond instead of sql_gen.as_ValueCond(). Documented that Code and ValueCond are sql_gen objects.

2398 05/24/2012 04:28 PM Aaron Marcuse-Kubitza

sql_gen.py: Added ColValueCond

2397 05/24/2012 03:59 PM Aaron Marcuse-Kubitza

sql.py: mk_flatten_mapping(): Filter str(col) through clean_name() to remove quotes, etc.

2396 05/24/2012 03:58 PM Aaron Marcuse-Kubitza

sql.py: Added clean_name()

2395 05/24/2012 03:43 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Join together input tables into new table for speed and so don't modify input if values edited

2394 05/24/2012 03:37 PM Aaron Marcuse-Kubitza

sql.py: mk_flatten_mapping(): Take as_items param to return a list of dict items instead of a dict. Sort preserve cols before other cols. flatten(): Turn on as_items so that cols list is sorted in input order, with preserve cols first. This ensures that if a pkey is provided in preserve, it will be the first col in the generated table.

2393 05/24/2012 03:24 PM Aaron Marcuse-Kubitza

sql.py: mk_flatten_mapping(), flatten(): Take list of cols to select instead of using all cols in all tables to join

2392 05/24/2012 02:58 PM Aaron Marcuse-Kubitza

sql.py: mk_flatten_mapping(), flatten(): Renamed flat_table param to into to be consistent with run_query_into() and put it first because it is the output param

2391 05/24/2012 02:55 PM Aaron Marcuse-Kubitza

sql.py: Added flatten()

2390 05/24/2012 02:38 PM Aaron Marcuse-Kubitza

sql.py: mk_flatten_mapping(): preserve Col objects will have tables changed to flat_table to work with flattened table

2389 05/24/2012 02:29 PM Aaron Marcuse-Kubitza

sql.py: mk_flatten_mapping(): Added preserve param for list of columns not to rename

2388 05/24/2012 02:18 PM Aaron Marcuse-Kubitza

sql.py: esc_name_by_module(): Support module value None, and use default module psycopg2 for it

2387 05/23/2012 09:58 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Renamed *pkeys_ref to *pkeys to reflect that they are now objects rather than an array-based references

2386 05/23/2012 09:54 PM Aaron Marcuse-Kubitza

sql.py: run_query_into(): Renamed into_ref param to into to reflect that it's now an object rather than an array-based reference

2385 05/23/2012 09:51 PM Aaron Marcuse-Kubitza

sql.py: run_query_into(): Made into_ref a sql_gen.Table instead of an array containing a table name to improve flexibility and clarity

2384 05/23/2012 09:34 PM Aaron Marcuse-Kubitza

dicts.py: Added join()

2383 05/23/2012 09:20 PM Aaron Marcuse-Kubitza

sql.py: Added mk_flatten_mapping()

2382 05/23/2012 08:28 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Renamed the copy of in_tables that gets modified to in_tables_, so that the original list can eventually be reused in joining together the input tables into a temp table

2381 05/23/2012 07:10 PM Aaron Marcuse-Kubitza

sql.py: run_query(): FunctionValueException: Also match "date/time field value out of range" errors

2380 05/23/2012 07:04 PM Aaron Marcuse-Kubitza

sql.py: put_table(): conds: Use a set instead of a list for faster checking of the "cond not in conds" assertion

2379 05/23/2012 06:55 PM Aaron Marcuse-Kubitza

sql.py: mk_select(): conds: Support containers of any iterable type

2378 05/23/2012 06:52 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Made conds a list so that there can be multiple conditions on the same column

2377 05/23/2012 06:36 PM Aaron Marcuse-Kubitza

sql.py: mk_select(): conds is list of (key, value) tuples instead of dict (dict still supported for compatibility), so that there can be multiple conditions on the same column

2376 05/23/2012 06:35 PM Aaron Marcuse-Kubitza

sql.py: mk_select(): conds is list of (key, value) tuples instead of dict (dict still supported for compatibility), so that there can be multiple conditions on the same column

2375 05/23/2012 06:28 PM Aaron Marcuse-Kubitza

util.py: NamedTuple inherits from objects.BasicObject so that it's comparable and hashable. This fixes a bug in dicts.make_hashable() where the NamedTuple created for a dict would appear to be hashable but would always compare as unequal.

2374 05/23/2012 06:15 PM Aaron Marcuse-Kubitza

sql.py: DbConn.esc_value(): Run strings.to_unicode() on the generated string so that if it contains unescaped non-ASCII characters, these will not cause problems when concatenated with plain strings

2373 05/23/2012 05:58 PM Aaron Marcuse-Kubitza

sql.py: run_query(): FunctionValueException: Unpack match.groups() into vars to make code clearer

2372 05/23/2012 05:56 PM Aaron Marcuse-Kubitza

exc.py: str_(): Avoid traceback exception-formatting functions when possible because they escape non-ASCII characters

2371 05/23/2012 05:11 PM Aaron Marcuse-Kubitza

sql.py: get_cur_query(): If no raw query: Use strings.ustr() instead of repr() to ensure that if the exception is parsed, embedded quotes will not be double-escaped. Prefix the query by [input] to show that it's not the raw query.

2370 05/23/2012 04:59 PM Aaron Marcuse-Kubitza

sql_gen.py: Non-Code objects: str() passes informative placeholder string to self.to_str() instead of empty string

2369 05/23/2012 04:41 PM Aaron Marcuse-Kubitza

sql.py: ExceptionWithNameValue: Use repr() instead of strings.ustr() on the value

2368 05/23/2012 04:38 PM Aaron Marcuse-Kubitza

sql.py: run_query(): Exception parsing: Use non-greedy qualifier "?" in regexps wherever possible to avoid matching closing quotes later in the error message

2367 05/23/2012 04:32 PM Aaron Marcuse-Kubitza

sql_gen.py: MockDb.esc_value(): Use repr() instead of strings.ustr() so the quotes around the value are included

2366 05/23/2012 04:30 PM Aaron Marcuse-Kubitza

sql_gen.py: ValueCond and Join class hierarchies inherit from objects.BasicObject like Code does

2365 05/23/2012 04:24 PM Aaron Marcuse-Kubitza

sql.py: put_table(): ignore(): Fixed bug where value needed to be filtered through repr(). NullValueException: Fixed bug where value passed to ignore() was the string 'NULL' instead of the value None.

2364 05/23/2012 04:14 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: plantname.rank: Filter through _toTaxonrank

2363 05/23/2012 04:03 PM Aaron Marcuse-Kubitza

sql.py: put_table(): ignore(): Avoid infinite loops by asserting that in_col is not in conds

2362 05/23/2012 03:58 PM Aaron Marcuse-Kubitza

objects.py: BasicObject: Fixed bug where util needed to be imported. Added eq() and hash().

2361 05/23/2012 03:47 PM Aaron Marcuse-Kubitza

strings.py: Removed no longer used DebugPrintable (that functionality is now in objects.BasicObject)