Activity - BIEN 3 - NCEAS Projects

Activity

From 04/26/2012 to 05/25/2012

05/25/2012

07:14 PM Revision 2459: db_xml.py: put_table(): Output debug messages with a level of 1.5 to match sql.put_table()'s level for summary messages: Aaron Marcuse-Kubitza
07:01 PM Revision 2458: bin/map: Fixed bug where verbosity needed to be 1 outside of test mode so that profiling and errors stats would be printed at end of import. Verbosity defaults to 0.5 rather than 1 in test mode so profiling and errors stats do not clutter up the test output when running automated tests.: Aaron Marcuse-Kubitza
06:55 PM Revision 2457: bin/map: Only display verbose_errors in test mode, but with any nonzero verbosity. They should not be displayed outside of test mode because verbose errors make the log files huge.: Aaron Marcuse-Kubitza
06:52 PM Revision 2456: bin/map: Renamed verbose param to verbosity because it's now a number, not a boolean: Aaron Marcuse-Kubitza
06:51 PM Revision 2455: bin/map: Removed no longer used debug param (verbose=2 is used instead): Aaron Marcuse-Kubitza
06:48 PM Revision 2454: bin/map: Fixed bug where verbose_errors' default value depended on debug var, which was not yet set. Removed verbose_errors param and instead turn verbose_errors on whenever verbosity >= 1. Verbosity defaults to 1 in test mode.: Aaron Marcuse-Kubitza
06:33 PM Revision 2453: bin/map: Logging: Don't set sql.run_raw_query.debug, because it is not used anymore (sql.connect(log_debug=...) is used instead): Aaron Marcuse-Kubitza
06:29 PM Revision 2452: bin/map: Logging: Print debug messages (level > 1) prefixed with their level, to distinguish higher- and lower-level debug messages: Aaron Marcuse-Kubitza
06:22 PM Revision 2451: sql.py: put_table(): Only display warning for exceptions with no handler (which are unexpected), not missing mappings for NOT NULL columns (which are normal in datasources without those columns): Aaron Marcuse-Kubitza
06:15 PM Revision 2450: sql.py: put_table(): Log summarizing debug messages with a level of 1.5 so they will be displayed even when the major SQL queries (which have a level of 2) are not shown: Aaron Marcuse-Kubitza
06:08 PM Revision 2449: bin/map: Provide a log_debug() function to sql.connect() if verbosity > 1 rather than >= 2, to support fractional verbosities: Aaron Marcuse-Kubitza
06:04 PM Revision 2448: sql.py: log_debug_none: Fixed bug where needed to take kw arg level to work with verbosity-based logging: Aaron Marcuse-Kubitza
05:57 PM Revision 2447: bin/map: Allow fractional verbosity values: Aaron Marcuse-Kubitza
05:56 PM Revision 2446: sql.py: Functions that version created tables, functions, etc. if they already exist: Use (default) exc_log_level=4 to hide the unsuccessful attempts to create items that already exist and show only the successful attempt: Aaron Marcuse-Kubitza
05:43 PM Revision 2445: sql.py: DbConn.run_query(): Added exc_log_level param to specify a different log_level if the query throws an exception. This will useful for functions that version created tables, functions, etc. if they already exist.: Aaron Marcuse-Kubitza
05:34 PM Revision 2444: sql.py: DbConn.run_query(): Removed no longer accurate doc comment, because that functionality is now in module-level run_query(): Aaron Marcuse-Kubitza
05:31 PM Revision 2443: sql.py: Specify log_levels for minor queries so they can be excluded from the debug output: Aaron Marcuse-Kubitza
05:16 PM Revision 2442: sql.py: select(): Pass log_level to run_query(): Aaron Marcuse-Kubitza
05:13 PM Revision 2441: sql.py: DbConn.run_query(): Added log_level param and pass it to self.log_debug(). run_query(): Pass extra kw_args to DbConn.run_query() (via run_raw_query()) so that caller can specify log_level.: Aaron Marcuse-Kubitza
04:54 PM Revision 2440: sql.py: run_query_into(): Fixed bug where "temporary tables cannot specify a schema name": Aaron Marcuse-Kubitza
04:42 PM Revision 2439: bin/map: Switched to verbosity-level-based system of logging. verbose is now an integer, and debug sets the minimum verbosity to 2.: Aaron Marcuse-Kubitza
04:37 PM Revision 2438: input.Makefile: Configuration: Removed debug var since it's not used in the Makefile: Aaron Marcuse-Kubitza
04:09 PM Revision 2437: db_xml.py: put_table(): put_table_(): Fixed bug where row_ins_ct_ref needed to be passed recursively to put_table() as keyword arg, because the in_row_ct_ref is not passed recursively: Aaron Marcuse-Kubitza
04:07 PM Revision 2436: db_xml.py: put_table(): _simplifyPath: Parse "next" XPath param to extract col name of next level's pkey: Aaron Marcuse-Kubitza
03:26 PM Revision 2435: bin/map: by_col: xml_func.strip(): Don't remove _simplifyPath because it is now handled by db_xml.put_table(): Aaron Marcuse-Kubitza
03:25 PM Revision 2434: db_xml.py: put_table(): Added basic special handling for structural XML functions, which for now just skips the function: Aaron Marcuse-Kubitza
03:21 PM Revision 2433: xml_func.py: strip(): Added preserve param for XML functions not to remove: Aaron Marcuse-Kubitza
02:49 PM Revision 2432: db_xml.py: put_table(): Handle forward pointers in translation-to-sql_gen step instead of in XML-tree-parsing step, so that special handling for structural XML functions can use the parsed tree before any sql.put_table() processing takes place: Aaron Marcuse-Kubitza
02:44 PM Revision 2431: xml_dom.py: Added is_node(): Aaron Marcuse-Kubitza
02:22 PM Revision 2430: sql.py: table_row_count(): Pass start=0 to mk_select() to avoid "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings: Aaron Marcuse-Kubitza
02:12 PM Revision 2429: sql.py: put_table(): Handle unknown exceptions by returning NULL for all rows. Refactored Missing mapping for NOT NULL column handling to use new helper function remove_all_rows().: Aaron Marcuse-Kubitza
01:54 PM Revision 2428: sql.py: put_table(): Assert that insert_out_pkeys and insert_in_pkeys have same row count. Assert that pkeys and in_table have same row count.: Aaron Marcuse-Kubitza
12:57 PM Revision 2427: db_xml.py: put_table(): Use new sql.table_row_count(): Aaron Marcuse-Kubitza
12:56 PM Revision 2426: sql.py: Added table_row_count(): Aaron Marcuse-Kubitza
12:52 PM Revision 2425: db_xml.py: put_table(): Use new sql_gen.row_count: Aaron Marcuse-Kubitza
12:47 PM Revision 2424: sql_gen.py: Added row_count: Aaron Marcuse-Kubitza
12:41 PM Revision 2423: db_xml.py: put_table(): Count # rows and update in_row_ct_ref once all columns have been processed. Don't pass in_row_ct_ref to recursive calls because it should only be increased once.: Aaron Marcuse-Kubitza
12:28 PM Revision 2422: db_xml.py: put_table(): Added in_row_ct_ref param to store the # of input rows processed. Renamed row_ct_ref param to row_ins_ct_ref to distinguish it from new in_row_ct_ref param.: Aaron Marcuse-Kubitza

05/24/2012

09:26 PM Revision 2421: sql_gen.py: MockDb.esc_name(): Don't use sql.esc_name_by_module() to avoid circular dependency on sql module: Aaron Marcuse-Kubitza
09:20 PM Revision 2420: sql.py: put_table(): Factored out mk_select() calls in calls to run_query_into_pkeys() into new helper function insert_into_pkeys(): Aaron Marcuse-Kubitza
09:09 PM Revision 2419: sql.py: put_table(): run_query_into_pkeys() calls use order_by=None in their select statements because there is a pkey, so order (row #) does not matter: Aaron Marcuse-Kubitza
09:05 PM Revision 2418: db_xml.py: put_table(): Subset in_table if limit != None or start != 0. start param defaults to 0 again to avoid subsetting the table when starting from row 0 (with no limit).: Aaron Marcuse-Kubitza
08:46 PM Revision 2417: db_xml.py: put_table(): Don't pass limit, start recursively, because the table subsetting will happen only once in the first invocation of the function. Moved limit, start params to end since they are not passed recursively. start param no longer defaults to 0 because this is not needed since sql.put_table() now sets start to 0 where needed.: Aaron Marcuse-Kubitza
08:38 PM Revision 2416: sql.py: put_table(): Removed limit and start params because they were never fully implemented, and because it's simpler to just have the caller subset their input table: Aaron Marcuse-Kubitza
08:27 PM Revision 2415: lists.py: Added uniqify(): Aaron Marcuse-Kubitza
08:08 PM Revision 2414: sql.py: Moved mk_flatten_mapping(), flatten() to Basic queries section since they don't involve database structure info: Aaron Marcuse-Kubitza
08:06 PM Revision 2413: sql.py: put_table(): Use single quotes rather than double quotes around strings where possible: Aaron Marcuse-Kubitza
07:59 PM Revision 2412: schemas/functions.sql, vegbien.sql: Changed CAST-related relational functions to return NULL on data exceptions and convert the exceptions to warnings. This helps column-based import by mapping invalid values to NULL instead of aborting the whole query on the first invalid value.: Aaron Marcuse-Kubitza
07:33 PM Revision 2411: sql.py: index_col(): Cache the query so it doesn't try to add an index on the same column multiple times: Aaron Marcuse-Kubitza
07:18 PM Revision 2410: sql.py mk_select(), sql_gen.py Join.to_str(): Fixed bug where conditions needed to be wrapped in () before being AND-ed together to ensure the proper operator precedence: Aaron Marcuse-Kubitza
06:49 PM Revision 2409: sql.py: put_table(): Add index on columns with invalid values to enable fast filtering: Aaron Marcuse-Kubitza
06:47 PM Revision 2408: sql.py: Added index_col(): Aaron Marcuse-Kubitza
06:18 PM Revision 2407: sql.py: put_table(): Add pkey on returned pkeys table to enable fast joins: Aaron Marcuse-Kubitza
06:17 PM Revision 2406: sql.py: Added index_pkey(): Aaron Marcuse-Kubitza
05:41 PM Revision 2405: sql.py: mk_update(): When running sql_gen.to_name_only_col(), check that the col's table is table: Aaron Marcuse-Kubitza
05:38 PM Revision 2404: sql.py: put_table(): Renamed *_pkeys to insert_*_pkeys to distinguish them from the full set of pkeys on the input table: Aaron Marcuse-Kubitza
05:27 PM Revision 2403: sql.py: put_table(): FunctionValueException: Change invalid values to NULL using UPDATE instead of filtering them out using WHERE, to avoid adding lots of conditions to the SELECT statement: Aaron Marcuse-Kubitza
05:11 PM Revision 2402: sql.py: Added mk_update() and update(): Aaron Marcuse-Kubitza
05:10 PM Revision 2401: sql_gen.py: Added to_name_only_col(): Aaron Marcuse-Kubitza
04:56 PM Revision 2400: sql_gen.py: Added as_Value(): Aaron Marcuse-Kubitza
04:29 PM Revision 2399: sql.py: mk_select(): conds: Use new sql_gen.ColValueCond instead of sql_gen.as_ValueCond(). Documented that Code and ValueCond are sql_gen objects.: Aaron Marcuse-Kubitza
04:28 PM Revision 2398: sql_gen.py: Added ColValueCond: Aaron Marcuse-Kubitza
03:59 PM Revision 2397: sql.py: mk_flatten_mapping(): Filter str(col) through clean_name() to remove quotes, etc.: Aaron Marcuse-Kubitza
03:58 PM Revision 2396: sql.py: Added clean_name(): Aaron Marcuse-Kubitza
03:43 PM Revision 2395: sql.py: put_table(): Join together input tables into new table for speed and so don't modify input if values edited: Aaron Marcuse-Kubitza
03:37 PM Revision 2394: sql.py: mk_flatten_mapping(): Take as_items param to return a list of dict items instead of a dict. Sort preserve cols before other cols. flatten(): Turn on as_items so that cols list is sorted in input order, with preserve cols first. This ensures that if a pkey is provided in preserve, it will be the first col in the generated table.: Aaron Marcuse-Kubitza
03:24 PM Revision 2393: sql.py: mk_flatten_mapping(), flatten(): Take list of cols to select instead of using all cols in all tables to join: Aaron Marcuse-Kubitza
02:58 PM Revision 2392: sql.py: mk_flatten_mapping(), flatten(): Renamed flat_table param to into to be consistent with run_query_into() and put it first because it is the output param: Aaron Marcuse-Kubitza
02:55 PM Revision 2391: sql.py: Added flatten(): Aaron Marcuse-Kubitza
02:38 PM Revision 2390: sql.py: mk_flatten_mapping(): preserve Col objects will have tables changed to flat_table to work with flattened table: Aaron Marcuse-Kubitza
02:29 PM Revision 2389: sql.py: mk_flatten_mapping(): Added preserve param for list of columns not to rename: Aaron Marcuse-Kubitza
02:18 PM Revision 2388: sql.py: esc_name_by_module(): Support module value None, and use default module psycopg2 for it: Aaron Marcuse-Kubitza

05/23/2012

09:58 PM Revision 2387: sql.py: put_table(): Renamed *pkeys_ref to *pkeys to reflect that they are now objects rather than an array-based references: Aaron Marcuse-Kubitza
09:54 PM Revision 2386: sql.py: run_query_into(): Renamed into_ref param to into to reflect that it's now an object rather than an array-based reference: Aaron Marcuse-Kubitza
09:51 PM Revision 2385: sql.py: run_query_into(): Made into_ref a sql_gen.Table instead of an array containing a table name to improve flexibility and clarity: Aaron Marcuse-Kubitza
09:34 PM Revision 2384: dicts.py: Added join(): Aaron Marcuse-Kubitza
09:20 PM Revision 2383: sql.py: Added mk_flatten_mapping(): Aaron Marcuse-Kubitza
08:28 PM Revision 2382: sql.py: put_table(): Renamed the copy of in_tables that gets modified to in_tables_, so that the original list can eventually be reused in joining together the input tables into a temp table: Aaron Marcuse-Kubitza
07:10 PM Revision 2381: sql.py: run_query(): FunctionValueException: Also match "date/time field value out of range" errors: Aaron Marcuse-Kubitza
07:04 PM Revision 2380: sql.py: put_table(): conds: Use a set instead of a list for faster checking of the "cond not in conds" assertion: Aaron Marcuse-Kubitza
06:55 PM Revision 2379: sql.py: mk_select(): conds: Support containers of any iterable type: Aaron Marcuse-Kubitza
06:52 PM Revision 2378: sql.py: put_table(): Made conds a list so that there can be multiple conditions on the same column: Aaron Marcuse-Kubitza
06:36 PM Revision 2377: sql.py: mk_select(): conds is list of (key, value) tuples instead of dict (dict still supported for compatibility), so that there can be multiple conditions on the same column: Aaron Marcuse-Kubitza
06:35 PM Revision 2376: sql.py: mk_select(): conds is list of (key, value) tuples instead of dict (dict still supported for compatibility), so that there can be multiple conditions on the same column: Aaron Marcuse-Kubitza
06:28 PM Revision 2375: util.py: NamedTuple inherits from objects.BasicObject so that it's comparable and hashable. This fixes a bug in dicts.make_hashable() where the NamedTuple created for a dict would appear to be hashable but would always compare as unequal.: Aaron Marcuse-Kubitza
06:15 PM Revision 2374: sql.py: DbConn.esc_value(): Run strings.to_unicode() on the generated string so that if it contains unescaped non-ASCII characters, these will not cause problems when concatenated with plain strings: Aaron Marcuse-Kubitza
05:58 PM Revision 2373: sql.py: run_query(): FunctionValueException: Unpack match.groups() into vars to make code clearer: Aaron Marcuse-Kubitza
05:56 PM Revision 2372: exc.py: str_(): Avoid traceback exception-formatting functions when possible because they escape non-ASCII characters: Aaron Marcuse-Kubitza
05:11 PM Revision 2371: sql.py: get_cur_query(): If no raw query: Use strings.ustr() instead of repr() to ensure that if the exception is parsed, embedded quotes will not be double-escaped. Prefix the query by [input] to show that it's not the raw query.: Aaron Marcuse-Kubitza
04:59 PM Revision 2370: sql_gen.py: Non-Code objects: __str__() passes informative placeholder string to self.to_str() instead of empty string: Aaron Marcuse-Kubitza
04:41 PM Revision 2369: sql.py: ExceptionWithNameValue: Use repr() instead of strings.ustr() on the value: Aaron Marcuse-Kubitza
04:38 PM Revision 2368: sql.py: run_query(): Exception parsing: Use non-greedy qualifier "?" in regexps wherever possible to avoid matching closing quotes later in the error message: Aaron Marcuse-Kubitza
04:32 PM Revision 2367: sql_gen.py: MockDb.esc_value(): Use repr() instead of strings.ustr() so the quotes around the value are included: Aaron Marcuse-Kubitza
04:30 PM Revision 2366: sql_gen.py: ValueCond and Join class hierarchies inherit from objects.BasicObject like Code does: Aaron Marcuse-Kubitza
04:24 PM Revision 2365: sql.py: put_table(): ignore(): Fixed bug where value needed to be filtered through repr(). NullValueException: Fixed bug where value passed to ignore() was the string 'NULL' instead of the value None.: Aaron Marcuse-Kubitza
04:14 PM Revision 2364: mappings/DwC2-VegBIEN.specimens.csv: plantname.rank: Filter through _toTaxonrank: Aaron Marcuse-Kubitza
04:03 PM Revision 2363: sql.py: put_table(): ignore(): Avoid infinite loops by asserting that in_col is not in conds: Aaron Marcuse-Kubitza
03:58 PM Revision 2362: objects.py: BasicObject: Fixed bug where util needed to be imported. Added __eq__() and __hash__().: Aaron Marcuse-Kubitza
03:47 PM Revision 2361: strings.py: Removed no longer used DebugPrintable (that functionality is now in objects.BasicObject): Aaron Marcuse-Kubitza
03:46 PM Revision 2360: sql_gen.py: Code: Inherit from new objects.BasicObject: Aaron Marcuse-Kubitza
03:46 PM Revision 2359: Added objects.py: Aaron Marcuse-Kubitza
03:37 PM Revision 2358: sql.py: put_table(): Renamed log_ignore() to ignore() and factored common conds-modifying code into it: Aaron Marcuse-Kubitza
03:29 PM Revision 2357: sql.py: put_table(): Moved post-insert code outside while loop because it will now always be run (there are no longer special cases where the postprocessing doesn't happen): Aaron Marcuse-Kubitza
03:25 PM Revision 2356: sql.py: put_table(): Missing mapping for NOT NULL column: Just create an empty pkeys table, since the missing rows' pkeys will be set to NULL later: Aaron Marcuse-Kubitza
03:17 PM Revision 2355: sql.py: put_table(): Joining together output and input pkeys: Use new sql_gen.join_same_not_null: Aaron Marcuse-Kubitza
03:14 PM Revision 2354: sql.py: put_table(): Setting missing rows' pkeys to NULL: Use new sql_gen.join_same_not_null: Aaron Marcuse-Kubitza
03:14 PM Revision 2353: sql_gen.py: Join: Added join_same_not_null. to_str(): Refactored to switch order of left and right tables and cols because left_table is on the right in the comparison, and using the sides of the comparison instead of the sides of the join makes the code clearer.: Aaron Marcuse-Kubitza
02:51 PM Revision 2352: sql_gen.py: Renamed join_using to join_same to reflect that it can also be used without USING: Aaron Marcuse-Kubitza
02:48 PM Revision 2351: sql.py: put_table(): Set missing rows' pkeys to NULL: Aaron Marcuse-Kubitza
02:10 PM Revision 2350: sql.py: put_table(): NullValueException: no mapping for missing col: Fixed bug where run_query_into_pkeys() was still using insert_joins instead of input_joins: Aaron Marcuse-Kubitza
02:06 PM Revision 2349: sql_gen.py: Added MockDb. All __str__() methods: Use self.to_str() with mockDb.: Aaron Marcuse-Kubitza
01:59 PM Revision 2348: sql_gen.py: Use db.esc_name() instead of sql.esc_name(db, ...) so passed-in db can be a mock object: Aaron Marcuse-Kubitza
01:58 PM Revision 2347: sql.py: DbConn: Added esc_name(): Aaron Marcuse-Kubitza
01:51 PM Revision 2346: db_xml.py: put_table(): Debug-print which columns are being put: Aaron Marcuse-Kubitza
01:50 PM Revision 2345: sql.py: ConstraintException, NullValueException: Improved error messages: Aaron Marcuse-Kubitza
01:31 PM Revision 2344: sql.py: put_table(): FunctionValueException: Fixed bug where out_table was still assumed to be an escaped string, but is now a Table object: Aaron Marcuse-Kubitza
01:29 PM Revision 2343: sql.py: mk_select(): joins: Use new table_not_null_col() instead of pkey() to get a non-NULL column to filter out on: Aaron Marcuse-Kubitza

05/22/2012

10:00 PM Revision 2342: exc.py: add_msg(): Fixed bug where msg needed to be converted to a unicode object before appending it to another unicode object: Aaron Marcuse-Kubitza
09:54 PM Revision 2341: mappings/VegX-VegBIEN.stems.csv: Fixed bug where taxonfit was named taxonFit. (This was only recently discovered because column names are now escaped, causing them not to be case-insensitive.): Aaron Marcuse-Kubitza
09:51 PM Revision 2340: sql.py: Added table_not_null_col(): Aaron Marcuse-Kubitza
09:50 PM Revision 2339: sql.py: Added table_cols() and use it in pkey(): Aaron Marcuse-Kubitza
09:36 PM Revision 2338: schemas/vegbien.sql, schemas/functions.sql: Relational functions: Added dummy not_null column to provide a column to use in LEFT JOIN filter-out filters: Aaron Marcuse-Kubitza
09:24 PM Revision 2337: sql.py: mk_insert_select(): embeddable: Use new sql_gen.NamedTable: Aaron Marcuse-Kubitza
09:23 PM Revision 2336: sql_gen.py: Added NamedTable. Table: Added to_Table().: Aaron Marcuse-Kubitza
09:06 PM Revision 2335: sql_gen.py: Added section labels for each type of SQL code object: Aaron Marcuse-Kubitza
08:25 PM Revision 2334: sql.py: put_table(): DuplicateKeyException: Fixed bug where dict_subset_right_join() was used instead of dict_subset(), adding spurious None values for columns in the constraint which are not in the input tables: Aaron Marcuse-Kubitza
08:23 PM Revision 2333: sql_gen.py: as_Col(): Don't allow None cols: Aaron Marcuse-Kubitza
08:06 PM Revision 2332: schemas/vegbien.ERD.mwb: Synced with schemas/vegbien.sql: Aaron Marcuse-Kubitza
07:39 PM Revision 2331: sql.py: Removed no longer used clean_name(): Aaron Marcuse-Kubitza
07:38 PM Revision 2330: sql.py: mk_insert_select(): embeddable: Removed clean_name() because the function name is now escaped where it's used: Aaron Marcuse-Kubitza
07:36 PM Revision 2329: sql.py: put_table(): Added support for out_table values that are Table objects: Aaron Marcuse-Kubitza
07:31 PM Revision 2328: sql.py: mk_insert_select(): Fixed bug where table for creating the returning column Col object was the already-escaped string, instead of the Table object: Aaron Marcuse-Kubitza
07:24 PM Revision 2327: sql.py: mk_insert_select(): Fixed bug where function name and returning col were not being escaped: Aaron Marcuse-Kubitza
07:08 PM Revision 2326: sql.py: put_table(): log_ignore(): Fixed bug where in_col needed to be passed through str() because it's a column object: Aaron Marcuse-Kubitza
07:03 PM Revision 2325: sql.py: put_table(): Fixed bug where the filter_out join should only be used in the insert, not in the select of existing/inserted rows. insert_select() call: Fixed compatibility bug where old versions of Python did not support mixing keyword args and ** args.: Aaron Marcuse-Kubitza
06:32 PM Revision 2324: sql.py: put_table(): Fixed bug where "add_row_num(db, out_pkeys_ref[0])" was mistakenly put under the "if row_ct_ref != None" if statement: Aaron Marcuse-Kubitza
06:26 PM Revision 2323: sql_gen.py: Renamed NamedCode to NamedCol to better reflect its specific use: Aaron Marcuse-Kubitza
06:23 PM Revision 2322: sql.py: Removed unnecessary calls to check_name(): Aaron Marcuse-Kubitza
06:22 PM Revision 2321: sql.py: mk_insert_select(): Fixed bug where returning col was not being escaped: Aaron Marcuse-Kubitza
06:20 PM Revision 2320: sql.py: add_row_num(): Fixed bug where table name was not being escaped: Aaron Marcuse-Kubitza
06:13 PM Revision 2319: sql.py: run_query_into(): Fixed bug where into table name was not being escaped: Aaron Marcuse-Kubitza
06:07 PM Revision 2318: sql.py: mk_insert_select(): Fixed bug where utput column names were not being escaped: Aaron Marcuse-Kubitza
05:57 PM Revision 2317: sql.py: put_table(): Fixed bug where only string columns were being included in the distinct_on, but columns are now always sql_gen.Col instances: Aaron Marcuse-Kubitza
05:53 PM Revision 2316: sql.py: put_table(): Put together varying insert_select() args using dict instead of individual vars: Aaron Marcuse-Kubitza
05:51 PM Revision 2315: sql.py: mk_select(): Fixed bug where order_by needed to default to None if distinct_on was used. Fixed bug where cond values were being treated as %s params in addition to being parsed by sql_gen.as_ValueCond().to_str().: Aaron Marcuse-Kubitza
05:40 PM Revision 2314: sql_gen.py: Col: Added to_Col(): Aaron Marcuse-Kubitza
05:31 PM Revision 2313: db_xml.py: put_table(): Accept sql_gen.Table objects or strings instead of separate table and schema names: Aaron Marcuse-Kubitza
05:10 PM Revision 2312: sql.py: put_table(): Require all in_table_cols to be sql_gen.Col objects: Aaron Marcuse-Kubitza
05:03 PM Revision 2311: sql_gen.py: ValueCond: Unwrap NamedCode objects: Aaron Marcuse-Kubitza
04:55 PM Revision 2310: sql_gen.py: NamedCode: Inherit from Col so that its name can be retrieved using the same attribute as Col's: Aaron Marcuse-Kubitza
04:43 PM Revision 2309: sql.py: put_table(): Debug-log each caught exception: Aaron Marcuse-Kubitza
04:41 PM Revision 2308: exc.py: str_(): Added first_line_only param to return just the first line: Aaron Marcuse-Kubitza
04:26 PM Revision 2307: sql.py: ConstraintException: Changed text of message to specify that a constraint was violated: Aaron Marcuse-Kubitza
04:14 PM Revision 2306: sql.py: Renamed ExceptionWithColumns to ConstraintException and added name field to contain the constraint name, if any: Aaron Marcuse-Kubitza
04:06 PM Revision 2305: sql.py: put_table(): If there are join_cols, don't get output pkeys of inserted rows and instead select all rows (existing and inserted) after the insert: Aaron Marcuse-Kubitza
04:04 PM Revision 2304: sql_gen.py: Join.to_str(): Fixed bug where order of right_table_col and left_table_col was reversed when applying as_ValueCond() and as_Col(): Aaron Marcuse-Kubitza
03:33 PM Revision 2303: sql.py: put_table(): Moved things outside of the try clause which should not produce the exceptions: Aaron Marcuse-Kubitza
03:21 PM Revision 2302: sql_gen.py: Code: Extend new strings.DebugPrintable instead of implementing __str__(), __repr__() itself: Aaron Marcuse-Kubitza
03:20 PM Revision 2301: strings.py: Added DebugPrintable: Aaron Marcuse-Kubitza
03:17 PM Revision 2300: sql_gen.py: Code: __str__(): Added class name. Added __repr__().: Aaron Marcuse-Kubitza
03:16 PM Revision 2299: util.py: Added class_name(): Aaron Marcuse-Kubitza
02:55 PM Revision 2298: sql_gen.py: Join.to_str(): Fixed bug in USING syntax where columns were not escaped: Aaron Marcuse-Kubitza
02:48 PM Revision 2297: sql.py: put_table(): Order selects by in_tables0's pkey to avoid undefined orderings on multiple runs of the same query: Aaron Marcuse-Kubitza
02:42 PM Revision 2296: sql.py: mk_select(): Removed no longer used esc_name_(): Aaron Marcuse-Kubitza
02:41 PM Revision 2295: sql_gen.py: as_Table() Removed no longer used support for (schema, table) tuples: Aaron Marcuse-Kubitza
02:39 PM Revision 2294: sql_gen.py: Removed no longer used unescape_table() and table2sql_gen(): Aaron Marcuse-Kubitza
02:38 PM Revision 2293: sql.py: mk_select(): Removed no longer used table_is_esc: Aaron Marcuse-Kubitza
02:37 PM Revision 2292: sql.py: mk_insert_select(): Removed no longer used table_is_esc: Aaron Marcuse-Kubitza
02:34 PM Revision 2291: sql.py: pkey(): Removed no longer used table_is_esc: Aaron Marcuse-Kubitza
02:31 PM Revision 2290: sql.py: cleanup_table(): Switched from table_is_esc to sql_gen.as_Table.to_str(): Aaron Marcuse-Kubitza
02:19 PM Revision 2289: csv2db: Switched to using plain table names rather than table_is_esc: Aaron Marcuse-Kubitza
02:13 PM Revision 2288: bin/map: Switched to using sql_gen rather than table_is_esc: Aaron Marcuse-Kubitza
02:05 PM Revision 2287: sql_gen.py: Removed no longer needed col2sql_gen() and value2sql_gen(): Aaron Marcuse-Kubitza
02:04 PM Revision 2286: sql.py: Replaced sql_gen.value2sql_gen() with sql_gen.as_Col(): Aaron Marcuse-Kubitza
02:00 PM Revision 2285: sql.py: Replaced sql_gen.col2sql_gen() with sql_gen.as_Col(): Aaron Marcuse-Kubitza
01:57 PM Revision 2284: sql.py: mk_select(): Inline cond() and don't use sql_gen.as_Col because sql_gen.as_ValueCond.to_str() calls it: Aaron Marcuse-Kubitza
01:54 PM Revision 2283: sql_gen.py: Removed no longer needed cond2sql_gen(): Aaron Marcuse-Kubitza
01:53 PM Revision 2282: sql.py: mk_select(): cond(): Parse conditions using sql_gen-only functions: Aaron Marcuse-Kubitza
01:47 PM Revision 2281: sql_gen.py: Removed no longer needed join2sql_gen(): Aaron Marcuse-Kubitza
01:44 PM Revision 2280: sql.py: put_table(): Switched joins to sql_gen.Join objects. mk_select(): Only accept joins which are sql_gen.Join objects.: Aaron Marcuse-Kubitza
01:38 PM Revision 2279: sql.py: put_table(): Removed no longer used table_is_esc param: Aaron Marcuse-Kubitza
01:36 PM Revision 2278: sql.py: put_table(): Switched joins to sql_gen.Join objects: Aaron Marcuse-Kubitza
01:28 PM Revision 2277: sql.py: mk_select(): joins: Switched to using sql_gen.Join.to_str() to render joins to SQL: Aaron Marcuse-Kubitza
01:24 PM Revision 2276: sql_gen.py: Join.to_str(): Fixed bugs revealed in first test of function: Aaron Marcuse-Kubitza

05/21/2012

11:05 PM Revision 2275: db_xml.py: put_table(): Turn off table_is_esc when calling sql.put_table() and don't escape out_table: Aaron Marcuse-Kubitza
11:04 PM Revision 2274: sql.py: mk_insert_select(): Use sql_gen.table2sql_gen().to_str() to escape the table: Aaron Marcuse-Kubitza
10:57 PM Revision 2273: db_xml.py: put_table(): First in_tables table is sql_gen.Table object: Aaron Marcuse-Kubitza
10:49 PM Revision 2272: db_xml.py: put_table(): Converted row (mapping) values to sql_gen objects: Aaron Marcuse-Kubitza
10:45 PM Revision 2271: sql.py: mk_select(): Accept main tables (table0's) that are Table objects. This change requires plain SQL code to be wrapped in a CustomCode object if it should not be unescaped and converted to a Table object.: Aaron Marcuse-Kubitza
10:42 PM Revision 2270: sql_gen.py: as_Table(): Accept tables that are Code objects, not just Table objects: Aaron Marcuse-Kubitza
10:40 PM Revision 2269: sql_gen.py: CustomCode: Fixed bug where needed to inherit from Code: Aaron Marcuse-Kubitza
10:19 PM Revision 2268: sql.py: put_table(): Return a sql_gen.Col object instead of an old-style tuple: Aaron Marcuse-Kubitza
10:00 PM Revision 2267: sql.py: mk_select(): joins: Switched to using filter_out as an attribute of the Join object instead of a sentinel value for the first column. Filter by the right table's pkey being NULL instead of each joined column being NULL, because some joined columns may contain NULL values which would mess things up, but the pkey presumably is NOT NULL.: Aaron Marcuse-Kubitza
09:56 PM Revision 2266: sql_gen.py: Join.to_str(): Fixed bug where type_ None was being concatenated with the JOIN str: Aaron Marcuse-Kubitza
09:31 PM Revision 2265: sql_gen.py: Join.to_str(): Fixed bug where USING syntax could not be used for filter_out join type, because a separate right column is required for filtering: Aaron Marcuse-Kubitza
09:20 PM Revision 2264: sql_gen.py: Use new table2sql_gen() in col2sql_gen(), join2sql_gen(): Aaron Marcuse-Kubitza
09:18 PM Revision 2263: sql.py: mk_select(): joins: Convert all joins to sql_gen format using join2sql_gen(): Aaron Marcuse-Kubitza
09:17 PM Revision 2262: sql_gen.py: Added table2sql_gen(): Aaron Marcuse-Kubitza
08:44 PM Revision 2261: sql_gen.py: Added join2sql_gen(): Aaron Marcuse-Kubitza
08:33 PM Revision 2260: sql_gen.py: Added as_Col(). as_ValueCond(): Added support for assuming the value is a column rather than a literal value, using the default_table param. Added Join.: Aaron Marcuse-Kubitza
07:10 PM Revision 2259: sql_gen.py: Put parameterized SQL code objects in separate section: Aaron Marcuse-Kubitza
07:08 PM Revision 2258: sql.py: put_table(): DuplicateKeyException: Assert that join_cols has changed to avoid infinite loops: Aaron Marcuse-Kubitza
06:59 PM Revision 2257: sql.py: put_table(): Moved getting pkeys of already existing rows from DuplicateKeyException to try clause, so that it always runs if there are join_cols. DuplicateKeyException: Add new duplicate key cols to join_cols instead of replacing join_cols so that multiple unique constraints being violated causes the union of their columns to be used for join_cols.: Aaron Marcuse-Kubitza
06:23 PM Revision 2256: sql_gen.py: Added CustomCode: Aaron Marcuse-Kubitza
06:05 PM Revision 2255: sql.py: mk_select(): joins: Fixed bug where joins dict was being modified without first being copied, causing the input value to be modified: Aaron Marcuse-Kubitza
05:52 PM Revision 2254: Compare object()-based sentinel values using is. Where sentinel values must be compared using ==, use rand.rand_int() instead.: Aaron Marcuse-Kubitza
05:13 PM Revision 2253: sql.py: put_table(): Added debug messages for every action performed: Aaron Marcuse-Kubitza
04:45 PM Revision 2252: sql.py: put_table(): Moved assignment of in_pkeys_ref outside loop so it wouldn't need to be re-versioned every iteration: Aaron Marcuse-Kubitza
04:42 PM Revision 2251: sql.py: put_table(): Changed temp_suffix to temp_prefix so all temp tables for a given out_table would have the same prefix. (Existing name collisions due to truncated names are not a problem because version prefixes are automatically added.): Aaron Marcuse-Kubitza
04:23 PM Revision 2250: mappings/DwC2-VegBIEN.specimens.csv: Filter dates through _toTimestamp: Aaron Marcuse-Kubitza
04:20 PM Revision 2249: schemas/functions.sql: Added _toTimestamp: Aaron Marcuse-Kubitza
04:15 PM Revision 2248: mappings/DwC2-VegBIEN.specimens.csv: Filter coordsaccuracy through _toDouble: Aaron Marcuse-Kubitza
04:12 PM Revision 2247: sql.py: FunctionValueException parsing: Support values containing non-word and non-ASCII characters: Aaron Marcuse-Kubitza
04:11 PM Revision 2246: exc.py: Support exception messages containing non-ASCII characters: Aaron Marcuse-Kubitza

05/18/2012

07:10 PM Revision 2245: sql.py: put_table(): Print debug messages about how exceptions are being handled: Aaron Marcuse-Kubitza
06:45 PM Revision 2244: sql.py: put_table(): After getting pkeys of already existing rows, insert new rows: Aaron Marcuse-Kubitza
06:42 PM Revision 2243: sql.py: put_table(): Handle FunctionValueExceptions by excluding rows with the invalid value in their "value" column: Aaron Marcuse-Kubitza
06:41 PM Revision 2242: sql.py: run_query(): Also parse "invalid input *syntax* at assignment" errors as FunctionValueExceptions: Aaron Marcuse-Kubitza
06:39 PM Revision 2241: sql_gen.py: Col: Convert string table names to Table objects: Aaron Marcuse-Kubitza
06:09 PM Revision 2240: sql.py: run_query(): Parse "invalid input value at assignment" errors' values as well: Aaron Marcuse-Kubitza
05:55 PM Revision 2239: sql.py: run_query(): Parse "invalid input value at assignment" errors as FunctionValueExceptions: Aaron Marcuse-Kubitza
05:27 PM Revision 2238: sql.py: mk_select(): joins: filter_out: Pass NULLs through. Use sql_gen.*2sql_gen() to add the left and right table names to the columns.: Aaron Marcuse-Kubitza
05:26 PM Revision 2237: sql_gen.py: cond2sql_gen(): Take assume_col param and pass it to value2sql_gen(): Aaron Marcuse-Kubitza
04:45 PM Revision 2236: sql.py: put_table(): Use table-qualified pkey col names whenever possible, to avoid ambiguous column references: Aaron Marcuse-Kubitza
04:12 PM Revision 2235: mappings/DwC2-VegBIEN.specimens.csv: placenames: Convert ranks using _toPlacerank to work with multi-inserts: Aaron Marcuse-Kubitza
04:11 PM Revision 2234: sql.py: DbConn._db(): Fixed bug where the isolation level was not set to "SERIALIZABLE" in a portable way: Aaron Marcuse-Kubitza
04:04 PM Revision 2233: sql.py: mk_select(): distinct_on is turned off when distinct_on == [] rather than when it's None: Aaron Marcuse-Kubitza
03:48 PM Revision 2232: schemas/vegbien.sql: Added _toPlacerank: Aaron Marcuse-Kubitza
03:43 PM Revision 2231: schemas/vegbien.sql: Added _toTaxonrank: Aaron Marcuse-Kubitza
03:35 PM Revision 2230: sql.py: put_table(): Handle NullValueExceptions by removing invalid rows: Aaron Marcuse-Kubitza
03:31 PM Revision 2229: sql_gen.py: Added NamedCode: Aaron Marcuse-Kubitza
03:30 PM Revision 2228: sql_gen.py: Added __str__() to base classes for debugging: Aaron Marcuse-Kubitza
02:46 PM Revision 2227: sql.py: mk_select() (and sql_gen.py): Fixed bugs where literal strings were treated as literal values when they should have been treated as column names. Take default_table param to determine default table to use if a column doesn't have an explicit table. put_table(): mk_main_select(): Pass in_tables0 as mk_select()'s default_table.: Aaron Marcuse-Kubitza
12:54 PM Revision 2226: sql.py: mk_select(): cond(): Run additional sql_gen translation functions cond2sql_gen() and col2sql_gen() on the left and right sides of the comparison: Aaron Marcuse-Kubitza
12:50 PM Revision 2225: sql_gen.py: ValueCond: Fixed bug where values which are Code objects were being converted to Literals. Added cond2sql_gen().: Aaron Marcuse-Kubitza

05/17/2012

08:01 PM Revision 2224: sql.py: mk_select(): join(): Use cond() now that it supports sql_gen format: Aaron Marcuse-Kubitza
07:50 PM Revision 2223: sql_gen.py: Added col2sql_gen() and use it in value2sql_gen(): Aaron Marcuse-Kubitza
07:25 PM Revision 2222: sql_gen.py: CompareCond: By default, compare NULL values literally. Support operator values to pass NULLs through.: Aaron Marcuse-Kubitza
07:23 PM Revision 2221: strings.py: remove_prefix(), remove_suffix(): Added removed_ref param: Aaron Marcuse-Kubitza
06:28 PM Revision 2220: sql.py: mk_select(): parse_col(): Use sql_gen.value2sql_gen().to_str(): Aaron Marcuse-Kubitza
06:22 PM Revision 2219: sql_gen.py: Added as_Table(), unescape_table(), value2sql_gen(): Aaron Marcuse-Kubitza
03:37 PM Revision 2218: sql.py: mk_select(): Documented conds param: Aaron Marcuse-Kubitza
03:32 PM Revision 2217: sql.py: mk_select(): cond(): Switched to using sql_gen so that custom conds would be supported: Aaron Marcuse-Kubitza
03:19 PM Revision 2216: sql_gen.py: ValueCond.to_str(): Made value_code a Code object instead of a string, and renamed it to left_value to reflect where it goes. Added as_ValueCond().: Aaron Marcuse-Kubitza
03:11 PM Revision 2215: sql.py: esc_value(): Fixed bug where db needed to be referenced through self: Aaron Marcuse-Kubitza
02:22 PM Revision 2214: sql_gen.py: ValueCond.to_str(): Added value_code param: Aaron Marcuse-Kubitza
02:16 PM Revision 2213: sql_gen.py: Literal, CompareCond: Implemented to_str(). ValueCond: Autoconvert literal values to Literals.: Aaron Marcuse-Kubitza
02:14 PM Revision 2212: sql.py: DbConn: Added esc_value(): Aaron Marcuse-Kubitza
01:52 PM Revision 2211: Moved SQL code generation classes from sql.py to new sql_gen.py. sql_gen.py: Added Code, Literal, ValueCond, and CompareCond. sql.py: Removed Query because we will use a different approach.: Aaron Marcuse-Kubitza
12:43 PM Revision 2210: sql.py: Added Query, Table, Col: Aaron Marcuse-Kubitza
11:28 AM Revision 2209: sql.py: get(): Fixed bug where limit=1 needs to be passed to select() as a keyword arg now that the distinct_on param comes before it: Aaron Marcuse-Kubitza
11:01 AM Revision 2208: sql.py: put_table(): mk_main_select(): Pass outer var conds to mk_select(): Aaron Marcuse-Kubitza
10:57 AM Revision 2207: sql.py: put_table(): mk_select_(): Fixed bug where it was sometimes being called without distinct_on, causing it to return a different # of rows. Renamed mk_select_() to mk_main_select() for clarity.: Aaron Marcuse-Kubitza
10:48 AM Revision 2206: sql.py: put_table(): Do inserts and selects in a loop so that it will keep retrying the operation with additional constraints until it succeeds: Aaron Marcuse-Kubitza

05/15/2012

03:56 PM Revision 2205: sql.py: put_table(): mk_select_(): Fixed bug where order_by needed to be None because otherwise it wouldn't match the distinct_on cols if they were specified: Aaron Marcuse-Kubitza
03:55 PM Revision 2204: sql.py: put_table(): insert_(): Fixed bug where distinct_on was not passed to mk_select_(): Aaron Marcuse-Kubitza
03:30 PM Revision 2203: sql.py: put_table(): mk_select_(): Fixed bug where distinct_on needed to be passed as a keyword param: Aaron Marcuse-Kubitza
03:21 PM Revision 2202: sql.py: put_table(): insert_() and mk_select_() take distinct_on param: Aaron Marcuse-Kubitza
03:10 PM Revision 2201: sql.py: put_table(): Factored out code that inserts into pkeys table into run_query_into_pkeys() helper function: Aaron Marcuse-Kubitza
02:55 PM Revision 2200: sql.py: mk_select(): Implemented DISTINCT ON according to the distinct_on param: Aaron Marcuse-Kubitza
02:48 PM Revision 2199: sql.py: mk_select(): Added distinct_on param to set the columns to SELECT DISTINCT ON: Aaron Marcuse-Kubitza
02:31 PM Revision 2198: sql.py: clean_name(): Convert names to lowercase so that PostgreSQL will behave the same whether the name is escaped with "" or not. This will help avoid bugs in code that uses temp tables created by the sql module.: Aaron Marcuse-Kubitza
02:29 PM Revision 2197: sql.py: put_table(): Added order_by=None wherever rows were not supposed to be re-ordered. On DuplicateKeyException: Save existing pkeys in temp table for joining on.: Aaron Marcuse-Kubitza
01:31 PM Revision 2196: db_xml.py: put_table(): Pass limit and start to sql.put_table(): Aaron Marcuse-Kubitza
01:09 PM Revision 2195: db_xml.py: put_table(): Added limit and start options: Aaron Marcuse-Kubitza
11:54 AM Revision 2194: sql.py: When creating a temporary entity (table, function, etc.), instead create it as a permanent entity in debug mode so it can be viewed after the program is run: Aaron Marcuse-Kubitza
11:40 AM Revision 2193: sql.py: DbConn: Store whether in debug mode (log_debug != log_debug_none) for easy use by methods: Aaron Marcuse-Kubitza
11:31 AM Revision 2192: bin/map: connect_db(): Turn on autocommit mode in debug mode if commit is on, so that incremental results can be seen in the DB: Aaron Marcuse-Kubitza
11:30 AM Revision 2191: sql.py: DbConn: Use internal autocommit handling instead of DB connection autocommit attr to avoid autocommits inside a savepoint: Aaron Marcuse-Kubitza
11:15 AM Revision 2190: sql.py: DbConn: Added autocommit option to turn on autocommit mode. Use set_session() instead of SQL command to set isolation level.: Aaron Marcuse-Kubitza

05/14/2012

05:50 PM Revision 2189: sql.py: mk_insert_select(): embeddable: Fixed bug where the function may do different things when run, because the function (and other statements whose cached strings depend on the function name) may be run after the function definition would have changed, by versioning the function name and using CREATE FUNCTION instead of CREATE OR REPLACE FUNCTION so that its definition never changes: Aaron Marcuse-Kubitza
05:28 PM Revision 2188: sql.py: Parse "function already exists" errors as DuplicateFunctionException: Aaron Marcuse-Kubitza
05:13 PM Revision 2187: sql.py: mk_select(): joins: Fixed bug where join_not_equal did not do what it was designed for, which is filtering out matches of the join condition (before the bug fix, it effectively did a cross join with matching rows excluded, causing duplication of rows). Renamed join_not_equal to filter_out to reflect its intended use. Support table-scoped column names in the WHERE conds list.: Aaron Marcuse-Kubitza
04:22 PM Revision 2186: sql.py: put_table(): Fixed bug where ORDER BY column needed to have table0 name prefixed (if it didn't already have a table name), to avoid ambiguous column references: Aaron Marcuse-Kubitza
04:11 PM Revision 2185: sql.py: mk_select(): Fixed bug in joins where right_col had the table name prepended *before* it was copied for use with a different table name in join_using and join_not_equal: Aaron Marcuse-Kubitza
03:42 PM Revision 2184: Mapped some unmapped fields in DwC inputs: Aaron Marcuse-Kubitza
02:19 PM Revision 2183: Added mappings/for_review/DwC2-VegBIEN.specimens.fields.csv: Aaron Marcuse-Kubitza
01:21 PM Revision 2182: db_xml.py: put_table(): Fixed bug where didn't commit right after inserting node, but instead waited until children with fkeys to parent (independent of the node itself) were inserted: Aaron Marcuse-Kubitza
01:16 PM Revision 2181: sql.py: put_table(): insert_(): Use insert_select() instead of run_query_into() if new option pkeys_table_exists is on: Aaron Marcuse-Kubitza
12:51 PM Revision 2180: sql.py: mk_select(): Support joins with !=: Aaron Marcuse-Kubitza
12:45 PM Revision 2179: sql.py: mk_select(): Support only some join columns being join_using: Aaron Marcuse-Kubitza
12:40 PM Revision 2178: sql.py: put_table(): Renamed in_joins to insert_joins and joins to select_joins for clarity: Aaron Marcuse-Kubitza
12:21 PM Revision 2177: db_xml.py: put_table(): Support children with fkeys to parent: Aaron Marcuse-Kubitza
12:11 PM Revision 2176: sql.py: mk_select(): Make tuple optional for None literal values: Aaron Marcuse-Kubitza

05/13/2012

02:05 PM Revision 2175: sql.py: put_table(): Removed "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings: Aaron Marcuse-Kubitza
02:02 PM Revision 2174: bin/map: by_col: row_ct = 0 because it's unknown for now: Aaron Marcuse-Kubitza
02:00 PM Revision 2173: mk_select(): Support join conditions with literal values: Aaron Marcuse-Kubitza
01:42 PM Revision 2172: sql.py: mk_insert_select(): embeddable: Don't cache function_query because function def could change and then change back: Aaron Marcuse-Kubitza
01:35 PM Revision 2171: sql.py: with_savepoint(): Renamed savepoints to have "level" prefix, since the # indicates the level #: Aaron Marcuse-Kubitza
01:32 PM Revision 2170: sql.py: get_cur_query(): Also accept input params to combine with input_query, and pass input params when get_cur_query() is called: Aaron Marcuse-Kubitza
01:26 PM Revision 2169: sql.py: DbConn.run_query(): Pass input query to get_cur_query(): Aaron Marcuse-Kubitza
01:19 PM Revision 2168: sql.py: get_cur_query() and _add_cursor_info(): Support input_query param that will be used if the raw query is None. Pass input_query in DbConn.execute().: Aaron Marcuse-Kubitza
01:09 PM Revision 2167: sql.py: DbConn.run_query(): Check that query != None: Aaron Marcuse-Kubitza
01:05 PM Revision 2166: bin/map: out_is_db: Only rollback() and close() out_db if it was connected: Aaron Marcuse-Kubitza
01:04 PM Revision 2165: sql.py: DbConn: Added connected(): Aaron Marcuse-Kubitza
01:01 PM Revision 2164: sql.py: Wrapped calls to get_cur_query() that are used as strings in str(), because get_cur_query() can return None: Aaron Marcuse-Kubitza
12:57 PM Revision 2163: sql.py: next_version(): Versions start from 1, because first existing name was version 0: Aaron Marcuse-Kubitza
12:55 PM Revision 2162: put_table(): Use short name for temp_suffix now that version # will be added if needed: Aaron Marcuse-Kubitza
12:51 PM Revision 2161: sql.py: mk_select(): Parse join columns for literal values and table-scoped names as well: Aaron Marcuse-Kubitza
11:54 AM Revision 2160: mappings/DwC2-VegBIEN.specimens.csv: establishmentMeans: Call _toGrowthform on growthform: Aaron Marcuse-Kubitza
11:53 AM Revision 2159: schemas/vegbien.sql: Added _toGrowthform: Aaron Marcuse-Kubitza
11:19 AM Revision 2158: sql.py: put_table(): Changed temp_prefix to a suffix so main name won't be removed if name is truncated: Aaron Marcuse-Kubitza
11:14 AM Revision 2157: sql.py: mk_select(): fields: Support columns with tables. Changed syntax for literal values so that it wouldn't conflict with new syntax for columns with tables.: Aaron Marcuse-Kubitza
11:08 AM Revision 2156: iters.py: flatten(): If not an iterable, just return the value: Aaron Marcuse-Kubitza
10:32 AM Revision 2155: sql.py: put_table(): Pass in_pkeys and out_pkeys to run_query_into() by ref so they will be updated if the table names are changed: Aaron Marcuse-Kubitza
10:28 AM Revision 2154: sql.py: put_table(): Pass pkeys to run_query_into() by ref so it will be updated if the table name is changed: Aaron Marcuse-Kubitza
10:15 AM Revision 2153: sql.py: run_query_into(): If CREATE TABLE AS generates a DuplicateTableException, rename the table with a version # prepended: Aaron Marcuse-Kubitza
10:08 AM Revision 2152: sql.py: run_query_into(): Made into param a reference so that the function can change it, and renamed it to into_ref: Aaron Marcuse-Kubitza
09:36 AM Revision 2151: sql.py: run_query_into(): Made into param a reference so that the function can change it, and renamed it to into_ref: Aaron Marcuse-Kubitza
09:11 AM Revision 2150: sql.py: put_table(): If DuplicateKeyException: run_query_into() recoverably, so that DB errors such as DuplicateTableException will be parsed: Aaron Marcuse-Kubitza
09:07 AM Revision 2149: sql.py: Removed no-longer-needed try_insert(): Aaron Marcuse-Kubitza
09:05 AM Revision 2148: sql.py: Merged with_parsed_errors() into run_query() so all recoverable queries would automatically benefit from DB error message parsing. DbConn: Moved _add_cursor_info() to DbCursor.execute().: Aaron Marcuse-Kubitza
07:45 AM Revision 2147: sql.py: with_parsed_errors(): Raise DuplicateTableException for "relation already exists" errors instead of "table name specified more than once" errors: Aaron Marcuse-Kubitza
07:43 AM Revision 2146: sql.py: run_query_into(): Removed "DROP TABLE IF EXISTS" because sometimes when there are collisions in the temp table names, the code actually uses both "copies" of the temp table. Eventually, this situation will be resolved by adding a counter to the temp table name.: Aaron Marcuse-Kubitza
07:26 AM Revision 2145: sql.py: Cleaned up DbException's and subclasses' messages: Aaron Marcuse-Kubitza
07:26 AM Revision 2144: exc.py: ExceptionWithCause: Added cause_newline option to put the cause on its own line instead of on the message line: Aaron Marcuse-Kubitza
07:10 AM Revision 2143: sql.py: with_parsed_errors(): Also parse "table name specified more than once" errors as DuplicateTableExceptions: Aaron Marcuse-Kubitza
06:56 AM Revision 2142: sql.py: put_table(): Handle DuplicateKeyExceptions by running a select query on the unique constraint columns: Aaron Marcuse-Kubitza
06:14 AM Revision 2141: sql.py: mk_select(): Support tuples of tables, not just lists: Aaron Marcuse-Kubitza
05:29 AM Revision 2140: sql.py: with_parsed_errors(): Support table names that start with "_": Aaron Marcuse-Kubitza
05:20 AM Revision 2139: sql.py: DbConn: Added with_savepoint(). with_savepoint(): Use new DbConn.with_savepoint().: Aaron Marcuse-Kubitza
04:13 AM Revision 2138: schemas/functions.sql: Added _toBool: Aaron Marcuse-Kubitza
04:12 AM Revision 2137: mappings/DwC2-VegBIEN.specimens.csv: establishmentMeans: Use _toBool on iscultivated, isnative: Aaron Marcuse-Kubitza
04:11 AM Revision 2136: schemas/functions.sql: Added _toBool: Aaron Marcuse-Kubitza
04:01 AM Revision 2135: schemas/functions.sql: Made trigger functions IMMUTABLE since they do not modify other tables: Aaron Marcuse-Kubitza
03:51 AM Revision 2134: sql.py: put_table(): Added support for putting just a window subset of the rows in the table. Removed "SELECT statement missing a WHERE, LIMIT, or OFFSET clause" warnings.: Aaron Marcuse-Kubitza
03:30 AM Revision 2133: sql.py: put_table(): Return the column where the pkeys are made available (the out_pkey) instead of taking it as an argument: Aaron Marcuse-Kubitza
03:20 AM Revision 2132: sql.py: put_table(): Get input pkeys corresponding to rows in insert and join together out_pkeys and in_pkeys into final pkeys table: Aaron Marcuse-Kubitza
01:04 AM Revision 2131: sql.py: put_table(): Fully support multiple in_tables, joined together using the main input table's pkey: Aaron Marcuse-Kubitza
01:02 AM Revision 2130: sql.py: mk_select(): joins: Fixed bug where USING-based joins did not have closing ")": Aaron Marcuse-Kubitza
12:28 AM Revision 2129: db_xml.py: put_table(): Fixed bug where in_table was last in in_tables instead of first, causing it to be ignored by the current put_table() implementation, which only considers the first table name: Aaron Marcuse-Kubitza
12:17 AM Revision 2128: db_xml.py: put_table(): Fixed bug where pkeys_table returned by recursive call to put_table() needed to be prefixed with $ to be treated as an input column name rather than a literal value: Aaron Marcuse-Kubitza

05/09/2012

05:29 AM Revision 2127: sql.py: mk_select(): Support joins with USING, which can be used to merge multiple input cols into the same output col: Aaron Marcuse-Kubitza
04:42 AM Revision 2126: sql.py: mk_insert_select(): embeddable: Fixed bug where query that uses function was being sorted by its first column (the default mk_select() setting), when it should be left in its original order: Aaron Marcuse-Kubitza
04:36 AM Revision 2125: sql.py: put_table(): Take a dict mapping out to in cols instead of separate in and out cols lists: Aaron Marcuse-Kubitza
04:08 AM Revision 2124: sql.py: mk_select(): Joins: Reversed order of left_col and right_col in the joins dict as well, so the joined table's columns are the keys: Aaron Marcuse-Kubitza
04:05 AM Revision 2123: sql.py: mk_select(): Joins: Reversed order of left_col and right_col so the column of the table being joined is first, to match the form of a WHERE clause: Aaron Marcuse-Kubitza
03:56 AM Revision 2122: sql.py: mk_select(): Support joins: Aaron Marcuse-Kubitza
03:27 AM Revision 2121: sql.py: mk_select(): Accept a list of tables to join together (initial implementation just uses the first table): Aaron Marcuse-Kubitza
02:26 AM Revision 2120: sql.py: mk_select(): Support ORDER BY clause. By default, order by the pkey, since PostgreSQL apparently doesn't do this automatically (and this was causing some staging table tests to fail).: Aaron Marcuse-Kubitza
02:04 AM Revision 2119: bin/map: In debug mode, print the row # and input row just like in error messages: Aaron Marcuse-Kubitza
01:51 AM Revision 2118: bin/map: verbose_errors also defaults to on in debug mode: Aaron Marcuse-Kubitza
01:39 AM Revision 2117: sql.py: add_row_num(): Make the row number column the primary key: Aaron Marcuse-Kubitza
12:36 AM Revision 2116: csv2db: Use new sql.cleanup_table() to map NULL-equivalents to NULL. Consider the empty string to be NULL.: Aaron Marcuse-Kubitza
12:35 AM Revision 2115: sql.py: Added cleanup_table(): Aaron Marcuse-Kubitza
12:33 AM Revision 2114: csvs.py: Added row filters: Aaron Marcuse-Kubitza

05/07/2012

11:14 PM Revision 2113: db_xml.py: put_table(): Fixed bug where relational functions were not being treated as value nodes, and thus their containing child was treated as a child with a backwards pointer instead of a field: Aaron Marcuse-Kubitza
11:12 PM Revision 2112: xml_func.py: Added is_func*() and is_xml_func*() and use them where their definitions were used: Aaron Marcuse-Kubitza
10:40 PM Revision 2111: db_xml.py: Added value() and use it where xml_dom.first_elem() was used: Aaron Marcuse-Kubitza
10:12 PM Revision 2110: mappings/DwC2-VegBIEN.specimens.csv: *Latitude/*Longitude: Moved _toDouble directly after the output col name, so that it's run after any translation functions (which all return strings). *ElevationInMeters: Added _toDouble around all output cols.: Aaron Marcuse-Kubitza
09:56 PM Revision 2109: xpath.py: get(): Create attrs: Fixed bug where attrs were created with last_only on, which caused attrs to get created multiple times if there were multiple attrs of the same name but different values, becase the last_only optimization would only check the last attr of that name: Aaron Marcuse-Kubitza
09:19 PM Revision 2108: mappings/DwC2-VegBIEN.specimens.csv: *Latitude/*Longitude: Use new _toDouble to convert strings to doubles (needed for by_col): Aaron Marcuse-Kubitza
09:16 PM Revision 2107: schemas/functions.sql: Added _toDouble: Aaron Marcuse-Kubitza
09:16 PM Revision 2106: bin/map: When calling xml_func.process(), pass DB connection if available: Aaron Marcuse-Kubitza
09:15 PM Revision 2105: xml_func.py: process(): If DB with relational functions available (passed in via db param), call any non-local XML functions as relational funcs: Aaron Marcuse-Kubitza
09:09 PM Revision 2104: sql.py: put(): pkey param (now pkey_) defaults to table's pkey: Aaron Marcuse-Kubitza
08:30 PM Revision 2103: bin/map: by_col: In debug mode, print stripped XML tree that guides import: Aaron Marcuse-Kubitza
08:03 PM Revision 2102: vegbien_dest: Fixed bug where there was a missing line continuation char before schemas var: Aaron Marcuse-Kubitza
08:02 PM Revision 2101: sql.py: DbConn: Fixed bug where schemas db_config value needed to be split apart into strings. Fixed bug where current_setting() returned a value rather than an identifier, so it had to be used with set_config() instead of SET, and run after SET TRANSACTION ISOLATION LEVEL. Moved Input validation section before Database connections because it's used by Database connections.: Aaron Marcuse-Kubitza
07:29 PM Revision 2100: Regenerated vegbien.ERD exports: Aaron Marcuse-Kubitza
07:26 PM Revision 2099: vegbien.ERD.mwb: Changed lines to a configuration that MySQLWorkbench wouldn't keep resetting whenever the ERD was reopened: Aaron Marcuse-Kubitza
07:21 PM Revision 2098: vegbien_dest: Added "functions" to schemas: Aaron Marcuse-Kubitza
07:20 PM Revision 2097: sql.py: db_config: Added schemas param. DbConn: Use any schemas db_config value to set search_path.: Aaron Marcuse-Kubitza
06:58 PM Revision 2096: sql.py: add_row_num(): Name the column "_row_num" so that it doesn't conflict with any "row_num" column that's part of the table schema: Aaron Marcuse-Kubitza
06:50 PM Revision 2095: main Makefile: VegBIEN DB: functions schema: Renamed schemas/functions/clear to .../reset to reflect that it also resets the schema to what's in the dump file. schemas/functions/reset: Use now-available schemas/functions.sql to create the schema.: Aaron Marcuse-Kubitza
06:45 PM Revision 2094: Added autogen schemas/functions.sql: Aaron Marcuse-Kubitza
06:41 PM Revision 2093: schemas/vegbien.sql.make: Use new pg_dump_vegbien: Aaron Marcuse-Kubitza
06:41 PM Revision 2092: Added pg_dump_vegbien to dump a schema of the vegbien db: Aaron Marcuse-Kubitza
06:34 PM Revision 2091: main Makefile: VegBIEN DB: Added functions schema targets: Aaron Marcuse-Kubitza
06:09 PM Revision 2090: Makefile: $(confirm): Support a separate line outside of the highlighted line. Include the "Continue?" in the macro since all prompts include it.: Aaron Marcuse-Kubitza
05:55 PM Revision 2089: Makefile: VegBIEN DB: Display different warning message depending on whether entire DB or just current public schema is being deleted: Aaron Marcuse-Kubitza
05:38 PM Revision 2088: db_xml.py: put_table(): Recurse into forward pointers: Aaron Marcuse-Kubitza

05/05/2012

09:55 PM Revision 2087: sql.py: put_table(): Take multiple in_tables. Initial implementation just used the first in_table.: Aaron Marcuse-Kubitza
09:48 PM Revision 2086: sql.py: Added add_row_num(). put_table(): Add row_num to pkeys_table, so it can be joined with in_table's pkeys.: Aaron Marcuse-Kubitza
09:38 PM Revision 2085: sql.py: Added run_query_into() and use it in insert_select(): Aaron Marcuse-Kubitza
08:53 PM Revision 2084: sql.py: pkey(): Support escaped table names: Aaron Marcuse-Kubitza
07:32 PM Revision 2083: sql.py: mk_insert_select(): embeddable: Name the function alias "f" since it will just be wrapped in a nested SELECT, so the exact name doesn't matter (and won't be visible outside the nested SELECT anyway): Aaron Marcuse-Kubitza
07:08 PM Revision 2082: db_xml.py: put_table(): Return the (table, col) where the pkeys are made available, now that this information is available from sql.put_table(): Aaron Marcuse-Kubitza
07:05 PM Revision 2081: sql.py: put_table(): Return just the name of the table where the pkeys are made available, since the column name in that table now equals the pkey name: Aaron Marcuse-Kubitza
06:58 PM Revision 2080: sql.py: mk_insert_select(): embeddable: Make the column returned by the function have the same name as the returning column: Aaron Marcuse-Kubitza
06:39 PM Revision 2079: db_xml.py: put_table() Use new sql.put_table(): Aaron Marcuse-Kubitza
06:39 PM Revision 2078: sql.py: Added put_table(): Aaron Marcuse-Kubitza
06:37 PM Revision 2077: sql.py: Added clean_name(). Use it where needed to make an escaped name appendable as a string.: Aaron Marcuse-Kubitza
05:53 PM Revision 2076: sql.py: Added with_parsed_errors() and use it in try_insert(): Aaron Marcuse-Kubitza
05:30 PM Revision 2075: sql.py: insert_select(): into != None: Fixed bug where cacheable was not passed through to DROP TABLE's run_query(), even though it was passed through to CREATE TABLE AS's run_query(): Aaron Marcuse-Kubitza
05:27 PM Revision 2074: db_xml.py: put_table(): Place pkeys in temp table: Aaron Marcuse-Kubitza
05:26 PM Revision 2073: sql.py: mk_insert_select(): Document that embeddable will cause the query to be fully cached, not just if it raises an exception. insert_select(): into != None: Pass recover and cacheable through to each run_query(): Aaron Marcuse-Kubitza
05:17 PM Revision 2072: sql.py: insert_select(): Support placing RETURNING values in temp table: Aaron Marcuse-Kubitza
04:40 PM Revision 2071: db_xml.py: put_table(): Support returning pkey from INSERT SELECT: Aaron Marcuse-Kubitza
04:38 PM Revision 2070: sql.py: mk_insert_select(): Support using an INSERT RETURNING statement as a nested SELECT: Aaron Marcuse-Kubitza

05/04/2012

07:15 PM Revision 2069: sql.py: mk_insert_select(): Removed unused params recover and cacheable: Aaron Marcuse-Kubitza
07:10 PM Revision 2068: sql.py: Added mogrify(): Aaron Marcuse-Kubitza
07:00 PM Revision 2067: db_xml.py: put_table(): Corrected @return doc: Aaron Marcuse-Kubitza
06:32 PM Revision 2066: sql.py: Added mk_insert_select() and use it in insert_select(): Aaron Marcuse-Kubitza
06:21 PM Revision 2065: db_xml.py: put_table(): Use new insert_select(): Aaron Marcuse-Kubitza
06:15 PM Revision 2064: sql.py: insert_select(): Changed order of cols and params arguments so select_query and params would be together: Aaron Marcuse-Kubitza
06:12 PM Revision 2063: sql.py: Added insert_select() and use it in insert(): Aaron Marcuse-Kubitza
04:55 PM Revision 2062: Calls to sql.esc_name*(): Removed preserve_case=True because it is now the default: Aaron Marcuse-Kubitza
04:51 PM Revision 2061: sql.py: esc_name_by_module(): Changed preserve_case to ignore_case, which defaults to False: Aaron Marcuse-Kubitza
04:49 PM Revision 2060: Calls to sql.esc_name*(): Removed preserve_case=True because it is now the default: Aaron Marcuse-Kubitza
04:47 PM Revision 2059: sql.py: esc_name_by_module(): preserve_case defaults to True: Aaron Marcuse-Kubitza
04:44 PM Revision 2058: sql.py: mk_select(): Escape all names used (table, column, cond, etc.): Aaron Marcuse-Kubitza
04:33 PM Revision 2057: sql.py: esc_name_by_module(): If not enclosing name in quotes, call check_name() on it: Aaron Marcuse-Kubitza
04:30 PM Revision 2056: sql.py: mk_select(): Support literal values in the list of cols to select: Aaron Marcuse-Kubitza
03:22 PM Revision 2055: sql.py: mk_select(): Don't escape the table name, because it will either be check_name()d or it's already been escaped: Aaron Marcuse-Kubitza
03:11 PM Revision 2054: sql.py: Added mk_select(), and use it in select(): Aaron Marcuse-Kubitza
02:14 PM Revision 2053: bin/map: Always pass qual_name(table) to sql.select(). This is possible now that qual_name() can handle None schemas.: Aaron Marcuse-Kubitza
02:08 PM Revision 2052: db_xml.py: put_table(): Take separate in_table and in_schema names, instead of in_table and table_is_esc, because the in_schema is needed to scope the temp tables appropriately: Aaron Marcuse-Kubitza
02:04 PM Revision 2051: sql.py: qual_name(): If schema is None, don't prepend schema: Aaron Marcuse-Kubitza

05/03/2012

06:59 PM Revision 2050: bin/map, sql.py: Turned SQL query caching back on because benchmarks of just the caching on vs. off reveal that it does reduce processing time significantly. However, there is a slowdown that was introduced between the time caching was added and the time the same XML tree was used for each node, which was giving the false indication that the slowdown was due to the caching.: Aaron Marcuse-Kubitza
06:44 PM Revision 2049: bin/map: Turn SQL query caching off by default: Aaron Marcuse-Kubitza
06:39 PM Revision 2048: bin/map: Added cache_sql env var to enable SQL query caching: Aaron Marcuse-Kubitza
06:39 PM Revision 2047: sql.py: Make caching DbConn enablable. Turn caching off by default because recent benchmarks (n=1000) were showing that it slows things down.: Aaron Marcuse-Kubitza
04:53 PM Revision 2046: bin/map: Added new verbose_errors mode, enabled in test mode and off otherwise, which controls whether the output row and tracebacks are included in error messages. Having this off in import mode will reduce the size of error logs so they don't fill up the vegbiendev hard disk as quickly.: Aaron Marcuse-Kubitza
04:51 PM Revision 2045: exc.py: print_ex(): Added detail option to turn off traceback: Aaron Marcuse-Kubitza
04:10 PM Revision 2044: bin/map: Turn parallel processing off by default. This should fix "Cannot allocate memory" errors in large imports.: Aaron Marcuse-Kubitza

05/01/2012

07:58 AM Revision 2043: bin/map: in_is_db: Don't cache the main SELECT query: Aaron Marcuse-Kubitza
07:56 AM Revision 2042: bin/map: by_col: Use the created template, which already has the column names in it, instead of mapping a sample row: Aaron Marcuse-Kubitza
07:50 AM Revision 2041: bin/map: Fixed bug where db_xml could not be imported twice, or it was treated as an undefined variable for some reason: Aaron Marcuse-Kubitza
07:45 AM Revision 2040: bin/map: map_table(): Make each column a db_xml.ColRef instead of a bare index, so that it will appear as the column name when converted to a string. This will provide better debugging info in the template tree and also avoid needing to create a separate sample row in by_col.: Aaron Marcuse-Kubitza
07:33 AM Revision 2039: db_xml.py: Added ColRef: Aaron Marcuse-Kubitza
06:33 AM Revision 2038: bin/map: Fixed bug where row count was off by one if all rows in the input were exhausted, because the row that raises StopIteration was counting as a row: Aaron Marcuse-Kubitza
06:13 AM Revision 2037: main Makefile: VegBIEN DB: mk_db: Use template1 because it has PROCEDURAL LANGUAGE plpgsql already installed and we aren't using an encoding other than UTF8: Aaron Marcuse-Kubitza
06:11 AM Revision 2036: Moved "CREATE PROCEDURAL LANGUAGE plpgsql" to main Makefile so that it would only run when the DB is created, not when the public schema is reinstalled. This is only relevant on PostgreSQL < 9.x, where the plpgsql language is not part of template0.: Aaron Marcuse-Kubitza
05:56 AM Revision 2035: Renamed parallel.py to parallelproc.py to avoid conflict with new system parallel module on vegbiendev: Aaron Marcuse-Kubitza
05:43 AM Revision 2034: Makefile: VegBIEN DB: public schema: Added schemas/rotate: Aaron Marcuse-Kubitza
05:34 AM Revision 2033: bin/map: Fixed bug in input rows processed count where the count would be off by 1, because the for loop would leave i at the index of the last row instead of one-past-the-last: Aaron Marcuse-Kubitza
04:44 AM Revision 2032: bin/map: Use the same XML tree for each row in DB outputs, to eliminate time spent creating the tree from the XPaths for each row: Aaron Marcuse-Kubitza
04:08 AM Revision 2031: bin/map: map_table(): Resolve each prefix into a separate mapping, which is collision-eliminated, instead of resolving values from multiple prefixes when each individual row is mapped: Aaron Marcuse-Kubitza
03:50 AM Revision 2030: bin/map: Moved collision-prevention code to map_rows() so it would only run if there were mappings, and so that it would run after any mappings preprocessing by map_table() that creates more collisions: Aaron Marcuse-Kubitza
03:45 AM Revision 2029: bin/map: Prevent collisions if multiple inputs mapping to same output: Aaron Marcuse-Kubitza
02:02 AM Revision 2028: mappings/DwC1-DwC2.specimens.csv: Mapped collectorNumber and recordNumber to recordNumber with _alt so they wouldn't collide when every input column, even empty ones, are created in the XML tree: Aaron Marcuse-Kubitza
12:42 AM Revision 2027: bin/map: If out_is_db, in debug mode, print each row's XML tree and each value that it's putting: Aaron Marcuse-Kubitza
12:36 AM Revision 2026: bin/map: If out_is_db, in debug mode, print the template XML tree used to insert a sample row into the DB: Aaron Marcuse-Kubitza

04/30/2012

11:57 PM Revision 2025: bin/map: map_table(): When translating mappings to column indexes, use appends to a new list instead of deletions from an existing list to simplify the algorithm: Aaron Marcuse-Kubitza
11:20 PM Revision 2024: union: Omit mappings that are mapped *to* in the input map, in addition to mappings that were overridden. This prevents multiple outputs being created for both the renamed and original mappings, causing duplicate output nodes when one XML tree is used for all rows.: Aaron Marcuse-Kubitza
11:18 PM Revision 2023: union: Omit mappings that are mapped *to* in the input map, in addition to mappings that were overridden. This prevents multiple outputs being created for both the renamed and original mappings, causing duplicate output nodes when one XML tree is used for all rows.: Aaron Marcuse-Kubitza
11:17 PM Revision 2022: input.Makefile: Maps building: Via maps cleanup: subtract: Include comment column so commented mappings are never removed: Aaron Marcuse-Kubitza
11:07 PM Revision 2021: subtract: Support "ragged rows" that have fewer columns than the specified column numbers: Aaron Marcuse-Kubitza
11:06 PM Revision 2020: util.py: list_subset(): Added default param to specify the value to use for invalid indexes (if any): Aaron Marcuse-Kubitza
09:44 AM Revision 2019: mappings/VegX-VegBIEN.stems.csv: Mappings with multiple inputs for the same output: Use _alt, etc. to map the multiple inputs to different places in the XML tree, so that when using a pregenerated tree, the empty leaves for each input will not collide with each other: Aaron Marcuse-Kubitza
09:20 AM Revision 2018: mappings/VegX-VegBIEN.stems.csv: Changed XPath references (using "$") to XML function references using _ref where needed to make them work even on a pre-made XML tree used by all rows: Aaron Marcuse-Kubitza
09:13 AM Revision 2017: xml_func.py: Added _ref to retrieve a value from another XML node: Aaron Marcuse-Kubitza
06:12 AM Revision 2016: xml_func.py: Made all functions take a 2nd node param, which contains the func node itself: Aaron Marcuse-Kubitza
04:15 AM Revision 2015: bin/map: If outputting to a DB, also create output XML elements for NULL input values. This will help with the transition to using the same XML tree for all rows.: Aaron Marcuse-Kubitza
04:09 AM Revision 2014: xml_func.py: _label: return None on empty input: Aaron Marcuse-Kubitza
03:46 AM Revision 2013: mappings/VegX-VegBIEN.stems.csv: Added _collapse around subtrees that need to be removed if they are created around a NULL value: Aaron Marcuse-Kubitza
03:40 AM Revision 2012: xml_func.py: Added _collapse to collapse a subtree if the "value" element in it is NULL: Aaron Marcuse-Kubitza
01:44 AM Revision 2011: schemas/vegbien.sql: definedvalue: Made definedvalue nullable so that each row of a datasource can have a uniform structure in VegBIEN, and to support reusing the same XML DOM tree for each row: Aaron Marcuse-Kubitza
01:11 AM Revision 2010: xpath.py: Added is_xpath(): Aaron Marcuse-Kubitza
01:10 AM Revision 2009: xml_dom.py: set_value(): If value is None and node is Element, remove value node entirely instead of setting node's value to None: Aaron Marcuse-Kubitza
01:02 AM Revision 2008: xml_dom.py: Added value_node(). Use new value_node() in value() and set_value(). set_value(): If the node already has a value node, reuse it instead of appending a new value node.: Aaron Marcuse-Kubitza
12:35 AM Revision 2007: xpath.py: put_obj(): Return the id_attr_node using get_1() because it should only be one node: Aaron Marcuse-Kubitza
12:30 AM Revision 2006: xml_func.py: _simplifyPath: Also treat the elem as empty if the required node exists but is empty: Aaron Marcuse-Kubitza
12:04 AM Revision 2005: db_xml.py: put_table(): Added part of put() code that should be common to both functions: Aaron Marcuse-Kubitza

04/27/2012

06:16 PM Revision 2004: xpath.py: put_obj(): Return a tuple of the inserted node and the id attr node: Aaron Marcuse-Kubitza
06:13 PM Revision 2003: xpath.py: set_id(): When creating the id_path, use obj() (which deepcopy()s the entire path) because it prevents pointers w/o targets: Aaron Marcuse-Kubitza
06:05 PM Revision 2002: xpath.py: set_id(): When creating the id_path, deepcopy() the id_elem because its keys will change in the main copy: Aaron Marcuse-Kubitza
05:47 PM Revision 2001: xpath.py: set_id(): Return the path to the ID attr, which can be used to change the ID: Aaron Marcuse-Kubitza
05:25 PM Revision 2000: xpath.py: put_obj(): Return the inserted node so it can be used to change the inserted value: Aaron Marcuse-Kubitza
05:08 PM Revision 1999: main Makefile: Maps validation: Fixed bug where there would be infinite recursion with the Maps validation section before the Subdir forwarding section (it's unknown why this is necessary): Aaron Marcuse-Kubitza

04/26/2012

07:12 PM Revision 1998: db_xml.py: put_table(): Added commit param to specify whether to commit after each query: Aaron Marcuse-Kubitza
06:55 PM Revision 1997: bin/map: in_is_db: by_col: Use new put_table() (defined but not implemented yet): Aaron Marcuse-Kubitza
06:54 PM Revision 1996: db_xml.py: Added put_table() (without implementation): Aaron Marcuse-Kubitza
06:52 PM Revision 1995: xml_func.py: strip(): Remove _ignore XML funcs completely instead of replacing them with their values: Aaron Marcuse-Kubitza
06:26 PM Revision 1994: bin/map: in_is_db: by_col: Prefix each input column name by "$": Aaron Marcuse-Kubitza
06:11 PM Revision 1993: bin/map: in_is_db: by_col: Strip off XML functions: Aaron Marcuse-Kubitza
06:09 PM Revision 1992: xml_func.py: Added strip(). pop_value(): Support custom name of value param.: Aaron Marcuse-Kubitza
05:44 PM Revision 1991: bin/map: in_is_db: by_col: Create XML tree of sample row, with the input column names as the values. This tree will guide the sequencing and creation of the column-based queries.: Aaron Marcuse-Kubitza
05:43 PM Revision 1990: input.Makefile: use_staged env var: defaults to on if by_col is on: Aaron Marcuse-Kubitza
05:00 PM Revision 1989: bin/map: Only turn on by_col optimization if mapping to same DB, rather than requiring each place that checks by_col to also check whether mapping to same DB: Aaron Marcuse-Kubitza

Also available in: Atom