/ - Changes - BIEN 3 - NCEAS Projects

root @ 2621

#	Date	Author	Comment
2621	06/05/2012 09:59 AM	Aaron Marcuse-Kubitza	main Makefile: VegBIEN DB: DB and bien user: Factored $(confirmRm<schema>) functions message text out into $(confirmRmSchema) function
2620	06/05/2012 09:52 AM	Aaron Marcuse-Kubitza	schemas/Makefile, py_functions.sql.make: Generate py_functions.sql from vegbien's py_functions schema
2619	06/05/2012 09:32 AM	Aaron Marcuse-Kubitza	main Makefile: postgres-Linux: Install postgresql-plpython
2618	06/05/2012 09:27 AM	Aaron Marcuse-Kubitza	main Makefile: python-Linux, postgres-Linux: Fixed bug where apt-get installs needed to each be run in a separate command, so that if any package was not found, the other packages would still install. (apt-get aborts on the first invalid package name.)
2617	06/05/2012 09:18 AM	Aaron Marcuse-Kubitza	db_dump_localize: Use new pg_version
2616	06/05/2012 09:18 AM	Aaron Marcuse-Kubitza	Added pg_version
2615	06/05/2012 08:05 AM	Aaron Marcuse-Kubitza	sql.py: into_table_name(): If relational function has a value argument, don't include other arguments, to save space
2614	06/05/2012 08:03 AM	Aaron Marcuse-Kubitza	sql.py: add_pkey(): Version the index name just in case add_suffix() doesn't correctly preserve a needed version #
2613	06/05/2012 08:01 AM	Aaron Marcuse-Kubitza	sql_gen.py: add_suffix(): Fixed bug where only strings already at the max length had the version preserved, even though appending the suffix could bring it past the max length and still cause the version to be overwritten. Fixed bug where last # in str, not first, should be considered to precede the version.
2612	06/05/2012 07:46 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): mapping param: Fixed documentation of supported key/value types
2611	06/05/2012 07:09 AM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Removed no longer accurate comment about handling _simplifyPath
2610	06/05/2012 07:01 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _nullIf relational function
2609	06/05/2012 06:39 AM	Aaron Marcuse-Kubitza	sql_gen.py: add_suffix(): Preserve version so that it won't be truncated off the string, leading to collisions
2608	06/04/2012 03:35 PM	Aaron Marcuse-Kubitza	sql_gen.py: identifier_max_len: Fixed bug where PostgreSQL's max length was actually 63, not 64
2607	06/04/2012 03:18 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: _label(): Fixed bug where some Python syntax had not been translated to PostgreSQL
2606	06/04/2012 03:07 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _label relational function
2605	06/04/2012 03:06 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Subsetting in_table: Fixed bug where in_table was not being ordered by the row_num, because order_by was set to None when it should have been omitted so it would default to the pkey
2604	06/04/2012 02:51 PM	Aaron Marcuse-Kubitza	csv2db: Increased frequency of "Processed .. row(s)" messages to match slower, more common INSERT case instead of faster, less used COPY FROM case
2603	06/04/2012 02:40 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: _merge(): Fixed bug where values were ordered by value instead of by sort order (column name)
2602	06/04/2012 02:17 PM	Aaron Marcuse-Kubitza	xml_func.py: process(): Refactored to emphasize special handling for row-based and column-based modes. In row-based mode, always use a DB relational function over a local XML function when possible, to faciliate testing of DB relational functions in row-based mode. (The shadowed local XML version will still be tested in non-DB modes, such as outputting to intermediate XML files.)
2601	06/04/2012 01:01 PM	Aaron Marcuse-Kubitza	bin/map: Move retrieval of out_db's relational functions outside of process_input() so they can also be used by the non-by_col case
2600	06/04/2012 12:52 PM	Aaron Marcuse-Kubitza	bin/map: out_is_db: Don't evaluate relational functions in xml_func.process() because these will be evaluated by db_xml.put()
2599	06/04/2012 12:41 PM	Aaron Marcuse-Kubitza	xml_func.py: Removed no longer used strip()
2598	06/04/2012 12:40 PM	Aaron Marcuse-Kubitza	bin/map: Use xml_func.process(..., strip=True) instead of xml_func.strip()
2597	06/04/2012 12:39 PM	Aaron Marcuse-Kubitza	xml_func.py: process(): Added strip()'s functionality via strip option
2596	06/04/2012 12:10 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _merge relational function
2595	06/04/2012 11:48 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added join_strs() aggregate
2594	06/04/2012 10:21 AM	Aaron Marcuse-Kubitza	sql.py: Renamed index_pkey() to add_pkey() to be consistent with add_index()
2593	06/04/2012 10:07 AM	Aaron Marcuse-Kubitza	sql.py: into_table_name(): In function args, omit column name for function result columns
2592	06/04/2012 09:57 AM	Aaron Marcuse-Kubitza	sql.py: into_table_name(): In function args, keep the input table name for input columns to identify where they came from, except for the main input table name because it makes the string too long
2591	06/04/2012 09:22 AM	Aaron Marcuse-Kubitza	sql_gen.py: esc_name(): Don't return plain name if is_safe_name(), because this makes the SQL inconsistent when some names have "_"s and some don't
2590	06/04/2012 09:17 AM	Aaron Marcuse-Kubitza	sql.py: index_pkey(): Use sql_gen.add_suffix() to ensure index name isn't too long
2589	06/04/2012 09:15 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): insert_out_pkeys, insert_in_pkeys: Use sql_gen.add_suffix() to ensure name isn't too long
2588	06/04/2012 09:07 AM	Aaron Marcuse-Kubitza	sql.py: next_version(): Use new sql_gen.add_suffix(). Removed identifier_max_len because it is now in sql_gen.
2587	06/04/2012 09:07 AM	Aaron Marcuse-Kubitza	sql_gen.py: Added identifier_max_len and add_suffix()
2586	06/04/2012 09:04 AM	Aaron Marcuse-Kubitza	next_version(): Append the version # so it looks more natural. Take into account the max identifier length.
2585	06/04/2012 09:03 AM	Aaron Marcuse-Kubitza	strings.py: Added add_suffix()
2584	06/04/2012 08:51 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Name the in_table just "in" plus the version #, and the insert_in_pkeys/insert_out_pkeys based on in_table, so that they don't take up so much space in the SQL
2583	06/04/2012 08:50 AM	Aaron Marcuse-Kubitza	sql_gen.py: is_safe_name(): Fixed bug where keywords were incorrectly considered safe
2582	06/04/2012 08:40 AM	Aaron Marcuse-Kubitza	strings.py: repr_no_u(): Fixed bug where "u" prefix was removed even in reprs of non-strings
2581	06/04/2012 08:32 AM	Aaron Marcuse-Kubitza	db_xml.py: into_table_name(): Removed no longer necessary handling of simple functions, which is now done by sql.into_table_name(). Ensure that rank params in functions (not tables) are not treated specially as hierarchical.
2580	06/04/2012 08:21 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): If into == None: For function calls, include the arguments in the into table name
2579	06/04/2012 08:17 AM	Aaron Marcuse-Kubitza	sql_gen.py: to_name_only_col(): Support non-Col Code inputs
2578	06/04/2012 07:42 AM	Aaron Marcuse-Kubitza	sql_gen.py CompareCond.to_str(), callers of combine_conds(): Removed unnecessary grouping () to make SQL clearer
2577	06/04/2012 07:31 AM	Aaron Marcuse-Kubitza	sql_gen.py: Added combine_conds() and use it in Join.to_str() and sql.py mk_select()
2576	06/04/2012 07:18 AM	Aaron Marcuse-Kubitza	sql_gen.py Join.to_str(), sql.py mk_select(): Combining conditions: Don't add newlines where not needed, so that output is less vertically spread out
2575	06/04/2012 07:10 AM	Aaron Marcuse-Kubitza	sql_gen.py: is_safe_name(): Fixed bug where names starting with a digit were incorrectly considered safe
2574	06/04/2012 07:06 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Separate temp table names from into table name with "_" instead of "-" so that quoting the table name will usually be unnecessary
2573	06/04/2012 07:03 AM	Aaron Marcuse-Kubitza	sql.py: esc_name_by_module(): Remove unused param ignore_case
2572	06/04/2012 06:59 AM	Aaron Marcuse-Kubitza	sql_gen.py: esc_name(): If is_safe_name(), just return name, to avoid escessive escaping in debug output for Redmine
2571	06/04/2012 06:55 AM	Aaron Marcuse-Kubitza	sql_gen.py: is_safe_name(): Don't consider uppercase letters safe because they would cause inconsistent behavior in PostgreSQL if quoted vs. not quoted (only unquoted identifiers are case-insensitive)
2570	06/04/2012 06:51 AM	Aaron Marcuse-Kubitza	sql.py: Removed no longer needed check_name()
2569	06/04/2012 06:50 AM	Aaron Marcuse-Kubitza	sql.py: esc_name_by_module(): psycopg2: If ignore_case is set but name is unsafe, just escape it instead of raising an exception
2568	06/04/2012 06:49 AM	Aaron Marcuse-Kubitza	sql_gen.py: Added is_safe_name()
2567	06/04/2012 06:39 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): col_ustr(): Removed no longer needed sql_gen.as_Col() because mapping and join_cols now ensure that their contents are sql_gen.Col objects
2566	06/01/2012 08:29 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added _alt relational function
2565	06/01/2012 08:28 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Make mapping and join_cols a sql_gen.ColDict so that literal values will always be turned into sql_gen.Col objects. DuplicateKeyException: Use dict_subset_right_join() instead of dict_subset() so that all columns in a constraint are included in joins on out_table (such as for a relational function with omitted arguments).
2564	06/01/2012 08:25 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added ColDict
2563	06/01/2012 08:19 PM	Aaron Marcuse-Kubitza	sql_gen.py: as_Col(): Added optional name param to specify that non-Col input will be renamed using NamedCol with the given name
2562	06/01/2012 07:06 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): FunctionValueException: Fixed bug where only function calls, not plain columns, were handled, by using sql_gen.unwrap_func_call() to remove any function call only if there was one
2561	06/01/2012 07:04 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added unwrap_func_call()
2560	06/01/2012 06:47 PM	Aaron Marcuse-Kubitza	bin/map: by_col: Stripping XML functions not in the DB: Fixed bug where preserve_funcs.add() was used when `preserve_funcs \|=` should have been used to add the entire iterable that sql.tables() returns
2559	06/01/2012 06:45 PM	Aaron Marcuse-Kubitza	sql.py: not_null_col: Changed value to 'not_null_col' so that column doesn't seem like a status indicator of whether some value is not null (in fact it's just a column that is always not null)
2558	06/01/2012 06:05 PM	Aaron Marcuse-Kubitza	xml_func.py: Replaced xpath.get_1() with xpath.get_value() where possible, for simplicity
2557	06/01/2012 05:59 PM	Aaron Marcuse-Kubitza	xml_func.py: strip(): Evaluate structural functions like _ignore and _ref by process() instead of removing them. Store structural functions' names in structural_funcs module var. This ensures that _ref targets are still expanded in column-based import.
2556	06/01/2012 05:56 PM	Aaron Marcuse-Kubitza	xpath.py: get(): Create attrs: Put keys last so that any lookahead assertion's path will be created last as it would have without the assertion. This ensures that any value argument of an XML function will always go last even if a lookahead assertion would otherwise have caused it to be created with the element's keys, which previously were created before the attributes.
2555	06/01/2012 04:55 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): If is_func, default into table name ends in () instead of '-pkeys'
2554	06/01/2012 04:54 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql, functions.sql: Made cast functions STRICT to enable the RETURNS NULL ON NULL INPUT optimization
2553	06/01/2012 04:33 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Pass is_func to sql.put_table()
2552	06/01/2012 04:32 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Added is_func param for whether out_table is the name of a SQL function, not a table
2551	06/01/2012 04:09 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Treat every node name that starts with "_" as a function, not just members of put_table_special_funcs. This ensures that DB function args are always treated as values, not children with fkeys to parent.
2550	06/01/2012 03:40 PM	Aaron Marcuse-Kubitza	bin/map: by_col: Strip only XML functions that are not in the DB
2549	06/01/2012 03:39 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Make special_funcs externally available as module constant put_table_special_funcs
2548	06/01/2012 03:38 PM	Aaron Marcuse-Kubitza	sql.py: tables(): Changed schema param to schema_like and filter the schema using LIKE so that all schemas can be selected
2547	06/01/2012 01:56 PM	Aaron Marcuse-Kubitza	to_do/timeline.doc: Updated to reflect the month we spent on optimization and column-based import
2546	06/01/2012 12:54 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): in_table name: Remove '-pkeys' suffix from the into table name before adding '-input' so that the name is shorter and clearer
2545	06/01/2012 12:43 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Wrap repr() calls for debug messages in strings.as_tt() to add Redmine formatting
2544	06/01/2012 12:39 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Output "Adding index" debug message with level=2.5 so it's not part of the Redmine steps
2543	05/31/2012 03:39 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql, functions.sql: Cast functions: Fixed bug where invalid value exceptions were not being caught, because implicit conversions to the return type apparently only happen outside the block containing the RETURN statement (i.e. at the end of the function). Fixed by adding explicit type conversion to return type, so that type conversion would happen inside try block.
2542	05/31/2012 03:31 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Re-enabled FunctionValueException handling, by just filtering out the value on all input columns that use the named function (since the error message does not specify which column it was that had the invalid value). This is in some ways better, anyway, because that way the invalid value is filtered out right away in all columns that could contain it, instead of potentially once for each column (if the value appears in more than one input column).
2541	05/31/2012 03:18 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Fixed bug where expressions could not be converted to a string until their table name had been removed
2540	05/31/2012 03:17 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added Expr
2539	05/31/2012 03:13 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Fixed bug where expressions needed to be enclosed in () to distinguish them from plain columns
2538	05/31/2012 03:06 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Support simple expressions as well as columns
2537	05/31/2012 02:37 PM	Aaron Marcuse-Kubitza	sql.py: Renamed index_col() to add_index() so its name isn't similar to index_cols()
2536	05/31/2012 02:33 PM	Aaron Marcuse-Kubitza	sql_gen.py: FunctionCall: Removed repr() because it's a Code object and its to_str() does not take extra arguments
2535	05/31/2012 02:12 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): FunctionValueException: Expanded parsing to include regular function calls, not just relational functions' trigger functions. put_table(): Disabled FunctionValueException handling because this expands FunctionValueException beyond what put_table() could handle.
2534	05/31/2012 01:38 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): MissingCastException: Fixed bug where renaming of cast literal value was not properly propagated to the returned value of the function call, causing the query to assume that a DISTINCT ON column referred to column in one of the joined tables instead of a named column in the SELECT columns list. This logic error would have been very difficult to catch without inspecting the code!
2533	05/31/2012 01:33 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added wrap_in_func()
2532	05/31/2012 01:25 PM	Aaron Marcuse-Kubitza	sql_gen.py: FunctionCall: Filter args through remove_col_rename() to remove any renamings from the function args
2531	05/31/2012 01:20 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): No handler for exception: Print full exception instead of just first line to assist in debugging
2530	05/31/2012 01:06 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql, functions.sql: Removed _to* relational functions because type casting for those types is now automatic
2529	05/31/2012 01:02 PM	Aaron Marcuse-Kubitza	mappings/DwC2-VegBIEN.specimens.csv: Removed _to* relational functions because type casting for those types is now automatic
2528	05/31/2012 12:59 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added cast functions for _to* relational functions
2527	05/31/2012 12:58 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Changed cast functions' input types to text because type must match exactly, not just be implicitly castable
2526	05/31/2012 12:47 PM	Aaron Marcuse-Kubitza	sql.py: run_query(): MissingCastException parsing: Support multiple-word types
2525	05/31/2012 12:38 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Handle MissingCastExceptions by attempting to call a function with the name of the type on the column
2524	05/31/2012 12:33 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added Functions section with Function and FunctionCall
2523	05/31/2012 11:56 AM	Aaron Marcuse-Kubitza	sql.py: Added MissingCastException and parse it in run_query()
2522	05/31/2012 11:36 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: Added cast functions for enum types which map invalid values to NULL

Project

General

Profile