Project

General

Profile

Statistics
| Revision:

# Date Author Comment
2624 06/05/2012 10:22 AM Aaron Marcuse-Kubitza

main Makefile: VegBIEN DB: DB and bien user: Added schemas/py_functions/reset. db: Create py_functions schema.

2623 06/05/2012 10:16 AM Aaron Marcuse-Kubitza

schemas/py_functions.sql.make: Fixed bug where owners needed to be included because schema is imported as superuser so that untrusted PL/Python functions can be created

2622 06/05/2012 10:15 AM Aaron Marcuse-Kubitza

pg_dump_vegbien: Support optionally including owners

2621 06/05/2012 09:59 AM Aaron Marcuse-Kubitza

main Makefile: VegBIEN DB: DB and bien user: Factored $(confirmRm<schema>) functions message text out into $(confirmRmSchema) function

2620 06/05/2012 09:52 AM Aaron Marcuse-Kubitza

schemas/Makefile, py_functions.sql.make: Generate py_functions.sql from vegbien's py_functions schema

2619 06/05/2012 09:32 AM Aaron Marcuse-Kubitza

main Makefile: postgres-Linux: Install postgresql-plpython

2618 06/05/2012 09:27 AM Aaron Marcuse-Kubitza

main Makefile: python-Linux, postgres-Linux: Fixed bug where apt-get installs needed to each be run in a separate command, so that if any package was not found, the other packages would still install. (apt-get aborts on the first invalid package name.)

2617 06/05/2012 09:18 AM Aaron Marcuse-Kubitza

db_dump_localize: Use new pg_version

2616 06/05/2012 09:18 AM Aaron Marcuse-Kubitza

Added pg_version

2615 06/05/2012 08:05 AM Aaron Marcuse-Kubitza

sql.py: into_table_name(): If relational function has a value argument, don't include other arguments, to save space

2614 06/05/2012 08:03 AM Aaron Marcuse-Kubitza

sql.py: add_pkey(): Version the index name just in case add_suffix() doesn't correctly preserve a needed version #

2613 06/05/2012 08:01 AM Aaron Marcuse-Kubitza

sql_gen.py: add_suffix(): Fixed bug where only strings already at the max length had the version preserved, even though appending the suffix could bring it past the max length and still cause the version to be overwritten. Fixed bug where last # in str, not first, should be considered to precede the version.

2612 06/05/2012 07:46 AM Aaron Marcuse-Kubitza

sql.py: put_table(): mapping param: Fixed documentation of supported key/value types

2611 06/05/2012 07:09 AM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Removed no longer accurate comment about handling _simplifyPath

2610 06/05/2012 07:01 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _nullIf relational function

2609 06/05/2012 06:39 AM Aaron Marcuse-Kubitza

sql_gen.py: add_suffix(): Preserve version so that it won't be truncated off the string, leading to collisions

2608 06/04/2012 03:35 PM Aaron Marcuse-Kubitza

sql_gen.py: identifier_max_len: Fixed bug where PostgreSQL's max length was actually 63, not 64

2607 06/04/2012 03:18 PM Aaron Marcuse-Kubitza

schemas/functions.sql: _label(): Fixed bug where some Python syntax had not been translated to PostgreSQL

2606 06/04/2012 03:07 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _label relational function

2605 06/04/2012 03:06 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Subsetting in_table: Fixed bug where in_table was not being ordered by the row_num, because order_by was set to None when it should have been omitted so it would default to the pkey

2604 06/04/2012 02:51 PM Aaron Marcuse-Kubitza

csv2db: Increased frequency of "Processed .. row(s)" messages to match slower, more common INSERT case instead of faster, less used COPY FROM case

2603 06/04/2012 02:40 PM Aaron Marcuse-Kubitza

schemas/functions.sql: _merge(): Fixed bug where values were ordered by value instead of by sort order (column name)

2602 06/04/2012 02:17 PM Aaron Marcuse-Kubitza

xml_func.py: process(): Refactored to emphasize special handling for row-based and column-based modes. In row-based mode, always use a DB relational function over a local XML function when possible, to faciliate testing of DB relational functions in row-based mode. (The shadowed local XML version will still be tested in non-DB modes, such as outputting to intermediate XML files.)

2601 06/04/2012 01:01 PM Aaron Marcuse-Kubitza

bin/map: Move retrieval of out_db's relational functions outside of process_input() so they can also be used by the non-by_col case

2600 06/04/2012 12:52 PM Aaron Marcuse-Kubitza

bin/map: out_is_db: Don't evaluate relational functions in xml_func.process() because these will be evaluated by db_xml.put()

2599 06/04/2012 12:41 PM Aaron Marcuse-Kubitza

xml_func.py: Removed no longer used strip()

2598 06/04/2012 12:40 PM Aaron Marcuse-Kubitza

bin/map: Use xml_func.process(..., strip=True) instead of xml_func.strip()

2597 06/04/2012 12:39 PM Aaron Marcuse-Kubitza

xml_func.py: process(): Added strip()'s functionality via strip option

2596 06/04/2012 12:10 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _merge relational function

2595 06/04/2012 11:48 AM Aaron Marcuse-Kubitza

schemas/functions.sql: Added join_strs() aggregate

2594 06/04/2012 10:21 AM Aaron Marcuse-Kubitza

sql.py: Renamed index_pkey() to add_pkey() to be consistent with add_index()

2593 06/04/2012 10:07 AM Aaron Marcuse-Kubitza

sql.py: into_table_name(): In function args, omit column name for function result columns

2592 06/04/2012 09:57 AM Aaron Marcuse-Kubitza

sql.py: into_table_name(): In function args, keep the input table name for input columns to identify where they came from, except for the main input table name because it makes the string too long

2591 06/04/2012 09:22 AM Aaron Marcuse-Kubitza

sql_gen.py: esc_name(): Don't return plain name if is_safe_name(), because this makes the SQL inconsistent when some names have "_"s and some don't

2590 06/04/2012 09:17 AM Aaron Marcuse-Kubitza

sql.py: index_pkey(): Use sql_gen.add_suffix() to ensure index name isn't too long

2589 06/04/2012 09:15 AM Aaron Marcuse-Kubitza

sql.py: put_table(): insert_out_pkeys, insert_in_pkeys: Use sql_gen.add_suffix() to ensure name isn't too long

2588 06/04/2012 09:07 AM Aaron Marcuse-Kubitza

sql.py: next_version(): Use new sql_gen.add_suffix(). Removed identifier_max_len because it is now in sql_gen.

2587 06/04/2012 09:07 AM Aaron Marcuse-Kubitza

sql_gen.py: Added identifier_max_len and add_suffix()

2586 06/04/2012 09:04 AM Aaron Marcuse-Kubitza

next_version(): Append the version # so it looks more natural. Take into account the max identifier length.

2585 06/04/2012 09:03 AM Aaron Marcuse-Kubitza

strings.py: Added add_suffix()

2584 06/04/2012 08:51 AM Aaron Marcuse-Kubitza

sql.py: put_table(): Name the in_table just "in" plus the version #, and the insert_in_pkeys/insert_out_pkeys based on in_table, so that they don't take up so much space in the SQL

2583 06/04/2012 08:50 AM Aaron Marcuse-Kubitza

sql_gen.py: is_safe_name(): Fixed bug where keywords were incorrectly considered safe

2582 06/04/2012 08:40 AM Aaron Marcuse-Kubitza

strings.py: repr_no_u(): Fixed bug where "u" prefix was removed even in reprs of non-strings

2581 06/04/2012 08:32 AM Aaron Marcuse-Kubitza

db_xml.py: into_table_name(): Removed no longer necessary handling of simple functions, which is now done by sql.into_table_name(). Ensure that rank params in functions (not tables) are not treated specially as hierarchical.

2580 06/04/2012 08:21 AM Aaron Marcuse-Kubitza

sql.py: put_table(): If into == None: For function calls, include the arguments in the into table name

2579 06/04/2012 08:17 AM Aaron Marcuse-Kubitza

sql_gen.py: to_name_only_col(): Support non-Col Code inputs

2578 06/04/2012 07:42 AM Aaron Marcuse-Kubitza

sql_gen.py CompareCond.to_str(), callers of combine_conds(): Removed unnecessary grouping () to make SQL clearer

2577 06/04/2012 07:31 AM Aaron Marcuse-Kubitza

sql_gen.py: Added combine_conds() and use it in Join.to_str() and sql.py mk_select()

2576 06/04/2012 07:18 AM Aaron Marcuse-Kubitza

sql_gen.py Join.to_str(), sql.py mk_select(): Combining conditions: Don't add newlines where not needed, so that output is less vertically spread out

2575 06/04/2012 07:10 AM Aaron Marcuse-Kubitza

sql_gen.py: is_safe_name(): Fixed bug where names starting with a digit were incorrectly considered safe

2574 06/04/2012 07:06 AM Aaron Marcuse-Kubitza

sql.py: put_table(): Separate temp table names from into table name with "_" instead of "-" so that quoting the table name will usually be unnecessary

2573 06/04/2012 07:03 AM Aaron Marcuse-Kubitza

sql.py: esc_name_by_module(): Remove unused param ignore_case

2572 06/04/2012 06:59 AM Aaron Marcuse-Kubitza

sql_gen.py: esc_name(): If is_safe_name(), just return name, to avoid escessive escaping in debug output for Redmine

2571 06/04/2012 06:55 AM Aaron Marcuse-Kubitza

sql_gen.py: is_safe_name(): Don't consider uppercase letters safe because they would cause inconsistent behavior in PostgreSQL if quoted vs. not quoted (only unquoted identifiers are case-insensitive)

2570 06/04/2012 06:51 AM Aaron Marcuse-Kubitza

sql.py: Removed no longer needed check_name()

2569 06/04/2012 06:50 AM Aaron Marcuse-Kubitza

sql.py: esc_name_by_module(): psycopg2: If ignore_case is set but name is unsafe, just escape it instead of raising an exception

2568 06/04/2012 06:49 AM Aaron Marcuse-Kubitza

sql_gen.py: Added is_safe_name()

2567 06/04/2012 06:39 AM Aaron Marcuse-Kubitza

sql.py: put_table(): col_ustr(): Removed no longer needed sql_gen.as_Col() because mapping and join_cols now ensure that their contents are sql_gen.Col objects

2566 06/01/2012 08:29 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added _alt relational function

2565 06/01/2012 08:28 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Make mapping and join_cols a sql_gen.ColDict so that literal values will always be turned into sql_gen.Col objects. DuplicateKeyException: Use dict_subset_right_join() instead of dict_subset() so that all columns in a constraint are included in joins on out_table (such as for a relational function with omitted arguments).

2564 06/01/2012 08:25 PM Aaron Marcuse-Kubitza

sql_gen.py: Added ColDict

2563 06/01/2012 08:19 PM Aaron Marcuse-Kubitza

sql_gen.py: as_Col(): Added optional name param to specify that non-Col input will be renamed using NamedCol with the given name

2562 06/01/2012 07:06 PM Aaron Marcuse-Kubitza

sql.py: put_table(): FunctionValueException: Fixed bug where only function calls, not plain columns, were handled, by using sql_gen.unwrap_func_call() to remove any function call only if there was one

2561 06/01/2012 07:04 PM Aaron Marcuse-Kubitza

sql_gen.py: Added unwrap_func_call()

2560 06/01/2012 06:47 PM Aaron Marcuse-Kubitza

bin/map: by_col: Stripping XML functions not in the DB: Fixed bug where preserve_funcs.add() was used when `preserve_funcs |=` should have been used to add the entire iterable that sql.tables() returns

2559 06/01/2012 06:45 PM Aaron Marcuse-Kubitza

sql.py: not_null_col: Changed value to 'not_null_col' so that column doesn't seem like a status indicator of whether some value is not null (in fact it's just a column that is always not null)

2558 06/01/2012 06:05 PM Aaron Marcuse-Kubitza

xml_func.py: Replaced xpath.get_1() with xpath.get_value() where possible, for simplicity

2557 06/01/2012 05:59 PM Aaron Marcuse-Kubitza

xml_func.py: strip(): Evaluate structural functions like _ignore and _ref by process() instead of removing them. Store structural functions' names in structural_funcs module var. This ensures that _ref targets are still expanded in column-based import.

2556 06/01/2012 05:56 PM Aaron Marcuse-Kubitza

xpath.py: get(): Create attrs: Put keys last so that any lookahead assertion's path will be created last as it would have without the assertion. This ensures that any value argument of an XML function will always go last even if a lookahead assertion would otherwise have caused it to be created with the element's keys, which previously were created before the attributes.

2555 06/01/2012 04:55 PM Aaron Marcuse-Kubitza

sql.py: put_table(): If is_func, default into table name ends in () instead of '-pkeys'

2554 06/01/2012 04:54 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql, functions.sql: Made cast functions STRICT to enable the RETURNS NULL ON NULL INPUT optimization

2553 06/01/2012 04:33 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Pass is_func to sql.put_table()

2552 06/01/2012 04:32 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Added is_func param for whether out_table is the name of a SQL function, not a table

2551 06/01/2012 04:09 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Treat every node name that starts with "_" as a function, not just members of put_table_special_funcs. This ensures that DB function args are always treated as values, not children with fkeys to parent.

2550 06/01/2012 03:40 PM Aaron Marcuse-Kubitza

bin/map: by_col: Strip only XML functions that are not in the DB

2549 06/01/2012 03:39 PM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Make special_funcs externally available as module constant put_table_special_funcs

2548 06/01/2012 03:38 PM Aaron Marcuse-Kubitza

sql.py: tables(): Changed schema param to schema_like and filter the schema using LIKE so that all schemas can be selected

2547 06/01/2012 01:56 PM Aaron Marcuse-Kubitza

to_do/timeline.doc: Updated to reflect the month we spent on optimization and column-based import

2546 06/01/2012 12:54 PM Aaron Marcuse-Kubitza

sql.py: put_table(): in_table name: Remove '-pkeys' suffix from the into table name before adding '-input' so that the name is shorter and clearer

2545 06/01/2012 12:43 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Wrap repr() calls for debug messages in strings.as_tt() to add Redmine formatting

2544 06/01/2012 12:39 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Output "Adding index" debug message with level=2.5 so it's not part of the Redmine steps

2543 05/31/2012 03:39 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql, functions.sql: Cast functions: Fixed bug where invalid value exceptions were not being caught, because implicit conversions to the return type apparently only happen outside the block containing the RETURN statement (i.e. at the end of the function). Fixed by adding explicit type conversion to return type, so that type conversion would happen inside try block.

2542 05/31/2012 03:31 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Re-enabled FunctionValueException handling, by just filtering out the value on all input columns that use the named function (since the error message does not specify which column it was that had the invalid value). This is in some ways better, anyway, because that way the invalid value is filtered out right away in all columns that could contain it, instead of potentially once for each column (if the value appears in more than one input column).

2541 05/31/2012 03:18 PM Aaron Marcuse-Kubitza

sql.py: add_index(): Fixed bug where expressions could not be converted to a string until their table name had been removed

2540 05/31/2012 03:17 PM Aaron Marcuse-Kubitza

sql_gen.py: Added Expr

2539 05/31/2012 03:13 PM Aaron Marcuse-Kubitza

sql.py: add_index(): Fixed bug where expressions needed to be enclosed in () to distinguish them from plain columns

2538 05/31/2012 03:06 PM Aaron Marcuse-Kubitza

sql.py: add_index(): Support simple expressions as well as columns

2537 05/31/2012 02:37 PM Aaron Marcuse-Kubitza

sql.py: Renamed index_col() to add_index() so its name isn't similar to index_cols()

2536 05/31/2012 02:33 PM Aaron Marcuse-Kubitza

sql_gen.py: FunctionCall: Removed repr() because it's a Code object and its to_str() does not take extra arguments

2535 05/31/2012 02:12 PM Aaron Marcuse-Kubitza

sql.py: run_query(): FunctionValueException: Expanded parsing to include regular function calls, not just relational functions' trigger functions. put_table(): Disabled FunctionValueException handling because this expands FunctionValueException beyond what put_table() could handle.

2534 05/31/2012 01:38 PM Aaron Marcuse-Kubitza

sql.py: put_table(): MissingCastException: Fixed bug where renaming of cast literal value was not properly propagated to the returned value of the function call, causing the query to assume that a DISTINCT ON column referred to column in one of the joined tables instead of a named column in the SELECT columns list. This logic error would have been very difficult to catch without inspecting the code!

2533 05/31/2012 01:33 PM Aaron Marcuse-Kubitza

sql_gen.py: Added wrap_in_func()

2532 05/31/2012 01:25 PM Aaron Marcuse-Kubitza

sql_gen.py: FunctionCall: Filter args through remove_col_rename() to remove any renamings from the function args

2531 05/31/2012 01:20 PM Aaron Marcuse-Kubitza

sql.py: put_table(): No handler for exception: Print full exception instead of just first line to assist in debugging

2530 05/31/2012 01:06 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql, functions.sql: Removed _to* relational functions because type casting for those types is now automatic

2529 05/31/2012 01:02 PM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Removed _to* relational functions because type casting for those types is now automatic

2528 05/31/2012 12:59 PM Aaron Marcuse-Kubitza

schemas/functions.sql: Added cast functions for _to* relational functions

2527 05/31/2012 12:58 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Changed cast functions' input types to text because type must match exactly, not just be implicitly castable

2526 05/31/2012 12:47 PM Aaron Marcuse-Kubitza

sql.py: run_query(): MissingCastException parsing: Support multiple-word types

2525 05/31/2012 12:38 PM Aaron Marcuse-Kubitza

sql.py: put_table(): Handle MissingCastExceptions by attempting to call a function with the name of the type on the column