/ - Changes - BIEN 3 - NCEAS Projects

root @ 3013

#	Date	Author	Comment
3013	06/21/2012 10:24 PM	Aaron Marcuse-Kubitza	import.stats.xls: Changed \/row (ms)" to "ms/row"
3012	06/21/2012 09:52 PM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): INSERT IGNORE: Used typed vars to store each column of the select query because the select query's column names are not known, so it would be unknown which record var column to use for each output column. (Add types so there are no unknown-type values that PostgreSQL will say can't be cast.)
3011	06/21/2012 09:21 PM	Aaron Marcuse-Kubitza	sql.py: insert_select(): In ignore mode, always recover from errors
3010	06/21/2012 09:14 PM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): INSERT IGNORE: Implemented simple cursor loop that just inserts each row, without ignoring duplicate keys
3009	06/21/2012 08:26 PM	Aaron Marcuse-Kubitza	sql.py: mk_insert_select(): Added ignore param to do an INSERT IGNORE equivalent operation. Added basic structure for INSERT IGNORE. Support embeddable queries without RETURNING columns (which is the case for most INSERT IGNORE queries).
3008	06/21/2012 06:50 PM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Always set into.schema to None, so that table is created in the temp schema in debug_temp mode
3007	06/21/2012 06:25 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.DbCursor.execute(): Don't cache ADD COLUMN queries because the rest of the query must be unique in order to use caching
3006	06/21/2012 06:24 PM	Aaron Marcuse-Kubitza	sql.py: add_col(): Don't ignore already-existing columns because sometimes name truncation causes collisions, requiring the caller to version the name. add_index_col(): Version the column name to avoid collisions.
3005	06/21/2012 06:08 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Version index names to avoid collisions
3004	06/21/2012 05:58 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.DbCursor.execute(): Always cache structural changes, not just if they throw an exception
3003	06/21/2012 05:40 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Fixed bug where expr could not be deep-copied until ensure_not_null() had been run on it, because ensure_not_null() modifies the index_cols of the table and this needs to be modified on the original table
3002	06/21/2012 05:39 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Fixed bug where expr could not be deep-copied until ensure_not_null() had been run on it, because ensure_not_null() modifies the index_cols of the table and this needs to be modified on the original table
3001	06/21/2012 05:12 PM	Aaron Marcuse-Kubitza	sql_gen.py: EnsureNotNull: Added to_str() which uses the column's index col if available. This works around a PostgreSQL bug where indexes on expressions are sometimes not used, but indexes on plain columns are used.
3000	06/21/2012 05:09 PM	Aaron Marcuse-Kubitza	sql_gen.py: Store index cols in the table instead of in the column, because columns are often recreated from the table and name but tables are generally not copied
2999	06/21/2012 04:21 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added is_indexed_col()
2998	06/21/2012 04:15 PM	Aaron Marcuse-Kubitza	sql.py: add_index(): Use new sql.ensure_not_null(), which creates a separate column to store the index expr where possible
2997	06/21/2012 04:12 PM	Aaron Marcuse-Kubitza	sql.py: Added add_not_null(), add_index_col(), ensure_not_null()
2996	06/21/2012 04:10 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added is_temp_col()
2995	06/21/2012 03:57 PM	Aaron Marcuse-Kubitza	sql_gen.py: concat(): Also preserve cast suffix
2994	06/21/2012 03:44 PM	Aaron Marcuse-Kubitza	sql_gen.py: Col: Added index_col attr
2993	06/21/2012 03:19 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added suffixed_col()
2992	06/21/2012 02:59 PM	Aaron Marcuse-Kubitza	sql.py: run_query_into(): Set the into table to be a temp table
2991	06/21/2012 02:59 PM	Aaron Marcuse-Kubitza	sql_gen.py: Table: Store whether table is temp table
2990	06/21/2012 01:50 PM	Aaron Marcuse-Kubitza	sql_gen.py: concat(): Also preserve appended column names
2989	06/21/2012 01:48 PM	Aaron Marcuse-Kubitza	sql_gen.py: Col.to_str(): Take for_str param which does both concat() and str()/clean_name() instead of use_concat param which doesn't remove quotes from the strings before concatenating, causing strings to be incorrectly truncated
2988	06/20/2012 09:46 PM	Aaron Marcuse-Kubitza	sql_gen.py: EnsureNotNull: Cast the null sentinel to the type it's the sentinel for, so that COALESCE returns the same type as the provided type, instead of potentially incorrectly assuming the constant's type. This fixes a bug where COALESCE returned an incorrect type.
2987	06/20/2012 09:38 PM	Aaron Marcuse-Kubitza	sql.py: cast(): Use new sql_gen.Cast
2986	06/20/2012 09:38 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added Cast
2985	06/20/2012 09:37 PM	Aaron Marcuse-Kubitza	sql_gen.py: concat(): Consider ")"s part of the existing suffix, so that parentheses match up
2984	06/20/2012 09:05 PM	Aaron Marcuse-Kubitza	Removed extra trailing whitespace
2983	06/20/2012 08:36 PM	Aaron Marcuse-Kubitza	main Makefile: Added explicit schemas/temp/reinstall target because vegbiendev (Ubuntu 12.04) seems to ignore %/reinstall
2982	06/20/2012 08:24 PM	Aaron Marcuse-Kubitza	vegbien_dest: Added temp to schemas list so the intermediate tables will be created there, where they are easy to remove with `make schemas/temp/reinstall`
2981	06/20/2012 08:23 PM	Aaron Marcuse-Kubitza	main Makefile: Added %/reinstall, which calls the corresponding uninstall and install targets
2980	06/20/2012 08:11 PM	Aaron Marcuse-Kubitza	Added temp schema to store intermediate tables permanently when desired for debugging. It is not part of schemas/all because it should not be remade to include current intermediate tables when remaking other schema exports.
2979	06/20/2012 08:04 PM	Aaron Marcuse-Kubitza	schemas: Moved *.sql.make into Makefile to take advantage of % pattern matching
2978	06/20/2012 07:47 PM	Aaron Marcuse-Kubitza	Removed no longer used schemas/vegbien_empty.sql
2977	06/20/2012 07:46 PM	Aaron Marcuse-Kubitza	main Makefile: Removed empty_db, because `make schemas/reinstall` has the same effect and is simpler
2976	06/20/2012 07:40 PM	Aaron Marcuse-Kubitza	README.TXT: Changed documentation to use make schemas/reinstall to empty the DB, since that command is simpler. Added how to archive the last import.
2975	06/20/2012 07:11 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Removed `if not db.debug_temp` check because that is done by sql.empty_temp()
2974	06/20/2012 07:10 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Use new empty_temp()
2973	06/20/2012 07:06 PM	Aaron Marcuse-Kubitza	import.stats.xls: Added comments for estimated numbers. Added "," separators to large numbers.
2972	06/20/2012 06:21 PM	Aaron Marcuse-Kubitza	sql.py: empty_temp(): In debug_temp mode, leave temp tables there for debugging
2971	06/20/2012 06:06 PM	Aaron Marcuse-Kubitza	sql.py: empty_temp(): Don't output at log_level 2 because it's an internal query, not part of the core algorithm
2970	06/20/2012 06:06 PM	Aaron Marcuse-Kubitza	sql.py: truncate(): Added kw_args to pass to run_query()
2969	06/20/2012 05:52 PM	Aaron Marcuse-Kubitza	Added inputs/import.stats.xls, which compares row-based and column-based import. This shows that column-based import is slowed down by table locking when run simultaneously, so we will need a new INSERT IGNORE replacement that doesn't lock tables.
2968	06/20/2012 03:14 PM	Aaron Marcuse-Kubitza	inputs: Ignore OpenOffice.org lock files
2967	06/20/2012 02:19 PM	Aaron Marcuse-Kubitza	sql.py: empty_temp(): Don't print log message if not emptying any tables
2966	06/20/2012 02:16 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Empty unneeded temp tables to free up memory
2965	06/20/2012 02:14 PM	Aaron Marcuse-Kubitza	sql.py: Added empty_temp()
2964	06/20/2012 02:14 PM	Aaron Marcuse-Kubitza	sql.py: Use new lists.mk_seq()
2963	06/20/2012 02:13 PM	Aaron Marcuse-Kubitza	lists.py: Added mk_seq()
2962	06/20/2012 02:11 PM	Aaron Marcuse-Kubitza	lists.py: is_seq(): Also return true for sets
2961	06/19/2012 03:02 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: specimenreplicate: Added indexes using COALESCE to match what sql_gen does
2960	06/19/2012 02:08 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Getting output table pkeys of existing/inserted rows: Do a DISTINCT ON the input pkey (row_num) in case the plain JOIN matched multiple output table rows for one input table row
2959	06/19/2012 01:44 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Empty unneeded temp tables to free up memory and avoid running out of memory (the temp tables seem to be in-memory only)
2958	06/19/2012 01:30 PM	Aaron Marcuse-Kubitza	sql_gen.py: null_sentinels: Added value for type timestamp with time zone. Put each type on its own line for clarity.
2957	06/19/2012 01:03 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: locationdetermination: Changed indexes to use COALESCE to match what sql_gen now does
2956	06/19/2012 12:23 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: location: Added indexes using COALESCE to match what sql_gen does
2955	06/19/2012 12:06 PM	Aaron Marcuse-Kubitza	sql.py: cast_temp_col(): Add an index on the created column
2954	06/19/2012 11:55 AM	Aaron Marcuse-Kubitza	sql_gen.py: null_sentinels: Added value for type double precision
2953	06/19/2012 11:52 AM	Aaron Marcuse-Kubitza	sql_gen.py: ensure_not_null(): Warn of no null sentinel for type, even if caller catches error
2952	06/19/2012 10:11 AM	Aaron Marcuse-Kubitza	schemas/py_functions.sql: Added plain function _namePart() and use it in trigger function _namePart()
2951	06/19/2012 09:42 AM	Aaron Marcuse-Kubitza	schemas/py_functions.sql: Added plain functions _dateRangeStart() and _dateRangeEnd() and use them in trigger functions _dateRangeStart() and _dateRangeEnd()
2950	06/19/2012 09:28 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: _label(): Ensure that label is NOT NULL so it doesn't NULL out the entire string
2949	06/19/2012 09:23 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added plain function _nullIf() and use it in trigger function _nullIf()
2948	06/19/2012 08:56 AM	Aaron Marcuse-Kubitza	sql.py: DbConn.DbCursor._cache_result(): Corrected comment to reflect why different types of queries are cached differently
2947	06/19/2012 08:46 AM	Aaron Marcuse-Kubitza	sql.py: add_col(): Catch DuplicateExceptions so that columns that already existed are ignored
2946	06/19/2012 08:43 AM	Aaron Marcuse-Kubitza	sql.py: run_query(): DuplicateException: Also match "column already exists" errors
2945	06/19/2012 08:20 AM	Aaron Marcuse-Kubitza	sql.py: Merged DuplicateTableException and DuplicateFunctionException into one exception DuplicateException, with a type variable for the type of duplicate item. Added ExceptionWithNameType.
2944	06/19/2012 08:05 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Fixed bug where external function calls needed to be schema-qualified in case functions schema is not in the search_path
2943	06/19/2012 07:59 AM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added plain function _label() and use it in trigger function _label()
2942	06/19/2012 07:50 AM	Aaron Marcuse-Kubitza	sql.py: put_table(): Support plain SQL functions in addition to relational functions
2941	06/18/2012 11:08 PM	Aaron Marcuse-Kubitza	sql_gen.py: Added NamedArg. FunctionCall: Support named arguments (http://www.postgresql.org/docs/9.0/static/sql-syntax-calling-funcs.html).
2940	06/18/2012 10:54 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added plain function _merge() and use it in trigger function _merge()
2939	06/18/2012 10:49 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Added plain function _alt() and use it in trigger function _alt()
2938	06/18/2012 10:37 PM	Aaron Marcuse-Kubitza	schemas/functions.sql: Removed no longer used ensure_not_null()
2937	06/18/2012 10:22 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): MissingCastException: Use cast_temp_col() so that cast will occur before any main insert, which locks the output table and should take as little time as possible
2936	06/18/2012 10:18 PM	Aaron Marcuse-Kubitza	sql.py: Added cast_temp_col()
2935	06/18/2012 10:17 PM	Aaron Marcuse-Kubitza	sql.py: add_col(): Support additional run_query() kw_args. add_row_num(): Use new add_col().
2934	06/18/2012 10:09 PM	Aaron Marcuse-Kubitza	sql.py: Added add_col()
2933	06/18/2012 08:17 PM	Aaron Marcuse-Kubitza	sql_gen.py: Col.__str__(): Truncate any table name using concat() to ensure that the full column name is included in the string
2932	06/18/2012 07:59 PM	Aaron Marcuse-Kubitza	strings.py, sql_gen.py: Renamed add_suffix() to concat() to reflect that this is a fixed-length replacement for +
2931	06/18/2012 07:49 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Moved MissingCastException to the top of the exceptions list because it's more of a core exception than the others, and will be raised before any rows are even inserted
2930	06/18/2012 06:20 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.with_savepoint(): Always release savepoint, because after ROLLBACK TO SAVEPOINT, "The savepoint remains valid and can be rolled back to again" (http://www.postgresql.org/docs/8.3/static/sql-rollback-to.html). Moved `self._savepoint -= 1` to the main try block's new finally block.
2929	06/18/2012 05:59 PM	Aaron Marcuse-Kubitza	sql.py: put_table(): Lock output table right before, and in the same nested transaction as, the insert statement that needs lock, so that it is not released in a prior autocommit and is held for as little time as possible
2928	06/18/2012 05:38 PM	Aaron Marcuse-Kubitza	db_xml.py: put_table(): Removed no longer needed commit param
2927	06/18/2012 05:16 PM	Aaron Marcuse-Kubitza	bin/map: Removed rollback() call before closing the connection because PostgreSQL does this automatically
2926	06/18/2012 05:14 PM	Aaron Marcuse-Kubitza	csv2db: Log inserts with log_level=5 so they are not shown for verbosity 4, which is used to see the savepoints and autocommits
2925	06/18/2012 05:13 PM	Aaron Marcuse-Kubitza	Removed unnecessary db.db.commit() calls because commits are now done automatically by DbConn's autocommit mode
2924	06/18/2012 04:54 PM	Aaron Marcuse-Kubitza	sql.py: DbConn.do_autocommit(): Output the "Autocommitting" debug message with level=4 so that it doesn't clutter up the logging output for normal verbosities
2923	06/18/2012 04:50 PM	Aaron Marcuse-Kubitza	DbConn: autocommit mode defaults to True so that all scripts get the benefit of automatic commits
2922	06/18/2012 04:49 PM	Aaron Marcuse-Kubitza	input.Makefile: Staging tables: import/install-%: Include the table name in the log file name so that successive tables for the same datasource don't overwrite the same log file
2921	06/18/2012 04:39 PM	Aaron Marcuse-Kubitza	sql.py: DbConn: Don't always autocommit in debug_temp mode, because this could cause autocommit mode to be turned on when the user does not expect it
2920	06/18/2012 04:36 PM	Aaron Marcuse-Kubitza	bin/map: connect_db(): Autocommit in commit mode to avoid the need for manual commits. This should also reduce the time that table locks are held, to avoid unnecessary contention when multiple processes are trying to insert into the same output table. (The program always uses nested transactions to support rollbacks, so there is no problem autocommitting whenever a top-level nested transaction or top-level query completes.)
2919	06/18/2012 04:29 PM	Aaron Marcuse-Kubitza	sql_gen.py: Removed TempFunction because that functionality is now provided by DbConn.TempFunction()
2918	06/18/2012 04:28 PM	Aaron Marcuse-Kubitza	sql.py: Use new DbConn.TempFunction()
2917	06/18/2012 04:28 PM	Aaron Marcuse-Kubitza	sql.py: DbConn: Added TempFunction()
2916	06/18/2012 04:25 PM	Aaron Marcuse-Kubitza	sql.py: Use new DbConn.debug_temp config option to control whether temporary objects should instead be permanent
2915	06/18/2012 04:20 PM	Aaron Marcuse-Kubitza	sql.py: DbConn: Added config option debug_temp
2914	06/18/2012 04:12 PM	Aaron Marcuse-Kubitza	sql.py: function_exists(): Fixed bug where trigger functions needed to be excluded, since they cannot be called directly

Project

General

Profile