Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  bin 3149 over 12 years Aaron Marcuse-Kubitza csv2db: Fixed bug where CREATE TABLE statement ...
  config 272 about 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 3163 over 12 years Aaron Marcuse-Kubitza schemas/vegbien.sql: location: Dropped unique c...
  lib 3164 over 12 years Aaron Marcuse-Kubitza sql.py: add_index(): Don't create index columns...
  mappings 2529 over 12 years Aaron Marcuse-Kubitza mappings/DwC2-VegBIEN.specimens.csv: Removed _t...
  schemas 3163 over 12 years Aaron Marcuse-Kubitza schemas/vegbien.sql: location: Dropped unique c...
  to_do 2547 over 12 years Aaron Marcuse-Kubitza to_do/timeline.doc: Updated to reflect the mont...
Makefile 10.4 KB 3156 over 12 years Aaron Marcuse-Kubitza main Makefile: python-Darwin: Added pip install...
README.TXT 2.9 KB 3133 over 12 years Aaron Marcuse-Kubitza input.Makefile: Added import/steps.by_col.sql t...
map 1.21 KB 3140 over 12 years Aaron Marcuse-Kubitza top-level map: Added support for custom public ...

Latest revisions

# Date Author Comment
3164 06/29/2012 04:33 AM Aaron Marcuse-Kubitza

sql.py: add_index(): Don't create index columns for nullable columns, because they require indexes to be created on all columns in order to use a distinct_table() temp table. Also, now that we are no longer using LEFT JOINs, the COALESCE call would only be evaluated once (in the plain JOIN) in the event that PostgreSQL doesn't use an index on a COALESCE expression.

3163 06/29/2012 03:30 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: location: Dropped unique constraint on lat/long because it covered only some rows, which interfered with column-based import's selection of different insert methods based on the presence or absence of duplicate keys. (With the constraint, locations with coordinates would have duplicates eliminated, but locations without coordinates would not be able to find which row was added for a particular location because there was no lookup key to join on, and would all just use the first inserted row.) The previous behavior didn't make much sense anyway, because it would assert that two locationevents occurred in the same place just because they had the same coordinates, which may not have been precise enough to make this determination. Asserting that two locationevents occurred in the same place is really part of the secondary validation, not the import process.

3162 06/29/2012 01:58 AM Aaron Marcuse-Kubitza

sql.py: DbConn: Fixed bug where Exceptions did not have the query appended if the query was not run in cacheable mode, by moving _add_cursor_info() from DbCursor.execute() to run_query() so it would also get called for non-cacheable queries that use a native cursor rather than a wrapper. Fixed bug where non-cacheable queries were not autocommitted, by moving self.do_autocommit() from DbCursor.execute() to run_query() so it would also get called for non-cacheable queries that use a native cursor rather than a wrapper.

3161 06/29/2012 01:54 AM Aaron Marcuse-Kubitza

sql.py: DbConn._db(): Record that a transaction is already open before setting the search_path so that a query is never run with an _savepoint value less than 1 (manual transactions are not supported yet)

3160 06/29/2012 01:52 AM Aaron Marcuse-Kubitza

sql.py: DbConn.with_savepoint(): Increment _savepoint before running queries so they don't get autocommitted

3159 06/29/2012 01:10 AM Aaron Marcuse-Kubitza

sql.py: empty_temp(): Empty temp tables even in debug_temp mode, so that it can be seen which tables have been garbage collected and disk space leaks can be detected. This will not affect the external re-runnability of slow queries in debug_temp mode, as long as the user aborts the debug_temp import while the slow query is still running.

3158 06/29/2012 01:07 AM Aaron Marcuse-Kubitza

sql_gen.py: ColDict: Use OrderedDict so that order of keys in input dict (if ordered) will be preserved. This should ensure that tempt table unique indexes have their columns in the same order as the output table, so that a merge join can be used.

3157 06/29/2012 01:01 AM Aaron Marcuse-Kubitza

util.py: dict_subset(): Use OrderedDict so that order of keys in input dict (if ordered) will be preserved

3156 06/29/2012 12:55 AM Aaron Marcuse-Kubitza

main Makefile: python-Darwin: Added pip installation instructions. python-Linux: Added ordereddict.

3155 06/29/2012 12:04 AM Aaron Marcuse-Kubitza

sql.py: DbConn.col_info(): cacheable param defaults to True now that callers explicitly turn off cacheable when needed

View all revisions | View revisions

Also available in: Atom