Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  bin 3149 over 12 years Aaron Marcuse-Kubitza csv2db: Fixed bug where CREATE TABLE statement ...
  config 272 about 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 3170 over 12 years Aaron Marcuse-Kubitza inputs/import.stats.xls: Added run time for SAL...
  lib 3172 over 12 years Aaron Marcuse-Kubitza strings.py: Added first_word()
  mappings 2529 over 12 years Aaron Marcuse-Kubitza mappings/DwC2-VegBIEN.specimens.csv: Removed _t...
  schemas 3163 over 12 years Aaron Marcuse-Kubitza schemas/vegbien.sql: location: Dropped unique c...
  to_do 2547 over 12 years Aaron Marcuse-Kubitza to_do/timeline.doc: Updated to reflect the mont...
Makefile 10.4 KB 3156 over 12 years Aaron Marcuse-Kubitza main Makefile: python-Darwin: Added pip install...
README.TXT 2.9 KB 3133 over 12 years Aaron Marcuse-Kubitza input.Makefile: Added import/steps.by_col.sql t...
map 1.21 KB 3140 over 12 years Aaron Marcuse-Kubitza top-level map: Added support for custom public ...

Latest revisions

# Date Author Comment
3172 06/29/2012 07:41 AM Aaron Marcuse-Kubitza

strings.py: Added first_word()

3171 06/29/2012 07:35 AM Aaron Marcuse-Kubitza

sql_io.py: cast_temp_col(): Use sql_gen.suffixed_col() to create the new column name

3170 06/29/2012 06:16 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Added run time for SALVIAS organisms, which just finished

3169 06/29/2012 06:14 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Use [1]-style footnotes because copying and pasting to Gmail doesn't preserve the superscripts

3168 06/29/2012 06:11 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated for latest simultaneous column-based import

3167 06/29/2012 04:42 AM Aaron Marcuse-Kubitza

sql_io.py: cast_temp_col(): Don't automatically create an index on the new column, because it doesn't necessarily need an index and the main index used for the join is now added automatically by distinct_table()

3166 06/29/2012 04:39 AM Aaron Marcuse-Kubitza

sql.py: flatten(): Don't automatically create indexes on all columns, because most columns don't need indexes and the main index used for the join is now added automatically by distinct_table()

3165 06/29/2012 04:35 AM Aaron Marcuse-Kubitza

sql.py: Removed no longer needed add_index_col() and ensure_not_null() because we are not using index columns

3164 06/29/2012 04:33 AM Aaron Marcuse-Kubitza

sql.py: add_index(): Don't create index columns for nullable columns, because they require indexes to be created on all columns in order to use a distinct_table() temp table. Also, now that we are no longer using LEFT JOINs, the COALESCE call would only be evaluated once (in the plain JOIN) in the event that PostgreSQL doesn't use an index on a COALESCE expression.

3163 06/29/2012 03:30 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: location: Dropped unique constraint on lat/long because it covered only some rows, which interfered with column-based import's selection of different insert methods based on the presence or absence of duplicate keys. (With the constraint, locations with coordinates would have duplicates eliminated, but locations without coordinates would not be able to find which row was added for a particular location because there was no lookup key to join on, and would all just use the first inserted row.) The previous behavior didn't make much sense anyway, because it would assert that two locationevents occurred in the same place just because they had the same coordinates, which may not have been precise enough to make this determination. Asserting that two locationevents occurred in the same place is really part of the secondary validation, not the import process.

View all revisions | View revisions

Also available in: Atom