Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  bin 3149 over 12 years Aaron Marcuse-Kubitza csv2db: Fixed bug where CREATE TABLE statement ...
  config 272 about 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 3180 over 12 years Aaron Marcuse-Kubitza mappings/DwC2-VegBIEN.specimens.csv: Mapped ins...
  lib 3175 over 12 years Aaron Marcuse-Kubitza db_xml.py: put_table(): Subsetting in_table: Pr...
  mappings 3180 over 12 years Aaron Marcuse-Kubitza mappings/DwC2-VegBIEN.specimens.csv: Mapped ins...
  schemas 3178 over 12 years Aaron Marcuse-Kubitza schemas/vegbien.sql: specimenreplicate: UNIQUE ...
  to_do 2547 over 12 years Aaron Marcuse-Kubitza to_do/timeline.doc: Updated to reflect the mont...
Makefile 10.4 KB 3156 over 12 years Aaron Marcuse-Kubitza main Makefile: python-Darwin: Added pip install...
README.TXT 2.9 KB 3133 over 12 years Aaron Marcuse-Kubitza input.Makefile: Added import/steps.by_col.sql t...
map 1.21 KB 3140 over 12 years Aaron Marcuse-Kubitza top-level map: Added support for custom public ...

Latest revisions

# Date Author Comment
3180 07/02/2012 08:33 AM Aaron Marcuse-Kubitza

mappings/DwC2-VegBIEN.specimens.csv: Mapped institutionCode. This will enable datasources to use specimenreplicate's institution_id index for duplicate elimination.

3179 07/02/2012 08:31 AM Aaron Marcuse-Kubitza

input.Makefile: Prompt user to accept test, instead of providing command line func for doing so

3178 07/02/2012 07:45 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: specimenreplicate: UNIQUE INDEX on catalognumber_dwc: Added institution_id so that datasources that specify it (such as aggregators) will not need to have catalognumbers be globally unique. Once the institution_id is mapped to, this will fix a bug where rows with the same catalognumber were assumed to be duplicates even though they were from different institutions. This should also avoid the need to do any duplicate elimination joins when importing specimenreplicate, speeding up column-based import.

3177 07/02/2012 07:32 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: specimenreplicate: Renamed museum_id to institution_id to correspond with DwC's institutionCode, so that it would be more obvious where to map institutionCode fields to

3176 07/02/2012 07:16 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated to include run times for rest of datasources for most recent column-based import

3175 06/29/2012 08:09 AM Aaron Marcuse-Kubitza

db_xml.py: put_table(): Subsetting in_table: Prepend schema to subset table name so that in pg_stat_activity, it's clear which datasource a particular query is from

3174 06/29/2012 07:46 AM Aaron Marcuse-Kubitza

sql_io.py: cast_temp_col(): add_col()'s distinguishing comment param: Add the type in case the same input column is being cast to different types, and both types have the same first word (causing their new column names to be the same)

3173 06/29/2012 07:42 AM Aaron Marcuse-Kubitza

sql_io.py: cast_temp_col(): Name the new column with only the first word of the type, to save space in the limited identifier length

3172 06/29/2012 07:41 AM Aaron Marcuse-Kubitza

strings.py: Added first_word()

3171 06/29/2012 07:35 AM Aaron Marcuse-Kubitza

sql_io.py: cast_temp_col(): Use sql_gen.suffixed_col() to create the new column name

View all revisions | View revisions

Also available in: Atom