Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  backups 4751 over 12 years Aaron Marcuse-Kubitza backups/Makefile: Backups: Full DB: Specify the...
  bin 5887 about 12 years Aaron Marcuse-Kubitza my2pg*: Turn off escape_string_warning because ...
  config 272 about 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 5904 about 12 years Aaron Marcuse-Kubitza mappings/VegCore-VegBIEN.csv: accepted* taxonla...
  lib 5903 about 12 years Aaron Marcuse-Kubitza sql.py: distinct_table(): Use DISTINCT ON inste...
  mappings 5904 about 12 years Aaron Marcuse-Kubitza mappings/VegCore-VegBIEN.csv: accepted* taxonla...
  schemas 5897 about 12 years Aaron Marcuse-Kubitza schemas/vegbien.sql: Functions containing UPDAT...
  to_do 4524 over 12 years Aaron Marcuse-Kubitza to_do/timeline.doc: Updated to reflect addition...
  validation 4523 over 12 years Aaron Marcuse-Kubitza Added validation/
Makefile 9.87 KB 5679 over 12 years Aaron Marcuse-Kubitza root Makefile: VegBIEN DB: Schemas: schemas/rot...
README.TXT 12.7 KB 5881 about 12 years Aaron Marcuse-Kubitza README.TXT: Datasource setup: Replaced manual `...
map 989 Bytes 5158 over 12 years Aaron Marcuse-Kubitza root map: Removed no longer needed public schem...
new_terms.csv 30.4 KB 4887 over 12 years Aaron Marcuse-Kubitza Regenerated root unmapped_terms.csv, new_terms.csv
unmapped_terms.csv 5.8 KB 4887 over 12 years Aaron Marcuse-Kubitza Regenerated root unmapped_terms.csv, new_terms.csv

Latest revisions

# Date Author Comment
5904 11/01/2012 01:06 AM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: accepted* taxonlabel: Removed ancestor hierarchy because this is populated, in much greater detail, when the accepted name is imported as an input name and the TNRS-parsed components are available

5903 11/01/2012 12:55 AM Aaron Marcuse-Kubitza

sql.py: distinct_table(): Use DISTINCT ON instead of a unique index and insert_select()'s ignore mode to remove duplicate rows. This uses whichever sorting method PostgreSQL deems to be fastest instead of requiring the use of a B-tree index. Since most of the slower operations in TNRS's import are distinct_table() calls, this should speed up the TNRS import, which is a bottleneck for the DB import as a whole because the TNRS import must complete before other datasources can be imported.

5902 11/01/2012 12:36 AM Aaron Marcuse-Kubitza

sql.py: distinct_table(): Changed comment about distinct_on column index to include just the input table, so that the function does not guarantee a unique index on the output table's distinct_on columns

5901 11/01/2012 12:15 AM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Added acceptedCountry, acceptedStateProvince, acceptedDecimalLatitude/Longitude

5900 10/31/2012 11:57 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Renamed latLongValid, latLongInvalid to georeferenceValid, georeferenceInvalid to correspond to DwC term georeferenceVerificationStatus

5899 10/31/2012 11:45 PM Aaron Marcuse-Kubitza

mappings/VegCore.csv: Added latLongValid, latLongInvalid, latLongInCountry, latLongInStateProvince

5898 10/31/2012 11:14 PM Aaron Marcuse-Kubitza

input.Makefile: Staging tables installation: Treat any .sql file whose name contains (not just ends with) "schema" as a schema file and sort it before other .sql files

5897 10/31/2012 10:17 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Functions containing UPDATE statements: Use quote_nullable() instead of quote_literal() to properly encode NULL values

5896 10/31/2012 10:10 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: Functions containing UPDATE statements: Use PL/pgSQL's EXECUTE statement to avoid caching query plans. This is necessary because as the table grows over time, the optimal query plan may change.

5895 10/31/2012 10:05 PM Aaron Marcuse-Kubitza

sql_io.py: put_table(): ensure_cond(): When deleting rows rows that do not satisfy the condition, handle sql.DoesNotExistExceptions caused by columns in the condition that were not replaced with NULL. These occur when out_table is a function, and the columns of the table the condition relates to therefore can't be found using out_table.

View all revisions | View revisions

Also available in: Atom