Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  backups 4751 over 12 years Aaron Marcuse-Kubitza backups/Makefile: Backups: Full DB: Specify the...
  bin 4927 over 12 years Aaron Marcuse-Kubitza csv2db: COPY FROM mode: Removed no longer neede...
  config 272 about 13 years Aaron Marcuse-Kubitza Moved bien_password to new config dir
  inputs 4950 over 12 years Aaron Marcuse-Kubitza inputs/REMIB/Specimen/map.csv: Remapped accessi...
  lib 4939 over 12 years Aaron Marcuse-Kubitza sql.py: DbConn.col_info(): Parse array types as...
  mappings 4949 over 12 years Aaron Marcuse-Kubitza mappings/VegCore-VegBIEN.csv: Only use institut...
  schemas 4948 over 12 years Aaron Marcuse-Kubitza schemas/py_functions.sql: _namePart(): Slice th...
  to_do 4524 over 12 years Aaron Marcuse-Kubitza to_do/timeline.doc: Updated to reflect addition...
  validation 4523 over 12 years Aaron Marcuse-Kubitza Added validation/
Makefile 9.99 KB 4752 over 12 years Aaron Marcuse-Kubitza root Makefile: PostgreSQL: postgres-Linux: Adde...
README.TXT 11.1 KB 4793 over 12 years Aaron Marcuse-Kubitza README.TXT: Data import: Added note that `make ...
map 1.22 KB 3475 over 12 years Aaron Marcuse-Kubitza root map: Run bin/map with a nice increment of ...
new_terms.csv 30.4 KB 4887 over 12 years Aaron Marcuse-Kubitza Regenerated root unmapped_terms.csv, new_terms.csv
unmapped_terms.csv 5.8 KB 4887 over 12 years Aaron Marcuse-Kubitza Regenerated root unmapped_terms.csv, new_terms.csv

Latest revisions

# Date Author Comment
4950 09/24/2012 02:54 PM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/map.csv: Remapped accession_number to catalogNumber because it is not globally unique, only (usually) unique within the institution providing the data ("acronym"). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution.

4949 09/24/2012 02:45 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: Only use institutionCode+collectionCode+catalogNumber as the authorlocationcode (location-scoping ID) if there is actually a catalogNumber. Otherwise, the mapping process would attempt to create one location for each collection in the datasource, when there should be one location for each specimen.

4948 09/24/2012 02:36 PM Aaron Marcuse-Kubitza

schemas/py_functions.sql: _namePart(): Slice the first name from the beginning of the string to one word before the end, instead of one after the beginning, in order to avoid overlap with the last name, which starts one before the end, when there is only one word. Note that only one word means the name is assumed to be a last name. This assumption may not always be true, but when a datasource provides the name concatenated, an assumption must be made when not all name components are present.

4947 09/24/2012 02:30 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: party: Added check constraint to require at least an organizationname or surname. Previously, NULL entries for the collector or identifier incorrectly caused the creation of an empty party entry, hence the lower inserted row counts now that this is no longer created.

4946 09/24/2012 02:17 PM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/map.csv: Remapped acronym to institutionCode because this is an aggregator, and the field lists the datasource each record was aggregated from. Note that the inserted row count changes because of different duplicate elimination strategies in specimenreplicate and party (which institutionCode is placed in).

4945 09/24/2012 02:11 PM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/create.sql: Also filter out rows where acronym (collectionCode) is NULL because this is a required field for valid records

4944 09/24/2012 01:28 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonpath: Renamed scientificnameauthor to author so the column name doesn't have "scientificname" in it, which made the term look confusingly like scientificname itself. Added descriptive comment that this is the author of the scientific name.

4943 09/24/2012 01:19 PM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonpath: Renamed canon_id to canon_taxonpath_id to clarify that this is a recursive fkey. The convention is that a recursive fkey includes the table name plus a descriptive prefix.

4942 09/24/2012 01:14 PM Aaron Marcuse-Kubitza

schemas/filter_ERD.csv: Don't filter out fkeys from taxonpath to itself

4941 09/24/2012 11:32 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: taxonpath: Added canon_id for the canonical (scrubbed) taxonpath determined by TNRS

View all revisions | View revisions

Also available in: Atom