mappings/VegCore-VegBIEN.csv: mapped subspecies to new taxonverbatim.subspecies for easier access by validations queries
fix: mappings/VegCore-VegBIEN.csv: taxonoccurrence.sourceaccessioncode: need to populate from aggregateOrganismObservationID when only that is available
schemas/vegbien.sql: specimenreplicate.institution_id: renamed to duplicate_institutions_sourcelist_id, as decided in the conference calls (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2)
mappings/VegCore.htm: regenerated from wiki: rename specimenHolderInstitutions to specimen_duplicate_institutions, as decided in the 2014-03-13 conference call (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2). note that most schema changes (such as this one) involve mappings changes, which are handled automatically by `inputs/run postprocess; yes|make inputs/{NVS,SALVIAS,TEAM}/test`.
mappings/VegCore-VegBIEN.csv: event__participant: allow multiple event__participant columns
mappings/VegCore-VegBIEN.csv: project_participant: use [!...] negative lookahead assertion so that multiple project_participant columns will properly map to separate projectcontributor rows
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
mappings/VegCore-VegBIEN.csv: mapped project_participant
mappings/VegCore-VegBIEN.csv: mapped taxon_determination__is_current, taxon_determination__is_original
bugfix: mappings/VegCore-VegBIEN.csv: main taxondetermination: use [!isoriginal=true] instead of [!isoriginal] so that adding a manual isoriginal field does not prevent this selector from matching
bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.
bugfix: mappings/VegCore-VegBIEN.csv: stratum's locationevent: link this to the parent locationevent, so that the parent locationevent's information (such as locationeventcontributors) is accessible to the stratum's locationevent
mappings/VegCore-VegBIEN.csv: mapped event__participant
mappings/VegCore-VegBIEN.csv: mapped stratum__name
schemas/vegbien.sql: added source_id to allow different datasources to have their own strata
mappings/VegCore.htm: regenerated from wiki. added Stratum table.
bugfix: mappings/VegCore-VegBIEN.csv: don't map datasetURL to source.url for taxa-only data (this mapping should only occur for Source tables)
mappings/VegCore-VegBIEN.csv: mapped datasetURL
fix: mappings/VegCore-VegBIEN.csv: source__modified_date: remapped to pubdate instead of datelastmodified because this is actually metadata for the source itself, rather than for the VegBIEN record of the source
mappings/VegCore-VegBIEN.csv: mapped source__modified_date (different from vegcore.vegpath.org?modified, which is for the data record)
mappings/VegCore.htm: regenerated from wiki. added source__version (= edition), source__modified_date.
mappings/VegCore-VegBIEN.csv: mapped edition
mappings/VegCore.htm: regenerated from wiki. added EQUIV (also mapped in mappings/VegCore-VegBIEN.csv).
mappings/VegCore-VegBIEN.csv: mapped municipality
mappings/VegCore-VegBIEN.csv: mapped DUPLICATE to nothing so that it would not be treated as an unmapped term
bugfix: mappings/VegCore-VegBIEN.csv: fixed priority of cultivated and oldGrowth so cultivated is used first if it's available
bugfix: mappings/VegCore-VegBIEN.csv: place.geovalid: added missing /1 after _alt
schemas/vegbien.sql: place.geovalid: added latLongDomainValid to the values to _and together
mappings/VegCore-VegBIEN.csv: place.geovalid: use false instead of NULL
mappings/VegCore-VegBIEN.csv: locationRemarks: Remapped to locationnarrative because location.notespublic is a boolean field
mappings/VegCore.htm: Regenerated from wiki. Renamed specimenHolders to specimenHolderInstitutions to make it obvious that this is a list of institutions, such as would be in institutionCode in a DwC export.
mappings/VegCore-VegBIEN.csv: Removed specimenIndexer->institutionCode mappings because the institutionCode should refer only to the specimenHolder
mappings/VegCore.htm: Regenerated from wiki. Remapped organismNotes to be a synonym of occurrenceRemarks, since notes on an organism are more generally notes on an occurrence.
mappings/VegCore-VegBIEN.csv: Mapped occurrenceRemarks
mappings/VegCore-VegBIEN.csv: _avg(): Use numeric param names to work with SQL functions
mappings/VegCore-VegBIEN.csv: Mapped latitude/longitude_DMS to coordinates.latitude_deg using new _dms_to_dd(text)
mappings/VegCore-VegBIEN.csv: latitude/longitude_deg,min,sec: Also mapped to the geoscrub coordinates entry
mappings/VegCore-VegBIEN.csv: latitude/longitude_sec: Fixed name, which had been incorrectly automapped to verbatim*
mappings/VegCore-VegBIEN.csv: Mapped latitude/longitude_deg,min,sec
mappings/VegCore.htm: Regenerated from wiki. Documentation has been added on how to choose term names (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Naming) and how to form globally unique ID values (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Forming-IDs). Source and Specimen terms have been renamed to be self-explanatory and unambiguous (the DwC equivalents remain as synonyms). Short definitions of Source terms have been added to explain the differences between them. Source, Specimen, and Collection terms have been shortened according to the new instructions for choosing preferred term names (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Naming).
mappings/VegCore-VegBIEN.csv: Mapped reproductiveCondition
mappings/VegCore.htm: Regenerated from wiki. matched*Fit_fraction has been renamed to matched*Confidence_fraction.
mappings/VegCore.htm: Regenerated from wiki. Data owner terms and taxon synonyms have been added, and morphospecies has been disambiguated.
mappings/VegCore.htm: Regenerated from wiki. Brad's new DwC ID terms spreadsheet has now been added, and a number of the ID terms clarified, disambiguated, and recategorized. In particular, institutionCode has now been split into the custodialInstitutions and collectingInstitution, to differentiate between which institution has the specimen vs. stamped the specimen. This distinction is important because the catalogNumber, stamped on the specimen, is only unique within the collectingInstitution. Most datasources don't unambiguously specify which institution their institutionCode is referring to, so it has been assumed to be custodialInstitutions unless a data dictionary says otherwise (as is the case for UNCC). In addition, a MatchedTaxonDetermination table has been added with the *_matched fields from TNRS.
mappings/VegCore.csv: Regenerated from wiki. Taxon terms with prefixes for other TaxonDeterminations now indicate the analogous term in an "analogous to" label next to the term
mappings/VegCore-VegBIEN.csv: datasourceRecordID: Fixed bug where also need to add datasourceRecordID next to occurrenceID for an institutionCode remap switch
mappings/VegCore-VegBIEN.csv: Mapped datasourceRecordID
mappings/VegCore-VegBIEN.csv: matched*Fit_fraction: Remapped to taxonconfidence instead of taxonfit
mappings/VegCore-VegBIEN.csv, inputs/*/*/map.csv: Applied term renamings from the new dynamically generated Veg+-VegCore.csv, which reflects the current state of the data dictionary. (Permanently switching to the new Veg+-VegCore.csv will be a separate change.) Updates to VegCore term names that have occurred since the data dictionary was created are now able to take effect, which involves remapping and inferring units on several fields.
mappings/VegCore-VegBIEN.csv: Mapped basalDiameter_in
mappings/VegCore-VegBIEN.csv: Mapped diameterBreastHeightGentry_cm, basalDiameter_cm, precipitation_mm
mappings/VegCore-VegBIEN.csv: locationID->location.sourceaccessioncode: Removed restriction that this mapping can't occur if geovalidation information is present. The locationID is no longer mapped to the place.sourceaccessioncode, so this filter is not necessary.
mappings/VegCore-VegBIEN.csv: institutionCode list->sourcename mapping: _split(): Also match ; as a separator, and match separators with or without a following space
mappings/VegCore-VegBIEN.csv: Also include morphospecies in the accepted taxondetermination's taxonverbatim, so that it can easily be retrieved by the analytical DB views
mappings/VegCore-VegBIEN.csv: Don't create NCBI crosslinks for the matched taxonomic name. These crosslinks are no longer needed now that TNRS provides a separate accepted name on which crosslinks can be made.
mappings/VegCore-VegBIEN.csv: Mapped accepted* taxonomic name, now to separate accepted taxondetermination
schemas/vegbien.sql: taxonlabel: Removed creationdate, which duplicates taxondetermination.determinationdate
mappings/VegCore-VegBIEN.csv: fieldNumber (authorEventCode): Fixed bug where locationevent.authorlocationcode should be authoreventcode
mappings/VegCore-VegBIEN.csv: morphoname: Remapped to the original rather than current taxondetermination because this is the original name applied by the author
mappings/VegCore-VegBIEN.csv: Mapped recordNumber to new specimenreplicate.collectionnumber
mappings/VegCore-VegBIEN.csv: Also map recordNumber (collectionnumber) to the indirect voucher's specimenreplicate
mappings/VegCore-VegBIEN.csv: Mapped individualCode. authortaxoncode: Prefer tag over recordNumber (collectionnumber), because this applies to the plant rather than the specimen.
mappings/VegCore-VegBIEN.csv: Mapped morphoname
schemas/vegbien.sql: plantobservation: Renamed collectionnumber to authorplantcode since this number, which identifies the plant, is actually different from the collectionnumber that identifies the specimen collected from it. This distinction is meaningful for plots data, but generally not for specimens data.
mappings/VegCore-VegBIEN.csv: Removed no longer used mappings for verbatimScientificName in _if conditions
mappings/VegCore-VegBIEN.csv: Removed taxonlabel for original taxondetermination, because the original taxondetermination is not scrubbed by scrub.make (only the most current taxondetermination gets scrubbed, because only a single scrubbed determination is added by scrub.make). This still leaves the original taxondetermination's taxonverbatim, which stores the taxonomic information for historical purposes.
mappings/VegCore-VegBIEN.csv: Removed no longer used accepted and verbatim (parsed) taxonlabels, which have been replaced by a single accepted or matched taxondetermination created by scrub.make
mappings/VegCore-VegBIEN.csv: Removed TNRS input taxonlabels meant to cross-link to taxonlabels added by the TNRS import, because TNRS taxondeterminations are now created instead
mappings/VegCore-VegBIEN.csv: main taxonverbatim.morphospecies "if has verbatim name" condition: Fixed bug where need to remove the taxonIsCanonical flag, because the TNRS.public.unscrubbed_taxondetermination_view table (which uses this flag) should include this field (although not other places where the morphospecies is stored by other TNRS tables)
mappings/VegCore-VegBIEN.csv: primary taxonlabel's parent taxonlabel: Fixed bug where a taxonverbatim was incorrectly being created solely to store the taxonRank, even though it was already stored in the taxonlabel's rank field
mappings/VegCore-VegBIEN.csv: Don't map morphospecies to the parsed taxonlabel's taxonepithet, because this causes an extra, parsed taxonlabel to be created for TNRS.public.unscrubbed_taxondetermination_view. It is not needed by the other TNRS tables.
mappings/VegCore-VegBIEN.csv: "if has verbatim name" _if statements that filter something out for TNRS mappings: Also assume true if taxonIsCanonical is specified, because some TNRS tables (eventually such as public.unscrubbed_taxondetermination_view) do not specify a separate "verbatim" taxondetermination but do provide taxonIsCanonical as a flag to turn various mappings on and off
mappings/VegCore-VegBIEN.csv: Remapped matched*Fit_fraction to taxondetermination.taxonfit when a taxondetermination, not just a taxonlabel, is provided
mappings/VegCore-VegBIEN.csv: Don't create a separate TNRS input taxonlabel if taxonIsCanonical exists
mappings/VegCore-VegBIEN.csv: taxonlabel.taxonomicname: Prepend the family to the rest of the name using new _merge_prefix() instead of _join_words()/_nullIf(), so that any input taxonomic name that includes the family will not have the family duplicated in the combined taxonomic name. Previously, the duplication was removed only when the rest of the input name was equal to the family. This change fixes a bug in the new TNRS import where a pre-concatenated taxonomic name (Accepted_scientific_name) which includes the family is now used instead of Accepted_name, which only includes it when it's equal to the family.
mappings/VegCore-VegBIEN.csv: Also map the morphospecies to the accepted taxonverbatim when an accepted name is provided
mappings/VegCore-VegBIEN.csv: identificationType: Fixed bug in mapping where extra *_id/ needed to be removed
mappings/VegCore-VegBIEN.csv: Mapped identificationType
mappings/VegCore-VegBIEN.csv: Mapped taxonOccurrencePkey
mappings/VegCore-VegBIEN.csv: locationID/locationName + subplot -> location.sourceaccessioncode mapping: Fixed bug where subplot was incorrectly being mapped to this field even when there was no location*. (This field can only be populated if both location* and subplot are specified.) Also only map locationID for this, to avoid inconsistencies where one table supplies locationID+subplot, while another table supplies locationName+subplot, but they both get mapped to the same field, preventing plots from being matched up with their observations when creating the analytical_stem.
mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only using authorTaxonCode if there is no plant ID: Added individualID, stemID to the terms that cause authorTaxonCode not to be mapped to VegBIEN authortaxoncode
mappings/VegCore-VegBIEN.csv: authortaxoncode mappings: Only use authorTaxonCode if there is no plant ID, because an individual plant gets its own taxonoccurrence and thus needs the taxonoccurrence's IDs to be unique to the plant, regardless of what the author designates as the taxonoccurrence code
mappings/VegCore-VegBIEN.csv: Mapped authorTaxonCode
schemas/vegbien.sql: Reattached trait to taxonoccurrence instead of taxonlabel, because the TraitObservation traits data is actually associated with a particular occurrence (plant observation complete with location, date, etc.), rather than just a taxon
mappings/VegCore-VegBIEN.csv: Mapped traits-related DwC terms measurementType, measurementValue, measurementUnit
mappings/VegCore.csv: Terms: Removed namespace prefixes (dcterms:), because VegCore terms are globally unique within VegCore and there should not be multiple versions of the same VegCore term with different namespaces. Provenance is instead indicated in the Sources column, which contains not just a namespace but a full URL to each source term.
mappings/VegCore.csv: Term names: Changed special characters to _ because Redmine doesn't support special characters in HTML anchors (it removes everything except letters, numbers, _, and -)
mappings/VegCore-VegBIEN.csv: Removed no longer used verbatimGrowthForm. Map to growthForm instead and translate growth form values to VegBIEN's growthform enum.
mappings/VegCore-VegBIEN.csv: institutionCode: Removed mapping to sourcename.matched_source_id, which is now autopopulated. Split any list of institutionCodes apart using new _split().
schemas/vegbien.sql: Allow multiple institutionCodes for each specimenreplicate by linking new sourcelist table many-to-many to source via sourcename (which is now a linking table)
schemas/vegbien.sql: Renamed sampletype to observationtype to match the VegCore term
mappings/VegCore.csv: Renamed sampleType to observationType to match the SALVIAS term it's derived from
mappings/VegCore-VegBIEN.csv: Don't forward specimenreplicate IDs to location for plots data (where the specimenreplicate IDs apply only to the specimen)
mappings/VegCore-VegBIEN.csv: Mapped dcterms:rights
mappings/VegCore-VegBIEN.csv: Mapped verbatimCoordinates
mappings/VegCore-VegBIEN.csv: Mapped projectStartDate, projectEndDate
mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
mappings/VegCore-VegBIEN.csv: Mapped sampleType
schemas/vegbien.sql: Renamed taxonconcept.concept_source_id back to concept_reference_id