Project

General

Profile

2013-11-07 conference call

Martha's notes

Upcoming

  • call next week at usual time (Th 9am PT&Tucson/12pm ET)
    • will discuss VegBank validation

Availability

  • Paul is back on his previous project
  • see the *Google spreadsheet* (and please add your availability for future weeks once it's known):

Loading Google Spreadsheet...

Decisions made

VegBank validation

taxonomic names in extracts

familyVerbatim - family as submitted by the data provider. May be null for some datasets.
scientificNameVerbatim - scientificName, as submitted by data provider
familyMatched - family matched by the TNRS, if any
scientificNameMatched - lowest-level scientific name matched by the TNRS, without the author
scientificNameAuthorshipMatched - author of the lowest-level scientific name matched by TNRS
familyAccepted - accepted family provided by TNRS
scientificNameAccepted - lowest-level accepted scientific name provided by TNRS, without the author
scientificNameAuthorshipAccepted - author of the lowest-level accepted scientific name provided by TNRS
annotations - annotation terms such as "cf." and "aff.", as extracted by the TNRS
unmatchedTerms - trailing strings, if any, not recognized by the TNRS (as returned by the TNRS in column umnatched_terms)
morphospecies

"morphospecies" is the resolved scientific name (minus the author) concatenated with unmatched terms, formed according the algorithm I sent you previously. I think it's helpful to list both unmatchedTerms and morphospecies; doing so clarifies how morphospecies is formed. As morphospecies are not relevant to specimen validation extracts, tell whoever is doing the validation to ignore the column morphospecies.

scientificNameMatched - lowest-level scientific name matched by the TNRS, without the author

We actually call this taxonName instead of scientificName, to indicate that it doesn't have the author. scientificName is defined by DwC as including the author. (Or perhaps we should come up with a different term for "taxonomic name with author" to avoid the ambiguity?)

In that case, substitute "taxon" for "scientificName" for all columns except "scientificNameVerbatim" (the latter can contain the authors, depending on the data source).

The column headings would be:

familyVerbatim
scientificNameVerbatim
familyMatched
taxonMatched
taxonAuthorshipMatched
familyAccepted
taxonAccepted
taxonAuthorshipAccepted
annotations
unmatchedTerms
morphospecies

geoscrubbing

  • if need to re-run geoscrubbing scripts, test just a sample of 1000 rows

To do for Bob and Mike Lee

  1. review current VegBank extract
    • ignore CVS rows (rows 239-541)
  2. check that new VegBank extract has CVS data properly removed

To do for Aaron

VegBank validation

  1. completely remove CVS plots
  2. send new extract

taxonomic names in extracts