spot-checking validation details¶

Queries¶

To get rows for a datasource, use one of the following queries.

If analytical_stem has not yet been created, it's OK to use analytical_stem_view instead of analytical_specimen/plot.
If you change analytical_stem_view to fix a bug, you must also reload analytical_stem because analytical_specimen/plot use this materialized table instead of the view.

Specimens¶

Input and output data side-by-side:

After running the query, split the input and output data into separate spreadsheet tabs

SELECT *
FROM "<datasource>"."Specimen" 
LEFT JOIN /*r#.*/analytical_specimen ON
    analytical_specimen."datasource" = '<datasource>'
AND analytical_specimen."occurrenceID" = "Specimen"."occurrenceID" 
ORDER BY "Specimen".row_num
LIMIT 100;

Plots¶

Input data must be retrieved separately.

Output data:

SELECT *
FROM /*r#.*/analytical_plot
WHERE "datasource" = '<datasource>'
LIMIT 100;

ACAD¶

Source data used is on vegbiendev in /home/bien/inputs/ACAD/Specimen/occurrence.txt

2013-3-6¶

See ACAD.2013-3-6.100_rows.xls and ACAD.2013-3-6.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-3-7)

Excellent. No issues. Mark all these sources [NY, BRIT, ACAD] as done.

query:

SELECT *
FROM "ACAD"."Specimen" 
LEFT JOIN /*r7855.*/analytical_specimen ON
    analytical_specimen."datasource" = 'ACAD'
AND analytical_specimen."occurrenceID" = "Specimen".id
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/ACAD/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-26¶

See ACAD.2013-2-26.100_rows.xls and ACAD.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27, ACAD.2013-2-26.100_rows_bb.xls)

institutionCode:
Should be herbarium acronym (ACAD) [not value Acadia University]. See: http://rs.tdwg.org/dwc/terms/#institutionCode

query:

SELECT *
FROM "ACAD"."Specimen" 
LEFT JOIN /*r7721.*/analytical_specimen ON
    analytical_specimen."datasource" = 'ACAD'
AND analytical_specimen."occurrenceID" = "Specimen".id
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/ACAD/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-20¶

See ACAD.2013-2-20.100_rows.xls and ACAD.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-21, ACAD.2013-2-20.100_rows_bb.xls)

FIXED: ~~collectionCode:~~
See spreadsheet "vegbien_identifiers". For collectionCode, note "In general, use whatever the data provider provides, unless badly formed". They have provided a properly formed, globally unique urn from the "Biodiversity Collections Index". Just use it as provided (urn:lsid:biocol.org:col:13294)
FIXED: ~~scientificName_verbatim:~~
Still incorrect Include family only if no lower ranked name is provided.

query:

SELECT *
FROM "ACAD"."Specimen" 
LEFT JOIN /*r7620.*/analytical_specimen ON
    analytical_specimen."datasource" = 'ACAD'
AND analytical_specimen."occurrenceID" = "Specimen".id
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/ACAD/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-1-25¶

See ACAD.2013-1-25.100_rows.xls and ACAD.2013-1-25.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-1-31, ACAD.2013-1-25.100_rows_bb.xls)

Validation complete. I found not problems with content. Great! This means we are done validating all of Canadensys.

I have suggested some changes to the layout of the specimen extract. Please apply these changes to all future validation extracts for all specimens from all sources, not just ACAD.

column name	comments
datasource
institutionCode
collectionCode
catalogNumber
occurrenceID
country
stateProvince
county
locality
decimalLatitude
decimalLongitude
coordinateUncertaintyInMeters
coordinateSource_bien
georeferenceProtocol_bien
geovalid_bien
isNewWorld_bien
dateCollected
higherPlantGroup_bien
family
genus
speciesBinomial
taxonName
scientificNameAuthorship
scientificNameWithMorphospecies	omit for all specimen validation extracts; this pertains only to plots
threatened_bien	move to after `growthForm`
identifiedBy	omit; this pertains to TNRS accepted name
dateIdentified	omit; this pertains to TNRS accepted name
identificationRemarks	omit; this pertains to TNRS accepted name
growthForm
cultivated_bien
cultivatedBasis_bien
recordedBy	move these two columns to before `dateCollected`
recordNumber	move these two columns to before `dateCollected`

Add the following columns before `family`:
family_verbatim	As provided by author
taxonName_verbatim	As provided by author
identifiedBy_verbatim	As provided by author
dateIdentified_verbatim	As provided by author
identificationRemarks_verbatim	As provided by author
family_matched_tnrs	As matched by TNRS
taxonName_matched_tnrs	As matched by TNRS

Add the following column after `growthForm`
reproductiveCondition

query:

SELECT *
FROM "ACAD"."Specimen" 
LEFT JOIN /*r7317.*/analytical_stem_view analytical_stem ON
    analytical_stem."datasource" = 'ACAD'
AND analytical_stem."occurrenceID" = "Specimen".id
ORDER BY "Specimen".row_num
LIMIT 100;

2013-1-15¶

See ACAD.50_rows.xls and ACAD.50_rows.csv
Input data is on the left, and output data is on the right, starting with the datasource column

query:

SELECT *
FROM "ACAD"."Specimen" 
LEFT JOIN /*r7023.*/analytical_stem ON
    analytical_stem."datasource" = 'ACAD'
AND analytical_stem."occurrenceID" = "Specimen".id
LIMIT 50;

ARIZ¶

Source data is on vegbiendev in /home/bien/inputs/ARIZ/_MySQL/MySQL.{schema,data}.sql

2012-12-11¶

Source data used is on vegbiendev in /home/bien/inputs/ARIZ/ARIZ_DiGIR_21012010.csv.tar.gz

See ARIZ.50_rows.xls and ARIZ.50_rows.csv

Brad Boyle's comments: (ariz_validation_report.xlsx)

		*Fixed?*
coordinateUncertaintyInMeters	Populate with the values in column 'CoordinatePrecision' in ariz_raw (see below).	CoordinatePrecision does not exist in the raw data, so CoordinateUncertaintyInMeters is used instead
elevationInMeters	This should equal (MinimumElevation+MaximumElevation)/2 if (a) both columns provide valid elevation values and MaximumElevation>=MinimumElevation). Currently on the value in MinimumElevation is being used.	PRIORITY FEATURE REQUEST
plotArea_ha	Omit this column from specimen data extracts	FIXED
samplingProtocol	Set to "Dried herbarium specimen" for all herbarium specimen records.	FEATURE REQUEST
dateCollected	For valid date ranges, is it possible to use the middle value of the dates provided, assuming dates are valid? See CatalogueNumber 10021. Use later date if dates split into two even groups. Thus, mid-date for this record would be 1916-08-29.	FEATURE REQUEST
speciesBinomialWithMorphospecies	This is correctly formed, but I recommend we call the column simply 'morphospecies'	FIXED
scientificNameWithMorphospecies	Incorrectly formed, but we don't need it. Omit this column	FIXED
threatened	This is not part of the original data providers data. In any view of data to reviewed by the data provider for validation, etc., I recommend suffixing the names of columns containing data added by BIEN with '_bien'. Thus, this field would become 'threatened_bien'	FIXED
growthForm	Rename to growthForm_bien. See 'threatened', above.	The growthForm field actually contains data provided by the data provider, which is NULL for ARIZ because their DwC export for BIEN2 doesn't contain a growthForm column.
cultivated	Rename to cultivated_bien. See 'threatened', above.	FIXED (the suffix might not always apply because it's sometimes from the data provider and sometimes parsed from the locality description)
cultivatedBasis	Rename to cultivatedBasis_bien. See 'threatened', above.	FIXED
coverPercent	Omit	FIXED
diameterBreastHeight_cm	Omit	FIXED
height_m	Omit	FIXED
tag	Omit	FIXED
organismX_m	Omit	FIXED
organismY_m	Omit	FIXED
authorTaxonCode	Omit	FIXED
individualID	Omit	FIXED
authorStemCode	Omit	FIXED

Missing columns that should be included in validation view (names shown are from original data dump ariz_raw):

		*Fixed?*
Locality	The free-text locality description	FIXED
FieldNotes	This is the detailed specimen description (of the plant, not the locality). Not sure what field this corresponds to in VegCore.	FIXED
HorizontalDatum	=geodeticdatum in dwc	FEATURE REQUEST
OriginalCoordinateSystem	=verbatimcoordinatesystem in dwc	FEATURE REQUEST
GeoRefMethod	=georeferenceprotocol in dwc	FEATURE REQUEST
CoordinatePrecision	This column incorrectly used by ARIZ. It appears they are actually entering undertainty in meters; therefore values should be transferred to coordinateUncertaintyInMeters.	The column is actually called CoordinateUncertaintyInMeters in the original data (`nimoy:/home/bien_shared/raw_data/ariz/ARIZ_DiGIR_21012010.csv.tar.gz`), so there is no need to remap it. However, NYBG does use coordinatePrecision incorrectly, and I remapped it to coordinateUncertaintyInMeters.
LatLongComments	=georeferenceremarks in dwc	FEATURE REQUEST
IdentifiedDate	=identificationmodifier in dwc I think you mean dateIdentified	FIXED
IdentificationModifier	Ubducates undertainty of identifications. In general, I rather than returning verbatim values, I recomomend populating with contents of column 'Annotations' from the TNRS output. This field should come after 'dateIdentified'.	PRIORITY FEATURE REQUEST

query:

SELECT * FROM "public.2012-12-06-23-19-52.r6672".analytical_stem WHERE "institutionCode" = 'ARIZ' LIMIT 50;

BRIT¶

Source data used is on vegbiendev in /home/bien/inputs/BRIT/MySQL.*.sql

2013-3-6¶

See BRIT.2013-3-6.100_rows.xls and BRIT.2013-3-6.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-3-7)

Excellent. No issues. Mark all these sources [NY, BRIT, ACAD] as done.

query:

SELECT *
FROM "BRIT".specimen_flat
LEFT JOIN /*r7855.*/analytical_specimen ON
    analytical_specimen."datasource" = 'BRIT'
AND analytical_specimen."occurrenceID" = specimen_flat.specimen_id::text
WHERE specimen_flat.specimen_id >= 1660
ORDER BY specimen_flat.specimen_id
LIMIT 100;

import command:

make inputs/BRIT/import_scrub by_col=1 n=100 start=952; bin/make_analytical_db

2013-2-26¶

See BRIT.2013-2-26.100_rows.xls and BRIT.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27, BRIT.2013-2-26.100_rows_bb.xls)

FIXED: ~~occurrenceRemarks:~~
Weird. Looks like they imported full label information and mixed together locality, habitat, specimen description and even the intitutional footer, all in one field. This will be too dangerous to keep as is. Let's discuss what to do about it.
if no Locality_Description, now ignore Notes_Plant because it's likely malformed
FIXED: ~~decimalLatitude/decimalLongitude:~~
BRIT looks like it's going to be a big headache to fix. Could you do me a favor and check to see if any of the BRIT records have non-NULL coordinates? I couldn't see any in the example extract.
Yes, about 15% of their rows have a valid lat/long. These rows are not in the first 100, however.

query:

SELECT *
FROM "BRIT".specimen_flat
LEFT JOIN /*r7724.*/analytical_specimen ON
    analytical_specimen."datasource" = 'BRIT'
AND analytical_specimen."occurrenceID" = specimen_flat.specimen_id::text
ORDER BY specimen_flat.specimen_id
LIMIT 100;

import command:

make inputs/BRIT/import_scrub by_col=1 n=100; bin/make_analytical_db

CTFS¶

Source data used is in http://capstone.bradley.edu/~asingh2/files/bci_01April2011.zip

2012-12-17¶

See CTFS.2012-12-17.50_rows.xls and CTFS.2012-12-17.50_rows.csv

query:

SELECT
datasource AS "BIEN:datasource" 
-- Country
, country AS "Country.CountryName" 
-- Site
, "locationName" AS "Site.PlotID; Quadrat.QuadratID" 
, "decimalLatitude" AS "Site.Latitude" 
, "decimalLongitude" AS "Site.Longitude" 
, "elevationInMeters" AS "Site.Elevation" 
, "plotArea_ha" AS "Site.SizeOfSite" 
-- Census
, "dateCollected" AS "Census.StartDate" 
-- Tree
, "organismX_m" AS "Tree.x" 
, "organismY_m" AS "Tree.y" 
-- Stem
, tag AS "Stem.StemTag or Tree.Tag" 
, "authorStemCode" AS "Stem.StemID" 
, "diameterBreastHeight_cm" AS "Stem.DBH" 
, "individualID" AS "Measurement.MeasureID" 
-- Family
, "family" AS "Family.Family" 
, "genus" AS "Genus.Genus" 
, "speciesBinomial" AS "Genus.Genus SpecificEpithet.SpeciesName" 
, "scientificNameAuthorship" AS "SpecificEpithet.SpeciesAuthority" 
-- BIEN
, "higherPlantGroup" AS "BIEN:higherPlantGroup" 
, "isNewWorld" AS "BIEN:isNewWorld" 
, geovalid AS "BIEN:geovalid" 
, cultivated AS "BIEN:cultivated" 
, threatened AS "BIEN:threatened" 
FROM "public.2012-12-14-15-22-55.r6870".analytical_stem
WHERE "datasource" = 'CTFS' LIMIT 50;

2012-12-12¶

See CTFS.50_rows.xls and CTFS.50_rows.csv

*Rick Condit* wanted the data in a form similar to CTFS itself, which has been provided (above)

query:

SELECT * FROM "public.2012-12-11-11-46-55.r6795".analytical_stem WHERE "datasource" = 'CTFS' LIMIT 50;

CVS ¶

see CVS validation

HVAA¶

Source data used is on vegbiendev in /home/bien/inputs/HVAA/Specimen/Herbario_occur_1360871068.csv

2013-2-26¶

See HVAA.2013-2-26.100_rows.xls and HVAA.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27)

The following sources passed validation and are complete:

REMIB
HVAA
SpeciesLink
SALVIAS
TEAM

query:

SELECT *
FROM "HVAA"."Specimen" 
LEFT JOIN /*r7722.*/analytical_specimen ON
    analytical_specimen."datasource" = 'HVAA'
AND analytical_specimen."occurrenceID" = "Specimen"."symbiotaId" 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/HVAA/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-20¶

See HVAA.2013-2-20.100_rows.xls and HVAA.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-21, HVAA.2013-2-20.100_rows_bb.xls)

FIXED: ~~scientificName_verbatim:~~
Include family only when no other lower ranked name is provided
FIXED: ~~ocurrenceRemarks:~~
Missing. please include in all specimen validation extracts
- FIXED: ~~mapping this source only:~~
```
=IF(input.occurrenceRemarks IS NULL, input.dynamicProperties,input.occurrenceRemarks)
```

query:

SELECT *
FROM "HVAA"."Specimen" 
LEFT JOIN /*r7620.*/analytical_specimen ON
    analytical_specimen."datasource" = 'HVAA'
AND analytical_specimen."occurrenceID" = "Specimen"."symbiotaId" 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/HVAA/import_scrub by_col=1 n=100; bin/make_analytical_db

Madidi¶

See the *Madidi validation spreadsheet*: (Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.)

Loading Google Spreadsheet...

Source data used is on vegbiendev in /home/bien/inputs/Madidi/*/*.csv

2013-3-7¶

query:

SELECT *
FROM "Madidi"."IndividualObservation" 
RIGHT JOIN "Madidi"."LocationObservation" USING ("PlotID", "PlotInventoryName")
RIGHT JOIN "Madidi"."Location" USING ("PlotID")
LEFT JOIN /*r7870.*/analytical_plot ON
    analytical_plot."datasource" = 'Madidi'
AND analytical_plot."locationName" = "IndividualObservation"."PlotName" 
AND analytical_plot."subplot" = "IndividualObservation"."Sequence" 
AND analytical_plot."individualCode" = "IndividualObservation"."Tree" 
ORDER BY "IndividualObservation".row_num
LIMIT 100;

import command:

make inputs/Madidi/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-3-6¶

See Madidi.2013-3-6.100_rows.xls and Madidi.2013-3-6.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

query:

SELECT *
FROM "Madidi"."IndividualObservation" 
RIGHT JOIN "Madidi"."LocationObservation" USING ("PlotID", "PlotInventoryName")
RIGHT JOIN "Madidi"."Location" USING ("PlotID")
LEFT JOIN /*r7870.*/analytical_plot ON
    analytical_plot."datasource" = 'Madidi'
AND analytical_plot."institutionCode" = "IndividualObservation"."Institutions" 
AND analytical_plot."collectionCode" IS NULL
AND analytical_plot."catalogNumber" = "IndividualObservation"."SpecimenID" 
ORDER BY "IndividualObservation".row_num
LIMIT 100;

import command:

make inputs/Madidi/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-3-5¶

See Madidi.2013-3-5.100_rows.xls and Madidi.2013-3-5.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-3-6)

Started working on this, but realized you sent me a specimen-data validation extract. This is plot data. Please send me a plot-format validation extract, containing plot codes, subplots, morphospecies names, etc.

query:

SELECT *
FROM "Madidi"."IndividualObservation" 
RIGHT JOIN "Madidi"."LocationObservation" USING ("PlotID", "PlotInventoryName")
RIGHT JOIN "Madidi"."Location" USING ("PlotID")
LEFT JOIN /*r7848.*/analytical_specimen ON
    analytical_specimen."datasource" = 'Madidi'
AND analytical_specimen."institutionCode" = "IndividualObservation"."Institutions" 
AND analytical_specimen."collectionCode" IS NULL
AND analytical_specimen."catalogNumber" = "IndividualObservation"."SpecimenID" 
ORDER BY "IndividualObservation".row_num
LIMIT 100;

import command:

make inputs/Madidi/import_scrub by_col=1 n=100; bin/make_analytical_db

2012-12-5¶

See Madidi.50_rows.xls and Madidi.50_rows.csv

*Peter Jorgensen* wants the data refreshed first.

query:

SELECT * FROM "public.2012-12-04-03-46-35.r6569".analytical_stem WHERE "institutionCode" = 'Madidi' AND "organismX_m" IS NOT NULL LIMIT 50;

MO¶

Source data used is on vegbiendev in /home/bien/inputs/MO/Specimen/digirexport*.txt

2012-11-30¶

See MO.50_rows.xls and MO.50_rows.csv

Peter Jorgensen's comments: MO.50_rows.annotated.xlsx

query:

SELECT * FROM "public.2012-11-28-18-18-51.r6487".analytical_stem WHERE "institutionCode" = 'MO' ORDER BY "recordNumber" LIMIT 50;

NCU¶

Source data used is on vegbiendev in /home/bien/inputs/NCU/_archive/NCU-NCSC_2010-02-12.csv

2012-12-7¶

See NCU-NCSC.50_rows.xls and NCU-NCSC.50_rows.csv

Bob Peet's comments: (e-mail on 2012-12-9)

FIXED: ~~The collection dates often read something like 2065 rather than 1965.~~ The refresh uses 4-digit years, so no longer has this problem. 2-digit years can be supported for other datasources using the _fix_date() filter.
FIXED: ~~Perhaps I misunderstand, but I also fail to see the verbatim name from the original record. Otherwise looks fine.~~ scientificName_verbatim has been added to the analytical DB

query:

SELECT * FROM "public.2012-12-05-12-51-00.r6646".analytical_stem WHERE "institutionCode" = 'NCU-NCSC' LIMIT 50;

NVS¶

Source data used is on vegbiendev in /home/bien/inputs/NVS/_src/AKATARAWA 2008 .xls

2013-1-22¶

See NVS.2013-1-22.50_rows.xls and NVS.2013-1-22.50_rows.csv
Input and output data are in separate tabs

query:

SELECT *
FROM "NVS"."StemObservation" 
LEFT JOIN r7317.analytical_stem analytical_stem ON
    analytical_stem."datasource" = 'NVS'
AND analytical_stem."individualObservationID" = "StemObservation"."Item Obs ID"::text
ORDER BY "StemObservation".row_num 
LIMIT 50;

2012-12-21¶

See NVS.2012-12-21.50_rows.xls and NVS.2012-12-21.50_rows.csv. This includes just the columns with NVS data and omits taxonomic scrubbing for easier verification with the source data.

query: (using a test-only import)

SELECT
datasource
, "projectID" 
, "locationID" 
, "plotArea_ha" 
, "scientificName" 
, "growthForm" 
, "diameterBreastHeight_cm" 
, tag
, "taxonOccurrenceID" 
, "authorTaxonCode" 
, "individualObservationID" 
FROM analytical_stem
WHERE datasource = 'NVS'
AND "individualObservationID" IS NOT NULL
ORDER BY "locationID" DESC
LIMIT 50;

2012-12-6¶

See NVS.50_rows.xls and NVS.50_rows.csv

Nick Spencer's comments: (e-mail on 2012-12-18)

Mostly in the right places, but I did note the following issues.

FIXED: ~~Suggested mapping corrections~~

NVS export element	->	VegCSV element	*Fixed?*
Identifier	->	tag	Y
itemID	->	individualID	Y
Item Obs ID	->	taxonOccurrenceID or occurrenceID (depending on definition)*	Y
species code	->	authorTaxonCode	Y
Verbatim Code	->	authorStemCode	Yes, but actually corresponds to authorTaxonCode

1. Coordinates

The requested plot coordinate output for a spread sheet export of NVS data is a map coordinate (as opposed to lat long). Obviously this doesn’t suit a multi data provider data repository such as BIEN. I will discuss with Shirley about altering our output to include a lat long. However, I wondered if you require a standard WGS84 value or in fact have to deal with various projections during your import process? How do you tell from the import data you have been given, which coordinate system was used?

2. *taxonOccurrenceID or occurrenceID

What is the definition or practical difference between these two elements? Our item Obs ID is a unique ID per stem observation, not a unique ID of the occurrence of a taxon (regardless of how many individuals occurrences there were, i.e. 4 tree of species Pseudowintera axillaris and a cover estimate of the same species by strata)

There is obviously a more rich set of attributes in the NVS export cf with what you mapped including other sources of occurrence data as well as cover estimates by strata. Have you tested these other data types in other data sets?

query:

SELECT * FROM "public.2012-12-05-12-51-00.r6646".analytical_stem WHERE "institutionCode" = 'NVS' LIMIT 50;

NY¶

Source data used is on vegbiendev in /home/bien/inputs/NY/Specimen/NYSpecimenDataAmericas.csv

2013-3-6¶

See NY.2013-3-6.100_rows.xls and NY.2013-3-6.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-3-7)

Excellent. No issues. Mark all these sources [NY, BRIT, ACAD] as done.

query:

SELECT *
FROM "NY"."Ecatalog_all" 
LEFT JOIN /*r7854.*/analytical_specimen ON
    analytical_specimen."datasource" = 'NY'
AND analytical_specimen."institutionCode" = "Ecatalog_all"."InstitutionCode" 
AND analytical_specimen."collectionCode" = "Ecatalog_all"."CollectionCode" 
AND analytical_specimen."catalogNumber" = "Ecatalog_all"."CatalogNumber" 
ORDER BY "Ecatalog_all".pkey
LIMIT 100;

import command:

make inputs/NY/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-26¶

See NY.2013-2-26.100_rows.xls and NY.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27, NY.2013-2-26.100_rows_bb.xls)

FIXED: ~~occurrenceRemarks:~~
Do not preprend "plant fungus: "

query:

SELECT *
FROM "NY"."Ecatalog_all" 
LEFT JOIN /*r7722.*/analytical_specimen ON
    analytical_specimen."datasource" = 'NY'
AND analytical_specimen."institutionCode" = "Ecatalog_all"."InstitutionCode" 
AND analytical_specimen."collectionCode" = "Ecatalog_all"."CollectionCode" 
AND analytical_specimen."catalogNumber" = "Ecatalog_all"."CatalogNumber" 
ORDER BY "Ecatalog_all".pkey
LIMIT 100;

import command:

make inputs/NY/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-20¶

See NY.2013-2-20.100_rows.xls and NY.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-21, NY.2013-2-20.100_rows_bb.xls)

FIXED: ~~scientificName_verbatim:~~
Include family only when no other lower ranked name is provided
FIXED: ~~ocurrenceRemarks:~~
Missing. please include in all specimen validation extracts
- FIXED: ~~mapping this source only:~~
```
=input.PlantFungDescription
```

query:

SELECT *
FROM "NY"."Ecatalog_all" 
LEFT JOIN /*r7623.*/analytical_specimen ON
    analytical_specimen."datasource" = 'NY'
AND analytical_specimen."institutionCode" = "Ecatalog_all"."InstitutionCode" 
AND analytical_specimen."collectionCode" = "Ecatalog_all"."CollectionCode" 
AND analytical_specimen."catalogNumber" = "Ecatalog_all"."CatalogNumber" 
ORDER BY "Ecatalog_all".pkey
LIMIT 100;

import command:

make inputs/NY/import_scrub by_col=1 n=100; bin/make_analytical_db

2012-12-17¶

See NY.2012-12-17.50_rows.xls and NY.2012-12-17.50_rows.csv

query:

SELECT
datasource AS "BIEN:datasource" 
, "institutionCode" AS "InstitutionCode" 
, "collectionCode" AS "CollectionCode" 
, "catalogNumber" AS "CatalogNumber" 
, "scientificName" AS "ScientificName" 
, "family" AS "Family" 
, "genus" AS "Genus" 
, "speciesBinomial" AS "Genus Species" 
, "scientificNameAuthorship" AS "ScientificNameAuthor" 
, "identifiedBy" AS "IdentifiedBy" 
, "recordNumber" AS "FieldNumber" 
, "recordedBy" AS "Collector" 
, "dateCollected" AS "YearCollected-MonthCollected-DayCollected" 
, "country" AS "Country" 
, "stateProvince" AS "StateProvince" 
, "county" AS "County" 
, "decimalLongitude" AS "Longitude" 
, "decimalLatitude" AS "Latitude" 
, "coordinateUncertaintyInMeters" AS "CoordinatePrecision" 
, "elevationInMeters" AS "MinimumElevation" 
-- BIEN
, "higherPlantGroup" AS "BIEN:higherPlantGroup" 
, "isNewWorld" AS "BIEN:isNewWorld" 
, geovalid AS "BIEN:geovalid" 
, cultivated AS "BIEN:cultivated" 
, threatened AS "BIEN:threatened" 
FROM "public.2012-12-14-15-22-55.r6870".analytical_stem
WHERE "datasource" = 'NY'
AND "institutionCode" = 'NY'
AND "collectionCode" = 'Herbarium'
LIMIT 50;

With the analytical DB's schema: NY.2012-12-17.orig.50_rows.xls and NY.2012-12-17.orig.50_rows.csv

query:

SELECT *
FROM "public.2012-12-14-15-22-55.r6870".analytical_stem
WHERE "datasource" = 'NY'
AND "institutionCode" = 'NY'
AND "collectionCode" = 'Herbarium'
LIMIT 50;

2012-11-28¶

See NY.csv on vegbiendev.nceas.ucsb.edu at /home/bien/inputs/NY/verify/NY.csv (access via SSH/SFTP)

Barbara Thiers' comments: (e-mail on 2012-11-28)

I tested a sample set of 10 records, comparing the NY data extracted from BIEN 3.0, the NY dataset that was input into BIEN 3.0, and the original database at NY.

The data are correctly represented in BIEN 3.0, ~~with one exception~~ FIXED (see below). We use our Barcode number as our unique specimen identifier. In the input file, these data are stored in the column labelled "CatalogNumber." We also have an internal record number, which we don't use, but is included in the input file under the column "UniqueNYInternalRecordNumber."

In BIEN 3.0, the "UniqueNYInternalRecordNumber" (henceforth IRN) was used as the unique identifier (stored under "PlotName, and also concatenated with collector's number in RecordNumber.

FIXED: Instead, the barcode number (stored under "CatalogNumber" in the input file) should be used instead of the IRN. The barcode numbers should be unique, but if if is desirable to maintain the concatenated RecordNumber field, then it is the CatalogNumber that should be combined with the FieldNumber value from the input table.
however, some rows are missing a CatalogNumber¹, so CatalogNumber is not a unique key for all rows.

¹ 1060 of 941,107 rows
SELECT * FROM "NY"."Ecatalog_all" WHERE "accessionNumber" IS NULL
Total query runtime: 942 ms.

FIXED: The FieldNumber is now in analytical_stem.recordNumber. It would be very helpful to also maintain the FieldNumber as a separate column in the analytical database, so that the combination of Collector + Collector number could be easily queried (this is the 4th most used type of query for our database).

The attached spreadsheet [ Mapping changes needed for NYBG data in BIEN 3.0.xlsx ] illustrates the mapping changes needed

query:

SELECT * FROM "public.2012-11-24-14-51-43.r6446".analytical_stem WHERE "institutionCode" = 'NY';

REMIB¶

Source data used is on vegbiendev in /home/bien/inputs/REMIB/Specimen.src/node.*.csv

2013-2-26¶

See REMIB.2013-2-26.100_rows.xls and REMIB.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27)

The following sources passed validation and are complete:

REMIB
HVAA
SpeciesLink
SALVIAS
TEAM

query:

SELECT *
FROM "REMIB"."Specimen" 
LEFT JOIN /*r7723.*/analytical_specimen ON
    analytical_specimen."datasource" = 'REMIB'
AND analytical_specimen."institutionCode" = "Specimen".acronym
AND analytical_specimen."catalogNumber" = "Specimen".accession_number
ORDER BY "Specimen"."Specimen.src.row_num" 
LIMIT 100;

import command:

make inputs/REMIB/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-20¶

See REMIB.2013-2-20.100_rows.xls and REMIB.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-21, REMIB.2013-2-20.100_rows_bb.xls)

FIXED: ~~county:~~
For this source only, "ND" = "no disponible = "not available". Just leave blank when county = "ND"
FIXED: ~~locality:~~
For this source only, "ND" = "no disponible = "not available". Just leave this field blank when both locality and habitat = "ND"
FIXED: ~~decimalLatitude:~~
This source uses 99 to indicate a missing value. Set both decimalLatitude and decimalLongitude to NULL if decimalLatitude=99
FIXED: ~~decimalLongitude:~~
This source uses -999 to indicate a missing value. Set both decimalLatitude and decimalLongitude to NULL if decimalLongitude=-999
FIXED: ~~scientificName_verbatim:~~
Include family only when no other lower ranked name is provided

query:

SELECT *
FROM "REMIB"."Specimen" 
LEFT JOIN /*r7620.*/analytical_specimen ON
    analytical_specimen."datasource" = 'REMIB'
AND analytical_specimen."institutionCode" = "Specimen".acronym
AND analytical_specimen."catalogNumber" = "Specimen".accession_number
ORDER BY "Specimen"."Specimen.src.row_num" 
LIMIT 100;

import command:

make inputs/REMIB/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-1-25¶

See REMIB.2013-1-25.100_rows.xls and REMIB.2013-1-25.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

query:

SELECT *
FROM "REMIB"."Specimen" 
LEFT JOIN /*r7317.*/analytical_stem_view analytical_stem ON
    analytical_stem."datasource" = 'REMIB'
AND analytical_stem."institutionCode" = "Specimen".acronym
AND analytical_stem."catalogNumber" = "Specimen".accession_number
ORDER BY "Specimen"."Specimen.src.row_num" 
LIMIT 100;

2013-1-15¶

See REMIB.50_rows.xls and REMIB.50_rows.csv
Input data is on the left, and output data is on the right, starting with the datasource column

query:

SELECT *
FROM "REMIB"."Specimen" 
LEFT JOIN /*r7023.*/analytical_stem ON
    analytical_stem."datasource" = 'REMIB'
AND analytical_stem."institutionCode" = "Specimen".acronym
AND analytical_stem."catalogNumber" = "Specimen".accession_number
LIMIT 50;

SALVIAS ¶

see SALVIAS validation

SpeciesLink¶

Source data used is on vegbiendev in /home/bien/inputs/SpeciesLink/Specimen/specieslink.csv

2013-2-26¶

See SpeciesLink.2013-2-26.100_rows.xls and SpeciesLink.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27)

The following sources passed validation and are complete:

REMIB
HVAA
SpeciesLink
SALVIAS
TEAM

query:

SELECT *
FROM "SpeciesLink"."Specimen" 
LEFT JOIN /*r7723.*/analytical_specimen ON
    analytical_specimen."datasource" = 'SpeciesLink'
AND analytical_specimen."occurrenceID" = "Specimen"."dwc_dwcore_GlobalUniqueIdentifier" 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/SpeciesLink/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-20¶

See SpeciesLink.2013-2-20.100_rows.xls and SpeciesLink.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-22, SpeciesLink.2013-2-20.100_rows_bb.xls)

FIXED: ~~scientificName_verbatim:~~
Include family only when no other lower ranked name is provided
FIXED: ~~identificationRemarks:~~
Incorrectly formed. The values shown are the field description of the specimen, and should be in columnn occurrenceRemarks (see below). As far as I can tell, identificationRemarks were not supplied for this source.
FIXED: ~~ocurrenceRemarks:~~
Missing. please include in all specimen validation extracts
- FIXED: ~~mapping this source only~~: =input.dwc_terms_occurrenceRemarks

query:

SELECT *
FROM "SpeciesLink"."Specimen" 
LEFT JOIN /*r7632.*/analytical_specimen ON
    analytical_specimen."datasource" = 'SpeciesLink'
AND analytical_specimen."occurrenceID" = "Specimen"."dwc_dwcore_GlobalUniqueIdentifier" 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/SpeciesLink/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-1-25¶

See SpeciesLink.2013-1-25.100_rows.xls and SpeciesLink.2013-1-25.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-1-31, SpeciesLink.2013-1-25.100_rows_bb.xls)

Content passes validation. I have the same format comments as for other specimen sources.

I did wonder why there were NULL values in the two analytical db fields geovalid_bien and isNewWorld_bien.
Rows that don't have both coordinates and a placename can't be geovalidated, so geovalid_bien is null. isNewWorld_bien is null whenever there isn't a scrubbed name match.

See attached [SpeciesLink.2013-1-25.100_rows_bb.xls].

column name	comments
datasource
institutionCode
collectionCode
catalogNumber
occurrenceID
country
stateProvince
county
locality
decimalLatitude
decimalLongitude
coordinateUncertaintyInMeters
coordinateSource_bien
georeferenceProtocol_bien
geovalid_bien	Should be 0,1 only. Why are some values NULL?
isNewWorld_bien	Should be 0,1 only. Why are some values NULL?
dateCollected
higherPlantGroup_bien
family
genus
speciesBinomial
taxonName
scientificNameAuthorship
scientificNameWithMorphospecies	omit for all specimen validation extracts; this pertains only to plots
threatened_bien	move to after `growthForm`
identifiedBy	omit; this pertains to TNRS accepted name
dateIdentified	omit; this pertains to TNRS accepted name
identificationRemarks	omit; this pertains to TNRS accepted name
growthForm
cultivated_bien
cultivatedBasis_bien
recordedBy	move these two columns to before `dateCollected`
recordNumber	move these two columns to before `dateCollected`

Add the following columns before `family`:
family_verbatim	As provided by author
taxonName_verbatim	As provided by author
identifiedBy_verbatim	As provided by author
dateIdentified_verbatim	As provided by author
identificationRemarks_verbatim	As provided by author
family_matched_tnrs	As matched by TNRS
taxonName_matched_tnrs	As matched by TNRS

Add the following column after `growthForm`
reproductiveCondition

query:

SELECT *
FROM "SpeciesLink"."Specimen" 
LEFT JOIN /*r7317.*/analytical_stem_view analytical_stem ON
    analytical_stem."datasource" = 'SpeciesLink'
AND analytical_stem."occurrenceID" = "Specimen"."dwc_dwcore_GlobalUniqueIdentifier" 
ORDER BY "Specimen".row_num
LIMIT 100;

TEAM¶

Source data used is on vegbiendev in /home/bien/inputs/TEAM/Specimen.src/node.*.csv

2013-2-20¶

See TEAM.2013-2-20.100_rows.xls and TEAM.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-26)

[Re Plot data validations for SALVIAS and TEAM:]
Both look good. The only problem I could find was the incorrectly formed scientificName_verbatim, but I believe you have fixed that now, correct? [yes] If that issue is fixed then these are both good and I do not need to see them again. Consider them done.

query:

SELECT *
FROM "TEAM"."TEAM_Sites" 
RIGHT JOIN "TEAM"."VT" ON "VT"."1ha Plot Number" = "TEAM_Sites"."Name" 
LEFT JOIN /*r7629.*/analytical_plot ON
    analytical_plot."datasource" = 'TEAM'
AND analytical_plot."individualObservationID" = "VT"."Id"::text
ORDER BY "VT".row_num
LIMIT 100;

import command:

make inputs/TEAM/import_scrub by_col=1 n=100; bin/make_analytical_db

TEX¶

see TEX validation

U¶

Source data used is on vegbiendev in /home/bien/inputs/U/Specimen/UtrechtHerbarium.csv

2013-2-26¶

See U.2013-2-26.100_rows.xls and U.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27, U.2013-2-26.100_rows_bb.xls)

FIXED: ~~institutionCode:~~
Please populate. Should be "U"
taxonName_verbatim:
Still incorrectly formed. Include family only when no other lower ranked name is provided. no, the family must be included in what's sent to TNRS

query:

SELECT *
FROM "U"."Specimen" 
LEFT JOIN /*r7723.*/analytical_specimen ON
    analytical_specimen."datasource" = 'U'
AND analytical_specimen."occurrenceID" = "Specimen"."ID" 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/U/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-20¶

See U.2013-2-20.100_rows.xls and U.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-21, U.2013-2-20.100_rows_bb.xls)

FIXED: ~~locality:~~ Was already fixed
This source uses input column "Ecology" as "habitat". Please append contents of input column "Ecology"
FIXED: ~~scientificName_verbatim:~~
Include family only when no other lower ranked name is provided
FIXED: ~~ocurrenceRemarks:~~
Missing. please include in all specimen validation extracts. ~~Input field "Notes" maps to occurrenceRemarks for this source.~~

query:

SELECT *
FROM "U"."Specimen" 
LEFT JOIN /*r7620.*/analytical_specimen ON
    analytical_specimen."datasource" = 'U'
AND analytical_specimen."occurrenceID" = "Specimen"."ID" 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/U/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-1-25¶

See U.2013-1-25.100_rows.xls and U.2013-1-25.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-1-31, U.2013-1-25.100_rows_bb.xls)

[...] consider the validation of U to be complete. See attached spreadsheet [U.2013-1-25.100_rows_bb.xls], with the usual format comments.

Content for U is fine. Although, as mentioned in my earlier email, I have some concerns about how we are forming DwC fields such as occurrenceID. But if we make any changes, it will apply to the entire database, so let's discuss this issue as a group.

column name	comments
datasource
catalogNumber
occurrenceID
country
stateProvince
county
locality
decimalLatitude
decimalLongitude
coordinateUncertaintyInMeters
coordinateSource_bien
georeferenceProtocol_bien
geovalid_bien
isNewWorld_bien
dateCollected
higherPlantGroup_bien
family
genus
speciesBinomial
taxonName
scientificNameAuthorship
scientificNameWithMorphospecies	omit for all specimen validation extracts; this pertains only to plots
threatened_bien	move to after `growthForm`
identifiedBy	omit; this pertains to TNRS accepted name
dateIdentified	omit; this pertains to TNRS accepted name
identificationRemarks	omit; this pertains to TNRS accepted name
growthForm
cultivated_bien
cultivatedBasis_bien
recordedBy	move these two columns to before `dateCollected`
recordNumber	move these two columns to before `dateCollected`

Add the following columns before `family`:
family_verbatim	As provided by author
taxonName_verbatim	As provided by author
identifiedBy_verbatim	As provided by author
dateIdentified_verbatim	As provided by author
identificationRemarks_verbatim	As provided by author
family_matched_tnrs	As matched by TNRS
taxonName_matched_tnrs	As matched by TNRS

Add the following column after `growthForm`
reproductiveCondition

Add the following columns after datasource (even if null):
institutionCode
collectionCode

query:

SELECT *
FROM "U"."Specimen" 
LEFT JOIN /*r7317.*/analytical_stem_view analytical_stem ON
    analytical_stem."datasource" = 'U'
AND analytical_stem."occurrenceID" = "Specimen"."ID" 
ORDER BY "Specimen".row_num
LIMIT 100;

2013-1-15¶

See U.50_rows.xls and U.50_rows.csv
Input data is on the left, and output data is on the right, starting with the datasource column

query:

SELECT *
FROM "U"."Specimen" 
LEFT JOIN /*r7023.*/analytical_stem ON
    analytical_stem."datasource" = 'U'
AND analytical_stem."occurrenceID" = "Specimen"."ID" 
LIMIT 50;

UNCC¶

Source data used is on vegbiendev in /home/bien/inputs/UNCC/Specimen/UNCC.csv

2013-2-26¶

See UNCC.2013-2-26.100_rows.xls and UNCC.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

query:

SELECT *
FROM "UNCC"."Specimen" 
LEFT JOIN /*r7870.*/analytical_specimen ON
    analytical_specimen."datasource" = 'UNCC'
AND (analytical_specimen."institutionCode" = "Specimen".herbarium
    OR analytical_specimen."institutionCode" IS NULL AND "Specimen".herbarium IS NULL
    OR analytical_specimen."institutionCode" = 'UNCC' AND "Specimen".herbarium = 'UNCCD'
)
AND analytical_specimen."collectionCode" IS NULL
AND analytical_specimen."catalogNumber" = "Specimen".accession
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/UNCC/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-2-26¶

See UNCC.2013-2-26.100_rows.xls and UNCC.2013-2-26.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Bob Peet's and Brad Boyle's comments: (e-mails on 2013-3-5, UNCC-validate_rkp_bb_rkp.xlsx)

Input	Output	Note	Brad's comments	Bob's comments on Brad's comments	Brad's final comments	Fixed?
row_num		Might be good to save this	If this is a database ID, then I suggest Aaron save it under datasetRecordID	ok
	datasource	All records currently = UNCC; probably should be where you got the data = NCU	Bob: Index Herbariorum does not recognize UNC, only UNCC. These are UNCC collections, so I suggest we use UNCC as both the datasource (an internal BIEN field) and institutionCode (a DwC field).	I suppose UNCC is less confusing than NCU, but technically I got the data from NCU which got it from UNCC.	Prefer to use UNCC	FIXED
accession	Catalog Number	OK, I think. Need to chek with Brad regardss our final decisions on this	This looks good to me. If accessionNumber is provided by the data provider, is unique and used for every observation, it should be used at the catalogNumber	ok
herbarium	institutionCode	Wrong. Herbarium is where UNCC got the collection. For institution code you want UNCC for all records	I believe Bob is right.` institutionCode` refers to the institution where the specimen is stored, which in this case is UNCC for all specimens. I believe `herbarium` is populated with a value other than UNCC when it was received as a duplicate from another herbarium. I don't think there is any point in storing the other values as there is not guarantee that another duplicate is stored at that institution. They may simply be fowarding a specimen they don't want; we have no way of knowing. So yes, make all values "UNCC"	omit herbarium	OK	FIXED: We don't currently have separate fields for this in VegBIEN or the analytical DB, so I just mapped it to a placeholder VegCore name.
	occurrenceID	?? What is this?	This should be populated for all sources. Aaron, please use my recommended formula is in the spreadsheet "vegbien_identifiers", based on the DwC specification.
family	family_verbatim; AND ScientificName_verbatim (part 1 of 3)	OK	OK
genus	OMITTED	OK
species	OMITTED	OK
usdarank	OMITTED	OK
infrarank	OMITTED	OK
SciName	ScientificName_verbatim (part 2 of 3)	ScientificName_verbatim = family + SciName + authors; Note sure why family was concatenated with SciName, and then why were the authors concatenated and these listed a second time as authors. The true verbatim scientificName is to my thinking just SciName	I think Aaron's formulation is correct, except for one record ("ZANNICHELLIACEAE (KEUTZ.) CORRELL" should not include the authority). Bob, you weren't part of the entire discussion, but the problem is that DwC defines scientificName as the taxon name PLUS authority. I don't much like the DwC definition either, but we decided to follow DwC. Because we didn't want to keep the authority separate from the name, we created a new field, taxonName_verbatim, which contain the lowest-level taxon name applied to the observation, MINUS the authority. That is why family sometimes appears in this column. If the specimen is only identified to order, the field will contain an ordinal name. Etc. Make sense?	Makes no sense to me to always concatenate family + genus + species to render ScientificName_verbatim as a trinomial. Aaron populated this as family + SciName + authors, whereas it should be SciName + authors except for those cases where there is no SciName and there is a family name.	Bob, I think you may be looking at an earlier incorrect version of the validation extract. Originally, Aaron was dumping to this field the value we send to the TNRS, which includes family pre-pended. I have since asked him to place family (or a higher taxon) in this field only if no determination below the rank of genus is provided. So you should never see family + genus + species. In version I am looking at, the name is always a correctly formed taxon name according to the ICBN. Aaron: your current version is correct	This happens when there is no input taxonName, causing the ranks and author to be concatenated together
authors	scientificNameAuthorship_verbatim; AND ScientificName_verbatim (part 3 of 3)	Note sure why this is saved in two places	I only see it in one place, scientificNameAuthorship_verbatim. We also display scientificNameAuthorship_matched, but that is different: it's what the TNRS found after matching the name to Tropicos.	Perhaps you answered this above, but the field 'authors' in the input is copied into TWO places in the output, first as the final third of ScientificName_verbatim and then as a separate scientificNameAuthorship_verbatum	OK, this must definitely be a misunderstanding due to you using at an older version of the validation. Sorry about that.The current validation only contains only taxonName_verbatim (without authors) and scientificNameAuthorship_verbatum (the authors only). I would have preferred taxonNameAuthorship_verbatim, but as there was already a DwC name for the latter entity, we had to use it. Basically, DwC's ill-thought-out naming convention has screwed us up, but there's nothing we can do but use a new name if DwC has no unambiguous name for the taxon name without author. Aaron: your current version is correct.
collector	recordedBy (part 1 of 4)	OK
collector1	recordedBy (part 2 of 4)	OK
collector2	recordedBy (part 3 of 4)	OK
collector3	recordedBy (part 4 of 4)	OK
collectno	recordNumber	OK
collmonth	dateCollected (part 2 of 3)	OK
collday	dateCollected (part 3 of 3)	OK
collyear	dateCollected (part 1 of 3)	OK
country	country	OK, but should probably convert into a standard character string	Bob: Aaron's presentation is correct for now. We haven't yet implemented geovalidation for BIEN3. When we do, he will return both the verbatim and scrubbed political division names.	ok
state	stateProvince	OK, but should probably convert into a standard character string	Bob: Aaron's presentation is correct for now. We haven't yet implemented geovalidation for BIEN3. When we do, he will return both the verbatim and scrubbed political division names.	ok
County	county	OK
Campus		Omit	Bob, what does this column mean? If it has anything to do with plants being collected on a campus (and therefore cultivated) we should definitely use it.	Yes, this refers to occurrence on the UNCC campus, but I do not know the meaning of the codes. Presumably C = cultivated	OK, this is important information in that case. Here's what I suggest. Aaron, if the column contains a non-null value, please put text "cultivated, collected on campus" in the column `cultivated_verbatim`.	FIXED
leaves		Omit	Agree	ok
flower	reproductiveCOndition	Just a code that will be meaningless; if we were to keep this we would also want fruit. I suggest we omit.	Aaron, I suggest you get a formula from Bob for how to translate their codes for columns `flower` and `fruit` into the plain English terms "flower", "fruit", and "fertile" (the latter in the case of ferns, gymnosperms). This is definitely important information which we should not discard. And yes, I agree that we do not need to keep the information in the columns `leaves` and `root`.	Flower = AFIJMN, Fruit = 2AHIMOQS, definitions unknown	Bob: should we interpret flowers (or buds) present if flower is not null? And should we interpret fruits present if "fruit" is not null? I would like to capture this phenology information if at all possible.	FIXED: Mapped using translation
fruit		Omit	No. See above.	ok		FIXED: Mapped using translation
root		Omit	Agree.
locality	locality (in part)	OK?	If we could figure out a way to separate the specimen description, I would prefer to append everything from all the `comment` columns pertaining to the locality description in this field. But can't think of a way to do that. Bob, suggestions?	Too inconsistent to for a simple answer
habitat	locality (in part)	OK??? I might place habitat as yet another comment and not part of locality	We have in general been appending the habitat description to the verbatim locality description. We do this because of the inconsistent practices of herbaria. Some keep them in separate fields, and some (like UNCC and ARIZ) keep them in separate fields.
comment1		Missing -- need to preserve; not sure variable name	I agree that this information is important, but the four denormalized columns unpredictable mix locality description with habitat description with the specimen description. It is crucially important for both trait data mining and for detecting cultivated specimens, that the specimen description be preserved. But I'm at a loss as to what Aaron should do here. Bob, can you suggest an algorithm whereby Aaron could separate the locality description from the specimen description?	Hopelessly inconsistent. I would just place all of habitat and comments1-4 in a general notes field	OK Aaron, dump everything in these fields to another field, and I will modify my script to check that field for the word cultivated but nothing else. Is there another DwC field we can use other than occurrenceRemarks (we need to keep locality descriptions out of that field).	These are concatenated in occurrenceRemarks
comment2		Missing -- need to preserve; not sure variable name
comment3		Missing -- need to preserve; not sure variable name
comment4		Missing -- need to preserve; not sure variable name
loanto	OMITTED	OK
inorout	OMITTED	OK
sheetno	OMITTED	OK
cultivated	cultivated_bien	Translation seems inconsistent. Where do the ocassional seros come from?	Aaron, please see my validation for source "TEX". I suggested we add the column `cultivatedVerbatim` to vegCore and vegBIEN, and display this column in all specimen validation views. As with TEX, the contents of UNCC's column `cultivated` should be dumped to `cultivatedVerbatim`.This means that for the validation extract you display, cultivatedVerbatim would be blank for all records except Accession Numbers 39691 and 19561. It wold be helpful to do some minimal translation. Bob, do the meanings of the codes "X" and "Y" differ in your field `CULTIVATED`? `cultivated_bien` is the results of post-import validation by BIEN; it is the results of combining parsing of the specimen description as well as using any cultivated flags supplied by the data provider, if any. Aaron, could you remind me why some values are blank (NULL) whereas others are zero? And why isn't Accession number 19561 flagged with a 1?		Aaron, if this column contains a non-null value, please put text "cultivated" in column `cultivated_verbatim`, unless you have already populated `cultivated_verbatim` due to non-null value in column `Campus`.	FIXED: all non-NULL values will be assumed to be cultivated . "Where do the ocassional seros come from?": FIXED: The zeros were being populated as the result of locality parsing, and should have been NULL
filler	OMITTED	OK
	decimalLatitude	NA
	decimalLongitude	NA
	coordinateUncertaintyInMeters	NA
	elevationInMeters	NA
	identifiedBy	NA
	dateIdentified	NA
	identificationRemarks	NA
	identificationRemarks	Header error. This is familyName_matched.	??? I don't understand Bob. Looks corrected to me. All values for this column are blank in the extract I am looking at.	Looks ok now. Previously the header was showing as a duplicate of identificationRemarks, but perhaps it was just my excel misbehaving.		FIXED
	taxonName_matched	OK (should families be excluded here? I noticed a case where the cultivar names was dropped upon matching)	Aaron's formulation is correct. taxonMatched contains the lowest taxon name matched, minus the authority. The TNRS doesn't process cultivar names, so they are dumped to `unmatchedTerms` (not displayed in specimen validation extracts)	ok
	scientificNameAuthorship_matched	OK
	higherPlantGroup_bien
	family	(I noticed a case wherefamilyName_matched was blank and a genus name was captured from the family_verbatim; suggests an error)	Yes, I see that too. Accession number 39691. Genus "ELEAGNUS" was incorrectly placed in field `FAMILY` in the original data. I'm pretty sure there's no way the TNRS would have returned "Eleagnus" as a family. Aaron, is this a parsing error?	!		Genera in the family column are not passed to TNRS
	genus	OK
	taxonName	OK
	scientificNameAuthorship	OK
	growthForm	NA??	Aaron: omit this from specimen validations.			FIXED
	threatened_bien	source?	Aaron: omit this from specimen validations.Sorry, I should have told you that earlier.Bob: not yet implemented, but we will us IUCN.			FIXED
	cultivatedBasis_bien	NA??	Bob: for specimens flagged as cultivated (isCultivatedBien=1) this will contain a standard bit of text describing the reason it was flagged as cultivated (e.g., "Key words found in specimens description", etc.)

query:

SELECT *
FROM "UNCC"."Specimen" 
LEFT JOIN /*r7724.*/analytical_specimen ON
    analytical_specimen."datasource" = 'UNCC'
AND (analytical_specimen."institutionCode" = "Specimen".herbarium
    OR analytical_specimen."institutionCode" IS NULL AND "Specimen".herbarium IS NULL
    OR analytical_specimen."institutionCode" = 'UNCC' AND "Specimen".herbarium = 'UNCCD'
)
AND analytical_specimen."collectionCode" IS NULL
AND analytical_specimen."catalogNumber" = "Specimen".accession
ORDER BY "Specimen".row_num
LIMIT 100;

2013-2-20¶

See UNCC.2013-2-20.100_rows.xls and UNCC.2013-2-20.100_rows.csv
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

query:

SELECT *
FROM "UNCC"."Specimen" 
LEFT JOIN /*r7634.*/analytical_specimen ON
    analytical_specimen."datasource" = 'UNCC'
AND (analytical_specimen."institutionCode" = "Specimen".herbarium OR analytical_specimen."institutionCode" IS NULL AND "Specimen".herbarium IS NULL)
AND analytical_specimen."collectionCode" IS NULL
AND analytical_specimen."catalogNumber" = "Specimen".accession
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/UNCC/import_scrub by_col=1 n=100; bin/make_analytical_db

2013-1-24¶

See UNCC.2013-1-24.50_rows.xls and UNCC.2013-1-24.50_rows.csv
Input and output data are in separate tabs

query: (using one-time import)

SELECT *
FROM "UNCC"."Specimen" 
LEFT JOIN /*r7354.*/analytical_stem_view ON
    analytical_stem_view."datasource" = 'UNCC'
AND (analytical_stem_view."institutionCode" = "Specimen".herbarium OR analytical_stem_view."institutionCode" IS NULL AND "Specimen".herbarium IS NULL)
AND analytical_stem_view."collectionCode" IS NULL
AND analytical_stem_view."catalogNumber" = "Specimen".accession
ORDER BY "Specimen".row_num
LIMIT 50;

2013-1-22¶

See UNCC.50_rows.xls and UNCC.50_rows.csv
Input and output data are in separate tabs

Bob Peet's comments: (UNCC-validation-Peet.xlsx from e-mail on 2013-1-23)

BIEN output for validation	Problem ?	Field on UNCC dataset reported	What is reported	What I think this BIEN field should be	*Response*
datasource	!!	none	Index Herbariorum code for this collection = UNCC	Where we obtained the information. Perhaps in all cases here "Robert Peet, NCU" or some code for me.	This is actually the short name of the organization that the data is from
institutionCode	!!	herbarium	The name or acronym in use by the institution that donated the collection to UNCC	Dc "The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record." For all records here this should be "UNCC"	This is the herbarium field as indicated in the source data. Different rows use different acronyms for this field.
occurrenceID	!!	accession	Dc catalogNumber "An identifier (preferably unique) for the record within the data set or collection."	Dc occurrenceID "An identifier for the Occurrence (as opposed to a particular digital record of the occurrence)…", Examples: "urn:lsid:nhm.ku.edu:Herps:32", "urn:catalog:FMNH:Mammal:145732".	FIXED: I'm assuming you meant to switch the mapping from occurrenceID to catalogNumber rather than the other way around, as this field was previously mapped to occurrenceID
country		country	UNCC provides their own idiosyncratic country codes and this is the best place. Ideally these would be converted into ISO 3166 country codes and placed in Dc "CountryCode"
stateProvince		state	Idiosyncratic UNCC abbreviations for state/province
county		county	County
locationID	!!	accession	Identical to OccurrenceID; Dc catalogNumber "An identifier (preferably unique) for the record within the data set or collection."	Dc "An identifier for the set of location information (data associated with dcterms:Location). May be a global unique identifier or an identifier specific to the data set."	This is primarily for plots data and will be the same as occurrenceID for specimens data
locationName	!!	accession	Identical to OccurrenceID; Dc catalogNumber "An identifier (preferably unique) for the record within the data set or collection."	Omit	Some input fields are mapped to multiple places in VegBIEN to be used as unique keys for the various tables
individualCode	!!	collectno	Collection number of the collector: Dc = "recordNumber"	Omit
dateCollected		collyear + collmonth + collday	Dc eventDate
higherPlantGroup
family		family	Dc family
genus		genus	Dc genus
speciesBinomial		genus + species	Dc genus + Dc specificEpithet
taxonName	?	genus + species	I cannot tell from the records if 'infrarank' and 'infraname' are concatenated here as they should be		This is the scrubbed name provided by TNRS
scientificNameAuthorship		authors	Dc scientificNameAuithorship
scientificNameWithMorphospecies	?	genus + species	In the example dataset 34950 has a morphospecies listed as a infraname and this is lost in the cleaning (perhaps not possible to parse)		This is because Brad's formula for scientificNameWithMorphospecies uses the specificEpithet instead of the morphospecies when available, and this name also contains a specificEpithet (manihot)
threatened	?	?	Source is not clear.		This field is populated by the import process
cultivated	?	cultivated ?	I cannot check this without more data		This field is populated by the import process
cultivatedBasis	?	?	Source not clear		This field is populated by the import process
RecordedBy		collector + collector1 + collector2 + collector3	Dc RecordedBy
recordNumber		collectno	Dc recordNumber; duplicates individualCode, which should be omitted
TaxonOccurrenceID	!!	accession	Duplicates occurrenceID; not sure what this is supposed to be		This is primarily for plots data and will be the same as occurrenceID for specimens data
MISSING FROM BIEN VALIDATION
	!!	locality	Dc verbatimLocality		FIXED: These are now concatenated in the locality field
	!!	habitat	Dc locationRemarks
	!!	comment1	Dc locationRemarks
	!!	comment2	Dc locationRemarks
	!!	comment3	Dc locationRemarks

query:

SELECT *
FROM "UNCC"."Specimen" 
LEFT JOIN r7317.analytical_stem_view ON
    analytical_stem_view."datasource" = 'UNCC'
AND analytical_stem_view."occurrenceID" = "Specimen".accession
ORDER BY "Specimen".row_num
LIMIT 50;

VegBank ¶

see VegBank validation

Herbaria used¶

741 names

query:

SELECT name from "public.2012-11-28-18-18-51.r6487".sourcename order BY name;

Files (110)

MO.50_rows.csv (18.4 KB) MO.50_rows.csv		Aaron Marcuse-Kubitza, 11/30/2012 09:22 AM
MO.50_rows.xls (39 KB) MO.50_rows.xls		Aaron Marcuse-Kubitza, 11/30/2012 11:29 AM
MO.50_rows.annotated.xlsx (23.9 KB) MO.50_rows.annotated.xlsx		Aaron Marcuse-Kubitza, 11/30/2012 12:15 PM
VegBank.9297.50_rows.csv (17.4 KB) VegBank.9297.50_rows.csv		Aaron Marcuse-Kubitza, 12/05/2012 11:16 AM
VegBank.10191.50_rows.csv (29.4 KB) VegBank.10191.50_rows.csv		Aaron Marcuse-Kubitza, 12/05/2012 11:20 AM
VegBank.928.50_rows.csv (17.4 KB) VegBank.928.50_rows.csv		Aaron Marcuse-Kubitza, 12/05/2012 11:22 AM
VegBank.10504.50_rows.csv (19.2 KB) VegBank.10504.50_rows.csv		Aaron Marcuse-Kubitza, 12/05/2012 11:24 AM
Madidi.50_rows.xls (24.5 KB) Madidi.50_rows.xls		Aaron Marcuse-Kubitza, 12/05/2012 11:37 AM
Madidi.50_rows.csv (14.2 KB) Madidi.50_rows.csv		Aaron Marcuse-Kubitza, 12/05/2012 11:37 AM
NVS.50_rows.xls (19.5 KB) NVS.50_rows.xls		Aaron Marcuse-Kubitza, 12/06/2012 07:08 PM
NVS.50_rows.csv (14 KB) NVS.50_rows.csv		Aaron Marcuse-Kubitza, 12/06/2012 07:08 PM
NCU-NCSC.50_rows.xls (23.5 KB) NCU-NCSC.50_rows.xls		Aaron Marcuse-Kubitza, 12/07/2012 07:08 PM
NCU-NCSC.50_rows.csv (14.4 KB) NCU-NCSC.50_rows.csv		Aaron Marcuse-Kubitza, 12/07/2012 07:08 PM
ARIZ.50_rows.xls (31 KB) ARIZ.50_rows.xls		Aaron Marcuse-Kubitza, 12/11/2012 09:09 AM
ARIZ.50_rows.csv (22.8 KB) ARIZ.50_rows.csv		Aaron Marcuse-Kubitza, 12/11/2012 09:09 AM
Mapping changes needed for NYBG data in BIEN 3.0.xlsx (32 KB) Mapping changes needed for NYBG data in BIEN 3.0.xlsx		Aaron Marcuse-Kubitza, 12/12/2012 05:39 PM
CTFS.50_rows.csv (15.9 KB) CTFS.50_rows.csv		Aaron Marcuse-Kubitza, 12/12/2012 09:08 PM
CTFS.50_rows.xls (23.5 KB) CTFS.50_rows.xls		Aaron Marcuse-Kubitza, 12/12/2012 09:08 PM
ariz_validation_report.xlsx (27.9 KB) ariz_validation_report.xlsx		Aaron Marcuse-Kubitza, 12/17/2012 10:17 AM
CTFS.2012-12-17.50_rows.csv (8.72 KB) CTFS.2012-12-17.50_rows.csv		Aaron Marcuse-Kubitza, 12/17/2012 06:35 PM
CTFS.2012-12-17.50_rows.xls (20 KB) CTFS.2012-12-17.50_rows.xls		Aaron Marcuse-Kubitza, 12/17/2012 06:35 PM
NY.2012-12-17.50_rows.csv (11.8 KB) NY.2012-12-17.50_rows.csv		Aaron Marcuse-Kubitza, 12/17/2012 07:37 PM
NY.2012-12-17.50_rows.xls (22.5 KB) NY.2012-12-17.50_rows.xls		Aaron Marcuse-Kubitza, 12/17/2012 07:37 PM
NY.2012-12-17.orig.50_rows.csv (18.3 KB) NY.2012-12-17.orig.50_rows.csv		Aaron Marcuse-Kubitza, 12/17/2012 07:46 PM
NY.2012-12-17.orig.50_rows.xls (26 KB) NY.2012-12-17.orig.50_rows.xls		Aaron Marcuse-Kubitza, 12/17/2012 07:46 PM
NVS.2012-12-21.50_rows.csv (4.8 KB) NVS.2012-12-21.50_rows.csv		Aaron Marcuse-Kubitza, 12/21/2012 02:27 PM
NVS.2012-12-21.50_rows.xls (13 KB) NVS.2012-12-21.50_rows.xls		Aaron Marcuse-Kubitza, 12/21/2012 02:27 PM
REMIB.50_rows.csv (22.7 KB) REMIB.50_rows.csv		Aaron Marcuse-Kubitza, 01/16/2013 12:06 AM
REMIB.50_rows.xls (37 KB) REMIB.50_rows.xls		Aaron Marcuse-Kubitza, 01/16/2013 12:06 AM
ACAD.50_rows.csv (55.2 KB) ACAD.50_rows.csv		Aaron Marcuse-Kubitza, 01/16/2013 12:09 AM
ACAD.50_rows.xls (62.5 KB) ACAD.50_rows.xls		Aaron Marcuse-Kubitza, 01/16/2013 12:09 AM
U.50_rows.csv (32.9 KB) U.50_rows.csv		Aaron Marcuse-Kubitza, 01/16/2013 12:24 AM
U.50_rows.xls (44.5 KB) U.50_rows.xls		Aaron Marcuse-Kubitza, 01/16/2013 12:24 AM
NVS.2013-1-22.50_rows.csv (23.3 KB) NVS.2013-1-22.50_rows.csv		Aaron Marcuse-Kubitza, 01/22/2013 11:03 AM
NVS.2013-1-22.50_rows.xls (34 KB) NVS.2013-1-22.50_rows.xls		Aaron Marcuse-Kubitza, 01/22/2013 11:03 AM
UNCC.50_rows.csv (25.7 KB) UNCC.50_rows.csv		Aaron Marcuse-Kubitza, 01/22/2013 06:16 PM
UNCC.50_rows.xls (41.5 KB) UNCC.50_rows.xls		Aaron Marcuse-Kubitza, 01/22/2013 06:16 PM
UNCC-validation-Peet.xlsx (13.2 KB) UNCC-validation-Peet.xlsx		Aaron Marcuse-Kubitza, 01/23/2013 08:27 PM
UNCC.2013-1-24.50_rows.csv (27.4 KB) UNCC.2013-1-24.50_rows.csv		Aaron Marcuse-Kubitza, 01/24/2013 10:33 AM
UNCC.2013-1-24.50_rows.xls (41 KB) UNCC.2013-1-24.50_rows.xls		Aaron Marcuse-Kubitza, 01/24/2013 10:33 AM
ACAD.2013-1-25.100_rows.csv (115 KB) ACAD.2013-1-25.100_rows.csv		Aaron Marcuse-Kubitza, 01/25/2013 05:26 PM
ACAD.2013-1-25.100_rows.xls (126 KB) ACAD.2013-1-25.100_rows.xls		Aaron Marcuse-Kubitza, 01/25/2013 05:26 PM
REMIB.2013-1-25.100_rows.csv (52.9 KB) REMIB.2013-1-25.100_rows.csv		Aaron Marcuse-Kubitza, 01/25/2013 10:53 PM
U.2013-1-25.100_rows.csv (69.3 KB) U.2013-1-25.100_rows.csv		Aaron Marcuse-Kubitza, 01/25/2013 10:53 PM
U.2013-1-25.100_rows.xls (80 KB) U.2013-1-25.100_rows.xls		Aaron Marcuse-Kubitza, 01/25/2013 10:53 PM
SpeciesLink.2013-1-25.100_rows.csv (171 KB) SpeciesLink.2013-1-25.100_rows.csv		Aaron Marcuse-Kubitza, 01/25/2013 10:59 PM
SpeciesLink.2013-1-25.100_rows.xls (177 KB) SpeciesLink.2013-1-25.100_rows.xls		Aaron Marcuse-Kubitza, 01/25/2013 10:59 PM
REMIB.2013-1-25.100_rows.xls (74 KB) REMIB.2013-1-25.100_rows.xls		Aaron Marcuse-Kubitza, 01/25/2013 11:04 PM
ACAD.2013-1-25.100_rows_bb.xls (156 KB) ACAD.2013-1-25.100_rows_bb.xls		Aaron Marcuse-Kubitza, 01/31/2013 02:35 PM
SpeciesLink.2013-1-25.100_rows_bb.xls (206 KB) SpeciesLink.2013-1-25.100_rows_bb.xls		Aaron Marcuse-Kubitza, 01/31/2013 02:41 PM
U.2013-1-25.100_rows_bb.xls (104 KB) U.2013-1-25.100_rows_bb.xls		Aaron Marcuse-Kubitza, 01/31/2013 03:11 PM
ACAD.2013-2-20.100_rows.csv (119 KB) ACAD.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 09:39 AM
ACAD.2013-2-20.100_rows.xls (130 KB) ACAD.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 09:52 AM
REMIB.2013-2-20.100_rows.csv (50.8 KB) REMIB.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 09:54 AM
REMIB.2013-2-20.100_rows.xls (74.5 KB) REMIB.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 09:54 AM
U.2013-2-20.100_rows.csv (68.1 KB) U.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 09:58 AM
U.2013-2-20.100_rows.xls (84 KB) U.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 09:58 AM
HVAA.2013-2-20.100_rows.csv (83 KB) HVAA.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 10:32 AM
HVAA.2013-2-20.100_rows.xls (90.5 KB) HVAA.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 10:32 AM
NY.2013-2-20.100_rows.csv (99.8 KB) NY.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 11:55 AM
NY.2013-2-20.100_rows.xls (136 KB) NY.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 11:55 AM
TEAM.2013-2-20.100_rows.csv (87.4 KB) TEAM.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 01:25 PM
TEAM.2013-2-20.100_rows.xls (99.5 KB) TEAM.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 01:25 PM
SpeciesLink.2013-2-20.100_rows.csv (178 KB) SpeciesLink.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 02:13 PM
SpeciesLink.2013-2-20.100_rows.xls (182 KB) SpeciesLink.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 02:13 PM
UNCC.2013-2-20.100_rows.csv (58.9 KB) UNCC.2013-2-20.100_rows.csv		Aaron Marcuse-Kubitza, 02/20/2013 05:02 PM
UNCC.2013-2-20.100_rows.xls (88 KB) UNCC.2013-2-20.100_rows.xls		Aaron Marcuse-Kubitza, 02/20/2013 05:02 PM
ACAD.2013-2-20.100_rows_bb.xls (675 KB) ACAD.2013-2-20.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/21/2013 04:06 PM
HVAA.2013-2-20.100_rows_bb.xls (128 KB) HVAA.2013-2-20.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/21/2013 04:06 PM
NY.2013-2-20.100_rows_bb.xls (160 KB) NY.2013-2-20.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/21/2013 04:06 PM
REMIB.2013-2-20.100_rows_bb.xls (132 KB) REMIB.2013-2-20.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/21/2013 04:06 PM
U.2013-2-20.100_rows_bb.xls (108 KB) U.2013-2-20.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/21/2013 04:06 PM
SpeciesLink.2013-2-20.100_rows_bb.xls (727 KB) SpeciesLink.2013-2-20.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/26/2013 03:31 PM
ACAD.2013-2-26.100_rows.csv (115 KB) ACAD.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 06:50 PM
ACAD.2013-2-26.100_rows.xls (128 KB) ACAD.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 06:50 PM
HVAA.2013-2-26.100_rows.csv (83 KB) HVAA.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 07:16 PM
HVAA.2013-2-26.100_rows.xls (89.5 KB) HVAA.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 07:16 PM
NY.2013-2-26.100_rows.csv (103 KB) NY.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 07:20 PM
NY.2013-2-26.100_rows.xls (137 KB) NY.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 07:20 PM
REMIB.2013-2-26.100_rows.csv (51.1 KB) REMIB.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 07:43 PM
REMIB.2013-2-26.100_rows.xls (72.5 KB) REMIB.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 07:43 PM
SpeciesLink.2013-2-26.100_rows.csv (176 KB) SpeciesLink.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 07:48 PM
SpeciesLink.2013-2-26.100_rows.xls (180 KB) SpeciesLink.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 07:48 PM
U.2013-2-26.100_rows.csv (69 KB) U.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 07:52 PM
U.2013-2-26.100_rows.xls (83.5 KB) U.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 07:52 PM
TEX.2013-2-26.100_rows.csv (115 KB) TEX.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 08:01 PM
TEX.2013-2-26.100_rows.xls (141 KB) TEX.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 08:01 PM
BRIT.2013-2-26.100_rows.csv (118 KB) BRIT.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 02/26/2013 08:13 PM
BRIT.2013-2-26.100_rows.xls (130 KB) BRIT.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 02/26/2013 08:13 PM
validationResultsSummary_2013-02-27.xlsx (37.9 KB) validationResultsSummary_2013-02-27.xlsx		Aaron Marcuse-Kubitza, 02/28/2013 09:07 AM
ACAD.2013-2-26.100_rows_bb.xls (153 KB) ACAD.2013-2-26.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/28/2013 09:07 AM
BRIT.2013-2-26.100_rows_bb.xls (154 KB) BRIT.2013-2-26.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/28/2013 09:07 AM
NY.2013-2-26.100_rows_bb.xls (681 KB) NY.2013-2-26.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/28/2013 09:07 AM
TEX.2013-2-26.100_rows_bb.xls (173 KB) TEX.2013-2-26.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/28/2013 09:07 AM
U.2013-2-26.100_rows_bb.xls (114 KB) U.2013-2-26.100_rows_bb.xls		Aaron Marcuse-Kubitza, 02/28/2013 09:07 AM
validationResultsSummary_2013-02-21.xlsx (23.8 KB) validationResultsSummary_2013-02-21.xlsx		Aaron Marcuse-Kubitza, 02/28/2013 09:33 AM
UNCC.2013-2-26.100_rows.csv (60 KB) UNCC.2013-2-26.100_rows.csv		Aaron Marcuse-Kubitza, 03/01/2013 08:30 PM
UNCC.2013-2-26.100_rows.xls (88.5 KB) UNCC.2013-2-26.100_rows.xls		Aaron Marcuse-Kubitza, 03/01/2013 08:30 PM
UNCC-validate_rkp_bb_rkp.xlsx (17.6 KB) UNCC-validate_rkp_bb_rkp.xlsx		Aaron Marcuse-Kubitza, 03/05/2013 12:58 PM
Madidi.2013-3-5.100_rows.csv (278 KB) Madidi.2013-3-5.100_rows.csv		Aaron Marcuse-Kubitza, 03/06/2013 10:01 AM
Madidi.2013-3-5.100_rows.xls (171 KB) Madidi.2013-3-5.100_rows.xls		Aaron Marcuse-Kubitza, 03/06/2013 10:01 AM
NY.2013-3-6.100_rows.csv (102 KB) NY.2013-3-6.100_rows.csv		Aaron Marcuse-Kubitza, 03/06/2013 02:32 PM
NY.2013-3-6.100_rows.xls (130 KB) NY.2013-3-6.100_rows.xls		Aaron Marcuse-Kubitza, 03/06/2013 02:32 PM
BRIT.2013-3-6.100_rows.csv (166 KB) BRIT.2013-3-6.100_rows.csv		Aaron Marcuse-Kubitza, 03/06/2013 02:45 PM
BRIT.2013-3-6.100_rows.xls (162 KB) BRIT.2013-3-6.100_rows.xls		Aaron Marcuse-Kubitza, 03/06/2013 02:45 PM
ACAD.2013-3-6.100_rows.csv (113 KB) ACAD.2013-3-6.100_rows.csv		Aaron Marcuse-Kubitza, 03/06/2013 03:01 PM
ACAD.2013-3-6.100_rows.xls (126 KB) ACAD.2013-3-6.100_rows.xls		Aaron Marcuse-Kubitza, 03/06/2013 03:01 PM
Madidi.2013-3-6.100_rows.csv (289 KB) Madidi.2013-3-6.100_rows.csv		Aaron Marcuse-Kubitza, 03/06/2013 11:08 PM
Madidi.2013-3-6.100_rows.xls (186 KB) Madidi.2013-3-6.100_rows.xls		Aaron Marcuse-Kubitza, 03/06/2013 11:08 PM
Madidi.2013-3-6.100_rows_bb.xls (738 KB) Madidi.2013-3-6.100_rows_bb.xls		Aaron Marcuse-Kubitza, 03/07/2013 04:30 PM

Project

General

Profile

Wiki

spot-checking validation details¶

Queries¶

Specimens¶

Plots¶

ACAD¶

2013-3-6¶

2013-2-26¶

2013-2-20¶

2013-1-25¶

2013-1-15¶

ARIZ¶

2012-12-11¶

BRIT¶

2013-3-6¶

2013-2-26¶

CTFS¶

2012-12-17¶

2012-12-12¶

CVS¶

HVAA¶

2013-2-26¶

2013-2-20¶

Madidi¶

2013-3-7¶

2013-3-6¶

2013-3-5¶

2012-12-5¶

MO¶

2012-11-30¶

NCU¶

2012-12-7¶

NVS¶

2013-1-22¶

2012-12-21¶

2012-12-6¶

NY¶

2013-3-6¶

2013-2-26¶

2013-2-20¶

2012-12-17¶

2012-11-28¶

REMIB¶

2013-2-26¶

2013-2-20¶

2013-1-25¶

2013-1-15¶

SALVIAS¶

SpeciesLink¶

2013-2-26¶

2013-2-20¶

2013-1-25¶

TEAM¶

2013-2-20¶

TEX¶

U¶

2013-2-26¶

2013-2-20¶

2013-1-25¶

2013-1-15¶

UNCC¶

2013-2-26¶

2013-2-26¶

2013-2-20¶

2013-1-24¶

2013-1-22¶

VegBank¶

Herbaria used¶

CVS ¶

SALVIAS ¶

VegBank ¶