- Table of contents
- GBIF validation
GBIF validation¶
source data: *on vegbiendev in MySQL* (log in as bien_read with the BIEN password)
2014-10-x¶
input-to-output query:
SELECT *
FROM "GBIF".raw_occurrence_record_plants
LEFT JOIN view_full_occurrence_individual ON
view_full_occurrence_individual.datasource = 'GBIF'
AND view_full_occurrence_individual.custodial_institution_codes = raw_occurrence_record_plants.specimen_duplicate_institutions
AND view_full_occurrence_individual.collection_code = raw_occurrence_record_plants.collection
AND view_full_occurrence_individual.catalog_number = raw_occurrence_record_plants."accessionNumber"
ORDER BY raw_occurrence_record_plants."*row_num"
LIMIT 100
output-to-input query:
SELECT *
FROM view_full_occurrence_individual
LEFT JOIN "GBIF".raw_occurrence_record_plants ON
view_full_occurrence_individual.custodial_institution_codes = raw_occurrence_record_plants.specimen_duplicate_institutions
AND view_full_occurrence_individual.collection_code = raw_occurrence_record_plants.collection
AND view_full_occurrence_individual.catalog_number = raw_occurrence_record_plants."accessionNumber"
WHERE view_full_occurrence_individual.datasource = 'GBIF'
/* no ORDER BY because the view_full_occurrence_individual rows are not in any
particular order (the hash joins that produce it do not sort their output) */
LIMIT 100
2013-12-10¶
*GBIF.2013-12-10.100_rows.xls*
(input and output data are in separate tabs. refer to the VegCore data dictionary for column definitions.)
- KNOWN NON-ISSUE:
stateProvince
is not populated, because the herbarium in the validation view does not provide it
subset import command:
time (export log= version=GBIF_VegBIEN; make schemas/$version/reinstall; make inputs/GBIF/import by_col=1 n=100; bin/make_analytical_db) # runtime: 1 min ("0m39.965s") @starscream; 2 min ("1m38.950s") @vegbiendev
query:
SELECT *
FROM "GBIF".raw_occurrence_record_plants
LEFT JOIN "GBIF_VegBIEN".analytical_specimen ON
analytical_specimen."datasource" = 'GBIF'
AND analytical_specimen."specimenHolderInstitutions" = raw_occurrence_record_plants."specimenHolderInstitutions"
AND analytical_specimen."collection" = raw_occurrence_record_plants."collection"
AND analytical_specimen."accessionNumber" = raw_occurrence_record_plants."accessionNumber"
ORDER BY raw_occurrence_record_plants."*row_num"
LIMIT 100