2011 working group Fr BIEN Implementation¶

start out w/ VegBank, VegX, or DwC?
specimen data very uniform, so only need one dataset/
park service db has plots data
sample of SALVIAS plots
take data, metadata and put into VegBank clone
get as close to the source as possible
work w/ Brad Boyle on loading SALVIAS data
do a MOBOT file
FIA is simple dataset
SALVIAS has individual level, observation level data
MBG, FIA extracts on nimoy
- FIA organized by state (48 states)
materialized views vs. single tables: raw data
SALVIAS not on nimoy
start w/ MBG, FIA
load data that's the actual data from the source, not already modified to fit the staging data
how obtained FIA, MOBOT:
- MOBOT from Brian Enquist, who got it from Jay
for herbaria: need strict DwC
DwC mismatches: load to first adjustment of schema
NYBG: NY Botannical Garden
MBG: automated process (web service) exposes data
- publicly available?
/home/bien_shared/raw_data/ny/: DwC
DIGIR servers provide data from the source
DwC archives
GBIF now uses CSV files to index things w/ metadata
does BIEN 2 deal with specimens: need to add fields
cultivated specimens in DwC? Ariz has them
really need specimen desc field in DwC
isCultivated is boolean (nullable?); has text desc field to explain reason
no schema spec, so handled differently by each institution
isCultivated is interpretation; may change in the future
should isCultivated be required?
FIA is problematic
- need orig source of plot data
- distribute CD of data
- extract data from Access DB
compute aggregates
load FIA from the source
start w/ NYBG dataset
stress testing the model
primarily a learning exercise
see if we can get SALVIAS data
develop in pipelines and workflows
"press the button"-type of solution
map oddities of each db to VegX vs. directly to VegBank
don't focus entirely on single-push model
SALVIAS is static
complexity depends on amount of schema modifications
real plots in CSVs, but uniform and standardized
simple plot dataset as training tool
step 2: map SALVIAS
spreadsheet is CSV: one for aggregates and one for plot attrs
- 3 representative CSVs
choose example datasets
VegBank deals w/ TurboVeg? no direct communication but could export as CSV -> import
Brad will post or e-mail current version of requirements doc
get actual plot data

Files (0)

Project

General

Profile

Wiki

2011 working group Fr BIEN Implementation¶