Project

General

Profile

VegCSV vision

Exchange schema

  • two "tiers":
    • level 1: tightly structured VegSQL (Mike Lee: or fully compliant VegCSV) (a precursor to VegSQL)
    • level 2: loosely structured VegCSV
  • level of structure determines whether data provider or aggregator provides mapping expertise

VegCSV

  • Mike Lee: metadata csv file (similar to a data dictionary) is the (optional for level 2) first file(s), which describes any of these:
    • the individual tables and the data export as a whole (separate file?)
    • column relationships
    • Mike Lee: required fields
    • each row maps table, field to standard VegCore terms
    • using one mapping file also allows the data provider to provide just a SQL export of their database which we can then import into our staging area
  • need to specify required columns and required tables (CSV listing?)
  • standard table names or allow to use any synonym?
  • automatic generation of LEFT JOINs, etc.
  • tools to help people get data into VegCSV

VegSQL

  • based on VegBIEN
    • fields renamed to standard VegCore (DwC, VegX, etc.) terms
  • fully specified relational schema
  • specification hosted by TDWG
  • portal to download schema: VegBIEN schema wiki page
  • Mike Lee: there was some uncertainty about whether we could accomplish all of the VegSQL functions with the VegCSV and a fully described metadata file. Michael Lee thinks it would be possible and simpler to have one format (just VegCSV) instead of two (VegSQL and VegCSV). VegSQL could be an internally used transfer system.

Publication

  • describes data model as well as import workflow (possibly in separate papers)
  • final product must be open access
  • preferably open source
    • "Society" journals are closed-source
    • Elsevier/Springer fees are higher
  • universal data model in JVS (Journal of Vegetation Science)
    • enforces modest size on papers
  • if large paper, publish in Ecological Monitor
  • Brad, Bob willing to work on publication of model