Project

General

Profile

2011-12-01 conference call

To do

  • VegX /*s/taxonNameUsageConcept/voucher must contain collector's last name
  • Ignore SALVIAS voucher_string because it is sometimes missing collector's name
  • Ignore SALVIAS TaxonScrubber *_status fields, but consider how they might fit into /*s/taxonDetermination/*s/taxonRelationshipAssertion/assertion/{fit,confidence} fields
  • identify issues in existing mappings, note them in comments column, and e-mail to group for feedback
  • (info) in eventual GUI tool for creating mappings, show preview of sample values for a field to assist mapping
    • list of possible enum values, statistical distribution of numerical values, or most common text values
  • spend 1 week looking for formal mechanism to do mapping: not 1 week, but enough to know that VegBank and VegX formats are on the right track
  • look into VegBranch's way of capturing mappings and metadata: data is loaded into a "VegBank module database" (w/ VegBank schema) and then exported to XML
  • look into Altova XMLSpy's graphical generation of XPaths
  • determine if XQuery's superset of XPath will do the queries we want: no
  • e-mail Mike Lee to find out which docs/sources he used to create the XML serialization format for VegBank
  • make sure taxonomic elements are correctly represented in VegX
  • develop benchmark tests to check that datasource data was inserted correctly into VegBank
    • these are pairs of summarizing queries which, when run on their respective databases, produce the same results
    • Brad will produce benchmark queries for SALVIAS and NYBG, and Aaron will translate them into VegBank
  • need list of milestones for the next 6-12 months
  • add conference call tasks to Redmine issue tracker
  • research Bourret's XML-ER mapping
  • research XQuery pointer dereferencing with higher-level operators
  • read CLIO articles and look up relevant references
  • look into RDF querying with SparQL