2012-01-05 conference call¶
- Table of contents
- 2012-01-05 conference call
Brad's meeting notes¶
In order of expected completion. The priorities below cover most of this month. Top priorities for the next week or two are 1-5.
Priorities¶
Modifiy VegBIEN schema to incorporate all of Bob suggested changes to support correct mapping between individuals and stems, etc.Create direct mapping & import scripts from VegX→VegBIENIdentify critical fields in VegBIEN, and modify constraints (Brad & Aaron; may need input from Bob or Mike Lee)- These are fields which block a record from importing if it cannot be parsed or otherwise violates constraints. All other fields should be set to NULL if value cannot be imported, and the import error reported and logged. In most cases, it is better to set a particular value to NULL than to skip an entire record (in other words, participation [relational constraints] are more important than business rules or data type constraints for particular columns). IMHO, at least for plot data. It may be necessary to modify some FKs and relations in the VegBIEN schema to accommodate these changes; mostly I suspect we will be "loosening" constraints. In my experience, VegBank has a number of mandatory participations and requires fields would should be set to optional for VegBIEN.
SALVIAS data (plots)Complete mapping of SALVIAS→VegX- Expand VegX→VegBIEN import utility to accommodate all elements in SALVIAS VegX extract
Run complete import of entire SALVIAS database- Run all validations (with help from Brad)
Makes changes as necessary to schemas and import scripts to fix any issues found
NYBG data (specimens; DwC)Complete mapping of NYBG→VegXExpand VegX→VegBIEN import utility to accommodate all elements in NYBG VegX extractRun complete import of entire NYBG database- Run all validations (help from Brad)
- Makes changes as necessary to schemas and import scripts to fix any issues found
CTFS (plots)Work with Shash to expand VegX→VegBIEN scripts to cover any elements present in the CTFS Panama data not previously included- Import Panama plot data
- Validate (with help from Rick)
- Work with Shash, Steve to develop CTFS→VegX separate mappings for species-level inventories (this is a separate data set which Shash has not yet mapped to VegX; should be done separately, no reason to delay import of Panama plots)
- Modify VegX→VegBIEN scripts if necessary
Import species inventory data- Validate (with help from Rick)
Other data sources to be added (lower priority, after above is completely; roughly by the end of January):¶
NCU (specimens)- Aaron to work directly with Mike Lee to develop mapping for DwC dump from NCU database. Brad & Bob may be able to help as well.
- NCU data should be mapped to DwC, NOT VegX. This is because NCU is herbarium data, which is much simpler than plot data. Most herbaria will be able to provide us with data dumps in this form; if they cannot (as with NCU) we should help them map to DwC, which they can use for other purposes. No herbarium database manager is going to be interested in mapping their data to something as complex as VegX. For this reason, we need a separate, generic DwC→VegX mapping. Thus, the import route for herbarium data should always be:
- Herbarium DB → DwC → VegX → VegBIEN
- As most herbaria will provide us with data already in DwC format, we will rarely have to do step one. The rest should be totally generic.
- Bob will work on obtaining access
- Brad to pester Gaby to respond to Aaron
To do¶
Finish importing SALVIAS dataImport stems dataFix data format issuesMap invalid data to NULLOnly ignore row if critical field is NULLDecide which fields are critical
Import full NYBG dataImport CTFS datacoordinate with Shash: have VegX file- CTFS has a lot of stems data
- Import TurboVeg data
Decouple VegBIEN from VegBank and map directly from VegX to VegBIEN
For next week¶
review timeline feedback: on the wiki under December 8 2011 WebEx meetingconfirm new meeting time: Friday 1/27 at 1pm PST (2 PM Mountain, 4 PM Eastern)
Goals¶
- single, robust set of scripts
- every VegX element will map to a VegBIEN element
- VegX elements in use by existing data sets will be mapped first