Project

General

Profile

2014-05-29 conference call

Martha's notes

Action Items

Aaron

  • Begin scrubbing taxon names on Friday (expected to run 4 days [actually 1 calendar week] or more).
    [the dev server is actually slower than the live server, so it is currently estimated to take 2 weeks]
  • Define the Data Dictionary terms for VegBIEN and associated tables.
    • First define the (approximately 60) terms needed for the BIEN3 viewFullOccurrence table. For terms defined in other name spaces (e.g., DwC, VegBank, Salvias, VegX), link to those definitions.
    • After those are finished, move on to completing the Data Dictionary for the rest of BIEN3 and associated tables.
    • The definitions need to be drafted and linked before June 9th (at least for terms in viewFullOccurrence and ideally for the entire Data Dictionary).
  • For now, working on disk space leak is lower priority than items listed above.

Martha

  • Send Nicole’s contact information: Nicole Hopkins (DONE)
    • Please copy the BIEN list when communicating with her to keep everyone in the loop.
  • Send DDL for the BIEN3 viewFullOccurrence table that Brad provided. (DONE)

upcoming

  • there will be 2 conference calls on the VegBIEN data dictionary in June
    • please fill out the Doodle poll so that the data dictionary calls can be scheduled
  • the next conference call is next week at the usual time (Th. 9am PT/9am Tucson/12pm ET)

availability

  • please add your availability for summer 2014 to the *spreadsheet*:

Loading Google Spreadsheet...

decisions

TNRS

  • higher priority than the disk space leak (Martha)
  • OK to add TNRS metadata after the fact, so that this doesn't delay the start of the rescrubbing (Martha)
  • don't implement Brad's match-picking algorithm until after the rescrubbing is started (Martha)
  • will scrub names 100,000 at a time to take advantage of higher limit on the dev server (because there are no other simultaneous users)

VegBIEN data dictionary

  • analytical tables are the top priority
  • use existing definitions from the source of the term where possible (Mark)
  • OK to just link to the definition of the term at the source instead of importing the definition into the data dictionary (the way we currently do things for VegCore) now we actually want to import the definition as well

to do for Martha

  • forward the DDL e-mail from Brad

to do for Aaron

  • increase TNRS names per batch to 100,000

info

TNRS

  • it can scrub 4 million names in a couple of days (Nicole @iPlant)
  • it should be possible to scrub up to 100,000 names at a time (Nicole @iPlant)
    • however, this only works when using file upload mode, whether you are using the live or dev TNRS