Project

General

Profile

2013-11-14 conference call

Martha's to-dos

VegBank
  • Aaron: Top priority is to fix the one remaining problem Mike identified in the VegBank extract.
  • Send Mike and Bob a new (and hopefully final) extract to validate.
Taxon Names
  • Aaron: Second priority is to make the changes to taxon names. First, send Brad a description of the algorithm you’ll use so he can sign off or correct it.
CVS
  • Aaron: Third priority is to make the changes to the CVS scripts analogous to the changes that were made for VegBank.
Madidi and beyond
  • Aaron: Work on validating other data sources only if the higher priority tasks are complete or you’re stuck waiting on others to do things.
Milestone document
  • Aaron: Please add two items to the (outdated) list of milestones to accomplish: 1) Trait workflow, 2) Diagram overall workflow for BIEN3. Just put these items at the bottom of the spreadsheet. [see timeline.2013.xls]

Upcoming

  • call next week at usual time (Th 9am PT/10am Tucson/12pm ET; note DST time change for Tucson)

Availability

Loading Google Spreadsheet...

Decisions made

datasource validations

  • OK to proceed with Madidi (Martha and Brad)
  • order doesn't have to be strictly linear; the focus on one datasource at a time was just for VegBank to ensure we made progress on it

taxonomic fields

  • accepted_* fields should actually be named scrubbed_* because the names in these fields are apparently not always accepted
  • names with taxonomic_status = invalid should be excluded because they cannot be mapped to a particular taxon
  • taxonomic_status should be accepted instead of synonym when an accepted name is available (this is not always the case when a name is marked as a synonym)

dependencies on BIEN2

  • the raw data should be loaded directly into BIEN3, not via BIEN2

To do for Martha

  • remind Bob and Mike Lee to review new VegBank extract

To do for Brad

taxonomic fields

  • review updated list of taxonomic fields

dependencies on BIEN2

  • find Cyrille's traits CSV and import scripts and put them on nimoy in /home/bien_shared/traits/
    • raw data: nimoy:/home/bien_shared/traits/raw_data/ .
    • flat files: nimoy:/home/boyle/bien2/load_data/traits/load_traits/data/ .
    • load scripts: nimoy:/home/boyle/bien2/load_data/traits/load_traits/ .

To do for Aaron

taxonomic fields

  1. create updated list of taxonomic fields
    modified from Brad's list:

familyVerbatim
scientificNameVerbatim
familyMatched
taxonMatched
taxonAuthorshipMatched
familyScrubbed
taxonScrubbed
taxonAuthorshipScrubbed
annotations
unmatchedTerms
morphospecies
taxonomic_status
is_accepted

  1. send Brad the list to review
  2. exclude names with taxonomic_status = invalid
  3. map taxonomic_status through to the validation view
    • populate this with what TNRS provides, except use accepted instead of synonym when an accepted name is available TNRS actually uses accepted already when an accepted name is available
  4. add is_accepted
    • true if taxonomic_status = accepted; false if taxonomic_status = synonym or invalid; NULL if taxonomic_status = no opinion or NULL
    • also need to map the Asteraceae custom statuses
  5. send Brad the formulas to form the TNRS derived fields

datasource validations

  1. VegBank (fix new issue)
  2. CVS
  3. Madidi and others when waiting on Bob and Mike Lee

exports

  • filter taxon_trait table to include only geovalid and TNRS-valid records
    geovalid does not make sense to filter by because the output table just contains a list of values for each trait, rather than specific point occurrences

lower priority:

dependencies on BIEN2

  • identify which datasources are loaded from BIEN2 exports instead of directly from raw data search the Datasource conditions of use page for "BIEN2"

import process

  • create high-level workflow diagram
    • this will help to identify dependencies on BIEN2