2013-11-14 conference call¶
Martha's to-dos¶
VegBankTaxon Names
Aaron: Top priority is to fix the one remaining problem Mike identified in the VegBank extract.Send Mike and Bob a new (and hopefully final) extract to validate.
- Aaron: Second priority is to make the changes to taxon names.
First, send Brad a description of the algorithm you’ll use so he can sign off or correct it.CVSMadidi and beyond
Aaron: Third priority is to make the changes to the CVS scripts analogous to the changes that were made for VegBank.
- Aaron: Work on validating other data sources only if the higher priority tasks are complete or you’re stuck waiting on others to do things.
Milestone document
Aaron: Please add two items to the (outdated) list of milestones to accomplish: 1) Trait workflow, 2) Diagram overall workflow for BIEN3. Just put these items at the bottom of the spreadsheet.[see timeline.2013.xls]
Upcoming¶
- call next week at usual time (Th 9am PT/10am Tucson/12pm ET; note DST time change for Tucson)
Availability¶
- see the *Google spreadsheet* (and please add your availability for future weeks once it's known):
Loading Google Spreadsheet...
Decisions made¶
datasource validations¶
- OK to proceed with Madidi (Martha and Brad)
- order doesn't have to be strictly linear; the focus on one datasource at a time was just for VegBank to ensure we made progress on it
taxonomic fields¶
accepted_*
fields should actually be namedscrubbed_*
because the names in these fields are apparently not always acceptednames withtaxonomic_status
=invalid
should be excluded because they cannot be mapped to a particular taxontaxonomic_status
should beaccepted
instead ofsynonym
when an accepted name is available (this is not always the case when a name is marked as a synonym)
dependencies on BIEN2¶
- the raw data should be loaded directly into BIEN3, not via BIEN2
To do for Martha¶
remind Bob and Mike Lee to review new VegBank extract
To do for Brad¶
taxonomic fields¶
- review updated list of taxonomic fields
dependencies on BIEN2¶
find Cyrille's traits CSV and import scripts and put them on nimoy in/home/bien_shared/traits/
- raw data:
nimoy:/home/bien_shared/traits/raw_data/
. - flat files:
nimoy:/home/boyle/bien2/load_data/traits/load_traits/data/
. - load scripts:
nimoy:/home/boyle/bien2/load_data/traits/load_traits/
.
- raw data:
To do for Aaron¶
taxonomic fields¶
create updated list of taxonomic fields
modified from Brad's list:
familyVerbatimscientificNameVerbatimfamilyMatchedtaxonMatchedtaxonAuthorshipMatchedfamilyScrubbedtaxonScrubbedtaxonAuthorshipScrubbed
annotations
unmatchedTermsmorphospeciestaxonomic_status
is_accepted
send Brad the list to reviewexclude names withtaxonomic_status
=invalid
maptaxonomic_status
through to the validation viewpopulate this with what TNRS provides, except useTNRS actually usesaccepted
instead ofsynonym
when an accepted name is availableaccepted
already when an accepted name is available
- add
is_accepted
true
iftaxonomic_status
=accepted
;false
iftaxonomic_status
=synonym
orinvalid
;NULL
iftaxonomic_status
=no opinion
orNULL
- also need to map the Asteraceae custom statuses
send Brad the formulas to form the TNRS derived fields
datasource validations¶
VegBank (fix new issue)CVS- Madidi and others when waiting on Bob and Mike Lee
exports¶
filter taxon_trait table to include only geovalid and TNRS-valid records
geovalid does not make sense to filter by because the output table just contains a list of values for each trait, rather than specific point occurrences
lower priority:
dependencies on BIEN2¶
identify which datasources are loaded from BIEN2 exports instead of directly from raw datasearch the Datasource conditions of use page for "BIEN2"
import process¶
- create high-level workflow diagram
- this will help to identify dependencies on BIEN2