- Table of contents
- 2013-06-13 conference call
- To do for Brad
- To do for Aaron
- include GCC when running TNRS
- fix higher_plant_group_nodes mapping
- plant/non-plant genus/family homonyms
- observation filtering
- switch from NCBI backbone to Tropicos
- analytical_stem_view: add disambiguating prefix for TNRS accepted name terms
- analytical_stem_view: add combination of TNRS accepted and matched name
- add species_binomial
- FIA filtering
- document TNRS terms in VegCore data dictionary
- include TNRS version and settings in TNRS cache
- include our TNRS client's version in TNRS cache
- future GBIF exports
- Availability
2013-06-13 conference call¶
To do for Brad¶
determine higher_plant_group
node names in Tropicos/APGIII backbone¶
higher_plant_group
node names in Tropicos/APGIII backbonesee higher_plant_group node names in Tropicos APGIII
- is the list in the BIEN2 analytical DB overview (p. 12 bottom > higherPlantGroup) complete?
"bryophytes", "ferns and allies", "flowering plants", "gymnosperms (conifers)", "gymnosperms (non-conifer)"
- e.g. there is an entry for "seed plants and ferns" under polyphyletic clades (p. 13 bottom > last ¶), but it is not in the list above
- on Monday?
ask Naim to include TNRS version columns in CSV download¶
Naim has been e-mailed and a feature request submitted at *TNRS-185* .
In order to know when to refresh our TNRS cache, it would be useful to have the following CSV columns indicating when the TNRS source code or database has changed:
Start_time
(e.g.Thu Jun 13 11:42:01 MST 2013
)TNRS_app_version
(e.g.3.2
)TNRS_db_version
Github_revision_number
(e.g.e18a25f79bfc1a3cd6c2a4f0803c84e49f8d19aa
)
- the TNRS source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/
e-mail out which meeting dates not available this summer¶
To do for Aaron¶
include GCC when running TNRS¶
- it provides more synonyms than Tropicos for Asteraceae, and the accepted names still match the Tropicos backbone
- positioned before Tropicos in the sources list, so that GCC will be used instead when it provides a result
- we had originally removed this along with USDA because "GCC is for only one family (Asteraceae)" (r5691)
fix higher_plant_group_nodes
mapping¶
higher_plant_group_nodes
mapping- contrary to the list in the BIEN2 analytical DB overview (p. 13 bottom > last ¶), "ferns and allies" should not include all the nodes in bryophytes
"ferns and allies": bryophytes (see above) + "Moniliformopses"
plant/non-plant genus/family homonyms¶
- genus and family homonyms are now both available in a delimited format
- can't use species homonyms to whitelist binomials because the list is not exhaustive
observation filtering¶
- observation_is_plant
- range_model_include
- counts of names, observations that are/are not plants
switch from NCBI backbone to Tropicos¶
see higher_plant_group node names in Tropicos APGIII
- this will avoid family+genus mismatch problems due to NCBI using a different family classification (needed when determining
higher_plant_group
) - NCBI is also missing a number of genera from Tropicos
- use Tropicos
name
andclassification
tables, joined together - using the TNRS copy of Tropicos at
ssh://arjuna.iplantcollaborative.org:1657
:ssh -p 1657 aaronmk@arjuna.iplantcollaborative.org
- TNRS batch-downloads the names from the Tropicos web service once a year (script runs overnight)
analytical_stem_view: add disambiguating prefix for TNRS accepted name terms¶
analytical_stem_view: add combination of TNRS accepted and matched name¶
- TNRS "no opinion" names don't have a taxon concept (accepted name), just a matched name
add species_binomial
¶
species_binomial
species_binomial
: (from Brad)IF(Accepted_species IS NOT NULL, Accepted_species, IF(Specific_epithet_matched IS NOT NULL,CONCAT(Genus_matched,' ',Specific_epithet_matched), NULL) )
FIA filtering¶
document TNRS terms in VegCore data dictionary¶
include TNRS version and settings in TNRS cache¶
- this helps determine when the TNRS cache needs to be reloaded (and which names to reload)
- retrieve this from the TNRS web app's download settings file (using Download settings button displayed once results returned)
- the following attributes should be included as cache table columns:
- TNRS URL
Job type
Contains Id
Start time
TNRS version
- github revision (from https://github.com/iPlantCollaborativeOpenSource/TNRS)
- TNRS DB version (not yet included)
Sources selected
Match threshold
Classification
Allow partial matches?
Constrain by higher taxonomy
- the following attributes are not needed:
E-mail
(always set totnrs@lka5jjs.orv
when using the web app download)Id
(the session key)Finish time
(unused, alwaysnull
)
include our TNRS client's version in TNRS cache¶
- in addition to the TNRS web service's version, the client version is needed to track changes to the format we encode data in
- the following columns are needed:
- for existing rows, this information can be reconstructed from the
Time_submitted
future GBIF exports¶
- change runscripts to not hardcode date in export filename
Availability¶
- Brad won't be available for some meetings this summer
(he's full-time iPlant until end of June, then traveling in Canada and doing consulting)- Brad will send meeting dates he won't be available
- Bob not here (getting ready for trip)