- Table of contents
- 2013-06-06 conference call
- To dos from Martha
- To do for Mark
- To do for Aaron
- GBIF subsetting: fix plant_fraction SQL bug
- GBIF subsetting: fix raw_occurrence_record filter formula
- animal/plant genus and family homonyms
- higherPlantGroup population
- analytical_stem_view: add disambiguating prefix for TNRS accepted name terms
- analytical_stem_view: add combination of TNRS accepted and matched name
- document TNRS terms in VegCore data dictionary
- Availability
2013-06-06 conference call¶
To dos from Martha¶
from Martha on the iPlant wiki:
Mark
Regarding item postprocess TNRS results to exclude animals with genus homonyms
Contact Tony at CSIRO about giving Aaron direct access to IRMNG database's animal/plant genus homonyms so he doesn’t have to resort to parsing web pages.Martha
On Monday, advise Aaron on how to proceed with the animal/plant genus homonyms.
Aaron
1) Fix the bug in the GBIF filtering script – we expect that to reduce the number of plant records to a believable number
2) Regarding postprocess TNRS results to exclude animals with genus homonyms,
wait until Monday to see if Tony provides you direct access to the IRMNG database’s animal/ plant genus homonyms.
3) fix higherPlantGroup to match on the genus when no family match
create genus->higherPlantGroup lookup table
lookup table must exclude internal plant homonyms (different from animal/plant homonyms)
get these from TNRS's Tropicos DB4) add COALESCE of TNRS accepted and matched name to analytical_stem_view
5) FIA filtering
Mark and Jim
Help Aaron determine which points at which to provide concrete results.
Aaron and Brad
Work together to define the column name changes for the data dictionary (resulting from the 'coalesced TNRS name' item).
To do for Mark¶
e-mail Tony Rees at CSIRO to ask for direct access to the IRMNG database
(cc Martha, Brad, Jim, Aaron)- if we haven't heard from Tony by next Monday, we'll use a screenscraping approach on their web interface instead
To do for Aaron¶
GBIF subsetting: fix plant_fraction
SQL bug¶
plant_fraction
SQL bugFIXED
COUNT(boolean)
counts non-NULL rather than true values- and boolean is actually an integer datatype in MySQL, so MySQL would not know that you were referring to a boolean
you need to addNULLIF(..., false)
around the expression ininputs/GBIF/raw_occurrence_record/run
>COUNT(family LIKE ...)
GBIF subsetting: fix raw_occurrence_record
filter formula¶
raw_occurrence_record
filter formulaFIXED
within the herbaria_filter institution_codes, NULL families are OK but non-plant families are notWHERE clause needs to include a recheck of each family to ensure that it is a plant or ambiguous
animal/plant genus and family homonyms¶
Waiting to see if Tony will provide the species and family homonyms in the same delimited format as the genera (he said he would work on it this Monday). If he doesn't change the format, we'll still need a screenscraper for the other homonym ranks.
- note: there are also family homonyms between plants and the kingdoms Fungi, Bacteria, Protista (search for
Plantae
in the IRMNG page)- GBIF may contain data from these kingdoms, especially fungi (e.g. mushrooms growing on a tree), so we do need to deal with family homonyms in general, even though there aren't animal/plant family homonyms.
- if haven't heard from Tony by next Monday, implement screenscraping of their homonyms web interface
- automate download of each letter's page
- use regular expressions to extract homonyms
- ensure that the genera of all species-level homonyms are in the genus-level homonyms list
- when matching the species binomial against homonyms, use just the species-level homonyms rather than the genus-level homonyms, to include more unambiguous taxa
higherPlantGroup
population¶
- see 2013-05-30 conference call > fix
higherPlantGroup
analytical_stem_view: add disambiguating prefix for TNRS accepted name terms¶
family
->acceptedFamily
, etc.
analytical_stem_view: add combination of TNRS accepted and matched name¶
namedcall thiscombined_*
scrubbed_*
instead because users need to know that this is the final output name from TNRS, and because this is not the combination of the scrubbed and verbatim names
document TNRS terms in VegCore data dictionary¶
- mapping from analytical_stem_view (VegCore) name to TNRS name
- first add links to TNRS data dictionary
- then get Brad's input on the definitions
- Bob should review the names for clarity to scientists
Availability¶
- Mark will be gone next Monday 6/10
- Brad is unreachable all of this week but will be back next week