Task #917
open
Task #928: switch to new TNRS setup
TNRS: Instructions for new version with TPL
Added by Martha Narro over 10 years ago.
Updated over 7 years ago.
Activity type:
Coding/analysis
Description
From Brad:
I’ve given some thought to the TPL matter. The algorithm isn’t hard, but Aaron will have to do the sorting himself.
1. Make sure sources are selected in the following order: GCC, TPL, Tropicos, USDA
2. When downloading names, do NOT sort by source (ie. don't limit results to just the best match when sorted by source)
3. Download all results (not just best matches)
3a. fix anomaly where there were multiple Selected
names for some input names (to avoid breaking constraints)
3b. reimplement the parsed-rank columns for the all-matches strategy, which does not have a single scrubbed name per input name to parse see taxon_match
derived columns .
3c. create table and algorithm to store a selected best match for each input name
4. Apply the usual TNRS sort order (see the README ._1) to the matches for a give name.
Aaron, Brad said there are no additional steps to apply here (in step 4). Just proceed to step 5. (--Martha)
Brad now says that actually the TNRS sort order is incorrect because of the Constrain by Source bug .
5. If the best match (indicated by Selected=TRUE) has source=Tropicos and acceptance=accepted AND another match is available where source<is not equal to>Tropicos and acceptance=synonym, use the latter name (we don't need this until after the names are scrubbed (Martha))
6. All other cases, use the best match as flagged
That should filter out most Tropicos nomenclatural synonyms incorrectly labeled accepted. I can unpack #4 for Aaron when the time arrives.
Brad
Files
Aaron, for now (next couple of months, this must be done on the development TNRS web app since TPL is only on the dev app. Martha will send you the url.
- Description updated (diff)
- Description updated (diff)
- % Done changed from 0 to 20
- Description updated (diff)
- Description updated (diff)
- % Done changed from 20 to 40
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
The taxon names Brad sent for testing the rescrubbing are now attached.
- Description updated (diff)
- % Done changed from 40 to 60
all the taxon names have now been rescrubbed
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
multiple Selected
names bug fixed in r13855
- Description updated (diff)
testing Martha's watcher notifications...
- Description updated (diff)
testing Martha's watcher notifications again after e-mail address fix...
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- % Done changed from 60 to 70
test update now that watcher notifications have been fixed
- % Done changed from 70 to 80
- Description updated (diff)
- Description updated (diff)
a single "-" creates strikethrough formatting, so you have to use "--" instead
- Description updated (diff)
- % Done changed from 80 to 70
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- % Done changed from 70 to 80
- Description updated (diff)
- Description updated (diff)
- % Done changed from 80 to 90
- Description updated (diff)
- % Done changed from 90 to 80
- Description updated (diff)
Also available in: Atom
PDF