Project

General

Profile

Actions

Task #917

open

Task #928: switch to new TNRS setup

TNRS: Instructions for new version with TPL

Added by Martha Narro almost 10 years ago. Updated about 7 years ago.

Status:
New
Priority:
Normal
Start date:
05/13/2014
Due date:
% Done:

80%

Estimated time:
Activity type:
Coding/analysis

Description

From Brad:

I’ve given some thought to the TPL matter. The algorithm isn’t hard, but Aaron will have to do the sorting himself.

1. Make sure sources are selected in the following order: GCC, TPL, Tropicos, USDA
2. When downloading names, do NOT sort by source (ie. don't limit results to just the best match when sorted by source)
3. Download all results (not just best matches)
3a. fix anomaly where there were multiple Selected names for some input names (to avoid breaking constraints)
3b. reimplement the parsed-rank columns for the all-matches strategy, which does not have a single scrubbed name per input name to parse see taxon_match derived columns .
3c. create table and algorithm to store a selected best match for each input name
4. Apply the usual TNRS sort order (see the README ._1) to the matches for a give name.
Aaron, Brad said there are no additional steps to apply here (in step 4). Just proceed to step 5. (--Martha)
Brad now says that actually the TNRS sort order is incorrect because of the Constrain by Source bug .
5. If the best match (indicated by Selected=TRUE) has source=Tropicos and acceptance=accepted AND another match is available where source<is not equal to>Tropicos and acceptance=synonym, use the latter name (we don't need this until after the names are scrubbed (Martha))
6. All other cases, use the best match as flagged

That should filter out most Tropicos nomenclatural synonyms incorrectly labeled accepted. I can unpack #4 for Aaron when the time arrives.

Brad

1 note that edit_distance = (1 - specific_epithet_score)*greatest_length


Files

48_test_names.txt (1.22 KB) 48_test_names.txt Martha Narro, 06/04/2014 05:58 PM
48_test_name_tnrs_results.xlsx (42.2 KB) 48_test_name_tnrs_results.xlsx Martha Narro, 06/04/2014 05:58 PM
Actions

Also available in: Atom PDF