Project

General

Profile

Actions

Task #916

open

Task #928: switch to new TNRS setup

Taxon name validation: VegBank

Added by Martha Narro over 10 years ago. Updated over 10 years ago.

Status:
New
Priority:
High
Start date:
05/12/2014
Due date:
05/16/2014 (over 10 years late)
% Done:

60%

Estimated time:
Activity type:
Coding/analysis

Description

Hi Aaron,

Bob only had time to get part way through the VegBank taxon validation file you sent, but there are some errors to correct. It'll be best for him if you fix these, rescrub the TNRS names as described in #917, and send a new extract before he invests more time, so I'm going to go ahead and create issues for them. Please fix these issues and then send Bob a new file in the format described in issue #915.

Line numbers refer to the csv file you sent him.

Line 617: TNRS gives a synonym for Aronia prunifolia in a different genus, but this is missed here

Likely cause: The BIEN scripts may be using only Tropicos as the taxonomic source, not USDA. Tropicos matches only the genus.

Fix: Use all sources for the next round of scrubbing. Use them in the order: TPL, Tropicos, GCC, USDA.

Line 897: Diacritical marks on authors names are often messed up

Likely cause: Character set problem. These have been checked using the online version of TNRS and it has been confirmed that diacritics are being rendered correctly.

Fix: Find where character set problems need to be handled in the BIEN scripts. (TNRS code works so look at what's done there.)

Line 1049: There are two spellings of Erechtites hieraciifolia and TNRS know of both of them, so why is one rejected here
Same issue as Line 617. The two spellings are in USDA and GCC, but not in Tropicos.

Please fix these problems and then send Bob a new file in the format described in issue #915.


my e-mail response from 2014-5-12:

Fix: Use all sources for the next round of scrubbing. Use them in the order:
TPL, Tropicos, GCC, USDA.

The next round of TNRS re-scrubbing is currently planned to happen after the aggregating validations are complete, so we will not be able to fix this issue for the validations, unless we move up the re-scrubbing. Note that re-scrubbing all the names is expected to take at least a week. rescrubbing done

TPL

This is not currently available as a TNRS source; we will have to wait until Brad adds it if we need that included. now added

Line 897: Diacritical marks on authors names are often messed up

The name on this row, "Asteraceae Boltonia L'Hér.", actually is rendered correctly by our TNRS client [1], so this is just a problem with the cached name, which has since been resolved.

[1] ssh vegbiendev.nceas.ucsb.edu <<<"bin/tnrs_client \"Asteraceae Boltonia L'Hér.\""

Line 1049: There are two spellings of Erechtites hieraciifolia and TNRS
know of both of them, so why is one rejected here

This is a bug in TNRS, which has been reported. we are now using a workaround instead

Actions #1

Updated by Martha Narro over 10 years ago

regarding:

Line 617: TNRS gives a synonym for Aronia prunifolia in a different genus, but this is missed here

Likely cause: The BIEN scripts may be using only Tropicos as the taxonomic source, not USDA. Tropicos matches only the genus.

Fix: Use all sources for the next round of scrubbing. Use them in the order: TPL, Tropicos, GCC, USDA.

Note: You'll need instructions from Brad about how to run the new version of TNRS that includes TPL (a new taxonomic source). Martha will schedule a call as soon as you are ready.

Actions #2

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • Description updated (diff)
Actions #3

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • Parent task set to #928
Actions #4

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • Description updated (diff)
Actions #5

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • Description updated (diff)
Actions #6

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • Description updated (diff)
Actions #7

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • Description updated (diff)
Actions #8

Updated by Aaron Marcuse-Kubitza over 10 years ago

  • % Done changed from 0 to 60

most bugs fixed, just needs TNRS bug workaround

Actions

Also available in: Atom PDF