Project

General

Profile

TEX validation

issues

2 feature requests

2013-12-20

extract

*TEX.2013-12-20.CSV_1.100_rows.xls*, *TEX.2013-12-20.CSV_2.100_rows.xls*
(input and output data are in separate tabs. refer to the VegCore data dictionary for column definitions.)

subset import command

time yes|(export log= version=TEX_VegBIEN; make schemas/$version/reinstall; make inputs/TEX/import by_col=1 n=100; bin/make_analytical_db) # runtime: 2 min ("1m57.226s") @vegbiendev

query

SELECT *
FROM      "TEX"."Specimen" 
LEFT JOIN "TEX_VegBIEN".analytical_specimen ON
    analytical_specimen."datasource" = 'TEX'
AND analytical_specimen."collection" = "Specimen"."collection" 
AND analytical_specimen."accessionNumber" = "Specimen"."accessionNumber" 
ORDER BY "Specimen"."*row_num" 
LIMIT 100
SELECT *
FROM      "TEX"."Specimen2" AS "Specimen" 
LEFT JOIN "TEX_VegBIEN".analytical_specimen ON
    analytical_specimen."datasource" = 'TEX'
AND analytical_specimen."collection" = "Specimen"."collection" 
AND analytical_specimen."accessionNumber" = "Specimen"."accessionNumber" 
ORDER BY "Specimen"."*row_num" 
LIMIT 100

2013-2-26

Source data used is on vegbiendev in /home/bien/inputs/TEX/Specimen*/TEX-LL_Texas_latlong_Final*.tab

See *TEX.2013-2-26.100_rows.xls* and *TEX.2013-2-26.100_rows.csv*
Input and output data are in separate tabs. Refer to the VegCore data dictionary for column definitions.

Brad Boyle's comments: (e-mail on 2013-2-27, attachment:TEX.2013-2-26.100_rows_bb.xls)

  • FIXED: institutionCode:
    Please populate. Should be "TEX"
  • FIXED: country:
    Please populate for this source only. "USA"
  • locality:
    Do not append "ECOLOGICAL INFORMATION". See explanation and custom mapping below under occurrenceRemarks
    this does sometimes include habitat information, so we will continue to map it to locationnarrative, as well as to occurrenceRemarks
  • PRIORITY FEATURE REQUEST: elevationInMeters:
    These values provided in original import. Can you parse and convert to m? Use midpoint if a range is given.
  • taxonName_verbatim:
    Include family only when no other lower ranked name is provided
    no, the family must be included in what's sent to TNRS
  • FIXED: occurrenceRemarks:
    Please put import."ECOLOGICAL INFORMATION" in this column. Unfortunately, this source mixes both habitat information and specimen description in this field. However, it is more important to put specimen description in occurrenceRemarks.
    • mapping this source only: =import."ECOLOGICAL INFORMATION"
  • PRIORITY FEATURE REQUEST: cultivatedStatus_verbatim:
    It looks like this sources uses column "ORIGIN" to indicate if specimen is of a wild plant or cultivated. If we already have such a column, please use existing name. If not, please add this column to core schema.
    • mapping this source only: =import.ORIGIN

query:

SELECT *
FROM "TEX"."Specimen" 
LEFT JOIN /*r7723.*/analytical_specimen ON
    analytical_specimen."datasource" = 'TEX'
AND analytical_specimen."collectionCode" = "Specimen"."HERBARIUM" 
AND analytical_specimen."catalogNumber" = "Specimen"."ACCESSION NO." 
ORDER BY "Specimen".row_num
LIMIT 100;

import command:

make inputs/TEX/import_scrub by_col=1 n=100; bin/make_analytical_db