Revision 4074
Added by Aaron Marcuse-Kubitza over 12 years ago
VegBIEN.organisms.csv | ||
---|---|---|
3 | 3 |
det_type,"/location/authorlocationcode/_alt/4/_merge/3/_if[@name=""if indirect voucher""]/cond/_eq:[right=indirect]/left","Brad: A SALVIAS value referring to the relationship between the voucher specimen and the observation. Affect how the identification of the specimen(latin name) is transferred to the observation. 'direct'=voucher specimen was collected from this same tree; they are one and the same individual. 'indirect'=voucher specimen was collected for a different individual, but the original data provider confirmed that this is the same species. 'default'=basically same as 'indirect'. 'uncollected'=no voucher specimen, data provider asserted that this was the name but was unable to collect. The main different is that with 'direct', 'indirect', and 'default', the scientific name can be updated automatically based on the name attached to the specimen voucher (assuming you have a link to that data, presumably from a herbarium database. Whereas, if det_type='uncollected', the name can never change because there is no specimen." |
4 | 4 |
coll_number,"/location/authorlocationcode/_alt/4/_merge/3/_if[@name=""if indirect voucher""]/else/_alt/1",Brad: Incorrect. Map instead as for voucher_string |
5 | 5 |
voucher_string,"/location/authorlocationcode/_alt/4/_merge/3/_if[@name=""if indirect voucher""]/else/_alt/2","Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below." |
6 |
census_date,/location/locationevent/obsstartdate/_*/date/_dateRangeStart/value,"This is for the subplot, not the organism, as all organisms in a subplot have the same value for it. The following query returns no rows: |
|
7 |
----- |
|
8 |
SELECT ""PLOT_ID"", subplot, count(DISTINCT census_date) AS census_date_count |
|
9 |
FROM ""SALVIAS-CSV"".organisms |
|
10 |
WHERE subplot IS NOT NULL AND census_date IS NOT NULL |
|
11 |
GROUP BY ""PLOT_ID"", subplot |
|
12 |
HAVING count(DISTINCT census_date) > 1 |
|
6 |
census_date,/location/locationevent/obsstartdate/_*/date/_dateRangeStart/value,"This is for the subplot, not the organism, as all organisms in a subplot have the same value for it. The following query returns no rows:
|
|
7 |
-----
|
|
8 |
SELECT ""PLOT_ID"", subplot, count(DISTINCT census_date) AS census_date_count
|
|
9 |
FROM ""SALVIAS-CSV"".organisms
|
|
10 |
WHERE subplot IS NOT NULL AND census_date IS NOT NULL
|
|
11 |
GROUP BY ""PLOT_ID"", subplot
|
|
12 |
HAVING count(DISTINCT census_date) > 1
|
|
13 | 13 |
-----" |
14 | 14 |
no_of_individuals,/location/locationevent/taxonoccurrence/aggregateoccurrence/count,"Brad: Incorrect for VegX. This is a count of number of indiiduals for an *aggregate* observation. For VegBank, I'm not sure. Not exactly the same as stemCount. An individual tree could have 3 stems but would still only count as 1. We need to check with Bob on this." |
15 | 15 |
cover_percent,/location/locationevent/taxonoccurrence/aggregateoccurrence/cover, |
... | ... | |
20 | 20 |
det_type,"/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/catalognumber_dwc/_if[@name=""if indirect voucher""]/cond/_eq:[right=indirect]/left","Brad: A SALVIAS value referring to the relationship between the voucher specimen and the observation. Affect how the identification of the specimen(latin name) is transferred to the observation. 'direct'=voucher specimen was collected from this same tree; they are one and the same individual. 'indirect'=voucher specimen was collected for a different individual, but the original data provider confirmed that this is the same species. 'default'=basically same as 'indirect'. 'uncollected'=no voucher specimen, data provider asserted that this was the name but was unable to collect. The main different is that with 'direct', 'indirect', and 'default', the scientific name can be updated automatically based on the name attached to the specimen voucher (assuming you have a link to that data, presumably from a herbarium database. Whereas, if det_type='uncollected', the name can never change because there is no specimen." |
21 | 21 |
coll_number,"/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/catalognumber_dwc/_if[@name=""if indirect voucher""]/else/_alt/1",Brad: Incorrect. Map instead as for voucher_string |
22 | 22 |
voucher_string,"/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/catalognumber_dwc/_if[@name=""if indirect voucher""]/else/_alt/2","Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below." |
23 |
OBSERVATION_ID,/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/sourceaccessioncode,"Brad: Neither is correct; this is just an internal ID for table plotObservations. However, it has the important property of uniquely identifying an ""observation"", which is an individual tree, in the case of an individual observation, or a records of a species with an associated count of individuals or measurement of percent cover, in the case of aggregate observations. Not sure where to store this. Main point is that it is not part of the original data, but an auto_increment added later." |
|
24 | 23 |
basal_diam,/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/stemobservation/basaldiameter, |
25 | 24 |
stem_canopy_form,"/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/stemobservation/definedvalue[*_id/userdefined[tablename=stemobservation,userdefinedname=canopyForm]]:[@fkey=tablerecord_id]/definedvalue",Brad: Should also be userDefined for VegBank. |
26 | 25 |
stem_canopy_position,"/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/stemobservation/definedvalue[*_id/userdefined[tablename=stemobservation,userdefinedname=canopyPosition]]:[@fkey=tablerecord_id]/definedvalue",Brad: Should also be userDefined for VegBank. |
Also available in: Unified diff
mappings/DwC2-VegBIEN.specimens.csv, VegCSV-VegBIEN.specimens.csv: Split occurrenceID into occurrenceID and individualID, where individualID refers to the plant in plots data and occurrenceID refers to the specimen in specimens data. This prevents plant sourceaccessioncodes from being mapped to the specimenreplicate, which was messing up stems mappings for the parent plantobservation. It also avoids mapping the specimenreplicate sourceaccessioncode to additional tables where it isn't needed. (Note that occurrenceID is needed for location to ensure that each specimen gets its own location to make locationdeterminations on. Everything else is directly or indirectly scoped by location when its own sourceaccessioncode isn't specified.)