Project

General

Profile

« Previous | Next » 

Revision 4950

inputs/REMIB/Specimen/map.csv: Remapped accession_number to catalogNumber because it is not globally unique, only (usually) unique within the institution providing the data ("acronym"). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution.

View differences:

inputs/REMIB/Specimen/map.csv
1 1
REMIB,VegCore,Filter,Comments
2 2
acronym,institutionCode,,
3
accession_number,occurrenceID,,
3
accession_number,catalogNumber,,"Not globally unique, only (usually) unique within the institution providing the data (""acronym""). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution. [1]
4

  
5
[1] Using the following query:
6
-----
7
SELECT acronym, accession_number, count(*)
8
FROM ""REMIB"".""Specimen""
9
GROUP BY acronym, accession_number
10
HAVING count(*) > 1
11
-----"
4 12
family,family,,
5 13
genus,genus,,
6 14
specificEpithet,specificEpithet,,
inputs/REMIB/Specimen/VegBIEN.csv
1 1
REMIB,VegBIEN:/_simplifyPath:[next=parent_id]/path,Comments
2
accession_number,"/location/_if[@name=""if subplot""]/else/authorlocationcode/_first/3/_alt/1",
2
accession_number,"/location/_if[@name=""if subplot""]/else/authorlocationcode/_first/3/_alt/2/_if[@name=""if catalogNumber""]/cond/_exists","Not globally unique, only (usually) unique within the institution providing the data (""acronym""). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution. [1]
3

  
4
[1] Using the following query:
5
-----
6
SELECT acronym, accession_number, count(*)
7
FROM ""REMIB"".""Specimen""
8
GROUP BY acronym, accession_number
9
HAVING count(*) > 1
10
-----"
3 11
acronym,"/location/_if[@name=""if subplot""]/else/authorlocationcode/_first/3/_alt/2/_if[@name=""if catalogNumber""]/then/_join/1",
12
accession_number,"/location/_if[@name=""if subplot""]/else/authorlocationcode/_first/3/_alt/2/_if[@name=""if catalogNumber""]/then/_join/3/_if[@name=""if indirect voucher""]/else","Not globally unique, only (usually) unique within the institution providing the data (""acronym""). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution. [1]
13

  
14
[1] Using the following query:
15
-----
16
SELECT acronym, accession_number, count(*)
17
FROM ""REMIB"".""Specimen""
18
GROUP BY acronym, accession_number
19
HAVING count(*) > 1
20
-----"
4 21
lat_deg,"/location/locationcoords/latitude_deg/_nullIf:[null=0,type=float]/value",
5 22
long_deg,"/location/locationcoords/longitude_deg/_nullIf:[null=0,type=float]/value",
6 23
coll_day,"/location/locationevent/taxonoccurrence/aggregateoccurrence/collectiondate/_alt/2/_date/day/_nullIf:[null=0,type=float]/value",
7 24
coll_month,"/location/locationevent/taxonoccurrence/aggregateoccurrence/collectiondate/_alt/2/_date/month/_nullIf:[null=0,type=float]/value",
8 25
coll_year,"/location/locationevent/taxonoccurrence/aggregateoccurrence/collectiondate/_alt/2/_date/year/_nullIf:[null=0,type=float]/value",
26
accession_number,"/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/catalognumber_dwc/_if[@name=""if indirect voucher""]/else","Not globally unique, only (usually) unique within the institution providing the data (""acronym""). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution. [1]
27

  
28
[1] Using the following query:
29
-----
30
SELECT acronym, accession_number, count(*)
31
FROM ""REMIB"".""Specimen""
32
GROUP BY acronym, accession_number
33
HAVING count(*) > 1
34
-----"
9 35
acronym,/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/institution_id/party/organizationname,
10
accession_number,/location/locationevent/taxonoccurrence/aggregateoccurrence/plantobservation/specimenreplicate/sourceaccessioncode,
11
accession_number,/location/locationevent/taxonoccurrence/sourceaccessioncode/_first/3,
12 36
family,/location/locationevent/taxonoccurrence/taxondetermination[!isoriginal]/*_id/taxonpath/family,
13 37
genus,/location/locationevent/taxonoccurrence/taxondetermination[!isoriginal]/*_id/taxonpath/genus,
14 38
specificEpithet,/location/locationevent/taxonoccurrence/taxondetermination[!isoriginal]/*_id/taxonpath/species,
15 39
collector,/location/locationevent/taxonoccurrence/verbatimcollectorname,
40
accession_number,"/location/locationevent/taxonoccurrence/voucher/*_id/specimenreplicate/catalognumber_dwc/_if[@name=""if indirect voucher""]/then","Not globally unique, only (usually) unique within the institution providing the data (""acronym""). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution. [1]
41

  
42
[1] Using the following query:
43
-----
44
SELECT acronym, accession_number, count(*)
45
FROM ""REMIB"".""Specimen""
46
GROUP BY acronym, accession_number
47
HAVING count(*) > 1
48
-----"
16 49
locality,/location/locationnarrative/_merge/1,
17 50
habitat,"/location/locationnarrative/_merge/3/_label[label=""habitat""]/value","Brad: Free-text description of vegetation community where collected, frequently redundane wrt 'Vegetation'. Bob, Nick: keep as user defined or create special element?"
18 51
country,/location/locationplace/*_id/placepath/country,
inputs/REMIB/Specimen/test.xml.ref
4 4
        <next>parent_id</next>
5 5
        <path>
6 6
            <location>
7
                <authorlocationcode>$accession_number</authorlocationcode>
7
                <authorlocationcode>
8
                    <_join>
9
                        <1>$acronym</1>
10
                        <3>$accession_number</3>
11
                    </_join>
12
                </authorlocationcode>
8 13
                <locationcoords>
9 14
                    <latitude_deg>
10 15
                        <_nullIf>
......
51 56
                            </collectiondate>
52 57
                            <plantobservation>
53 58
                                <specimenreplicate>
59
                                    <catalognumber_dwc>$accession_number</catalognumber_dwc>
54 60
                                    <institution_id><party><organizationname>$acronym</organizationname></party></institution_id>
55
                                    <sourceaccessioncode>$accession_number</sourceaccessioncode>
56 61
                                </specimenreplicate>
57 62
                            </plantobservation>
58 63
                        </aggregateoccurrence>
59
                        <sourceaccessioncode>$accession_number</sourceaccessioncode>
60 64
                        <taxondetermination>
61 65
                            <taxonpath_id>
62 66
                                <taxonpath>
inputs/REMIB/Specimen/new_terms.csv
1 1
acronym,institutionCode,,
2
accession_number,occurrenceID,,
2
accession_number,catalogNumber,,"Not globally unique, only (usually) unique within the institution providing the data (""acronym""). Note that there are nevertheless 11,869 rows where an accession_number appears multiple times within the same institution. [1]
3

  
4
[1] Using the following query:
5
-----
6
SELECT acronym, accession_number, count(*)
7
FROM ""REMIB"".""Specimen""
8
GROUP BY acronym, accession_number
9
HAVING count(*) > 1
10
-----"
3 11
long_deg,decimalLongitude,,
4 12
lat_deg,decimalLatitude,,
5 13
coll_day,dayCollected,,

Also available in: Unified diff