Project

General

Profile

« Previous | Next » 

Revision 725

join: For input mappings with no match in the join map, include them in the output map with an empty mapping

View differences:

inputs/NYBG/maps/VegBIEN.organisms.csv
1 1
NYBG:nybg_raw,VegBIEN:/taxonoccurrence,Comments
2
BasisOfRecord,,Brad: OMIT? See http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord for definition of this term.
3
CatalogNumber,,"Brad: Not sure if mapping correct. Nick, is there an element for institutional accession codes in VegX?; Aaron: This can't be used as the accession code (primary key) because some rows don't have a value for it"
4
ContinentOcean,,Brad: OMIT
5
DateLastModified,,"Brad: Mapping to VegX is incorrect. I think is this merely an internal timestamp indicated when record last modified, not necessarily when determination (taxon name) last modified. Probably we should omit this field, although might be useful for updating changed records from this source."
6
IndividualCount,,Brad: OMIT; not relevant for DwC plant specimen data.
7
JulianDay,,Brad: OMIT
8
PreparationType,,Brad: OMIT
9
PreviousCatalogNumber,,Brad: OMIT
10
RelatedCatalogItem,,Brad: OMIT
11
RelationshipType,,Brad: OMIT
12
TimeOfDay,,Brad: OMIT
13
TypeStatus,,Brad: OMIT (?). Indicates whether this specimen served as type for taxon name. Probably not relevant for BIEN
14
key,,Brad: OMIT; I'm pretty sure this is a temporary artificial ID generated at time of export.
2 15
BoundingBox,/*_id/locationevent/*_id/location/dsgpoly,
3 16
CoordinatePrecision,/*_id/locationevent/*_id/location/locationaccuracy,
4 17
Country,"/*_id/locationevent/*_id/location/locationplace(/*_id/namedplace[placesystem=""area|country|territory""])/placename",Brad: Required; reject record if this field NULL
inputs/SALVIAS/maps/VegBIEN.stems.csv
1 1
SALVIAS:stems,VegBIEN:/stemobservation,Comments
2
origrecord_id_stems,,
3
stem_id,,
4
tmp_del,,
2 5
PlotObsID,/*_id/plantobservation/authorplantcode,
3 6
NoInd,/*_id/plantobservation/stemcount,
4 7
stem_tag2,/authorstemcode/_alt/1,
inputs/SALVIAS/maps/VegBIEN.plots.csv
1 1
SALVIAS:plotMetadata,VegBIEN:/locationevent,Comments
2
AccessCode,,
3
ElevSource,,
4
Habitat,,
5
MethodCode,,
6
PrecipSource,,
7
PrimOwnerID,,
8
RevisionComments,,
9
SiteName,,
10
TempSource,,
11
bearing,,
12
lat_long_accuracy,,
13
lat_string,,
14
long_string,,
15
new_world,,
16
orig_filename,,
17
plot_administrator,,
18
plot_notes,,
19
pol1_type,,
20
pol2_type,,
21
recensused,,"Brad: This is a 0/1 value, internal to SALVIAS. 1 indicates that a  plot has >1 set of values, from different census events.; Aaron: Different censuses are distinguished in organisms data by different census_no values"
22
tmp_del,,
23
topography_desc,,
24
vegetation_1,,
25
vegetation_2,,
2 26
plot_area_ha,/*_id/location/area,"Brad: Area in hectares. Is there any way to store units?; Aaron: VegX plot area annotation says ""Total area of the plot in square meters."" so units are fixed"
3 27
SiteCode,/*_id/location/authorlocationcode,Brad: plotCode is as-assigned by data provider; guranteed to be unique only within dataset (=project)
4 28
Elev,/*_id/location/elevation/_alt/1,Brad: Mean elevation in meters. This is a constrained decimal value; is there no place for this in VegX other than verbatimElevation? Check with Nick.
inputs/SALVIAS/maps/VegBIEN.organisms.csv
1 1
SALVIAS:plotObservations,VegBIEN:/taxonoccurrence,Comments
2
GenAuth,,
3
IsMorpho,,
4
Line,,
5
OrigAuth,,
6
OrigGenus,,
7
OrigRecordID,,
8
PlotCode,,"Brad: Same as plotCode, above"
9
SourceVoucher,,"Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below."
10
SpAuthStatus,,
11
canopy_form,,Brad: Should also be userDefined for VegBank. 
12
canopy_position,,Brad: Should also be userDefined for VegBank. 
13
coll_inits,,
14
collector_code,,Brad: OMIT
15
common_name,,
16
det_by,,
17
dist,,
18
fam_status,,Brad: OMIT. This will be determined later by using TNRS.
19
gen_status,,Brad: OMIT. This will be determined later by using TNRS.
20
height_class,,
21
height_m_commercial,,
22
ind_id,,Brad: OMIT
23
infra_auth_1,,
24
liana_infestation,,Brad: Should also be userDefined for VegBank. 
25
morphocf,,
26
morphoname,,
27
name_status,,"Brad: OMIT. Except, note that if species_status=3, this indicate that name is a morphospecies and not a standard latin name. Not exactly sure how to use this in BIEN, but could be useful during the name-scrubbing process with TNRS."
28
other_annotations,,
29
perp_dist,,
30
phenology,,
31
species_code,,
32
temp_liandbh,,
33
tmp_del,,
2 34
PlotID,/*_id/locationevent/authoreventcode,"Brad: Not sure why this is repeated? This field and plotCode, as the same as above."
3 35
height_m,/aggregateoccurrence/*_id/plantobservation/overallheight,Brad: Incorrect for VegBank. This is a measurement applied to a single tree. Check with Bob
4 36
tag2,/aggregateoccurrence/*_id/plantobservation/stemobservation/authorstemcode/_alt/1,"Brad: See commend for tag1. Your mapping for tag2 looks correct. Probably both values would go here, only nested, with one superceding the other."
bin/join
28 28
    cols[1] = map_1_out
29 29
    writer.writerow(cols)
30 30
    for row in reader:
31
        try: row[1] = map_1[row[1]]
32
        except KeyError: continue # skip row
31
        row[1] = map_1.get(row[1], '')
33 32
        writer.writerow(row)
34 33

  
35 34
main()
mappings/SALVIAS_db-VegBIEN.organisms.csv
1 1
SALVIAS:plotObservations,VegBIEN:/taxonoccurrence,Comments
2
GenAuth,,
3
IsMorpho,,
4
Line,,
5
OrigAuth,,
6
OrigGenus,,
7
OrigRecordID,,
8
PlotCode,,"Brad: Same as plotCode, above"
9
SourceVoucher,,"Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below."
10
SpAuthStatus,,
11
canopy_form,,Brad: Should also be userDefined for VegBank. 
12
canopy_position,,Brad: Should also be userDefined for VegBank. 
13
coll_inits,,
14
collector_code,,Brad: OMIT
15
common_name,,
16
det_by,,
17
dist,,
18
fam_status,,Brad: OMIT. This will be determined later by using TNRS.
19
gen_status,,Brad: OMIT. This will be determined later by using TNRS.
20
height_class,,
21
height_m_commercial,,
22
ind_id,,Brad: OMIT
23
infra_auth_1,,
24
liana_infestation,,Brad: Should also be userDefined for VegBank. 
25
morphocf,,
26
morphoname,,
27
name_status,,"Brad: OMIT. Except, note that if species_status=3, this indicate that name is a morphospecies and not a standard latin name. Not exactly sure how to use this in BIEN, but could be useful during the name-scrubbing process with TNRS."
28
other_annotations,,
29
perp_dist,,
30
phenology,,
31
species_code,,
32
temp_liandbh,,
33
tmp_del,,
2 34
PlotID,/*_id/locationevent/authoreventcode,"Brad: Not sure why this is repeated? This field and plotCode, as the same as above."
3 35
height_m,/aggregateoccurrence/*_id/plantobservation/overallheight,Brad: Incorrect for VegBank. This is a measurement applied to a single tree. Check with Bob
4 36
tag2,/aggregateoccurrence/*_id/plantobservation/stemobservation/authorstemcode/_alt/1,"Brad: See commend for tag1. Your mapping for tag2 looks correct. Probably both values would go here, only nested, with one superceding the other."
mappings/for_review/SALVIAS_db-VegBIEN.organisms.csv
1 1
SALVIAS:plotObservations,VegBIEN:/taxonoccurrence,Comments
2
GenAuth,,
3
IsMorpho,,
4
Line,,
5
OrigAuth,,
6
OrigGenus,,
7
OrigRecordID,,
8
PlotCode,,"Brad: Same as plotCode, above"
9
SourceVoucher,,"Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below."
10
SpAuthStatus,,
11
canopy_form,,Brad: Should also be userDefined for VegBank. 
12
canopy_position,,Brad: Should also be userDefined for VegBank. 
13
coll_inits,,
14
collector_code,,Brad: OMIT
15
common_name,,
16
det_by,,
17
dist,,
18
fam_status,,Brad: OMIT. This will be determined later by using TNRS.
19
gen_status,,Brad: OMIT. This will be determined later by using TNRS.
20
height_class,,
21
height_m_commercial,,
22
ind_id,,Brad: OMIT
23
infra_auth_1,,
24
liana_infestation,,Brad: Should also be userDefined for VegBank. 
25
morphocf,,
26
morphoname,,
27
name_status,,"Brad: OMIT. Except, note that if species_status=3, this indicate that name is a morphospecies and not a standard latin name. Not exactly sure how to use this in BIEN, but could be useful during the name-scrubbing process with TNRS."
28
other_annotations,,
29
perp_dist,,
30
phenology,,
31
species_code,,
32
temp_liandbh,,
33
tmp_del,,
2 34
PlotID,//locationevent/authoreventcode,"Brad: Not sure why this is repeated? This field and plotCode, as the same as above."
3 35
height_m,//plantobservation/overallheight,Brad: Incorrect for VegBank. This is a measurement applied to a single tree. Check with Bob
4 36
tag2,//stemobservation/authorstemcode/_alt/1,"Brad: See commend for tag1. Your mapping for tag2 looks correct. Probably both values would go here, only nested, with one superceding the other."
mappings/for_review/NYBG-VegBIEN.organisms.csv
1 1
NYBG,VegBIEN:/taxonoccurrence,Comments
2
BasisOfRecord,,Brad: OMIT? See http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord for definition of this term.
3
CatalogNumber,,"Brad: Not sure if mapping correct. Nick, is there an element for institutional accession codes in VegX?; Aaron: This can't be used as the accession code (primary key) because some rows don't have a value for it"
4
ContinentOcean,,Brad: OMIT
5
DateLastModified,,"Brad: Mapping to VegX is incorrect. I think is this merely an internal timestamp indicated when record last modified, not necessarily when determination (taxon name) last modified. Probably we should omit this field, although might be useful for updating changed records from this source."
6
IndividualCount,,Brad: OMIT; not relevant for DwC plant specimen data.
7
JulianDay,,Brad: OMIT
8
PreparationType,,Brad: OMIT
9
PreviousCatalogNumber,,Brad: OMIT
10
RelatedCatalogItem,,Brad: OMIT
11
RelationshipType,,Brad: OMIT
12
TimeOfDay,,Brad: OMIT
13
TypeStatus,,Brad: OMIT (?). Indicates whether this specimen served as type for taxon name. Probably not relevant for BIEN
14
key,,Brad: OMIT; I'm pretty sure this is a temporary artificial ID generated at time of export.
2 15
BoundingBox,//location/dsgpoly,
3 16
CoordinatePrecision,//location/locationaccuracy,
4 17
Country,"//*_id/namedplace[placesystem=""area|country|territory""]/placename",Brad: Required; reject record if this field NULL
mappings/for_review/SALVIAS-VegBIEN.plots.csv
1 1
SALVIAS,VegBIEN:/locationevent,Comments
2
PLOT_ID,,"Brad: This is artificial internal database ID; a unique identifier within SALVIAS DB to each plot, within the table plotMetadata."
3
holdridge_life_zone,,
4
life_zone_code,,
5
observation_type,,"Brad: SALVIAS internal metadata indicating whether the record represents an individual or aggregate observation. Rather than storing, use to decide where to store in VegX.; Aaron: VegX aggregateOrganismObservation table is missing many fields available in individualOrganismObservation, so we're mapping to individualOrganismObservation regardless of observation type"
6
recensused,,"Brad: This is a 0/1 value, internal to SALVIAS. 1 indicates that a  plot has >1 set of values, from different census events.; Aaron: Different censuses are distinguished in organisms data by different census_no values"
2 7
plot_area_ha,//location/area,"Brad: Area in hectares. Is there any way to store units?; Aaron: VegX plot area annotation says ""Total area of the plot in square meters."" so units are fixed"
3 8
plot_code,//location/authorlocationcode,Brad: plotCode is as-assigned by data provider; guranteed to be unique only within dataset (=project)
4 9
elev_m,//location/elevation/_alt/1,Brad: Mean elevation in meters. This is a constrained decimal value; is there no place for this in VegX other than verbatimElevation? Check with Nick.
mappings/for_review/SALVIAS-VegBIEN.organisms.csv
1 1
SALVIAS,VegBIEN:/taxonoccurrence,Comments
2
PLOT_ID,,"Brad: Not sure why this is repeated? This field and plotCode, as the same as above."
3
collector_code,,Brad: OMIT
4
comments,,Brad: OMIT
5
fam_status,,Brad: OMIT. This will be determined later by using TNRS.
6
gen_status,,Brad: OMIT. This will be determined later by using TNRS.
7
ind_id,,Brad: OMIT
8
species_status,,"Brad: OMIT. Except, note that if species_status=3, this indicate that name is a morphospecies and not a standard latin name. Not exactly sure how to use this in BIEN, but could be useful during the name-scrubbing process with TNRS."
9
voucher_string,,"Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below."
2 10
subplot,//location/authorlocationcode,
3 11
plot_code,//location/authorlocationcode,"Brad: Same as plotCode, above"
4 12
census_date,//locationevent/obsstartdate/_date/year,
mappings/for_review/SALVIAS_db-VegBIEN.plots.csv
1 1
SALVIAS:plotMetadata,VegBIEN:/locationevent,Comments
2
AccessCode,,
3
ElevSource,,
4
Habitat,,
5
MethodCode,,
6
PrecipSource,,
7
PrimOwnerID,,
8
RevisionComments,,
9
SiteName,,
10
TempSource,,
11
bearing,,
12
lat_long_accuracy,,
13
lat_string,,
14
long_string,,
15
new_world,,
16
orig_filename,,
17
plot_administrator,,
18
plot_notes,,
19
pol1_type,,
20
pol2_type,,
21
recensused,,"Brad: This is a 0/1 value, internal to SALVIAS. 1 indicates that a  plot has >1 set of values, from different census events.; Aaron: Different censuses are distinguished in organisms data by different census_no values"
22
tmp_del,,
23
topography_desc,,
24
vegetation_1,,
25
vegetation_2,,
2 26
plot_area_ha,//location/area,"Brad: Area in hectares. Is there any way to store units?; Aaron: VegX plot area annotation says ""Total area of the plot in square meters."" so units are fixed"
3 27
SiteCode,//location/authorlocationcode,Brad: plotCode is as-assigned by data provider; guranteed to be unique only within dataset (=project)
4 28
Elev,//location/elevation/_alt/1,Brad: Mean elevation in meters. This is a constrained decimal value; is there no place for this in VegX other than verbatimElevation? Check with Nick.
mappings/NYBG-VegBIEN.organisms.csv
1 1
NYBG,VegBIEN:/taxonoccurrence,Comments
2
BasisOfRecord,,Brad: OMIT? See http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord for definition of this term.
3
CatalogNumber,,"Brad: Not sure if mapping correct. Nick, is there an element for institutional accession codes in VegX?; Aaron: This can't be used as the accession code (primary key) because some rows don't have a value for it"
4
ContinentOcean,,Brad: OMIT
5
DateLastModified,,"Brad: Mapping to VegX is incorrect. I think is this merely an internal timestamp indicated when record last modified, not necessarily when determination (taxon name) last modified. Probably we should omit this field, although might be useful for updating changed records from this source."
6
IndividualCount,,Brad: OMIT; not relevant for DwC plant specimen data.
7
JulianDay,,Brad: OMIT
8
PreparationType,,Brad: OMIT
9
PreviousCatalogNumber,,Brad: OMIT
10
RelatedCatalogItem,,Brad: OMIT
11
RelationshipType,,Brad: OMIT
12
TimeOfDay,,Brad: OMIT
13
TypeStatus,,Brad: OMIT (?). Indicates whether this specimen served as type for taxon name. Probably not relevant for BIEN
14
key,,Brad: OMIT; I'm pretty sure this is a temporary artificial ID generated at time of export.
2 15
BoundingBox,/*_id/locationevent/*_id/location/dsgpoly,
3 16
CoordinatePrecision,/*_id/locationevent/*_id/location/locationaccuracy,
4 17
Country,"/*_id/locationevent/*_id/location/locationplace(/*_id/namedplace[placesystem=""area|country|territory""])/placename",Brad: Required; reject record if this field NULL
mappings/SALVIAS-VegBIEN.plots.csv
1 1
SALVIAS,VegBIEN:/locationevent,Comments
2
PLOT_ID,,"Brad: This is artificial internal database ID; a unique identifier within SALVIAS DB to each plot, within the table plotMetadata."
3
holdridge_life_zone,,
4
life_zone_code,,
5
observation_type,,"Brad: SALVIAS internal metadata indicating whether the record represents an individual or aggregate observation. Rather than storing, use to decide where to store in VegX.; Aaron: VegX aggregateOrganismObservation table is missing many fields available in individualOrganismObservation, so we're mapping to individualOrganismObservation regardless of observation type"
6
recensused,,"Brad: This is a 0/1 value, internal to SALVIAS. 1 indicates that a  plot has >1 set of values, from different census events.; Aaron: Different censuses are distinguished in organisms data by different census_no values"
2 7
plot_area_ha,/*_id/location/area,"Brad: Area in hectares. Is there any way to store units?; Aaron: VegX plot area annotation says ""Total area of the plot in square meters."" so units are fixed"
3 8
plot_code,/*_id/location/authorlocationcode,Brad: plotCode is as-assigned by data provider; guranteed to be unique only within dataset (=project)
4 9
elev_m,/*_id/location/elevation/_alt/1,Brad: Mean elevation in meters. This is a constrained decimal value; is there no place for this in VegX other than verbatimElevation? Check with Nick.
mappings/SALVIAS-VegBIEN.organisms.csv
1 1
SALVIAS,VegBIEN:/taxonoccurrence,Comments
2
PLOT_ID,,"Brad: Not sure why this is repeated? This field and plotCode, as the same as above."
3
collector_code,,Brad: OMIT
4
comments,,Brad: OMIT
5
fam_status,,Brad: OMIT. This will be determined later by using TNRS.
6
gen_status,,Brad: OMIT. This will be determined later by using TNRS.
7
ind_id,,Brad: OMIT
8
species_status,,"Brad: OMIT. Except, note that if species_status=3, this indicate that name is a morphospecies and not a standard latin name. Not exactly sure how to use this in BIEN, but could be useful during the name-scrubbing process with TNRS."
9
voucher_string,,"Brad: OMIT. This is the verbatim text, which includes both collectors name and collection number. I would use coll_number, below."
2 10
subplot,/*_id/locationevent/*_id/location/authorlocationcode,
3 11
plot_code,/*_id/locationevent/*_id/location/parent_id/location/authorlocationcode,"Brad: Same as plotCode, above"
4 12
census_date,/*_id/locationevent/obsstartdate/_date/year,
mappings/SALVIAS_db-VegBIEN.plots.csv
1 1
SALVIAS:plotMetadata,VegBIEN:/locationevent,Comments
2
AccessCode,,
3
ElevSource,,
4
Habitat,,
5
MethodCode,,
6
PrecipSource,,
7
PrimOwnerID,,
8
RevisionComments,,
9
SiteName,,
10
TempSource,,
11
bearing,,
12
lat_long_accuracy,,
13
lat_string,,
14
long_string,,
15
new_world,,
16
orig_filename,,
17
plot_administrator,,
18
plot_notes,,
19
pol1_type,,
20
pol2_type,,
21
recensused,,"Brad: This is a 0/1 value, internal to SALVIAS. 1 indicates that a  plot has >1 set of values, from different census events.; Aaron: Different censuses are distinguished in organisms data by different census_no values"
22
tmp_del,,
23
topography_desc,,
24
vegetation_1,,
25
vegetation_2,,
2 26
plot_area_ha,/*_id/location/area,"Brad: Area in hectares. Is there any way to store units?; Aaron: VegX plot area annotation says ""Total area of the plot in square meters."" so units are fixed"
3 27
SiteCode,/*_id/location/authorlocationcode,Brad: plotCode is as-assigned by data provider; guranteed to be unique only within dataset (=project)
4 28
Elev,/*_id/location/elevation/_alt/1,Brad: Mean elevation in meters. This is a constrained decimal value; is there no place for this in VegX other than verbatimElevation? Check with Nick.

Also available in: Unified diff