inputs/TEAM/: Obtained new download of TEAM data. (Note that the new download has a slightly different schema.) Archived old data in _archive/. Added tables to import_order.txt. Renamed TeamPlotMetaData/ to TEAM_Sites/ to correspond with the section header in Vegetation-Tree-and-Liana-Metadata-1.5.pdf. Fixed TEAM_Sites mappings: Remapped CollectionDate to eventDate because it relates to the plot, not the organism. Mapped Name to plotName so TEAM_Sites data will match up with VL, VT data.
inputs/TEAM/VL, VT: Split concatenated flat files apart into separate parts each time a header is duplicated, so that the header would be autoremoved by cat_csv. Changed modified BIEN2 flat file headers back to original headers (the duplicated headers) so the headers of all part files would match up. (This is required for cat_csv header autoremoval to work properly.) This results in changes to the input column names in */map.csv.
sql_io.py: null_strs: Added 'nulo' (used by REMIB)
mappings/Veg+-VegCore.csv: DBH: Removed diameterBreastHeight_m alternative because datasources that don't append units to DBH almost always have units of cm or in
inputs/TEAM/*/map.csv: Remapped dbh from diameterBreastHeight_m to diameterBreastHeight_cm, using the units defined in Vegetation-Metadata-1.4.pdf
inputs/import.stats.xls: Updated import times
inputs/TEAM/: Added TeamPlotMetaData
inputs/TEAM/_src/: Added ci-team_extract/Vegetation-Metadata-1.4.pdf and symlink to it in the _src subdir
inputs/: Added aggregated unmapped_terms.csv, new_terms.csv which were not already under version control
inputs/SALVIAS-CSV/Organism/map.csv: Remapped stem_dbh from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units based on the units for intercept_cm, which measures the same dimension
inputs/SALVIAS/stems/map.csv: Remapped stem_dbh from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units based on the units for plotObservations.intercept_cm, which measures the same dimension
inputs/SALVIAS/plotObservations/map.csv: Remapped temp_dbh from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units based on the units for intercept_cm, which measures the same dimension
inputs/Madidi/Organism/map.csv: Remapped Diameter from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units based on the range and precision of values
inputs/FIA/Organism/map.csv: DBH: Changed units comment to include that assumption was also based on location inside the U.S., because some data outside the U.S. also uses fractional DBHs, but these are not likely to be inch measurements
inputs/FIA/Organism/map.csv: Remapped DBH from diameterBreastHeight_m to diameterBreastHeight_in, assuming units based on the range and precision of values
inputs/CTFS/StemObservation/map.csv: DBH: Changed units comment to include that assumption was also based on the precision of values, because fractional DBHs sometimes indicate units of inches
mappings/VegCore.csv: Added diameterBreastHeight_in
schemas/functions.sql: Added _in_to_m()
mappings/Veg+-VegCore.csv: Remapped DBH from no longer existing term diameterBreastHeight to diameterBreastHeight_cm, diameterBreastHeight_m (both terms will be listed in the map spreadsheet after automapping, and the user can then choose one)
inputs/CTFS/StemObservation/map.csv: Remapped DBH from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units are cm based on the range of values
mappings/VegCore.csv: Added diameterBreastHeight_cm
mappings/VegCore.csv: Added stemID, which was only in mappings/VegCore-VegBIEN.csv
input.Makefile: Maps validation: Inline $(unmappedTerms) because it's only used once
input.Makefile: Maps validation: %/new_terms.csv: Include the entire map spreadsheet row, so that each new term is listed together with its mapping. This facilitates adding new mappings to mappings/Veg+-VegCore.csv directly from any new_terms.csv. Note that the use of `sort -u` (in lib/mappings.Makefile) causes multiline comments to be separated, leading to spurious lines for each multiline comment line.
inputs/: Added unmapped_terms.csv, new_terms.csv which were not already under version control
inputs/VegBank/plot_/: Automapped with new parentPlotID term, which now has a join mapping in mappings/VegCore-VegBIEN.csv
Regenerated unmapped_terms.csv, new_terms.csv
mappings/Veg+-VegCore.csv: Added parentPlotID
mappings/VegCore-VegBIEN.csv: Added parentLocationID, parentPlotName, which always map directly to the parent location, regardless of whether any subplot ID is present
mappings/Veg+.unmapped_terms.csv: Removed vague term volumeCanopy, which has no definition in VegX
mappings/Makefile: .VegCore.csv.last_cleanup: Fixed bug where needed to change sorting columns to match new column order
mappings/VegCore.csv: Reordered columns to put Comments first, which matches mappings/Veg+-VegCore.csv
mappings/Veg+-VegCore.csv: Removed redundant stem_id->stemID mapping
mappings/Veg+-VegCore.csv: Standardized the capitalization of names, by camel-casing each name except for acronyms and "ID", which are made all uppercase
mappings/VegCore.csv: Renamed diameterBreastHeight to diameterBreastHeight_m to assert units matching the VegBIEN field
mappings/VegCore.csv: Removed duplicates
input.Makefile: Maps building: Use new mappings/VegCore.csv as the VegCore vocabulary to canonicalize on, in order to also canonicalize VegCore terms which are not yet mapped to VegBIEN. This results in several DwC terms getting their case standardized according to http://rs.tdwg.org/dwc/terms/. Continue to determine unmapped terms using mappings/VegCore-VegBIEN.csv, because a term should not be considered mapped until it has been mapped all the way through to VegBIEN.
mappings/VegCore.csv: Removed trailing spaces from terms
mappings/Veg+.unmapped_terms.csv: Removed duplicates of VegCore terms
mappings/: Split Veg+.terms.csv into VegCore.csv and Veg+.unmapped_terms.csv
mappings/Veg+.terms.csv: Removed terms that are in mappings/Veg+-VegCore.csv
mappings/Veg+-VegCore.csv: Added sources where missing
mappings/Veg+-VegCore.csv: Added Source and Comments columns from mappings/Veg+.terms.csv. Reordered columns to put Comments first.
mappings/Veg+.terms.csv: Removed duplicate entries for stem_id/stemID, collector
inputs/REMIB/Specimen/: Filter out invalid, frameshifted rows so they don't produce errors in the import or anomalies like thousands of taxondeterminations for one taxonoccurrence. This involves moving the CSVs to Specimen.src and using a create.sql to create the filtered table.
mappings/VegCore-VegBIEN.csv: Forward occurrenceID to taxonoccurrence.sourceaccessioncode when there is no other taxonoccurrence.sourceaccessioncode, to ensure that taxonoccurrence is uniquely identified so that there is one taxonoccurrence per organism
mappings/VegCore-VegBIEN.csv: taxonoccurrence.authortaxoncode alternatives: Use _first instead of _alt because when one of these fields is present, it can be used directly even if it's sometimes NULL, without needing to spend a lot of time _alting together fields that won't be used. Datasources where the authortaxoncode is sometimes NULL usually have a separate sourceaccessioncode for the taxonoccurrence. (In the rare case that they don't, they should map a non-NULL field to recordNumber or tag to ensure that taxonoccurrences can be uniquely identified.)
mappings/VegCore-VegBIEN.csv: Mapped tag to taxonoccurrence.authortaxoncode when the record is an organism, in case there is no other ID for the taxonoccurrence. This fixes a bug in FIA and TEAM data where all organisms in a plot used the same taxonoccurrence because taxonoccurrence was not properly constrained, causing the loss of individual taxondeterminations on each organism.
input.Makefile: Testing: %/test.by_col.xml: Do abort tester if by-column test fails. There are no longer small rowcount differences between row-based and column-based import on some datasources, so this is now possible.
schemas/vegbien.sql: stemobservation: stemobservation_unique_within_plantobservation unique index: Added tag so that a stemobservation can be scoped by its tag when no other ID is specified
schemas/vegbien.sql: stemobservation: stemobservation_unique_within_plantobservation unique index: Fixed bug where filter condition underconstrained stemobservation when neither sourceaccessioncode nor authorstemcode was specified, by making sure that at least one *_unique index always applies
mappings/VegCore-VegBIEN.csv: Remapped tag to new stemobservation.tag
schemas/vegbien.sql: stemobservation: Added tag, tags
mappings/VegCore-VegBIEN.csv: tag: Removed no longer applicable comment
mappings/VegCore-VegBIEN.csv: Removed no longer used previousTag and the complex mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. tag: Removed iscurrent=true because there is now only one tag field.
inputs/SALVIAS/*/map.csv: Remapped all versions of stem and tree tags to tag, with the second tag superceding the first, to avoid the complex VegCore-VegBIEN mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. inputs/SALVIAS-CSV/Organism/map.csv: stem and tree tags: Made the stem tag supercede the tree tag instead of vice versa, to have as specific of a tag as possible.
inputs/SALVIAS/stems/map.csv: Copied Brad's comments on plotObservations.tag1, tag2 to stem_tag1, stem_tag2
mappings/VegCore-VegBIEN.csv: Removed _rangeStart and _rangeEnd filters from fields which should contain decimal values. These filters should be added on a per-datasource basis instead.
inputs/ARIZ/Specimen/map.csv: Documented that MinimumElevationInMeters, MinimumElevationInMeters contain some verbatim values, including ranges and units
mappings/VegCore-VegBIEN.csv: Removed /_units:[default=m,to=m,to=]/value filter from fields. It should be added on a per-datasource basis instead.
mappings/VegCore-VegBIEN.csv: Removed /_replace:["\bca\.?"=]/value filter from fields. It should be added on a per-datasource basis instead.
mappings/VegCore-VegBIEN.csv: verbatimElevation->elevation_m mapping: Translate units automatically (currently only works in row-based mode). Don't remove any "ca." prefix because this is a datasource-specific filter that does not apply to current datasources with verbatimElevation. Also map verbatimElevation to location.verbatimelevation.
inputs/NCU-NCSC/Specimen/map.csv: Elevation: Removed comment that it includes units, because this is now part of the definition of verbatimElevation
mappings/Veg+.terms.csv: Documented that verbatimElevation must include units
inputs/ARIZ/Specimen/map.csv: Remapped VerbatimElevation to UNUSED
inputs/*/*/map.csv: Remapped all unused terms to special value UNUSED. Remapped all private terms to special value PRIVATE. Remapped all deliberately unmapped terms to special value OMIT.
mappings/Veg+-VegCore.csv: Remapped realLatitude, realLongitude to new special value PRIVATE, which is more specific than OMIT
mappings/Veg+.terms.csv: Added special value PRIVATE
mappings/Veg+.terms.csv: Added special values OMIT, UNUSED
inputs/VegBank/plot_/map.csv: Remapped elevation from verbatimElevation to elevationInMeters, since the values are all decimals. The units come from the data dictionary.
inputs/SALVIAS/plotMetadata/map.csv, inputs/SALVIAS-CSV/Plot/map.csv: Remapped elev_m from verbatimElevation to elevationInMeters, since the values are all decimals. Note that the units of SALVIAS Elev were provided by a comment from Brad (and can also be assumed to be the same as SALVIAS-CSV elev_m).
inputs/NCU-NCSC/Specimen/map.csv: Documented that Elevation includes units
inputs/Madidi/Plot/map.csv: Remapped Minimum altitude from minimumElevationInMeters to verbatimElevation_m, since it is a range, not a minimum. Note that the units are assumed based on the range of values present and the region the data is from (Madidi National Park).
mappings/VegCore-VegBIEN.csv: Also mapped verbatimElevation_m to verbatimelevation
mappings/VegCore-VegBIEN.csv: Also mapped verbatimElevation_m to elevationrange_m
mappings/VegCore-VegBIEN.csv: Mapped verbatimElevation_m
mappings/Veg+.terms.csv: Added verbatimElevation_m
mappings/Veg+-VegCore.csv: Mapped realLatitude, realLongitude to OMIT because private data should not be placed in a public database
mappings/Veg+.terms.csv: Added realLatitude, realLongitude
inputs/VegBank/plot_/map.csv: Documented that elevationrange is unused
inputs/Madidi/Plot/map.csv: Fixed comments on Direction and OrientaciĆ³n/exposicion so each comment refers to the other field that is equivalent
inputs/Madidi/Plot/map.csv: Remapped Altitude from verbatimElevation to elevationInMeters, since the values are all decimals. Note that the units are assumed based on the range of values present and the region the data is from (Madidi National Park).
inputs/CTFS/Plot/map.csv: Remapped Elevation from verbatimElevation to elevationInMeters, since it is a float in the original bci.sql database. Note that the units are assumed based on the range of values present and the country the data is from (Panama).
mappings/VegCore-VegBIEN.csv: Mapped elevationInMeters
mappings/Veg+.terms.csv: Added elevationInMeters
schemas/vegbien.sql: location: Added verbatimelevation
README.TXT: Data import: Added note that `make schemas/reinstall` must be done after running make_analytical_db on a previous import
schemas/vegbien.sql: Added indexes for additional analytical_db_view joins, as described at <https://projects.nceas.ucsb.edu/nceas/issues/494>
schemas/vegbien.sql: Added indexes for the analytical_db_view joins, as described at <https://projects.nceas.ucsb.edu/nceas/issues/494>
README.TXT: Data import: Added note that `make schemas/rotate` must be done after running make_analytical_db
schemas/functions.sql: Renamed _pct_to_frac() to _percent_to_fraction() and _frac_to_pct() to _fraction_to_percent(), for clarity and for consistency with _percent (which is spelled out), as used by SALVIAS (http://salvias.net/Documents/salvias_data_dictionary.html) and elsewhere
review: Don't remove XML functions that are unit conversions
schemas/vegbien.sql: Changed _frac units suffix to _fraction for clarity and for consistency with _percent (which is spelled out), as used by SALVIAS (http://salvias.net/Documents/salvias_data_dictionary.html) and elsewhere
inputs/*/*/map.csv: Remapped intercept_cm to new intercept_cm so that units match
mappings/VegCore-VegBIEN.csv: Mapped intercept_cm
schemas/functions.sql: Added _cm_to_m()
mappings/Veg+.terms.csv: Added intercept_cm
mappings/VegCore-VegBIEN.csv: Changed volumeCanopy to the more accurate intercept_m. volumeCanopy was the closest equivalent VegX term, but did not really fit line-intercept information, nor did it include units.