input.Makefile: Maps building: Use new mappings/VegCore.csv as the VegCore vocabulary to canonicalize on, in order to also canonicalize VegCore terms which are not yet mapped to VegBIEN. This results in several DwC terms getting their case standardized according to http://rs.tdwg.org/dwc/terms/. Continue to determine unmapped terms using mappings/VegCore-VegBIEN.csv, because a term should not be considered mapped until it has been mapped all the way through to VegBIEN.
mappings/VegCore.csv: Removed trailing spaces from terms
mappings/Veg+.unmapped_terms.csv: Removed duplicates of VegCore terms
mappings/: Split Veg+.terms.csv into VegCore.csv and Veg+.unmapped_terms.csv
mappings/Veg+.terms.csv: Removed terms that are in mappings/Veg+-VegCore.csv
mappings/Veg+-VegCore.csv: Added sources where missing
mappings/Veg+-VegCore.csv: Added Source and Comments columns from mappings/Veg+.terms.csv. Reordered columns to put Comments first.
mappings/Veg+.terms.csv: Removed duplicate entries for stem_id/stemID, collector
inputs/import.stats.xls: Updated import times
inputs/REMIB/Specimen/: Filter out invalid, frameshifted rows so they don't produce errors in the import or anomalies like thousands of taxondeterminations for one taxonoccurrence. This involves moving the CSVs to Specimen.src and using a create.sql to create the filtered table.
mappings/VegCore-VegBIEN.csv: Forward occurrenceID to taxonoccurrence.sourceaccessioncode when there is no other taxonoccurrence.sourceaccessioncode, to ensure that taxonoccurrence is uniquely identified so that there is one taxonoccurrence per organism
mappings/VegCore-VegBIEN.csv: taxonoccurrence.authortaxoncode alternatives: Use _first instead of _alt because when one of these fields is present, it can be used directly even if it's sometimes NULL, without needing to spend a lot of time _alting together fields that won't be used. Datasources where the authortaxoncode is sometimes NULL usually have a separate sourceaccessioncode for the taxonoccurrence. (In the rare case that they don't, they should map a non-NULL field to recordNumber or tag to ensure that taxonoccurrences can be uniquely identified.)
mappings/VegCore-VegBIEN.csv: Mapped tag to taxonoccurrence.authortaxoncode when the record is an organism, in case there is no other ID for the taxonoccurrence. This fixes a bug in FIA and TEAM data where all organisms in a plot used the same taxonoccurrence because taxonoccurrence was not properly constrained, causing the loss of individual taxondeterminations on each organism.
input.Makefile: Testing: %/test.by_col.xml: Do abort tester if by-column test fails. There are no longer small rowcount differences between row-based and column-based import on some datasources, so this is now possible.
schemas/vegbien.sql: stemobservation: stemobservation_unique_within_plantobservation unique index: Added tag so that a stemobservation can be scoped by its tag when no other ID is specified
schemas/vegbien.sql: stemobservation: stemobservation_unique_within_plantobservation unique index: Fixed bug where filter condition underconstrained stemobservation when neither sourceaccessioncode nor authorstemcode was specified, by making sure that at least one *_unique index always applies
mappings/VegCore-VegBIEN.csv: Remapped tag to new stemobservation.tag
schemas/vegbien.sql: stemobservation: Added tag, tags
mappings/VegCore-VegBIEN.csv: tag: Removed no longer applicable comment
mappings/VegCore-VegBIEN.csv: Removed no longer used previousTag and the complex mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. tag: Removed iscurrent=true because there is now only one tag field.
inputs/SALVIAS/*/map.csv: Remapped all versions of stem and tree tags to tag, with the second tag superceding the first, to avoid the complex VegCore-VegBIEN mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. inputs/SALVIAS-CSV/Organism/map.csv: stem and tree tags: Made the stem tag supercede the tree tag instead of vice versa, to have as specific of a tag as possible.
inputs/SALVIAS/stems/map.csv: Copied Brad's comments on plotObservations.tag1, tag2 to stem_tag1, stem_tag2
mappings/VegCore-VegBIEN.csv: Removed _rangeStart and _rangeEnd filters from fields which should contain decimal values. These filters should be added on a per-datasource basis instead.
inputs/ARIZ/Specimen/map.csv: Documented that MinimumElevationInMeters, MinimumElevationInMeters contain some verbatim values, including ranges and units
mappings/VegCore-VegBIEN.csv: Removed /_units:[default=m,to=m,to=]/value filter from fields. It should be added on a per-datasource basis instead.
mappings/VegCore-VegBIEN.csv: Removed /_replace:["\bca\.?"=]/value filter from fields. It should be added on a per-datasource basis instead.
mappings/VegCore-VegBIEN.csv: verbatimElevation->elevation_m mapping: Translate units automatically (currently only works in row-based mode). Don't remove any "ca." prefix because this is a datasource-specific filter that does not apply to current datasources with verbatimElevation. Also map verbatimElevation to location.verbatimelevation.
inputs/NCU-NCSC/Specimen/map.csv: Elevation: Removed comment that it includes units, because this is now part of the definition of verbatimElevation
mappings/Veg+.terms.csv: Documented that verbatimElevation must include units
inputs/ARIZ/Specimen/map.csv: Remapped VerbatimElevation to UNUSED
inputs/*/*/map.csv: Remapped all unused terms to special value UNUSED. Remapped all private terms to special value PRIVATE. Remapped all deliberately unmapped terms to special value OMIT.
mappings/Veg+-VegCore.csv: Remapped realLatitude, realLongitude to new special value PRIVATE, which is more specific than OMIT
mappings/Veg+.terms.csv: Added special value PRIVATE
mappings/Veg+.terms.csv: Added special values OMIT, UNUSED
inputs/VegBank/plot_/map.csv: Remapped elevation from verbatimElevation to elevationInMeters, since the values are all decimals. The units come from the data dictionary.
inputs/SALVIAS/plotMetadata/map.csv, inputs/SALVIAS-CSV/Plot/map.csv: Remapped elev_m from verbatimElevation to elevationInMeters, since the values are all decimals. Note that the units of SALVIAS Elev were provided by a comment from Brad (and can also be assumed to be the same as SALVIAS-CSV elev_m).
inputs/NCU-NCSC/Specimen/map.csv: Documented that Elevation includes units
inputs/Madidi/Plot/map.csv: Remapped Minimum altitude from minimumElevationInMeters to verbatimElevation_m, since it is a range, not a minimum. Note that the units are assumed based on the range of values present and the region the data is from (Madidi National Park).
mappings/VegCore-VegBIEN.csv: Also mapped verbatimElevation_m to verbatimelevation
mappings/VegCore-VegBIEN.csv: Also mapped verbatimElevation_m to elevationrange_m
mappings/VegCore-VegBIEN.csv: Mapped verbatimElevation_m
mappings/Veg+.terms.csv: Added verbatimElevation_m
mappings/Veg+-VegCore.csv: Mapped realLatitude, realLongitude to OMIT because private data should not be placed in a public database
mappings/Veg+.terms.csv: Added realLatitude, realLongitude
inputs/VegBank/plot_/map.csv: Documented that elevationrange is unused
inputs/Madidi/Plot/map.csv: Fixed comments on Direction and OrientaciĆ³n/exposicion so each comment refers to the other field that is equivalent
inputs/Madidi/Plot/map.csv: Remapped Altitude from verbatimElevation to elevationInMeters, since the values are all decimals. Note that the units are assumed based on the range of values present and the region the data is from (Madidi National Park).
inputs/CTFS/Plot/map.csv: Remapped Elevation from verbatimElevation to elevationInMeters, since it is a float in the original bci.sql database. Note that the units are assumed based on the range of values present and the country the data is from (Panama).
mappings/VegCore-VegBIEN.csv: Mapped elevationInMeters
mappings/Veg+.terms.csv: Added elevationInMeters
schemas/vegbien.sql: location: Added verbatimelevation
README.TXT: Data import: Added note that `make schemas/reinstall` must be done after running make_analytical_db on a previous import
schemas/vegbien.sql: Added indexes for additional analytical_db_view joins, as described at <https://projects.nceas.ucsb.edu/nceas/issues/494>
schemas/vegbien.sql: Added indexes for the analytical_db_view joins, as described at <https://projects.nceas.ucsb.edu/nceas/issues/494>
README.TXT: Data import: Added note that `make schemas/rotate` must be done after running make_analytical_db
schemas/functions.sql: Renamed _pct_to_frac() to _percent_to_fraction() and _frac_to_pct() to _fraction_to_percent(), for clarity and for consistency with _percent (which is spelled out), as used by SALVIAS (http://salvias.net/Documents/salvias_data_dictionary.html) and elsewhere
review: Don't remove XML functions that are unit conversions
schemas/vegbien.sql: Changed _frac units suffix to _fraction for clarity and for consistency with _percent (which is spelled out), as used by SALVIAS (http://salvias.net/Documents/salvias_data_dictionary.html) and elsewhere
inputs/*/*/map.csv: Remapped intercept_cm to new intercept_cm so that units match
mappings/VegCore-VegBIEN.csv: Mapped intercept_cm
schemas/functions.sql: Added _cm_to_m()
mappings/Veg+.terms.csv: Added intercept_cm
mappings/VegCore-VegBIEN.csv: Changed volumeCanopy to the more accurate intercept_m. volumeCanopy was the closest equivalent VegX term, but did not really fit line-intercept information, nor did it include units.
mappings/Veg+.terms.csv: Added intercept_m
schemas/vegbien.sql: taxonscope: Added comment that it stores the scope of a morphospecies name
README.TXT: Data import: Commit: Shortened import message to fit on one line in the README, to avoid issues when copying and pasting
schemas/functions.sql: Added _ha_to_m2(text), _pct_to_frac(text)
schemas/vegbien.sql: analytical_db_view: Use _m2_to_ha() on location.area_m2 to get plotAreaHa
schemas/functions.sql: Added _m2_to_ha()
mappings/VegCore-VegBIEN.csv, Veg+.terms.csv: Removed imprecise and no longer used plotArea and area. Use plotArea_<units> instead.
inputs/*/*/map.csv: Remapped applicable plotArea fields to plotArea_m2
mappings/VegCore-VegBIEN.csv: Mapped plotArea_m2
mappings/Veg+.terms.csv: Added plotArea_m2
mappings/VegCore-VegBIEN.csv: Renamed plotAreaHa to plotArea_ha for consistency with VegBIEN units suffixing convention, which includes an "_"
inputs/*/*/map.csv: Remapped applicable plotArea fields to plotAreaHa
mappings/Veg+-VegCore.csv: Removed inaccurate SizeOfSite->plotArea mapping, which does not match units
mappings/VegCore-VegBIEN.csv: Mapped plotAreaHa
schemas/functions.sql: Added _ha_to_m2()
mappings/Veg+.terms.csv: Added plotAreaHa
mappings/Veg+.terms.csv: Standardize area using VegX /plots/plot/area instead of Madidi Inventory+description.Area
schemas/vegbien.sql: analytical_db_view: Use _frac_to_pct() on aggregateoccurrence.cover_frac to get pctCover
schemas/functions.sql: Added _pct_to_frac()
mappings/VegCore-VegBIEN.csv: coverPercent: Convert to fraction using _pct_to_frac()
xml_dom.py: replace_with_text(): Support ints and floats
xml_func.py: simplify(): Run xml_dom.prune_empty() on function nodes that don't have an explicit simplifying function. This allows single-arg functions with no arg to be pruned rather than called with no args (causing errors if the single param does not have a default value).
Regenerated vegbien.ERD exports
schemas/vegbien.sql: Added units suffix to additional VegBIEN fields that have units
schemas/vegbien.sql: Added units suffix to all core VegBIEN fields that have units. It is the responsibility of the mappings to ensure that all units are properly translated.
root Makefile: PostgreSQL: postgres-Linux: Added postgresql-postgis apt-get
backups/Makefile: Backups: Full DB: Specify the date suffix of the backup when it's created rather than adding it afterwards. This allows the user to specify a suffix that matches the corresponding public-schema backup.
inputs/*/*/map.csv: Mapped variants of subspecies directly to new subspecies term
mappings/VegCore-VegBIEN.csv: subspecies, infraspecificEpithet: Added _alts for datasources that specify both
input.Makefile: Mapping: $(map2db): Inline $(map) because this is the only place it's used
input.Makefile: Mapping: $(map): Don't require flat files because they don't need to be used directly anymore (staging tables are used instead)
input.Makefile: Mapping: $(map2db): Always use staging tables, because the flat files don't need to be used directly anymore
mappings/Veg+-VegCore.csv: Remapped subspecies, subSpeciesName to new subspecies term