moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
inputs/*/*/map.csv: added distinguishing #... suffix (e.g. UNUSED#institutionID) to the special terms OMIT, PRIVATE, UNUSED (VegCore.vegpath.org#Special-terms) to avoid creating a collision in the staging table renaming
mappings/VegCore.htm: Regenerated from wiki. Added flower, fruit, commonName.
inputs/input.Makefile: %/.map.csv.last_cleanup: Run fix_line_endings after canon/translate to standardize Python's \r\n line endings back to \n. This prevents issues with mixed line endings because LibreOffice (and probably Excel) treat all cell-internal line endings as \n but row line endings as whatever the file had, while text editors like jEdit translate all line endings to whatever the autodetected line ending is. (This creates spurious line ending diffs when a map spreadsheet containing multiline cells is edited in a text editor.)
inputs/SALVIAS/: Regenerated salvias_*.schema.sql from the MySQL version, to take advantage of my2pg improvements. The placeholder *_index columns which take the place of MySQL's inline index definitions have now been replaced by no-op CHECK constraints, so that there are no longer lots of dummy *_index columns in the map spreadsheets.
mappings/VegCore.htm: Regenerated from wiki. Remapped organismNotes to be a synonym of occurrenceRemarks, since notes on an organism are more generally notes on an occurrence.
mappings/VegCore.htm: Regenerated from wiki. Brad's new DwC ID terms spreadsheet has now been added, and a number of the ID terms clarified, disambiguated, and recategorized. In particular, institutionCode has now been split into the custodialInstitutions and collectingInstitution, to differentiate between which institution has the specimen vs. stamped the specimen. This distinction is important because the catalogNumber, stamped on the specimen, is only unique within the collectingInstitution. Most datasources don't unambiguously specify which institution their institutionCode is referring to, so it has been assumed to be custodialInstitutions unless a data dictionary says otherwise (as is the case for UNCC). In addition, a MatchedTaxonDetermination table has been added with the *_matched fields from TNRS.
mappings/VegCore-VegBIEN.csv, inputs/*/*/map.csv: Applied term renamings from the new dynamically generated Veg+-VegCore.csv, which reflects the current state of the data dictionary. (Permanently switching to the new Veg+-VegCore.csv will be a separate change.) Updates to VegCore term names that have occurred since the data dictionary was created are now able to take effect, which involves remapping and inferring units on several fields.
mappings/VegCore.csv: Regenerated from wiki
inputs/SALVIAS*/Organism/map.csv: Remapped voucher_string/coll_number to recordNumber instead of catalogNumber, because this number is actually applied by the collector rather than by a herbarium
inputs/*/*/map.csv: Remapped recordNumber to new individualCode where applicable
mappings/VegCore.csv: Term names: Changed special characters to _ because Redmine doesn't support special characters in HTML anchors (it removes everything except letters, numbers, _, and -)
mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
inputs/SALVIAS*/plotObservations/map.csv: Remapped Habit to growthForm with _map filter applied
inputs/SALVIAS/: Updated to new salvias_plots export on nimoy, which has a different schema
inputs/SALVIAS/: Mapped unmapped fields with a VegCore/VegBIEN equivalent. plotMetadata_/: Remapped life_zone to communityID because it is now alt-ed together with vegetation*, and thus not just a description with life_zone_code as its globally unique name.
inputs/SALVIAS/plotObservations/map.csv, inputs/SALVIAS-CSV/Organism/map.csv: height_m, stem_height_m: Remapped to height_m using units from <http://salvias.net/Documents/salvias_data_dictionary.html#Plot+data>
inputs/SALVIAS/plotObservations/map.csv, inputs/SALVIAS-CSV/Organism/map.csv: x_position, y_position: Remapped to organismX_m/organismY_m using units from <http://salvias.net/Documents/salvias_data_dictionary.html#Plot+data>
mappings/VegCore.csv: Renamed verbatim* taxonomic terms to original* because in most datasources, they are in fact for the original taxon determination of the organism (which can be a completely different name than the primary determination), rather than merely unscrubbed versions of the primary taxonomic name elements. Note that SALVIAS's orig_* terms do appear to be merely unscrubbed versions, but it's not a problem to add an additional taxon determination for them.
inputs/*/*/map.csv: Prefix a * to every term that's not in Veg+ for easy identification of unmapped terms when editing map.csv. Note that canon will remove the * when it finds a matching Veg+ term.
inputs/SALVIAS/plotObservations/map.csv: Remapped temp_dbh from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units based on the units for intercept_cm, which measures the same dimension
mappings/VegCore.csv: Renamed diameterBreastHeight to diameterBreastHeight_m to assert units matching the VegBIEN field
inputs/SALVIAS/*/map.csv: Remapped all versions of stem and tree tags to tag, with the second tag superceding the first, to avoid the complex VegCore-VegBIEN mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. inputs/SALVIAS-CSV/Organism/map.csv: stem and tree tags: Made the stem tag supercede the tree tag instead of vice versa, to have as specific of a tag as possible.
inputs/*/*/map.csv: Remapped intercept_cm to new intercept_cm so that units match
mappings/VegCore-VegBIEN.csv: Changed volumeCanopy to the more accurate intercept_m. volumeCanopy was the closest equivalent VegX term, but did not really fit line-intercept information, nor did it include units.
inputs/*/*/map.csv: Changed output column header from Veg+ to VegCore because the names will be VegCore names after automapping. This is possible now that we're using new automapping scripts that do not require a particular column header.
inputs/*/*/map.csv: Moved filter suffixes to separate filter column to enable automapping to work on those mappings' terms, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Move-filter-suffixes-to-separate-filter-column>. Note that the only changes to VegBIEN.csvs are the (now automapped) names of terms in "No join mapping" comments.
inputs/*/*/map.csv: Added Filter column to contain any suffix added after the term, so that the automapping mechanism does not have to deal with the filter expressions
inputs/*/*/map.csv: Removed no longer needed [Veg+] suffix in root, because the input column is no longer used by old-style map utilities such as union that needed this
inputs/*/*/map.csv: Changed empty mappings to self mappings, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Change-empty-mappings-to-self-mappings>. Note that in map.full.csv and VegBIEN.csv, lines that have changed are always the result of the input field's case being changed to match the case of the datasource's actual column name.
inputs/*/*/map.csv: Added back automapped mappings to map.csv, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Map_refactoring#Add-back-automapped-mappings-to-mapcsv>
inputs/: Added [Veg+] to via map roots to indicate that the datasource and Veg+ vocabularies are combinable. This is possible now that automapped entries are no longer subtracted when this is in the map root, so there is no concern of losing comments on subtracted automapped rows. Note that this change turns on old-style automapping for these datasources, causing SALVIAS plotMetadata to acquire additional mappings.
input.Makefile: Maps building: %/.map.csv.last_cleanup: Translate map.csv using $(mappings)/$(via)-VegCore.csv
input.Makefile: Maps building: %/.map.csv.last_cleanup: Canonicalize map.csv using $(mappings)/$(via).vocab.csv
inputs/SALVIAS/: Switched to using the DB export's staging tables instead of the exported CSVs
inputs/: Renamed subfolders to VegCSV names, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-to-VegCSV-names>
inputs/SALVIAS*/1.organisms/map.csv: Map directly to locationID, plotName instead of parentLocationID, parentPlotName because these terms now map correctly to the parent location when a subplot column exists
inputs/SALVIAS*/1.organisms/map.csv: Mapped subplot, Line to new subplot VegCore term
inputs/SALVIAS*/1.organisms/map.csv: Remapped cfaff to identificationQualifier, because it was previously mapped to the same taxondetermination as the Orig* terms but does not have a corresponding Orig prefix to indicate that it should apply to the original determination instead of the primary TNRS one
inputs/SALVIAS*/1.organisms/map.csv: Removed computer.* prefix from primary (TNRS) taxondetermination, so it would map to the main taxondetermination in VegBIEN
inputs/SALVIAS*/1.organisms/map.csv: Remapped OrigFamily/OrigGenus/OrigSpecies to new verbatim* taxonomic names. Also remapped cfaff to verbatimIdentificationQualifier, because it was previously mapped to the same taxondetermination as the Orig* terms, but this will later need to be remapped to identificationQualifier (not in this commit because that is a separate change). Note that the switch to the verbatim* taxonomic names removes a concatenated binomial that was part of the previous mappings, which put OrigGenus and OrigSpecies together into one scientificName.
inputs: Move src subdir into main dir, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-src-subdir-into-main-dir>
inputs/SALVIAS*/src/1.organisms/map.csv: Removed no longer applicable comments, which related to mappings that were in effect long ago
inputs/SALVIAS*/src/1.organisms/map.csv: Habit: Mapped to new Veg+ habit term
inputs/SALVIAS*/src/1.organisms/map.csv: Habit: Don't filter out values not part of the provided terms list, because such values should be flagged as invalid in the error maps rather than silently discarded. This also ensures that any valid values which are not part of the provided terms list are kept.
inputs: Moved maps into subfolders, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-maps-into-subfolders>
inputs: Replaced Veg+ prefix with map on via maps, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Replace-Veg-prefix-with-map-on-via-maps>
inputs: Renamed organisms table to 1.organisms so import order would be inherent in the dir name, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-with-import-order>
Merged DwC (including DwC1) and VegCSV mappings into new Veg+ schema. This involves replacing occurrences of DwC and VegCSV with Veg+ (or sometimes VegCore) everywhere, as described in <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV-DwC_merging>.
mappings/DwC2-VegBIEN.specimens.csv, VegCSV-VegBIEN.specimens.csv: Split occurrenceID into occurrenceID and individualID, where individualID refers to the plant in plots data and occurrenceID refers to the specimen in specimens data. This prevents plant sourceaccessioncodes from being mapped to the specimenreplicate, which was messing up stems mappings for the parent plantobservation. It also avoids mapping the specimenreplicate sourceaccessioncode to additional tables where it isn't needed. (Note that occurrenceID is needed for location to ensure that each specimen gets its own location to make locationdeterminations on. Everything else is directly or indirectly scoped by location when its own sourceaccessioncode isn't specified.)
input via maps: Removed _date/date filter from date fields because the main mappings now have _date around all dates, so this filter is redundant
inputs/SALVIAS*/maps/VegCSV.organisms.csv: census_date: Documented that this is for the subplot, not the organism, as all organisms in a subplot have the same value for it
mappings/VegCSV-VegBIEN.specimens.csv: Changed plotID to locationID and parentPlotID to parentLocationID to use DwC-related terms
mappings/VegCSV-VegBIEN.specimens.csv: Changed VegCSV term fieldNumber (from DwC) to recordNumber to be consistent with the TDWG meaning of fieldNumber, which defines it as the author code for the event, not the organism (what VegBIEN calls the authorlocationcode and VegBank calls the authorObsCode)
plots inputs: Remapped all VegX via maps to VegCSV. See steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegX-%3EVegCSV>.
inputs/SALVIAS/maps/VegX.organisms.csv: Map organism notes to different place than stem notes, because these are separate fields
inputs/SALVIAS*/maps/VegX.organisms.csv: Fixed missing join mappings for stemobservation-related fields
mappings/VegX-VegBIEN.stems.csv: Reversed input XPaths so that they start with plot instead of individualOrganismObservation as stem
mappings/VegX-VegBIEN.stems.csv: VegX XPaths: Expanded {} expressions using expand_braces, so that later use of expand_braces on the file would not affect the VegX output mappings of the inputs' via maps (VegX.organisms.csv, etc.)
mappings/DwC2-VegBIEN.specimens.csv, VegX-VegBIEN.stems.csv: Removed all manual mappings to datasource_id now that datasource_id is auto-populated, both on the VegBIEN output side and the DwC/VegX input side. This should greatly simplify many of the mappings!
inputs/SALVIAS: Switched to using CSV exports of the DB, so that staging tables could be created for column-based import
VegX mappings: Changed taxonDetermination of role identifier to instead have explicitly no role, because data providers' VegX files generally do not provide role information and we don't want the default taxonDetermination XPaths to require this
VegX mappings: Map locationevent.sourceaccessioncode to plotUniqueIdentifier since this field is no longer being used by authorlocationcode
VegX mappings: Map the authorlocationcode to plotName instead of plotUniqueIdentifier because it's a better fit
VegX mappings: Renamed taxonNameUsageConceptsID to taxonNameUsageConceptID (no plural) to match VegX 1.5.3
VegX mappings: taxonConcept mappings: Added "tcs:" namespace prefix to appropriate elements. This will make the taxonConcept XPaths compatible with CTFS VegX.
inputs/SALVIAS*/maps: Cleaned up maps for the first time since all via maps became subject to cleanup
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field identificationLabel2 to identificationLabel. Distinguish what are now two identificationLabel fields of the same name by tagging each one with [@id=2] or [@id=1]. inputs/SALVIAS-CSV/maps/VegX.organisms.csv: Merge tag1/stem_tag1 and tag2/stem_tag2 using _alt, since they are never set to different values when both are not NULL (although sometimes just one or just the other is not NULL).
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field tag2 to identificationLabel2 to reflect that it will become a second instance of identificationLabel
VegX-VegBIEN.organisms.csv: Re-mapped individualOrganismObservation user-defined field lineCover to already existing volumeCanopy
VegX-VegBIEN.organisms.csv: Re-mapped individualOrganismObservation user-defined field cover to already existing attribute.coverPercent
VegX-VegBIEN.organisms.csv: Re-mapped individualOrganismObservation user-defined field count to already existing aggregateOrganismObservation.aggregateValue
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation/plotObservation user-defined fields sourceaccessioncode to sourceAccessionCode to be consistent with VegX case sensitivity
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field interceptCm to lineCover to be consistent with VegBIEN
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field individualCode to authorPlantCode to be consistent with VegBIEN
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field htFirstBranchM to heightFirstBranch to be consistent with VegBIEN
VegX-VegBIEN.organisms.csv: Renamed individualOrganismObservation user-defined field coverPercent to cover to be consistent with VegBIEN
inputs/SALVIAS*/maps/VegX.organisms.csv: Habit: Ignore invalid values instead of generating a SyntaxException
xml_func.py: _map: Instead of _closed special entry, make all maps closed by default and open them if special entry "*=*" is present. Support using a _map to filter values by interpreting special entry "*=" as removing all values not explicitly specified, and by interpreting special value "*" as keeping input value the same.
VegX mappings: Changed userdef xPosition, yPosition to /relativePlotPosition/relativeX, /relativePlotPosition/relativeY
inputs/SALVIAS/maps/VegX.organisms.csv: Habit: Fixed syntax error in growthForm map
inputs/SALVIAS/maps/VegX.organisms.csv: Habit: Removed input values from growthForm map that Brad said were invalid
inputs/SALVIAS/maps/VegX.organisms.csv: Mapped custom Habit values not listed in the SALVIAS data dictionary
VegX-VegBIEN mapping: Mapped userdefined fields to new first-class fields
SALVIAS mappings: Fixed plot key mappings to map the correct values to subplot and parent plot
SALVIAS-db VegX mapping: Map subplots correctly the way SALVIAS-CSV does
VegX-VegBIEN mapping: Map sourceaccessioncode and voucher (catalognumber_dwc) to correct place. SALVIAS mappings: Map SourceVoucher as an alternative to coll_number.
VegX-VegBIEN mapping: Mapped stem tags to new stemtag table
SALVIAS mappings: Mapped habit to growthForm (user-defined field) instead of habit
VegX-VegBIEN mapping: Include the datasource name (now provided by map in /_ignore/inLabel) in the appropriate places in both VegX and VegBIEN
inputs/SALVIAS/maps/VegX.organisms.csv: Mapped OrigSpecies and OrigGenus combined to new plantlevel Binomial
inputs/SALVIAS/maps/VegX.organisms.csv: Map cfaff to taxonConcept/fit, which maps to taxondetermination.taxonFit
inputs/SALVIAS/maps/VegX.*.csv: Replaced symlinks with actual files
mappings/Makefile: Run simplify_xpath on VegX-VegBIEN.organisms.csv
mappings: Map ScientificNameAuthor to plantconcept with rank author
SALVIAS organisms mapping: Removed redundant PlotCode mapping because the association to plotevent is done with PlotID
VegX mappings: Use baseDistance/value instead of baseDistance so we can later use complexUserDefined to distinguish between different types of dbh
SALVIAS mappings: use PlotID as authorObsCode to link plot observations and organisms correctly for organisms without a PlotCode
VegX-VegBank organisms mapping: Added collectionDate mapping