moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.
inputs/*/*/test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix
Refreshed SALVIAS
mappings/VegCore-VegBIEN.csv, inputs/*/*/map.csv: Applied term renamings from the new dynamically generated Veg+-VegCore.csv, which reflects the current state of the data dictionary. (Permanently switching to the new Veg+-VegCore.csv will be a separate change.) Updates to VegCore term names that have occurred since the data dictionary was created are now able to take effect, which involves remapping and inferring units on several fields.
inputs/SALVIAS/*/test.xml.ref: Restored SALVIAS* inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
mappings/VegCore-VegBIEN.csv: Removed TNRS input taxonlabels meant to cross-link to taxonlabels added by the TNRS import, because TNRS taxondeterminations are now created instead
db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
inputs/SALVIAS/stems/map.csv: Remapped stem_dbh from diameterBreastHeight_m to diameterBreastHeight_cm, assuming units based on the units for plotObservations.intercept_cm, which measures the same dimension
mappings/VegCore-VegBIEN.csv: Remapped tag to new stemobservation.tag
mappings/VegCore-VegBIEN.csv: Removed no longer used previousTag and the complex mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. tag: Removed iscurrent=true because there is now only one tag field.
inputs/SALVIAS/*/map.csv: Remapped all versions of stem and tree tags to tag, with the second tag superceding the first, to avoid the complex VegCore-VegBIEN mapping logic that attempts to place both tags in VegBIEN in the correct order but does not work for column-based import. inputs/SALVIAS-CSV/Organism/map.csv: stem and tree tags: Made the stem tag supercede the tree tag instead of vice versa, to have as specific of a tag as possible.
schemas/vegbien.sql: Added units suffix to all core VegBIEN fields that have units. It is the responsibility of the mappings to ensure that all units are properly translated.
mappings/VegCore-VegBIEN.csv: Added /_simplifyPath:[next=parent_id]/path to root so the returned subplot location will be its parent location if there is no subplot name or ID (indicating that that particular plot did not have subplots). Note that this also causes the parent_id forwarding effect to occur for all other tables containing parent_id, which will help prevent similar issues with subplot events, etc. This will hopefully fix the SALVIAS.plotObservations bug where some organisms did not have a subplot #, causing the subplot location to become NULL and causing the corresponding locationevent rows not to match the locationevent_unique_within_location index filter condition (which requires a parent_id), which caused multiple output table pkeys to be returned for those rows, violating the locationevent_pkeys temp table's primary key.
inputs/SALVIAS/: Switched to using the DB export's staging tables instead of the exported CSVs
inputs/: Renamed subfolders to VegCSV names, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-to-VegCSV-names>
inputs/SALVIAS/2.stems/map.csv: Mapped stem_id
inputs: Move src subdir into main dir, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-src-subdir-into-main-dir>
inputs: Moved test outputs into subfolders, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-test-outputs-into-subfolders>
inputs: Renamed stems table to 2.stems so import order would be inherent in the dir name, using steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Rename-subfolders-with-import-order>
mappings/DwC2-VegBIEN.specimens.csv, VegCSV-VegBIEN.specimens.csv: Split occurrenceID into occurrenceID and individualID, where individualID refers to the plant in plots data and occurrenceID refers to the specimen in specimens data. This prevents plant sourceaccessioncodes from being mapped to the specimenreplicate, which was messing up stems mappings for the parent plantobservation. It also avoids mapping the specimenreplicate sourceaccessioncode to additional tables where it isn't needed. (Note that occurrenceID is needed for location to ensure that each specimen gets its own location to make locationdeterminations on. Everything else is directly or indirectly scoped by location when its own sourceaccessioncode isn't specified.)
mappings/VegCSV-VegBIEN.specimens.csv: occurrenceID: Mapped to specimenreplicate.sourceaccessioncode for mergability with DwC
mappings/DwC2-VegBIEN.specimens.csv, DwC1-DwC2.specimens.csv: Split eventDate into eventDate and dateCollected, where eventDate refers only to the date of the sampling event, but dateCollected also refers to the date the particular specimen was collected. (This distinction is important in merging with VegCSV, because in plots data, these two fields are distinct.) Remapped datasources with dateCollected-related fields to new dateCollected.
mappings/VegCSV-VegBIEN.specimens.csv: individualCount: Disambiguated alternate meaning as stem count by changing stem count fields to map to new stemCount term, which maps to plantobservation.stemcount
mappings/VegCSV-VegBIEN.specimens.csv: height: Removed mapping to plantobservation.overallheight, since the height is a stem field rather than a plant field. Note that a height in the organisms table will be mapped to the height in a single stemobservation for that plant, with NULL sourceaccessioncode and authorstemcode. Note also that this change is possible because no mapped datasource yet provides a valid overallheight with multiple stems or that differs from its single stem's height. (Although SALVIAS sometimes provides both a stem height and an organism height, that height is always either the same, or the organism height is invalid. See <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/SALVIAS_issues#Some-organisms-have-one-stem-but-different-heights-in-the-organisms-and-stems-tables>.)
plots inputs: Remapped all VegX via maps to VegCSV. See steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegX-%3EVegCSV>.
mappings/VegX-VegBIEN.stems.csv: Reversed XPaths so that they start with location instead of plantobservation
mappings/VegX-VegBIEN.stems.csv: Expanded {} expressions using expand_braces, so that each distinct output for the same input is on its own line, improving readability. This will also help enable search-and-replace reversing of XPaths for the re-rooting to location.
VegBIEN: Reversed aggregateoccurrence<->plantobservation relationship to point from plantobservation->aggregateoccurrence, so plantobservation could be scoped by aggregateoccurrence in the same way as all other core tables are scoped by their parent tables. This reversed direction was an anomaly due to the need to have a trigger auto-set aggregateoccurrence.count to 1 when there was an associated plantobservation. This was most easily accomplished on the aggregateoccurrence table itself, but required the reversed relationship. The trigger has now been reimplemented on plantobservation, which externally updates aggregateoccurrence.count.
mappings/VegX-VegBIEN.stems.csv: plantobservation: sourceaccessioncode, authorplantcode: Removed no longer needed mapping to specimenreplicate.sourceaccessioncode, since specimenreplicate for plots data is now identified by its plantobservation fkey, without needing its own sourceaccessioncode
bin/map: Don't create unneeded /_ignore/inLabel element containing the datasource name because sql_io.put_table() now autopopulates the datasource_id
mappings/DwC2-VegBIEN.specimens.csv, VegX-VegBIEN.stems.csv: Removed all manual mappings to datasource_id now that datasource_id is auto-populated, both on the VegBIEN output side and the DwC/VegX input side. This should greatly simplify many of the mappings!
input.Makefile: Testing: Renamed import.*.out tests to end in .xml because they now contain XML import trees for validation, and this extension turns on XML syntax highlighting in a text editor
bin/map: out_is_db: Output the put template to stdout so it will be validated in the automated testing
mappings/VegX-VegBIEN.stems.csv: Indirect voucher mappings: Removed no longer needed ":[*_id/taxonoccurrence]" because a specimenreplicate is a taxonoccurrence, so it doesn't need to have an empty taxonoccurrence
bin/map: If outputting to a DB, also create output XML elements for NULL input values. This will help with the transition to using the same XML tree for all rows.
mappings: Build VegX-VegBIEN.organisms.csv from VegX-VegBIEN.stems.csv instead of vice versa. This entails switching the roots around so stem points to organism instead of the other way around, which is a complex operation. Re-rooted VegX-VegBIEN.organisms.csv at /plantobservation instead of /taxonoccurrence to avoid traveling up the hierarchy to taxonoccurrence and back down again to plantobservation, etc. as would otherwise have been the case.
input.Makefile: Run separate tests for each map spreadsheet (input table) rather than all tables at once. This will make it possible to test CSV inputs, which have one CSV per table.