NYBG-DwC maps: Map to input fields with XML func appended whenever possible (DwC1->DwC2 translation is done by DwC-VegBIEN.specimens.csv)
vegbien.sql: Renamed methodtaxonclass.description to methodtaxonclass.taxonclass and changed it to a closed list (enum taxonclass). method.description can still be used for freeform taxonclass inclusions/exclusions.
DwC1-DwC2.specimens.csv: Removed no longer needed /_alt/2 XML func from date mappings (you will only ever map either the full date or the year/month/day)
DwC mappings: Moved DwC1's CoordinatePrecision /_noCV/value XML func suffix to DwC2-VegBIEN.specimens.csv
mappings: Removed mappings for XML func suffixes of a path because they are now automatically created heuristically by join
join: Added heuristic search for a match on a parent path, so that every XML func suffix of a path doesn't need its own mapping
Regenerated vegbien.ERD exports
vegbien.sql: Added method.pointsperline. Rearranged ERD after removing role fkeys.
filter_ERD.csv: Remove role fkeys
vegbien.sql: aggregateoccurrence: Added linecover
vegbien.sql: methodtaxonclass: Added description comment with list of values (which may become a closed list)
vegbien.sql: Changed lengthunits to m in all comments
vegbien.sql: method: Added subplotspacing and subplotmethod_id
vegbien.sql: method: Removed lengthunits and instead require all length- or area-related measurements throughout VegBIEN to be converted to SI base units, e.g. cm -> m, ha -> m^2. Adjusted ERD to avoid some densely packed lines.
vegbien.sql: methodtaxonclass: Added description field for taxon classes that don't fit well into a plantconcept. Made at least one of plantconcept_id or description required. Added unique constraint.
SALVIAS verifications: Use count(DISTINCT) instead of nested SELECT DISTINCT
VegBIEN verifications: Select only the records for the datasource being verified
SALVIAS verifications: Fixed to exclude subplots from locations/location events and uniqify locations based on coords
inputs/SALVIAS/verify.sql: Updated for schema changes
vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD. (I think this will need to be manually re-marked whenever either of those tables is updated.)
vegbien.sql: Removed methodgrowthform and growthform, since growthforms can be accommodated by plantconcept in a similar way as higher-order taxonomic ranks
vegbien.sql: methodgrowthform, methodtaxonclass: Removed "included" default value so it's always obvious whether the author intended the classes to be inclusions or exclusions
vegbien.sql: aggregateoccurrence: Removed unneeded fields. Added aggregateoccurrence->coverindex fkey.
vegbien.sql: Added constraint to enforce 1:1 aggregateoccurrence:plantobservation relationship
vegbien.sql: Added plantname unique constraint
bin/map: Use new util.ListDict and util.WrapIter to simplify getting rows by column name instead of index, and to enable a row to be printed with its column names in error messages
util.py: Added WrapIter to wrap an iterator and ListDict to view a list as a dict
bin/map: Use new util.list_flip()
util.py: Added list_flip()
env_password: Fixed to set the environment variable in the calling shell. Do this by cc-ing the tty only on messages before the "Enter password" prompt, because the redirect creates a subshell which causes the env var to only be set within that subshell.
inputs/NYBG-CSV/maps/DwC.specimens.csv: Removed mappings that are already present in mappings/DwC1-DwC2.specimens.csv. This map now contains only the mappings where NYBG-CSV differs from standard DwC1.
inputs/NYBG/maps/DwC.specimens.csv: Removed mappings that are already present in mappings/DwC1-DwC2.specimens.csv. This map now contains only the mappings where NYBG differs from standard DwC1.
Remove accidentally-committed temp file inputs/NYBG/DwC.specimens2.csv
mappings/Makefile: Generate DwC.self.specimens.csv from DwC-VegBIEN.specimens.csv for use in creating full via maps for inputs
input.Makefile: Generate full via maps from input via maps by appending mappings from the via format to itself when available
inputs/NYBG/maps/DwC.specimens.csv: Changed label to "NYBG-DwC" to take advantage of automatic filling in of DwC mappings not specified in the NYBG map
subtract: Support custom column numbers to compare on (instead of just input col). Added ignore option to continue even if input columns don't match.
bin/map: DB inputs: Get all rows in one query (hopefully a significant optimization). Allow maps to contain entries for columns that are not in the DB table.
sql.py: select(): Select all fields if fields == None. Replaced col(cur, idx) with col_names(cur) because an iterator is easier to use than getting by index.
bin/map: Fixed bug in previous implementation of allowing maps for CSV inputs to contain entries for columns that are not in the CSV file
bin/map: Allow maps for CSV inputs to contain entries for columns that are not in the CSV file
Use new sort_map instead of manually specifying the sort order
Added sort_map to sort a map spreadsheet in the standard order
Removed no longer needed join_passthru, because join_union_sort now serves its purpose
Don't generate mappings/for_review/DwC-VegBIEN.specimens.csv because it's a derived map with lots of duplicated mappings for the various DwC versions
mappings/Makefile: Generate DwC-VegBIEN.specimens.csv directly from DwC1-DwC2 and DwC2-VegBIEN mappings by using join_union_sort with header_num=1, rather than via intermediate DwC1-VegBIEN.specimens.csv
union: Added header_num option to select which map's header to use as the output header
Rename join_sort to join_union_sort and have it run union in ignore mode. This will automatically append the joined map when the input map is a derivative of the joined map, such as for NYBG-DwC.
union: Pass through map 0, so that if ignore is set, the input map will still be output. Allow either map's input label to contain the other's input label to enable e.g. appending mappings for an older input version to those for a newer input version.
DwC1-DwC2 mapping: Changed input label to DwC1, which is allowed by the now relaxed label constraints imposed by union
union: Check if two maps can be combined based on whether map 0 column 0 label contains map 1 column 0 label instead of being equal. This allows map 0's input 0 root to contain the datasource name as well as a format that allows it to be combined with a more general map. Added ignore flag to not print an error if column labels don't match.
bin/map: Support optional data format tag in map spreadsheet labels, used by union to check if two maps can be combined
mappings: Added DwC1-DwC2.specimens.csv to core maps so it gets cleaned up
Only generate for_review mappings of core maps and end products
Generate DwC-VegBIEN mapping as union of DwC1 and DwC2 mappings
NYBG DB mapping: Removed IdentifiedDate and CollectedDate mappings because they are generated from the year/month/day
Added mappings/for_review/DwC1-VegBIEN.specimens.csv
Added DwC1-DwC mapping. Generate DwC1-VegBIEN mapping automatically.
vegbien.sql: Renamed _keys unique constraints/unique indexes to _unique to better reflect their purpose
vegbien.sql: Added method.diameterheight to store DBH height
VegBIEN: Moved plantstatus.plantlevel to plantname.rank because the taxonomic rank is a property of the name itself
PostgreSQL-MySQL.csv: Fixed custom types translation to match shorter type names
vegbien.sql: Added plantstatus unique constraint
DwC-VegBIEN mapping: Map datasource name via DwC institutionCode
vegbien.ERD.mwb: Lined up logo and legend with other ERD elements
vegbien.sql: Renamed methodgrowthform.growthformmethod_id to submethod_id. Added methodtaxonclass.submethod_id (similar to methodgrowthform.submethod_id).
vegbien.sql: Added methodgrowthform.growthformmethod_id for specifying a method used by just the growthform
vegbien.ERD.mwb: Rearranged legend to more closely match layout of ERD
vegbien.sql: Reordered plantstatus fields to put the most important fields at the top, which will be visible in the ERD
vegbien.sql: Replaced method.taxonclassincluded,taxonclassexcluded with new many:many methodtaxonclass table. Added methodgrowthform, growthform tables to do the same thing as methodtaxonclass for growth forms.
vegbien.sql: method: Added comment on reference_id
VegBIEN: Moved plotmethod fields to method because they can also apply to strata. Removed no longer used plotmethod table.
input.Makefile: input DB creation: Removed "IF NOT EXISTS" because that check is handled by $(dbExists)
input.Makefile: Don't try to recreate an input DB if it already exists
Added UArizona DB input
Renaming UArizona to UArizona-CSV because there is also a DB input in bien2_staging.ariz_raw on nimoy
Added UArizona input
env_password: Fixed bug where exit command would not cause it to exit, because pipefail shell option was not set. Moved automatic exiting of the calling script into env_password itself.
map: Exit if password not set
env_password: cc stderr if it's a log file
env_password: Print all messages to /dev/tty so the user sees them even if stderr is redirected to a log file. Exit if password not already set, because e.g. scripts run in the background will not be able to prompt for it.
input.Makefile: Don't have make import call verify, because the user often runs import as a test and will not want the output cluttered with verification information. Also, the full imports for which this was intended are often run asynchronously, so that the user will not see the output anyway.
input.Makefile: Don't abort on verification errors, which are expected during development
SALVIAS tests: Fixed invalid accepted test outputs due to not running `make empty_db` before running tests when using the no-redo optimization shortcut
SALVIAS mappings: Fixed plot key mappings to map the correct values to subplot and parent plot
vegbien.sql: locationevent: Added unique constraint for subplots based on subplot location
SALVIAS-db VegX mapping: Map subplots correctly the way SALVIAS-CSV does
SALVIAS verification: Updated to schema changes
input.Makefile: Fixed syntax error in verify %.ref target (outdated variable name)
input.Makefile: Halt psql commands on first error
vegbien.sql: Removed location.authorlocationcode because it's now stored in locationevent as an author-specific setting
vegbien.sql: locationevent: Redid unique constraints to avoid applying authorlocationcode-only duplicate elimination to subplots
SALVIAS mappings: Map SiteCode/plot_code to locationevent.authorlocationcode because locationevent is now the place to store author-specific plot information
SALVIAS mappings: Fixed PlotID mapping to go to locationevent.sourceaccessioncode
VegBIEN: Renamed locationevent.authoreventcode to authorlocationcode to reflect that datasources usually use an author-defined code for a plot rather than a plot event