VegBIEN: Renamed specimen to specimenreplicate
Remerged ERD DDL into ERD
Redoing commit that linked aggregateoccurrence forward to individualplant, allowing many taxonoccurrences (e.g. one for each specimen) to point to the same plant (e.g. that those specimens came from)
Added NYBG input
input.Makefile: Run tests with verbose output
bin/map: Fixed bug where verbose/debug flags were ignored and message were always printed.
bin/map: Added verbose and debug options. Added initial debug info.
xml_dom.py: Added is_simple() to determine whether every child recursively has no more than one child. Used is_simple() to print condensed XML when simple nodes are converted to a string.
vegbien.sql: Enforce 1:1 relationship between aggregateoccurrence<->individualplant and taxonoccurrence<->specimen
vegbien.sql: Changed individualplant UNIQUE constraint to enforce 1:1 relationship between aggregateoccurrence and individualplant
Undoing previous commit since it would prevent a plant from being tied to a data source, because the aggregateoccurrence pointer goes in the wrong direction
vegbien.sql: Added aggregateoccurrence.individualplant_id to make a 1:1 relationship between aggregateoccurrence and individualplant
input.Makefile: Generate VegBIEN.2-step.xml correctly from VegX.xml, by removing DB config env vars passed to map for that test case. Note that this causes the VegBIEN.2-step.xml test to fail, because the 2-step mapping does not yet match the 1-step mapping.
input.Makefile: Don't need to filter test output since stderr now goes to the screen
input.Makefile: Don't save *.err outputs for each test because this information is printed to the screen
input.Makefile: Send echoed diff command to stdout of the make process (set -x echoes it to stderr)
input.Makefile: Write test stderr to .err file instead of test output, and tee it to stdout of the make process
vegbien.sql: Updated name of UNIQUE constraint for specimen collectionnumber. Regenerated vegbien.ERD exports.
input.Makefile: Don't print "accept test" message when user aborted a test with Ctrl+C
inputs/SALVIAS/test: Accepted test outputs
input.Makefile: Also print message for accepting test output when diff fails
bin/map: Print a message when a database is successfully connected to
sql.py: Don't enclose PostgreSQL names in quotes because this disables case-insensitivity
sql.py: Use esc_name() to escape fields in SELECT statements
sql.py: Added esc_name() to escape identifiers like column names
vegbien.sql: Added comments to specimen.collectioncode_dwc and collectionnumber to differentiate them
vegbien.sql: Renamed authorspecimencode to collectionnumber to match its name in source data
input.Makefile: Use pipefail to cause a test to fail even when the output is filtered by grep. Print message for failing tests with command to run to accept the new test output.
VegX-VegBIEN mapping: Map additional taxondetermination.determinationdate input formats straight through
test/input/SALVIAS_db.sh: Updated DB name
Regenerated vegbien.ERD exports
input.Makefile: Added documentation for why import errors for one input do not abort the import process for all inputs
input.Makefile: Determine DB name from input directory name, rather than DB file name
input.Makefile: Added documentation for accepting a test output
mappings/Makefile: Don't delete DwC-VegBIEN.specimens.csv in clean
VegBIEN: Renamed taxondetermination.*determination to is*
inputs/SALVIAS/test: Ignore test outputs
input.Makefile: Added test that generates VegBIEN.2-step.xml by mapping via a VegX.xml
input.Makefile: Added test that generates VegX.xml
input.Makefile: Added test that generates VegBIEN.xml
input.Makefile: Factored test/import.out out of DB section
input.Makefile: Renamed test/import.ref to import.out.ref. Changed syntax for accepting a test output to work with all types of test outputs.
Makefiles: Recurse into outermost subdir rather than bypassing it and going directly to innermost subdir
input.Makefile: Deal with inputs without a DB file, tests, verifications, etc.
inputs/Makefile: Don't use subdir makefiles because they are no longer needed
input.Makefile: Detect DB engine automatically from SQL file available in src subdir
input.Makefile: Factored as much as possible out of section for each DB engine
input.Makefile: Moved tests into test subdir
Added initial DwC-VegBIEN mappings spreadsheet with DwC terms
inputs/SALVIAS/verify: Updated to use new names for renamed tables
vegbien.sql: Removed no longer needed specimen.collector_id
VegX-VegBIEN mapping: Map collector name to new verbatimcollectorname field
vegbien.sql: Removed specimen.collectornumber_dwc and replaced it with verbatimcollectorname to reflect that the collectornumber_dwc is actually an ID of the specimen, and the collector's name is what we want to store
mappings/Makefile: Run simplify_xpath on VegX-VegBIEN.organisms.csv
simplify_xpath: Be case sensitive to handle VegX correctly
VegX-VegBIEN mapping: Avoid using a dummy taxondetermination with role=collector
VegX-VegBIEN mapping: Map stem count to new stemcount field
VegX-VegBIEN mapping: Take advantage of aggregateoccurrence.count being optional
vegbien.sql: Made aggregateoccurrence.count to handle individuals data (for which count should be dynamically determined from # individual plants inside the aggregateoccurrence)
NYBG-VegBIEN mapping: Don't map dummy values to locationcode, etc. (e.g. in specimens data) because these tables are no longer required
vegbien.sql: Made several pointers to parent elements optional to deal with specimens data that might not have a location, etc.
vegbien.sql: Added taxondetermination UNIQUE constraint
VegX-VegBIEN mapping: Took advantage of location.confidentialitystatus being optional
VegX-VegBIEN mapping: Took advantage of userdefined.userdefinedtype being optional
vegbien.sql: Gave userdefined.userdefinedtype a default value
VegX-VegBIEN mappings: Took advantage of plantconcept.reference_id becoming optional
vegbien.sql: Made plantconcept.reference_id optional. Merge plantconcepts with no reference_id when eliminating duplicates.
PostgreSQL-MySQL.csv: Deal with all non-NOT NULL timestamp fields
vegbien.sql: Removed confusing plantconcept.plantname field since we are using plantname.plantname instead
VegBIEN: Renamed aux_role to role
VegX-VegBIEN mappings: Took advantage of several fields becoming optional
vegbien.sql: taxonbinmethod points to stratumtype instead of stratum because stratumtype is a method table, but stratum is a measurements table. stratum does not point directly to stratummethod because it points to it via stratumtype.
vegbien.sql: Made taxondetermination.determinationdate optional because some determinations might not have a date
vegbien.sql: Added specimen.authorspecimencode
Adjusted vegbien.ERD.mwb
VegBIEN: Renamed sourceaccessionnumber to sourceaccessioncode to show that they are the data source's analog of accessioncode. Added sourceaccessioncode to all applicable tables because this is the database pkey, which is distinct from any author*code applied by the collector.
vegbien.sql: Changed taxonbinmethod_keys to UNIQUE INDEX to take advantage of COALESCE for dealing with NULL values
vegbien.sql: Renamed taxonbin to taxonbinmethod to reflect that it does not contain actual organisms (those go in aggregateoccurrence), but rather defined a method of aggregating organisms
vegbien.sql: Removed taxonbin.count because that belongs in aggregateoccurrence and taxonbin is more similar to a sampling method. Added taxonbin UNIQUE constraint.
vegbien.sql: Do location duplicate elimination independently on code or lat/long, allowing duplicate entries with NULLs to exist when a location is incompletely specified
vegbien.sql: Require location to have either an authorlocationcode or a lat/long. Distinguish between regular and subplots in UNIQUE constraint.
vegbien.sql: Renamed location.latitude and longitude to publiclatitude, publiclongitude to reflect that they are not the actual lat/long. Switched to requiring reallatitude/reallongitude.
Added inputs/TurboVeg
vegbien.ERD.mwb: Deal with MySQL assuming that a timestamp field is NOT NULL
PostgreSQL-MySQL.csv: Deal with MySQL assuming that a timestamp field is NOT NULL
vegbien.sql: Made specimen.taxonoccurrence_id required
vegbien.sql: Made several fields optional, adding defaults where needed
PostgreSQL-MySQL.csv: Deal with PostgreSQL-style :: casts
NYBG mappings: Add mapping for CollectorNumber to specimen.collectornumber_dwc
vegbien.sql: Added specimen.collectornumber_dwc
VegBIEN: Renamed sourceid to author*code
mappings: Map ScientificNameAuthor to plantconcept with rank author