join: Merge the column labels as well
maps.py: Eliminate duplicates when merging values in the same column
join: Moved mappings-specific merge functionality into maps.merge_mappings()
join: Use merge_rows() from new maps.py
Added new library maps.py for map spreadsheet manipulation
join: Merge comments of input map and join map
join: Report which input mappings are missing a mapping in the join map
inputs/NYBG/maps/VegX.organisms.csv: Added note that the primary key has NULL values in some rows
input.Makefile: Preserve as many intermediate files as possible (make likes to delete intermediates if they aren't marked as .PRECIOUS)
sort: Sort empty strings last so that inputs with no mapping go at the end of the map spreadsheet
VegBIEN-VegBank.csv: Updated for recent table renames
join: For input mappings with no match in the join map, include them in the output map with an empty mapping
input.Makefile: Generalized to handle mapping via any format, not just VegX
input.Makefile: Don't print message to accept output for failing 2-step tests, because they use another test's accepted output
input.Makefile: Don't abort tester if only 2-step test fails, as it's often finicky
xml_func.py: Raise xml_func.SyntaxException for ValueErrors generated by date.strftime() (e.g. year out of range due to poor Y2K support in some OS implementations of strftime)
xml_func.py: Raise xml_func.SyntaxException for ValueErrors generated by datetime.date() (e.g. month out of range)
vegbien.sql: Added project.reference_id to namespace project names by datasource
input.Makefile: Import all tables at once by default
bin/map: Print "Inserted ... new rows into database" message to stdout rather than stderr so it can be stored in the test case output as a validation check
Accepted initial test output for NYBG/test/import.out.ref
bin/map: Clean up datasource input values
strings.py: Added std_newl() to convert line endings and cleanup() to process strings with extra or nonstandard whitespace
PostgreSQL-MySQL.csv: Deal with custom types
vegbien.sql: Added aggregateoccurrence.occurrencestatus_dwc field
Regenerated vegbien.ERD exports
vegbien.ERD.mwb: Added commclass table to ERD
vegbien.sql: Removed direct pointer from location to namedplace because locationplace already has this relationship and we don't want to have an extra pointer just for duplicate elimination
vegbien.ERD.mwb: Added stratummethod to ERD
vegbien.sql: Removed locationevent.stratummethod_id because the stratummethod is a per-stratum (or technically, per-stratumtype) field
PostgreSQL-MySQL.csv: Remove CHECK constraints
PostgreSQL-MySQL.csv: Remove functions and triggers
vegbien.sql: Ensure that aggregateoccurrence.count == 1 when the aggregateoccurrence has a plantobservation. Use a trigger to do this automatically.
README.TXT: Added command for reimporting data
README.TXT: Added instructions to sync ERD with vegbien.sql schema. Organized commands into categories.
Added BIEN_logo.png
vegbien.ERD.mwb: Added color group legend
vegbien.ERD.mwb: Fixed lines
vegbien.ERD.mwb: Fixed lines and moved plant to its own color category
vegbien.ERD.mwb: Added colors to ERD
vegbien.ERD.mwb: Simplified diagram by removing column types
schemas/Makefile: Don't generate for_ERD DDLs because the ERD is now synced with the full schema
vegbien.ERD.mwb: Synced with whole schema
vegbien.sql: Reordered fields in tables truncated in the ERD so that all removed fields are at the end of the table
schemas/Makefile: Generate MySQL version of vegbien.sql as well as vegbien.for_ERD.sql for eventual use in syncing the ERD with the whole schema
PostgreSQL-MySQL.csv: Added translations for syntaxes used by pg_dump
repl: All regexps are by default in multiline and ignore case mode
vegbien.sql: Made planttag a child of plantobservation instead of plant, since tags change over time
vegbien.sql: Removed no longer used plantobservation.aggregateoccurrence_id
VegX-VegBIEN mapping: Link aggregateoccurrence to plantobservation via forward pointer rather than backward child-to-parent pointer
vegbien.sql: Made plantobservation.aggregateoccurrence_id optional because link will soon be going in the other direction
vegbien.sql: Removed taxonbinmethod table since its fields are now in aggregateoccurrence
vegbien.sql: Added taxonbinmethod fields to aggregateoccurrence
vegbien.sql: Added back aggregateoccurrence.stratum_id
vegbien.sql: Added stratum.area
vegbien.sql: Removed denormalized duplicate fields from stratum
vegbien.sql: Added plant and planttag tables
VegBIEN: Renamed stem to stemobservation
vegbien.sql: Removed specimenreplicate:taxonoccurrence 1:1 requirement
VegBIEN: Renamed individualplant to plantobservation
vegbien.sql: Updated table comments for specimenreplicate and specimen
vegbien.sql: Added specimen table to tie specimenreplicates together
VegBIEN: Renamed specimen to specimenreplicate
Remerged ERD DDL into ERD
Redoing commit that linked aggregateoccurrence forward to individualplant, allowing many taxonoccurrences (e.g. one for each specimen) to point to the same plant (e.g. that those specimens came from)
Added NYBG input
input.Makefile: Run tests with verbose output
bin/map: Fixed bug where verbose/debug flags were ignored and message were always printed.
bin/map: Added verbose and debug options. Added initial debug info.
xml_dom.py: Added is_simple() to determine whether every child recursively has no more than one child. Used is_simple() to print condensed XML when simple nodes are converted to a string.
vegbien.sql: Enforce 1:1 relationship between aggregateoccurrence<->individualplant and taxonoccurrence<->specimen
vegbien.sql: Changed individualplant UNIQUE constraint to enforce 1:1 relationship between aggregateoccurrence and individualplant
Undoing previous commit since it would prevent a plant from being tied to a data source, because the aggregateoccurrence pointer goes in the wrong direction
vegbien.sql: Added aggregateoccurrence.individualplant_id to make a 1:1 relationship between aggregateoccurrence and individualplant
input.Makefile: Generate VegBIEN.2-step.xml correctly from VegX.xml, by removing DB config env vars passed to map for that test case. Note that this causes the VegBIEN.2-step.xml test to fail, because the 2-step mapping does not yet match the 1-step mapping.
input.Makefile: Don't need to filter test output since stderr now goes to the screen
input.Makefile: Don't save *.err outputs for each test because this information is printed to the screen
input.Makefile: Send echoed diff command to stdout of the make process (set -x echoes it to stderr)
input.Makefile: Write test stderr to .err file instead of test output, and tee it to stdout of the make process
vegbien.sql: Updated name of UNIQUE constraint for specimen collectionnumber. Regenerated vegbien.ERD exports.
input.Makefile: Don't print "accept test" message when user aborted a test with Ctrl+C
inputs/SALVIAS/test: Accepted test outputs
input.Makefile: Also print message for accepting test output when diff fails
bin/map: Print a message when a database is successfully connected to
sql.py: Don't enclose PostgreSQL names in quotes because this disables case-insensitivity
sql.py: Use esc_name() to escape fields in SELECT statements
sql.py: Added esc_name() to escape identifiers like column names
vegbien.sql: Added comments to specimen.collectioncode_dwc and collectionnumber to differentiate them
vegbien.sql: Renamed authorspecimencode to collectionnumber to match its name in source data
input.Makefile: Use pipefail to cause a test to fail even when the output is filtered by grep. Print message for failing tests with command to run to accept the new test output.
VegX-VegBIEN mapping: Map additional taxondetermination.determinationdate input formats straight through
test/input/SALVIAS_db.sh: Updated DB name