mappings/Makefile: Fixed bug in rules for cleaning up core maps whenever they change, which had the target and prerequisite reversed
input.Makefile: nolog option defaults to on when test is on
input.Makefile: Fixed bug where no log file was being created, even when nolog was off
Replaced all type(...) == str with util.is_str(...) to properly treat Unicode objects as strings
xml_dom.py: minidom.Element.write_opening(): Use new Attr.__str__() method for printing attributes
bin/map: For XML inputs, use xpath.get()'s new multiple result support to iterate over elements matching the root, rather than just iterating over the first matching element's siblings. This fixes the broken 2-step tests for stems, which didn't filter by the root's attributes.
xpath.py: get(): Fixed bug where limit was not passed to recursive get() calls
xml_dom.py: by_tag_name(): Iterate forwards over children unless last_only optimization turned on. Added Attr.__str__() and repr() for debug-printing Attrs.
xpath.py: get(): Turn off last_only optimization when limit > 1
xpath.py: get(): Added full support for returning multiple matches
xpath.py: get(): Added basic structure for returning multiple matches. Added limit parameter to select one or many matches.
input.Makefile: test/VegBIEN.%.2-step.xml: Use the core map specific to the test's table instead of the main core map
bin/map: Print error if map root not found inXML input
mappings/: Removed mappings used by old tester
inputs/SALVIAS/maps/VegX.*.csv: Replaced symlinks with actual files
Removed old tester
Switched to using inputs/test as main test target
Added SALVIAS-CSV tests
Added NYBG-CSV tests
input.Makefile: Run separate tests for each map spreadsheet (input table) rather than all tables at once. This will make it possible to test CSV inputs, which have one CSV per table.
Added NYBG-CSV input
inputs/Makefile: Fixed forwarding of empty targets to subdirs
Regenerated vegbien.ERD exports
vegbien.sql: Added morphospecies table
vegbien.ERD.mwb: Fixed lines
vegbien.sql: Removed the taxonOccurrence:aggregateOccurrence 1:1 constraint
sql.py: truncate(): Use run_raw_query() instead of run_query() because truncate() does not use the recover functionality of run_query(). Also, in the profiling output, this separates the "normal" SQL statements (which use run_query()) from the "core" SQL statements (which use run_raw_query()).
vegbien.sql: Added indexes for each field in party used in duplicate elimination (for use by sql.put()'s DuplicateKeyException handler)
sql.py: run_raw_query(): In debug mode, print query after params have been substituted in
sql.py: Fixed index_cols() to handle UNIQUE indexes with expressions, whose column names are stored in a different format
sql.py: Print warning if SELECT statement missing a WHERE, LIMIT, or OFFSET clause. Changed bin/map DB input get-all-rows statement to pass start=0 to suppress this warning for that statement.
db_xml.py: Added start option to get() that passes through to sql.select()
sql.py: Added start option to select() to set the OFFSET
sql.py: If run_raw_query.debug flag is set, print each query executed (on a single line)
strings.py: Added one_line() function to make a string all on one line
strings.py: Renamed one_line() to remove_extra_newl() to better reflect what it does
bin/map: Don't print Done after an action in debug logging mode because it messes up newlines when more debugging info is printed right after it
input.Makefile: Added nolog option to disable creating a log file, e.g. for debugging runs
xml_dom.py: Remove extra newlines from single-line strings (bin/map doesn't need to do this itself anymore)
strings.py: Added is_multiline() and one_line() for removing extra newlines from single-line strings
bin/map: In debug mode, print input XPath's XML tree all on one line
sql.py: Switched try_insert() to use index_cols() instead of constraint_cols() for "duplicate key value violates unique constraint" errors because they can also be generated by UNIQUE indexes (and there is a UNIQUE index for every UNIQUE constraint)
sql.py: Added index_cols() to get cols used by an index (similar to constraint_cols())
vegbien.sql: Fixed duplicate elimination for party to use a UNIQUE index with COALESCE for nullable fields
sql.py: Fixed bug in try_insert() where DuplicateKeyException was passed only cols0 instead of cols array
Added get_errors to select just the error messages from `map` output
Added profile_stats to analyze a profiling statistics file
bin/map: Added profile_to option which turns on profiling to the specified file
bin/map: Added "if name == '__main__': main()" idiom so file can be included as well as run. This will be useful for profiling.
dates.py: Fixed strftime() to pad years and days with leading zeros as datetime.strftime() does
dates.py: Work around strftime() bug that can't deal with 2/29 on a leap year
xml_func.py: Added FormatException for SyntaxExceptions generated by strftime() (which are often Python bugs)
Added schemas/vegbank.revised.sql. Initial version has all "character varying" types replaced with text.
vegbien.sql: Replaced all "character varying" types with text, removing the length limits. Note that in PostgreSQL, text and "character varying" are stored the same way internally, so this does not affect performance or indexes.
xml_dom.py: Added documentation labels to each section
xml_dom.py: Fixed bug in NodeTextEntryIter where an entry containing an element instead of a text node would be returned as the whole entry, instead of the value of the entry
bin/map: Added support for starting import at a specific row. Refactored row-processing code with and without a map to use a common process_rows() function (with the previous process_rows() being renamed to map_rows()).
bin/map: Use new util.cast()
util.py: Added cast() to cast a value while passing None through
bin/map: Print row # of rows with errors
sql.py: Fixed error in pkey() where recover was not passed as a named parameter to run_query()
sql.py: Added documentation labels to each section
db_xml.py: Used new sql.py recover functionality
sql.py: Added ability to recover from database errors so you don't get the error "InternalError: current transaction is aborted, commands ignored until end of transaction block"
vegbien.sql: Removed taxonoccurrence.taxoninferencearea because it's duplicated in aggregateoccurrence.inferencearea
bin/map: Highlight the "input row" and "output row:" labels in error messages
xml_func.py: Highlight nodes that were commented out because of errors
exc.py: Print exceptions with the first line highlighted in red
term.py: Added emph() and error()
vegbien.ERD.mwb: Added note, notelink, and revision
vegbien.ERD.mwb: Added embargo to diagram
vegbien.ERD.mwb: Fixed lines. Added "Core subset" and "Other tables" labels.
xml_func.py: Changed _date func to use new dates.strftime(), which can handle years before 1900
Added dates.py to handle date/time manipulation, such as fixing Python's broken strftime() that can't handle years before 1900
Regenerated mappings/for_review/DwC-VegBIEN.specimens.csv
vegbien.ERD.mwb: Added reference and party tables
filter_ERD.csv: Remove fkeys to heavily-linked tables (reference, party)
Added to_do/milestones.doc
Renamed milestones.doc to timeline.doc
Added schemas/filter_ERD.csv and use it when generating vegbien.my.sql
vegbien.ERD.mwb: Added cover* to main diagram
vegbien.ERD.mwb: Started adding additional tables "below the fold" on the 2nd page
vegbien.ERD.mwb: Moved legend to top left to make room for more misc tables. Organized legend by location on diagram.
vegbien.ERD.mwb: Added soilobs table
vegbien.ERD.mwb: Added userdefined tables. Fixed lines.
vegbien.ERD.mwb: Changed location color to match VegBank ERD
vegbien.ERD.mwb: Added trait to diagram
vegbien.ERD.mwb: Added plantstatus to diagram. Added margins around diagram.
Added milestones.doc
DwC mappings: Fixed syntax of _date XML funcs to not wrap dates twice in a _date func
xml_func.py: Fixed bug in SyntaxException constructor where the cause was not passed to ExceptionWithCause
xml_dom.py: Override Node.__repr__ and Element.__repr__ to make sure self.toprettyxml() is used in all cases where a Node is converted to a string