xml_dom.py: Fixed bug in parent() where incorrect variable name was used
VegX-VegBIEN mapping: Use the input data source's label (e.g. SALVIAS) everywhere a reference is needed
bin/map: Store the input data source's label (e.g. SALVIAS) in the output XML tree for use by references in the mappings
xpath.py: get(): Fixed bug where it would try to create a node named . or .. if . or .. didn't have matching attributes. Now it will just reuse the current or parent node, but create any needed attrs if create is True.
util.py: Added list_eq_is() to compare two lists using is
xpath.py: Don't allow rooted attributes (doesn't make sense), in case someone tries to do elem[/rooted_attr]
bin/map: Moved root.clear() into separate function prep_root() that can be called whenever needed
xpath.py: Added get() support for references (different from pointers) to dynamically set the value of an attribute
util.py: Added list_get()
util.py: Added is_list()
bin/map: Use var doc0_root for quick reference to doc0's root
xpath.py: get(): Go to root when empty element is encountered at the beginning of an XPath. Added allow_rooted parameter to turn off this functionality when XPaths with a leading slash should not be considered rooted.
xpath.py: Don't consider a path starting with "." to be rooted. Do this by not automatically translating an empty path name to ".".
xpath.py: Added is_rooted()
xpath.py: Added elem_is_empty()
xpath.py: Added documentation labels to each section
xpath.py: Added support for getting the parent node when encountering ".."
xml_dom.py: Added parent() to get parent node without recursing past the root node to the document object. Documented that NodeParentIter incorporates this sanity check.
xpath.py: get(): Renamed parent to root to better reflect that it's the starting point for the search. Calling it parent will later be confusing when we want to get the parent node using "..".
xpath.py: Added parser support for attribute values that are references to another part of the XML tree
xml_func.py: Fixed module description comment to reflect that not all XML funcs generate text
xml_func.py: Refactored to add funcs to the module funcs variable as they are defined. Renamed defined functions to the name of the corresponding XML function.
xml_func.py: Added _ignore func to "comment out" an XML subtree
input.Makefile: Fixed error message when no DB file found so that it doesn't incorrectly imply that PostgreSQL inputs are supported
input.Makefile: Don't run tests in verbose mode because the run time stats, etc. are not relevant
bin/map: Only print error/run time stats in verbose mode. input.Makefile: Run import in verbose mode so that error/run time stats are still printed.
Moved value to string conversion functions infrom util.py to new module format.py
exc.py, profiling.py: Use util.int2str() to print # iters with thousands separators
util.py: Added int2str()
bin/map: Document that the exit status is the # of errors in the import, up to the maximum exit status
exc.py: Generalize ExTracker to not just print the # of errors at exit. Instead, provide an exit() method that the ExTracker creator can call at exit to set the exit status to the # of errors. This fixes the Python bug where a benign error message was printed if SystemExit was raised in an atexit function.
bin/map: Set ExPercentTracker's iter_text. Start ExPercentTracker after input processing, because errors in command line options should just end the program and don't need to be tracked.
exc.py: ExPercentTracker: Added ability to set custom iter_text, similar to ItersProfiler
bin/map: Use profiling.ItersProfiler. Refactored input row count calculation to have each function aggregate and return the row count, and then display the row count and statistics that depend on it at the end of the program.
Added profiling.py to time operations and provide the user with statistical information
util.py: Added basic to_si() to add SI prefix to value
util.py: Added format_str() to use locale-specific formatting settings, including thousands separator. Use it in to_percent().
bin/map: Use new ExPercentTracker to print error rate (% of # rows) when program exits
exc.py: Added ExPercentTracker to track errors as % of iterations
util.py: Added to_percent()
exc.py: print_ex(): Declare emph param as a keywork param instead of popping it from **format
inputs/SALVIAS/maps/VegX.organisms.csv: Mapped OrigSpecies and OrigGenus combined to new plantlevel Binomial
xpath.py: Fixed bug where value of XPath (used for copying to other branches) is retrieved after first XPath element is popped rather than before, which can sometimes leave an empty XPath for value() to run on
mappings/DwC-VegBIEN.specimens.csv: Fixed bien.vegetation mapping to point to commconcept->commname. Fixed bien.substrate mapping to point to locationevent.landscapenarrative.
inputs/NYBG/maps/DwC.specimens.csv: Mapped CoordinatePrecision using _noCV
xml_func.py: Added _noCV func to check that non-ratio-scale data does not contain CV values
mappings/DwC-VegBIEN.specimens.csv: Fixed locality fields mapping to go to location.locationnarrative
input.Makefile: For all input types, including DB, import each table in a separate map invocation
xml_func.py: _range: Treat a None from or to value as an unknown (a la SQL NULL) and return None instead of raising a SyntaxException
xml_dom.py: NodeTextEntryIter: Convert empty entries (including entries containing error comments) to None
xml_dom.py: replace(): Added support for new node that's None (deletes existing node)
xml_func.py: Put SyntaxException's cause on same line as error message so that the whole error is treated as distinct by error_stats
Added errors_filter_before and errors_filter_after to prepare `map` error messages for easy filtering and then restore line breaks
error_stats: Fixed to work on Mac
error_stats: Simplified to use uniq --count option
input.Makefile: Print error message if no input file found (for file input type). This fixes a bug where map would just take input from stdin when no input file redirect or input DB env vars were specified.
map: Map standard DB names to original DB names on nimoy
Regenerated vegbien.ERD exports
vegbien.sql: Added methodtrait and methodtraitname tables
PostgreSQL-MySQL.csv: Handle array types
vegbien.ERD.mwb: Recolored plant tables to all have the same color, distinct from the occurrence color
mappings/VegX-VegBIEN.organisms.csv: Added mappings for SALVIAS fields with no join mapping. This fixes the last of the "no join mapping" errors.
input.Makefile: svn_props: Set svn:ignore on maps subdirs
inputs/SALVIAS-CSV/maps/VegX.plots.csv: Fixed mappings without a join mapping in VegX-VegBIEN.*.csv
VegX mappings: Gentry DBH mapping: Use VegX's attribute and method tables
mappings/VegX-VegBIEN.organisms.csv: Removed no longer used mapping to taxondetermination.determinationdate. This also prevents ever creating a taxondetermination without a plantconcept.
bin/map: Added redo option to control whether the database is emptied before inserting new data. Can be used to turn off emptying the DB in test mode, because this is often slow and is not needed if you are running tests on an empty testing database.
opts.py: env_flag(): Added support for default value if unset
bin/map: Use env_flag()'s new env_names usage support to print flags usage
opts.py: Added env_names usage support to env_flag()
mappings/VegX-VegBIEN.organisms.csv: Removed no longer needed mapping for taxonDetermination/note
inputs/SALVIAS-CSV/maps/VegX.organisms.csv: Map cfaff to taxonConcept/fit, which maps to taxondetermination.taxonFit
inputs/SALVIAS/maps/VegX.organisms.csv: Map cfaff to taxonConcept/fit, which maps to taxondetermination.taxonFit
join: Print a warning if no join mapping found (in addition to adding this warning to the comments column)
Removed no longer needed inputs/NYBG/maps/VegX.organisms.csv because NYBG is now mapped via DwC
mappings/VegX-VegBIEN.organisms.csv: Removed mappings used only by NYBG, because NYBG now maps via DwC
Added ch_root_via to transform a map spreadsheet to use a different root, using a connecting root that links the input and output roots together
Added cols to select columns from a spreadsheet
util.py: Added list_subset()
ch_root: Fixed detection of unset env vars so that usage message is printed when any option is missing
opts.py: Call an error handler if an env var isn't set
util.py: Added function wrappers for statements noop() and and_()
inputs/NYBG-CSV: Map via DwC
Added subtract to subtract map spreadsheets
ch_root: Ignore empty lines
Added intersect to intersect two map spreadsheets
union: Clarified overwrite order of inputs in description
Removed no longer needed mappings/review
mappings/Makefile: Regenerate for_review maps automatically when a map changes
mappings/review: Generalized to convert all mappings to VegBIEN, not just a specific listed set (which was out of date)
mappings/for_review/DwC-VegBIEN.specimens.csv: Regenerated
inputs/NYBG/maps/DwC.specimens.csv: Fixed CollectedDate mapping to use the _date XML func
DwC mappings: Mapped Substrate and Vegetation
DwC mappings: Mapped BoundingBox, footprintWKT to location.dsgpoly
DwC mappings: Mapped Notes and PlantFungusDescription to bien.specimenDescription, merged together
xml_func.py: Added _merge and _label XML funcs