VegBIEN mappings: Mapped datasource name to new project.datasource. Fixes project duplicate elimination.
vegbien.sql: Renamed project.reference_id to datasource_id and pointed it to party, to match locationevent, etc.
VegBIEN mappings: Mapped current lat/long to centerlat/long as well so location duplicate elimination will work properly
xpath.py: Added support for common subpath after split path's {}
sql.py: put(): When encountering a DuplicateKeyException, use dict_subset_right_join to fill in explicit NULL values for columns which don't have data. This causes the database to use the UNIQUE constraint's index to look up the record, instead of relying on individual column indexes for the columns that did have data, which may or may not be available.
util.py: Added DefaultDict to wrap collections.defaultdict with a simple value passed in the constructor, defaulting to None. Added dict_subset_right_join() to fill in None for subset keys that don't exist.
vegbien.sql: Added method and plotmethod UNIQUE indexes
vegbien.ERD.mwb: Removed embargo table from ERD because its functionality is provided in location.confidentialitystatus,confidentialityreason
Regenerated vegbien.ERD exports
vegbien.sql: Moved locationevent method fields to plotmethod and method. Added comments to method/plotmethod fields, as provided by Michael Lee.
VegX-VegBIEN mapping: Mapped locationevent.methodnarrative to new plotmethod table
VegX-VegBIEN mapping: Map sourceaccessioncode and voucher (catalognumber_dwc) to correct place. SALVIAS mappings: Map SourceVoucher as an alternative to coll_number.
vegbien.sql: Removed VegBank-internal tables (including user account tables) because they do not belong in the ecological database. Any web interface should store user account information, cached queries, etc. in a separate interface-specific database.
VegX-VegBIEN mapping: Mapped stem tags to new stemtag table
vegbien.sql: Renamed planttag to stemtag and made it a child of stemobservation. Removed trait table from ERD because it's not used for the purpose we want to use traits for.
vegbien.sql: Removed no longer used location.reference_id. Datasource scoping is now done on locationevent instead, so that locations can be shared across datasources that refer to the same plot or point.
VegX-VegBIEN mapping: Map datasource name (/_ignore/inLabel) to new locationevent.datasource instead of location.reference
vegbien.sql: Added locationevent.datasource_id
vegbien.sql: locationevent: Removed VegBank-internal interp_* fields
VegBIEN: Renamed specimenreplicate.reference_id to datasource_id and pointed it to party instead of reference, since party is better optimized for storing names
DwC mappings: Mapped datasource name to specimenreplicate.reference instead of location.reference
DwC mappings: Mapped specimen description via fieldNotes instead of custom bien.specimenDescription field
VegBIEN: Moved specimenreplicate.verbatimcollectorname to taxonoccurrence since it can also apply to aggregateoccurrences. Removed no longer needed taxonoccurrence fields which are now in taxondetermination.
SALVIAS mappings: Mapped habit to growthForm (user-defined field) instead of habit
DwC-VegBIEN mapping: Convert latitude/longitude values of exactly zero to NULL
xml_func.py: Added _nullIf
util.py: Fixed cast() to not cast a subclass to a superclass (which doesn't make sense in a dynamically-typed language). Added none_if().
util.py: Removed locale import since it's no longer used by util
NYBG-DwC mappings: Map Vegetation to habitat (merged with Habitat). DwC-VegBIEN mapping: Removed remaining mappings to plantobservation.
DwC-VegBIEN mapping: Added datasource name to location.reference using /_ignore/inLabel
profiling.py: Support Python before 2.7 by using new dates.total_seconds(). Also use dates.now() to ensure datetimes always have a timezone.
dates.py: Fixed timestamp() to deal with microseconds correctly by adding them after time.mktime()
dates.py: Deal properly with different timezones by using external dateutil package. Added total_seconds() to replace datetime.timedelta.total_seconds() on Python before 2.7.
vegbien.sql: *method tables: Added table comments
VegX-VegBIEN mapping: Reordered 2-step-only mappings that use /_ignore/inLabel so they run at the same time as other mappings that set the field that uses /_ignore/inLabel. This fixes almost all of the failing 2-step tests.
vegbien.sql: method: Added lengthunits field
vegbien.sql: Changed types of numerical plotmethod fields to double precision
vegbien.sql: method, plotmethod: Added comments to fields
vegbien.ERD.mwb: Adjusted lines
vegbien.sql: Added plotmethod. locationevent points to plotmethod instead of directly to method
vegbien.sql: Point to covermethod from method instead of locationevent
vegbien.sql: Removed no longer needed sizeclass table (whose fields are now in method)
vegbien.sql: Replaced stratumtype, stratummethod with method
vegbien.sql: Attach method to aggregateoccurrence instead of taxonoccurrence
vegbien.sql: Removed methodtrait* tables and added first-class method attributes as first-class fields of method. Removed *method tables from the ERD that will be replaced by method.
vegbien.sql: Removed location.dsgpoly because it is now locationdetermination.footprintgeometry_dwc
VegBIEN mappings: Remap to new locationdetermination fields
VegBIEN: Renamed location.reallatitude,reallongitude to centerlatitude,centerlongitude to reflect that it's now a value calculated from the centroid of the current locationdetermination
vegbien.sql: locationdetermination: Reordered fields
vegbien.sql: locationdetermination.coordsaccuracy: Added comment with units
vegbien.sql: locationdetermination: Added determination status columns from taxondetermination
vegbien.sql: locationdetermination: Added coordinates-related fields
VegX-VegBIEN mapping: Include the datasource name (now provided by map in /_ignore/inLabel) in the appropriate places in both VegX and VegBIEN
bin/map: Removed metadata values feature since the syntax used was causing problems with mappings starting with a ":", and metadata can instead be stored as attributes of the primary key's mapping
xml_dom.py: Fixed bug in parent() where it didn't account for NodeParentIter's first element returned being the current node, not its parent. Refactored parent() to use parentNode directly, and NodeParentIter to use parent(), instead of the other way around.
xml_dom.py: Fixed bug in parent() where incorrect variable name was used
VegX-VegBIEN mapping: Use the input data source's label (e.g. SALVIAS) everywhere a reference is needed
bin/map: Store the input data source's label (e.g. SALVIAS) in the output XML tree for use by references in the mappings
xpath.py: get(): Fixed bug where it would try to create a node named . or .. if . or .. didn't have matching attributes. Now it will just reuse the current or parent node, but create any needed attrs if create is True.
util.py: Added list_eq_is() to compare two lists using is
xpath.py: Don't allow rooted attributes (doesn't make sense), in case someone tries to do elem[/rooted_attr]
bin/map: Moved root.clear() into separate function prep_root() that can be called whenever needed
xpath.py: Added get() support for references (different from pointers) to dynamically set the value of an attribute
util.py: Added list_get()
util.py: Added is_list()
bin/map: Use var doc0_root for quick reference to doc0's root
xpath.py: get(): Go to root when empty element is encountered at the beginning of an XPath. Added allow_rooted parameter to turn off this functionality when XPaths with a leading slash should not be considered rooted.
xpath.py: Don't consider a path starting with "." to be rooted. Do this by not automatically translating an empty path name to ".".
xpath.py: Added is_rooted()
xpath.py: Added elem_is_empty()
xpath.py: Added documentation labels to each section
xpath.py: Added support for getting the parent node when encountering ".."
xml_dom.py: Added parent() to get parent node without recursing past the root node to the document object. Documented that NodeParentIter incorporates this sanity check.
xpath.py: get(): Renamed parent to root to better reflect that it's the starting point for the search. Calling it parent will later be confusing when we want to get the parent node using "..".
xpath.py: Added parser support for attribute values that are references to another part of the XML tree
xml_func.py: Fixed module description comment to reflect that not all XML funcs generate text
xml_func.py: Refactored to add funcs to the module funcs variable as they are defined. Renamed defined functions to the name of the corresponding XML function.
xml_func.py: Added _ignore func to "comment out" an XML subtree
input.Makefile: Fixed error message when no DB file found so that it doesn't incorrectly imply that PostgreSQL inputs are supported
input.Makefile: Don't run tests in verbose mode because the run time stats, etc. are not relevant
bin/map: Only print error/run time stats in verbose mode. input.Makefile: Run import in verbose mode so that error/run time stats are still printed.
Moved value to string conversion functions infrom util.py to new module format.py
exc.py, profiling.py: Use util.int2str() to print # iters with thousands separators
util.py: Added int2str()
bin/map: Document that the exit status is the # of errors in the import, up to the maximum exit status
exc.py: Generalize ExTracker to not just print the # of errors at exit. Instead, provide an exit() method that the ExTracker creator can call at exit to set the exit status to the # of errors. This fixes the Python bug where a benign error message was printed if SystemExit was raised in an atexit function.
bin/map: Set ExPercentTracker's iter_text. Start ExPercentTracker after input processing, because errors in command line options should just end the program and don't need to be tracked.
exc.py: ExPercentTracker: Added ability to set custom iter_text, similar to ItersProfiler
bin/map: Use profiling.ItersProfiler. Refactored input row count calculation to have each function aggregate and return the row count, and then display the row count and statistics that depend on it at the end of the program.
Added profiling.py to time operations and provide the user with statistical information
util.py: Added basic to_si() to add SI prefix to value
util.py: Added format_str() to use locale-specific formatting settings, including thousands separator. Use it in to_percent().
bin/map: Use new ExPercentTracker to print error rate (% of # rows) when program exits
exc.py: Added ExPercentTracker to track errors as % of iterations
util.py: Added to_percent()
exc.py: print_ex(): Declare emph param as a keywork param instead of popping it from **format