inputs/NYBG: Added verify/specimens.ref.sql
Added mappings/verify.specimens.sql
Added inputs/NYBG-CSV/verify/
Makefile: Print done message after verify
VegX-VegBIEN mapping: Use new lookup-only element syntax to ensure that stemtag 1 is not created if it doesn't exist when stemtag 2 tries to set its iscurrent status to false. This should fix the 136 "NullValueException: columns: tag" errors in the SALVIAS organisms import.
xpath.py: get(): Added support for lookup-only elements which are not created if they don't exist
xpath.py: parse(): Added support for lookup-only elements which are not created if they don't exist
VegX-VegBIEN mapping: Map stemtags using [] instead of :[] for attrs that are really keys
Regenerated vegbien.ERD exports
VegX-VegBIEN mapping: Handle user-defined field voucherType (SALVIAS DetType) by mapping specimenreplicates for voucherTypes other than direct via voucher
xml_func.py: Added _if and _eq. Added cast() to throw SyntaxException if can't cast and use it in conv_items(). _merge: Check types of input using conv_items(strings.ustr, items).
util.py: Added all_not_none() and bool2str()
strings.py: Added ustr() (like built-in str() but converts to unicode object)
PostgreSQL-MySQL.csv: Fixed bug in removal of casts of default values, which treated NOT NULL as part of the datatype
VegBIEN: soilobs: Added default value for horizon. Adjusted mappings to remove now-unecessary horizon value.
repl: Removed automatic case-insensitivity because Python apparently only supports turning on case-insensitivity via (?i) but not off via (?-i) (as Java does)
VegBIEN: soilobs: Removed soil* prefix from fields
VegX-VegBIEN mapping: Map to new soilobs fields
SALVIAS inputs: Use new _units:[units="%"] on soil fields that are percents. Replace "<..." values with 0.
xml_func.py: Added _units
vegbien.sql: soilobs: Converted user-defined fields to first-class. Labeled appropriate fields as "fraction".
VegBIEN mappings: Changed tableRecord_ID to tablerecord_id to match PostgreSQL field name
DwC2-VegBIEN mapping: Adjusted user-defined mappings
vegbien.sql: userdefined: Made userdefinedname NOT NULL. userdefined, definedvalue: Added unique constraints.
VegX-VegBIEN mapping: Mapped userdefined fields to new first-class fields
xml_func.py: Added _map and _replace
vegbien.ERD.mwb: Fixed lines. Expanded truncated tables where there was room.
vegbien.sql: locationevent: Added temperature and precipitation
vegbien.sql: aggregateoccurrence: Added growthform
vegbien.ERD.mwb: Reversed the locations of soiltaxon and soilobs to give soilobs room to add new fields
vegbien.sql: Removed embargo table and emb_* fields because we're using a central field, location.confidentialitystatus, for embargo information and coordinate fuzzing
vegbien.sql: stemobservation: Added heightfirstbranch
vegbien.sql: stemobservation: Added diameteraccuracy. Reordered fields.
VegBIEN: stemobservation: Renamed diameter to diameterbreastheight to be more accurate
vegbien.ERD.mwb: Expanded tables where there was room
DwC mappings: Fixed user-defined field mappings according to Brad Boyle's changes
vegbien.sql: Changed specimenreplicate_unique_collectionnumber constraint to include verbatimcollectorname because collection number is assigned by collector
VegBIEN: Moved taxonoccurrence.verbatimcollectorname to specimenreplicate and aggregateoccurrence so that it can be used in specimenreplicate duplicate elimination
mappings/DwC1-DwC2.specimens.csv: Notes mapping: Removed extraneous /_merge/1
input.Makefile: svn_props: Removed no longer needed items from input dir svn:ignore
input.Makefile: verify: Fixed bug for inputs without a .ref where $(wildcard) wouldn't recheck the file after verify/%.out is run, so the verify output wasn't printed
input.Makefile: Moved verify files into separate subdir
bin/map: Changed root label data format convention to datasrc[data_format] so datasource names containing hyphens would not have the part after the - treated as the data format
inputs maps: Changed input root labels to match dir names since verify expects these to be the same
input.Makefile: verify: Fixed bug where datasource name was not set for non-DB inputs
input.Makefile: Removed no longer needed default verify action for dirs with no verify.ref's
input.Makefile: verify: Made verifications table-specific
input.Makefile: import: Merged import and import-all because they do the same thing
input.Makefile: verify: Started rearranging to allow different verifies for each table
Moved verify.sql to mappings since it's mapping-related
input.Makefile: Changed option nolog to log so that options aren't specified in the negative
input.Makefile: svn ignore .trace files
input.Makefile: Profile imports into a .trace file unless env var profile=""
xml_func.py: _alt: On empty input, return None instead of raising SyntaxException because empty input should be OK
xml_func.py: _alt: Fixed bug where not specifying any item would crash the program instead of raising a SyntaxException
Factored verify.sql out into schemas dir
input.Makefile: verify: Print diff in two columns if verbose=1
inputs/SALVIAS/verify.sql: When filtering by datasource name, use an AND clause in the JOIN party's ON condition instead of a separate WHERE statement, so that the datasource filtering code is all on the same line
inputs/SALVIAS/verify.sql: Use new :datasource variable instead of literal 'SALVIAS'
input.Makefile: Provide the verify.sql script a :datasource variable set to the datasource name (in quotes)
vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD
bin/map: DB, CSV inputs: Use column indexes instead of column names to look up each field (optimization to avoid repeated dict lookups of the same key)
util.py: ListDict: str(): Print each entry on its own line, in the order the keys were provided
NYBG-DwC maps: Filter out MinimumElevation = "."
xml_dom.py: NodeTextEntryIter: Filter out empty entries (instead of producing an entry with an explicit None value, which causes problems with XML funcs that can't handle Nones)
NYBG-DwC maps: Map to input fields with XML func appended whenever possible (DwC1->DwC2 translation is done by DwC-VegBIEN.specimens.csv)
vegbien.sql: Renamed methodtaxonclass.description to methodtaxonclass.taxonclass and changed it to a closed list (enum taxonclass). method.description can still be used for freeform taxonclass inclusions/exclusions.
DwC1-DwC2.specimens.csv: Removed no longer needed /_alt/2 XML func from date mappings (you will only ever map either the full date or the year/month/day)
DwC mappings: Moved DwC1's CoordinatePrecision /_noCV/value XML func suffix to DwC2-VegBIEN.specimens.csv
mappings: Removed mappings for XML func suffixes of a path because they are now automatically created heuristically by join
join: Added heuristic search for a match on a parent path, so that every XML func suffix of a path doesn't need its own mapping
vegbien.sql: Added method.pointsperline. Rearranged ERD after removing role fkeys.
filter_ERD.csv: Remove role fkeys
vegbien.sql: aggregateoccurrence: Added linecover
vegbien.sql: methodtaxonclass: Added description comment with list of values (which may become a closed list)
vegbien.sql: Changed lengthunits to m in all comments
vegbien.sql: method: Added subplotspacing and subplotmethod_id
vegbien.sql: method: Removed lengthunits and instead require all length- or area-related measurements throughout VegBIEN to be converted to SI base units, e.g. cm -> m, ha -> m^2. Adjusted ERD to avoid some densely packed lines.
vegbien.sql: methodtaxonclass: Added description field for taxon classes that don't fit well into a plantconcept. Made at least one of plantconcept_id or description required. Added unique constraint.
SALVIAS verifications: Use count(DISTINCT) instead of nested SELECT DISTINCT
VegBIEN verifications: Select only the records for the datasource being verified
SALVIAS verifications: Fixed to exclude subplots from locations/location events and uniqify locations based on coords
inputs/SALVIAS/verify.sql: Updated for schema changes
vegbien.ERD.mwb: Re-marked aggregateoccurrence:plantobservation relationship as 1:1 in the ERD. (I think this will need to be manually re-marked whenever either of those tables is updated.)
vegbien.sql: Removed methodgrowthform and growthform, since growthforms can be accommodated by plantconcept in a similar way as higher-order taxonomic ranks
vegbien.sql: methodgrowthform, methodtaxonclass: Removed "included" default value so it's always obvious whether the author intended the classes to be inclusions or exclusions
vegbien.sql: aggregateoccurrence: Removed unneeded fields. Added aggregateoccurrence->coverindex fkey.
vegbien.sql: Added constraint to enforce 1:1 aggregateoccurrence:plantobservation relationship
vegbien.sql: Added plantname unique constraint
bin/map: Use new util.ListDict and util.WrapIter to simplify getting rows by column name instead of index, and to enable a row to be printed with its column names in error messages
util.py: Added WrapIter to wrap an iterator and ListDict to view a list as a dict
bin/map: Use new util.list_flip()
util.py: Added list_flip()