xml_func.py: simplify(): Added pruning optimization that removes empty children. Empty children are created when some mappings don't apply to the current datasource.
xml_func.py: simplify(): Only generate children list if node is a function
xml_func.py: simplify(): Refactored to support processing nodes that are not functions. Changed var names for clarity.
mappings/VegCore-VegBIEN.csv: _simplifyPath() calls: Removed no longer needed `require` arg, and removed no longer needed table suffix from `next` arg
db_xml.py: put(): _simplifyPath() built-in function: Removed `require` param, which is not used by this _simplifyPath() implementation because the database constraints handle this
mappings/Veg+.terms.csv: Added subplot
input.Makefile: SVN: add: Also add empty import_order.txt
lib/common.Makefile: SVN: Added $(addFile)
input.Makefile: SVN: add: Don't automatically add a Specimen subdir, because some plots datasources don't have that table
README.TXT: Datasource setup: Adding input data: Added step to add <table> to inputs/<datasrc>/import_order.txt
README.TXT: Datasource setup: Changed "<name>" to "<datasrc>" to distinguish it more clearly from "<table>", which is also a name
README.TXT: Datasource setup: Adding input data: Changed steps to use new %/add command to add table's subdir
input.Makefile: SVN: Added %/add to add a new table subdir. add: Changed default subdir name to Specimen to match suggested table names at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV#Suggested-table-names>. Use new %/add to add it.
inputs/import.stats.xls: Updated with stats from latest import
README.TXT: Datasource setup: Replaced fixed table names with link to VegCSV suggested table names
input.Makefile: $(srcsOnly): Include only files ending in one of the data extensions: csv tsv txt xml. This allows the data provider to include other documentation files, such as SQL export queries, in the table subdirs.
bin/map: Documented that it is duplicate-column safe (supports multiple columns of the same name)
README.TXT: Datasource setup: Obtaining CSVs: Documented that when exporting relational databases to CSVs, you MUST ensure that embedded quotes are escaped by doubling them, not by preceding them with a "\" as is the default in phpMyAdmin
csvs.py: delims: Added ";", which is phpMyAdmin's default CSV delimiter
sql_io.py: null_strs: Added 'NULL', which is used by phpMyAdmin as the default "Replace NULL with" value for CSV exports
sql_io.py: cleanup_table(): Refactored to use for loop with array constant, so that additional NULL-equivalent strings can easily be added
mappings/roots/: Merged roots for different tables into one mappings/root.sh for Veg+, which handles all tables' mappings to VegBIEN
sql_io.py: put_table(): When ignoring all rows for an iteration, return literal NULL value instead of column of NULLs as an optimization for callers using that iteration's pkeys
mappings/VegCore-VegBIEN.csv: Primary taxondetermination: Removed [role=identifier] because the role of the entity making the determination is unknown. Added [!isoriginal] filter to those mappings to ensure that primary taxondetermination XPaths map to a different taxondetermination than the [isoriginal=true] determination when both are present.
inputs/SALVIAS*/1.organisms/map.csv: Remapped cfaff to identificationQualifier, because it was previously mapped to the same taxondetermination as the Orig* terms but does not have a corresponding Orig prefix to indicate that it should apply to the original determination instead of the primary TNRS one
mappings/Veg+.terms.csv: Removed no longer used computer.* taxonomic terms
mappings/VegCore-VegBIEN.csv: Removed no longer used computer.* taxonomic terms
inputs: Regenerated VegBIEN.csv for several datasources, which had apparently not gotten regenerated when make was run after the taxonRank mapping addition
backups/: svn:ignore: Also ignore .*, which includes temp files generated by rsync
xml_func.py: simplify(): Also consider _name() to be an aggregate function
inputs/SALVIAS*/1.organisms/map.csv: Removed computer.* prefix from primary (TNRS) taxondetermination, so it would map to the main taxondetermination in VegBIEN
mappings/VegCore-VegBIEN.csv: Mapped taxonRank analogously to computer.taxonRank
inputs/SALVIAS*/1.organisms/map.csv: Remapped OrigFamily/OrigGenus/OrigSpecies to new verbatim* taxonomic names. Also remapped cfaff to verbatimIdentificationQualifier, because it was previously mapped to the same taxondetermination as the Orig* terms, but this will later need to be remapped to identificationQualifier (not in this commit because that is a separate change). Note that the switch to the verbatim* taxonomic names removes a concatenated binomial that was part of the previous mappings, which put OrigGenus and OrigSpecies together into one scientificName.
mappings/VegCore-VegBIEN.csv: Mapped verbatimScientificName to taxonoccurrence.authortaxoncode as an alternative to scientificName
mappings/VegCore-VegBIEN.csv: Mapped verbatim* taxonomic terms
mappings/Veg+.terms.csv: Added verbatimIdentificationQualifier
mappings/Veg+.terms.csv: Added verbatimScientificName
schemas/vegbien.sql: taxondetermination: taxondetermination_unique unique index: Added isoriginal so an "original" determination in the same row (as found in SALVIAS) will be seen as distinct from the scrubbed determination, even if they are to the same plant name
mappings/VegCore-VegBIEN.csv: taxonomic terms: Removed ":[isoriginal=true]" because there may be multiple determinations for an organism (either in separate rows or, for SALVIAS, in separate columns), and not all will be the original determination
schemas/vegbien.sql: taxondetermination.role: Default to 'unknown' so that the field is optional
schemas/vegbien.sql: role enum: Added 'unknown' value
mappings/Veg+.terms.csv: Added verbatim* taxonomic terms
inputs: Regenerated maps for changes to bin/union, which removes empty mappings. Added /_alt suffix where needed.
inputs: Move src subdir into main dir, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCSV_subfolders#Move-src-subdir-into-main-dir>
input.Makefile: $(tables): Allow datasource to specify custom import order in src/import_order.txt
mappings/Veg+.terms.csv: growthForm: Documented source of standard terms
inputs/SALVIAS*/src/1.organisms/map.csv: Removed no longer applicable comments, which related to mappings that were in effect long ago
inputs/SALVIAS/src/2.stems/map.csv: Added comments from corresponding SALVIAS-CSV organisms columns
inputs/SALVIAS*/src/1.organisms/map.csv: Habit: Mapped to new Veg+ habit term
inputs/SALVIAS*/src/1.organisms/map.csv: Habit: Don't filter out values not part of the provided terms list, because such values should be flagged as invalid in the error maps rather than silently discarded. This also ensures that any valid values which are not part of the provided terms list are kept.
mappings/Veg+-VegCore.csv: habit: Map to new verbatimGrowthForm since this field is not necessarily standardized
mappings/Makefile: Veg+.cs-VegBIEN.csv: Join new Veg+-VegCore.to_self.csv (self-join), instead of Veg+-VegCore.csv, to VegCore-VegBIEN.csv, to support two-level chains of mappings in Veg+-VegCore.csv
mappings/Veg+-VegCore.csv: /_alt pass through mappings: Removed comment because the two-level mapping propagates it to all fields ending in /_alt, even though it doesn't apply to them, causing the main VegBIEN map and several datasources' maps to change unnecessarily. Also, the comment is not completely accurate because /_alt pass throughs are now used primarily to support idempotent self-joins of Veg+-VegCore.csv.
union: Don't eliminate duplicate rows based on matches between map_0's output column and map_1's input column, because union is now being used for self-joins and it is legitimate for a term to appear as both an input and an output
sql_io.py: put_table(): MissingCastException: Use strings.repr_no_u() instead of strings.urepr() in order to remove the u in u'...' for Unicode strings
README.TXT: After a new import: Updated commands for new subdirs layout
Regenerated vegbien.ERD exports
mappings: Added autogen Veg+-VegCore.to_self.csv, which is Veg+-VegCore.csv joined to itself, and use it as an intermediate map to join to VegCore-VegBIEN.csv. This provides support for two-level chains of mappings in Veg+-VegCore.csv.
mappings/Veg+-VegCore.csv: Changed output root to Veg+, to allow mappings/Veg+-VegCore.csv to be joined with itself idempotently, for supporting multi-level chains of mappings
mappings/Veg+-VegCore.csv: Add pass through /_alt mapping for all terms in this map that are merged with _alt, to allow datasource to define custom mappings that don't pass through the default mapping. This also allows mappings/Veg+-VegCore.csv to be joined with itself idempotently, to support multi-level chains of mappings.
mappings/Veg+-VegCore.csv: authorPlantCode: Added _alt suffix to create the correct priority
union: Exclude empty rows from the output, so that empty mappings from map_0 aren't included when map_1 contains a non-empty mapping for the same term. Note that this causes "No non-empty join mapping" warnings to turn into "No join mapping".
ci_map: Run join_union_sort in quiet mode so that it doesn't add lots of "No non-empty join mapping" warnings to the Comments column
mappings/Veg+-VegCore.csv: scientificNameAuthor: Added scientificNameAuthorship mapping with /_alt/1, to ensure that it has priority over scientificNameAuthor and to ensure that it has an _alt suffix when a datasource contains both scientificNameAuthor and scientificNameAuthorship (such as SpeciesLink)
inputs/SpeciesLink/src/specimens/map.csv: Added explicit _alt suffix when multiple terms map to the same place
inputs/ARIZ/src/specimens/map.csv: RelatedCatalogItem mappings: Added _alt suffixes
union: Multi-support: When an input appears in both maps, treat an empty mapping as if it didn't exist so that it doesn't overwrite a non-empty mapping in the other map
mappings/Makefile: Veg+.cs-VegBIEN.csv: Join Veg+-VegCore.csv to VegCore-VegBIEN.csv in quiet mode, to avoid adding "No non-empty join mapping" to the Comments column
join: quiet mode: Turn off all warnings, not just "No input mapping" warnings. This is useful when join-unioning a synonymy to a primary map, which may have "No non-empty join mapping" for some terms but this should not be stored in the resulting map's Comments column.
mappings/Makefile: Rewrapped lines
mappings/Veg+-VegCore.csv: Added verbatimGrowthForm mapping
mappings/Veg+.terms.csv: verbatimGrowthForm: Added comment that additional values come from SALVIAS. As other datasources' custom growth form values are added, they can be added to this comment.
mappings/Veg+.terms.csv: Added verbatimGrowthForm
schemas/vegbien.sql: locationdetermination: Added verbatimlatitude, verbatimlongitude, verbatimcoordinates
schemas/functions.sql: Made aggregating functions polymorphic
xml_func.py: Removed no longer used _collapse()
xml_func.py: Removed no longer needed _if(), which has been translated to a SQL function
schemas/functions.sql: Added _if()
sql.py: function_exists(): Support overloaded functions
sql.py: run_query(): Parse "more than one" errors as DuplicateExceptions
xml_func.py: XML function specification documentation: Updated parameters
xml_func.py: Removed no longer needed _eq(), which has been translated to a SQL function
schemas/functions.sql: Added _eq()
sql.py: run_query(): Parse "could not determine polymorphic type because input has type "unknown"" errors as MissingCastExceptions to type text. This adds support for polymorphic SQL functions whose parameters are anyelement, etc.
sql_io.py: put_table(): sql.MissingCastException: Support unknown (None) columns, by casting all columns
sql.py: MissingCastException: Support unknown (None) columns
xml_dom.py: replace_with_text(): Support bool `new` values
input.Makefile: Determine import order from sorted order of all non-hidden subdirs, instead of from fixed constant. This allows datasources to specify arbitrary tables, rather than being limited to 0.plots, 1.organisms, 2.stems, specimens.
lib/common.Makefile: Added $(wildcard/) (needed because builtin $(wildcard) doesn't do / suffix correctly)
input.Makefile: src/%/map.full.csv: Fixed bug where couldn't have $(srcMap) in prerequisites because this would for some reason cause src/%/map.full.csv to always be remade
input.Makefile: Src maps cleanup: Fixed bug where src.csv was using .map.csv.last_cleanup instead of .src.csv.last_cleanup as its .last_cleanup file
input.Makefile: Maps building: Moved src/%/map.full.csv after src/%/map.csv now that the filenames are fixed, so pattern matching order isn't an issue
input.Makefile: Maps building: $(makeFullCsv): Removed no longer needed test for whether the $(coreSelfMap) exists, because Veg+'s self map always exists
inputs/CTFS/src/1.organisms/: Added "_" prefix to prevent it from being treated as a data table subdir, before the DB export is mapped