xml_dom.py: replace_with_text(): Support ints and floats
xml_func.py: simplify(): Run xml_dom.prune_empty() on function nodes that don't have an explicit simplifying function. This allows single-arg functions with no arg to be pruned rather than called with no args (causing errors if the single param does not have a default value).
Regenerated vegbien.ERD exports
schemas/vegbien.sql: Added units suffix to additional VegBIEN fields that have units
schemas/vegbien.sql: Added units suffix to all core VegBIEN fields that have units. It is the responsibility of the mappings to ensure that all units are properly translated.
root Makefile: PostgreSQL: postgres-Linux: Added postgresql-postgis apt-get
backups/Makefile: Backups: Full DB: Specify the date suffix of the backup when it's created rather than adding it afterwards. This allows the user to specify a suffix that matches the corresponding public-schema backup.
inputs/*/*/map.csv: Mapped variants of subspecies directly to new subspecies term
mappings/VegCore-VegBIEN.csv: subspecies, infraspecificEpithet: Added _alts for datasources that specify both
input.Makefile: Mapping: $(map2db): Inline $(map) because this is the only place it's used
input.Makefile: Mapping: $(map): Don't require flat files because they don't need to be used directly anymore (staging tables are used instead)
input.Makefile: Mapping: $(map2db): Always use staging tables, because the flat files don't need to be used directly anymore
mappings/Veg+-VegCore.csv: Remapped subspecies, subSpeciesName to new subspecies term
mappings/VegCore-VegBIEN.csv: Mapped subspecies, variety, forma, cultivar
mappings/Veg+.terms.csv: Added subspecies, variety, forma, cultivar
schemas/vegbien.sql: taxon.authority_id: Added descriptive comment that this is the authority which defines the taxon name (as opposed to the author of the taxon name)
schemas/vegbien.sql: taxon: Added author_id for the author of the taxon name. This is distinct from authority_id, which is the authority used to determine which taxon name to apply.
schemas/vegbien.sql: analytical_db_view: Use new denormalized placepath table instead of place, which significantly reduces the number of joins
schemas/vegbien.sql: location: Removed stateprovince, country because these are now in placepath (as well as in place.rank)
schemas/vegbien.sql: analytical_db_view: LEFT JOIN locationcoords and locationplace so that locations will be included even if they don't have one of these two determinations
schemas/vegbien.sql: analytical_db_view: Fixed bug where method was being joined instead of left-joined, causing only rows with a method to be included
schemas/vegbien.sql: locationplace: Added identifier_id, so that different identifiers (e.g. the data provider and GNRS) can provide separate locationplaces even if the standardized name happens to be the same as the original name
mappings/VegBank-VegBIEN.csv: Added place->locationplace renaming
mappings/VegBIEN-VegBank.csv: Reversed the order of the columns so it's a more natural forward renaming, and renamed the file to VegBank-VegBIEN.csv to reflect the new column order
mappings/VegBIEN-VegBank.csv: Fixed order of plantconcept->taxon renaming because the VegBIEN column is on the right
schemas/vegbien.sql: Renamed namedplace to place for simplicity and consistency with placepath and locationplace
schemas/vegbien.sql: taxon: Made authority an fkey to reference instead of a text field
schemas/vegbien.sql: Moved steps to include a taxon name at a rank with no explicit column from taxon's comment to taxonpath's comment, because that is the table the steps apply to
schemas/vegbien.sql: Added placepath (analogous to taxonpath), and point locationplace to it instead of directly to namedplace
schemas/vegbien.sql: Split locationdetermination into locationcoords and locationplace, so that coordinate determinations can be made separately from place determinations
schemas/vegbien.sql: location: Removed authore, authorn because this information is now in locationdetermination as verbatimlongitude, verbatimlatitude
schemas/vegbien.sql: location: Removed centerlatitude/longitude, publiclatitude/longitude because this information is now in locationdetermination
schemas/vegbien.ERD.mwb: Fixed lines
mappings/VegBIEN-VegBank.csv: Added table rename plantconcept->taxon
schemas/vegbien.sql: taxonpath.scientificnamewithauthor: Added comment that it's equivalent to "Name sec. x"
schemas/vegbien.sql: taxon: Added comment that it's VegBank's plantConcept table
schemas/vegbien.sql: Renamed plantconcept to taxonpath for consistency with DwC's Taxon category and to emphasize that the table stores taxonomic paths
schemas/vegbien.sql: Renamed plantname to taxon for consistency with DwC's Taxon category
schemas/vegbien.sql: plantname: Renamed plantname field to taxonname for consistency with DwC's Taxon category
Updated aggregated unmapped_terms.csv, new_terms.csv. This removes terms that contained a filter (which is now in a separate column) and moves new terms that are unmapped from new_terms.csv to unmapped_terms.csv. Note that the majority of unmapped terms are from VegBank's huge tables, and are not part of the core fields needed for the analytical DB.
schemas/vegbien.sql: taxonrank: Switched to using extended taxonomic ranks list derived from VegX at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegBIEN_taxonomic_schema#Extended>. This renames *division to *phylum and splits up 'cultivar/forma'.
schemas/vegbien.sql: taxonrank: Removed 'authority', which doesn't belong as a taxonomic rank
schemas/vegbien.sql: plantname: Added authority so each taxonomic level can have its own authority (author). Include it in the plantname_unique unique index because plantname is a globally scoped table.
schemas/vegbien.sql: taxonrank: Removed 'binomial', which doesn't belong as a taxonomic rank
schemas/vegbien.sql: Changed analytical_db_view to use new denormalized taxonomic names in plantconcept, which significantly reduces the number of joins. Note that changing the tables used by a view which depends on other tables will cause those tables to be reordered in dependency order to appear before the view, causing things to be moved around in the svn diff.
inputs/Madidi/Organism/map.csv: Remapped Specie+autor to new scientificNameWithAuthorship. Mapped Species and morphotypes to now-available scientificName.
mappings/VegCore-VegBIEN.csv: Moved scientificNameWithAuthorship before scientificName in taxonoccurrence.authortaxoncode's _alts
mappings/VegCore-VegBIEN.csv: Mapped scientificNameWithAuthorship as an _alt of taxonoccurrence.authortaxoncode
mappings/VegCore-VegBIEN.csv: Mapped scientificNameWithAuthorship
mappings/Veg+.terms.csv: Added scientificNameWithAuthorship
mappings/VegCore-VegBIEN.csv: Taxonomic names: Remapped to new denormalized fields in plantconcept
schemas/vegbien.sql: plantname: Added comment documenting how to include a taxon name at a rank with no explicit column, by using the plantname table as an ordered linked list linked together using parent_id. (This method of using a linked list is one way of storing an ordered list of user-defined data. It is similar to using locationevent.previous_id to link successive reobservations of the same location together.) Note that plantname can store both the official tree of life and the data provider's own custom tree of life (or a subset thereof), with the two being distinguished by whether the data provider's or TNRS's taxondeterminations point to them.
schemas/vegbien.sql: plantname: Added verbatimrank to store ranks of custom taxonomic levels, such as rosids. Note that even if you specify a custom verbatimrank, you must also specify a closest-match rank from the taxonrank closed list. This ensures that every taxonomic name is placed in the correct relative order in the taxonomic hierarchy.
schemas/vegbien.sql: plantconcept: Made plantname_id optional because the datasource's plantconcepts do not need to be placed in the recursive plantname hierarchy
schemas/vegbien.sql: plantconcept: Added datasource_id and appropriate unique indexes to enable scoping by datasource. Moved plantcode right after datasource_id because it will be used for the sourceaccessioncode (if any).
schemas/vegbien.sql: Moved plantconcept.plantdescription to plantname and renamed it to description, so that a taxon of any rank can have a description
schemas/vegbien.sql: plantconcept: Added denormalized taxonomic ranks from <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegBIEN_taxonomic_schema#Primary> and concatenated scientific name fields
Removed no longer used ucase_first
Removed no longer used bin/union
Removed no longer used join_union_sort
Removed no longer used ci_map, because all relevant mapping scripts are now case-insensitive
mappings/Makefile: Inline $(review_) because it's only used once
mappings/Makefile: Removed no longer used $(review)
mappings/Makefile: Don't set $(SHELL) to /bin/bash because this is no longer needed
mappings/Makefile: Removed empty VegCSV section. mappings/Makefile's only functionality is now to clean up (sort) the core maps whenever they change and create human-readable maps from them.
mappings/Makefile: Removed no longer used self maps, because the new automapping mechanism does not use them
input.Makefile: Existing maps discovery: Substituted Veg+ for $(via) because it's now only used once
mappings/VegCore-VegBIEN.csv: Changed input column header from VegCore[Veg+] to VegCore because this is more accurate. This is possible now that we're using new automapping scripts that do not require a particular column header.
inputs/*/*/map.csv: Changed _merge to _join everywhere because _merge's (slower) duplicate elimination functionality is not needed (the combined columns do not both contain the same value, so they can simply be concatenated)
schemas/functions.sql: _label(): Accept params of any type, in order to support types other than text (which come from staging tables that are imported directly from a SQL export). This fixes a bug in SALVIAS.plotMetadata's column-based import.
schemas/functions.sql: _label(): Support NULL labels by not prepending a label
mappings/Veg+-VegCore.csv: Changed output column header from Veg+ to VegCore because this is more accurate. This is possible now that we're using new automapping scripts that do not require a particular column header. Note that this change now requires the map.csvs to use VegCore as their output column header, because otherwise the Veg+ header will get automapped to VegCore. (The header replacing is a feature to support changing the header when the schema of the column's terms changes.)
mappings/root.sh: Changed output column header from Veg+ to VegCore because this is more accurate following the initial automapping
inputs/*/*/map.csv: Changed output column header from Veg+ to VegCore because the names will be VegCore names after automapping. This is possible now that we're using new automapping scripts that do not require a particular column header.
inputs/import.stats.xls: Copied the Change factor formula to all rows (it displays an empty string for rows that don't have both a row-based and a column-based import)
README.TXT: Data import: Added steps to record the import times in inputs/import.stats.xls
inputs/import.stats.xls: Updated with stats from latest import
Added import_times
mappings/root.sh: Removed no longer needed $in_root_suffix
src_map: Upgraded to match new map format by adding Filter column
input.Makefile: $(viaMaps): Fixed bug where could not wrap it in $(wildcard) because that would prevent map.csv from being created when a new datasource or new subdir is added
input.Makefile: $(viaMaps): Removed extra addition of */map.csv, which is already included because all $(tables) have or will get a map.csv
mappings/: Removed no longer used derived file Veg+.vocab.csv
input.Makefile: Removed no longer used $(vocab)
input.Makefile: Maps validation: %/new_terms.csv: Filter out $(coreMap) and $(dict) successively instead of $(vocab), to avoid requiring intermediate mapping files not edited by the user
input.Makefile: Maps validation: $(newTerms): Don't hardcode the caller's first filter_out_ci by prerequisite position; instead allow them to specify the command (including the var name) themselves
input.Makefile: Maps validation: $(newTerms): For simplicity, subset the columns before running filter_out_ci
mappings/: Removed no longer used Veg+-VegBIEN.csv and derived autogen Veg+.self.csv
input.Makefile: Maps building: %/unmapped_terms.csv: Use $(coreMap) instead of $(vocab) because the terms should already be translated to VegCore terms, rather than still being Veg+
input.Makefile: Maps validation: $(newTerms): Fixed bug where header needed to be removed before running filter_out_ci because filter_out_ci only removes the header if it matches the vocabulary's header. Removing the header afterward can cause the first row to be removed instead if the header was already removed.
cols: Support CSVs without a header, such as intermediates that become unmapped_terms.csv, new_terms.csv
inputs/: Regenerated unmapped_terms.csv, new_terms.csv
input.Makefile: %/.map.csv.last_cleanup: Removed no longer used prerequisite $(vocab)
input.Makefile: %/.map.csv.last_cleanup: Canonicalize separately on $(coreMap) and $(dict), instead of requiring them to be combined in $(vocab)