schemas/vegbien.sql: analytical_* pkeys: Added dateCollected because the records are actually unique within the location*event*, not the location
schemas/vegbien.sql: analytical_stem_view: Exclude records with no collectiondate or obsstartdate, which is required to uniquely identify a record
analytical_stem_view: dateCollected: Use locationevent.obsstartdate when aggregateoccurrence.collectiondate is not provided
schemas/vegbien.sql: analytical_stem_view: Include only the current taxondetermination for each taxonoccurrence, to avoid cross-joining taxondeterminations with stems and thus multiplying the number of rows for datasources that have multiple taxondeterminations per taxonoccurrence
schemas/vegbien.sql: taxondetermination: Added AFTER trigger to set the current taxondetermination for the taxonoccurrence
lib/PostgreSQL-MySQL.csv: Statements ending in ";": When matching any character, use .*? (with the (?s) flag) instead of [^;]* in order to allow embedded ; to be matched. This fixes a bug where a CREATE VIEW statement was not removed because it contained an embedded ; .
schemas/vegbien.sql: taxondetermination: Added unique index to ensure that there is only one current determination for each taxonoccurrence
lib/PostgreSQL-MySQL.csv: Remove indexes with WHERE clauses
schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies, recordNumber. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on most of the tables which provide them, and LEFT JOINed tables have their identifying fields combined to create a NOT NULL value.
schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
lib/PostgreSQL-MySQL.csv: Only match a statement-terminating ; when it's at the end of a line
schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on the tables which provide them.
db_xml.py: put(): _setDefault(): Delay the evaluation of each col_default's value until the col_default is actually retrieved. This fixes a bug in the source table mappings where the explicit source entry was being created after the col_default source entry, causing the initial entry, which did not have the additional fields populated, to be used instead.
dicts.py: Added WrapDict, a dict that runs a function on each value retrieved
db_xml.py: put(): _setDefault(): Fixed bug where need to copy col_defaults before calling update() on it, to avoid modifying the input value (which may be reused by the caller, expecting it to be unmodified)
db_xml.py: put(): col_defaults param: Fixed bug where need to use None as default value, because col_defaults will be modified by put() and the {} default value is a global instance
mappings/VegCore-VegBIEN.csv: source table mappings: Set shortname to env var $source when it's not explicitly specified, because shortname is a required field of source
db_xml.py: put(): Pass through the values of nodes which are text nodes
db_xml.py: put(): put_(): Support setDefault() values which are text nodes, by passing text strings through when put() is run on all col_defaults entries
db_xml.py: put(): _setDefault(): Support setting multiple col_defaults at once by using the param names themselves as the column names
dicts.py: DictProxy: Implemented delitem()
bin/map: update_in_label(): Removed hardcoded source_id col_default, which is now set in mappings/VegCore-VegBIEN.csv's output root
mappings/VegCore-VegBIEN.csv: Set the source_id col_default to the datasource name using the new _setDefault() built-in function and _env()
db_xml.py: put(): Added _setDefault() built-in function, which adds an entry to col_defaults
xml_func.py: _env(): Fixed bug where need to retrieve actual string value of name param using xml_dom.NodeTextEntryIter instead of NodeEntryIter
xml_func.py: _env(): Fixed bug where need to use xml_dom.replace_with_text() instead of xml_dom.replace() because replace() requires a DOM node
bin/map: update_in_label(): Set $source env var to the in_label (datasource name), to make it available to _env()
xml_func.py: Simplifying functions: Added _env()
Added inputs/VegBank/Source/, containing referenceType metadata
Added inputs/SpeciesLink/Source/, containing referenceType metadata
Added inputs/SALVIAS*/Source/, containing referenceType metadata
Added inputs/REMIB/Source/, containing referenceType metadata
Added inputs/GBIF/Source/, containing referenceType metadata
Added inputs/TEAM/Source/, containing referenceType metadata
Placed inputs/TEAM/_src/Vegetation-Tree-and-Liana-Metadata-1.5.pdf under version control
inputs/FIA/import_order.txt: Added Source, which needs to come before Organism
Added inputs/Madidi/Source/, containing referenceType metadata
Added inputs/FIA/Source/, containing referenceType metadata
Added inputs/CVS/Source/, containing referenceType metadata
Added inputs/CTFS/Source/, containing referenceType metadata
bin/map: Support map spreadsheets containing only metadata mappings (with no corresponding staging table), by falling back to an empty table when the named table does not exist
mappings/VegCore-VegBIEN.csv: institutionCode: Also map to the sourcename's matched source, which identifies whether the source is a herbarium
schemas/vegbien.sql: source: Made shortname NOT NULL to ensure that all datasources have a globally-unique short name
import_all: Added import of inputs/.herbaria/ before the main import
Added inputs/.herbaria/
input.Makefile: SVN: add: Also run %/add on all data subdirs
input.Makefile: Existing maps discovery: Moved tables discovery to its own section, above SVN so it can be used by SVN
mappings/VegCore.csv: referenceType: Fixed sort order
mappings/VegCore-VegBIEN.csv: Mapped referenceType
mappings/VegCore.csv: Added referenceType
mappings/VegCore-VegBIEN.csv: institutionCode: Remap to source.shortname when specimen information is not provided, as is the case for geoscrub.herbaria on nimoy
inputs/bien_web/observation/map.csv: Mapped observationID->occurrenceID
README.TXT: Datasource setup: Add input data for each table present in the datasource: Added step to run `make inputs/<datasrc>/<table>/install` if the table is in a .sql export
README.TXT: Datasource setup: MySQL inputs: Added step to install the export, which needs to happen before mapping individual tables
README.TXT: Datasource setup: Add input data for each table present in the datasource: Replaced "CSV" with "CSV" because there can be multiple CSV part files for one table
README.TXT: Datasource setup: Add input data for each table present in the datasource: Don't add a CSV or create.sql file for tables that are in a .sql export
README.TXT: Schema changes: Sync ERD with vegbien.sql schema: Changed instructions to just select tables with arrows next to them rather than all tables, because each table that's updated will have its lines reset and the number of lines that need to be fixed should be minimized
README.TXT: Datasource setup: Accept the test cases: `make inputs/<datasrc>/test by_col=1`: Clarified that errors could indicate bugs in the VegBIEN unique constraints
README.TXT: Data import: To remake analytical DB: Added explicit public schema setting since the analytical DB is often manually remade after the public schema has been renamed. Removed warnings that certain commands must be run after running make_analytical_db, because the "remake analytical DB" instructions no longer require this.
README.TXT: Datasource setup: MySQL inputs: Added steps to export the database to a PostgreSQL-compatible .sql file, which can be directly used by the install process without the need to export each table as CSV
README.TXT: Datasource setup: Choosing a table name: Documented that for .sql exports, you must use the name of the table in the DB export, not a suggested or custom name
input.Makefile: Staging tables installation: $(dbExports): Also include the files that would be generated by running _MySQL/*.make and creating the corresponding PostgreSQL translations
input.Makefile: Staging tables installation: Moved .sql export downloading and translation to separate Input data retrieval section
Added lib/MySQL.{data,schema}.sql.make templates to use in datasources' _MySQL/ dirs
inputs/import.stats.xls: Updated import times
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Changed to use Brad's formula, which concatenates genus and specific_epithet/morphospecies, and uses family if just the family is present, rather than using the full taxonomic name
mappings/VegCore-VegBIEN.csv: Concatenated taxonlabel: Don't prepend family if the taxonName/scientificName itself is the family, so that the family is not duplicated in the concatenated taxonomic name
schemas/functions.sql: _nullIf(): Removed NOT NULL constraint on null param, to support use a (nullable) column rather than a literal as the null-equivalent value
xml_func.py: Simplifying functions: Added _nullIf(), to remove calls with no null value
xml_dom.py: Added prune_parent()
schemas/functions.sql: Added _or()
schemas/functions.sql: Added _merge_words()
schemas/vegbien.sql: analytical_*: Renamed geosourceValid to geovalid. (It had gotten renamed in the reference -> source rename.)
mappings/VegCore.csv: Renamed georeferenceValid to geovalid
inputs/import.stats.xls: Updated import times. This now includes the Canadensys plants-related datasources HIBG, JBM, QFA, TRT, TRTE, UBC, VASCAN, and WIN.
Added inputs/HIBG/
Added inputs/JBM/
Added inputs/VASCAN/
Added inputs/WIN/
Added inputs/UBC/
Added inputs/TRTE/Specimen/
Added inputs/QFA/
Added inputs/TRT/
schemas/vegbien.sql: Allow bien_read to SELECT from all tables in the public schema
schemas/vegbien.sql: Allow bien_read to SELECT from analytical_aggregate, analytical_stem
lib/PostgreSQL-MySQL.csv: Removed GRANT/REVOKE because SCHEMA GRANTs are not supported in MySQL
pg_dump_vegbien: non-$owners mode: Removed --no-privileges in order to include GRANTs to other users
root Makefile: PostgreSQL: $(postgresReload-Linux): Making schemas/*.conf world-readable: Fixed bug where need to do this as the bien user, which owns the files
root Makefile: PostgreSQL: $(postgresReload-*): Make schemas/*.conf world-readable so it's readable by the postgres user, which the .conf installation is run as
root Makefile: PostgreSQL: $(postgresReload-*): Also install pg_hba.conf
root Makefile: PostgreSQL: Added postgres_reload to reload postgresql.conf and restart the DB
root Makefile: PostgreSQL: postgres-*: Factored postgresql.conf installation out in to $(postgresReload-*)
schemas/: Synced pg_hba.conf and pg_hba.Mac.conf's bien entries, which adds phpPgAdmin support (template1 access) on the Mac and bien_read access on Linux
root Makefile: VegBIEN DB: DB and users: Also create bien_read user for read-only access to the DB
schemas/pg_hba.Mac.conf: Allow access to the bien group rather than just the bien user, which will include bien_read
schemas/pg_hba.Mac.conf: Fixed bug where also need to allow password-based logins from the same machine, in order to work with pgAdmin
schemas/vegbien.ERD.poster.pdf: Updated to 33x51in poster size and 0.25in margins
README.TXT: Schema changes: Creating a poster of the ERD: Added section with the State St FedEx Kinkos' rates for posters ($10.25/sq ft laminated)
README.TXT: Schema changes: Creating a poster of the ERD: Changed "Measure the fractional height of the text onscreen" to "Determine the poster size"