mappings/VegCore.csv: Terms: Removed namespace prefixes (dcterms:), because VegCore terms are globally unique within VegCore and there should not be multiple versions of the same VegCore term with different namespaces. Provenance is instead indicated in the Sources column, which contains not just a namespace but a full URL to each source term.
dict2redmine: Hyperlink each term to its anchor in the data dictionary, rather than to its first source, which is not necessarily the definitive definition of the term. This also allows clicking the term to get its permalink in the address bar, rather than having to click the small, light gray paragraph mark next to the term name that Redmine provides.
dict2redmine: redmine_add_links(): Fixed bug where need to avoid matching internal links ( ... ) as citations ( [...] )
mappings/VegCore.csv: Term names: Changed special characters to _ because Redmine doesn't support special characters in HTML anchors (it removes everything except letters, numbers, _, and -)
mappings/Makefile: .Veg+-VegCore.csv.last_cleanup: Also canon the output (VegCore) column to the VegCore.csv vocabulary. ? prefixes are not a problem because there are always at least two alternatives listed for these terms, so canon will not modify the output field.
psql_script_vegbien: Run psql_vegbien with `nice -n +5` to prevent CPU-intensive operations from slowing down the shell/UI
inputs/import.stats.xls: Updated import times
Regenerated inputs/CVS/taxonObservation_/new_terms.csv. Note that it includes mappings to terms which are not in mappings/VegCore-VegBIEN.csv, which are prefixed with *.
input.Makefile: Maps validation: %/new_terms.csv: Undid incorrect change of column to filter terms out of. This actually needs to be the input column, even though unmapped_terms.csv is generated from the output column, because it's possible to have a mapping to a term which is not in mappings/VegCore-VegBIEN.csv, and such a term would show up in unmapped_terms.csv but should not be filtered out of new_terms.csv.
lib/phpPgAdmin.login.php.diff: public_ user's password message: Print as its own message instead of appending it to $msg. Print it before any error message so it always appears at the top of the page.
root Makefile: PostgreSQL: phpPgAdmin: Edit config file to allow passwordless logins. Edit login page to fill in public_ as the default username and add a message to leave the password blank for that user.
root Makefile: $(postgresReload-*): Ignore `mv -n` errors, which generally indicate that the existing *.conf was already renamed to *.conf.old
Makefile mk_db, schemas/pg_hba*.conf: Added passwordless public_ user with access to just the database schema. Note that in PostgreSQL, only users with explicit GRANT permissions on a table can read data in that table, but all DB users with a login can view all table schemas.
README.TXT: Maintenance: system updates that affect PostgreSQL: Added that this applies to both Linux and Mac OS X
README.TXT: Maintenance: system updates that affect PostgreSQL: list of things that could break if PostgreSQL is not restarted: Added that you may not be able to access the database as the postgres superuser
backups/fix_perms: Removed world read permissions from backups dir. Note that this will require superuser permissions to view archived backups on jupiter, because the bien group is not set up with the same members as on vegbiendev. (On jupiter, it contains only stri,regetz,donoghue,naiamh.)
inputs/CVS/taxonObservation_/map.csv: Mapped plantname, plantNameWithAuthority
inputs/CVS/cvs.~.utils.sql: plantconcept_plantnames(): Use CVS's taxonLevel values, which are different from the VegBank plantLevel values that the original version of this function used
inputs/CVS/cvs.~.utils.sql: plantconcept_*(): Use plantConcept.lowestParentConcept_ID,taxonLevel instead of plantStatus.plantParent_ID,plantLevel to find the plantConcept's ancestors, because CVS does not use plantStatus except in very few cases and instead puts the parent link directly in plantConcept
inputs/VegBank/vegbank.~.utils.sql: plantconcept_plantnames(): Made function STABLE instead of VOLATILE because it does not modify any tables
inputs/CVS/cvs.~.utils.sql: plantconcept_plantnames(): Made function STABLE instead of VOLATILE because it does not modify any tables
mappings/VegCore.csv: Removed no longer used verbatimGrowthForm. Use growthForm instead.
mappings/VegCore-VegBIEN.csv: Removed no longer used verbatimGrowthForm. Map to growthForm instead and translate growth form values to VegBIEN's growthform enum.
inputs/Madidi/Organism/map.csv: Habit: Mapped growth form values
inputs/Madidi/Organism/map.csv: Remapped Habit from verbatimGrowthForm to growthForm, which points to the same place
inputs/CVS/taxonObservation_/map.csv: Use denorm_* denormalized taxonomic ranks in place of the normalized ranks when both are provided
input.Makefile: Maps validation: %/new_terms.csv: Fixed bug where need to filter unmapped_terms.csv's terms out of the output column, not the input column, because that's what the unmapped terms are generated from. Usually these columns are the same for unmapped terms, but sometimes an output term is changed from the original column's name but still doesn't match a VegCore term in mappings/VegCore-VegBIEN.csv.
input.Makefile: SVN: add: Added comment with instructions to update all inputs with these settings, using `make inputs/add`
input.Makefile: SVN: add: verify: Also ignore *.xlsx
README.TXT: Data import: Creating enough disk space: Added instructions for removing archived backups to free up space
inputs/CVS/taxonObservation_/map.csv: Fixed bug where taxonLevel, not taxonRank, needs to be mapped to taxonRank, because CVS's taxonRank is actually a number, while taxonLevel contains the corresponding text string
README.TXT: Data import: Before import, added step to make sure there is at least 100GB of disk space
sql_io.py: put_table(): is_function: Fixed bug where need to add the pkeys table's test pkey constraint after the data is added rather than when the empty table is created, to avoid adding a pkey constraint that will later be violated by data which returns multiple output rows for an input row (such as calls to _split())
sql_io.py: put_table(): insert_into_pkeys(): Allow callers to override run_query_into()'s add_pkey_ param in case the initial version of the pkeys table should not yet have the test pkey constraint (e.g. because data is added after the table is created)
README.TXT: Data import: Checking for errors: Search for "Command exited with non-zero status" to find errors, which is faster than checking that each input's log ends in "Encountered 0 error(s)"
README.TXT: Data import: import_all: Corrected text of note about time until control is returned to the shell
README.TXT: Data import: Moved download of logs to right after the import is done, because this is a quick step that doesn't depend on the backup- and export-creation steps
mappings/VegCore-VegBIEN.csv: institutionCode: Removed mapping to sourcename.matched_source_id, which is now autopopulated. Split any list of institutionCodes apart using new _split().
schemas/vegbien.sql: sourcename: Added sourcename_set_matched_source_id() trigger
schemas/functions.sql: Added _split()
schemas/vegbien.sql: sourcelist_unique: Removed COALESCE around name because it's NOT NULL
schemas/vegbien.sql: Allow multiple institutionCodes for each specimenreplicate by linking new sourcelist table many-to-many to source via sourcename (which is now a linking table)
schemas/vegbien.sql: sourcename: Removed system, which has been replaced by source_id as the scoping field
schemas/vegbien.sql: party: Added sourceaccessioncode and uniquify on it instead when provided. vegbien.ERD.mwb: Rearranged party-related tables to allow the tables to be fully expanded.
schemas/vegbien.sql: Renamed sampletype to observationtype to match the VegCore term
Added inputs/SALVIAS/salvias_users.~.clean_up.sql
inputs/SALVIAS/: Added salvias_users tables
my2pg: Translate blob to bytea
my2pg: Also remove UNIQUE and FULLTEXT inline indexes
mappings/VegCore.csv: UNUSED: Comments: Added Redmine formatting
mappings/VegCore.csv: OMIT: Changed "is omitted" to "should be omitted", because the mappings specify suggestions rather than requirements as to how a field should be used
mappings/VegCore.csv: Removed no longer used subInstitutionCode. Use datasource, institutionCode instead.
schemas/vegbien.sql: analytical_*: Renamed subInstitutionCode to institutionCode because this is the institution storing the specimen, as defined by DwC
schemas/vegbien.sql: analytical_*: Renamed institutionCode to datasource because this is actually the top-level datasource providing the record, not the institution storing the specimen
mappings/VegCore.csv: Added datasource
mappings/VegCore.csv: referenceType: Added closed list values
mappings/VegCore.csv: observationMeasure: Re-sourced to SALVIAS:observation_type, since SALVIAS comes before VegBIEN in the source precedence
mappings/VegCore.csv: Renamed sampleType to observationType to match the SALVIAS term it's derived from
inputs/SALVIAS-CSV/Plot/map.csv: Mapped observation_type
mappings/VegCore.csv: Added individualCount_*cm_or_more used by analytical_aggregate
mappings/VegCore.csv: subplotX/Y: Added definition
mappings/VegCore.csv: subplot*: Sources: Put SALVIAS:subplot last, because the specific field is closer in meaning to the term than the category
mappings/VegCore.csv: project*Date: Added definition
mappings/VegCore.csv: parentPlotName: Added definition
mappings/VegCore.csv: parent*: Sources: Put VegBank:PARENT_ID last, because the specific field is closer in meaning to the term than the category
mappings/VegCore.csv: locationName: Added definition
mappings/VegCore.csv: eventDate/startDate/endDate: Added definition
mappings/VegCore.csv: locationName: Sources: Put VegX:plotName first because it is closest in meaning to the term
mappings/VegCore.csv: recordedBy*: Added definition
mappings/VegCore.csv: recordedBy.middleName: Added source to DwC:recordedBy
inputs/CVS/: Joined together stemCount and stemLocation tables to create stemLocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)
inputs/CVS/: Joined together taxonImportance and stemCount tables to create stemCount_, because stemCount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename), and is thus an AggregateOccurrence-related table along with taxonImportance
inputs/CVS/taxonObservation_/map.csv: Fixed bug where need to indicate that data is plots data to prevent the specimenreplicate ID from being forwarded to the location ID
inputs/VegBank/taxonobservation_/map.csv: Fixed bug where need to indicate that data is plots data to prevent the specimenreplicate ID from being forwarded to the location ID
mappings/VegCore-VegBIEN.csv: Don't forward specimenreplicate IDs to location for plots data (where the specimenreplicate IDs apply only to the specimen)
xml_func.py: Simplifying functions: Added _eq()
xml_func.py: Added is_scalar()
xml_func.py: process(): row-based mode: preserving complex funcs: Fixed bug where functions with no params would crash reduce() because it requires at least one value when no initial value is specified
Added scalar.py
Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data
inputs/VegBank/: Joined together stemcount and stemlocation tables to create stemlocation_, in order to include the stem size class's measurements in each tagged stem's stemobservation (in addition to in the stemobservation for the aggregateoccurrence as a whole)
inputs/VegBank/stemlocation/map.csv: Also mapped stemlocation_id to individualID to create one plantobservation for each stemobservation
inputs/VegBank/stemlocation/map.csv: Remapped stemcount_id to aggregateOccurrenceID to match stemcount_id's mapping in stemcount_
inputs/VegBank/: Joined together taxonimportance and stemcount tables to create stemcount_, because stemcount actually stores stem abundance by size, rather than grouping stems by organism (http://vegbankdev.nceas.ucsb.edu/vegbank/views/dba_tabledescription_detail.jsp?view=detail&wparam=stemcount&entity=dba_tabledescription&where=where_tablename)
Added inputs/VegBank/_archive
input.Makefile: Testing: Added `%/test: %/test.xml` to allow testing just a subdir
input.Makefile: General targets: Added `%/: %/map.csv` to allow remaking just a subdirectory
inputs/CVS/: Refreshed data with new export from Bob
inputs/CVS/cvs-archive-2012-12-04.schema.sql: Fixed types using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Tools#MS-Access-database-MDB>
bin/map: Removed column names simplification, which was causing columns with the same alphanumeric characters but different punctuation to be simplified to the same name. Name simplification is now performed by the mapping mechanism itself, and can be overridden in the mappings.
Regenerated inputs/VegBank/new_terms.csv
Added inputs/NCU/_src/NCU_specimens_public_2012-12-10.zip.url
inputs/NCU/: Refreshed data with new export from Bob