import_all: Run all imports (not just the main datasources' import) with $import_source turned off, so that the Source tables will not be imported a second time when the datasource's main tables are imported. Note that it's not necessary to wait for asynchronous commands after the jobs for the main import are started (so that $import_source is not unset until after they are started), because with_all does not return until all jobs are started and have noted the $import_source setting in effect in the shell environment.
import_all: Source tables import: Fixed bug where need to use $all option to with_all to also include special datasources starting with "."
make_analytical_db: Also create taxon_trait materialized view
inputs/*/*/map.csv: Reverted special OMIT mappings for input columns that have the same name as a VegCore table and have not yet been mapped to a VegCore term
mappings/Makefile: VegCore.csv: Filter out the VegCore tables so they are not matched as terms. This is necessary because some terms have the same name as a table, but the term should be the match rather than the table.
mappings/VegCore.csv: Changed line endings to \r\n to match the output of filter_out_ci
inputs/CTFS/TaxonOccurrence/map.csv: Mapped SpeciesAuthority
backups/Makefile: Synchronization: $(remote): Fixed bug where need trailing / at end of path
backups/Makefile: Synchronization: $(remote): Updated path to backups
README.TXT: Data import: On jupiter: Updated path to backups
README.TXT: Installation: Added command to change to the directory of the checked out files
README.TXT: Installation: Added command to check out files from svn
schemas/vegbien.sql: Added taxon_trait materialized view
mappings/Veg+-VegCore.csv: Sources: Removed redundant bien2_ prefix from bien2_staging subnamespace
schemas/vegbien.sql: trait: trait_unique: Removed value and units because there should only be one value of a trait for each taxonoccurrence
schemas/vegbien.sql: Reattached trait to taxonoccurrence instead of taxonlabel, because the TraitObservation traits data is actually associated with a particular occurrence (plant observation complete with location, date, etc.), rather than just a taxon
Added inputs/bien2_traits/
mappings/VegCore-VegBIEN.csv: Mapped traits-related DwC terms measurementType, measurementValue, measurementUnit
schemas/vegbien.ERD.mwb: Added trait table to ERD
schemas/vegbien.sql: trait: Added trait_unique unique index
schemas/vegbien.sql: trait: Added units field
schemas/vegbien.sql: trait: Renamed type to name because TraitObservation stores trait names rather than types
schemas/vegbien.sql: trait: Linked to taxonlabel instead of stemobservation, because TraitObservation's traits are taxon-level and stem-level traits currently go in named fields instead of a stem traits table
inputs/.TNRS/tnrs_*/map.csv: Remapped Source to OMIT so it won't match to the Source table
inputs/.TNRS/tnrs_other/map.csv: Updated for new VegCore terms, which include Source as a table name. This field will need to be remapped so it doesn't collide with the table name.
inputs/import.stats.xls: Updated import times
README.TXT: Data import: Added step to check that the source table contains entries for all inputs
Regenerated vegbien.ERD exports
make_analytical_db: Also populate owner_count
make_analytical_db: Generate provider_count before analytical_aggregate because it's much faster
schemas/vegbien.sql: Added materialized view owner_count, generated from owner_count_view
make_analytical_db: Also populate provider_count
schemas/vegbien.sql: Added materialized view provider_count, generated from provider_count_view
schemas/vegbien.sql: Added provider_count_view for counts of occurrences per top-level provider
Regenerated mappings/VegCore.htm
schemas/vegbien.sql: provider_view: Sort NULL sourcetype last
schemas/vegbien.sql: Added provider_view, which combines source and sourcename
schemas/vegbien.sql: sourcename: Gave public_ SELECT permissions
README.TXT: Maintenance: VegCore data dictionary: Regenerate everything in mappings/ that changes when VegCore.htm changes (such as VegCore.tables.redmine) instead of just VegCore.csv
inputs/*/Source/map.csv without mappings: Added referenceType, etc. mappings. This also ensures that the source table entry for the datasource will be created before the herbaria list is imported, causing all top-level datasources to sort at the top of the source table.
schemas/vegbien.sql: Granted the public_ user read-only access to the contents of the source table
root Makefile: PostgreSQL: $(editPhppgadmin): Ignore errors if patch has already been applied
lib/phpPgAdmin.config.inc.php.diff: Remove context so segment matching would depend only on the $conf['extra_login_security'] line itself
mappings/Makefile: Added VegCore.tables.redmine, which contains the Redmine-formatted list of VegCore tables to paste into <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore#Tables>
mappings/: Removed no longer used VegCore.redmine. VegCore.csv is now generated from the Redmine page instead of the other way around.
mappings/Makefile: Added VegCore.tables.csv, which contains all the tables in the VegCore data dictionary
README.TXT: Data import: backups/fix_perms: Run using sudo to also change permissions on files owned by the bien user, and to change the owner of files owned by you to the bien user
Regenerated mappings/VegCore.csv, which adds categories
README.TXT: Maintenance: Added instructions to regenerate mappings/VegCore.csv whenever the VegCore data dictionary page is changed
mappings/Makefile: Generate VegCore.csv from the VegCore data dictionary page by extracting all HTML anchors (in Redmine, each section heading, and therefore each VegCore term, gets its own anchor)
mappings/VegCore.csv: Changed line endings to \n to match what sed generates from the VegCore data dictionary page
mappings/VegCore.csv: Removed informational columns, because this information is now maintained on the VegCore data dictionary page at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore>
mappings/Veg+-VegCore.csv.csv: Removed hypothetical terms which are not in use by any VegBIEN datasource
mappings/Veg+-VegCore.csv: habit: Remapped to growthForm, which replaces verbatimGrowthForm
mappings/VegCore.csv: BIEN2 terms: Added sub-namespaces (bien_web, geoscrub, etc.) to source URLs
dict2redmine: redmine_add_links(): Hyperlink just the source name, not also the () around it
dict2redmine: RedmineDictWriter: Use h2 instead of h3 for the term name so that the term will be normal-sized instead of smaller in the Redmine table of contents
dict2redmine: Renamed redmine_url() to redmine_link() because it generates links, not URLs
dict2redmine: redmine_add_links(): Put citations in () instead of [] to avoid conflicting with the Redmine syntax for internal links ( ... )
mappings/VegCore.csv: Terms: Removed namespace prefixes (dcterms:), because VegCore terms are globally unique within VegCore and there should not be multiple versions of the same VegCore term with different namespaces. Provenance is instead indicated in the Sources column, which contains not just a namespace but a full URL to each source term.
dict2redmine: Hyperlink each term to its anchor in the data dictionary, rather than to its first source, which is not necessarily the definitive definition of the term. This also allows clicking the term to get its permalink in the address bar, rather than having to click the small, light gray paragraph mark next to the term name that Redmine provides.
dict2redmine: redmine_add_links(): Fixed bug where need to avoid matching internal links ( ... ) as citations ( [...] )
mappings/VegCore.csv: Term names: Changed special characters to _ because Redmine doesn't support special characters in HTML anchors (it removes everything except letters, numbers, _, and -)
mappings/Makefile: .Veg+-VegCore.csv.last_cleanup: Also canon the output (VegCore) column to the VegCore.csv vocabulary. ? prefixes are not a problem because there are always at least two alternatives listed for these terms, so canon will not modify the output field.
psql_script_vegbien: Run psql_vegbien with `nice -n +5` to prevent CPU-intensive operations from slowing down the shell/UI
Regenerated inputs/CVS/taxonObservation_/new_terms.csv. Note that it includes mappings to terms which are not in mappings/VegCore-VegBIEN.csv, which are prefixed with *.
input.Makefile: Maps validation: %/new_terms.csv: Undid incorrect change of column to filter terms out of. This actually needs to be the input column, even though unmapped_terms.csv is generated from the output column, because it's possible to have a mapping to a term which is not in mappings/VegCore-VegBIEN.csv, and such a term would show up in unmapped_terms.csv but should not be filtered out of new_terms.csv.
lib/phpPgAdmin.login.php.diff: public_ user's password message: Print as its own message instead of appending it to $msg. Print it before any error message so it always appears at the top of the page.
root Makefile: PostgreSQL: phpPgAdmin: Edit config file to allow passwordless logins. Edit login page to fill in public_ as the default username and add a message to leave the password blank for that user.
root Makefile: $(postgresReload-*): Ignore `mv -n` errors, which generally indicate that the existing *.conf was already renamed to *.conf.old
Makefile mk_db, schemas/pg_hba*.conf: Added passwordless public_ user with access to just the database schema. Note that in PostgreSQL, only users with explicit GRANT permissions on a table can read data in that table, but all DB users with a login can view all table schemas.
README.TXT: Maintenance: system updates that affect PostgreSQL: Added that this applies to both Linux and Mac OS X
README.TXT: Maintenance: system updates that affect PostgreSQL: list of things that could break if PostgreSQL is not restarted: Added that you may not be able to access the database as the postgres superuser
backups/fix_perms: Removed world read permissions from backups dir. Note that this will require superuser permissions to view archived backups on jupiter, because the bien group is not set up with the same members as on vegbiendev. (On jupiter, it contains only stri,regetz,donoghue,naiamh.)
inputs/CVS/taxonObservation_/map.csv: Mapped plantname, plantNameWithAuthority
inputs/CVS/cvs.~.utils.sql: plantconcept_plantnames(): Use CVS's taxonLevel values, which are different from the VegBank plantLevel values that the original version of this function used
inputs/CVS/cvs.~.utils.sql: plantconcept_*(): Use plantConcept.lowestParentConcept_ID,taxonLevel instead of plantStatus.plantParent_ID,plantLevel to find the plantConcept's ancestors, because CVS does not use plantStatus except in very few cases and instead puts the parent link directly in plantConcept
inputs/VegBank/vegbank.~.utils.sql: plantconcept_plantnames(): Made function STABLE instead of VOLATILE because it does not modify any tables
inputs/CVS/cvs.~.utils.sql: plantconcept_plantnames(): Made function STABLE instead of VOLATILE because it does not modify any tables
mappings/VegCore.csv: Removed no longer used verbatimGrowthForm. Use growthForm instead.
mappings/VegCore-VegBIEN.csv: Removed no longer used verbatimGrowthForm. Map to growthForm instead and translate growth form values to VegBIEN's growthform enum.
inputs/Madidi/Organism/map.csv: Habit: Mapped growth form values
inputs/Madidi/Organism/map.csv: Remapped Habit from verbatimGrowthForm to growthForm, which points to the same place
inputs/CVS/taxonObservation_/map.csv: Use denorm_* denormalized taxonomic ranks in place of the normalized ranks when both are provided
input.Makefile: Maps validation: %/new_terms.csv: Fixed bug where need to filter unmapped_terms.csv's terms out of the output column, not the input column, because that's what the unmapped terms are generated from. Usually these columns are the same for unmapped terms, but sometimes an output term is changed from the original column's name but still doesn't match a VegCore term in mappings/VegCore-VegBIEN.csv.
input.Makefile: SVN: add: Added comment with instructions to update all inputs with these settings, using `make inputs/add`
input.Makefile: SVN: add: verify: Also ignore *.xlsx
README.TXT: Data import: Creating enough disk space: Added instructions for removing archived backups to free up space
inputs/CVS/taxonObservation_/map.csv: Fixed bug where taxonLevel, not taxonRank, needs to be mapped to taxonRank, because CVS's taxonRank is actually a number, while taxonLevel contains the corresponding text string
README.TXT: Data import: Before import, added step to make sure there is at least 100GB of disk space
sql_io.py: put_table(): is_function: Fixed bug where need to add the pkeys table's test pkey constraint after the data is added rather than when the empty table is created, to avoid adding a pkey constraint that will later be violated by data which returns multiple output rows for an input row (such as calls to _split())
sql_io.py: put_table(): insert_into_pkeys(): Allow callers to override run_query_into()'s add_pkey_ param in case the initial version of the pkeys table should not yet have the test pkey constraint (e.g. because data is added after the table is created)
README.TXT: Data import: Checking for errors: Search for "Command exited with non-zero status" to find errors, which is faster than checking that each input's log ends in "Encountered 0 error(s)"