Added inputs/.geoscrub/_src/ to store Jim's geoscrub CSV
schemas/functions.sql: _map(): Changed error message for an unmapped value to "Value not in map" rather than "Invalid map value", because an unmapped value is not necessarily explicitly invalid
inputs/VegBank/plot_/map.csv: confidentialitystatus filter: Merged mappings for 0 with other public-equivalent fields. Note that fuzzed plots are still public, because the private columns have been removed.
inputs/VegBank/plot_/map.csv: Mapped confidentialitystatus to dcterms:accessRights with an appropriate _map filter
mappings/VegCore-VegBIEN.csv: Mapped dcterms:accessRights
schemas/functions.sql: _map(): Raise data_exception if value not in map and no default provided (not the same as a NULL default value)
mappings/VegCore-VegBIEN.csv: verbatimGrowthForm: Removed _map filter, which applied only to SALVIAS and has now been moved to the applicable SALVIAS tables
inputs/SALVIAS*/plotObservations/map.csv: Remapped Habit to growthForm with _map filter applied
sql_io.py: put_table(): Special handling for functions with hstore params: Fixed bug where need to unwrap literal values of mapping, which might be sql_gen.Literal objects
sql_gen.py: Added get_value()
dicts.py: join(): Added support for unhashable types, which are passed through. This adds support for SQL literal values which are dicts (hstores).
xml_func.py: Removed no longer used _map(), which has been replaced by a corresponding DB function
schemas/functions.sql: Added _map(), which uses the new hstore functionality. This expands _map() functionality to column-based import.
root Makefile: VegBIEN DB: DB and bien user: mk_db: hstore extension: Fixed bug where need to use `CREATE EXTENSION hstore SCHEMA pg_catalog` instead of createlang, because hstore must be explicitly created in pg_catalog or else it will be created in the public schema instead, causing it to get deleted every time the public schema is reinstalled and cascading the delete to everything (including in other schemas) that uses hstore
sql_io.py: put_table(): Added special handling for functions with hstore params. Note that although _map() doesn't exist yet as a DB function, this code must be in place before _map() is created to avoid param type mismatch errors.
root Makefile: PostgreSQL: postgres-Linux: Changed plpython to plpython3 in order to install plpython3u
schemas/py_functions.sql: _date(): Removed features that require dateutil, which is not available under plpython3u. This includes removing the now-unused date string parameter.
mappings/VegCore-VegBIEN.csv: Removed _date/date, because _date using a string date argument is no longer supported under plpython3u (dateutil is missing). Note that PostgreSQL's own date parsing is sufficient for most dates, so this use of _date is not strictly necessary and removing it will improve import times.
schemas/py_functions.sql: Replaced xrange() with range() for plpython3u
root Makefile: Python: python-Linux: Also install python3, needed by plpython3u
schemas/py_functions.sql: Updated except clause syntax for PostgreSQL 9.1.6
schemas/*.sql: Updated for PostgreSQL 9.1.6, which has standard_conforming_strings = on (which affects \-escapes in string literals), escape_string_warning not explicitly set, and uses ALTER TABLE ONLY instead of ALTER TABLE
README.TXT: Removed step to manually run make_analytical_db, now that this is done automatically by import_all. Added separate instructions to remake the analytical DB.
import_all: Change to main directory make targets are run from. Use relative paths to bin/ commands, which is possible now that the current dir is set.
import_all: Create a background process that waits until the import is done and then runs make_analytical_db
Added waitpid
import_all: Documented that `wait %1` waits for asynchronous commands
root Makefile: VegBIEN DB: DB and bien user: mk_db: Also install hstore extension. Note that this is only supported by PostgreSQL 9.1+.
input.Makefile: Editing import: Updated queries for current schema
inputs/.geoscrub/geoscrub_cultivated/create.sql: Fixed bug where need to filter out NULL lat/longs because primary keys can't contain NULL values
schemas/py_functions.sql: Changed function languages to plpython3u to match the new installed version. Note that plpythonu is not available on Mac under PostgreSQL 9.1.6.
reinstall_all: Fixed bug where also need to include datasources starting with . such as .TNRS/, by using with_all's new $all option
with_all: Added $all option to also include datasources starting with . such as .TNRS/. This is necessary for reinstall_all, which needs to install all datasources.
root Makefile: PostgreSQL: $(pg_ctl-*): Fixed bug where need to pause for a few seconds after restarting PostgreSQL, to wait for the server to be ready to accept connections
root Makefile: Installation: uninstall: Removed inputs/uninstall because the DB will be uninstalled anyway, so the inputs don't need to be individually removed first
schemas/postgresql.Mac.conf: Added back unix_socket_directory setting, which is apparently still needed in PostgreSQL 9.1.6
root Makefile: PostgreSQL: postgres-Linux: Also install postgresql.conf
root Makefile: PostgreSQL: postgres-Darwin: Also install postgresql.Mac.conf
root Makefile: PostgreSQL: $(macUsePostgresLib): Factored out PostgreSQL dir to $(macPostgresDir)
schemas/postgresql.Mac.conf: Updated to PostgreSQL 9.1.6's postgresql.conf
root Makefile: Datasources: inputs/install: Fixed bug where need to `wait` after `. bin/reinstall_all` to wait for inputs to finish installing before installing the public schema. This is necessary because views in the public schema now have dependencies on some datasources, such as TNRS.
root Makefile: PostgreSQL: $(psqlAsAdmin): Use new $(asAdmin)
root Makefile: VegBIEN DB: Schemas: schemas/public/install: Use $(psqlNoSearchPath) instead of $(psqlAsBien) because the search_path is set by vegbien.sql
root Makefile: Datasources: Added inputs/install override which runs `. bin/reinstall_all` instead, in order to install all datasources simultaneously
root Makefile: Python: python-Darwin: Added instructions to install Python 3.2 (Python 2 comes with Mac OS X, but Python 3.2 is needed for plpython3u)
root Makefile: VegBIEN DB: DB and bien user: mk_db: Updated for PostgreSQL 9.1.6 on the Mac, which only provides plpython3u (Python 3)
root Makefile: VegBIEN DB: DB and bien user: mk_db: Updated for PostgreSQL 9.1.6, which requires the DB name to be specified on the command line instead of in the $PGDATABASE env var set by postgres_vegbien. Fixed bug where need to run createlang as postgres superuser, because plpythonu is an untrusted language (with unrestricted access to the entire DB).
root Makefile: PostgreSQL: postgres-Darwin: Updated for PostgreSQL 9.1.6, which requires some /usr/lib/ symlinks to be changed to newer versions installed in the PostgreSQL lib/ dir
input.Makefile: $(psqlAsBien), csv2db: Turn off the automatic search_path where needed, because when the input is installed, the schemas in it may not exist yet
schemas/vegbien.sql: place: Renamed geosource_valid to geovalid. (It had gotten renamed in the reference -> source rename.)
schemas/vegbien.sql: location: Renamed confidentialitystatus->accesslevel, confidentialityreason->accessconditions to match the corresponding fields in source. Note that accessconditions stores more than confidentialityreason did, because it can contain details about the accesslevel in addition to the reason for it.
schemas/vegbien.sql: source.accesslevel, location.confidentialitystatus: Changed type to accesslevel
schemas/vegbien.sql: Added accesslevel enum
inputs/import.stats.xls: Updated import times
Regenerated vegbien.ERD exports
schemas/vegbien.sql: Renamed reference -> source to make this table more broadly applicable, and because this now stores the datasource metadata
schemas/vegbien.sql: referencename: Scope it by top-level datasource, because institutionCodes (which map to this field) are not globally unique. This involves renaming the previous reference_id field, which was for the matched reference, to matched_reference_id, to allow a scoping reference_id field.
mappings/VegCore-VegBIEN.csv: Made taxonoccurrence.verbatimcollectorname an fkey to party, and renamed it to collector_id
inputs/VegBank/taxonobservation_/map.csv: Mapped new givenname, surname (from collector_id's party) to recordedBy
inputs/VegBank/taxonobservation_/create.sql: Also join to collector_id's party to include collector name
inputs/VegBank/vegbank.~.clean_up.sql: Rename taxoninterpretation.party_id to taxoninterpretation_party_id to make it globally unique when joining taxoninterpretation to other tables
inputs/VegBank/vegbank.~.clean_up.sql: Rename party.d_obscount to party_d_obscount to make it globally unique when joining with other tables
input.Makefile: Existing maps discovery: $(allTables): Fixed bug where need to remove extra whitespace before $(tables) when there are no $(joinedTables)
lib/mappings.Makefile: Checking if $(termsSubdirs) defined: Fixed bug where can't use ifndef because that checks if the variable is empty, not undefined. Need to use `ifeq ($(origin var),undefined)` instead.
inputs/TEAM/V*/map.csv: Omit *Method, because it just contains "Derived" for a small fraction of the rows
inputs/SALVIAS/: Updated to new salvias_plots export on nimoy, which has a different schema
inputs/SALVIAS/salvias_plots.~.clean_up.sql: Moved Ensure globally unique column names to end to match VegBank order
my2pg: *int types: Added mediumint
Placed inputs/SALVIAS/_archive/ under version control
inputs/SALVIAS/salvias_plots.~.clean_up.sql: Remove private data that should not be publicly visible, indicated by plotMetadata.AccessCode = 1
inputs/SALVIAS/salvias_plots.~.clean_up.sql: Enable cascading deletes by adding the necessary fkeys
Added inputs/SALVIAS/_src/salvias_data_access_controls.txt
inputs/.geoscrub/import_order.txt: Fixed bug where geoscrub_cultivated needs to be installed after geoscrub_cleaned_unique, not before as it would be with the default alphabetical sort order
inputs/.geoscrub/geoscrub_cultivated/: Use _no_import file to exclude geoscrub_cultivated from the import, because it's used directly as a lookup table by analytical_stem rather than being imported. This ensures that there is no import log or input row count for geoscrub_cultivated in the import times, which would skew the import row count because the row count would be included even though no columns are mapped.
input.Makefile: $(tables): Fixed bug where need to use $(importTables) instead of $(tables) in all places that should use only imported tables, rather than just in the import process itself
input.Makefile: Import to VegBIEN: Added support for tables which should be installed but not imported, but which must be installed after tables which are imported rather than before. This currently applies to geoscrub.geoscrub_cultivated, which depends on geoscrub_cleaned_unique (and therefore must be installed after it), but which should not be imported because it's used directly as a lookup table by analytical_stem.
inputs/VegBank/vegbank.~.clean_up.sql: Documented that plots with confidentialitystatus >= 4 are not deleted if their embargos have already expired. This applies to the Shenandoah NP data, which has confidentialitystatus = 5 but is no longer embargoed according to the embargo table
inputs/SALVIAS/: Mapped unmapped fields with a VegCore/VegBIEN equivalent. plotMetadata_/: Remapped life_zone to communityID because it is now alt-ed together with vegetation*, and thus not just a description with life_zone_code as its globally unique name.
schemas/vegbien.sql: referencetype: Added terms from reference.referencetype closed list in VegBank data dictionary. Cited sources in comment.
schemas/vegbien.sql: reference.referencetype: Changed type to referencetype enum
schemas/vegbien.sql: Added referencetype enum, containing VegBank's values in reference.referencetype as well as values for bien_web.datasource.aggregatorOrPrimary and bien_web.dataSourceNormalized.isHerbarium,isAggregator
specimenreplicate: Made institution_id an fkey to referencename instead of party, to later be matched up with reference entries for each aggregator's subprovider
schemas/vegbien.sql: referencename: Added referencename_unique unique index on name
schemas/vegbien.sql: referencename: Made reference_id optional so it can be populated later when referencenames are scrubbed
schemas/vegbien.sql: referencename: Renamed identifier to name because it is specifically any name for the reference, not necessarily an ID
schemas/vegbien.sql: Renamed referencealtident to referencename to allow any verbatim reference name to go here, with reference containing the corresponding accepted reference name
schemas/vegbien.sql: reference: Added accesslevel, accessconditions from bien_web.datasource
schemas/vegbien.sql: address: Added street2 from bien_web.party.address2
schemas/vegbien.sql: address: Renamed fields to bien_web.party names
schemas/vegbien.sql: party: Added department from bien_web.party
inputs/SALVIAS/plotMetadata_/map.csv: Mapped lookup_MethodCode_Description to new observationMeasure
schemas/vegbien.sql: method: Made name optional when description or observationmeasure is specified
schemas/vegbien.sql: method: method_unique: Include observationmeasure since the method name sometimes is not globally unique (e.g. in SALVIAS)
mappings/VegCore-VegBIEN.csv: Mapped observationMeasure
mappings/VegCore.csv: observationMeasure: Added source to DwC samplingProtocol
mappings/VegCore.csv: Added observationMeasure