Regenerated vegbien.ERD exports
schemas/vegbien.sql: analytical_*: Removed NOT NULL constraint on dateCollected
schemas/vegbien.sql: source: Added sampletype field to indicate a plot or specimen datasource
schemas/vegbien.sql: sync_analytical_*_to_view(): Added NOT NULL constraints
make_analytical_db: Added step to create darwin_core materialized view
inputs/*/Source/map.csv for non-herbaria: Mapped sampleType
inputs/.herbaria/herbaria/map.csv: Set sampleType to "specimen"
mappings/VegCore-VegBIEN.csv: Mapped sampleType
mappings/VegCore.csv: Added sampleType
schemas/vegbien.sql: Added sampletype enum
root Makefile: $(postgresReload-*): Confirm the operation before continuing, since it involves changing PostgreSQL config files in nontrivial ways. Added instructions for setting kernel.shmmax to at least 4GB minus 1 byte on Linux, to work with the shared_buffers setting in postgresql.conf.
schemas/postgresql.conf: shared_buffers: Documented that it must be less than ~95% of SHMMAX
schemas/vegbien.sql: analytical_stem_view: identifiedBy: Fixed bug where need to use party.fullname instead of name components because the name is now mapped to fullname
schemas/vegbien.sql: analytical_stem_view, darwin_core_view: dateCollected: Use the parent plot event's obsstartdate when the subplot event does not have its own obsstartdate
schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date or non-current taxondeterminations
schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date
schemas/vegbien.sql: Added darwin_core_view
schemas/vegbien.sql: sync_analytical_*_to_view(): Added CREATE INDEX statements
README.TXT: Data import: Added steps to publish analytical DB on nimoy.bien_web
schemas/vegbien.sql: analytical_stem_view: Changed JOINs to LEFT JOINs to include occurrences without taxondeterminations
export_analytical_db: Use 'NULL' as the NULL value instead of \N, because MySQL has problems with \N
publish_analytical_db: Load to bien3_adb instead of bien_web
README.TXT: Data import: Added step to export analytical DB
root Makefile: $(postgres-Linux): Fixed bug where need $(asAdmin) before commands to rename existing *.conf
root Makefile: $(postgres-Linux): Also install postgresql-contrib, which contains the hstore extension
Added inputs/NVS/
inputs/CVS/Organism/map.csv: Mapped accordingTo to "Weakley 2006"
inputs/NY/Specimen/map.csv: Omit UniqueNYInternalRecordNumber to avoid confusion since this is an internal-only ID. This makes InstitutionCode+CollectionCode+CatalogNumber the globally unique identifier instead.
README.TXT: Added Datasource refreshing section with instructions for refreshing VegBank
schemas/vegbien.sql: Renamed taxonconcept.concept_source_id back to concept_reference_id
schemas/vegbien.sql: Renamed soilobs to soilsample per working group discussion
input.Makefile: SVN: add: verify: Fixed bug where need to use $ prefix before string to parse newline
inputs/NY/verify/: svn:ignore .csv files
input.Makefile: SVN: add: Also svn:ignore .csv files
export_analytical_db: Export NULL as \N to work with MySQL
schemas/vegbien.sql: analytical_*: Added index on NOT NULL columns, starting with institutionCode
schemas/vegbien.sql: analytical_*: Removed primary keys and NOT NULL constraints on columns that sometimes have NULL values
publish_analytical_db: Added CSV dialect information
root Makefile: PostgreSQL: $(postgresReload-*): Rename existing *.conf to *.conf.old
publish_analytical_db: Use LOAD DATA LOCAL INFILE instead of LOAD DATA INFILE to avoid needing FILE permissions on bien_web
Added publish_analytical_db
export_analytical_db: Append the public schema version to the CSV filename
backups/Makefile: $(rsyncBackups): Added *.csv
Added export_analytical_db
backups/: Ignore _* and *.csv
make_analytical_db: mk_analytical_table(): Use explicit schema references everywhere. This fixes a bug where the TRUNCATE/INSERT steps on the public schema's table would reference the analytical_db view instead because they were not schema-scoped.
make_analytical_db: mk_analytical_table(): Factored table references in different schemas out into vars
schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
make_analytical_db: Moved set -x () around just psql_verbose_vegbien so embedded $() expressions wouldn't also be in set -x (verbose) mode
make_analytical_db: Fixed bug where need to use bash instead of sh because vegbien_dest requires it
make_analytical_db: Factored analytical_* table creation code out into mk_analytical_table() function
make_analytical_db: Create analytical_db views pointing to the analytical_* versions in the public schema
vegbien_dest: $schemas: Removed analytical_db because views that will be added to it were shadowing public schema tables with the same names during population of those tables in make_analytical_db
vegbien_dest: Export $public, to make sure it's available to any invoked scripts as an env var
vegbien_dest: $schemas: Added analytical_db
inputs/import.stats.xls: Added separate tab with stats for 2012-6~9. The Excel format apparently only supports 255 columns, so previous imports had been silently truncated off. Note that once the 2012-10 imports reach column 255, a new tab will need to be created with the 2012-10+ imports.
bin/map: in_is_db: by_col: Clearing errors table: Skip this if the table has been set to None because it didn't exist (and thus was a metadata-only map spreadsheet)
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the specific_epithet from the accepted_taxonverbatim rather than the parsed_taxonverbatim
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Include the family any time the genus is not specified, instead of just when accepted_taxonlabel.rank = 'family'. These should have the same effect since TNRS includes the rank, but using COALESCE is clearer.
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Changed to also include morphospecies when just the family is specified
schemas/vegbien.sql: analytical_stem_view: Fixed bug where location.authorlocationcode needed to be used as the plotName when location.sourceaccessioncode was not provided, to ensure that plotName would be NOT NULL
inputs/FIA/import_order.txt: Fixed bug where FIA_COND_unique needed to be explicitly included in import_order.txt now that we're using import_order.txt to import the Source metadata table before the data tables
inputs/import.stats.xls: Updated import times
root Makefile: PostgreSQL: $(postgresReload-Linux): Try chmoding both as your user and as the bien user
input.Makefile: Testing: $(runTest): Ignore failed diffs when the test is compared to another test's output (e.g. in by_col mode)
bin/map: in_is_db: If table does not exist, set table to None so that db_xml.put_table() doesn't try to access it. This fixes a bug in metadata-only map spreadsheets under column-based import.
db_xml.py: put_table(): Support None in_table by calling put() directly
Removed no longer used geoscrub.*.sql. Use geoscrub_output instead.
Removed no longer used geoscrub_cleaned_unique. Use geoscrub_output instead.
Removed no longer used geoscrub_cultivated. Use analytical_stem_view.cultivated instead.
schemas/vegbien.sql: analytical_stem_view: cultivated: Removed BIEN2's geoscrub_cultivated, which has now been replaced by the primary corresponding scripts (and never had particularly many matches to the locations in any case)
schemas/vegbien.sql: analytical_stem_view: cultivated: Use OR instead of _or() to combine cultivated_family_locations.country IS NOT NULL with the other values, because this field's false value should not be used in place of NULL if all the other values are NULL, as it would be with _or(). (cultivated_family_locations.country IS NOT NULL can indicate presence, but not absence, of cultivated status.)
schemas/functions.sql, vegbien.sql: _and(), _or(): Added comment comparing the function and the corresponding logical operator
schemas/vegbien.sql: public: Added _or(), for use by analytical_stem_view
schemas/vegbien.sql: analytical_stem_view: cultivated: Also set if family/country combination found in cultivated_family_locations
schemas/vegbien.sql: cultivated_family_locations: Added data from nimoy:/home/boyle/bien2/geoscrub/cultivated/cult_by_taxon/flag_by_taxa.inc
schemas/vegbien.sql: Added cultivated_family_locations to store locations where various taxon families are considered cultivated
mappings/VegCore-VegBIEN.csv: Mapped locality description fields to location.iscultivated using _locationnarrative_is_cultivated()
xml_func.py: Simplifying functions: Added passthru entries for _and, _or
schemas/vegbien.sql: Added _locationnarrative_is_cultivated()
lib/PostgreSQL-MySQL.csv: Change text to varchar(255) because text columns can't be used in indexes in MySQL
lib/PostgreSQL-MySQL.csv: Resaved in Excel, which removed unnecessary quotes around fields
schemas/vegbien.sql: analytical_aggregate: Added identifiedBy, which is no longer a scoping field (which would prevent scientificNameWithMorphospecies from being unique) now that there is only one taxondetermination for each taxonoccurrence
schemas/vegbien.sql: analytical_stem_view: dateCollected: For plots data, use the locationevent obsstartdate instead of the collectiondate in order to group taxonoccurrences/stems from the same locationevent together
schemas/vegbien.sql: analytical_* pkeys: Added dateCollected because the records are actually unique within the location*event*, not the location
schemas/vegbien.sql: analytical_stem_view: Exclude records with no collectiondate or obsstartdate, which is required to uniquely identify a record
analytical_stem_view: dateCollected: Use locationevent.obsstartdate when aggregateoccurrence.collectiondate is not provided
schemas/vegbien.sql: analytical_stem_view: Include only the current taxondetermination for each taxonoccurrence, to avoid cross-joining taxondeterminations with stems and thus multiplying the number of rows for datasources that have multiple taxondeterminations per taxonoccurrence
schemas/vegbien.sql: taxondetermination: Added AFTER trigger to set the current taxondetermination for the taxonoccurrence
lib/PostgreSQL-MySQL.csv: Statements ending in ";": When matching any character, use .*? (with the (?s) flag) instead of [^;]* in order to allow embedded ; to be matched. This fixes a bug where a CREATE VIEW statement was not removed because it contained an embedded ; .
schemas/vegbien.sql: taxondetermination: Added unique index to ensure that there is only one current determination for each taxonoccurrence
lib/PostgreSQL-MySQL.csv: Remove indexes with WHERE clauses
schemas/vegbien.sql: analytical_aggregate: Added primary key on institutionCode, plotName, scientificNameWithMorphospecies, recordNumber. Note that this makes these fields NOT NULL, which should not be a problem because there are inner joins instead of LEFT JOINs on most of the tables which provide them, and LEFT JOINed tables have their identifying fields combined to create a NOT NULL value.