Added inputs/NVS/map.csv (global mappings)
inputs/NVS/Project/map.csv: Remapped Project Abbreviation to projectName instead of Name, because Project Abbreviation is what's used throughout the tables to link to the project
inputs/NVS/Plot/map.csv: Mapped Physiography
inputs/NVS/Plot/map.csv: Mapped Area
inputs/NVS/Plot/map.csv: Altitude: Provided rationale for units determination
Added inputs/NVS/Organism/
Added inputs/NVS/Project/
mappings/VegCore-VegBIEN.csv: Mapped projectStartDate, projectEndDate
mappings/VegCore.csv: Added projectStartDate, projectEndDate
mappings/VegCore.csv: Renamed plotName to locationName because this term also applies to the location of a specimen. This replaces CTFS's definition of locationName as locality.
root Makefile: apt-get: Use --yes to allow unattended installations
schemas/vegbien.sql: analytical_*: Renamed plotName to locationName to match the new VegCore term name
mappings/VegCore.csv: Added subInstitutionCode
schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use project.sourceaccessioncode instead of locationevent.project_id for the projectID
schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use the parent locationevent's obsstartdate instead when the subevent does not provide it
schemas/vegbien.sql: analytical_stem_view: locationevent info: Fixed bug where need to use the parent locationevent's project and method instead when the subevent does not provide them, because they are often attached to it instead
schemas/vegbien.sql: analytical_stem_view: geolocation info: Fixed bug where need to use the parent location instead when provided, because lat/long and placenames are attached to it instead of the subplot's location
backups/Makefile: %.md5: Fixed bug where md5sum does not have a -q option like md5
backups/Makefile: %.md5: Fixed bug where need to use md5sum instead of md5 on Linux
schemas/vegbien.sql: analytical_stem_view: Filter out non-current taxondeterminations (occurrences with no taxondetermination are preserved)
schemas/vegbien.sql: Removed no longer needed darwin_core table. Use analytical_stem instead, which is now identical.
schemas/vegbien.sql: sync_analytical_*_to_view(): Creating analytical_* table: Fixed bug where need LIMIT 0 so that it can be used on a full DB, which will have data in the tables used by analytical_stem_view
schemas/vegbien.sql: Merged darwin_core into analytical_stem
schemas/vegbien.sql: darwin_core_view, analytical_stem_view: Updated now that newWorldCountries.isoCode is a text field
README.TXT: Data import: backups: Step to copy backups to jupiter: Added full path to aaronmk/ (/data/dev/aaronmk)
inputs/newWorld/geoscrub.schema.~.changes.sql: Reversed order of adding unique constraints and changing types
inputs/newWorld/geoscrub.schema.~.changes.sql: Changed isoCode type to text. Added unique constraint on isoCode.
backups/Makefile: Added md5s target to generate .md5 files for all backups
inputs/import.stats.xls: Updated import times
backups/Makefile: %.md5: Run with `nice -n +5` to avoid slowing down the UI
backups/: svn:ignore: Added *.md5. Removed no longer applicable *.log.
backups/Makefile: Changed paths to be relative to the Makefile rather than the current directory, so this Makefile can be used in other directories as well (such as jupiter:/aaronmk/VegBIEN.backups/)
backups/Makefile: %.backup: Also create MD5 of backup
backups/Makefile: Added %.md5 target to create checksums of each backup
README.TXT: Data import: backups: Added step to copy backups to jupiter in /aaronmk/VegBIEN.backups/ . The jupiter folder, which has several TB of space available, will replace local backup drives as the location for archived backups.
README.TXT: Data import: Removed additional backup of just the public schema, which is not needed because the public schema is included in the full DB backup. The additional public schema backup increased the total backup size by 60-70%, so this will help conserve limited disk space on vegbiendev as well as on local archives of the backups.
README.TXT: Backups: Full DB: Updated steps to match Data import steps, which add the date to the backup filename when it's created rather than afterwards
README.TXT: Backups: Archived imports: Back up: Added instructions for archiving the last import before backing it up
Regenerated vegbien.ERD exports
schemas/vegbien.sql: analytical_*: Removed NOT NULL constraint on dateCollected
schemas/vegbien.sql: source: Added sampletype field to indicate a plot or specimen datasource
schemas/vegbien.sql: sync_analytical_*_to_view(): Added NOT NULL constraints
make_analytical_db: Added step to create darwin_core materialized view
inputs/*/Source/map.csv for non-herbaria: Mapped sampleType
inputs/.herbaria/herbaria/map.csv: Set sampleType to "specimen"
mappings/VegCore-VegBIEN.csv: Mapped sampleType
mappings/VegCore.csv: Added sampleType
schemas/vegbien.sql: Added sampletype enum
root Makefile: $(postgresReload-*): Confirm the operation before continuing, since it involves changing PostgreSQL config files in nontrivial ways. Added instructions for setting kernel.shmmax to at least 4GB minus 1 byte on Linux, to work with the shared_buffers setting in postgresql.conf.
schemas/postgresql.conf: shared_buffers: Documented that it must be less than ~95% of SHMMAX
schemas/vegbien.sql: analytical_stem_view: identifiedBy: Fixed bug where need to use party.fullname instead of name components because the name is now mapped to fullname
schemas/vegbien.sql: analytical_stem_view, darwin_core_view: dateCollected: Use the parent plot event's obsstartdate when the subplot event does not have its own obsstartdate
schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date or non-current taxondeterminations
schemas/vegbien.sql: analytical_stem_view: Don't filter out rows without a date
schemas/vegbien.sql: Added darwin_core_view
schemas/vegbien.sql: sync_analytical_*_to_view(): Added CREATE INDEX statements
README.TXT: Data import: Added steps to publish analytical DB on nimoy.bien_web
schemas/vegbien.sql: analytical_stem_view: Changed JOINs to LEFT JOINs to include occurrences without taxondeterminations
export_analytical_db: Use 'NULL' as the NULL value instead of \N, because MySQL has problems with \N
publish_analytical_db: Load to bien3_adb instead of bien_web
README.TXT: Data import: Added step to export analytical DB
root Makefile: $(postgres-Linux): Fixed bug where need $(asAdmin) before commands to rename existing *.conf
root Makefile: $(postgres-Linux): Also install postgresql-contrib, which contains the hstore extension
Added inputs/NVS/
inputs/CVS/Organism/map.csv: Mapped accordingTo to "Weakley 2006"
inputs/NY/Specimen/map.csv: Omit UniqueNYInternalRecordNumber to avoid confusion since this is an internal-only ID. This makes InstitutionCode+CollectionCode+CatalogNumber the globally unique identifier instead.
README.TXT: Added Datasource refreshing section with instructions for refreshing VegBank
schemas/vegbien.sql: Renamed taxonconcept.concept_source_id back to concept_reference_id
schemas/vegbien.sql: Renamed soilobs to soilsample per working group discussion
input.Makefile: SVN: add: verify: Fixed bug where need to use $ prefix before string to parse newline
inputs/NY/verify/: svn:ignore .csv files
input.Makefile: SVN: add: Also svn:ignore .csv files
export_analytical_db: Export NULL as \N to work with MySQL
schemas/vegbien.sql: analytical_*: Added index on NOT NULL columns, starting with institutionCode
schemas/vegbien.sql: analytical_*: Removed primary keys and NOT NULL constraints on columns that sometimes have NULL values
publish_analytical_db: Added CSV dialect information
root Makefile: PostgreSQL: $(postgresReload-*): Rename existing *.conf to *.conf.old
publish_analytical_db: Use LOAD DATA LOCAL INFILE instead of LOAD DATA INFILE to avoid needing FILE permissions on bien_web
Added publish_analytical_db
export_analytical_db: Append the public schema version to the CSV filename
backups/Makefile: $(rsyncBackups): Added *.csv
Added export_analytical_db
backups/: Ignore _* and *.csv
make_analytical_db: mk_analytical_table(): Use explicit schema references everywhere. This fixes a bug where the TRUNCATE/INSERT steps on the public schema's table would reference the analytical_db view instead because they were not schema-scoped.
make_analytical_db: mk_analytical_table(): Factored table references in different schemas out into vars
schemas/vegbien.sql: analytical_stem_view: recordNumber: Combine identifying fields in taxonoccurrence, plantobservation, and stemobservation to ensure that this field is unique within the plot and not NULL
make_analytical_db: Moved set -x () around just psql_verbose_vegbien so embedded $() expressions wouldn't also be in set -x (verbose) mode
make_analytical_db: Fixed bug where need to use bash instead of sh because vegbien_dest requires it
make_analytical_db: Factored analytical_* table creation code out into mk_analytical_table() function
make_analytical_db: Create analytical_db views pointing to the analytical_* versions in the public schema
vegbien_dest: $schemas: Removed analytical_db because views that will be added to it were shadowing public schema tables with the same names during population of those tables in make_analytical_db
vegbien_dest: Export $public, to make sure it's available to any invoked scripts as an env var
vegbien_dest: $schemas: Added analytical_db
inputs/import.stats.xls: Added separate tab with stats for 2012-6~9. The Excel format apparently only supports 255 columns, so previous imports had been silently truncated off. Note that once the 2012-10 imports reach column 255, a new tab will need to be created with the 2012-10+ imports.