


| Revision:

# Date Author Comment
10581 08/03/2013 03:11 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: added %/import_scrub, similar to import_scrub but just imports one table

10580 08/03/2013 12:25 AM Aaron Marcuse-Kubitza

bin/import_all: with_all import_scrub: documented that this step uses $by_col, so that users know to include by_col=1 when running this step separately

10579 08/03/2013 12:24 AM Aaron Marcuse-Kubitza

bin/import_all: use column-based import (by_col=1) by default, instead of requiring the user to explicitly specify it. instead turn it off explicitly (by_col=) for row-based import.

10578 08/03/2013 12:03 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: To back up DB: after renaming current import to public: say to replace $version with the appropriate revision, because the $version env var should not be set (otherwise the backup will try to use a nonexistent import with the given revision #)

10577 08/03/2013 12:00 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To back up DB: updated instructions to inline setting of $dump_opts, like in bin/import_all

10576 08/02/2013 11:55 PM Aaron Marcuse-Kubitza

bin/import_all: don't set $dump_opts until running the backup command that uses it, so that the user can run this backup command separately just by copying the line out of the script (without worrying about env vars that need to be set, other than $version which is visible outside the script)

10575 08/01/2013 05:02 PM Aaron Marcuse-Kubitza

inputs/U/Specimen/: translated some multi-column filters to postprocessing derived columns, using the steps at

10574 08/01/2013 05:00 PM Aaron Marcuse-Kubitza

inputs/U/Specimen/: translated some multi-column filters to postprocessing derived columns, using the steps at

10573 08/01/2013 04:56 PM Aaron Marcuse-Kubitza

inputs/U/Specimen/map.csv: Gazetteer/Newgazett, Majorarea: documented that these are the closest equivalent for Guyana/Suriname

10572 08/01/2013 04:55 PM Aaron Marcuse-Kubitza

inputs/U/Specimen/map.csv: Majorarea: mapped to stateProvince, which is the closest equivalent for Guyana/Suriname

10571 08/01/2013 04:54 PM Aaron Marcuse-Kubitza

inputs/U/Specimen/map.csv: Gazetteer/Newgazett: remapped to county, which is the closest equivalent for Guyana/Suriname

10570 08/01/2013 04:37 PM Aaron Marcuse-Kubitza

inputs/U/Specimen/map.csv: Gazetteer, Newgazett: combine them with _alt() instead of _join() because only one of Gazetteer, Newgazett is ever populated

10569 08/01/2013 04:04 PM Aaron Marcuse-Kubitza

inputs/NCU/: switched to new-style import, using the steps at

10568 08/01/2013 04:02 PM Aaron Marcuse-Kubitza

bugfix: placed inputs/NCU/Specimen/postprocess.sql under version control

10567 08/01/2013 03:57 PM Aaron Marcuse-Kubitza

bugfix: inputs/NCU/Specimen/map.csv: CatalogSeriesPrefix: enclosed comment in "

10566 08/01/2013 03:43 PM Aaron Marcuse-Kubitza

inputs/NCU/Specimen/map.csv: OwnerInstitution: remapped to specimenOwner rather than specimenHolderInstitutions. OwnerInstitution, CatalogSeriesPrefix: documented the VegCore SQL dotpath ( that would be used to refer to the field. this specifies the destination field at a much finer level of detail than the one-size-fits-all denormalized name.

10565 08/01/2013 02:56 PM Aaron Marcuse-Kubitza

mappings/VegCore-VegBIEN.csv: mapped municipality

10564 08/01/2013 02:46 PM Aaron Marcuse-Kubitza

inputs/NCU/Specimen/map.csv: CityLocality: remapped to municipality because this is a placename, not a verbatim locality description

10563 08/01/2013 02:25 PM Aaron Marcuse-Kubitza

inputs/NCU/Specimen/: translated single-column filters to postprocessing derived columns, using the steps at

10562 08/01/2013 02:07 PM Aaron Marcuse-Kubitza

inputs/NY/: switched to new-style import, using the steps at

10561 08/01/2013 02:02 PM Aaron Marcuse-Kubitza

inputs/$dest/$subdir/: translated single-column filters to postprocessing derived columns, using the steps at

10560 08/01/2013 01:28 PM Aaron Marcuse-Kubitza

inputs/NY/Ecatalog_all/: translated multi-column filters to postprocessing derived columns, using the steps at

10559 08/01/2013 01:11 PM Aaron Marcuse-Kubitza

inputs/bien_web/: switched to new-style import, using the steps at

10558 08/01/2013 11:37 AM Aaron Marcuse-Kubitza

inputs/UBC/: switched to new-style import, using the steps at

10557 08/01/2013 11:07 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: don't exit the screen until after getting $version, which is defined within it

10556 08/01/2013 09:49 AM Aaron Marcuse-Kubitza

planning/timeline/timeline.2013.xls: updated for changes made in the conference call: moved Attribution and conditions of use up because it's high priority. deferred Importing to normalized VegCore until after October due to decision to use VegBIEN for the October DB.

10555 08/01/2013 03:51 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: regenerated exports and udpated image map

10554 08/01/2013 03:46 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: GNRS & geovalidation steps: refactored to separate geoscrubbing (matching a named place) from prepping for geovalidation (uniquifying the lat/long and parent place). note that only bare lat/longs without official placenames will be geovalidated.

10553 08/01/2013 02:36 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: geovalidation: inherit from georeferencing, since applying a corrected (or confirmed) lat/long is a form of georeferencing

10552 08/01/2013 02:29 AM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/VegCore.ERD.mwb: scrubbed_geoplace: made parent_geoplace nullable to allow a scrubbed root node which has no parent. added scrubbed_name to emphasize that any applied in this table is scrubbed.

10551 08/01/2013 12:47 AM Aaron Marcuse-Kubitza

inputs/CTFS/StemObservation/unmapped_terms.csv: regenerated

10550 08/01/2013 12:43 AM Aaron Marcuse-Kubitza

inputs/UNCC/Specimen/new_terms.csv: regenerated

10549 08/01/2013 12:22 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: make test by_col=1: documented that if you encounter errors, they are most likely related to the PostgreSQL error parsing in /lib/ parse_exception()

10548 07/31/2013 11:58 PM Aaron Marcuse-Kubitza

bugfix: lib/ parse_exception(): MissingCastException from DoesNotExistException for function: handle overloaded functions where none of the overloads supports the given arg types (so assume text). this may have become a bug from system upgrades?

10547 07/31/2013 11:35 PM Aaron Marcuse-Kubitza

added inputs/newWorld/iso_code_gadm/VegBIEN.csv, etc., generated by running ./run

10546 07/31/2013 11:34 PM Aaron Marcuse-Kubitza

added inputs/newWorld/newWorldCountries/map.csv, etc., generated by running ./run

10545 07/31/2013 10:54 PM Aaron Marcuse-Kubitza

inputs/NY/Ecatalog_all/map.csv: PlantFungDescription: documented that PlantFung confusingly refers to the plant/fungus the specimen came from, rather than to a fungus growing on the plant

10544 07/31/2013 10:29 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: fixed lines, including changing inheritance connectors to 1:1/optional on child table

10543 07/31/2013 09:21 PM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/ reset(): noted that bin/redmine_synonyms can be used as a template

10542 07/31/2013 09:14 PM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/ reset(): noted that the dimensions precede the matched name

10541 07/31/2013 09:13 PM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/ added reset() stub, with instructions of how to autogenerate this using dimensions in ../document.mwb.xml

10540 07/31/2013 08:55 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: fixed lines

10539 07/31/2013 08:28 PM Aaron Marcuse-Kubitza

bugfix: /Makefile: postgres-Linux: phppgadmin.conf: updated `ln -s` to /etc/apache2/conf-available/ for current name of /etc/apache2/conf.d/phppgadmin.conf, which is now just phppgadmin

10538 07/31/2013 07:52 PM Aaron Marcuse-Kubitza

added planning/workflow/derived_columns/range_measurements/low-high_vs_midpoint-uncertainty.VegBank.png diagram, generated from a screenshot of the VegBank Data Dictionary (

10537 07/31/2013 05:21 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: party category: made it pink to match VegBank (, swapping colors with source. (note that source isn't red like in VegBank because black text doesn't show up well against it.)

10536 07/31/2013 05:06 PM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/VegCore.ERD.pdf: regenerated from schemas/VegCore/VegCore.ERD.mwb, without any tables moused-over. apparently, mousing over a table while saving the PDF also saves the highlighting on the fields, not just when saving the PNG.

10535 07/31/2013 04:56 PM Aaron Marcuse-Kubitza

inputs/HVAA/: switched to new-style import, using the steps at

10534 07/31/2013 04:52 PM Aaron Marcuse-Kubitza

inputs/HVAA/Specimen/map.csv: fieldNumber: remapped to UNUSED

10533 07/31/2013 04:32 PM Aaron Marcuse-Kubitza

inputs/HVAA/Specimen/: translated multi-column filters to postprocessing derived columns, using the steps at

10532 07/31/2013 04:31 PM Aaron Marcuse-Kubitza

inputs/HVAA/Specimen/: translated multi-column filters to postprocessing derived columns, using the steps at

10531 07/31/2013 04:25 PM Aaron Marcuse-Kubitza

inputs/HVAA/Specimen/: translated multi-column filters to postprocessing derived columns, using the steps at

10530 07/31/2013 04:04 PM Aaron Marcuse-Kubitza

inputs/SpeciesLink/: switched to new-style import, using the steps at

10529 07/31/2013 03:59 PM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/map.csv: renamed DUPLICATE#of:... output columns to be <= 63 chars long, in order to be valid PostgreSQL columns without collisions

10528 07/31/2013 03:37 PM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/: translated multi-column filters to postprocessing derived columns, using the steps at

10527 07/31/2013 03:12 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: regenerated exports and udpated image map

10526 07/31/2013 03:10 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: rel_place: renamed to subplace to clarify that this is any place contained within another place, not just places with a position relative to their parent place

10525 07/31/2013 03:05 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: don't require this to be a rel_place, because some individuals will be associated with standalone specimens, which don't have a parent place

10524 07/31/2013 02:55 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: fixed lines

10523 07/31/2013 02:49 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: regenerated exports and udpated image map

10522 07/31/2013 02:45 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: require rel_place to have a parent place. merge plot_element into subplot because subplot is the only place the parent_plot constraint is imposed (individuals can now be located in any place, not just a plot).

10521 07/31/2013 02:38 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: individual: require it to have a place (which may be a rel_place within another place). do not require the place to be a plot_element, because individuals can be located in places other than uniformly-shaped plots.

10520 07/31/2013 02:23 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: regenerated exports. this now includes updating the image map for moved and new tables.

10519 07/31/2013 02:11 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: added plot_element to require that the associated parent place of a rel_place be a plot. added subplot to require that a subplot be a plot_element (i.e. have a parent_plot) and to show where to put subplot data.

10518 07/31/2013 01:28 AM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/ ERD.pdf: fixed URL since the map is now in a subdir

10517 07/31/2013 01:22 AM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/ added URLs, which now autoredirect to the appropriate DB table on vegbiendev

10516 07/31/2013 01:21 AM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/ cleanup(): trim growing whitespace: remove multiple spaces at a time, in case the user saved and reopened but didn't run this script in between

10515 07/31/2013 01:01 AM Aaron Marcuse-Kubitza

added schemas/VegCore/ERD/.htaccess to forward unknown subdirs to vegbiendev MySQL as tables

10514 07/31/2013 01:00 AM Aaron Marcuse-Kubitza

added web/servers/vegbiendev/db/my/, which forwards to MySQL instead of PostgreSQL

10513 07/31/2013 01:00 AM Aaron Marcuse-Kubitza

web/servers/vegbiendev/db/: moved PostgreSQL engine to separate pg/ subdir to allow for other engines

10512 07/31/2013 12:23 AM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/ERD/ #$AUTHOR: trim growing whitespace which Gimp repeatedly adds on each save

10511 07/31/2013 12:21 AM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/ERD/ VegCore: fixed URL

10510 07/31/2013 12:19 AM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/ERD/ VegCore: fixed URL

10509 07/31/2013 12:10 AM Aaron Marcuse-Kubitza

schemas/VegCore/ERD/, index.htm: re-ran (it needs to be run every time is edited)

10508 07/31/2013 12:02 AM Aaron Marcuse-Kubitza

added schemas/VegCore/ERD/, which cleans up and formats Gimp's image map for publishing; along with derived file index.htm

10507 07/30/2013 11:59 PM Aaron Marcuse-Kubitza

added schemas/VegCore/ERD/ image map for VegCore.ERD.png. note that the tables are sorted, and this sort order supersedes the data dictionary sort order (which is somewhat similar). the table URLs have not been added yet.

10506 07/30/2013 10:07 PM Aaron Marcuse-Kubitza

lib/sh/ $sed_cmd: added usage

10505 07/30/2013 09:46 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: GNRS & geovalidation steps: 2. GNRS: split into substeps 2a. unique lat/long, 2b. names, 2c. geovalidatable place for clarity. don't refer to the scrubbed_geoplace as the GADM shape, because only the parent_geoplace has the shape (the scrubbed_geoplace just has the scrubbed names).

10504 07/30/2013 09:10 PM Aaron Marcuse-Kubitza

added schemas/VegCore/VegCore.ERD.letter_size.pdf. this must be generated in Linux rather than Mac, because the Mac PDF printer messes up the colors in the PDF (missing color profile?).

10503 07/30/2013 08:34 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: added labels for GNRS and geovalidation steps, analogous to the TNRS taxonomic scrubbing steps labels

10502 07/30/2013 07:25 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: regenerated exports. VegCore.ERD.png now gets the sRGB color profile attached in Gimp so that the colors don't look washed out on some LCD screens.

10501 07/30/2013 07:13 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: georeferencing: added hstore extender

10500 07/30/2013 07:08 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: individual: made it a plot element by optionally attaching a plot position (a rel_place whose parent is the containing plot)

10499 07/30/2013 06:54 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: plot: replaced bounding_box_rect with length_m, width_m, since the bounding box was intended to store plot dimensions (along the plot azimuth) rather than an actual bounding box aligned to the compass directions. added azimuth_deg_N, which is used to resolve plot element x/y coordinates to absolute geocoordinates while taking into account the rotation of the plot.

10498 07/30/2013 06:42 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: tables with parent hierarchies: made parent optional, since the root(s) of the hierarchy will not have an entry for this, and any unique constraints that include this column should be ignored (which they will be if the value is NULL instead of a self-pointer)

10497 07/30/2013 06:37 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: subplot: renamed to rel_place and inherit from place directly, in order to store other plot elements that are relative to their containing plot

10496 07/30/2013 06:26 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: taxon_path: converted to an auxiliary table of taxon_name instead of a subclass of it (like geopath for the place table). this causes distinct taxon_paths to be stored only once, instead of repeatedly for each taxon_name.

10495 07/30/2013 06:16 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: place hierarchy: reorganized to store scrubbed geoplaces in a containment hierarchy instead of a denormalized geopath. this allows each source-specific place to be GNRS-scrubbed to a GADM place, and then have its coordinates geovalidated to see if it is within the matched GADM place. this uses the georeferencing table to store the matched GADM place (scrubbed_geoplace) for each input place, instead of geopath_scrub to store the matched GADM geo*path* for each input geo*path*. (this avoids the need to scrub every combination of place ranks, because just the name of each place is scrubbed relative to its parent place.) geopath instead becomes an auxiliary table to store the place table's verbatim ranks, for easy access and storage.

10494 07/30/2013 04:51 PM Aaron Marcuse-Kubitza

inputs/SpeciesLink/Specimen/map.csv: conceptual_darwin_2003_1_0_BoundingBox: remapped to UNUSED

10493 07/30/2013 03:22 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: place: renamed to local_place to distinguish it from geoplace, which is not a subclass of place (it is a separate, global table, while local_place is source-specific). note that renames sometimes need to be done manually on vegbiendev, to avoid triggering a MySQL bug that blocks the new table from being created and requires the entire database to be recreated to clear the error.

10492 07/30/2013 03:02 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: stem, stem_observation: made associated individual/individual_observation optional, because some stems (e.g. in VegBank) are not grouped together into individuals. note that a stem is still considered to BE-AN individual, but it is a type of individual which may be grouped under another, plant-level individual.

10491 07/30/2013 02:47 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: fixed lines

10490 07/30/2013 02:45 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: specimen_observation: added description ( taxon_presence: added occurrence_status ( stem_observation, aggregate_observation: made room for them to expand with additional first-class fields.

10489 07/30/2013 02:22 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: taxon_presence, taxon_absence: inherit from taxon_determination rather than taxon_observation, so that the taxon_determination's taxon can be used as the identifying taxon (i.e. the authorPlantName,

10488 07/30/2013 01:41 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: taxon_determination: inherit from taxon_observation again because now that redeterminations can only occur on reobservable things, it makes sense to only allow one taxon_determination per observation event. this means that each redetermination on a specimen would get its own taxon_observation (where any additional attributes noted in the reobservation could also be included).

10487 07/30/2013 01:31 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: taxon_occurrence: renamed to reobservable to emphasize that this is only for things on which taxon redeterminations can be made, such as individuals and specimens (including voucher specimens). a redetermination on an aggregate_observation would instead be made on its voucher specimen, which is the only reobservable part of it.

10486 07/30/2013 01:07 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: moved taxon_observation subclasses closer to taxon_observation so that it would be clear they were observation-related rather than occurrence-related (e.g. there is no concept of "repeat-sampling" of an aggregate_observation, because each sampling it is the collector's opinion that the plants correspond to a particular taxon)

10485 07/30/2013 02:00 AM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/VegCore.ERD.png: switched back to attaching the sRGB color profile directly, because actually, the native->sRGB translation happens in the monitor driver itself (and can be adjusted in System Preferences > Displays > Color), rather than in the specific application. this means that the hex color values color-matched in MySQL Workbench were actually sRGB (translated by the OS to monitor-native for display), and that the sRGB profile merely needed to be explicitly indicated for other monitors that are not close to sRGB (and thus need the translation). the closeness of the 27-inch iMac screen to sRGB can be verified by selecting sRGB in System Preferences > Displays > Color, and noting that the desktop background does not change from when the default "iMac" setting is selected.

10484 07/30/2013 01:38 AM Aaron Marcuse-Kubitza

bugfix: schemas/VegCore/VegCore.ERD.png: convert to sRGB color profile after attaching the native monitor profile instead of attaching it directly. this allows the hex colors that were color-matched in MySQL Workbench (which presumably uses raw monitor RGB) to be translated to the universal sRGB space, where they can then be localized to a different monitor's local color space. note that this does not visibly change the image on the 27-inch iMac screen from what was produced via the previous, incorrect method (attaching the sRGB profile without conversion from native), which would imply that the iMac's screen is very close to the sRGB color space already. if this is the case, it is instead older LCDs that have off-white color spaces that need translation from sRGB.

10483 07/29/2013 11:39 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.png: attached sRGB color profile using Gimp (, so that the colors don't look completely washed out and off-hue on older LCDs (i.e. other than the 27-inch iMac screen)

10482 07/27/2013 12:13 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: regenerated exports