added inputs/bien2_traits/validations.sql.run
inputs/import.stats.xls: updated import times
fix: inputs/VegBIEN/Redmine/wiki/.htaccess: redirect to new main page when accessed without trailing /
inputs/bien2_traits/TraitObservation/postprocess.sql: remove rows with no taxon name, which are invalid, and which helps simplify the aggregating validations queries
fix: inputs/VegBIEN/Redmine/svn/.htaccess: updated repository URL to point to trunk/
bugfix: inputs/SALVIAS/verify/plots.out.sql: fixed ' quoting syntax to use '' instead of \' to escape '
inputs/input.Makefile: verify/%.out: use a *.sql file in the verify/ directory itself to generate *.out, so that each datasource can have its own set of output queries. for datasources that should share the same set of queries, they can instead be symlinked to the same file.
fix: inputs/CVS/project/: added _no_import since this should not also be imported separately from taxon_observation.**
added inputs/XAL/Specimen/_no_import, since this is a demo-only datasource and there isn't a staging table for it
inputs/.geoscrub/county_centroids/test.xml.ref, inputs/.NCBI/{names.src,nodes.src}/test.xml.ref: accepted test outputs (generated now that these tables are in import_order.txt)
inputs/FIA/taxon_observation.**/header.csv: updated for new REF_RESEARCH_STATION.country metadata value col
inputs/input.Makefile: add!: verify/: also svn:ignore *.tsv, *.txt
inputs/publishable datasources.xlsx: updated
fix: inputs/SALVIAS/projects/postprocess.sql: remove private data that should not be publicly visible: remove projects that do not have "There are no specific use conditions attached to this dataset"
fix: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Remove private data that should not be publicly visible: also need to remove metadata-only plots
bugfix: inputs/SALVIAS/plotMetadata_/map.csv: things mapped to project_participant: remapped to event__participant because these actually relate to the event, not the project, even though they seem like project-related fields
fix: inputs/SALVIAS/plotMetadata_/map.csv, inputs/Madidi/LocationObservation/map.csv: things mapped to communityID: remapped to communityName, which is what's used in analytical_stem (communityID is for numeric IDs)
inputs/SALVIAS/plotMetadata_/create.sql, map.csv: expanded plot_administrator:party_code_party_ and mapped plot_administrator_name to a 2nd project_participant
mappings/VegCore-VegBIEN.csv: project_participant: use [!...] negative lookahead assertion so that multiple project_participant columns will properly map to separate projectcontributor rows
inputs/SALVIAS/plotMetadata_/map.csv: mapped PrimOwnerID_name->project_participant
inputs/SALVIAS/plotMetadata_/create.sql: added join to PrimOwnerID:party_code_party_
bugfix: inputs/SALVIAS/import_order.txt: added party_code_party_
bugfix: inputs/SALVIAS/party_code_party_/create.sql: need to remove duplicate entries in party_code_party
inputs/SALVIAS/party_code_party_/map.csv: mapped fullname->event_participant_name for use by other tables
mapped inputs/SALVIAS/party_code_party_/
inputs/SALVIAS/_MySQL/salvias_plots.*.sql: refreshed. this adds the party and party_code_party tables Brad provided for mapping the plot contributors.
fix: inputs/SALVIAS/salvias_plots.~.clean_up.sql: Delete rows that do not satisfy foreign key constraints: also need to do this for plotObservations, since the refreshed data contains dangling rows for that as well
inputs/SALVIAS/run_: documented *.sql install runtime (3 min), as separate from the full `datasrc_make reinstall` runtime (3.5 min)
inputs/SALVIAS/run_: refresh(): `datasrc_make reinstall`: updated runtime. documented that runtimes are from starscream.
added inputs/SALVIAS/run_, which includes a refresh() target
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
bugfix: inputs/.TNRS/schema.sql: scrubbed_family: Name_matched_accepted_family was missing from the TNRS results at one point, so we are now using Family_matched as a workaround to populate this. the workaround is for accepted names only, as no opinion names do not have an Accepted_name_family to prepend to the scrubbed name to parse.
inputs/.TNRS/schema.sql: reexported from live DB, which changes the element order
inputs/VegBank/import_order.txt: added projectcontributor_
inputs/VegBank/projectcontributor_/map.csv, postprocess.sql: added project_participant
added inputs/VegBank/projectcontributor_/
inputs/VegBank/vegbank.~.clean_up.sql: projectcontributor.surname: prepend table name to avoid join collisions
inputs/VegBank/vegbank.~.clean_up.sql, inputs/CVS/cvs.~.clean_up.sql: Prevent "column name specified more than once" errors when tables are joined: put tables in alphabetical order for consistency
inputs/datasource_release_status.xlsx: renamed to `publishable datasources.xlsx` to match the spreadsheet title
inputs/VegBank/^taxon_observation.**.sample/create.sql, map.csv: added new project columns
inputs/VegBank/taxon_observation.**/postprocess.sql: added the project table
mapped inputs/VegBank/project/, which includes the projectName for attribution
inputs/CVS/^taxon_observation.**.sample/create.sql, map.csv: added new project columns
inputs/CVS/taxon_observation.**/postprocess.sql: added the project table
inputs/CVS/project/map.csv: mapped stopDate->projectEndDate
mapped inputs/CVS/project/, which includes the projectName for attribution
inputs/VegBIEN/Redmine/svn/.htaccess: updated to use much faster direct repository URL rather than Redmine web interface, now that the repository itself is publicly accessible in addition to the Redmine view of it
fix: inputs/TEX/Specimen*/map.csv, postprocess.sql: habitat: also placed in occurrenceRemarks so that this field gets parsed for growth form information, as requested by Brad (wiki.vegpath.org/TEX_validation#2013-2-26)
fix: inputs/TEX/Specimen*/map.csv: mapped constant values for specimenHolderInstitutions, country. these have to be added with `rm=1 ./inputs/TEX/Specimen.../run postprocess`.
bugfix: inputs/TEX/Specimen2/map.csv: mapped BARCODE to accessionNumber so that we have a unique ID for each row
inputs/datasource_release_status.xlsx: updated
inputs/CVS/^taxon_observation.**.sample/create.sql: added Mike Lee's additional plots used to validate confidentiality-related fields (wiki.vegpath.org/CVS_validation#plots-to-include)
bugfix: inputs/CVS/^taxon_observation.**.sample/create.sql: include taxonName in the subset of columns that's imported for the validation, because it is _alt-ed with scientificName for forming the TNRS input name. this is unique to CVS, which is why it was not part of the validation subset copied from the VegBank subset.
bugfix: inputs/.TNRS/schema.sql: granted bien_read SELECT access to derived views as well as the core tnrs table
updated inputs/datasource_release_status.xlsx
added inputs/datasource_release_status.xlsx, export of Google spreadsheet at https://docs.google.com/spreadsheet/ccc?key=0ArZXrTAXd-TYdDRRb2RxYi11TWZrQVh5bVdKOURCeFE
fix: inputs/CVS/^taxon_observation.**.sample/: added _no_import because this table duplicates part of what's imported from taxon_observation.**
bugfix: inputs/VegBank/plot/: added _no_import because this table is left-joined and should not be imported separately
bugfix: inputs/{.NCBI,CTFS}/*.src/: added _no_import because these tables are left-joined and should not be imported separately
inputs/import.stats.xls: removed table names from datasources where only one table is imported
fix: inputs/import.stats.xls: removed deleted tables from current import
inputs/GBIF/raw_occurrence_record_plants/map.csv: row_num: remapped to plain *row_num, like the other datasources that have this field
inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: rerun time: noted that this is only fast after manual vacuuming of the table (to remove the deleted rows from the index). autovacuum apparently does not run, although it should.
inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: reran test, which added yearCollected/monthCollected/dayCollected
inputs/CVS/plantConcept_/create.sql: documented runtime (3 min)
inputs/CTFS/*.src/: added test.xml.ref
inputs/CTFS/*.src/: added VegBIEN.csv
bugfix: inputs/CTFS/TaxonOccurrence*/map.csv: things mapped to taxonObservationID: remapped to taxonOccurrenceID since taxonObservationID is not mapped to anything in VegBIEN (denormalized VegCore doesn't distinguish between taxon occurrences and taxon observations of them)
bugfix: inputs/ARIZ/~.clean_up.sql: prevent "column already exists" errors when there is an input column of the same name as an output column
inputs/.geoscrub/import_order.txt: added county_centroids so that it would be installed by new-style import
inputs/FIA/TREE/run: documented import() runtime (1.5 h), which includes table cleanup runtime (1 h)
inputs/GBIF/raw_occurrence_record_plants/run: updated import() runtime (same), documented table cleanup runtime (1.5 h)
inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: CREATE INDEX ... specimenHolderInstitutions: documented runtime (45 min)
inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented runtime (3.5 min)
bugfix: inputs/CTFS/import_order.txt: added *.src so that these would be installed under new-style import as well. this means that their columns will now be automapped, requiring the names to be renamed to VegCore names in */create.sql. note that VegCore taxonOccurrenceID has been renamed to taxonObservationID since this was last run.
inputs/.geoscrub/run: documented import() runtime (20 min)
bugfix: inputs/.NCBI/import_order.txt: added nodes.src, names.src so that these would be installed under new-style import as well. this means that their columns will now be automapped, requiring the names to be renamed to VegCore names in nodes/create.sql.
bugfix: inputs/input.Makefile: install: for new-style datasources, use the associated runscript instead (the old-style install target will not do everything that's needed for a new-style datasource)
inputs/FIA/COND/postprocess.sql: filtering formula: documented that this was created by Brad, and provided the URL to it on nimoy
inputs/CVS/cvs.~.clean_up.sql: remove plot.realLatitude/realLongitude, since this is private data that should not be publicly visible
inputs/CVS/^taxon_observation.**.sample/create.sql: uncommented identifiedBy since this is now part of taxonObservation_
fix: inputs/CVS/observation_community/create.sql: communityName: populate from commConcept.commName instead, because commInterpretation.commname is not always populated. this requires left-joining to commConcept.
inputs/CVS/observation_community/map.csv: updated output column names to new input column names, to avoid later output column collisions
inputs/CVS/observation_community/header.csv, map.csv: updated input column names for cvs.~.clean_up.sql renamings
inputs/CVS/cvs.~.clean_up.sql: commClass, commConcept fields: prepend table name to avoid inter-table collisions upon join
added inputs/CVS/observation_community/, as for VegBank
inputs/CVS/cvs.~.clean_up.sql: commClass.dba_src_ID: prepend table name to avoid inter-table collisions upon join
added inputs/CVS/observationContributor_/, which adds the people collecting the plot
inputs/CVS/cvs.~.clean_up.sql: observationContributor.dba_src_ID: prepended table name to avoid collision when left-joining to party
bugfix: inputs/input.Makefile: %/header.csv: errexit the command so that errors won't scroll by, which in this case requires `set -o pipefail`
fix: inputs/CVS/taxonObservation_/create.sql: mapped identifiedBy, which involves joining to party
inputs/CVS/cvs.~.clean_up.sql: don't rename taxonInterpretation.PARTY_ID, so that this can be USING-joined to party in inputs/CVS/taxonObservation_/create.sql
inputs/CVS/^taxon_observation.**.sample/map.csv: synced output columns to input columns (which removes the extra *s)
fix: inputs/CVS/plot_/postprocess.sql: locality: include the site name (authorLocation), because this is part of the unique specification of the place that was sampled, and Bob wants this to be included in VegBIEN