fix: inputs/publishable datasources.xlsx: updated: conditions of use: Canadensys sources: these should actually be marked as no restrictions, in contrast to Brian E's earlier annotations, because they are public domain: the citation is requested, but not legally required
inputs/publishable datasources.xlsx: updated
inputs/publishable datasources.xlsx: updated: conditions of use: indicated which datasources have no restrictions
inputs/CVS/^taxon_observation.**.sample/test.xml.ref: updated
fix: inputs/CVS/plantConcept_/header.csv: regenerated after fixing the postprocess/cleanup ordering bug (r14827), which had caused header.csv to be incorrectly generated after renames in postprocess.sql were applied
bugfix: inputs/input.Makefile: postprocess must be run after cleanup rather than before because it depends on the cleanup having been performed.
this bug was not previously detected because this is only a problem when refreshing a datasource to data in the same format: this would attempt to run an existing postprocess.sql, out of order, instead of starting with no postprocess.sql as we usually do....
bugfix: inputs/input.Makefile: $(dbExports): also need to put data.sql before clean_up.sql, etc. previously, this ordering had to be done by naming clean_up.sql, etc so they would sort after data.sql alphabetically, but it can be confusing to have to remember to do this. this fixes a bug in the CVS refresh where cvs.~.clean_up.sql was being run before data.sql, causing some private columns to have been deleted before the data was imported into the tables, creating a column mismatch error.
inputs/Cyrille_traits/Makefile: set custom $(null_strs) which handles "NA"
inputs/input.Makefile: pass make var $(null_strs) to invoked commands so it can be used by lib/sql_io.py
fix: *Makefile: changed line endings to \n so that `patch` can work with pasted input. use `svn di --extensions --ignore-eol-style` to verify no diff.
added inputs/CVS/_src/cvs-eep-archive-2014-10-07-correctedCVSData.{data,schema}.sql.ini
bugfix: inputs/CVS/_src/{data,schema}.sql.ini: sourcefilename: this needs to be on the VM's own HD to avoid crashing MSAccess to PostgreSQL. destinationdatabase: added this back since it is fine to leave this blank.
added inputs/CVS/_src/cvs-eep-archive-2013-10-22-VegBIEN.{data,schema}.sql.ini
added inputs/CVS/_src/{data,schema}.sql.ini
inputs/publishable datasources.xlsx: updated: use white text on dark backgrounds for better visibility, and to create more visual contrast for the unredistributable indicators
inputs/publishable datasources.xlsx: updated: consolidated Brian E's new columns into a single "conditions?" column. fix: "what is needed to publish it": renamed to "conditions of use/remaining tasks". "conditions of use": changed color scheme to match "publishable?" columns.
added inputs/CVS/verify/Review of CVS data in BIEN3-RKP2014Sept7-Revised.docx from Bob
bugfix: inputs/CVS/plot_/postprocess.sql: locality: site_name should come before directions_to_place because it is at a higher level of granularity
added inputs/CVS/verify/Review of CVS data in BIEN3-RKP2014Sept7.docx from Bob
added inputs/bien2_traits/_no_import since bien2_traits has been replaced by Cyrille_traits
added inputs/Cyrille_traits/
mappings/VegCore.htm: regenerated from wiki. made verbatimLocality a synonym of locality since they are used to store the same data.
fix: inputs/input.Makefile: $(nonHeaderSrcs): updated to exclude new header.txt
inputs/input.Makefile: added %/list_srcs
fix: lib/sh/util.sh: already_exists_msg(): changed calling convention to avoid it seeming like `return 0` is run if already_exists_msg() throws an error, when in fact already_exists_msg() is just a command that should be run before returning/errexiting
fix: inputs/input.Makefile: need to escape $ in commands, including inside comments
bugfix: inputs/input.Makefile: `$(call add*,$(svnFiles))` must be invoked externally to clear the $(wildcard) cache before expanding $(svnFiles)
inputs/VegBank/run*.log: updated. this adds the function call context in addition to the function location.
fix: inputs/.geoscrub/geoscrub_output/geoscrub.csv.run: make(): added warning that this will truncate the geoscrub database tables
added inputs/VegBank/run.call_graph.log
inputs/VegBank/run.log: updated for echo_vars() changes. the PG* vars, which contain important information, will now not need to be filtered out.
added inputs/VegBank/run.log
fix: inputs/input.Makefile: $(svnFilesGlob): *.log should be in both the subdirs and the main dir
inputs/input.Makefile: $(svnFilesGlob): *.log
inputs/Makefile: install: install an empty VegBIEN schema instead of all the datasources, at Mark's request. this enables loading just a single datasource.
added inputs/CVS/verify/Review of CVS data in BIEN3.docx
bugfix: inputs/input.Makefile: %/install: $(exportHeader) must come before postprocess because postprocess renames columns
bugfix: inputs/input.Makefile: $(import_install_): need `set -o pipefail` to enable errexit
inputs/.geoscrub/geoscrub_output/run: documented postprocess() rm=1 runtime (6 min)
fix: inputs/.geoscrub/geoscrub_output/postprocess.sql: map_geovalidity(): unscrubbable names should actually be geo*in*valid, not geovalid=NULL, according to Brad
bugfix: inputs/input.Makefile: sql/install: ";" for commands inside $(if) blocks need to be inside the $(if) block, too, because otherwise there will be dangling ";" without a statement (bash does not support empty statements containing just ";")
inputs/publishable datasources.xlsx: udpated
inputs/.TNRS/schema.sql: taxon_match: added taxon_scrub_best_match_jerry_lu index to facilitate finding names affected by the match-picking bug (#943)
fix: inputs/HVAA/Specimen/postprocess.sql, map.csv: monthCollected/dayCollected: fix indefinite dates (which aren't supported by Postgres), as decided by Bob (https://docs.google.com/spreadsheets/d/1PI8n0CRttN7ttsXs5qfh5OFFzSoAfJj0gSbylgX6vj4/edit#gid=0)
inputs/.TNRS/schema.sql: taxon_match: added taxon_scrub_by_name index
inputs/.TNRS/schema.sql: taxon_match: added taxon_scrub_by_family index
inputs/.TNRS/schema.sql: taxon_match: added taxon_scrub_by_species_binomial index
bugfix: mappings/VegCore-VegBIEN.csv: prefixed taxonomic ranks: use _concat_nullify() so that the prefix is only added if the epithet is non-NULL
bugfix: inputs/FIA/REF_RESEARCH_STATION/map.csv: mapped country, which is not provided in the FIA data
inputs/.TNRS/schema.sql: taxon_match: removed no longer used scrubbed_unique_taxon_name. the scrubbed name ranks are now generated from the other TNRS columns instead.
inputs/.TNRS/schema.sql: removed no longer used view ValidMatchedTaxon. use taxon_scrub instead.
inputs/.TNRS/schema.sql: taxon_scrub: use taxon_best_match directly, to avoid the need for a separate ValidMatchedTaxon view
fix: inputs/.TNRS/schema.sql: taxon_scrub: merged synonymous columns
schemas/vegbien.sql: taxon_scrub: documented steps to merge synonymous columns
inputs/.TNRS/schema.sql: removed no longer used view MatchedTaxon. use taxon_best_match instead.
inputs/.TNRS/schema.sql: ValidMatchedTaxon: use taxon_best_match now that it's equivalent to MatchedTaxon
fix: inputs/.TNRS/schema.sql: MatchedTaxon: merged synonymous columns
inputs/.TNRS/schema.sql: removed no longer used taxon_scrub.scrubbed_unique_taxon_name.* . use taxon_scrub instead.
inputs/.TNRS/schema.sql: taxon_scrub: use taxon_match derived columns instead of the incorrect values in taxon_scrub.scrubbed_unique_taxon_name.* (which does not work with the multi-match strategy)
inputs/.TNRS/schema.sql: MatchedTaxon: use derived columns from taxon_match. this also incorporates the fixes in the new derived columns.
inputs/.TNRS/schema.sql: taxon_scrub: use derived columns from taxon_match. this also incorporates the fixes in the new derived columns.
inputs/.TNRS/schema.sql: taxon_match: to port derived column changes to vegbiendev: derived_cols_export() code: documented runtime (6 h)
bugfix: inputs/.TNRS/schema.sql: removed no longer used derived column __accepted_infraspecific_label, which had a buggy formula that broke derived_cols_populate()
fix: inputs/.TNRS/schema.sql: taxon_match: to remove a column: updated instructions
**: updated to use the local machine's new hostname, frenzy
inputs/.TNRS/schema.sql: added new derived columns to derived views
fix: schemas/util.sql: derived_col_update(): also need steps to drop column, because DROP __ CASCADE doesn't work when there are dependent views
inputs/.TNRS/schema.sql: _accepted_infraspecific{rank,epithet}: use array slice of new _accepted{genus,specific_epithet,infra_{rank,epithet}}, which is simpler than using remove_prefix() in __accepted_infraspecific_label
inputs/.TNRS/schema.sql: "[accepted_]genus__@DwC__@vegpath.org": don't need to use *Accepted_name anymore because _accepted{genus,specific_epithet,infra_{rank,epithet}} is now generated from *Accepted_name
inputs/.TNRS/schema.sql: taxon_match."__accepted_{genus,specific_epithet}": renamed to "__accepted_{genus,specific_epithet,infra_{rank,epithet}}" since this now includes these other ranks as well
bugfix: inputs/.TNRS/schema.sql: taxon_match."__accepted_{genus,specific_epithet}": use "*Accepted_name" instead of "Accepted_species[_binomial]__@TNRS__@vegpath.org" (from "*Accepted_name_species") because Accepted_name_species apparently sometimes does not match the Accepted_name and uses malformed Unicode characters
inputs/.TNRS/schema.sql: taxon_match: `inputs/.TNRS/data.sql.run refresh`: documented runtime (1 min)
bugfix: inputs/.TNRS/schema.sql: taxon_match: use "Accepted_species[_binomial]__@TNRS__@vegpath.org" instead of "*Accepted_name_species". this fixes a bug in __accepted_infraspecific_label where Accepted_name_species with trailing whitespace could not be prefix-removed from names that contained just a species binomial.
fix: inputs/.TNRS/schema.sql: taxon_match: added derived column "Accepted_species[_binomial]__@TNRS__@vegpath.org", which removes trailing whitespace
inputs/.TNRS/schema.sql: added steps to remove a column and to add a non-derived column
inputs/.TNRS/schema.sql: taxon_match: to remove columns or add columns at the end: merged into "to add a new derived column"
inputs/.TNRS/schema.sql: to add columns in the middle: renamed to "to move a column to the middle" for clarity
inputs/.TNRS/schema.sql: to populate a new column: updated to use util.derived_col_update()
fix: inputs/.TNRS/schema.sql: taxon_match: to remove columns or add columns: also need to run util.recreate_view()
inputs/.TNRS/schema.sql: taxon_match: to remove columns or add columns at the end: don't need to run `rm=1 inputs/.TNRS/data.sql.run` because this is now run by `make schemas/remake`
schemas/util.sql: remove_prefix(), remove_suffix(): support case-insensitive matching
bugfix: inputs/.TNRS/schema.sql: taxon_match.__accepted_infraspecific_label: need to use case-insensitive matching of the removed prefix because TNRS lowercases part of the Accepted_name
bugfix: inputs/.TNRS/schema.sql: taxon_match: use wrapper for util.remove_prefix() so CHECK constraints that use it don't get dropped when the util schema is reinstalled
inputs/.TNRS/schema.sql: taxon_match: COMMENT: added steps to port derived column changes to vegbiendev
bugfix: inputs/.TNRS/schema.sql: taxon_match: derived columns: use new "matched~Name[_no_author]___@TNRS__@vegpath.org" instead of "*Name_matched" so that "No suitable matches found." is removed before concatenating with other fields
inputs/.TNRS/schema.sql: taxon_match: added derived column "matched~Name[_no_author]___@TNRS__@vegpath.org", which removes the "No suitable matches found." string
inputs/.TNRS/schema.sql: reordered derived columns in dependency order
bugfix: inputs/.TNRS/schema.sql: "[accepted_]morphospecies[_binomial]__@Brad__.TNRS@vegpath.org": need to use "[accepted_]genus__@DwC__@vegpath.org" rather than "*Accepted_name" for this for rank = genus
inputs/.TNRS/schema.sql: taxon_match: added derived column "[scrubbed_]morphospecies[_binomial]__@Brad__.TNRS@vegpath.org"
bugfix: inputs/.TNRS/schema.sql: "[accepted_]genus__@DwC__@vegpath.org": need to populate this for rank = genus
inputs/.TNRS/schema.sql: taxon_match: added derived column "[scrubbed_]taxonomicStatus__@DwC__@vegpath.org"
bugfix: inputs/.TNRS/schema.sql: derived columns: use "Accepted_family__@TNRS__@vegpath.org" instead of "*Accepted_name_family" because "*Accepted_name_family" is sometimes missing
fix: inputs/.TNRS/schema.sql: taxon_match: added derived column "Accepted_family__@TNRS__@vegpath.org", which is needed because "*Accepted_name_family" isn't always populated
bugfix: inputs/.TNRS/schema.sql: taxon_match: to add columns in the middle: also need to run util.derived_cols_repopulate() since the dependency order has changed
fix: inputs/.TNRS/schema.sql: taxon_match: COMMENT: to add columns in the middle: also need to run util.derived_cols_update()
fix: inputs/.TNRS/schema.sql: taxon_match: COMMENT: updated util.derived_cols_sync() to util.derived_cols_update()