mappings/VegCore.csv: Regenerated from wiki. All ambiguous terms now have multiple alternatives, preventing them from being automapped to a single alternative without prompting the user for confirmation
mappings/Makefile: Veg+-VegCore.csv: translate: Fixed bug where need to run on $@ instead of $<
mappings/VegCore.csv: Regenerated from wiki. All mappings/Veg+-VegCore.csv terms are now added as synonyms or separate terms.
mappings/VegCore.csv: Regenerated from wiki. Most ambiguous terms are now split into alternatives, and most mappings/Veg+-VegCore.csv terms are now added as synonyms.
canon: Raise an error if two input terms map to the same simplified string
translate: Changed dictionary to thesaurus, since the map used actually has synonyms rather than definitions
mappings/Makefile: Veg+-VegCore.csv: Translate the thesaurus's output terms using itself in order to map a synonym of an ambiguous term directly to its alternatives list rather than only to the ambiguous term itself
mappings/Makefile: Veg+-VegCore.csv: Run collapse_multimap on the generated map so that all alternatives are included, rather than just the first alternative, when translate maps an ambiguous term
redmine_synonyms: Fixed bug where need to output a CSV rather than TSV to be usable by other programs that use map spreadsheets
Added collapse_multimap, which collapses multimap entries in a spreadsheet dictionary
mappings/Veg+-VegCore.csv: Separate alternatives of ambiguous terms with , instead of ", " for easier machine-parsability
redmine_synonyms: Added support for ambiguous terms, which unlike the synonyms format nests the term (the alternative) under the synonym (the ambiguous term) rather than the synonym under the term. Note that ambiguous terms must also be prefixed with ? to differentiate them from composites (e.g. recordedBy_givenName), which use the same _-based naming convention.
mappings/VegCore.csv: Regenerated from wiki
schemas/vegbien.sql: analytical_stem_view: Renamed scientificNameWithMorphospecies to taxonNameWithMorphospecies because it does not contain the scientific name author, as required by DwC scientificName <http://rs.tdwg.org/dwc/terms/#scientificName>
mappings/Makefile: VegCore.tables.csv: Exclude ambiguous table names, which should not be part of the tables summary (as neither are table synonyms)
input.Makefile: $(translate?): Merged with $(translate), which is not used independently
input.Makefile: Use new translate_ci instead of translate
mappings/Makefile: Use new translate_ci instead of translate
Added translate_ci
mappings/VegCore-VegBIEN.csv: institutionCode list->sourcename mapping: _split(): Also match ; as a separator, and match separators with or without a following space
mappings/Makefile: Added target to create Veg+-VegCore.csv from VegCore.htm, initially commented out until all the synonyms in the existing Veg+-VegCore.csv are added to the VegCore data dictionary <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_data_dictionary>
Added redmine_synonyms, which translates a Redmine HTML page to a thesaurus
lockfile: Linux: Documented why newgrp and recursive invocation of lockfile are needed
lockfile: Linux: Fixed bug where need to change primary group of the dotlockfile process to the group of the dir to contain the lockfile, because dotlockfile otherwise reports a "permission denied" error (even though the directory is actually writable, dotlockfile thinks it isn't). Running dotlockfile with a different primary group is complicated because newgrp, the command that does this, does not pass arguments to the new process, so they must instead be passed via environment variables and a recursive invocation of lockfile (with the $inner recursion flag set). Additionally, exec cannot be used to propagate the PPID (needed by dotlockfile) because newgrp creates a new process rather than using exec, so it must be manually entered into the lockfile after dotlockfile runs.
lockfile: Linux: Fixed bug where need to lower retry count to avoid overflowing the retries variable
lockfile: Linux: Added workaround for bug in dotlockfile where using -1 to retry indefinitely doesn't work, so need to use large integer instead
lockfile: Linux: Use bin/dotlockfile instead of the system's dotlockfile, because the system's dotlockfile is SETGID mail, which prevents it from creating lockfiles in a directory owned by the bien user and group when being run by the login user
bin/: svn:ignore: Added dotlockfile, which is copied from the system during installation
bin/: svn:ignore: Removed no longer applicable test_output
root Makefile: misc-Linux: Added command to copy dotlockfile to the bin/ dir, so that it can be used without being SETGID mail, which would prevent it from creating lockfiles in a directory owned by the bien user and group when being run by the user
root Makefile: core: Added misc-* to install other dependencies
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Removed no longer needed canon_taxonverbatim.family alternative, since the family will be included in the canon_taxonlabel.taxonomicname by the mappings
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use canon_*taxonlabel*.taxonomicname instead of canon_taxonverbatim.taxonomicname as one of the alternatives because only canon_taxonlabel.taxonomicname is guaranteed to be populated by the mappings, while canon_taxonverbatim.taxonomicname will only be populated if the datasource explicitly specifies that field. This distinction is only meaningful for data without a TNRS match, as TNRS supplies canon_taxonverbatim.taxonomicname.
import_all: after_import(): Added wait on tnrs.make's lockfile to ensure that all background scrubbing processes are complete before creating the analytical DB
import_all: Moved `waitpid $jobs` into after_import()
schemas/vegbien.ERD.mwb: Fixed table sizes
schemas/vegbien.ERD.mwb: Regenerated exports
schemas/vegbien.sql: removed all accessioncode fields, as VegBIEN does not use them
Added inputs/FIA/_src/FIADB_version4.accdb and FIADB_version4.sql (created from it using Access To PostgreSQL and the additional transformations at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Tools#MS-Access-database-MDB>)
Added inputs/FIA/COND_unique/, generated from new FIA data
inputs/FIA/FIA_COND_unique/create.sql: Fixed bug where need to remove `CREATE TABLE :table AS` at beginning because that is added by the make target
inputs/FIA/geoscrub.~.clean_up.sql: Moved creation of FIA_COND_unique to FIA_COND_unique/create.sql
README.TXT: Full database import: Updated time until import_all returns control to the shell to account for the TNRS names now being imported concurrently with the inputs rather than before them
mappings/VegCore-VegBIEN.csv: Also include morphospecies in the accepted taxondetermination's taxonverbatim, so that it can easily be retrieved by the analytical DB views
schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the taxonName or scientificName when the name components are not provided, as is the case when there is no scrubbed taxondetermination (because TNRS returns no match)
mappings/VegCore.csv: Regenerated from wiki. This adds Brad's DwC ID terms and their definitions in <https://projects.nceas.ucsb.edu/nceas/attachments/download/621/vegbien_identifier_examples.xlsx>.
join: Added support for direct mappings to VegBIEN by passing through outputs that start with / (indicating an XPath rather than a term)
schemas/vegbien.sql: analytical_stem_view: Added family_matched, taxonName_matched, scientificNameAuthorship_matched
schemas/vegbien.sql: analytical_stem_view: Added family_verbatim, scientificName_verbatim, scientificNameAuthorship_verbatim from datasource taxondetermination
schemas/vegbien.sql: analytical_stem_view: Fixed bug where need to use identifiedBy and dateIdentified from the datasource taxondetermination rather than the canonical taxondetermination (whichever taxondetermination is most scrubbed)
schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent(): is_datasource_current: Fixed bug where need to filter out determinationtypes for matched/accepted determinations, which are not datasource determinations
schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent(): Fixed bug where need to also set existing datasource_current taxondetermination's is_datasource_current to false
xml_dom.py: replace_with_text(): Added support for all scalar (non-Node) types, which will be stringified using strings.ustr()
schemas/functions.sql: Added _fix_date()
sql_io.py: put_table(): Documented that much of the complexity of the normalizing algorithm is due to PostgreSQL not having a native command for insert/on duplicate select
sql_io.py: put_table(): Corrected "insert/if not exists get" to "insert/on duplicate select"
sql_io.py: put_table(): Removed no longer applicable requirement that it be run at the beginning of a transaction, which was only required when the output table was locked during the function call
sql_io.py: put_table(): Documented that the function's insert/if not exists get algorithm does not support database triggers that populate fields covered by a unique constraint
inputs/FIA/_src/_README.TXT: Documented that FIA does not provide data for some states, e.g. HI
config/: Set svn:ignore to exclude *password files
Removing config/bien_read_password from version control
Removing config/bien_password from version control
inputs/FIA/: Added refreshed data (not yet mapped)
input.Makefile: Existing maps discovery: $(exts): Also match uppercase versions of extensions
lib/common.Makefile: Added $(ucase) and $(ci)
inputs/FIA/_src/Makefile: Table bundling: $(tableCsvs): Fixed bug where need to replace % with $* in $(csvPattern)
inputs/FIA/_src/Makefile: Table bundling: Fixed bug where need to remove trailing slashes from dirs that will match a target pattern
inputs/FIA/_src/Makefile: Added Table bundling targets to regroup CSVs by tables
lib/common.Makefile: Added $(mkdir)
Added inputs/FIA/_src/_README.TXT with Bob's comments
input.Makefile: SVN: $(_svnFilesGlob): Added README.TXT
mappings/VegCore.csv: Regenerated from wiki. Synonym lists have now been translated to sections to create a web page anchor for each synonym, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_refactoring#Index-synonyms-as-web-page-anchors>. This enables searching for VegCore synonyms in the data dictionary as well as terms, and makes it possible to swap a term and a synonym while still keeping both as indexed anchors.
mappings/VegCore.csv: Regenerated from wiki. All uncategorized terms have now been moved to tables.
README.TXT: Maintenance: VegCore data dictionary: Added steps to check that no terms were lost when moving terms
inputs/import.stats.xls: Updated import times
schemas/vegbien.sql: analytical_stem_view: coordinates: Only use county_centroids coordinates when datasource coordinates are not provided, not also when datasource coordinates aren't geovalid. This also fixes a bug where (NULL) county_centroids coordinates were used for non-geovalid coordinates even when there was no county_centroids match, rather than including the non-geovalid coordinates.
schemas/vegbien.sql: taxondetermination: Added is_datasource_current, which is autopopulated to the most recent datasource-provided taxondetermination
schemas/vegbien.sql: taxondetermination: Added taxondetermination_single_accepted_determination unique index to facilitate joining on the accepted determination
schemas/vegbien.sql: taxondetermination: Added taxondetermination_single_matched_determination unique index to facilitate joining on the matched determination
schemas/vegbien.sql: taxondetermination: Removed notespublic, notesmgt, which are not used by VegBIEN
schemas/vegbien.sql: taxon_trait_view: scientificName: Use taxonverbatim.taxonname when taxonlabel/taxonverbatim.taxonomicname are not provided, to accommodate TNRS names. This is part of the workaround for the bug where the taxonlabel's taxonomicname (concatenated taxonomicname) is occasionally not populated.
schemas/vegbien.sql: taxon_trait_view: Added workaround for bug where the taxonlabel's taxonomicname (concatenated taxonomicname) is occasionally not populated due to a taxonlabel constraint violation, by using the taxonverbatim's taxonomicname instead in these cases. This bug, which appeared in the r7317 import, is so far not reproducible (tested on Mac OS X), so its cause is unknown, but may be caused by a bug in functions._merge_prefix(), which is run on the taxonlabel's taxonomicname but not the taxonverbatim's taxonomicname.
schemas/vegbien.sql: analytical_stem_view: Added dateIdentified, identificationRemarks per Brad's request (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#E-mail-on-2013-1-16)
inputs/FIA/_src/Makefile: Added extraction targets to extract zip archives
inputs/FIA/_src/download: Use new Makefile, which uses make logic to determine if a file needs to be downloaded
Added inputs/FIA/_src/Makefile, with targets to download each zip archive
schemas/vegbien.sql: analytical_stem_view: derived terms: Added _bien suffix per Brad's request (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#Brad-Boyles-comments)
Added inputs/FIA/_src/FIADB_version4.accdb.url
inputs/FIA/_src/download: Only run wget on files that don't yet exist
inputs/FIA/_src/download: Run wget in same directory as script to ensure files get downloaded there
inputs/FIA/_src/download: Set svn:executable
Added inputs/FIA/_src/download to download archives of CSVs for each state
to_do/timeline.2013.xls: Updated with changes during conference call