Activity
From 01/08/2013 to 02/06/2013
02/06/2013
- 07:34 PM Revision 7482: README.TXT: Maintenance: VegCore data dictionary: When moving terms, check that no terms were lost: Updated steps now that VegCore.csv and Veg+-VegCore.csv are sorted by name, so that a comparison of added/deleted counts is not necessary and a simple `svn di` can be used
- 07:33 PM Revision 7481: mappings/Makefile: Veg+-VegCore.csv: Sort terms by name so that reordering terms in the VegCore data dictionary does not cause Veg+-VegCore.csv to change. This makes it much easier to identify synonyms and ambiguous terms that were accidentally deleted during a data dictionary refactoring. (Note that these are no longer included in VegCore.csv, so this is required in addition to sorting VegCore.csv by name.)
- 07:26 PM Revision 7480: mappings/Makefile: VegCore.csv: Sort terms by name so that reordering terms in the VegCore data dictionary does not cause VegCore.csv to change. This makes it much easier to identify terms that were accidentally deleted during a data dictionary refactoring.
02/05/2013
- 06:19 PM Revision 7479: mappings/VegCore.csv: Regenerated from wiki. This adds cf_aff.
- 06:18 PM Revision 7478: mappings/Makefile: VegCore.csv: Filter out namespaces by matching only terms whose header links within the data dictionary
- 06:08 PM Revision 7477: mappings/VegCore.csv: Regenerated from wiki. This causes TNRS's Annotations (cf/aff) to be mapped into VegBIEN.
- 06:05 PM Revision 7476: mappings/VegCore-VegBIEN.csv: matched*Fit_fraction: Remapped to taxonconfidence instead of taxonfit
- 05:56 PM Revision 7475: mappings/Makefile: VegCore.csv: Fixed bug where need to remove duplicates, which are no longer supported by canon, by removing alternatives of ambiguous terms when these occur separately from their definitions
- 05:29 PM Revision 7474: mappings/Makefile: VegCore.csv: Removed synonyms and ambiguous terms, since the canonicalization of them is handled by Veg+-VegCore.csv. This also reduces the time it takes canon to build the in-memory Python dict of replacements, which scales to all inputs and should speed up the build/test command.
- 05:22 PM Revision 7473: mappings/Makefile: VegCore.csv: Removed synonyms, since the canonicalization of them is handled by Veg+-VegCore.csv
- 05:10 PM Revision 7472: mappings/Makefile: VegCore.csv: Match terms by header # instead of matching all anchors, in order to include the leading ? before an ambiguous term
- 04:42 PM Revision 7471: mappings/Makefile: Veg+-VegCore.csv: Generate dynamically from VegCore.htm, which allows the VegCore thesaurus to be automatically kept up to date. More importantly, it allows terms in all map spreadsheets to be updated simultaneously when a term is renamed (e.g. by replacing a term with one of its synonyms).
- 04:40 PM Revision 7470: mappings/VegX-VegCore.csv: Applied term renamings from the new dynamically generated Veg+-VegCore.csv. Updates to VegCore term names that have occurred since the data dictionary was created are now able to take effect, which involves remapping several fields.
- 04:32 PM Revision 7469: mappings/VegCore-VegBIEN.csv, inputs/*/*/map.csv: Applied term renamings from the new dynamically generated Veg+-VegCore.csv, which reflects the current state of the data dictionary. (Permanently switching to the new Veg+-VegCore.csv will be a separate change.) Updates to VegCore term names that have occurred since the data dictionary was created are now able to take effect, which involves remapping and inferring units on several fields.
- 04:27 PM Revision 7468: mappings/VegCore-VegBIEN.csv: Mapped basalDiameter_in
- 04:15 PM Revision 7467: mappings/VegCore-VegBIEN.csv: Mapped diameterBreastHeightGentry_cm, basalDiameter_cm, precipitation_mm
- 04:14 PM Revision 7466: schemas/vegbien.sql: Added _mm_to_m()
- 03:56 PM Revision 7465: mappings/Makefile: Veg+-VegCore.csv: Fixed bugs where also need to filter out ambiguous tables, but shouldn't filter out acronyms (which are regular fields)
- 03:40 PM Revision 7464: mappings/VegCore-VegBIEN.csv: locationID->location.sourceaccessioncode: Removed restriction that this mapping can't occur if geovalidation information is present. The locationID is no longer mapped to the place.sourceaccessioncode, so this filter is not necessary.
- 03:38 PM Revision 7463: mappings/VegCore.csv: Regenerated from wiki
- 03:19 PM Revision 7462: mappings/Makefile: Veg+-VegCore.csv: Fixed bug where need to filter out table names to avoid applying table replacements to fields which have the same name as a table
- 03:03 PM Revision 7461: inputs/Madidi/map.csv: Fixed bug where needed to remove duplicate input names, now that translate doesn't allow them
- 01:59 PM Revision 7460: mappings/Makefile: VegX-VegCore.csv: Sort by the input column instead of the output column to keep the sort order stable across VegCore term renames
- 01:46 PM Revision 7459: mappings/Makefile: Veg+-VegCore.csv: Before running collapse_multimap, canonicalize alternatives of ambiguous terms using unambiguous mappings. This ensures that the alternatives lists contain only canonical VegCore terms rather than synonyms.
- 01:43 PM Revision 7458: mappings/VegCore.csv: Regenerated from wiki. All synonyms are now hyperlinked, allowing them to be matched by redmine_synonyms.
- 01:31 PM Revision 7457: mappings/Veg+-VegCore.csv: Removed Sources, Definition columns because source information is now in the VegCore data dictionary
- 01:25 PM Revision 7456: mappings/VegCore.csv: Regenerated from wiki. Ambiguous terms newly available to redmine_synonyms due to the bugfix now have multiple alternatives.
- 01:25 PM Revision 7455: redmine_synonyms: Ambiguous terms: Fixed bug where need to use header # instead of term name to determine whether a term is an alternative, because some alternatives (e.g. verbatimElevation) don't follow the units-suffix naming convention.
- 12:58 PM Revision 7454: mappings/VegCore.csv: Regenerated from wiki. All ambiguous terms now have multiple alternatives, preventing them from being automapped to a single alternative without prompting the user for confirmation
- 12:50 PM Revision 7453: mappings/Makefile: Veg+-VegCore.csv: translate: Fixed bug where need to run on $@ instead of $<
- 12:49 PM Revision 7452: mappings/VegCore.csv: Regenerated from wiki. All ambiguous terms now have multiple alternatives, preventing them from being automapped to a single alternative without prompting the user for confirmation
- 12:22 PM Revision 7451: mappings/VegCore.csv: Regenerated from wiki. All mappings/Veg+-VegCore.csv terms are now added as synonyms or separate terms.
- 10:26 AM Revision 7450: mappings/VegCore.csv: Regenerated from wiki. Most ambiguous terms are now split into alternatives, and most mappings/Veg+-VegCore.csv terms are now added as synonyms.
- 06:12 AM Revision 7449: canon: Raise an error if two input terms map to the same simplified string
- 04:34 AM Revision 7448: translate: Changed dictionary to thesaurus, since the map used actually has synonyms rather than definitions
- 04:31 AM Revision 7447: mappings/Makefile: Veg+-VegCore.csv: Translate the thesaurus's output terms using itself in order to map a synonym of an ambiguous term directly to its alternatives list rather than only to the ambiguous term itself
- 04:26 AM Revision 7446: mappings/Makefile: Veg+-VegCore.csv: Run collapse_multimap on the generated map so that all alternatives are included, rather than just the first alternative, when translate maps an ambiguous term
- 04:25 AM Revision 7445: redmine_synonyms: Fixed bug where need to output a CSV rather than TSV to be usable by other programs that use map spreadsheets
- 04:23 AM Revision 7444: Added collapse_multimap, which collapses multimap entries in a spreadsheet dictionary
- 03:45 AM Revision 7443: mappings/Veg+-VegCore.csv: Separate alternatives of ambiguous terms with , instead of ", " for easier machine-parsability
- 03:31 AM Revision 7442: redmine_synonyms: Added support for ambiguous terms, which unlike the synonyms format nests the term (the alternative) under the synonym (the ambiguous term) rather than the synonym under the term. Note that ambiguous terms must also be prefixed with ? to differentiate them from composites (e.g. recordedBy_givenName), which use the same _-based naming convention.
- 03:08 AM Revision 7441: mappings/VegCore.csv: Regenerated from wiki
- 02:49 AM Revision 7440: mappings/VegCore.csv: Regenerated from wiki
- 02:22 AM Revision 7439: schemas/vegbien.sql: analytical_stem_view: Renamed scientificNameWithMorphospecies to taxonNameWithMorphospecies because it does not contain the scientific name author, as required by DwC scientificName <http://rs.tdwg.org/dwc/terms/#scientificName>
- 01:56 AM Revision 7438: mappings/Makefile: VegCore.tables.csv: Exclude ambiguous table names, which should not be part of the tables summary (as neither are table synonyms)
- 01:51 AM Revision 7437: input.Makefile: $(translate?): Merged with $(translate), which is not used independently
- 01:50 AM Revision 7436: input.Makefile: Use new translate_ci instead of translate
- 01:47 AM Revision 7435: mappings/Makefile: Use new translate_ci instead of translate
- 01:39 AM Revision 7434: Added translate_ci
02/04/2013
02/02/2013
- 05:39 PM Revision 7432: mappings/Makefile: Added target to create Veg+-VegCore.csv from VegCore.htm, initially commented out until all the synonyms in the existing Veg+-VegCore.csv are added to the VegCore data dictionary <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_data_dictionary>
- 05:38 PM Revision 7431: Added redmine_synonyms, which translates a Redmine HTML page to a thesaurus
- 04:37 PM Revision 7430: lockfile: Linux: Documented why newgrp and recursive invocation of lockfile are needed
- 04:33 PM Revision 7429: lockfile: Linux: Fixed bug where need to change primary group of the dotlockfile process to the group of the dir to contain the lockfile, because dotlockfile otherwise reports a "permission denied" error (even though the directory is actually writable, dotlockfile thinks it isn't). Running dotlockfile with a different primary group is complicated because newgrp, the command that does this, does not pass arguments to the new process, so they must instead be passed via environment variables and a recursive invocation of lockfile (with the $inner recursion flag set). Additionally, exec cannot be used to propagate the PPID (needed by dotlockfile) because newgrp creates a new process rather than using exec, so it must be manually entered into the lockfile after dotlockfile runs.
- 02:41 PM Revision 7428: lockfile: Linux: Fixed bug where need to lower retry count to avoid overflowing the retries variable
- 02:37 PM Revision 7427: lockfile: Linux: Added workaround for bug in dotlockfile where using -1 to retry indefinitely doesn't work, so need to use large integer instead
- 01:49 PM Revision 7426: lockfile: Linux: Use bin/dotlockfile instead of the system's dotlockfile, because the system's dotlockfile is SETGID mail, which prevents it from creating lockfiles in a directory owned by the bien user and group when being run by the login user
- 01:38 PM Revision 7425: bin/: svn:ignore: Added dotlockfile, which is copied from the system during installation
- 01:30 PM Revision 7424: bin/: svn:ignore: Removed no longer applicable test_output
- 01:26 PM Revision 7423: root Makefile: misc-Linux: Added command to copy dotlockfile to the bin/ dir, so that it can be used without being SETGID mail, which would prevent it from creating lockfiles in a directory owned by the bien user and group when being run by the user
- 01:24 PM Revision 7422: root Makefile: core: Added misc-* to install other dependencies
- 11:56 AM Revision 7421: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Removed no longer needed canon_taxonverbatim.family alternative, since the family will be included in the canon_taxonlabel.taxonomicname by the mappings
- 11:49 AM Revision 7420: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use canon_*taxonlabel*.taxonomicname instead of canon_taxonverbatim.taxonomicname as one of the alternatives because only canon_taxonlabel.taxonomicname is guaranteed to be populated by the mappings, while canon_taxonverbatim.taxonomicname will only be populated if the datasource explicitly specifies that field. This distinction is only meaningful for data without a TNRS match, as TNRS supplies canon_taxonverbatim.taxonomicname.
- 11:28 AM Revision 7419: import_all: after_import(): Added wait on tnrs.make's lockfile to ensure that all background scrubbing processes are complete before creating the analytical DB
- 11:18 AM Revision 7418: import_all: Moved `waitpid $jobs` into after_import()
02/01/2013
- 04:57 PM Revision 7417: schemas/vegbien.ERD.mwb: Fixed table sizes
- 04:51 PM Revision 7416: schemas/vegbien.ERD.mwb: Regenerated exports
- 04:34 PM Revision 7415: schemas/vegbien.sql: removed all accessioncode fields, as VegBIEN does not use them
- 03:10 PM Revision 7414: Added inputs/FIA/_src/FIADB_version4.accdb and FIADB_version4.sql (created from it using Access To PostgreSQL and the additional transformations at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Tools#MS-Access-database-MDB>)
01/31/2013
- 08:20 PM Revision 7413: Added inputs/FIA/COND_unique/, generated from new FIA data
- 08:05 PM Revision 7412: inputs/FIA/FIA_COND_unique/create.sql: Fixed bug where need to remove `CREATE TABLE :table AS` at beginning because that is added by the make target
- 08:03 PM Revision 7411: inputs/FIA/geoscrub.~.clean_up.sql: Moved creation of FIA_COND_unique to FIA_COND_unique/create.sql
- 07:40 PM Revision 7410: README.TXT: Full database import: Updated time until import_all returns control to the shell to account for the TNRS names now being imported concurrently with the inputs rather than before them
- 07:31 PM Revision 7409: mappings/VegCore-VegBIEN.csv: Also include morphospecies in the accepted taxondetermination's taxonverbatim, so that it can easily be retrieved by the analytical DB views
- 07:15 PM Revision 7408: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Fixed bug where need to use the taxonName or scientificName when the name components are not provided, as is the case when there is no scrubbed taxondetermination (because TNRS returns no match)
- 06:08 PM Revision 7407: mappings/VegCore.csv: Regenerated from wiki. This adds Brad's DwC ID terms and their definitions in <https://projects.nceas.ucsb.edu/nceas/attachments/download/621/vegbien_identifier_examples.xlsx>.
- 05:06 PM Revision 7406: schemas/vegbien.ERD.mwb: Regenerated exports
- 04:04 PM Revision 7405: join: Added support for direct mappings to VegBIEN by passing through outputs that start with / (indicating an XPath rather than a term)
- 04:01 PM Revision 7404: mappings/VegCore.csv: Regenerated from wiki
- 11:38 AM Revision 7403: schemas/vegbien.sql: analytical_stem_view: Added family_matched, taxonName_matched, scientificNameAuthorship_matched
- 11:02 AM Revision 7402: schemas/vegbien.sql: analytical_stem_view: Added family_verbatim, scientificName_verbatim, scientificNameAuthorship_verbatim from datasource taxondetermination
- 10:57 AM Revision 7401: mappings/VegCore.csv: Regenerated from wiki
- 10:30 AM Revision 7400: schemas/vegbien.sql: analytical_stem_view: Fixed bug where need to use identifiedBy and dateIdentified from the *datasource* taxondetermination rather than the canonical taxondetermination (whichever taxondetermination is most scrubbed)
- 10:23 AM Revision 7399: schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent(): is_datasource_current: Fixed bug where need to filter out determinationtypes for matched/accepted determinations, which are not datasource determinations
- 10:19 AM Revision 7398: schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent(): Fixed bug where need to also set existing datasource_current taxondetermination's is_datasource_current to false
- 08:52 AM Revision 7397: xml_dom.py: replace_with_text(): Added support for all scalar (non-Node) types, which will be stringified using strings.ustr()
- 03:52 AM Revision 7396: schemas/functions.sql: Added _fix_date()
- 02:49 AM Revision 7395: sql_io.py: put_table(): Documented that much of the complexity of the normalizing algorithm is due to PostgreSQL not having a native command for insert/on duplicate select
- 02:24 AM Revision 7394: sql_io.py: put_table(): Corrected "insert/if not exists get" to "insert/on duplicate select"
- 01:52 AM Revision 7393: sql_io.py: put_table(): Removed no longer applicable requirement that it be run at the beginning of a transaction, which was only required when the output table was locked during the function call
- 01:48 AM Revision 7392: sql_io.py: put_table(): Documented that the function's insert/if not exists get algorithm does not support database triggers that populate fields covered by a unique constraint
- 01:42 AM Revision 7391: inputs/FIA/_src/_README.TXT: Documented that FIA does not provide data for some states, e.g. HI
01/30/2013
- 10:48 PM Revision 7390: config/: Set svn:ignore to exclude *password files
- 10:41 PM Revision 7389: Removing config/bien_read_password from version control
- 10:30 PM Revision 7388: Removing config/bien_password from version control
01/29/2013
- 03:26 PM Revision 7387: inputs/FIA/: Added refreshed data (not yet mapped)
- 03:15 PM Revision 7386: input.Makefile: Existing maps discovery: $(exts): Also match uppercase versions of extensions
- 03:12 PM Revision 7385: lib/common.Makefile: Added $(ucase) and $(ci)
- 01:56 PM Revision 7384: inputs/FIA/_src/Makefile: Table bundling: $(tableCsvs): Fixed bug where need to replace % with $* in $(csvPattern)
- 01:15 PM Revision 7383: inputs/FIA/_src/Makefile: Table bundling: Fixed bug where need to remove trailing slashes from dirs that will match a target pattern
- 01:09 PM Revision 7382: inputs/FIA/_src/Makefile: Added Table bundling targets to regroup CSVs by tables
- 01:09 PM Revision 7381: lib/common.Makefile: Added $(mkdir)
- 11:02 AM Revision 7380: Added inputs/FIA/_src/_README.TXT with Bob's comments
- 11:02 AM Revision 7379: input.Makefile: SVN: $(_svnFilesGlob): Added README.TXT
- 10:33 AM Revision 7378: mappings/VegCore.csv: Regenerated from wiki. Synonym lists have now been translated to sections to create a web page anchor for each synonym, using the steps at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/VegCore_refactoring#Index-synonyms-as-web-page-anchors>. This enables searching for VegCore synonyms in the data dictionary as well as terms, and makes it possible to swap a term and a synonym while still keeping both as indexed anchors.
- 06:19 AM Revision 7377: mappings/VegCore.csv: Regenerated from wiki. All uncategorized terms have now been moved to tables.
- 06:19 AM Revision 7376: README.TXT: Maintenance: VegCore data dictionary: Added steps to check that no terms were lost when moving terms
01/28/2013
- 05:13 PM Revision 7375: inputs/import.stats.xls: Updated import times
- 05:12 PM Revision 7374: mappings/VegCore.csv: Regenerated from wiki
01/25/2013
- 03:54 PM Revision 7373: schemas/vegbien.sql: analytical_stem_view: coordinates: Only use county_centroids coordinates when datasource coordinates are not provided, not also when datasource coordinates aren't geovalid. This also fixes a bug where (NULL) county_centroids coordinates were used for non-geovalid coordinates even when there was no county_centroids match, rather than including the non-geovalid coordinates.
- 03:34 PM Revision 7372: mappings/VegCore.csv: Regenerated from wiki
- 11:27 AM Revision 7371: schemas/vegbien.sql: taxondetermination: Added is_datasource_current, which is autopopulated to the most recent *datasource-provided* taxondetermination
- 11:07 AM Revision 7370: schemas/vegbien.sql: taxondetermination: Added taxondetermination_single_accepted_determination unique index to facilitate joining on the accepted determination
- 11:05 AM Revision 7369: schemas/vegbien.sql: taxondetermination: Added taxondetermination_single_matched_determination unique index to facilitate joining on the matched determination
- 10:32 AM Revision 7368: schemas/vegbien.sql: taxondetermination: Removed notespublic, notesmgt, which are not used by VegBIEN
- 09:30 AM Revision 7367: schemas/vegbien.sql: taxon_trait_view: scientificName: Use taxonverbatim.taxonname when taxonlabel/taxonverbatim.taxonomicname are not provided, to accommodate TNRS names. This is part of the workaround for the bug where the taxonlabel's taxonomicname (concatenated taxonomicname) is occasionally not populated.
- 09:10 AM Revision 7366: schemas/vegbien.sql: taxon_trait_view: Added workaround for bug where the taxonlabel's taxonomicname (concatenated taxonomicname) is occasionally not populated due to a taxonlabel constraint violation, by using the taxonverbatim's taxonomicname instead in these cases. This bug, which appeared in the r7317 import, is so far not reproducible (tested on Mac OS X), so its cause is unknown, but may be caused by a bug in functions._merge_prefix(), which is run on the taxonlabel's taxonomicname but not the taxonverbatim's taxonomicname.
01/24/2013
- 09:51 PM Revision 7365: schemas/vegbien.sql: analytical_stem_view: Added dateIdentified, identificationRemarks per Brad's request (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#E-mail-on-2013-1-16)
- 09:40 PM Revision 7364: inputs/FIA/_src/Makefile: Added extraction targets to extract zip archives
- 09:07 PM Revision 7363: inputs/FIA/_src/download: Use new Makefile, which uses make logic to determine if a file needs to be downloaded
- 09:05 PM Revision 7362: Added inputs/FIA/_src/Makefile, with targets to download each zip archive
- 08:00 PM Revision 7361: schemas/vegbien.sql: analytical_stem_view: derived terms: Added _bien suffix per Brad's request (https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#Brad-Boyles-comments)
- 03:22 PM Revision 7360: Added inputs/FIA/_src/FIADB_version4.accdb.url
- 03:18 PM Revision 7359: inputs/FIA/_src/download: Only run wget on files that don't yet exist
- 03:16 PM Revision 7358: inputs/FIA/_src/download: Run wget in same directory as script to ensure files get downloaded there
- 03:06 PM Revision 7357: inputs/FIA/_src/download: Set svn:executable
- 03:04 PM Revision 7356: Added inputs/FIA/_src/download to download archives of CSVs for each state
- 03:03 PM Revision 7355: to_do/timeline.2013.xls: Updated with changes during conference call
- 09:46 AM Revision 7354: schemas/vegbien.sql: taxon_trait_view: Renamed datasource_taxonverbatim to taxonverbatim because there is now only one taxonverbatim
- 09:31 AM Revision 7353: schemas/vegbien.sql: taxon_trait_view: Moved the taxondetermination.iscurrent filter to the join condition to allow using the taxondetermination_single_current_determination index
- 09:24 AM Revision 7352: schemas/vegbien.sql: taxon_trait_view: Join only on the primary taxonlabel, not the accepted taxonlabel, because the scrubbed name is now available directly via the taxonlabel attached to the scrubbed taxondetermination
- 09:11 AM Revision 7351: schemas/vegbien.sql: analytical_stem_view: Added locality
- 08:18 AM Revision 7350: inputs/UNCC/Specimen/map.csv: accession: Remapped to catalogNumber per Bob's corrections
01/23/2013
- 10:31 PM Revision 7349: schemas/vegbien.ERD.mwb: Regenerated exports
- 10:25 PM Revision 7348: mappings/VegCore.csv: Regenerated from wiki
- 10:01 PM Revision 7347: README.TXT: Schema changes: Added instructions to run the appropriate sync function when changing the analytical views
- 09:56 PM Revision 7346: schemas/vegbien.sql: analytical_stem_view: Added georeferenceProtocol, which is set to 'county centroid' when county centroid coordinates are used
- 08:12 PM Revision 7345: make_analytical_db: Don't run export_analytical_db if the SQL script exits with an error
- 08:04 PM Revision 7344: README.TXT: Full database import: record the import times in inputs/import.stats.xls: Added `export version=<version>` because import_times may be run in a shell different from the one that the import was run in
- 08:03 PM Revision 7343: inputs/import.stats.xls: Updated import times
01/22/2013
- 07:43 PM Revision 7342: schemas/vegbien.sql: taxonverbatim: taxonverbatim_unique: Added morphoname for cases when there is just a morphoname, and to distinguish taxonverbatims with the same taxonlabel but different morphonames
- 07:43 PM Revision 7341: schemas/vegbien.sql: stemobservation: Added stemobservation_non_empty CHECK constraint to prevent creating an empty stemobservation for plantobservation rows without stem *data* but with stem *mappings*
- 07:36 PM Revision 7340: schemas/vegbien.sql: stemobservation: Added stemobservation_non_empty CHECK constraint to prevent creating an empty stemobservation for plantobservation rows without stem *data* but with stem *mappings*
- 07:34 PM Revision 7339: schemas/vegbien.sql: stemobservation: Added stemobservation_non_empty CHECK constraint to prevent creating an empty stemobservation for plantobservation rows without stem *data* but with stem *mappings*
- 07:16 PM Revision 7338: schemas/vegbien.sql: taxonverbatim: taxonverbatim_unique: Added morphoname for cases when there is just a morphoname, and to distinguish taxonverbatims with the same taxonlabel but different morphonames
- 07:11 PM Revision 7337: schemas/vegbien.sql: taxonverbatim: Allow taxonlabel_id to be NULL when morphoname is provided
- 07:09 PM Revision 7336: schemas/vegbien.sql: taxonverbatim: Allow taxonlabel_id to be NULL when morphoname is provided
- 07:04 PM Revision 7335: schemas/vegbien.sql: taxonverbatim: Added source_id to allow creating taxonverbatims without a (scoping) taxonlabel
- 05:34 PM Revision 7334: schemas/vegbien.sql: analytical_stem_view: Removed speciesBinomialWithMorphospecies now that it's duplicated by scientificNameWithMorphospecies
- 05:28 PM Revision 7333: schemas/vegbien.sql: analytical_stem_view: scientificNameWithMorphospecies: Create it using the speciesBinomialWithMorphospecies formula, per Brad's request at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#2013-1-18>
- 05:05 PM Revision 7332: schemas/vegbien.sql: analytical_stem_view: Added coordinateSource to indicate whether coordinates are from county_centroids (georeferencing) or the source data
- 05:00 PM Revision 7331: schemas/vegbien.sql: Added coordinatesource enum
- 04:50 PM Revision 7330: mappings/VegCore.csv: Regenerated from wiki
- 04:34 PM Revision 7329: schemas/vegbien.sql: analytical_stem_view: coordinates: Also use the county_centroids coordinates when the datasource coordinates are not geovalid. (Note that canon_place.geovalid will be NULL, i.e. not true, when the datasource coordinates are NULL.)
- 04:28 PM Revision 7328: schemas/vegbien.sql: scientificName: Set to taxonverbatim.taxonname instead per Brad's changes at <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Spot-checking#2013-1-18>. Renamed to taxonName since this now doesn't include the author, which is part of DwC's scientificName field.
- 03:55 PM Revision 7327: schemas/vegbien.sql: sync_analytical_stem_to_view(): Support running the function when dependent views do not exist. This allows using the sync function when changing column names of the analytical_stem_view, which sometimes requires manually dropping and re-creating the analytical_aggregate_view.
- 02:49 PM Revision 7326: backups/Makefile: %.md5/test: Added comment to run with `make -s` to avoid echoing make commands
- 02:42 PM Revision 7325: README.TXT: Full database import: Added steps to scrub unscrubbed taxondeterminations (if they are not scrubbed automatically)
- 02:06 PM Revision 7324: inputs/.geoscrub/_src/README.TXT: Added e-mails from Jim about how the county_centroids data was generated
- 01:18 PM Revision 7323: schemas/vegbien.sql: analytical_stem_view: coordinates: Use new county_centroids coordinates and uncertainty when the datasource's coordinates are not available
- 01:10 PM Revision 7322: Added inputs/.geoscrub/county_centroids/ from Jim
- 01:09 PM Revision 7321: inputs/.geoscrub/import_order.txt: Added geoscrub_output
- 12:24 PM Revision 7320: inputs/import.stats.xls: Updated import times
- 12:19 PM Revision 7319: README.TXT: Full database import: In PostgreSQL: Added step to check that there are TNRS taxondeterminations
- 12:17 PM Revision 7318: README.TXT: Full database import: In PostgreSQL: Added step to check that unscrubbed_taxondetermination_view returns no rows
01/18/2013
- 02:37 PM Revision 7317: Added inputs/newWorld/newWorldCountries/_no_import
- 02:33 PM Revision 7316: to_do/timeline.2013.xls: Updated with Brad's modifications
- 02:18 PM Revision 7315: Added inputs/FIA/_src/FIA_summary.b-e.00079.pdf from Bob
- 02:07 PM Revision 7314: Added inputs/.herbaria/_archive/
- 01:02 PM Revision 7313: inputs/.herbaria/: Removed no longer needed geoscrub.*.sql, which has been replaced with bien3_adb.*.sql
- 01:00 PM Revision 7312: inputs/.herbaria/: Removed no longer needed herbaria/. Use ih/ instead.
- 12:58 PM Revision 7311: Added inputs/.herbaria/ih/ and corresponding bien3_adb MySQL export
- 12:43 PM Revision 7310: mappings/VegCore-VegBIEN.csv: Don't create NCBI crosslinks for the matched taxonomic name. These crosslinks are no longer needed now that TNRS provides a separate accepted name on which crosslinks can be made.
- 12:32 PM Revision 7309: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Include the accepted name's row next to the matched name's row instead of merging the two together into one TNRS row, to allow including separate taxondeterminations for the matched and accepted names. Added Max_score from TNRS.tnrs.
- 12:25 PM Revision 7308: schemas/vegbien.sql: taxondetermination_set_iscurrent(): Added new determinationtype accepted to sort order
- 12:01 PM Revision 7307: mappings/VegCore-VegBIEN.csv: Mapped accepted* taxonomic name, now to separate accepted taxondetermination
- 11:35 AM Revision 7306: mappings/VegCore.csv: Regenerated from wiki
- 11:20 AM Revision 7305: schemas/vegbien.sql: taxondetermination_set_iscurrent(): Changed TNRS determinationtype from computer to matched, to allow for a separate accepted determinationtype
- 10:57 AM Revision 7304: schemas/vegbien.sql: taxonlabel: Removed creationdate, which duplicates taxondetermination.determinationdate
- 10:08 AM Revision 7303: schemas/vegbien.sql: analytical_stem_view: isNewWorld: Removed no longer needed COALESCE() to false, because newWorldCountries now uses false where applicable instead of NULL. This also ensures that isNewWorld will be NULL if there is no country name to test, which was not the case in the previous workaround.
- 10:02 AM Revision 7302: Added inputs/newWorld/newWorldCountries/ with postprocess.sql that sets isNewWorld to false wherever it's NULL. (The input table only marks New World countries as true, but doesn't mark non-New World countries as false.)
- 09:50 AM Revision 7301: schemas/vegbien.sql: analytical_stem_view: isNewWorld: Fixed bug where need to COALESCE() "newWorldCountries"."isNewWorld" to false, because it is only set to a boolean for countries that are New World
- 09:19 AM Revision 7300: README.TXT: Full database import: freeing disk space: Updated import schema size, which is smaller due to the removed CTFS staging tables, removed duplicate rows, and possibly fewer index holes
- 08:56 AM Revision 7299: README.TXT: Full database import: After running `make schemas/$version/publish`, added `unset version` to make sure future version-dependent commands use the public schema
- 08:50 AM Revision 7298: schemas/vegbien.sql: taxon_trait_view: Fixed bug where measurementUnit needed to be set to trait.units, not name
- 08:42 AM Revision 7297: schemas/vegbien.sql: provider_count_view: Don't set default values for sourcetype/observationtype, because the appropriate values are now set for all top-level inputs and these defaults are not applicable for data owners not in geoscrub.herbaria
- 08:41 AM Revision 7296: inputs/bien2_traits/Source/map.csv: Mapped observationType
- 08:27 AM Revision 7295: schemas/vegbien.sql: taxondetermination: Removed taxondetermination_computer_min_fit CHECK constraint, whose functionality is now duplicated by unscrubbed_taxondetermination_view's Max_score filter condition. The score threshold value should only be maintained in one place, namely unscrubbed_taxondetermination_view.
- 08:23 AM Revision 7294: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to filter out any names that will be rejected by taxondetermination's constraints, because otherwise, these names will stay in unscrubbed_taxondetermination_view and be repeatedly reimported
- 07:38 AM Revision 7293: inputs/.TNRS/schema.sql: tnrs: Added Max_score column for use in filtering out names that will be rejected by taxondetermination's constraints
- 07:22 AM Revision 7292: inputs/.TNRS/schema.sql: Renamed tnrs_populate_accepted_scientific_name() trigger to tnrs_populate_derived_fields() to accommodate additional derived fields
- 07:14 AM Revision 7291: tnrs_db: Support multiple appended columns in the tnrs table
- 07:13 AM Revision 7290: csvs.py: ColInsertFilter: Support adding multiple, consecutive columns
- 06:30 AM Revision 7289: schemas/functions.sql: _max(), _min(): Put $n params all on one line to match other aggregating functions
- 06:28 AM Revision 7288: schemas/functions.sql: _max(), _min(): Use PostgreSQL built-in functions GREATEST(), LEAST() instead of a query with aggregating functions
- 06:02 AM Revision 7287: README.TXT: Added Single datasource import section with commands to import/reimport/scrub just a datasource rather than the full DB
- 05:54 AM Revision 7286: schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent_on_delete() trigger: Fixed bug where need to suppress any foreign key exception, which occurs during a cascading delete because the associated taxonoccurrence has already been deleted, preventing any other taxondeterminations of that taxonoccurrence from being updated
- 05:35 AM Revision 7285: input.Makefile: Taxonomic scrubbing: Added reimport_scrub
- 05:34 AM Revision 7284: input.Makefile: Import to VegBIEN: Added reimport
- 05:28 AM Revision 7283: input.Makefile: Taxonomic scrubbing: Added rescrub
- 05:21 AM Revision 7282: input.Makefile: Taxonomic scrubbing: Added scrub target and use it in import_scrub
- 05:18 AM Revision 7281: input.Makefile: Import to VegBIEN: Moved import, rm to top of section since they are top-level targets and don't depend on the variables defined for %/import
- 05:17 AM Revision 7280: input.Makefile: Moved rm to Import to VegBIEN section
- 05:16 AM Revision 7279: input.Makefile: Moved taxonomic scrubbing targets to separate Taxonomic scrubbing section
- 04:43 AM Revision 7278: inputs/import.stats.xls: Updated import times
- 03:34 AM Revision 7277: schemas/vegbien.sql: provider_count_view: Include only sources with at least one row. Currently (as of r7023), all entries in BIEN2's geoscrub.herbaria are also in VegBIEN, so the filter is not yet necessary, but switching to bien3_adb.ih could create source entries without data rows which should be excluded from the providers list.
- 03:25 AM Revision 7276: import_all: Output the PIDs of the import_scrub and after_import processes, so those processes can be managed without shell job control. This is useful if the connection is lost to the remote shell running the import, which prevents using job control on the import processes.
- 01:23 AM Revision 7275: input.Makefile: Import to VegBIEN: import_scrub: Run `make scrub` in the background, to allow the import to continue with the next table rather than having to wait for the current table to be scrubbed
- 12:53 AM Revision 7274: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Moved waitself call to top of script
- 12:52 AM Revision 7273: inputs/import.stats.xls: Updated import times
- 12:24 AM Revision 7272: inputs/import.stats.xls: Added Postprocessing section for use with the next import
- 12:05 AM Revision 7271: inputs/import.stats.xls: Updated import times. Total does not yet include postprocessing.
01/17/2013
- 11:29 PM Revision 7270: import_times: Add blank line before \"Postprocessing logs\" to separate it from the input logs
- 11:28 PM Revision 7269: import_times: Separate out the postprocessing logs (e.g. public.unscrubbed_taxondetermination_view), as the import times in these logs are not aggregated together (each input has its own run of the postprocessing script)
01/16/2013
- 02:55 PM Revision 7268: root Makefile: Datasources: import: Use new import_scrub instead of import (input.Makefile)
- 02:51 PM Revision 7267: import_all: Use new import_scrub (input.Makefile) instead of import, which avoids needing to start background processes for tnrs-remake and scrub-remake
- 02:50 PM Revision 7266: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Fixed bug where need to use tnrs.make's lockfile instead because can't be importing while tnrs.make is scrubbing. tnrs.make leaves tnrs in an incomplete state while running because the accepted names are parsed *after* their matched names. Using a separate lockfile would cause some accepted names to be missing.
- 02:27 PM Revision 7265: input.Makefile: Import to VegBIEN: Added import_scrub, which runs `make scrub` after the import
- 02:26 PM Revision 7264: root Makefile: Datasources: Added scrub, which runs tnrs-remake and scrub-remake
- 02:18 PM Revision 7263: inputs/.TNRS/*/*.make: Only allow one instance of the script to be running at any time, by using new waitself
- 02:15 PM Revision 7262: waitpid, lockfile: Changed $interval default to 5s to work with smaller imports, where less waiting is needed
- 02:14 PM Revision 7261: Added waitself
- 02:11 PM Revision 7260: bin/lockfile: Include the PID in the lockfile to avoid the need to manually remove lockfiles. On Mac, this requires using shlock instead of lockfile.
- 01:35 PM Revision 7259: Added bin/lockfile
- 01:34 PM Revision 7258: Added pid2name
- 01:33 PM Revision 7257: Added name2pids
- 01:33 PM Revision 7256: waitpid: Use `ps` instead of /proc to also work on Mac
- 01:07 PM Revision 7255: inputs/.TNRS/tnrs/tnrs.make: Fixed bug where need special handling to support being run as a .make script
- 11:59 AM Revision 7254: inputs/.geoscrub/_src/README.TXT: Added dates for e-mails from Jim
- 11:57 AM Revision 7253: inputs/.geoscrub/_src/README.TXT: Added e-mail from Jim about repository with scripts to generate the geoscrub_output table
- 11:02 AM Revision 7252: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to use tnrs_accepted.Name_submitted IS NOT NULL rather than tnrs_accepted.* IS NOT NULL, because tnrs_accepted.* (which plain tnrs_accepted gets changed to by PostgreSQL) checks *each field* of the tnrs_accepted tuple rather than checking if the tuple itself is NULL
- 10:23 AM Revision 7251: inputs/.TNRS/schema.sql: Added back tnrs+accepted view, which is useful for debugging the import of the TNRS results
- 09:21 AM Revision 7250: inputs/REMIB/Specimen/postprocess.sql: Added back ARIZ, NY because some REMIB specimens for these datasources are not yet in the datasources themselves
- 08:43 AM Revision 7249: Added inputs/REMIB/Specimen/postprocess.sql to remove institutions that we have direct data for
- 08:43 AM Revision 7248: Placed inputs/REMIB/_archive/ under version control
- 08:23 AM Revision 7247: Added inputs/SpeciesLink/Specimen/postprocess.sql to remove institutions that we have direct data for
- 08:21 AM Revision 7246: Placed inputs/SpeciesLink/_archive/ under version control
- 07:56 AM Revision 7245: input.Makefile: $(import?): Renamed $public_import option to $full_import because it applies to any import of all datasources, not just a public import on vegbiendev
- 07:23 AM Revision 7244: schemas/vegbien.sql: analytical_stem_view: Changed `WHERE COALESCE(taxondetermination.iscurrent, true)` to a join condition to enable using the taxondetermination_single_current_determination index, which produces the filtered rows directly. Note that this index will not be used for full-database imports, because the query planner uses hash joins everywhere instead of nested loops.
- 06:47 AM Revision 7243: db_xml.py: put_table(): Fixed bug where for views, shouldn't advance start (OFFSET clause) after each chunk, because views are typically dynamic and will contain a new set of rows after the first set is imported
- 06:41 AM Revision 7242: sql.py: Added view_exists()
- 06:16 AM Revision 7241: inputs/.TNRS/schema.sql: Removed no longer used tnrs_canon. unscrubbed_taxondetermination_view uses its definition directly instead.
- 06:14 AM Revision 7240: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Added comment from tnrs_canon
- 06:12 AM Revision 7239: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Added comment from tnrs_canon
- 06:09 AM Revision 7238: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Do the tnrs_canon joins manually instead of using tnrs_canon, to allow PostgreSQL to use a nested loop join on just the needed tnrs rows instead of a hash self-join of all tnrs rows. The query planner is not yet advanced enough to automatically integrate the select on the view into the top-level joins list, which would make this change automatically.
- 05:52 AM Revision 7237: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: rowsAdded(): Look at last 100 rows instead of last 10, because rows are added to the log file each time the script waits and the Inserted # new rows message must be in the tailed rows
- 05:48 AM Revision 7236: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: rowsAdded(): Fixed bug where need to test if log file exists before using it in tail, because if tail fails and causes rowsAdded to return false, this error exit status will be indistinguishable from false for no rows added and the script will keep going
- 05:40 AM Revision 7235: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Fixed bug where need special handling to support being run as a .make script
- 03:35 AM Revision 7234: input.Makefile: Editing import: Added unscrub to remove TNRS taxondeterminations
- 03:34 AM Revision 7233: psql_script_vegbien: Added no_query_results option to hide results of calls to void functions
- 03:33 AM Revision 7232: schemas/vegbien.sql: Added delete_scrubbed_taxondeterminations()
- 01:43 AM Revision 7231: root Makefile: python-Darwin: Added instructions to install dateutil for Python 3 as well as Python 2, for use in PL/Python functions
- 01:42 AM Revision 7230: root Makefile: python-Darwin: Added note that Python 2 comes preinstalled
- 01:15 AM Revision 7229: Added inputs/GBIF/Specimen/postprocess.sql to remove institutions that we have direct data for
01/15/2013
01/14/2013
- 05:21 PM Revision 7227: inputs/.TNRS/schema.sql: Removed no longer used array_to_string(). The IMMUTABLE wrapper is only needed for index conditions and other places that require an IMMUTABLE function.
- 05:14 PM Revision 7226: input.Makefile: Maps validation: %/new_terms.csv: Filter out terms that map to UNUSED, because these are not mappings that are useful as VegCore synonyms
- 05:13 PM Revision 7225: input.Makefile: Maps validation: %/new_terms.csv: Filter out terms that map to UNUSED, because these are not mappings that are useful as VegCore synonyms
- 05:12 PM Revision 7224: README.TXT: Data import: Checking free disk space: Updated import schema size to 110GB
- 04:37 PM Revision 7223: Added inputs/Madidi/_README.TXT
- 04:35 PM Revision 7222: new_terms.csv: Regenerated
- 04:34 PM Revision 7221: inputs/Madidi/new_terms.csv: Regenerated
- 04:19 PM Revision 7220: inputs/Madidi/_archive/2010-1-2/: Set svn:ignore
- 04:18 PM Revision 7219: inputs/Madidi/_README.TXT: Archived to _archive/2010-1-2/
- 03:43 PM Revision 7218: inputs/Madidi/: Refreshed. Note that new export has a completely new schema.
- 03:42 PM Revision 7217: inputs/Madidi/: Refreshed. Note that new export has a completely new schema.
- 01:53 PM Revision 7216: input.Makefile: Maps validation: %/new_terms.csv: Filter out terms that map to UNUSED, because these are not mappings that are useful as VegCore synonyms
- 01:18 PM Revision 7215: mappings/VegCore-VegBIEN.csv: fieldNumber (authorEventCode): Fixed bug where locationevent.authorlocationcode should be authoreventcode
- 12:19 PM Revision 7214: Added inputs/Madidi/map.csv, created from new_terms.csv
- 12:16 PM Revision 7213: inputs/Madidi/_archive/: Set svn:ignore
- 12:15 PM Revision 7212: csvs.py: sniff(): TSVs: Don't turn off quoting, because some TSVs (such as Madidi.IndividualObservation) do quote fields
- 12:13 PM Revision 7211: csvs.py: TsvReader: Use csv.reader.next() when possible to support quoted fields, such as in Madidi.IndividualObservation
- 11:43 AM Revision 7210: input.Makefile: Configuration: $(exts): Added .dat, which the new Madidi files use
- 08:39 AM Revision 7209: mappings/Makefile: VegCore.tables.csv: Removed no longer needed removal of Namespaces table, which is now marked as just a section, not a table
- 08:37 AM Revision 7208: mappings/VegCore.csv: Regenerated from wiki
- 07:39 AM Revision 7207: Added to_do/timeline.2013.xls (from Brad, converted to .xls)
- 07:30 AM Revision 7206: to_do/timeline.doc: Renamed to timeline.2012.doc to allow for a separate 2013 timeline
01/11/2013
- 05:05 PM Revision 7205: README.TXT: Data import: Deleting imports before the last: Added instructions to keep a previous import instead of deleting it
- 04:22 PM Revision 7204: input.Makefile: Staging tables installation: $(logInstall): Always log the installation, regardless of the $log env var, because $log is set by default on development machines but an install log should still be created
- 01:03 PM Revision 7203: schemas/vegbien.ERD.mwb: Regenerated exports
- 10:19 AM Revision 7202: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Fixed bug where need to handle the case where (SELECT source.source_id FROM source WHERE source.shortname = 'TNRS') is NULL because no TNRS names have been imported yet
- 09:44 AM Revision 7201: **/new_terms.csv, **/unmapped_terms.csv: Regenerated using `make missing_mappings`
- 09:19 AM Revision 7200: mappings/VegCore-VegBIEN.csv: morphoname: Remapped to the original rather than current taxondetermination because this is the *original* name applied by the author
- 09:16 AM Revision 7199: inputs/SALVIAS*/Organism/map.csv: Remapped voucher_string/coll_number to recordNumber instead of catalogNumber, because this number is actually applied by the collector rather than by a herbarium
- 09:11 AM Revision 7198: mappings/VegCore-VegBIEN.csv: Mapped recordNumber to new specimenreplicate.collectionnumber
- 09:02 AM Revision 7197: mappings/VegCore-VegBIEN.csv: Also map recordNumber (collectionnumber) to the indirect voucher's specimenreplicate
- 08:48 AM Revision 7196: inputs/*/*/map.csv: Remapped recordNumber to new individualCode where applicable
- 08:44 AM Revision 7195: mappings/VegCore-VegBIEN.csv: Mapped individualCode. authortaxoncode: Prefer tag over recordNumber (collectionnumber), because this applies to the plant rather than the specimen.
- 08:17 AM Revision 7194: mappings/VegCore-VegBIEN.csv: Mapped morphoname
- 08:16 AM Revision 7193: mappings/VegCore.csv: Regenerated from wiki
- 08:14 AM Revision 7192: mappings/VegCore.csv: Regenerated from wiki
- 08:04 AM Revision 7191: schemas/vegbien.sql: taxonverbatim: Added morphoname (which is different from the morphospecies suffix)
- 07:33 AM Revision 7190: schemas/vegbien.sql: plantobservation: Renamed collectionnumber to authorplantcode since this number, which identifies the *plant*, is actually different from the collectionnumber that identifies the *specimen* collected from it. This distinction is meaningful for plots data, but generally not for specimens data.
- 07:28 AM Revision 7189: schemas/vegbien.sql: plantobservation: Renamed collectionnumber to authorplantcode since this number, which identifies the *plant*, is actually different from the collectionnumber that identifies the *specimen* collected from it. This distinction is meaningful for plots data, but generally not for specimens data.
- 07:23 AM Revision 7188: schemas/vegbien.sql: specimenreplicate: Added collectionnumber
- 07:17 AM Revision 7187: schemas/vegbien.sql: taxonlabel: Removed no longer used matched_label_fit_fraction. Use taxondetermination.taxonfit instead.
- 07:02 AM Revision 7186: inputs/*/*/test.xml.ref: Restored inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
- 06:55 AM Revision 7185: schemas/vegbien.ERD.mwb: Expanded analytical_stem to fit the width of all fields
- 06:53 AM Revision 7184: schemas/vegbien.sql: taxondetermination: taxondetermination_computer_min_fit CHECK constraint: Fixed bug where need to use CASE instead of OR when a branch of an OR shouldn't be evaluated, because PostgreSQL doesn't support short-circuit OR
- 06:38 AM Revision 7183: README.TXT: Debugging: Added instructions for "binary chop" debugging, which requires syncing the DB schema to the svn working copy
- 06:08 AM Revision 7182: mappings/VegCore-VegBIEN.csv: Removed no longer used mappings for verbatimScientificName in _if conditions
- 06:08 AM Revision 7181: inputs/.NCBI/nodes/test.xml.ref: Restored inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
- 06:06 AM Revision 7180: sql_io.py: put_table(): DuplicateKeyException: Uniquifying input table to avoid internal duplicate keys: Also filter out duplicate rows in the out_table, so that they don't create duplicate key errors and the resulting index holes
- 06:01 AM Revision 7179: sql.py: distinct_table(): Added support for custom joins used in creating the new table. This can then be used by sql_io.put_table() to filter out duplicate rows in the out_table, so that they don't create duplicate key errors and the resulting index holes.
- 05:53 AM Revision 7178: README.TXT: Documentation: Redmine-formatted list of steps for column-based import: Added step to reinstall public schema first, to reset the sequences so that they don't create a diff when the new steps.by_col.log.sql is committed
- 05:48 AM Revision 7177: Added inputs/ACAD/Specimen/logs/steps.by_col.log.sql
- 05:45 AM Revision 7176: sql_gen.py: Join: Added support for mapping values which are lists, for use in USING joins
- 05:40 AM Revision 7175: inputs/SALVIAS/*/test.xml.ref: Restored SALVIAS* inserted row counts, which had gotten auto-accepted from a test run on a non-empty DB
- 05:01 AM Revision 7174: schemas/vegbien.sql: analytical_stem: Added locationName (authorPlotCode), subplot, individualCode (authorPlantCode) for use in validation
- 04:57 AM Revision 7173: schemas/vegbien.sql: sync_analytical_stem_to_view(): Drop and re-create dependent objects to avoid errors that analytical_stem can't be dropped because of dependents
- 04:56 AM Revision 7172: schemas/vegbien.sql: sync_analytical_stem_to_view(): Changed to PL/pgSQL function to allow adding PL/pgSQL commands
- 03:26 AM Revision 7171: schemas/vegbien.ERD.mwb: Moved family_higher_plant_group to leave room for analytical_stem to expand
- 03:08 AM Revision 7170: mappings/VegCore-VegBIEN.csv: Removed no longer used mappings for verbatimScientificName in _if conditions
- 02:59 AM Revision 7169: mappings/VegCore-VegBIEN.csv: Removed taxonlabel for original taxondetermination, because the original taxondetermination is not scrubbed by scrub.make (only the most current taxondetermination gets scrubbed, because only a single scrubbed determination is added by scrub.make). This still leaves the original taxondetermination's taxonverbatim, which stores the taxonomic information for historical purposes.
- 02:44 AM Revision 7168: mappings/VegCore-VegBIEN.csv: Removed no longer used accepted and verbatim (parsed) taxonlabels, which have been replaced by a single accepted or matched taxondetermination created by scrub.make
- 02:34 AM Revision 7167: Removed no longer used inputs/.TNRS/tnrs_accepted, tnrs_other. Use the tnrs_canon view instead.
- 02:22 AM Revision 7166: Removed no longer used inputs/.TNRS/tnrs_accepted, tnrs_other. Use the tnrs_canon view instead.
- 02:18 AM Revision 7165: Added inputs/.TNRS/_archive/
- 02:18 AM Revision 7164: Added inputs/.TNRS/tnrs/cleanup.sql to prevent running the default cleanup operations, which don't work on tables which have views referencing them (as is the case for tnrs, which is referenced by tnrs_canon)
- 02:07 AM Revision 7163: import_all: Removed no longer needed TNRS import, which has been replaced by scrub.make (which adds TNRS taxondeterminations after the import instead of creating taxonlabel links before it)
- 02:03 AM Revision 7162: mappings/VegCore-VegBIEN.csv: Removed TNRS input taxonlabels meant to cross-link to taxonlabels added by the TNRS import, because TNRS taxondeterminations are now created instead
- 01:42 AM Revision 7161: schemas/vegbien.sql: analytical_stem_view: Use just the main taxonlabel created by scrub.make instead of all the additional taxonlabels created by the TNRS import
- 01:11 AM Revision 7160: mappings/VegCore-VegBIEN.csv: main taxonverbatim.morphospecies "if has verbatim name" condition: Fixed bug where need to remove the taxonIsCanonical flag, because the TNRS.public.unscrubbed_taxondetermination_view table (which uses this flag) *should* include this field (although not other places where the morphospecies is stored by other TNRS tables)
- 12:49 AM Revision 7159: schemas/vegbien.sql: taxondetermination: taxondetermination_set_iscurrent() trigger: Also run on delete, to mark another taxondetermination as the current one when a current taxondetermination is deleted
- 12:18 AM Revision 7158: inputs/.TNRS/schema.sql: tnrs_canon: Annotations: Always use value from the matched name, because the accepted name does not have this
- 12:05 AM Revision 7157: mappings/VegCore-VegBIEN.csv: primary taxonlabel's parent taxonlabel: Fixed bug where a taxonverbatim was incorrectly being created solely to store the taxonRank, even though it was already stored in the taxonlabel's rank field
01/10/2013
- 11:52 PM Revision 7156: mappings/VegCore-VegBIEN.csv: Don't map morphospecies to the parsed taxonlabel's taxonepithet, because this causes an extra, parsed taxonlabel to be created for TNRS.public.unscrubbed_taxondetermination_view. It is not needed by the other TNRS tables.
- 11:45 PM Revision 7155: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Omit Infraspecific_rank to help avoid creating a separate, parsed taxonlabel. Don't map to taxonRank because Name_matched_rank is populated more often.
- 11:34 PM Revision 7154: inputs/.TNRS/public.unscrubbed_taxondetermination_view/scrub.make: Reduced $maxPause to 4 hr, because new taxondeterminations are being added throughout the import, so it is unlikely that more than more than 4 hr would pass between successive imports of taxondeterminations (causing scrub.make to stop prematurely)
- 11:23 PM Revision 7153: inputs/.TNRS/schema.sql: Removed no longer used tnrs+accepted. Use tnrs_canon or a self-join of tnrs instead
- 11:22 PM Revision 7152: schemas/vegbien.sql: tnrs_input_name: Use TNRS.tnrs directly instead of the now-deprecated tnrs+accepted
- 11:12 PM Revision 7151: schemas/vegbien.sql: Use new TNRS.tnrs_canon instead of tnrs+accepted to avoid creating additional taxonlabels for the parsed, matched, and accepted names and instead just use the most-canonicalized name of the names output by TNRS (the accepted name if available, or the matched name otherwise)
- 10:50 PM Revision 7150: mappings/VegCore-VegBIEN.csv: "if has verbatim name" _if statements that filter something out for TNRS mappings: Also assume true if taxonIsCanonical is specified, because some TNRS tables (eventually such as public.unscrubbed_taxondetermination_view) do not specify a separate "verbatim" taxondetermination but do provide taxonIsCanonical as a flag to turn various mappings on and off
- 09:06 PM Revision 7149: mappings/VegCore-VegBIEN.csv: Remapped matched*Fit_fraction to taxondetermination.taxonfit when a taxondetermination, not just a taxonlabel, is provided
- 09:03 PM Revision 7148: bin/map: map_table(): Resolving prefixes: Fixed bug where need to use list instead of tuple for metadata value mappings
- 08:16 PM Revision 7147: schemas/vegbien.sql: taxondetermination: Added CHECK constraint to allow only taxondeterminations with a minimum fit fraction of 80%, analogous to taxonlabel's taxonlabel_1_matched_label_min_fit() trigger
01/09/2013
- 05:34 PM Revision 7146: mappings/VegCore-VegBIEN.csv: Don't create a separate TNRS input taxonlabel if taxonIsCanonical exists
- 05:24 PM Revision 7145: inputs/.TNRS/schema.sql: tnrs_canon: Fixed bug where need to always use Unmatched_terms from tnrs rather than tnrs_accepted
- 05:07 PM Revision 7144: inputs/.TNRS/schema.sql: Added tnrs_canon, which stores the most canonicalized name output by TNRS
- 04:17 PM Revision 7143: schemas/vegbien.sql: analytical_stem_view: accepted_taxonverbatim: Fixed bug where need to join only to the taxonverbatim whose morphospecies is NULL, to avoid joining to multiple taxonverbatims at once. This extra filter is now needed because there can be multiple taxonverbatims for a taxonlabel with different morphospecies.
- 03:59 PM Revision 7142: mappings/VegCore-VegBIEN.csv: taxonlabel.taxonomicname: Prepend the family to the rest of the name using new _merge_prefix() instead of _join_words()/_nullIf(), so that any input taxonomic name that includes the family will not have the family duplicated in the combined taxonomic name. Previously, the duplication was removed only when the rest of the input name was *equal to* the family. This change fixes a bug in the new TNRS import where a pre-concatenated taxonomic name (Accepted_scientific_name) which includes the family is now used instead of Accepted_name, which only includes it when it's equal to the family.
- 03:52 PM Revision 7141: xml_func.py: Simplifying functions: Merging: Added _merge_prefix() passthru
- 03:33 PM Revision 7140: schemas/functions.sql: Added _merge_prefix()
- 02:42 PM Revision 7139: inputs/.TNRS/schema.sql: tnrs_populate_accepted_scientific_name(): Fixed bug where Accepted_name_family shouldn't be prefixed to Accepted_name if Accepted_name is itself the family, to avoid duplicating the family in the Accepted_scientific_name
- 02:18 PM Revision 7138: inputs/.TNRS/schema.sql: tnrs+accepted: Added new Accepted_scientific_name column and mapped it in public.unscrubbed_taxondetermination_view
- 11:06 AM Revision 7137: schemas/vegbien.sql: tnrs_input_name: Fixed bug where need to filter out tnrs+accepted rows with NULL Accepted_scientific_name, because inputs to tnrs_db must be strings
- 10:53 AM Revision 7136: schemas/vegbien.sql: tnrs_input_name: Prepend TNRS accepted names that have not yet been parsed. This allows parsing TNRS accepted names without first needing to import them into taxonlabels, which may not occur until the next import.
- 10:09 AM Revision 7135: inputs/.TNRS/schema.sql: tnrs+accepted: Use new Accepted_scientific_name to join to tnrs_accepted.Name_submitted
- 10:05 AM Revision 7134: inputs/.TNRS/schema.sql: tnrs: Added tnrs_populate_accepted_scientific_name() trigger
- 09:57 AM Revision 7133: inputs/.TNRS/schema.sql: tnrs: Added Accepted_scientific_name field which will contain the joined-together accepted name that gets re-parsed by TNRS
- 09:13 AM Revision 7132: inputs/.TNRS/: Changed tnrs+accepted to a view (defined in schema.sql) so accepted names would automatically be populated as they are parsed by TNRS, rather than needing to run `make inputs/.TNRS/tnrs+accepted/reinstall` to populate them
- 08:16 AM Revision 7131: mappings/VegCore-VegBIEN.csv: Also map the morphospecies to the accepted taxonverbatim when an accepted name is provided
- 08:01 AM Revision 7130: schemas/vegbien.sql: taxonverbatim: taxonverbatim_unique: Added morphospecies so that there can be multiple taxonverbatims for the same taxonlabel, each with different morphospecies suffixes
- 04:17 AM Revision 7129: inputs/.TNRS/public.unscrubbed_taxondetermination_view/map.csv: Mapped Accepted_name.*
- 03:02 AM Revision 7128: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Use new tnrs+accepted instead of tnrs so that the accepted name can be imported at the same time
- 02:23 AM Revision 7127: import_all: Reinstall tnrs+accepted, for eventual use by unscrubbed_taxondetermination_view
- 02:20 AM Revision 7126: Added inputs/.TNRS/tnrs+accepted/, which self-joins the TNRS results to their parsed accepted names
- 02:02 AM Revision 7125: import_all: Directly import just the TNRS tables that should be imported, because some TNRS tables are included in import_order.txt so that they are part of the automated testing, but should not be imported at the same time as tnrs_accepted/tnrs_other
- 12:45 AM Revision 7124: inputs/import.stats.xls: Updated import times
01/08/2013
- 11:24 PM Revision 7123: with_all: $all mode: Fixed bug where need " " before # for it to be interpreted as a comment (unlike in a Makefile, where the " " often needs to be left out to avoid it being treated as part of a variable value)
- 10:55 PM Revision 7122: bin/map: Made $redo flag default to off, because redo mode is slow (all tables have to be truncated) and is only needed when running tests on a public schema with data in it, which would not be the case on a development machine where tests are usually run
- 10:19 PM Revision 7121: import_all: Made temporary vars local, so they wouldn't affect the calling shell
- 09:45 PM Revision 7120: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Sort by taxondetermination.taxonoccurrence_id instead of taxondetermination_id to allow scanning the taxondetermination_single_current_determination index, which includes only current determinations and avoids needing to scan past many non-current determinations. Note that using taxonoccurrence_id does not create sort order ambiguity between taxondeterminations with the same taxonoccurrence_id, because there is only one current determination per taxonoccurrence.
- 09:32 PM Revision 7119: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Inner-join to taxonverbatim and taxonlabel instead of LEFT JOINing, because only taxondeterminations with a taxonlabel can have accepted taxondeterminations (otherwise there would be no name to scrub)
- 09:30 PM Revision 7118: schemas/vegbien.sql: unscrubbed_taxondetermination_view: Inner-join to tnrs instead of LEFT JOINing, because only taxondeterminations whose taxonlabels have already been scrubbed by TNRS should have accepted taxondeterminations added. Removed now-unneeded filter by tnrs.Name_submitted IS NOT NULL, which is replaced by the inner join.
- 08:46 PM Revision 7117: sql_io.py: put_table(): ensure_cond(): Fixed bug where need to wrap strings used in the tracked error message in strings.ustr()
- 08:33 PM Revision 7116: xml_dom.py: replace_with_text(): Fixed bug where need to use scalar.is_nonnull_scalar() instead of is_scalar() to avoid converting None values to the string 'None'
- 08:32 PM Revision 7115: scalar.py: Added is_nonnull_scalar()
Also available in: Atom