Project

General

Profile

Statistics
| Revision:
  • svn:ignore: *

# Date Author Comment
13591 06/02/2014 04:50 AM Aaron Marcuse-Kubitza

lib/tnrs.py: switched to downloading all matches per name, as is needed to implement #917. note that this will break the parts of the schema that use the tnrs table, until Brad's match-picking algorithm can be implemented, but this tradeoff is necessary to be able to begin scrubbing sooner (Martha; wiki.vegpath.org/2014-05-29_conference_call#TNRS)

13590 06/02/2014 04:35 AM Aaron Marcuse-Kubitza

schemas/vegbien.sql: tnrs_input_name: don't scrub accepted names, as using multiple matches per name no longer provides a single accepted name to scrub. instead, the Accepted_* fields can be whitespace-split to generate the same columns that would have been generated by the scrubbing (and without the overhead of the extra TNRS call).

13589 06/02/2014 04:27 AM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: added back index on Name_submitted, which is needed for tnrs_input_name to work properly (now that there is no automatic index created by a unique constraint)

13587 06/02/2014 03:43 AM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: tnrs: removed unique constraint on Name_submitted, Name_matched because there can be more than one match with the same Name_matched (but different accepted names, etc.)

13586 06/01/2014 09:00 PM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: tnrs.tnrs__valid_match index: made it non-unique to allow multiple matches per name, as is needed to implement #917

13585 06/01/2014 05:00 AM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: tnrs__match_num__fill(): only fill if not set, to support case where tnrs is being restored from a .sql file (where match_num is already set)

13584 06/01/2014 04:36 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: documented runtime to add a constraint (3 min)

13583 06/01/2014 04:35 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: unique constraint on Name_submitted: added Name_matched to allow multiple matches per name, as is needed to implement #917

13582 06/01/2014 03:44 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: documented how to populate a new column

13581 06/01/2014 03:41 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: pkey: use match_num instead of Name_number to allow multiple matches per name, as is needed to implement #917

13580 05/31/2014 10:31 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs.match_num: made it NOT NULL now that it's populated

13579 05/31/2014 10:28 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: populate match_num

13578 05/31/2014 10:25 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: populate match_num

13577 05/31/2014 09:50 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: documented how to add and remove columns

13575 05/31/2014 08:58 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: made COMMENTs start on their own line, using the steps at wiki.vegpath.org/Postgres_queries#make-COMMENTs-start-on-their-own-line

13573 05/31/2014 08:10 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: added match_num

13572 05/31/2014 08:06 PM Aaron Marcuse-Kubitza

inputs/.TNRS/data.sql.run: refresh(): documented runtime (1 min)

13570 05/31/2014 06:44 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added tnrs__match_num__next()

13567 05/30/2014 06:34 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added tnrs__batch_begin() trigger to populate the match_num (match sort order)

13540 05/27/2014 10:13 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: taxon_scrub.scrubbed_unique_taxon_name.*: added scrubbed_taxon_name_with_author, needed by Jeff Ott's analysis (wiki.vegpath.org/Data_requests)

13533 05/27/2014 12:28 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: taxon_scrub: added scrubbed_morphospecies_binomial, analogous to accepted_morphospecies_binomial for scrubbed_*

13532 05/27/2014 12:13 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: taxon_scrub: added scrubbed_morphospecies_binomial, analogous to accepted_morphospecies_binomial for scrubbed_*

13531 05/26/2014 11:54 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: taxon_scrub: documented how to modify it

13528 05/26/2014 11:20 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added taxon_scrub_modify()

13527 05/23/2014 06:17 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon_modify(): use simpler util.recreate_view()

13526 05/23/2014 06:15 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon_modify(): documented usage

13518 05/21/2014 07:30 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon_modify(): removed no longer needed DROP VIEW statement

13512 05/21/2014 05:50 PM Aaron Marcuse-Kubitza

fix: schemas/util.sql: force_recreate(): renamed to just recreate(), because "force" normally implies that things will be deleted, which this function does not do

13508 05/21/2014 04:25 PM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: MatchedTaxon.taxonomicStatus: filter using map_taxonomic_status() so that the corrected value is available in the normalized DB, not just analytical_stem

13507 05/21/2014 04:05 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon: to modify: use new MatchedTaxon_modify(), which eliminates the work of putting together the dependent views

13506 05/21/2014 03:53 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added MatchedTaxon_modify()

13503 05/21/2014 04:13 AM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: map_taxonomic_status(): need to use accepted name instead of scrubbed name (which also includes no-opinion names), as described at http://wiki.vegpath.org/2013-11-14_conference_call#taxonomic-fields. this used to be the accepted name, but got switched when the concatenated name was also used to store the matched name for no-opinion names.

13501 05/21/2014 01:27 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon: documented how to modify it (using util.force_recreate())

13498 05/20/2014 05:46 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon, etc.: added accepted_morphospecies_binomial derived field

13444 05/13/2014 04:50 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon.Accepted_name_species: mapped to accepted_species_binomial

13443 05/13/2014 04:09 AM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: COMMENTs: always include newline before and after

13441 05/13/2014 03:46 AM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: taxon_scrub, etc.: undid rename of accepted name columns to scrubbed_* (r13435), because these are actually not the same (scrubbed_* is the combination of accepted and no-opinion names). the accepted name columns will now be named accepted_*, following the standard naming scheme.

13439 05/13/2014 03:13 AM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: taxon_scrub, etc.: scrubbed_*: use columns from MatchedTaxon whenever possible, to as much as possible avoid the need to join to taxon_scrub.scrubbed_unique_taxon_name.*

13437 05/13/2014 02:29 AM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/grants.sql: added GRANT statements from schema.sql because these aren't run by `make inputs/.TNRS/reinstall`

13401 05/03/2014 02:03 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: add: verify/: also svn:ignore *.log

13372 05/01/2014 01:29 PM Aaron Marcuse-Kubitza

fix: lib/runscripts/file.pg.sql.run: removed include of in_datasrc_dir.run, because this location does not apply to all .sql export scripts

12779 03/20/2014 07:58 PM Aaron Marcuse-Kubitza

*{.sh,run}: use new begin_target instead of `echo_func; set_make_vars`

12018 02/02/2014 12:49 AM Aaron Marcuse-Kubitza

inputs/input.Makefile: add!: verify/: also svn:ignore *.tsv, *.txt

11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11965 01/16/2014 01:22 AM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: scrubbed_family: Name_matched_accepted_family was missing from the TNRS results at one point, so we are now using Family_matched as a workaround to populate this. the workaround is for accepted names only, as no opinion names do not have an Accepted_name_family to prepend to the scrubbed name to parse.

11964 01/16/2014 01:19 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: reexported from live DB, which changes the element order

11912 12/16/2013 01:43 PM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: granted bien_read SELECT access to derived views as well as the core tnrs table

11715 11/21/2013 11:08 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: updated runtime (30 min) and rowcount (+2 million)

11711 11/21/2013 09:04 AM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: tnrs_populate_fields(): is_valid_match: set this to false if Taxonomic_status is Invalid

11709 11/21/2013 08:49 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added map_taxonomic_status()

11708 11/21/2013 08:48 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql, data.sql: updated for PostgreSQL 9.3

11647 11/13/2013 02:48 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs_populate_fields(): regenerate the derived cols: updated runtime (40 min)

11643 11/10/2013 07:02 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: removed no longer used Accepted_scientific_name. use scrubbed_unique_taxon_name instead.

11642 11/10/2013 07:00 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon, etc.: removed no longer used acceptedScientificName (from tnrs.Accepted_scientific_name). use scrubbed_unique_taxon_name instead.

11641 11/10/2013 06:43 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: removed no longer used AcceptedTaxon. use taxon_scrub.scrubbed_unique_taxon_name.* instead.

11637 11/10/2013 05:55 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: removed no longer used ScrubbedTaxon. use taxon_scrub instead.

11634 11/10/2013 04:11 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added taxon_scrub, which combines ValidMatchedTaxon with scrubbed_unique_taxon_name.* instead of AcceptedTaxon

11633 11/10/2013 03:38 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: ValidMatchedTaxon: synced to MatchedTaxon

11632 11/10/2013 03:22 PM Aaron Marcuse-Kubitza

fix: inputs/.TNRS/schema.sql: scrubbed_taxon_name_with_author: renamed to scrubbed_unique_taxon_name because this also contains the family, and is therefore different from just the taxon name with author

11631 11/10/2013 01:50 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon: added scrubbed_taxon_name_with_author

11630 11/10/2013 01:23 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: removed Is_homonym, since this did not take into account the never_homonym status (when the author disambiguates) or the ability of a non-homonym at a lower rank to override a homonym at a higher rank. taking these into account just produces the value of is_valid_match.

11629 11/10/2013 01:19 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: removed Is_plant, since this functionality is now provided by is_valid_match. note that whether a name is a plant is not meaningful for TNRS, because it can match only plant names (thus a "non-plant" is actually a non-match).

11628 11/10/2013 01:06 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: added scrubbed_taxon_name_with_author derived column, which uses the matched name when an accepted name is not available

11627 11/10/2013 09:44 AM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: removed no longer used Max_score. use is_valid_match to determine validity instead.

11626 11/10/2013 12:09 AM Aaron Marcuse-Kubitza

bugfix: lib/runscripts/file.pg.sql.run: export_(): exclude Source and related tables so that these will be re-created by the staging tables installation instead, ensuring that they are always in sync with the Source/ subdir

11625 11/10/2013 12:08 AM Aaron Marcuse-Kubitza

inputs/.TNRS/data.sql: updated for new derived columns

11624 11/10/2013 12:04 AM Aaron Marcuse-Kubitza

bugfix: lib/runscripts/file.pg.sql.run: export_(): exclude Source and related tables so that these will be re-created by the staging tables installation instead, ensuring that they are always in sync with the Source/ subdir

11619 11/09/2013 04:47 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: removed no longer used score_ok(). use tnrs.Is_plant instead. (the threshold is still documented in tnrs_populate_fields().)

11618 11/09/2013 04:45 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs_populate_fields(): is_valid_match: don't consider Max_score because Is_plant will always be false when the Max_score is insufficient (<0.8)

11617 11/09/2013 04:20 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: schema comment: added steps to remake schema.sql and back up the new TNRS schema. documented that these steps should be run on vegbiendev.

11616 11/09/2013 04:16 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: schema comment: added steps to determine what changes need to be made on vegbiendev

11615 11/09/2013 04:01 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs_populate_fields(): regenerate the derived cols: updated runtimes (~same)

11614 11/09/2013 03:54 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: moved instructions to apply schema changes on vegbiendev to the TNRS schema, because this applies to all elements in the TNRS schema, not just the tnrs table

11613 11/09/2013 03:30 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: score_ok(): don't make it STRICT because this prevents it from being inlined

11612 11/09/2013 03:24 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: removed no longer used tnrs_score_ok index. use tnrs__valid_match instead.

11611 11/09/2013 03:09 PM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: tnrs_populate_fields(): is_valid_match: documented that this excludes homonyms because these are not valid matches (i.e. TNRS provides a name, but the name is not meaningful because it is not unambiguous)

11610 11/09/2013 03:07 PM Aaron Marcuse-Kubitza

bugfix: inputs/.TNRS/schema.sql: ValidMatchedTaxon: exclude inter-kingdom homonyms because these are not valid matches (i.e. TNRS provides a name, but the name is not meaningful because it is not unambiguous). this uses taxon_scrub__is_valid_match instead of score_ok() to determine whether the result should be included.

11609 11/09/2013 02:56 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: ValidMatchedTaxon: synced to MatchedTaxon

11608 11/09/2013 02:55 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: MatchedTaxon: added is_valid_match

11607 11/09/2013 02:52 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: added tnrs__valid_match index to facilitate joining to only valid matches

11606 11/09/2013 02:48 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: added is_valid_match derived column, to make it easier to select from only those TNRS results that can safely be used as a scrubbed name

11396 10/21/2013 07:14 PM Aaron Marcuse-Kubitza

fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error

10866 09/04/2013 11:06 PM Aaron Marcuse-Kubitza

inputs/*/*/test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix

10793 08/29/2013 02:07 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: added covering indexes on foreign keys where needed. this enables rows to be cascadingly deleted without a full table scan.

10790 08/27/2013 10:52 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: instructions for when changing this table's schema: updated to use new `inputs/.TNRS/data.sql.run refresh`

10789 08/27/2013 10:50 PM Aaron Marcuse-Kubitza

inputs/.TNRS/data.sql.run: added refresh() target which runs inputs/test_taxonomic_names/test_scrub

10787 08/27/2013 10:32 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: updated steps to run when changing this table's schema, to use new TNRS editing workflow

10786 08/27/2013 10:14 PM Aaron Marcuse-Kubitza

inputs/.TNRS/data.sql: re-ran TNRS using `inputs/test_taxonomic_names/test_scrub; rm=1 inputs/.TNRS/data.sql.run export_`

10783 08/27/2013 09:53 PM Aaron Marcuse-Kubitza

inputs/.TNRS/data.sql: generate from the DB using `rm=1 inputs/.TNRS/data.sql.run export_` instead of being a hand-edited file

10782 08/27/2013 09:50 PM Aaron Marcuse-Kubitza

added inputs/.TNRS/data.sql.run for syncing data.sql directly with the DB without needing to use inputs/test_taxonomic_names/test_scrub just to export the sample data. (however, when modifying the tnrs table, it may still be easier to generate new sample data using test_scrub rather than refactoring the table in place.)

10779 08/27/2013 09:25 PM Aaron Marcuse-Kubitza

added lib/runscripts/schema.pg.sql.run and use it in inputs/.TNRS/schema.sql.run

10778 08/27/2013 09:18 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: generate from the DB using `rm=1 inputs/.TNRS/schema.sql.run export_` instead of being a hand-edited file. this makes it much easier to edit the (now frequently-changing) TNRS schema directly in pgAdmin (which is graphical), rather than having to manually copy SQL changes from pgAdmin to the file.

10777 08/27/2013 09:15 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql.run: export_(): added usage

10776 08/27/2013 09:12 PM Aaron Marcuse-Kubitza

added inputs/.TNRS/schema.sql.run, which syncs schema.sql with the DB

10754 08/27/2013 01:54 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: moved source code comments to in-schema COMMENT ON comments so all the info in schema.sql is in the DB

10753 08/27/2013 01:47 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: views that use * as the column list: added comments to indicate that this is the case, so that the views can be updated in place rather than only by reinstalling the TNRS schema

10747 08/27/2013 12:49 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs: util.set_col_types() runtime: updated for most recent ALTER COLUMN TYPE command (9 min)

10746 08/27/2013 12:25 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: tnrs.Time_submitted: renamed to batch and added fkey to batch.id. this requires including the batch table in inputs/.TNRS/data.sql, so that the fkey is satisfied (batch entries are already added by bin/tnrs_db.

10741 08/26/2013 07:48 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: batch: reset name of id_by_time unique constraint since this field is now in the batch table

10740 08/26/2013 07:46 PM Aaron Marcuse-Kubitza

inputs/.TNRS/schema.sql: download_settings: renamed to batch_download_settings because this table is actually specific to the batch, and it does not make sense to have a download settings file without a batch