/trunk/schemas/vegbien.sql - Changes - BIEN 3 - NCEAS Projects

root/trunk/schemas/vegbien.sql @ 12224

#	Date	Author	Comment
12224	02/14/2014 03:09 PM	Aaron Marcuse-Kubitza	validation/aggregating//.sql, schemas/vegbien.sql, lib/runscripts/validations.pg.sql.run, inputs/bien2_traits/validations.sql: added _ to beginning of each view name so the validation views would sort at the top in the datasource's tables list. this will also make the validation result sets easily distinguishable from the data tables.
12186	02/13/2014 04:06 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public_validations_*: merged into single public_validations schema, with type-specific prefixes for each query, so that datasources can use validations queries from multiple type categories, and so that each datasource doesn't have to indicate which validations output schema it's using
12183	02/13/2014 02:09 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: regenerated from DB, which adds an autogenerated column alias
12172	02/13/2014 10:41 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql, validation/aggregating//.sql: renamed public_validation_* to plural public_validations_* since we refer to them as aggregating validations
12169	02/13/2014 08:34 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public_validation_plots: added Brad's notes as comments
12168	02/13/2014 08:29 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public_validation_plots: populated queries from validation/aggregating/plots/bien3_validations_salvias_vegbien.sql
12165	02/13/2014 08:20 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public_validation_traits: added Brad's notes as comments
12164	02/13/2014 08:17 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public_validation_traits: populated queries from validation/aggregating/traits/bien3_validations_traits_bien3.sql
12156	02/13/2014 05:58 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem.datasource: removed fkey to source.shortname because this prevents individually-reloaded datasources from being published
12155	02/13/2014 05:56 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: publish(), rm(): documented runtime (1 min)
12154	02/13/2014 05:49 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added datasource_unpublish()
12147	02/08/2014 10:38 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: public_validation: split into separate subschemas for plots, specimens, traits
12144	02/08/2014 10:00 PM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: public_validation comment: undid incorrect replacement of "validation" with "public_validation"
12138	02/08/2014 03:55 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added rm(), which deletes this schema
12137	02/08/2014 03:17 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added publish(), which publishes this schema
12136	02/08/2014 02:47 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added public_schema_publish()
12131	02/07/2014 08:30 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: datasource_publish(): removed nested transaction, which is not needed because Postgres runs each top-level command in a transaction (including in autocommit mode)
12130	02/07/2014 06:58 PM	Aaron Marcuse-Kubitza	bugfix: schemas/Makefile,vegbien.sql: renamed validation to public_validation to match the suffixed name applied by `make schemas/public/install`
12042	02/05/2014 01:02 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added validation schema, which contains queries used in the aggregating validations. schemas/Makefile: include schemas that depend on `public` (eg. `validation`) so they are restored along with it.
12024	02/02/2014 07:52 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added top_plot view
12015	01/28/2014 05:02 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: location_set_top_plot(): only do the lookup of the parent's top_plot when there is a parent_id
12014	01/28/2014 04:56 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: location: added autopopulated top_plot
12013	01/26/2014 11:09 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: locationevent: added locationevent_pull_forward_from_parent() trigger which populates project_id from the parent locationevent
12010	01/25/2014 11:17 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: locationevent__pull_forward_from_parent(): renamed to locationevent_pull_forward_from_parent_for_stratum() since it actually has a filter that causes it to apply only to stratum events
12006	01/23/2014 11:21 AM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: commconcept: added missing unique constraint on commdescription, which is also an identifying field in addition to the ID (which is in commname.commname)
11998	01/23/2014 12:15 AM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: locationevent__communities(): inner-join to commdetermination instead of left-joining, so that this does not produce community entries (which occur because there is one commclass per locationevent, but only some commclasses will then have a commdetermination)
11990	01/22/2014 12:24 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: projectcontributor: added missing unique constraint
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11963	01/15/2014 11:23 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: plot.**, analytical_stem_view, analytical_plot: added project_contributors
11962	01/15/2014 11:14 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added project_contributors()
11921	12/19/2013 11:11 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: scientificName_verbatim: don't use taxonverbatim.taxonname+author as the scientificName_verbatim if only the author is provided. (this lead to weird scientificName_verbatims that contain just the author.)
11908	12/12/2013 08:35 AM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: location: use the place_id from the parent location when no place_id is specified. this fixes a bug in analytical_stem_view where the parent location's place_id was used because it was sometimes missing from the sublocation, but the parent place_id itself was sometimes missing instead if sublocations each had their own place information. this way, it is always available directly in the sublocation, populated from the parent location if needed.
11907	12/12/2013 08:27 AM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: location: added place_id which is autopopulated from the current locationplace. join on this in plot.**, to avoid a 1:many join when a location has multiple locationplaces.
11906	12/11/2013 11:10 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: locationevent_unique_within_parent_by_location unique index: need COALESCE around location_id since it's nullable
11889	12/10/2013 07:10 AM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: analytical_specimen: added specimens-related columns that are in analytical_plot
11874	12/09/2013 05:26 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: updated for renamed county_centroids column names
11838	12/05/2013 08:35 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: include the family_higher_plant_group lookup table values so that these don't need to be regenerated from the NCBI nodes whenever the DB is reloaded
11837	12/05/2013 07:58 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonlabel_update_ancestors(): don't do an index scan if the value being scanned for is NULL, to support testing this function without the indexes in place, without extra full-table scans for NULL values affecting things. this can be used to determine if the function is actually using the indexes, by turning them off and seeing if the runtime changes.
11829	12/05/2013 01:31 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxonlabel_update_ancestors(): don't create a performance-intensive nested transaction (EXCEPTION block) for each INSERT, because there should no longer be duplicate ancestors, so it's OK to abort the whole transaction if this assertion fails
11828	12/05/2013 01:03 AM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: taxonlabel_update_ancestors_on_{insert,update}(): only use either the matched taxon's ancestors or the parent's ancestors, to avoid issues related to duplication between these two ancestors lists. this also fixes a bug where the 2nd taxonlabel_update_ancestors() call assumes that the existing ancestors are for the old parent, when in fact they have actually just been set to those for the new matched taxon (which horribly confuses taxonlabel_update_ancestors()).
11827	12/04/2013 10:06 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: _taxonlabel_set_parent_id(): just use a plain UPDATE statement, to avoid the significant parsing and stringification overhead of EXECUTE and quote_nullable(). it is not clear that EXECUTE is actually necessary to avoid caching the query plan, because the cache should be invalidated automatically when the table's ANALYZE statistics are regenerated.
11826	12/04/2013 10:00 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: removed unused function _taxonlabel_set_matched_label_id(), which refers to obsolete fields
11825	12/04/2013 09:58 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: synced to DB (the view renderer apparently changed the text of a view)
11821	12/04/2013 06:04 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: scrubbed_author: removed empty COALESCE around value (left over from when multiple values needed to be combined for many TNRS fields)
11816	12/04/2013 03:12 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: provider_count_view: source totals: use the much faster query developed for Brad (wiki.vegpath.org/VegBIEN_FAQ#from-Brad-on-2013-12-4), which avoids the need to do a GROUP BY on all of analytical_stem. eventually, we will want to apply the same optimization to the first publisher subtotals.
11736	11/25/2013 12:59 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: collected_dates: documented runtime (2.5 min)
11735	11/25/2013 12:57 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: collected_date_min: replaced with collected_dates view that lists all dates we have, so that we can determine which of these may be valid. it turns out that we have data collected from very far back (to the year 1), which are not merely 2-digit years because PostgreSQL will only parse early years when there are 4 digits.
11732	11/24/2013 11:48 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: collected_date_min: exclude invalid dates < 1000-01-01
11731	11/24/2013 11:41 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: collected_date_min: exclude -infinity
11730	11/24/2013 11:13 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added collected_date_min view
11710	11/21/2013 08:53 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: added taxonomic_status. notice that PostgreSQL 9.3 puts each view column on a separate line, making it much easier to review the svn diff!
11706	11/21/2013 07:40 AM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: taxon_trait_view: include only TNRS-valid names
11696	11/18/2013 05:40 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: sync_analytical_stem_to_view(): use new util.force_recreate() instead of manually dropping and re-creating every view that uses this. this avoids the need to add several lines to this function every time we add a new scientific view (of which we expect to have many), because force_recreate()'s error parsing handles this automatically. this makes it possible for a non-expert user to add scientific views without compromising the ability to add columns to analytical_stem_view, because they don't need to understand Postgres's dependency error messages when updating analytical_stem with this function.
11668	11/14/2013 02:36 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: plantobservation_aggregateoccurrence_count_1(): only default aggregateoccurrence.count to 1 for specimens data, because plots data may have any number of individuals in a taxon_presence record that has no explicit individual_count
11667	11/14/2013 02:32 PM	Aaron Marcuse-Kubitza	schemas/*.sql: updated for PostgreSQL 9.3. this reorders some functions, adds empty comment headers for omitted SEQUENCE SET commands, and (best of all) finally splits view columns onto multiple lines, so that changes in the columns are actually legible (and produce their own svn diff!)
11645	11/10/2013 07:09 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: scrubbed_morphospecies_binomial: only append the morphospecies suffix if there is not a scrubbed specific epithet
11644	11/10/2013 07:08 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: scrubbed_morphospecies_binomial: only populate this from the component ranks; do not put a full taxon name in here if it would otherwise be NULL
11640	11/10/2013 06:38 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: tnrs_input_name: MatchedTaxon self-join: must use a NOT NULL column for a proper anti-join. this unfortunately requires the more verbose LEFT JOIN ON syntax (which allows using the pkey as the NOT NULL column) instead of NATURAL LEFT JOIN (which requires using another column, which are all nullable)
11639	11/10/2013 06:34 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: tnrs_input_name: use plain UNION, which automatically removes duplicates, rather than UNION ALL with a manual EXCEPT-removal of rows in the first SELECT
11638	11/10/2013 06:14 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: tnrs_input_name: updated to use taxon_scrub.scrubbed_unique_taxon_name.*, to avoid further dependencies on AcceptedTaxon
11636	11/10/2013 05:54 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxon_trait_view: updated to use new taxon_scrub
11635	11/10/2013 05:51 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: updated to use new taxon_scrub. this avoids the need to manually COALESCE every accepted* and matched* field, and makes the formulas much clearer
11623	11/09/2013 10:22 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: analytical_stem_view: scrubbed_taxon_name_no_author, scrubbed_author: need to COALESCE these to the matched* when no accepted* is available
11622	11/09/2013 10:02 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view, etc.: renamed scrubbed fields with the scrubbed_* prefix, to clearly distinguish these from the equivalent fields for other taxon names
11621	11/09/2013 09:10 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: analytical_stem_view: family, genus: need to COALESCE these to the matched* when no accepted* is available
11598	11/08/2013 04:26 PM	Aaron Marcuse-Kubitza	bugfix: schemas/vegbien.sql: 2013-7-10.Naia.range_limiting_factors: filter by coordinateUncertaintyInMeters filter: assume true for rows with no coordinateUncertaintyInMeters
11597	11/08/2013 03:43 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: 2013-7-10.Naia.range_limiting_factors: filter by coordinateUncertaintyInMeters <= 10 km
11582	11/06/2013 01:30 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: 2013-7-10.Naia.range_limiting_factors: don't sort the results by occurrence_id, because this is not a meaningful ordering and prevents incremental output from the query
11581	11/06/2013 01:09 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: 2013-7-10.Naia.range_limiting_factors: also filter out rows without species
11533	10/31/2013 06:22 AM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: analytical_plot, analytical_specimen: removed derived columns that are not part of the validation
11532	10/31/2013 06:15 AM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: analytical_plot, analytical_specimen: removed internal ID columns that are not part of the validation
11531	10/31/2013 05:46 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_plot: removed derived columns that should not be validated by data providers
11530	10/31/2013 05:42 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_specimen: synced to analytical_stem
11529	10/31/2013 05:36 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_plot: documented that this contains all of the analytical_stem columns, minus specimenHolderInstitutions, collection, accessionNumber, occurrenceID
11528	10/31/2013 05:34 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_plot: synced to analytical_stem
11527	10/31/2013 05:29 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: added individualCount
11526	10/31/2013 04:42 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: plot.**, analytical_stem_view: added slopeAspect, slopeGradient
11505	10/30/2013 07:23 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination_set_iscurrent(): include new iscurrent__verbatim, so that taxondeterminations the datasource marks as current are always considered first. this currently applies to VegBank and CVS.
11504	10/30/2013 07:17 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination.isoriginal: made it nullable like iscurrent__verbatim, because this is populated from the datasource. taxondetermination_set_iscurrent() now supports isoriginal=NULL, so this is not a problem.
11503	10/30/2013 07:08 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination.is_datasource_current: renamed to iscurrent__verbatim and made it nullable, so that this can be used to store the verbatim iscurrent status
11502	10/30/2013 07:04 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination_set_iscurrent(): removed setting of is_datasource_current (which is now the same as iscurrent), so that this can be used to store the verbatim iscurrent status
11501	10/30/2013 06:59 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination_set_iscurrent(): isoriginal: make sure it is always either true or false, so that if the NOT NULL constraint on this is ever removed you don't end up with the incorrect sort order false, true, NULL (it should be false=NULL, true)
11500	10/30/2013 06:42 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: use plain taxondetermination.iscurrent instead of is_datasource_current since these are now the same
11499	10/30/2013 06:38 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination_set_iscurrent(): is_datasource_current: set to the same value as iscurrent, since these now have the same formula
11498	10/30/2013 06:34 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: taxondetermination_set_iscurrent(): removed no longer used accepted, matched determinationtypes (for these determinations, left-join to TNRS.ScrubbedTaxon)
11483	10/30/2013 02:57 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added 2013-7-10.Naia.range_limiting_factors
11482	10/30/2013 02:45 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: sync_analytical_stem_to_view(): row_num: renamed to taxon_occurrence__pkey because previous taxon determinations have been removed, so each row is in fact a taxon_occurrence (~= VegCore.vegpath.org?ERD.taxon_occurrence)
11481	10/30/2013 02:20 PM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: analytical_stem_view: don't ORDER BY datasource, because this requires a slow full-table sort after the hash joins. (when selecting a subset of analytical_stem_view, nested loops are used automatically without needing an ORDER BY to force this.) to get the datasource-sorted order (plus a sort-order guarantee), you can still add a manual `ORDER BY datasource`, which will use a fast index scan on one of the datasource indexes.
11480	10/30/2013 01:58 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem: added row_num, which can serve as the taxon_observation ID (DwC occurrenceID)
11477	10/30/2013 12:11 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem: locationID... index: use eventDate instead of dateCollected since it's now eventDate that identifies the locationevent
11476	10/30/2013 04:41 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: use plot. to obtain plot-related fields, so that the same code does not need to be maintained in both analytical_stem_view and plot.
11475	10/30/2013 04:32 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: moved specimen-specific fields to occurrence section
11474	10/30/2013 03:50 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view, plot.**: added separate location__cultivated__bien
11473	10/30/2013 03:11 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: added separate eventDate, in addition to dateCollected
11472	10/30/2013 02:59 AM	Aaron Marcuse-Kubitza	fix: schemas/vegbien.sql: dateCollected: use aggregateoccurrence.collectiondate before locationevent.obsstartdate rather than after, because this is more accurate. it was previously the other way around to allow dateCollected to be the pkey for the row's locationevent (for plots data).
11471	10/30/2013 02:38 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view, plot.**: locationevent__pkey: moved to right before the locationevent-related fields
11469	10/29/2013 06:52 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: analytical_stem_view: changed column order, etc. to match plot.**
11468	10/29/2013 06:46 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: plot.**: added locationevent__pkey so that this view can be joined to other VegBIEN tables, which require the internal pkey
11466	10/29/2013 09:05 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: sync_analytical_stem_to_view(): speciesBinomialWithMorphospecies index: documented runtime (1 h)
11465	10/29/2013 08:56 AM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: plot.**: updated to use the same column formulas as analytical_stem_view

Project

General

Profile