schemas/vegbien.sql: taxonverbatim: added subspecies, as decided in the conference call (wiki.vegpath.org/2014-04-10_conference_call#VegBIEN-schema-2)
fix: schemas/vegbien.sql: plots* with duplicated rows: removed duplicated rows
schemas/vegbien.sql: specimens*: ran through pipeline
schemas/vegbien.sql: _specimens_16_list_distinct_specimen_descriptions: re-ran through pipeline after removing duplicated rows
schemas/vegbien.sql: rm_output_queries(): also support removing just a particular output query
bugfix: schemas/util.sql: remake_diff_table(): need to rm_freq() type_table, because left/right_table don't have freq yet
schemas/util.sql: auto_rm_freq(): use new rm_freq()
schemas/util.sql: added rm_freq(regclass[])
schemas/vegbien.sql: _specimens_11_list_of_three_standard_political_divisions: ran through pipeline
fix: schemas/vegbien.sql: _specimens_11_list_of_three_standard_political_divisions: use same column names as input query
schemas/util.sql: remake_diff_table(): result table comment: documented how to display NULL values that are extra or missing
schemas/vegbien.sql: _specimens_13_count_of_all_verbatim_and_decimal_lat_long: ran through pipeline
fix: schemas/vegbien.sql: _specimens_12_distinct_collector_name_collect_num_date_w_count: dateCollected: also need to convert to text in GROUP BY/ORDER BY
bugfix: schemas/vegbien.sql, inputs/NY/validations.sql, validation/aggregating/specimens/qualitative_validations_specimens.sql: _specimens_12_distinct_collector_name_collect_num_date_w_count: dateCollected: cast this to text rather than date because some values for this field are not valid dates and will throw an error if cast to date
schemas/vegbien.sql: _specimens_10_count_number_of_records_by_institution: ran through pipeline
bugfix: schemas/vegbien.sql, validation/aggregating/specimens/qualitative_validations_specimens.sql: _specimens_10_count_number_of_records_by_institution: need to dereference specimenreplicate.duplicate_institutions_sourcelist_id to the corresponding sourcelist.name
schemas/vegbien.sql: public_validations._specimens_*: added comments from validation/aggregating/specimens/qualitative_validations_specimens.sql
schemas/vegbien.sql: remake_diff_tables(schema text): removed bien2_traits runtime because this applies only to one datasource. the bien2_traits runtime is now documented in inputs/bien2_traits/run.
schemas/vegbien.sql: public_validations: schema comment: documented how to run the validations. this information is also in the usage comment for public_validations.remake_diff_table(), but is copied here for easy reference.
schemas/vegbien.sql: public_validations: specimens queries: added autogenerated ~type tables
schemas/vegbien.sql: public_validations: added specimens queries to pipeline
fix: schemas/vegbien.sql: public_validations.rematerialize_out_view(text, regclass): run with join_collapse_limit = 1 to fix query planner issues. this option has been tested on the queries that do not yet use the standard join sequence (plots #11,12,13,14,16,17,18), and all of these queries also work fine with join_collapse_limit = 1. (the standard join sequence is used to ensure both correctness of the query and compatibility with join_collapse_limit = 1, but in some cases is not needed for join_collapse_limit.)
bugfix: schemas/vegbien.sql: schemas/vegbien.sql(): need to util.use_schema(schema_anchor) before initializing vars that use own-schema functions
*{.sh,run}: runscript targets: use begin_target instead of echo_func so the target name is properly echoed. note that this requires using with_rm so that $rm is properly progagated to applicable invoked targets. (previously, $rm was progagated to all invoked targets. note that with_rm only works inside a runscript target that starts with begin_target.)
schemas/vegbien.sql: updated _specimens_01_count_of_total_records_specimens_in_source_db
schemas/vegbien.ERD.mwb: regenerated exports
schemas/vegbien.sql: public_validations: added _specimens_01_count_of_total_records_specimens_in_source_db
schemas/vegbien.sql: specimenreplicate.institution_id: renamed to duplicate_institutions_sourcelist_id, as decided in the conference calls (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2)
fix: schemas/VegCore/mk_derived: added `EOF` at end to avoid (benign) "here-document delimited by end-of-file" warnings on Linux
fix: schemas/util.sql: trim(): by default, cascadingly drop dependent columns so that they don't prevent trim() from succeeding. note that this requires the dependent columns to then be manually re-created.
added schemas/VegCore.ERD.pdf symlink for easy access
bugfix: schemas/util.sql: is_constant(util.col_ref): updated to include standard newline at beginning of comment (applies to newly-imported staging tables)
*{.sh,run}: use new begin_target instead of `echo_func; set_make_vars`
fix: schemas/util.sql: explain2notice_msg(): don't include EXPLAIN output for simple, single-value queries, to avoid cluttering up the log output
schemas/util.sql: added fold_explain_msg()
bugfix: schemas/util.sql: set_col_types(): need to COALESCE the executed SQL to '' because util.eval() does not support NULL (and shouldn't, because this indicates a missing COALESCE in constructing the statement)
schemas/util.sql: set_col_types(): use simpler util.eval() instead of manual EXECUTE/util.debug_print_sql()
schemas/util.sql: set_col_types(): use string_agg() instead of array_to_string(ARRAY) for clarity
schemas/util.sql: added mk_not_null()
added schemas/VegCore/Brad_Boyle/bien3_data_provenance_use_cases.docx* from e-mail from Brad
schemas/vegbien.sql: _plots_08_list_of_plots_which_use_percent_cover, _plots_15_pct_cover_of_each_verb_taxon_in_each_plot_in_each_pro: reran with fixes, which removes the incorrectly auto-added copies columns. (they were only able to be auto-added because the tables had no rows.)
bugfix: drop_column(regclass[]): need to run `SELECT NULL::void;` at end of function to avoid folding away functions called in previous query
fix: schemas/util.sql: diff(regclass, regclass): moved try_create() of copies column in parent table to auto_rm_freq() so that it would only happen if both tables actually contain a copies column (otherwise, the try_create() will create an empty copies column if both tables are empty)
schemas/util.sql: try_create(): also handle "child table is missing column" errors
schemas/util.sql: added coalesce(anyarray), which can be used to force evaluation of all values of a COALESCE
schemas/vegbien.sql: implemented _plots_19_count_of_censuses_per_plot_in_each_project
schemas/util.sql: EXCEPTION blocks with multiple exception types: use OR to merge exception types into one WHEN block
schemas/vegbien.sql: public_validations: schema comment: changed "to sync the queries with schemas/vegbien.sql" to "to reset the queries to what's in schemas/vegbien.sql" for clarity
fix: schemas/vegbien.sql: schema comment: to reset the key and value columns for all validations queries: updated running of custom keys() functions to use keys() types instead
schemas/vegbien.sql: schema comment: to sync the queries with schemas/vegbien.sql: use new public_validations.rm_output_queries() instead of rm_all_queries() to leave the input queries in place
schemas/vegbien.sql: schema comment: documented how to reset the key and value columns for all validations queries
schemas/util.sql: mk_keys_func(regtype, util.col_cast[]): indicate in the type comment that the keys() type is autogenerated, so it can be distinguished from custom keys() types when bulk-regenerating keys() types
bugfix: schemas/util.sql: show_relations_like(): also need to include composite types, as these are also relations (and are expected to be included by callers of show_relations_like())
bugfix: schemas/vegbien.sql: rm_output_queries(): also need to include keys_* and values__* types, as these are also associated with the query
schemas/util.sql: added debug_print_func_call(text) and use it where applicable
schemas/util.sql: drop_relations_like(): debug-print the regexps so that you can tell which tables it's trying to match
schemas/vegbien.sql: public_validations: regenerated ~type tables, which adds `copies` columns for queries with a mismatch in the # of occurrences of each row
bugfix: schemas/vegbien.sql: public_validations.validation_views(): need to include views with letters after the query # (eg. _plots_06a_list_of_stems)
schemas/util.sql: removed no longer used to_freq(regclass, drop_if_always_1). use to_freq(regclass) and auto_rm_freq() instead.
bugfix: schemas/util.sql: diff(regclass, regclass): only drop freq column if all tables have all 1s
schemas/util.sql: auto_rm_freq(): accept multiple tables, so the freq column is only dropped if all tables have all 1s
schemas/util.sql: added freq_always_1(regclass[])
schemas/util.sql: added drop_column(regclass[])
schemas/util.sql: added parent(regclass)
schemas/util.sql: try_create(): also handle not_null_violation, which is thrown when trying to add a NOT NULL column to a parent table, which cascades to a child table whose values for the new column will be NULL
bugfix: schemas/util.sql: diff(text, text): also need to cast left_/right_ to base type for the IS DISTINCT FROM filter, because the WHERE clause apparently does not use columns from the SELECT list, even though GROUP BY and ORDER BY do
schemas/util.sql: added to_freq(regclass, drop_if_always_1)
schemas/util.sql: added auto_rm_freq(regclass)
schemas/util.sql: added freq_always_1(regclass)
bugfix: schemas/util.sql: diff(regclass, regclass): need to create a diff when the # of copies of a row differs between the tables. this uses new util.to_freq().
schemas/util.sql: added to_freq(regclass)
schemas/util.sql: added populate_table(regclass, text)
schemas/util.sql: added copy_types_and_data(regclass, text)
schemas/vegbien.sql: public_validations schema comment: added instructions to change the key and value columns for a validations query
schemas/vegbien.sql: implemented _plots_16_intercepts_for_each_verb_taxon_in_each_plot_each_proj
schemas/vegbien.sql: implemented _plots_09_list_of_plots_which_use_line_intercept
schemas/vegbien.sql: public_validations: queries that use EXISTS: join locationevent.plot_id to plot.plot_id directly instead of going via location.plot_location_id
schemas/vegbien.sql: implemented _plots_08_list_of_plots_which_use_percent_cover
schemas/vegbien.sql: implemented _plots_07_list_of_plots_which_use_counts_of_indiv_per_species
schemas/vegbien.sql, inputs/SALVIAS/validations.sql: added _plots_06a_list_of_stems, for use in figuring out the diff in _plots_06_list_of_plots_with_stem_measurements
schemas/vegbien.sql: plot: removed explicit column lists added in the autorename of plot.location_id->plot_id
schemas/vegbien.sql: plot: renamed pkey to plot_id. note that the field is autorenamed in all validation views which use it.
schemas/vegbien.sql: locationevent: added autopopulated plot_id column which points to the outermost plot of the locationevent's location
bugfix: schemas/vegbien.sql: locationevent: added missing fkey on place_visit_id
bugfix: schemas/vegbien.sql: _plots_06_list_of_plots_with_stem_measurements: only include stemobservation records which have actual stem IDs, not merely stem-related measurements (DBH, etc.)
bugfix: schemas/vegbien.sql: _plots_06_list_of_plots_with_stem_measurements: LEFT JOIN to project instead of inner joining, to get Postgres to use the right query plan. this is the last change needed to make query #6 runnable.
bugfix: schemas/vegbien.sql: rematerialize_out_view(): run all queries with `SET enable_seqscan = off` to avoid slow query plans. this fixes _plots_06_list_of_plots_with_stem_measurements and significantly speeds up _plots_10_count_of_individuals_per_plot_in_each_project (and possibly others).
schemas/vegbien.sql: locationevent: documented `CREATE INDEX locationevent_place_visit_id` runtime (3 min)
fix: schemas/vegbien.sql: locationevent: added locationevent_place_visit_id index to facilitate joins to place_visit_id in the validations queries
bugfix: schemas/vegbien.sql: source_by_shortname(): documented that in some cases, it is actually a bad idea to use a nested SELECT, because this will prevent Postgres from using an index scan (causing an equally bad slowdown as not inlining in cases where a nested SELECT is required).
schemas/postgresql.conf: log_min_messages: dropped the verbosity back down to the default, to avoid clogging up the logs
schemas/vegbien.sql: locationevent: documented `VACUUM ANALYZE` runtime (20 min)
schemas/postgresql.conf: log_min_messages: show what autovacuum is doing
fix: schemas/postgresql.conf: disable autovacuum_vacuum_cost_delay to avoid stalling autovacuuming due to a concurrent query, as this can prevent autovacuuming from happening altogether (http://vegpath.org/links/#PostgreSQL:%20Documentation:%209.3:%20Resource%20Consumption:%2018.4.4.%20Cost-based%20Vacuum%20Delay)
schemas/vegbien.sql: location: documented `CREATE INDEX plot_source_id` runtime (5 min)
fix: schemas/vegbien.sql: location: added plot_source_id index to provide the equivalent of the location.source_id index for outer plots. this will help Postgres choose the right query plans in queries involving outer plots.
schemas/vegbien.sql: _plots_18_list_of_subplots_codes_for_each_plot_for_each_project: added ~type table
schemas/vegbien.sql: _plots_15_pct_cover_of_each_verb_taxon_in_each_plot_in_each_pro: added ~type table
bugfix: schemas/Makefile: `%/install: vegbien.sql`: also need to match `public_validations` when used as a schema-qualifier (public_validations._), and after a cast (::) to a schema-qualified type. these occur in schema-qualified casts to the custom return type in the keys() functions.
bugfix: schemas/Makefile: `%/install: vegbien.sql`: sed expr: need to use '' instead of "" because $(*q) may contain "