schemas/util.sql: diff_cols(): documented how to run EXPLAIN on the FULL JOIN query
schemas/util.sql: diff_cols(): embed left_query/right_query directly in the main SQL statement, so that the query planner can take the sort order of the queries into account in planning the FULL JOIN (and where possible, use a fast merge join)
schemas/util.sql: added eval2col_pair()
schemas/util.sql: added functions for maintaining schema bundles, groups of schemas with a common prefix
schemas/util.sql: added schema_bundle_get_schemas()
schemas/util.sql: added schema_replace()
schemas/util.sql: added schema_rm(), schema_rename()
schemas/util.sql: added diff_views()
schemas/util.sql: added diff_cols(), for use with the aggregating validations
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
schemas/util.sql: explain2table(): documented usage:PERFORM util.explain2table($$query$$);
schemas/util.sql: explain2table(): by default, use the util.explain table
schemas/util.sql: added explain table
schemas/util.sql: added explain2notice()
schemas/util.sql: added explain2str()
schemas/util.sql: added explain2table()
schemas/util.sql: added explain()
schemas/util.sql: added force_recreate(), for use by sync_analytical_stem_to_view(). this uses the new `GET STACKED DIAGNOSTICS` in PostgreSQL 9.3 to access the DETAIL section of the dependent_objects_still_exist error.
schemas/*.sql: updated for PostgreSQL 9.3. this reorders some functions, adds empty comment headers for omitted SEQUENCE SET commands, and (best of all) finally splits view columns onto multiple lines, so that changes in the columns are actually legible (and produce their own svn diff!)
schemas/util.sql: added save_drop_views()
schemas/util.sql: added is_empty(anyarray)
schemas/util.sql: added regexp_matches_group()
schemas/util.sql: show_create_view(): also include GRANT statements, which are necessary to fully re-create the view
schemas/util.sql: added show_grants_for(table_ regclass), for use by show_create_view()
schemas/util.sql: added save_drop_view()
schemas/util.sql: added show_create_view()
schemas/util.sql: added mk_subset_by_row_num_func(regclass), which uses a sort_col instead of a row_num column
schemas/util.sql: mk_subset_by_row_num_func(): factored creation of no-sort subset function into mk_subset_by_row_num_no_sort_func() so it can be used by other functions
schemas/util.sql: added ||% operator to append to escaped strings (the % indicates an identifier, as in Perl hashes and one of the x86 assembler syntaxes for registers
bugfix: schemas/util.sql: mk_subset_by_row_num_func(): need explicit ORDER BY on the row_num_col, to prevent PostgreSQL from sorting the rows in reverse (they will be sorted because of the index scan, but the direction is otherwise indeterminate)
schemas/util.sql: mk_subset_by_row_num_func(): support tables whose pkeys don't start with 1 (such as the VegBank tables), by calculating the smallest row_num from the table
schemas/util.sql: offset2row_num(), limit2row_num(): added optional min_row_num, for tables whose serial pkeys start from a value other than 1
schemas/util.sql: added qual_name(regclass)
schemas/util.sql: added esc_name__append()
schemas/util.sql: added col__min()
schemas/util.sql: added limit2row_num() and use it in mk_subset_by_row_num_func() for clarity
schemas/util.sql: added offset2row_num() and use it in mk_subset_by_row_num_func() for clarity
schemas/util.sql: added schema_ident()
schemas/util.sql: added schema(regtype), schema(anyelement)
bugfix: schemas/util.sql: set_col_names_with_metadata(): rename any metadata cols rather than re-adding them with new names
schemas/util.sql: combining functions taking anyelement params which could be text: take text param instead, so that other argument types (e.g. integer) will first be implicitly cast to text instead of trying to concatenate integers directly. this fixes a bug in the VegBank.stemcount_,stemlocation_ _join() of two integer pkeys, which first needed to be cast to text. anyelement was previously used so that other text-like types such as varchar could also be used, but varchar is implicitly castable to text so keeping anyelement should not be necessary.
schemas/util.sql: added ?*>= operator (is populated more often than) from wiki.vegpath.org/Postgres_queries#is-populated-more-often-than
schemas/util.sql: hstore(text[], anyelement): replaced with just hstore(text[], text) so that bare literals (of type unknown) are auto-cast to text instead of causing a "could not determine polymorphic type because input has type "unknown"" error
bugfix: schemas/util.sql: reset_col_names(): remove metadata value columns from the map table entirely, so that they are not considered to be in the table (old-style import does not use metadata value columns, so there would be no column or a different column in the metadata value column's slot)
bugfix: schemas/util.sql: trim(): need to remove all added columns, not just derived columns, because metadata value columns are also added by new-style import and need to be removed to restore a staging table to its pre-new-style state
schemas/util.sql: added added_cols()
bugfix: schemas/util.sql: reset_col_names(): need to avoid renaming metadata value columns, because the "from" entry is a value, not a column name
schemas/util.sql: added ?>= operator with is_more_complete_than() function
schemas/util.sql: SQL-language IMMUTABLE functions marked STRICT: removed STRICT to enable dynamic inlining, which speeds up the function up to 7x. STRICT was not removed where the function was particularly complex and the STRICT optimization would likely be more significant than inlining.
schemas/util.sql: schema comment: added note that IMMUTABLE SQL-language functions should never be declared STRICT, because this prevents them from being inlined. inlining can create a significant speed improvement (7x+), by avoiding function calls and enabling additional constant folding.
schemas/util.sql: map_nulls(): documented that inputs/REMIB/Specimen/postprocess.sql > country also shows that inlining is now happening properly. note that the speed improvement due to inlining is not as much, %wise, when the values util._map() is run on are long strings instead of the short strings used in the initial profiling. this is because a greater % of the time is spent in system functions such as hstore>text, which are not affected by the inlining because they are run either way.
schemas/util.sql: map_nulls(): use new nulls_map(). proper inlining (i.e. same runtime before and after change) has been verified with the following profiling query:SELECT util.map_nulls(array[1, 2, 3]::text[], v) FROM unnest(array_fill(1, array100000)) f (v)
schemas/util.sql: added nulls_map(), for use with _map()
schemas/util.sql: added trim(), which trims a table to include only original columns, as defined by a map table
schemas/util.sql: added derived_cols(), which gets table_'s derived columns (all the columns not in the names table)
schemas/util.sql: added eval2set()
schemas/util.sql: added drop_column()
schemas/util.sql: map_nulls(): documented that due to dynamic inlining, this is just as fast as util._map() which it wraps. dynamic inlining now brings altogether a 40x speed improvement to map_nulls() (4000 ms -> 100 ms), and would likely bring a comparable improvement for other functions that are run repeatedly and call other user-defined functions.
bugfix: schemas/util.sql: map_nulls(): updated to use hstore(text[], anyelement), which has replaced hstore(anyarray, anyelement)
schemas/util.sql: removed hstore(anyarray, anyelement), which did not support dynamic inlining, to avoid confusion over which hstore() function to use. use new hstore(text[], anyelement) instead (with explicit cast on the keys array if needed).
schemas/util.sql: added hstore(text[], anyelement), which dynamically inlines properly, unlike hstore(anyarray, anyelement). this can be selected by explicitly casting the keys array to text[], which now provides a 6x speed improvement (380 ms -> 60 ms) for map_nulls().
schemas/util.sql: fix_array(): turned off STRICT to allow dynamic inlining, which speeds up util.map_nulls() by 3x (1500 ms -> 500 ms)
schemas/util.sql: array_length(anyarray), array_length(anyarray, dimension integer): turned off STRICT to allow dynamic inlining, which speeds up util.map_nulls(). this requires adding a `CASE WHEN $1 IS NULL THEN NULL` statement to array_length(anyarray, dimension integer) to replace the functionality provided by STRICT.
schemas/util.sql: map_nulls(): turned off STRICT to allow dynamic inlining, which causes a 2x speed improvement1. (see r10352 for an explanation of dynamic inlining.) note that turning off STRICT disables NULL-skipping (avoiding running a function when all its params are NULL), so it should only be used when the NULL-skipping optimization is needed less than dynamic inlining....
schemas/util.sql: inlinable IMMUTABLE functions: avoid using config params (e.g. `SET search_path TO util`) because these prevent dynamic inlining (i.e. inlining of a function call with variable instead of constant arguments, by substituting the arguments into the function's body). dynamic inlining can speed up function evaluation significantly, because a (slow) call to a user-defined SQL function is avoided.
bugfix: schemas/util.sql: set_col_names(): need to generate error if destination column already exists (rather than suppressing it with try_create()), because this indicates a collision
schemas/util.sql: map table: added unique constraint on the to column as well, because the destination names also need to be distinct in order to be a valid set of column names
schemas/util.sql: map table: changed pkey to a unique constraint so pgAdmin would sort the entries in table order (matching the order they are in the staging table) instead of alphabetized by the pkey
bugfix: schemas/util.sql: not_empty(anyarray): array_length() now refers to different functions, with different semantics, depending on whether util is in the search_path. this necessitates explicitly selecting util.array_length() and switching to its semantics (ARRAY[] -> 0 instead of NULL)
schemas/util.sql: map_nulls(): support all datatypes, not just text
schemas/util.sql: added hstore(keys anyarray, value anyelement) and => (anyarray, anyelement) operator to support other element types for hstore
schemas/util.sql: added map_nulls(), a common use case of _map()
bugfix: schemas/util.sql: hstore(keys text[], value text): use new fix_array() so that an empty keys array is made 1-dimensional to match up with the array generated by array_fill()
schemas/util.sql: added fix_array(), which ensures that the array will always have proper non-NULL dimensions
schemas/util.sql: added empty_array(), for constructing proper empty 1-dimensional arrays whose dimensions are not NULL ( {}::text[] does not do this)
bugfix: schemas/util.sql: array_length(anyarray): need to call util.array_length() instead of just array_length() (which uses pg_catalog.array_length()) so that empty arrays will be returned as 0 instead of NULL. note that for some reason, adding `SET search_path=util` to the function does not have the same effect.
bugfix: schemas/util.sql: _map(map hstore, value anyelement): need to cast result to unknown to support types that don't have a cast directly from text
schemas/util.sql: added _map(map hstore, value anyelement) to seamlessly map types other than text (by casting back and forth between text and the value type)
bugfix: schemas/util.sql: rename_cols(): run additional `SELECT NULL::void` query after the main for-loop query so that PostgreSQL does not try to fold away the execution of util.try_create() just because multiple rows are not returned by the function. the result set of the first query will still be discarded, but will be fully evaluated. (this has nothing to do with VOLATILE vs. IMMUTABLE; util.try_create() is already declared VOLATILE and would normally not be folded.) rename_cols() is used to rename derived columns, which are not part of the map.csv and cannot be positionally renamed.
schemas/util.sql: added text[] => text operator, analogous to text => text for multiple keys (uses hstore(keys text[], value text))
schemas/util.sql: added hstore(keys text[], value text), which can be used to avoid repeating the same value for each key. there are many /_map filters which use the XPath syntax for doing this, which now need to use an equivalent SQL syntax to avoid duplicating the value many times.
schemas/util.sql: added array_fill(anyelement, integer), which doesn't require lengths for multiple dimensions
schemas/util.sql: added array_length(anyarray, dimension integer) wrapper, which returns 0 instead of NULL for empty arrays
schemas/util.sql: added array_length(anyarray), which does not require a second dimension argument
schemas/util.sql: added reset_col_names(), the counterpart to set_col_names(). note that this alters the map table, so it will need to be repopulated after running this function.
schemas/util.sql: mk_derived_col(): support using this function to overwrite an existing column (i.e. as a general-purpose function to perform in-place update with ALTER COLUMN TYPE USING)
schemas/util.sql: create_if_not_exists(): also suppress "multiple primary keys are not allowed" error
schemas/util.sql: added table_nulls_mapped__get(), which gets whether a table's NULL-equivalent strings have been replaced with NULL
schemas/util.sql: added table_flag__get(), which gets whether a status flag is set by the presence of a table constraint
schemas/util.sql: added table_nulls_mapped__set(), which sets that a table's NULL-equivalent strings have been replaced with NULL
schemas/util.sql: added table_flag__set(), which stores a status flag by the presence of a table constraint
schemas/util.sql: create_if_not_exists(): also ignore duplicate_object exceptions, thrown when trying to add a duplicate constraint
schemas/util.sql: type_qual(), type_qual_name(): added comments to distinguish these similarly-named functions, one of which gets a type qualifier and the other of which gets a qualified name (not the name of a type qualifier, which one might otherwise assume)
schemas/util.sql: typeof(): support expressions that are not relative to a table (which do not have a table_ param). note that this requires removing the STRICT qualifier, so that NULL expressions will now produce an error instead of passing through as NULL.
bugfix: schemas/util.sql: col_names(): need to exclude dropped columns (which remain included in the pg_attribute table until the next tuple rewrite), by filtering on `NOT attisdropped`. lib/sql.py table_col_names() is not affected by this because it is able to access the column names from the DB driver directly, after performing `SELECT * FROM table LIMIT 0`.
schemas/util.sql: set_col_names_with_metadata(): don't delete the metadata entries from the map table, because they are now added before the renames take place, so that the renames can simply be performed on the constant columns themselves. this does, however, require that the metadata entries are always listed last in the map.csv (which is currently the case).
schemas/util.sql: reset_map_table(): drop the table and recreate it instead of just creating it if it doesn't exist, so that any change to the util.map table is propagated to persistent map tables whenever they are reloaded from the map.csv
schemas/util.sql: added drop_table()
schemas/util.sql: set_col_names(): don't perform rename if the name is not changing, to avoid cluttering the debug output with unnecessary queries
bugfix: schemas/util.sql: set_col_names_with_metadata(): need `util.` before mk_const_col(). "to", "from" need to be referenced from row_. substring() needs to start from 2 rather than 1 because PostgreSQL string indexes are 1-based.