lib/import.sh: Export $schema, $table so they are available to programs invoked within an import script, which should not reset these vars if they include import.sh
lib/import.sh: Only set $table, $schema if they don't already exist
lib/import.sh: Added $root_dir and use it in $bin_dir
inputs/FIA/*/import: Use new mk_*_col()
schemas/*functions.sql: Renamed to *util.sql because now that these schemas are used by the new-style import scripts, there can be more than just functions in them
schemas/util.sql: Added mk_const_col()
schemas/util.sql: Added type_qual()
schemas/util.sql: mk_derived_col(): Added "idempotent" comment
schemas/util.sql: Added mk_derived_col()
inputs/FIA/COND/import: oldGrowth: Updated expr column names
schemas/util.sql: Added typeof(text, regtype)
inputs/FIA/*/import: Removed util. before function names because util is in the search_path
schemas/functions.sql: Added existing_cols()
schemas/functions.sql: col_type(): Fixed bug where a NULL col name crashed the undefined_column throw, because MESSAGE can't be NULL and the NULL name was nulling out the entire message
schemas/functions.sql: Added col_exists()
inputs/FIA/COND/map.csv: Mapped SLOPE, ASPECT
web/main/.htaccess: remove linewraps (of the form table.path.vg/_-term) used to create a newline for Google spreadsheets
inputs/FIA/*/map.csv: Replaced . between table and column name with newline, so that table viewers like pgAdmin will display both the table and column name at the left edge of the header cell, rather than displaying only the table name because the column name doesn't fit. This fixes the problem of seeing a bunch of columns whose names all start with a table name, and not knowing what each of them is. It also preserves the ability to see at a glance which table a column is in, which helps in navigating wide tables. Removed * before unmapped terms, because whether a term is mapped is generally obvious from the table name itself.
inputs/input.Makefile: %/.map.csv.last_cleanup: Run fix_line_endings after canon/translate to standardize Python's \r\n line endings back to \n. This prevents issues with mixed line endings because LibreOffice (and probably Excel) treat all cell-internal line endings as \n but row line endings as whatever the file had, while text editors like jEdit translate all line endings to whatever the autodetected line ending is. (This creates spurious line ending diffs when a map spreadsheet containing multiline cells is edited in a text editor.)
Added bin/fix_line_endings to standardize \r\n line endings to \n
inputs/FIA/COND/import: Renamed COND.oldgrowth to VegCore name oldGrowth
inputs/FIA/*/map.csv: Ensured that joined columns are globally unique, so they don't map to an ambiguous VegCore term in the future
inputs/FIA/*/map.csv: Mapped terms to VegCore
schemas/functions.sql: col_type(): Include column name in error message
inputs/FIA/*/import: Updated column names to match map.csv
schemas/functions.sql: col_type(): Raise undefined_column exception if column does not exist, instead of silently returning NULL
inputs/FIA/import: Abort if any invoked script encounters an error
planning/timeline/timeline.2013.xls: Updated for current progress
inputs/FIA/*/map.csv: Removed no longer needed leading . from joined fields (globally-unique terms), because functions.to_global_col_names() is not used anymore
Added inputs/FIA/occurrence_all/, which combines all the core tables in a denormalized view. Note that it is not necessary to materialize this view into a (large) denormalized table, because the unique indexes and left/right joins allow the rows to be denormalized on the fly.
inputs/FIA/*/import: Use map_table to set column names based on the contents of map.csv, instead of using functions.to_global_col_names() and functions.rename_if_exists(). Added map.csv for all tables.
inputs/FIA/: Changed postprocess.sql scripts to import scripts that can be run directly. Added top-level inputs/FIA/import to run all of them together.
inputs/FIA/COND/postprocess.sql: Removed trailing whitespace
Added lib/import.sh, for use by new, simpler import scripts used by FIA. Note that for now, input.Makefile is still used to create map.csv.
inputs/input.Makefile: Moved postprocess.sql from $(exportHeader) to %/install because that is not part of the $(exportHeader) functionality. Added %/header.csv and use it in $(exportHeader).
inputs/input.Makefile: $(catSrcs): Fixed bug where need to use $(nonHeaderSrcs) instead of $(srcs) to exclude header.csv
schemas/functions.sql: map: Added additional columns that are present in the standard map spreadsheet format (filter, notes). These columns are necessary to make COPY FROM work, because it requires the # of columns to be the same in the input data and the output table.
inputs/input.Makefile: Moved $(cleanup) from $(exportHeader) to %/install because this is not part of exportHeader's functionality
inputs/input.Makefile: $(mkSrcMap): Use header.csv instead of the header of the CSVs, so that the column list in the map spreadsheet matches the actual DB table
inputs/input.Makefile: %.sql/run: Change to the directory the file is located in, so that includes (\i) are relative to the file, rather than relative to whatever happens to be the current directory
inputs/input.Makefile: %/install: Always generate a header.csv, even for CSV inputs with their own header. This will include the actual column names in the staging table, which may differ from their names in the CSVs (e.g. the addition of row_num). Note that header.csv is not included in the CSVs list itself, and will not override the header or dialect in them.
schemas/functions.sql: Added set_col_names()
schemas/functions.sql: rename_if_exists(): Also ignore duplicate_column exceptions, which are generated when a column is renamed to itself (as well as when two columns are renamed to the same place)
schemas/functions.sql: Added col_names(regclass), which unlike col_names(regtype) returns names in the order they are in the table
schemas/functions.sql: Added map_values()
schemas/functions.sql: map_get(): Fixed bug where can't use STRICT in EXECUTE INTO because there will sometimes be no match, causing a "query returned no rows" error
schemas/functions.sql: rename_cols(): Support any renames type with an -> operator
schemas/functions.sql: Added operator ->(regclass, text)
schemas/functions.sql: Added map_get()
schemas/functions.sql: table2hstore(): Made it STABLE instead of IMMUTABLE because the input table is not constant
schemas/functions.sql: Added table2hstore()
schemas/functions.sql: Added reset_map_table()
schemas/functions.sql: Added truncate()
schemas/functions.sql: mk_map_table(): Use the sql language instead of plpgsql because EXECUTE is not used directly, so plpgsql is not actually needed
schemas/functions.sql: mk_map_table(): Store map table schema in separate `map` table and extend it using LIKE, for easier maintainability of the map schema
schemas/functions.sql: Added mk_map_table()
schemas/functions.sql: ensure_prefix(): Made it IMMUTABLE instead of STABLE
schemas/functions.sql: Added rename_cols()
inputs/FIA/*/postprocess.sql: Avoid using :table, :table_str so that the commands in the script can also be run by pasting them into pgAdmin
README.TXT: Full database import: Manual steps to run TNRS/remake analytical DB: Added `export version=<version>` to ensure that the import is run into the correct schema. Since these instructions are for running commands separately from the rest of the import, it's important to first ensure that the import environment is set up properly.
schemas/vegbien.ERD.mwb: Added taxon_trait to ERD
schemas/vegbien.ERD.mwb: Regenerated exports
schemas/vegbien.sql: Removed unused analytical_aggregate table, because analytical_stem provides much more detailed, higher-quality data, both in terms of the number or of rows and the number of columns. analytical_aggregate has also long been out of sync with the analytical DB schema, and it doesn't make sense to spend processing time in make_analytical_db to perform the DISTINCT ON if the table isn't being used. We may revisit analytical_aggregate later once we have ID fields for each entity in the DISTINCT ON and can avoid DISTINCTing on all analytical_aggregate columns.
inputs/FIA/*/postprocess.sql: Added index on *.CN (autogen IDs)
README.TXT: Full database import: Added steps to use `screen` to allow recovering from a closed terminal window
inputs/FIA/TREE/postprocess.sql: TREE.unique index: Renamed to TREE.ID because this is on an autogenerated pkey rather than on domain values (for which a set of unique columns has not yet been found and may not exist)
inputs/FIA/REF_SPECIES/postprocess.sql: Matched SPECIES_SYMBOL to .SYMBOL. Added .SYMBOL_TYPE for use in joining to REF_PLANT_DICTIONARY.
Added inputs/FIA/REF_UNIT/postprocess.sql
Added inputs/FIA/REF_RESEARCH_STATION/postprocess.sql
Added inputs/FIA/COUNTY/postprocess.sql
Added inputs/FIA/REF_PLANT_DICTIONARY/postprocess.sql
inputs/FIA/COND/postprocess.sql: Matched COND.HABTYPCD1, COND.HABTYPCD1_PUB_CD to REF_HABTYP_DESCRIPTION
inputs/input.Makefile: Staging tables installation: $(exportHeader): Fixed bug where need to run postprocess.sql before exporting the header, because it can change the column names
inputs/input.Makefile: Staging tables installation: $(exportHeader): export the header before running $(cleanup), because the header is not affected by the data cleanup operations and thus can be generated right away, to allow mapping while the cleanup operations run
inputs/FIA/REF_HABTYP_DESCRIPTION/postprocess.sql: Prepare columns for joining with COND
inputs/input.Makefile: Staging tables installation: $(exportHeader): Fixed bug where need to use psql_script_vegbien instead of the psql_verbose_vegbien used by $(psqlAsBien), to avoid echoing commands as part of the exported header
Added planning/workflow/(de)normalized_import.mappings.png
Added planning/workflow/denormalized_import.png, normalized_import.png
web/main/IH/: Added lowercase alias
Added web/main/IH/
inputs/input.Makefile: Staging tables installation: Added postprocess target, which runs all the postprocess.sql files
inputs/FIA/REF_SPECIES/postprocess.sql: Cast ID column to integer
inputs/FIA/*/postprocess.sql: Cluster tables by their *.unique index for faster joins
inputs/FIA/*/postprocess.sql: Cast ID columns to integer using new functions.set_col_types()
bin/psql_verbose_vegbien: Run with client_min_messages = NOTICE to display notices for debugging. This is supposed to be the default, but apparently isn't.
inputs/input.Makefile: BIEN commands: $(psqlAsBien): Use psql_verbose_vegbien instead of psql_script_vegbien so that timings and notices are displayed, which is useful for profiling and debugging
schemas/functions.sql: Added col_cast and set_col_types()
schemas/functions.sql: Added col_ref, col_type()
schemas/functions.sql: Added cluster_once()
schemas/functions.sql: Added cluster_index()
schemas/functions.sql: create_if_not_exists(): Also handle duplicate_column exceptions
schemas/functions.sql: Added rename_if_exists()
inputs/FIA/COND/postprocess.sql: Renamed oldgrowth to COND.oldgrowth so it wouldn't be renamed by to_global_col_names()
inputs/FIA/COND/postprocess.sql: Added oldgrowth column as part of the postprocessing instead of as part of the view that left joins the core tables together. This avoids needing to regenerate the oldgrowth field whenever the view is queried or materialized.
inputs/FIA/TREE/postprocess.sql: Added index on columns that join to parent tables
inputs/FIA/*/postprocess.sql: Removed table prefix from globally-unique columns that should be joined on
schemas/functions.sql: Marked STRICT functions as such
schemas/functions.sql: col_global_names(): Treat any column name that contains . as already being globally unique, and don't prepend the table name. This allows renaming the table columns after running col_global_names(), without causing the table name to be re-prepended the next time col_global_names() is run.