/trunk/inputs/input.Makefile - Changes - BIEN 3 - NCEAS Projects

root/trunk/inputs/input.Makefile @ 12991

#	Date	Author	Comment
12991	03/30/2014 06:02 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added new-style aggregating validations (`validate` target)
12920	03/27/2014 03:31 AM	Aaron Marcuse-Kubitza	bugfix: lib/common.Makefile: $(add*): need to wrap w/ $(wildcard) to prevent "targets don't exist" error, because svn 1.7 does not suppress this error even with --force
12919	03/27/2014 03:27 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: add!: add* of $(svnFiles): need to ignore errors because svn 1.7 does not suppress the "targets don't exist" error even with --force
12867	03/22/2014 05:06 AM	Aaron Marcuse-Kubitza	fix: inputs/input.Makefile: don't treat *.xml as data files since these are not currently supported
12795	03/21/2014 02:16 AM	Aaron Marcuse-Kubitza	fix: inputs/input.Makefile: removed no longer used special handling of XML inputs, support for which was never added to the Makefile. (bin/map, however, does support importing an XML file into a database.) this fixes a bug in XAL, which used to abort with an error but now just imports an empty table.
12794	03/21/2014 12:34 AM	Aaron Marcuse-Kubitza	fix: inputs/input.Makefile: %/install: don't ignore errors if table does not exist, to ensure a proper errexit. this is now possible because every dir that this target is being run on should be a data dir. (Source/ used to be a metadata-only dir.)
12793	03/21/2014 12:31 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: $(cleanup): need `set -o pipefail`
12751	03/18/2014 05:16 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/postprocess.sql: don't perform replacements using map.csv, because map.csv is not idempotent. this functionality was only there to facilitate switching to new-style import, which is now largely done. (the remaining datasources NVS, SALVIAS, TEAM contain only 1 postprocess.sql: inputs/SALVIAS/projects/postprocess.sql (`st inputs/{NVS,SALVIAS,TEAM}/*/postprocess.sql`).)
12747	03/18/2014 04:33 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/postprocess.sql: always run this, not just if the associated map spreadsheets change, to avoid needing to `touch` them to cause %/postprocess.sql to run
12744	03/18/2014 04:06 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/postprocess.sql: also need to apply renames from mappings/VegCore.thesaurus.csv, as these have been applied to map.csv
12220	02/14/2014 12:20 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(svnFilesGlob): added validations.sql
12039	02/04/2014 10:32 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: verify/%.out: use a .sql file in the verify/ directory itself to generate .out, so that each datasource can have its own set of output queries. for datasources that should share the same set of queries, they can instead be symlinked to the same file.
12018	02/02/2014 12:49 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: add!: verify/: also svn:ignore .tsv, .txt
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11849	12/06/2013 02:44 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: install: for new-style datasources, use the associated runscript instead (the old-style install target will not do everything that's needed for a new-style datasource)
11847	12/06/2013 12:51 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: install: for new-style datasources, use the associated runscript instead (the old-style install target will not do everything that's needed for a new-style datasource)
11810	12/03/2013 03:44 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/header.csv: errexit the command so that errors won't scroll by, which in this case requires `set -o pipefail`
11802	12/03/2013 07:45 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: `%/install: %/create.sql`: errexit the command so that errors won't scroll by, which in this case requires `set -o pipefail`
11794	11/27/2013 11:04 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: scrub: clarified that using & (background process) also ignores TNRS errors (the primary purpose of & , of course, is to run asynchronously)
11777	11/26/2013 02:23 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: $(import): except in a full-database import, errexit so that the import will stop on an error and not let it scroll by
11719	11/21/2013 01:08 PM	Aaron Marcuse-Kubitza	fix: inputs/input.Makefile: $(svnFilesGlob): removed schema and PDF files, since these are owned by the data provider and should not be in the repository that gets open-sourced
11676	11/18/2013 03:52 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: sql/install: exit on error by using `set -o pipefail`
11564	11/05/2013 07:27 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(_svnFilesGlob): also svn-add _no_import in the top-level datasrc dir. (this requires using add! , because the presence of a _no_import file there will normally turn off adding by svnFilesGlob.)
11522	10/31/2013 02:16 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/install: don't run map_table, because this instead done by the runscript. although it does not hurt to do it twice, invoking load_data by itself should not run map_table at all, so that the original column names can be inspected in the table and map.csv reordered to match.
11519	10/31/2013 01:51 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/install: don't run map_table, because this instead done by the runscript. although it does not hurt to do it twice, invoking load_data by itself should not run map_table at all, so that the original column names can be inspected in the table and map.csv reordered to match.
11440	10/25/2013 09:58 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added %/import_temp alias for %/import, to mirror the presence of import_temp for import
11285	10/17/2013 04:43 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: import: remove the temp suffix once the import is done, so that the full database import doesn't keep the suffix attached to the datasources that import_all didn't import with reimport. removed unused import_publish target (instead use import_temp to invoke just the import without the temp suffix removal).
11253	10/12/2013 12:48 PM	Aaron Marcuse-Kubitza	bugfix: Makefile: recursive invocation of $(MAKE): enclose targets in "" in case they contain
11251	10/12/2013 12:11 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/uninstall: allow user to set is_view=1 flag to use DROP VIEW instead of DROP TABLE
11236	10/10/2013 12:43 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: `ln -s` to create VegBIEN.csv: enclose the filenames in "" since they may contain * (e.g. taxon_observation.**)
10994	09/15/2013 10:02 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: `%/install: %/create.sql`: don't include %/header.csv as a target, so that it won't get deleted if the install fails (especially on a step that happens after the header is exported)
10874	09/05/2013 01:01 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: reimport: don't remove the existing import first, because it will instead be removed by the publish step. this ensures there is always one complete copy of the datasource in the DB.
10870	09/05/2013 12:02 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: reimport: use import_publish instead of import so that the reimport replaces the previous import
10869	09/04/2013 11:59 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added import_publish, which removes the temp suffix when the import is done
10863	09/04/2013 03:00 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(map2db): import to datasrc.new instead of plain datasrc, so that the current import of the datasrc is not overwritten
10862	09/04/2013 02:59 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added publish (`make inputs/src/publish`)
10860	09/04/2013 02:43 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added %/publish (`make inputs/src/src.version/publish`)
10839	08/30/2013 11:18 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/test: in by_col mode, also need to run %/test.by_col.xml
10798	08/29/2013 05:09 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: rm: use new datasource_rm(), which encapsulates the schema-specific aspects of removing a datasource
10748	08/27/2013 12:55 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: scrub: documented that using & (background process) ignores TNRS errors, so that TNRS bugs do not prevent the remaining tables from being imported even if TNRS can't be run
10582	08/03/2013 03:28 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(import): support restarting the import where it left off by setting continue=1. this is done by grepping the restart row out of the log file's last partition.
10581	08/03/2013 03:11 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added %/import_scrub, similar to import_scrub but just imports one table
10347	07/19/2013 10:51 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/postprocess.sql: need to run bin/repl in text mode (text=1) so that values to match are treated as literal strings rather than regular expressions. this difference is important for column names with spaces or special characters.
10312	07/18/2013 11:38 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added %/postprocess.sql to replace input column names with the corresponding output column names when switching to new-style import (this target must be manually run, but does simplify the process of renaming the postprocess.sql input columns)
10256	07/11/2013 11:56 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: Staging tables installation: $(allInstalls): don't filter out Source table, because it is now an installed table rather than just a mapping
10241	07/10/2013 09:53 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: %/install: run %/map_table at end to rename the staging table columns for new-style datasources
10240	07/10/2013 09:52 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: added %/map_table to run the new-style import staging table renaming
10205	07/10/2013 01:50 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: map.csv and derived files: use $(tables) instead of $(importTables) when making them so that the mappings of those tables are still kept up-to-date even though they are marked _no_import (and not imported into the main DB)
10180	07/06/2013 06:00 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/postprocess: removed no longer used invocation of $*/import (precursor to the runscripts used in FIA)
10174	07/06/2013 03:55 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: for new-style datasources, use a symlink to mappings/VegCore-VegBIEN.csv directly instead of prefiltering VegCore-VegBIEN.csv to include only the columns in map.csv. prefiltering used to be performed as part of mapping the map.csv VegCore output terms to VegBIEN using bin/join, but is no longer needed because the staging table columns are now VegCore terms. instead, the full VegCore-VegBIEN.csv is needed so that derived columns added in stage I or II validations are detected by bin/map (rather than just the original source columns in map.csv).
10167	07/06/2013 01:45 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: SVN: add: don't add subdirs for datasources marked _no_import (e.g. datasources which only have an inputs/ dir to be listed in VegPath)
10162	07/03/2013 08:21 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: SVN: $(svnFilesGlob): added data.csv, used to store versioned data (such as the empty data.csv used by Source/ tables which have their metadata in the map table instead)
10107	06/28/2013 04:47 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added support for separate grants.sql file, which may contain GRANT statements that would normally be filtered out by pg_dump_limit
10106	06/28/2013 04:44 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: sql/install: added $debug option to run the *.sql import verbosely, to display which statements are being run. this should only be used for SQL files that use COPY FROM to import data, to avoid echoing pages of insert statements.
10105	06/28/2013 01:53 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: keep $(sortFile) up-to-date: use sort_file_updated=1 flag to indicate that import_order.txt has already been checked, so that recursive invocations of make don't need to recheck it. also use this flag instead of an explicit $(MAKECMDGOALS) list to prevent the $(sortFile) check from being infinite-recursively reinvoked when input.Makefile is read as part of the $(sortFile) check itself.
10104	06/28/2013 01:38 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: keep import_order.txt up-to-date by running `make $(sortFile)` each time make is run. this ensures that new datasources always have import_order.txt populated when make is first run. eventually, $(tables) can be always set to $(allTables) so that this auto-updating can also be used to ensure that new subdirs added by the user always make it into import_order.txt (so that they will be included in the subdirs that get remade, etc.). import_order.txt is primarily for specifying the order of the subdirs, but some datasources also use it to filter out subdirs, so it can't yet be always updated to include the full list of subdirs. however, the filter-out usage should no longer be necessary after the switch to new-style import.
10103	06/28/2013 12:58 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added $(filter_make), used to filter the output of embedded $(shell make ...) invocations
10102	06/28/2013 11:39 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(sortFile): use $(filter-out)->then instead of $(filter)->else for clarity
10101	06/28/2013 11:21 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added $(sortFile) (import_order.txt) target which adds any missing tables to import_order.txt
10100	06/28/2013 11:03 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: added list_tables to print $(tables) for use in populating import_order.txt
9968	06/20/2013 07:01 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: `%/install %/header.csv: %/create.sql`: in noclobber mode, mark %/header.csv as .PRECIOUS so the existing file won't be deleted if the table already exists (causing an error exit)
9951	06/19/2013 08:54 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(_svnFilesGlob): added *Makefile
9948	06/19/2013 08:45 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(_svnFilesGlob): added *run (runscripts)
9880	06/12/2013 10:45 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(dontImport): also support putting a _no_import file at the top level in the datasource to exclude the entire datasource
9875	06/12/2013 09:41 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: use header from map.csv instead of the new columns, so that source.shortname is set to GBIF instead of VegCore
9874	06/12/2013 09:24 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/VegBIEN.csv: when a runscript is available, instead map the output columns of map.csv to VegBIEN, because the columns have been renamed in the staging table
9843	06/11/2013 05:56 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/install: don't run $(cleanup) if it has already been run by $(import_install_), so that it doesn't run twice
9842	06/11/2013 05:54 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/postprocess: don't run postprocess.sql if it is supposed to be run by a runscript, because postprocess.sql may then depend on additional steps the runscript runs before it
9833	06/11/2013 04:51 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: $(logInstall): don't output to the install log if $noclobber flag is set, to prevent overwriting the log when re-running the install target idempotently
9417	05/16/2013 04:38 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(svnFiles): also exclude *.data.sql, which should never be in svn
8835	05/06/2013 04:25 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: sql/install: manually specify $no_search_path option to psql_script_vegbien, which is added automatically in $(psqlNoSearchPath) but that uses psql_verbose_vegbien
8801	05/02/2013 08:53 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: SVN: add, %/add: /logs: also svn:ignore .gz, used for compressed log files
8384	04/09/2013 09:36 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Removed .PRECIOUS from %/header.csv, %/map.csv so that these scripts are deleted on error. This is useful for the runscripts and for non-data dirs whose header.csv cannot be made.
8383	04/09/2013 09:33 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: SVN: Only run %/add on subdirs with visible (non-hidden) files. Subdirs with only hidden files (e.g. .htaccess) are assumed to be non-data dirs.
8382	04/09/2013 09:23 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: %/install: $(logInstall): Only use log file if log dir exists, to support non-data dirs
8381	04/09/2013 09:13 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: %/install: ignore errors in $(exportHeader) and $(cleanup) if the table does not exist (i.e. a non-data dir)
8380	04/09/2013 09:11 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: %/install: Don't run $(import_install_) for empty dirs because there is no data to import
8254	03/28/2013 07:22 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: SVN: add: Removed Source/map.csv prerequisite because it is not related to adding unversioned files in the dir. It was originally a prerequisite in order to auto-create it when the datasource dir is first created, but the map.csv recipe does not currently create metadata-only map.csvs. In the future, metadata-only map.csvs will be replaced with constant columns added to the applicable tables.
8252	03/28/2013 07:19 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/map.csv: Fixed bug where can only make header.csv if map.csv does not exist, because some subdirs are metadata-only and don't have a corresponding DB table
8249	03/28/2013 06:38 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: sql/install: Use psql_script_vegbien instead of $(psqlNoSearchPath) (which uses psql_verbose_vegbien) because the insert statement for each data row should not be echoed
8244	03/28/2013 06:10 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/map.csv: make $*/header.csv first in case it doesn't exist (e.g. if it has been deleted so that it will be remade)
8241	03/28/2013 05:23 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: postprocess: Use %/postprocess instead of %/postprocess.sql/run so $*/import is also run
8239	03/28/2013 05:19 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/postprocess: Also run the $*/import script, if it exists. Note that this is not the same as the %/import make target.
8238	03/28/2013 05:12 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/postprocess.sql/run: Factored out into separate %/postprocess command, which can eventually also perform other actions
8201	03/27/2013 06:56 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/header.csv: Fixed bug where newlines inside column names were incorrectly formatted by psql's table header formatting, by using COPY TO STDOUT instead
8176	03/25/2013 09:01 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/.map.csv.last_cleanup: Run fix_line_endings after canon/translate to standardize Python's \r\n line endings back to \n. This prevents issues with mixed line endings because LibreOffice (and probably Excel) treat all cell-internal line endings as \n but row line endings as whatever the file had, while text editors like jEdit translate all line endings to whatever the autodetected line ending is. (This creates spurious line ending diffs when a map spreadsheet containing multiline cells is edited in a text editor.)
8160	03/22/2013 11:13 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Moved postprocess.sql from $(exportHeader) to %/install because that is not part of the $(exportHeader) functionality. Added %/header.csv and use it in $(exportHeader).
8159	03/22/2013 11:05 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(catSrcs): Fixed bug where need to use $(nonHeaderSrcs) instead of $(srcs) to exclude header.csv
8157	03/22/2013 07:39 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Moved $(cleanup) from $(exportHeader) to %/install because this is not part of exportHeader's functionality
8156	03/22/2013 07:29 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: $(mkSrcMap): Use header.csv instead of the header of the CSVs, so that the column list in the map spreadsheet matches the actual DB table
8155	03/22/2013 07:18 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %.sql/run: Change to the directory the file is located in, so that includes (\i) are relative to the file, rather than relative to whatever happens to be the current directory
8154	03/22/2013 07:15 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/install: Always generate a header.csv, even for CSV inputs with their own header. This will include the actual column names in the staging table, which may differ from their names in the CSVs (e.g. the addition of row_num). Note that header.csv is not included in the CSVs list itself, and will not override the header or dialect in them.
8121	03/20/2013 05:07 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: $(exportHeader): Fixed bug where need to run postprocess.sql before exporting the header, because it can change the column names
8120	03/20/2013 05:02 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: $(exportHeader): export the header before running $(cleanup), because the header is not affected by the data cleanup operations and thus can be generated right away, to allow mapping while the cleanup operations run
8118	03/20/2013 03:23 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: $(exportHeader): Fixed bug where need to use psql_script_vegbien instead of the psql_verbose_vegbien used by $(psqlAsBien), to avoid echoing commands as part of the exported header
8113	03/20/2013 10:12 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: Added postprocess target, which runs all the postprocess.sql files
8108	03/20/2013 08:47 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: BIEN commands: $(psqlAsBien): Use psql_verbose_vegbien instead of psql_script_vegbien so that timings and notices are displayed, which is useful for profiling and debugging
8091	03/19/2013 11:48 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: %/install: Use new %.sql/run to run postprocess.sql
8090	03/19/2013 11:47 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: Added %.sql/run to run postprocess.sql, etc. separately from the install targets they are a part of
8089	03/19/2013 11:47 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: Staging tables installation: Added %.sql/run to run postprocess.sql, etc. separately from the install targets they are a part of

Project

General

Profile