Project

General

Profile

Statistics
| Revision:

# Date Author Comment
8316 04/04/2013 01:16 AM Aaron Marcuse-Kubitza

inputs/FIA/*/map.csv: Changed newlines between table and field name to - because the newlines mess up the flow of queries and also break pgAdmin's display of EXPLAIN output. The - was chosen because it's a non-whitespace character that linewraps in browsers, phpPgAdmin, and Google spreadsheets (although unfortunately not in pgAdmin). It is better than space because you can set a text editor to treat it as a word character, allowing the entire column name (<table>-<field>) to be selected by double-clicking it.

8315 04/03/2013 09:55 PM Aaron Marcuse-Kubitza

Added planning/workflow/normalized_vs_denormalized/denormalized.generic_standardizations.png (a slide from Brad's bien3_architecture_denormalized.pptx PowerPoint), which shows the staging table preprocessing particularly well

8314 04/03/2013 09:45 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: record the import times in inputs/import.stats.xls: Added instructions for what to do if the rightmost imports start getting truncated due to the 255-column limit in spreadsheets. (This will occur in 8 imports.)

8313 04/03/2013 09:32 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Removed the previous imports from the current tab because they are also in the 2012-6~9 tab, and should not be in two places

8312 04/03/2013 09:28 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. MO and FIA have been refreshed.

8311 04/02/2013 04:17 PM Aaron Marcuse-Kubitza

Removed no longer needed inputs/GBIF/import. Use ./run instead.

8310 04/02/2013 04:17 PM Aaron Marcuse-Kubitza

Removed no longer needed inputs/GBIF/_MySQL/import. Use ./run instead.

8309 04/02/2013 04:16 PM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/run: import: Run make directly instead of via ./import

8308 04/02/2013 04:15 PM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/run: Use new import.run, which defines all()

8307 04/02/2013 04:06 PM Aaron Marcuse-Kubitza

Added planning/workflow/normalized_vs_denormalized/bien3_architecture_(de)normalized.pptx

8306 04/02/2013 03:57 PM Aaron Marcuse-Kubitza

Added planning/workflow/normalized_vs_denormalized/BIEN-modArch-Dec2010 NS-SBD 1.4.ppt.url

8305 04/02/2013 03:50 PM Aaron Marcuse-Kubitza

planning/workflow/: Moved normalized vs. denormalized files to separate normalized_vs_denormalized/ subfolder

8304 04/02/2013 03:21 PM Aaron Marcuse-Kubitza

Regenerated inputs/ACAD/Specimen/logs/steps.by_col.log.sql

8303 04/02/2013 03:15 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: Override MySQL_export() so $filter can be customized

8302 04/02/2013 03:13 PM Aaron Marcuse-Kubitza

inputs/GBIF/table.run: import(): Updated for lib/table.run template changes

8301 04/02/2013 03:09 PM Aaron Marcuse-Kubitza

lib/table.run: template: import(): Also pass "$@" to superclass method

8300 04/02/2013 03:08 PM Aaron Marcuse-Kubitza

lib/table.run: template: Use "$FUNCNAME" instead of hardcoding import

8299 04/02/2013 03:02 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/MySQL_export, used by ./table.run

8298 04/02/2013 02:57 PM Aaron Marcuse-Kubitza

lib/util.run: echo_func: Fixed bug where need to use BASH_LINENO0 for the line #s to match up with the files. For some reason the required array indexes for BASH_SOURCE (1) and BASH_LINENO (0) differ by one.

8297 04/02/2013 02:51 PM Aaron Marcuse-Kubitza

inputs/GBIF/run: Use new import.run, which defines all()

8296 04/02/2013 02:51 PM Aaron Marcuse-Kubitza

lib/table.run: Use new import.run, which defines all()

8295 04/02/2013 02:49 PM Aaron Marcuse-Kubitza

Added lib/import.run

8294 04/02/2013 02:48 PM Aaron Marcuse-Kubitza

lib/util.run: echo_func: Include the line # of the function to make it easier to find where the code being run is

8293 04/02/2013 02:32 PM Aaron Marcuse-Kubitza

lib/table.run: Added all (default target)

8292 04/02/2013 02:26 PM Aaron Marcuse-Kubitza

lib/util.run: run_cmd: If bash exited with an error, don't run the "$@" command. This test is necessary because `trap run_cmd EXIT` will run run_cmd as the result of any exit from the shell, including an error.

8291 04/02/2013 02:21 PM Aaron Marcuse-Kubitza

*run: Use -e option to bash on the #! line instead of separate `set -o errexit` line so that there is no issue with the `set -o errexit` line getting separated from the #! line (errexit is required for the scripts to work properly)

8290 04/02/2013 02:09 PM Aaron Marcuse-Kubitza

lib/util.run: run_cmd: When no command specified, default to running the `all` command, just like make

8289 04/02/2013 02:07 PM Aaron Marcuse-Kubitza

lib/util.run: Run run_cmd at shell exit (using trap) instead of requiring every runscript to have `run_cmd ` at the end of it

8288 04/02/2013 01:49 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/run

8287 04/02/2013 01:48 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/raw_occurrence_record/run

8286 04/02/2013 01:47 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/table.run

8285 04/02/2013 01:45 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/_MySQL/run

8284 04/02/2013 01:42 PM Aaron Marcuse-Kubitza

lib/util.run: fwd: Check that $subdirs is defined. Added $subdirs to usage.

8283 04/02/2013 01:39 PM Aaron Marcuse-Kubitza

lib/util.run: fwd: Added usage

8282 04/02/2013 01:32 PM Aaron Marcuse-Kubitza

lib/table.run: Switched from echo_run to echo_func

8281 04/02/2013 01:16 PM Aaron Marcuse-Kubitza

lib/util.run: run_cmd: Echo the command being run, including the top-level run script. This is in addition to the echoing of the command in the function itself (using echo_func), which provides both the runscript that was run and the file where the invoked command was actually located (which may be different due to includes).

8280 04/02/2013 01:12 PM Aaron Marcuse-Kubitza

lib/util.run: Echo the command at the beginning of each function using new echo_func, instead of having to type echo_run before every call to a function. Note that because echo_func uses BASH_SOURCE, the path to the file containing the function will be included in the debug message, which greatly facilitates locating which file a command is in.

8279 04/02/2013 01:08 PM Aaron Marcuse-Kubitza

lib/util.run: Added echo_func

8278 04/02/2013 12:50 PM Aaron Marcuse-Kubitza

lib/util.run: Added echo_cmd and use it in echo_run

8277 04/02/2013 12:46 PM Aaron Marcuse-Kubitza

lib/util.run: echo_cmd(): Renamed to echo_run for clarity, because it also runs the command

8276 04/02/2013 12:39 PM Aaron Marcuse-Kubitza

lib/util.run: Added inline_make()

8275 04/02/2013 12:39 PM Aaron Marcuse-Kubitza

lib/util.run: Added echo_stdin()

8274 04/02/2013 12:30 PM Aaron Marcuse-Kubitza

bin/my2pg_export: Put --password first because it's an authentication-related option

8273 04/02/2013 10:52 AM Aaron Marcuse-Kubitza

Added lib/table.run, which includes the commands in import.sh but uses run scripts to allow running commands other than just import. (For example, map_table or postprocess can be run separately. Uninstall-related commands which would not belong in an import script can also be added, because import is only one of many commands a run script can offer.)

8272 04/02/2013 10:35 AM Aaron Marcuse-Kubitza

Added lib/util.run with general functions and template for run scripts (a bash-based replacement for make). Unlike make, run scripts support full bash functionality including multiline commands. The run script template also includes syntax for various kinds of relative includes in bash.

8271 04/02/2013 12:03 AM Aaron Marcuse-Kubitza

lib/common.Makefile: Added $(require_var)

8270 04/01/2013 10:42 PM Aaron Marcuse-Kubitza

bin/publish_analytical_db: Fixed bug where need to remove `ESCAPED BY '"'` because this would causing " followed by an escape sequence char to be interpreted specially (e.g. "n -> \n). MySQL automatically takes care of quote doubling when you specify `FIELDS OPTIONALLY ENCLOSED BY`.

8269 04/01/2013 10:13 PM Aaron Marcuse-Kubitza

lib/common.Makefile: Compression: Added `%:: .gz`, `.gz: %`

8268 04/01/2013 08:07 PM Aaron Marcuse-Kubitza

planning/workflow/import_process_comparison.odg: Moved "staging tables" under the method labels to reduce empty space

8267 04/01/2013 07:52 PM Aaron Marcuse-Kubitza

planning/workflow/import_process_comparison.odg: Removed margins so the labels would align with the page margin on the Import process wiki page <https://projects.nceas.ucsb.edu/nceas/projects/bien/wiki/Import_process>

8266 04/01/2013 07:32 PM Aaron Marcuse-Kubitza

Added planning/workflow/import_process_comparison.odg and .png export

8265 04/01/2013 06:12 PM Aaron Marcuse-Kubitza

lib/db_xml.py: put_table(): Fixed bug where command to advance start to fetch next set was unintentionally deleted when removing the is_view check

8264 04/01/2013 06:11 PM Aaron Marcuse-Kubitza

inputs/UNCC/Specimen/new_terms.csv: Updated for updated VegCore vocab

8263 04/01/2013 03:53 PM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.md5: Regenerated after appending agent table to GBIFPortalDB-2013-02-20.data.sql

8262 04/01/2013 03:51 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.gz.md5

8261 03/28/2013 08:16 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/raw_occurrence_record/ from refresh

8260 03/28/2013 08:07 PM Aaron Marcuse-Kubitza

inputs/GBIF/MySQL.schema.sql: Regenerated with inline enum type translated to CHECK constraint

8259 03/28/2013 08:07 PM Aaron Marcuse-Kubitza

bin/my2pg: Translate inline enum type to CHECK constraint

8258 03/28/2013 07:43 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/**/MySQL.schema.sql

8257 03/28/2013 07:42 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/_MySQL/MySQL.*.sql.make

8256 03/28/2013 07:36 PM Aaron Marcuse-Kubitza

inputs/FIA/: Archived no longer used subdirs from BIEN2 export

8255 03/28/2013 07:29 PM Aaron Marcuse-Kubitza

inputs/FIA/: Archived no longer used subdirs from BIEN2 export

8254 03/28/2013 07:22 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: SVN: add: Removed Source/map.csv prerequisite because it is not related to adding unversioned files in the dir. It was originally a prerequisite in order to auto-create it when the datasource dir is first created, but the map.csv recipe does not currently create metadata-only map.csvs. In the future, metadata-only map.csvs will be replaced with constant columns added to the applicable tables.

8253 03/28/2013 07:19 PM Aaron Marcuse-Kubitza

Added inputs/FIA/_archive

8252 03/28/2013 07:19 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: %/map.csv: Fixed bug where can only make header.csv if map.csv does not exist, because some subdirs are metadata-only and don't have a corresponding DB table

8251 03/28/2013 07:02 PM Aaron Marcuse-Kubitza

README.TXT: Datasource setup: Install the staging tables: For a MySQL .sql export: Documented which password to use at each of the two password prompts my2pg_export will give you. You could also embed the value of the 2nd prompt in the _MySQL/*.make file using `--password="$(cat path/to/config/bien_password)"`.

8250 03/28/2013 06:56 PM Aaron Marcuse-Kubitza

README.TXT: Datasource setup: Install the staging tables: Removed requirement that `make inputs/<datasrc>/reinstall quiet=1 &` be run on vegbiendev for MySQL .sql exports, because the hostname is now set to vegbiendev instead of localhost

8249 03/28/2013 06:38 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: sql/install: Use psql_script_vegbien instead of $(psqlNoSearchPath) (which uses psql_verbose_vegbien) because the insert statement for each data row should not be echoed

8248 03/28/2013 06:14 PM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Run remake_VegBIEN_mappings at end to keep mappings to next stage of import process up to date

8247 03/28/2013 06:14 PM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/: Accepted new test output

8246 03/28/2013 06:13 PM Aaron Marcuse-Kubitza

lib/import.sh: remake_VegBIEN_mappings(): Also remake VegBIEN.csv and test.xml.ref use `make test`

8245 03/28/2013 06:11 PM Aaron Marcuse-Kubitza

lib/import.sh: Added remake_VegBIEN_mappings()

8244 03/28/2013 06:10 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: %/map.csv: make $*/header.csv first in case it doesn't exist (e.g. if it has been deleted so that it will be remade)

8243 03/28/2013 06:07 PM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/map.csv: Regenerated using new input table mappings

8242 03/28/2013 05:47 PM Aaron Marcuse-Kubitza

lib/import.sh: Added make() and use it instead of the full make command

8241 03/28/2013 05:23 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: postprocess: Use %/postprocess instead of %/postprocess.sql/run so $*/import is also run

8240 03/28/2013 05:21 PM Aaron Marcuse-Kubitza

inputs/FIA/: Ran inputs/FIA/import. This maps to VegCore's commonName.

8239 03/28/2013 05:19 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: %/postprocess: Also run the $*/import script, if it exists. Note that this is not the same as the %/import make target.

8238 03/28/2013 05:12 PM Aaron Marcuse-Kubitza

inputs/input.Makefile: %/postprocess.sql/run: Factored out into separate %/postprocess command, which can eventually also perform other actions

8237 03/28/2013 04:59 PM Aaron Marcuse-Kubitza

inputs/FIA/PLOT/map.csv: ELEV: Remapped to elevation_ft, assuming units based on the actual elevation of the region for a sample plot record

8236 03/28/2013 04:27 PM Aaron Marcuse-Kubitza

inputs/VegBank/taxonobservation_/map.csv: Mapped int_currplantcommon to vernacularName

8235 03/28/2013 04:25 PM Aaron Marcuse-Kubitza

mappings/VegCore.htm: Renamed salvias_plots table plotMetadata to PlotMetadata because of SALVIAS refresh on nimoy

8234 03/28/2013 04:18 PM Aaron Marcuse-Kubitza

mappings/VegCore.htm: Regenerated from wiki. Added flower, fruit, commonName.

8233 03/28/2013 03:37 PM Aaron Marcuse-Kubitza

mappings/Makefile: $(vocab); bin/redmine_synonyms: Support crossed out (deprecated) terms

8232 03/28/2013 03:24 PM Aaron Marcuse-Kubitza

README.TXT: Maintenance: VegCore data dictionary: Added steps to update the data dictionary's Tables section if necessary

8231 03/28/2013 02:14 PM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/Makefile: %.data.sql: Added agent table

8230 03/28/2013 01:18 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.md5

8229 03/28/2013 01:11 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.schema.sql

8228 03/28/2013 11:02 AM Aaron Marcuse-Kubitza

Added web/main/svn*/, now using .htaccess to forward to Redmine/*

8227 03/28/2013 10:55 AM Aaron Marcuse-Kubitza

Removed web/main/svn, svn-web symlinks because they need to be .htaccess-es in order for the relative mod_rewrite commands to work correctly

8226 03/28/2013 10:50 AM Aaron Marcuse-Kubitza

Added web/main/svn, svn-web symlinks to Redmine/* for shorter URLs

8225 03/28/2013 10:49 AM Aaron Marcuse-Kubitza

Added web/main/Redmine/svn-web/

8224 03/28/2013 08:28 AM Aaron Marcuse-Kubitza

inputs/GBIF/: Added scripts for subsetting refresh

8223 03/28/2013 12:24 AM Aaron Marcuse-Kubitza

lib/sql.py: table_order_by(): Documented that it returns None if table is a view, because table_cluster_on() would return None. This is necessary for inputs/FIA/occurrence_all/ sorting to work correctly, because specifying a manual sort order would prevent the query planner from just using fast nested loop joins and instead cause it to perform a slow sort. (This appears to be a bug in the query planner, because when the column list specified matches the joined-on indexes, there should be no need for post-nested loop re-sorting.)

8222 03/28/2013 12:20 AM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/test.xml.ref: Updated inserted row count for new row sort order

8221 03/28/2013 12:19 AM Aaron Marcuse-Kubitza

lib/db_xml.py: put_table(): Fixed bug where also need to advance start to fetch next set when table is a view, because the views that are now being used with the import (inputs/FIA/occurrence_all/) are static rather than dynamic and do not return different rows after the previous set of rows has been imported

8220 03/27/2013 11:43 PM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Removed no longer applicable comment that directional joins are needed for PostgreSQL query planner to avoid slow sorts

8219 03/27/2013 11:40 PM Aaron Marcuse-Kubitza

inputs/FIA/TREE/import: Reclustered table by TREE.parent path index, to facilitate path-order joins

8218 03/27/2013 11:39 PM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Changed all RIGHT JOINs to inner joins so that tables would be joined in path order (i.e. general->specific). This optimizes the incremental joins so that the small tables are joined to each other before being joined to the large tables, rather than each row of the large tables being looked up in the small tables. This effect may not be noticeable for small LIMIT values, but would become apparent for large LIMIT values, such as the 1-million-row partitions used by db_xml.put_table() for column-based import. Note that inner joins used to cause the query planner to produce incorrect results containing slow sorts, but now this appears to no longer be an issue, perhaps because the result is not sorted by the TREE.ID index (which is not in the same order as the path indexes *.unique, *.parent).

8217 03/27/2013 10:46 PM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Removed trailing whitespace