Project

General

Profile

Statistics
| Revision:

# Date Author Comment
8369 04/09/2013 02:33 PM Aaron Marcuse-Kubitza

Added www/logs/

8368 04/09/2013 02:24 PM Aaron Marcuse-Kubitza

web/: Moved auxiliary files into the main/ subdir in preparation for having just the web/ dir. Renamed web/ to www/ so it can be replaced with web/main/.

8367 04/09/2013 02:09 PM Aaron Marcuse-Kubitza

web/main/servers/vegbiendev/: Split it into db and fs forks, with db being the default. The fs fork links directly to /home/bien/svn on vegbiendev, which makes world-readable files directly web-accessible. (Permissions on /home/bien/svn and subdirs have been checked to ensure that private files are not world-readable.)

8366 04/09/2013 01:33 PM Aaron Marcuse-Kubitza

Added web/main/IH/db/

8365 04/09/2013 01:26 PM Aaron Marcuse-Kubitza

web/main/index.php: Updated fragment redirect for new dotpath format (using ? instead of a relative path)

8364 04/09/2013 01:20 PM Aaron Marcuse-Kubitza

web/main/index.php: Updated path templates for new dotpath format (using ? instead of /)

8363 04/09/2013 01:06 PM Aaron Marcuse-Kubitza

web/main/**/.htaccess: Support dotpaths in the query string instead of in the path, so that non-dotpath paths don't need to be suffixed with / to prevent their filenames from being interpreted as dotpaths. Putting dotpaths in the query string still requires only one character between the host and the path, but it's ? instead of / . ? is in many ways more natural, because the dotpath is a non-filesystem string to be parsed rather than something that's already a filesystem path. This change also avoids the need to strip trailing /s in many RewriteRules, because the dotpath mechanism is no longer appending them.

8362 04/09/2013 01:00 PM Aaron Marcuse-Kubitza

Added web/main/dotpath.php, which parses any dotpath in the query string

8361 04/09/2013 12:58 PM Aaron Marcuse-Kubitza

web/main/util.php: Added coalesce()

8360 04/09/2013 10:18 AM Aaron Marcuse-Kubitza

web/main/svn*: Fixed symlinks to use .redmine instead of Redmine

8359 04/09/2013 09:37 AM Aaron Marcuse-Kubitza

web/main/IH/.htaccess: Fixed whitespace

8358 04/09/2013 09:36 AM Aaron Marcuse-Kubitza

web/main/IH/.htaccess: RewriteRule: Fixed bug where need to \-escape % because it's a special character in the replacement string (it references a regexp group matched by the last RewriteCond)

8357 04/09/2013 09:28 AM Aaron Marcuse-Kubitza

Removed web/main/Redmine symlink so that it isn't listed in the directory listing as a visible (non-hidden) file. (There is already a .redmine symlink which matches all case variations via web/main/.htaccess logic.)

8356 04/09/2013 09:22 AM Aaron Marcuse-Kubitza

Added web/main/svn* symlinks to VegBIEN/Redmine/svn*

8355 04/09/2013 09:18 AM Aaron Marcuse-Kubitza

web/main/svn*/: Removed in preparation for replacing with symlinks, which will work now that absolute paths are used where needed

8354 04/09/2013 09:15 AM Aaron Marcuse-Kubitza

web/main/**/.htaccess: internal redirects using relative paths with .. : Use absolute paths instead so that when the directory is reached through a symlink, the redirect will still work. Note that relative paths without .. do not need to be absolute paths because the subtree structure is the same (just the parent dirs are different).

8353 04/09/2013 08:01 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Regenerated exports

8352 04/09/2013 07:26 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb legend images: resized to medium-sized squares because apparently MySQL Workbench can't handle rectangular or very large images (it distorts them or resets them to their original size when the diagram is reloaded)

8351 04/09/2013 06:08 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Legend: Added IS-A, HAS-A, and "inherits from record" entries

8350 04/09/2013 06:08 AM Aaron Marcuse-Kubitza

Added lib/MySQL_Workbench/connector.png, solid_line.png, dotted_line.png

8349 04/09/2013 05:00 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Fixed lines

8348 04/09/2013 04:59 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Renamed placename to named_place to match the table name

8347 04/09/2013 04:55 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Renamed measurement to trait because this is the more commonly used name for the entity

8346 04/09/2013 04:47 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Added VegCore logo

8345 04/09/2013 04:46 AM Aaron Marcuse-Kubitza

Added schemas/VegCore/VegCore.logo.svg

8344 04/09/2013 04:09 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: taxon: Added recursive parent fkey for (optionally) storing taxon hierarchies

8343 04/09/2013 03:42 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Repositioned taxon_determination

8342 04/09/2013 03:39 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Moved taxon next to qualified_taxon instead of above it, because inheritance (IS-A) is shown vertically while HAS-A is shown horizontally

8341 04/09/2013 03:25 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Populated legend

8340 04/05/2013 12:23 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Added exports

8339 04/05/2013 12:23 AM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Fixed lines and settings for the Linux MySQL Workbench

8338 04/04/2013 10:01 PM Aaron Marcuse-Kubitza

schemas/VegCore/VegCore.ERD.mwb: Added table colors

8337 04/04/2013 10:01 PM Aaron Marcuse-Kubitza

Removed backup file schemas/VegCore/VegCore.ERD.mwb.bak

8336 04/04/2013 09:48 PM Aaron Marcuse-Kubitza

Added schemas/VegCore/VegCore.ERD.mwb, VegCore.my.sql with first VegCore ERD and MySQL schema. All tables are in the ERD, but contain only pkey and fkey columns.

8335 04/04/2013 09:52 AM Aaron Marcuse-Kubitza

lib/sql.py: mk_select(): using subset function: Turn off enable_sort (within the transaction) to avoid unwanted slow sorts. This change (along with the subset functions themselves) should significantly reduce the long FIA.occurrence_all table subset time (~8 hours altogether) and with it the total import time, which had more than doubled as a result of the FIA refresh. Note that this issue would have been even more pronounced for larger datasets, such as the GBIF refresh, which would have taken ~2.5 days longer (400 million rows * ~30% are plants * (FIA: ~8 hours/16.7 million rows) * 1 day/24 hours).

8334 04/04/2013 09:30 AM Aaron Marcuse-Kubitza

lib/sql.py: mk_select(): Use subset function when it's available for fast querying at large OFFSET values

8333 04/04/2013 09:29 AM Aaron Marcuse-Kubitza

lib/sql.py: Added has_subset_func()

8332 04/04/2013 08:48 AM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Run mk_subset_by_row_num_func() to make the subset functions available for fast querying at large OFFSET values

8331 04/04/2013 08:43 AM Aaron Marcuse-Kubitza

schemas/util.sql: mk_subset_by_row_num_func(): regular subset function: Fixed bug where need to add 1 to the 0-based offset_ to get the 1-based row_num (which is usually a serial column)

8330 04/04/2013 08:38 AM Aaron Marcuse-Kubitza

schemas/util.sql: mk_subset_by_row_num_func(): regular subset function: Fixed bug where need to subtract 1 from the end row_num because BETWEEN limits are inclusive of the bounds

8329 04/04/2013 08:33 AM Aaron Marcuse-Kubitza

schemas/util.sql: mk_subset_by_row_num_func(): regular subset function: Fixed bug where also need to COALESCE offset_ to 0 when it's added to the limit_

8328 04/04/2013 08:20 AM Aaron Marcuse-Kubitza

schemas/util.sql: mk_subset_by_row_num_func(): subset function which turns off enable_sort: Fixed bug where need to pass ($2, $3) instead of ($1, $2) to the regular subset function

8327 04/04/2013 08:14 AM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Added occurrence_all-row_num column for use with mk_subset_by_row_num_func()

8326 04/04/2013 08:12 AM Aaron Marcuse-Kubitza

schemas/util.sql: mk_subset_by_row_num_func(): Also create subset function which turns off enable_sort. This is used for limit values greater than ~100,000 to avoid unwanted slow sorts. The regular subset function is still needed to work with EXPLAIN, so that it produces expanded output instead of just a function scan.

8325 04/04/2013 07:27 AM Aaron Marcuse-Kubitza

schemas/util.sql: Added mk_subset_by_row_num_func()

8324 04/04/2013 07:10 AM Aaron Marcuse-Kubitza

schemas/util.sql: Added type_qual_name()

8323 04/04/2013 06:33 AM Aaron Marcuse-Kubitza

schemas/util.sql: force_update_view(): Fixed bug where also need to drop view for "cannot change name of view column" errors

8322 04/04/2013 05:24 AM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Use new force_update_view(), which only drops the view if its columns have changed and otherwise just uses CREATE OR REPLACE VIEW, rather than always first running DROP VIEW IF EXISTS

8321 04/04/2013 05:20 AM Aaron Marcuse-Kubitza

schemas/util.sql: Added force_update_view()

8320 04/04/2013 04:23 AM Aaron Marcuse-Kubitza

bin/make_analytical_db: Commented out export_analytical_db because we are not yet using the analytical DB in MySQL, and it doesn't make sense to generate a large, unused CSV export each time

8319 04/04/2013 04:19 AM Aaron Marcuse-Kubitza

bin/export_analytical_db: Replaced analytical_aggregate with analytical_stem

8318 04/04/2013 03:53 AM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/: Updated header.csv for new column order

8317 04/04/2013 03:40 AM Aaron Marcuse-Kubitza

inputs/FIA/occurrence_all/import: Use directional joins (LEFT/RIGHT JOIN) instead of inner joins to ensure that the PostgreSQL query planner always joins starting with the TREE table. Note that the directional joins are now needed for a different reason than when they were initially added, which had been to avoid slow sorts. The sorts (at least for LIMIT-only queries) went away when small tables such as COUNTY and REF_UNIT were added to the joins.

8316 04/04/2013 01:16 AM Aaron Marcuse-Kubitza

inputs/FIA/*/map.csv: Changed newlines between table and field name to - because the newlines mess up the flow of queries and also break pgAdmin's display of EXPLAIN output. The - was chosen because it's a non-whitespace character that linewraps in browsers, phpPgAdmin, and Google spreadsheets (although unfortunately not in pgAdmin). It is better than space because you can set a text editor to treat it as a word character, allowing the entire column name (<table>-<field>) to be selected by double-clicking it.

8315 04/03/2013 09:55 PM Aaron Marcuse-Kubitza

Added planning/workflow/normalized_vs_denormalized/denormalized.generic_standardizations.png (a slide from Brad's bien3_architecture_denormalized.pptx PowerPoint), which shows the staging table preprocessing particularly well

8314 04/03/2013 09:45 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: record the import times in inputs/import.stats.xls: Added instructions for what to do if the rightmost imports start getting truncated due to the 255-column limit in spreadsheets. (This will occur in 8 imports.)

8313 04/03/2013 09:32 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Removed the previous imports from the current tab because they are also in the 2012-6~9 tab, and should not be in two places

8312 04/03/2013 09:28 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. MO and FIA have been refreshed.

8311 04/02/2013 04:17 PM Aaron Marcuse-Kubitza

Removed no longer needed inputs/GBIF/import. Use ./run instead.

8310 04/02/2013 04:17 PM Aaron Marcuse-Kubitza

Removed no longer needed inputs/GBIF/_MySQL/import. Use ./run instead.

8309 04/02/2013 04:16 PM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/run: import: Run make directly instead of via ./import

8308 04/02/2013 04:15 PM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/run: Use new import.run, which defines all()

8307 04/02/2013 04:06 PM Aaron Marcuse-Kubitza

Added planning/workflow/normalized_vs_denormalized/bien3_architecture_(de)normalized.pptx

8306 04/02/2013 03:57 PM Aaron Marcuse-Kubitza

Added planning/workflow/normalized_vs_denormalized/BIEN-modArch-Dec2010 NS-SBD 1.4.ppt.url

8305 04/02/2013 03:50 PM Aaron Marcuse-Kubitza

planning/workflow/: Moved normalized vs. denormalized files to separate normalized_vs_denormalized/ subfolder

8304 04/02/2013 03:21 PM Aaron Marcuse-Kubitza

Regenerated inputs/ACAD/Specimen/logs/steps.by_col.log.sql

8303 04/02/2013 03:15 PM Aaron Marcuse-Kubitza

inputs/GBIF/raw_occurrence_record/run: Override MySQL_export() so $filter can be customized

8302 04/02/2013 03:13 PM Aaron Marcuse-Kubitza

inputs/GBIF/table.run: import(): Updated for lib/table.run template changes

8301 04/02/2013 03:09 PM Aaron Marcuse-Kubitza

lib/table.run: template: import(): Also pass "$@" to superclass method

8300 04/02/2013 03:08 PM Aaron Marcuse-Kubitza

lib/table.run: template: Use "$FUNCNAME" instead of hardcoding import

8299 04/02/2013 03:02 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/MySQL_export, used by ./table.run

8298 04/02/2013 02:57 PM Aaron Marcuse-Kubitza

lib/util.run: echo_func: Fixed bug where need to use BASH_LINENO0 for the line #s to match up with the files. For some reason the required array indexes for BASH_SOURCE (1) and BASH_LINENO (0) differ by one.

8297 04/02/2013 02:51 PM Aaron Marcuse-Kubitza

inputs/GBIF/run: Use new import.run, which defines all()

8296 04/02/2013 02:51 PM Aaron Marcuse-Kubitza

lib/table.run: Use new import.run, which defines all()

8295 04/02/2013 02:49 PM Aaron Marcuse-Kubitza

Added lib/import.run

8294 04/02/2013 02:48 PM Aaron Marcuse-Kubitza

lib/util.run: echo_func: Include the line # of the function to make it easier to find where the code being run is

8293 04/02/2013 02:32 PM Aaron Marcuse-Kubitza

lib/table.run: Added all (default target)

8292 04/02/2013 02:26 PM Aaron Marcuse-Kubitza

lib/util.run: run_cmd: If bash exited with an error, don't run the "$@" command. This test is necessary because `trap run_cmd EXIT` will run run_cmd as the result of any exit from the shell, including an error.

8291 04/02/2013 02:21 PM Aaron Marcuse-Kubitza

*run: Use -e option to bash on the #! line instead of separate `set -o errexit` line so that there is no issue with the `set -o errexit` line getting separated from the #! line (errexit is required for the scripts to work properly)

8290 04/02/2013 02:09 PM Aaron Marcuse-Kubitza

lib/util.run: run_cmd: When no command specified, default to running the `all` command, just like make

8289 04/02/2013 02:07 PM Aaron Marcuse-Kubitza

lib/util.run: Run run_cmd at shell exit (using trap) instead of requiring every runscript to have `run_cmd ` at the end of it

8288 04/02/2013 01:49 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/run

8287 04/02/2013 01:48 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/raw_occurrence_record/run

8286 04/02/2013 01:47 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/table.run

8285 04/02/2013 01:45 PM Aaron Marcuse-Kubitza

Added inputs/GBIF/_MySQL/run

8284 04/02/2013 01:42 PM Aaron Marcuse-Kubitza

lib/util.run: fwd: Check that $subdirs is defined. Added $subdirs to usage.

8283 04/02/2013 01:39 PM Aaron Marcuse-Kubitza

lib/util.run: fwd: Added usage

8282 04/02/2013 01:32 PM Aaron Marcuse-Kubitza

lib/table.run: Switched from echo_run to echo_func

8281 04/02/2013 01:16 PM Aaron Marcuse-Kubitza

lib/util.run: run_cmd: Echo the command being run, including the top-level run script. This is in addition to the echoing of the command in the function itself (using echo_func), which provides both the runscript that was run and the file where the invoked command was actually located (which may be different due to includes).

8280 04/02/2013 01:12 PM Aaron Marcuse-Kubitza

lib/util.run: Echo the command at the beginning of each function using new echo_func, instead of having to type echo_run before every call to a function. Note that because echo_func uses BASH_SOURCE, the path to the file containing the function will be included in the debug message, which greatly facilitates locating which file a command is in.

8279 04/02/2013 01:08 PM Aaron Marcuse-Kubitza

lib/util.run: Added echo_func

8278 04/02/2013 12:50 PM Aaron Marcuse-Kubitza

lib/util.run: Added echo_cmd and use it in echo_run

8277 04/02/2013 12:46 PM Aaron Marcuse-Kubitza

lib/util.run: echo_cmd(): Renamed to echo_run for clarity, because it also runs the command

8276 04/02/2013 12:39 PM Aaron Marcuse-Kubitza

lib/util.run: Added inline_make()

8275 04/02/2013 12:39 PM Aaron Marcuse-Kubitza

lib/util.run: Added echo_stdin()

8274 04/02/2013 12:30 PM Aaron Marcuse-Kubitza

bin/my2pg_export: Put --password first because it's an authentication-related option

8273 04/02/2013 10:52 AM Aaron Marcuse-Kubitza

Added lib/table.run, which includes the commands in import.sh but uses run scripts to allow running commands other than just import. (For example, map_table or postprocess can be run separately. Uninstall-related commands which would not belong in an import script can also be added, because import is only one of many commands a run script can offer.)

8272 04/02/2013 10:35 AM Aaron Marcuse-Kubitza

Added lib/util.run with general functions and template for run scripts (a bash-based replacement for make). Unlike make, run scripts support full bash functionality including multiline commands. The run script template also includes syntax for various kinds of relative includes in bash.

8271 04/02/2013 12:03 AM Aaron Marcuse-Kubitza

lib/common.Makefile: Added $(require_var)

8270 04/01/2013 10:42 PM Aaron Marcuse-Kubitza

bin/publish_analytical_db: Fixed bug where need to remove `ESCAPED BY '"'` because this would causing " followed by an escape sequence char to be interpreted specially (e.g. "n -> \n). MySQL automatically takes care of quote doubling when you specify `FIELDS OPTIONALLY ENCLOSED BY`.