Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  _archive 1598 almost 13 years Aaron Marcuse-Kubitza Moved _archive/tapir2flatClient/trunk/client/ t...
  analysis 3076 over 12 years Aaron Marcuse-Kubitza Added top-level analysis dir for range modeling
  backups 9496 over 11 years Aaron Marcuse-Kubitza added backups/*.md5
  bin 9530 over 11 years Aaron Marcuse-Kubitza bin/tnrs_db: documented how to estimate total r...
  config 7801 almost 12 years Aaron Marcuse-Kubitza root Makefile: VegBIEN DB: mk_db: Added command...
  exports 8798 over 11 years Aaron Marcuse-Kubitza exports/: svn:ignore *.csv
  inputs 9568 over 11 years Aaron Marcuse-Kubitza added lib/sh/resume_import.sh and use it in inp...
  lib 9571 over 11 years Aaron Marcuse-Kubitza bugfix: lib/sh/resume_import.sh: sql_preamble()...
  mappings 9459 over 11 years Aaron Marcuse-Kubitza bugfix: mappings/VegCore-VegBIEN.csv: place.geo...
  planning 9403 over 11 years Aaron Marcuse-Kubitza added planning/workflow/validation/GeoDistKM.sq...
  schemas 9529 over 11 years Aaron Marcuse-Kubitza inputs/.TNRS/schema.sql, data.sql: updated TNRS...
  web 9389 over 11 years Aaron Marcuse-Kubitza web/links/index.htm: updated to Firefox bookmar...
.htaccess 326 Bytes 8771 over 11 years Aaron Marcuse-Kubitza /.htaccess: use canonical URL without symlinks
Makefile 12.6 KB 8844 over 11 years Aaron Marcuse-Kubitza bugfix: /Makefile: moved schemas/install from i...
README.TXT 22.8 KB 9532 over 11 years Aaron Marcuse-Kubitza bugfix: README.TXT: Full database import: scree...
fix_perms 97 Bytes 7560 almost 12 years Aaron Marcuse-Kubitza Added root fix_perms
map 1001 Bytes 6949 about 12 years Aaron Marcuse-Kubitza vegbien_dest: Changed default $prefix to "", so...
new_terms.csv 38.1 KB 7222 almost 12 years Aaron Marcuse-Kubitza new_terms.csv: Regenerated
run 450 Bytes 9074 over 11 years Aaron Marcuse-Kubitza *{.sh,run}: removed extra space between functio...
unmapped_terms.csv 13.1 KB 7201 about 12 years Aaron Marcuse-Kubitza **/new_terms.csv, **/unmapped_terms.csv: Regene...

Latest revisions

# Date Author Comment
9571 05/24/2013 11:02 AM Aaron Marcuse-Kubitza

bugfix: lib/sh/resume_import.sh: sql_preamble(): also stop at first "-- Table structure for table" line (when using a full dumpfile rather than a data-only subset)

9570 05/24/2013 10:58 AM Aaron Marcuse-Kubitza

lib/sh/resume_import.sh: resume_import(): run connection preamble (first few lines of dumpfile) before continuing with main file at offset, so that connection setting are reapplied

9569 05/24/2013 06:45 AM Aaron Marcuse-Kubitza

lib/sh/resume_import.sh: is_pkey_imported__int(): use echo_stdout so the user can see the result of the > function in each iteration

9568 05/24/2013 06:42 AM Aaron Marcuse-Kubitza

added lib/sh/resume_import.sh and use it in inputs/GBIF/_MySQL/MySQL.data.sql.run

9567 05/24/2013 06:32 AM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/MySQL.data.sql.run: is_pkey_imported__int(): made pkey name configurable in $pkey_name

9566 05/24/2013 05:32 AM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/MySQL.data.sql.run: import_resume_pos() run time: removed seconds because the precision is likely only to the nearest half-minute

9565 05/24/2013 05:31 AM Aaron Marcuse-Kubitza

inputs/GBIF/_MySQL/MySQL.data.sql.run: documented that import_resume_pos() takes 6 min to run, with 37 iterations

9564 05/24/2013 05:20 AM Aaron Marcuse-Kubitza

added inputs/GBIF/_MySQL/MySQL.data.sql.run, with helper functions for resuming the import to MySQL from where it left off. this is very useful if the import is interrupted for any reason, because otherwise, the entire import would have to be run again from the start, taking 40-50 hours. import_resume_pos() uses new binsearch() to find where in the file the import left off, based on which pkeys have already been imported. (GBIF pkeys are unfortnately not in any order in the input file, nor are they in insertion order in the imported table, because MySQL instead clusters the table by the pkey. this necessitates a much more complex solution to resuming a partial import.)

9563 05/24/2013 05:14 AM Aaron Marcuse-Kubitza

lib/sh/binsearch.sh: binsearch(): also echo_vars the iter_num, to track how close binsearch is to finding the value (it will always take the same # iters, log2(max - min) )

9562 05/24/2013 05:11 AM Aaron Marcuse-Kubitza

lib/sh/binsearch.sh: binsearch(): also echo_vars the min/max so these can be used as shortcut inputs if binsearch is run again

View all revisions | View revisions

Also available in: Atom