inputs/GBIF/_MySQL/.rsync_ignore: added GBIFPortalDB-*.data.sql.gz, because these are intermediate files
*{.sh,run}: runscript targets: use begin_target instead of echo_func so the target name is properly echoed. note that this requires using with_rm so that $rm is properly progagated to applicable invoked targets. (previously, $rm was progagated to all invoked targets. note that with_rm only works inside a runscript target that starts with begin_target.)
lib/sh/make.sh: self_make(): renamed to with_rm() for clarity, since this is used only to progagate $rm, and does not also invoke a command with the same name as the current function, as the name might suggest
*{.sh,run}: use new begin_target instead of `echo_func; set_make_vars`
moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
copyright scrub: inputs/: removed data provider-owned schema and documentation files, which are not BIEN copyright and should not be part of what is submitted for open-sourcing. these files will remain accessible via the web interface (fs.vegpath.org), but will not be in the repository.
added inputs/GBIF/_MySQL/.rsync_ignore with filters from /README.TXT > Maintenance > to synchronize vegbiendev, jupiter, and your local machine. these filters will now be used with bin/sync_upload in addition to the periodic backup commands.
added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.0.preamble.sql
*{.sh,run}: use simpler .rel() instead of `. "$(dirname "${BASH_SOURCE0}")"/...` for relative includes
bugfix: inputs/GBIF/_MySQL/MySQL_schema, MySQL_data: sed: put {} commands on their own line to work on Mac
bugfix: inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.run: ^.preamble.sql/make(): need to run set_make_vars even though the make vars are not used, because set_make_vars sets $_remake, which is needed by self_make
bugfix: *run: overriding targets: use new self_make to properly progagate the $remake flag to the overridden target, so that the target itself is not skipped
inputs/GBIF/_MySQL/run: documented steps to reload GBIF MySQL
inputs/GBIF/_MySQL/run: added load_data(), which loads the dumpfile into MySQL
lib/runscripts/table_dir.run: renamed table to subdir because this can apply to any datasrc subdir. moved table-specific code to table.run.
bugfix: inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.run: override ^.preamble.sql/make() and use ../_src/GBIFPortalDB-2013-02-20.dump as the dumpfile instead of this file, which does not contain the preamble
removed inputs/GBIF/_MySQL/MySQL.data.sql*, since we are using the much faster exported TSVs instead (see raw_occurrence_record/table.tsv). this also avoids confusion between GBIFPortalDB-2013-02-20.data.sql* and MySQL.data.sql* when loading data into MySQL.
bugfix: inputs/GBIF/_MySQL/MySQL.data.sql.run: moved to GBIFPortalDB-2013-02-20.data.sql.run since it's actually the raw input file, not the ANSI export of it, that needs to be imported
added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.schema.z.clean_up.sql, which removes duplicated and unnecessary indexes in raw_occurrence_record
added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.schema.0.preamble.sql
added lib/sh/resume_import.sh and use it in inputs/GBIF/_MySQL/MySQL.data.sql.run
inputs/GBIF/_MySQL/MySQL.data.sql.run: is_pkey_imported__int(): made pkey name configurable in $pkey_name
inputs/GBIF/_MySQL/MySQL.data.sql.run: import_resume_pos() run time: removed seconds because the precision is likely only to the nearest half-minute
inputs/GBIF/_MySQL/MySQL.data.sql.run: documented that import_resume_pos() takes 6 min to run, with 37 iterations
added inputs/GBIF/_MySQL/MySQL.data.sql.run, with helper functions for resuming the import to MySQL from where it left off. this is very useful if the import is interrupted for any reason, because otherwise, the entire import would have to be run again from the start, taking 40-50 hours. import_resume_pos() uses new binsearch() to find where in the file the import left off, based on which pkeys have already been imported. (GBIF pkeys are unfortnately not in any order in the input file, nor are they in insertion order in the imported table, because MySQL instead clusters the table by the pkey. this necessitates a much more complex solution to resuming a partial import.)
*{.sh,run}: use new top_make instead of `make --directory="$top_dir"`
*{.sh,run}: removed extra space between function name and (), which is apparently not needed even though `help function` includes it. this greatly improves readability, including when function names are pasted into commit messages!
lib/util.sh: make (): don't change the directory to $top_dir, because paths used by the caller will instead be relative to the current dir (note that in runscripts, paths are generally not absolute because of canon_rel_path ()). this requires manually specifying $top_dir when running make on a physical (rather than inlined) makefile.
*{.sh,run}: changed echo_func to an alias that includes the "$", so that callers don't need to specify the "$" manually. although the original echo_func function is still there, callers need to switch over to the alias at the same time as it's defined because otherwise the "$@" would be doubled, since echo_func refers to the alias instead.
", so that callers don't need to specify the "$
moved lib/*.run into runscripts/ subdir so runscripts are grouped together and easier to find rather than being scattered throughout lib/
added inputs/GBIF/_MySQL/MySQL.data.sql.gz.md5
*run: added include guards
lib/util.run: echo_func (): use the caller's FUNCNAME via the $FUNCNAME[] array instead of requiring them to pass it in the function args as `echo_func "$FUNCNAME" "$@"`
Removed no longer needed inputs/GBIF/_MySQL/import. Use ./run instead.
inputs/GBIF/_MySQL/run: import: Run make directly instead of via ./import
inputs/GBIF/_MySQL/run: Use new import.run, which defines all()
*run: Use -e option to bash on the #! line instead of separate `set -o errexit` line so that there is no issue with the `set -o errexit` line getting separated from the #! line (errexit is required for the scripts to work properly)
lib/util.run: Run run_cmd at shell exit (using trap) instead of requiring every runscript to have `run_cmd ` at the end of it
Added inputs/GBIF/_MySQL/run
inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.md5: Regenerated after appending agent table to GBIFPortalDB-2013-02-20.data.sql
Added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.gz.md5
Added inputs/GBIF/**/MySQL.schema.sql
Added inputs/GBIF/_MySQL/MySQL.*.sql.make
inputs/GBIF/_MySQL/Makefile: %.data.sql: Added agent table
Added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.sql.md5
Added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.schema.sql
inputs/GBIF/: Added scripts for subsetting refresh