Project

General

Profile

# Date Author Comment
12011 01/25/2014 09:18 PM Aaron Marcuse-Kubitza

/README.TXT: Notes on running programs: added warning that you should always start with a clean shell to avoid spurious bugs

11985 01/21/2014 07:23 PM Aaron Marcuse-Kubitza

/README.TXT: Testing: added pointer to development machine specs

11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11967 01/18/2014 10:51 PM Aaron Marcuse-Kubitza

/README.TXT: added note that shell scripts should always be read-only, so that editing them while an import is in progress will not crash the import (see http://vegpath.org/links/#**%20modifying%20a%20running%20shell%20script)

11940 01/09/2014 12:31 AM Aaron Marcuse-Kubitza

/README.TXT: to synchronize a Mac's settings with my testing machine's: added step to remove the downloaded Spam folder, because spam e-mails often contain viruses that would trigger clamscan

11915 12/16/2013 05:46 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: documented that you should always start with a clean shell, which does not have changes to the env vars. (there have been inexplicable bugs that went away after closing and reopening the terminal window.) note that running `exec bash` is not sufficient to reset the env vars.

11897 12/11/2013 07:53 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: backups: added step to download backup to local machine

11892 12/10/2013 07:36 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: In PostgreSQL: documented that the tables to check are located in the r# schema, not public

11866 12/09/2013 02:27 PM Aaron Marcuse-Kubitza

/README.TXT: Datasource setup: added steps to backup e-mails

11800 12/03/2013 06:27 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: To restart an aborted import for a specific table: run the two commands in errexit mode so that the datasource does not incorrectly have the temp suffix removed if the import command exited with an error

11795 11/27/2013 11:16 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: To restart an aborted import for a specific table: added command to remove the temp suffix from the source table entry, which is not automatic for importing a specific table (only for importing the entire datasource, at the end of which the datasource is considered completely imported and ready to overwrite any previous import)

11787 11/26/2013 11:10 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: documented that `make schemas/reinstall` requires sudo access

11728 11/21/2013 04:59 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: verifying import: In PostgreSQL: don't include current values of the datasource counts, etc., because these may change and should always be re-checked at wiki.vegpath.org/VegBIEN_contents

11686 11/18/2013 05:05 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: to backup files not in Time Machine: PostgreSQL: need to run with `overwrite=1` so removed files are also deleted

11685 11/18/2013 05:02 AM Aaron Marcuse-Kubitza

/README.TXT: to backup files not in Time Machine: PostgreSQL: only stop PostgreSQL after all files have been copied, to minimize the time that the PostgreSQL server is down (the final copy just copies concurrent changes)

11684 11/18/2013 05:02 AM Aaron Marcuse-Kubitza

/README.TXT: to backup files not in Time Machine: PostgreSQL: only stop PostgreSQL after all files have been copied, to minimize the time that the PostgreSQL server is down (the final copy just copies concurrent changes)

11683 11/18/2013 04:59 AM Aaron Marcuse-Kubitza

/README.TXT: updated to PostgreSQL 9.3

11573 11/05/2013 10:31 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: after import: record the import times in inputs/import.stats.xls: documented that this should be run on the local machine, because it needs the Mac filename ordering

11570 11/05/2013 08:54 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: after import: removed step to install analytical_stem on nimoy because the import mechanism is not set up to do this (we don't generate CSV exports of the full analytical_stem table because they take up a lot of space and are not currently used for anything)

11569 11/05/2013 08:32 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: after import: In PostgreSQL: added step to check that analytical_stem contains the expected # of rows

11568 11/05/2013 08:16 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: after import: In PostgreSQL: added specific instructions for determining which/how many datasources are expected to be included in the provider_count and source tables

11516 10/31/2013 12:50 AM Aaron Marcuse-Kubitza

/README.TXT: for each task, documented which machine it's run on. for tasks run on vegbiendev, added pointer to "Connecting to vegbiendev" steps.

11515 10/31/2013 12:19 AM Aaron Marcuse-Kubitza

/README.TXT: added instructions for connecting to vegbiendev

11263 10/13/2013 12:02 AM Aaron Marcuse-Kubitza

/README.TXT: Single datasource import: added pointer to instructions to remake the analytical DB (also required after single datasource import)

11259 10/12/2013 03:49 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize vegbiendev, jupiter, and your local machine: run all sync_uploads on the svn working copy using --size-only, because the mtimes are based on when the files were last updated by svn and are not meaningful

11258 10/12/2013 03:46 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: On local machine: do steps under Maintenance > "to synchronize vegbiendev, jupiter, and your local machine": removed no longer accurate indicator that these steps are above Full database import, since Full database import is now at the beginning of the file

11090 09/27/2013 03:43 PM Aaron Marcuse-Kubitza

/README.TXT: Datasource setup: added link to Example steps for a datasource (wiki.vegpath.org/Import_process_for_Madidi)

11089 09/27/2013 03:23 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To remake analytical DB: added runtime (13 h)

11019 09/19/2013 06:27 PM Aaron Marcuse-Kubitza

/README.TXT: Datasource setup: additional steps for new-style datasources: added steps not present in http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource because they were performed all at once for all datasources

11018 09/19/2013 06:24 PM Aaron Marcuse-Kubitza

/README.TXT: Datasource setup: added additional steps for new-style datasources, from http://wiki.vegpath.org/Adding_new-style_import_to_a_datasource

10981 09/15/2013 06:25 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: to backup files not in Time Machine: need to use -E option to sudo to preserve env, after installing the latest system update

10895 09/09/2013 05:44 PM Aaron Marcuse-Kubitza

/README.TXT: Single datasource import: removed rescrub step because this is not needed by the current TNRS process

10885 09/05/2013 07:19 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: added Running individual steps separately label for the section that is not part of the main import, but is useful if the import is aborted part of the way through

10884 09/05/2013 05:02 PM Aaron Marcuse-Kubitza

/README.TXT: moved Single datasource import, Datasource setup to top since these are the most important howtos

10868 09/04/2013 11:48 PM Aaron Marcuse-Kubitza

bugfix: bin/after_import: run backups/fix_perms right after the backup files are created to make them private

10864 09/04/2013 05:26 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: Publish the new import: added runtime (1 min)

10850 08/31/2013 07:47 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: time to wait for the import to finish: updated to time in inputs/import.stats.xls

10847 08/31/2013 07:27 PM Aaron Marcuse-Kubitza

bin/import_all: added step to remove any leftover TNRS lockfile (previously done manually)

10820 08/30/2013 07:15 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: on a live machine, you should put the following in your .profile: need to make svn files web-accessible, because these are used by fs.vegpath.org links (such as to the ERD, etc.). note that this does not affect unversioned files, because these get the right permissions on the local machine instead (see Testing > On a development machine, you should put the following in your .profile).

10819 08/30/2013 07:07 AM Aaron Marcuse-Kubitza

/README.TXT: to backup files not in Time Machine: added command to start the PostgreSQL server

10818 08/30/2013 06:58 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: to synchronize a Mac's settings with my testing machine's: don't upload ~/.profile, etc. to jupiter because these files are different on each machine. they can instead be synced manually.

10817 08/30/2013 06:52 AM Aaron Marcuse-Kubitza

/README.TXT: to backup files not in Time Machine: added command to stop the PostgreSQL server

10816 08/30/2013 06:49 AM Aaron Marcuse-Kubitza

/README.TXT: to synchronize vegbiendev, jupiter, and your local machine: noted that ./fix_perms should be run on all machines

10814 08/30/2013 06:31 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: to synchronize vegbiendev, jupiter, and your local machine:: added step to run `make backups/TNRS.backup/download live=1`, because bin/sync_upload does not sync this due to filters in backups/.rsync_filter.download

10813 08/30/2013 06:11 AM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize vegbiendev, jupiter, and your local machine: added step to run ./fix_perms so that there are fewer permissions diffs to review

10812 08/30/2013 06:07 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: to synchronize a Mac's settings with my testing machine's: upload: `(cd ~/Dropbox/svn/; svn up)`: use `up` instead so that the needed --force option is applied

10804 08/30/2013 01:21 AM Aaron Marcuse-Kubitza

/README.TXT: Single datasource import: run commands in the background, since these are long-running commands

10785 08/27/2013 10:13 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: fixing TNRS errors: noted that inputs/test_taxonomic_names/test_scrub re-runs TNRS

10784 08/27/2013 10:12 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: fixing TNRS errors: updated instructions for new TNRS schema editing workflow

10744 08/27/2013 11:38 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To back up DB (staging tables and last import) separately: added step to upload backups to jupiter

10743 08/27/2013 11:30 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To back up DB (staging tables and last import) separately: added step to remake backups/TNRS.backup

10603 08/06/2013 04:53 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: min disk space: updated import schema size for last import

10600 08/06/2013 01:16 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: tailing inputs/analytical_db/logs/make_analytical_db.log.sql: increased # lines to 150 to include all lines for the last run

10588 08/04/2013 12:58 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: To restart an aborted import for a specific table: bin/after_import: need to run it in the background

10587 08/04/2013 12:57 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To restart an aborted import for a specific table: added step to run bin/after_import

10584 08/03/2013 03:34 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: To restart an aborted import for a specific table: added by_col=1

10583 08/03/2013 03:32 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: added steps to restart an aborted import for a specific table

10579 08/03/2013 12:24 AM Aaron Marcuse-Kubitza

bin/import_all: use column-based import (by_col=1) by default, instead of requiring the user to explicitly specify it. instead turn it off explicitly (by_col=) for row-based import.

10578 08/03/2013 12:03 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: To back up DB: after renaming current import to public: say to replace $version with the appropriate revision, because the $version env var should not be set (otherwise the backup will try to use a nonexistent import with the given revision #)

10577 08/03/2013 12:00 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To back up DB: updated instructions to inline setting of $dump_opts, like in bin/import_all

10557 08/01/2013 11:07 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: don't exit the screen until after getting $version, which is defined within it

10549 08/01/2013 12:22 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: make test by_col=1: documented that if you encounter errors, they are most likely related to the PostgreSQL error parsing in /lib/sql.py parse_exception()

10379 07/20/2013 05:25 AM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: added instructions for what to do if http://vegbiendev.nceas.ucsb.edu/phppgadmin/ goes down (sometimes displaying a Not found error)

10286 07/17/2013 01:56 AM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: regenerate mappings/VegCore.csv: commit command: use single quotes ' instead of double quotes " to avoid needing to \-escape every special char (single quotes ' still need to be escaped)

10222 07/10/2013 04:10 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to backup files not in Time Machine: removed VirtualBox VMs because they are now in Time Machine, and do not need to be backed up separately

10221 07/10/2013 04:08 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: added steps to upload just the VirtualBox VMs

10220 07/10/2013 04:02 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: added overwrite=1 so that old snapshots, etc. are also deleted

10219 07/10/2013 04:01 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: use better bin/sync_upload instead of put

10218 07/10/2013 03:59 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: removed no longer needed inplace=1, because the VirtualBox VMs now all use a snapshot covering the full disk, so that the full disk is not altered (removing the need to optimize backing up a large file) and just the diff files need to be backed up each time

10093 06/27/2013 01:02 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Maintenance: syncing ~/bien to ~/Dropbox/svn: added overwrite=1 so that perms transfer from the authoritative ~/bien regardless of relative mtimes

10068 06/26/2013 02:58 PM Aaron Marcuse-Kubitza

/README.TXT: to synchronize vegbiendev, jupiter, and your local machine: added step to update mtimes/perms on ~/Dropbox/svn/ so that copying files back to ~/bien does not overwrite the permissions from what is on vegbiendev

10040 06/25/2013 04:27 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: synchronization: fixed whitespace

10037 06/25/2013 03:43 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: removed filters that are now handled by .rsync_ignores

10035 06/25/2013 03:17 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: sync ~/Dropbox/svn/ (the no-unversioned-files working copy) separately from the rest of the files, because .svn/ is now excluded by /.rsync_ignore, so that `svn up` needs to be used to keep the .svn/ dirs in sync. note that .svn/ should generally not be synced between machines, because they may use incompatible versions of the svn working copy format.

10034 06/25/2013 03:02 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize a Mac's settings with my testing machine's: use new bin/sync_upload (with $sync_remote_subdir) so that per-dir .rsync_ignores are processed, and to use the default $sync_remote_url

10032 06/25/2013 02:28 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize vegbiendev, jupiter, and your local machine: use new bin/sync_upload instead of specifying all the filter patterns manually. this replaces several `put` commands with various filters with just a bin/sync_upload each on vegbiendev and your machine (in overwrite=1 mode to force a complete sync).

10029 06/25/2013 01:42 PM Aaron Marcuse-Kubitza

/README.TXT: removed unnecessary `env` before kw params, which are treated as such whenever they appear before a command name

10028 06/25/2013 01:22 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: updated `make backups/download` to `make backups/<file>/download`

10027 06/25/2013 01:21 PM Aaron Marcuse-Kubitza

backups/Makefile: upload: use bin/sync_upload

10025 06/25/2013 01:07 PM Aaron Marcuse-Kubitza

bugfix: /README.TXT: `make inputs/upload`, `make inputs/download`: added live=1 so that the sync operation runs rather than previewing what will be synced. removed test=1 because this flag is not used by put.

10010 06/23/2013 03:58 PM Aaron Marcuse-Kubitza

/README.TXT: Backups: TNRS cache: Back up/Restore: added runtimes (3 min/5.5 min)

9996 06/20/2013 06:21 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To run TNRS, etc. after the main import: clarified that you should only run `export version=<version>` if the import is named something other than public (i.e. it has not yet replaced the previous public schema)

9995 06/20/2013 06:14 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: To run TNRS: removed `by_col=1` because by-column mode is not applicable to running TNRS. it is, however, needed when running import_scrub (i.e. `make inputs/<datasrc>/reimport_scrub by_col=1`).

9970 06/20/2013 07:11 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: disk space check: updated minimum (to 300GB) for new import schema size. note that most of the space (166GB) is indexes, and even of the 87GB of data, only 20GB is from GBIF and 15GB from FIA (so most of it is duplication).

9950 06/19/2013 08:51 PM Aaron Marcuse-Kubitza

/README.TXT: `make inputs/{upload,download}`: first run with test=1 to see what the diffs will be

9899 06/14/2013 07:42 AM Aaron Marcuse-Kubitza

bugfix: /README.TXT: Full database import: added step to remove any leftover TNRS lockfile. usually, the PID in it would not exist, but sometimes it now refers to a different, active process which blocks tnrs.make.

9887 06/12/2013 12:18 PM Aaron Marcuse-Kubitza

/README.TXT: Full database import: On local machine: added step to do steps under Maintenance > "to synchronize vegbiendev, jupiter, and your local machine", which is needed in addition to `make inputs/upload` since that doesn't handle overwrites or deletions

9886 06/12/2013 12:10 PM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: to synchronize vegbiendev, jupiter, and your local machine: added warning that you should pay careful attention to all files that will be deleted or overwritten (as the three machines are often out of sync)

9884 06/12/2013 11:17 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: make inputs/{upload,download}: run them first with `test=1` to see what the changes will be

9883 06/12/2013 11:12 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: `svn up`: use --force to avoid errors about existing files

9532 05/23/2013 06:27 PM Aaron Marcuse-Kubitza

bugfix: README.TXT: Full database import: screen: need to unset TMOUT, version after running `screen` rather than before so they take effect within the `screen` shell

9531 05/23/2013 06:25 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: after running `screen`: run `set -o ignoreeof` to prevent Ctrl+D from exiting `screen` to keep attached jobs

9528 05/23/2013 03:28 PM Aaron Marcuse-Kubitza

README.TXT: updating TNRS CSV columns: use the entire "COPY tnrs ..." statement instead of just the body of it so that the explicit columns list is included. this way, the COPY statement will cause an error if the TNRS schema was changed but inputs/.TNRS/data.sql was not yet updated.

9499 05/21/2013 10:27 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: added warning to perform every single step listed, to avoid breaking column-based import

9498 05/21/2013 10:26 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: Publish the new import: added warning to be sure you have done every single verification step before proceeding. otherwise, a previous valid import could incorrectly be overwritten with a broken one.

9497 05/21/2013 09:07 PM Aaron Marcuse-Kubitza

bugfix: README.TXT: Full database import: To run TNRS/remake analytical DB: need to run `export version=<version>` before the command which uses it rather than after

9494 05/21/2013 07:42 PM Aaron Marcuse-Kubitza

README.TXT: Datasource setup: For MySQL inputs: For .sql exports: added steps to grant privileges to the bien user. the privileges list excludes UPDATE, DELETE, ALTER, DROP to prevent bugs in the import scripts from accidentally deleting data.

9492 05/21/2013 07:33 PM Aaron Marcuse-Kubitza

README.TXT: Full database import: added steps to check that TNRS ran successfully, and fix errors (due to column changes in the TNRS CSV) if it didn't

9401 05/16/2013 11:18 AM Aaron Marcuse-Kubitza

/README.TXT: Full database import: before running screen: added `unset TMOUT` because TMOUT (autologout) causes screen to exit even with background processes active

9400 05/16/2013 11:17 AM Aaron Marcuse-Kubitza

/README.TXT: Maintenance: added things to put in your .profile on a live machine (e.g. vegbiendev). in particular, you MUST NOT have a TMOUT (autologout) set, because this causes screen to exit even if background processes (e.g. from column-based import) are running