Refreshing CVS--an MS Access plots datasource¶
what is needed from the user¶
- updated extract (so we can go back to the raw data if needed)
steps¶
underlined: user input needed (other steps can be automated)
from README.TXT > Single datasource refresh
- connect to vegbiendev:
ssh -t vegbiendev.nceas.ucsb.edu exec sudo -u aaronmk -i
- obtain updated extract
- from Mike Lee
- place extract in
inputs/CVS/_src/
- unzip the extract
- IMPORTANT: move previous versions of the extract out of the way:
mv inputs/CVS/*.{data,schema}.sql inputs/CVS/_archive/
otherwise, you will get strange errors when it tries to load new data on top of an old schema! - see README.TXT > Datasource setup > For MS Access databases
- for the .ini files, use
inputs/CVS/_src/{data,schema}.sql.ini
- for the .ini files, use
- save the existing staging tables to revert to in case of errors:
bin/psql_verbose_vegbien <<EOF ALTER SCHEMA "CVS" RENAME TO "CVS_prev"; EOF
- reload staging tables:
rm=1 inputs/CVS/run
- remove the previous staging tables:
bin/psql_verbose_vegbien <<EOF DROP SCHEMA "CVS_prev" CASCADE; EOF
- run column-based import:
make inputs/CVS/reimport_scrub by_col=1 & tail -n 150 inputs/CVS/*/logs/r[#].log.sql # view progress
- see README.TXT > Single datasource refresh > steps after
reimport_scrub
runtimes¶
inputs/CVS/data.sql
creation by MSAccess to PostgreSQL: 30 min ("06:28:58".."07:00:04")rm=1 inputs/CVS/run
: @vegbiendev: 40 min ("38m43.663s"); @frenzy: 15 min ("16m58.735s")make inputs/CVS/reimport_scrub by_col=1 &
: 1 day ("1.1 days")