/trunk/inputs/GBIF - Changes - BIEN 3 - NCEAS Projects

root/trunk/inputs/GBIF @ 14490

svn:ignore: *

#	Date	Author	Comment
14075	07/15/2014 09:35 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: updated
13965	07/10/2014 12:17 PM	Aaron Marcuse-Kubitza	inputs/GBIF/_MySQL/.rsync_ignore: don't exclude GBIFPortalDB-*.data.sql.gz, even though this is an intermediate file, because it's better to have a backup of it locally. this was excluded in r13316 (2014-4-24) to free up disk space on the local machine.
13401	05/03/2014 02:03 PM	Aaron Marcuse-Kubitza	inputs/input.Makefile: add: verify/: also svn:ignore *.log
13316	04/24/2014 05:29 PM	Aaron Marcuse-Kubitza	inputs/GBIF/_MySQL/.rsync_ignore: added GBIFPortalDB-*.data.sql.gz, because these are intermediate files
12988	03/30/2014 05:41 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/_src/0001000-131106143450413.zip.header.txt, which is useful to see what fields will be available when we switch to the new GBIF export format
12985	03/30/2014 05:11 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/_src/0001000-131106143450413.zip.header.txt.run
12968	03/29/2014 04:06 AM	Aaron Marcuse-Kubitza	*{.sh,run}: runscript targets: use begin_target instead of echo_func so the target name is properly echoed. note that this requires using with_rm so that $rm is properly progagated to applicable invoked targets. (previously, $rm was progagated to all invoked targets. note that with_rm only works inside a runscript target that starts with begin_target.)
12967	03/29/2014 03:58 AM	Aaron Marcuse-Kubitza	lib/sh/make.sh: self_make(): renamed to with_rm() for clarity, since this is used only to progagate $rm, and does not also invoke a command with the same name as the current function, as the name might suggest
12886	03/24/2014 05:35 PM	Aaron Marcuse-Kubitza	schemas/vegbien.sql: specimenreplicate.institution_id: renamed to duplicate_institutions_sourcelist_id, as decided in the conference calls (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2)
12879	03/24/2014 01:49 AM	Aaron Marcuse-Kubitza	mappings/VegCore.htm: regenerated from wiki: rename specimenHolderInstitutions to specimen_duplicate_institutions, as decided in the 2014-03-13 conference call (wiki.vegpath.org/2014-03-13_conference_call#schema-changes-2). note that most schema changes (such as this one) involve mappings changes, which are handled automatically by `inputs/run postprocess; yes\|make inputs/{NVS,SALVIAS,TEAM}/test`.
12873	03/23/2014 11:43 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: switched to using lib/runscripts/table.run instead of mysql.table.run because some subdirs (Source/) need the regular table.run to work properly. mysql.table.run should instead be used directly by subdirs that use the MySQL install.
12779	03/20/2014 07:58 PM	Aaron Marcuse-Kubitza	*{.sh,run}: use new begin_target instead of `echo_func; set_make_vars`
12516	02/27/2014 01:27 PM	Aaron Marcuse-Kubitza	bugfix: *.sql: public.source_by_shortname(): need to wrap it in a nested SELECT because Postgres incorrectly does not constant-fold (inline) it, leading to a slowdown when it is therefore run many times. this is done using the steps at wiki.vegpath.org/Postgres_queries#wrap-function-call-in-nested-SELECT .
12018	02/02/2014 12:49 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: add!: verify/: also svn:ignore .tsv, .txt
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11888	12/10/2013 06:35 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/map.csv: row_num: remapped to plain *row_num, like the other datasources that have this field
11887	12/10/2013 06:31 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: rerun time: noted that this is only fast after manual vacuuming of the table (to remove the deleted rows from the index). autovacuum apparently does not run, although it should.
11881	12/09/2013 07:24 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: reran test, which added yearCollected/monthCollected/dayCollected
11869	12/09/2013 02:43 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: updated import() runtime (same), documented table cleanup runtime (1.5 h)
11868	12/09/2013 02:38 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: CREATE INDEX ... specimenHolderInstitutions: documented runtime (45 min)
11867	12/09/2013 02:28 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented runtime (3.5 min)
11788	11/26/2013 11:11 PM	Aaron Marcuse-Kubitza	**/new_terms.csv, unmapped_terms.csv updated (using `make missing_mappings`)
11705	11/21/2013 12:24 AM	Aaron Marcuse-Kubitza	copyright scrub: inputs/: removed data provider-owned schema and documentation files, which are not BIEN copyright and should not be part of what is submitted for open-sourcing. these files will remain accessible via the web interface (fs.vegpath.org), but will not be in the repository.
11658	11/14/2013 02:17 AM	Aaron Marcuse-Kubitza	added inputs/GBIF/_src/0001000-131106143450413.zip.md5, GBIFPortalDB-2013-09-10.dump.gz.md5
11654	11/14/2013 12:49 AM	Aaron Marcuse-Kubitza	inputs/GBIF/_src/GBIFPortalDB-2013-09-10.dump.gz.url: documented download time (5.5 h for an 18 GB file)
11653	11/14/2013 12:40 AM	Aaron Marcuse-Kubitza	inputs/GBIF/_src/0001000-131106143450413.zip.url: documented download time (only 2 h for an 18 GB file)
11650	11/13/2013 07:14 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/_src/0001000-131106143450413.zip.url (DwC-A export), GBIFPortalDB-2013-09-10.dump.gz.url (raw data), portal_26_feb_2013.war.url (raw data portal)
11648	11/13/2013 04:16 PM	Aaron Marcuse-Kubitza	inputs/GBIF/: added LOA files: _src/use_conditions/LetterOfAgreement_template.doc, BIEN LoA agreement annex.docx
11396	10/21/2013 07:14 PM	Aaron Marcuse-Kubitza	fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
11107	09/29/2013 08:58 PM	Aaron Marcuse-Kubitza	bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.
10866	09/04/2013 11:06 PM	Aaron Marcuse-Kubitza	inputs///test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix
10443	07/26/2013 05:58 PM	Aaron Marcuse-Kubitza	inputs/{.,}/.schema.sql: regenerated using the instructions in bin/my2pg. this primarily replaces timestamp with text/timestamp/ (to preserve indefinite dates).
10425	07/25/2013 07:34 PM	Aaron Marcuse-Kubitza	bugfix: inputs///map.csv for specimen tables: remapped eventDate,day,month,year to *Collected, because a general date always applies to the observation itself rather than to any parent event (specimens don't have a parent event)
10270	07/14/2013 01:26 AM	Aaron Marcuse-Kubitza	bugfix: inputs///map.csv (e.g. inputs/GBIF/raw_occurrence_record_plants/map.csv): remapped author to scientificNameAuthorship rather than authors, which it had gotten incorrectly automapped to. note that the VegCore term authors has now been renamed to data_authors to avoid ambiguity, but incorrect automappings resulting from it had not yet been fixed.
10269	07/14/2013 12:54 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: updated herbaria.ih column names for staging table column renaming
10268	07/14/2013 12:33 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: need to include lib/runscripts/mysql.table.run instead of table.run (table.run was accidentally substituted when inputs/.NCBI/table.run was copied to all new-style datasources
10242	07/10/2013 10:07 PM	Aaron Marcuse-Kubitza	inputs/*/Source/VegBIEN.csv: regenerated for new-style import, which uses a symlink to mappings/VegCore-VegBIEN.csv instead of a custom mapping using the original column names
10209	07/10/2013 02:32 AM	Aaron Marcuse-Kubitza	inputs///map.csv for CSV tables with a row_num column: added missing row_num entry, which is needed by the staging table column renaming to make the order of the map.csv columns match the order in the staging table
10199	07/09/2013 04:44 PM	Aaron Marcuse-Kubitza	bugfix: inputs/*/Source/map.csv: added missing row_num entry, which is needed by the staging table column renaming to make the order of the map.csv columns match the order in the staging table. the staging table column renaming is now used by all Source tables.
10179	07/06/2013 05:39 PM	Aaron Marcuse-Kubitza	inputs/*/: added table.run for use by the table subdirs in new-style import. datasources without table subdirs do not need this.
10174	07/06/2013 03:55 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: for new-style datasources, use a symlink to mappings/VegCore-VegBIEN.csv directly instead of prefiltering VegCore-VegBIEN.csv to include only the columns in map.csv. prefiltering used to be performed as part of mapping the map.csv VegCore output terms to VegBIEN using bin/join, but is no longer needed because the staging table columns are now VegCore terms. instead, the full VegCore-VegBIEN.csv is needed so that derived columns added in stage I or II validations are detected by bin/map (rather than just the original source columns in map.csv).
10166	07/06/2013 11:29 AM	Aaron Marcuse-Kubitza	bugfix: inputs/*/Source/data.csv for new-style datasources: need to include a blank row (plus a blank header) so that the metadata values are imported at least once instead of zero times, now that there is an installed staging table that will be iterated over. the blank row did not used to be necessary, because db_xml.put_table() has a special case for metadata-only tables with no installed table, which avoids iterating over the table's rows.
10163	07/03/2013 10:20 PM	Aaron Marcuse-Kubitza	inputs/*/Source/ for new-style datasources: use an actual staging table instead of a metadata-only table, so that metadata values can be stored in the staging table instead of the map.csv (as will be required by new-style import)
10089	06/27/2013 12:20 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/_archive/
10088	06/27/2013 12:18 PM	Aaron Marcuse-Kubitza	removed inputs/GBIF/Specimen/, which has been replaced by the refresh in raw_occurrence_record_plants/
10087	06/27/2013 12:17 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/map.csv, used to regenerate inputs/GBIF/raw_occurrence_record_plants/map.csv when raw_occurrence_record_plants is resubset
10051	06/26/2013 07:55 AM	Aaron Marcuse-Kubitza	inputs/GBIF/run: inherit from lib/runscripts/datasrc_dir.run, which uses import_order.txt to forward calls to the subdirs
10050	06/26/2013 07:54 AM	Aaron Marcuse-Kubitza	added blank runscripts inputs/GBIF/Source/run, Specimen/run because they are in import_order.txt (used by lib/runscripts/datasrc_dir.run)
10036	06/25/2013 03:31 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/_src/.rsync_filter.upload,download to prevent old versions of GBIFPortalDB-*.dump.gz from being downloaded to the local machine, while keeping them on jupiter. this avoids the need to store these files in ~/Documents/BIEN/large_files/ with symlinks from inputs/GBIF/_src/ to exclude them from the sync.
10008	06/23/2013 03:47 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/raw_occurrence_record_plants/.rsync_ignore with filters that have previously needed to be manually added whenever `make inputs/upload` was run
10007	06/23/2013 03:46 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/_MySQL/.rsync_ignore with filters from /README.TXT > Maintenance > to synchronize vegbiendev, jupiter, and your local machine. these filters will now be used with bin/sync_upload in addition to the periodic backup commands.
9927	06/19/2013 10:17 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: genus->taxonlabel.taxonomicname: use new _filter_genus() (see r9882)
9885	06/12/2013 11:26 AM	Aaron Marcuse-Kubitza	added inputs/GBIF/_MySQL/GBIFPortalDB-2013-02-20.data.0.preamble.sql
9882	06/12/2013 10:49 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: genus->taxonlabel.taxonomicname: filter out genera that contain numbers (using new _filter_genus()), which break TNRS and prevent it from matching any other parts of the name. later, these genera can instead be moved to the end of the name, where TNRS will correctly match them as Unmatched_terms.
9877	06/12/2013 10:05 AM	Aaron Marcuse-Kubitza	added inputs/GBIF/raw_occurrence_record_plants/table.tsv.md5
9876	06/12/2013 09:51 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: regenerated. updated for new staging table input columns, which are now the same as the output columns.
9875	06/12/2013 09:41 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: use header from map.csv instead of the new columns, so that source.shortname is set to GBIF instead of VegCore
9874	06/12/2013 09:24 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/VegBIEN.csv: when a runscript is available, instead map the output columns of map.csv to VegBIEN, because the columns have been renamed in the staging table
9873	06/12/2013 08:32 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/VegBIEN.csv: regenerated, which adds row_num input col
9864	06/12/2013 06:35 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/import_order.txt, run: updated raw_occurrence_record/ to raw_occurrence_record_plants/
9858	06/12/2013 04:47 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: import() runtime: specified that this does not include table.tsv.gz/make()
9857	06/12/2013 04:07 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: # duplicates: added revision #
9856	06/12/2013 04:07 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented that there are 4.5 million duplicates (59,998,354 rows before - 55,417,646 rows after = 4,580,708)
9855	06/12/2013 03:49 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: added rerun time (~0 thanks to index, so no problem doing the DELETE each time postprocess.sql is run)
9854	06/12/2013 03:25 AM	Aaron Marcuse-Kubitza	*{.sh,run}: use simpler .rel() instead of `. "$(dirname "${BASH_SOURCE⁰}")"/...` for relative includes
9851	06/12/2013 02:48 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/_MySQL/MySQL_schema, MySQL_data: sed: put {} commands on their own line to work on Mac
9845	06/11/2013 06:40 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: updated column names to match the renamings in map.csv, which are now performed on the staging table itself
9828	06/11/2013 03:29 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: institution_code index: create it idempotently using create_if_not_exists() and an explicit index name, so that a duplicate index doesn't get added each time postprocess.sql is run
9826	06/11/2013 03:22 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: add util to the search_path so that postprocess.sql will also work when run by inputs/input.Makefile, which only puts the datasource (GBIF) in the search_path
9823	06/11/2013 09:04 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added import() runtime (5 h)
9822	06/10/2013 11:58 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv.gz/make() runtime: noted that this excludes the upload time
9821	06/10/2013 11:58 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added table.tsv.gz/upload() runtime (15 min)
9820	06/10/2013 11:48 PM	Aaron Marcuse-Kubitza	added lib/runscripts/mysql.table.run (general to all MySQL datasources) and use it in inputs/GBIF/table.run
9819	06/10/2013 11:13 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): to view runtime when using `screen`: keys used to scroll: added Ctrl-B/Ctrl-F for page-at-a-time scrolling (there are a lot of pages of output for the import() target!)
9818	06/09/2013 09:21 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: table.tsv.gz/make(): don't run table.tsv.gz/upload in test mode, to avoid clobbering the backup of a full table.tsv with a partial, testing table.tsv
9816	06/09/2013 09:08 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: table.tsv.gz/upload(): don't use inplace mode because it leaves a newer mtime when aborted, causing rsync to think that the partial upload is actually newer than the source. note that rsync's --partial-dir mode is just as capable of resuming an aborted upload (it will just use a file in .rsync-tmp instead). inplace mode is primarily designed for fixed-offset files which don't change much between edits, but this is not true for exports (or the gzips of them), which will change the file offsets of most data if even one row or column is added or removed.
9815	06/09/2013 09:01 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: table.tsv.gz/make(): run table.tsv.gz/upload here instead of in table.tsv/make() because it should not run until table.tsv.gz is finished being made, which is not the case in table.tsv/make() because table.tsv.gz/make is run in the background
9814	06/09/2013 08:59 PM	Aaron Marcuse-Kubitza	inputs/GBIF/table.run: table.tsv.gz/upload(): moved before table.tsv.gz/make() so it can be used by it
9813	06/09/2013 08:39 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: table.tsv.gz/upload(): need overwrite=1 because the mtime of an aborted inplace upload is newer
9812	06/09/2013 08:32 PM	Aaron Marcuse-Kubitza	inputs/GBIF/table.run: table.tsv*/upload(): renamed to table.tsv.gz/upload() to upload only table.tsv.gz, not table.tsv, in order to save bandwidth
9807	06/09/2013 07:00 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/table.run: table.tsv*/upload(): need to run put in live mode (live=1)
9803	06/09/2013 06:30 PM	Aaron Marcuse-Kubitza	inputs/GBIF/table.run: table.tsv/make(): run table.tsv*/upload when the file make is done so that the file is backed up to jupiter
9802	06/09/2013 06:29 PM	Aaron Marcuse-Kubitza	inputs/GBIF/table.run: added table.tsv*/upload()
9781	06/09/2013 11:13 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): documented how to view the runtime when using `screen` (press Ctrl-A [ , use up-arrow, and then press Esc to leave copy mode)
9780	06/09/2013 11:12 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: herbaria_filter/make(): use new ih_herbarium table instead of the herbaria_filter.ih.csv_ file directly
9779	06/08/2013 12:23 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added ih_herbarium/make(), which stores the IH herbaria
9778	06/08/2013 11:50 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: table/make(): also filter out rows with a non-plant family (as described at http://vegpath.org/wiki/2013-06-06_conference_call#GBIF-subsetting-fix-raw_occurrence_record-filter-formula), since some institutions have both animal and plant rows, even though they are in IH or in the 80% list. (note that NULL families are OK.)
9777	06/08/2013 04:12 AM	Aaron Marcuse-Kubitza	*{.sh,run}: use mysql instead of mysql_ANSI because mysql is now an alias to mysql_ANSI (since ANSI mode still supports key MySQL features, like `` quotes)
9776	06/08/2013 04:09 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): documented that incremental output is provided right away with --quick (unbuffered), but takes awhile to become visible in Macfusion sshfs. this can be tested with `while true; do stat inputs/GBIF/raw_occurrence_record_plants/table.tsv; sleep 2; done` running concurrently with `./inputs/GBIF/raw_occurrence_record_plants/run table.tsv/make` on vegbiendev:/home/bien/svn .
9775	06/08/2013 04:00 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): use new raw_occurrence_record_plants view from table/make()
9774	06/08/2013 03:15 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: table/make(): added make of prerequisites
9773	06/08/2013 03:14 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: table/make(): don't reset $table to plant_fraction_for_herbaria_filter for commands that use $table
9772	06/08/2013 03:10 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added table/make(), which makes the filter view
9771	06/08/2013 02:14 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/: renamed to raw_occurrence_record_plants because it's actually only the plants in raw_occurrence_record, not all of raw_occurrence_record. also, this will allow us to create a separate raw_occurrence_record_plants view whose name matches the folder and does not collide with the raw_occurrence_record table.
9770	06/08/2013 12:44 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): added runtime, which is ~0 since it just needs to do CSV import and index scans
9769	06/08/2013 12:43 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): time the population of herbaria_filter
9768	06/07/2013 11:47 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): updated runtime. added rows affected count to runtime so if the number of rows it's related to (in this case, institution_code) changes, the runtime can be expected to change accordingly.
9766	06/06/2013 04:49 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): plant_fraction column: COUNT counts non-NULL rather than true values (which counter-intuitively includes false, because it's non-NULL), so need to add NULLIF around the boolean expression to turn it into a NULL-or-not expression. see http://vegpath.org/wiki/2013-06-06_conference_call#GBIF-subsetting-fix-plant_fraction-SQL-bug .
9755	06/06/2013 08:09 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv.gz/make(): documented runtime (35 min)
9749	06/06/2013 05:33 AM	Aaron Marcuse-Kubitza	inputs/GBIF/table.run: table.tsv/make(): remake table.tsv.gz/make() after table.tsv is made

Project

General

Profile