/trunk/inputs/GBIF/raw_occurrence_record_plants - Changes - BIEN 3 - NCEAS Projects

root/trunk/inputs/GBIF/raw_occurrence_record_plants @ 12004

svn:ignore: *

#	Date	Author	Comment
11970	01/20/2014 11:33 AM	Aaron Marcuse-Kubitza	moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).
11888	12/10/2013 06:35 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/map.csv: row_num: remapped to plain *row_num, like the other datasources that have this field
11887	12/10/2013 06:31 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: rerun time: noted that this is only fast after manual vacuuming of the table (to remove the deleted rows from the index). autovacuum apparently does not run, although it should.
11881	12/09/2013 07:24 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: reran test, which added yearCollected/monthCollected/dayCollected
11869	12/09/2013 02:43 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: updated import() runtime (same), documented table cleanup runtime (1.5 h)
11868	12/09/2013 02:38 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: CREATE INDEX ... specimenHolderInstitutions: documented runtime (45 min)
11867	12/09/2013 02:28 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented runtime (3.5 min)
11396	10/21/2013 07:14 PM	Aaron Marcuse-Kubitza	fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error
11107	09/29/2013 08:58 PM	Aaron Marcuse-Kubitza	bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.
10866	09/04/2013 11:06 PM	Aaron Marcuse-Kubitza	inputs///test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix
10425	07/25/2013 07:34 PM	Aaron Marcuse-Kubitza	bugfix: inputs///map.csv for specimen tables: remapped eventDate,day,month,year to *Collected, because a general date always applies to the observation itself rather than to any parent event (specimens don't have a parent event)
10270	07/14/2013 01:26 AM	Aaron Marcuse-Kubitza	bugfix: inputs///map.csv (e.g. inputs/GBIF/raw_occurrence_record_plants/map.csv): remapped author to scientificNameAuthorship rather than authors, which it had gotten incorrectly automapped to. note that the VegCore term authors has now been renamed to data_authors to avoid ambiguity, but incorrect automappings resulting from it had not yet been fixed.
10269	07/14/2013 12:54 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: updated herbaria.ih column names for staging table column renaming
10174	07/06/2013 03:55 PM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: for new-style datasources, use a symlink to mappings/VegCore-VegBIEN.csv directly instead of prefiltering VegCore-VegBIEN.csv to include only the columns in map.csv. prefiltering used to be performed as part of mapping the map.csv VegCore output terms to VegBIEN using bin/join, but is no longer needed because the staging table columns are now VegCore terms. instead, the full VegCore-VegBIEN.csv is needed so that derived columns added in stage I or II validations are detected by bin/map (rather than just the original source columns in map.csv).
10008	06/23/2013 03:47 PM	Aaron Marcuse-Kubitza	added inputs/GBIF/raw_occurrence_record_plants/.rsync_ignore with filters that have previously needed to be manually added whenever `make inputs/upload` was run
9927	06/19/2013 10:17 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: genus->taxonlabel.taxonomicname: use new _filter_genus() (see r9882)
9882	06/12/2013 10:49 AM	Aaron Marcuse-Kubitza	mappings/VegCore-VegBIEN.csv: genus->taxonlabel.taxonomicname: filter out genera that contain numbers (using new _filter_genus()), which break TNRS and prevent it from matching any other parts of the name. later, these genera can instead be moved to the end of the name, where TNRS will correctly match them as Unmatched_terms.
9877	06/12/2013 10:05 AM	Aaron Marcuse-Kubitza	added inputs/GBIF/raw_occurrence_record_plants/table.tsv.md5
9876	06/12/2013 09:51 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/test.xml.ref: regenerated. updated for new staging table input columns, which are now the same as the output columns.
9875	06/12/2013 09:41 AM	Aaron Marcuse-Kubitza	bugfix: inputs/input.Makefile: %/VegBIEN.csv: use header from map.csv instead of the new columns, so that source.shortname is set to GBIF instead of VegCore
9874	06/12/2013 09:24 AM	Aaron Marcuse-Kubitza	inputs/input.Makefile: %/VegBIEN.csv: when a runscript is available, instead map the output columns of map.csv to VegBIEN, because the columns have been renamed in the staging table
9873	06/12/2013 08:32 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/VegBIEN.csv: regenerated, which adds row_num input col
9858	06/12/2013 04:47 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: import() runtime: specified that this does not include table.tsv.gz/make()
9857	06/12/2013 04:07 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: # duplicates: added revision #
9856	06/12/2013 04:07 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented that there are 4.5 million duplicates (59,998,354 rows before - 55,417,646 rows after = 4,580,708)
9855	06/12/2013 03:49 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: added rerun time (~0 thanks to index, so no problem doing the DELETE each time postprocess.sql is run)
9845	06/11/2013 06:40 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: updated column names to match the renamings in map.csv, which are now performed on the staging table itself
9828	06/11/2013 03:29 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: institution_code index: create it idempotently using create_if_not_exists() and an explicit index name, so that a duplicate index doesn't get added each time postprocess.sql is run
9826	06/11/2013 03:22 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: add util to the search_path so that postprocess.sql will also work when run by inputs/input.Makefile, which only puts the datasource (GBIF) in the search_path
9823	06/11/2013 09:04 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added import() runtime (5 h)
9822	06/10/2013 11:58 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv.gz/make() runtime: noted that this excludes the upload time
9821	06/10/2013 11:58 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added table.tsv.gz/upload() runtime (15 min)
9819	06/10/2013 11:13 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): to view runtime when using `screen`: keys used to scroll: added Ctrl-B/Ctrl-F for page-at-a-time scrolling (there are a lot of pages of output for the import() target!)
9781	06/09/2013 11:13 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): documented how to view the runtime when using `screen` (press Ctrl-A [ , use up-arrow, and then press Esc to leave copy mode)
9780	06/09/2013 11:12 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: herbaria_filter/make(): use new ih_herbarium table instead of the herbaria_filter.ih.csv_ file directly
9779	06/08/2013 12:23 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added ih_herbarium/make(), which stores the IH herbaria
9778	06/08/2013 11:50 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: table/make(): also filter out rows with a non-plant family (as described at http://vegpath.org/wiki/2013-06-06_conference_call#GBIF-subsetting-fix-raw_occurrence_record-filter-formula), since some institutions have both animal and plant rows, even though they are in IH or in the 80% list. (note that NULL families are OK.)
9777	06/08/2013 04:12 AM	Aaron Marcuse-Kubitza	*{.sh,run}: use mysql instead of mysql_ANSI because mysql is now an alias to mysql_ANSI (since ANSI mode still supports key MySQL features, like `` quotes)
9776	06/08/2013 04:09 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): documented that incremental output is provided right away with --quick (unbuffered), but takes awhile to become visible in Macfusion sshfs. this can be tested with `while true; do stat inputs/GBIF/raw_occurrence_record_plants/table.tsv; sleep 2; done` running concurrently with `./inputs/GBIF/raw_occurrence_record_plants/run table.tsv/make` on vegbiendev:/home/bien/svn .
9775	06/08/2013 04:00 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: table.tsv/make(): use new raw_occurrence_record_plants view from table/make()
9774	06/08/2013 03:15 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: table/make(): added make of prerequisites
9773	06/08/2013 03:14 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record_plants/run: table/make(): don't reset $table to plant_fraction_for_herbaria_filter for commands that use $table
9772	06/08/2013 03:10 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record_plants/run: added table/make(), which makes the filter view
9771	06/08/2013 02:14 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/: renamed to raw_occurrence_record_plants because it's actually only the plants in raw_occurrence_record, not all of raw_occurrence_record. also, this will allow us to create a separate raw_occurrence_record_plants view whose name matches the folder and does not collide with the raw_occurrence_record table.
9770	06/08/2013 12:44 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): added runtime, which is ~0 since it just needs to do CSV import and index scans
9769	06/08/2013 12:43 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): time the population of herbaria_filter
9768	06/07/2013 11:47 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): updated runtime. added rows affected count to runtime so if the number of rows it's related to (in this case, institution_code) changes, the runtime can be expected to change accordingly.
9766	06/06/2013 04:49 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): plant_fraction column: COUNT counts non-NULL rather than true values (which counter-intuitively includes false, because it's non-NULL), so need to add NULLIF around the boolean expression to turn it into a NULL-or-not expression. see http://vegpath.org/wiki/2013-06-06_conference_call#GBIF-subsetting-fix-plant_fraction-SQL-bug .
9755	06/06/2013 08:09 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv.gz/make(): documented runtime (35 min)
9705	06/04/2013 11:05 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: plant_fraction_for_herbaria_filter/make(): need to make prerequisites first (plant_fraction/make)
9681	06/01/2013 05:17 AM	Aaron Marcuse-Kubitza	bugfix: *run: overriding targets: use new self_make to properly progagate the $remake flag to the overridden target, so that the target itself is not skipped
9676	06/01/2013 03:31 AM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added check_target_exists so table.tsv would not be overwritten if it already existed
9672	06/01/2013 02:08 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction/make(): documented runtime (1 hr)
9668	05/30/2013 08:22 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): removed no longer needed explicit clear of $remake, which is now done by make.sh instead
9665	05/30/2013 07:53 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: added herbaria_filter/seal()
9664	05/30/2013 07:51 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): changed "from IH" to "contains all of IH" because not all rows are now from IH
9663	05/30/2013 07:49 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): renamed acronym->institution_code to match the column name in raw_occurrence_record rather than in IH
9662	05/30/2013 07:46 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: removed no longer used herbaria_filter.plant_fraction.csv_/make(). use plant_fraction_for_herbaria_filter view instead.
9661	05/30/2013 07:45 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): use the plant_fraction_for_herbaria_filter view directly instead of first exporting it to a CSV
9659	05/30/2013 07:09 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): if remaking, turn off remake mode after doing this target's rm operations, so that prerequisite targets are not also remade
9644	05/30/2013 08:28 AM	Aaron Marcuse-Kubitza	added inputs/GBIF/raw_occurrence_record/postprocess.sql, which removes institutions that we have direct data for
9643	05/30/2013 08:18 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter/make(): skip table if already exists (unless remaking), like plant_fraction/make()
9640	05/30/2013 07:36 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter.plant_fraction.csv_/make(): use new plant_fraction_for_herbaria_filter view
9639	05/30/2013 07:13 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: added plant_fraction_for_herbaria_filter/make(). note that for simplicity, plant_fraction_for_herbaria_filter is a view instead of a table.
9638	05/30/2013 06:50 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: .table/(): renamed to /() because a target named after a table refers to the table unless it has an explicit file extension
9637	05/30/2013 06:49 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/(): renamed to plant_fraction/() because a target named after a table refers to the table unless it has an explicit file extension
9633	05/30/2013 06:19 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: added plant_fraction.table/seal(), which uses new mysql_seal_table()
9629	05/29/2013 10:35 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction: added index on plant_fraction for fast extraction of herbaria by fraction threshold
9628	05/29/2013 10:10 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: tables: set ENGINE to MyISAM and DEFAULT CHARSET to utf8 to match the other GBIF tables. (note that MyISAM is not the default, but is needed to avoid row sort order problems and other issues with InnoDB.)
9627	05/29/2013 08:09 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): in remaking mode, drop the table first
9626	05/29/2013 08:04 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): only create and populate the table if it doesn't already exist, to avoid clobbering existing data. the noclobber functionality uses new skip_table(), which is the table analog of require_not_exists().
9593	05/24/2013 03:13 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter.table/make(): inline the PRIMARY KEY statement with its column
9592	05/24/2013 03:10 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: plant_fraction.table/make(): create the table once with "IF NOT EXISTS" and then populate it with INSERT SELECT, to avoid locking it while it's being repopulated. dropping and recreating the table with CREATE TABLE AS prevented phpMyAdmin from even reading the database's tables list, because it was unable to fetch a rowcount for plant_fraction.
9554	05/24/2013 01:20 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter.ih.csv_/make(): don't use any outer limit value, so that all the IH herbaria are always used. this also ensures that the first GBIF rows will be from an IH herbarium.
9553	05/24/2013 01:17 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter.table/make(): herbaria_filter: don't explicitly set ENGINE or DEFAULT CHARSET, because these should be set to the database values instead so that collations, etc. match
9485	05/21/2013 01:44 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter.table/make(): also include the exported plant_fraction herbaria
9484	05/21/2013 01:43 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: added herbaria_filter.plant_fraction.csv_/make(), which exports the plant_fraction herbaria whose plant_fraction >= 0.8
9483	05/21/2013 01:42 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: added plant_fraction.table/make(), which contains the plant fraction for each herbarium
9471	05/20/2013 08:21 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: herbaria_filter.table/make(): need to use append=1 with mysql_import so the output table doesn't get re-truncated when additional parts are added
9462	05/20/2013 03:40 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: herbaria_filter.table/make(): specify the different parts used to create the table in an array
9461	05/20/2013 03:19 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: renamed herbaria_filter.csv_ to herbaria_filter.ih.csv_ to allow for other tables that get combined into herbaria_filter
9456	05/17/2013 03:43 PM	Aaron Marcuse-Kubitza	lib/sh/db.sh: mysql_import(): automatically ensure the table is empty (i.e. using truncate()), unless append=1 is specified. extra calls to truncate() now that this happens automatically have also been removed.
9434	05/16/2013 09:28 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: dynamically generate herbaria_filter.csv_ from herbaria.ih in new target herbaria_filter.csv_/make()
9433	05/16/2013 09:27 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: store the herbaria filter in a MySQL table loaded from a CSV instead of getting it from a hardcoded list of IN (...) values
9418	05/16/2013 04:40 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: renamed herbaria.sql to herbaria.data.sql so it wouldn't be added to svn by `make inputs/GBIF/raw_occurrence_record/add` or `make inputs/add`
9412	05/16/2013 03:46 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): exclude deleted rows (i.e. where the deleted timestamp is non-NULL)
9411	05/16/2013 03:42 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/header.csv: regenerated using ./run. since the table is reimported as a CSV, it uses bin/csv2db, which prepends an additional row_num column.
9410	05/16/2013 03:09 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): remove explicit cols list to include all cols. the file size of the generated table.tsv will increase by ~3x, but should remain reasonably-sized compared to our available disk space.
9409	05/16/2013 03:04 PM	Aaron Marcuse-Kubitza	bugfix: inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): need \ line continuation after vars so they only apply to the command rather than being set as global vars
9391	05/15/2013 11:27 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: moved table.tsv.md5/make() and invocation of it to inputs/GBIF/table.run because it's general to all tables (which would all use table.tsv for this datasource). use $target_filename in calling table.tsv.md5/make from table.tsv/make.
9386	05/15/2013 07:44 PM	Aaron Marcuse-Kubitza	lib/runscripts/table.run: input_make(): renamed to table_make() to make it clear that the target names are relative to the table subdir itself, not the datasrc dir. it was previously called input_make because it used inputs/input.Makefile directly, but now will use any Makefile in the datasrc dir.
9384	05/15/2013 03:34 PM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv.md5/make(): don't add extra .md5 extension to $target_filename because it already has the extension as part of the target name (now that this command is run in its own make target rather than in table.tsv/make())
9363	05/15/2013 10:53 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: inputs/GBIF/raw_occurrence_record/run: added check_target_exists so you know why make skipped the file (for other, non-silent targets, it would also avoid make's verbose output when the file exists)
9362	05/15/2013 10:38 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): moved making of table.tsv.md5 to separate function
9357	05/15/2013 10:00 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): also add md5 sum for table.tsv
9356	05/15/2013 09:50 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added back filter kw args, which had gotten deleted in a commit without update (although actually, svn should not allow a commit without update, so the working copy may have gotten corrupted)
9348	05/15/2013 08:22 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run table.tsv/make() and functions used by it: added usage comments for cmd line usage, caller usage, and declaring function usage
9339	05/13/2013 07:43 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): cols: also include scientific_name, which is preferable as a TNRS input because it also contains lower ranks
9338	05/13/2013 07:40 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): cols: also include id, institution_code, collection_code, catalogue_number
9337	05/13/2013 07:38 AM	Aaron Marcuse-Kubitza	inputs/GBIF/raw_occurrence_record/run: table.tsv/make(): added filter for institution_codes in herbaria.ih (in PostgreSQL)

Project

General

Profile