Project

General

Profile

  • svn:mime-type: application/octet-stream

# Date Author Comment
11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11902 12/11/2013 09:56 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: removed table names from datasources where only one table is imported

11901 12/11/2013 09:52 PM Aaron Marcuse-Kubitza

fix: inputs/import.stats.xls: removed deleted tables from current import

11900 12/11/2013 09:51 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

11571 11/05/2013 10:19 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

10876 09/05/2013 01:17 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: analytical DB: updated rowcount

10875 09/05/2013 01:14 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

10866 09/04/2013 11:06 PM Aaron Marcuse-Kubitza

inputs/*/*/test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix

10851 09/04/2013 09:37 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: updated import times

10606 08/06/2013 05:07 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: added backup MD5 test time for last import

10605 08/06/2013 05:03 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: added backup upload time for last import

10602 08/06/2013 04:30 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: added backup times from last import

10601 08/06/2013 01:35 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

10194 07/09/2013 02:30 AM Aaron Marcuse-Kubitza

fix: inputs/import.stats.xls: removed spurious diff comment on total time, which only applied to the previous import

10193 07/09/2013 02:28 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: reformatted times longer than one day as a # of days instead of hours, for clarity. the days format is chosen automatically when the # hours exceeds one day.

9979 06/20/2013 02:17 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

9504 05/23/2013 11:54 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Postprocessing: populated entries for analytical DB for last 4 imports, and for backup, backup test for last import. note that the combined import time for the last import is 3.5 days, compared to 3 days for the column-based import portion.

9503 05/22/2013 11:47 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Postprocessing: added (empty) entries for analytical DB, backup, backup test

9500 05/21/2013 10:49 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. GBIF has been refreshed (with the range modeling column subset), and column-based import now takes 3 days for 88.4 million rows.

8313 04/03/2013 09:32 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Removed the previous imports from the current tab because they are also in the 2012-6~9 tab, and should not be in two places

8312 04/03/2013 09:28 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. MO and FIA have been refreshed.

8076 03/19/2013 12:49 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7863 03/06/2013 07:50 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. The core import time has dropped by more than half (!) to ~12 hours, now that the TNRS scrubbing is added using a simple LEFT JOIN, instead of being pushed through the normalized schema. Not since October has the import been this fast!

7800 02/28/2013 03:31 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7731 02/27/2013 03:45 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7599 02/20/2013 12:15 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7502 02/07/2013 01:48 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times using the import_times bugfix for times longer than a day

7498 02/07/2013 11:51 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7375 01/28/2013 05:13 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7343 01/23/2013 08:03 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7320 01/22/2013 12:24 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7278 01/18/2013 04:43 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7273 01/18/2013 12:52 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7272 01/18/2013 12:24 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Added Postprocessing section for use with the next import

7271 01/18/2013 12:05 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. Total does not yet include postprocessing.

7124 01/09/2013 12:45 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

7062 01/04/2013 10:10 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Moved CTFS to Deleted section

7028 01/02/2013 06:43 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6983 12/20/2012 11:21 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6911 12/19/2012 05:23 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Reformatted so the first by column import and the comparison by row import will fit on the same page when printed on portrait-mode letter paper

6910 12/19/2012 05:10 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Changed import type labels to By row/By column so they would fit into one field, leaving the extra field free to contain the revision #

6872 12/17/2012 10:04 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6828 12/14/2012 01:46 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6798 12/12/2012 04:13 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6751 12/11/2012 03:39 AM Aaron Marcuse-Kubitza

Renamed inputs/NCU-NCSC/ to NCU because this is the primary herbarium contained in the data

6712 12/10/2012 05:47 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6647 12/06/2012 04:18 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6525 12/03/2012 11:05 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6455 11/25/2012 07:38 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Added separate tab with stats for 2012-6~9. The Excel format apparently only supports 255 columns, so previous imports had been silently truncated off. Note that once the 2012-10 imports reach column 255, a new tab will need to be created with the 2012-10+ imports.

6448 11/25/2012 06:15 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6361 11/23/2012 10:38 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6351 11/21/2012 07:48 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. This now includes the Canadensys plants-related datasources HIBG, JBM, QFA, TRT, TRTE, UBC, VASCAN, and WIN.

6350 11/20/2012 09:59 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6264 11/19/2012 10:54 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6181 11/15/2012 01:51 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6160 11/14/2012 02:26 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6120 11/13/2012 12:00 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

6040 11/08/2012 12:17 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. Fixed input row counts and import times to include derived data, such as TNRS and geoscrub, which adds to the import time and therefore should be considered in the import's speed. (TNRS was already being included in the import time for some, but not all, imports.)

5986 11/05/2012 02:29 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5964 11/02/2012 02:54 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5825 10/29/2012 10:06 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. The TNRS import has slowed down significantly, possibly due to a bug in the autopopulation of the taxonlabel_relationship table when the input data contains cycles.

5762 10/25/2012 07:20 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5663 10/19/2012 12:18 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5607 10/17/2012 04:01 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5500 10/15/2012 07:57 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5320 10/09/2012 07:25 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5222 10/04/2012 03:58 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times. This now includes the half-hour-long pre-import of the TNRS taxonomic names (which the datasources then match up with), as well as the concatenation of the datasource's taxonomic name components to create or match up with the TNRS input name.

5105 09/28/2012 10:23 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

5004 09/26/2012 06:34 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4975 09/25/2012 03:52 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4952 09/24/2012 03:36 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4919 09/21/2012 02:29 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4875 09/20/2012 06:05 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4836 09/19/2012 05:56 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4778 09/18/2012 02:32 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated import times

4678 09/14/2012 05:53 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Copied the Change factor formula to all rows (it displays an empty string for rows that don't have both a row-based and a column-based import)

4676 09/14/2012 05:42 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4619 09/12/2012 07:02 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4533 09/10/2012 06:53 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. Corrected input row count of CTFS.TaxonOccurrence, which had been set to the inserted row count (which is right above it in the log file).

4518 09/07/2012 10:57 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4389 08/31/2012 02:23 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. This now includes CTFS.TaxonOccurrence (presence-only observations), FIA (11 million rows!), and Madidi.Organism. The addition of FIA almost doubles the # of rows to 26 million and increases the import time from 9.5 to 11.5 hours.

4381 08/30/2012 11:23 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. This now includes the core CTFS tables.

4308 08/29/2012 04:39 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4249 08/28/2012 10:35 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4216 08/27/2012 04:18 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4206 08/24/2012 12:20 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4185 08/23/2012 02:12 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4184 08/22/2012 04:56 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4111 08/20/2012 05:22 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

4090 08/17/2012 12:50 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. The import time for SpeciesLink (the slowest datasource) went back down to 9 hours after replacing the slower _merge with _alt.

4067 08/16/2012 12:29 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. The import time for SpeciesLink (the slowest datasource) doubled, to 16 hours, most likely due to replacing _alt with the slower _merge, which preserves more input data.

3995 08/14/2012 07:12 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

3961 08/13/2012 09:15 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

3936 08/10/2012 03:50 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. Note that the import now includes additional date parsing on all date fields, which adds 1/2-1 hour to the import time. Eventually, we will want to translate _date() to PL/pgSQL and only use extra date processing if PostgreSQL's cast to timestamp doesn't work, which should greatly reduce this time.

3846 08/08/2012 03:46 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

3791 08/06/2012 05:36 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

3752 08/02/2012 04:46 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import

3751 08/02/2012 04:40 PM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Moved independent-import data to separate tab so that it wouldn't get moved to the side whenever a new column of simultaneous-import data is inserted. It is also no longer updated, because all column-based imports are now done simultaneously.

3744 08/01/2012 10:32 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Moved Simultaneously above Independently because that is how we are now running the imports

3675 07/30/2012 11:31 AM Aaron Marcuse-Kubitza

inputs/import.stats.xls: Updated with stats from latest import. Note that the import now includes CVS.