Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  logs 8801 over 11 years Aaron Marcuse-Kubitza inputs/input.Makefile: SVN: add, %/add: */logs:...
.map.csv.last_cleanup 0 Bytes 4118 over 12 years Aaron Marcuse-Kubitza inputs: Moved maps into subfolders, using the s...
VegBIEN.csv 42 Bytes 10349 over 11 years Aaron Marcuse-Kubitza inputs/REMIB/: switched to new-style import, us...
create.sql 29 Bytes 10330 over 11 years Aaron Marcuse-Kubitza inputs/REMIB/Specimen/create.sql: moved filteri...
header.csv 216 Bytes 5746 about 12 years Aaron Marcuse-Kubitza inputs/REMIB/Specimen/header.csv: Regenerated f...
map.csv 1.02 KB 10349 over 11 years Aaron Marcuse-Kubitza inputs/REMIB/: switched to new-style import, us...
new_terms.csv 323 Bytes 10339 over 11 years Aaron Marcuse-Kubitza inputs/REMIB/Specimen/: translated single-colum...
postprocess.sql 3.98 KB 12516 almost 11 years Aaron Marcuse-Kubitza bugfix: *.sql: public.source_by_shortname(): ne...
run 63 Bytes 10349 over 11 years Aaron Marcuse-Kubitza inputs/REMIB/: switched to new-style import, us...
test.xml.ref 14.8 KB 11396 about 11 years Aaron Marcuse-Kubitza fix: bin/map: put template: comment out the "Pu...
unmapped_terms.csv 261 Bytes 10341 over 11 years Aaron Marcuse-Kubitza bugfix: inputs/REMIB/Specimen/map.csv: state: c...
  • svn:ignore: *

Latest revisions

# Date Author Comment
12516 02/27/2014 01:27 PM Aaron Marcuse-Kubitza

bugfix: *.sql: public.source_by_shortname(): need to wrap it in a nested SELECT because Postgres incorrectly does not constant-fold (inline) it, leading to a slowdown when it is therefore run many times. this is done using the steps at wiki.vegpath.org/Postgres_queries#wrap-function-call-in-nested-SELECT .

11970 01/20/2014 11:33 AM Aaron Marcuse-Kubitza

moved everything into /trunk/ to create the standard svn layout, for use with tools that require this (eg. git-svn). IMPORTANT: do NOT do an `svn up`. instead, re-use your working copy's existing files with `svn switch` (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.switch.html).

11396 10/21/2013 07:14 PM Aaron Marcuse-Kubitza

fix: bin/map: put template: comment out the "Put template:" label so that the output is valid XML, and displays properly in a browser rather than showing a syntax error

11107 09/29/2013 08:58 PM Aaron Marcuse-Kubitza

bugfix: mappings/VegCore-VegBIEN.csv: nest all taxonoccurrences inside a stratum event, so that the parent locationevent is always fully populated before child locationevents point to it. (previously, a stub parent event was created when the child event was imported first, which blocked the fully-populated parent event from being inserted later on.) this uses auto-folding (for VegBank/CVS) and auto-forwarding (for other datasources) to prune empty stratum events for taxonoccurrences that don't have strata. (see wiki.vegpath.org/Auto-folding, wiki.vegpath.org/Auto-forwarding for more info about these normalization techniques.) note that the inserted row counts stay exactly the same for all datasources except VegBank (which was being fixed), indicating that this signficant change to the mappings did not change the semantics of the import of taxonoccurrences.

10866 09/04/2013 11:06 PM Aaron Marcuse-Kubitza

inputs/*/*/test.xml.ref: updated source.shortname for new datasource name, which now starts out with .new suffix

10377 07/20/2013 05:09 AM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/postprocess.sql: map_nulls() derived cols: documented total runtime (7.5 min on vegbiendev)

10376 07/20/2013 05:07 AM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/postprocess.sql: map_nulls() derived cols: updated runtimes for map_nulls() inlining, which created a speed improvement of 7x for the numeric columns and 2.5x for the text columns (292563.362->41929.772 ms and 83640.424->35690.797 ms, respectively). note that the map_nulls__coord__*() calls could be optimized further by combining the successive map_nulls() calls into one, with the hstores merged.

10361 07/20/2013 01:27 AM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/postprocess.sql: map_nulls__*(): turned off STRICT to allow dynamic inlining, which speeds up the mk_derived_col() statements by 5x (342799.823 ms -> 71533.252 ms (6 min -> 1 min) for latitude_sec)

10360 07/19/2013 07:23 PM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/postprocess.sql: runtimes: updated for vegbiendev, before dynamic inlining. the times are about twice as fast as on starscream, so vegbiendev is faster at whatever is the limiting speed factor (probably not CPU, based on other benchmarks).

10350 07/19/2013 02:26 PM Aaron Marcuse-Kubitza

inputs/REMIB/Specimen/postprocess.sql: runtimes: documented the machine the times are from

View revisions

Also available in: Atom