Task #887
Updated by Aaron Marcuse-Kubitza over 10 years ago
h3. issue * in the last full-database import, this caused disk space errors in 29 of [[VegBIEN_contents#datasources|41 datasources]]: <pre> ssh -t vegbiendev.nceas.ucsb.edu exec sudo su - aaronmk export version=r13016 grep --files-with-matches -F "No space left on device" inputs/{.,}*/*/logs/$version.log.sql # and uniqify by datasource </pre> * in a test run, importing even moderate-sized datasources (eg. NY, SALVIAS) causes all the available disk space to be used up, and crashes the import: <pre> ssh -t vegbiendev.nceas.ucsb.edu exec sudo su - aaronmk export version=test_import grep --files-with-matches -F "No space left on device" inputs/{.,}*/*/logs/$version.log.sql # the list includes all the datasources in the test run: NY, SALVIAS </pre> * because the test run crashed as well, the problem is most likely a bug in Postgres or Linux itself. this unfortunately means that we *can't run the import until we find the Postgres/Linux bug* that is causing the problem. * the disk fills up during queries that normally run quickly but now seem to run indefinitely until the disk space is used up (note the @query_start@ times above each query, compared to when the snapshot was taken): <pre><code class="SQL"> snapshot taken: "Apr 2 11:42:26 2014" "query_start" "2014-04-02 06:20:01.952174-07" /*NVS.StemObservation*/ CREATE TEMP TABLE "stemobservation_pkeys" AS SELECT "in#11"."row_num" , "stemobservation"."stemobservation_id" AS "out.stemobservation_id" FROM "in#11" JOIN "stemobservation" ON COALESCE("stemobservation"."height_m", CAST('NaN' AS double precision)) = COALESCE("in#11"."NVS.StemObservation.Height::double", CAST('NaN' AS double precision)) /* EXPLAIN: Merge Join (cost=1045217.21..95909185.15 rows=6323041431 width=8) Merge Cond: ((COALESCE(stemobservation.height_m, 'NaN'::double precision)) = (COALESCE("in#11"."NVS.StemObservation.Height::double", 'NaN'::double precision))) -> Sort (cost=30039.56..30492.76 rows=181279 width=12) Sort Key: (COALESCE(stemobservation.height_m, 'NaN'::double precision)) -> Seq Scan on stemobservation (cost=0.00..14206.79 rows=181279 width=12) -> Materialize (cost=1015177.64..1050057.81 rows=6976033 width=12) -> Sort (cost=1015177.64..1032617.73 rows=6976033 width=12) Sort Key: (COALESCE("in#11"."NVS.StemObs "query_start" "2014-04-01 14:56:05.573418-07" /*CVS.taxon_observation.***/ CREATE TEMP TABLE "location_pkeys" AS SELECT "in#5"."row_num" , "location"."location_id" AS "out.location_id" FROM "in#5" JOIN "location" ON ("location"."accesslevel" = "in#5"."CVS.taxon_observation.**.accessRights::accesslevel" OR ("location"."accesslevel" IS NULL AND "in#5"."CVS.taxon_observation.**.accessRights::accesslevel" IS NULL)) /* EXPLAIN: Nested Loop (cost=0.00..83049883712.45 rows=27683225000 width=8) Join Filter: ((location.accesslevel = "in#5"."CVS.taxon_observation.**.accessRights::accesslevel") OR ((location.accesslevel IS NULL) AND ("in#5"."CVS.taxon_observation.**.accessRights::accesslevel" IS NULL))) -> Seq Scan on location (cost=0.00..175778.45 rows=5536645 width=8) -> Materialize (cost=0.00..35434.00 rows=1000000 width=8) -> Seq Scan on "in#5" (cost=0.00..30434.00 rows=1000000 width=8) */ </code></pre> * the problem is not necessarily caused by #884, because sort temp files consume disk space only in proportion to the table size, which for a small import would not be nearly enough to fill the disk h3. fix * restore vegbiendev to last working configuration * roll back Postgres to the version it was at in the last successful import import, re-run import, and see if problem goes away ** this may require building Postgres from source, because past _revisions_ of the same numeric version might only be available in version control, not in binary form via apt-get (which numbers packages by numeric version) ** * if this isn't possible, it may be necessary to downgrade to Postgres 9.2 (which will unfortunately be missing some features that we now use)