Task #884
Updated by Aaron Marcuse-Kubitza over 10 years ago
h3. issue
* see the following @pg_stat_activity@ snapshot taken at 2014-4-2 11:42am PT (note especially the @query_start@ times above each query):
<pre><code class="SQL">
snapshot taken: "Apr 2 11:42:26 2014"
"query_start"
"2014-04-02 06:20:01.952174-07"
/*NVS.StemObservation*/ CREATE TEMP TABLE "stemobservation_pkeys" AS
SELECT
"in#11"."row_num"
, "stemobservation"."stemobservation_id" AS "out.stemobservation_id"
FROM "in#11"
JOIN "stemobservation" ON COALESCE("stemobservation"."height_m", CAST('NaN' AS double precision)) = COALESCE("in#11"."NVS.StemObservation.Height::double", CAST('NaN' AS double precision))
/* EXPLAIN:
Merge Join (cost=1045217.21..95909185.15 rows=6323041431 width=8)
Merge Cond: ((COALESCE(stemobservation.height_m, 'NaN'::double precision)) = (COALESCE("in#11"."NVS.StemObservation.Height::double", 'NaN'::double precision)))
-> Sort (cost=30039.56..30492.76 rows=181279 width=12)
Sort Key: (COALESCE(stemobservation.height_m, 'NaN'::double precision))
-> Seq Scan on stemobservation (cost=0.00..14206.79 rows=181279 width=12)
-> Materialize (cost=1015177.64..1050057.81 rows=6976033 width=12)
-> Sort (cost=1015177.64..1032617.73 rows=6976033 width=12)
Sort Key: (COALESCE("in#11"."NVS.StemObs
"query_start"
"2014-04-01 14:56:05.573418-07"
/*CVS.taxon_observation.***/ CREATE TEMP TABLE "location_pkeys" AS
SELECT
"in#5"."row_num"
, "location"."location_id" AS "out.location_id"
FROM "in#5"
JOIN "location" ON ("location"."accesslevel" = "in#5"."CVS.taxon_observation.**.accessRights::accesslevel" OR ("location"."accesslevel" IS NULL AND "in#5"."CVS.taxon_observation.**.accessRights::accesslevel" IS NULL))
/* EXPLAIN:
Nested Loop (cost=0.00..83049883712.45 rows=27683225000 width=8)
Join Filter: ((location.accesslevel = "in#5"."CVS.taxon_observation.**.accessRights::accesslevel") OR ((location.accesslevel IS NULL) AND ("in#5"."CVS.taxon_observation.**.accessRights::accesslevel" IS NULL)))
-> Seq Scan on location (cost=0.00..175778.45 rows=5536645 width=8)
-> Materialize (cost=0.00..35434.00 rows=1000000 width=8)
-> Seq Scan on "in#5" (cost=0.00..30434.00 rows=1000000 width=8)
*/
</code></pre>
* this may be the cause of #883, if intermediate sort results are stored on disk and take up significant space
h3. fix
* roll back Postgres to the version it was at in the last successful import, re-run import, and see if problem goes away
** this may require building Postgres from source, because past _revisions_ of the same numeric version might only be available in version control, not in binary form via apt-get (which numbers packages by numeric version)
* if this isn't possible, it may be necessary to downgrade to Postgres 9.2 (which will unfortunately be missing some features that we now use)