Bug #948
Updated by Aaron Marcuse-Kubitza over 10 years ago
h3. issue
"from Brody":mailto:brody.sandelATbiology.au.dk?Brody_Sandel.2014-8-22-9:43PT:
> SELECT * FROM view_full_occurrence_individual WHERE datasource =
> 'VegBank' AND plot_name = 'CO081202XY04'
> ;
>
> returns two lines with "Lupinus argenteus". Those lines appear to be complete duplicates, except for the last field (taxonobservation_id) which is 1230 for one and 1231 for the next. Both lines show cover_percent of 2.
>
> Looking at VegBank (http://vegbank.org/vegbank/views/observation_comprehensive.jsp?view=comprehensive&entity=observation&wparam=42558&strata2Show=1¶ms=42558&placeholder=), I see a single entry of Lupinus argenteus, with a cover of 2%.
>
> There's a similar situation for Quercus gambelii in this particular plot.
h3. info
It looks like a bug in the view_full_occurrence_individual_view joins,
as VegBank itself has only one entry for this [1].
fn1. using @aggregate_organism_observation_id = 937165@ from those two rows:
<pre><code class="SQL">
SELECT * FROM "VegBank".taxonimportance WHERE taxonimportance_id = 937165; -- 1 row
SELECT * FROM "VegBank".taxoninterpretation WHERE taxonobservation_id = 697934; -- 1 row
</code></pre>
h3. fix
go through all of the view_full_occurrence_individual_view joins, and figure out where the duplication is being introduced:
@FROM source@
@JOIN location USING (source_id)@
@LEFT JOIN locationevent USING (location_id)@
@LEFT JOIN place USING (place_id)@
@LEFT JOIN location parent_location ON parent_location.location_id = location.parent_id@
@LEFT JOIN coordinates USING (coordinates_id)@
@LEFT JOIN geoscrub.geoscrub_output ON ARRAY[geoscrub_output."decimalLatitude"] = ARRAY[coordinates.latitude_deg] AND ARRAY[geoscrub_output."decimalLongitude"] = ARRAY[coordinates.longitude_deg] AND ARRAY[geoscrub_output.country] = ARRAY[place.country] AND ARRAY[geoscrub_output."stateProvince"] = ARRAY[place.stateprovince] AND ARRAY[geoscrub_output.county] = ARRAY[place.county]@
@LEFT JOIN "newWorld".iso_code_gadm ON iso_code_gadm."*GADM country" = COALESCE(geoscrub_output."acceptedCountry", place.country)@
@LEFT JOIN "newWorld"."newWorldCountries" ON "newWorldCountries"."*isoCode" = iso_code_gadm."*2-digit iso code"@
@LEFT JOIN geoscrub.county_centroids ON place.country = 'United States'::text AND county_centroids."stateProvince" = COALESCE(geoscrub_output."acceptedStateProvince", place.stateprovince) AND county_centroids.county = COALESCE(geoscrub_output."acceptedCounty", place.county)@
@LEFT JOIN locationevent parent_event ON parent_event.locationevent_id = locationevent.parent_id@
@LEFT JOIN project ON project.project_id = COALESCE(locationevent.project_id, parent_event.project_id)@
@LEFT JOIN stratum ON stratum.stratum_id = COALESCE(locationevent.stratum_id, parent_event.stratum_id)@
@LEFT JOIN method ON method.method_id = COALESCE(locationevent.method_id, parent_event.method_id);@
@LEFT JOIN taxonoccurrence ON taxonoccurrence.locationevent_id = "plot.**".locationevent_id@
@LEFT JOIN party collector ON collector.party_id = taxonoccurrence.collector_id@
-@LEFT JOIN aggregateoccurrence USING (taxonoccurrence_id)@- _query:_
<pre><code class="SQL">
SELECT * FROM aggregateoccurrence WHERE (source_id, (COALESCE(sourceaccessioncode, '\N'::text)) COLLATE pg_catalog."default") = (38,'937165') AND sourceaccessioncode IS NOT NULL; -- 1 row
</code></pre>
-@LEFT JOIN plantobservation USING (aggregateoccurrence_id)@- _query:_
<pre><code class="SQL">
SELECT * FROM plantobservation WHERE aggregateoccurrence_id = 3823285; -- 1 row
</code></pre>
-@LEFT JOIN specimenreplicate USING (plantobservation_id)@- _query:_
<pre><code class="SQL">
SELECT * FROM specimenreplicate WHERE plantobservation_id = 4462528; -- 1 row
</code></pre>
@LEFT JOIN sourcelist ON sourcelist.sourcelist_id = specimenreplicate.duplicate_institutions_sourcelist_id@
@LEFT JOIN taxondetermination ON taxondetermination.taxonoccurrence_id = taxonoccurrence.taxonoccurrence_id AND taxondetermination.iscurrent@
@LEFT JOIN party identifiedby ON identifiedby.party_id = taxondetermination.party_id@
@LEFT JOIN taxonverbatim USING (taxonverbatim_id)@
@LEFT JOIN taxonlabel USING (taxonlabel_id)@
@LEFT JOIN "TNRS".taxon_scrub ON taxon_scrub."*Name_submitted" = taxonlabel.taxonomicname@
@LEFT JOIN family_higher_plant_group ON family_higher_plant_group.family = taxon_scrub."[scrubbed_]family~(-Accepted_-)__@TNRS__@vegpath.org"@
@LEFT JOIN cultivated_family_locations ON cultivated_family_locations.family = taxon_scrub."[scrubbed_]family~(-Accepted_-)__@TNRS__@vegpath.org" AND cultivated_family_locations.country = "plot.**".country;@