Project

General

Profile

« Previous | Next » 

Revision 9856

inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented that there are 4.5 million duplicates (59,998,354 rows before - 55,417,646 rows after = 4,580,708)

View differences:

inputs/GBIF/raw_occurrence_record_plants/postprocess.sql
20 20
    , 'UBC'
21 21
    , 'WIN'
22 22
)
23
/* there are 4.5 million duplicates [1]
24

  
25
[1] 59,998,354 rows before [2] - 55,417,646 rows after [3] = 4,580,708
26

  
27
[2] 59998354
28
SELECT "raw_occurrence_record-row_num"
29
FROM "GBIF".raw_occurrence_record_plants
30
ORDER BY "raw_occurrence_record-row_num" DESC
31
LIMIT 1
32
Total query runtime: 19436 ms.
33

  
34
[3] 55417646
35
SELECT COUNT(*) FROM "GBIF".raw_occurrence_record_plants
36
Total query runtime: 23 ms.
37
*/
23 38
/* list obtained using the following on r9459:
24 39
SELECT DISTINCT dataprovider
25 40
FROM sourcelist

Also available in: Unified diff