Revision 9856
Added by Aaron Marcuse-Kubitza over 11 years ago
inputs/GBIF/raw_occurrence_record_plants/postprocess.sql | ||
---|---|---|
20 | 20 |
, 'UBC' |
21 | 21 |
, 'WIN' |
22 | 22 |
) |
23 |
/* there are 4.5 million duplicates [1] |
|
24 |
|
|
25 |
[1] 59,998,354 rows before [2] - 55,417,646 rows after [3] = 4,580,708 |
|
26 |
|
|
27 |
[2] 59998354 |
|
28 |
SELECT "raw_occurrence_record-row_num" |
|
29 |
FROM "GBIF".raw_occurrence_record_plants |
|
30 |
ORDER BY "raw_occurrence_record-row_num" DESC |
|
31 |
LIMIT 1 |
|
32 |
Total query runtime: 19436 ms. |
|
33 |
|
|
34 |
[3] 55417646 |
|
35 |
SELECT COUNT(*) FROM "GBIF".raw_occurrence_record_plants |
|
36 |
Total query runtime: 23 ms. |
|
37 |
*/ |
|
23 | 38 |
/* list obtained using the following on r9459: |
24 | 39 |
SELECT DISTINCT dataprovider |
25 | 40 |
FROM sourcelist |
Also available in: Unified diff
inputs/GBIF/raw_occurrence_record_plants/postprocess.sql: Remove institutions that we have direct data for: documented that there are 4.5 million duplicates (59,998,354 rows before - 55,417,646 rows after = 4,580,708)