Full-database summary statistics¶
Note that different statistics were calculated on different versions of the database, and some are out of date
Input rows¶
27,930,518
calculation: inputs/import.stats.xls
2013-1-16 import has 29,805,931 total - 1,707,970 geoscrub - 3536 herbaria - 163,907 NCBI
Table row counts¶
Table | Row count |
analytical_stem | ~27,204,860 |
analytical_aggregate | ~15,143,832 |
SET search_path TO "public.2012-12-11-11-46-55.r6795"; EXPLAIN SELECT * FROM analytical_stem; EXPLAIN SELECT * FROM analytical_aggregate;
Joins¶
location to locationevent¶
11,217,698
SET search_path TO "public.2012-11-06-16-50-06.r6038"; SELECT count(*) FROM location LEFT JOIN locationevent USING (location_id)
Filters¶
places with no geoscrub place¶
datasource | no_geoscrub_place | total |
ACAD | 8200 | 8247 |
ARIZ | 831 | 36165 |
CTFS | 156 | 157 |
CVS | 6206 | 6206 |
FIA | 270690 | 270691 |
GBIF | 188289 | 188613 |
Madidi | 285 | 285 |
MO | 4402 | 205541 |
NCU-NCSC | 1668 | 2741 |
NY | 1 | 57839 |
QMOR | 280 | 444 |
REMIB | 13734 | 16275 |
SALVIAS | 6746 | 7065 |
SpeciesLink | 136494 | 187724 |
TEAM | 2337 | 2337 |
U | 2563 | 3177 |
VegBank | 59413 | 66284 |
UNCC | 1831 | 1858 |
SET search_path TO "public.2012-11-13-19-16-20.r6152"; SELECT reference.shortname AS datasource , no_geoscrub_place , count(*) AS total FROM ( SELECT datasource_place.reference_id, count(*) AS no_geoscrub_place FROM place datasource_place JOIN place matched_place ON matched_place.place_id = datasource_place.matched_place_id WHERE matched_place.matched_place_id IS NULL -- no match AND datasource_place.reference_id > 3 GROUP BY datasource_place.reference_id ) no_geoscrub_place JOIN reference USING (reference_id) JOIN place USING (reference_id) GROUP BY reference_id, reference.shortname, no_geoscrub_place ORDER BY reference_id
locationevents without location¶
373
SET search_path TO "public.2012-12-06-23-19-52.r6672"; SELECT count(*) FROM locationevent WHERE location_id IS NULL
taxonoccurrences without locationevent¶
254,588
SET search_path TO "public.2012-12-06-23-19-52.r6672"; SELECT count(*) FROM taxonoccurrence WHERE locationevent_id IS NULL
extra taxondeterminations¶
Extra taxondeterminations have isoriginal = true
3,904,804 of 29,779,347 = 13%
SET search_path TO "public.2012-11-06-16-50-06.r6038"; SELECT count(*) FROM taxondetermination WHERE isoriginal
non-extra taxondeterminations¶
25,874,543 of 29,779,347
Source: 29,779,347 - 3,904,804 = 25,874,543
Analytical DB¶
Note that total row counts are out of date. They were likely fixed in changes between the r6360 (2012-11-21) and r6446 (2012-11-24) imports, due to the decrease in the backup file size indicating fewer analytical_stem rows. r6111 (2012-11-9) also fixed a join to use geoscrub_cultivated rather than geoscrub, which contained many duplicate records.
speciesBinomials¶
207,066
(takes 11 min)
SET search_path TO "public.2012-11-16-20-51-10.r6262"; SELECT count(DISTINCT "speciesBinomial") FROM analytical_aggregate;
non-NULL georeferenceValid¶
3,985,866 of 54,344,800
SET search_path TO "public.2012-11-13-19-16-20.r6152"; SELECT count(*) FROM analytical_stem WHERE "georeferenceValid" IS NOT NULL
higherPlantGroup¶
higherPlantGroup | count |
<NULL> | 1,486,319 |
bryophytes | 376,128 |
gymnosperms (conifers) | 4,262,022 |
ferns and allies | 564,406 |
flowering plants | 47,517,123 |
gymnosperms (non-conifer) | 24,313 |
SET search_path TO "public.2012-11-06-16-50-06.r6038"; SELECT "higherPlantGroup", count(*) FROM analytical_stem GROUP BY "higherPlantGroup"
isNewWorld¶
16,151,354 of 54,344,800
SET search_path TO "public.2012-11-06-16-50-06.r6038"; SELECT count(*) FROM analytical_stem WHERE "isNewWorld"
threatened¶
189,855 of 54,344,800
SET search_path TO "public.2012-11-06-16-50-06.r6038"; SELECT count(*) FROM analytical_stem WHERE threatened