Revision f41365c2
Added by Adam M. Wilson almost 11 years ago
climate/research/LST_missingdata.Rmd | ||
---|---|---|
1 |
LST Climatology Evaluation |
|
2 |
==== |
|
3 |
Adam M. Wilson |
|
4 |
```{r} |
|
5 |
## get repo info |
|
6 |
githash=system("git --git-dir /Users/adamw/work/environmental-layers/.git log --pretty=format:'%h' -n 1",intern=T) |
|
7 |
print(paste("Compiled on",date()," using code version (git hash):",githash)) |
|
8 |
``` |
|
9 |
|
|
10 |
A short script to visualize and explore the updated Land Surface Climatology algorithm that 'lowers the standards' in some areas to increase the number of available observations. |
|
11 |
|
|
12 |
```{r,echo=FALSE,results="hide",message=FALSE} |
|
13 |
## some setup |
|
14 |
opts_knit$set(progress = TRUE, verbose = TRUE, cache=TRUE,root.dir="~/Downloads/nasa/") |
|
15 |
library(rasterVis) |
|
16 |
library(rgdal) |
|
17 |
library(xtable) |
|
18 |
library(reshape) |
|
19 |
setwd("~/Downloads/nasa/") |
|
20 |
```` |
|
21 |
|
|
22 |
|
|
23 |
## Download data from ECOcast and convert to raster stacks |
|
24 |
```{r} |
|
25 |
download=F |
|
26 |
if(download) system("wget -e robots=off -L -r -np -nd -p 20140304_LST -nc -A tif http://ecocast.arc.nasa.gov/data/pub/climateLayers/LST_new/") |
|
27 |
|
|
28 |
## organize file names |
|
29 |
f=data.frame(full=T,path=list.files("20140304_LST",pattern="tif$",full=T),stringsAsFactors=F) |
|
30 |
f$month=as.numeric(do.call(rbind,strsplit(basename(f$path),"_|[.]"))[,7]) |
|
31 |
f$type=do.call(rbind,strsplit(basename(f$path),"_|[.]"))[,1] |
|
32 |
f=f[order(f$month),] |
|
33 |
f$mn=month.name[f$month] |
|
34 |
|
|
35 |
## create raster stacks |
|
36 |
lst_mean=stack(f$path[f$type=="mean"]) |
|
37 |
names(lst_mean)=f$mn[f$type=="mean"] |
|
38 |
NAvalue(lst_mean)=0 |
|
39 |
|
|
40 |
lst_nobs=stack(f$path[f$type=="nobs"]) |
|
41 |
names(lst_nobs)=f$mn[f$type=="nobs"] |
|
42 |
|
|
43 |
lst_qa=stack(f$path[f$type=="qa"]) |
|
44 |
names(lst_qa)=f$mn[f$type=="qa"] |
|
45 |
|
|
46 |
## define a function to summarize data |
|
47 |
fst=function(x,na.rm=T) c("mean"=mean(x,na.rm=T),"min"=min(x,na.rm=T),"max"=max(x,na.rm=T)) |
|
48 |
rfst=function(r) cellStats(r,fst) |
|
49 |
``` |
|
50 |
|
|
51 |
## Mean Monthly LST |
|
52 |
|
|
53 |
Map of LST by month (with white indicating missing data). Note that many inland regions have missing data (white) in some months (mostly winter). |
|
54 |
```{r, message=FALSE, fig.width=11, fig.height=8} |
|
55 |
colramp=colorRampPalette(c("blue","orange","red")) |
|
56 |
dt_mean=rfst(lst_mean) |
|
57 |
levelplot(lst_mean,col.regions=c(colramp(99)),at=seq(0,65,len=99),main="Mean Land Surface Temperature",sub="Tile H08v05 (California and Northern Mexico)") |
|
58 |
``` |
|
59 |
|
|
60 |
|
|
61 |
Table of mean, min, and maximum LST for this tile by month. |
|
62 |
```{r,results="asis"} |
|
63 |
print(xtable(dt_mean), type = "html") |
|
64 |
``` |
|
65 |
|
|
66 |
### Boxplot of Monthly Mean LST |
|
67 |
```{r, message=FALSE, fig.width=11, fig.height=8} |
|
68 |
lst_tmean=melt(unlist(as.matrix(lst_mean))) |
|
69 |
colnames(lst_tmean)=c("cell","month","value") |
|
70 |
lst_tmean=lst_tmean[!is.na(lst_tmean$value),] |
|
71 |
lst_tmean$month=factor(lst_tmean$month,levels=month.name,ordered=T) |
|
72 |
bwplot(value~month,data=lst_tmean,ylab="Mean LST (c)",xlab="Month") |
|
73 |
``` |
|
74 |
|
|
75 |
|
|
76 |
## Total number of available observations |
|
77 |
|
|
78 |
This section details the spatial and temporal distribution of the number of LST observations that were not masked by quality control (see section below). Note that the regions with no data in the map above have missing data (nobs=0) here as well, but also the areas surrounding those regions have low numbers of observations in some months (blue colors). |
|
79 |
```{r, message=FALSE, fig.width=11, fig.height=8} |
|
80 |
dt_nobs=rfst(lst_nobs) |
|
81 |
levelplot(lst_nobs,col.regions=c("grey",colramp(99)),at=c(-0.5,0.5,seq(1,325,len=99)), |
|
82 |
main="Sum Available Observations",sub="Tile H08v05 (California and Northern Mexico) \n Grey indicates zero observations") |
|
83 |
``` |
|
84 |
|
|
85 |
Table of mean, min, and maximum number of observations for this tile by month. |
|
86 |
```{r,results="asis", echo=FALSE} |
|
87 |
print(xtable(dt_nobs), type = "html") |
|
88 |
``` |
|
89 |
|
|
90 |
### Boxplot of Number of Observations |
|
91 |
The seasonal cycle of missing data is quite noisy, though there tend to be fewer observations in winter months (DJF). |
|
92 |
```{r, message=FALSE, fig.width=11, fig.height=8} |
|
93 |
lst_tnobs=melt(unlist(as.matrix(lst_nobs))) |
|
94 |
colnames(lst_tnobs)=c("cell","month","value") |
|
95 |
lst_tnobs=lst_tnobs[!is.na(lst_tnobs$value),] |
|
96 |
lst_tnobs$month=factor(lst_tnobs$month,levels=month.name,ordered=T) |
|
97 |
bwplot(value~month,data=lst_tnobs,ylab="Number of availble observations",xlab="Month") |
|
98 |
``` |
|
99 |
|
|
100 |
|
|
101 |
## Quality Assessment level used |
|
102 |
|
|
103 |
Map of the Quality Assessment (QA) level used to fill the pixels. It goes from 0 (highest quality) to 3(low). For h08v05 all pixels are filled with either 0 or 1. So red indicates areas with the lower quality data (most of the tile). |
|
104 |
```{r, message=FALSE, fig.width=11, fig.height=8} |
|
105 |
levelplot(lst_qa[[1]],col.regions=c("grey","red"),at=c(-0.5,0.5,1.5),cuts=2, |
|
106 |
main="Quality Assessment Filter",sub="Tile H08v05 (California and Northern Mexico)") |
|
107 |
``` |
|
108 |
|
|
109 |
|
|
110 |
Proportion of cells in each month with QA=1 (including cells in the Pacific Ocean) |
|
111 |
```{r,results="asis", echo=FALSE} |
|
112 |
dt_qa=t(data.frame("ProportionQA_1"=cellStats(lst_qa,"mean"))) |
|
113 |
print(xtable(dt_qa), type = "html") |
|
114 |
``` |
|
115 |
|
|
116 |
|
|
117 |
## Questions |
|
118 |
|
|
119 |
A few open questions/comments (in my mind): |
|
120 |
|
|
121 |
1. Why are there only two QA classes for this tile (0 and 1) rather than 4 (0-3)? There are still missing data in some months, is the plan to do it or was there another reason to not consider all classes for this tile? |
|
122 |
2. How exactly are the different QA levels selected? If QA=0 results in <33 obs, go to QA=1, etc.? |
|
123 |
3. Please name monthly output in a way that it sorts chronologically (e.g. mean_LST_Day_1km_h08v05_04.tif instead of mean_LST_Day_1km_h08v05_apr_4.tif ) |
|
124 |
4. Please name directories on ECOcast with dates rather than "new". E.g. LST/20140304/* That will make it easier to see which is the new version. |
|
125 |
5. Should we consider also saving the SD of the observations in each pixel (in addition to the mean and n of observations)? |
|
126 |
|
Also available in: Unified diff
Updating LST missing data summary