Task #359
openTest GAM method to produce Tmax for one day (OREGON): r script
0%
Description
This is a first attempt at producing predicted values from meteorological station data.
Two methods were tested in a r script: a linear regression and a GAM regression. The Oregon station data was used as input with 30% of stations held for validation. Assessment of the results is done through diagnostics statistics (e.g. AIC and RMSE) and plots (e.g. residuals and prediction surfaces).
Updated by Benoit Parmentier almost 13 years ago
- Status changed from New to In Progress
- Assignee set to Benoit Parmentier
Updated by Benoit Parmentier over 12 years ago
Assessment of the statistical models for interpolation must rely on diagnostic statistics. This step will help in selecting the appropriate model for production of the dataset as well as for the future scientific publication. Eventually, a choice must be made in terms of the diagnostic means of evaluating the GAM models. For the time being, I have been using the following methods:
RMSE, Deviance, AIC, GCV scores, Plots. This may be a good issue to talk during a phone call meeting.
1) RMSE
The Root Mean Square Error is the average of the squared root of the sum of squares of residuals between the actual and fitted values (from validation stations).
2) Deviance and explained deviance
Deviance in GAM indicates the difference between the best possible fit (with one parameter for each observation) and the current model. Models with smaller the Deviance are considered to be better. R offers a way of comparing models based on deviance using the anova command.
3) AIC
The Akaike Information Criterion is based on the log likelihood. It is a measure of the goodness of fit of a model: The smaller the AIC, the better the fit.AIC takes into account the number of parameters estimated by penalizing the addition of variables.
4) GCV score
The Generalized Cross Validation (GCV) relates to the method used to fit the smoothing parameter in the GAM procedure. Smaller GCV scores indicate better fit.
5) GAM Plots
The 3d plots help to visualize the trends in input independent variables in relation to the independent variable (Tmax). For instance, one expects decreases in Tmax with increase in latitude in Oregon.
6) Diagnostic from the literature
Find out the diagnostic measures used in PRISM, WorldClim and Anusplin papers.