Robinson updates¶
2013-9-25 (date)¶
What I did this past week
- Worked on edits for the DEM manuscript, which received favorable reviews but needs major revision.
- Spent some time (over the last few months) converting a postgres script, which searches a database to get species occurrence records in which certain quality flags (e.g. country is specified, geocoordinates fall within specified country, etc.) are met, into Python. Got partway through this, but it is on the back burner for now until we decide if we will ever need it.
- With a lot of help from Javier Otegui, created a table in Litoria that includes information for all vertebrate records in eBird and GBIF that are 'good quality'. This means that the record contains necessary information like collection location, ID, etc., and the record falls within the known geographic range of the species. For records that meet the criteria of 'good quality', the table contains the number of records (for each species) from before 1970, 1970-1990, 1990-2013. For those who are interested, it is on Litoria and is called: 'nrob_inrangemap_byyear'.
What I'm working on now
- Edits to the DEM manuscript. Most pressingly, accuracy assessments of the DEM compared to the most ground-truthed (while still being trust worthy) elevation data that I can get... especially in the blend zone.
What obstacles are blocking progress
- None
What's next
- Get the MS resubmitted, and then???
Robinson updates¶
2013-7-27 (date)¶
What I did this past week
- The DEM serving page is now up: http://dem.earthenv.org. If folks want to play around with it, I would love feedback on any issues that arise (esp. across different browsers), and suggestions re: making it more functional (major improvements- like zooming on the graphic, etc.- aside).
- I am working with Rob's post-doc, Javier, to do basic quality checks on species occurrence tables. Javier wrote a postgreSQL script to do this for tables in the MOL database. These checks include making sure that geolocation data are present in each occurrence record, and valid (e.g. lat is not < -90), country information is present, and the record falls within the stated country's border. I am converting Javier's script to Python, and 'genericizing' it a bit to work; a) on any table (not just MOL-specific), and b) across a variety of user inputs (e.g. multiple input types for "country" can be recognized and dealt with).
What I'm working on now
- The script mentioned above, which needs to be streamlined in order to process many records quickly.
What obstacles are blocking progress
- None
What's next
- Make sure the QC-check script processes records quickly, or rethink the approach if not.
- The next step for the occurrence tables work will be to see if an occurrence falls within a species' known distribution range (or if not, how far beyond the range was it found?). This will be the next script that I work on.
Robinson updates¶
2013-7-16 (date)¶
What I did this past week
- Compiled a DEM tile serving page for the earthenv.org website. Tiles can be downloaded via user inputs of coordinates as well as through a clickable map.
What I'm working on now
- General file/script clean-up on Atlas.
- Starting on the species occurrence record collection.
What obstacles are blocking progress
- None
Robinson updates¶
2013-6-04 (date)¶
What I did this past week
- Submitted MS, then resubmitted when I received notification that the editor of the initial journal did not think it was the correct venue. New journal: ISPRS J. of Photogrammetry and Remote Sensing
- Took a break for a bit after having worked a fair bit of overtime to get the submission out
What I'm working on now
- Starting DEM retrieval work. We need a VRT for easy retrieval, and I want to get a few search methods up and running on the website. These include a search-by-coordinates method, and a clickable map
- General file/script clean-up on Atlas.
What obstacles are blocking progress
- None
What's next
- Again- maybe work on creating a few tools that can be used directly through iPlant (e.g. build some gdal functionality into the website so that folks can do a bit of processing on/exploring of DEM tiles without needing a) their own tools, or b) knowledge of how to use things like gdal commands).
Robinson updates¶
2013-5-21 (date)¶
What I did this past week
- Finished the manuscript (it's in final review before we submit)
- Compressed and transferred EarthEnv-DEM90 tiles to iPlant (HUGE thanks to Martha, Nirav, Edwin, Rob and Nick Brand for facilitating that so quickly!)
What I'm working on now
- Last minute touch-ups to MS
- Getting the DEM dataset fully ready for official release (e.g. the correct metadata, VRT for easy retrieval, etc...)
What obstacles are blocking progress
- None
What's next
- Creating VRT and other ways by which a user can obtain individual EarthEnv-DEM90 tiles and/or large groups of tiles. We want to create a search feature that is very similar to that used by CGIAR-CSI for SRTM data, where users can search for tiles via a click-able map or by inputting coordinates.
- We've also discussed the possibility of creating a few tools that can be used directly through iPlant (e.g. build some gdal functionality into the website so that folks can do a bit of processing on/exploring of DEM tiles without needing a) their own tools, or b) knowledge of how to use things like gdal commands). I will explore that a bit too.
- General file/script clean-up on Atlas. I need to go through and get rid of old things that are obsolete, organize the scripts into something of a library (and get them into the GIT repository), etc. No reason that Jim should constantly have to clean up after me :)
Robinson updates¶
2013-4-22 (date)¶
What I did this past week
- Finished the global 90m DEM
- Finished a first draft of the manuscript for the 90m global DEM.
What I'm working on now
- Edits to the manuscript
What obstacles are blocking progress
- None
What's next
- Manuscript submission (target= May 1) and then on to the next project :)
Robinson updates¶
2013-3-11 (date)¶
What I did this past week¶
- Finished multiscale smoothing for 88% of the globe
- The other 12% had issues wherein voids appeared in the middle of large expanses of flat terrain in smoothed tiles. In these areas, the underlying terrain is completely flat, and the terrain variance is therefore 0. It seems that nulls are created by the smoothing script in the middle of large areas where these two conditions are met.
- I then tried to build a command into the smoothing script that would fill these voids with input DEM values (because the area is flat and variance is 0, no smoothing needs to be done there). I tried many things, and nothing worked!
- Finally, I got a void-replacement script working, though I was never able to integrate this into the multiscale smooth script. Instead, so I had to run it as a post-smoothing step (and I still don’t get why it wouldn't work within the smoothing script!).
- The next issue was to do with remote access to the Atlas server (I think). For some reason, upon logging out of Atlas the smoothed DEMs within GRASS are no longer viable. If I start GRASS and try to display these grids, they come out as all null. And if I try to just run the above-mentioned void-fill command on these pre-existing grids, the output is all null. It is strange, and I don’t know why it happens... but it means that in order to fill the voids I had to completely re-run the whole smoothing process on the tiles that contained voids (296 tiles).
- As of this morning, all tiles have been processed and the pieces of the smoothed DEM are all there.
What I'm working on now¶
- Error checking on the smoothed tiles, especially those that required void-filling afterward
- Finding SRTM tiles that need extra corrections. Today, during one of the visual inspections mentioned above, I found a tile that had a 1 x 1 degree block of 0's in the middle of an ~ 179m flat plane of terrain. I went back to the source SRTM tile, and found that it was faulty. That tile (19_03, for those who are interested) contains nulls in that 1 x 1 degree block. When ocean was converted to 0 for SRTM tiles, this block of pixels were converted to 0's. I have no idea how extensive this issue is, but each tile where this sort of thing happens needs to be fixed (in whichever manner is required) and then reprocessed before inclusion in the final DEM. I will spend the next few days finding any such tiles
- Re-running blending from N59 to N60
- Finishing the first draft of the manuscript (almost done, though re: the issue above I'll need to add some things!)
What obstacles are blocking progress¶
- Nada
What's next¶
- Correcting SRTM tiles mentioned above, and finishing the DEM.
- Moving on to something else?
Robinson updates¶
2013-1-14 (date)¶
What I did this past week¶
- Corrected the multiscalesmooth script to avoid the creation of voids (replaced 0 with 0.000001 in one calculation, where 0 was winding up in a denominator)
- Tried to download and properly configure GRASS GIS on my laptop, so that I could carry out the smoothing procedure locally (thereby allowing me to create and store intermediate files without filling up the very space-limited Atlas server).
- Began scripting out the multiscale smoothing process for the entire globe. The process needs to be done in six different sections in order to a) treat the data from the two satellites (ASTER and SRTM) separately (so that blending can be performed using smoothed data, and b) minimized any potential issues regarding edges of the datasets. The six sections are:
1) SRTM data from S55 to N55 (in 5 x 5 degree tiles)
2) SRTM data from N55 to N60 (in 5 x 5 degree tiles, and with the top 100 rows of pixels inverted and added to the top of the grid prior to processing, to minimize error associated with the edge of the grid)
3) SRTM data from S60 to S55 (in 5 x 5 degree tiles, and following the same procedure as above but with the bottom rows inverted and added)
4) Void-filled ASTER data from N60 to N80 (in 5 x 5 degree tiles)
5) ASTER data from N59 to N60 (in 1 x 5 degree tiles, and with the bottom rows inverted and added, this data having no voids to begin with)
6) Void-filled ASTER data from N80 to N83 (in 3 x 5 degree tiles, and with the top rows inverted and added)
- Before any of the multiscale smoothing can be undertaken, however, the ASTER and SRTM input data had/have to be mosaiced into 'full' grids (covering the entire global circumference and the full latitudinal range available for the dataset in question). Having this 'full' grid allows for each smaller tile (e.g. 5 x 5 degrees) to be extracted with a buffer of 100 extra pixels around the edges (except at the true grid edges, where the inverted rows are added as described above). These extra pixels allow for the edges to have inaccurate smoothed values (because the algorithm runs out of pixels on which to base the calculations) without degrading the quality of the final grid (because the edge pixels will be clipped off at the end). This will create a seamless final output. I have created the SRTM 'full grid' from SRTM data that a) have not been blended with ASTER data at the N59 to N60 zone (this was one of the initial steps previously), and b) have had ocean converted from null to 0 (this had been done on some of the data previously, but not all- because of the blending with ASTER that had been performed).
What I'm working on now¶
- Getting the script for the SRTM S55 to N55 section to run (see below)
- Putting the finishing touches on the scripts for the other sections mentioned above- these are mostly done though I need to get the georeference stuff all figured out so that I can convert to numpy array, add inverted rows, convert back to raster, and save in correctly georefenced files for the sections for which inverted rows need to be added.
What obstacles are blocking progress¶
- I have been unable to get GRASS working on my Windows 7 machine, due to Python version conflicts (I run (and have to run) 2.6, GRASS comes bundled with 2.7, and I cannot seem to get the program configured to bypass 2.7 and work straight from 2.6!). This means that after I run each section of the smoothing procedure, I will need to transfer the files to my local HD to clear them from Atlas before I can run the next section (no idea how long these transfers will take).
- I am having trouble automating the multiscale smooth process at the 180 E/W meridian, where in order to include the extra pixels around the edges, the sign of the bounding longitude changes. Example: for a tile running from -180 to -175, the extent that includes 100 extra pixels on each edge runs from 179.91667 to -174.91667. This is tripping up the gdalwarp command, which doesn’t seem to read across the 180 E/W meridian correctly. I am trying to figure out how to overcome the issue created by the following ‘-te’ input (which results in the error seen below):
‘-te’ input:
179.916666667 44.9166666667 -174.916666667 50.0833333333
Result:
Creating output file that is -425799P x 6200L.
ERROR 1: Attempt to create -425799x6200 dataset is illegal,sizes must be larger than zero.
What's next¶
- Getting the smoothed tiles, writing it up.
Robinson updates¶
2012-12-13 (date)¶
What I did this past week¶
- Continued exploring methods for creating a stdev grid by which to smooth the global DEM. Tested grid calculation through: 1) Numpy arrays, 2) GRASS, 3) ArcGIS. Tests have been completed over two 1 x 1 degree test regions, one over a highly terrain-variable area encompassing part of the Grand Canyon (where the global DEM is SRTM data only), and another over a 1 x 1 degree region in NW Canada (where the global DEM is a blend of ASTER and SRTM).
- Used these outputs to smooth the DEM's in both test regions, and compared the results obtained from the different methods
- Results are mixed. The main findings are:
1. Calculating a stdev grid through numpy arrays takes a very long time (> 1 hour for a 1 x 1 degree grid). This method has a positive aspect, however, in that the calculations are very high precision (probably a lot of why it takes so long). This becomes evident during the multiscale smoothing (see below).
2. Calculating a stdev grid using GRASS is very fast (~ 20 seconds for a 1 x 1 degree grid). This seems like the best way forward, at the moment, though more exploration of precision issues should be undertaken before the final decision (again, see below)
3. Calculating a stdev grid in ArcGIS is problematic. The issue is that one of the steps in the calculation calls for a neighborhood median to be calculated, but ArcGIS 10.0 will not calculate a neighborhood median for floating point grids. I have overcome this by multiplying and then dividing by 1000 (following median calculation) but at the expense of precision. I do not think that an Arc script is a good way forward.
4. Despite these differences, especially with the probable low quality of the ArcGIS grid, it seems that actual multiscale smoothing is almost immune to which grid is used for the process (at least for one test region- the Grand Canyon). This is encouraging, as the 'best' method may simply be that which is fastest.
5. Statement #4 is not entirely correct though (!). In the NW Canada test region, multiscale smoothing using both the numpy and GRASS-derived grids resulted in voids in the final output. These voids were quite a bit larger and more numerous in the GRASS-derived output, but still appeared in the output derived from using numpy. This gets back to the precision issue mentioned above, as it turns out that voids occur where pixels in the stdev grid = 0. .Because the numpy script calculates the stdev grid with higher precision, there are fewer of these 0's and so fewer voids in the output... but they are still there and are unacceptable.
What I'm working on now¶
- Figuring out how to overcome the issue of voids in the smoothed DEM, so that final comparisons can be run and the global multiscale smoothing can finally (!) begin.
- My next step is to examine the multiscale smoothing algorithm carefully to see if there is a place where a bit of simple raster math (e.g. adding 1 to each pixel) can be used to overcome the void issue. Clearly this occurs because of 0's in the denominator, so I hope to be able to easily circumvent this step (without altering the end product, obviously).
What obstacles are blocking progress¶
- Not much
What's next¶
- Finishing this (seemingly never-ending) task so that I can move on and we can actually use this thing!
Robinson updates¶
2012-11-19 (date)¶
What I did these past weeks¶
- I was in Australia for 14 days, with no good internet connection- so progress was stalled for a bit!
- After that- I finished converting the aml code, to Python, for calculating a smoothing stdev grid
- Ran this Python script over a 3 x 3 region of SRTM data, covering the area of the Grand Canyon (for a highly terrain variable test region)
What I'm working on now¶
- I am about to use the stdev grid as an input for the multiscale smoothing algorithm, to smooth the test region
- The output will be compared to a couple of smoothing runs based on constant values
- In addition, I will run the aml code (hopefully!) to compare results derived from ArcGIS to those from open source software
What obstacles are blocking progress¶
- The first main issue is time consumption- the 3 x 3 stdev grid took >8 hours to be produced. It will not be feasible to do the entire globe at this rate. The first test run, however, was done using the default settings from the aml code, from which means and medians were calculated over circular neighborhoods with radii of 2 and 3. One way to cut the time taken to perform this would be to set these radii to be a bit larger- and I will probably compare a few stdev grids created using various settings (for both time to produce and output quality).
- The other main issue is with using John's aml script in Arc. I will need to run the scrip on my own laptop, which will be far slower at processing a raster of any considerable size (assuming there is enough memory to do so!). In addition, I am having a bit of trouble setting up an Arctool through which to run the script, which is necessary to get the old language to run in the newer version of the software. I'll keep working on it though!
What's next¶
- Smoothing tests and a decision on how best to proceed
- Deciding how to break up the global data for processing on multiple CPU's
{{>toc}}
h1. Robinson updates
2012-10-26 (date)¶
What I did [these] past week(s)¶
- Wrote a first draft of the manuscript to accompany the global 90m DEM, through the post-processing (smoothing, metadata, etc.) section. This is targeted toward Methods in Ecology and Evolution.
- Began converting John Gallants' aml code for detecting an appropriate smoothing sd to Python. This code produces a grid of 'noise' sd's that are based on local means and variances in neighborhoods around pixels. This can then be used for smoothing an input DEM, or smoothing can be carried out using a constant sd. I will be testing both methods (see below) to see which is more appropriate.
- Heard back from John, who recommended smoothing all sections of the global grid (inc. SRTM, which has undergone some smoothing already- by JPL). The workflow here will be: 1) Smooth SRTM, 2) Smooth filled ASTER GDEM2, 3) Blend smoothed SRTM/GDEM2. This will require redoing the blending work.
What I'm working on now
- Continuing to convert code
- Cleaning up manuscript a bit
What obstacles are blocking progress¶
- Not much here- I am unfamiliar with aml so that is taking a bit of time but isn't terrible
What's next¶
- Testing smoothing via best detected sd vs. constant sd.
1. Select a few test regions (e.g. flat vs. variable terrain areas)
2. Implement sd detection code
3. Smooth test areas based on detected sd grid as well as an assortment of (sensible) constant sd's
4. Compare grids (visually, statistically using variances, etc.)
In John's aml code, there is a comment that the method does not fair well in highly variable terrain (it picks up real variation as noise and results in over-smoothing), and that this might be made better by using a fitted polynomial in the first step of the calculation, as opposed to a mean (as it is now). I may look into altering the script to include a fitted polynomial calculation in this early step
{{>toc}}
h1. Robinson updates
2012-10-08 (date)¶
What I did this past week (no, month)¶
- After testing the DSF method over a small section of Canada, it was found to outperform interpolation alone (despite the questionable quality of GLSDEM). Thus I ran DSF over the entire ASTER GDEM2 portion of the global DEM. This took ~ 2 weeks to complete on Atlas
- I also figured out a way to convert ocean from null to 0 in the SRTM portion of the global DEM, without messing with land masks (and potential alignment issues, etc.). This was a two step process in which I 1) ran a gdalwarp to reassign the null value from '-9999' to 0 (which converted all '-9999' pixels to 0), then 2) ran r.mapcalc in GRASS to reclassify 0 back to 0 (instead of null) and assign '-32768' as the null value. This was done based on a major assumption, that SRTM contained no voids over land (it is not supposed to...).
- Mosaiced the complete 90m DEM into a single file, which is available on Atlas for error checking by all
- Began thinking about how to break up and store the global file for end-use distribution (size, compression, etc.). We are currently thinking about 5 x 5 degree tiles and a simple compression format (e.g. zip or gzip)
- Contacted John Gallant re: implementation of his smoothing algorithm, specifically focusing on how to set the error standard deviation, whether tree canopy heights need to be removed prior to smoothing, and whether drainage enforcement needs to occur afterward.
- Documented the locations of all void-filled pizels in text files (with names corresponding to tiles for which they contain void-filled pixel locations)
- Began writing a methods paper with this work
What I'm working on now¶
- I have written an intro and some methods for the manuscript, and upon doing so have found out that the SRTM portion of our DEM has already undergone some smoothing and drainage enforcement (where voids were filled). We are exploring this further, as it may mean that we do not run the smoothing algorithm over the SRTM portion of the grid (we don't want to smooth a smoothed DEM!).
- continuing to write
- Smoothing (when we hear back from John and get some input)
What obstacles are blocking progress¶
- Just that John hasn't had time to tackle our questions yet.
What's next¶
- Smoothing whichever part of the DEM we decided to smooth (and potentially removing tree canopy height first, drainage enforcement afterward)
- Creating a binary grid that will accompany the global DEM and that spatially identifies all pixels that have been void-filled (this will be done after we decide the sizes of tiles for distribution)
- Finishing the manuscript
- POSSIBLY some quality control comparisons involving flow calculations
{{>toc}}
h1. Robinson updates
2012-08-30 (date)¶
What I did this past week (no, month)¶
- Felt bad about myself because I was supposed to do this update at least 3 weeks ago (D'oh!)
- Plugged along at nailing down "best" parameters for interpolation during void-filling over the Aster GDEM2 portion of the global dataset. As a reminder- this involved burning voids into test datasets, estimating the values within these voids using various combinations of tension and smoothing parameters for spline interpolation, and comparing the estimated values to real values underlying the "burned in" voids. Tests have included:
- Over the regions of E. Russia, N. Central Canada, and NW Canada I have tested 92 tension/smoothing parameter combinations per region, plus an additional few combinations over E. Russia. All told, ~300 tests have been performed to find the best parameters
- The best tension/smoothing parameters found above (135/0.6) and another comparison combination (105/0.6) were also tested over regions in Africa, China and S. America
- Downloaded and processed a new dataset for use in the delta surface fill method of void-filling
- The GLSDEM dataset is available over the globe at 90m (no other DEM is available at this high a resolution). This provides a baseline dataset for use in the delta surface fill method, though GLSDEM has some notable issues (more on that in the "What I am working on now" section).
- The repository for GLSDEM data is set up so that to get a single 1 degree by 1 degree tile you must first navigate through 3 parent directories. I needed > 5000 tiles to cover N60 and above, so I first had to learn how to use Pythons ftplib to automate tile downloads
- After downloading the tiles I discovered that some of them have inconsistencies in their extents/resolutions. The origins of most tiles is X.000/Y.000, with resolutions of 3 arc-seconds. A number of tiles, however, have 1/2 pixel offsets (i.e. the origins fall on X.001/Y.999) and resolutions that are very slightly greater than 3 arc-seconds. SOOOO- I ran each tile through a gdalwarp that adjusted and clipped offending tiles to fall on X.000/Y.000 and have exactly 3 arc-second resolution.
- Finally, these corrected tiles were mosaiced into extents that a) are reasonable to run the delta surface fill method on, and b) match corresponding tiles (with voids to be filled)from the global dataset
- Set up the script to run the DSF method and fill voids in the global dataset
- Created a rough draft of a workflow for the steps involved in creating the 90m DEM. This has most steps but needs to be converted to a higher-level workflow given the complexity of each step
- Explored some options for weather stations that might be of use in validating Benoit and Adam's climate models in the Oregon test region
What I'm working on now¶
- I am running a test of the DSF method, over N. Central Canada, to see if it really performs better than spline interpolation alone. The reasons for this are as follows:
- Given the high resolution of these data, the delta surface fill method takes a long time to perform over each tile. This is due to a) converting rasters to arrays in order to perform operations on them, and b) the time it takes to estimate values in void areas through spline interpolation. IF the DSF method performs no better than interpolation alone at estimating values in "void" areas, then it is not worth the extra time it takes to run it.
- I would not be concerned about the performance of the DSF method if I were more confident in the quality of the baseline data (GLSDEM). The issue alluded to above has to do with the fact that the GLSDEM dataset is based on a compilation of various publicly available DEM's, and high resolution DEM data were un-available at high latitudes (where we need good data to fill voids!). The folks that created GLSDEM decided to handle this by downsampling 1km resolution data into 90m resolution data where 1km was the best available (quite a large chunk of land in the higher latitudes). This raises major red flags, for me, about the quality of the data at high latitudes, and is another reason that a test run to assess the performance of void-filling using DSF versus straight interpolation is warranted.
- When I have time, I will also be checking for the availability of climate data near Beijing (as the next proposed test area for the climate models). There are quite a few Chinese stations on the GHCN, but I have not yet georeferenced them to see how many are in the area of Beijing.
What obstacles are blocking progress¶
- Mostly just computing time- 90m resolution datasets are a bit of a bear to work with, even when they are chopped into smaller pieces
What's next¶
- The finishing steps will include:
- Smoothing the DEM using John Gallant's multiscale smoothing algorithm
- Running simple terrain analyses as quality assessment
- Creating metadata to accompany the global DEM, most importantly so that void-filled cells are documented and traceable
- Deciding how to break up and store the data for eventual distribution
- Writing it all up
{{>toc}}
h1. Robinson updates
2012-07-23 (date)¶
What I did this past week¶
- Found a problem with the spline parameter tests that required re-running the graphical and statistical analyses:
- The GRASS program r.fillnulls (and potentially other GRASS interpolation programs as well?) adds one column and one row of cells to output grids. This was not mentioned in the r.fillnulls documentation, or anywhere else that I have been able to find.
- By not knowing of this row/column addition, I was comparing real versus interpolated cell values of non-corresponding cells (the interpolated value was for a cell that was one row and one column off).
- After finding the problem, I clipped the interpolated grids back to the original study extents, and re-ran comparisons.
- Found that the previously tested tension levels were no longer giving satisfactory results in spline parameter tests
- RMSE and correlation values were still getting better as tension levels increased to that at which I had previously stopped (80).
- This prompted me to continue increasing the tension level, so that I could obtain additional statistical outputs by which to assess the best tension level. I have reached a tension level of 120 so far, and some of the statistical outputs are continuing to improve (though not all are behaving this way).
- Added another test region to the spline parameter tests
- After finding that the tension level tests previously performed were not as promising as they had appeared to be, I was nervous about choosing spline parameters based on only two test regions (E. Russia and N. Central Canada). Sooo... I added the other region (NW Canada) to the analysis, so that the final results would be based on more than two test datasets.
- Spline interpolation performs significantly worse on the new test region (exe. RMSE's are an order of magnitude higher).
- All of this gives me little confidence in choosing a parameters by which spline interpolation can be successfully run over the global grid (or even Aster GDEM2 portion of it).
- Found an alternative interpolation method that I would like to try out:
- This uses the r.resamp.bspline package in GRASS, and runs using bilinear resampling (r.fillnulls uses regularized resampling as the default)
- The package has a flag that allows for the interopolation algorithm to be run only where nulls are encountered (as opposed to over the whole grid, as in r.fillnulls). This will speed up the calculations.
- In r.resamp.bspline, I can set not only the smoothing parameter, but also the number of pixels used for interpolation, in both the ns and ew directions. This gives me more control and might potentially lead to finding a better method.
- r.resamp.bspline has an optional cross-validation analysis, that helps to determine the optimal smoothing parameter over the grid (rather handy). This analysis reports the mean and rms of residuals from real compared to estimated values for a user-defined number of points, and will might allow me to compare best parameters for different regions and decide what might be the best for the entire globe (based on differences among test regions).
- Spent a bit of time searching for additional weather stations to help fill the gaps in the mountain ranges of Oregon.
- Spent some time familiarizing myself with GIT and talking with Jim about using this system. Will upload some script as soon as the aforementioned problem is ironed out (and so the script is in better shape).
What I'm working on now
- Finishing checking out the numbers on the initial spline results, and potentially adding a few new tension levels to see if the statistics stop improving
- Running the new interpolation analysis
- Comparing all results so that a sound method by which to move forward can be chosen (at long last).
- Compiling a workflow of work done to date
What obstacles are blocking progress¶
- None at the moment
What's next¶
- Complete void-filling over the entire globe
- Run multi-scale smoothing over the entire globe
- Calculate terrain/hydro variables from finished 90 (at appropriate resolutions) DEM for comparison to those calculated from other global DEM's (e.g. GMTED2010).
{{>toc}}
h1. Robinson updates
2012-07-03 (date)¶
What I did this past week¶
- Identified what appears to be the best tension level for spline interpolation in the Delta Surface Fill method of void-filling
- Ran the analyses listed below on multiple tension levels for E. Russia and North-central Canada, to compare a terrain-variable and flat region, until the best 6 tensions were identified. Compared these.
- Graphical analyses: histograms, correlation plots, line graphs of real vs. spline interpolation derived DEM values.
- Statistical analyses: correlation, the Kolmogorov-Smirnov test, RMSE
- Used the best tension level in DSF for E. Russia
- Compared estimated (through spline interpolation alone and DSF) and GMTED2010 values across multiple voids
- Put all methods, graphs, results, and additional comments/concerns into a detailed report Note that this is a draft, so not polished. Questions/comments are most welcome, and let me know if you want it in e-mail instead of google doc
What I'm working on now¶
- Figuring out how to complete DSF over global DEM. This will require:
- Deciding on the area over which to run the method (e.g. 1 degree by 1 degree tiles? 5 degree by 5 degree tile aggregates?)
- Deciding how to deal with regions that contain ocean. When the 90m DEM was constructed, all areas with missing data (where there was only ocean and so no DEM tile) were seeded with the null value. If I break the global grid into 5 degree by 5 degree tiles (for example) to run void filling, a lot of this null-ocean will wind up in the tiles and create complications.
What obstacles are blocking progress¶
- None at the moment
What's next¶
- Finally upload some codes to GIT repository (D'oh!)
- Complete void-filling over the entire globe
- Run multi-scale smoothing over the entire globe
- Calculate terrain/hydro variables from finished 90 (at appropriate resolutions) DEM for comparison to those calculated from other global DEM's (e.g. GMTED2010).
Robinson updates¶
2012-06-18 (date)¶
What I did this past week¶
- Began working on using the Delta Surface Fill (DSF) method for void-filling of the global dataset
- Compared GMTED2010 and the global dataset, over E. Russia, and created the Delta Surface
- Figured out how to fill the center of 11 x 11 pixel voids with mean from Delta Surface
- Began work on finding best parameters (tension and smoothing) for void-filling through spline, in a more systematic way than I was before
- Shifted voids 100 pixels in the x direction, thereby creating "fake voids" over existing data (100 pixel shift should get the largest voids far enough away from real voids to eliminate the influence of real voids on interpolation)
- Filled real voids w/neighboring pixel values to ensure that they do not influence interpolation
- Have run interpolation at five tension levels (5,15,20,30,40) and graphed some of results (pdf or ppt to be uploaded)
What I'm working on now¶
- Comparing results to find the best parameter settings (tension and smoothing) for void-filling through spline interpolation
- Running correlations and histograms of real data vs. data found through interpolation at different tension levels
- Looking at graphical outputs of difference b/n real data and data found through interpolation (real data - interp) at different tension levels
- Statistical comparisons of means/medians of real data vs. data found through interpolation at different tension levels
- Doing similar comparisons to those above with various smoothing parameters on the best one or two tension levels
What obstacles are blocking progress¶
- None at the moment, I am just making my way through the steps
What's next¶
- Continue testing for best spline interpolation parameters
- Complete DSF method
- Compare results
- Finally upload some codes to GIT repository (D'oh!)
Robinson updates¶
2012-05-25 (date)¶
What I did this past week¶
- Checked into void-filling and smoothing for the global DEM
- Smoothing seems best accomplished using John Gallant's mulstiscale smoothing technique, but this does not especially handle void-filling (and voids are a big problem in some of our data) - Tested a few different methods of filling voids in a section of the global dataset (over Eastern Russia) using spline interpolation at different tension levels (with smoothing parameter at the automatic level of 0.1)
- Of note, spline is the method that was used for void-filling in the SRTM dataset where voids are "of medium size" and over "dissected" (i.e. non-flat) terrain
- See attached PPT (Void-filling Methods Comparison) for methods and preliminary results - Checked out a second void-filling method, the Delta Surface Fill (DSF) method
- See http://www.asprs.org/a/publications/pers/2006journal/march/highlight2.pdf for detailed info
- Simple conceptual diagram in attached PPT (Void-filling Methods Comparison)
- This may not be applicable, as it is recommended for large voids (40 to 60 pixels wide/high, which is larger than any in the E. Russia dataset) - Learned a lot about matplotlib and GRASS plugins in QGIS (r.fillnulls is available as a plugin but is not recognized from command line in either version of GRASS on Atlas (??))
What I'm working on now¶
- Trying out different tension and smoothing parameters (should I monkey with smoothing?) for spline interpolation, to find best void-filling method
- Figuring out how to run DSF
- Looking for possible elevational/geodetic observation data for the E. Russia area for use in comparisons of most accurate method (I've had a hard time finding such data so far)
- Trying to get better information on possible void-filling in the non-SRTM DTED level 1 product used in the compilation of the E. Russia area of the GMTED2010 dataset (don't want to use void-filled data to fill our voids!). This is turning out to be difficult to come by.
What obstacles are blocking progress¶
- I am still a bit fuzzy on finding the best tension and smoothing parameters for spline interpolation. B/c this is run using the GRASS plugin for QGIS, it's a bit of a pain to keep guessing and checking. In the broader context, doing such guess and check work for the global dataset (if spline is the best method) will be aweful! Suggestions are appreciated!
- Figuring out how to implement DSF method. This may be available in GRASS or QGIS (as a plugin), otherwise it'll have to be scripted and this will take time (though is quite do-able).
- Travel- This is on a personal note, but I will be out of the country and have limited internet access... so progress will be slowed for a bit
What's next¶
- Try out more tension levels, and possibly smoothing factors, to find best interpolation
- Try out other interpolation techniques?
- Run DSF method
- Compare void-filled rsters to each other, and to observation point data (?)
- Run multiscale smoothing on best void-filled dataset
Robinson updates¶
2012-05-16 (date)¶
What I did this past week¶
Completed comparative analyses of global DEM and GMTED2010 DEM (attached PPT)
Created elevational difference rasters, to help visualize patterns of discrepancies between the datasets (attached PPT)- Global DEM - GMTED2010
- Null values (many exist in E. Russia) were masked: If a pixel value was null in either dataset, the difference value for that pixel was automatically set to 0.
- Because of fusion around N60, global DEM is expected to have lower elevation values than GMTED2010 from ~ N60 (+/- 10km from border) through N83
Cleaned up script for easy future implementation (e.g. created functions for most procedures, etc.)
Comparitive Results show that:- The DEM's are approximately the same for the Oregon region (expected)
- There are considerable differences between elevation values for the areas in the fusion zone
- These differences are more pronounced for regions where terrain is highly variable
- There is little difference between DEM's for Oregon
- There is a distinct signal of major differences between the grids at N60 and up (latitudinally). This is especially visible in North Central Canada and Eastern Russia, and is
probably a biproduct of the lack of SRTM/ASTER fusion in the GMTED dataset. - There is a fair amount of noise in the difference values between datasets, leading back to the current discussion regarding the need to apply smoothing techniques to the
global DEM input data.
What I'm working on now¶
Working out null-filling techniques. This will need to be done for the entire global DEM at some point, as ASTER GDEM2 has significant numbers of pixels with null values.- Option 1: Get values from set of nearest neighbors, average these and fill null pixels with average
- Option 2: Use GMTED products (250m, 500m, 1km available) to fill null pixels
- *These options, or something like them, will be compared and the best method chosen
What obstacles are blocking progress¶
- What is the best way to fill null pixels?? I am starting to play around, but other ideas than those above would be most appreciated!
- I have noticed that the GMTED2010 datasets for our regions of interest contain no elevation values below 0, while ours do contain such values. I need to look at the documentation
for GMTED2010 to see how negative values were corrected, and think about doing the same thing/something similar for the global DEM - It seems that some sort of smoothing technique is necessary, for the global DEM, before final comparisons can be done
What's next¶
- Trying different null-filling techniques and comparing results
- Assessing negative elevation value correction
Robinson updates¶
2012-05-04 (date)¶
What I did this past week¶
Comparative analyses of global DEM and GMTED2010 DEM. Areas of interest:
- Oregon (should be the same b/n datasets)
- NW and North-central Canada over N60 fusion zone (expect differences b/n datasets)
- Russia over N60 fusion zone (?)
What I'm working on now¶
- Finishing north-central Canada comparison
- Identifying possible Russia comparison zone- will possibly need to choose zone and remove null values for comparisons (see below)
What obstacles are blocking progress¶
Appearance of null data (striping effect) in Russia region of global DEM is significant (carry over from Aster GDEM2). This will cause false differences in the comparative analysis.
What's next¶
- Identify locations of anomalous elevation differences found between cells in current comparisons- assess why these occur
- Consider smoothing techniques to be applied to global DEM, to remove null data problem from Aster portion of global DEM
- Compare Topographic Wetness Indices, calculated from each dataset, for regions of interest- document error propagation as data are used for more complex analyses