Bulk downloading and analysing MODIS data in R

Today we are gonna work with bulk downloaded data from MODIS. The MODIS satellites Terra and Aqua have been floating around the earth since the year 2000 and provide a reliable and free-to-obtain source of remote sensing data for ecologists and conservationists. Among the many MODIS products, Indices of Vegetation Greenness such as NDVI or EVI have been used in countless ecological studies and are the first choice of most ecologists for relating field data to remote sensing data. Here we gonna demonstrate 3 different ways how to acquire and calculate the mean EVI for a given Region and time-period. For this we are using the MOD13Q1 product. We focus on the area a little bit south of Mount Kilimanjaro and the temporal span around May 2014.

(1)
The first handy R-package we gonna use is MODISTools. It is able to download spatial-temporal data via a simple subset command. The nice advantage of using *MODISTools* is that it only downloads the point of interest as the subsetting and processing is done server-wise. This makes it excellent to download only what you need and reduce the amount of download data. A disadvantage is that it queries a perl script on the daac.ornl.gov server, which is often horrible slow and stretched to full capacity almost every second day.

# Using the MODISTools Package
library(MODISTools)
# MODISTools requires you to make a query data.frame
coord <- c(-3.223774, 37.424605) # Coordinates south of Mount Kilimanjaro
product <- "MOD13Q1"
bands <- c("250m_16_days_EVI","250m_16_days_pixel_reliability") # What to query. You can get the names via GetBands
savedir <- "tmp/" # You can save the downloaded File in a specific folder
pixel <- c(0,0) # Get the central pixel only (0,0) or a quadratic tile around it
period <- data.frame(lat=coord[1],long=coord[2],start.date=2013,end.date=2014,id=1)
# To download the pixels
MODISSubsets(LoadDat = period,Products = product,Bands = bands,Size = pixel,SaveDir = "",StartDate = T)
MODISSummaries(LoadDat = period,FileSep = ",", Product = "MOD13Q1", Bands = "250m_16_days_EVI",ValidRange = c(-2000,10000), NoDataFill = -3000, ScaleFactor = 0.0001,StartDate = TRUE,Yield = T,Interpolate = T, QualityScreen = TRUE, QualityThreshold = 0,QualityBand = "250m_16_days_pixel_reliability")
# Finally read the output
read.table("MODIS_Summary_MOD13Q1_2014-08-10.csv",header = T,sep = ",")

The Mean EVI between the year 2013 up to today on the southern slope of Kilimanjaro was .403 (SD=.036). The Yield (integral of interpolated data between start and end date) is 0.05. MODISTools is a very much suitable for you if you want to get hundreds of individual point locations, but less suitable if you want to extract for instance values for a whole area (eg. a polygon shape) or are just annoyed by the frequent server breakdowns…

(2)
If you want to get whole MODIS tiles for your area of interest you have another option in R available. The MODIS package is particularly suited for this job in R and even has options to incorporate ArcGis as path for additional processing. It is also possible to use the MODIS Reprojection Tool or gdal (our choice) as underlying workhorse.

library(MODIS)
dates <- as.POSIXct( as.Date(c("01/05/2014","31/05/2014"),format = "%d/%m/%Y") )
dates2 <- transDate(dates[1],dates[2]) # Transform input dates from before
# The MODIS package allows you select tiles interactively.  We however select them manually here
h = "21"
v = "09"
runGdal(product=product,begin=dates2$beginDOY,end = dates2$endDOY,tileH = h,tileV = v,)
# Per Default the data will be stored in
# ~homefolder/MODIS_ARC/PROCESSED
# After download you can stack the processed TIFS
vi <- preStack(path="~/MODIS_ARC/PROCESSED/MOD13Q1.005_20140810192530/", pattern="*_EVI.tif$")
s <- stack(vi)
s <- s*0.0001 # Rescale the downloaded Files with the scaling factor

# And extract the mean value for our point from before.
# First Transform our coordinates from lat-long to to the MODIS sinus Projection
sp <- SpatialPoints(coords = cbind(coord[2],coord[1]),proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84"))
sp <- spTransform(sp,CRS(proj4string(s)))
extract(s,sp) # Extract the EVI values for the available two layers from the generated stack
#>  0.2432  | 0.3113

MODIS EVI Tile for East Africa

Whole MODIS EVI Tile (250m cs) for East Africa

 

(3)
If all the packages and tools so far did not work as expected, there is also an alternative to use a combination of R and Python to analyse your MODIS files or download your tiles manually. The awesome pyModis scripts allow you to download directly from a USGS server, which at least for me was almost always faster than the LP-DAAC connection the two other solutions before used. However up so far both the MODISTools and MODIS package have processed the original MODIS tiles (which ship in hdf4 container format) for you. Using this solution you have to access the hdf4 files yourself and extract the layers you are interested in.
Here is an example how to download a whole hdf4 container using the modis_download.py script Just query modis_download.py -h if you are unsure about the command line parameters.

# Custom command. You could build your own using a loop and paste
f <- paste0("modis_download.py -r -p MOD13Q1.005 -t ",tile," -e ","2014-05-01"," -f ","2014-05-30"," tmp/","h21v09")
# Call the python script
system(f)
# Then go into your defined download folder (tmp/ in our case)

# Now we are dealing with the original hdf4 files.
# The MODIS package in R has some processing options to get a hdf4 layer into a raster object using gdal as a backbone

library(MODIS)
lay <- "MOD13Q1.A2014145.h21v09.005.2014162035037.hdf" # One of the raw 16days MODIS tiles we downloaded
mod <- getSds(lay,method="gdal") # Get the layer names and gdal links
# Load the layers from the hdf file. Directly apply scaling factor
ndvi <- raster(mod$SDS4gdal[1]) * 0.0001
evi <- raster(mod$SDS4gdal[2]) * 0.0001
reliability <- raster(mod$SDS4gdal[12])
s <- stack(ndvi,evi,reliability)

#Now extract the coordinates values
extract(s,sp)

There you go. Everything presented here was executed on a Linux Debian machine and I have no clue if it works for you Windows or MAC users. Try it out. Hope everyone got some inspiration how to process MODIS data in R ;)

 

Curlew is now on Twitter

Good news everyone (or bad). I have now officially joined Twitter. Never really understood the hype so far and I always believed that their website is just filled with cute cat gifs and pictures of peoples last dinner. However in the past month I increasingly caught myself reading a lot of interesting twitter discussions of some people. There is a lot of potential for collaborations and good advice similar to the Q&A websites, where I am already pretty active. So here I am. Will try to set it up properly later at home, tweet some news, follow some people. Social media doesn’t count as worktime (yet).

EDIT:

What Irony that this is actually my 5oth blog-post since I started this blog :)

 

It is conference-summer – BES-TEG in YORK 2014

The Sun is shining, birds are singing and most scientists have nothing better to do than jetting around the globe to attend conferences. Yes, it is summer indeed. Here is some advertising for the 2014 Tropical Ecology – Early Carer Meeting of the British Ecological Society. This year the fun is happening in York. As many people seem to be on vacation the deadline for abstracts has been extended to the 14th of July. See the attached Documents ( BES-TEG 2014 Flyer ,BES_TEG Key speakers ) for more information. I will be there as well… .


Dear All,

The British Ecology Society – Tropical Ecology Group (BES-TEG) are organizing a 7th early career meeting scheduled to take place at the University of York, on the 14th and 15th of August 2014. Day one will focus on Ecology and Ecosystem Processes while day two will focus on Practical Applications and Links to Policy; such as conservation, livelihood, policy and development. All early-career researchers, both PhD and Post-Docs, are welcome to present their tropical ecology related research as a poster and/or oral presentation. The deadline for abstract submission has been extended to Monday the 14th of July 2014.

Please find attached the flyer and conference document for detailed information.

It would be appreciated if this flyer circulates within your department. Students are encouraged to come to York for what should be a really interesting few days in August.

An event website has been set up for registration

http://www.eventbrite.co.uk/o/british-ecological-society-tropical-ecology-group-bes-teg-6217007537

Annual Forest Loss – A comparison between global forest loss datasets

The start of this year was marked by the publication of two new global datasets for environmental analysis. My impression is that both of those datasets will be of increasing importance in ecological analysis in the future (even though their value for conservation biology has been actively criticized, see Tropek et al. 2014). Thus there is a need to assess the accuracy of their forest loss detection over time and if they are consistent.

The first dataset is the already famous Global Forest Map published by Hansen et al. (2013) in Science end of last year. The temporal span of their dataset goes back from the year 2000 up to the year 2012 and by using only Landsat data in a temporal time-series analysis they got a pretty decent high-resolution land-cover product. Although the resolution of the Hansen dataset is great (30m global average coming from Landsat) Hansen et al. decided to only publish the year 2000 baseline with the forest cover. They provide us with aggregated loss, gain and loss per year layers though, but nevertheless the user has no option to reproduce a similar product for the year 2012.

The other dataset is the combined published result from a 4 year long monitoring by the japanese satellite ALOS-PALSAR. They decided to release a global forest cover map at a 50m spatial resolution, which in contrast to Hansen can be acquired for the whole time-frame of the ALOS-PALSAR mission. It thus has a temporal coverage of the whole globe from the year 2007 until 2010. The data can be acquired on their homepage after getting an account. The ALOS PALSAR data has a nice temporal span and can be downloaded for multiple years, thus in theory allowing to make temporal comparisons and predictions about future land-use trends. However I am a bit concerned about the accuracy of their classifications as I have found multiple errors already in the area I am working in.

Classification Errors with the ALOS PALSAR dataset. Suddenly there are huge waterbodies in the Savanna near Kilimanjaro

Classification Errors with the ALOS PALSAR dataset. Suddenly there are huge waterbodies (blue) in the Savanna near Kilimanjaro

Because I am interested in using the ALOS PALSAR dataset in my analysis (how often do you get a nice spatial-temporal dataset of forest cover) I made a comparison between the forest loss detected in my area of interest for both datasets. It should be noted that is a comparison between different satellite sensors as well and not only by classification algorithms. So we are not comparing products from the same data source.

So what is the plan for our comparison:

  • We downloaded the whole ALOS PALSAR layers for all years covered of the area around Kilimanjaro in northern Tanzania (N00, E035). We then extracted only the forest cover (Value == 1) and calculate the difference between years to acquire the forest loss for the year 2008,2009 and 2010 respectively.
  • From the Google Engine app we downloaded the “loss per year” dataset and cropped it to our area of interest. Furthermore we are only interested in the aggregated Forest loss in the years 2008, 2009 and 2010 which we have available in the ALOS PALSAR dataset. We furthermore resampled the Hansen dataset up to 50m to match up with the ALOS PALSAR resolution.

The Result:

I haven’t found a fancy way to display this simple comparison, so here comes just the result table. As predicted (if you look at it visually),the ALOS PALSAR algorithm overshots the amount of forest loss a lot.

year 2007-2008 2008-2009 2009-2010
Hansen Forest Loss cells 262 304 529
ALOS PALSAR Forest Loss cells 26995 24970 16297
Equal cells in both 17 30 131

Conclusion:

So which one is right? I personally trust Hansens data a lot more. Especially because I found them to be pretty consistent in my area of study. For me the ALOS PALSAR data is not useable yet until the authors have figured out ways to improve their classification. It can be concluded that users should not forget that those Forest Cover products are ultimately just the result of a big un-supervised algorithm who doesn’t discriminate between right and wrong. Without validation and careful consideration of the observer you might end up having wrong results.

Interesting Paper: Global warming and invertebrate colouring

Just now another very interesting paper has been published in Nature Communications, which was written by former colleagues of mine from the University of Marburg.

Global warming favours light-coloured insects in Europe

As we all know many insect species like butterflies, bees or dragonflies have their main activity pattern during the day due to their ectotherm thermoregulation. Body colour is an important aspect of this thermoregulation as darker ( more blackish) individuals usually heat up faster. Therefore darker insects have an advantage compared to brighter insects in cooler climates as they heat up more rapidly and can forage earlier. This pattern can be mapped on a larger scale using occurrence data and has been known as “thermal melanism hypothesis” in macroecology. The authors go a step further from here as they  not only display a new biogeographic pattern previously unknown to science ( colouring gradient of European dragonflies and butterflies from south to north), but they also demonstrate how this mechanistic link between a macroecological pattern and a functional trait can be used to forecast the effect of climate change on insects.

From Zeuss et al. (2014): Shift in colour value for (a) the raw data; (b) the phylogenetic component (P); and (c) the specific component (S). Red indicates an increase in colour lightness; blue indicates a decrease in colour lightness. The diameter of each dot indicates the extent of the shift (n=1,845). The distribution of the shifts shows for the specific component a clear trend towards higher (that is, lighter-coloured) values (peak of the distribution positive; zero indicated by black line). The phylogenetic component suggests that the shifts in colour lightness have a strong phylogenetic background leading to a complex geographic mosaic in the response of assemblages to climate change. The inserted histograms show the mean change in colour lightness calculated for 1,000 alternative phylogenetic trees and are positive throughout, indicating that uncertainties in the phylogenetic hypotheses are unlikely to affect our conclusion of a general shift towards lighter assemblages. The distributional information used in the analysis is often based on a large time span, that is, the distributional information published in 1988 summarizes data until that year using information even from the beginning of the twentieth century. Rugs at the abscissa indicate observed values.

Possible critics: A definite next step in the analysis would be to include real measurements of optical colour rather than RGB values of scanned pictures. The colour values used in this study were all derived from scientific taxonomic drawings of those insects and thus biased by subjectivity of the respective artist. Nevertheless this bias should be consistent (if the same artist has sketched the images) so it should not influence the colouring gradient.  It is also interesting to note that many insects ( I know this for instance from my work with bugs and hoverflies) can adapt their body colouring to their habitat or differ quite a lot within a population. Differing melanism in body color and wing colouration might be related to the climatic niche they occur in, but the insects themselves might also possess phenotypic plasticity to adapt for instance to different habitats and background (Hochkirch et al. 2008). This pattern certainly needs more investigations in the future.

The article has been published as open access paper, so give it a try ;)

  • Hochkirch, A., Deppermann, J. and Gröning, J. (2008), Phenotypic plasticity in insects: the effects of substrate color on the coloration of two ground-hopper species. Evolution & Development, 10: 350–359.

  • Zeuss, D. et al. (2014) Global warming favours light-coloured insects in Europe. Nat. Commun. 5:3874

Macroecology for QGIS, the new QSDM plugin

This is just a quick posting informing all the QGIS interested readers of this blog that I am about to release a new QGIS plugin. It’s name is QSDM (QGIS Species Distribution Modelling) and similar as with LecoS it is particular suited for the practicing ecologists out there. This time i had no plan and interest of coding a graphical interface and thus the whole plugin can only be executed from within the Processing Toolbox (QGIS version > 2.0 ). In my opinion this will be the future of most advanced QGIS plugins anyway.

So what is the idea? Basically QSDM is a plugin taking statistical models for species distribution modeling to QGIS. For now only the famous Maxent is enabled and working, but the ambitious plan is to enable other modeling techniques such as RandomForests and LogisticRegression as well if the user has the necessary libraries enabled.

You might ask what is the advantage of running Maxent from within QGIS? First, you can immediately see the output so it is nice for visual exploration. Second, the QSDM plugin helps you with the formating of your layers and occurrence files. For instance all input raster layers are automatically unified to a common resolution and exported as ESRI .asc files. You simply need to load in your layers and let the tool do the rest. For those of you who want more control (and I really insist that you want to), I also enabled functions to generate a custom parameter file for Maxent and enabled an option to start the Maxent GUI in a new process.

–> I recognize that the easiness of this tool might tempt more people to execute tools without really understanding what they do and how they work. Please be sure what you do and always (!!!) validate the outputs of the tools you use (this includes QSDM). For understanding Maxent parameters I highly recommend reading the attached literature list and this publication!

Other things i implemented in the initial release of QSDM

  • Create Species Richness grid
    • Creates a new raster containing Species Richness or Endemism of input occurence layer
  • Calculate Niche Overlap Statistics
    • Can calculate Schoener’s D or Warren’s I based on Hellinger distances for all input layers.
  • Range Shift
    • Shows the difference between two input prediction layers. For instance for current and likely future conditions.
  • Data Transformations
    • Makes quick transformations of input raster layers

More is planned, but this depends entirely on my inclination to do so, the time I have available and if it can be useful for my own research as well.

Please remember that the plugin is still experimental. So please don’t be angry if it doesn’t work for you. testing was conducted on QGIS 2.2 stable on my Debian Linux machine and it should hopefully work for Windows as well. But similar as with LecoS i have no opportunity to test the plugin on Mac OS based systems and I also don’t really intend to :-p. Sorry Apple.

Literature:

  • Steven J. Phillips, Robert P. Anderson and Robert E. Schapire, (2006) “Maximum entropy modeling of species geographic distributions” Ecological Modelling, Vol 190/3-4 pp 231-259
  • Steven J. Phillips and Miroslav Dudik, (2008) “Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation.” Ecography, Vol 31, pp 161-175
  • Jane Elith et al. (2011) “A statistical explanation of MaxEnt for ecologists” Diversity and Distributions, 17, 43–57 DOI: 10.1111/j.1472-4642.2010.00725.x

Interesting Paper: Current and future nature-based tourism in the Eastern Arc Mountains

Greetings from Moshi, Tanzania, where I am still busy with the fieldwork. Just want to point the dear readers to a new paper, that is currently in Press in “Ecosystem Services”. Titled “The current and future value of nature-based tourism in the Eastern Arc Mountains of Tanzania” the study analyses and gives estimates for the current and future economic value of nature-based tourism in Tanzania. The analysis is based on a dataset including the spatial location of lodgings and visitor estimates and provides predictive outlooks for two different land-use scenarios (no-change, hopeful-future). They conclude that eco-tourism in the Tanzanian EAM (Eastern Arc Mountains) can provide, among other values for ecosystem service, substantial revenue in the future if the management effectiveness of protected areas can be improved.

Bayliss, J., et al., The current and future value of nature-based tourism in the Eastern Arc Mountains
of Tanzania. Ecosystem Services (2014), http://dx.doi.org/10.1016/j.ecoser.2014.02.006i

Sadly they only included the EAM from Tanzania in their study and thus left out the Taita Hills. During my stay in Taita I observed multiple disturbances such as fuelwood extractions and larger loggings in the last forest patches of Taita. Having multiple endemic bird (Taita Thrush, Taita Apalis, …) and plant species (eg. Saintpaulia teitensis) the forests of the Taita hills surely can be considered a part of the renown EAM Biodiversity hotspot. But due to the high population density at Taita there should be more economic opportunities and initiatives, such as the Taita-Taveta Wildlife Forum has been promoting, to increase the support of both local people and government to protect these last forest patches and ensure future connectivity.

 

Fuelwood collection. Photographed near Ngangao Forest

Fuelwood collection. Photographed near Ngangao Forest

Recent loggings within parts of Vuria Forest.

Recent loggings within parts of Vuria Forest.

Anyway, lets hope that this study can back up some arguments in the science-policy dialog with decision makers in Tanzania and abroad.

Amy Whitehead's Research

the ecological musings of a conservation biologist

Michael McCarthy's Research

School of Botany, The University of Melbourne

The Rostrum

Ecology, Entomology, Statistics and Science Policy

marionpfeifer

Environmental Change - Understand, Predict, Adapt

Dynamic Ecology

Multa novit vulpes

The Molecular Ecologist

A blog about Conservation and Ecology and everything between

metvurst

METeorological Visualisation Utilities using R for Science and Teaching

A Birder´s Blog

"Everybody loves what they know"

BIOFRAG

A new metric to quantify biodiversity response to fragmentation

Trust Me, I'm a Geographer

Using Technology to Explore Our World

Duncan Golicher's weblog

Research, scripts and life in Chiapas

Daniel J. Hocking

Ecology, conservation biology, & statistical modeling for a changing world

Climate Change Ecology

The Science, Economics, and Politics of Climate Change

Jeff Ollerton's Biodiversity Blog

Blogging about biodiversity

Darren's Side Projects

Warning: self-taught coding ahead! Mostly Javascript (Google Maps API, Google Chart Tools, jQuery, etc.), some HTML5, maybe some Python and R here and there. I make no guarantees that the work on this page is complete, efficiently coded, relevant, unplagerized, or follows any resemblance of a set of best-practices. Also, most examples shown here work best in Google Chrome - again, no guarantees.

Bartomeus lab

Ecology, global change and pollinators

Jean-Pierre Rossi - Blog

A blog about forest insects, R, GRASS, LaTeX and the likes

Follow

Get every new post delivered to your Inbox.

Join 77 other followers