The Landsat mission is one the most successful remote-sensing programs and has been running since the early 1970s. The most recent addition to the flock of Landsat satellites – Mission Nr. 8 has been supplying tons of images to researchers, NGO’s and governments for over two years now. Providing nearly 400 images daily (!) this has amassed to an impressive dataset of over half a million individual images by now (N = 515243 by 29/07/2015).
Landsat 8 scenes can be easily queried via a number of web-interfaces, the oldest and most successful being the USGS earth-explorer which also distributes other NASA remote-sensing products. ESA also started to mirror Landsat 8 data and so did the great Libra website from developmentseed. Using the Landsat8 before/after tool from remotepixel.ca tool you can even make on the fly comparisons of imagery scenes. You might ask how some of those services are able to show you the number of images and the estimated cloud-cover. This information is saved in the scenes-list metadata file, which contains the identity, name, acquisition date and many other information from all Landsat 8 scenes since the start of the mission. In addition Landsat 8 also has a cloudCover estimate (and sadly only L8, but the USGS is working on a post-creation measure for the previous satellites as far as I know), which you can readily explore on a global scale. Here is some example code showcasing how to peek into this huge ever-growing archive.
# Download the metadata file l = "http://landsat.usgs.gov/metadata_service/bulk_metadata_files/LANDSAT_8.csv.gz" download.file(l,destfile = basename(l)) # Now decompress t = decompressFile(basename(l),temporary = T,overwrite=T,remove=F,ext="gz",FUN=gzfile)
Now you can read in the resulting csv. For speed I would recommend using the “data.table” package!
# Load data.table if(!require(data.table))install.packages("data.table");library(data.table) # Use fread to read in the csv system.time( zz = fread(t,header = T) ) file.remove(t)
The metadata file contains quite a number of cool fields to explore. For instance the “browseURL” columns contains the full link to an online .jpg thumbnail. Very useful to have a quick look at the scene.
require('jpeg') l = "http://earthexplorer.usgs.gov/browse/landsat_8/2015/164/071/LC81640712015201LGN00.jpg" download.file(l,basename(l)) jpg = readJPEG("LC81640712015201LGN00.jpg") # read the file res = dim(jpg)[1:2] # get the resolution plot(1,1,xlim=c(1,res),ylim=c(1,res),asp=1,type='n',xaxs='i',yaxs='i',xaxt='n',yaxt='n',xlab='',ylab='',bty='n') rasterImage(jpg,1,1,res,res)
The “cloudCoverFull” column contains the average cloud-cover for each scene, which is interesting to explore as the long-term average of measured cloudCover per region/country likely differs due to different altitude or precipitation levels. Here is a map showing the average cloud-cover per individual scene since mission start:
Clouds are a major source of annoyance for anyone who intends to measure vegetation cover or classify land-cover. Might write another post later showcasing some examples on how to filter satellite data for clouds.
Since quite some time ecological models have tried to incorporate both continuous and discrete characteristics of species into their models. Newbold et al. (2013) demonstrated that functional traits affect the response of tropical bird species towards land-use intensity. Tropical forest specialist birds seem to decrease globally in probability of presence and abundance in more intensively used forests. This patterns extends to many taxonomic groups and the worldwide decline of “specialist species” has been noted before by Clavel et al. (2011).
But how to acquire such data on habitat specialization? Ether you assemble your own exhaustive trait database or you query information from some of the openly available data sources. One could for instance be the IUCN redlist, which not only has expert-validated data on a species current threat status, but also on population size and also on a species habitat preference. Here IUCN follows its own habitat classification scheme ( http://www.iucnredlist.org/technical-documents/classification-schemes/habitats-classification-scheme-ver3 ). The curious ecologist and conservationist should keep in mind however, that not all species are currently assessed by IUCN.
There are already a lot of scripts available on the net from which you can get inspiration on how to query the IUCN redlist (Kay Cichini from the biobucket explored this already in 2012 ). Even better: Someone actually compiled a whole r-package called letsR full of web-scraping functions to access the IUCN redlist. Here is some example code for Perrin’s Bushshrike, a tropical bird quite common in central Africa
# Install package install.packages(letsR) library(letsR) # Perrin's or Four-colored Bushshrike latin name name <- 'Telophorus viridis' # Query IUCN status lets.iucn(name) #>Species Family Status Criteria Population Description_Year #>Telophorus viridis MALACONOTIDAE LC Stable 1817 #>Country #>Angola, Congo, The Democratic Republic of the Congo, Gabon, Zambia # Or you can query habitat information lets.iucn.ha(name) #>Species Forest Savanna Shrubland Grassland Wetlands Rocky areas Caves and Subterranean Habitats #>Telophorus viridis 1 1 1 0 0 0 0 #> Desert Marine Neritic Marine Oceanic Marine Deep Ocean Floor Marine Intertidal Marine Coastal/Supratidal #> 0 0 0 0 0 0 #> Artificial/Terrestrial Artificial/Aquatic Introduced Vegetation Other Unknown #> 1 0 0 0 0
letsR also has other methods to work with the spatial data that IUCN provides ( http://www.iucnredlist.org/technical-documents/spatial-data ), so definitely take a look. It works by querying the IUCN redlist api for the species id (http://api.iucnredlist.org/go/Telophorus-viridis). Sadly the habitat function does only return the information if a species is known to occur in a given habitat, but not if it is of major importance for a particular species (so if for instance a Species is known to be a “forest-specialist” ). Telophorus viridis for instance also occurs in savannah and occasionally artificial habitats like gardens ( http://www.iucnredlist.org/details/classify/22707695/0 ).
So I just programmed my own function to assess if forest habitat is of major importance to a given species. It takes a IUCN species id as input and returns ether “Forest-specialist”, if forest habitat is of major importance to a species, “Forest-associated” if a species is just known to occur in forest or “Other Habitats” if a species does not occur in forests at all. The function works be cleverly querying the IUCN redlist and breaking up the HTML structure at given intervals that indicate a new habitat type.
Find the function on gist.github (Strangely WordPress doesn’t include them as they promised)
How does it work? You first enter the species IUCN redlist id. It is in the url after you have queried a given species name. Alternatively you could also download the whole IUCN classification table and match your species name against it 😉 Find it here. Then simply execute the function with the code.
name = 'Telophorus viridis' data <- read.csv('all.csv') # This returns the species id data$Red.List.Species.ID[which(data$Scientific.Name==name)] #> 22707695 # Then simply run my function isForestSpecialist(22707695) #> 'Forest-specialist'
There are many interesting things to calculate in relation to landscape ecology and its statistical metrics. However many (if not the majority) of the published toolsets are not reproducible, their algorithm code not published or open-source. Obviously this makes the easy implementation of underlying algorithms even harder for independent developers (scientists) if you don’t have the time to reproduce their work (not to mention the danger of making stupid mistakes, we are all human).
I recently found this new article in Methods in Ecology and Evolution by Etherington et al., who didn’t really present any novel techniques or methods, but instead provided a new python library that is capable of calculating Neutral Landscape Models (NLMs). NLMs are often used as nullmodel counterpart to real remote-sensing derived maps (land-cover or altitude) to test the effect of landscape structure or heterogeneity on a species (-community). Many NLM algorithms are based on cluster techniques, cellular automata or calculating randomly distributed numbers in a given 2d space. There have been critical and considerate voices stating that existing NLMS are often misused and better null-models are needed for specific hypothesis, such as a species perception of landscape structures. Nevertheless NLMs are still actively used and new papers published with it.
The new library, called NLMpy, is open source and published under the MIT licence. Thus I can easily use and integrate into QGIS and its processing framework. Their NLMpy library only depends on numpy and scipy and thus doesn’t add any other dependency to your python setup, if you already are able to run LecoS in your QGIS installation. The NLM functions are visible in the new LecoS 1.9.6 version, but only if you have NLMpy installed and it is available in your python path. Otherwise they won’t show up! Please don’t ask me here how to install additional python libraries on your machine, but rather consult google or some of the Q&A sites. I installed it following the instructions on this page.
After you have installed it and upgraded your LecoS version within QGIS, you should be able to spot a new processing group and a number of new algorithms. Here are some screenshots that show the new algorithms and two NLMs that I calculated. The first one is based on a Midpoint displacement algorithm and could be for instance used to test against an altitude raster layer (need to reclassify to real altitude values first). The second one is aimed at emulating a random classified land-cover map. Here I first calculated a proportional NLM using a random cluster nearest-neighbour algorithm. Second I used the libraries reclassify function (“Classify proportional Raster”) to convert the proportional values (range 0-1) into a relative number of landcover classes with exactly 6 different land-cover classes. Both null model look rather realistic, don’t they 😉
This is a quick and dirty implementation, so there could occur some errors. You should use a meter-based projection as extent (such as UTM) as negative values (common in degree-based projections like WGS84 latitude-longitude) sometimes result in strange error messages. You also have to change the CRS of the generated result to the one of your project manually, otherwise you likely won’t see the result. Instead of the number of rows and columns as in the original implementation, the functions in LecoS are based on a selected extent and the desired output cellsize.
For more complex modelling tasks I would suggest that you use the library directly. To give you a good start Etherington et al. also appended some example code and data in their article´s supporting information. Furthermore a few days ago they even had a spatial MEE blog post with some youtube video demonstrations how to use their library. So it shouldn’t be that hard even for python beginners. Or you could just use the processing interface within LecoS.
In any way, if you use the library in your research, I guess the authors would really appreciate it if you could cite them 🙂
- Etherington, T. R., Holland, E. P., O’Sullivan, D. (2014), NLMpy: a python software package for the creation of neutral landscape models within a general numerical framework. Methods in Ecology and Evolution. doi: 10.1111/2041-210X.12308
In addition I also temporarily removed LecoS ability to calculate the mean patch distance metric due to some unknown errors in the calculation. I’m kinda stuck here and anyone who can spot the (maybe obvious) bug gets a virtual hug from me!
Happy new year!
The PREDICTS database: a global database of how local terrestrial biodiversity responds to human impacts
New article in which I am also involved. I have told the readers of the blog about the PREDICTS initiative before. Well, the open-access article describing the last stand of the database has just been released as early-view article. So if you are curious about one of the biggest databases in the world investigating impacts of anthropogenic pressures on biodiversity, please have a look. As we speak the data is used to define new quantitative indices of global biodiversity decline valid for multiple taxa (and not only vertebrates like WWF living planet index).
Today we are gonna work with bulk downloaded data from MODIS. The MODIS satellites Terra and Aqua have been floating around the earth since the year 2000 and provide a reliable and free-to-obtain source of remote sensing data for ecologists and conservationists. Among the many MODIS products, Indices of Vegetation Greenness such as NDVI or EVI have been used in countless ecological studies and are the first choice of most ecologists for relating field data to remote sensing data. Here we gonna demonstrate 3 different ways how to acquire and calculate the mean EVI for a given Region and time-period. For this we are using the MOD13Q1 product. We focus on the area a little bit south of Mount Kilimanjaro and the temporal span around May 2014.
The first handy R-package we gonna use is MODISTools. It is able to download spatial-temporal data via a simple subset command. The nice advantage of using *MODISTools* is that it only downloads the point of interest as the subsetting and processing is done server-wise. This makes it excellent to download only what you need and reduce the amount of download data. A disadvantage is that it queries a perl script on the daac.ornl.gov server, which is often horrible slow and stretched to full capacity almost every second day.
# Using the MODISTools Package library(MODISTools) # MODISTools requires you to make a query data.frame coord <- c(-3.223774, 37.424605) # Coordinates south of Mount Kilimanjaro product <- "MOD13Q1" bands <- c("250m_16_days_EVI","250m_16_days_pixel_reliability") # What to query. You can get the names via GetBands savedir <- "tmp/" # You can save the downloaded File in a specific folder pixel <- c(0,0) # Get the central pixel only (0,0) or a quadratic tile around it period <- data.frame(lat=coord,long=coord,start.date=2013,end.date=2014,id=1) # To download the pixels MODISSubsets(LoadDat = period,Products = product,Bands = bands,Size = pixel,SaveDir = "",StartDate = T) MODISSummaries(LoadDat = period,FileSep = ",", Product = "MOD13Q1", Bands = "250m_16_days_EVI",ValidRange = c(-2000,10000), NoDataFill = -3000, ScaleFactor = 0.0001,StartDate = TRUE,Yield = T,Interpolate = T, QualityScreen = TRUE, QualityThreshold = 0,QualityBand = "250m_16_days_pixel_reliability") # Finally read the output read.table("MODIS_Summary_MOD13Q1_2014-08-10.csv",header = T,sep = ",")
The Mean EVI between the year 2013 up to today on the southern slope of Kilimanjaro was .403 (SD=.036). The Yield (integral of interpolated data between start and end date) is 0.05. MODISTools is a very much suitable for you if you want to get hundreds of individual point locations, but less suitable if you want to extract for instance values for a whole area (eg. a polygon shape) or are just annoyed by the frequent server breakdowns…
If you want to get whole MODIS tiles for your area of interest you have another option in R available. The MODIS package is particularly suited for this job in R and even has options to incorporate ArcGis as path for additional processing. It is also possible to use the MODIS Reprojection Tool or gdal (our choice) as underlying workhorse.
library(MODIS) dates <- as.POSIXct( as.Date(c("01/05/2014","31/05/2014"),format = "%d/%m/%Y") ) dates2 <- transDate(dates,dates) # Transform input dates from before # The MODIS package allows you select tiles interactively. We however select them manually here h = "21" v = "09" runGdal(product=product,begin=dates2$beginDOY,end = dates2$endDOY,tileH = h,tileV = v,) # Per Default the data will be stored in # ~homefolder/MODIS_ARC/PROCESSED # After download you can stack the processed TIFS vi <- preStack(path="~/MODIS_ARC/PROCESSED/MOD13Q1.005_20140810192530/", pattern="*_EVI.tif$") s <- stack(vi) s <- s*0.0001 # Rescale the downloaded Files with the scaling factor # And extract the mean value for our point from before. # First Transform our coordinates from lat-long to to the MODIS sinus Projection sp <- SpatialPoints(coords = cbind(coord,coord),proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84")) sp <- spTransform(sp,CRS(proj4string(s))) extract(s,sp) # Extract the EVI values for the available two layers from the generated stack #> 0.2432 | 0.3113
If all the packages and tools so far did not work as expected, there is also an alternative to use a combination of R and Python to analyse your MODIS files or download your tiles manually. The awesome pyModis scripts allow you to download directly from a USGS server, which at least for me was almost always faster than the LP-DAAC connection the two other solutions before used. However up so far both the MODISTools and MODIS package have processed the original MODIS tiles (which ship in hdf4 container format) for you. Using this solution you have to access the hdf4 files yourself and extract the layers you are interested in.
Here is an example how to download a whole hdf4 container using the modis_download.py script Just query modis_download.py -h if you are unsure about the command line parameters.
# Custom command. You could build your own using a loop and paste f <- paste0("modis_download.py -r -p MOD13Q1.005 -t ",tile," -e ","2014-05-01"," -f ","2014-05-30"," tmp/","h21v09") # Call the python script system(f) # Then go into your defined download folder (tmp/ in our case) # Now we are dealing with the original hdf4 files. # The MODIS package in R has some processing options to get a hdf4 layer into a raster object using gdal as a backbone library(MODIS) lay <- "MOD13Q1.A2014145.h21v09.005.2014162035037.hdf" # One of the raw 16days MODIS tiles we downloaded mod <- getSds(lay,method="gdal") # Get the layer names and gdal links # Load the layers from the hdf file. Directly apply scaling factor ndvi <- raster(mod$SDS4gdal) * 0.0001 evi <- raster(mod$SDS4gdal) * 0.0001 reliability <- raster(mod$SDS4gdal) s <- stack(ndvi,evi,reliability) #Now extract the coordinates values extract(s,sp)
There you go. Everything presented here was executed on a Linux Debian machine and I have no clue if it works for you Windows or MAC users. Try it out. Hope everyone got some inspiration how to process MODIS data in R 😉
The Sun is shining, birds are singing and most scientists have nothing better to do than jetting around the globe to attend conferences. Yes, it is summer indeed. Here is some advertising for the 2014 Tropical Ecology – Early Carer Meeting of the British Ecological Society. This year the fun is happening in York. As many people seem to be on vacation the deadline for abstracts has been extended to the 14th of July. See the attached Documents ( BES-TEG 2014 Flyer ,BES_TEG Key speakers ) for more information. I will be there as well… .
The British Ecology Society – Tropical Ecology Group (BES-TEG) are organizing a 7th early career meeting scheduled to take place at the University of York, on the 14th and 15th of August 2014. Day one will focus on Ecology and Ecosystem Processes while day two will focus on Practical Applications and Links to Policy; such as conservation, livelihood, policy and development. All early-career researchers, both PhD and Post-Docs, are welcome to present their tropical ecology related research as a poster and/or oral presentation. The deadline for abstract submission has been extended to Monday the 14th of July 2014.
Please find attached the flyer and conference document for detailed information.
It would be appreciated if this flyer circulates within your department. Students are encouraged to come to York for what should be a really interesting few days in August.
An event website has been set up for registration