Tag Archive | biogeography

Interesting Paper: Global warming and invertebrate colouring

Just now another very interesting paper has been published in Nature Communications, which was written by former colleagues of mine from the University of Marburg.

Global warming favours light-coloured insects in Europe

As we all know many insect species like butterflies, bees or dragonflies have their main activity pattern during the day due to their ectotherm thermoregulation. Body colour is an important aspect of this thermoregulation as darker ( more blackish) individuals usually heat up faster. Therefore darker insects have an advantage compared to brighter insects in cooler climates as they heat up more rapidly and can forage earlier. This pattern can be mapped on a larger scale using occurrence data and has been known as “thermal melanism hypothesis” in macroecology. The authors go a step further from here as they  not only display a new biogeographic pattern previously unknown to science ( colouring gradient of European dragonflies and butterflies from south to north), but they also demonstrate how this mechanistic link between a macroecological pattern and a functional trait can be used to forecast the effect of climate change on insects.

From Zeuss et al. (2014): Shift in colour value for (a) the raw data; (b) the phylogenetic component (P); and (c) the specific component (S). Red indicates an increase in colour lightness; blue indicates a decrease in colour lightness. The diameter of each dot indicates the extent of the shift (n=1,845). The distribution of the shifts shows for the specific component a clear trend towards higher (that is, lighter-coloured) values (peak of the distribution positive; zero indicated by black line). The phylogenetic component suggests that the shifts in colour lightness have a strong phylogenetic background leading to a complex geographic mosaic in the response of assemblages to climate change. The inserted histograms show the mean change in colour lightness calculated for 1,000 alternative phylogenetic trees and are positive throughout, indicating that uncertainties in the phylogenetic hypotheses are unlikely to affect our conclusion of a general shift towards lighter assemblages. The distributional information used in the analysis is often based on a large time span, that is, the distributional information published in 1988 summarizes data until that year using information even from the beginning of the twentieth century. Rugs at the abscissa indicate observed values.

Possible critics: A definite next step in the analysis would be to include real measurements of optical colour rather than RGB values of scanned pictures. The colour values used in this study were all derived from scientific taxonomic drawings of those insects and thus biased by subjectivity of the respective artist. Nevertheless this bias should be consistent (if the same artist has sketched the images) so it should not influence the colouring gradient.  It is also interesting to note that many insects ( I know this for instance from my work with bugs and hoverflies) can adapt their body colouring to their habitat or differ quite a lot within a population. Differing melanism in body color and wing colouration might be related to the climatic niche they occur in, but the insects themselves might also possess phenotypic plasticity to adapt for instance to different habitats and background (Hochkirch et al. 2008). This pattern certainly needs more investigations in the future.

The article has been published as open access paper, so give it a try 😉

  • Hochkirch, A., Deppermann, J. and Gröning, J. (2008), Phenotypic plasticity in insects: the effects of substrate color on the coloration of two ground-hopper species. Evolution & Development, 10: 350–359.

  • Zeuss, D. et al. (2014) Global warming favours light-coloured insects in Europe. Nat. Commun. 5:3874


Macroecology playground (3) – Spatial autocorrelation

Hey, it has been over 2 months, so welcome in 2014 from my side. And i am sorry for not posting more updates recently, but like everyone i was (and still am) under constant working pressure. This year will be quite interesting for me personally as i am about to start my thesis project and will (besides other things) go to Africa for fieldwork. But for now i will try to catch your interest with a new macroecology playground post dealing with the important issue of spatial autocorrelation. See the other Macroecology playground posts here and here for knowing what happened in the past.

Spatial autocorrelation is the issue that data points in geographical space are somewhat dependent on each other or their values correlated because of spatial proximity/distance. This is also known as the first law of geography (Google it). However, most of the statistical tools we have available assume that all our datapoints are independent from each other, which is rarely the case in macroecology. Just imagine the steep slope of mountain regions. Literally all big values will always occur near the peak of the mountains and decrease with distance from the peak. There is thus already a data-inherent gradient present which we somehow have to account for, if we are to investigate the effect of altitude alone (and not the effect of the proximity to nearby cells).

In our hypothetical example we want to explore how well the topographical average (average height per grid cell) can explain amphibian richness in South America and if the residuals (model errors) in our model are spatially autocorrelated. I can’t share the data, but i believe the dear reader will get the idea of what we are trying to do.

# Load libraries

# Load in your dataset. In my case i am loading both the Topo and richness from a raster stack.
amp <- s$Amphibians.richness
topo <- s$Topographical.Average
summary(fit1 <- lm(getValues(amp)~getValues(topo)))
# Extract from the output
Multiple R-squared: 0.1248, Adjusted R-squared: 0.1242
F-statistic: 217.7 on 1 and 1527 DF, p-value: < 2.2e-16


What did we do? As you can see we fitted a simple linear regression model using the values from both the amphibian richness raster layer and the topographical range raster. The relation seems to be highly significant and this simple model can explain up to 12.4% of the variation. Here is the basic plot output for both response and predictor variable.

Plot of response and predictor values

Plot of response and predictor values

As you can see high values of both layers seem to be spatially clustered. So the likelihood of violating the independence of datapoints in a linear regression model is quite likely. Lets investigate the spatial autocorrelation by looking at Moran’s I, which is a measure for spatial autocorrelation (technically its just a determinant of correlation that calculated the pearsons r of surrounding values within a certain window). So lets investigate if the residual values (the error in model fit) are spatially autocorrelated.

library(ncf) # For the Correlogram

# Generate an Residual Raster from the model before
rval <- getValues(amp) # Create new raster
rval[as.numeric(names(fit1$residuals))]<- fit1$residuals # replace all data-cells with res value
resid <- topo
values(resid) <-rval;rm(rval) #replace our values in this new raster
names(resid) <- "Residuals"

# Now calculate Moran's I of the new residual raster layer
x = xFromCell(resid,1:ncell(resid)) # take x coordinates
y = yFromCell(resid,1:ncell(resid)) # take y coordinates
z = getValues(resid) # and the values of course
# Now calculate Moran's I
# Use the extracted coordinates and values, increase the distance in 100er steps and don't forget to use latlon=T (given that you have your rasters in WGS84 projected)
system.time(co <- correlog(x,y,z,increment = 100,resamp = 0, latlon = T,na.rm=T)) # this can take a while.
# It takes even longer if you try to estimate significance of spatial autocorrelation

# Now show the result
plot(0,type="n",col="black",ylab="Moran's I",xlab="lag distance",xlim=c(0,6500),ylim=c(-1,1))
Moran's I of the model residuals

Moran’s I of the model residuals

Ideally Moran’s I should be as close to zero as possible. In the above plot you can see that values in close distance (up to 2000 Distance units) and with greater distance as well, the model residuals are positively autocorrelated (too great than expected by chance alone, thus correlated with proximity). The function correlog allows you to resample the dataset to investigate significance of this patterns, but for now i will just assume that our models residuals are significantly spatially autocorrelated.

There are numerous techniques to deal with or investigate spatial autocorrelation. Here the interested reader is advised to look at Dormann et al. (2007) for inspiration. In our example we will try to fit a simultaneous spatial autoregressive model (SAR) and try to see if we can partially get the spatial autocorrelation out of the residual error.  SARs can model the spatial error generating process and operate with weight
matrices that specify the strength of interaction between neighbouring sites (Dormann et al., 2007). If you know that the spatial autocorrelation occurs in the response variable only, a so called “lagged-response model” would be most appropriate, otherwise use a “mixed” SAR if the error occurs in both response and predictors. However Kissling and Carl (2008) investigated SAR models in detail and came to the conclusion that lagged and mixed SARs might not always give better results than ordinary least square regressions and can generate bias (Kissling & Carl, 2008). Instead they recommend to calculate “spatial error” SAR models when dealing with species distribution data, which assumes that the spatial correlation does neither occur in response or predictors, but in the error term.

So lets build the spatial weights and fit a SAR:


x = xFromCell(amp,1:ncell(amp))
y = yFromCell(amp,1:ncell(amp))
z = getValues(amp)
nei <- dnearneigh(cbind(x,y),d1=0,d2=2000,longlat=T) # Get neighbourlist of interactions with a distance unit 2000.
nlw <- nb2listw(nei,style="W",zero.policy=T)
# You should calculate the interaction weights with the maximal distance in which autocorrelation occurs.
# But here we will just take the first x-intercept where positive correlation turns into the negative.
# Now fit the spatial error SAR
sar_e <- errorsarlm(z~topo,data=val,listw=nlw,na.action=na.omit,zero.policy=T)
# We use the generated z values and weights as input. Nodata values are excluded and zeros are given to boundary errors

# Now compare how much Variation can be explained
summary(fit1)$adj.r.squared # The r_squared of the normal regression
> 0.124
summary(sar_e,Nagelkerke=T)$NK # Nagelkerkes pseudo r_square of the SAR
> 0.504 # -- for SAR. So we could increase the influence of topographical average value on amphibian richness

# Finally do a likelihood ratio test
# Likelihood ratio for spatial linear models
>Likelihood ratio = 869.7864, df = 1, p-value &lt; 2.2e-16
>sample estimates:
>Log likelihood of sar_e; Log likelihood of fit1
> -7090.903 

# Not only are our two models significantly different, but the log likelihood of our SAR is also greater than the ordinary model
# indicating a better fit.

The SAR is one of many methods to deal with spatial autocorrelation. I agree that the choice of of the weights matrix distance is a bit arbitrary (it made sense for me), so you might want to investigate the occurence of spatial correlations a bit more prior to fitting a SAR. So have we dealt with the autocorrelation? Lets just calculate Moran’s I values again for both the old residual and the SAR residual values. Looks better doesn’t it?

Comparison of Moran's I for both a linear model and a error SAR residuals

Comparison of Moran’s I for both a linear model and a error SAR residuals


  • F Dormann, C., M McPherson, J., B Araújo, M., Bivand, R., Bolliger, J., Carl, G., … & Wilson, R. (2007). Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. Ecography, 30(5), 609-628.
  • Kissling, W. D., & Carl, G. (2008). Spatial autocorrelation and the selection of simultaneous autoregressive models. Global Ecology and Biogeography, 17(1), 59-71.

Distribution maps in R

Today i’m gonna play a little bit with map features and show you how to make different basic distribution maps in R. Using the 2.14.1 Version of R i will make a graphical distribution map of the dragonflies species in Bavaria. The data was extracted from the book “Libellen in Bayern” and applied to a presence-absence matrix. The whole of bavaria was converted to a geographical grid (X,Y-Values), whose values came from available topographical maps. Iam open to any suggestions or other packages which could present such data in a fashionable way!

For basic great looking maps you could use the package sp and lattice. Also i suggest using the package RColorBrewer, which provides very nice color ranges. See the comments in the R-Code for explantations.

library(sp);library(lattice)   # Loads all libraries
data <- read.csv2("grid_bayern.csv",
header=T,dec=",",sep=";",na.strings="NA") # Load the data
### The date has the following columns: "X","Y","Diversity"
coordinates(data) <- c("X","Y") # Apply X-Y Values as coordinates to form a SpatialPointsDataFrame.
## coloured points plot with legend in plotting area and scales
spplot(data, "Diversity",
cuts = 3, col.regions=brewer.pal(3, "Set1")[3:1],
legendEntries = c("small","avarage","high"))

## Blubble Plot --> Increasing bubble size for higher values
bubble(data, "Diversity", maxsize = 1.5,pch=19,
main = "Bavaria Dragonfly diversity", key.entries = c(1,5,10,25,50),scales=list(draw=F))

Diversity for bavarian dragonfliesBubbleplot for bavarian dragonlies

As some points seem to be missing you could also build an interpolated graphic. For this first we will need the packages maps, akima and fields. The code below loads in the dataset and defines our X and Y-Axis ranges and interpolates all data to adjacent areas based on contour-lines. Please note that these distribution is just a default kriging, which doesn’t have to be right. You need to look to some variograms and adjust your map to build the correct interpolated values.

library(akima);library(fields) ## Load all libraries
data <- read.csv2("grid_bavaria.csv",header=T,
dec=",",sep=";",na.strings="NA") ## load the data
rx=range(data$X);ry=range(data$Y) ## define the ranges of the plots

int.scp <- interp(data$X,data$Y), data$diversity, duplicate="strip") ## Make an interpolation

# Build the image plot with the interpolated values
col=brewer.pal(10, "Spectral")[10:1],nlevel=10,main="Spatial Diversity")
#contour(int.scp,add=TRUE) # You could also show the contour lines with this command

Interpolated dragonfly diversity for bavaria

Sussex Research Hive

Supporting the research community at the University of Sussex

Small Pond Science

Research, teaching, and mentorship in the sciences

Landscape Ecology 2.0

intersecting landscape ecology, open science, and R


The Research Blog of IIASA

Jörg Steinkamps Blog

Mainly things about R, Linux and vegetation modeling

Amy Whitehead's Research

the ecological musings of a conservation biologist

Michael McCarthy's Research

School of BioSciences, The University of Melbourne

The Rostrum

science, statistics, policy and more


Environmental Change - Understand, Predict, Adapt

Dynamic Ecology

Multa novit vulpes


METeorological Visualisation Utilities using R for Science and Teaching

A Birder´s Blog

"Everybody loves what they know"

Trust Me, I'm a Geographer

Using Technology to Explore Our World

Duncan Golicher's weblog

Research, scripts and life in Chiapas