Interesting Paper: Global warming and invertebrate colouring
Just now another very interesting paper has been published in Nature Communications, which was written by former colleagues of mine from the University of Marburg.
Global warming favours lightcoloured insects in Europe
As we all know many insect species like butterflies, bees or dragonflies have their main activity pattern during the day due to their ectotherm thermoregulation. Body colour is an important aspect of this thermoregulation as darker ( more blackish) individuals usually heat up faster. Therefore darker insects have an advantage compared to brighter insects in cooler climates as they heat up more rapidly and can forage earlier. This pattern can be mapped on a larger scale using occurrence data and has been known as “thermal melanism hypothesis” in macroecology. The authors go a step further from here as they not only display a new biogeographic pattern previously unknown to science ( colouring gradient of European dragonflies and butterflies from south to north), but they also demonstrate how this mechanistic link between a macroecological pattern and a functional trait can be used to forecast the effect of climate change on insects.
Possible critics: A definite next step in the analysis would be to include real measurements of optical colour rather than RGB values of scanned pictures. The colour values used in this study were all derived from scientific taxonomic drawings of those insects and thus biased by subjectivity of the respective artist. Nevertheless this bias should be consistent (if the same artist has sketched the images) so it should not influence the colouring gradient. It is also interesting to note that many insects ( I know this for instance from my work with bugs and hoverflies) can adapt their body colouring to their habitat or differ quite a lot within a population. Differing melanism in body color and wing colouration might be related to the climatic niche they occur in, but the insects themselves might also possess phenotypic plasticity to adapt for instance to different habitats and background (Hochkirch et al. 2008). This pattern certainly needs more investigations in the future.
The article has been published as open access paper, so give it a try 😉

Hochkirch, A., Deppermann, J. and Gröning, J. (2008), Phenotypic plasticity in insects: the effects of substrate color on the coloration of two groundhopper species. Evolution & Development, 10: 350–359.
 Zeuss, D. et al. (2014) Global warming favours lightcoloured insects in Europe. Nat. Commun. 5:3874
Macroecology playground (3) – Spatial autocorrelation
Hey, it has been over 2 months, so welcome in 2014 from my side. And i am sorry for not posting more updates recently, but like everyone i was (and still am) under constant working pressure. This year will be quite interesting for me personally as i am about to start my thesis project and will (besides other things) go to Africa for fieldwork. But for now i will try to catch your interest with a new macroecology playground post dealing with the important issue of spatial autocorrelation. See the other Macroecology playground posts here and here for knowing what happened in the past.
Spatial autocorrelation is the issue that data points in geographical space are somewhat dependent on each other or their values correlated because of spatial proximity/distance. Most of the statistical tools we have available for analysis out there assume that all our datapoints are independent from each other, which is rarely the case in macroecology. Just imagine the steep slope of mountain regions. Literally all big values will always occur near the peak of the mountains and decrease with distance from the peak. There is thus already a data inherent gradient present which we somehow have to account for, if are to investigate the effect of altitude alone (and not the effect of the proximity to nearby cells).
In our hypothetical example we want to explore how well the topographical average (average height per grid cell) can explain amphibian richness in South America and if the residuals (model errors) in our model are spatially autocorrelated. I can’t share the data, but i believe the dear reader will get the idea of what we are trying to do.
# Load libraries library(raster) # Load in your dataset. In my case i am loading both the Topo and richness from a raster stack. amp < s$Amphibians.richness topo < s$Topographical.Average summary(fit1 < lm(getValues(amp)~getValues(topo))) # Extract from the output > Multiple Rsquared: 0.1248, Adjusted Rsquared: 0.1242 > Fstatistic: 217.7 on 1 and 1527 DF, pvalue: < 2.2e16 par(mfrow=c(2,1)) plot(amp,col=rainbow(100,start=0.2)) plot(s$Topographical.Average)
What did we do? As you can see we fitted a simple linear regression model using the values from both the amphibian richness raster layer and the topographical range raster. The relation seems to be highly significant and this simple model can explain up to 12.4% of the variation. Here is the basic plot output for both response and predictor variable.
As you can see high values of both layers seem to be spatially clustered. So the likelihood of violating the independence of datapoints in the linear regression model is very high. Lets investigate the spatial autocorrelation by looking at Moran’s I, which is a measure for spatial autocorrelation (technically its just a determinant of correlation like pearsons r combined with spatial weights of the surroundings). So lets investigate if the residual values (the error in model fit) are spatially autocorrelated.
library(ncf) # For the Correlogram # Generate an Residual Raster from the model before rval < getValues(amp) # Create new raster rval[as.numeric(names(fit1$residuals))]< fit1$residuals # replace all datacells with res value resid < topo values(resid) <rval;rm(rval) #replace our values in this new raster names(resid) < "Residuals" # Now calculate Moran's I of the new residual raster layer x = xFromCell(resid,1:ncell(resid)) # take x coordinates y = yFromCell(resid,1:ncell(resid)) # take y coordinates z = getValues(resid) # and the values of course # Now calculate Moran's I # Use the extracted coordinates and values, increase the distance in 100er steps and don't forget to use latlon=T (given that you have your rasters in WGS84 projected) system.time(co < correlog(x,y,z,increment = 100,resamp = 0, latlon = T,na.rm=T)) # this can take a while. # It takes even longer if you try to estimate significance of spatial autocorrelation # Now show the result plot(0,type="n",col="black",ylab="Moran's I",xlab="lag distance",xlim=c(0,6500),ylim=c(1,1)) abline(h=0,lty="dotted") lines(co$correlation~co$mean.of.class,col="red",lwd=2) points(x=co$x.intercept,y=0,pch=19,col="red")
Ideally Moran’s I should be as close to zero as possible. In the above plot you can see that values in close distance (up to 2000 Distance units) and with greater distance as well, the model residuals are positively autocorrelated (too great than expected by chance alone, thus correlated with proximity). The function correlog allows you to resample the dataset to investigate significance of this patterns, but for now i will just assume that our models residuals are significantly spatially autocorrelated.
There are numerous techniques to deal with or investigate spatial autocorrelation. Here the interested reader is advised to look at Dormann et al. (2007) for inspiration. In our example we will try to fit a simultaneous spatial autoregressive model (SAR) and try to see if we can partially get the spatial autocorrelation out of the residual error. SARs can model the spatial error generating process and operate with weight
matrices that specify the strength of interaction between neighbouring sites (Dormann et al., 2007). If you know that the spatial autocorrelation occurs in the response variable only, a so called “laggedresponse model” would be most appropriate, otherwise use a “mixed” SAR if the error occurs in both response and predictors. However Kissling and Carl (2008) investigated SAR models in detail and came to the conclusion that lagged and mixed SARs might not always give better results than ordinary least square regressions and can generate bias (Kissling & Carl, 2008). Instead they recommend to calculate “spatial error” SAR models when dealing with species distribution data, which assumes that the spatial correlation does neither occur in response or predictors, but in the error term.
So lets build the spatial weights and fit a SAR:
library(spdep) x = xFromCell(amp,1:ncell(amp)) y = yFromCell(amp,1:ncell(amp)) z = getValues(amp) nei < dnearneigh(cbind(x,y),d1=0,d2=2000,longlat=T) # Get neighbourlist of interactions with a distance unit 2000. nlw < nb2listw(nei,style="W",zero.policy=T) # You should calculate the interaction weights with the maximal distance in which autocorrelation occurs. # But here we will just take the first xintercept where positive correlation turns into the negative. # Now fit the spatial error SAR sar_e < errorsarlm(z~topo,data=val,listw=nlw,na.action=na.omit,zero.policy=T) # We use the generated z values and weights as input. Nodata values are excluded and zeros are given to boundary errors # Now compare how much Variation can be explained summary(fit1)$adj.r.squared # The r_squared of the normal regression > 0.124 summary(sar_e,Nagelkerke=T)$NK # Nagelkerkes pseudo r_square of the SAR > 0.504 # < for SAR. So we could increase the influence of topographical average value on amphibian richness # Finally do a likelihood ratio test LR.sarlm(sar_e,fit1) > Likelihood ratio for spatial linear models > >data: >Likelihood ratio = 869.7864, df = 1, pvalue < 2.2e16 >sample estimates: >Log likelihood of sar_e Log likelihood of fit1 > 7090.903 7525.796 # Not only are our two models significantly different, but the log likelihood of our SAR is also greater than the ordinary model # indicating a better fit.
The SAR is one of many methods to deal with spatial autocorrelation. I agree that the choice of of the weights matrix distance is a bit arbitrary (it made sense for me), so you might want to investigate the occurence of spatial correlations a bit more prior to fitting a SAR. So have we dealt with the autocorrelation? Lets just calculate Moran’s I values again for both the old residual and the SAR residual values. Looks better doesn’t it?
References:
 F Dormann, C., M McPherson, J., B Araújo, M., Bivand, R., Bolliger, J., Carl, G., … & Wilson, R. (2007). Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. Ecography, 30(5), 609628.

Kissling, W. D., & Carl, G. (2008). Spatial autocorrelation and the selection of simultaneous autoregressive models. Global Ecology and Biogeography, 17(1), 5971.
Distribution maps in R
Today i’m gonna play a little bit with map features and show you how to make different basic distribution maps in R. Using the 2.14.1 Version of R i will make a graphical distribution map of the dragonflies species in Bavaria. The data was extracted from the book “Libellen in Bayern” and applied to a presenceabsence matrix. The whole of bavaria was converted to a geographical grid (X,YValues), whose values came from available topographical maps. Iam open to any suggestions or other packages which could present such data in a fashionable way!
For basic great looking maps you could use the package sp and lattice. Also i suggest using the package RColorBrewer, which provides very nice color ranges. See the comments in the RCode for explantations.
library(sp);library(lattice) # Loads all libraries data < read.csv2("grid_bayern.csv", header=T,dec=",",sep=";",na.strings="NA") # Load the data ### The date has the following columns: "X","Y","Diversity" coordinates(data) < c("X","Y") # Apply XY Values as coordinates to form a SpatialPointsDataFrame. ## coloured points plot with legend in plotting area and scales spplot(data, "Diversity", scales=list(draw=F),key.space=list(x=0.73,y=0.9,corner=c(0,1)), cuts = 3, col.regions=brewer.pal(3, "Set1")[3:1], legendEntries = c("small","avarage","high")) ## Blubble Plot > Increasing bubble size for higher values bubble(data, "Diversity", maxsize = 1.5,pch=19, main = "Bavaria Dragonfly diversity", key.entries = c(1,5,10,25,50),scales=list(draw=F))
As some points seem to be missing you could also build an interpolated graphic. For this first we will need the packages maps, akima and fields. The code below loads in the dataset and defines our X and YAxis ranges and interpolates all data to adjacent areas based on contourlines. Please note that these distribution is just a default kriging, which doesn’t have to be right. You need to look to some variograms and adjust your map to build the correct interpolated values.
library(RColorBrewer);library(maps); library(akima);library(fields) ## Load all libraries data < read.csv2("grid_bavaria.csv",header=T, dec=",",sep=";",na.strings="NA") ## load the data rx=range(data$X);ry=range(data$Y) ## define the ranges of the plots int.scp < interp(data$X,data$Y), data$diversity, duplicate="strip") ## Make an interpolation # Build the image plot with the interpolated values image.plot(int.scp,xlim=rx,ylim=ry,legend.shrink=0.7, col=brewer.pal(10, "Spectral")[10:1],nlevel=10,main="Spatial Diversity") #contour(int.scp,add=TRUE) # You could also show the contour lines with this command