# Macroecology playground (1) – Bird species richness in a nutshell

Ahh, Macroecology. The study of ecological patterns and processes on big scales. Questions like “what factors determine distribution and diversity of all life on earth?” have troubled scientists since A.v.Humboldt and Wallace times. At the University of Copenhagen a whole research center has been dedicated to this specific field and macro-ecological studies are more and more present in prestigious journals like Nature and Science. Previous studies at the center have found skewed distributions of bird richness with a specific bias towards the mountains (Jetz & Rahbek, 2002, Rahbek et al., 2007). In this blog post i am going to play a bit around with some data from Rahbek et al. (2007). The analysis and the graphs are by no means sufficient (and even violate many model assumptions like homoscedasticity, normality and data independence) and are therefore more of exploratory nature 😉 The post will show you how to build a raster stack of geographical data and how to use the data in some very basic models.

It was recommended to me to use the freely available SAM software for the analysis but although the program is really nice and fast it isn’t suitable enough for me as you can not modify specific model parameters or graphical outputs. And as a self-declared R junkie i refuse to work with “click-compute-result” tools 😉

So here is how the head of SAM data file (“data.sam”) looks like (i won’t share it, so please generate your own data).

As you can see the *.sam* file is technically just a tabulator separated table with the coordinates for a gridcell (1° gridcell on a latitude-longitude projection) and all response and predictor values for this cell. To get this data into R we are gonna use the raster package to generate a so called raster stack for our analysis. This is how i did it

# Load libraries library(raster) # Create Data from SAM data <- read.delim(file="data.sam",header=T,sep="\t",dec=".") # read in a data.frame coordinates(data) <- ~Longitude+Latitude # Convert to a SpatialPointsDataframe cs <- "+proj=longlat +datum=WGS84 +no_defs" # define the correct projection (long-lat) gridded(data) <- T # Make a SpatialPixelsDataframe proj4string(data) <- CRS(cs) # set the defined CRS # Create Raster layer stack s <- stack() for(n in names(data)){ d <- data.frame(coordinates(data),data[,n]) ras <- rasterFromXYZ(xyz=d,digits=10,crs=CRS(cs)) s <- addLayer(s,ras) rm(d,n,ras) } # Now you can query and plot the raster layers from the stack plot(s$Birds.richness,col=rainbow(100,start=0.1))

You wanna do some modeling or extract data? Here you go. First we make a subset of some of our predictors from the raster stack and then fit ordinary least squares multiple regression models to our data to see how much variance can be explained. Note that linear regressions are not the proper techniques for this kind of analysis (degrees of freedom to high due to spatial autocorrelation, violation of assumptions mentioned before), but its still useful for explanatory purposes.

# Extract some predictors from the raster Stack predictors <- subset(s,c(7,8,10)) names(predictors) > "NDVI" "Topographical.Range" "Annual.Mean.Temperature" # Now extract the data from both the bird richness layer and the predictors birds <- getValues(s$Birds.richness) val <- as.data.frame(getValues(predictors)) # Do the multiple regression fit <- lm(birds~.,data=val) summary(fit) > Estimate Std. Error t value Pr(>|t|) (Intercept) 215.675282 15.837493 13.62 <2e-16 *** NDVI -34.541242 1.245769 -27.73 <2e-16 *** Topographical.Range 0.056458 0.002452 23.03 <2e-16 *** Annual.Mean.Temperature 0.940664 0.054747 17.18 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 81.86 on 1525 degrees of freedom (1461 observations deleted due to missingness) Multiple R-squared: 0.6931, Adjusted R-squared: 0.6925 F-statistic: 1148 on 3 and 1525 DF, p-value: < 2.2e-16

Ignore the p-values and just focus on the adjusted r² value. As you can see we are able to explain nearly 70% of the variance with this simple model. So how do our residuals and the predicted values look like? For that we have to create analogous raster layers containing both the predicted and the residual values. Then we plot all species raster layers again using the **spplot** function from the package sp (automatically loaded with “raster”)

# Estimates prediction rval <- getValues(s$Birds.richness) # Create new values rval[as.numeric(names(fit$fitted.values))]<- predict(fit) # replace all data-cells with predicted values pred <- predictors$NDVI # make a copy of an existing raster values(pred) <-rval;rm(rval) #replace all values in this raster copy names(pred) <- "Prediction" # Residual Raster rval <- getValues(s$Birds.richness) # Create new values rval[as.numeric(names(fit$residuals))]<- fit$residuals # replace all data-cells with residual values resid <-predictors$NDVI values(resid) <-rval;rm(rval) names(resid) <- "Residuals"</pre> # Do the plot with spplot ss <- stack(s$Birds.richness, pred, resid) sp <- as(ss, 'SpatialGridDataFrame') trellis.par.set(sp.theme()) spplot(sp)

While looking at the residual plot you might notice that our simple model fails to explain all the variation at mountain altitudes (the Andes). Still the predicted values look very alike the observed richness. Bird species Richness is highest at tropical mountain ranges, which is consistent with results from Africa (Jetz & Rahbek, 2002). Reasons for this pattern are not fully understood yet, but if i had to discuss this with a colleague i would probably bring up arguments like older evolutionary time, higher habitat heterogeneity and greater numbers of climatic niches at mountain ranges. At this point you would then test for spatial autocorrelation using Moran´s I, adjust your data to that and use more sophisticated methods like General Additive Models (GAMs) or Spatial Autoregressive Model (SARs) and account for the spatial autocorrelation. See Rahbek et al. (2007) for the actual study.

**References:**

- Jetz, W., & Rahbek, C. (2002). Geographic range size and determinants of avian species richness.
*Science*,*297*(5586), 1548-1551. - Rahbek, C., Gotelli, N. J., Colwell, R. K., Entsminger, G. L., Rangel, T. F. L., & Graves, G. R. (2007). Predicting continental-scale patterns of bird species richness with spatially explicit models.
*Proceedings of the Royal Society B: Biological Sciences*,*274*(1607), 165-174.