# BIOFRAG – Biodiversity responses to Forest Fragmentation

Another interesting project closely related to PREDICTS is the BIOFRAG Project, which tries to construct a global database of research papers dealing with Forest Fragmentation and its impacts on Biodiversity taxa. One final goal of the BIOFRAG project is the development of a new fragmentation index using watersheds delineation algorithm and fragment descriptors in order to characterize Fragment traits. I am very interested in seeing the final outcome of this approach and maybe I even find the time to implement their algorithm in LecoS for QGIS as soon as it is released. Their database paper, lead authored by Marion Pfeifer, was just released to the public as open-access paper. You can read it in full here.

If you consider of contributing data then more information can be found on the BIOFRAG blog and all researchers involved with forest fragmentation research should consider contributing to them and also to PREDICTS (see here) if you haven’t already done so. And as usual: If you were studying in Africa, then please get in touch with me! I will contact you as soon as I return from my Fieldwork in Kenya and Tanzania at the end of May.

# Statistical inferences using p-values

And another quick post for today. Here is a nice infographic I just found on the Nature News page. Nice demonstration how p-values can fail us in making hypothesis inferences. Just another article bashing p-values you could say. Or “Just switch already to Bayesian stats or report real effect strengths instead of p-values”. Although the matter is clear for many ecologists out there, the majority still happily uses p-values inferring that they proved their working hypothesis wrong or true. At my former and also at my current university p-values are still being taught and used in all courses related to data analysis. Students are being asked and expected to always (!) report the p-value and trained to look specifically for something they claim is statistical significance of an effect. And then people are wondering why the hell everyone still uses century old techniques. Often while not even knowing what it exactly means. I certainly believe (and I say that while being still educated 🙂 ) that especially in the education of future ecologists and conservationists statistics courses should become mandatory for all (under)graduates. In times of big data analysis basic statistical knowledge has to be a must for everyone.

The related Nature News article can be found here. More nice infos and facts about my research in Africa and fieldwork trip will appear around May.

EDIT: And as a funny addition check out this awesome R-function which gives you an appropriate significance description for every p-value 😀

# Macroecology playground (2) – About the Mid domain effect null model

The use of null models in ecology has a long history (Connor & Simberloff,1979) and was in the epicenter of many scientific disputes. Some of them are even continuing until today (or here). I will spare the readers of this blog any further discussions or arguments as i haven’t entirely made up my own mind yet. Statistically speaking many null models make perfect sense for me if ecological data is just seen as “data”. The biological perspective of many null models however can be discussed as many of them make assumptions (random distribution of species in spatial community ecology for instance), which seem to be hardly true *in natura*. I agree that ecologists have to make careful considerations while designing their statistical analysis. I am going to follow the debate about null models more in the future, but for now let me introduce you to a simple null model in macroecology.

One of the most used null models in Macroecology is the so called Mid domain effect (MDE) null model. Given that the effect of all possible environmental predictors on a species distribution decreases, we would expect that the species richness peaks shift toward the center of their geometric constraints (Colwell & Lees, 2000; Colwell et al., 2004). This so called mid domain peak is build on the stochastic phenomena that if you shuffle species ranges inside a geometric constraint, you will always find that the greatest overlaps occur in the very center.

For an easy visualization**:** Just imagine an aluminum box full of different sized pencils. One of those you had back in primary school. The pencils inside are of varying size, some might be nearly as long as the whole box, others are nearly depleted. Close the box and shuffle it. If you now open the box again, you will find the most pencils (or parts of a pencil) in the middle of the box.

One way to generate a MDE null model from given species ranges is to use a so called spreading dye algorithm, which emulates grow of cells inside the given geometric constraints from a random starting point (emulating multiple drops of dye inside a water pont). Click the GIF image below to watch a growing MDE (**CAREFUL – BIG GIF PICTURE > 4mb**). As input the number of occupied grid cells per bird species in south America was used. The range was kept constant, but the starting point varies.

As you can observe the relative bird species richness peaks in the middle of the continent after some time. This patterns becomes more prominent if the algorithm runs for all 2869 bird species occurring in south America. The final image and their range quartiles look like this :

Here you can observe that the overall mid domain peak can only be observed for the fourth quartile. For the other three the relative distribution is quite random, which might explain why the MDE null model often explains quite a lot of the variance for widespread species (Dunn et al., 2007). The MDE null model has been criticized and defended again multiple times, but is still widely used in macroecology. Critics usually bring up possible influences of phylogeny (Davies et al, 2005) or geometric constrains (Connolly, 2005; McClain et al., 2007). Issues particularly with the spreading dye algorithm are, that the simulated species ranges are like spreading ink drops which are very similar in shape. In reality species ranges often have quite complex and different configurations/shapes. Furthermore the models stops at the borders of the geometric contrains (the coastline of south America). Any random drop of ink near the coast line will therefore always grow into the heart of the country, which therefore makes the shape of the used geometric constrain the most important predictor of a possible range peak. If for instance the model would be repeated for a more irregular shape (like middle America) the peaks will develop where the greatest land mass is (so around texas and bolivia). The sheer probability of an ink dye developing in panama or Ecuador is too low due to the chance of hitting this small shape. This is a property of the algorithm and might result in non-significant null models for the middle American regions.

**References**

- Colwell RK, Lees DC (2000) The middomain effect: Geometric constraints on the

geography of species richness. Trends Ecol Evol 15:70 –76. - Colwell, R. K., Rahbek, C., & Gotelli, N. J. (2004). The Mid‐Domain Effect and Species Richness Patterns: What Have We Learned So Far?.
*The American Naturalist*,*163*(3), E1-E23. - Connor, E. F., & Simberloff, D. (1979). The assembly of species communities: chance or competition?.
*Ecology*, 1132-1140. - Connolly, S. R. (2005). Process‐Based Models of Species Distributions and the Mid‐Domain Effect.
*The American Naturalist*,*166*(1), 1-11. -
Davies, T. J., Grenyer, R., & Gittleman, J. L. (2005). Phylogeny can make the mid-domain effect an inappropriate null model.
*Biology letters*,*1*(2), 143-146. - Dunn, R. R., McCain, C. M., & Sanders, N. J. (2007). When does diversity fit null model predictions? Scale and range size mediate the mid‐domain effect. Global Ecology and Biogeography, 16(3), 305-312
- McClain, C. R., White, E. P., & Hurlbert, A. H. (2007). Challenges in the application of geometric constraint models.
*Global Ecology and Biogeography*,*16*(3), 257-264.