Tag Archive | snippet

Port your R scripts to QGIS using SEXTANTE

has grown a lot in the past years and is increasingly used as GIS. For many spatial operations and even spatial statistics certain R packages are simply the best choice up to date. For instance Home-range analysis are kinda impossible to perform (at least for me) without looking at the adehabitat packages. You want to perform a spatial operation in a script and do not know how to code python: Just use R. Francisco Rodriguez-Sanchez has posted very nice examples how to perform GIS tasks completely within R.

However R as GIS lacks the simplicity in matters of design and it requires quite an effort to make a good-looking map. Therefore using QGIS for map-design and output is always the favored choice for me. But why use R and QGIS independently? Through the SEXTANTE Toolbox for QGIS it is possible to execute fast and precise r-scripts for your spatial data just in QGIS.

What you need to do:

  • Download the current QGIS dev. and R.
  • Enable SEXTANTE in QGIS and activate the R-scripts in the SEXTANTE options provider window. Furthermore specify a R-scripts folder where you store your scripts.
  • Also make sure that your scripts are logged (it is in the SEXTANTE options as well)
  • Execute one of the Example R-scripts to test if the scripts are working.

If the above steps all turned out as expected you could start formatting your own r-scripts into a so-called .rsx file (R SEXTANTE script).

Here is a little info-graphic how to use R in a SEXTANTE context:


So open your favorite text-editor (or the new r-script dialog in QGIS) and create a new file. Save it into the rscripts folder inside your QGIS profile directory (“~/.qgis2/sextante/rscripts” on Linux based systems. Similar structure under Windows inside of your Documents folder). All .rsx scripts start with defining a group (where to save the script in SEXTANTE) and continue with additional inputs specified by the user. All Input data comes with two hashs in front. Furthermore you need to specify if the script output should show plots (add “#showplots” at the beginning of the script) and/or console output (start a command with “>”).

After you wrote your script, startup QGIS, open the SEXTANTE toolbox and have fun executing it. All things are possible, but it isn’t really easy to debug .rsx scripts in QGIS as the output is limited and sometimes you just wonder why it isn’t working.

To get you started here is the basic script to do the nice-looking levelplot from the rasterVis r-package:

##[Own Scripts] = group
##layer = raster
myPal <- terrain.colors(20)

Script is stored in the “Own Scripts” group. It just requires a raster (best is a DEM) as input.
You could extend the scripts by saving the output to a user defined folder or by creating just a plot for a specific extent (for instance the current QGIS extent). Output looks like this for the country of Skane in south Sweden:

output_LevelplotRight now this is just for show, but of course you could also generate for example contours of the raster DEM and save them in GDAL supported format after script execution.


Useful R functions for ecologists

Every biologist using R has some self-written functions he particularly likes and which are useful in many different analysis. Here i am sharing seven nifty functions, which have proven to be useful for me in the past. Many can be used in different contexts and their functionalities are certainly missing in the R base package.

# The function below uses the MASS package to fit provided values to given often used
# distributions. AIC and BIC information criterion values are calculated and returned in an ordered table
testdist <- function(values){
distributions <- c("normal","lognormal","exponential","logistic","cauchy","gamma","geometric","weibull")
res <- data.frame(cbind(distributions)); res[,c("AIC","BIC")] <- NA
for(i in seq(1:nrow(res))){
fit <- fitdistr(values,densfun=as.character(res$distributions[i]),)
res$AIC[i] <- AIC(fit);res$BIC[i] <- BIC(fit)
res <- res[order(res$BIC,decreasing=F,na.last=T),]
#Example output:
#distributions AIC BIC
#weibull 204.6404 207.5083
#normal 205.7745 208.6425
#gamma 206.4929 209.3609
#lognormal 206.9348 209.8028
#logistic 207.0618 209.9298
#cauchy 218.2213 221.0893
#exponential 332.5055 333.9395
#geometric 332.5107 333.9447
# Those values shouldn't be taken for granted and you should always visually explore potential distributions
# for instance with histograms or qqplots !
# Standarderror of the mean (SEM)
stderr <- function(x) sqrt(var(x,na.rm=TRUE)/length(na.omit(x)))
#Example output:
# > 1.144411
# Do segments on top of Scatter-Plot.
# Requires a transmitted data.frame with x-y values and a vector with line-length
doSegments <- function(x,y,ll,eps=0.05,...){
# Further Arguments are transmitted to segments
segments(x,y-ll,x,y+ll) # Build lines
segments(x-eps,y-ll,x+eps,y-ll) # Do the segments on top
segments(x-eps,y+ll,x+eps,y+ll) # and below
#Example output:
fit <- lm(trees$Volume~trees$Girth)
# Displays the residuals of a linear regression as errorbar
doSegments(trees$Girth,trees$Volume,ll=resid(fit) )


# Returns a list of the elements of x that are not in y
# and the elements of y that are not in x (not the same thing...)
setdiff2 <- function(x,y) {
Xdiff = setdiff(x,y)
Ydiff = setdiff(y,x)
list(X_not_in_Y=Xdiff, Y_not_in_X=Ydiff)
# Example output
a <- c("A","B","C","D")
b <- c("C","D","E","F")
#> $X_not_in_Y
#> [1] "A" "B"
#> $Y_not_in_X
#> [1] "E" "F"
# Remove all NA-Values from a list
remNa <- function(x) {
# Example Output
a <- c("A","B",NA,"D")
#> [1] "A" "B" "D"
## String manipulation
# trim white space/tabs
trim_whitespace <-function(s) gsub("^[[:space:]]+|[[:space:]]+$","",s)

# Example output
a <- "   Teststring   "
#> [1] "Teststring"

# Extract numbers from string or character
numbers_from_string <- function(x) as.numeric(gsub("\\D", "", x))

# Example output
a <- "We found 13 rabbits playing on the field"
#> [1] 13

Return coordinates of addresses in R

Georeferencing huge databases in R just became a little bit easier for me.
I found the nice ‘rjson‘ package, which allows to decompose json code retrieved from google maps for instance. My data table consists of columns with the zipcode, the city and the state and a custom value.
Here is the snippet which returns the point coordinates from a given address in R.

require(rjson) # load rjson package
getCoordinates <- function(address) {
url <- paste("http://maps.google.com/maps/geo?q=",address,"&output=json",sep="")
map_data <- fromJSON(paste(readLines(url),collapse=""))
coord <- map_data$Placemark[[1]]$Point$coordinates[1:2]

# example query
address <- paste(data$zip[1],"+",data$city[1],"+",data$state[1],sep="")

Taxonomy in R

Hi there,
here is a r-snippet which i wrote to get new data out of a species list.
I have a species-list of insect-pollinated european grassland plants, which is a little to detailed for me.
Through filtering i want to get a subset list with only small herbs.
First i consulted the PESI-Portal for a taxamatch only to find out, that only 75% of my species are currently in the database (Acer campestre is not found?)
After that i searched for a smart solution in r and i found taxize, a newly package which queries different databases and has nice functions for taxonomic data management.
First i want to get familynames for each genus-species combination. My species list contains two columns with the genus and the species name.
See below my solution how to get a family name for each species.

# Load in the source package taxize and get familynames for taxa in splist
install_github("taxize_", "ropensci")

splist <- paste(data$Genus,data$Species,sep="_")
mat <- matrix(nrow=length(splist),ncol=2)
mat[,2] <- splist
for(i in seq(1:length(splist))){
 # I had to use a try, because some entries produce strange errors.
res <- try(expr=get_tsn(splist[i],"sciname", by_="name"),silent=T)
if(is.vector(res)) {
fam <- get_familyname(res)
mat[i,1] <- fam
print(paste(fam,splist[i],sep=" - "))

It’s kind of rough, but it works. Currently it runs through my plant-species list with over 1500 entries.
The output looks like this…

[1] "Asteraceae - Artemisia_scoparia"

[1] "Asteraceae - Artemisia_vulgaris"

[1] "Araceae - Arum_maculatum"

[1] "Rosaceae - Aruncus_dioicus"

[1] "Asclepiadaceae - Asclepias_syriaca"

[1] "Asparagaceae - Asparagus_officinalis"

Iam still looking for a solution how to get specific traits for every species, such as “Perennial or Annual” or “Grow height”.
Will report back, if i find something.

Sussex Research Hive

Supporting the research community at the University of Sussex

Small Pond Science

Research, teaching, and mentorship in the sciences

Landscape Ecology 2.0

intersecting landscape ecology, open science, and R


The Research Blog of IIASA

Amy Whitehead's Research

the ecological musings of a conservation biologist

Michael McCarthy's Research

School of BioSciences, The University of Melbourne

The Rostrum

science, statistics, policy and more


Environmental Change - Understand, Predict, Adapt

Dynamic Ecology

Multa novit vulpes


METeorological Visualisation Utilities using R for Science and Teaching

A Birder´s Blog

"Everybody loves what they know"

Trust Me, I'm a Geographer

Using Technology to Explore Our World

Duncan Golicher's weblog

Research, scripts and life in Chiapas

Daniel J. Hocking

Ecology, conservation biology, & statistical modeling for a changing world