--- title: "Multi-taxon indicator (trim)" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Multi-taxon indicator (trim)} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- This documents outlines how a multi-taxon indicator (Farmland Breeding Birds) can be calculated in R using the TRIM method. ## Load packages The following packages are required. All packages are available on [CRAN](https://cran.r-project.org/) apart from `{fbi}` which can be installed from GitHub. ```r library(dplyr) library(fbi) library(finbif) library(ggplot2) library(lubridate) library(rtrim) ``` ## Survey data These five fields are required for the survey data. ```r select <- c("document_id", "location_id", "year", "month", "day") ``` These filters restrict the survey data to the "Point count" and "Line transect" bird monitoring datasets from 1979 onwards at sites labelled "farmland" and where the selected data fields have no missing data. ```r filter <- list( location_tag = "farmland", collection = c( "Point counts of breeding terrestrial birds", "Line transect censuses of breeding birds" ), date_range_ymd = c("1979-01-01", ""), has_value = select ) ``` The survey data can now downloaded from FinBIF. ```r surveys <- finbif_occurrence( filter = filter, select = select, aggregate = "events", aggregate_counts = FALSE, n = "all", quiet = TRUE ) ``` Two processing functions are applied to the survey data to first limit each site to the first survey of the year and then limit the surveys to sites where at least two years have been surveyed. ```r surveys <- pick_first_survey_in_year(surveys) surveys <- require_two_years(surveys) ``` ## Count data Count data only requires two fields to be selected: the survey identifier (`document_id`) and the measure of abundance (in this case the number of breeding pairs: `pair_abundance`). ```r select <- c("document_id", abundance = "pair_abundance") ``` The count data requires the same filters as the survey data (though the filter `has_value` needs to be redefined). ```r filter[["has_value"]] <- select ``` A set of taxa contributing to the multi-taxon indicator is selected. ```r taxa <- c( "Vanellus vanellus", "Numenius arquata", "Alauda arvensis", "Hirundo rustica", "Delichon urbicum", "Anthus pratensis", "Saxicola rubetra", "Turdus pilaris", "Sylvia communis", "Corvus monedula", "Sturnus vulgaris" ) ``` The count data for these taxa can now be downloaded from FinBIF. ```r counts <- lapply( taxa, finbif_occurrence, filter = filter, select = c("scientific_name", "document_id", abundance = "pair_abundance"), n = "all", quiet = TRUE ) ``` Three processing functions are applied to the count data to: infill the count data with zero occurrences using the survey data; sum over the counts for each site-year combination; and remove all sites where the number of breeding pairs was zero on every occasion. ```r counts <- lapply(counts, zero_fill, surveys) counts <- lapply(counts, sum_by_event) counts <- lapply(counts, remove_all_zero_locations) ``` # Fit TRIM Models A TRIM model is used to estimate the change in abundance over time for each taxon. ```r model <- lapply( counts, trim, count_col ="abundance", site_col = "location_id", year_col = "year" ) ``` # Create Index An index of change in relative abundance is created for each taxon by setting the base year to the year 2000. The indices are then combined into a single table ```r indices <- lapply(model, index, base = 2000) indices <- mapply(mutate, indices, sp = taxa, SIMPLIFY = FALSE) indices <- do.call(rbind, indices) ``` The geometric mean of relative abundance and its associated uncertainty can be calculated via a Monte Carlo simulation method (see [here](https://www.cbs.nl/en-gb/society/nature-and-environment/indices-and-trends--trim--/msi-tool) for details). ```r n <- 1000 max_cv <- 3 imputed_min <- .01 truncfac <- 10 base <- 1 nyrs <- 44 indices <- mutate( indices, cv = ifelse(imputed >= .1 & se_imp > 0, se_imp / imputed, NA_real_) ) indices <- group_by(indices, sp) indices <- mutate(indices, cv = mean(cv, na.rm = TRUE)) indices <- filter(indices, cv < max_cv) indices <- mutate(indices, imputed = pmax(imputed, imputed_min)) indices <- mutate(indices, se_imp = if_else(imputed > imputed_min, se_imp, 0)) indices <- mutate(indices, se_imp = se_imp / imputed) indices <- mutate(indices, imputed = log(imputed)) indices <- group_by(indices, time, sp) indices <- summarise( indices, mc = rnorm(n, imputed, se_imp), i = seq(n), .groups = "keep" ) indices <- mutate(indices, mc = pmax(mc, log(imputed_min))) indices <- group_by(indices, sp, i) indices <- arrange(indices, time) indices <- mutate( indices, mcf = pmax(pmin(lead(mc) - mc, log(truncfac)), log(1 / truncfac)) ) indices <- arrange(indices, -time) indices <- mutate( indices, mcb = pmax(pmin(lead(mc) - mc, log(truncfac)), log(1 / truncfac)) ) indices <- group_by(indices, i, time) index <- summarise( indices, mcf = mean(mcf, na.rm = TRUE), mcb = mean(mcb, na.rm = TRUE), .groups = "drop_last" ) index <- arrange(index, time) index <- mutate(index, mcf = cumsum(lag(lead(mcf, base - 1), base, 0))) index <- arrange(index, -time) index <- mutate( index, mcb = cumsum(lag(lead(mcb, nyrs - base), nyrs - base + 1, 0)) ) index <- group_by(index, time) index <- summarise( index, index = exp(mean(mcf)) * exp(mean(mcb)), se = sd(mcf) * exp(mean(mcf)) + sd(mcb) * exp(mean(mcb)) ) ``` ```r ggplot(index) + aes( x = parse_date_time(time, "Y"), y = index, ymin = index - se, ymax = index + se ) + geom_ribbon(alpha = .2) + geom_line() + ylab(NULL) + xlab(NULL) + theme_minimal() ``` ![Geometric mean of relative abundance of Farmland Breeding Birds in Finland (calculated using rTRIM)](figure/plot-multi-taxon-trim-1.png)