---
title: "Configuration"
output:
rmarkdown::html_vignette:
toc: true
vignette: >
%\VignetteIndexEntry{Configuration}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
url <- paste0(
"https://raw.githubusercontent.com/luomus/fin-biodiv-indicators/%s",
"/config.yml"
)
fb_rel <- "https://luomus.github.io/finbif/"
fb_dev <- "https://finbif-docs-dev.netlify.app/"
```
The FinBIF Biodiversity Indicators Service (FBI) is configured using a
[`YAML`](https://yaml.org/spec/1.2.2/ "YAML Spec 1.2.2") file. The config
file can be accessed and edited
[here](`r sprintf(url, "main")`){.pkgdown-release}[here
](`r sprintf(url, "dev")`){.pkgdown-devel} and is stored in the FBI
code repository. The FBI service uses the "live" editable version of the config
file. But, the source version should be kept up to date as possible and
treated as a backup. When the service application is initialised, the source
version of the config file is used as a starting configuration.
Each multi-taxon indicator has its own object in the configuration file. For
example, the Farmland Breeding Bird indicator, and the single-taxon indicators
that belong to it, are configured using the parameters outlined below,
where the top-level key, `flb`, is the short-code of the indicator.
Click to show/hide example configuration
```yaml
flb:
name: Farmland birds
taxa:
- code: MX.27328
binomial: Crex crex
start_year: 2006
- code: MX.27527
binomial: Vanellus vanellus
- code: MX.27613
binomial: Numenius arquata
- code: MX.32065
binomial: Alauda arvensis
- code: MX.32132
binomial: Hirundo rustica
- code: MX.32163
binomial: Delichon urbicum
- code: MX.32213
binomial: Anthus pratensis
- code: MX.32949
binomial: Saxicola rubetra
- code: MX.33117
binomial: Turdus pilaris
- code: MX.33936
binomial: Sylvia communis
- code: MX.37142
binomial: Corvus monedula
- code: MX.36817
binomial: Sturnus vulgaris
- code: MX.36589
binomial: Passer montanus
start_year: 2006
- code: MX.35154
binomial: Emberiza hortulana
filters:
date_range_ymd:
- "1979-01-01"
- ""
location_tag: farmland
collection:
- HR.157
- HR.61
surveys:
selection:
- document_id
- location_id
- year
- month
- day
has_value:
- document_id
- location_id
- year
- month
- day
counts:
abundance: pair_abundance
selection:
- document_id
- pair_abundance
has_value:
- document_id
- pair_abundance
combine: geometric_mean
use_data_after: "10-01"
model:
trim:
base_year: 2000
surveys_process:
- pick_first_survey_in_year
- require_two_years
counts_process:
- zero_fill
- sum_by_event
- set_start_year
- remove_all_zero_locations
```
## Defaults
Default parameters that apply to all indicators can be specified in a special
object with the top-level key, `default` (see below). Defaults can be
overwritten by specifying different parameters value for the indicator objects.
[Click to show/hide example defaults]{title=""}
``` yaml
default:
surveys:
selection:
- document_id
- location_id
- year
- month
- day
has_value:
- document_id
- location_id
- year
- month
- day
counts:
abundance: abundance
use_data_after: "01-01"
```
## Sections
The following describes each section of the multi-taxon indicator objects in
more detail.
### Name
A string giving the long-form name of a multi-taxon indicator.
### Taxa
An array defining the taxa that are included in the multi-taxon indicator and
for which single-taxon indicators will be calculated.
Each element of the array must include:
- `code`: a string indicating a FinBIF taxon MX code (e.g, `MX.27328`).
- `binomial`: a string indicating a scientific taxon name (e.g., `Crex crex`).
And optionally include one or more of:
- `extra_codes`: an array of FinBIF taxon MX codes to include along with the
nominal taxon declared with `code`.
- `subtaxa`: a boolean indicating whether to include observations of the child
taxa of the taxa declared with `code` and `extra_codes`.
- `start_year`: an integer indicating the year observation data of the taxa
should begin. Observation data from before this year will be excluded. Note
that if `start_year` is greater than `base_year` (see section
[Model](#model "Section model")) then `base_year` for the taxon will be set
to `start_year`.
### Extra taxa
An array defining the taxa that are not included in the parent multi-taxon
indicator but for which single-taxon indicators will be calculated.
Each element of the array is configured in the same way as
[`taxa`](#taxa "Section taxa") above.
### From
Multi-taxon indicators can have no constituent single-taxon indicators. In this
case, the `from` section indicates which other multi-taxon indicator (in the
form of a short-code) the input data comes from.
### Filters
The `filters` section is an object defining a list of filters to apply to the
observation data from FinBIF. See the finbif R package
[documentation
](`r fb_rel`reference/filters.html "finbif docs"){.pkgdown-release}[
documentation
](`r fb_dev`reference/filters.html "finbif docs"){.pkgdown-devel}
for details.
### Surveys
The `surveys` section defines which fields are selected when getting survey data
from FinBIF. Fields are selected with the key `selection` as an array (see the
finbif R package [documentation
](`r fb_rel`reference/variables.html "finbif docs"){.pkgdown-release}[
documentation
](`r fb_dev`reference/variables.html "finbif docs"){.pkgdown-devel}
for available fields). Under a second key, `has_value`, an array indicates which
fields must not have null values when filtering records.
### Counts
The `counts` section defines which fields are selected when accessing count data
from FinBIF. This sections uses the `selection` and `has_value` key in the
same manner as [`above`](#Surveys "Section surveys").
An additional key, `abundance`, is used to indicate which field is used as the
abundance (counts) data.
### Combine
The section `combine` defines how single-taxon indicator data is combined to
form a multi-taxon indicator. Options include:
- `geometric_mean`: combines relative abundance as the geometric mean abundance.
- `cti`: combine abundances as a community temperature index.
- `overall_abundance`: combine abundance as total abundance for the taxon group.
### Use data after
The section `use_data_after` defines what calendar date (in the form `"MM-DD"`)
data collected during the current year should start being included in the
indicator.
### Model
The `model` section defines how single- or multi-taxon indicators are calculated
from the survey and count data sourced from FinBIF. The section consists of one
or more model objects where the object keys indicate the model being used.
Models include:
- `trim`: Trends and Indices for Monitoring data model (via
[rtrim](https://github.com/SNStatComp/rtrim "rtrim GitHub repository"))
- `rbms`: Generalised abundance indices for butterfly monitoring count data (
via [rbms](https://retoschmucki.github.io/rbms/ "rbms"))
- `lmer`: Linear Mixed Effects Regression (via
[lme4](https://github.com/lme4/lme4/ "lme4 GitHub repository"))
Each element of the `model` section must include:
- `surveys_process`: An array of survey data processing functions (see
[processing](../reference/index.html#processing-functions "Process functions")
)
- `counts_process` An array of count data processing functions (see
[processing](../reference/index.html#processing-functions "Process functions")
)
And optionally include:
- `base_year`: an integer indicating the base year of the indicator.
- `args`: additional arguments passed to the modelling function.