Introduction

The package forrescalc is developed to analyse tree measure data from Vandekerkhove et al. (2021) that are saved in a Fieldmap database. The package contains functions to aggregate the data to different levels. As the dataset is rather large and these calculations take a lot of time, the package also contains functions to save results in a git repository and retrieve them afterwards for further analysis or visualisation of results. Primary results of the aggregations are saved in the git repository forresdat, which is meant to be data source for aggregated data on forests. Calculations are done by using ID codes, and lookup tables for these codes are saved as separate tables in forresdat.

Installation

To install forrescalc from the INBO universe, start a new R session and run this code (before loading any packages):

# Enable the INBO universe
# (not needed for INBO employees, as this is the default setting)
options(
  repos = c(
    inbo = "https://inbo.r-universe.dev", CRAN = "https://cloud.r-project.org"
  )
)
# Install the package
install.packages("forrescalc")

To install forrescalc from GitHub, start a new R session and run this code (before loading any packages):

install.packages("remotes")
remotes::install_github("inbo/forrescalc", build_vignettes = TRUE)

Some functions require local access to the GitHub repository forresdat.

The package forrescalc can be used for different reasons:

  • keep the data repository forresdat up-to-date
  • load data from the Fieldmap database or forresdat to do visualisations or analyses
  • do similar aggregations on your own data
  • validate data from the Fieldmap database

The next part shortly describes some common tasks and mentions the functions that are useful for these tasks. For detailed information on the use of specific functions (including an example), we refer to the help of the function (accessible in R by typing the function name preceded by a ?, for instance ?load_data_dendrometry). Updating forresdat should be done following a strict routine, for other tasks it is up to the user to cherry-pick the desired functions.

Load data from Fieldmap database

With the calculations for forresdat in mind, we developed some functions to load specific data from the Fieldmap database: load_plotinfo(), load_data_dendrometry(), load_data_shoots(), load_data_deadwood(), load_data_regeneration(), load_data_vegetation() and load_data_herblayer(). After retrieving data from the database, they might calculate some additional variables that are needed for further analysis (e.g. they calculate year starting from the observation date). The output of these functions is ready to be used in the calculation functions.

To be able to calculate tree volumes starting from tree and shoot data, height models are available from git repository forresheights using function load_height_models().

To validate the Fieldmap database, package forrescalc also contains ‘check’-functions for different tables of the database, and an umbrella function check_data_fmdb() that runs all these functions and lists all missing data and wrong input.

Calculations (aggregations)

Umbrella functions

The calculations, which are in fact aggregations of very detailed data on tree level to plot or plot-species level, are grouped in 3 umbrella functions:

Each of these functions bundles different subfunctions for the same dataset(s), but with a result on a different level. calculate_dendrometry() performs for instance the subfunctions calc_dendro_plot(), calc_dendro_plot_species(), calc_deadw_decay_plot(), calc_deadw_decay_plot_species(), calc_diam_plot() and calc_diam_plot_species() (after calculating tree volumes using functions compose_stem_data(), calc_variables_stem_level(), calc_variables_tree_level() and calc_intact_deadwood()). The result of calculate_dendrometry() is a list of 6 dataframes, each being the result of one of the subfunctions. To update forresdat, all subfunctions have to be executed, so here it is easier to use these bundled functions. Users that only need one subfunction, can also use these subfunctions.

Subfunctions

Subfunctions that aggregate data, have a name starting with the abbreviation of the above mentioned high level functions (calc_dendro_, calc_regeneration_ or calc_vegetation_) and ending with their grouping variable(s). For instance a function that aggregates dendrometry data on the level of plot and year is named calc_dendro_plot(). Each of these functions makes this aggregation for different variables, for instance number of species, basal area and volume. These details are described in the help of the specific functions (e.g. ?calc_dendro_plot).

Subfunctions that not typically aggregate data, can have other names:

  • create_unique_tree_id() brings data from individual trees over different years together
  • add_zeros(): as absence is not reported explicitly, this function allows to add records with zero values for missing species, missing height classes,…
  • compare_periods_per_plot() compares parameters between periods (years that parameters are measured)
  • create_statistics() gives summary statistics for the specified variables on the specified level
  • calc_diam_statistics_species() gives the diameter distribution on the level of forest reserve, species and year

Additional functions

And there are also some functions that we thought to be useful (or user request functions) (these are not used to manipulate data for forresdat):

Save and retrieve data

NOTE: to be able to save data to forresdat or another git repository, one should first clone the repository to a local RStudio project. This can be done following the 4 steps described here (see also the 2 figures below the text). The https link to forresdat can be copied from its GitHub page.

With the workflow for the update of forresdat in mind, we developed 3 functions to save the results from the umbrella functions: save_results_forresdat() (to save the tables in forresdat or another git repository), save_results_access() (equivalent function to save the tables in an Access database) and save_results_csv() (equivalent function to save the tables as .csv). All three functions save the dataframes from the list as separate tables. They use the functionality of the git2rdata package.

Tables from git can be loaded again in R with the function read_forresdat_table(), and the whole ‘data package’ forresdat can be loaded using read_forresdat(). The result of the latter is in the frictionless data package format and it can be treated using the frictionless package.

Two other functions allow to copy tables from an Access database to a git repository and visa versa (from_access_to_forresdat() and from_forresdat_to_access()). The first one allows to add lookup tables to forresdat and the second one might be useful to copy tables from forresdat to a personal analysis database in Access.

Functions save_results_forresdat() and from_access_to_forresdat() save tables in your local copy of the git repository (grouped as 1 commit for each function, and with adapting the .json file that allows it to be used as a frictionless data package). Changes can be checked in the local repository (project forresdat, tab Git, button History) and pushed to the remote repository on GitHub. If you are not satisfied with the changes, you can remove the last commit with the function remove_last_commit_forresdat(), but only BEFORE YOU PUSHED THIS COMMIT TO THE REMOTE.

Update forresdat

To guarantee consistency, the data repository forresdat should be updated using a strict routine. This routine is written in a script called main.R that is in the root of the installed package (and in the inst folder of the git repository of the package forrescalc). In this script, paths to the Fieldmap database and the git repository should be adapted to the local situation. Evidently changes in the commits should be checked before pushing them to the remote, as is described above.

When removing a table from forresdat, this should only be done using function remove_table_forresdat() to make sure the .json file (that makes it a frictionless data package) is updated correctly.

References

Vandekerkhove K., Van de Kerckhove P., Leyman A., De Keersmaeker L., Lommelen E., Esprit M. and Goessens S., 2021. Monitoring programme on strict forest reserves in Flanders (Belgium): Methods and operational protocols: With an overview of the intensive monitoring sites. Reports of the Research Institute for Nature and Forest 2021(28). Research Institute for Nature and Forest, Brussels. https://doi.org/10.21436/inbor.38677490