overview_forrescalc.Rmd
The package forrescalc
is developed to analyse tree measure data that are saved in a fieldmap database. The package contains functions to aggregate the data to different levels. As the dataset is rather large and these calculations take a lot of time, the package also contains functions to save results in a git repository and retrieve them afterwards for further analysis or visualisation of results. Primary results of the aggregations are saved in the git repository forresdat
, which is meant to be data source for aggregated data on forests. Calculations are done by using ID codes, and lookup tables for these codes are saved as separate tables in forresdat
.
Installation of the package:
# install.packages("remotes")
remotes::install_github("inbo/forrescalc", build_vignettes = TRUE)
Some functions require local access to the github repository forresdat.
The package forrescalc
can be used for different reasons:
forresdat
up-to-dateforresdat
to do visualisations or analysesThe next part shortly describes some common tasks and mentions the functions that are useful for these tasks. For detailed information on the use of specific functions (including an example), we refer to the help of the function (accessible in R by typing the function name preceded by a ?
, for instance ?load_data_dendrometry
). Updating forresdat
should be done following a strict routine, for other tasks it is up to the user to cherry-pick the desired functions.
With the calculations for forresdat
in mind, we developed some functions to load specific data from the Fieldmap database: load_data_dendrometry()
, load_data_deadwood()
, load_data_regeneration()
and load_data_vegetation()
. After retrieving data from the database, they might calculate some additional variables that are needed for further analysis (e.g. they calculate year
starting from the observation date). The output of these functions is ready to be used in the calculation functions.
The calculations, which are in fact aggregations of very detailed data on tree level to plot or plot-species level, are grouped in 3 umbrella functions:
Each of these functions bundles different subfunctions for the same dataset(s), but with a result on a different level. calculate_dendrometry()
performs for instance the subfunctions calculate_dendro_plot()
, calculate_dendro_plot_species()
and create_unique_tree_id()
. The result of calculate_dendrometry()
is a list of 3 dataframes, each being the result of one of the subfunctions. To update forresdat
, all subfunctions have to be executed, so here it is easier to use these bundled functions. Users that only need one subfunction, can also use these subfunctions.
Subfunctions that aggregate data, have a name starting with the above mentioned high level functions or their abbreviation (calculate_dendro
, calculate_regeneration
or calculate_vegetation
) and ending with their grouping variable(s). For instance a function that aggregates dendrometry data on the level of plot and year is named calculate_dendro_plot()
. Each of these functions makes this aggregation for different variables, for instance number of species, basal area and volume. These details are described in the help of the specific functions (e.g. ?calculate_dendro_plot
).
Subfunctions that not typically aggregate data, can have other names:
create_unique_tree_id
brings data from individual trees over different years togetherAnd there are also some functions that we thought to be useful (or user request functions) (these are not used to manipulate data for forresdat
):
make_table_wide()
changes a dataframe from long to wide, what you might want to do with (a part of) the results from create_unique_tree_id()
NOTE: to be able to save/retrieve data to/from forresdat
or another git repository, one should first clone the repository to a local RStudio project. This can be done following the 4 steps described here (see also the 2 figures below the text). The https link to forresdat
can be copied from its github page.
With the workflow for the update of forresdat
in mind, we developed 2 functions to save the results from the umbrella functions: save_results_git()
(to save the tables in forresdat
or another git repository) and save_results_access()
(equivalent function to save the tables in an Access database). Both functions save the dataframes from the list as separate tables. save_results_git()
(and other functions related to git) use the functionality of the git2rdata package.
Tables from git can be loaded again in R with the function read_forresdat()
.
Two other functions allow to copy tables from an Access database to a git repository and visa versa (from_access_to_git()
and from_git_to_access()
). The first one allows to add lookup tables to forresdat
and the second one might be useful to copy tables from forresdat
to a personal analysis database in Access.
Functions save_results_git()
and from_access_to_git()
save tables in your local copy of the git repository (grouped as 1 commit for each function). Changes can be checked in the local repository (project forresdat
, tab Git, button History) and pushed to the remote repository on GitHub. If you are not satisfied with the changes, you can remove the last commit with the function remove_last_commit_git
, but only BEFORE YOU PUSHED THIS COMMIT TO THE REMOTE.
forresdat
To guarantee consistency, the data repository forresdat
should be updated using a strict routine. This routine is written in a script called Main.R
that is in the root of the installed package (and in the inst
folder of the git repository of the package forrescalc
). In this script, paths to the Fieldmap database and the git repository should be adapted to the local situation. Evidently changes in the commits should be checked before pushing them to the remote, as is described above.