Returns either the raw data source soilmap or (by default) the processed data source soilmap_simple as a standardized sf multipolygon layer (tidyverse-styled, internationalized) in the Belgian Lambert 72 CRS (EPSG-code 31370). Given the size of these data sources (especially the raw one), this function takes a bit longer than usual to run.

  file = file.path(locate_n2khab_data(),
  file_raw = file.path(locate_n2khab_data(), "10_raw/soilmap"),
  use_processed = TRUE,
  version_processed = "soilmap_simple_v2",
  standardize_coastalplain = FALSE,
  simplify = FALSE,
  explan = FALSE



The absolute or relative file path of the processed data source soilmap_simple. Used only if use_processed = TRUE (= default). The default value follows the data management advice in the vignette on data storage (run vignette("v020_datastorage")). It uses the first n2khab_data folder that is found when sequentially climbing up 0 to 10 levels in the file system hierarchy, starting from the working directory.


Same as file, to define the filepath of the raw datasource soilmap. Used only if use_processed = FALSE.


Logical. If TRUE (the default), load and return the processed data source soilmap_simple, instead of the raw data source soilmap. The central layer of soilmap_simple can be manually generated by reading the raw soilmap data source with standardize_coastalplain=TRUE, simplify=TRUE and explan=FALSE, but this takes some time.


Version ID of the soilmap_simple data source. Only used with use_processed = TRUE (the default). Defaults to the latest available version defined by the package.


Logical. Only applied with use_processed = FALSE. If TRUE, fill the values of the morphogenetic substrate, texture and drainage variables (bsm_mo_substr, bsm_mo_tex and bsm_mo_drain + their _explan counterparts) where possible, for features with a geomorphological soil type code. This largely applies to features in the 'coastal plain' area.

  • To derive morphogenetic texture and drainage levels from the geomorphological soil types, a conversion table by Bruno De Vos & Carole Ampe is applied (for earlier work on this, see Ampe 2013).

  • Substrate classes are copied over from bsm_ge_substr into bsm_mo_substr (bsm_ge_substr already follows the categories of bsm_mo_substr).

  • A logical variable is added to the output to mark conversions (see section Value).

These steps coincide with the approach that was taken to construct bsm_mo_soilunitype in the raw data source.


Logical. Only applied with use_processed = FALSE. If TRUE, only a limited number of variables that are most useful for analytical work are returned.


Logical, defaults to FALSE. Should the _explan variables accompanying bsm_mo_xxx variables be returned in the simplified result? If use_processed = FALSE: only has effect if simplify=TRUE. (With simplify=FALSE, the _explan variables are always returned.) If use_processed = TRUE: is applied in returning soilmap_simple.


A Simple feature collection of geometry type MULTIPOLYGON, representing either the processed data source soilmap_simple (default) or the raw data source soilmap.

Besides the standardization for the coastal plain areas, soilmap_simple contains only a subset of the soilmap variables (marked with an asterisk below).

The soilmap attribute variables all start with prefix bsm_ (referring to the 'Belgian soil map'), in order to distinguish from similar attributes derived from other maps or field observations.

Most attributes represent categories and are returned as factors. When a variable is a one-to-one translation of another (e.g. code vs. explanation), the order of factor levels is aligned.

Three types of data frame variables are returned when reading soilmap:

  • variables with mo_ in their name: their categories follow the Belgian Morphogenetic System.

    • With standardize_coastalplain = FALSE, these are only available outside the coastal plain areas except for bsm_mo_soilunitype (which is standardized already in the raw data source).

  • variables with ge_ in their name: their categories follow the Belgian Geomorphological System. (Note however, that bsm_ge_substr does follow the Belgian Morphogenetic System as well.)

    • Values are typically available within the coastal plain areas, but some geomorphological soil types (starting with letter O) have a wider distribution across Flanders.

    • They are not included in soilmap_simple.

    • A special variable is bsm_ge_typology, which is TRUE if bsm_soiltype follows the geomorphological typology, and FALSE otherwise.

  • variables without mo_ or ge_ in their name are:

    • either system-agnostic metadata (first two + last four variables: bsm_poly_id, bsm_map_id, bsm_map_url, bsm_book_url, bsm_detailmap_url, bsm_profloc_url),

    • or mixed (representing mo_ categories within and ge_ categories outside coastal plains): the other ones, like bsm_region, bsm_legend, bsm_soiltype and bsm_soilseries.

    • A special variable is bsm_converted, returned only if standardize_coastalplain = TRUE.

Many variables have a 'counterpart variable' with suffix _explan: they provide a more elaborate textual explanation. They are not listed below.

Short explanation of attributes is given below. More elaborate explanations can be found in the references and in metadata at DOV.

  1. Meaning of the main non-metadata variables:

    • bsm_region (*): name of the region

    • bsm_ge_region: code of the region within the coastal plain area

    • bsm_legend: generalised (simplified) legend key (37 levels)

    • bsm_legend_title and bsm_legend_explan: the legend keys and text of Van Ranst & Sys (2000) (833 and 622 levels, respectively)

    • bsm_soiltype: the soil type of the Belgian soil map (mixed nature: morphogenetic & geomorphological codes). bsm_soiltype_id represents a numeric code for each level.

    • bsm_ge_typology: Logical. Does the soiltype code follow the geomorphological typology?

    • bsm_soiltype_region: bsm_soiltype, followed by a code representing bsm_region

    • bsm_soilseries: either the morphogenetic soil series (outside the coastal plain areas), which is the three core characters of bsm_soiltype, or just bsm_soiltype if the latter has a geomorphological code.

    • bsm_converted (*): Logical. Were morphogenetic texture and drainage variables (bsm_mo_tex and bsm_mo_drain) derived from a conversion table? This is equivalent with the question: does bsm_mo_soilunitype differ from bsm_soiltype? Value TRUE is largely confined to the 'coastal plain' areas. Only returned if standardize_coastalplain = TRUE. (Note: the variable is not included in version soilmap_simple_v1.)

    • bsm_mo_soilunitype (*): as bsm_soiltype, but applying morphogenetic codes within the coastal plain areas in most cases (see the standardize_coastalplain argument for more information about this conversion)

    • bsm_mo_substr (*), bsm_ge_substr: code of the soil substrate

    • bsm_mo_tex (*): code of the soil texture category

    • bsm_mo_drain (*): code of the soil drainage category

    • bsm_mo_prof (*): code of the soil profile category

    • bsm_mo_parentmat (*): code of a variant regarding the parent material

    • bsm_mo_profvar (*): code of a variant regarding the soil profile

    • bsm_mo_phase: code of the soil phase (i.e. additional soil properties). They are explained in the book that accompanies the specific analog map identified by bsm_map_id.

    • bsm_ge_series: the geomorphological soil series

    • bsm_ge_subseries: the geomorphological soil subseries

  2. Meaning of the metadata variables:

    • bsm_poly_id (*): unique polygon ID (numeric)

    • bsm_map_id: code of the analog map covering this area

    • bsm_map_url: hyperlink to the scanned analog map scale 1:20000 (pdf), identified by bsm_map_id

    • bsm_bookurl: hyperlink to the scanned book (pdf), accompanying the analog map identified by bsm_map_id

    • bsm_detailmap_url: hyperlink to the scanned maps at scale 1:5000 (zip-file with jpg files) belonging to the map identified by bsm_map_id

    • bsm_profloc_url: hyperlink to the scanned maps with the profile locations (zip-file with jpg files) belonging to the map identified by bsm_map_id

(*) Included in the soilmap_simple data source.


The raw data source is published at DOV (Databank Ondergrond Vlaanderen) and is discussed by Van Ranst & Sys (2000) and Dudal et al. (2005). A 'pure' (single) dataformat of the raw data source (no metadatafiles etc.) has also been stored (with versioning) at Zenodo (doi:10.5281/zenodo.3387007 ) - which we refer to as the soilmap data source - in order to support the read_soilmap() function and to sustain long-term workflow reproducibility.

The processed data source soilmap_simple is a GeoPackage, available at Zenodo.

Note that factors are generated with implicit NA values (i.e. there is no factor level to represent the missing values). If you want this category to appear in certain results, you can convert such variables with forcats::fct_explicit_na().

In case the raw data source soilmap is used (use_processed = FALSE), it is possible to manually perform the standardization for coastal plain features and/or the simplification, both of which were applied in the soilmap_simple data source. See Arguments for more information.

See R-code in the n2khab-preprocessing repository for the creation of the soilmap_simple data source from the soilmap data source.


  • Ampe C. (2013). Databank aardewerk Vlaanderen 2010. Omzetten (zeer) oude legende bodemkartering naar legende bodemkaart Kuststreek. Vlaamse Landmaatschappij Regio West, Bruges, 45 p.

  • Dudal R., Deckers J., Van Orshoven J. & Van Ranst E. (2005). Soil survey in Belgium and its applications. In: Bullock P., Jones R.J.A., Montanarella L. (editors). Soil Resources of Europe. Office for Official Publications of the European Communities, Luxembourg, p. 63–71. URL:

  • Van Ranst E. & Sys C. (2000). Eenduidige legende van de digitale bodemkaart van Vlaanderen (schaal 1: 20000). Universiteit Gent, Laboratorium voor Bodemkunde, Ghent, 361 p. URL:

See also

Other functions returning environmental data sets: read_shallowgroundwater(), read_watercourse_100mseg(), read_watersurfaces()


if (FALSE) {
# This example supposes that your working directory or a directory up to 10
# levels above has the 'n2khab_data' folder AND that the latest version of
# the 'soilmap_simple'
# data source is present in the default subdirectory.
# In all other cases, this example won't work but at least you can
# consider what to do.

soilmap_simple <- read_soilmap()
soilmap_simple %>%
  filter(! %>%
soilmap_simple %>%
  filter(bsm_converted) %>%