# Coordinate operations and maps in R

This tutorial shows how to do coordinate transformations and conversions in R. It also demonstrates how spatial data can be mapped with ggplot2 and mapview.

## Introduction and setup

This tutorial originated while preparing an R-demonstration during a GIS club at INBO on 16 September 2021. (Further material of the GIS club is available here.)

A straightforward approach for transforming spatial data is available in another tutorial. The current tutorial tackles a few extra aspects.

The tutorial assumes you have some pre-existing knowledge:

• Basic knowledge about coordinate reference systems (CRSs) and geodetic datums; see another tutorial and the references and links therein.
• Knowing how to read geospatial files with `sf`. There is another tutorial demonstrating some aspects.

In this tutorial we will apply conversions (without datum shift) and transformations (with datum shift) to `sf` and `sfc` objects. In the background, these operations use the PROJ library.

Let’s load the needed R packages and prepare a more appropriate theme for mapping with `ggplot2`:

``````library(dplyr)
library(sf)
library(ggplot2)
theme_set(theme_bw())
theme_update(panel.grid = element_line(colour = "grey80"))
library(mapview)
``````

Set up input data if needed:

``````if (!file.exists(file.path(gisclubdata_path, "gemeenten_belgie.shp"))) {
file.path(tempdir(), "data_gisclub.zip"))
unzip(file.path(tempdir(), "data_gisclub.zip"),
exdir = tempdir())
list.files(file.path(tempdir(), "data_gisclub"), full.names = TRUE) %>%
file.copy(gisclubdata_path, recursive = TRUE)
}
``````

The code assumes that you have a `gisclubdata_path` object defined (directory path as a string).

## Joining and mapping polygon layers

For a given `municipalities` and `watercourses` dataset, we would like to append the municipality name to the watercourses by making a spatial join.

### Reading data and trying to join

Let’s read the data file of Belgian municipalities:

``````path_municipalities <- file.path(gisclubdata_path, "gemeenten_belgie.shp")
``````

It looks like this:

``````municipalities
``````
``````Simple feature collection with 581 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension:     XYZ
Bounding box:  xmin: 2.541331 ymin: 49.49695 xmax: 6.408098 ymax: 51.50511
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84
# A tibble: 581 × 4
T_MUN_NL        Shape_Leng Shape_Area                                geometry
<chr>                <dbl>      <dbl>                      <MULTIPOLYGON [°]>
1 ’s Gravenbrakel      0.759    0.0108  Z (((4.095037 50.66599 0, 4.095101 50.…
2 Aalst                0.676    0.0101  Z (((4.054666 50.99444 0, 4.054666 50.…
3 Aalter               0.735    0.0154  Z (((3.41099 51.15987 0, 3.41275 51.15…
4 Aarlen               0.951    0.0148  Z (((5.860749 49.72667 0, 5.860889 49.…
5 Aarschot             0.640    0.00807 Z (((4.927684 51.03717 0, 4.92797 51.0…
6 Aartselaar           0.265    0.00141 Z (((4.401293 51.14814 0, 4.400461 51.…
7 Aat                  0.977    0.0163  Z (((3.756864 50.69743 0, 3.757621 50.…
8 Affligem             0.261    0.00229 Z (((4.114181 50.92559 0, 4.115272 50.…
9 Aiseau-Presles       0.393    0.00284 Z (((4.584122 50.43827 0, 4.584458 50.…
10 Alken                0.336    0.00361 Z (((5.275147 50.9102 0, 5.275645 50.9…
# … with 571 more rows
``````

It appears that the points have three coordinate dimensions (`XYZ`). However the features have only two useful dimensions:

``````st_dimension(municipalities)
``````
``````  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[38] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[149] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[186] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[223] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[260] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[297] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[334] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[371] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[408] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[445] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[482] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[519] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[556] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
``````

Some tools – e.g. spherical geoprocessing – expect XY data. So we will drop the Z dimension with `st_zm()`. Meanwhile we select the only necessary attribute and rename it, using `select()`.

``````municipalities <-
municipalities %>%
select(municipality = T_MUN_NL) %>%
st_zm()
``````

We perform similar operations on the `watercourses` dataset. Here, data are already defined as two dimensions.

``````path_watercourses <- file.path(gisclubdata_path, "watergangen_antw.shp")
``````
``````watercourses
``````
``````Simple feature collection with 7258 features and 14 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 138050.5 ymin: 201644.5 xmax: 162761.3 ymax: 230509
CRS:           NA
# A tibble: 7,258 × 15
OIDN   UIDN VERSIE BEGINDATUM VERSDATUM   VHAG NAAM       OPNDATUM   BGNINV
<dbl>  <dbl>  <int> <date>     <date>     <dbl> <chr>      <date>      <int>
1 205462 492073      3 2017-07-05 2021-05-06  3320 nvt        2017-06-20     13
2 179828 488809      2 2015-01-20 2021-05-06    -9 nvt        2014-12-02      4
3 223031 499842      2 2018-03-28 2021-05-06 54710 Kallebeek… 2018-02-27      4
4 267383 462960      2 2020-02-22 2021-05-06    -9 nvt        2020-01-08      4
5  72361 481073      3 2010-11-08 2021-05-06 65101 nvt        2020-06-22      6
6 223073 499879      3 2018-03-28 2021-05-06    -9 nvt        2018-02-27     13
7  72079 480970      3 2010-11-08 2021-05-06 53460 nvt        2016-06-20      3
8 145811 567202      3 2012-11-20 2021-07-07    -8 ng         2012-10-19     11
9 205453 492064      3 2017-07-05 2021-05-06    -9 nvt        2017-06-20     10
10 269842 463401      2 2020-04-22 2021-05-06    -9 nvt        2020-04-03      4
# … with 7,248 more rows, and 6 more variables: LBLBGNINV <chr>, LENGTE <dbl>,
#   OPPERVL <dbl>, Shape_Leng <dbl>, Shape_Area <dbl>, geometry <POLYGON>
``````

Note that not all columns are printed. Let’s preview all columns with `glimpse()`:

``````glimpse(watercourses)
``````
``````Rows: 7,258
Columns: 15
\$ OIDN       <dbl> 205462, 179828, 223031, 267383, 72361, 223073, 72079, 14581…
\$ UIDN       <dbl> 492073, 488809, 499842, 462960, 481073, 499879, 480970, 567…
\$ VERSIE     <int> 3, 2, 2, 2, 3, 3, 3, 3, 3, 2, 1, 2, 2, 1, 2, 1, 2, 3, 2, 3,…
\$ BEGINDATUM <date> 2017-07-05, 2015-01-20, 2018-03-28, 2020-02-22, 2010-11-08…
\$ VERSDATUM  <date> 2021-05-06, 2021-05-06, 2021-05-06, 2021-05-06, 2021-05-06…
\$ VHAG       <dbl> 3320, -9, 54710, -9, 65101, -9, 53460, -8, -9, -9, -9, -9, …
\$ NAAM       <chr> "nvt", "nvt", "Kallebeekgeul", "nvt", "nvt", "nvt", "nvt", …
\$ OPNDATUM   <date> 2017-06-20, 2014-12-02, 2018-02-27, 2020-01-08, 2020-06-22…
\$ BGNINV     <int> 13, 4, 4, 4, 6, 13, 3, 11, 10, 4, 1, 4, 4, 1, 4, 4, 1, 13, …
\$ LBLBGNINV  <chr> "aanmaak uniek percelenplan\r\n", "bijhouding binnengebiede…
\$ LENGTE     <dbl> 570.19, 185.33, 1491.01, 307.60, 286.06, 270.32, 48.11, 17.…
\$ OPPERVL    <dbl> 2492.23, 494.77, 9350.36, 978.46, 432.97, 405.56, 118.29, 1…
\$ Shape_Leng <dbl> 570.18844, 185.32549, 1491.01243, 307.59691, 286.06075, 270…
\$ Shape_Area <dbl> 2492.22527, 494.77446, 9350.36080, 978.46010, 432.97004, 40…
\$ geometry   <POLYGON> POLYGON ((141976.5 222453.1..., POLYGON ((142200.6 2050…
``````

We only select some:

``````watercourses <-
watercourses %>%
select(polygon_id = OIDN,
vhag_code = VHAG,
watercourse = NAAM)
``````
``````watercourses
``````
``````Simple feature collection with 7258 features and 3 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 138050.5 ymin: 201644.5 xmax: 162761.3 ymax: 230509
CRS:           NA
# A tibble: 7,258 × 4
polygon_id vhag_code watercourse                                     geometry
<dbl>     <dbl> <chr>                                          <POLYGON>
1     205462      3320 nvt           ((141976.5 222453.1, 141973.1 222449.6, 1…
2     179828        -9 nvt           ((142200.6 205085.9, 142197.3 205085, 142…
3     223031     54710 Kallebeekgeul ((146874.6 203777.1, 146879.3 203775.9, 1…
4     267383        -9 nvt           ((148997 209136.1, 148996.7 209136.1, 148…
5      72361     65101 nvt           ((141780.9 213697.8, 141780.7 213695.3, 1…
6     223073        -9 nvt           ((144377.8 205161.2, 144408.7 205152.8, 1…
7      72079     53460 nvt           ((140094.1 210962.7, 140093.8 210962.5, 1…
8     145811        -8 ng            ((149004.6 209659.9, 149002.5 209666.4, 1…
9     205453        -9 nvt           ((139682 225241.4, 139697.5 225222, 13971…
10     269842        -9 nvt           ((155762.1 207273.7, 155766.5 207273, 155…
# … with 7,248 more rows
``````

In the above prints of the `municipalities` and `watercourses` objects, you could already see their coordinate reference system (CRS). With `st_crs()` you can extract the `crs` object; it is printed as:

``````st_crs(municipalities)
``````
``````Coordinate Reference System:
User input: WGS 84
wkt:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["latitude",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["longitude",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
``````

The CRS name ‘WGS 84’ can also be returned by `st_crs(municipalities)\$Name`. As seen from the WKT string, this is a geographical CRS.

However for `watercourses` the CRS is missing:

``````st_crs(watercourses)
``````
``````Coordinate Reference System: NA
``````

Can we make the spatial join already?

``````st_join(watercourses, municipalities) %>% try
``````
``````Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared,  :
st_crs(x) == st_crs(y) is not TRUE
``````

Fortunately, this doesn’t work: `st_join()` requires its inputs to belong to the same CRS.

Well, we know that `watercourses` is in the projected CRS `EPSG:31370` (BD72 / Belgian Lambert 72), so we can just update it:

``````st_crs(watercourses) <- "EPSG:31370"
``````

Since the CRSs are still different, this should not be sufficient for the join.

``````st_join(watercourses, municipalities) %>% try
``````
``````Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared,  :
st_crs(x) == st_crs(y) is not TRUE
``````

Indeed!

Let’s transform `municipalities` to `EPSG:31370`:

``````municipalities_31370 <- st_transform(municipalities, "EPSG:31370")
``````

And now…

``````(watercourses_comm <- st_join(watercourses, municipalities_31370))
``````
``````Simple feature collection with 7502 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 138050.5 ymin: 201644.5 xmax: 162761.3 ymax: 230509
Projected CRS: BD72 / Belgian Lambert 72
# A tibble: 7,502 × 5
polygon_id vhag_code watercourse                        geometry municipality
*      <dbl>     <dbl> <chr>                         <POLYGON [m]> <chr>
1     205462      3320 nvt           ((141976.5 222453.1, 141973.… Beveren
2     179828        -9 nvt           ((142200.6 205085.9, 142197.… Kruibeke
3     223031     54710 Kallebeekgeul ((146874.6 203777.1, 146879.… Kruibeke
4     267383        -9 nvt           ((148997 209136.1, 148996.7 … Antwerpen
5      72361     65101 nvt           ((141780.9 213697.8, 141780.… Beveren
6     223073        -9 nvt           ((144377.8 205161.2, 144408.… Kruibeke
7      72079     53460 nvt           ((140094.1 210962.7, 140093.… Beveren
8     145811        -8 ng            ((149004.6 209659.9, 149002.… Antwerpen
9     205453        -9 nvt           ((139682 225241.4, 139697.5 … Beveren
10     269842        -9 nvt           ((155762.1 207273.7, 155766.… Mortsel
# … with 7,492 more rows
``````

Succeeded!

Two things to note here:

• `st_join()` has an argument ‘`join`’ that defines the type of topological relationship that is used to either join or not join a geometry from the first layer with a geometry from the second. Various so-called ‘binary predicates’ can be used to define this relationship (see `?st_join`), but the default one is `st_intersects()`, i.e. an intersection as defined by the ‘DE-9IM’ model of binary geometry relations.
• Since several watercourses intersect more than one municipality, several of them are repeated in `watercourses_comm`, each with another values for `municipality`.

We chose to make the spatial join in the `EPSG:31370` CRS. However, we can equally do it on a spherical model of the Earth, when using the geographical coordinates. We expect the resulting joined attributes to be equivalent. Let’s check:

``````st_join(st_transform(watercourses, "EPSG:4326"), municipalities) %>%
st_drop_geometry %>%
all.equal(watercourses_comm %>% st_drop_geometry)
``````
``````[1] TRUE
``````

### Mapping

While it has only one attribute and 581 rows, the object size of `municipalities_31370` is relatively large because its polygons have numerous vertices:

``````object.size(municipalities_31370) %>% format(units = "MB")
``````
``````[1] "22.1 Mb"
``````

This makes a simple map take long to render in R (not shown here; just try yourself):

``````ggplot(data = municipalities_31370) + geom_sf()
``````

For the sake of this exercise, we make a simplified derived object `comm_simpl_31370` with fewer vertices:

``````comm_simpl_31370 <- st_simplify(municipalities_31370, dTolerance = 10)
``````

Its size is much smaller:

``````object.size(comm_simpl_31370) %>% format(units = "MB")
``````
``````[1] "3 Mb"
``````

So it renders faster.

``````ggplot(data = comm_simpl_31370) + geom_sf()
``````

However, mind that the simplification may be OK for mapping at an appropriate scale, but beware that information has been lost, so shapes have changed and gaps + overlaps will appear between polygons. Hence don’t use this simplified version if you process it further, or for detailed mapping.

Note the latitude-longitude graticule: the map follows the CRS of the plotted object, but the graticule is latitude-longitude by default.

We can plot the graticule of another CRS with `coord_sf()`:

``````ggplot(data = comm_simpl_31370) + geom_sf() + coord_sf(datum = "EPSG:31370")
``````

The above is the most obvious choice here, since it’s the CRS of the object, but we could as well plot the graticule of e.g. `EPSG:3035` (ETRS89-extended / LAEA Europe):

``````ggplot(data = comm_simpl_31370) + geom_sf() + coord_sf(datum = "EPSG:3035")
``````

Next to the `ggplot2` package, which we used above, interactive maps can be made with the `mapview` package.

Here too, time to render differs between:

``````mapview(municipalities_31370, legend = FALSE)
``````

and:

``````mapview(comm_simpl_31370, legend = FALSE)
``````

Note: run the code to see this and the following interactive maps.

In this case, `mapview` is using the `leaflet` package with some nice defaults, but in the above examples it makes a color map by default because there’s only one attribute. We don’t want this and we make some further adjustments:

``````mapview(comm_simpl_31370,
col.regions = "pink",
alpha.regions = 0.3,
legend = FALSE,
map.types = "OpenStreetMap")
``````

Note that a separate tutorial about `leaflet` is available on this website!

Beware that `mapview` and `leaflet` produce maps in the ‘WGS 84 / Pseudo-Mercator’ CRS (`EPSG:3857`). For that to happen, `mapview` transforms spatial objects on-the-fly. Pseudo Mercator or Web Mercator is a much used projection in web mapping because the coordinate projection is computed much faster. However it should not be used in official mapping or as a projected CRS for geoprocessing, since it has more distortions (including shape) than e.g. the (truly) conformal Mercator projection (e.g. in CRS `EPSG:3395` – WGS 84 / World Mercator – which is best used in the equatorial region).

Let’s show two layers with `mapview`:

``````mapview(list(comm_simpl_31370, watercourses),
col.regions = c("pink", "blue"),
alpha.regions = c(0.2, 0.5),
color = c("grey", NA),
legend = FALSE,
map.types = "OpenStreetMap")
``````

## Different transformation pipelines

We consider a layer of points that correspond to locations where characteristics were determined of Flemish surfacewater bodies:

``````path_points <- file.path(gisclubdata_path, "meetplaatsen_oppwater.shp")
``````

Check the CRS:

``````st_crs(points)
``````
``````Coordinate Reference System:
User input: Amersfoort / RD New
wkt:
PROJCRS["Amersfoort / RD New",
BASEGEOGCRS["Amersfoort",
DATUM["Amersfoort",
ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4289]],
CONVERSION["RD New",
METHOD["Oblique Stereographic",
ID["EPSG",9809]],
PARAMETER["Latitude of natural origin",52.1561605555556,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",5.38763888888889,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9999079,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",155000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",463000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Engineering survey, topographic mapping."],
AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
BBOX[50.75,3.2,53.7,7.22]],
ID["EPSG",28992]]
``````

The `points` object has the projected ‘Amersfoort / RD New’ CRS (`EPSG:28992`), which was created for The Netherlands – onshore. See e.g. Wikipedia for more background about this CRS.

Essentially we are dealing with a different projected CRS than `EPSG:31370`. `EPSG:28992` uses the ‘Amersfoort’ geodetic datum, defined using the ‘Bessel 1841’ ellipsoid, while `EPSG:31370` has the ‘Reseau National Belge 1972’ datum, defined using the ‘International 1924’ ellipsoid.1

We will transform `points` (back) to `EPSG:31370` since the data are outside the intended area of usage. Note that this dataset has been made for mere demonstrative purposes: since at least a part of the points fall outside the area of usage (geographic bounding box) of the ‘Amersfoort / RD New’ CRS, the dataset is not appropriate to start from. But let’s suppose it is all one had available to work with!

### Standard approach

By default, the appropriate transformation pipeline (concatenated coordinate operations) can differ between different areas that are represented in a spatial dataset. This is taken care of automatically, based on the bounding boxes of each CRS, but it will also depend on the availability of specific transformation grids.

``````points_31370 <- st_transform(points, "EPSG:31370")
``````

Notice the completely different coordinate ranges between both CRSs:

``````st_geometry(points)
#> Geometry set for 7320 features
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -62470.03 ymin: 300542 xmax: 193543.8 ymax: 427302.7
#> Projected CRS: Amersfoort / RD New
#> First 5 geometries:
#> POINT (-30849.68 354341.7)
#> POINT (-31237.91 354683.3)
#> POINT (-28757.74 356391.3)
#> POINT (709.4341 373861.2)
#> POINT (116680.8 340639.1)
``````
``````st_geometry(points_31370)
#> Geometry set for 7320 features
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 3174.827 ymin: 152992.4 xmax: 260391 ymax: 278347.1
#> Projected CRS: BD72 / Belgian Lambert 72
#> First 5 geometries:
#> POINT (35483.62 205530.5)
#> POINT (35090.55 205866.5)
#> POINT (37545.94 207609.8)
#> POINT (66666.83 225461.1)
#> POINT (183089.2 193870.8)
``````

A specific area shown with `ggplot2` in `EPSG:31370`:

``````ggplot() +
geom_sf(data = watercourses, fill = "lightblue", colour = "lightblue") +
geom_sf(data = points_31370, fill = "red", shape = 21) +
lims(x = c(154e3, 155e3), y = c(218e3, 219e3)) +
coord_sf(datum = "EPSG:31370")
``````

### Enforcing a specific pipeline

We can request available transformation pipelines with `sf_proj_pipelines()`:

``````pipelines <- sf_proj_pipelines("EPSG:28992", "EPSG:31370")
``````

One can further limit this by specifying the area of interest (`aoi` argument), which we didn’t do here.

It is a dataframe with following structure:

``````glimpse(pipelines)
``````
``````Rows: 13
Columns: 9
\$ id           <chr> "pipeline", "pipeline", "pipeline", "pipeline", "pipeline…
\$ description  <chr> "Inverse of RD New + Amersfoort to ETRS89 (9) + Inverse o…
\$ definition   <chr> "+proj=pipeline +step +inv +proj=sterea +lat_0=52.1561605…
\$ has_inverse  <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
\$ accuracy     <dbl> 0.011, 0.201, 0.260, 0.450, 1.001, 1.250, 2.000, 2.000, 2…
\$ axis_order   <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
\$ grid_count   <int> 2, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0
\$ instantiable <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
\$ containment  <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
``````

``````pipelines %>%
as_tibble %>%
select(definition, description, grid_count, accuracy, instantiable)
``````
``````# A tibble: 13 × 5
definition             description           grid_count accuracy instantiable
<chr>                  <chr>                      <int>    <dbl> <lgl>
1 +proj=pipeline +step … Inverse of RD New + …          2    0.011 TRUE
2 +proj=pipeline +step … Inverse of RD New + …          1    0.201 TRUE
3 +proj=pipeline +step … Inverse of RD New + …          1    0.26  TRUE
4 +proj=pipeline +step … Inverse of RD New + …          0    0.45  TRUE
5 +proj=pipeline +step … Inverse of RD New + …          1    1.00  TRUE
6 +proj=pipeline +step … Inverse of RD New + …          0    1.25  TRUE
7 +proj=pipeline +step … Inverse of RD New + …          0    2     TRUE
8 +proj=pipeline +step … Inverse of RD New + …          0    2     TRUE
9 +proj=pipeline +step … Inverse of RD New + …          0    2     TRUE
10 +proj=pipeline +step … Inverse of RD New + …          0    2     TRUE
11 +proj=pipeline +step … Inverse of RD New + …          0    6     TRUE
12 +proj=pipeline +step … Inverse of RD New + …          0    6     TRUE
13 +proj=pipeline +step … Inverse of RD New + …          0   NA     TRUE
``````

We can see that the first one has best accuracy. Several pipelines are based on one or two grids for a better accuracy.

When printing the `pipelines` object, its first row is printed in a more readable manner:

``````pipelines
``````
``````Candidate coordinate operations found:  13
Strict containment:     FALSE
Axis order auth compl:  FALSE
Source:  EPSG:28992
Target:  EPSG:31370
Best instantiable operation has accuracy: 0.011 m
Description: Inverse of RD New + Amersfoort to ETRS89 (9) + Inverse of BD72
to ETRS89 (3) + Belgian Lambert 72
Definition:  +proj=pipeline +step +inv +proj=sterea +lat_0=52.1561605555556
+lon_0=5.38763888888889 +k=0.9999079 +x_0=155000
+y_0=463000 +ellps=bessel +step +proj=hgridshift
+grids=nl_nsgi_rdtrans2018.tif +step +inv
+proj=hgridshift
+grids=be_ign_bd72lb72_etrs89lb08.tif +step
+proj=lcc +lat_0=90 +lon_0=4.36748666666667
+lat_1=51.1666672333333 +lat_2=49.8333339
+x_0=150000.013 +y_0=5400088.438 +ellps=intl
``````

Since the rows of pipelines are sorted by `instantiable` and `accuracy`, it makes sense to designate the printed pipeline as Best instantiable operation.

It follows that the same information appears when manually selecting and then printing, except for the number of ‘candidate operations found’:

``````pipelines[1,]
``````
``````Candidate coordinate operations found:  1
Strict containment:     FALSE
Axis order auth compl:  FALSE
Source:  EPSG:28992
Target:  EPSG:31370
Best instantiable operation has accuracy: 0.011 m
Description: Inverse of RD New + Amersfoort to ETRS89 (9) + Inverse of BD72
to ETRS89 (3) + Belgian Lambert 72
Definition:  +proj=pipeline +step +inv +proj=sterea +lat_0=52.1561605555556
+lon_0=5.38763888888889 +k=0.9999079 +x_0=155000
+y_0=463000 +ellps=bessel +step +proj=hgridshift
+grids=nl_nsgi_rdtrans2018.tif +step +inv
+proj=hgridshift
+grids=be_ign_bd72lb72_etrs89lb08.tif +step
+proj=lcc +lat_0=90 +lon_0=4.36748666666667
+lat_1=51.1666672333333 +lat_2=49.8333339
+x_0=150000.013 +y_0=5400088.438 +ellps=intl
``````

If you’d like to get even more metadata on the available pipelines, including the WKT definitions, you can also run PROJ’s `projinfo` command line program from within R if it’s available in your system `PATH`.

``````system("projinfo -s EPSG:28992 -t EPSG:31370 --spatial-test intersects -o WKT2:2019")
``````

The default operation returned by PROJ’s `projinfo`, based on the union of the source and target CRS’s bounding box, appears to be just one pipeline with so-called ballpark accuracy. It’s obtained by dropping the `--spatial-test intersects` option (or replacing `intersects` by `contains`, which is default)2, and so it matches the pipeline where `accuracy` equals `NA` in our `pipelines` dataframe:

``````pipelines[13,]
``````
``````Candidate coordinate operations found:  1
Strict containment:     FALSE
Axis order auth compl:  FALSE
Source:  EPSG:28992
Target:  EPSG:31370
Best instantiable operation has only ballpark accuracy
Description: Inverse of RD New + Ballpark geographic offset from Amersfoort
to BD72 + Belgian Lambert 72
Definition:  +proj=pipeline +step +inv +proj=sterea +lat_0=52.1561605555556
+lon_0=5.38763888888889 +k=0.9999079 +x_0=155000
+y_0=463000 +ellps=bessel +step +proj=lcc
+lat_0=90 +lon_0=4.36748666666667
+lat_1=51.1666672333333 +lat_2=49.8333339
+x_0=150000.013 +y_0=5400088.438 +ellps=intl
``````

The low accuracy of a ballpark transformation is because datum differences between source and target CRSs are neglected:

It does not attempt any datum shift, hence the “ballpark” qualifier in its name. Its accuracy is unknown, and could lead in some cases to errors of a few hundreds of metres.

(Source: https://proj.org/glossary.html#term-Ballpark-transformation)

With `st_transform()` we can enforce a specific pipeline. Note that it is only available for `sfc` objects, not `sf` objects.

Let’s apply the most accurate pipeline to all points:

``````# We select the pipeline with lowest accuracy (< 0.02), by filtering on accuracy.
# If the grids are installed, this should match the first line of the pipelines object.
chosen_pipeline_definition <- pipelines %>% filter(accuracy < 0.02) %>% pull(definition)
points_31370_strict <-
st_transform(st_geometry(points), "EPSG:31370",
pipeline = chosen_pipeline_definition) %>%
st_sf(st_drop_geometry(points), geometry = .) %>%
as_tibble %>%
st_as_sf
``````

And compare:

``````st_geometry(points_31370)
#> Geometry set for 7320 features
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 3174.827 ymin: 152992.4 xmax: 260391 ymax: 278347.1
#> Projected CRS: BD72 / Belgian Lambert 72
#> First 5 geometries:
#> POINT (35483.62 205530.5)
#> POINT (35090.55 205866.5)
#> POINT (37545.94 207609.8)
#> POINT (66666.83 225461.1)
#> POINT (183089.2 193870.8)
``````
``````st_geometry(points_31370_strict)
#> Geometry set for 7320 features  (with 9 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 18976.03 ymin: 152963.3 xmax: 260391 ymax: 244024
#> Projected CRS: BD72 / Belgian Lambert 72
#> First 5 geometries:
#> POINT (35393.39 205495.3)
#> POINT (35000.39 205831.3)
#> POINT (37455.41 207574.3)
#> POINT (66666.83 225461.1)
#> POINT (183089.2 193870.8)
``````

Notice the 9 empty geometries in case of `points_31370_strict`!

What is happening?

``````p <-
ggplot() +
geom_sf(data = points_31370, colour = "red") +
geom_sf(data = points_31370_strict, colour = "green")
p
``````

It appears that points too far at sea are considered invalid when applying this pipeline, and their geometry was cleared (although the corresponding row was not dropped).

If we make a zoomable plot, there’s more to discover.

``````plotly::ggplotly(p)
``````

Most points on the map are on the exact same location between both approaches. We can see that points west and south of the geographic bounding box for usage of the Dutch CRS have a different position depending on the applied approach.

It can be further investigated with `mapview` (note that this does a further reprojection to `EPSG:3857`):

``````mapview(list(points_31370, points_31370_strict),
col.regions = c("red", "green"))
``````

Here it appears that the western and southern points of `points_31370` were transformed with the ballpark-accuracy pipeline: they are way off the surfacewater bodies!

In this case, one might prefer enforcing the specific pipeline outside the geographic bounding box of `EPSG:28992`, as we did for `points_31370_strict`. However best explore the other pipelines as well, e.g. the most accurate one that doesn’t depend on a grid, and consider using that outside the geographic bounding box.

Bottomline is that you should be suspicious about coordinates outside the CRS’s geographic bounding box! So keep an eye on them when applying a default transformation procedure.

## Further reading and more packages

Much more information can be found in the online books by Pebesma & Bivand (2021) and Lovelace et al. (2019). The `sf` package has a nice website at https://r-spatial.github.io/sf; version 0.6-1 was introduced in Pebesma (2018).

There are several more R packages to make beautiful maps. We specifically mention `tmap` (Tennekes, 2018) and `mapsf` for making production-ready thematic maps. `tmap` is used a lot in Lovelace et al. (2019).

Further, there is the `plotKML` package, which renders spatial objects in Google Earth (Hengl et al., 2015)!

## Bibliography

Hengl T., Roudier P., Beaudette D. & Pebesma E. (2015). plotKML: Scientific Visualization of Spatio-Temporal Data. Journal of Statistical Software 63. http://www.geostat-course.org/system/files/jss1079.pdf.

Lovelace R., Nowosad J. & Muenchow J. (2019). Geocomputation with R. https://geocompr.robinlovelace.net/.

Pebesma E. (2018). Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal 10 (1): 439–446. https://journal.r-project.org/archive/2018/RJ-2018-009/index.html.

Pebesma E. & Bivand R. (2021). Spatial Data Science. https://r-spatial.org/book.

Tennekes M. (2018). tmap: Thematic Maps in R. Journal of Statistical Software 84 (1): 1–39. https://doi.org/10.18637/jss.v084.i06.

1. Note that several European countries, including Belgium, now provide national projected CRSs that use the common ETRS89 datum (European Terrestrial Reference System 1989). An example is the Belgian CRS `EPSG:3812` (ETRS89 / Belgian Lambert 2008). ↩︎

2. In the documentation of `projinfo`, we can read about the `--spatial-test` option: ‘Specify how the area of use of coordinate operations found in the database are compared to the area of use specified explicitly with `--area` or `--bbox`, or derived implicitly from the area of use of the source and target CRS. By default, projinfo will only keep coordinate operations whose area of use is strictly within the area of interest (contains strategy). If using the intersects strategy, the spatial test is relaxed, and any coordinate operation whose area of use at least partly intersects the area of interest is listed.’ ↩︎