library(terra)
library(purrr)
library(sf)
library(dplyr)
library(tmap)
<- rast("data-large/ces/1961.tif")
ces1961
<- ces1961/255 # normalize the values to [0,1]
ces1961
names(ces1961) <- "luminosity"
Supervised learning
Introduction
- Supervised learning is a type of machine learning where the model is trained on a labelled dataset
- The model learns to map input data to the correct output label
- Supervised learning is used extensively in remote sensing to enable quantitative analysis of satellite imagery
- We will use historic satellite imagery to train a model to classify land cover types in a given area
Task and Data
- The task we will perform is to quantitatively analyze forest cover change over time in the area surrounding the village Ces in Ticino, Switzerland
- We are provided with historic areal imagery from 1961 from swisstopo [source]
- The resolution of the imagery is 1 m/pixel and it contains only one single band (!!!)
plot(ces1961, col = grey.colors(255))
Approach
To solve this task
- We first need to create some labelled data
- Split our labelled data into training and testing data
- Train a model on the training data
- Evaluate the model on the testing data
Data labelling
In preparation, I used QGIS to create labelled points for the following classes:
- Forest
- Buildings
- Agriculture
- Shadows
library(sf)
<- read_sf("data-large/ces/labelled-data.gpkg")
labelled_data
# We will turn the class into a factor,
# which is useful for classification
$class <- factor(labelled_data$class)
labelled_data
tmap_mode("view")
tm_shape(ces1961) +
tm_raster(palette = "-Greys",legend.show = FALSE) +
tm_shape(labelled_data) +
tm_dots(col = "class") +
tm_layout(legend.outside = TRUE)
Splitting the data
- We need to split our data into training and testing data
- We will randomly select 70% of the data for training and the remaining 30% for testing
Important
Note that this is not the recommended approach, instead we should use a spatially stratified split (see Brenning 2023; and Schratz et al. 2021). To keep things simple for this exercise however, we will use a random split.
<- labelled_data |>
labelled_data group_by(class) |> # for each class...
mutate( #
i = sample(seq_len(n())), # ...assign a random number from 1 to n...
train = i <= n()*0.7 # ...and assign the first 70% to "training"
)
<- labelled_data |> # filter the training data and store in a
data_train filter(train) |> # a new data frame
select(-c(i,train)) # remove obsolete columns
<- labelled_data |> # same as above, but the filter is inverted
data_test filter(!train) |> # for the testing data
select(-c(i,train))
tmap_mode("plot")
tm_shape(labelled_data) + tm_dots(col = "train", palette = c("blue","red"))
Feature Extraction
- We need to extract the values of the raster data at the labelled points
- Since we only have one band, our result is a data.frame with one column
# ID = FALSE means that we do not want to add an ID column
<- terra::extract(ces1961, data_train, ID = FALSE)
train_features
head(train_features)
luminosity
1 0.3568627
2 0.1960784
3 0.4392157
4 0.6313725
5 0.2823529
6 0.5294118
<- cbind(data_train, train_features) |>
data_train2 st_drop_geometry()
Training the model
- We will use the
rpart
package to train a classification tree - The classification tree is also known as a decision tree
- A decision tree has a flowchart-like structure (see Figure 6.1)
- Classification trees does not always produce the best results, but they are simple and interpretable
library(rpart)
<- rpart(class~., data = data_train2, method = "class") cart_model
- As mentioned above, the decision tree is interpretable. We can visualize it using the
rpart.plot
package: - Each pixel is classified into one of the four classes based on its value (luminosity) according to the decision tree in Figure 6.1
library(rpart.plot)
rpart.plot(cart_model, type = 3)

Predicting the probabilities per class for each pixel
- We will use the trained model to predict the probabilities of each class for each pixel in the raster data
- We can use the
predict
function to do this - The result is a raster with one layer per class, giving the probability of each class for each pixel
<- predict(ces1961, cart_model)
ces1961_predict
tm_shape(ces1961_predict) + tm_raster(style = "cont", midpoint = .5, palette = "-Spectral",title = "probability") +
tm_layout()
Highest probability class
- We can combine the four layers into a single layer by assigning the class with the highest probability to each pixel
- For this, we can use the
which.max
function:
<- which.max(ces1961_predict)
ces1961_predict2
# the next three lines assign class labels to the raster (instead of numbers)
<- levels(data_train2$class)
classes <- data.frame(ID = seq_along(classes), class = classes)
classes_df levels(ces1961_predict2) <- classes_df
# visualize the result
tmap_mode("view")
tm_shape(ces1961_predict2) + tm_raster(palette = c("gold","gray","palegreen4","black")) +
tm_layout(legend.outside = TRUE)
Model Evaluation I
- To evaluate the model, we will use the testing data
<- terra::extract(ces1961_predict2, data_test, ID = FALSE)
test_features
<- cbind(data_test, test_features) |>
confusion_matrix st_drop_geometry() |>
transmute(predicted = class.1, actual = class) |>
table()
Agriculture | Buildings | Forest | Shadows | |
---|---|---|---|---|
Agriculture | 34 | 4 | 13 | 0 |
Buildings | 3 | 5 | 0 | 0 |
Forest | 4 | 0 | 16 | 2 |
Shadows | 0 | 0 | 10 | 18 |
Model Evaluation I
- A very simple measure to asses the model performance is the overall accuracy
- This is the ratio of the number of features correctly classified (diagonals in Table 6.1) to the total number of features (sum of all column sums in Table 6.1):
\[\frac{\text{number of features correctly classified}}{\text{total number of features}}\]
- if we insert our values, we get:
\[\frac{34+5+16+18)}{41+9+39+20}= 0.67\]
Feature Engineering
- Currently, our raster data only contains one band. In other words, our feature space is only one-dimensional
- It’s actually quite a challenge to determine the land cover of a pixel if we regard just a single pixel. We humans rely on context (sorrounding pixels) to determine the land cover. You can test this by zooming in very closely on the raster image and try to distinguish the land cover types (see Figure 6.2)
- We will have a look at possibilities to include context information in Feature Engineering
<- 2706200
xmin <- xmin+100
xmax <- 1143600
ymin <- ymin+100
ymax <- ext(xmin, xmax, ymin, ymax)
e
crop(ces1961, e) |>
plot( col = grey.colors(255))

Tasks
- Download the datasets labelled-points.gpkg and 1961.tif from moodle
- Follow the steps described above to train a model to classify the land cover types in the area surrounding the village Ces in Ticino
- Evaluate the model using the testing data