BSR
Computer Vision

11 October 2024

Image Segmentation with U-Net

Introduction

In this project, I trained my own U-Net model to perform image segmentation on the Carvana Image Masking Challenge and the Dubai Aerial Imagery datasets.

What is image segmentation? It is a process of partitioning, or dividing, an image into multiple segments, thus called "image segmentation". The goal of image segmentation is to identify objects in an image.

Satellite true image
Satellite mask

Figure 1: Side by side visualization of the masked image and its true image

The true images come with their own masked images. These masked images contained the objects that have been manually segmented, or labelled, by humans. The goal of the model is to locate objects in an image given the image and its masked image.

U-Net Architecture

U-Net is a CNN architecture that consists of multiple encoders and encoders that are connected by skip connections. As result, the entire architecture looks like a "U" shape.

U-Net Architecture from Wikipedia (Source: https://en.wikipedia.org/wiki/U-Net)U-Net Architecture from Wikipedia (Source: https://en.wikipedia.org/wiki/U-Net)

Results

Dubai Aerial Imagery

With 50 iterations, the model was only able to reduce its loss to 0.5230.523. In other words, the model still needs more iterations and parameter tuning.

Satellite true image
Satellite predicted image

Figure 2: Satellite predicted image vs. true image

Satellite true image
Satellite predicted image

Figure 3: Satellite predicted image vs. true image

Carvana Image Masking Challenge

Unlike the result for Dubai Aerial Imagery, the model was able to reduce its loss to 0.01850.0185 and DICE score to 0.990.99 with only 50 iterations.

Car 1 True Mask
Car 1 Predicted Mask

Figure 4: Predicted mask vs. true image

Car 2 True Mask
Car 2 Predicted Mask

Figure 5: Predicted mask vs. true image

References

  1. Olaf Ronneberger, Philipp Fischer, Thomas Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv:1505.04597.
  2. Carvana Image Masking Challenge. Kaggle.
  3. Dubai Aerial Imagery. Kaggle.
  1. 1. Introduction
    • 2. U-Net Architecture
      • 3. Results
      • 4. References