Image Segmentation with Watershed Algorithm

In this blog, we will discuss marker-based image segmentation using the watershed algorithm in detail. So, let’s first discuss what is a watershed in the context of an image.

The watershed algorithm is based on the concept of visualizing an image as a topographic surface where high-intensity values denote peaks and hills while the low intensity denotes valleys. This can be obtained by plotting the (x,y) image coordinates versus the intensity as shown below.

In this kind of surface plot, there are generally 3 types of points

  1. Points which are local minimum
  2. Points at which if you place a drop of water, that drop will fall with certainty to a single minimum
  3. Points at which the drop is equally likely to fall to more than 1 such minimum

Now, for a particular minimum, the set of points that satisfy the 2nd condition is called the catchment basin or watershed of that minimum. While those satisfying the 3rd condition are termed as divide lines or watershed lines. And this is the main objective of the watershed algorithm, which is to find the watershed lines. I hope you understood what is a watershed in the context of images. Now, let’s discuss how the watershed algorithm finds such lines using the analogy shown below.

Suppose a hole is punched in each local minima. Start filling each local minima(from these holes) with different colored water(labels) at a uniform rate. After some time, water from different catchment basins or valleys will start to merge. To prevent this, dams or barriers are constructed at the locations where water merges. A stage will come when all the peaks are underwater and only the tops of the dams(barriers) are visible. These dam boundaries correspond to the watershed lines. This is the philosophy behind the watershed algorithm.

But this approach produces over-segmented results due to noise or other irregularities in the image. So, to overcome this, instead of filling each local minima what we can do is only fill some of them. But how to decide which are those minimums? This is where the concept of marker comes.

Markers are the set of pixels from where the flooding will start. Intuitively, think of markers as the pixels which we are sure of belonging to the objects present in the image. Generally, the number of markers is equal to the number of objects (classes) + 1 (for background). For a better understanding, see the below image. Here, we have 4 markers (see right image), 1 for background, and 3 for coins. All the markers are represented with different pixel values. The pixels with value 0 (black) in the marker image are the unknown regions and the watershed algorithm assigns each of these pixels a class out of the 4 classes (3 coins and 1 background).

These markers can either be explicitly defined by the user or can be automatically determined using morphological operators or by other methods. Intuitively, using these markers we are telling the watershed algorithm to group points like these together.

So, the concept is simple. Start growing these markers (labeled with different colors(labels)). When they are about to meet, construct barriers at those locations. These barriers give us segmentation results. Now, let’s take an example to understand how to implement the watershed algorithm using OpenCV.

OpenCV

OpenCV provides a built-in cv2.watershed() function that performs a marker-based image segmentation using the watershed algorithm. This takes as input the image (8-bit, 3-channel) along with the markers(32-bit, single-channel) and outputs the modified marker array. The syntax is given below.

As explained above, label the markers with different pixel values (such as 1,2,3, and so on) and the unknown pixels which we are not sure of anything with a value of 0. In the output, each pixel is either set to a marker value or -1 if it belongs to the boundary. Now, let’s take an image and implement this algorithm using OpenCV-Python.

In this, we will use the distance transform method along with contours to create the markers. So, for this first we need to binarize the image. This can be done using OTSU’s as shown below.

Clearly, there are some holes present in the coins. So, to remove this we can use morphological closing as shown below.

Now, we apply the Distance Transform on the binary image. In this, for each foreground pixel, we calculate its Euclidean distance to the closest background pixel (0’s here). This can be done using the cv2.distanceTransform function as shown below.

Obviously, the bright pixels are the sure markers for each coin. To extract the sure foreground, we can use the thresholding.

Now, let’s label each of these markers with different pixel values using contours as shown below. For that first create a 32-bit, single-channel marker image of all 0’s. Then find the contours and draw them (filled with the value of contour index +1).

This will create markers for the foreground region. Next, we need to represent the marker for the background region. Here, I have manually created a marker using the cv2.circle() and with a value of len(contours)+1 as shown below.

Now our markers are ready. It’s time for the final step, apply watershed and visualize the results.

See how we are able to segment the coins. That’s all for this blog. In the next blog, we will discuss how to create these markers using the morphological operations. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Leave a Reply