Image segmentation is an important concept in Imaging. It sets a threshold to the image and binarize the image according to that threshold. Binarization is a process in which the colored image is converted to an image composed only of black and white pixels. For example, let us take the image in Figure 1.
This image is composed of black ink (for the text) and light grayish paper (for the background). I want to binarize Figure 1 such that the text will be white and the background will be black. To do that, first I need to plot the histogram of pixel values. I used the function imhist() of Scilab with 256 bins. Figure 2 shows the histogram of pixel values for the image in Figure 1.
It is observed that there are so many pixels with values greater than 150 as compared to those less than it. In Figure 1, the background is also dominant so the pixels with values greater than 150 are the background pixels. Since I want the background to be black and the text to be white, I let a matrix BW = I < 125 (giving a small allowance to the threshold) with I being the image matrix. This will create a matrix BW with values T or F (True or False) with T be white and F be black. Figure 3 shows the binarized image of Figure 1.
I can use the segmentation concept to isolate a certain color in colored images. Before proceeding to segmenting colored images, I will need to discuss a little bit about colored images. Per pixel in a colored image has a particular color and this color is a combination of the three primary colors: red, green and blue. Colored images will then have a separate channel for each color (R for red, G for green and B for blue). Before segmenting, the three channels of the image have to be converted to normalized chromaticity coordinates or NCC. To do this, I let:
The terms r (Equation 2) and g (Equation 3) will be the coordinates in the NCC. Note that I did not use the b value (Equation 4) because when I know r and g, I already know b because r + g + b = 1.
Now before calculating anything, I must choose a photo wherein there are many colors involved. I chose a photo of a Korean girl group in one of their music videos where they wear tracksuits of five different colors as seen in Figure 4.
I then choose a region of interest or ROI wherein I want to isolate that certain color. I chose the five different colors simultaneously to perform this activity faster.
I solved for the chromaticity coordinates r and g of each pixel on each color patch. I then took the mean and standard deviation values of r and g for each patch. These values are important to setup the Gaussian distribution for the probability of belongingness for each pixel on the original image. Using the equations 5 and 6 below, I will then solve for the probability of belongingness of each pixel on the original image.
I then solved for the belongingness probability for each pixel with the mean and standard deviation values that I calculated from the patches. Since there are values of P(r,g) that are greater than one, I normalized the probability function to only have values from 0 to 1. As of now, each pixel now has a belongingness probability value. I can use these values to perform the threshold segmentation. I tried different thresholds and Figure 6 shows the results of segmentation.
It is observed in Figure 6 that as the threshold increases, less white pixels are observed. It is expected because the binarization process is reliant on the value of the probability of belongingness. The higher its probability, the more chances it is to be white on the binarized form.
For the rest of the color patches, I only used a single threshold value of 0.4. Figure 7 shows the rest of the segmented images.
The process shown above is called Parametric segmentation since it uses a Gaussian Probability Distribution Function as a threshold basis. There is another method of segmentation and this is called the Non-parametric segmentation. This type of segmentation uses the concept of histogram backprojection wherein the r and g values of the image pixels are compared to a 2D histogram of a certain color patch. The value of the image pixels are then changed depending on the value on the histogram.
To start with non-parametric segmentation, I will give a little discussion on 2D histograms. Given the values of r and g, I can predict the color of that pixel using the normalized chromaticity space like that in Figure 8.
A 2D histogram is formed when the r and g values of each pixel on the original image is calculated. The plot of the NCC space (Figure 8) is divided into bins and be used as basis. Originally, each bin has a value of 0. Each time a pixel’s r and g values correspond to a certain bin, the value of that bin increases by 1. When all the pixels have now been incorporated to the histogram, the histogram is then normalized and binarized using a threshold. When the value of that bin exceeds the set threshold, it will be given a value of 1 (white) and 0 (black) otherwise. The histograms are created using the color patches seen in Figure 5. The histograms generated are shown in Figure 9.
If I compare these histograms with the NCC space in Figure 8, I can see that the white spots are located in the proper color location. After checking, I can proceed to the next step which is histogram backprojection. The pixels of the original image will become white when the histogram values of the corresponding r and g values of the pixels are 1. When the histogram values are zero, the pixel is changed to black. This will then be segmented based on the color of the patch. The resultant image of the non-parametric segmentation is shown in Figure 10. The threshold used for all of the images will be 0.4 (same threshold as that I used in parametric segmentation).
When comparing the results of the parametric (Figure 7) and non-parametric (Figure 10) segmentation, I noticed that, with the same threshold, parameteric segmentation has a clearer image of the isolated color than non-parametric segmentation. The reason for this is that many information are lost when the histogram is being binarized in the non-parametric segmentation. Also, due to the Gaussian PDF of the parametric segmentation, a large scale of pixels will be involved.
The difference in runtime is in favor of parametric segmentation. This is not what is expected but because of the capability of Scilab to perform parallel computations within matrices, this made the difference. In non-parametric segmentation, a histogram must be setup and values are compared and this caused the delay while in parametric segmentation, values are calculated simultaneously and is just thresholded.
In this activity, I will explore above and beyond by tackling on the topic on choosing color patches. For the color patches shown in Figure 5, they are all of the same saturation (no dark nor light versions of the same hue). One of the reasons why not the whole clothes were segmented above is that they did not reach the threshold. If I do not want to change the threshold, I can just change the color patch itself. Let us take for example the image of the sky in Figure 11.
I want to segment the image such that the sky will turn white (while all the rest are black) while having a constant threshold of 0.1. If I pick a patch like that shown in Figure 12, we can expect that most of the sky will be blacked out. The result of this color segmentation is shown in Figure 13.
I notice that only the upper parts are whited out. If I choose another patch like that shown in Figure 14, we can expect another bad segmentation. The resultant image will be shown in Figure 15.
Now the lower parts are white but the higher parts are black. If I change the color patch such that a wide range of color saturation are observed, we can expect much better results. Figure 16 shows the patch with a wide range of saturation and Figure 17 shows the segmented image.
This happens because a wide range of color saturation will bring the mean to the real average of all the blue pixels. The standard deviation also increases greatly thus leading to a larger Gaussian PDF. This means that more pixels will then be included even though the threshold is at a strict 0.1. Although it may not be perfect, the point here is to show how choosing the right patch can also be a way to perform better segmentation.
Review:
To think about it, this activity was given only six days to make and I finished the programming part after two days. This means that I went “in the zone” because this activity is so awesome. For this activity, I would congratulate myself for all the effort I put here. Also, I really understood the topic very well. Since it’s alerady October and Christmas is coming fast, I will give my self a Christmas-themed score: 12! Because…