Cara menggunakan binarization image python

In the most simple terms, image binarization means that you are converting a image to black and white format.

Original Image(c) GrayScaleImage(b) and Binary Image(a)

Most computer vision programs start by converting the image to binary format(trust me your first step in every computer vision application would be conducting image binarization). The more raw the image is, it becomes easier for the computer to process it and understand the underlying features of the image(which may be quite easy for normal human eyes to figure out).

Different ways to perform Image binarization in OpenCV

There might be multiple factors that decide how the binarization is conducted. I would not be covering all the techniques but here are the most common ones:

  1. Canny Edge Detection
  2. Thresholding
  3. Segmentation(you can use it to create more segments than just 2 black and white but we will focus on binarization)

Note: This article does not go into too much depth about how these algorithms work. Instead, the focus is on how to actually get things done in OpenCV and python. However, I will put links and references if you want to see the underlying working of these algorithms.

Canny Edge Detection

As the name itself suggests, this detector would detect edges in an image. The edges detected by the process are white, while everything else is black. The Canny edge detection algorithm does it using 5 steps :
Noise reduction, Gradient calculation, Non-maximum suppression, Double threshold, Edge Tracking by Hysteresis. To see these steps in more detail check this page.

Now comes the main part of how you can use it in open-cv. This is what the canny method asks from you.

cv2.Canny(image, lowerThreshold, upperThreshold, apertureSize, L2gradient)

You just have to provide your original image(RGB is fine as the OpenCV’s canny method internally converts it to grayscale), a lower and upper threshold values. These 3 parameters are more than sufficient for going ahead with the canny method. I will briefly explain what the last 2 parameters do, but you will hardly ever change them and they are optional.

apertureSize: is the kernel size used for Sobel filter. The filter works pretty much the same way as the convolutional filter works in CNN. The default value is 3X3.

L2Gradient: This specifies the equation for finding gradient magnitude. L2gradient is of boolean type, and its default value is False.

Now let me explain what does the lower and upper threshold values means. This is quite important and will impact your output result

The threshold values are used by Step 4(threshold) in the Canny algorithm. Mind you that these are not pixel values, instead, they are minimum and maximum gradient values.

For example, if the threshold is [0.1 0.15] then the edge pixels above the upper limit(0.15) are considered and edge pixels below the threshold(0.1) are discarded. Now, you may have a question “what about the pixels in between upper and lower threshold”? They are considered only if they are connected to pixels in the upper threshold. Thus we get a clean edge.

Frankly speaking, it was pretty difficult for me to understand these values. But these are actually some try and change values that you update in your code on a case-by-case basis. There are no standard sets of values that work for all. I have seen some articles where they say the high value should be 3 times the small value, but that’s not universal. Mostly I have seen people using 75, 200. You can give it a try and see if it helps but you are always free to use values that give the best result to you. But if you are still not sure, you can use the following snippet(shoutout to pyimagesearch for this) to automatically calculate values for you:

median = np.median(image)
lower = int(max(0, (1.0 - sigma) * median))
upper = int(min(255, (1.0 + sigma) * median))
#lower sigma-->tighter threshold(default value of sigma can be 0.33)

So in the end, this is how your code will look in python. Isn't it quite simple yet effective?

import cv2image = cv2.imread("3.jpg")
edge_image= cv2.Canny(image, lower, upper)
cv2.imshow("edgeDetection", edge_image)
cv2.waitKey(0)

Original image(right), Canny Edges(Left)

Thresholding

A very simple technique where all pixels below a threshold value will be marked as black and those above the threshold will be marked as white. Thresholding is actually a subset of Segmentation(which will be discussed in the next section).

Thresholding is a simpler form of Image Segmentation, it is a way to create a binary image based on setting a threshold value on the pixel intensity of the original image.

Let’s say you want to threshold the image based on green pixels. So you set the threshold to green pixels. So all green pixels will become white in the thresholded image and the rest as black. This would be called your Segment 1(based on green pixels).

Now if you still decide to create another segment based on red pixels, you can go ahead and set the threshold to Red pixels and apply thresholding. The output would be called Segment 2. That's why we call Thresholding a simpler form of Segmentation since you can apply multiple thresholding and it would be similar to if you were applying the Segmentation technique.

To see thresholding vs Segmentation check here. Just the intro part would be enough, don't go into code as it is not required.

In practice, you would be using more Segmentation rather than Thresholding. So in this article, I will emphasize more on Segmentation. But just to tell you how simple it is in OpenCV, I give you a simple code snippet

ret, thresh_image = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY)

where you have 120 and 255 as threshold values. If you want to explore in-depth check this very nice explanation. Also, note that Adaptive thresholding would be the most used case. So you can check that with more care.

Segmentation

If you went through this youtube video, you might have encountered this image

(From left) 1. let's say you have an image and you want to binarize it based on red dots 2. this image is binarized based on red dots using the Thresholding technique 3. You can repeat multiple thresholding based on different thresholds to obtain something similar as in this image. Yellow segments were already created in Image 2.

Instead of applying multiple thresholds on the original image, we can apply a segmentation technique to obtain different segments. That's is what Segmentation is all about. But since our main topic of concern in this article is binarization, so we plan to create only 1 segment. This 1 segment could have been created using the Thresholding technique but why do that if we have a more powerful technique of Segmentation.

Enough of theory but that's not why we are here. So let's start with code. When doing segmentation, we convert our image to HSV image. We have BGR, RGB codings for our image, in a similar way we have HSV encoding.

hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

then we pass our HSV image to the cv2.inRange function to get the segmented image. And that's simple it is

seg_image= cv2.inRange(hsv, np.array([-10, 50, 100]), np.array([50, 150, 225]))
cv2.imshow("segmentation", seg_image)

Now, what are the 2 arrays that we have passed to the function? Those are the minimum and maximum segmentation or thresholding values. But the most important thing is those values are in HSV format and not RGB or BGR format. So you would be asking how do we get HSV values. Another snippet to your rescue:

#get rgb values from here: https://imagecolorpicker.com/en
rgbColor = np.uint8([[[92,113,165]]])
hsvColor = cv2.cvtColor(rgbColor, cv2.COLOR_BGR2HSV)

upper = np.array([hsvColor[0][0][0] + 10, hsvColor[0][0][1] + 10, hsvColor[0][0][2] + 40])
lower = np.array([hsvColor[0][0][0] - 10, hsvColor[0][0][1] - 10, hsvColor[0][0][2] - 40])

Get the BGR values of the part of the image using this URL. If you already know the BGR value of your segment color, you don't need this URL. You put that value in place of [92,113,165](it’s BGR and not RGB). The rest part of the code automatically gives you the minimum and maximum value in HSV that you would have to use in inRange function. I have used +-10, +-10, +-40 just to obtain some ballpark minimum and maximum values but you definitely would have to change the final obtained HSV values to fit as per your need(tip from personal experience: I do a -10 from the minimum array and +10 from all maximum array, see the output. If the segmentation was perfect otherwise update the values by doing a + or -10). I will emphasize again choosing the correct threshold is the most important and tricky part here, so don’t get happy with just 1 value, try more.

Once everything in place you would have a segmented image and the output would look something like this:

Left: Original Image, right: Segmented Image

Which one to use?

If you have reached till here, you would be wondering which one would be best suited for you. See in computer vision and deep learning, there is never a sure winner. You give a try to all the ways and what yields you the best results, you go ahead with that. So give it a try all and then decide. But if you think that you can skip the image binarization in computer vision, things will be getting difficult for you pretty soon.