Home  >  Article  >  Backend Development  >  Python-based image enhancement technology

Python-based image enhancement technology

王林
王林Original
2023-08-30 22:05:101145browse

Let me start this tutorial with some theoretical terms. When we talk about image enhancement, it basically means that we want a new version of the image that is more suitable than the original image. For example, when you scan a document, the output image may be of lower quality than the original input image. Therefore, we need a way to improve the quality of the output images so that they are visually more expressive to the viewer, and this is where image enhancement comes into play. When we enhance an image, what we do is sharpen features of the image, such as contrast and edges.

It should be noted that image enhancement does not increase the information content of the image, but increases the dynamic range of the selected features, ultimately improving the quality of the image. So here we don't actually know what the output image will look like, but we should be able to tell (subjectively) if there are any improvements, such as observing more details in the output image.

Image enhancement is often used as a pre-processing step among the basic steps involved in digital image processing (i.e. segmentation, representation). There are many techniques for image enhancement, but I will introduce two techniques in this tutorial:

Image inverse

and Power law transformation. We'll look at how to implement them in Python. let's start! Image inversion

As you might have guessed from the title of this section (which can also be called image inversion), the purpose of image inversion is to convert dark intensities in the input image into light intensities in the image. output image, and the light intensity in the input image to the dark intensity in the output image. In other words, dark areas become lighter and light areas become darker.

Assume

I(i,j)

refers to the intensity value of the pixel located at (i,j). To clarify here, intensity values ​​in grayscale images fall within the range [0,255], while (i,j) refers to row and column values, respectively. When we apply the image inverse operator to a grayscale image, the output pixel O(i,j) value is: <pre class="brush:plaintext;toolbal:false;">O(i,j) = 255 - I(i,j) </pre> Now, most of our images are in color. These images contain three channels:

red

, green, and blue, and are called RGB images. In this case, contrary to the formula above, we need to subtract the intensity of each channel from 255. Therefore the output image is at pixel (i ,j):

O_R(i,j) = 255 - R(i,j)
O_G(i,j) = 255 - G(i,j)
O-B)i,j) = 255 - B(i,j)
After the introduction, let’s take a look at how to implement the image inverse operator in Python. I want to mention that for the sake of simplicity, I will be running this operator on a grayscale image. But I'll give you some ideas about applying this operator on color images, and I'll leave the complete program to you as an exercise.

For color images, the first thing you need to do is extract each pixel channel (i.e. RGB) intensity value. To do this, you can use the Python Imaging Library (PIL). Continue to download the sample baboon image from baboon.png. The size of the image is

500x500

. Suppose you want to extract the red, green, and blue intensity values ​​at pixel location

(325, 432). This can be done as follows:

from PIL import Image

im = Image.open('baboon.png')
print(im.getpixel((325,432)))
According to the documentation, the function of method getpixel()

is:

Return the pixel value at the given location.

After running the above script, you will find that you only get the following results:
138
! But where are the (RGB) intensity values ​​for the three channels? The problem seems to be related to the

mode of the image being read. Check the pattern by running the following statement:

print(im.mode)
You will get the output P

, which means the image was read in palette mode. One thing you can do is convert the image to RGB mode before returning the intensity values ​​of the different channels. To do this, you can use the

convert() method as follows:

rgb_im = im.convert('RGB')
In this case, you will get the following return value: (180, 168, 178)

. This means that the intensity values ​​for the red, green, and blue channels are 180, 168, and 178 respectively.

Putting everything we've described so far together, a Python script that returns the RGB values ​​of an image looks like this:

from PIL import Image

im = Image.open('baboon.png')
rgb_im = im.convert('RGB')
print(rgb_im.getpixel((325,432)))

There is one point left before continuing with the image inverse operator. The above example shows how to retrieve the RGB value of

only one

pixel, but when performing the inverse operator, you need to do it for

all pixels. To print all intensity values ​​for different channels for each pixel, you can do the following:

from PIL import Image

im = Image.open('baboon.png')
rgb_im = im.convert('RGB')
width, height = im.size

for w in range(width):
    for h in range(height):
		print(rgb_im.getpixel((w,h)))

At this point, I'll leave this as an exercise for you to see how to apply the image inverse operator on all color image channels (i.e. RGB) for each pixel.

Let’s look at an example of applying the image inverse operator on a grayscale image. Go ahead and download boat.png, which will serve as our test image in this section. It looks like this:

Python-based image enhancement technology

我将使用 numpy 库来完成此任务。在上面的图像上应用图像逆运算符的 Python 脚本应如下所示:

import cv2
import numpy as np
from PIL import Image
img = Image.open('boat.png')
array_img = np.array(img)
image_invert = np.invert(array_img)
cv2.imwrite('new_boat.jpg', image_invert)

Numpy 是一个用于使用 Python 进行科学计算的 Python 包。 OpenCV-Python 是一个旨在解决计算机视觉问题的库。 OpenCV-Python 与 numpy 捆绑在一起,因此如果先安装 OpenCV-Python,则无需安装 numpy。我们首先用 Pillow 打开图像,然后将其转换为 numpy 数组。

然后我们使用numpy的invert()函数将图像反转并保存新的反转图像。 invert() 函数会将白色转换为黑色,反之亦然。

下面左边是原始图像,右边是新反转的图像。

Python-based image enhancement technology

请注意,应用该运算符后,图像的某些特征变得更加清晰。例如,看看右侧图像中的云彩和灯塔。

幂律变换

这个算子,也称为伽马校正,是我们可以用来增强图像的另一个算子。让我们看看算子方程。在像素 (i,j) 处,运算符如下所示:

p(i,j) = kI(i,j)^gamma

I(i,j) 是图像位置处的强度值 (i,j); kgamma 是正常数。我不会在这里讨论数学细节,但我相信您可以在图像处理书籍中找到该主题的详尽解释。但需要注意的是,在大多数情况下,k=1,所以我们主要是改变gamma的值。因此,上述方程可以简化为:

p(i,j) = I(i,j)^gamma

我将在这里使用 OpenCVNumPy 库。如果您需要了解有关该库的更多信息,请查看我的教程 NumPy 简介。我们的测试图像将再次是boat.tiff(继续下载它)。

执行幂律变换运算符的 Python 脚本如下所示:

import cv2
import numpy as np

im = cv2.imread('boat.tiff')
im = im/255.0
im_power_law_transformation = cv2.pow(im,0.6)
cv2.imshow('Original Image',im)
cv2.imshow('Power Law Transformation',im_power_law_transformation)
cv2.waitKey(0)

请注意,我们选择的 gamma 值是 0.6。下图显示了原始图像以及对该图像应用幂律变换算子的结果(左图为原始图像,右图为应用幂律变换算子后的结果)。

Python-based image enhancement technology

上面的结果是 gamma = 0.6 时的结果。让我们看看当我们将 gamma 增加到 1.5 时会发生什么,例如:

Python-based image enhancement technology

请注意,当我们增加伽马值时,图像会变得更暗,反之亦然。

您可能会问幂律变换有什么用处。事实上,用于图像采集、打印和显示的不同设备根据幂律变换算子进行响应。这是因为人脑使用伽马校正来处理图像。例如,当我们希望在计算机显示器或电视屏幕上正确显示图像(所有图像中显示最佳图像对比度)时,伽马校正就被认为很重要。

结论

在本教程中,您学习了如何使用 Python 增强图像。您已经了解了如何使用图像逆算子突出显示特征,以及如何将幂律变换视为在计算机显示器和电视屏幕上正确显示图像的关键算子。

The above is the detailed content of Python-based image enhancement technology. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn