特征类型和图像分割

小飞侠xp

发布于 2018-08-29 15:18:00

9860

发布于 2018-08-29 15:18:00

文章被收录于专栏：书山有路勤为径

特征类型

这图里的大多数特征或者说任意图像的大多数特征，都逃不出三大类边缘、角点和斑点。

边缘：图像强度发生突变的区域，也称为高强度梯度区域
角点：角点则是两个边缘相交的地方，起来像是个角或一个尖锐的点
斑点：按特征划分的区域，可能是强度特别高或特别低的区域或是具备独特纹理的区域

我们最想检测的就是角点，因为角点是可重复性最高的特征，也就是说因为角点是可重复性最高的特征，给出关于同一景象的两张或以上图像我们就能很轻易地识别出这类特征。

举一个简单的例子：

看这张蒙德里安的画作来看这三小块A、B 和 C 有了这三小块,告诉我它们位于图像的哪个矩形区域吗？

A 只是个简单的色块,能和许多这样的矩形区域匹配,由于它不是独一无二的所以不是个好特征

B 是边缘,因为从 B 的方向来看 B 与红色矩形底部的边缘相匹配,但我们还是可以左右移动这个边缘 B 左右都能匹配,我们只能估测这个边缘在图像上的大概位置,但很难找出确切位置

C 则是个角点,实际上 C 包含了两个角点,位置也很容易确定就在右下角,这是因为角点代表两个边缘变化的交点。所以角点是最容易匹配的是独一无二的因此是好特征。

角点检测

图像强度变化往往也称为图像梯度，要检测角点我们也可以靠这类梯度测量法来进行。

每个方向的梯度测量都会有一个幅值即梯度强度的度量值和表示强度变化的方向。

这些值都能用 Sobel 算子计算出来，Sobel 算子会分别取 x 和 y 方向的强度变化或图像梯度

这里我绘制出了山峰图像的这两个梯度，分别称之为 Gx 和 Gy 其中 G 是梯度的英文首字母。

这两张图看起来和之前的卷积核有点不一样，因为它们还没有转为二进制阀值图像，这里不需要转化。

计算出这两个方向总梯度的幅值和方向，将这些值从图像空间的xy坐标系转换成以 ρ 表示幅值 θ 表示方向的极坐标系。

把 Gx 和 Gy 想象成梯度三角形两边的长，Gx 是底边的长 Gy 则是右边的长，所以梯度的总幅值 ρ 就是三角形的斜线，也就是这两个梯度和的平方根。而梯度方向 θ则是 Gy 除以 Gx 的正切的倒数

注：下图有误 rho = sqrt(Gx^2 + Gy^2)

许多角点检测器会取一个窗口，在梯度图像不同区域里上下左右移动这个窗口，一旦遇到角点，窗口就会发现刚才计算出来的梯度方向和幅值有突变而识别出角点的存在。

Harris Corner Detection

复制图像将其转为 RGB 颜色空间

# Import resources and display image
import matplotlib.pyplot as plt
import numpy as np
import cv2

%matplotlib inline

# Read in the image
image = cv2.imread('images/waffle.jpg')

# Make a copy of the image
image_copy = np.copy(image)

# Change color to RGB (from BGR)
image_copy = cv2.cvtColor(image_copy, cv2.COLOR_BGR2RGB)

plt.imshow(image_copy)

角点检测靠的是强度变化，所以先把图像转为灰度图像，然后将值转化为浮点型，以便 Hrarris 角点检测器使用。

接着创建角点检测器 Harris

该函数需要输入的参数有灰度浮点值、以及检测潜在角点所需观察的相邻像素大小，2 表示 2 乘 2 像素方块(由于在这个例子中角点很明显,所以这样的小窗口就够用了);然后输入 Sobel 算子的大小,3 也就是典型的核大小。最后输入一个常数以便确定哪些点会被视为角点，通常设为 0.04，如果这个常数设得稍微小一些那检测出来的角点就会多一些。

函数的输出图像命名为 dst，这个图像会把角点标亮，非角点则会标为较暗的像素，实际上我们很难看到这张图里标亮的角点，所以我要再加一步操作来处理这些角点，这一步叫角点膨胀。使用 OpenCV 的函数 dilate 将其应用到检测出来的角点上，在计算机视觉里膨胀会放大明亮的区域，或是位于前景的区域比如这些角点以便我们更清楚地观察它们。

# Convert to grayscale
gray = cv2.cvtColor(image_copy, cv2.COLOR_RGB2GRAY)
gray = np.float32(gray)

# Detect corners 
dst = cv2.cornerHarris(gray, 2, 3, 0.04)

# Dilate corner image to enhance corner points
dst = cv2.dilate(dst,None)

plt.imshow(dst, cmap='gray')

Extract and display strong corners

要选出最明亮的角点我得定义一个阀值以便角点通过,但这里我要设一个较低的阀值,也就是至少为最大角点检测值的十分之一

创建图像副本以便绘制角点
如果角点大于我们定义的阀值,那就把它绘制在副本上
在图像副本上用小绿圈画出强角点
可以看到多数角点都被检测出来了,实际上少了几个角点,可以把阀值调低试试，把阀值减少至角点最大值的 1% 再次将结果绘制出来

##Define a threshold for extracting strong corners
# This value vary depending on the image and how many corners you want to detect
# Try changing this free parameter, 0.1, to be larger or smaller ans see what happens
thresh = 0.1*dst.max()

# Create an image copy to draw corners on
corner_image = np.copy(image_copy)

# Iterate through all the corners and draw them on the image (if they pass the threshold)
for j in range(0, dst.shape[0]):
    for i in range(0, dst.shape[1]):
        if(dst[j,i] > thresh):
            # image, center pt, radius, color, thickness
            cv2.circle( corner_image, (i, j), 1, (0,255,0), 1)

plt.imshow(corner_image)

形态学操作—膨胀与腐蚀

图像分割（Image Segmentation）

熟悉了一些简单的特征类型，如何通过使用这些特征将图像的不同部分组合在一起。

将图像分组或分割成不同的部分称为图像分割。

图像分割的最简单情况是背景减法。在视频和其他应用中，通常情况是人必须与静态或移动背景隔离，因此我们必须使用分割方法来区分这些区域。图像分割还用于各种复杂的识别任务，例如在对道路图像中的每个像素进行分类时。

我们将介绍几种分割图像的方法：

使用轮廓绘制图像不同部分的边界
通过一些颜色或纹理相似性的度量来聚类图像数据

图像描廓(Image Contours)

边缘检测算法常用于检测物体边界,但检测出来的边缘往往不仅是物体边界,还涉及一些有趣的特征和线条。而要进行图像分割，要的只是那些完整的闭合边界，因为这类边界能切实标识出特定的图像区域和物体，图像描廓就可以实现这一点。

图像轮廓就是位于已知边界上的边缘所形成的连续曲线，因此轮廓可用于图像分割，能提供大量关于物体边界形状的信息。

在 OpenCV 里如果物体是白色的背景是黑色的，就可以得到最好的轮廓检测效果。所以在识别图像轮廓之前，我们要先为图像创建二进制阀值，这样才能用黑白像素将图像里不同的物体区分开来，然后我们用这些物体的边缘来形成轮廓。这种二值图像通常只由一个阀值生成，或由 Canny 边缘检测器生成。

# Import resources and display image
import numpy as np
import matplotlib.pyplot as plt
import cv2

%matplotlib inline

# Read in the image
image = cv2.imread('images/thumbs_up_down.jpg')

# Change color to RGB (from BGR)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.imshow(image)

首先将图像转为灰度图像,然后用逆二进制阀值把手显示成白色,而不是像之前一样让背景显示成白色生成二值图像

# Convert to grayscale
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)

# Create a binary thresholded image
retval, binary = cv2.threshold(gray, 225, 255, cv2.THRESH_BINARY_INV)

plt.imshow(binary, cmap='gray')

找到并画出轮廓

CV 的函数 findContours,该函数要输入的参数有我们的二值图像、轮廓检索模式这里用的是树模式,以及轮廓近似方法这里我就设为简单的链近似了.

函数会输出轮廓列表和轮廓层级,如果你有诸多轮廓彼此嵌套那这个层级就能派上大用场,层级定义了轮廓之间的关系,详情请见文档

绘制轮廓, OpenCV 的函数 drawContours,输入的参数有图像副本，轮廓列表以及要显示的轮廓，-1 指的是所有轮廓，输入轮廓的颜色和大小。

# Find contours from thresholded, binary image
retval, contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw all contours on a copy of the original image
contours_image = np.copy(image)
contours_image = cv2.drawContours(contours_image, contours, -1, (0,255,0), 3)

plt.imshow(contours_image)

轮廓特征

每个轮廓都有许多可以计算的特征，包括轮廓的面积，它的方向（大部分轮廓指向的方向），它的周长，以及OpenCV documentation, here.中概述的许多其他属性。

方向：

对象的方向是对象指向的角度。要找到轮廓的角度，首先应找到适合轮廓的椭圆，然后从该形状中提取角度。

# Fit an ellipse to a contour and extract the angle from that ellipse
(x,y), (MA,ma), angle = cv2.fitEllipse(selected_contour)

这些取向值以度为单位，从x轴测量。值为零表示平直线，值为90表示轮廓指向直线！

因此，每个轮廓计算的方向角应该能够告诉我们关于手的一般位置的信息。用拇指向上的手应该比用拇指向下的手更高（接近90度）。

TODO: Find the orientation of each contour

## TODO: Complete this function so that 
## it returns the orientations of a list of contours
## The list should be in the same order as the contours
## i.e. the first angle should be the orientation of the first contour
def orientations(contours):
    """
    Orientation 
    :param contours: a list of contours
    :return: angles, the orientations of the contours
    """
    
    # Create an empty list to store the angles in
    # Tip: Use angles.append(value) to add values to this list
    angles = []
    for selected_contour in contours:
        (x,y), (MA,ma), angle = cv2.fitEllipse(selected_contour)
        angles.append(angle)
    return angles


# ---------------------------------------------------------- #
# Print out the orientation values
angles = orientations(contours)
print('Angles of each contour (in degrees): ' + str(angles))

边界矩形-Bounding Rectangle

# Find the bounding rectangle of a selected contour
x,y,w,h = cv2.boundingRect(selected_contour)
# Draw the bounding rectangle as a purple box
box_image = cv2.rectangle(contours_image, (x,y), (x+w,y+h), (200,0,200),2)
#要裁剪图像，请选择要包含的图像的正确宽度和高度。
# Crop using the dimensions of the bounding rectangle (x, y, w, h)
cropped_image = image[y: y + h, x: x + w]

TODO: Crop the image around a contou

## TODO: Complete this function so that
## it returns a new, cropped version of the original image
def left_hand_crop(image, selected_contour):
    """
    Left hand crop 
    :param image: the original image
    :param selectec_contour: the contour that will be used for cropping
    :return: cropped_image, the cropped image around the left hand
    """
    
    ## TODO: Detect the bounding rectangle of the left hand contour
 
    x,y,w,h = cv2.boundingRect(selected_contour)
    ## TODO: Crop the image using the dimensions of the bounding rectangle
    # Make a copy of the image to crop
    cropped_image = np.copy(image)
    cropped_image = cropped_image[y: y + h, x: x + w] 
    return cropped_image


## TODO: Select the left hand contour from the list
## Replace this value
selected_contour = contours[1]


# ---------------------------------------------------------- #
# If you've selected a contour
if(selected_contour is not None):
    # Call the crop function with that contour passed in as a parameter
    cropped_image = left_hand_crop(image, selected_contour)
    plt.imshow(cropped_image)

K-means 聚类

有种常用的图像分割技术叫 k 均值聚类，方法是把具相似特征的数据点聚类或分组到一起。

我们来看一个简单的例子更具体地探讨 k 均值

这张图很小只有 34 乘 34 像素是彩虹的一部分，我要用 k 均值根据颜色将这张图分为三簇

首先我们知道这张图里的每个像素都有一个 RGB 值，将各像素值当作 RGB 颜色空间的数据点绘制出来。

如果我让 k 均值将这些图像数据分成三簇，那么 k 均值就会观察这些像素值随机猜测三个 RGB 点将数据分成三簇。

k 均值会分别取各簇所有 RGB 值的实际平均数也就是均值，然后将三个中心点更新为相对应的均值
将之前猜测出来的中心点移动到簇均值的位置上
重复这个过程，根据调整后的新中心点形成新簇然后再次计算簇均值更新均值随后再次更新中心点
基本上每次迭代后中心点的移动幅度都会变小，算法会不断重复这个步骤直至收敛，而收敛程度是由我们定义的：比如 10 次或根据每次迭代后中心点的移动幅度来确定是否要收敛

import numpy as np
import matplotlib.pyplot as plt
import cv2

%matplotlib inline

# Read in the image
## TODO: Check out the images directory to see other images you can work with
# And select one!
image = cv2.imread('images/monarch.jpg')

# Change color to RGB (from BGR)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.imshow(image)

重塑这张图像使其变成一个二维数组以便输入 k 均值算法:这样的数组维数应该是 m 乘 3,m 指像素数 3 则指颜色通道的数目。
将这些值转为浮点型

# Reshape image into a 2D array of pixels and 3 color values (RGB)
pixel_vals = image.reshape((-1,3))

# Convert to float type
pixel_vals = np.float32(pixel_vals)

用函数 cv2.kmeans,该函数需要输入的参数有我们刚创建的 m 乘 3 像素值数组、k 值这里初始设为 2,还有我们想要的标签但这里不需要所以写 none,还有终止条件，然后是迭代次数；然后是迭代次数。
标准要在调用这个函数之前定义它，标准会告诉算法何时应终止，这里用 ε 值或迭代最大次数来定义标准，迭代最大次数设为 10 而 ε 这个值我们曾略略提过，也就是在经过几次迭代后若簇移动的范围小于该值则算法终止。
要显示分割情况，需要将数据重新转成一张 8 bit 的图像，还要重塑分割好的数据使其变回图像副本原本的形状

# define stopping criteria
# you can change the number of max iterations for faster convergence!
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1)

## TODO: Select a value for k
# then perform k-means clustering
k = 2
retval, labels, centers = cv2.kmeans(pixel_vals, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)

# convert data into 8-bit values
centers = np.uint8(centers)
segmented_data = centers[labels.flatten()]

# reshape data into the original image dimensions
segmented_image = segmented_data.reshape((image.shape))
labels_reshape = labels.reshape(image.shape[0], image.shape[1])

plt.imshow(segmented_image)

实际上我可以把簇标签可视化，需将他们逐一呈现就像用掩膜一样，来看等于 1 的标签

## TODO: Visualize one segment, try to find which is the leaves, background, etc!
plt.imshow(labels_reshape==1, cmap='gray')

甚至还可以利用这些信息来对这部分图像进行掩膜处理

# mask an image segment by cluster

cluster = 0 # the first cluster

masked_image = np.copy(image)
# turn the mask green!
masked_image[labels_reshape == cluster] = [0, 0, 0]

plt.imshow(masked_image)

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2018.08.23 ，如有侵权请联系 cloudcommunity@tencent.com 删除

其他

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

其他

登录后参与评论

0 条评论

热度