8 |
9 |
|
If the influence of pixel position is not considered, then when finding the minimum cost, the cost=2 in the upper left corner will be considered the smallest. However, if we consider the influence of pixel spacing, let's look at the direction of the upper left corner. The difference from the center is 6-2. If we do a linear interpolation, the value of the upper left corner per unit distance from the center should be\(6-\left( 6-2 \right)\times 1/\sqrt{2}\ =3.17>3\), so the one directly above is the correct direction of the minimum cost.
Shortest path search
In magnetic lasso, the general usage is to click a point first, and then move the mouse, between the mouse and the initially clicked point A line automatically close to the edge will appear between them. Here we define the pixel point clicked at the beginning as the seed point (seed), and the magnetic lasso actually finds the seed point taking into account the edge-related cost mentioned in the previous part. The shortest path of the current mouse. As shown in the picture below, the red ones are seed points, and when you move the mouse, the connection between the seed point closest to the edge and the mouse coordinates will be displayed in real time. This is why the magnetic lasso is also called Livewire.

There are many ways to achieve the shortest path. Generally speaking, it is dynamic programming. What is introduced here is an implementation based on Dijkstra's algorithm. The basic idea is that, given a seed point Finally, Dijkstra's algorithm is executed to traverse all pixels of the image to obtain the shortest path from each pixel to the seed point. Take the picture below as an example. In a cost matrix, after using the Dijkstra algorithm to traverse each element, each element will point to an adjacent element, so that any pixel can find a path to the seed, such as the upper right corner. The pixels corresponding to 42 and 39 follow the arrow to 0.

The algorithm is as follows:
输入:
s // 种子点
l(q,r) // 计算局部cost
数据结构:
L // 当前待处理的像素
N(q) // 当前像素相邻的像素
e(q) // 标记一个像素是否已经做过相邻像素展开的Bool函数
g(q) // 从s到q的总cost
输出:
p // 记录所有路径的map
算法:
g(s)←0; L←s; // 将种子点作为第一点初始化
while L≠Ø: // 遍历尚未结束
q←min(L); // 取出最小cost的像素并从待处理像素中移除
e(q)←TRUE; // 将当前像素记录为已经做过相邻像素展开
for each r∈N(q) and not e(r):
gtemp←g(q)+l(q,r); // 计算相邻像素的总cost
if r∈L and gtemp<g(r): // 找到了更好的路径
r←L; { from list.} // 舍弃较大cost的路径
else if r∉L:
g(r)←gtemp; // 记录当前找到的最小路径
p(r)←q;
L←r; // 加入待处理以试图寻找更短的路径
The traversal process will give priority to the area with the lowest cost, as shown below:
After the shortest paths to the seed pixels corresponding to all pixels are found, just draw the shortest path to the seed directly when moving the mouse.
Python implementation
The algorithm part directly calls OpenCV’s Canny function and Sobel function (for gradient), and the processing of RGB is also very simple. , directly approximated by the maximum value of the gradient. In addition, because of laziness, both cost map and path map directly use dictionaries (dict), while recording expanded pixels directly uses sets (set). The GUI part uses Python's threading because I don't know how to use QThread. Only the image displays the interactive area and the status bar prompt. Left-click to set the seed point, and right-click to end. The extracted edge is a green line, and the edge being extracted is a blue line.

Code
Algorithm part
from __future__ import division
import cv2
import numpy as np
SQRT_0_5 = 0.70710678118654757
class Livewire():
"""
A simple livewire implementation for verification using
1. Canny edge detector + gradient magnitude + gradient direction
2. Dijkstra algorithm
"""
def __init__(self, image):
self.image = image
self.x_lim = image.shape[0]
self.y_lim = image.shape[1]
# The values in cost matrix ranges from 0~1
self.cost_edges = 1 - cv2.Canny(image, 85, 170)/255.0
self.grad_x, self.grad_y, self.grad_mag = self._get_grad(image)
self.cost_grad_mag = 1 - self.grad_mag/np.max(self.grad_mag)
# Weight for (Canny edges, gradient magnitude, gradient direction)
self.weight = (0.425, 0.425, 0.15)
self.n_pixs = self.x_lim * self.y_lim
self.n_processed = 0
@classmethod
def _get_grad(cls, image):
"""
Return the gradient magnitude of the image using Sobel operator
"""
rgb = True if len(image.shape) > 2 else False
grad_x = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3)
grad_y = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3)
if rgb:
# A very rough approximation for quick verification...
grad_x = np.max(grad_x, axis=2)
grad_y = np.max(grad_y, axis=2)
grad_mag = np.sqrt(grad_x**2+grad_y**2)
grad_x /= grad_mag
grad_y /= grad_mag
return grad_x, grad_y, grad_mag
def _get_neighbors(self, p):
"""
Return 8 neighbors around the pixel p
"""
x, y = p
x0 = 0 if x == 0 else x - 1
x1 = self.x_lim if x == self.x_lim - 1 else x + 2
y0 = 0 if y == 0 else y - 1
y1 = self.y_lim if y == self.y_lim - 1 else y + 2
return [(x, y) for x in xrange(x0, x1) for y in xrange(y0, y1) if (x, y) != p]
def _get_grad_direction_cost(self, p, q):
"""
Calculate the gradient changes refer to the link direction
"""
dp = (self.grad_y[p[0]][p[1]], -self.grad_x[p[0]][p[1]])
dq = (self.grad_y[q[0]][q[1]], -self.grad_x[q[0]][q[1]])
l = np.array([q[0]-p[0], q[1]-p[1]], np.float)
if 0 not in l:
l *= SQRT_0_5
dp_l = np.dot(dp, l)
l_dq = np.dot(l, dq)
if dp_l < 0:
dp_l = -dp_l
l_dq = -l_dq
# 2/3pi * ...
return 0.212206590789 * (np.arccos(dp_l)+np.arccos(l_dq))
def _local_cost(self, p, q):
"""
1. Calculate the Canny edges & gradient magnitude cost taking into account Euclidean distance
2. Combine with gradient direction
Assumption: p & q are neighbors
"""
diagnol = q[0] == p[0] or q[1] == p[1]
# c0, c1 and c2 are costs from Canny operator, gradient magnitude and gradient direction respectively
if diagnol:
c0 = self.cost_edges[p[0]][p[1]]-SQRT_0_5*(self.cost_edges[p[0]][p[1]]-self.cost_edges[q[0]][q[1]])
c1 = self.cost_grad_mag[p[0]][p[1]]-SQRT_0_5*(self.cost_grad_mag[p[0]][p[1]]-self.cost_grad_mag[q[0]][q[1]])
c2 = SQRT_0_5 * self._get_grad_direction_cost(p, q)
else:
c0 = self.cost_edges[q[0]][q[1]]
c1 = self.cost_grad_mag[q[0]][q[1]]
c2 = self._get_grad_direction_cost(p, q)
if np.isnan(c2):
c2 = 0.0
w0, w1, w2 = self.weight
cost_pq = w0*c0 + w1*c1 + w2*c2
return cost_pq * cost_pq
def get_path_matrix(self, seed):
"""
Get the back tracking matrix of the whole image from the cost matrix
"""
neighbors = [] # 8 neighbors of the pixel being processed
processed = set() # Processed point
cost = {seed: 0.0} # Accumulated cost, initialized with seed to itself
paths = {}
self.n_processed = 0
while cost:
# Expand the minimum cost point
p = min(cost, key=cost.get)
neighbors = self._get_neighbors(p)
processed.add(p)
# Record accumulated costs and back tracking point for newly expanded points
for q in [x for x in neighbors if x not in processed]:
temp_cost = cost[p] + self._local_cost(p, q)
if q in cost:
if temp_cost < cost[q]:
cost.pop(q)
else:
cost[q] = temp_cost
processed.add(q)
paths[q] = p
# Pop traversed points
cost.pop(p)
self.n_processed += 1
return paths
livewire.py
livewire.py
GUI part
from __future__ import division
import time
import cv2
from PyQt4 import QtGui, QtCore
from threading import Thread
from livewire import Livewire
class ImageWin(QtGui.QWidget):
def __init__(self):
super(ImageWin, self).__init__()
self.setupUi()
self.active = False
self.seed_enabled = True
self.seed = None
self.path_map = {}
self.path = []
def setupUi(self):
self.hbox = QtGui.QVBoxLayout(self)
# Load and initialize image
self.image_path = ''
while self.image_path == '':
self.image_path = QtGui.QFileDialog.getOpenFileName(self, '', '', '(*.bmp *.jpg *.png)')
self.image = QtGui.QPixmap(self.image_path)
self.cv2_image = cv2.imread(str(self.image_path))
self.lw = Livewire(self.cv2_image)
self.w, self.h = self.image.width(), self.image.height()
self.canvas = QtGui.QLabel(self)
self.canvas.setMouseTracking(True)
self.canvas.setPixmap(self.image)
self.status_bar = QtGui.QStatusBar(self)
self.status_bar.showMessage('Left click to set a seed')
self.hbox.addWidget(self.canvas)
self.hbox.addWidget(self.status_bar)
self.setLayout(self.hbox)
def mousePressEvent(self, event):
if self.seed_enabled:
pos = event.pos()
x, y = pos.x()-self.canvas.x(), pos.y()-self.canvas.y()
if x < 0:
x = 0
if x >= self.w:
x = self.w - 1
if y < 0:
y = 0
if y >= self.h:
y = self.h - 1
# Get the mouse cursor position
p = y, x
seed = self.seed
# Export bitmap
if event.buttons() == QtCore.Qt.MidButton:
filepath = QtGui.QFileDialog.getSaveFileName(self, 'Save image audio to', '', '*.bmp\n*.jpg\n*.png')
image = self.image.copy()
draw = QtGui.QPainter()
draw.begin(image)
draw.setPen(QtCore.Qt.blue)
if self.path_map:
while p != seed:
draw.drawPoint(p[1], p[0])
for q in self.lw._get_neighbors(p):
draw.drawPoint(q[1], q[0])
p = self.path_map[p]
if self.path:
draw.setPen(QtCore.Qt.green)
for p in self.path:
draw.drawPoint(p[1], p[0])
for q in self.lw._get_neighbors(p):
draw.drawPoint(q[1], q[0])
draw.end()
image.save(filepath, quality=100)
else:
self.seed = p
if self.path_map:
while p != seed:
p = self.path_map[p]
self.path.append(p)
# Calculate path map
if event.buttons() == QtCore.Qt.LeftButton:
Thread(target=self._cal_path_matrix).start()
Thread(target=self._update_path_map_progress).start()
# Finish current task and reset
elif event.buttons() == QtCore.Qt.RightButton:
self.path_map = {}
self.status_bar.showMessage('Left click to set a seed')
self.active = False
def mouseMoveEvent(self, event):
if self.active and event.buttons() == QtCore.Qt.NoButton:
pos = event.pos()
x, y = pos.x()-self.canvas.x(), pos.y()-self.canvas.y()
if x < 0 or x >= self.w or y < 0 or y >= self.h:
pass
else:
# Draw livewire
p = y, x
path = []
while p != self.seed:
p = self.path_map[p]
path.append(p)
image = self.image.copy()
draw = QtGui.QPainter()
draw.begin(image)
draw.setPen(QtCore.Qt.blue)
for p in path:
draw.drawPoint(p[1], p[0])
if self.path:
draw.setPen(QtCore.Qt.green)
for p in self.path:
draw.drawPoint(p[1], p[0])
draw.end()
self.canvas.setPixmap(image)
def _cal_path_matrix(self):
self.seed_enabled = False
self.active = False
self.status_bar.showMessage('Calculating path map...')
path_matrix = self.lw.get_path_matrix(self.seed)
self.status_bar.showMessage(r'Left: new seed / Right: finish')
self.seed_enabled = True
self.active = True
self.path_map = path_matrix
def _update_path_map_progress(self):
while not self.seed_enabled:
time.sleep(0.1)
message = 'Calculating path map... {:.1f}%'.format(self.lw.n_processed/self.lw.n_pixs*100.0)
self.status_bar.showMessage(message)
self.status_bar.showMessage(r'Left: new seed / Right: finish')
gui.py
gui.py
Main function
import sys
from PyQt4 import QtGui
from gui import ImageWin
def main():
app = QtGui.QApplication(sys.argv)
window = ImageWin()
window.setMouseTracking(True)
window.setWindowTitle('Livewire Demo')
window.show()
window.setFixedSize(window.size())
sys.exit(app.exec_())
if __name__ == '__main__':
main()
main.py
main.py
It was painfully uploaded to Github (portal), welcome to fork.
Improvements in efficiency
Because the prototype of this code is only for Python evaluation and verification before development in C++, efficiency is not considered at all, and the execution speed is completely unacceptable. , basically 400x400 pictures cannot be tolerated... As for the efficiency improvement based on the Python version, I have not thought about it carefully, but it seems that there are a few obvious places:
1) Take out the current minimum cost Pixel operation
p = min(cost, key=cost.get)
Although this is fun to write, it is obviously not possible. At least a min heap must be used. What. Because I used dict to represent both the pixels to be processed and the cost, and I was too lazy to think about how to combine it with Python's heapq, so I just used the crude and trouble-free min().
2) Calculation of gradient direction
The calculation of trigonometric functions should be avoided as much as possible. In addition, the original article may be to extend the value range to >π, so q-p is also used. In fact, this item has a small weight, so I'm afraid it's okay to just do a dot product of the gradient direction vectors of the two pixels and then normalize the result. Even if you want to use arccos, you can consider writing a look-up table approximation. Of course, the last thing I want to say is that I personally feel that this is really not necessary. Direct adaptive spilne or even three-point mean smoothing and denoising effect will be good.
3) Calculate the position of adjacent pixels
If two pixels are adjacent, the 8 adjacent pixels around them will also coincide. My method is relatively primitive and can calculate the rate directly without modularization.
4) Replace some data structures
For example, path map is essentially a matrix that gives the previous pixel on the shortest path for each pixel. In fact, you can consider using a linear data structure instead, but if you do this, it will generally be in C/C++ code.
5) numpy
In my impression, the order in which numpy is called will also affect efficiency. Continuously calling numpy's built-in methods seems to bring about an overall improvement in efficiency, but then again, If it reaches this point in actual application, it should also fall into the category of C/C++ code.
6) Improvements at the algorithm level
This area has not been studied in depth. The first impression is that in practical applications, it is not necessary to calculate the entire image at the beginning. Some blocks can be divided according to the seed position. , the mouse itself will also leave a trajectory, and you may also consider performing a heuristic search only in the direction of the mouse trajectory. In addition, when calculating the path, you may consider borrowing ideas similar to Image Pyramid. There is no need to search for the path at full resolution right from the start. Since the projects I did later did not use this algorithm, I did not continue to study it. Although I was very curious, there are actually many ready-made codes, such as GIMP, but I didn't have the energy to look at it.
More improvements
Although none of them have been done, I will briefly introduce them and consider the practical improvements.
Path Cooling
Anyone who has used Photoshop and GIMP Magnetic Lasso knows that even if the mouse does not click on the image, it will automatically move during the movement. Generate some points that fix the cutout trajectory. These points are actually new seed points. This method of automatically generating new seed points during use is called Path cooling. The basic idea of this method is as follows: as the mouse moves, if a path remains fixed for a certain period of time, then the point farthest from the seed in this path is set as the new seed. In fact, what is hidden behind it is dynamic. The idea of planning, Bellman optimality. The name is also quite vivid, path cooling.
Dynamic Training(Interactive Dynamic Training)

When using simple shortest path search, the edges often found are not what you want. For example, in the picture above, the green line is the edge extracted in the previous section, and the blue line is the edge currently being extracted. In the picture on the left, the edge of Lena's hat outside the mirror is what we want to extract. However, since the edge of Lena's hat in the mirror has a lower cost, the actual extracted blue line segment is pasted to the right as in the picture on the right. So the idea of Interactive Dynamic Training is to consider the green line segment as the correctly extracted edge, and then use the green line segment as training data to add a correction value to the cost function of the current edge extraction.
The method used in [1] is to count the histogram of the gradient intensity of the points on the edge of the previous segment, and then weight the pixels in the current image according to the frequency of gradient occurrence. For example, if the gradient intensity corresponding to all pixels in the green line segment is between 50 and 100, then 50 to 100 can be divided into 5 bins in units of 10, and the frequency of occurrence in each bin can be counted. It is a histogram, and then linearly weights the currently detected gradient intensity. For example, if there are at most 10 corresponding pixels in the range of 50~60, then 10 is regarded as the maximum value, and the pixels with currently detected gradient strengths between 50 and 60 are multiplied by a coefficient of 1.0; if the training data is 70~ There are 5 between 80, then the cost weighting coefficient is 5/10=0.5, then all pixels with currently detected gradient strengths between 70 and 80 are multiplied by the coefficient 0.5; if there are none above 100 in the training data, so The cost is added to 0/10=0, and the weighting coefficient is 0, so that even if a stronger edge is detected, it will not deviate from the previous edge. This is the basic idea. Of course, the actual implementation is not that simple. In addition to the pixels on the edge, the two pixels on the left and right on the vertical edge must also be considered, thus ensuring the edge pattern. In addition, as the mouse gets farther and farther away from the training edge, the pattern of the detected edge may appear different, so Training may have a counterproductive effect. Therefore, the scope of this Training also needs to take into account the distance between the mouse and the seed point. Finally, There needs to be some smoothing and denoising processing. The details are mentioned in [1]. It is quite cumbersome (it seems that there was no SIFT at that time), so I won’t go into details.
Correction of seed point position (Cursor Snap)
Although this algorithm can automatically find the path closest to the edge between the seed point and the mouse, however, human My hands often shake, so the seed points may not be well set to the edge. Therefore, after the user sets the seed point position, the pixel with the lowest cost can be automatically searched within a small range around its coordinates, such as a 7x7 area, as the real seed point position. This process is called Cursor snap.
For more about a simple implementation of magnetic lasso in Photoshop (based on Python), please pay attention to the PHP Chinese website for related articles!