Efficient Graphbased Image Segmentation
Resources
 Paper: Efficient GraphBased Image Segmentation
 IJCV, Sep. 2004
 Pedro F. Felzenszwalb @ MIT
 Daniel P. Huttenlocher @ Cornell
 Code implementation
 http://cs.brown.edu/~pff/segment/
 written in C++
Why we need segmentation?
A wide range of computational vision problems could in principle make good use of segmented images, were such segmentations reliably and efficiently computable.
 Intermediatelevel vision problems such as stereo and motion estimation require an appropriate region of support for correspondence operations.
 Spatially nonuniform regions of support can be identified using segmentation techniques.
 Higherlevel problems such as recognition and image indexing can also make use of segmentation results in matching, to address problems such as figureground separation and recognition by parts.
Properties
 Capture important groupings or regions, which often reflect global aspects of the image.
 Be highly efficient (running in time nearly linear in the number of image pixels). In order to be of practical use, segmentation methods should run at speeds similar to edge detection or other lowlevel visual processing techniques.
 application: video processing
Basic Approach
Graphbased image segmentation techniques represent the problem in terms of a undirected graph G = (V,E).
 Each node v_i \in V corresponds to a pixel in the image
 The edges e_i \in E connect certain pairs of neighboring pixels
In the case of image segmentation, the elements in V are pixels and the weight of an edge is some measure of the dissimilarity between the two pixels connected by that edge. e.g., the difference in
 intensity
 color
 motion
 location
 some other local attributes
Rewritting in Python be like…
def diff_rgb(img, x1, y1, x2, y2):
r = (img[0][x1, y1]  img[0][x2, y2]) ** 2
g = (img[1][x1, y1]  img[1][x2, y2]) ** 2
b = (img[2][x1, y1]  img[2][x2, y2]) ** 2
return sqrt(r + g + b)
def diff_grey(img, x1, y1, x2, y2):
v = (img[x1, y1]  img[x2, y2]) ** 2
return sqrt(v)
Classical Clustering
Unadaptive criterion

Widely varying intensities should not alone be judged as evidence for multiple regions. It is not adequate to assume that regions have nearly constant or slowly varying intensities.

The intensity difference across the boundary between the ramp and the constant region is actually smaller than many of the intensity differences within the high variability region. Thus, in order to segment such an image, some kind of adaptive or nonlocal criterion must be used.
In Short…
 Classical clustering methods use stable, “global” and unadaptive criterion to fill in the e_i \in E
 The earliest graphbased methods use fixed thresholds and local measures in computing a segmentation. (C.T.Zahn., IEEE Transactions on Computing, 1971.)
 Not good enough!
 We want adapative criterion for different situations to make better decisions.
Predicate
 Define the internal difference of a component C \in V to be the largest weight in the minimum spanning tree of the component, MST(C,E). That is,
 Define the difference between two components C_1, C_2 \in V to be the minimum weight edge connecting the two components. That is,
This measure of difference could in principle be problematic, because it reflects only the smallest edge weight between two components. In practice, the measure works quite well in spite of this apparent limitation.
The region comparison predicate:
If the difference between the components, Dif(C_1,C_2) is large relative to the internal difference within at least one of the components, Int(C_1) and Int(C_2), then there is evidence for a boundary between a pair or components
We define the pairwise comparison predicate as,
where the minimum internal difference, MInt, is defined as,
A threshold function \tau is used to control the degree to which the difference between components must be larger than minimum internal difference (so that D can be true).
In the extreme case, when \vert C\vert = 1,Int(C) = 0. Therefore, we use a threshold function based on the size of the component,
That is, for small components we require stronger evidence for a boundary. In practice k sets a scale of observation, in that a larger k causes a preference for larger components.
New Algorithm
 Definition 1:
 A segmentation S is too fine if there is some pair of regions C_1, C_2 \in S for which there is no evidence for a boundary between them.
 Definition 2:
 A segmentation S is too coarse when there exists a proper refinement of S that is not too fine.
 Property 1:
 For any (finite) graph G = (V, E) there exists some segmentation S that is neither too coarse nor too fine.
Algorithm Segmentation algorithm.
 Input: a graph G=(V,E), n vertices and m edges.
 Output: a segmentation of V into components S=(C_1,\cdots,C_r)
0.Sort E into \pi = (o_1,\cdots,o_m), by nondecreasing edge weight.
1.Start with a segmentation S^0, where each vertex v_i is in its own component.
2.Repeat step 3 for q = 1,\cdots,m.
3.Given S^{q1}, construct S^q as follows.
Let v_i and v_j denote vertices connected by the qth edge, i.e., o_q=(v_i,v_j).
If:
 v_i and v_j are in disjoint components of S^{q1}
 w(o_q) \le MInt(C_i^{q1},C_j^{q1}).
Then, S^q is obtained from S^{q1} by merging C_i^{q1} and C_j^{q1}. Otherwise, S^q=S^{q1}.
4.Return S=S^m
Coding
 Rewritten in python
def segment_graph(graph, num_nodes, const, min_size, threshold_func):
weight = lambda edge: edge[2]
forest = Forest(num_nodes)
sorted_graph = sorted(graph, key=weight)
threshold = [threshold_func(1, const)] * num_nodes
for edge in sorted_graph:
parent_a = forest.find(edge[0])
parent_b = forest.find(edge[1])
a_condition = weight(edge) <= threshold[parent_a]
b_condition = weight(edge) <= threshold[parent_b]
if parent_a != parent_b and a_condition and b_condition:
forest.merge(parent_a, parent_b)
a = forest.find(parent_a)
threshold[a] = weight(edge) + threshold_func(forest.nodes[a].size,
const)
return remove_small_components(forest, sorted_graph, min_size)