Efficient Graph-based Image Segmentation


Why we need segmentation?

A wide range of computational vision problems could in principle make good use of segmented images, were such segmentations reliably and efficiently computable.


Basic Approach

Graph-based image segmentation techniques represent the problem in terms of a undirected graph G = (V,E).

In the case of image segmentation, the elements in V are pixels and the weight of an edge is some measure of the dissimilarity between the two pixels connected by that edge. e.g., the difference in


Rewritting in Python be like…

def diff_rgb(img, x1, y1, x2, y2):
    r = (img[0][x1, y1] - img[0][x2, y2]) ** 2
    g = (img[1][x1, y1] - img[1][x2, y2]) ** 2
    b = (img[2][x1, y1] - img[2][x2, y2]) ** 2
    return sqrt(r + g + b)
def diff_grey(img, x1, y1, x2, y2):
    v = (img[x1, y1] - img[x2, y2]) ** 2
    return sqrt(v)    

Classical Clustering

Un-adaptive criterion

In Short…


Int(C)=\max_{e \in MST(C,E)} w(e)
Dif(C_1,C_2)=\min_{v_i \in C_1,v_j \in C_2,(v_i,v_j)\in E}w((v_i,v_j))

This measure of difference could in principle be problematic, because it reflects only the smallest edge weight between two components. In practice, the measure works quite well in spite of this apparent limitation.

The region comparison predicate:

If the difference between the components, Dif(C_1,C_2) is large relative to the internal difference within at least one of the components, Int(C_1) and Int(C_2), then there is evidence for a boundary between a pair or components

We define the pairwise comparison predicate as,

where the minimum internal difference, MInt, is defined as,


A threshold function \tau is used to control the degree to which the difference between components must be larger than minimum internal difference (so that D can be true).

In the extreme case, when \vert C\vert = 1,Int(C) = 0. Therefore, we use a threshold function based on the size of the component,

\tau (C)=k/\vert C\vert

That is, for small components we require stronger evidence for a boundary. In practice k sets a scale of observation, in that a larger k causes a preference for larger components.

New Algorithm

Algorithm Segmentation algorithm.

0.Sort E into \pi = (o_1,\cdots,o_m), by non-decreasing edge weight.

1.Start with a segmentation S^0, where each vertex v_i is in its own component.

2.Repeat step 3 for q = 1,\cdots,m.

3.Given S^{q-1}, construct S^q as follows.

Let v_i and v_j denote vertices connected by the q-th edge, i.e., o_q=(v_i,v_j).


Then, S^q is obtained from S^{q-1} by merging C_i^{q-1} and C_j^{q-1}. Otherwise, S^q=S^{q-1}.

4.Return S=S^m


def segment_graph(graph, num_nodes, const, min_size, threshold_func):
    weight = lambda edge: edge[2]

    forest = Forest(num_nodes)
    sorted_graph = sorted(graph, key=weight)
    threshold = [threshold_func(1, const)] * num_nodes

    for edge in sorted_graph:
        parent_a = forest.find(edge[0])
        parent_b = forest.find(edge[1])
        a_condition = weight(edge) <= threshold[parent_a]
        b_condition = weight(edge) <= threshold[parent_b]

        if parent_a != parent_b and a_condition and b_condition:
            forest.merge(parent_a, parent_b)
            a = forest.find(parent_a)
            threshold[a] = weight(edge) + threshold_func(forest.nodes[a].size, 

    return remove_small_components(forest, sorted_graph, min_size)

Finally, the outcomes!

MInt(C_1,C_2)=\min(Int(C_1)+\tau(C_1),Int(C_2)+\tau(C_2)),\tau (C)=k/|C|