OpenCV: Image Processing and Analysis Reference

Note:
The chapter describes functions for image processing and analysis. Most of the functions work with 2d arrays of pixels. We refer the arrays as "images" however they do not neccesserily have to be IplImage's, they may be CvMat's or CvMatND's as well.

Drawing Functions

Drawing functions work with arbitrary 8-bit images or single-channel images with larger depth: 16s, 32s, 32f, 64f All the functions include parameter color that means rgb value (that may be constructed with CV_RGB macro) for color images and brightness for grayscale images.

CV_RGB

Line

The function cvLine draws the line segment between pt1 and pt2 points in the image. The line is clipped by the image or ROI rectangle. The 8-connected or 4-connected Bresenham algorithm is used for simple line segments. Thick lines are drawn with rounding endings. To specify the line color, the user may use the macro CV_RGB( r, g, b ).

LineAA

The function cvLineAA draws the 8-connected line segment between pt1 and pt2 points in the image. The line is clipped by the image or ROI rectangle. The algorithm includes some sort of Gaussian filtering to get a smooth picture. To specify the line color, the user may use the macro CV_RGB( r, g, b ).

Rectangle

The function cvRectangle draws a rectangle with two opposite corners pt1 and pt2.

Circle

The function cvCircle draws a simple or filled circle with given center and radius. The circle is clipped by ROI rectangle. The Bresenham algorithm is used both for simple and filled circles. To specify the circle color, the user may use the macro CV_RGB ( r, g, b ).

Ellipse

The function cvEllipse draws a simple or thick elliptic arc or fills an ellipse sector. The arc is clipped by ROI rectangle. The generalized Bresenham algorithm for conic section is used for simple elliptic arcs here, and piecewise-linear approximation is used for antialiased arcs and thick arcs. All the angles are given in degrees. The picture below explains the meaning of the parameters.

EllipseAA

The function cvEllipseAA draws an antialiased elliptic arc. The arc is clipped by ROI rectangle. The generalized Bresenham algorithm for conic section is used for simple elliptic arcs here, and piecewise-linear approximation is used for antialiased arcs and thick arcs. All the angles are in degrees.

FillPoly

The function cvFillPoly fills an area bounded by several polygonal contours. The function fills complex areas, for example, areas with holes, contour self-intersection, etc.

FillConvexPoly

The function cvFillConvexPoly fills convex polygon interior. This function is much faster than the function cvFillPoly and fills not only the convex polygon but any monotonic polygon, that is, a polygon whose contour intersects every horizontal line (scan line) twice at the most.

PolyLine

PolyLineAA

InitFont

The function cvInitFont initializes the font structure that can be passed further into text drawing functions. Although only one font is supported, it is possible to get different font flavors by varying the scale parameters, slope, and thickness.

PutText

The function cvPutText renders the text in the image with the specified font and color. The printed text is clipped by ROI rectangle. Symbols that do not belong to the specified font are replaced with the rectangle symbol.

GetTextSize

The function cvGetTextSize calculates the binding rectangle for the given text string when a specified font is used.

Gradients, Edges and Corners

Sobel

Calculates first, second, third or mixed image derivatives using extended Sobel operator

The function cvSobel calculates the image derivative by convolving the image with the appropriate kernel:

Laplace

The function cvLaplace calculates Laplacian of the source image by summing second x- and y- derivatives calcualted using Sobel operator:

Specifying apertureSize=1 gives the fastest variant that is equal to convolving the image with the following kernel:

As well as in cvSobel function, no scaling is done and the same combinations of input and output formats are supported.

Canny

The function cvCanny finds the edges on the input image img and marks them in the output image edges using the Canny algorithm. The smallest of threshold1 and threshold2 is used for edge linking, the largest - to find initial segments of strong edges.

PreCornerDetect

The function cvPreCornerDetect finds the corners on the input image img and stores them in the corners image in accordance with Method 1 for corner detection desctibed in the guide.

CornerEigenValsAndVecs

For every pixel the function cvCornerEigenValsAndVecs considers blockSize × blockSize neigborhood S(p). It calcualtes covariation matrix of derivatives over the neigborhood as:

After that it finds eigenvectors and eigenvalues of the resultant matrix and stores them into destination image in form (λ₁, λ₂, x₁, y₁, x₂, y₂), where
λ₁, λ₂ - eigenvalues of M; not sorted
(x₁, y₁) - eigenvector corresponding to λ₁
(x₂, y₂) - eigenvector corresponding to λ₂

CornerMinEigenVal

The function cvCornerMinEigenVal is similar to cvCornerEigenValsAndVecs but it calculates and stores only the minimal eigen value of derivative covariation matrix for every pixel, i.e. min(λ₁, λ₂) in terms of the previous function.

FindCornerSubPix

The function cvFindCornerSubPix iterates to find the sub-pixel accurate location of a corner, or "radial saddle point", as shown in on the picture below.

Sub-pixel accurate corner (radial saddle point) locator is based on the observation that any vector from q to p is orthogonal to the image gradient.

The core idea of this algorithm is based on the observation that every vector from the center q to a point p located within a neighborhood of q is orthogonal to the image gradient at p subject to image and measurement noise. Thus:

where the gradients are summed within a neighborhood ("search window") of q. Calling the first gradient term G and the second gradient term b gives:

The algorithm sets the center of the neighborhood window at this new center q and then iterates until the center keeps within a set threshold.

GoodFeaturesToTrack

The function cvGoodFeaturesToTrack finds corners with big eigenvalues in the image. The function first calculates the minimal eigenvalue for every source image pixel using cvCornerMinEigenVal function and stores them in eigImage. Then it performs non-maxima suppression (only local maxima in 3x3 neighborhood remain). The next step is rejecting the corners with the minimal eigenvalue less than qualityLevel•max(eigImage(x,y)). Finally, the function ensures that all the corners found are distanced enough from one another by considering the corners (the most strongest corners are considered first) and checking that the distance between the newly considered feature and the features considered earlier is larger than minDistance. So, the function removes the features than are too close to the stronger features.

Sampling, Interpolation and Geometrical Transforms

InitLineIterator

The function cvInitLineIterator initializes the line iterator and returns the number of pixels between two end points. Both points must be inside the image. After the iterator has been initialized, all the points on the raster line that connects the two ending points may be retrieved by successive calls of CV_NEXT_LINE_POINT point. The points on the line are calculated one by one using 4-connected or 8-connected Bresenham algorithm.

Example. Using line iterator to calculate pixel values along the color line

SampleLine

The function cvSampleLine implements a particular case of application of line iterators. The function reads all the image points lying on the line between pt1 and pt2, including the ending points, and stores them into the buffer.

GetRectSubPix

where the values of pixels at non-integer coordinates ( x+center.x, y+center.y ) are retrieved using bilinear interpolation. Every channel of multiple-channel images is processed independently. Whereas the rectangle center must be inside the image, the whole rectangle may be partially occluded. In this case, the replication border mode is used to get pixel values beyond the image boundaries.

GetQuadrangeSubPix

The function cvGetQuadrangleSubPix extracts pixels from I at sub-pixel accuracy and stores them to J as follows:

where the values of pixels at non-integer coordinates A•(x,y)^T+b are retrieved using bilinear interpolation. Every channel of multiple-channel images is processed independently.

Example. Using cvGetQuadrangeSubPix for image rotation.

Resize

The function cvResize resizes image I so that it fits exactly to J. If ROI is set, the function consideres the ROI as supported as usual. the source image using the specified structuring element B that determines the shape of a pixel neighborhood over which the minimum is taken:

The function supports the in-place mode when the source and destination pointers are the same. Erosion can be applied several times iterations parameter. Erosion on a color image means independent transformation of all the channels.

Morphological Operations

CreateStructuringElementEx

The function cv CreateStructuringElementEx allocates and fills the structure IplConvKernel , which can be used as a structuring element in the morphological operations.

ReleaseStructuringElement

The function cv ReleaseStructuringElement releases the structure IplConvKernel that is no longer needed. If *ppElement is NULL , the function has no effect. The function returns created structuring element.

Erode

The function cvErode erodes the source image using the specified structuring element B that determines the shape of a pixel neighborhood over which the minimum is taken:

Dilate

The function cvDilate dilates the source image using the specified structuring element B that determines the shape of a pixel neighborhood over which the maximum is taken:

The function supports the in-place mode when the source and destination pointers are the same. Dilation can be applied several times iterations parameter. Dilation on a color image means independent transformation of all the channels.

MorphologyEx

The function cvMorphologyEx performs advanced morphological transformations using on erosion and dilation as basic operations.

The temporary image temp is required if op=CV_MOP_GRADIENT or if A=C (inplace operation) and op=CV_MOP_TOPHAT or op=CV_MOP_BLACKHAT

Filters and Color Conversion

Smooth

The function cvSmooth smooths image using one of several methods. Every of the methods has some features and restrictions listed below

Blur with no scaling works with single-channel images only and supports accumulation of 8-bit to 16-bit format (similar to cvSobel and cvLaplace) and 32-bit floating point to 32-bit floating-point format.

Simple blur and Gaussian blur support 1- or 3-channel, 8-bit and 32-bit floating point images. These two methods can process images in-place.

Median and bilateral filters work with 1- or 3-channel 8-bit images and can not process images in-place.

Integral

The function cvIntegral calculates one or more integral images for the source image as following:

After that the images are calculated, they can be used to calculate sums of pixels over an arbitrary rectangles, for example:

It makes possible to do a fast blurring or fast block correlation with variable window size etc.

CvtColor

The function cvCvtColor converts input image from one color space to another. The function ignores colorModel and channelSeq fields of IplImage header, so the source image color space should be specified correctly (including order of the channels in case of RGB space, e.g. BGR means 24-bit format with B₀ G₀ R₀ B₁ G₁ R₁ ... layout, whereas RGB means 24-format with R₀ G₀ B₀ R₁ G₁ B₁ ... layout). The function can do the following transformations:

Threshold

The function cvThreshold applies fixed-level thresholding to single-channel array. The function is typically used to get bi-level (binary) image out of grayscale image or for removing a noise, i.e. filtering out pixels with too small or too large values. There are several types of thresholding the function supports that are determined by thresholdType:

AdaptiveThreshold

The function cvAdaptiveThreshold transforms grayscale image to binary image according to the formulae:

For the method CV_ADAPTIVE_THRESH_MEAN_C it is a mean of blockSize × blockSize pixel neighborhood, subtracted by param1.

For the method CV_ADAPTIVE_THRESH_GAUSSIAN_C it is a weighted sum (gaussian) of blockSize × blockSize pixel neighborhood, subtracted by param1.

LUT

The function cvLUT fills the destination array with values of look-up table entries. Indices of the entries are taken from the source array. That is, the function processes each pixel as follows:

Pyramids and the Applications

PyrDown

The function cvPyrDown performs downsampling step of Gaussian pyramid decomposition. First it convolves source image with the specified filter and then downsamples the image by rejecting even rows and columns.

PyrUp

The function cvPyrUp performs up-sampling step of Gaussian pyramid decomposition. First it upsamples the source image by injecting even zero rows and columns and then convolves result with the specified filter multiplied by 4 for interpolation. So the destination image is four times larger than the source image.

PyrSegmentation

The function cvPyrSegmentation implements image segmentation by pyramids. The pyramid builds up to the level level. The links between any pixel a on level i and its candidate father pixel b on the adjacent level are established if

p(c(a),c(b))<threshold1. After the connected components are defined, they are joined into several clusters. Any two segments A and B belong to the same cluster, if

p(c(A),c(B))<threshold2. The input image has only one channel, then

p(c¹,c²)=|c¹-c²|. If the input image has three channels (red, green and blue), then

p(c¹,c²)=0,3·(c¹_r-c²_r)+0,59·(c¹_g-c²_g)+0,11·(c¹_b-c²_b) . There may be more than one connected component per a cluster.

The images src and dst should be 8-bit single-channel or 3-channel images or equal size

Connected components

CvConnectedComp

Connected component

    typedef struct CvConnectedComp
    {
        double area; /* area of the segmented component */
        float value; /* gray scale value of the segmented component */
        CvRect rect; /* ROI of the segmented component */
    } CvConnectedComp;

FloodFill

Fills a connected component with given color

void cvFloodFill( CvArr* img, CvPoint seed, double newVal,
                  double lo=0, double up=0, CvConnectedComp* comp=0,
                  int flags=4, CvArr* mask=0 );
#define CV_FLOODFILL_FIXED_RANGE (1 << 16)
#define CV_FLOODFILL_MASK_ONLY   (1 << 17)

img

Input image, either 1-,3-channel 8-bit, or single-channel floating-point image. It is modified by the function unless CV_FLOODFILL_MASK_ONLY flag is set (see below).

seed

Coordinates of the seed point inside the image ROI.

newVal

New value of repainted domain pixels. For 8-bit color images it is a packed color (e.g. using CV_RGB macro).

lo

Maximal lower brightness/color difference between the currently observed pixel and one of its neighbor belong to the component or seed pixel to add the pixel to component. In case of 8-bit color images it is packed value.

up

Maximal upper brightness/color difference between the currently observed pixel and one of its neighbor belong to the component or seed pixel to add the pixel to component. In case of 8-bit color images it is packed value.

comp

Pointer to structure the function fills with the information about the repainted domain.

flags

The operation flags. Lower bits contain connectivity value, 4 (by default) or 8, used within the function. Connectivity determines which neighbors of a pixel are considered. Upper bits can be 0 or combination of the following flags:

CV_FLOODFILL_FIXED_RANGE - if set the difference between the current pixel and seed pixel is considered, otherwise difference between neighbor pixels is considered (the range is floating).
CV_FLOODFILL_MASK_ONLY - if set, the function does not fill the image (newVal is ignored), but the fills mask (that must be non-NULL in this case).

mask

Operation mask, should be singe-channel 8-bit image, 2 pixels wider and 2 pixels taller than img. If not NULL, the function uses and updates the mask, so user takes responsibility of initializing mask content. Floodfilling can't go across non-zero pixels in the mask, for example, an edge detector output can be used as a mask to stop filling at edges. Or it is possible to use the same mask in multiple calls to the function to make sure the filled area do not overlap.

The function cvFloodFill fills a connected component starting from the seed pixel where all pixels within the component have close to each other values (prior to filling). The pixel is considered to belong to the repainted domain if its value I(x,y) meets the following conditions (the particular cases are specifed after commas):

I(x',y')-lo<=I(x,y)<=I(x',y')+up, grayscale image + floating range
I(seed.x,seed.y)-lo<=I(x,y)<=I(seed.x,seed.y)+up, grayscale image + floating range

I(x',y')_r-lo_r<=I(x,y)_r<=I(x',y')_r+up_r and
I(x',y')_g-lo_g<=I(x,y)_g<=I(x',y')_g+up_g and
I(x',y')_b-lo_b<=I(x,y)_b<=I(x',y')_b+up_b, color image + floating range

I(seed.x,seed.y)_r-lo_r<=I(x,y)_r<=I(seed.x,seed.y)_r+up_r and
I(seed.x,seed.y)_g-lo_g<=I(x,y)_g<=I(seed.x,seed.y)_g+up_g and
I(seed.x,seed.y)_b-lo_b<=I(x,y)_b<=I(seed.x,seed.y)_b+up_b, color image + fixed range

where I(x',y') is value of one of pixel neighbors (to be added to the connected component in case of floating range, a pixel should have at least one neigbor with similar brightness)

FindContours

Finds contours in binary image

int cvFindContours( CvArr* img, CvMemStorage* storage, CvSeq** firstContour,
                    int headerSize=sizeof(CvContour), CvContourRetrievalMode mode=CV_RETR_LIST,
                    CvChainApproxMethod method=CV_CHAIN_APPROX_SIMPLE );

image: The source 8-bit single channel image. Non-zero pixels are treated as 1's, zero pixels remain 0's - that is image treated as binary. To get such a binary image from grayscale, one may use cvThreshold, cvAdaptiveThreshold or cvCanny. The function modifies the source image content.
storage: Container of the retrieved contours.
firstContour: Output parameter, will contain the pointer to the first outer contour.
headerSize: Size of the sequence header, >=sizeof(CvChain) if method=CV_CHAIN_CODE, and >=sizeof(CvContour) otherwise.
mode: Retrieval mode.

CV_RETR_EXTERNALretrives only the extreme outer contours
CV_RETR_LISTretrieves all the contours and puts them in the list
CV_RETR_CCOMPretrieves all the contours and organizes them into two-level hierarchy: top level are external boundaries of the components, second level are bounda boundaries of the holes
CV_RETR_TREEretrieves all the contours and reconstructs the full hierarchy of nested contours

method

Approximation method.

CV_CHAIN_CODEoutputs contours in the Freeman chain code. All other methods output polygons (sequences of vertices).
CV_CHAIN_APPROX_NONEtranslates all the points from the chain code into points;
CV_CHAIN_APPROX_SIMPLEcompresses horizontal, vertical, and diagonal segments, that is, the function leaves only their ending points;
CV_CHAIN_APPROX_TC89_L1,
CV_CHAIN_APPROX_TC89_KCOS applies one of the flavors of Teh-Chin chain approximation algorithm.
CV_LINK_RUNS uses completely different (from the previous methods) algorithm - linking of horizontal segments of 1's. Only CV_RETR_LIST retrieval mode is allowed by the method.

The function cvFindContours retrieves contours from the binary image and returns the number of retrieved contours. The pointer firstContour is filled by the function. It will contain pointer to the first most outer contour or NULL if no contours is detected (if the image is completely black). Other contours may be reached from firstContour using h_next and v_next links. The sample in cvDrawContours discussion shows how to use contours for connected component detection. Contours can be also used for shape analysis and object recognition - see squares sample in CVPR 2001 tutorial course located at SourceForge site.

StartFindContours

Initializes contour scanning process

CvContourScanner cvStartFindContours( IplImage* img, CvMemStorage* storage,
                                      int headerSize, CvContourRetrievalMode mode,
                                      CvChainApproxMethod method );

image: The source 8-bit single channel binary image.
storage: Container of the retrieved contours.
headerSize: Size of the sequence header, >=sizeof(CvChain) if method=CV_CHAIN_CODE, and >=sizeof(CvContour) otherwise.
mode: Retrieval mode, has the same meaning as in cvFindContours.
method: Approximation method, the same as in cvFindContours except that CV_LINK_RUNS can not be used here.

The function cvStartFindContours initializes and returns pointer to the contour scanner. The scanner is used further in cvFindNextContour to retrieve the rest of contours.

FindNextContour

Finds next contour in the image

CvSeq* cvFindNextContour( CvContourScanner scanner );

scanner: Contour scanner initialized by the function cvStartFindContours .

The function cvFindNextContour locates and retrieves the next contour in the image and returns pointer to it. The function returns NULL, if there is no more contours.

SubstituteContour

Replaces retrieved contour

void cvSubstituteContour( CvContourScanner scanner, CvSeq* newContour );

scanner: Contour scanner initialized by the function cvStartFindContours .
newContour: Substituting contour.

The function cvSubstituteContour replaces the retrieved contour, that was returned from the preceding call of the function cvFindNextContour and stored inside the contour scanner state, with the user-specified contour. The contour is inserted into the resulting structure, list, two-level hierarchy, or tree, depending on the retrieval mode. If the parameter newContour=NULL, the retrieved contour is not included into the resulting structure, nor all of its children that might be added to this structure later.

EndFindContours

Finishes scanning process

CvSeq* cvEndFindContours( CvContourScanner* scanner );

scanner: Pointer to the contour scanner.

The function cvEndFindContours finishes the scanning process and returns the pointer to the first contour on the highest level.

DrawContours

Draws contour outlines or interiors in the image

void cvDrawContours( CvArr *image, CvSeq* contour,
                     double external_color, double hole_color,
                     int max_level, int thickness=1,
                     int connectivity=8 );

image: Image where the contours are to be drawn. Like in any other drawing function, the contours are clipped with the ROI.
contour: Pointer to the first contour.
externalColor: Color to draw external contours with.
holeColor: Color to draw holes with.
maxLevel: Maximal level for drawn contours. If 0, only contour is drawn. If 1, the contour and all contours after it on the same level are drawn. If 2, all contours after and all contours one level below the contours are drawn, etc. If the value is negative, the function does not draw the contours following after contour but draws child contours of contour up to abs(maxLevel)-1 level.
thickness: Thickness of lines the contours are drawn with. If it is negative (e.g. =CV_FILLED), the contour interiors are drawn.
connectivity: Connectivity of line segments of the contour outlines.

The function cvDrawContours draws contour outlines in the image if thickness>=0 or fills area bounded by the contours if thickness<0.

Example. Connected component detection via contour functions

#include "cv.h"
#include "highgui.h"

int main( int argc, char** argv )
{
    IplImage* src;
    // the first command line parameter must be file name of binary (black-n-white) image
    if( argc == 2 && (src=cvLoadImage(argv[1], 0))!= 0)
    {
        IplImage* dst = cvCreateImage( cvGetSize(src), 8, 3 );
        CvMemStorage* storage = cvCreateMemStorage(0);
        CvSeq* contour = 0;

        cvThreshold( src, src, 1, 255, CV_THRESH_BINARY );
        cvNamedWindow( "Source", 1 );
        cvShowImage( "Source", src );

        cvFindContours( src, storage, &contour, sizeof(CvSeq), CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
        cvZero( dst );

        for( ; contour != 0; contour = contour->h_next )
        {
            int color = CV_RGB( rand(), rand(), rand() );
            /* replace CV_FILLED with 1 to see the outlines */
            cvDrawContours( dst, contour, color, color, -1, CV_FILLED, 8 );
        }

        cvNamedWindow( "Components", 1 );
        cvShowImage( "Components", dst );
        cvWaitKey(0);
    }
}

Replace CV_FILLED with 1 in the sample below to see the contour outlines

Image and contour moments

Moments

Calculates all moments up to third order of a polygon or rasterized shape

void cvMoments( const CvArr* arr, CvMoments* moments, int isBinary=0 );

arr: Image (1-channel or 3-channel with COI set) or polygon (CvSeq of points of a vector of points).
moments: Pointer to returned moment state structure.
isBinary: (For images only) If the flag is non-zero, all the zero pixel values are treated as zeroes, all the others are treated as ones.

The function cvMoments calculates spatial and central moments up to the third order and writes them to moments. The moments may be used then to calculate gravity center of the shape, its area, main axises and various shape characeteristics including 7 Hu invariants.

GetSpatialMoment

Retrieves spatial moment from moment state structure

double cvGetSpatialMoment( CvMoments* moments, int j, int i );

moments: The moment state, calculated by cvMoments.
j: x-order of the retrieved moment, j >= 0.
i: y-order of the retrieved moment, i >= 0 and i + j <= 3.

The function cvGetSpatialMoment retrieves the spatial moment, which in case of image moments is defined as:

M_ji=sum_x,y(I(x,y)•x^j•yⁱ)

where I(x,y) is the intensity of the pixel (x, y).

GetCentralMoment

Retrieves central moment from moment state structure

double cvGetCentralMoment( CvMoments* moments, int j, int i );

moments: Pointer to the moment state structure.
j: x-order of the retrieved moment, j >= 0.
i: y-order of the retrieved moment, i >= 0 and i + j <= 3.

The functioncvGetCentralMoment retrieves the central moment, which in case of image moments is defined as:

μ_ij=sum_x,y(I(x,y)•(x-x_c)^j•(y-y_c)ⁱ),

where x_c=M₁₀/M₀₀, y_c=M₀₁/M₀₀ - coordinates of the gravity center

GetNormalizedCentralMoment

Retrieves normalized central moment from moment state structure

double cvGetNormalizedCentralMoment( CvMoments* moments, int x_order, int y_order );

moments: Pointer to the moment state structure.
j: x-order of the retrieved moment, j >= 0.
i: y-order of the retrieved moment, i >= 0 and i + j <= 3.

The function cvGetNormalizedCentralMoment retrieves the normalized central moment, which in case of image moments is defined as:

η_ij= μ_ij/M₀₀^((i+j)/2+1)

GetHuMoments

Calculates seven Hu invariants

void cvGetHuMoments( CvMoments* moments, CvHuMoments* HuMoments );

moments: Pointer to the moment state structure.
HuMoments: Pointer to Hu moments structure.

The function cvGetHuMoments calculates seven Hu invariants that are defined as:

 h₁=η₂₀+η₀₂

 h₂=(η₂₀-η₀₂)²+4η₁₁²

 h₃=(η₃₀-3η₁₂)²+ (3η₂₁-η₀₃)²

 h₄=(η₃₀+η₁₂)²+ (η₂₁+η₀₃)²

 h₅=(η₃₀-3η₁₂)(η₃₀+η₁₂)[(η₃₀+η₁₂)²-3(η₂₁+η₀₃)²]+(3η₂₁-η₀₃)(η₂₁+η₀₃)[3(η₃₀+η₁₂)²-(η₂₁+η₀₃)²]

 h₆=(η₂₀-η₀₂)[(η₃₀+η₁₂)²- (η₂₁+η₀₃)²]+4η₁₁(η₃₀+η₁₂)(η₂₁+η₀₃)

 h₇=(3η₂₁-η₀₃)(η₂₁+η₀₃)[3(η₃₀+η₁₂)²-(η₂₁+η₀₃)²]-(η₃₀-3η₁₂)(η₂₁+η₀₃)[3(η₃₀+η₁₂)²-(η₂₁+η₀₃)²]

These values are proved to be invariants to the image scale, rotation, and reflection except the seventh one, whose sign is changed by reflection.

Special Image Transforms

HoughLines

Finds lines in binary image using Hough transform

CvSeq* cvHoughLines2( CvArr* image, void* lineStorage, int method,
                      double dRho, double dTheta, int threshold,
                      double param1=0, double param2 );

image

Source 8-bit single-channel (binary) image. It may be modified by the function.

lineStorage

The storage for the lines detected. It can be a memory storage (in this case a sequence of lines is created in the storage and returned by the function) or single row/single column matrix (CvMat*) of a particular type (see below) where the lines' parameters are written. The matrix header is modified by the function so its cols/rows contains a number of lines detected (that is a matrix is truncated to fit exactly the detected lines, though no data is deallocated - only the header is modified). In the latter case if the actual number of lines exceeds the matrix size, the maximum possible number of lines is returned (the lines are not sorted by length, confidence or whatever criteria).

method

The Hough transform variant, one of:

CV_HOUGH_STANDARD - classical or standard Hough transform. Every line is represented by two floating-point numbers (ρ, θ), where ρ is a distance between (0,0) point and the line, and θ is the angle between x-axis and the normal to the line. Thus, the matrix must be (the created sequence will be) of CV_32FC2 type.
CV_HOUGH_PROBABILISTIC - probabilistic Hough transform (more efficient in case if picture contains a few long linear segments). It returns line segments rather than the whole lines. Every segment is represented by starting and ending points, and the matrix must be (the created sequence will be) of CV_32SC4 type.
CV_HOUGH_MULTI_SCALE - multi-scale variant of classical Hough transform. The lines are encoded the same way as in CV_HOUGH_CLASSICAL.

dRho

Distance resolution in pixel-related units.

dTheta

Angle resolution measured in radians.

threshold

Threshold parameter. A line is returned by the function if the corresponding accumulator value is greater than threshold.

param1

The first method-dependent parameter:

For classical Hough transform it is not used (0).
For probabilistic Hough transform it is the minimum line length.
For multi-scale Hough transform it is divisor for distance resolution dRho. (The coarse distance resolution will be dRho and the accurate resolution will be (dRho / param1)).

param2

The second method-dependent parameter:

For classical Hough transform it is not used (0).
For probabilistic Hough transform it is the maximum gap between line segments lieing on the same line to treat them as the single line segment (i.e. to join them).
For multi-scale Hough transform it is divisor for angle resolution dTheta. (The coarse angle resolution will be dTheta and the accurate resolution will be (dTheta / param2)).

The function cvHoughLines2 implements a few variants of Hough transform for line detection.

Example. Detecting lines with Hough transform.

/* This is a standalone program. Pass an image name as a first parameter of the program.
   Switch between standard and probabilistic Hough transform by changing "#if 1" to "#if 0" and back */
#include <cv.h>
#include <highgui.h>
#include <math.h>

int main(int argc, char** argv)
{
    IplImage* src;
    if( argc == 2 && (src=cvLoadImage(argv[1], 0))!= 0)
    {
        IplImage* dst = cvCreateImage( cvGetSize(src), 8, 1 );
        IplImage* color_dst = cvCreateImage( cvGetSize(src), 8, 3 );
        CvMemStorage* storage = cvCreateMemStorage(0);
        CvSeq* lines = 0;
        int i;
        cvCanny( src, dst, 50, 200, 3 );
        cvCvtColor( dst, color_dst, CV_GRAY2BGR );
#if 1
        lines = cvHoughLines2( dst, storage, CV_HOUGH_CLASSICAL, 1, CV_PI/180, 150, 0, 0 );

        for( i = 0; i < lines->total; i++ )
        {
            float* line = (float*)cvGetSeqElem(lines,i);
            float rho = line[0];
            float theta = line[1];
            CvPoint pt1, pt2;
            double a = cos(theta), b = sin(theta);
            if( fabs(a) < 0.001 )
            {
                pt1.x = pt2.x = cvRound(rho);
                pt1.y = 0;
                pt2.y = color_dst->height;
            }
            else if( fabs(b) < 0.001 )
            {
                pt1.y = pt2.y = cvRound(rho);
                pt1.x = 0;
                pt2.x = color_dst->width;
            }
            else
            {
                pt1.x = 0;
                pt1.y = cvRound(rho/b);
                pt2.x = cvRound(rho/a);
                pt2.y = 0;
            }
            cvLine( color_dst, pt1, pt2, CV_RGB(255,0,0), 3, 8 );
        }
#else
        lines = cvHoughLines2( dst, storage, CV_HOUGH_PROBABILISTIC, 1, CV_PI/180, 80, 30, 10 );
        for( i = 0; i < lines->total; i++ )
        {
            CvPoint* line = (CvPoint*)cvGetSeqElem(lines,i);
            cvLine( color_dst, line[0], line[1], CV_RGB(255,0,0), 3, 8 );
        }
#endif
        cvNamedWindow( "Source", 1 );
        cvShowImage( "Source", src );

        cvNamedWindow( "Hough", 1 );
        cvShowImage( "Hough", color_dst );

        cvWaitKey(0);
    }
}

This is the sample picture the function parameters have been tuned for:

And this is the output of the above program in case of probabilistic Hough transform ("#if 0" case):

DistTransform

Calculates distance to closest zero pixel for all non-zero pixels of source image

void cvDistTransform( const CvArr* src, CvArr* dst, CvDisType disType=CV_DIST_L2,
                      int maskSize=3, float* mask=0 );

src: Source 8-bit single-channel (binary) image.
dst: Output image with calculated distances (32-bit floating-point, single-channel).
disType: Type of distance; can be CV_DIST_L1, CV_DIST_L2, CV_DIST_C or CV_DIST_USER.
maskSize: Size of distance transform mask; can be 3 or 5. In case if CV_DIST_L1 or CV_DIST_C the parameter is forced to 3, because 5×5 mask gives the same result as 3×3 in this case yet it is slower.
mask: User-defined mask in case of user-defined distance, it consists of 2 numbers (horizontal/vertical shift cost, diagonal shift cost) in case of 3×3 mask and 3 numbers (horizontal/vertical shift cost, diagonal shift cost, knight's move cost) in case of 5×5 mask.

The function cvDistTransform calculates the approximated distance from every binary image pixel to the nearest zero pixel. For zero pixels the function sets the zero distance, for others it finds the shortest path consisting of basic shifts: horizontal, vertical, diagonal or knight's move (the latest is available for 5×5 mask). The overal distance is calculated as a sum of these basic distances. Because the distance function should be symmetric, all the horizontal and vertical shifts must have the same cost (that is denoted as a), all the diagonal shifts must have the same cost (denoted b), and all knight's moves' must have the same cost (denoted c). For CV_DIST_C and CV_DIST_L1 types the distance is calculated precisely, whereas for CV_DIST_L2 (Euclidian distance) the distance can be calculated only with some relative error (5×5 mask gives more accurate results), OpenCV uses the values suggested in [Borgefors86]:

CV_DIST_C (3×3):
a=1, b=1

CV_DIST_L1 (3×3):
a=1, b=2

CV_DIST_L2 (3×3):
a=0.955, b=1.3693

CV_DIST_L2 (5×5):
a=1, b=1.4, c=2.1969

And below are samples of distance field (black (0) pixel is in the middle of white square) in case of user-defined distance:

User-defined 3×3 mask (a=1, b=1.5)

4.5	4	3.5	3	3.5	4	4.5
4	3	2.5	2	2.5	3	4
3.5	2.5	1.5	1	1.5	2.5	3.5
3	2	1	0	1	2	3
3.5	2.5	1.5	1	1.5	2.5	3.5
4	3	2.5	2	2.5	3	4
4.5	4	3.5	3	3.5	4	4.5

User-defined 5×5 mask (a=1, b=1.5, c=2)

4.5	3.5	3	3	3	3.5	4.5
3.5	3	2	2	2	3	3.5
3	2	1.5	1	1.5	2	3
3	2	1	0	1	2	3
3	2	1.5	1	1.5	2	3
3.5	3	2	2	2	3	3.5
4	3.5	3	3	3	3.5	4

Typically, for fast coarse distance estimation CV_DIST_L2, 3×3 mask is used, and for more accurate distance estimation CV_DIST_L2, 5×5 mask is used.

[Borgefors86] Gunilla Borgefors, "Distance Transformations in Digital Images". Computer Vision, Graphics and Image Processing 34, 344-371 (1986).

Histogram Functions

CvHistogram

Muti-dimensional histogram

    typedef struct CvHistogram
    {
        int header_size; /* header's size */
        CvHistType type; /* type of histogram */
        int flags; /* histogram's flags */
        int c_dims; /* histogram's dimension */
        int dims[CV_HIST_MAX_DIM]; /* every dimension size */
        int mdims[CV_HIST_MAX_DIM]; /* coefficients for fast access to element */
        /* &m[a,b,c] = m + a*mdims[0] + b*mdims[1] + c*mdims[2] */
        float* thresh[CV_HIST_MAX_DIM]; /* bin boundaries arrays for every dimension */
        float* array; /* all the histogram data, expanded into the single row */
    struct CvNode* root; /* root of balanced tree storing histogram bins */
        CvSet* set; /* pointer to memory storage (for the balanced tree) */
        int* chdims[CV_HIST_MAX_DIM]; /* cache data for fast calculating */
    } CvHistogram;

CreateHist

Creates histogram

CvHistogram* cvCreateHist( int cDims, int* dims, int type,
                           float** ranges=0, int uniform=1 );

cDims: Number of histogram dimensions.
dims: Array of histogram dimension sizes.
type: Histogram representation format: CV_HIST_ARRAY means that histogram data is represented as an multi-dimensional dense array CvMatND; CV_HIST_TREE means that histogram data is represented as a multi-dimensional sparse array CvSparseMat.
ranges: Array of ranges for histogram bins. Its meaning depends on the uniform parameter value. The ranges are used for when histogram is calculated or backprojected to determine, which histogram bin corresponds to which value/tuple of values from the input image[s].
uniform: Uniformity flag; if not 0, the histogram has evenly spaced bins and for every 0<=i<cDims ranges[i] is array of two numbers: lower and upper boundaries for the i-th histogram dimension. The whole range [lower,upper] is split then into dims[i] equal parts to determine i-th input tuple value ranges for every histogram bin. And if uniform=0, then i-th element of ranges array contains dims[i]+1 elements: lower₀, upper₀, lower₁, upper₁ == lower₂, ..., upper_dims[i]-1, where lower_j and upper_j are lower and upper boundaries of i-th input tuple value for j-th bin, respectively. In either case, the input values that are beyond the specified range for a histogram bin, are not counted by cvCalcHist and filled with 0 by cvCalcBackProject.

The function cvCreateHist creates a histogram of the specified size and returns the pointer to the created histogram. If the array ranges is 0, the histogram bin ranges must be specified later via the function cvSetHistBinRanges, though cvCalcHist and cvCalcBackProject may process 8-bit images without setting bin ranges, they assume equally spaced in 0..255 bins.

SetHistBinRanges

Sets bounds of histogram bins

void cvSetHistBinRanges( CvHistogram* hist, float** ranges, int uniform=1 );

hist: Histogram.
ranges: Array of bin ranges arrays, see cvCreateHist.
uniform: Uniformity flag, see cvCreateHist.

The function cvSetHistBinRanges is a stand-alone function for setting bin ranges in the histogram. For more detailed description of the parameters ranges and uniform see cvCalcHist function, that can initialize the ranges as well. Ranges for histogram bins must be set before the histogram is calculated or backproject of the histogram is calculated.

ReleaseHist

Releases histogram

void cvReleaseHist( CvHistogram** hist );

hist: Double pointer to the released histogram.

The function cvReleaseHist releases the histogram (header and the data). The pointer to histogram is cleared by the function. If *hist pointer is already NULL, the function does nothing.

ClearHist

Clears histogram

void cvClearHist( CvHistogram* hist );

hist: Histogram.

The function cvClearHist sets all histogram bins to 0 in case of dense histogram and removes all histogram bins in case of sparse array.

MakeHistHeaderForArray

Makes a histogram out of array

void cvMakeHistHeaderForArray( int cDims, int* dims, CvHistogram* hist,
                               float* data, float** ranges=0, int uniform=1 );

cDims: Number of histogram dimensions.
dims: Array of histogram dimension sizes.
hist: The histogram header initialized by the function.
data: Array that will be used to store histogram bins.
ranges: Histogram bin ranges, see cvCreateHist.
uniform: Uniformity flag, see cvCreateHist.

The function cvMakeHistHeaderForArray initializes the histogram, which header and bins are allocated by user. No cvReleaseHist need to be called afterwards. The histogram will be dense, sparse histogram can not be initialized this way.

QueryHistValue_1D

Queries value of histogram bin

#define cvQueryHistValue_1D( hist, idx0 ) \
    cvGetReal1D( (hist)->bins, (idx0) )
#define cvQueryHistValue_2D( hist, idx0, idx1 ) \
    cvGetReal2D( (hist)->bins, (idx0), (idx1) )
#define cvQueryHistValue_3D( hist, idx0, idx1, idx2 ) \
    cvGetReal3D( (hist)->bins, (idx0), (idx1), (idx2) )
#define cvQueryHistValue_nD( hist, idx ) \
    cvGetRealND( (hist)->bins, (idx) )

hist: Histogram.
idx0, idx1, idx2, idx3: Indices of the bin.
idx: Array of indices

The macros cvQueryHistValue_*D return the value of the specified bin of 1D, 2D, 3D or nD histogram. In case of sparse histogram the function returns 0, if the bin is not present in the histogram, and no new bin is created.

GetHistValue_1D

Returns pointer to histogram bin

#define cvGetHistValue_1D( hist, idx0 ) \
    ((float*)(cvPtr1D( (hist)->bins, (idx0), 0 ))
#define cvGetHistValue_2D( hist, idx0, idx1 ) \
    ((float*)(cvPtr2D( (hist)->bins, (idx0), (idx1), 0 ))
#define cvGetHistValue_3D( hist, idx0, idx1, idx2 ) \
    ((float*)(cvPtr3D( (hist)->bins, (idx0), (idx1), (idx2), 0 ))
#define cvGetHistValue_nD( hist, idx ) \
    ((float*)(cvPtrND( (hist)->bins, (idx), 0 ))

hist: Histogram.
idx0, idx1, idx2, idx3: Indices of the bin.
idx: Array of indices

The macros cvGetHistValue_*D return pointer to the specified bin of 1D, 2D, 3D or nD histogram. In case of sparse histogram the function creates a new bins and fills it with 0, if it does not exists.

GetMinMaxHistValue

Finds minimum and maximum histogram bins

void cvGetMinMaxHistValue( const CvHistogram* hist,
                           float* minVal, float* maxVal,
                           int* minIdx =0, int* maxIdx =0);

hist: Histogram.
minVal: Pointer to the minimum value of the histogram; can be NULL.
maxVal: Pointer to the maximum value of the histogram; can be NULL.
minIdx: Pointer to the array of coordinates for minimum. If not NULL, must have hist->c_dims elements to store the coordinates.
maxIdx: Pointer to the array of coordinates for maximum. If not NULL, must have hist->c_dims elements to store the coordinates.

The function cvGetMinMaxHistValue finds the minimum and maximum histogram bins and their positions. In case of several maximums or minimums the earliest in lexicographical order extrema locations are returned.

NormalizeHist

Normalizes histogram

void cvNormalizeHist( CvHistogram* hist, double factor );

hist: Pointer to the histogram.
factor: Normalization factor.

The function cvNormalizeHist normalizes the histogram bins by scaling them, such that the sum of the bins becomes equal to factor.

ThreshHist

Thresholds histogram

void cvThreshHist( CvHistogram* hist, double thresh );

hist: Pointer to the histogram.
thresh: Threshold level.

The function cvThreshHist clears histogram bins that are below the specified level.

CompareHist

Compares two dense histograms

double cvCompareHist( const CvHistogram* H1, const CvHistogram* H2,
                      CvCompareMethod method );

H1

The first dense histogram.

H2

The second dense histogram.

method

Comparison method, one of:

CV_COMP_CORREL;
CV_COMP_CHISQR;
CV_COMP_INTERSECT.

The function cvCompareHist compares two histograms using specified method and returns the comparison result. It processes as following:

Correlation (method=CV_COMP_CORREL):
d(H₁,H₂)=sum_I(H'₁(I)•H'₂(I))/sqrt(sum_I[H'₁(I)²]•sum_I[H'₂(I)²])
where
H'_k(I)=H_k(I)-1/N•sum_JH_k(J) (N=number of histogram bins)

Chi-Square (method=CV_COMP_CHISQR):
d(H₁,H₂)=sum_I[(H₁(I)-H₂(I))/(H₁(I)+H₂(I))]

Intersection (method=CV_COMP_INTERSECT):
d(H₁,H₂)=sum_Imax(H₁(I),H₂(I))

Note, that the function can operate on dense histogram only. To compare sparse histogram or more general sparse configurations of weighted points, consider cvCalcEMD function.

CopyHist

Copies histogram

void cvCopyHist( CvHistogram* src, CvHistogram** dst );

src: Source histogram.
dst: Pointer to destination histogram.

The function cvCopyHist makes a copy of the histogram. If the second histogram pointer *dst is NULL, a new histogram of the same size as src is created. Otherwise, both histograms must have equal types and sizes. Then the function copies the source histogram bins values to destination histogram and sets the same as src's value ranges.

CalcHist

Calculates histogram of image(s)

void cvCalcHist( IplImage** img, CvHistogram* hist,
                 int doNotClear=0, const CvArr* mask=0 );

img: Source images (though, you may pass CvMat** as well).
hist: Pointer to the histogram.
doNotClear: Clear flag, if it is non-zero, the histogram is not cleared before calculation. It may be useful for iterative histogram update.
mask: The operation mask, determines what pixels of the source images are counted.

The function cvCalcHist calculates the histogram of one or more single-channel images. The elements of a tuple that is used to increment a histogram bin are taken at the same location from the corresponding input images.

Sample. Calculating and displaying 2D Hue-Saturation histogram of a color image

#include <cv.h>
#include <highgui.h>

int main( int argc, char** argv )
{
    IplImage* src;
    if( argc == 2 && (src=cvLoadImage(argv[1], 1))!= 0)
    {
        IplImage* h_plane = cvCreateImage( cvGetSize(src), 8, 1 );
        IplImage* s_plane = cvCreateImage( cvGetSize(src), 8, 1 );
        IplImage* v_plane = cvCreateImage( cvGetSize(src), 8, 1 );
        IplImage* planes[] = { h_plane, s_plane };
        IplImage* hsv = cvCreateImage( cvGetSize(src), 8, 3 );
        int h_bins = 30, s_bins = 32;
        int hist_size[] = {h_bins, s_bins};
        float h_ranges[] = { 0, 180 }; /* hue varies from 0 (~0°red) to 180 (~360°red again) */
        float s_ranges[] = { 0, 255 }; /* saturation varies from 0 (black-gray-white) to 255 (pure spectrum color) */
        float* ranges[] = { h_ranges, s_ranges };
        int scale = 10;
        IplImage* hist_img = cvCreateImage( cvSize(h_bins*scale,s_bins*scale), 8, 3 );
        CvHistogram* hist;
        float max_value = 0;
        int h, s;

        cvCvtColor( src, hsv, CV_BGR2HSV );
        cvCvtPixToPlane( hsv, h_plane, s_plane, v_plane, 0 );
        hist = cvCreateHist( 2, hist_size, CV_HIST_ARRAY, ranges, 1 );
        cvCalcHist( planes, hist, 0, 0 );
        cvGetMinMaxHistValue( hist, 0, &max_value, 0, 0 );
        cvZero( hist_img );

        for( h = 0; h < h_bins; h++ )
        {
            for( s = 0; s < s_bins; s++ )
            {
                float bin_val = cvQueryHistValue_2D( hist, h, s );
                int intensity = cvRound(bin_val*255/max_value);
                cvRectangle( hist_img, cvPoint( h*scale, s*scale ),
                             cvPoint( (h+1)*scale - 1, (s+1)*scale - 1),
                             CV_RGB(intensity,intensity,intensity), /* graw a grayscale histogram.
                                                                       if you have idea how to do it
                                                                       nicer let us know */
                             CV_FILLED );
            }
        }

        cvNamedWindow( "Source", 1 );
        cvShowImage( "Source", src );

        cvNamedWindow( "H-S Histogram", 1 );
        cvShowImage( "H-S Histogram", hist_img );

        cvWaitKey(0);
    }
}

CalcBackProject

Calculates back projection

void cvCalcBackProject( IplImage** img, CvArr* backProject, const CvHistogram* hist );

img: Source images (though you may pass CvMat** as well).
backProject: Destination back projection image of the same type as the source images.
hist: Histogram.

The function cvCalcBackProject calculates the back project of the histogram. For each tuple of pixels at the same position of all input single-channel images the function puts the value of the histogram bin, corresponding to the tuple, to the destination image. In terms of statistics, the value of each output image pixel is probability of the observed tuple given the distribution (histogram). For example, to find a red object in the picture, one may do the following:

Calculate a hue histogram for the red object assuming the image contains only this object. The histogram is likely to have a strong maximum, corresponding to red color.
Calculate back projection of a hue plane of input image where the object is searched, using the histogram. Threshold the image.
Find connected components in the resulting picture and choose the right component using some additional criteria, for example, the largest connected component.

That is the approximate algorithm of Camshift color object tracker, except for the last step, where CAMSHIFT algorithm is used to locate the object on the back projection given the previous object position.

CalcBackProjectPatch

Locates a template within image by histogram comparison

void cvCalcBackProjectPatch( IplImage** img, CvArr* dst,
                             CvSize patchSize, CvHistogram* hist,
                             int method, float normFactor );

img: Source images (though, you may pass CvMat** as well)
dst: Destination image.
patchSize: Size of patch slid though the source image.
hist: Histogram
method: Compasion method, passed to cvCompareHist (see description of that function).
normFactor: Normalization factor for histograms, will affect normalization scale of destination image, pass 1. if unsure.

The function cvCalcBackProjectPatch calculates back projection by comparing histograms of the source image patches with the given histogram. Taking measurement results from some image at each location over ROI creates an array img. These results might be one or more of hue, x derivative, y derivative, Laplacian filter, oriented Gabor filter, etc. Each measurement output is collected into its own separate image. The img image array is a collection of these measurement images. A multi-dimensional histogram hist is constructed by sampling from the img image array. The final histogram is normalized. The hist histogram has as many dimensions as the number of elements in img array.

Each new image is measured and then converted into an img image array over a chosen ROI. Histograms are taken from this img image in an area covered by a "patch" with anchor at center as shown in the picture below. The histogram is normalized using the parameter norm_factor so that it may be compared with hist. The calculated histogram is compared to the model histogram; hist uses the function cvCompareHist with the comparison method=method). The resulting output is placed at the location corresponding to the patch anchor in the probability image dst. This process is repeated as the patch is slid over the ROI. Iterative histogram update by subtracting trailing pixels covered by the patch and adding newly covered pixels to the histogram can save a lot of operations, though it is not implemented yet.

Back Project Calculation by Patches

CalcProbDensity

Divides one histogram by another

void  cvCalcProbDensity( const CvHistogram* hist1, const CvHistogram* hist2,
                         CvHistogram* histDens, double scale=255 );

hist1: first histogram (divisor).
hist2: second histogram.
histDens: destination histogram.

The function cvCalcProbDensity calculates the object probability density from the two histograms as:

histDens(I)=0  if hist1(I)==0
            scale  if hist1(I)!=0 && hist2(I)>hist1(I)
            hist2(I)*scale/hist1(I) if hist1(I)!=0 && hist2(I)<=hist1(I)

So the destination histogram bins are within [0,scale).

CalcEMD2

Computes "minimal work" distance between two weighted point configurations

float cvCalcEMD2( const CvArr* signature1, const CvArr* signature2, CvDisType distType,
                  float (*distFunc)(const float* f1, const float* f2, void* userParam ),
                  const CvArr* costMatrix, CvArr* flow,
                  float* lowerBound, void* userParam );

signature1: First signature, size1×dims+1 floating-point matrix. Each row stores the point weight followed by the point coordinates. The matrix is allowed to have a single column (weights only) if the user-defined cost matrix is used.
signature2: Second signature of the same format as signature1, though the number of rows may be different. The total weights may be different, in this case an extra "dummy" point is added to either signature1 or signature2.
distType: Metrics used; CV_DIST_L1, CV_DIST_L2, and CV_DIST_C stand for one of the standard metrics; CV_DIST_USER means that a user-defined function distFunc or pre-calculated costMatrix is used.
distFunc: The user-defined distance function. It takes coordinates of two points and returns the distance between the points.
costMatrix: The user-defined size1×size2 cost matrix. At least one of costMatrix and distFunc must be NULL. Also, if a cost matrix is used, lower boundary (see below) can not be calculated, because it needs a metric function.
flow: The resultant size1×size2 flow matrix: flow_ij is a flow from i-th point of signature1 to j-th point of signature2
lowerBound: Optional output parameter: lower boundary of distance between the two signatures that is a distance between mass centers. The lower boundary may not be calculated if the user-defined cost matrix is used, the total weights of point configurations are not equal, or there is the signatures consist of weights only (i.e. the matrices have a single column).
userParam: Pointer to optional data that is passed into the user-defined distance function.

The function cvCalcEMD2 computes earth mover distance and/or a lower boundary of the distance between the two weighted point configurations. One of the application desctibed in [RubnerSept98] is multi-dimensional histogram comparison for image retrieval. EMD is a transportation problem that is solved using some modification of simplex algorithm, thus the complexity is exponential in the worst case, though, it is much faster in average. In case of real metric the lower boundary can be calculated even faster (using linear-time algorithm) and it can be used to determine roughly whether the two signatures are far enough so that they cannot relate to the same object.

[RubnerSept98] Y. Rubner. C. Tomasi, L.J. Guibas. The Earth Mover's Distance as a Metric for Image Retrieval. Technical Report STAN-CS-TN-98-86, Department of Computer Science, Stanford University, September 1998.

Utility Functions

MatchTemplate

Compares template against overlapped image regions

void cvMatchTemplate( const CvArr* I, const CvArr* T,
                      CvArr* result, int method );

I: Image where the search is running. It should be single-chanel, 8-bit or 32-bit floating-point.
T: Searched template; must be not greater than the source image and the same data type as the image.
R: Image of comparison results; single-channel 32-bit floating-point. If I is W×H and T is w×h then R must be W-w+1×H-h+1.
method: Specifies the way the template must be compared with image regions (see below).

The function cvMatchTemplate is similiar to cvCalcBackProjectPatch. It slids through I, compares w×h patches against T using the specified method and stores the comparison results to result. Here are the formular for the different comparison methods one may use (the summation is done over template and/or the image patch: x'=0..w-1, y'=0..h-1):

method=CV_TM_SQDIFF:
R(x,y)=sum_x',y'[T(x',y')-I(x+x',y+y')]²

method=CV_TM_SQDIFF_NORMED:
R(x,y)=sum_x',y'[T(x',y')-I(x+x',y+y')]²/sqrt[sum_x',y'T(x',y')²•sum_x',y'I(x+x',y+y')²]

method=CV_TM_CCORR:
R(x,y)=sum_x',y'[T(x',y')•I(x+x',y+y')]

method=CV_TM_CCORR_NORMED:
R(x,y)=sum_x',y'[T(x',y')•I(x+x',y+y')]/sqrt[sum_x',y'T(x',y')²•sum_x',y'I(x+x',y+y')²]

method=CV_TM_CCOEFF:
R(x,y)=sum_x',y'[T'(x',y')•I'(x+x',y+y')],

where T'(x',y')=T(x',y') - 1/(w•h)•sum_x",y"T(x",y") (mean template brightness=>0)
      I'(x+x',y+y')=I(x+x',y+y') - 1/(w•h)•sum_x",y"I(x+x",y+y") (mean patch brightness=>0)

method=CV_TM_CCOEFF_NORMED:
R(x,y)=sum_x',y'[T'(x',y')•I'(x+x',y+y')]/sqrt[sum_x',y'T'(x',y')²•sum_x',y'I'(x+x',y+y')²]

After the function finishes comparison, the best matches can be found as global minimums (CV_TM_SQDIFF*) or maximums (CV_TM_CCORR* and CV_TM_CCOEFF*) using cvMinMaxLoc function.

4.5	4	3.5	3	3.5	4	4.5
4	3	2.5	2	2.5	3	4
3.5	2.5	1.5	1	1.5	2.5	3.5
3	2	1	0	1	2	3
3.5	2.5	1.5	1	1.5	2.5	3.5
4	3	2.5	2	2.5	3	4
4.5	4	3.5	3	3.5	4	4.5

4.5	3.5	3	3	3	3.5	4.5
3.5	3	2	2	2	3	3.5
3	2	1.5	1	1.5	2	3
3	2	1	0	1	2	3
3	2	1.5	1	1.5	2	3
3.5	3	2	2	2	3	3.5
4	3.5	3	3	3	3.5	4

4.5	4	3.5	3	3.5	4	4.5
4	3	2.5	2	2.5	3	4
3.5	2.5	1.5	1	1.5	2.5	3.5
3	2	1	0	1	2	3
3.5	2.5	1.5	1	1.5	2.5	3.5
4	3	2.5	2	2.5	3	4
4.5	4	3.5	3	3.5	4	4.5

4.5	3.5	3	3	3	3.5	4.5
3.5	3	2	2	2	3	3.5
3	2	1.5	1	1.5	2	3
3	2	1	0	1	2	3
3	2	1.5	1	1.5	2	3
3.5	3	2	2	2	3	3.5
4	3.5	3	3	3	3.5	4

4.5	4	3.5	3	3.5	4	4.5
4	3	2.5	2	2.5	3	4
3.5	2.5	1.5	1	1.5	2.5	3.5
3	2	1	0	1	2	3
3.5	2.5	1.5	1	1.5	2.5	3.5
4	3	2.5	2	2.5	3	4
4.5	4	3.5	3	3.5	4	4.5

4.5	3.5	3	3	3	3.5	4.5
3.5	3	2	2	2	3	3.5
3	2	1.5	1	1.5	2	3
3	2	1	0	1	2	3
3	2	1.5	1	1.5	2	3
3.5	3	2	2	2	3	3.5
4	3.5	3	3	3	3.5	4