CN102142089A

CN102142089A - Semantic binary tree-based image annotation method

Info

Publication number: CN102142089A
Application number: CN 201110002770
Authority: CN
Inventors: 刘咏梅; 杨帆; 杜福鹏
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2011-01-07
Filing date: 2011-01-07
Publication date: 2011-08-03
Anticipated expiration: 2031-01-07
Also published as: CN102142089B

Abstract

The invention provides a semantic binary tree-based image annotation method. The method comprises the following steps of: 1, segmenting an annotated image for learning by using an image segmentation algorithm for an image set at a specific scene to acquire visual description of an image area; 2, constructing visual nearest neighbor images of all images for learning; 3, establishing a semantic binary tree of the scene according to the nearest neighbor images in the step 2; and 4, discovering a corresponding position from a root node to a leaf node of the semantic binary tree for an image to be annotated on the scene, and transmitting all annotation words from the node to the root node to the image. The invention aims to establish the semantic binary tree for an annotated image set for training on the specific scene, so that the accuracy of automatic semantic annotation of the image which is subjected to scene classification by using an image visual feature is improved.

Description

A kind of image labeling method based on semantic binary tree

Technical field

What the present invention relates to is a kind of automatic semanteme marking method of image.

Background technology

The mark word of image has reflected the senior semantic information of image preferably as a kind of very valuable iamge description resource.How making full use of the mark word information of training image, is the important means that improves the image labeling precision.Background of the present invention is on the correlativity basis of comprehensive utilization image, semantic and visual signature, extracts the semantic scene of training image, and the training image of different scenes is set up vision mode, according to visual signature image to be marked is carried out semanteme at last and sorts out.

Summary of the invention

The object of the present invention is to provide a kind of can the raising to sort out the image labeling method based on semantic binary tree of the mark precision of back image to be marked through scene.

The object of the present invention is achieved like this:

Step 1 for the image set of special scenes, adopts image segmentation algorithm that the mark image that is used to learn is cut apart, and the vision that obtains image-region is described;

Step 2, the vision arest neighbors figure of all images that is configured to learn;

Step 3, the semantic binary tree of setting up described scene according to the arest neighbors figure in the step 2;

Step 4 to the image to be marked under the described scene, finds the relevant position from the root node of semantic binary tree to leaf node, and this node place is passed to described image to all mark words of root node.

The method of the vision arest neighbors figure of the described all images that is configured to learn is: the visible sensation distance between image adopts the similarity measure dozer distance of the integrated coupling of multizone, corresponding each width of cloth image in the summit of figure, the visible sensation distance between the limit correspondence image on connection summit.

The method of the semantic binary tree of described foundation is: the root node place of binary tree has compiled all the mark images in the scene, represent the semantic expressiveness of the corresponding root node of mark word of described scene, arest neighbors figure in the step 2 is adopted two fens algorithms of normalization cutting, image is divided into two set, represent the left subtree and the right subtree of root node respectively, add up the remarkable mark of except the mark word at root node place word in two set, and redefine the ownership of every width of cloth image by this mark word; The method of seeking remarkable mark word is the occurrence number that respectively marks word in the statistics set, and the mark word that occurrence number is the highest is as significantly marking word; If more than one of the maximum mark word of number of times, the mark word that word frequency is lower is as significantly marking word;

The left subtree of root node and right subtree are repeated aforesaid operations, in having only a sub-picture or set, do not have the mark word that significantly occurs, the leaf node correspondence of bottom the image of the lower mark word of occurrence frequency.

The present invention utilizes mark word and visual information that the mark image of special scenes is set up semantic binary tree, has proposed a concrete grammar of the mark image of special scenes being set up semantic tree.The summit of tree is to modal mark word under should scene, growth along with semantic tree, the semanteme of each leaf node correspondence is branched cutting, the refinement gradually of the semanteme of child node, the mark word of representative is progressively concrete, is tending towards and the semantic binary tree by setting up, to the image to be marked of this scene, to leaf node, obtain corresponding markup information from the root of this scene semantic tree.

The present invention is intended to the mark image set of the training usefulness under the special scenes is set up semantic binary tree, improves and utilizes Image Visual Feature to carry out the precision of the automatic semantic tagger of the image behind the scene classification.

The present invention is used for image labeling with the binary tree that node has key word, has higher utility.To use many CBIR valuable help, for example the image rustling sound engine of *** will be arranged.

Description of drawings

Accompanying drawing is a process flow diagram of the present invention.

Embodiment

For example the present invention is done more detailed description below in conjunction with accompanying drawing:

Step 1 for the image set of special scenes, adopts image segmentation algorithm that the mark image that is used to learn is cut apart, and the vision that obtains image-region is described.

Step 2, the vision arest neighbors figure of all images that is configured to learn.The similarity measure dozer distance of the integrated coupling of visible sensation distance employing multizone between image (Earth Mover ' s Distance, EMD).Corresponding each width of cloth image in the summit of figure, the visible sensation distance between the limit correspondence image on connection summit.

Step 3, the semantic binary tree of setting up this scene according to the arest neighbors figure in the step 2.Method is as follows.

The root node place of binary tree has compiled all the mark images in this scene, represents the semantic expressiveness of the corresponding root node of mark word of this scene.Arest neighbors figure in the step 2 is adopted two fens algorithms of N-Cut (Normalized Cut, normalization cutting), image is divided into two set, represent the left subtree and the right subtree of root node respectively.Add up the remarkable mark of except the mark word at root node place word in two set, and redefine the ownership of every width of cloth image by this mark word.The method of seeking remarkable mark word is the occurrence number that respectively marks word in the statistics set, and the mark word that occurrence number is the highest is as significantly marking word.If more than one of the maximum mark word of number of times, the mark word that word frequency is lower is as significantly marking word.

Left subtree and right subtree to root node repeat aforesaid operations, do not have the mark word that significantly occurs in having only a sub-picture or set.The leaf node correspondence of bottom the image of the lower mark word of occurrence frequency.

Step 4 to the image to be marked under this scene, finds the relevant position from the root node of semantic binary tree to leaf node, and this node place is passed to this image to all mark words of root node.

Claims

1. image labeling method based on semantic binary tree is characterized in that:

2. the image labeling method based on semantic binary tree according to claim 1, the method that it is characterized in that the vision arest neighbors figure of the described all images that is configured to learn is: the visible sensation distance between image adopts the similarity measure dozer distance of the integrated coupling of multizone, corresponding each width of cloth image in the summit of figure, the visible sensation distance between the limit correspondence image on connection summit.

3. the image labeling method based on semantic binary tree according to claim 1 and 2, the method that it is characterized in that the semantic binary tree of described foundation is: the root node place of binary tree has compiled all the mark images in the scene, represent the semantic expressiveness of the corresponding root node of mark word of described scene, arest neighbors figure in the step 2 is adopted two fens algorithms of normalization cutting, image is divided into two set, represent the left subtree and the right subtree of root node respectively, add up the remarkable mark of except the mark word at root node place word in two set, and redefine the ownership of every width of cloth image by this mark word; The method of seeking remarkable mark word is the occurrence number that respectively marks word in the statistics set, and the mark word that occurrence number is the highest is as significantly marking word; If more than one of the maximum mark word of number of times, the mark word that word frequency is lower is as significantly marking word;