KR20170077429A - Saliency Map Generation Method and System based on Video Analysis - Google Patents

Saliency Map Generation Method and System based on Video Analysis Download PDF

Info

Publication number
KR20170077429A
KR20170077429A KR1020150187306A KR20150187306A KR20170077429A KR 20170077429 A KR20170077429 A KR 20170077429A KR 1020150187306 A KR1020150187306 A KR 1020150187306A KR 20150187306 A KR20150187306 A KR 20150187306A KR 20170077429 A KR20170077429 A KR 20170077429A
Authority
KR
South Korea
Prior art keywords
map
interest
attention
motion
extracting
Prior art date
Application number
KR1020150187306A
Other languages
Korean (ko)
Inventor
송혁
고민수
Original Assignee
전자부품연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 전자부품연구원 filed Critical 전자부품연구원
Priority to KR1020150187306A priority Critical patent/KR20170077429A/en
Publication of KR20170077429A publication Critical patent/KR20170077429A/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/147Scene change detection
    • H04N5/2351

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

A method and system for generating a map of interest based on video analysis is provided. According to the embodiment of the present invention, the attention map generating method extracts attention map from space, extracts the attention map from the motion basis, and combines the extracted attention maps. Thus, a moving image analysis based attention drawing map can be generated, and the important region in the moving image can be automatically extracted without manual selection of the user.

Description

[0001] The present invention relates to a map generation method and system based on video analysis,

The present invention relates to an image processing technique, and more particularly, to a method and system for estimating an area of interest in a moving image using various image analysis techniques.

Human beings tended to focus more on specific areas such as fast-moving objects or bright areas when viewing images, and attention map extraction techniques that applied this human visual attention to the field of computer vision were developed.

Conventional map extraction techniques have been proposed to apply to one - frame images using spatial feature information such as brightness, color, and color histogram.

However, existing still - image - based map generation techniques use only spatial information of the image, so if they are applied to the moving image, the temporal correlation between the map of interest and the map is inferior.

Therefore, there is a need for a technique of generating a map of interest, which is suitable for a moving image rather than a still image.

SUMMARY OF THE INVENTION The present invention has been made in order to solve the above problems, and it is an object of the present invention to provide a moving image analyzing system capable of generating an attention map from a moving image using a video analysis technique such as image segmentation and motion estimation. And to provide a map generation method and system.

It is another object of the present invention to provide a method and apparatus for automatically extracting important regions in a moving image through a map of interest based on a moving image analysis, so that it can be utilized in various fields such as image editing and object extraction.

According to an aspect of the present invention, there is provided a method for creating a map of interest, comprising: a first extraction step of extracting a first map of interest from a space; A second extracting step of extracting a second attention map based on a motion; And combining the first attention map with the second map of interest.

The first extracting step may include: dividing an image into a plurality of regions; Calculating average color values of the divided regions; Computing the difference from the surrounding area with the average color values, and calculating the similarities of the divided areas; And generating the first attention map based on the calculated similarities.

According to another aspect of the present invention, there is provided a method of creating a map of interest, comprising: detecting motion regions in an image; And generating the second attention map based on the motion magnitudes of the detected motion regions.

In the combining step, the first map of interest and the second map of interest may be combined using adaptive weights.

According to another aspect of the present invention, there is provided a method for creating a map of interest, comprising the steps of: searching for an important region in a final map of interest generated in the combining step; And tracking an important area searched in the searching step.

According to another aspect of the present invention, there is provided a map generating system for a map of interest, comprising: a first extractor for extracting a first map of interest from a space; A second extracting unit for extracting a second map of interest based on a motion; And a combining unit for combining the first map of interest and the second map of interest.

As described above, according to the embodiments of the present invention, it is possible to automatically extract important regions in a moving image without manually selecting a user through a moving image analysis based attention drawing map.

In addition, according to embodiments of the present invention, meaningful information in a moving image is automatically extracted, so that the user can utilize the same in various fields / services such as image editing and object extraction.

FIG. 1 is a flowchart illustrating a method for generating a map of interest based on a moving image analysis according to an exemplary embodiment of the present invention.
Fig. 2 is a diagram provided for explanation of area division, Fig.
Figure 3 is a diagram provided in the description of motion estimation,
4 is a diagram illustrating the final attention map,
5 is a block diagram of a attention map generation system according to another embodiment of the present invention.

Hereinafter, the present invention will be described in detail with reference to the drawings.

1. A method of creating a map of interest based on video analysis

FIG. 1 is a flowchart illustrating a method of generating a map of interest based on a moving image analysis according to an exemplary embodiment of the present invention.

The attention map generating method according to an embodiment of the present invention generates an attention map by estimating an area of interest in a moving image by using an image analysis technique such as motion estimation and image segmentation. In the point of analyzing images using motion information of moving images, it is different from conventional still image based map generation.

Furthermore, the attention map generating method according to the embodiment of the present invention may automatically extract important regions in a moving image using the extracted attention map. This is different from the existing technology in which the user has to manually select an important area in the video.

In order to create a map of interest, as shown in FIG. 1, first, a map of interest is extracted based on a space through an area division in a current frame (S110). Then, the motion of the next frame of the current frame and the predetermined section is estimated, and the attention map is extracted based on the motion using the information (S120).

Next, a final attention map is generated by combining the map of interest generated in step S110 and the map of interest generated in step S120 into an adaptive weight (step S130).

Hereinafter, each of the steps constituting FIG. 1 will be described in detail.

2. Space-based attention map extraction using region segmentation (S110)

First, region segmentation is performed using the SLIC super pixel method for the current frame image. The SLIC superpixel scheme has a property of preserving the boundaries well, creating a relatively homogeneous region compared to other schemes.

The left side of FIG. 2 is the original image, and the right side is the result of applying the region segmentation technique to the original image.

Next, for the divided areas, the average color value in the area is calculated. Calculate the similarity by calculating the difference from the surrounding area through the average color value. The lower the degree of similarity, the higher the attention value. The higher the degree of similarity, the lower the attention value.

3. Extraction of attention map using motion estimation (S120)

First, a motion vector is calculated using a motion estimation technique in a current frame and a next frame of a predetermined interval.

If the motion is estimated using only the current frame and the next frame, erroneous motion can be estimated by a momentary illumination change or a subtle movement. Therefore, in order to reduce such errors, motion estimation is performed in various frames of a predetermined section, and the error is reduced by correcting the motion.

The left side of FIG. 3 is the current frame image and the right side is the result of motion estimation for the current frame image. The areas indicated by color are movement areas, and the areas marked with red are areas with greater motion than areas marked with blue.

A region with a large motion has a high attention value, a region with a small motion has a low attention value, and a region with no motion has a zero attention value.

4. Creating a map of the final attention map by the attention map combination (S130)

Here, the two priorities generated above are combined using an adaptive weighting map. Equation (1) represents an equation for calculating the final attention map.

S f = α 1 S s + α 2 S m + α 3 S s S m

? 3 = (? 1 +? 2 ) / 2 (1)

Where S f is the final map of interest, S s is the space-based map of interest and S m is the motion-based map of interest. α 1 , α 2 , α 3 represent weights for combination.

The left side of FIG. 4 is the original images, and the right side is the final attention maps for the original images. In addition to the area with a large color difference from the periphery, a region with a large motion also appeared as a target area, and the target value was larger when both of them were satisfied.

5. A map generation system based on video analysis

5 is a block diagram of a attention map generation system according to another embodiment of the present invention. The attention map generating system according to the embodiment of the present invention is a system for generating an attention map in accordance with the algorithm shown in FIG.

As shown in FIG. 5, the attention map generating system according to the embodiment of the present invention performing such functions includes a moving image input unit 210, a space-based attention map extracting unit 220, An extracting unit 230, a attention map combining unit 240, and an important region detecting / searching unit 250.

The video input unit 210 receives a moving image generated through a camera, or receives moving images from a storage medium, an external device, or an external network.

The space-based attention map extractor 220 extracts a map of interest from space based on the segmentation of the current frame of the moving image input through the moving image input unit 210. [

The motion-based attention map extraction unit 230 estimates the motion of a current frame and a next frame of a moving image input through the moving image input unit 210, and extracts a map of interest based on the motion using the information .

The attention map combining unit 240 generates a final attention map by adaptively weighting the attention maps generated by the space-based map extraction unit 220 and the motion-based map extraction unit 230.

The critical region search / tracking unit 250 searches a critical region from the final attention map generated by the attention map combining unit 240, and tracks the searched critical region.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present invention.

210:
220: Space-based attention map extracting unit
230: motion-based attention map extracting unit
240: Attention map combination part
250: critical area detecting / searching unit

Claims (6)

A first extraction step of extracting a first map of interest from space;
A second extracting step of extracting a second attention map based on a motion; And
And combining the first map of interest with the second map of interest.
The method according to claim 1,
Wherein the first extraction step comprises:
Dividing the image into a plurality of regions;
Calculating average color values of the divided regions;
Computing the difference from the surrounding area with the average color values, and calculating the similarities of the divided areas;
And generating the first attention map based on the calculated similarities.
The method according to claim 1,
Detecting motion regions in an image;
And generating the second attention map based on the motion magnitudes of the detected motion regions.
The method according to claim 1,
Wherein the combining step comprises:
Wherein the first map of interest and the second map of interest are combined using adaptive weights.
The method according to claim 1,
Searching for a significant region in a final attention map generated in the combining step; And
And tracking the important area searched in the searching step.
A first extracting unit for extracting a first map of interest based on the space;
A second extracting unit for extracting a second map of interest based on a motion; And
And a combining unit for combining the first map of interest and the second map of interest.
KR1020150187306A 2015-12-28 2015-12-28 Saliency Map Generation Method and System based on Video Analysis KR20170077429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150187306A KR20170077429A (en) 2015-12-28 2015-12-28 Saliency Map Generation Method and System based on Video Analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150187306A KR20170077429A (en) 2015-12-28 2015-12-28 Saliency Map Generation Method and System based on Video Analysis

Publications (1)

Publication Number Publication Date
KR20170077429A true KR20170077429A (en) 2017-07-06

Family

ID=59354483

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150187306A KR20170077429A (en) 2015-12-28 2015-12-28 Saliency Map Generation Method and System based on Video Analysis

Country Status (1)

Country Link
KR (1) KR20170077429A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020159036A1 (en) * 2019-01-30 2020-08-06 삼성전자주식회사 Electronic device generating caption information for video sequence and operation method thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020159036A1 (en) * 2019-01-30 2020-08-06 삼성전자주식회사 Electronic device generating caption information for video sequence and operation method thereof

Similar Documents

Publication Publication Date Title
KR102138950B1 (en) Depth map generation from a monoscopic image based on combined depth cues
KR100544677B1 (en) Apparatus and method for the 3D object tracking using multi-view and depth cameras
US8542929B2 (en) Image processing method and apparatus
US9165211B2 (en) Image processing apparatus and method
KR101096468B1 (en) Complexity-adaptive 2d-to-3d video sequence conversion
US7899122B2 (en) Method, apparatus and computer program product for generating interpolation frame
US9600898B2 (en) Method and apparatus for separating foreground image, and computer-readable recording medium
US8004528B2 (en) Method, systems and computer product for deriving three-dimensional information progressively from a streaming video sequence
US11995856B2 (en) Video depth estimation based on temporal attention
KR100888081B1 (en) Apparatus and method for converting 2D image signals into 3D image signals
KR20190077428A (en) Video frame rate conversion using streamed metadata
US9626595B2 (en) Method and apparatus for tracking superpixels between related images
KR102201297B1 (en) Apparatus and method for interpolating frames based on multiple flows
Wang et al. Stereoscopic image retargeting based on 3D saliency detection
US20150063682A1 (en) Video disparity estimate space-time refinement method and codec
KR101125061B1 (en) A Method For Transforming 2D Video To 3D Video By Using LDI Method
Chung et al. Video object extraction via MRF-based contour tracking
KR20170077429A (en) Saliency Map Generation Method and System based on Video Analysis
Dittrich et al. Saliency detection for stereoscopic video
CN108885778A (en) Image processing equipment and image processing method
Lin et al. Depth map enhancement on rgb-d video captured by kinect v2
WO2015158570A1 (en) System, method for computing depth from video
EP2947626B1 (en) Method and apparatus for generating spanning tree, method and apparatus for stereo matching, method and apparatus for up-sampling, and method and apparatus for generating reference pixel
Barra et al. Using mutual information for multi-anchor tracking of human beings
Chang et al. Application of inpainting technology to video restoration