CN114860979A - Image retrieval method and system based on region of interest extraction - Google Patents
Image retrieval method and system based on region of interest extraction Download PDFInfo
- Publication number
- CN114860979A CN114860979A CN202210575033.3A CN202210575033A CN114860979A CN 114860979 A CN114860979 A CN 114860979A CN 202210575033 A CN202210575033 A CN 202210575033A CN 114860979 A CN114860979 A CN 114860979A
- Authority
- CN
- China
- Prior art keywords
- region
- original
- visual
- retrieval
- interest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000000605 extraction Methods 0.000 title claims abstract description 23
- 230000000007 visual effect Effects 0.000 claims abstract description 82
- 238000012545 processing Methods 0.000 claims abstract description 9
- 230000002093 peripheral effect Effects 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 abstract description 7
- 230000016776 visual perception Effects 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/60—Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image retrieval method and system based on region of interest extraction, and belongs to the technical field of image processing. The method comprises the following steps: constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model; extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database; establishing a retrieval interface and inputting a retrieval instruction; and searching out and storing original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction to obtain a picture/video library. The visual attention model-based region of interest detection in the invention adds a human visual attention mechanism, and is more in line with the human visual perception process.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an image retrieval method and system based on region of interest extraction.
Background
In recent years, with the rapid development of network and multimedia technologies, video and image-based retrieval technologies have attracted more and more attention. In the conventional method for image retrieval based on text, annotation, and the like, a specific flow is shown in fig. 4. The method mainly adopts a manual mode to label and annotate the image video data, stores labels and the image video data in a correlation mode, and retrieves videos and image data by retrieving label keywords.
However, the above method has some drawbacks: firstly, with the sharp increase of files such as image video and the like, the workload is enormous by using a manual filling method; secondly, understanding of the image video is different for each person, so that inaccurate labeling is easily caused, and a retrieval error is easily caused; thirdly, the method cannot meet the requirements of user personalization, such as retrieval requirements of image low-level visual feature contents and the like.
Disclosure of Invention
The invention provides an image retrieval technology for extracting a region of interest based on a visual attention model, aiming at solving the problems of low efficiency, inaccuracy, incapability of meeting personalized retrieval requirements and the like of the image retrieval technology in the background technology.
The invention adopts the following technical scheme: an image retrieval method based on region of interest extraction at least comprises the following steps:
constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model;
extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database;
establishing a retrieval interface and inputting a retrieval instruction; searching out original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction and storing the original pictures/videos to obtain a picture/video library; the extraction process of extracting the region of interest is as follows:
firstly, processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model;
step two, acquiring at least one visual attention focus in the saliency map by utilizing a competition mechanism;
and step three, taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
In a further embodiment, the step one specifically includes the following steps:
step 101, filtering an original image/original video by using a multi-scale multi-channel filter, extracting visual features, and obtaining a feature map about the visual features; the characteristic diagram at least comprises: a color feature map, a brightness feature map and a direction feature map;
102, selecting a scale according to requirements, defining the center c and the periphery s of each feature map, and respectively calculating the central periphery difference of the scale of the center c and the scale of the periphery s in the corresponding feature map; the result of the central peripheral difference is an attention map corresponding to the visual features;
103, normalizing the attention images to respectively obtain normalized color attention imagesLuminance attention mapAnd direction attention map
Step 104, obtaining a saliency map SM by adopting the following formula:
in a further embodiment, the second step specifically includes the following steps:
step 201, obtaining a plurality of significance levels in a significance map, sorting the significance levels in a sequence from strong to weak, and selecting points corresponding to the top 10 significance levels as candidate points;
step 202, comparing the saliency corresponding to the candidate points with a first threshold t in sequence, wherein the candidate points with the saliency greater than the first threshold t are the visual attention focus.
In a further embodiment, the third step specifically includes the following steps:
calculating Euclidean distance between the visual attention focuses, if the Euclidean distance is smaller than a second threshold value d, merging the corresponding two visual attention focuses by adopting the following formula,
resulting in a merged visual focus of attention (X, Y):
in the formula (x) i ,y i ) And (x) j ,y j ) Respectively representing the coordinates of the ith visual attention focus and the jth visual attention focus, wherein i is not equal to j; v. of s,i And v s,j The gray values of the ith and jth visual attention focuses in the saliency map are respectively represented.
In a further embodiment, the step 101 is further represented by:
firstly, filtering an image BY using a Gaussian weight matrix, performing down-sampling to obtain a Gaussian pyramid with n layers, extracting color, brightness and direction characteristics under each scale sigma of the pyramid to form corresponding RG (sigma), BY (sigma), I (sigma) and O (sigma) characteristic pyramids, wherein sigma belongs to [0, n-1 ];
wherein, the color characteristics in the color characteristic diagram are as follows: red colourGreen colourBlue colorAnd yellow
The luminance characteristics in the luminance characteristic diagram are represented as: i ═ r + g + b)/3;
the directional characteristic is a four directional characteristic formed through transformation of a Gabor wavelet in four directions of θ ═ {0 °,45 °,90 °,135 ° } on the basis of the luminance characteristic, where r, g, and b are three components of red, green, and blue of the original image.
In a further embodiment, the calculation method for calculating the central perimeter difference between the central c-scale and the perimeter s-scale in the corresponding feature map is as follows:
wherein RG (c, s) represents the central peripheral difference of the red-green color feature map, BY (c, s) represents the central peripheral difference of the blue-yellow color feature map, I (c, s) represents the central peripheral difference of the luminance feature map, O (c, s, θ) represents the central peripheral difference of the directional feature map, r (c) and r(s) represent the center and periphery of the red feature map, respectively, g (c) and g(s) represent the center and periphery of the green feature map, respectively, b (c) and b(s) represent the center and periphery of the blue feature map, respectively, y (c) and y(s) represent the center and periphery of the yellow feature map, respectively, I (c) and I(s) represent the center and periphery of the luminance feature map, respectively, and O (c, θ) and O (s, θ) represent the center and periphery of the directional feature map, respectively.
In a further embodiment, the saliency is obtained as follows:
in the formula, σ c And σ s Scale factors representing the center c and the periphery s, respectively, (x, y) being the coordinates of the pixel points in the saliency map
In a further embodiment, the specific process of searching out the original image/original video meeting the requirement in the retrieval database is as follows:
and comparing the similarity of the characteristic values in the retrieval database with the input retrieval instruction, sequencing the similarity from high to low, screening out the characteristic values of a preset number which are sequenced in the front, and matching out the corresponding original image/original video based on the characteristic values.
The image retrieval system based on region-of-interest extraction for implementing the image retrieval method as described above includes:
a first module configured to construct a visual gaze model, and extract an area of interest in an original picture/original video based on the visual gaze model;
the second module is set to extract a characteristic value in the region of interest, and the characteristic value and the corresponding original picture/original video are subjected to associative storage according to a preset relation to obtain a retrieval database;
a third module, configured to establish a search interface and input a search instruction; and searching out and storing original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction to obtain a picture/video library.
In a further embodiment, the first module further comprises a fourth module connected thereto, the fourth module being configured to: processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model; obtaining at least one visual focus of attention in the saliency map using a competition mechanism; and taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
The invention has the beneficial effects that: the method for extracting the region of interest based on the visual attention model adds a human visual attention mechanism into the region of interest detection based on the visual attention model, and is more in line with the human visual perception process, wherein the method for extracting the region of interest based on the visual attention model comprises the following steps: obtaining a saliency map from physiological characteristics of the visual system of the human eye; and obtaining an attention focus by a winner by taking a full competition mechanism, taking the attention focus as a seed point for region growing and dividing, and then obtaining the region of interest by a region growing and dividing method. The problem that the region of interest extracted by the traditional method is separated from the subjective understanding of the user is solved.
Drawings
Fig. 1 is a flowchart of an image retrieval method based on region of interest extraction according to the present invention.
Fig. 2 is a flow chart of region of interest extraction based on visual fixation model in the present invention.
Fig. 3 is a flowchart of acquiring a saliency map in the present invention.
Fig. 4 is a flow chart of a prior art image retrieval technique based on text annotation.
FIG. 5 is a flow diagram of a prior art content-based image retrieval technique.
Fig. 6 is a flowchart of a prior art region-of-interest based image retrieval technique.
Detailed Description
Content-Based Image Retrieval (CBIR) is a research focus at present, the CBRI technology is to perform Image Retrieval by extracting low-level visual features such as color and shape of an Image, has strong objectivity, and overcomes the defects of the conventional Image Retrieval, and a specific flow is shown in fig. 5. However, it is difficult to represent the high-level semantic features of the image from the low-level visual features of the image, i.e. the so-called "semantic gap" problem: the information that the user obtains from the visual data is inconsistent with the user's own understanding of the visual data. Therefore, acquiring the high-level semantics of the image is the key to solve the semantic gap problem.
The detection of the region of Interest (ROI) of the image is an effective method for obtaining the high-level semantic meaning of the image. In recent years, with the development of interest detection technology, many interesting detection methods have been proposed, such as: the specific flow is shown in fig. 6, wherein the human-computer interaction based region of interest detection requires a user to participate, so that the intention of the user can be accurately obtained, but the interaction process is relatively complex; the method breaks away from the subjective understanding of a user on the image and easily causes an opposite result in the data with prominent image background.
Therefore, in order to solve the above technical problems, the present embodiment provides an image retrieval method based on region of interest extraction, and in order to solve the problem that the region of interest extraction in the conventional method is separated from the subjective understanding of the user, the region of interest detection based on the visual attention model in the present invention adds a human visual attention mechanism, which better conforms to the human visual perception process. As shown in fig. 1, the method comprises the following steps:
constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model; in the present embodiment, the present invention is applicable to both the analysis search of images and the analysis search of videos.
Extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database;
establishing a retrieval interface and inputting a retrieval instruction; and searching out original pictures/videos meeting the requirements from a retrieval database based on the retrieval instruction, and storing to obtain a picture/video library. Further shown are: and comparing the similarity of the characteristic values in the retrieval database with the input retrieval instruction, sequencing the similarity from high to low, screening out the characteristic values of a preset number which are sequenced in the front, and matching out the corresponding original image/original video based on the characteristic values.
In a further embodiment, the process of extracting the region of interest is shown in fig. 5, and includes:
firstly, processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model;
step two, acquiring at least one visual attention focus in the saliency map by utilizing a competition mechanism;
and step three, taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
The method effectively overcomes the defect that a traditional region growing and dividing method needs to select seed points manually, and simultaneously solves the problems that the image is divided inaccurately by using a visual attention mechanism and the obtained region of interest is small. The visual fixation model is adopted, non-uniform sampling is carried out on the image by utilizing a human visual attention mechanism, the central peripheral difference is calculated to obtain a characteristic diagram of the image, and the characteristic diagram is fused into a saliency map of the image.
Specifically, the step one specifically comprises the following steps:
101, filtering an original image/original video by using a multi-scale multi-channel filter, extracting visual features, and obtaining a feature map about the visual features; the characteristic diagram at least comprises: a color feature map, a brightness feature map and a direction feature map; firstly, filtering an image BY using a Gaussian weight matrix, performing down-sampling to obtain a Gaussian pyramid with n layers, extracting color, brightness and direction characteristics under each scale sigma of the pyramid to form corresponding RG (sigma), BY (sigma), I (sigma) and O (sigma) characteristic pyramids, wherein sigma belongs to [0, n-1 ];
wherein, the color characteristics in the color characteristic diagram are as follows: red colourGreen colourBlue colorAnd yellow
The luminance characteristics in the luminance characteristic diagram are represented as: i ═ r + g + b)/3;
the directional characteristic is a four directional characteristic formed through transformation of a Gabor wavelet in four directions of θ ═ {0 °,45 °,90 °,135 ° } on the basis of the luminance characteristic, where r, g, and b are three components of red, green, and blue of the original image.
102, selecting a scale according to requirements, defining the center c and the periphery s of each feature map, and respectively calculating the central periphery difference of the scale of the center c and the scale of the periphery s in the corresponding feature map; the result of the central peripheral difference is an attention map corresponding to the visual features; the calculation method for calculating the central peripheral difference between the central c scale and the peripheral s scale in the corresponding feature map is as follows:
wherein RG (c, s) represents the central peripheral difference of the red-green color feature map, BY (c, s) represents the central peripheral difference of the blue-yellow color feature map, I (c, s) represents the central peripheral difference of the luminance feature map, O (c, s, θ) represents the central peripheral difference of the directional feature map, r (c) and r(s) represent the center and periphery of the red feature map, respectively, g (c) and g(s) represent the center and periphery of the green feature map, respectively, b (c) and b(s) represent the center and periphery of the blue feature map, respectively, y (c) and y(s) represent the center and periphery of the yellow feature map, respectively, I (c) and I(s) represent the center and periphery of the luminance feature map, respectively, and O (c, θ) and O (s, θ) represent the center and periphery of the directional feature map, respectively.
It should be noted that, since the feature maps have different sizes at different scales, the feature map at the large scale s needs to be interpolated and enlarged to obtain the same size as the feature map at the small scale c when performing the difference.
103, normalizing the attention images to respectively obtain normalized color attention imagesLuminance attention mapAnd direction attention map
Step 104, obtaining a saliency map SM by adopting the following formula:
due to the selective and transitive nature of the attention focus, the selection and shifting of attention focus is achieved through the network contention mechanism of WTA. This ensures that all but the most active one, with the focus of attention directed by the most active part in terms of identifiable orientation points, is suppressed. Those local inhibit points are also temporarily activated while looking for the current focus of attention in the saliency map, and the next-to-saliency-area is considered the most active winner as the WTA network moves to the next focus of attention. The fixation area of the human eye thus shifts from a strong focus of attention to a weaker focus of attention, a process that is known as the shift of the point of attention. For further screening of attention focus, a method of weighting euclidean distances is proposed in the prior art, but the method is only suitable for a single object, for a plurality of object images.
Therefore, in the present embodiment, attention is paid to a method of comparing the degree of saliency of the focus with the threshold t, which is specifically expressed as: step 201, obtaining a plurality of saliency degrees in a saliency map, sorting the plurality of saliency degrees according to a sequence from strong to weak (numbering is performed according to the sequence from strong to weak), and selecting points corresponding to the top 10 saliency degrees as candidate points; in this embodiment, the saliency is obtained as follows:
in the formula, σ c And σ s Scale factors representing the center c and the periphery s, respectively, (x, y) are the coordinates of the pixel points in the saliency map.
Step 202, comparing the saliency corresponding to the candidate points with a first threshold t in sequence, wherein the candidate points with the saliency greater than the first threshold t are the visual attention focus.
In order to further improve the accuracy of taking the attention focus as the region growing and dividing seed point, the similar attention focuses need to be further merged for processing, which is specifically represented as follows: calculating Euclidean distance between the visual attention focuses, if the Euclidean distance is smaller than a second threshold value d, merging the corresponding two visual attention focuses by adopting the following formula,
resulting in a merged visual focus of attention (X, Y):
in the formula (x) i ,y i ) And (x) j ,y j ) Respectively representing the coordinates of the ith visual attention focus and the jth visual attention focus, wherein i is not equal to j; v. of s,i And v s,j The gray values of the ith and jth visual attention focuses in the saliency map are respectively represented.
In another embodiment, an image retrieval system based on region of interest extraction for implementing the above method is disclosed, comprising:
a first module configured to construct a visual gaze model, and extract an area of interest in an original picture/original video based on the visual gaze model;
the second module is set to extract a characteristic value in the region of interest, and the characteristic value and the corresponding original picture/original video are subjected to associative storage according to a preset relation to obtain a retrieval database;
a third module, configured to establish a search interface and input a search instruction; and searching out and storing original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction to obtain a picture/video library.
Wherein the first module further comprises a fourth module connected thereto, the fourth module being arranged to: processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model; obtaining at least one visual focus of attention in the saliency map using a competition mechanism; and taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
Claims (10)
1. An image retrieval method based on region of interest extraction is characterized by at least comprising the following steps:
constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model;
extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database;
establishing a retrieval interface and inputting a retrieval instruction; searching out original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction and storing the original pictures/videos to obtain a picture/video library; the extraction process of extracting the region of interest is as follows:
firstly, processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model;
step two, acquiring at least one visual attention focus in the saliency map by utilizing a competition mechanism;
and step three, taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
2. The image retrieval method based on region of interest extraction according to claim 1, wherein the first step specifically comprises the following steps:
step 101, filtering an original image/original video by using a multi-scale multi-channel filter, extracting visual features, and obtaining a feature map about the visual features; the characteristic diagram at least comprises: a color feature map, a brightness feature map and a direction feature map;
102, selecting a scale according to requirements, defining the center c and the periphery s of each feature map, and respectively calculating the central periphery difference of the scale of the center c and the scale of the periphery s in the corresponding feature map; the result of the central peripheral difference is an attention map corresponding to the visual features;
103, normalizing the attention images to respectively obtain normalized color attention imagesLuminance attention mapAnd direction attention map
Step 104, obtaining a saliency map SM by adopting the following formula:
3. the image retrieval method based on region of interest extraction according to claim 1, wherein the second step specifically comprises the following steps:
step 201, obtaining a plurality of significance levels in a significance map, sorting the significance levels in a sequence from strong to weak, and selecting points corresponding to the top 10 significance levels as candidate points;
step 202, comparing the saliency corresponding to the candidate points with a first threshold t in sequence, wherein the candidate points with the saliency greater than the first threshold t are the visual attention focus.
4. The image retrieval method based on region of interest extraction according to claim 1, wherein the third step specifically comprises the following steps:
calculating Euclidean distances between the visual attention focuses, and if the Euclidean distances are smaller than a second threshold value d, combining the two corresponding visual attention focuses by adopting the following formula to obtain a combined visual attention focus (X, Y):
in the formula (x) i ,y i ) And (x) j ,y j ) Respectively representing the coordinates of the ith visual attention focus and the jth visual attention focus, wherein i is not equal to j; v. of s,i And v s,j Individual watchGray values of the visual attention focus point i and the visual attention focus point j in the saliency map are shown.
5. The image retrieval method based on region of interest extraction according to claim 2, wherein the step 101 is further represented by:
firstly, filtering an image BY using a Gaussian weight matrix, performing down-sampling to obtain a Gaussian pyramid with n layers, extracting color, brightness and direction characteristics under each scale sigma of the pyramid to form corresponding RG (sigma), BY (sigma), I (sigma) and O (sigma) characteristic pyramids, wherein sigma belongs to [0, n-1 ];
wherein, the color characteristics in the color characteristic diagram are as follows: red colourGreen colourBlue colorAnd yellow
The luminance characteristics in the luminance characteristic diagram are represented as: i ═ r + g + b)/3;
the directional characteristic is a four directional characteristic formed through transformation of a Gabor wavelet in four directions of θ ═ {0 °,45 °,90 °,135 ° } on the basis of the luminance characteristic, where r, g, and b are three components of red, green, and blue of the original image.
7. the image retrieval method based on region of interest extraction as claimed in claim 3, wherein the saliency is obtained as follows:
in the formula, σ c And σ s Scale factors representing the center c and the periphery s, respectively, (x, y) are the coordinates of the pixel points in the saliency map.
8. The image retrieval method based on region of interest extraction as claimed in claim 1, wherein the specific process of searching out the original image/original video meeting the requirement in the retrieval database is as follows:
and comparing the similarity of the characteristic values in the retrieval database with the input retrieval instruction, sequencing the similarity from high to low, screening out the characteristic values of a preset number which are sequenced in the front, and matching out the corresponding original image/original video based on the characteristic values.
9. An image retrieval system based on region-of-interest extraction for implementing the image retrieval method according to any one of claims 1 to 8, comprising:
a first module configured to construct a visual gaze model, and extract an area of interest in an original picture/original video based on the visual gaze model;
the second module is set to extract a characteristic value in the region of interest, and the characteristic value and the corresponding original picture/original video are subjected to associative storage according to a preset relation to obtain a retrieval database;
a third module, configured to establish a search interface and input a search instruction; and searching out original pictures/videos meeting the requirements from a retrieval database based on the retrieval instruction, and storing to obtain a picture/video library.
10. An image retrieval system based on region of interest extraction as claimed in claim 9, wherein the first module further comprises a fourth module connected thereto, the fourth module configured to: processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model; obtaining at least one visual focus of attention in the saliency map using a competition mechanism; and taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210575033.3A CN114860979A (en) | 2022-05-24 | 2022-05-24 | Image retrieval method and system based on region of interest extraction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210575033.3A CN114860979A (en) | 2022-05-24 | 2022-05-24 | Image retrieval method and system based on region of interest extraction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114860979A true CN114860979A (en) | 2022-08-05 |
Family
ID=82638754
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210575033.3A Withdrawn CN114860979A (en) | 2022-05-24 | 2022-05-24 | Image retrieval method and system based on region of interest extraction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114860979A (en) |
-
2022
- 2022-05-24 CN CN202210575033.3A patent/CN114860979A/en not_active Withdrawn
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112101150B (en) | Multi-feature fusion pedestrian re-identification method based on orientation constraint | |
EP2955645B1 (en) | System for automated segmentation of images through layout classification | |
Kumar et al. | Leafsnap: A computer vision system for automatic plant species identification | |
CN102542058B (en) | Hierarchical landmark identification method integrating global visual characteristics and local visual characteristics | |
CN100573523C (en) | A kind of image inquiry method based on marking area | |
Ahmad et al. | Endoscopic image classification and retrieval using clustered convolutional features | |
CN106126585B (en) | The unmanned plane image search method combined based on quality grading with perceived hash characteristics | |
CN108460114B (en) | Image retrieval method based on hierarchical attention model | |
Song et al. | Taking advantage of multi-regions-based diagonal texture structure descriptor for image retrieval | |
Grana et al. | Automatic segmentation of digitalized historical manuscripts | |
CN112927776A (en) | Artificial intelligence automatic interpretation system for medical inspection report | |
CN109213886B (en) | Image retrieval method and system based on image segmentation and fuzzy pattern recognition | |
Ko et al. | Microscopic cell nuclei segmentation based on adaptive attention window | |
CN108664968B (en) | Unsupervised text positioning method based on text selection model | |
Fu et al. | Medical image retrieval and classification based on morphological shape feature | |
Xue et al. | Investigating CBIR techniques for cervicographic images | |
CN114860979A (en) | Image retrieval method and system based on region of interest extraction | |
CN106548118A (en) | The recognition and retrieval method and system of cinema projection content | |
Hung et al. | A content-based image retrieval system integrating color, shape and spatial analysis | |
Amory et al. | A content based image retrieval using k-means algorithm | |
Misra et al. | Text extraction and recognition from image using neural network | |
Kulkarni | Natural language based fuzzy queries and fuzzy mapping of feature database for image retrieval | |
CN114202659A (en) | Fine-grained image classification method based on spatial symmetry irregular local region feature extraction | |
Zhu et al. | Detecting text in natural scene images with conditional clustering and convolution neural network | |
Duan et al. | Bio-inspired visual attention model and saliency guided object segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20220805 |
|
WW01 | Invention patent application withdrawn after publication |