WO2019137185A1 - Procédé et appareil de filtrage d'image, support de stockage et dispositif informatique - Google Patents

Procédé et appareil de filtrage d'image, support de stockage et dispositif informatique Download PDF

Info

Publication number
WO2019137185A1
WO2019137185A1 PCT/CN2018/122841 CN2018122841W WO2019137185A1 WO 2019137185 A1 WO2019137185 A1 WO 2019137185A1 CN 2018122841 W CN2018122841 W CN 2018122841W WO 2019137185 A1 WO2019137185 A1 WO 2019137185A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
group
pictures
picture set
cluster
Prior art date
Application number
PCT/CN2018/122841
Other languages
English (en)
Chinese (zh)
Inventor
刁梁
陈昕
周华
朱欤
Original Assignee
美的集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 美的集团股份有限公司 filed Critical 美的集团股份有限公司
Publication of WO2019137185A1 publication Critical patent/WO2019137185A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Definitions

  • the present application relates to a picture processing technology, and in particular, to a picture screening method and apparatus, a storage medium, and a computer device.
  • the embodiment of the present application provides a picture screening method and device, a storage medium, and a computer device.
  • the grouping, according to the feature vector of each picture in the first picture set, each picture in the first picture set to the group includes:
  • the feature vector of each picture in the first picture set is clustered, and each picture in the first picture set is grouped into a group according to a clustering result, including:
  • Each picture in the first picture set is grouped into a group, wherein the number of groups is the same as the number of cluster centers.
  • the determining a cluster center corresponding to each group of pictures includes:
  • a cluster center corresponding to each group of pictures is determined.
  • the method further includes:
  • the reference center is calculated based on a cluster center corresponding to each group of pictures.
  • the distance between the cluster center corresponding to the group of pictures and the reference center is deleted, and one or more sets of pictures that meet the preset condition are deleted from the first picture set to obtain a second picture.
  • Collections including:
  • the distance between the cluster center corresponding to the group of pictures and the reference center is deleted, and one or more sets of pictures that meet the preset condition are deleted from the first picture set to obtain a second picture.
  • Collections including:
  • An extracting unit configured to extract a feature vector of each picture in the first picture set
  • a grouping unit configured to group each picture in the first picture set into a packet based on a feature vector of each picture in the first picture set
  • the distance determining unit is configured to determine a cluster center corresponding to each group of pictures, and determine a distance between the cluster center corresponding to each group of pictures and the reference center;
  • the filtering unit is configured to delete one or more sets of pictures that meet the preset condition from the first picture set based on the distance between the cluster center corresponding to the group of pictures and the reference center, to obtain a second picture set.
  • the grouping unit is configured to cluster feature vectors of each picture in the first picture set, and group each picture in the first picture set into a group according to a clustering result. .
  • the grouping unit includes:
  • Set subunits configured to set the number of cluster centers
  • a clustering subunit configured to cluster feature vectors of respective pictures in the first picture set
  • the sub-units are configured to group each picture in the first picture set into a group, wherein the number of groups is the same as the number of cluster centers.
  • the grouping unit is further configured to determine, according to the clustering result, a cluster center corresponding to each group of pictures.
  • the device further includes:
  • the reference center calculation unit is configured to calculate the reference center based on the cluster centers corresponding to the groups of pictures.
  • the screening unit is configured to delete one or more sets of pictures whose distance from the reference center to the reference center is greater than or equal to a preset threshold, and delete the first picture set to obtain the first Two picture collections.
  • the screening unit is configured to sort the distance between the cluster center corresponding to each group of pictures and the reference center from large to small, and determine the M group picture with the largest distance, where M is a positive integer; Deleting the M sets of pictures in the first picture set to obtain a second picture set.
  • the storage medium provided by the embodiment of the present application has stored thereon computer executable instructions, and the computer executable instructions are implemented by the processor to implement the image filtering method described above.
  • the computer device provided by the embodiment of the present application includes a memory, a processor, and computer executable instructions stored on the memory and executable on the processor, and the processor implements the image screening method when the computer executes the computer executable instructions. .
  • acquiring a first picture set, extracting feature vectors of each picture in the first picture set, and grouping the first according to feature vectors of each picture in the first picture set Determining each picture in the picture set into a group; determining a cluster center corresponding to each group of pictures, and determining a distance between the cluster center corresponding to each group of pictures and the reference center; and based on the cluster center corresponding to each group of pictures Referring to the distance of the center, one or more sets of pictures satisfying the preset condition are deleted from the first picture set to obtain a second picture set.
  • the first picture set that is crawled is processed by using computer vision technology to obtain feature vectors of each picture in the first picture set, and then the feature vector is performed by using a clustering algorithm.
  • Clustering processing thereby realizing grouping of each picture in the first picture set, and finally, automatically cleaning the garbage picture in the first picture set, thereby realizing automatic cleaning of the picture, providing accurate picture data for artificial intelligence application source.
  • 1 is a schematic diagram of hardware entities of each party performing information interaction in an embodiment of the present application
  • FIG. 2 is a schematic flowchart 1 of a picture screening method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart 2 of a picture screening method according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart 3 of a picture screening method according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart 4 of a picture screening method according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram 1 of a picture screening device according to an embodiment of the present application.
  • FIG. 7 is a second structural diagram of a picture screening apparatus according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • FIG. 1 is a schematic diagram of hardware entities of each party performing information interaction in the embodiment of the present application.
  • FIG. 1 includes: a picture screening device, a server 1 - a server n, wherein the image filtering device performs information interaction with a server through a wired network or a wireless network.
  • the image filtering device is disposed in the terminal, and the type of the terminal is, for example, a mobile phone, a desktop computer, a PC, an all-in-one, etc.; the terminal provides at least the following two functions: 1) providing a user interface (UI, Interface) for the user. 2) The process of crawling the picture from the server 1 - server n and performing picture filtering.
  • UI user interface
  • the image filtering device is disposed in the server, and the server provides the following functions: a process of crawling the image from the server 1 - server n and performing image filtering; in addition, the server can perform information interaction with the client-oriented client
  • the receiving user can implement the process of crawling the image and performing image filtering, and can also send data such as image screening results to the client of the user, and the client is responsible for providing the UI for the user.
  • FIG. 1 is only an example of a system architecture that implements the embodiments of the present application.
  • the embodiment of the present application is not limited to the system structure described in FIG. 1 above, and various embodiments of the present application are proposed based on the system architecture.
  • FIG. 2 is a schematic flowchart 1 of a picture screening method according to an embodiment of the present application. As shown in FIG. 2, the picture screening method includes the following steps:
  • Step 201 Acquire a first picture set.
  • the manner of obtaining the first set of pictures may be, but is not limited to, the following methods: acquiring keywords (also may be keywords) input by the user, and selecting various types of websites (also may be databases) according to the keywords. Crawl up the image that matches the keyword.
  • the keyword is “air conditioning”, and the pictures matching “air conditioning” are crawled from various types of websites.
  • the picture matching the “air conditioning” may be an image with an air conditioning pattern on the picture, or It is an image with air-conditioning text on the picture.
  • the type of the website may be set by the user, for example, the user may set a business type website, an education type website, an entertainment type website, etc., so that the type of the website may be crawled according to the type of the website.
  • the image that matches the keyword may be set by the user, for example, the user may set a business type website, an education type website, an entertainment type website, etc., so that the type of the website may be crawled according to the type of the website.
  • the image that matches the keyword is not limited, and the website with the access right can implement the crawling of the picture.
  • the first picture set is a sum of a type of picture that matches the keyword, and the first picture set includes a plurality of pictures that match the keyword, however, there are some probabilities in the first picture set. Grunge images, there is a need to remove these junk images from the first collection of images.
  • the first picture set includes a picture 1, a picture 2, a picture 3, a picture 4, and a picture 5.
  • the picture 1 and the picture 5 are garbage pictures, and are deleted from the first picture set. To achieve the process of deleting the garbage pictures.
  • Step 202 Extract feature vectors of respective pictures in the first picture set.
  • the feature vector of each picture in the first picture set is extracted by using computer vision technology.
  • computer vision technology is a technology that uses a computer instead of the human eye to recognize and process pictures.
  • the embodiment of the present application uses a deep learning (DL, Deep Learning) technology to extract feature vectors of respective pictures in the first picture set.
  • the deep learning technique can automatically learn the representation of the feature vector from the big data.
  • Convolutional Neural Network (CNN) is an application of deep learning in the field of image.
  • CNN Convolutional Neural Network
  • the special structure of local weight sharing has unique advantages in image processing, and the layout is closer to the actual biological neural network.
  • a picture is represented as a vector of pixels, such as a 1000 ⁇ 1000 picture, which can be represented as a vector of 1000000.
  • the vector data of the picture is input into the deep learning model, and after a series of processing (such as filtering, convolution, weighting, offsetting, etc.), the feature vector of the picture can be obtained.
  • the feature vector of picture 1 is P1
  • the feature vector of picture 2 is P2
  • the feature vector of picture 3 is P3
  • the feature vector of picture 4 is P4
  • the feature vector of picture 5 is P5.
  • Step 203 Group each picture in the first picture set into a packet based on a feature vector of each picture in the first picture set.
  • the feature vector of the picture represents the feature of the picture. If the distance between the feature vectors of the two pictures is closer, the similarity between the two pictures is higher, if the feature vector of the two pictures is The further the distance between them, the lower the similarity between the two pictures.
  • X (x1, x2, x3, ........, xn)
  • Y (y1, y2, y3 ,........,yn)
  • calculating the distance between X and Y can be, but is not limited to, by the following methods:
  • Method 1 Calculate the Euclidean distance between X and Y.
  • Method 2 Calculate the Manhattan distance between X and Y.
  • Method 3 Calculate the Minkowski distances for X and Y.
  • Minkowski distances of X and Y are compared. Specifically, the Minkowski distances of X and Y are compared.
  • Method 4 Calculate the cosine similarity of X and Y.
  • the embodiment of the present application may perform clustering on feature vectors of each picture in the first picture set based on any one of the foregoing methods, and group each picture in the first picture set into a group based on the clustering result.
  • K-means clustering method K-meas
  • clustering is performed centering on several points in the space (such as N points), and the objects closest to them are returned.
  • class the object of the clustering is a feature vector
  • the process of clustering generally includes:
  • the cluster center corresponding to each group of pictures can be determined.
  • the number of cluster centers is set to 20. After clustering the feature vectors of each image, all the images are divided into 20 groups according to the clustering result, and 20 cluster centers are obtained.
  • Step 204 Determine a cluster center corresponding to each group of pictures, and determine a distance between the cluster center corresponding to each group of pictures and the reference center.
  • the cluster center of each group of pictures represents the overall feature of the group, and the reference center O can be calculated based on the cluster center corresponding to each group of pictures.
  • the cluster centers corresponding to the 10 groups of pictures are: O1, O2, O3, O4, O5, O6, O7, O8, O9, O10, and the reference center O is the 10 cluster centers. average value.
  • the cluster center of a group can be the average of the feature vectors included in the group. For example, if a group includes the following feature vectors: P1, P2, and P3, the cluster center of the group is (P1+P2+P3)/3.
  • cluster centers there are 10 cluster centers, which are: O1, O2, O3, O4, O5, O6, O7, O8, O9, O10.
  • the distance between the 10 cluster centers and the reference center O can be passed but not limited.
  • the four distance calculation methods in step 203 are calculated.
  • Step 205 Delete one or more sets of pictures that meet the preset condition from the first picture set, and obtain a second picture set, based on the distance between the cluster center corresponding to the group of pictures and the reference center.
  • the preset condition is to limit one or more sets of pictures that are far away from the reference center from the first set.
  • one or more sets of pictures that meet the preset condition may also be referred to as garbage.
  • Pictures, the feature vectors of these junk pictures are far away from the feature vectors of other pictures, so the similarity is low.
  • a second picture of a more uniform type can be obtained. set.
  • FIG. 3 is a schematic flowchart 2 of a picture screening method according to an embodiment of the present disclosure. As shown in FIG. 3, the picture screening method includes the following steps:
  • Step 301 Acquire a first picture set.
  • the manner of obtaining the first set of pictures may be, but is not limited to, the following methods: acquiring keywords (also may be keywords) input by the user, and selecting various types of websites (also may be databases) according to the keywords. Crawl up the image that matches the keyword.
  • the keyword is “air conditioning”, and the pictures matching “air conditioning” are crawled from various types of websites.
  • the picture matching the “air conditioning” may be an image with an air conditioning pattern on the picture, or It is an image with air-conditioning text on the picture.
  • the type of the website may be set by the user, for example, the user may set a business type website, an education type website, an entertainment type website, etc., so that the type of the website may be crawled according to the type of the website.
  • the image that matches the keyword may be set by the user, for example, the user may set a business type website, an education type website, an entertainment type website, etc., so that the type of the website may be crawled according to the type of the website.
  • the image that matches the keyword is not limited, and the website with the access right can implement the crawling of the picture.
  • Step 302 Extract feature vectors of respective pictures in the first picture set.
  • the feature vector of each picture in the first picture set is extracted by using computer vision technology.
  • computer vision technology is a technology that uses a computer instead of the human eye to recognize and process pictures.
  • the embodiment of the present application uses the DL technology to extract feature vectors of respective pictures in the first picture set.
  • the deep learning technique can automatically learn the representation of the feature vector from the big data.
  • CNN has a unique advantage in local image weight sharing, and the layout is closer to the actual biological neural network.
  • a picture is represented as a vector of pixels, such as a 1000 ⁇ 1000 picture, which can be represented as a vector of 1000000.
  • the vector data of the picture is input into the deep learning model, and after a series of processing (such as filtering, convolution, weighting, offsetting, etc.), the feature vector of the picture can be obtained.
  • Step 303 Group each picture in the first picture set into a packet based on a feature vector of each picture in the first picture set.
  • the feature vector of the picture represents the feature of the picture. If the distance between the feature vectors of the two pictures is closer, the similarity between the two pictures is higher, if the feature vector of the two pictures is The further the distance between them, the lower the similarity between the two pictures.
  • the embodiment of the present application clusters feature vectors of each picture in the first picture set, and groups each picture in the first picture set into a group according to the clustering result.
  • K-means clustering method K-meas
  • clustering is performed centering on several points in the space (such as N points), and the objects closest to them are returned.
  • class the object of the clustering is a feature vector
  • the process of clustering generally includes:
  • the cluster center corresponding to each group of pictures can be determined.
  • Step 304 Determine a cluster center corresponding to each group of pictures, and determine a distance between the cluster center corresponding to each group of pictures and the reference center.
  • the cluster center of each group of pictures represents the overall feature of the group, and the reference center O can be calculated based on the cluster center corresponding to each group of pictures.
  • Step 305 Delete one or more sets of pictures whose distance from the reference center with respect to the reference center is greater than or equal to a preset threshold, and delete the first picture set to obtain a second picture set.
  • the probability that the group of pictures corresponding to the cluster center is garbage is larger;
  • a threshold is set. If the distance of a cluster center relative to the reference center is greater than or equal to the threshold, the group of pictures corresponding to the cluster center is a junk image, and the group of pictures is from the first When a picture set is deleted, a second picture set of a more uniform type can be obtained.
  • the technical solution of the embodiment of the present application realizes the screening process of the picture through the computer automatic process, which greatly reduces the labor cleaning cost.
  • FIG. 4 is a schematic flowchart 3 of a picture screening method according to an embodiment of the present application. As shown in FIG. 4, the picture screening method includes the following steps:
  • Step 401 Acquire a first picture set.
  • the manner of obtaining the first set of pictures may be, but is not limited to, the following methods: acquiring keywords (also may be keywords) input by the user, and selecting various types of websites (also may be databases) according to the keywords. Crawl up the image that matches the keyword.
  • the keyword is “air conditioning”, and the pictures matching “air conditioning” are crawled from various types of websites.
  • the picture matching the “air conditioning” may be an image with an air conditioning pattern on the picture, or It is an image with air-conditioning text on the picture.
  • the type of the website may be set by the user, for example, the user may set a business type website, an education type website, an entertainment type website, etc., so that the type of the website may be crawled according to the type of the website.
  • the image that matches the keyword may be set by the user, for example, the user may set a business type website, an education type website, an entertainment type website, etc., so that the type of the website may be crawled according to the type of the website.
  • the image that matches the keyword is not limited, and the website with the access right can implement the crawling of the picture.
  • Step 402 Extract feature vectors of respective pictures in the first picture set.
  • the feature vector of each picture in the first picture set is extracted by using computer vision technology.
  • computer vision technology is a technology that uses a computer instead of the human eye to recognize and process pictures.
  • the embodiment of the present application uses the DL technology to extract feature vectors of respective pictures in the first picture set.
  • the deep learning technique can automatically learn the representation of the feature vector from the big data.
  • CNN has a unique advantage in local image weight sharing, and the layout is closer to the actual biological neural network.
  • a picture is represented as a vector of pixels, such as a 1000 ⁇ 1000 picture, which can be represented as a vector of 1000000.
  • the vector data of the picture is input into the deep learning model, and after a series of processing (such as filtering, convolution, weighting, offsetting, etc.), the feature vector of the picture can be obtained.
  • Step 403 Group each picture in the first picture set into a packet based on a feature vector of each picture in the first picture set.
  • the feature vector of the picture represents the feature of the picture. If the distance between the feature vectors of the two pictures is closer, the similarity between the two pictures is higher, if the feature vector of the two pictures is The further the distance between them, the lower the similarity between the two pictures.
  • the embodiment of the present application clusters feature vectors of each picture in the first picture set, and groups each picture in the first picture set into a group according to the clustering result.
  • K-means clustering method K-meas
  • clustering is performed centering on several points in the space (such as N points), and the objects closest to them are returned.
  • class the object of the clustering is a feature vector
  • the process of clustering generally includes:
  • the cluster center corresponding to each group of pictures can be determined.
  • Step 404 Determine a cluster center corresponding to each group of pictures, and determine a distance between the cluster center corresponding to each group of pictures and the reference center.
  • the cluster center of each group of pictures represents the overall feature of the group, and the reference center O can be calculated based on the cluster center corresponding to each group of pictures.
  • Step 405 Sort the distance between the cluster center corresponding to each group of pictures and the reference center from large to small, and determine the M group picture with the largest distance, M is a positive integer; delete the first picture set from the first picture set M group pictures, get the second picture collection.
  • the probability that the group of pictures corresponding to the cluster center is garbage is larger;
  • the distance between the cluster center of each group of pictures and the reference center is sorted according to the largest to smallest, and the M group pictures with the largest distance are deleted from the first picture set, so that a second picture with a more uniform type can be obtained. set.
  • the corresponding cluster centers are: O1, O2, O3, O4, O5, wherein the distance between the 5 cluster centers and the reference center are: S1, S2, S3, S4, S5
  • S1, S2, S3, S4, S5 According to the order of S2, S4, S3, S4, and S1 if two sets of pictures need to be deleted, the two sets of pictures corresponding to O2 and O4 are deleted from the first picture set.
  • FIG. 5 is a schematic flowchart diagram of a picture screening method according to an embodiment of the present disclosure. As shown in FIG. 5, the picture screening method includes the following steps:
  • Step 501 Acquire a keyword and crawl a picture matching the keyword to form a first picture set.
  • Step 502 Extract feature vectors of respective pictures in the first picture set.
  • Step 503 Set the number of cluster centers to N.
  • Step 504 Cluster feature vectors of respective pictures, and divide each picture into N groups based on the clustering result.
  • Step 505 Determine a cluster center corresponding to each group of pictures based on the clustering result, and calculate a reference center based on each cluster center.
  • Step 506 Calculate the distance between each cluster center and the reference center.
  • Step 507 Sort the distance between each cluster center and the reference center from large to small.
  • Step 508 The M group pictures corresponding to the M cluster centers that are far away from each other are deleted from the first picture set to obtain a second picture set.
  • FIG. 6 is a first schematic structural diagram of a picture screening apparatus according to an embodiment of the present application. As shown in FIG. 6, the picture screening apparatus includes:
  • the obtaining unit 601 is configured to acquire a first picture set.
  • the extracting unit 602 is configured to extract a feature vector of each picture in the first picture set
  • the grouping unit 603 is configured to group each picture in the first picture set into a packet based on a feature vector of each picture in the first picture set;
  • the distance determining unit 604 is configured to determine a cluster center corresponding to each group of pictures, and determine a distance between the cluster center corresponding to each group of pictures and the reference center;
  • the filtering unit 605 is configured to delete one or more sets of pictures that meet the preset condition from the first picture set based on the distance between the cluster center corresponding to the group of pictures and the reference center, to obtain a second picture set.
  • FIG. 7 is a second schematic structural diagram of a picture screening apparatus according to an embodiment of the present application. As shown in FIG. 7, the picture screening apparatus includes:
  • the obtaining unit 701 is configured to acquire a first picture set.
  • the extracting unit 702 is configured to extract a feature vector of each picture in the first picture set
  • the grouping unit 703 is configured to group each picture in the first picture set into a packet based on a feature vector of each picture in the first picture set;
  • the distance determining unit 704 is configured to determine a cluster center corresponding to each group of pictures, and determine a distance between the cluster center corresponding to each group of pictures and the reference center;
  • the filtering unit 705 is configured to delete one or more sets of pictures that meet the preset condition from the first picture set based on the distance between the cluster center corresponding to the group of pictures and the reference center, to obtain a second picture set.
  • the grouping unit 703 is configured to cluster feature vectors of each picture in the first picture set, and group each picture in the first picture set to a group based on a clustering result. in.
  • the grouping unit 703 includes:
  • the clustering subunit 7032 is configured to cluster feature vectors of the respective pictures in the first picture set
  • the dividing subunit 7033 is configured to group each picture in the first picture set into a group, wherein the number of groups is the same as the number of cluster centers.
  • the grouping unit 703 is further configured to determine a cluster center corresponding to each group of pictures based on the clustering result.
  • the device further includes:
  • the reference center calculation unit 706 is configured to calculate the reference center based on the cluster centers corresponding to the groups of pictures.
  • the filtering unit 705 is configured to delete one or more sets of pictures whose distance from the reference center to the reference center is greater than or equal to a preset threshold, and delete the first picture set to obtain The second picture collection.
  • the screening unit 705 is configured to sort the distance between the cluster center corresponding to each group of pictures and the reference center from large to small, and determine the M group picture with the largest distance, where M is a positive integer. Removing the M sets of pictures from the first set of pictures to obtain a second set of pictures.
  • the above apparatus of the present application may also be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product. Based on such understanding, the technical solution of the embodiments of the present application may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • embodiments of the present application are not limited to any particular combination of hardware and software.
  • the embodiment of the present application further provides a storage medium, where the computer-executable instructions are executed, and the computer-executable instructions are executed by the processor to implement the above-mentioned image screening method in the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • the computer device includes a memory 801, a processor 802, and a computer executable on the memory 801 and executable on the processor 802.
  • the disclosed method and smart device may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one second processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit;
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé et un appareil de filtrage d'image, un support de stockage et un dispositif informatique. Le procédé comprend : l'acquisition d'un premier ensemble d'images (201) ; l'extraction d'un vecteur de caractéristiques de chaque image dans le premier ensemble d'images (202) ; sur la base du vecteur de caractéristiques de chaque image dans le premier ensemble d'images, le regroupement de chaque image dans le premier ensemble d'images en groupes (203) ; la détermination d'un centre de grappe correspondant à chaque groupe d'images et la détermination de la distance entre le centre de grappe correspondant à chaque groupe d'images et un centre de référence (204) ; et sur la base de la distance entre le centre de grappe correspondant à chaque groupe d'images et le centre de référence, la suppression, du premier ensemble d'images, d'un ou de plusieurs groupes d'images qui respectent une condition prédéfinie, de façon à obtenir un second ensemble d'images (205).
PCT/CN2018/122841 2018-01-09 2018-12-21 Procédé et appareil de filtrage d'image, support de stockage et dispositif informatique WO2019137185A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810017485.3 2018-01-09
CN201810017485.3A CN108228844B (zh) 2018-01-09 2018-01-09 一种图片筛选方法及装置、存储介质、计算机设备

Publications (1)

Publication Number Publication Date
WO2019137185A1 true WO2019137185A1 (fr) 2019-07-18

Family

ID=62640221

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/122841 WO2019137185A1 (fr) 2018-01-09 2018-12-21 Procédé et appareil de filtrage d'image, support de stockage et dispositif informatique

Country Status (2)

Country Link
CN (1) CN108228844B (fr)
WO (1) WO2019137185A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348107A (zh) * 2020-11-17 2021-02-09 百度(中国)有限公司 图像数据清洗方法及装置、电子设备和介质
CN112783883A (zh) * 2021-01-22 2021-05-11 广东电网有限责任公司东莞供电局 一种多源数据接入下电力数据标准化清洗方法和装置
CN117953252A (zh) * 2024-03-26 2024-04-30 贵州道坦坦科技股份有限公司 高速公路资产数据自动化采集方法及***

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228844B (zh) * 2018-01-09 2020-10-27 美的集团股份有限公司 一种图片筛选方法及装置、存储介质、计算机设备
CN110377774B (zh) * 2019-07-15 2023-08-01 腾讯科技(深圳)有限公司 进行人物聚类的方法、装置、服务器和存储介质
CN110377775A (zh) * 2019-07-26 2019-10-25 Oppo广东移动通信有限公司 一种图片审核方法及装置、存储介质
CN110929764A (zh) * 2019-10-31 2020-03-27 北京三快在线科技有限公司 图片审核方法和装置,电子设备及存储介质
CN111309948A (zh) * 2020-02-14 2020-06-19 北京旷视科技有限公司 图片筛选方法、图片筛选装置以及电子设备
CN113255694B (zh) * 2021-05-21 2022-11-11 北京百度网讯科技有限公司 训练图像特征提取模型和提取图像特征的方法、装置
CN114549883B (zh) * 2022-02-24 2023-09-05 北京百度网讯科技有限公司 图像处理方法、深度学习模型的训练方法、装置和设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556646A (zh) * 2009-05-20 2009-10-14 电子科技大学 一种基于核聚类的虹膜分类方法
CN102129568A (zh) * 2011-04-29 2011-07-20 南京邮电大学 利用改进的高斯混合模型分类器检测图像垃圾邮件的方法
CN104036259A (zh) * 2014-06-27 2014-09-10 北京奇虎科技有限公司 人脸相似度识别方法和***
CN108228844A (zh) * 2018-01-09 2018-06-29 美的集团股份有限公司 一种图片筛选方法及装置、存储介质、计算机设备

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7567960B2 (en) * 2006-01-31 2009-07-28 Xerox Corporation System and method for clustering, categorizing and selecting documents
CN101211341A (zh) * 2006-12-29 2008-07-02 上海芯盛电子科技有限公司 图像智能模式识别搜索方法
CN101295305B (zh) * 2007-04-25 2012-10-31 富士通株式会社 图像检索装置
CN100593785C (zh) * 2008-05-30 2010-03-10 清华大学 一种基于多特征相关反馈的三维模型检索方法
CN101464946B (zh) * 2009-01-08 2011-05-18 上海交通大学 基于头部识别和跟踪特征的检测方法
CN101576913B (zh) * 2009-06-12 2011-09-21 中国科学技术大学 基于自组织映射神经网络的舌象自动聚类、可视化和检索方法
CN101576932B (zh) * 2009-06-16 2012-07-04 阿里巴巴集团控股有限公司 近重复图片的计算机查找方法和装置
CN101853491B (zh) * 2010-04-30 2012-07-25 西安电子科技大学 基于并行稀疏谱聚类的sar图像分割方法
CN101859326B (zh) * 2010-06-09 2012-04-18 南京大学 一种图像检索方法
CN103294813A (zh) * 2013-06-07 2013-09-11 北京捷成世纪科技股份有限公司 一种敏感图片搜索方法和装置
CN103488689B (zh) * 2013-09-02 2017-09-12 新浪网技术(中国)有限公司 基于聚类的邮件分类方法和***
CN106021362B (zh) * 2016-05-10 2018-04-13 百度在线网络技术(北京)有限公司 查询式的图片特征表示的生成、图片搜索方法和装置
CN107423297A (zh) * 2016-05-23 2017-12-01 中兴通讯股份有限公司 图片的筛选方法及装置
CN106777007A (zh) * 2016-12-07 2017-05-31 北京奇虎科技有限公司 相册分类优化方法、装置及移动终端
CN107341190B (zh) * 2017-06-09 2021-01-22 努比亚技术有限公司 图片筛选方法、终端及计算机可读存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556646A (zh) * 2009-05-20 2009-10-14 电子科技大学 一种基于核聚类的虹膜分类方法
CN102129568A (zh) * 2011-04-29 2011-07-20 南京邮电大学 利用改进的高斯混合模型分类器检测图像垃圾邮件的方法
CN104036259A (zh) * 2014-06-27 2014-09-10 北京奇虎科技有限公司 人脸相似度识别方法和***
CN108228844A (zh) * 2018-01-09 2018-06-29 美的集团股份有限公司 一种图片筛选方法及装置、存储介质、计算机设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348107A (zh) * 2020-11-17 2021-02-09 百度(中国)有限公司 图像数据清洗方法及装置、电子设备和介质
CN112783883A (zh) * 2021-01-22 2021-05-11 广东电网有限责任公司东莞供电局 一种多源数据接入下电力数据标准化清洗方法和装置
CN117953252A (zh) * 2024-03-26 2024-04-30 贵州道坦坦科技股份有限公司 高速公路资产数据自动化采集方法及***
CN117953252B (zh) * 2024-03-26 2024-05-31 贵州道坦坦科技股份有限公司 高速公路资产数据自动化采集方法及***

Also Published As

Publication number Publication date
CN108228844A (zh) 2018-06-29
CN108228844B (zh) 2020-10-27

Similar Documents

Publication Publication Date Title
WO2019137185A1 (fr) Procédé et appareil de filtrage d'image, support de stockage et dispositif informatique
Li et al. Factorizable net: an efficient subgraph-based framework for scene graph generation
Wang et al. Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval
US10482146B2 (en) Systems and methods for automatic customization of content filtering
CN109783671B (zh) 一种以图搜图的方法、计算机可读介质及服务器
CN107209860A (zh) 使用分块特征来优化多类图像分类
CN113661487A (zh) 使用机器训练词条频率加权因子的产生密集嵌入向量的编码器
CN110751027B (zh) 一种基于深度多示例学习的行人重识别方法
CN108595688A (zh) 基于在线学习的潜在语义跨媒体哈希检索方法
CN111080551B (zh) 基于深度卷积特征和语义近邻的多标签图像补全方法
US9639598B2 (en) Large-scale data clustering with dynamic social context
CN114298122B (zh) 数据分类方法、装置、设备、存储介质及计算机程序产品
CN113434716A (zh) 一种跨模态信息检索方法和装置
US20240193790A1 (en) Data processing method and apparatus, electronic device, storage medium, and program product
CN114238746A (zh) 跨模态检索方法、装置、设备及存储介质
JP6017277B2 (ja) 特徴ベクトルの集合で表されるコンテンツ間の類似度を算出するプログラム、装置及び方法
CN111191065B (zh) 一种同源图像确定方法及装置
CN114037008A (zh) 基于属性连边的多粒度属性网络嵌入的节点分类方法及***
CN103150574B (zh) 基于最邻近标签传播算法的图像型垃圾邮件检测方法
CN112434174A (zh) 多媒体信息的发布账号的识别方法、装置、设备及介质
CN111090743A (zh) 一种基于词嵌入和多值形式概念分析的论文推荐方法及装置
Lakshmi An efficient telugu word image retrieval system using deep cluster
JP2015097036A (ja) 推薦画像提示装置及びプログラム
Pertusa et al. Mirbot: A multimodal interactive image retrieval system
CN113515660B (zh) 基于三维张量对比策略的深度特征对比加权图像检索方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18899545

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18899545

Country of ref document: EP

Kind code of ref document: A1