CN113190698B

CN113190698B - Paired picture set generation method and device, electronic equipment and storage medium

Info

Publication number: CN113190698B
Application number: CN202110468967.2A
Authority: CN
Inventors: 龚震霆
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-04-28
Filing date: 2021-04-28
Publication date: 2023-08-01
Anticipated expiration: 2041-04-28
Also published as: CN113190698A

Abstract

The application provides a generation method, a device, electronic equipment and a storage medium of a pairing picture set, relates to the technical field of artificial intelligence, and particularly relates to the technical field of computer vision and deep learning. The specific implementation scheme is as follows: acquiring search information input by a user, generating a data set of pictures according to the search information, extracting characteristics of each picture in the data set to obtain multi-granularity category characteristics of each picture, and carrying out picture pairing according to the multi-granularity category characteristics of each picture to generate a paired picture set. According to the method and the device, based on the search information used during user search, a large amount of picture data are obtained, matching of the paired pictures is carried out based on the multi-granularity characteristics of the extracted pictures, a large amount of gallery data of the paired pictures are expanded, satisfaction degree of a customer in obtaining the paired pictures is increased, browsing appeal of the user is fully stimulated, and more accurate experience effect is provided for the user.

Description

Paired picture set generation method and device, electronic equipment and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to the technical field of computer vision and deep learning, and specifically relates to a paired picture set generation method, a paired picture set generation device, electronic equipment and a storage medium.

Background

The new generation young people have strong personalized expression mode, and show unique individuality on various social network platforms, and are particularly enthusiastic to use unique head portraits in various network platform account systems. Most young lovers currently want to show their own love through account head portraits of both parties, and the head portraits capable of indicating the lover relationship are called lover head portraits.

However, in the related art, it is not easy to obtain such paired lover head portraits, manual creation, design, shooting, etc. are required, and the efficiency is low.

Disclosure of Invention

The application provides a paired picture set generation method and device for improving paired image generation efficiency, electronic equipment and storage medium.

According to an aspect of the present application, there is provided a method for generating a paired picture set, including:

acquiring search information input by a user;

generating a data set of the picture according to the search information;

extracting the characteristics of each picture in the data set to obtain multi-granularity class characteristics of each picture, wherein the multi-granularity class characteristics indicate the granularity of the characteristic extraction;

and carrying out picture pairing according to the multi-granularity category characteristics of each picture so as to generate a paired picture set.

According to another aspect of the present application, there is provided a paired picture set generating apparatus, including:

the acquisition module is used for acquiring search information input by a user;

the generation module is used for generating a data set of the picture according to the search information;

the feature extraction module is used for carrying out feature extraction on each picture in the data set to obtain multi-granularity class features of each picture, wherein the multi-granularity class features indicate the granularity of feature extraction;

and the pairing module is used for carrying out picture pairing according to the multi-granularity category characteristics of each picture so as to generate a paired picture set.

According to another aspect of the present application, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect.

According to another aspect of the present application, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of the first aspect.

According to another aspect of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.

It should be understood that the description of this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.

Drawings

The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

fig. 1 is a flow chart of a method for generating a pairing picture set according to an embodiment of the present application;

fig. 2 is a flowchart of another method for generating a paired picture set according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a paired picture set generating device according to an embodiment of the present application;

fig. 4 is a schematic block diagram of an electronic device 800 provided by an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The following describes a method, an apparatus, an electronic device, and a storage medium for generating a pairing picture set according to embodiments of the present application with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for generating a paired picture set according to an embodiment of the present application.

As shown in fig. 1, the method comprises the steps of:

step 101, acquiring search information input by a user.

The search information is search information input by a user by using a search function of search engine software, and the search information is used for determining a search picture, for example, when the user searches for a lover paired image, the input search information is a lover head portrait.

Step 102, generating a data set of the picture according to the search information.

In a first implementation manner of the embodiment of the present application, keywords are extracted according to search information, a search picture is obtained according to the keywords, a picture set is determined in response to user operation on the search picture in a preset time, key value pair data is generated for each user according to a user identifier, a preset time and the picture set, wherein keys in the key value pair represent the user identifier and the preset time, values represent the picture set, a data set of the picture is generated according to at least one key value pair data, candidate paired pictures with an association relationship are obtained through search operation based on the user, and the quality of the obtained picture set is improved.

In a second implementation manner of the embodiment of the present application, at least one target text is obtained according to the search information, where the target text includes the search information, for example, a text published by a website and including the search information; or the semantics of the title or body of the target text are the same as those of the search information. Further, at least one picture in the target text is determined for each target text, a picture set is obtained according to the at least one picture in the target text, key value pair data are generated according to the picture set, wherein keys in the key value pair represent semantic information of the target text, and the semantic information of the target file can be obtained based on title identification of the target text or semantic information determined based on the text of the target text. The value represents a picture set in the target text, the data set of the picture is generated according to at least one key value pair data, the related text is matched based on search information of a user, candidate paired pictures with association relation are obtained from the text, the picture set is formed, and the quality of the obtained picture set is improved.

As a third possible implementation manner of the embodiment of the present application, the dataset of the picture determined in the first implementation manner and the dataset of the picture determined in the second implementation manner are combined to obtain a combined dataset, so that the amount of pictures contained in the picture dataset is increased, a large number of subsequent paired pictures are expanded, and further efficiency of obtaining paired pictures is improved.

It should be noted that, in the technical solution of the present application, the acquisition, storage, application, etc. of the related personal information of the user all conform to the rules of the related laws and regulations, and do not violate the popular regulations of the public order.

And 103, extracting the characteristics of each picture in the data set to obtain multi-granularity category characteristics of each picture.

Wherein the multi-granularity class features indicate the granularity of feature extraction, that is, the different classes of features, indicating the different granularity of feature extraction.

In this embodiment, for each picture in the acquired picture dataset, various granularity class features, for example, coarser granularity features, finer granularity features, and the like, are extracted, wherein the different granularity class features include different information amounts in the extracted pictures, so that the acquisition of picture information with different granularity is realized, and the information amount of the acquired pictures is increased.

And 104, performing picture pairing according to the multi-granularity category characteristics of each picture to generate a paired picture set.

In this embodiment, based on the obtained multi-granularity category features, sorting is performed according to the order from large to small based on the granularity of the categories, and matching is performed between the pictures based on the sorted category features to generate the paired pictures, so that matching is performed through the features of multiple categories to improve the reliability of paired picture generation and increase the satisfaction of the paired pictures.

According to the method for generating the paired picture set, search information input by a user is acquired, a data set of pictures is generated according to the search information, feature extraction is carried out on each picture in the data set to obtain multi-granularity category features of each picture, picture pairing is carried out according to the multi-granularity category features of each picture to generate the paired picture set, a large amount of picture data is acquired based on the search information used during user search, matching of paired pictures is carried out based on the multi-granularity features of the extracted pictures, a large amount of picture library data of paired pictures is expanded, satisfaction degree of a user in acquiring paired pictures is increased, browsing appeal is fully stimulated for the user, and a more accurate basic experience effect is provided for the user.

Based on the above embodiments, another implementation manner is provided, and the multi-granularity category feature includes a first granularity category feature, a second granularity category global feature and a second granularity category local feature of the picture, so as to determine the paired picture according to the similar feature, the global feature and the local feature of the picture.

Fig. 2 is a flowchart of another method for generating a paired picture set according to an embodiment of the present application, as shown in fig. 2, the method includes the following steps:

Step 201, obtaining search information input by a user.

Step 202, generating a data set of the picture according to the search information.

In step 201 and step 202, the principle may be the same with reference to the explanation in the above embodiment, and the details are not repeated in this embodiment.

Step 203, inputting each picture in the dataset into the trained first feature extraction model to generate a first granularity category feature of each picture.

In this embodiment, the first feature extraction model is obtained by training a first training sample and a second training sample, where the first training sample is a first picture corresponding to a plurality of clustering categories obtained by performing a first clustering on pictures in an existing picture library, specifically, performing a first clustering on pictures in an existing material database according to a preset clustering category and number to obtain a first picture corresponding to a plurality of clustering categories, where the categories indicated by the plurality of clustering categories are different, and for convenience in description, the granularity of the clustering categories is called as a first granularity, for example, the clustering categories of the first granularity are animal categories, landscape categories, cartoon categories, character categories, and the like, which are not listed in this embodiment. And after labeling, the clustered first training samples are used for training the first feature extraction model in a first stage, so that the first feature extraction model obtained by training can identify a first granularity class of the obtained picture, wherein the first feature extraction model comprises a feature extraction layer for extracting features of the first granularity class, and the features output by the feature extraction layer are used as the features of the first granularity class.

The second training sample is a picture data set and is used for training the first feature extraction model in a second stage, the first feature extraction model obtained through the first stage training is realized through the second stage training, the picture data set determined based on the scene is adopted for training, fine adjustment of the first feature extraction model is realized, the first feature extraction model obtained after fine adjustment is suitable for feature extraction requirements of corresponding scenes, fine adjustment of the first feature extraction model obtained through the first stage training is realized through a small amount of second training samples, and the training effect of the first feature extraction model is improved.

It should be noted that, the first granularity class feature, because of the coarser granularity, can only identify the feature belonging to a certain major class, for example, belonging to an animal class, or belonging to a cartoon class, that is, the extracted first granularity class feature may approximately indicate that the feature belongs to a certain class, for example, a dog in the picture a, and according to the extracted first granularity class feature, it may be determined that the picture a is an animal, but not a dog, and not a cat.

In the embodiment of the application, each picture in the data set is input into a trained first feature extraction model to generate a first granularity class feature of each picture, as a possible implementation manner, the first feature extraction model sorts the pictures in the data set by using a mapper-reducer operation of a hadoop cluster, similar feature extraction is performed to obtain the first granularity class feature of each picture, so that the operation amount of the model is reduced while the first granularity class feature of each picture is obtained by extraction, and the class feature of the picture is obtained efficiently and accurately by adopting the trained first feature extraction model.

Step 204, inputting each picture in the dataset into the trained second feature extraction model to generate a second granularity class global feature and a second granularity class local feature of each picture.

In this embodiment, the second feature extraction model is obtained by training a third training sample and a second training sample, where the third training sample is a second picture corresponding to a plurality of clustering categories obtained by performing a second clustering on pictures in an existing picture library, for example, performing the second clustering on pictures in an existing material database according to a preset clustering category and a preset clustering number to obtain a second picture included in each clustering category, where the plurality of clustering categories indicate different categories, and for convenience in explanation, the granularity of the second clustering category is called as a second granularity category, and thus, the clustering category is called as a cat category, a puppy category, a cat category, a dog category, a gold hair dog category, a poodle category, and the like, and in this embodiment, the second granularity category is not listed one by one. And generating a third training sample according to the second pictures corresponding to the second granularity categories, marking the third training sample, and then training the second feature extraction model in the first stage to enable the second feature extraction model obtained by training to identify the second granularity categories of the pictures, wherein the second feature extraction model comprises a feature extraction layer for extracting features of the second granularity categories, the feature extraction layer comprises a first feature extraction layer and a second feature extraction layer, the features output by the first feature extraction layer serve as global features of the second granularity categories, and the features output by the second feature extraction layer serve as local features of the second granularity categories.

It should be noted that, the granularity of the second granularity cluster category is finer than that of the first granularity cluster category, for example, the second granularity cluster category is a cat category, and the first granularity cluster category is an animal category, so that feature extraction with finer granularity is realized.

The second training sample is a picture data set and is used for training the second feature extraction model in a second stage, the second feature extraction model obtained through the training in the first stage is realized through the training in the second stage, the picture data set determined based on the scene is adopted for training, fine adjustment of the second feature extraction model is realized, the second feature extraction model obtained after fine adjustment is suitable for feature extraction requirements of corresponding scenes, fine adjustment of the second feature extraction network obtained through the training in the first stage is realized through a small amount of second training samples, and the training effect of the first feature extraction model is improved.

The second granularity type global feature may be used to indicate a type finer than the first granularity type feature, and the second granularity type local feature may be used to indicate a type finer than the second granularity type feature, for example, for the picture a, the extracted first granularity type feature indicates that the picture a is an animal, the extracted second granularity type global feature indicates that the picture a is a puppy, the extracted second granularity type local feature indicates that the picture a is a gold hair dog, and the accurate identification of the type of the object in the picture is achieved through feature extraction of different granularities.

It is to be appreciated that the dimensions of the second granularity category global feature and the first granularity category feature may be the same, while the dimensions of the second granularity category local feature and the second granularity category global feature are typically different because the local features of objects in different pictures contain different information, and thus the dimensions of the extracted second granularity category local features are also different.

In the embodiment of the application, each picture in the dataset is input into a trained second feature extraction model to generate a second granularity class feature of each picture, the second granularity class feature comprises a second granularity class global feature and a second granularity class local feature, as a possible implementation manner, the first feature extraction model sorts the pictures in the dataset by using a mapper-reducer operation of a hadoop cluster, and feature extraction is performed to obtain the second granularity class global feature and the second granularity class local feature of each picture, so that the operation amount of the model is reduced while the second granularity class feature of each picture is obtained by extraction, and the class feature of the picture is obtained efficiently and accurately by adopting the trained second feature extraction model.

Step 205, fusing the first granularity category feature of each picture, the second granularity category global feature of each picture and the second granularity category local feature to obtain the multi-granularity category feature of each picture.

In this embodiment, for each picture, the extracted first granularity category feature, the second granularity category global feature and the second granularity category local feature of the picture are fused to obtain the multi-granularity category feature of each picture, so that the picture matching is performed according to the multi-granularity category feature, and an accurate paired picture is obtained. For example, under the couple picture pairing scene, the couple paired picture which is highly matched is obtained, so that the accuracy of the paired picture is improved.

Step 206, selecting candidate paired picture sets matched with the first granularity category features and/or the second granularity category global features for the picture sets in each key value pair.

The data set of the picture in the embodiment comprises a plurality of key value pairs, each key value pair comprises a corresponding picture set, and because the pictures in the picture set in each key value pair have similar characteristics, the picture pairing is performed on the picture set in each key value pair, so that the efficiency of the picture pairing can be improved.

In a first implementation manner of the embodiment of the present application, for each of the picture sets in the key value pair, for the first granularity category feature, the first feature similarity between any two pictures is calculated according to the euclidean distance based on the first granularity category feature, and according to the calculated first feature similarity, the first feature similarity between the first similarity threshold and the second similarity threshold is determined, and further according to the first feature similarity between the first similarity threshold and the second similarity threshold, the corresponding candidate paired picture is determined, and further according to the candidate paired picture, the candidate paired picture set is determined, wherein the pictures in the candidate paired picture set have uniqueness. For example, in the key value pair K, by similarity matching, it is determined that the candidate paired pictures indicated by the first similarity feature X1 are the picture a and the picture B, the candidate paired pictures indicated by the first similarity feature X2 are the picture a and the picture C, and the candidate paired pictures indicated by the first similarity feature X3 are the picture C and the picture B, and then the candidate paired picture set determined according to the candidate paired pictures described above is A, B and C.

In a second implementation manner of the embodiment of the present application, for each picture set in the key value pair, for the second granularity category feature, the second feature similarity between any two pictures is calculated according to the euclidean distance based on the second granularity category global feature, and according to the calculated second feature similarity, the second feature similarity between the second similarity threshold and the third similarity threshold is determined, and further according to the second feature similarity between the second similarity threshold and the third similarity threshold, the corresponding candidate paired picture is determined, and further according to the candidate paired picture, the candidate paired picture set is determined, wherein the pictures in the candidate paired picture set have uniqueness. For example, in the key value pair K, the candidate paired pictures indicated by the second similarity feature Y1 are determined to be the picture a and the picture B, and the candidate paired pictures indicated by the second similarity feature Y2 are determined to be the picture a and the picture C through similarity matching, and then the candidate paired picture set determined according to the above candidate paired pictures is A, B and C.

In a third implementation manner of this embodiment, for each picture set corresponding to each key value, selecting a second paired picture set with a first granularity class feature matching; and selecting a candidate paired picture set matched with the global features of the second granularity category for each second paired picture set. The feature matching method can refer to the description in the first implementation manner or the second implementation manner, and the principle is the same, so that the description is omitted in this embodiment.

In this embodiment, matching of paired pictures is performed through the first granularity category feature and/or the second granularity category global feature, so as to obtain a candidate paired picture set, and accuracy of picture pairing is improved.

Step 207, selecting a first paired picture set matched with the local features of the second granularity class according to the candidate paired picture set corresponding to each key value.

In this embodiment, for each picture set corresponding to each key value, after similarity matching is performed according to the first granularity category feature and the second granularity category global feature, a candidate paired picture set is determined, and then, according to the second granularity category local feature, a first paired picture set is determined for any two pictures in the candidate paired picture set based on a similarity distance, for example, a euclidean distance, wherein the first paired picture set includes a plurality of paired pictures.

In this embodiment, matching of paired pictures is performed through category features with multiple granularities, so as to obtain a first paired picture set, thereby improving accuracy of picture pairing

Step 208, generating a paired picture set according to the plurality of first paired picture sets.

And further, aiming at the first paired picture sets corresponding to the plurality of key values, paired pictures in the plurality of first paired picture sets are combined to obtain paired picture sets, so that the paired picture sets are greatly expanded under corresponding scenes, the paired pictures are not required to be generated manually when a user needs to acquire response paired pictures, the cost for generating the paired picture sets is reduced, the paired picture acquisition efficiency is improved, the browsing appeal of the user is fully stimulated, and a more accurate basic experience effect is provided for the user.

According to the method for generating the paired picture set, search information input by a user is acquired, a data set of pictures is generated according to the search information, feature extraction is carried out on each picture in the data set to obtain multi-granularity category features of each picture, and the paired picture set is generated according to the multi-granularity category features of each picture.

Based on the above embodiment, after the data set of the picture is generated, in order to improve the quality of the picture contained in the data set, the data set of the generated picture may be further processed, for example, cleaned, deduplicated, and the like, after the above steps 102 and 202.

As a possible implementation manner, since the picture with smaller size has lower quality, and the low quality picture can affect the use of the user, the picture in the dataset needs to be cleaned, the picture in the dataset is compared with the size threshold of the preset picture, and the picture smaller than the size threshold is deleted, so that the dataset is obtained, and the quality of the picture in the dataset is improved.

As another possible implementation manner, a picture classifier is adopted to screen out a sensitive picture which does not meet the preset requirement, wherein the sensitive picture can be determined based on the requirement of the law and regulation, and the method is not limited in this embodiment.

As another possible implementation manner, after the map operation in the map-reducer of the hadoop cluster is performed, all the key-value pairs are obtained, the data are ordered according to the key-value pairs, a plurality of identical key-value pairs are arranged together, and the key-value pair data ordered at the first is selected from the identical key-value pairs, so that a data set of a picture is obtained, the screening of the identical key-value pairs for the data is realized, the identical key-value pairs for the data are removed, the data amount contained in the data set is reduced, the calculation amount in the subsequent pairing is reduced, and the pairing efficiency is improved.

In order to achieve the foregoing embodiments, the present embodiment provides a paired picture set generating device, and fig. 3 is a schematic structural diagram of a paired picture set generating device provided in the embodiments of the present application, where, as shown in fig. 3, the device includes:

the acquiring module 31 is configured to acquire search information input by a user.

The generating module 32 is configured to generate a dataset of pictures according to the search information.

The feature extraction module 33 is configured to perform feature extraction on each picture in the dataset to obtain a multi-granularity class feature of each picture, where the multi-granularity class feature indicates a granularity of feature extraction.

The pairing module 34 is configured to pair the pictures according to the multi-granularity category feature of each picture, so as to generate a paired picture set.

Further, as a possible implementation manner, the generating module 32 is specifically configured to:

acquiring a search picture according to the search information;

in a preset time, responding to the user operation of searching the pictures, and determining a picture set;

generating key value pair data according to the user identification, the preset time and the picture set for each user; wherein, the keys in the key value pair represent the user identifier and the preset time, and the value represents the picture set;

and generating a data set of the picture according to the at least one key value pair data.

As another possible implementation manner, the generating module 32 is specifically configured to:

acquiring at least one target text according to the search information;

generating key value pair data according to a picture set in each target text; wherein, the keys in the key value pair represent semantic information of the target text, and the values represent picture sets in the target text;

As a possible implementation manner, the feature extraction module 33 is specifically configured to:

inputting each picture in the dataset into a trained first feature extraction model to generate a first granularity class feature of each picture;

inputting each picture in the dataset into a trained second feature extraction model to generate a second granularity class global feature and a second granularity class local feature of each picture;

and fusing the first granularity category characteristic of each picture, the second granularity category global characteristic of each picture and the second granularity category local characteristic to obtain the multi-granularity category characteristic of each picture.

As one possible implementation, the pairing module 34 includes:

the first pairing unit is used for selecting candidate paired picture sets matched with the first granularity category characteristics and/or the second granularity category global characteristics for the picture sets in each key value pair;

the second pairing unit is used for selecting a first pairing picture set matched with the local features of the second granularity class for each candidate pairing picture set;

And the generation unit is used for generating the paired picture sets according to the plurality of first paired picture sets.

As a possible implementation manner, the first pairing unit is specifically configured to:

selecting a second paired picture set matched with the first granularity class characteristics for the picture set in each key value pair;

and selecting candidate paired picture sets matched with the global features of the second granularity category from each second paired picture set.

As a possible implementation manner, the first feature extraction model is obtained by training a first training sample and a second training sample;

the first training sample is used for training the first feature extraction model in a first stage; the first training sample is a first picture obtained by performing first clustering on pictures in an existing picture library;

the second training sample is used for training the first feature extraction model in a second stage; wherein the second training sample is the picture dataset.

As a possible implementation manner, the second feature extraction model is obtained by training a third training sample and a second training sample;

the third training sample is used for training the second feature extraction model in the first stage; the third training sample is a second picture obtained by performing second clustering on pictures in the existing picture library, wherein the number of categories of the second cluster is larger than that of the first cluster;

The second training sample is used for training the second feature extraction model in a second stage; wherein the second training sample is the picture dataset.

It should be noted that the explanation of the method embodiment is also applicable to the apparatus of this embodiment, and the principle is the same, and will not be repeated in this embodiment.

In the paired picture set generating device provided by the embodiment, search information input by a user is acquired, a picture data set is generated according to the search information, feature extraction is carried out on each picture in the data set to obtain multi-granularity category features of each picture, and picture pairing is carried out according to the multi-granularity category features of each picture to generate paired picture sets.

In order to achieve the above embodiments, an embodiment of the present application provides an electronic device, including:

At least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the preceding method embodiment.

In order to implement the above embodiments, the present application provides a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of the foregoing method embodiments.

In order to implement the above embodiments, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the method embodiments described above.

Fig. 4 is a schematic block diagram of an electronic device 800 provided by an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.

As shown in fig. 4, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a ROM (Read-Only Memory) 802 or a computer program loaded from a storage unit 808 into a RAM (Random Access Memory ) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An I/O (Input/Output) interface 805 is also connected to bus 804.

Various components in device 800 are connected to I/O interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a CPU (Central Processing Unit ), GPU (Graphic Processing Units, graphics processing unit), various dedicated AI (Artificial Intelligence ) computing chips, various computing units running machine learning model algorithms, DSPs (Digital Signal Processor, digital signal processors), and any suitable processors, controllers, microcontrollers, and the like. The calculation unit 801 performs the respective methods and processes described above, for example, a paired picture set generation method. For example, in some embodiments, the method of generating a paired picture set may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the paired picture set generation method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the method of generating the paired picture set by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit System, FPGA (Field Programmable Gate Array ), ASIC (Application-Specific Integrated Circuit, application-specific integrated circuit), ASSP (Application Specific Standard Product, special-purpose standard product), SOC (System On Chip ), CPLD (Complex Programmable Logic Device, complex programmable logic device), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, RAM, ROM, EPROM (Electrically Programmable Read-Only-Memory, erasable programmable read-Only Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., CRT (Cathode-Ray Tube) or LCD (Liquid Crystal Display ) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network ), WAN (Wide Area Network, wide area network), internet and blockchain networks.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be noted that, artificial intelligence is a subject of studying a certain thought process and intelligent behavior (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a person, and has a technology at both hardware and software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application are achieved, and are not limited herein.

The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. A method of generating a paired picture set, comprising:

acquiring search information input by a user;

generating a picture data set according to the search information; the picture data set comprises a plurality of picture sets in key value pairs;

fusing the first granularity category characteristic of each picture, the second granularity category global characteristic of each picture and the second granularity category local characteristic of each picture to obtain the multi-granularity category characteristic of each picture, wherein the multi-granularity category characteristic indicates the granularity of feature extraction; wherein the granularity of the first granularity category feature is coarser than the granularity of the second granularity category feature;

selecting candidate paired picture sets matched with the first granularity category characteristics and/or the second granularity category global characteristics for the picture sets in each key value pair;

Selecting a first paired picture set matched with the local features of the second granularity class according to each candidate paired picture set;

and generating a pairing picture set according to the plurality of first pairing picture sets.

2. The method of claim 1, wherein the generating a picture dataset from the search information comprises:

acquiring a search picture according to the search information;

and generating the picture data set according to at least one key value pair data.

3. The method of claim 1, wherein the generating a picture dataset from the search information comprises:

acquiring at least one target text according to the search information;

4. The method according to claim 1, wherein selecting candidate paired picture sets with matching first granularity category features and/or second granularity category global features for the picture sets in each key pair comprises:

and selecting candidate paired picture sets matched with the global features of the second granularity category for each second paired picture set.

5. The method of claim 1, wherein the first feature extraction model is trained using a first training sample and a second training sample;

the first training sample is used for training the first feature extraction model in a first stage; the first training sample is obtained by carrying out first clustering on pictures in an existing picture library;

6. The method of claim 5, wherein the second feature extraction model is trained using a third training sample and a second training sample;

The third training sample is used for training the second feature extraction model in the first stage; the third training sample is obtained by performing second clustering on pictures in the existing picture library, wherein the number of categories of the second clusters is larger than that of the first clusters;

7. A paired picture generation apparatus, comprising:

the generation module is used for generating a picture data set according to the search information; the picture data set comprises a plurality of picture sets in key value pairs;

the feature extraction module is used for inputting each picture in the data set into a trained first feature extraction model to generate a first granularity category feature of each picture, inputting each picture in the data set into a trained second feature extraction model to generate a second granularity category global feature and a second granularity category local feature of each picture, and fusing the first granularity category feature of each picture, the second granularity category global feature and the second granularity category local feature of each picture to obtain a multi-granularity category feature of each picture, wherein the multi-granularity category feature indicates the granularity of feature extraction; wherein the granularity of the first granularity category feature is coarser than the granularity of the second granularity category feature;

The pairing module is used for selecting candidate pairing picture sets matched with the first granularity category characteristics and/or the second granularity category global characteristics for the picture sets in each key value pair, and selecting a first pairing picture set matched with the second granularity category local characteristics for each candidate pairing picture set; and generating a pairing picture set according to the plurality of first pairing picture sets.

8. The apparatus of claim 7, wherein the generating module is specifically configured to:

acquiring a search picture according to the search information;

9. The apparatus of claim 7, wherein the generating module is specifically configured to:

acquiring at least one target text according to the search information;

10. The apparatus of claim 7, wherein the pairing module is specifically configured to:

11. The apparatus of claim 7, wherein the first feature extraction model is trained using a first training sample and a second training sample;

12. The apparatus of claim 11, wherein the second feature extraction model is trained using a third training sample and a second training sample;

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-6.