WO2023082993A1

WO2023082993A1 - Information recommendation method, apparatus and system

Info

Publication number: WO2023082993A1
Application number: PCT/CN2022/127554
Authority: WO
Inventors: 石静雯; 杨勇; 李征; 王冬月; 丁卓冶
Original assignee: 北京沃东天骏信息技术有限公司; 北京京东世纪贸易有限公司
Priority date: 2021-11-11
Filing date: 2022-10-26
Publication date: 2023-05-19
Also published as: CN114119142A

Abstract

The present disclosure relates to the technical field of computers. Provided are an information recommendation method, apparatus and system. The method comprises: acquiring an image of a product; determining a representation vector of the image of the product; according to the representation vector of the image of each product, determining a clustering identifier corresponding to the image of the product; according to the product and the clustering identifier corresponding to the image of the product, formulating a limiting condition for different products corresponding to the same clustering identifier being displayed at the same time, so as to form a scattering rule; and performing scattering processing on a product recommendation result by using the scattering rule. Therefore, the scattering effect is improved, and the problem of clustering recommendation of information is better solved.

Description

信息推荐方法、装置和***Information recommendation method, device and system

相关申请的交叉引用Cross References to Related Applications

本申请是以CN申请号为202111331942.4，申请日为2021年11月11日的申请为基础，并主张其优先权，该CN申请的公开内容在此作为整体引入本申请中。This application is based on the application with CN application number 202111331942.4 and the application date is November 11, 2021, and claims its priority. The disclosure content of this CN application is hereby incorporated into this application as a whole.

技术领域technical field

本公开涉及计算机技术领域，特别涉及一种信息推荐方法、装置和***。The present disclosure relates to the field of computer technology, in particular to an information recommendation method, device and system.

背景技术Background technique

业务***会推荐一些信息给用户，例如，电子商务***会推荐一些产品信息给用户。但是，推荐***可能会将相似产品扎堆推荐给用户，影响用户体验。The business system will recommend some information to the user, for example, the e-commerce system will recommend some product information to the user. However, the recommendation system may recommend similar products to users together, affecting user experience.

一些相关技术，利用产品的类目信息或文本信息制定打散规则，对产品推荐结果进行打散处理，以解决信息扎堆推荐的问题。Some related technologies make use of product category information or text information to formulate dispersal rules, and disperse product recommendation results to solve the problem of information clustering and recommendation.

发明内容Contents of the invention

本公开一些实施例提出一种信息推荐方法，包括：Some embodiments of the present disclosure propose an information recommendation method, including:

获取产品的图像；Get an image of the product;

确定产品的图像的表征向量；Determining a representation vector of an image of the product;

根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识；According to the characterization vector of the image of each product, determine the cluster identification corresponding to the image of the product;

根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则；According to the cluster identification corresponding to the product and the product image, formulate the restriction conditions for different products corresponding to the same cluster identification to be displayed at the same time, and form a dispersal rule;

利用所述打散规则对产品推荐结果进行打散处理。The product recommendation results are broken up by using the breakup rule.

在一些实施例中，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识包括：In some embodiments, according to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product includes:

根据产品的品类信息和产品词将产品划分为若干初始类别；Divide the product into several initial categories according to the category information and product words of the product;

根据每个初始类别下的各个产品的图像的表征向量，确定每个初始类别相应的表征向量；Determining a corresponding characterization vector for each initial category according to the characterization vectors of images of each product under each initial category;

对各个初始类别相应的表征向量进行聚类，形成若干一级聚类；Cluster the characterization vectors corresponding to each initial category to form several first-level clusters;

针对每个一级聚类下的各个产品的图像的表征向量进行聚类，形成每个一级聚类下的若干二级聚类；Clustering is performed on the characterization vectors of images of each product under each first-level cluster to form several second-level clusters under each first-level cluster;

根据产品的图像所属的一级聚类和二级聚类，确定产品的图像对应的一级聚类标识和二级聚类标识。According to the first-level cluster and the second-level cluster to which the product image belongs, the first-level cluster identifier and the second-level cluster identifier corresponding to the product image are determined.

在一些实施例中，产品的图像对应的聚类标识包括产品的图像对应的一级聚类标识和二级聚类标识；根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则包括：限制产品推荐结果中相邻的第一数量的结果中，对应相同一级聚类标识的不同产品最多展示第二数量，对应相同二级聚类标识的不同产品最多展示第三数量，其中，第一数量大于第二数量，第二数量大于第三数量。In some embodiments, the cluster identification corresponding to the image of the product includes a first-level cluster identification and a second-level cluster identification corresponding to the image of the product; according to the cluster identification corresponding to the product and the image of the product, formulate The restrictive conditions for different products of different products to be displayed at the same time, forming the dispersing rules include: among the results of restricting the first number of adjacent products in the product recommendation results, the different products corresponding to the same first-level clustering identifier can be displayed at most the second number, corresponding to the same two Different products identified by the level cluster display at most a third quantity, wherein the first quantity is greater than the second quantity, and the second quantity is greater than the third quantity.

在一些实施例中，确定产品的图像的表征向量包括：In some embodiments, determining the representation vector of the image of the product comprises:

将产品的图像输入训练后的图像向量提取网络，输出产品的图像的表征向量，Input the image of the product into the trained image vector extraction network, and output the representation vector of the image of the product,

其中，所述图像向量提取网络是利用产品的图像和作为标签的产品的产品词对卷积神经网络进行训练得到的，训练用的产品的图像是经过预处理的，所述预处理包括以下的一项或多项：Wherein, the image vector extraction network is obtained by using the image of the product and the product word of the product as the label to train the convolutional neural network, and the image of the product used for training is pre-processed, and the pre-processing includes the following One or more:

如果训练用的产品的图像是透明图，将透明图转换为白底图；If the image of the product used for training is a transparent image, convert the transparent image to a white background image;

对训练用的产品的图像以预设概率随机水平翻转；The image of the product used for training is randomly flipped horizontally with a preset probability;

对训练用的产品的图像以预设概率随机旋转预设角度以内的角度；The image of the product used for training is randomly rotated at an angle within the preset angle with a preset probability;

以预设概率随机偏移训练用的产品的图像的颜色。Randomly shifts the color of images of products used for training with a preset probability.

在一些实施例中，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识还包括：In some embodiments, according to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product further includes:

根据新产品的品类信息和产品词确定新产品的初始类别；Determine the initial category of the new product based on the category information and product words of the new product;

根据初始类别与一级聚类的映射关系，确定新产品的初始类别相应的新产品的图像的一级聚类标识；According to the mapping relationship between the initial category and the first-level cluster, determine the first-level cluster identification of the image of the new product corresponding to the initial category of the new product;

根据新产品的图像的表征向量与新产品的一级聚类下的各个二级聚类之间的距离，将最短距离相应的二级聚类标识作为新产品的图像的二级聚类标识。According to the distance between the characterization vector of the image of the new product and each secondary cluster under the primary cluster of the new product, the secondary cluster identifier corresponding to the shortest distance is used as the secondary cluster identifier of the image of the new product.

在一些实施例中对每个初始类别下的各个产品的图像的表征向量进行平均池化，得到该初始类别相应的表征向量；形成一级聚类的聚类方法包括k-means聚类、层次聚类；形成二级聚类的聚类方法包括CBSCAN聚类、canopy聚类。In some embodiments, the characterization vectors of the images of each product under each initial category are averagely pooled to obtain the corresponding characterization vectors of the initial category; the clustering methods for forming a first-level cluster include k-means clustering, hierarchical Clustering; clustering methods for forming secondary clusters include CBSCAN clustering and canopy clustering.

获取产品推荐结果；Obtain product recommendation results;

获取预设的打散规则，所述打散规则包括根据产品以及产品的图像对应的聚类标识制定的对应相同聚类标识的不同产品被同时展示的限制条件；Obtaining a preset break-up rule, the break-up rule includes restrictions on simultaneous display of different products corresponding to the same cluster mark formulated according to the product and the cluster mark corresponding to the image of the product;

在一些实施例中，所述打散规则的形成方法包括：In some embodiments, the method for forming the breaking rules includes:

获取产品的图像；Get an image of the product;

根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则。According to the cluster identification corresponding to the product and the image of the product, the restrictive conditions for simultaneous display of different products corresponding to the same cluster identification are formulated to form a dispersal rule.

针对每个一级聚类下的各个产品的图像的表征向量进行聚类，形成每个一级聚类下的若干二级聚类；Clustering is performed on the representation vectors of images of each product under each first-level cluster to form several second-level clusters under each first-level cluster;

本公开一些实施例提出一种信息推荐装置，包括：Some embodiments of the present disclosure propose an information recommendation device, including:

图像获取模块，被配置为获取产品的图像；an image acquisition module configured to acquire an image of a product;

向量确定模块，被配置为确定产品的图像的表征向量；a vector determination module configured to determine a representation vector of an image of the product;

聚类标识确定模块，被配置为根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识；The cluster identification determination module is configured to determine the cluster identification corresponding to the image of the product according to the representation vector of the image of each product;

打散规则制定模块，被配置为根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则；The dispersal rule formulation module is configured to formulate restrictions on simultaneous display of different products corresponding to the same cluster identities according to the cluster identification corresponding to the product and the image of the product to form a dispersal rule;

打散处理模块，被配置为利用所述打散规则对产品推荐结果进行打散处理。The unbundling processing module is configured to unbundle the product recommendation results by using the unbundling rules.

推荐结果获取模块，被配置为获取产品推荐结果；The recommendation result obtaining module is configured to obtain product recommendation results;

打散规则获取模块，被配置为获取预设的打散规则，所述打散规则包括根据产品以及产品的图像对应的聚类标识制定的对应相同聚类标识的不同产品被同时展示的限制条件；The unbundling rule acquisition module is configured to acquire preset unbundling rules, the unpacking rules include restrictions on simultaneous display of different products corresponding to the same cluster ID formulated according to the product and the cluster ID corresponding to the image of the product ;

本公开一些实施例提出一种信息推荐装置，包括：存储器；以及耦接至所述存储器的处理器，所述处理器被配置为基于存储在所述存储器中的指令，执行各个实施例的信息推荐方法。Some embodiments of the present disclosure propose an information recommendation device, including: a memory; and a processor coupled to the memory, the processor is configured to execute the information of various embodiments based on instructions stored in the memory recommended method.

本公开一些实施例提出一种信息推荐***，其特征在于，包括：Some embodiments of the present disclosure propose an information recommendation system, which is characterized in that it includes:

第一信息推荐单元，被配置为形成初始的产品推荐结果；The first information recommendation unit is configured to form an initial product recommendation result;

第二信息推荐单元，被配置为通过执行各个实施例的信息推荐方法，对产品推荐结果进行打散处理。The second information recommendation unit is configured to disperse product recommendation results by executing the information recommendation method of each embodiment.

本公开一些实施例提出一种非瞬时性计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现各个实施例的信息推荐方法的步骤。Some embodiments of the present disclosure provide a non-transitory computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the information recommendation method of each embodiment are implemented.

附图说明Description of drawings

下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍。根据下面参照附图的详细描述，可以更加清楚地理解本公开。The drawings that need to be used in the description of the embodiments or related technologies will be briefly introduced below. The present disclosure can be more clearly understood from the following detailed description with reference to the accompanying drawings.

显而易见地，下面描述中的附图仅仅是本公开的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。Apparently, the drawings in the following description are only some embodiments of the present disclosure, and those skilled in the art can obtain other drawings according to these drawings without any creative effort.

图1示出本公开一些实施例的信息推荐方法的流程示意图。Fig. 1 shows a schematic flowchart of an information recommendation method in some embodiments of the present disclosure.

图2示出本公开一些实施例的图像表征向量的生成过程示意图。Fig. 2 shows a schematic diagram of a process of generating an image representation vector in some embodiments of the present disclosure.

图3示出本公开一些实施例的粗粒度的一级聚类示意图。Fig. 3 shows a schematic diagram of coarse-grained first-level clustering in some embodiments of the present disclosure.

图4示出本公开一些实施例的细粒度的二级聚类示意图。Fig. 4 shows a schematic diagram of fine-grained two-level clustering in some embodiments of the present disclosure.

图5示出本公开一些实施例的信息推荐装置的结构示意图。Fig. 5 shows a schematic structural diagram of an information recommendation device in some embodiments of the present disclosure.

图6示出本公开一些实施例的信息推荐装置的结构示意图。Fig. 6 shows a schematic structural diagram of an information recommendation device in some embodiments of the present disclosure.

图7示出本公开一些实施例的信息推荐装置的结构示意图。Fig. 7 shows a schematic structural diagram of an information recommendation device in some embodiments of the present disclosure.

图8示出本公开一些实施例的信息推荐***的结构示意图。Fig. 8 shows a schematic structural diagram of an information recommendation system in some embodiments of the present disclosure.

具体实施方式Detailed ways

下面将结合本公开实施例中的附图，对本公开实施例中的技术方案进行清楚、完整地描述。The following will clearly and completely describe the technical solutions in the embodiments of the present disclosure with reference to the drawings in the embodiments of the present disclosure.

除非特别说明，否则，本公开中的“第一”“第二”等描述用来区分不同的对象，并不用来表示大小或时序等含义。Unless otherwise specified, descriptions such as "first" and "second" in the present disclosure are used to distinguish different objects, and are not used to indicate meanings such as size or timing.

经研究发现，相关技术利用产品的类目信息或文本信息制定打散规则，对产品推荐结果进行打散处理，存在推荐结果中不同类目/文本信息下的相似商品无法打散、或者相同类目/文本信息下的不相似商品被强制打散等问题，影响打散效果，不利于信息扎堆推荐问题的解决，影响用户体验。After research, it is found that related technologies use product category information or text information to formulate dispersal rules, and disperse product recommendation results. In the recommendation results, similar products under different categories/text information cannot be disperse, or the same category Dissimilar products under item/text information are forcibly scattered, which affects the effect of dispersal, is not conducive to solving the problem of information clustering and recommendation, and affects user experience.

本公开实施例确定产品的图像对应的聚类标识，据此制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则，对产品推荐结果进行打散处理，改善打散效果，更好地解决信息扎堆推荐的问题，提升用户体验。The embodiment of the present disclosure determines the cluster identification corresponding to the image of the product, and formulates the restriction conditions for different products corresponding to the same cluster identification to be displayed at the same time, forms a dispersal rule, performs dispersal processing on the product recommendation results, and improves the dispersal effect , to better solve the problem of information clustering and recommendation, and improve user experience.

如图1所示，该实施例的信息推荐方法包括步骤110-150：As shown in Figure 1, the information recommendation method of this embodiment includes steps 110-150:

在步骤110，获取产品的图像。At step 110, an image of the product is acquired.

根据产品的图像的URL(Uniform Resource Locator，统一资源***)获取产品的图像。Acquire the image of the product according to the URL (Uniform Resource Locator, Uniform Resource Locator) of the image of the product.

在步骤120，利用计算机视觉(computer vision)技术，确定产品的图像的表征向量。图像表征向量的生成过程如图2所示。In step 120, a representation vector of the image of the product is determined using computer vision technology. The generation process of the image representation vector is shown in Figure 2.

确定产品的图像的表征向量包括：将产品的图像输入训练后的图像向量提取网络，输出产品的图像的表征向量。Determining the representation vector of the image of the product includes: inputting the image of the product into the trained image vector extraction network, and outputting the representation vector of the image of the product.

其中，所述图像向量提取网络是利用产品的图像和作为标签的产品的产品词对卷积神经网络进行训练得到的。Wherein, the image vector extraction network is obtained by using the image of the product and the product word of the product as the label to train the convolutional neural network.

上述训练数据可以通过以下方法获得。选取近期点击较高的产品的图像作为训练数据，每个图像取该产品的文本信息作为标签，如可取产品的产品词作为标签。过滤出至少含例如500张产品图像的产品词，每个产品词取例如500张图像，最终得到例如700万张图像数据。这些训练数据可划分成训练集、验证集以及测试集。The above training data can be obtained by the following method. Select images of products that have been clicked more recently as training data, and each image takes the text information of the product as a label, such as the product word of the product as a label. Filter out product words containing at least, for example, 500 product images, take, for example, 500 images for each product word, and finally obtain, for example, 7 million image data. These training data can be divided into training set, verification set and test set.

通过迁移学习的方法来训练网络。首先，获取一个初步训练后的卷积神经网络，初步训练可以是利用大量自然图像(如Imagenet数据)进行训练，训练卷积神经网络的预训练权重，使得初步训练后的卷积神经网络具备底层的视觉特征的学习能力。然后，将初步训练后的卷积神经网络迁移到具体的信息推荐任务中，将产品的图像和作为标签的产品的产品词对卷积神经网络进行进一步训练得到图像向量提取网络，对卷积神经网络的权重进行微调，使得训练后的网络适用于具体的信息推荐任务。卷积神经网络例如使用resnet,vgg等网络作为骨干(backbone)网络。加载预训练权重加速了网络的学习，缩短训练时间。The network is trained by transfer learning. First, obtain a convolutional neural network after initial training. The initial training can be performed by using a large number of natural images (such as Imagenet data) to train the pre-trained weights of the convolutional neural network, so that the convolutional neural network after initial training has the underlying ability to learn visual features. Then, the convolutional neural network after initial training is migrated to the specific information recommendation task, and the image of the product and the product word of the product as the label are further trained on the convolutional neural network to obtain an image vector extraction network. The weights of the network are fine-tuned to make the trained network suitable for specific information recommendation tasks. Convolutional neural networks, for example, use networks such as resnet and vgg as backbone networks. Loading pre-trained weights speeds up the learning of the network and shortens the training time.

训练过程中，图像数据在输入神经网络之前，可以经过预处理。所述预处理包括以下的一项或多项。1)如果输入图像为透明图，将4通道的透明图转为3通道的白底图，其中，3通道是RGB通道，4通道比3通道多一个透明度通道；2)随机截取图像中一部分，截取区域面积占比例如在[0.7,1.0]之间，长宽比例如在[0.75,1.33]之间；3)图像大小调整到例如224*224固定大小；4)以预设概率如0.5随机水平翻转；5)以预设概率如0.5随机旋转预设角度如45度以内的角度；6)以预设概率如0.5随机偏移图像颜色；7)将像素值归一化到[1,-1]之间。During training, the image data can be preprocessed before being input into the neural network. The pretreatment includes one or more of the following. 1) If the input image is a transparent image, convert the 4-channel transparent image to a 3-channel white background image, where the 3-channel is an RGB channel, and the 4-channel has one more transparency channel than the 3-channel; 2) Randomly intercept a part of the image, The area ratio of the intercepted area is, for example, between [0.7,1.0], and the aspect ratio is, for example, between [0.75,1.33]; 3) The image size is adjusted to a fixed size such as 224*224; 4) Random with a preset probability such as 0.5 Horizontal flip; 5) Randomly rotate the preset angle with a preset probability such as 0.5, such as an angle within 45 degrees; 6) Randomly shift the image color with a preset probability such as 0.5; 7) Normalize the pixel value to [1,- 1] between.

通过上述1)4)5)6)的预处理，使得被训练网络专注于产品本身特征，而对产品图像的背景、角度、颜色等不敏感。通过上述2)3)7)的预处理，对图像进行大小或像素的标准化处理，提高训练效果。Through the preprocessing of the above 1) 4) 5) 6), the trained network can focus on the characteristics of the product itself, but not sensitive to the background, angle, color, etc. of the product image. Through the preprocessing of the above 2) 3) 7), the size or pixel of the image is standardized to improve the training effect.

训练过程中，为了减小后续操作的运算量，将骨干网络生成的向量进一步降维。以骨干网络是resnet101为例，图像经过resnet101的卷积层以及一个平均池化(average pooling)操作后，得到K ₁维特征向量，再在后面接入三层降维网络，分别从K ₁依次降维到K ₂、K ₃、K ₄维度，K ₁>K ₂>K ₃>K ₄。为了减少降维过程中的信息损失，降维网络采用prelu激活函数。最后，经过一层线性变换得到每个产品词对应的逻辑回归值(logits)，再经sigmoid函数变成[0,1]范围的概率值。本公开实施例采用多标签分类目标函数，一张图像可能预测为多个产品词，相互之间独立，每一类相当于一个二分类任务，使用二分类binary-cross-entropy作为损失函数。训练使用adam优化器进行反向传播学习，设置0.001的初始学习率和学习率动态调整，并且利用horovod进行单机多卡加速。选取在测试集上产品词的top-1(排名第一)、top-5(排名前5)、top-10(排名前10)召回率(recall)最高的模型作为预测阶段的模型。 During the training process, in order to reduce the computational load of subsequent operations, the vector generated by the backbone network is further reduced in dimension. Taking the backbone network resnet101 as an example, after the image passes through the convolutional layer of resnet101 and an average pooling operation, the K _1- dimensional feature vector is obtained, and then connected to the three-layer dimensionality reduction network, respectively from K ₁ Dimensionality reduction to K ₂ , K ₃ , and K ₄ dimensions, K ₁ >K ₂ >K ₃ >K ₄ . In order to reduce the information loss in the dimensionality reduction process, the dimensionality reduction network adopts the prelu activation function. Finally, the logistic regression value (logits) corresponding to each product word is obtained through a layer of linear transformation, and then converted into a probability value in the [0,1] range by the sigmoid function. The embodiment of the disclosure adopts a multi-label classification objective function. An image may be predicted as multiple product words, which are independent of each other. Each category is equivalent to a binary classification task, and binary-cross-entropy is used as the loss function. The training uses the adam optimizer for backpropagation learning, sets the initial learning rate of 0.001 and dynamically adjusts the learning rate, and uses horovod for single-machine multi-card acceleration. Select the model with the highest recall rate (recall) of the top-1 (ranked first), top-5 (top 5), top-10 (top 10) product words on the test set as the model in the prediction stage.

在步骤130，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识。In step 130, according to the characterization vectors of the images of each product, the cluster identifiers corresponding to the images of the products are determined.

如果产品数量不多，可以采用聚类算法，对各个产品的图像的表征向量进行一次聚类，直接确定产品的图像对应的聚类标识。If the number of products is not large, a clustering algorithm may be used to perform clustering on the representation vectors of the images of each product to directly determine the cluster identifiers corresponding to the images of the products.

如果产品数量较多，可以进行粗粒度和细粒度的两次聚类，确定产品的图像对应的一级聚类标识和二级聚类标识。从而，降低计算量。下面重点描述后一种情况。If the number of products is large, two coarse-grained and fine-grained clusters can be performed to determine the first-level cluster identifier and the second-level cluster identifier corresponding to the image of the product. Thus, the calculation amount is reduced. The latter case will be described with emphasis below.

根据各个产品的图像的表征向量，通过聚类，确定已有产品的图像对应的聚类标识，详见步骤130.1，通过匹配，确定新产品的图像对应的聚类标识，详见步骤130.2。According to the characterization vectors of the images of each product, through clustering, determine the cluster ID corresponding to the image of the existing product, see step 130.1 for details, and determine the cluster ID corresponding to the image of the new product through matching, see step 130.2 for details.

针对已有产品的步骤130.1，根据各个已有产品的图像的表征向量，通过聚类，确定已有产品的图像对应的聚类标识，包括：根据产品的品类信息和产品词将已有产品划分为若干初始类别；根据每个初始类别下的各个已有产品的图像的表征向量，确定每个初始类别相应的表征向量；对各个初始类别相应的表征向量进行聚类，形成若干一级聚类，也称粗粒度聚类，如图3所示；针对每个一级聚类下的各个已有产品的图像的表征向量进行聚类，形成每个一级聚类下的若干二级聚类，也称细粒度聚类，如图4所示；根据已有产品的图像所属的一级聚类和二级聚类，确定已有产品的图像对应的一级聚类标识和二级聚类标识。从而确定图像对应的粗粒度聚类和细粒度聚类。For the step 130.1 of the existing products, according to the characterization vectors of the images of each existing product, through clustering, determine the cluster identification corresponding to the images of the existing products, including: dividing the existing products according to the category information and product words of the products for several initial categories; according to the characterization vectors of images of each existing product under each initial category, determine the corresponding characterization vectors for each initial category; cluster the corresponding characterization vectors for each initial category to form several first-level clusters , also known as coarse-grained clustering, as shown in Figure 3; clustering is performed on the characterization vectors of images of existing products under each first-level cluster to form several second-level clusters under each first-level cluster , also known as fine-grained clustering, as shown in Figure 4; according to the first-level clustering and second-level clustering to which the image of the existing product belongs, determine the first-level clustering identifier and the second-level clustering corresponding to the image of the existing product logo. In order to determine the coarse-grained clustering and fine-grained clustering corresponding to the image.

例如，按照一级品类(C1)拼接主产品词(PW)的维度，将产品划分成N个C1_PW类别(即初始类别)；根据需要，还可以过滤C1_PW下产品数量小于100的C1_PW，消除错误产品词带来的数据噪音，假设剩余n个C1_PW，n等于或小于N；对剩余n个C1_PW下的产品随机均匀采样，每个C1_PW下采样例如1万个产品，每个产品对应一个K ₄维的表征向量(image emb)，对每个C1_PW下所有采样产品的K ₄维的表征向量做平均池化(mean pooling)操作，获得表征每个C1_PW的K ₄维向量(C1_PW emb)，共计n个C1_PW的K ₄维向量；对n个C1_PW的K ₄维向量进行聚类获得m个粗粒度类别(或称粗粒度VID，即一级聚类)。可用的聚类算法有k-means聚类、层次聚类(Hierarchical clustering)等。以层次聚类为例，待聚类的样本为n个C1_PW的K ₄维向量，1)初始化，将每个样本都视为一个聚类；2)计算各个聚类之间的相似度；3)根据相似度寻找最相近的两个聚类，将它们归为一类；4)重复2)和3)步骤，直至每个类之间的相似度都大于事先设定的相似度阈值，剩余的类为最终聚类结果。根据产品对应的C1_PW以及C1_PW与粗粒度VID的映射关系，获得每个产品的粗粒度VID属性。然后，对每个粗粒度VID下对应的产品图像向量进行快速聚类，获得每个粗粒度VID下的细粒度VID(即二级聚类)。可用的适用于大数据集的快速聚类方法包括CBSCAN、canopy等聚类方法。以canopy为例，待聚类的样本为每个粗粒度VID下的所有产品图像向量。1)待聚类样本集合为S，设定初始距离阈值为T ₁、T ₂，且T ₁>T ₂；2)在S中堆积挑选一个样本A，使用一个粗糙距离计算方式如欧式距离，计算A与集合S中其他样本数量之间的距离d；3)根据2)中的距离d，把d小于T ₁的样本数据向量划到一个canopy中，同时把d小于T ₂的样本数据向量从候选中心向量S中移除；4)重复2)3)步骤，直至候选中心向量S为空，聚类结束，每个canopy为一个细粒度VID。 For example, divide the product into N C1_PW categories (that is, the initial category) according to the dimensions of the first-level category (C1) splicing the main product word (PW); according to the needs, you can also filter the C1_PW with the number of products under C1_PW less than 100 to eliminate errors Data noise brought by product words, assuming that there are n remaining C1_PWs, and n is equal to or less than N; the products under the remaining n C1_PWs are randomly and uniformly sampled, and each C1_PW is down-sampled, such as 10,000 products, and each product corresponds to a K ₄ Dimensional representation vector (image emb), perform mean pooling operation on the K _4- dimensional representation vectors of all sampled products under each C1_PW, and obtain K _4- dimensional vectors (C1_PW emb) representing each C1_PW, total K _4- dimensional vectors of n C1_PWs; clustering is performed on n K _4- dimensional vectors of C1_PWs to obtain m coarse-grained categories (or coarse-grained VIDs, ie first-level clustering). Available clustering algorithms include k-means clustering, hierarchical clustering (Hierarchical clustering) and so on. Taking hierarchical clustering as an example, the samples to be clustered are n K _4- dimensional vectors of C1_PW, 1) initialization, each sample is regarded as a cluster; 2) calculate the similarity between each cluster; 3 ) Find the two closest clusters according to the similarity, and classify them into one class; 4) Repeat steps 2) and 3) until the similarity between each class is greater than the pre-set similarity threshold, and the remaining The class is the final clustering result. According to the C1_PW corresponding to the product and the mapping relationship between the C1_PW and the coarse-grained VID, the coarse-grained VID attribute of each product is obtained. Then, fast clustering is performed on the corresponding product image vectors under each coarse-grained VID to obtain the fine-grained VIDs under each coarse-grained VID (ie, secondary clustering). Available fast clustering methods suitable for large data sets include clustering methods such as CBSCAN and canopy. Taking canopy as an example, the samples to be clustered are all product image vectors under each coarse-grained VID. 1) The set of samples to be clustered is S, and the initial distance thresholds are set to T ₁ and T ₂ , and T ₁ >T ₂ ; 2) A sample A is accumulated and selected in S, and a rough distance calculation method such as Euclidean distance is used, Calculate the distance d between A and other samples in the set S; 3) According to the distance d in 2), divide the sample data vectors with d less than T ₁ into a canopy, and at the same time divide the sample data vectors with d less than T ₂ Remove from the candidate center vector S; 4) Repeat steps 2) and 3) until the candidate center vector S is empty, the clustering ends, and each canopy is a fine-grained VID.

针对新产品的步骤130.2，根据各个产品的图像的表征向量，通过匹配，确定新产品的图像对应的聚类标识，包括：根据新产品的品类信息和产品词，确定新产品的初始类别；根据初始类别与一级聚类的映射关系，确定新产品的初始类别相应的新产品的图像的一级聚类标识；根据新产品的图像的表征向量与新产品的一级聚类下的各个二级聚类之间的距离，将最短距离相应的二级聚类标识作为新产品的图像的二级聚类标识。In step 130.2 for the new product, according to the representation vectors of the images of each product, determine the cluster identification corresponding to the image of the new product through matching, including: determining the initial category of the new product according to the category information and product words of the new product; The mapping relationship between the initial category and the first-level clustering determines the first-level clustering identification of the image of the new product corresponding to the initial category of the new product; The distance between the first-level clusters, and the second-level cluster identification corresponding to the shortest distance is used as the second-level cluster identification of the image of the new product.

例如，对于一个新产品，根据产品的一级品类C1和文本信息(如主产品词PW)，获得新产品C1_PW类别，根据C1_PW与粗粒度VID的映射关系，确定新产品相应的粗粒度VID。然后，提取该新产品图像的表征向量，计算该新产品的表征向量与粗粒度VID下的细粒度VID之间的距离，将最近距离相应的细粒度VID确定为该新产品的细粒度VID。For example, for a new product, according to the first-level category C1 of the product and text information (such as the main product word PW), the new product C1_PW category is obtained, and the corresponding coarse-grained VID of the new product is determined according to the mapping relationship between C1_PW and the coarse-grained VID. Then, the characterization vector of the new product image is extracted, the distance between the characterization vector of the new product and the fine-grained VID under the coarse-grained VID is calculated, and the fine-grained VID corresponding to the shortest distance is determined as the fine-grained VID of the new product.

在步骤140，根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则。In step 140, according to the cluster identifiers corresponding to the products and the images of the products, restrictive conditions for simultaneous display of different products corresponding to the same cluster identifiers are formulated to form a dispersal rule.

如果产品的图像对应的聚类标识包括产品的图像对应的一级聚类标识和二级聚类标识；形成的打散规则包括：限制产品推荐结果中相邻的第一数量(如4个)的结果中，对应相同一级聚类标识的不同产品最多展示第二数量(如2个)，对应相同二级聚类标识的不同产品最多展示第三数量(如1个)，其中，第一数量大于第二数量，第二数量大于第三数量。If the cluster identification corresponding to the image of the product includes the first-level cluster identification and the second-level cluster identification corresponding to the image of the product; the formed dispersal rules include: limiting the first number of neighbors (such as 4) in the product recommendation results In the results of , the different products corresponding to the same first-level clustering logo display at most the second number (such as 2), and the different products corresponding to the same second-level clustering logo display at most the third number (such as 1), where the first The quantity is greater than the second quantity, and the second quantity is greater than the third quantity.

如果产品的图像对应的一个聚类标识，形成的打散规则包括：限制产品推荐结果中相邻的第四数量(如5个)的结果中，对应相同聚类标识的不同产品最多展示第五数量(如1个)，其中，第四数量大于第五数量。If the image of the product corresponds to a cluster identifier, the formed dispersal rules include: among the results of limiting the fourth adjacent number (such as 5) in the product recommendation results, the different products corresponding to the same cluster identifier are displayed at most the fifth Quantity (such as 1), wherein, the fourth quantity is greater than the fifth quantity.

在步骤150，利用所述打散规则对产品推荐结果进行打散处理。In step 150, the product recommendation results are divided into pieces by using the breakup rules.

获取产品推荐结果；获取预设的打散规则，所述打散规则包括根据产品以及产品的图像对应的聚类标识制定的对应相同聚类标识的不同产品被同时展示的限制条件；利用所述打散规则对产品推荐结果进行打散处理。Obtain product recommendation results; obtain preset break-up rules, the break-up rules include restrictions on simultaneous display of different products corresponding to the same cluster mark formulated according to the product and the cluster mark corresponding to the image of the product; use the The splitting rule splits up the product recommendation results.

其中，初始的产品推荐结果可以根据用户输入的检索关键词形成，或者根据其他预设关键词形成，本公开并不限定如何形成初始的产品推荐结果。Wherein, the initial product recommendation result can be formed according to the search keyword input by the user, or according to other preset keywords, and the present disclosure does not limit how to form the initial product recommendation result.

以步骤140中的第一种打散规则为例，在产品推荐结果中相邻的第一数量(如4个)的结果中，如果对应相同一级聚类标识的不同产品的数量大于第二数量(如2个)，则最多展示第二数量(如2个)，超出第二数量(如2个)的产品可在下一个第一数量的推荐结果进行展示判断，类似的，如果对应相同二级聚类标识的不同产品的数量大于第三数量(如1个)，则最多展示第三数量(如1个)，超出第三数量(如1个)的产品可在下一个第一数量的推荐结果进行展示判断。Taking the first kind of dispersing rule in step 140 as an example, in the result of the first adjacent number (such as 4) in the product recommendation result, if the number of different products corresponding to the same first-level clustering identification is greater than the second Quantity (such as 2), then the second quantity (such as 2) will be displayed at most, and products exceeding the second quantity (such as 2) can be displayed and judged in the next recommendation result of the first quantity. Similarly, if the products corresponding to the same two If the number of different products identified by the level cluster is greater than the third number (such as 1), then the third number (such as 1) will be displayed at most, and products exceeding the third number (such as 1) can be recommended in the next first number The results are displayed and judged.

上述实施例，通过确定产品的图像对应的聚类标识，据此制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则，对产品推荐结果进行打散处理，改善打散效果，更好地解决信息扎堆推荐的问题。此外，如果利用粗细两级聚类标识形成的打散规则进行打散处理，还可以降低计算量。In the above embodiment, by determining the cluster identification corresponding to the image of the product, based on this, the restriction conditions for different products corresponding to the same cluster identification to be displayed at the same time are formulated, and the unbundling rule is formed, and the product recommendation results are unpacked to improve unbundling. effect, and better solve the problem of information clustering and recommendation. In addition, if the dispersal processing is performed using the dispersive rules formed by the coarse and fine two-level clustering labels, the amount of calculation can also be reduced.

如图5所示，该实施例的信息推荐装置500包括以下模块。As shown in FIG. 5 , the information recommendation apparatus 500 of this embodiment includes the following modules.

图像获取模块510，被配置为获取产品的图像；An image acquisition module 510 configured to acquire an image of the product;

向量确定模块520，被配置为确定产品的图像的表征向量；The vector determination module 520 is configured to determine the representation vector of the image of the product;

聚类标识确定模块530，被配置为根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识；The cluster identification determination module 530 is configured to determine the cluster identification corresponding to the image of the product according to the representation vector of the image of each product;

打散规则制定模块540，被配置为根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则；The dispersal rule formulation module 540 is configured to formulate restrictions on simultaneous display of different products corresponding to the same cluster identifier according to the product and the cluster identification corresponding to the image of the product to form a dispersal rule;

打散处理模块550，被配置为利用所述打散规则对产品推荐结果进行打散处理。The unbundling processing module 550 is configured to unbundle the product recommendation results by using the unbundling rules.

在一些实施例中，聚类标识确定模块530，被配置为根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识包括：In some embodiments, the cluster identifier determining module 530 is configured to determine the cluster identifier corresponding to the image of the product according to the characterization vector of the image of each product including:

产品的图像对应的聚类标识包括产品的图像对应的一级聚类标识和二级聚类标识。The cluster identifier corresponding to the product image includes a first-level cluster identifier and a second-level cluster identifier corresponding to the product image.

在一些实施例中，打散规则制定模块540，被配置为限制产品推荐结果中相邻的第一数量的结果中，对应相同一级聚类标识的不同产品最多展示第二数量，对应相同二级聚类标识的不同产品最多展示第三数量，其中，第一数量大于第二数量，第二数量大于第三数量。In some embodiments, the disintegration rule formulation module 540 is configured to limit the results of the first number of adjacent products in the product recommendation results, and the different products corresponding to the same first-level cluster identifier display at most the second number, corresponding to the same two Different products identified by the level cluster display at most a third quantity, wherein the first quantity is greater than the second quantity, and the second quantity is greater than the third quantity.

在一些实施例中，向量确定模块520，被配置为将产品的图像输入训练后的图像向量提取网络，输出产品的图像的表征向量，其中，所述图像向量提取网络是利用产品的图像和作为标签的产品的产品词对卷积神经网络进行训练得到的，训练用的产品的图像是经过预处理的，所述预处理包括以下的一项或多项：In some embodiments, the vector determination module 520 is configured to input the image of the product into the trained image vector extraction network, and output the representation vector of the image of the product, wherein the image vector extraction network uses the image of the product and the network as The product word of the product of the label is obtained by training the convolutional neural network, and the image of the product used for training is preprocessed, and the preprocessing includes one or more of the following:

在一些实施例中，聚类标识确定模块530，被配置为根据新产品的品类信息和产品词确定新产品的初始类别；In some embodiments, the cluster identification determination module 530 is configured to determine the initial category of the new product according to the category information and product words of the new product;

在一些实施例中，对每个初始类别下的各个产品的图像的表征向量进行平均池化，得到该初始类别相应的表征向量。In some embodiments, average pooling is performed on the characterization vectors of the images of the products in each initial category to obtain the corresponding characterization vectors of the initial category.

在一些实施例中，形成一级聚类的聚类方法包括k-means聚类、层次聚类。In some embodiments, clustering methods for forming first-level clusters include k-means clustering and hierarchical clustering.

在一些实施例中，形成二级聚类的聚类方法包括CBSCAN聚类、canopy聚类。In some embodiments, clustering methods for forming secondary clusters include CBSCAN clustering, canopy clustering.

如图6所示，该实施例的信息推荐装置600包括以下模块。As shown in FIG. 6 , the information recommendation apparatus 600 of this embodiment includes the following modules.

推荐结果获取模块610，被配置为获取产品推荐结果。The recommendation result obtaining module 610 is configured to obtain product recommendation results.

打散规则获取模块620，被配置为获取预设的打散规则，所述打散规则包括根据产品以及产品的图像对应的聚类标识制定的对应相同聚类标识的不同产品被同时展示的限制条件；打散规则的形成方法参见前述，这里不再赘述。The unbundling rule acquisition module 620 is configured to acquire a preset unbundling rule, the unpacking rule includes restrictions on simultaneous display of different products corresponding to the same cluster ID formulated according to the product and the cluster ID corresponding to the image of the product Conditions; the formation method of breaking up rules can be found in the above, and will not be repeated here.

打散处理模块630，被配置为利用所述打散规则对产品推荐结果进行打散处理。The unbundling processing module 630 is configured to unbundle the product recommendation results by using the unbundling rules.

如图7所示，该实施例的信息推荐装置700包括：存储器710以及耦接至该存储器710的处理器720，处理器720被配置为基于存储在存储器710中的指令，执行前述任意一些实施例中的信息推荐方法。As shown in FIG. 7 , the information recommendation device 700 of this embodiment includes: a memory 710 and a processor 720 coupled to the memory 710, the processor 720 is configured to execute any of the foregoing implementations based on instructions stored in the memory 710 The information in the example recommends a method.

其中，信息推荐装置700例如可以是第二信息推荐单元。Wherein, the information recommending apparatus 700 may be, for example, a second information recommending unit.

其中，存储器710例如可以包括***存储器、固定非易失性存储介质等。***存储器例如存储有操作***、应用程序、引导装载程序(Boot Loader)以及其他程序等。Wherein, the memory 710 may include, for example, a system memory, a fixed non-volatile storage medium, and the like. The system memory stores, for example, an operating system, an application program, a boot loader (Boot Loader) and other programs.

其中，处理器720可以用通用处理器、数字信号处理器(DSP)、应用专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑设备、分立门或晶体管等分立硬件组件方式来实现。Wherein, the processor 720 may be a general processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistors and other discrete hardware components. way to achieve.

装置700还可以包括输入输出接口730、网络接口740、存储接口750等。这些接口730，740，750以及存储器710和处理器720之间例如可以通过总线760连接。其中，输入输出接口730为显示器、鼠标、键盘、触摸屏等输入输出设备提供连接接口。网络接口740为各种联网设备提供连接接口。存储接口750为SD卡、U盘等外置存储设备提供连接接口。总线760可以使用多种总线结构中的任意总线结构。例如，总线结构包括但不限于工业标准体系结构(ISA)总线、微通道体系结构(MCA)总线、***组件互连(PCI)总线。The device 700 may further include an input and output interface 730, a network interface 740, a storage interface 750, and the like. These

interfaces

730 , 740 , 750 as well as the memory 710 and the processor 720 may be connected via a bus 760 , for example. Wherein, the input and output interface 730 provides a connection interface for input and output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 740 provides a connection interface for various networked devices. The storage interface 750 provides connection interfaces for external storage devices such as SD cards and U disks. Bus 760 may use any of a variety of bus structures. For example, bus structures include, but are not limited to, Industry Standard Architecture (ISA) buses, Micro Channel Architecture (MCA) buses, Peripheral Component Interconnect (PCI) buses.

图8示出本公开一些实施例的信息推荐***的示意图。Fig. 8 shows a schematic diagram of an information recommendation system of some embodiments of the present disclosure.

如图8所示，该实施例的信息推荐***800包括：As shown in Figure 8, the information recommendation system 800 of this embodiment includes:

第一信息推荐单元810，被配置为形成初始的产品推荐结果；The first information recommendation unit 810 is configured to form an initial product recommendation result;

第二信息推荐单元820，被配置为通过执行各实施例的信息推荐方法，对产品推荐结果进行打散处理。The second information recommendation unit 820 is configured to disperse the product recommendation results by executing the information recommendation method of each embodiment.

第一信息推荐单元810可以根据用户输入的检索关键词形成初始的产品推荐结果，或者根据其他预设关键词形成初始的产品推荐结果。本公开并不限定如何形成初始的产品推荐结果。The first information recommending unit 810 may form an initial product recommendation result according to a search keyword input by a user, or form an initial product recommendation result according to other preset keywords. The present disclosure does not limit how to form the initial product recommendation result.

本公开一些实施例提出一种非瞬时性计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现各实施例的信息推荐方法的步骤。Some embodiments of the present disclosure provide a non-transitory computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the steps of the information recommendation method of each embodiment are implemented.

本领域内的技术人员应当明白，本公开的实施例可提供为方法、***、或计算机程序产品。因此，本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本公开可采用在一个或多个其中包含有计算机程序代码的非瞬时性计算机可读存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present disclosure may be provided as methods, systems, or computer program products. Accordingly, the present disclosure can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more non-transitory computer-readable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer program code embodied therein. .

本公开是参照根据本公开实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解为可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present disclosure. It should be understood that each process and/or block in the flowchart and/or block diagram, and a combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

以上所述仅为本公开的较佳实施例，并不用以限制本公开，凡在本公开的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本公开的保护范围之内。The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure shall be included in the protection of the present disclosure. within range.

Claims

一种信息推荐方法，包括：A method for recommending information, comprising:

获取产品的图像；Get an image of the product;

确定产品的图像的表征向量；Determining a representation vector of an image of the product;

根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识；According to the characterization vector of the image of each product, determine the cluster identification corresponding to the image of the product;

根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则；According to the cluster identification corresponding to the product and the product image, formulate the restriction conditions for different products corresponding to the same cluster identification to be displayed at the same time, and form a dispersal rule;

利用所述打散规则对产品推荐结果进行打散处理。The product recommendation results are broken up by using the breakup rule.
根据权利要求1所述的方法，The method according to claim 1,

根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识包括：According to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product includes:

根据产品的品类信息和产品词将产品划分为若干初始类别；Divide the product into several initial categories according to the category information and product words of the product;

根据每个初始类别下的各个产品的图像的表征向量，确定每个初始类别相应的表征向量；Determining a corresponding characterization vector for each initial category according to the characterization vectors of images of each product under each initial category;

对各个初始类别相应的表征向量进行聚类，形成若干一级聚类；Cluster the characterization vectors corresponding to each initial category to form several first-level clusters;

针对每个一级聚类下的各个产品的图像的表征向量进行聚类，形成每个一级聚类下的若干二级聚类；Clustering is performed on the representation vectors of images of each product under each first-level cluster to form several second-level clusters under each first-level cluster;

根据产品的图像所属的一级聚类和二级聚类，确定产品的图像对应的一级聚类标识和二级聚类标识；According to the first-level cluster and the second-level cluster to which the image of the product belongs, determine the first-level cluster identification and the second-level cluster identification corresponding to the product image;

或者，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识包括：采用聚类算法，对各个产品的图像的表征向量进行一次聚类，根据产品的图像所属的聚类确定产品的图像对应的聚类标识。Or, according to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product includes: using a clustering algorithm to cluster the characterization vectors of the images of each product once, and determining the product according to the cluster to which the image of the product belongs The image corresponding to the cluster identity.
根据权利要求1所述的方法，The method according to claim 1,

在产品的图像对应的聚类标识包括产品的图像对应的一级聚类标识和二级聚类标识的情况下，根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则包括：限制产品推荐结果中相邻的第一数量的结果中，对应相同一级聚类标识的不同产品最多展示第二数量，对应相同二级聚类标识的不同产品最多展示第三数量，其中，第一数量大于第二数量，第二数量大于第三数量；或者，In the case that the cluster identification corresponding to the product image includes the first-level cluster identification and the second-level cluster identification corresponding to the product image, according to the product and the cluster identification corresponding to the product image, formulate different The restrictive conditions for products to be displayed at the same time, forming the dispersal rules include: restricting the results of the first number of adjacent products in the product recommendation results, different products corresponding to the same first-level cluster logo can be displayed at most the second number, corresponding to the same second-level cluster Different products of the class logo display up to a third quantity, wherein the first quantity is greater than the second quantity, and the second quantity is greater than the third quantity; or,

在产品的图像对应一个聚类标识的情况下，形成的打散规则包括：限制产品推荐结果中相邻的第四数量的结果中，对应相同聚类标识的不同产品最多展示第五数量其中，第四数量大于第五数量。In the case that the image of the product corresponds to a cluster identifier, the formed dispersing rules include: among the results of limiting the fourth number of adjacent products in the product recommendation results, different products corresponding to the same cluster identifier are displayed at most the fifth number. Among them, The fourth quantity is greater than the fifth quantity.
根据权利要求1所述的方法，确定产品的图像的表征向量包括：According to the method of claim 1, determining the representation vector of the image of the product comprises:

将产品的图像输入训练后的图像向量提取网络，输出产品的图像的表征向量，Input the image of the product into the trained image vector extraction network, and output the representation vector of the image of the product,

其中，所述图像向量提取网络是利用产品的图像和作为标签的产品的产品词对卷积神经网络进行训练得到的，训练用的产品的图像是经过预处理的，所述预处理包括以下的一项或多项：Wherein, the image vector extraction network is obtained by using the image of the product and the product word of the product as the label to train the convolutional neural network, and the image of the product used for training is pre-processed, and the pre-processing includes the following One or more:

如果训练用的产品的图像是透明图，将透明图转换为白底图；If the image of the product used for training is a transparent image, convert the transparent image to a white background image;

对训练用的产品的图像以预设概率随机水平翻转；The image of the product used for training is randomly flipped horizontally with a preset probability;

对训练用的产品的图像以预设概率随机旋转预设角度以内的角度；The image of the product used for training is randomly rotated at an angle within the preset angle with a preset probability;

以预设概率随机偏移训练用的产品的图像的颜色。Randomly shifts the color of images of products used for training with a preset probability.
根据权利要求2所述的方法，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识还包括：According to the method according to claim 2, according to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product also includes:

根据新产品的品类信息和产品词确定新产品的初始类别；Determine the initial category of the new product based on the category information and product words of the new product;

根据初始类别与一级聚类的映射关系，确定新产品的初始类别相应的新产品的图像的一级聚类标识；According to the mapping relationship between the initial category and the first-level cluster, determine the first-level cluster identification of the image of the new product corresponding to the initial category of the new product;

根据新产品的图像的表征向量与新产品的一级聚类下的各个二级聚类之间的距离，将最短距离相应的二级聚类标识作为新产品的图像的二级聚类标识。According to the distance between the characterization vector of the image of the new product and each secondary cluster under the primary cluster of the new product, the secondary cluster identifier corresponding to the shortest distance is used as the secondary cluster identifier of the image of the new product.
根据权利要求2所述的方法，The method of claim 2,

确定每个初始类别相应的表征向量包括：对每个初始类别下的各个产品的图像的表征向量进行平均池化，得到该初始类别相应的表征向量；Determining the characterization vector corresponding to each initial category includes: performing average pooling on the characterization vectors of the images of each product under each initial category to obtain the corresponding characterization vector of the initial category;

或者，形成一级聚类的聚类方法包括k-means聚类、层次聚类中的至少一项；Or, the clustering method for forming the first-level clustering includes at least one of k-means clustering and hierarchical clustering;

或者，形成二级聚类的聚类方法包括CBSCAN聚类、canopy聚类中的至少一项。Alternatively, the clustering method for forming the secondary clustering includes at least one of CBSCAN clustering and canopy clustering.
一种信息推荐方法，包括：A method for recommending information, comprising:

获取产品推荐结果；Obtain product recommendation results;

获取预设的打散规则，所述打散规则包括根据产品以及产品的图像对应的聚类标识制定的对应相同聚类标识的不同产品被同时展示的限制条件；Obtaining a preset break-up rule, the break-up rule includes restrictions on simultaneous display of different products corresponding to the same cluster mark formulated according to the product and the cluster mark corresponding to the image of the product;

利用所述打散规则对产品推荐结果进行打散处理。The product recommendation results are broken up by using the breakup rule.
根据权利要求7所述的方法，所述打散规则的形成方法包括：The method according to claim 7, the forming method of the breaking rules comprises:

获取产品的图像；Get an image of the product;

确定产品的图像的表征向量；Determining a representation vector of an image of the product;

根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识；According to the characterization vector of the image of each product, determine the cluster identification corresponding to the image of the product;

根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则。According to the cluster identification corresponding to the product and the image of the product, the restrictive conditions for simultaneous display of different products corresponding to the same cluster identification are formulated to form a dispersal rule.
根据权利要求8所述的方法，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识包括：According to the method according to claim 8, according to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product comprises:

根据产品的品类信息和产品词将产品划分为若干初始类别；Divide the product into several initial categories according to the category information and product words of the product;

根据每个初始类别下的各个产品的图像的表征向量，确定每个初始类别相应的表征向量；Determining a corresponding characterization vector for each initial category according to the characterization vectors of images of each product under each initial category;

对各个初始类别相应的表征向量进行聚类，形成若干一级聚类；Cluster the characterization vectors corresponding to each initial category to form several first-level clusters;

针对每个一级聚类下的各个产品的图像的表征向量进行聚类，形成每个一级聚类下的若干二级聚类；Clustering is performed on the representation vectors of images of each product under each first-level cluster to form several second-level clusters under each first-level cluster;

根据产品的图像所属的一级聚类和二级聚类，确定产品的图像对应的一级聚类标识和二级聚类标识；According to the first-level cluster and the second-level cluster to which the image of the product belongs, determine the first-level cluster identification and the second-level cluster identification corresponding to the product image;

或者，根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识包括：采用聚类算法，对各个产品的图像的表征向量进行一次聚类，根据产品的图像所属的聚类确定产品的图像对应的聚类标识。Or, according to the characterization vectors of the images of each product, determining the cluster identification corresponding to the image of the product includes: using a clustering algorithm to cluster the characterization vectors of the images of each product once, and determining the product according to the cluster to which the image of the product belongs The image corresponding to the cluster identity.
根据权利要求8所述的方法，The method of claim 8,

在产品的图像对应的聚类标识包括产品的图像对应的一级聚类标识和二级聚类标识的情况下，根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则包括：限制产品推荐结果中相邻的第一数量的结果中，对应相同一级聚类标识的不同产品最多展示第二数量，对应相同二级聚类标识的不同产品最多展示第三数量，其中，第一数量大于第二数量，第二数量大于第三数量；或者，In the case that the cluster identification corresponding to the product image includes the first-level cluster identification and the second-level cluster identification corresponding to the product image, according to the product and the cluster identification corresponding to the product image, formulate different The restrictive conditions for products to be displayed at the same time, forming the dispersal rules include: restricting the results of the first number of adjacent products in the product recommendation results, different products corresponding to the same first-level cluster logo can be displayed at most the second number, corresponding to the same second-level cluster Different products of the class logo display up to a third quantity, wherein the first quantity is greater than the second quantity, and the second quantity is greater than the third quantity; or,

在产品的图像对应一个聚类标识的情况下，形成的打散规则包括：限制产品推荐结果中相邻的第四数量的结果中，对应相同聚类标识的不同产品最多展示第五数量其中，第四数量大于第五数量。In the case that the image of the product corresponds to a cluster identifier, the formed dispersing rules include: among the results of limiting the fourth number of adjacent products in the product recommendation results, different products corresponding to the same cluster identifier are displayed at most the fifth number. Among them, The fourth quantity is greater than the fifth quantity.
一种信息推荐装置，包括：An information recommendation device, comprising:

图像获取模块，被配置为获取产品的图像；an image acquisition module configured to acquire an image of a product;

向量确定模块，被配置为确定产品的图像的表征向量；a vector determination module configured to determine a representation vector of an image of the product;

聚类标识确定模块，被配置为根据各个产品的图像的表征向量，确定产品的图像对应的聚类标识；The cluster identification determination module is configured to determine the cluster identification corresponding to the image of the product according to the representation vector of the image of each product;

打散规则制定模块，被配置为根据产品以及产品的图像对应的聚类标识，制定对应相同聚类标识的不同产品被同时展示的限制条件，形成打散规则；The dispersal rule formulation module is configured to formulate restrictions on simultaneous display of different products corresponding to the same cluster identities according to the cluster identification corresponding to the product and the image of the product to form a dispersal rule;

打散处理模块，被配置为利用所述打散规则对产品推荐结果进行打散处理。The unbundling processing module is configured to unbundle the product recommendation results by using the unbundling rules.
一种信息推荐装置，包括：An information recommendation device, comprising:

推荐结果获取模块，被配置为获取产品推荐结果；The recommendation result obtaining module is configured to obtain product recommendation results;

打散规则获取模块，被配置为获取预设的打散规则，所述打散规则包括根据产品以及产品的图像对应的聚类标识制定的对应相同聚类标识的不同产品被同时展示的限制条件；The unbundling rule acquisition module is configured to acquire preset unbundling rules, the unpacking rules include restrictions on simultaneous display of different products corresponding to the same cluster ID formulated according to the product and the cluster ID corresponding to the image of the product ;

打散处理模块，被配置为利用所述打散规则对产品推荐结果进行打散处理。The unbundling processing module is configured to unbundle the product recommendation results by using the unbundling rules.
一种信息推荐装置，包括：存储器；以及耦接至所述存储器的处理器，所述处理器被配置为基于存储在所述存储器中的指令，执行权利要求1-10中任一项所述的信息推荐方法。An information recommendation device, comprising: a memory; and a processor coupled to the memory, the processor configured to execute any one of claims 1-10 based on instructions stored in the memory information recommendation method.
一种信息推荐***，包括：An information recommendation system, comprising:

第一信息推荐单元，被配置为形成初始的产品推荐结果；The first information recommendation unit is configured to form an initial product recommendation result;

第二信息推荐单元，被配置为通过执行权利要求1-10中任一项所述的信息推荐方法，对产品推荐结果进行打散处理。The second information recommendation unit is configured to disperse product recommendation results by executing the information recommendation method according to any one of claims 1-10.
一种非瞬时性计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现权利要求1-10中任一项所述的信息推荐方法的步骤。A non-transitory computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the steps of the information recommendation method described in any one of claims 1-10 are realized.
一种计算机程序，包括：A computer program comprising:

指令，所述指令由处理器执行时使所述处理器执行根据权利要求1-10中任一项所述的信息推荐方法。An instruction, when executed by a processor, causes the processor to execute the information recommendation method according to any one of claims 1-10.