TWI693567B - Method and device for obtaining multi-label user portrait - Google Patents

Method and device for obtaining multi-label user portrait Download PDF

Info

Publication number
TWI693567B
TWI693567B TW107146609A TW107146609A TWI693567B TW I693567 B TWI693567 B TW I693567B TW 107146609 A TW107146609 A TW 107146609A TW 107146609 A TW107146609 A TW 107146609A TW I693567 B TWI693567 B TW I693567B
Authority
TW
Taiwan
Prior art keywords
user
label
classifier
value
information
Prior art date
Application number
TW107146609A
Other languages
Chinese (zh)
Other versions
TW201935344A (en
Inventor
張雅淋
李龍飛
Original Assignee
香港商阿里巴巴集團服務有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 香港商阿里巴巴集團服務有限公司 filed Critical 香港商阿里巴巴集團服務有限公司
Publication of TW201935344A publication Critical patent/TW201935344A/en
Application granted granted Critical
Publication of TWI693567B publication Critical patent/TWI693567B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本說明書實施例揭露了一種訓練使用者肖像分類器的方法和裝置以及獲取多標籤使用者肖像的方法和裝置。所述訓練方法包括:獲取第一組使用者的各自的第一特徵向量;獲取所述第一組使用者各自的第一標籤的值;以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量;獲取所述第一組使用者各自的第二標籤的值;以及以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器。The embodiments of the present specification disclose a method and device for training user portrait classifiers and a method and device for acquiring multi-label user portraits. The training method includes: acquiring the respective first feature vectors of the first group of users; acquiring the values of the respective first labels of the first group of users; using the respective first feature vectors of the first group of users And the set of values of the first label is used as the first training set to train the first classifier; the respective first feature vectors of the first group of users and the values of the first label are combined to obtain the use of the first group The second feature vector of each user; acquiring the value of each second label of the first group of users; and taking the set of the values of the second feature vector and second label of each of the first group of users as the second Training set, training the second classifier.

Description

一種獲取多標籤使用者肖像的方法和裝置Method and device for obtaining multi-label user portrait

本發明是關於機器學習之領域,具體地,是關於一種訓練使用者肖像分類器的方法和裝置以及一種獲取多標籤使用者肖像的方法和裝置。The present invention relates to the field of machine learning, and in particular, to a method and device for training user portrait classifiers and a method and device for acquiring multi-label user portraits.

隨著網際網路的普及與發展,越來越多的資料可以被各個網際網路運營商收集起來。例如,對於電商網站,可以獲得使用者的購買記錄、瀏覽記錄等資訊;對於搜尋引擎,可以獲得使用者的搜尋記錄、點擊記錄等資訊。為了更好的利用這樣的資訊,以提供更為高效優質的服務,使用者肖像這一技術得到了普遍重視。使用者肖像是根據使用者社會屬性、生活習慣和消費行為等資訊而抽象出的一個標籤化的使用者模型。目前,現有技術中包括基於深度神經網路獲取使用者肖像的方法和基於統計資料獲取使用者肖像的方法等。因此,需要一種更有效的用於獲取多標籤使用者肖像的方案。With the popularity and development of the Internet, more and more data can be collected by various Internet operators. For example, for e-commerce websites, you can obtain information such as user purchase records and browsing records; for search engines, you can obtain information such as user search records and click records. In order to make better use of such information and provide more efficient and high-quality services, the technology of user portraits has received widespread attention. The user portrait is a tagged user model abstracted based on the user's social attributes, lifestyle and consumption behavior. At present, the prior art includes a method for obtaining a user portrait based on a deep neural network and a method for obtaining a user portrait based on statistical data. Therefore, there is a need for a more effective solution for acquiring multi-label user portraits.

本說明書實施例旨在提供一種更有效的獲取多標籤使用者肖像的方案,以解決現有技術中的不足。 為實現上述目的,本說明書一個方面提供一種訓練使用者肖像分類器的方法,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,所述使用者肖像為多標籤使用者肖像,所述方法包括:獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;獲取所述第一組使用者各自的第一標籤的值,所述第一標籤的值對應於使用者的第一標籤資訊;以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量;獲取所述第一組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器。 在一個實施例中,在上述訓練鏈式分類器的方法中,所述使用者的資訊包括使用者的標籤資訊。 在一個實施例中,在上述訓練鏈式分類器的方法中,第一標籤是年齡,第二標籤是購買偏好。 在一個實施例中,在上述訓練鏈式分類器的方法中,第一標籤是購買偏好,第二標籤是購買能力。 本說明書另一方面提供一種訓練使用者肖像分類器的方法,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,其中所述第一分類器是通過上述訓練方法訓練獲得的第一分類器,所述使用者肖像為多標籤使用者肖像,所述方法包括:在訓練第一分類器之後,獲取第二組使用者的各自的第一特徵向量,所述第二組使用者包括不屬於所述第一組使用者的至少一個使用者,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;將所述第二組使用者的各自的第一特徵向量輸入所述第一分類器,以獲取所述第二組使用者的各自的第一標籤預測值,將所述第二組使用者中每個使用者的第一特徵向量和第一標籤預測值組合,以獲取第二組使用者各自的第二特徵向量;獲取第二組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及以所述第二組使用者各自的第二特徵向量和第二標籤的值的集合作為第三訓練集,訓練所述第二分類器。 本說明書另一方面提供一種獲取多標籤使用者肖像的方法,包括:基於使用者資訊獲取使用者的第一特徵向量;將所述第一特徵向量輸入通過上述訓練方法訓練獲得的第一分類器,獲得所述使用者的第一標籤預測值,作為所述使用者的第一標籤的值;將所述第一特徵向量與所述第一標籤的值組合,以獲取所述使用者的第二特徵向量;以及將所述第二特徵向量輸入通過上述訓練方法訓練獲得的第二分類器,獲得所述使用者的第二標籤預測值,作為所述使用者的第二標籤的值。 在一個實施例中,上述獲取多標籤使用者肖像的方法還包括,在基於使用者資訊獲取使用者的第一特徵向量之後,在所述使用者資訊中包括所述第一標籤資訊的情況中,以所述第一標籤資訊的對應預設值替換所述第一標籤預測值,作為所述使用者的第一標籤的值。 在一個實施例中,上述獲取多標籤使用者肖像的方法還包括,在獲取所述使用者的第二特徵向量之後,在所述使用者資訊中包括所述第二標籤資訊的情況中,以所述第二標籤資訊的對應預設值替換所述第二標籤預測值,作為所述使用者的第二標籤的值。 本說明書另一方面提供一種訓練使用者肖像分類器的裝置,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,所述使用者肖像為多標籤使用者肖像,所述裝置包括:第一獲取單元,配置為,獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;第二獲取單元,配置為,獲取所述第一組使用者各自的第一標籤的值,所述第一標籤的值對應於使用者的第一標籤資訊;第一訓練單元,配置為,以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;第三獲取單元,配置為,將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量;第四獲取單元,配置為,獲取所述第一組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及第二訓練單元,配置為,以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器。 本說明書另一方面提供一種訓練使用者肖像分類器的裝置,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,其中所述第一分類器是通過上述訓練方法訓練獲得的第一分類器,所述使用者肖像為多標籤使用者肖像,所述裝置包括:第五獲取單元,配置為,獲取第二組使用者的各自的第一特徵向量,所述第二組使用者包括不屬於所述第一組使用者的至少一個使用者,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;輸入單元,配置為,將所述第二組使用者的各自的第一特徵向量輸入所述第一分類器,以獲取所述第二組使用者的各自的第一標籤預測值;組合單元,配置為,將所述第二組使用者中每個使用者的第一特徵向量和第一標籤預測值組合,以獲取第二組使用者各自的第二特徵向量;第六獲取單元,配置為,獲取第二組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及第三訓練單元,配置為,以所述第二組使用者各自的第二特徵向量和第二標籤的值的集合作為第三訓練集,訓練所述第二分類器。 本說明書另一方面提供一種獲取多標籤使用者肖像的裝置,包括:第一獲取單元,配置為,基於使用者資訊獲取使用者的第一特徵向量;第一輸入單元,配置為,將所述第一特徵向量輸入通過上述訓練方法訓練獲得的第一分類器,獲得所述使用者的第一標籤預測值,作為所述使用者的第一標籤的值;第二獲取單元,配置為,將所述第一特徵向量與所述第一標籤的值組合,以獲取所述使用者的第二特徵向量;以及第二輸入單元,配置為,將所述第二特徵向量輸入通過上述訓練方法訓練獲得的第二分類器,獲得所述使用者的第二標籤預測值,作為所述使用者的第二標籤的值。 通過根據本說明書實施例的用於獲取多標籤使用者肖像的上述方案,使得對於使用者肖像的各標籤的學習更為準確可靠,也使得獲取的多標籤使用者肖像更加精確。The embodiments of the present specification aim to provide a more effective solution for acquiring multi-label user portraits to solve the deficiencies in the prior art. To achieve the above purpose, one aspect of this specification provides a method for training a user portrait classifier, the classifier is a chain classifier, which includes a first classifier and a second classifier, and the user portrait is multi-label User portrait, the method includes: acquiring respective first feature vectors of the first group of users, the first feature vector corresponding to the user's information, the information includes the user's registration information, and the user's Operation history information; obtain the value of each first label of the first group of users, the value of the first label corresponds to the first label information of the user; with the first characteristics of the first group of users The set of vectors and the value of the first label is used as the first training set to train the first classifier; the first feature vector of the first group of users and the value of the first label are combined to obtain the first group Respective second feature vectors of users; acquiring values of respective second tags of the first group of users, the values of the second tags corresponding to the user's second tag information, and the user's second tags and The first label of the user is associated; and the set of values of the second feature vector and the second label of the user of the first group is used as the second training set to train the second classifier. In one embodiment, in the above method for training a chain classifier, the user information includes user tag information. In one embodiment, in the above method for training a chain classifier, the first label is age and the second label is purchase preference. In one embodiment, in the above method for training a chain classifier, the first label is purchase preference and the second label is purchase ability. Another aspect of this specification provides a method for training a portrait classifier of a user, the classifier is a chain classifier, which includes a first classifier and a second classifier, wherein the first classifier is obtained by the above training method In the first classifier obtained by training, the user portrait is a multi-label user portrait. The method includes: after training the first classifier, acquiring respective first feature vectors of the second group of users, the first The two groups of users include at least one user who does not belong to the first group of users. The first feature vector corresponds to the user's information. The information includes the user's registration information and the user's operation history information. ; Entering the respective first feature vectors of the second group of users into the first classifier to obtain the respective first label prediction values of the second group of users, and the second group of users The first feature vector and the first label prediction value of each user in the combination are combined to obtain the second feature vector of the second group of users; the value of the second label of the second group of users is obtained, the second The value of the tag corresponds to the second tag information of the user, and the second tag of the user is associated with the first tag of the user; and the second feature vector and the second tag of the second set of users The set of values is used as the third training set to train the second classifier. Another aspect of the present specification provides a method for acquiring a multi-label user portrait, including: acquiring a first feature vector of a user based on user information; and inputting the first feature vector into a first classifier trained by the above training method To obtain the predicted value of the user's first label as the value of the user's first label; combine the first feature vector with the value of the first label to obtain the user's first label Two feature vectors; and inputting the second feature vector into the second classifier trained by the above training method to obtain the predicted value of the second label of the user as the value of the second label of the user. In one embodiment, the above method for acquiring a multi-label user portrait further includes, after acquiring the first feature vector of the user based on the user information, in the case where the first label information is included in the user information Replacing the predicted value of the first tag with the corresponding preset value of the first tag information as the value of the first tag of the user. In one embodiment, the above method of acquiring a multi-label user portrait further includes, after acquiring the second feature vector of the user, in the case where the second label information is included in the user information, to The corresponding preset value of the second tag information replaces the predicted value of the second tag as the value of the second tag of the user. Another aspect of the present specification provides an apparatus for training a user portrait classifier. The classifier is a chain classifier, which includes a first classifier and a second classifier. The user portrait is a multi-label user portrait. The device includes: a first acquisition unit configured to acquire respective first feature vectors of a first group of users, the first feature vectors corresponding to user information, the information including user registration information, And the user's operation history information; the second acquiring unit is configured to acquire the value of each first tag of the first group of users, the value of the first tag corresponding to the user's first tag information; A training unit configured to train the first classifier using the set of values of the first feature vector and the first label of the first group of users as the first training set; the third acquisition unit is configured to: Combining the respective first feature vectors of the first group of users with the values of the first label to obtain the respective second feature vectors of the first group of users; the fourth acquiring unit is configured to acquire the first The value of each user's second label, the value of the second label corresponds to the user's second label information, and the user's second label is associated with the user's first label; and the second training unit , Configured to train the second classifier using the set of values of the second feature vector and the second label of the first group of users as the second training set. Another aspect of the present specification provides an apparatus for training a portrait classifier of a user, the classifier is a chain classifier, which includes a first classifier and a second classifier, wherein the first classifier is obtained through the above training method The first classifier obtained by training, the user portrait is a multi-label user portrait, the device includes: a fifth acquisition unit configured to acquire the respective first feature vectors of the second group of users, the first The two groups of users include at least one user who does not belong to the first group of users. The first feature vector corresponds to the user's information. The information includes the user's registration information and the user's operation history information. The input unit is configured to input the respective first feature vectors of the second group of users into the first classifier to obtain the respective first label prediction values of the second group of users; the combination unit , Configured to combine the first feature vector and the first label prediction value of each user in the second group of users to obtain the respective second feature vectors of the second group of users; the sixth acquisition unit, configure In order to obtain the value of each second label of the second group of users, the value of the second label corresponds to the second label information of the user, and the second label of the user is associated with the first label of the user; And a third training unit configured to train the second classifier using the set of values of the second feature vector and the second label of the second group of users as a third training set. Another aspect of the present specification provides an apparatus for acquiring multi-label user portraits, including: a first acquiring unit configured to acquire a user's first feature vector based on user information; a first input unit configured to: The first feature vector is input to the first classifier trained by the above training method to obtain the predicted value of the first label of the user as the value of the first label of the user; the second acquisition unit is configured to: Combining the first feature vector and the value of the first label to obtain a second feature vector of the user; and a second input unit configured to input the second feature vector for training by the above training method The obtained second classifier obtains the predicted value of the second label of the user as the value of the second label of the user. Through the above-mentioned solution for acquiring the multi-tag user portrait according to the embodiment of the present specification, the learning of each tag of the user portrait is more accurate and reliable, and the acquired multi-tag user portrait is also more accurate.

下面將結合附圖描述本說明書實施例。 圖1示出根據本說明書實施例的系統100的示意圖。如圖1所示,系統100包括分類器鏈11。在一個實施例中,分類器鏈11中包括多個分類器Cj , j = 1 … n,每個分類器Cj 對應於使用者的一個標籤,這些n個分類器串聯起來形成一條鏈。分類器Cj 可以基於決策樹、單純貝葉斯、支援向量機、關聯規則學習、神經網路、遺傳演算法中的一種演算法,所述n個分類器Cj 可以基於相同的演算法,也可以基於不同的演算法。 在一個實施例中,如圖1所示,分類器鏈11包括4個分類器C1、C2、C3和C4。例如,分類器C1是對應於性別標籤的分類器,分類器C2是對應於年齡標籤的分類器,分類器C3是對應於購買偏好標籤的分類器,以及分類器C4是對應於購買能力的分類器。 在訓練分類器鏈11時,首先,向分類器C1輸入第一訓練集t1,該訓練集t1包括對應於各個使用者的資訊的多個特徵向量x1 和各個使用者的標籤值λ1 。在C1為性別分類器的情況中,標籤值λ1 對應使用者於性別。以t1訓練C1獲得對應於性別標籤的分類器C1。之後,向分類器C2輸入訓練集t2。如圖中所示,該訓練集t2包括對應於各個使用者的資訊的多個特徵向量x2 和各個使用者的標籤值λ2 。在C2為年齡分類器的情況中,標籤值λ2 對應於使用者年齡段。所述特徵向量x2 除了包括上述特徵向量x1 之外,還包括各個使用者的標籤值λ1 ,即對應於不同性別的值。以訓練集t2訓練C2,使得將對使用者年齡的分類與使用者的性別標籤資訊關聯起來。在訓練後面的分類器C3和C4時,以與訓練C2相同的方式訓練,即,在t3中的特徵向量x3 中包括x2 和λ2 ,在t4中的特徵向量x4 中包括x3 和λ3 ,從而將使用者的各個標籤關聯起來。使得對於樣本標籤的學習更加準確可靠。例如,在C3是購買偏好分類器的情況中,標籤λ3 對應於使用者購買偏好,輸入C3的特徵向量x3 除包括C2中的特徵向量x2 之外,還包括標籤值λ2 ,即使用者年齡標籤值。 在對四個分類器C1-C4都訓練結束之後,即,將分類器鏈11訓練為一個多標籤分類模型,可將其用於對未知標籤的使用者進行分類。如圖1所示,通過將未知標籤的使用者的初始資訊以特徵向量x1 ’的形式輸入C1,通過C1對使用者資訊進行分類,獲得使用者的性別標籤預測值λ1 ’。C1將使用者資訊x1 ’和λ1 ’輸入至C2,從而C2基於使用者資訊x1 ’和λ1 ’進行分類,獲得使用者的年齡標籤預測值λ2 ’。之後,以與C2中相同的方式,分類器C3會從上一個分類器C2接收其特徵向量x2 ’和λ2 ’,從而基於x2 ’和λ2 ’進行分類,獲得購買偏好標籤預測值λ3 ’。分類器C4會從上一個分類器C3接收其特徵向量x3 ’和λ3 ’,從而基於x3 ’和λ3 ’進行分類,獲得購買能力標籤預測值λ4 ’,從而可以獲得使用者肖像標籤集{λ1 ’、λ2 ’、λ3 ’、λ4 ’}。 下面結合本說明書的具體實例描述根據本說明書實施例的訓練鏈式分類器的方法和獲取多標籤使用者肖像的方法。 圖2示出了根據本說明書實施例的一種訓練使用者肖像分類器的方法,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,所述使用者肖像為多標籤使用者肖像。所述方法包括:在步驟S21,獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;在步驟S22,獲取所述第一組使用者各自的第一標籤的值,所述第一標籤的值對應於使用者的第一標籤資訊;在步驟S23,以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;在步驟S24,將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量;在步驟S25,獲取所述第一組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及在步驟S26,以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器。 首先,在步驟S21,獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊。該第一組使用者包括多個使用者,例如包括幾萬個規模的使用者。該第一特徵向量是一個列向量,其中的元素對應於使用者的各個資訊字段的值。使用者資訊可包括使用者原始登錄資訊,如使用者的註冊資訊:手機、電子信箱、城市等。使用者資訊還可以包括使用者的操作歷史資訊,如搜尋和點擊記錄,該搜尋和點擊記錄中例如包括商品的描述資訊(商品類別、價格、是否降價)、商品廣告、優惠活動推廣等。使用者資訊還可以包括使用者標籤資訊,例如,性別、年齡等。 在獲取使用者資訊之後,通過將使用者的相應資訊轉換成對應的數值形式,從而將這些數值組成一個特徵向量。例如,可以將使用者資訊中的城市名轉換成預先設定的對應的數字,例如以1表示北京,以2表示上海等等。為了準確地學習對使用者的分類,使用者資訊中一般包括使用者在一段時間中的操作歷史資訊,例如使用者在半年、三個月、一個月中的搜尋和點擊記錄。 在一個實施例中,使用者資訊為使用者初始資訊,即包括使用者登錄資訊和使用者操作歷史資訊。 然後,在步驟S22,獲取所述第一組使用者各自的第一標籤的值,所述第一標籤的值對應於使用者的第一標籤資訊。該第一標籤與第一分類器是對應的,例如該第一分類器是對使用者性別進行分類的分類器,則第一標籤即為使用者性別。在另一個實施例中,該第一分類器是對使用者年齡進行分類的分類器,則第一標籤為使用者年齡。對於一些標籤,例如性別、年齡等,關於其的標籤資訊可以是使用者自己登錄的,也可以是從以往模型對使用者的分級中直接獲得的。對於一些標籤,例如,購買偏好、購買能力等,關於其的標籤資訊可從以往模型對使用者的分級中獲得。 在一個實施例中,第一分類器是圖1所示的分類器C1,C1例如是性別分類器,因此,第一標籤的值λ1 為對應於使用者性別資訊的值。例如,將女性預設為對應於數字0,將男性預設為對應於數字1,從而當λ1 =0時,表示性別標籤為女性,當λ1 =1時,表示性別標籤為男性。 在步驟S23,以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器。在一個實施例中,第一分類器可以是圖1所示的分類器C1、C2、C3中的任何一個分類器,其訓練集中包括多個使用者的各自的第一特徵向量xj ,以及多個使用者各自的第一標籤的值λj (j=1、2、3)。 在一個實施例中,第一分類器是圖1中的分類器C1,分類器C1例如是對使用者的性別進行分類的分類器。可基於使用者的原始登錄資訊和使用者的點擊記錄建立使用者的特徵向量x1 ,以使用者的性別(真實性別或根據以往模型預測的性別)對應的值作為第一標籤的值λ1 ,以多個使用者的特徵向量x1 及標籤值λ1 的集合訓練該分類器C1,從而使得分類器C1可用於對使用者的性別進行分類。 在步驟S24,將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量。也就是說,將第一標籤的值作為一個元素加入到第一特徵向量中,從而獲得第二特徵向量。 在一個實施例中,在分類器C1是性別分類器的情況中,將使用者的性別標籤值λ1 作為一個元素加入到特徵向量x1 中,以用於分類器C2的訓練。 在步驟S25,獲取所述第一組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊。第二標籤與第二分類器對應。例如,第二分類器可以是購買偏好分類器,則第二標籤是使用者的購買偏好。 在一個實施例中,第二分類器是圖1中的分類器C2,例如,其可以是年齡分類器,從而第二標籤為使用者年齡標籤。例如,第二標籤的值λ2 可預設為對應於使用者的幾個年齡段,例如,可以預設為,當λ2 =1時,對應於5-10歲年齡段,當λ2 =2時,對應於10-20歲年齡段,當λ2 =3時,對應於20-30歲年齡段,等等。該第二標籤的值所對應的資訊(即年齡資訊)的獲取與上述對第一標籤資訊的獲取相似,在此不再贅述。 在步驟S26,以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器。在一個實施例中,第二分類器可以是圖1所示的分類器C2、C3、C4中的任何一個分類器,其各自訓練集中包括多個使用者的各自的第二特徵向量xj ,以及多個使用者各自的第二標籤的值λj ,而所述第二特徵向量xj 中包括第一分類器對應的標籤值λj-1 ,其中j=2、3、4。 在一個實施例中,第二分類器是圖1中的分類器C2,其例如為對使用者年齡進行分類的分類器。通過在使用者的特徵向量x1 中增加對應於性別的元素(即,λ1 ),而獲取使用者的特徵向量x2 ,以使用者的年齡對應的值(即,λ2 )作為第二標籤的值,以多個使用者的特徵向量x2 及標籤值λ2 的集合訓練該分類器C2,從而使得分類器C2可用於對使用者的年齡進行分類。從而,將對分類器C2(即,年齡分類器)的訓練與性別標籤相關聯。 在一個實施例中,上述鏈式分類器還包括如圖1所示的分類器C3,分類器C3例如是對使用者購買偏好進行分類的分類器。從而,分類器C3對應的使用者標籤為使用者購買偏好。可根據實際應用情況,對購買偏好標籤值λ3 進行賦值。例如,可以根據不同人群的購買特點,將購買偏好分為生活用品、電子產品、奢侈品、學習用品等幾大類。並且通過將不同類的購買偏好對應於預定數值,而對λ3 賦值。例如,生活用品對應於數字1,電子產品對應於數字2,從而,當λ3 =1時,代表使用者的購買偏好是生活用品。 在訓練C3時,通過在分類器C2對應的特徵向量x2 中增加對應於年齡的元素(λ2 ),從而獲取使用者的特徵向量x3 。並且,獲取使用者的購買偏好資訊以獲取標籤值λ3 。以多個使用者的特徵向量x3 及標籤值λ3 的集合訓練該分類器C3,從而使得分類器C3可用於對使用者的購買偏好進行分類。在該訓練中,將對分類器C3的訓練與分類器C2對應的標籤(即年齡)相關聯。另外,由於在分類器C2對應的特徵向量x2 中包括分類器C1對應的標籤(即,性別),從而還將對分類器C3的訓練還與性別標籤相關聯。而在實際中,使用者的購買偏好顯然是與性別和年齡相關聯的,因此,根據本說明書實施例的訓練方法優化了對使用者資訊的充分利用,使得對多標籤使用者肖像的預測更加準確。 在一個實施例中,上述鏈式分類器還包括如圖1所示的分類器C4,分類器C4例如是對使用者購買能力進行分類的分類器。從而,分類器C4對應的使用者標籤為使用者購買能力。可根據實際應用情況,對購買能力標籤值λ4 進行賦值。例如,可以將購買能力分為低下、中等、較高、高等幾大類。並且通過將不同類的購買能力對應於預定數值,而對λ4 進行賦值,例如,低下對應於數字1,中等對應於數字2,等等,從而,當λ4 =2時,代表使用者的購買能力是中等水平。 在訓練C4時,通過在分類器C3對應的特徵向量x3 中增加對應於購買偏好標籤值(λ3 )的元素,從而獲取使用者的特徵向量x4 。並且,獲取使用者的購買能力資訊以獲取標籤值λ4 。以多個使用者的特徵向量x4 及標籤值λ4 的集合訓練該分類器C4,從而使得分類器C4可用於對使用者的購買偏好進行分類。在該訓練中,將對分類器C3的訓練與使用者初始資訊、性別、年齡以及購買偏好相關聯,從而優化了對使用者資訊的充分利用,使得對多標籤使用者肖像的預測更加準確。 在一個實施例中,在訓練包括多個分類器的鏈式分類器中,基於標籤學習的難易程度確定標籤的學習順序,即,先學習容易學習的標籤,再學習較難一些的標籤。例如,在包括上述C1、C2、C3和C4的鏈式分類器中,性別標籤只有兩個分類,因此,性別是比較容易學習的,因此,將性別分類器放在最先學習的分類器C1的位置。年齡標籤的分類比較少,並且也比較容易確定,因此放在分類器C2的位置。購買偏好的分類選項比較多,使用者的購買偏好較不容易確定,並且使用者的購買偏好還與使用者的性別、年齡都有關聯,因此,將購買偏好分類器置於分類器C3的位置。而使用者購買能力與使用者的性別、年齡和購買偏好都相關,因此,將購買能力標籤置於分類器C4的位置。 在一個實施例中,所獲取的部分使用者的部分標籤資訊是缺失的。例如,第二組使用者的性別標籤資訊缺失,該第二組使用者包括至少一個使用者,並且該至少一個使用者不屬於上述第一組使用者。在該情況中,在上述已經利用第一組使用者的特徵向量x1 和性別標籤值λ1 訓練了性別分類器C1之後,將第二組使用者各自的特徵向量x1 ’分別輸入分類器C1,獲得第二組使用者各自的性別標籤預測值λ1 ’。將性別標籤預測值λ1 ’作為一個元素加入特徵向量x1 ’,從而獲得第二組使用者各自的特徵向量x2 ’。之後,可以以第二組使用者各自的特徵向量x2 ’和年齡標籤值λ2 的集合作為訓練集,訓練年齡分類器C2。並且,該包括性別標籤預測值λ1 ’的第二組使用者樣本還可以用於訓練後續的使用者購買偏好分類器C3、購買能力分類器C4等。 在一個實施例中,分類器鏈11中包括多個分類器Cj , j = 1 … n,每個分類器Cj 對應於使用者的一個標籤,這些n個分類器串聯起來形成一條鏈。其中,與上述實施例類似地,對每個分類器Cj 的訓練都與其之前的分類器C1 、C2 、、、Cj-1 所對應的標籤值相關聯,從而優化了對使用者資訊的充分利用,使得對多標籤使用者肖像的預測更加準確。 圖3示出了根據本說明書實施例的一種獲取多標籤使用者肖像的方法,包括:在步驟S31,基於使用者資訊獲取使用者的第一特徵向量;在步驟S32,將所述第一特徵向量輸入通過上述訓練方法訓練獲得的第一分類器,獲得所述使用者的第一標籤預測值,作為所述使用者的第一標籤的值;在步驟S33,將所述第一特徵向量與所述第一標籤的值組合,以獲取所述使用者的第二特徵向量;以及在步驟S34,將所述第二特徵向量輸入通過上述訓練方法訓練獲得的第二分類器,獲得所述使用者的第二標籤預測值,作為所述使用者的第二標籤的值。 例如,第一分類器是上述性別分類器C1,第二分類器是上述年齡分類器C2。首先,基於使用者資訊,即使用者登錄資訊和使用者操作歷史資訊,獲取對應於分類器C1的特徵向量x1 。將特徵向量x1 輸入分類器C1,從而獲得使用者的性別標籤預測值λ1 ’,作為性別標籤值λ1 。將特徵向量x1 與性別標籤預測值λ1 組合,即,將λ1 作為一個元素加入到特徵向量x1 中,從而獲得使用者的特徵向量x2 。將特徵向量x2 輸入上述年齡分類器C2,從而獲得使用者的年齡標籤預測值λ2 ’,作為年齡標籤值λ2 。 在一個實施例中,還可以將獲得的年齡標籤值λ2 作為元素加入特徵向量x2 中,從而獲得使用者的特徵向量x3 ,將特徵向量x3 輸入到上述購買偏好分類器C3中,從而可獲得使用者的購買偏好標籤預測值λ3 ’,作為使用者的購買偏好標籤值λ3 。 在一個實施例中,還可以將獲得的購買偏好標籤預測值λ3 ’作為元素加入特徵向量x3 中,從而獲得使用者的特徵向量x4 ,將特徵向量x4 輸入到上述購買能力分類器C4中,從而可獲得使用者的購買能力標籤預測值λ4 ’,作為使用者的購買能力標籤值λ4 。 從而,通過根據本說明書實施例的獲取多標籤使用者肖像的方法,可以獲取使用者肖像的標籤集{λ1 ’、λ2 ’、λ3 ’、λ4 ’}。在該使用者肖像標籤集{λ1 ’、λ2 ’、λ3 ’、λ4 ’}中,年齡標籤預測值λ2 ’的獲得與使用者初始資訊x1和性別標籤值λ1 相關聯,購買偏好標籤預測值λ3 ’的獲得與使用者初始資訊x1、性別標籤值λ1 和年齡標籤值λ2 相關聯,以及購買能力標籤預測值λ4 ’的獲得與使用者初始資訊x1、性別標籤值λ1 、年齡標籤值λ2 和購買偏好標籤值λ3 相關聯。從而在預測使用者標籤時充分考慮了使用者各個標籤之間的關聯關係。 在一個實施例中,使用者初始資訊中可能包括部分使用者標籤資訊。例如,使用者的登錄資訊中可能包括年齡、性別資訊等,在該情況中,使用標籤資訊的對應預設值替換標籤預測值,作為使用者的標籤值。例如,在使用者登錄資訊中包括年齡的情況中,在使用者肖像標籤集中,使用該年齡對應的年齡段的對應預設值替換年齡預測值,作為使用者的年齡標籤值。 圖4示出了根據本說明書實施例的訓練使用者肖像分類器的裝置400,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,所述使用者肖像為多標籤使用者肖像。所述裝置400包括:第一獲取單元41,配置為,獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;第二獲取單元42,配置為,獲取所述第一組使用者各自的第一標籤的值,所述第一標籤的值對應於使用者的第一標籤資訊;第一訓練單元43,配置為,以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;第三獲取單元44,配置為,將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量;第四獲取單元45,配置為,獲取所述第一組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及第二訓練單元46,配置為,以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器。 在一個實施例中提供一種訓練使用者肖像分類器的裝置,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,其中所述第一分類器是通過上述訓練方法訓練獲得的第一分類器,所述使用者肖像為多標籤使用者肖像,所述裝置包括:第五獲取單元,配置為,在訓練第一分類器之後,獲取第二組使用者的各自的第一特徵向量,所述第二組使用者包括不屬於所述第一組使用者的至少一個使用者,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;輸入單元,配置為,將所述第二組使用者的各自的第一特徵向量輸入所述第一分類器,以獲取所述第二組使用者的各自的第一標籤預測值,作為其第一標籤的值;組合單元,配置為,將所述第二組使用者中每個使用者的第一特徵向量和第一標籤的值組合,以獲取第二組使用者各自的第二特徵向量;第六獲取單元,配置為,獲取第二組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及第三訓練單元,配置為,以所述第二組使用者各自的第二特徵向量和第二標籤的值的集合作為第三訓練集,訓練所述第二分類器。 圖5示出了根據本說明書實施例的獲取多標籤使用者肖像的裝置500,包括:第一獲取單元51,配置為,基於使用者資訊獲取使用者的第一特徵向量;第一輸入單元52,配置為,將所述第一特徵向量輸入通過上述訓練方法訓練獲得的第一分類器,獲得所述使用者的第一標籤預測值,作為所述使用者的第一標籤的值;第二獲取單元53,配置為,將所述第一特徵向量與所述第一標籤的值組合,以獲取所述使用者的第二特徵向量;以及第二輸入單元54,配置為,將所述第二特徵向量輸入通過上述訓練方法訓練獲得的第二分類器,獲得所述使用者的第二標籤預測值,作為所述使用者的第二標籤的值。 在一個實施例中,所述獲取多標籤使用者肖像的裝置還包括第三獲取單元,配置為,在獲取使用者的第一特徵向量之後,以所述第一標籤資訊對應的預設值替換所述第一標籤預測值,作為所述使用者的第一標籤的值。 在一個實施例中,所述獲取多標籤使用者肖像的裝置還包括第四獲取單元,配置為,在獲取所述使用者的第二特徵向量之後,以所述第二標籤資訊對應的預設值替換所述第二標籤預測值,作為所述使用者的第一標籤的值。 通過根據本說明書實施例的用於獲取多標籤使用者肖像的上述方案,使得在鏈式的多個分類器之間傳遞使用者的標籤資訊,考慮了使用者各個標籤之間的關聯性,使得對於使用者肖像的各標籤的學習更為準確可靠,也使得獲取的多標籤使用者肖像更加精確。 本領域普通技術人員應該還可以進一步意識到,結合本文中所揭露的實施例描述的各示例的單元及演算法步驟,能夠以電子硬體、電腦軟體或者二者的結合來實現,為了清楚地說明硬體和軟體的可互換性,在上述說明中已經按照功能一般性地描述了各示例的組成及步驟。這些功能究竟以硬體還是軟體方式來執軌道,取決於技術方案的特定應用和設計約束條件。本領域普通技術人員可以對每個特定的應用來使用不同方法來實現所描述的功能,但是這種實現不應認為超出本發明的範圍。 結合本文中所揭露的實施例描述的方法或演算法的步驟可以用硬體、處理器執軌道的軟體模組,或者二者的結合來實施。軟體模組可以置於隨機存取記憶體(RAM)、內部記憶體、唯讀記憶體(ROM)、電可程式化ROM、電可抹除可程式化ROM、暫存器、硬碟、可移動式磁碟、CD-ROM、或技術領域內所公知的任意其它形式的儲存媒體中。 以上所述的具體實施方式,對本發明的目的、技術方案和有益效果進行了進一步詳細說明,所應理解的是,以上所述僅為本發明的具體實施方式而已,並不用於限定本發明的保護範圍,凡在本發明的精神和原則之內,所做的任何修改、等同替換、改進等,均應包含在本發明的保護範圍之內。The embodiments of the present specification will be described below with reference to the drawings. FIG. 1 shows a schematic diagram of a system 100 according to an embodiment of this specification. As shown in FIG. 1, the system 100 includes a classifier chain 11. In one embodiment, the classifier chain 11 includes multiple classifiers C j , j = 1... N, each classifier C j corresponds to a label of the user, and these n classifiers are connected in series to form a chain. The classifier C j may be based on one of decision tree, simple Bayesian, support vector machine, association rule learning, neural network, genetic algorithm. The n classifiers C j may be based on the same algorithm, It can also be based on different algorithms. In one embodiment, as shown in FIG. 1, the classifier chain 11 includes four classifiers C1, C2, C3, and C4. For example, the classifier C1 is a classifier corresponding to gender tags, the classifier C2 is a classifier corresponding to age tags, the classifier C3 is a classifier corresponding to purchase preference tags, and the classifier C4 is a classification corresponding to purchase ability Device. When training the classifier chain 11, first, a first training set t1 is input to the classifier C1. The training set t1 includes a plurality of feature vectors x 1 corresponding to the information of each user and the tag value λ 1 of each user. In the case where C1 is a gender classifier, the tag value λ 1 corresponds to the user's gender. Train C1 at t1 to obtain the classifier C1 corresponding to the gender label. After that, the training set t2 is input to the classifier C2. As shown in the figure, the training set t2 includes a plurality of feature vectors x 2 corresponding to the information of each user and the label value λ 2 of each user. In the case where C2 is an age classifier, the label value λ 2 corresponds to the user's age group. In addition to the above feature vector x 1 , the feature vector x 2 also includes the tag value λ 1 of each user, that is, a value corresponding to different genders. Training C2 with the training set t2 makes it possible to associate the classification of the user's age with the user's gender label information. When training the following classifiers C3 and C4, train in the same way as the training C2, that is, the feature vector x 3 in t3 includes x 2 and λ 2 , and the feature vector x 4 in t 4 includes x 3 And λ 3 to associate the user's tags. This makes the learning of sample labels more accurate and reliable. For example, in the case where C3 is a purchase preference classifier, the tag λ 3 corresponds to the user's purchase preference, and the feature vector x 3 of the input C3 includes the tag value λ 2 in addition to the feature vector x 2 in C2, namely User age tag value. After all four classifiers C1-C4 are trained, that is, the classifier chain 11 is trained as a multi-label classification model, which can be used to classify users with unknown labels. As shown in FIG. 1, by inputting the initial information of the user of the unknown label into C1 in the form of a feature vector x 1 ′, the user information is classified by C1 to obtain the predicted value λ 1 ′ of the user’s gender label. C1 inputs the user information x 1 ′ and λ 1 ′ to C2, so that C2 classifies based on the user information x 1 ′ and λ 1 ′ to obtain the predicted value of the user’s age label λ 2 ′. Then, in the same way as in C2, the classifier C3 will receive its feature vectors x 2 'and λ 2 ' from the previous classifier C2, so as to classify based on x 2 'and λ 2 ', and obtain the predicted value of the purchase preference label λ 3 '. The classifier C4 will receive its feature vectors x 3 'and λ 3 ' from the previous classifier C3, so as to classify based on x 3 'and λ 3 ', and obtain the predicted value of the purchasing power label λ 4 ', which can obtain the user's portrait Label set {λ 1 ', λ 2 ', λ 3 ', λ 4 '}. The method of training the chain classifier according to the embodiment of the present specification and the method of acquiring the multi-label user portrait are described below with reference to specific examples of the present specification. FIG. 2 shows a method for training a user portrait classifier according to an embodiment of the present specification. The classifier is a chain classifier, which includes a first classifier and a second classifier. Portrait of label user. The method includes: in step S21, acquiring respective first feature vectors of the first group of users, the first feature vector corresponding to the user's information, the information includes the user's registration information, and the user's Operation history information; in step S22, the value of each first label of the first group of users is obtained, the value of the first label corresponds to the user's first label information; in step S23, the first The first feature vector of each group of users and the value of the first label are used as the first training set to train the first classifier; in step S24, the first feature vector of each of the first group of users and the first Combination of tag values to obtain respective second feature vectors of the first group of users; in step S25, obtain values of respective second tags of the first group of users, the values of the second tags corresponding to The user's second label information, and the user's second label is associated with the user's first label; and in step S26, the values of the second feature vector and the second label of the first group of users Is used as the second training set to train the second classifier. First, in step S21, the respective first feature vectors of the first group of users are obtained. The first feature vectors correspond to the user's information. The information includes the user's registration information and the user's operation history information. . The first group of users includes multiple users, for example, tens of thousands of users. The first feature vector is a column vector, the elements of which correspond to the values of various information fields of the user. The user information may include the user's original login information, such as the user's registration information: mobile phone, e-mail, city, etc. The user information may also include the user's operation history information, such as search and click records. The search and click records include, for example, product description information (product category, price, and whether to reduce the price), product advertisements, and promotion of promotional activities. User information can also include user tag information, such as gender, age, etc. After acquiring the user information, the user's corresponding information is converted into a corresponding numerical form, so that these values are combined into a feature vector. For example, the city name in the user information can be converted into a preset corresponding number, such as 1 for Beijing, 2 for Shanghai, and so on. In order to accurately learn to classify users, user information generally includes user operation history information over a period of time, such as user search and click records in half a year, three months, and one month. In one embodiment, the user information is user initial information, which includes user registration information and user operation history information. Then, in step S22, the value of each first label of the first group of users is obtained, and the value of the first label corresponds to the first label information of the user. The first label corresponds to the first classifier. For example, if the first classifier is a classifier that classifies the gender of the user, the first label is the gender of the user. In another embodiment, the first classifier is a classifier that classifies the user's age, and the first label is the user's age. For some tags, such as gender, age, etc., the tag information about it can be registered by the user himself, or can be obtained directly from the rating of the user by the previous model. For some tags, for example, purchase preferences, purchasing power, etc., tag information about them can be obtained from the ratings of users in previous models. In one embodiment, the first classifier is the classifier C1 shown in FIG. 1, and C1 is, for example, a gender classifier. Therefore, the value λ 1 of the first label is a value corresponding to the user's gender information. For example, the female is preset to correspond to the number 0, and the male is preset to correspond to the number 1, so that when λ 1 =0, it means that the gender label is female, and when λ 1 =1, it means that the gender label is male. In step S23, the first classifier is trained using the set of the first feature vector and the value of the first label of the first group of users as the first training set. In one embodiment, the first classifier may be any one of the classifiers C1, C2, and C3 shown in FIG. 1, whose training set includes respective first feature vectors x j of multiple users, and The value λ j (j=1, 2, 3) of the first label of each user. In one embodiment, the first classifier is the classifier C1 in FIG. 1, and the classifier C1 is, for example, a classifier that classifies the gender of the user. The user's feature vector x 1 can be created based on the user's original registration information and the user's click record, and the value corresponding to the user's gender (real sex or gender predicted according to the previous model) is used as the value of the first label λ 1 , Train the classifier C1 with a set of feature vectors x 1 and label values λ 1 of multiple users, so that the classifier C1 can be used to classify the gender of the users. In step S24, the first feature vector of each user of the first group and the value of the first label are combined to obtain the second feature vector of each user of the first group. That is, the value of the first label is added as an element to the first feature vector, thereby obtaining the second feature vector. In one embodiment, in the case where the classifier C1 is a gender classifier, the user's gender label value λ 1 is added as an element to the feature vector x 1 for training of the classifier C2. In step S25, the value of each second label of the first group of users is obtained, and the value of the second label corresponds to the second label information of the user. The second label corresponds to the second classifier. For example, the second classifier may be a purchase preference classifier, and the second label is the user's purchase preference. In one embodiment, the second classifier is the classifier C2 in FIG. 1, for example, it may be an age classifier, so that the second label is a user age label. For example, the value λ 2 of the second label can be preset to correspond to several age groups of the user. For example, it can be preset that when λ 2 =1, it corresponds to the age group of 5-10 years old, when λ 2 = At 2 o'clock, it corresponds to the 10-20 age group, when λ 2 = 3, it corresponds to the 20-30 age group, and so on. The acquisition of information corresponding to the value of the second tag (ie, age information) is similar to the acquisition of the first tag information described above, and will not be repeated here. In step S26, the second classifier is trained using the set of the second feature vector and the value of the second label of the first group of users as the second training set. In an embodiment, the second classifier may be any one of the classifiers C2, C3, and C4 shown in FIG. 1, and their respective training sets include respective second feature vectors x j of multiple users, And the second label value λ j of each user, and the second feature vector x j includes the label value λ j-1 corresponding to the first classifier, where j=2, 3, and 4. In one embodiment, the second classifier is the classifier C2 in FIG. 1, which is, for example, a classifier that classifies the age of the user. By a user in a feature vector x corresponding to the sex of the element increases (i.e., λ 1), the user acquires a feature vector x 2, to a value corresponding to the age of the user (i.e., λ 2) As a second The value of the tag trains the classifier C2 with a set of feature vectors x 2 and tag values λ 2 of multiple users, so that the classifier C2 can be used to classify the age of the user. Thus, the training of the classifier C2 (ie, the age classifier) is associated with the gender label. In one embodiment, the above-mentioned chain classifier further includes a classifier C3 as shown in FIG. 1. The classifier C3 is, for example, a classifier that classifies a user's purchase preference. Therefore, the user tag corresponding to the classifier C3 is the user's purchase preference. The value of the purchase preference label λ 3 can be assigned according to the actual application. For example, according to the purchasing characteristics of different groups of people, purchasing preferences can be divided into daily necessities, electronic products, luxury goods, school supplies, etc. And by assigning different types of purchase preferences to predetermined values, λ 3 is assigned. For example, daily necessities correspond to the number 1, and electronic products correspond to the number 2, so that when λ 3 =1, it represents that the user's purchasing preference is daily necessities. When training C3, by adding an element (λ 2 ) corresponding to the age to the feature vector x 2 corresponding to the classifier C2, the user's feature vector x 3 is obtained . Moreover, the user's purchase preference information is obtained to obtain the tag value λ 3 . The classifier C3 is trained with a set of feature vectors x 3 and label values λ 3 of multiple users, so that the classifier C3 can be used to classify the user's purchasing preferences. In this training, the training of the classifier C3 is associated with the label (ie, age) corresponding to the classifier C2. In addition, since the feature vector x 2 corresponding to the classifier C2 includes the label (ie, gender) corresponding to the classifier C1, the training of the classifier C3 is also associated with the gender label. In practice, the user’s purchasing preference is obviously related to gender and age. Therefore, the training method according to the embodiment of the present specification optimizes the full use of user information, making prediction of multi-label user portraits more accurate. In one embodiment, the above-mentioned chain classifier further includes a classifier C4 as shown in FIG. 1. The classifier C4 is, for example, a classifier that classifies the purchasing power of users. Therefore, the user tag corresponding to the classifier C4 is the user's purchasing power. The value of the purchasing power tag λ 4 can be assigned according to the actual application. For example, the purchasing power can be divided into low, medium, high, and high categories. And by assigning different types of purchasing power to predetermined values, λ 4 is assigned, for example, low corresponds to the number 1, medium corresponds to the number 2, and so on, so that when λ 4 = 2, it represents the user’s The purchasing power is medium. When training C4, by adding an element corresponding to the purchase preference label value (λ 3 ) to the feature vector x 3 corresponding to the classifier C3, the user's feature vector x 4 is obtained . Moreover, the user's purchasing power information is acquired to obtain the tag value λ 4 . The classifier C4 is trained with a set of feature vectors x 4 and label values λ 4 of multiple users, so that the classifier C4 can be used to classify the user's purchasing preferences. In this training, the training of classifier C3 is associated with the user's initial information, gender, age, and purchase preferences, thereby optimizing the full use of user information and making the prediction of multi-label user portraits more accurate. In one embodiment, in training a chain classifier that includes multiple classifiers, the learning order of the labels is determined based on the difficulty of label learning, that is, the labels that are easy to learn are learned first, and then the more difficult labels are learned. For example, in the chain classifier including the above C1, C2, C3 and C4, there are only two categories of gender tags, so gender is relatively easy to learn, so put the gender classifier in the classifier C1 that is learned first s position. The age tags are classified in relatively few, and are relatively easy to determine, so they are placed in the position of classifier C2. There are many classification options for purchase preferences, and the user's purchase preferences are less easy to determine, and the user's purchase preferences are also related to the user's gender and age. Therefore, the purchase preference classifier is placed in the position of classifier C3 . The user's purchasing power is related to the user's gender, age, and purchasing preferences. Therefore, the purchasing power label is placed in the position of the classifier C4. In one embodiment, part of the acquired tag information of some users is missing. For example, the gender tag information of the second group of users is missing, the second group of users includes at least one user, and the at least one user does not belong to the first group of users. In this case, after the gender classifier C1 has been trained using the feature vector x 1 of the first group of users and the gender label value λ 1 above , the feature vector x 1 ′ of each user of the second group is input into the classifier C1. Obtain the predicted value of the gender tag λ 1 ′ of the second group of users. The predicted value λ 1 ′ of the gender label is added as an element to the feature vector x 1 ′, thereby obtaining the respective feature vector x 2 ′ of the second group of users. Afterwards, the age classifier C2 can be trained using the set of the respective feature vectors x 2 ′ of the second group of users and the age label value λ 2 as the training set. Moreover, the second set of user samples including the predicted value of the gender tag λ 1 ′ can also be used to train subsequent users to purchase a preference classifier C3, a purchase ability classifier C4, and so on. In one embodiment, the classifier chain 11 includes multiple classifiers C j , j = 1... N, each classifier C j corresponds to a label of the user, and these n classifiers are connected in series to form a chain. Similar to the above embodiment, the training of each classifier C j is associated with the label values corresponding to the classifiers C 1 , C 2 , and C j-1 , thereby optimizing the user The full use of information makes prediction of multi-label user portraits more accurate. FIG. 3 shows a method for acquiring a multi-label user portrait according to an embodiment of the present specification, including: in step S31, acquiring a user's first feature vector based on user information; in step S32, converting the first feature The vector is input to the first classifier obtained by the training method described above, and the predicted value of the first label of the user is obtained as the value of the first label of the user; in step S33, the first feature vector and Combining the values of the first label to obtain the second feature vector of the user; and in step S34, inputting the second feature vector into the second classifier trained by the above training method to obtain the usage The predicted value of the second tag of the user is used as the value of the second tag of the user. For example, the first classifier is the aforementioned sex classifier C1, and the second classifier is the aforementioned age classifier C2. First, based on user information, that is, user registration information and user operation history information, a feature vector x 1 corresponding to the classifier C1 is obtained. The feature vector x 1 is input to the classifier C1 to obtain the user's predicted value of the gender tag λ 1 ′ as the gender tag value λ 1 . The feature vector x 1 is combined with the gender label predicted value λ 1 , that is, λ 1 is added as an element to the feature vector x 1 , thereby obtaining the user's feature vector x 2 . The feature vector x 2 is input to the above-mentioned age classifier C2, thereby obtaining the predicted value λ 2 ′ of the user's age label as the age label value λ 2 . In one embodiment, the obtained age label value λ 2 can also be added as an element to the feature vector x 2 to obtain the user's feature vector x 3 , and the feature vector x 3 is input into the above-mentioned purchase preference classifier C3, Thus, the predicted value λ 3 ′ of the user's purchase preference label can be obtained as the user's purchase preference label value λ 3 . In one embodiment, the obtained predicted value of the purchase preference label λ 3 ′ can also be added as an element to the feature vector x 3 to obtain the feature vector x 4 of the user, and the feature vector x 4 is input to the above-mentioned purchasing ability classifier In C4, the predicted value of the user's purchasing power label λ 4 ′ can be obtained as the user's purchasing power label value λ 4 . Thus, through the method of acquiring a multi-label user portrait according to an embodiment of the present specification, a label set {λ 1 ', λ 2 ', λ 3 ', λ 4 '} of the user portrait can be obtained. In the user portrait label set {λ 1 ', λ 2 ', λ 3 ', λ 4 '}, the obtained age label predicted value λ 2 'is associated with the user's initial information x1 and the gender label value λ 1 , The acquisition of the predicted value of the purchase preference label λ 3 ′ is associated with the user’s initial information x1, the gender label value λ 1 and the age label value λ 2 , and the acquisition of the predicted value of the purchase ability label λ 4 ′ with the user’s initial information x1 The tag value λ 1 , the age tag value λ 2 and the purchase preference tag value λ 3 are associated. Therefore, when predicting user tags, the relationship between the tags of the user is fully considered. In one embodiment, the user initial information may include part of the user tag information. For example, the user's registration information may include age, gender information, etc. In this case, the corresponding default value of the label information is used to replace the predicted label value as the user's label value. For example, in the case where the user registration information includes age, in the user portrait label set, the corresponding preset value of the age group corresponding to the age is used to replace the predicted age value as the user's age label value. FIG. 4 shows an apparatus 400 for training a user portrait classifier according to an embodiment of the present specification. The classifier is a chain classifier, which includes a first classifier and a second classifier. Portrait of label user. The device 400 includes: a first acquiring unit 41 configured to acquire respective first feature vectors of a first group of users, the first feature vector corresponding to user information, the information including user registration Information, and user operation history information; the second acquisition unit 42 is configured to acquire the value of each first tag of the first group of users, the value of the first tag corresponding to the first tag of the user Information; the first training unit 43 is configured to train the first classifier using the set of values of the first feature vector and the first label of the first group of users as the first training set; the third acquisition unit 44 , Configured to combine the first feature vectors of the first group of users and the values of the first label to obtain the second feature vectors of the first group of users; the fourth acquisition unit 45 is configured to To obtain the value of each second label of the first group of users, the value of the second label corresponds to the user's second label information, and the user's second label is associated with the user's first label And the second training unit 46, configured to train the second classifier using the set of values of the second feature vector and the second label of each of the first group of users as a second training set. In one embodiment, an apparatus for training a portrait classifier of a user is provided. The classifier is a chain classifier, which includes a first classifier and a second classifier, wherein the first classifier is obtained through the above training method. In the first classifier obtained through training, the user portrait is a multi-label user portrait. The device includes: a fifth acquisition unit configured to, after training the first classifier, acquire each of the second group of users A first feature vector, the second group of users includes at least one user who does not belong to the first group of users, the first feature vector corresponds to user information, and the information includes user registration information And the user's operation history information; the input unit is configured to input the respective first feature vectors of the second group of users into the first classifier to obtain the respective user's respective characteristics of the second group The predicted value of the first label as the value of its first label; the combination unit is configured to combine the first feature vector of each user in the second group of users with the value of the first label to obtain the second A second feature vector of each group of users; a sixth acquisition unit configured to acquire the value of each second tag of the second group of users, the value of the second tag corresponding to the second tag information of the user, and The second label of the user is associated with the first label of the user; and the third training unit is configured to use the set of values of the second feature vector and the second label of the second group of users as the third Training set, training the second classifier. FIG. 5 shows an apparatus 500 for acquiring a multi-label user portrait according to an embodiment of the present specification, including: a first acquiring unit 51 configured to acquire a first feature vector of a user based on user information; a first input unit 52 , Configured to input the first feature vector into the first classifier trained by the above training method to obtain the predicted value of the first label of the user as the value of the first label of the user; second The obtaining unit 53 is configured to combine the first feature vector and the value of the first label to obtain the second feature vector of the user; and the second input unit 54 is configured to configure the Two feature vectors are input to the second classifier trained by the above training method to obtain the predicted value of the second label of the user as the value of the second label of the user. In one embodiment, the device for acquiring multi-label user portraits further includes a third acquiring unit configured to replace the preset value corresponding to the first tag information after acquiring the user's first feature vector The predicted value of the first tag serves as the value of the first tag of the user. In one embodiment, the apparatus for acquiring a multi-tag user portrait further includes a fourth acquiring unit configured to, after acquiring the second feature vector of the user, use a preset corresponding to the second tag information The value replaces the predicted value of the second label as the value of the first label of the user. Through the above-mentioned solution for acquiring a portrait of a multi-label user according to an embodiment of the present specification, the label information of the user is transferred between multiple chained classifiers, taking into account the relevance between the various labels of the user, so that The learning of each label of the user portrait is more accurate and reliable, and it also makes the acquired multi-label user portrait more accurate. Those of ordinary skill in the art should also be further aware that the example units and algorithm steps described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, computer software, or a combination of the two, for clarity The interchangeability of hardware and software is described. In the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are implemented in hardware or software depends on the specific application of the technical solution and design constraints. A person of ordinary skill in the art may use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the present invention. The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented by hardware, a software module executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable and programmable ROM, registers, hard drives, A removable disk, CD-ROM, or any other form of storage medium known in the art. The specific embodiments described above further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention and are not intended to limit the scope of the present invention. The scope of protection, within the spirit and principle of the present invention, any modification, equivalent replacement, improvement, etc., shall be included in the scope of protection of the present invention.

100‧‧‧系統 21‧‧‧步驟 22‧‧‧步驟 23‧‧‧步驟 24‧‧‧步驟 25‧‧‧步驟 26‧‧‧步驟 31‧‧‧步驟 32‧‧‧步驟 33‧‧‧步驟 34‧‧‧步驟 400‧‧‧裝置 41‧‧‧第一獲取單元 42‧‧‧第二獲取單元 43‧‧‧第一訓練單元 44‧‧‧第三獲取單元 45‧‧‧第四獲取單元 46‧‧‧第二訓練單元 500‧‧‧裝置 51‧‧‧第一獲取單元 52‧‧‧第一輸入單元 53‧‧‧第二獲取單元 54‧‧‧第二輸入單元100‧‧‧System 21‧‧‧Step 22‧‧‧Step 23‧‧‧Step 24‧‧‧Step 25‧‧‧Step 26‧‧‧Step 31‧‧‧Step 32‧‧‧Step 33‧‧‧Step 34‧‧‧Step 400‧‧‧device 41‧‧‧ First acquisition unit 42‧‧‧Second acquisition unit 43‧‧‧First Training Unit 44‧‧‧ The third acquisition unit 45‧‧‧ fourth acquisition unit 46‧‧‧Second training unit 500‧‧‧device 51‧‧‧ First acquisition unit 52‧‧‧First input unit 53‧‧‧Second acquisition unit 54‧‧‧Second input unit

通過結合附圖描述本說明書實施例,可以使得本說明書實施例更加清楚: 圖1示出根據本說明書實施例的系統100的示意圖; 圖2示出了根據本說明書實施例的一種訓練鏈式分類器的方法; 圖3示出了根據本說明書實施例的一種獲取多標籤使用者肖像的方法; 圖4示出了根據本說明書實施例的訓練鏈式分類器的裝置400;以及 圖5示出了根據本說明書實施例的獲取多標籤使用者肖像的裝置500。By describing the embodiments of the present specification with reference to the drawings, the embodiments of the present specification can be made clearer: FIG. 1 shows a schematic diagram of a system 100 according to an embodiment of this specification; 2 shows a method for training a chain classifier according to an embodiment of this specification; FIG. 3 shows a method for acquiring multi-label user portraits according to an embodiment of the present specification; 4 shows an apparatus 400 for training a chain classifier according to an embodiment of this specification; and FIG. 5 shows an apparatus 500 for acquiring multi-label user portraits according to an embodiment of the present specification.

Claims (16)

一種訓練使用者肖像分類器的方法,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,所述使用者肖像為多標籤使用者肖像,所述方法包括:獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊,其中,上述獲取第一特徵向量包括:獲取所述使用者資訊,並通過將所述使用者的相應資訊轉換成對應的數值形式,從而將所述數值組成所述第一特徵向量;獲取所述第一組使用者各自的第一標籤的值,所述第一標籤對應於所述第一分類器且所述第一標籤的值對應於使用者的第一標籤資訊;以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量,其中,上述組合第一特徵向量和第一標籤的值包括:將第一標籤的值作為一個元素加入到第一特徵向量中,從而獲得第二特徵向量;獲取所述第一組使用者各自的第二標籤的值,所述第二標籤對應於所述第二分類器且所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者 的第一標籤相關聯;以及以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器,其中,在訓練所述包括多個分類器的鏈式分類器中,基於標籤學習的難易程度確定所述分類器的順序,以決定標籤的學習順序,所述標籤學習難易程度取決於所述學習標籤之分類數。 A method for training a user portrait classifier. The classifier is a chain classifier, which includes a first classifier and a second classifier. The user portrait is a multi-label user portrait. The method includes: acquiring The respective first feature vectors of the first group of users, the first feature vectors corresponding to the user's information, the information includes the user's registration information, and the user's operation history information, wherein The feature vector includes: acquiring the user information, and converting the corresponding information of the user into a corresponding numerical form, thereby composing the numerical value into the first feature vector; acquiring each of the first group of users The value of the first label of the first label corresponds to the first classifier and the value of the first label corresponds to the first label information of the user; The set of feature vectors and the values of the first label are used as the first training set to train the first classifier; the respective first feature vectors of the first group of users and the values of the first label are combined to obtain the first A second feature vector of each group of users, wherein the combination of the first feature vector and the value of the first label includes: adding the value of the first label as an element to the first feature vector, thereby obtaining the second feature vector; Acquiring the value of each second label of the first group of users, the second label corresponds to the second classifier and the value of the second label corresponds to the second label information of the user, and the user 'S second label and user Associated with the first label of the first group; and using the set of the second feature vectors of the first group of users and the value of the second label as the second training set to train the second classifier, wherein In the chain classifier of each classifier, the order of the classifiers is determined based on the difficulty of label learning to determine the learning order of labels, and the difficulty of label learning depends on the number of classifications of the learned labels. 根據請求項1所述的訓練使用者肖像分類器的方法,其中所述使用者的資訊包括使用者的標籤資訊。 The method for training a user portrait classifier according to claim 1, wherein the user information includes user tag information. 根據請求項1所述的訓練使用者肖像分類器的方法,其中所述第一標籤是年齡,所述第二標籤是購買偏好。 The method for training a user portrait classifier according to claim 1, wherein the first label is age and the second label is a purchase preference. 根據請求項1所述的訓練使用者肖像分類器的方法,其中所述第一標籤是購買偏好,所述第二標籤是購買能力。 The method for training a user portrait classifier according to claim 1, wherein the first label is a purchase preference and the second label is a purchase ability. 一種訓練使用者肖像分類器的方法,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,其中所述第一分類器是通過根據請求項1至4中任一項所述的方法訓練獲得的第一分類器,所述使用者肖像為多標籤使用者肖像,所述方法包括:獲取第二組使用者的各自的第一特徵向量,所述第二 組使用者包括不屬於所述第一組使用者的至少一個使用者,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;將所述第二組使用者的各自的第一特徵向量輸入所述第一分類器,以獲取所述第二組使用者的各自的第一標籤預測值;將所述第二組使用者中每個使用者的第一特徵向量和第一標籤預測值組合,以獲取第二組使用者各自的第二特徵向量;獲取第二組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及以所述第二組使用者各自的第二特徵向量和第二標籤的值的集合作為第三訓練集,訓練第二分類器。 A method for training a user portrait classifier, the classifier is a chain classifier, which includes a first classifier and a second classifier, wherein the first classifier is based on any one of the request items 1 to 4 The first classifier obtained by the method described in the item, the user portrait is a multi-label user portrait, the method includes: acquiring respective first feature vectors of the second group of users, the second The group of users includes at least one user who does not belong to the first group of users. The first feature vector corresponds to the user's information. The information includes the user's registration information and the user's operation history information; Input the respective first feature vectors of the second group of users into the first classifier to obtain the respective first label prediction values of the second group of users; The first feature vector of each user is combined with the predicted value of the first label to obtain the second feature vector of the second group of users; to obtain the value of the second label of the second group of users, the second label The value of corresponds to the user’s second label information, and the user’s second label is associated with the user’s first label; and the value of each second feature vector and second label of the second group of users Is used as the third training set to train the second classifier. 一種獲取多標籤使用者肖像的方法,包括:基於使用者資訊獲取使用者的第一特徵向量;將所述第一特徵向量輸入根據請求項1至4中任一項所述的方法訓練獲得的第一分類器,獲得所述使用者的第一標籤預測值,作為所述使用者的第一標籤的值;將所述第一特徵向量與所述第一標籤的值組合,以獲取所述使用者的第二特徵向量;以及將所述第二特徵向量輸入根據請求項1至5中任一項所述的方法訓練獲得的第二分類器,獲得所述使用者的第二 標籤預測值,作為所述使用者的第二標籤的值。 A method for obtaining a multi-label user portrait, comprising: obtaining a user's first feature vector based on user information; and inputting the first feature vector into the method according to any one of request items 1 to 4 The first classifier obtains the predicted value of the first label of the user as the value of the first label of the user; combines the first feature vector with the value of the first label to obtain the The second feature vector of the user; and inputting the second feature vector into the second classifier trained according to the method described in any one of the request items 1 to 5, to obtain the second feature vector of the user The predicted value of the tag serves as the value of the second tag of the user. 根據請求項6所述的獲取多標籤使用者肖像的方法,還包括,在基於使用者資訊獲取使用者的第一特徵向量之後,在所述使用者資訊中包括所述第一標籤資訊的情況中,以所述第一標籤資訊的對應預設值替換所述第一標籤預測值,作為所述使用者的第一標籤的值。 The method for acquiring a multi-label user portrait according to claim 6, further comprising, after acquiring the first feature vector of the user based on the user information, including the first label information in the user information In which, the predicted value of the first tag is replaced with the corresponding preset value of the first tag information as the value of the first tag of the user. 根據請求項6所述的獲取多標籤使用者肖像的方法,還包括,在獲取所述使用者的第二特徵向量之後,在所述使用者資訊中包括所述第二標籤資訊的情況中,以所述第二標籤資訊的對應預設值替換所述第二標籤預測值,作為所述使用者的第二標籤的值。 The method for acquiring a multi-label user portrait according to claim 6, further comprising, after acquiring the second feature vector of the user, in the case where the second label information is included in the user information, Replacing the predicted value of the second tag with the corresponding preset value of the second tag information as the value of the second tag of the user. 一種訓練使用者肖像分類器的裝置,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,所述使用者肖像為多標籤使用者肖像,所述裝置包括:第一獲取單元,配置為,獲取第一組使用者的各自的第一特徵向量,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊,其中,上述獲取第一特徵向量包括:獲取所述使用者資訊,並通過將所述使用者的相應資訊轉換成對應的數值形式,從而將所述數值組成所述第一特徵向量;第二獲取單元,配置為,獲取所述第一組使用者各自 的第一標籤的值,所述第一標籤對應於所述第一分類器且所述第一標籤的值對應於使用者的第一標籤資訊;第一訓練單元,配置為,以所述第一組使用者各自的第一特徵向量和第一標籤的值的集合作為第一訓練集,訓練第一分類器;第三獲取單元,配置為,將所述第一組使用者各自的第一特徵向量和第一標籤的值組合,以獲取所述第一組使用者各自的第二特徵向量,其中,上述組合第一特徵向量和第一標籤的值包括:將第一標籤的值作為一個元素加入到第一特徵向量中,從而獲得第二特徵向量;第四獲取單元,配置為,獲取所述第一組使用者各自的第二標籤的值,所述第二標籤對應於所述第二分類器且所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及第二訓練單元,配置為,以所述第一組使用者各自的第二特徵向量和第二標籤的值的集合作為第二訓練集,訓練第二分類器,其中,在訓練所述包括多個分類器的鏈式分類器中,基於標籤學習的難易程度確定所述分類器的順序,以決定標籤的學習順序,所述標籤學習難易程度取決於所述學習標籤之分類數。 A device for training a user portrait classifier, the classifier is a chain classifier, which includes a first classifier and a second classifier, the user portrait is a multi-label user portrait, the device includes: An acquiring unit configured to acquire respective first feature vectors of the first group of users, the first feature vector corresponding to the user's information, the information including the user's registration information and the user's operation history Information, wherein the acquiring the first feature vector includes acquiring the user information, and converting the corresponding information of the user into a corresponding numerical form, thereby forming the numerical value into the first feature vector; Two acquisition units configured to acquire each of the first group of users The value of the first label of the first label corresponds to the first classifier and the value of the first label corresponds to the first label information of the user; the first training unit is configured to The set of the first feature vector and the value of the first label of each group of users is used as the first training set to train the first classifier; the third acquisition unit is configured to use the first Combining the feature vector and the value of the first label to obtain the second feature vector of the first group of users, wherein the combination of the first feature vector and the value of the first label includes: using the value of the first label as a Elements are added to the first feature vector to obtain a second feature vector; a fourth acquisition unit is configured to acquire the value of each second label of the first group of users, the second label corresponding to the first Two classifiers and the value of the second label corresponds to the user's second label information, and the user's second label is associated with the user's first label; and the second training unit is configured to: The set of values of the second feature vector and the second label of the users of the first group are used as the second training set to train the second classifier, wherein, in training the chain classifier including a plurality of classifiers, based on The difficulty of tag learning determines the order of the classifiers to determine the learning order of tags, and the difficulty of tag learning depends on the number of classifications of the learned tags. 根據請求項9所述的訓練使用者肖像分類器的裝置,其中所述使用者的資訊包括使用者的標籤資訊。 The apparatus for training a user portrait classifier according to claim 9, wherein the user information includes user tag information. 根據請求項9所述的訓練使用者肖像分類器的裝置,其中所述第一標籤是年齡,所述第二標籤是購買偏好。 The apparatus for training a user portrait classifier according to claim 9, wherein the first label is age and the second label is a purchase preference. 根據請求項9所述的訓練使用者肖像分類器的裝置,其中所述第一標籤是購買偏好,所述第二標籤是購買能力。 The apparatus for training a user portrait classifier according to claim 9, wherein the first label is a purchase preference and the second label is a purchase ability. 一種訓練使用者肖像分類器的裝置,所述分類器是鏈式分類器,其包括第一分類器和第二分類器,其中所述第一分類器是通過根據請求項1至4中任一項所述的方法訓練獲得的第一分類器,所述使用者肖像為多標籤使用者肖像,所述裝置包括:第五獲取單元,配置為,獲取第二組使用者的各自的第一特徵向量,所述第二組使用者包括不屬於所述第一組使用者的至少一個使用者,所述第一特徵向量對應於使用者的資訊,所述資訊包括使用者的註冊資訊、以及使用者的操作歷史資訊;輸入單元,配置為,將所述第二組使用者的各自的第一特徵向量輸入所述第一分類器,以獲取所述第二組使用者的各自的第一標籤預測值;組合單元,配置為,將所述第二組使用者中每個使用者的第一特徵向量和第一標籤預測值組合,以獲取第二組使用者各自的第二特徵向量; 第六獲取單元,配置為,獲取第二組使用者各自的第二標籤的值,所述第二標籤的值對應於使用者的第二標籤資訊,並且使用者的第二標籤與使用者的第一標籤相關聯;以及第三訓練單元,配置為,以所述第二組使用者各自的第二特徵向量和第二標籤的值的集合作為第三訓練集,訓練所述第二分類器。 An apparatus for training user portrait classifiers, the classifier is a chain classifier, which includes a first classifier and a second classifier, wherein the first classifier is based on any one of the request items 1 to 4 The first classifier obtained by the method described in the item, the user portrait is a multi-label user portrait, the device includes: a fifth acquisition unit configured to acquire the respective first characteristics of the second group of users Vector, the second group of users includes at least one user who does not belong to the first group of users, the first feature vector corresponds to user information, the information includes user registration information, and usage The operation history information of the user; the input unit is configured to input the respective first feature vectors of the second group of users into the first classifier to obtain the respective first tags of the second group of users Prediction value; a combination unit configured to combine the first feature vector and the first label prediction value of each user in the second group of users to obtain the second feature vector of the second group of users; The sixth acquiring unit is configured to acquire the value of each second tag of the second group of users, the value of the second tag corresponds to the user's second tag information, and the user's second tag and the user's The first label is associated; and the third training unit is configured to train the second classifier using the set of values of the second feature vector and the value of the second label of the second group of users as the third training set . 一種獲取多標籤使用者肖像的裝置,包括:第一獲取單元,配置為,基於使用者資訊獲取使用者的第一特徵向量;第一輸入單元,配置為,將所述第一特徵向量輸入根據請求項1至4中任一項所述的方法訓練獲得的第一分類器,獲得所述使用者的第一標籤預測值,作為所述使用者的第一標籤的值;第二獲取單元,配置為,將所述第一特徵向量與所述第一標籤的值組合,以獲取所述使用者的第二特徵向量;以及第二輸入單元,配置為,將所述第二特徵向量輸入根據請求項1至5中任一項所述的方法訓練獲得的第二分類器,獲得所述使用者的第二標籤預測值,作為所述使用者的第二標籤的值。 A device for acquiring multi-label user portraits includes: a first acquiring unit configured to acquire a user's first feature vector based on user information; a first input unit configured to input the first feature vector according to The first classifier obtained by the method described in any one of claims 1 to 4 is trained to obtain the predicted value of the first label of the user as the value of the first label of the user; the second acquisition unit, It is configured to combine the first feature vector and the value of the first label to obtain the second feature vector of the user; and the second input unit is configured to input the second feature vector according to The obtained second classifier is trained by the method described in any one of claims 1 to 5, and the predicted value of the second label of the user is obtained as the value of the second label of the user. 根據請求項14所述的獲取多標籤使用者肖像的裝置, 還包括第一替換單元,配置為,在基於使用者資訊獲取使用者的第一特徵向量之後,在所述使用者資訊中包括所述第一標籤資訊的情況中,以所述第一標籤資訊的對應預設值替換所述第一標籤預測值,作為所述使用者的第一標籤的值。 According to the device for obtaining portraits of multi-label users according to claim 14, Also includes a first replacement unit configured to, after acquiring the first feature vector of the user based on the user information, in the case where the first tag information is included in the user information, use the first tag information The corresponding preset value of replaces the predicted value of the first label as the value of the first label of the user. 根據請求項14所述的獲取多標籤使用者肖像的裝置,還包括第二替換單元,配置為,在獲取所述使用者的第二特徵向量之後,在所述使用者資訊中包括所述第二標籤資訊的情況中,以所述第二標籤資訊的對應預設值替換所述第二標籤預測值,作為所述使用者的第二標籤的值。 The device for obtaining a multi-label user portrait according to claim 14, further comprising a second replacement unit configured to include the first information in the user information after acquiring the second feature vector of the user In the case of two-label information, the predicted value of the second label is replaced with the corresponding preset value of the second label information as the value of the second label of the user.
TW107146609A 2018-02-13 2018-12-22 Method and device for obtaining multi-label user portrait TWI693567B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
??201810148824.1 2018-02-13
CN201810148824.1 2018-02-13
CN201810148824.1A CN108229590B (en) 2018-02-13 2018-02-13 Method and device for acquiring multi-label user portrait

Publications (2)

Publication Number Publication Date
TW201935344A TW201935344A (en) 2019-09-01
TWI693567B true TWI693567B (en) 2020-05-11

Family

ID=62661860

Family Applications (1)

Application Number Title Priority Date Filing Date
TW107146609A TWI693567B (en) 2018-02-13 2018-12-22 Method and device for obtaining multi-label user portrait

Country Status (3)

Country Link
CN (1) CN108229590B (en)
TW (1) TWI693567B (en)
WO (1) WO2019157928A1 (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229590B (en) * 2018-02-13 2020-05-15 阿里巴巴集团控股有限公司 Method and device for acquiring multi-label user portrait
CN109102341B (en) * 2018-08-27 2021-08-31 寿带鸟信息科技(苏州)有限公司 Old man portrait drawing method for old man service
CN109785034A (en) * 2018-11-13 2019-05-21 北京码牛科技有限公司 User's portrait generation method, device, electronic equipment and computer-readable medium
CN109886299B (en) * 2019-01-16 2024-05-24 平安科技(深圳)有限公司 User portrait method and device, readable storage medium and terminal equipment
CN109858532A (en) * 2019-01-16 2019-06-07 平安科技(深圳)有限公司 A kind of user draws a portrait method, apparatus, readable storage medium storing program for executing and terminal device
CN109885745A (en) * 2019-01-16 2019-06-14 平安科技(深圳)有限公司 A kind of user draws a portrait method, apparatus, readable storage medium storing program for executing and terminal device
CN110069706A (en) * 2019-03-25 2019-07-30 华为技术有限公司 Method, end side equipment, cloud side apparatus and the end cloud cooperative system of data processing
CN110852338A (en) * 2019-07-26 2020-02-28 平安科技(深圳)有限公司 User portrait construction method and device
CN110674877B (en) * 2019-09-26 2023-06-27 联想(北京)有限公司 Image processing method and device
CN112749323A (en) * 2019-10-31 2021-05-04 北京沃东天骏信息技术有限公司 Method and device for constructing user portrait
CN113496236B (en) * 2020-03-20 2024-05-24 北京沃东天骏信息技术有限公司 User tag information determining method, device, equipment and storage medium
CN111723257B (en) * 2020-06-24 2023-05-02 山东建筑大学 User portrayal method and system based on water usage rule
CN112035742B (en) * 2020-08-28 2023-10-24 康键信息技术(深圳)有限公司 User portrait generation method, device, equipment and storage medium
CN112308166B (en) * 2020-11-09 2023-08-01 建信金融科技有限责任公司 Method and device for processing tag data
CN112330510A (en) * 2020-11-20 2021-02-05 龙马智芯(珠海横琴)科技有限公司 Volunteer recommendation method and device, server and computer-readable storage medium
CN113568738A (en) * 2021-07-02 2021-10-29 上海淇玥信息技术有限公司 Resource allocation method and device based on multi-label classification, electronic equipment and medium
CN113806638B (en) * 2021-09-29 2023-12-08 中国平安人寿保险股份有限公司 Personalized recommendation method based on user portrait and related equipment
CN114399352B (en) * 2021-12-22 2023-06-16 中国电信股份有限公司 Information recommendation method and device, electronic equipment and storage medium
CN116091112A (en) * 2022-12-29 2023-05-09 江苏玖益贰信息科技有限公司 Consumer portrait generating device and portrait analyzing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201011575A (en) * 2008-09-12 2010-03-16 Univ Nat Cheng Kung Recommendation apparatus and method of integrating rough sets and multiple-characteristic exploration
CN104615730A (en) * 2015-02-09 2015-05-13 浪潮集团有限公司 Method and device for classifying multiple labels
CN106650780A (en) * 2016-10-18 2017-05-10 腾讯科技(深圳)有限公司 Data processing method, device, classifier training method and system
US20170223036A1 (en) * 2015-08-31 2017-08-03 Splunk Inc. Model training and deployment in complex event processing of computer network data

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270192A (en) * 2011-07-27 2011-12-07 浙江工业大学 Multi-label classification control method based on smart volume management (SVM) active learning
CN102364498B (en) * 2011-10-17 2013-11-20 江苏大学 Multi-label-based image recognition method
CN102945371B (en) * 2012-10-18 2015-06-24 浙江大学 Classifying method based on multi-label flexible support vector machine
CN105446988B (en) * 2014-06-30 2018-10-30 华为技术有限公司 The method and apparatus for predicting classification
CN106447490A (en) * 2016-09-26 2017-02-22 广州速鸿信息科技有限公司 Credit investigation application method based on user figures
CN106709754A (en) * 2016-11-25 2017-05-24 云南电网有限责任公司昆明供电局 Power user grouping method based on text mining
CN107220281B (en) * 2017-04-19 2020-02-21 北京协同创新研究院 Music classification method and device
CN108229590B (en) * 2018-02-13 2020-05-15 阿里巴巴集团控股有限公司 Method and device for acquiring multi-label user portrait

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201011575A (en) * 2008-09-12 2010-03-16 Univ Nat Cheng Kung Recommendation apparatus and method of integrating rough sets and multiple-characteristic exploration
CN104615730A (en) * 2015-02-09 2015-05-13 浪潮集团有限公司 Method and device for classifying multiple labels
US20170223036A1 (en) * 2015-08-31 2017-08-03 Splunk Inc. Model training and deployment in complex event processing of computer network data
CN106650780A (en) * 2016-10-18 2017-05-10 腾讯科技(深圳)有限公司 Data processing method, device, classifier training method and system

Also Published As

Publication number Publication date
TW201935344A (en) 2019-09-01
WO2019157928A1 (en) 2019-08-22
CN108229590B (en) 2020-05-15
CN108229590A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
TWI693567B (en) Method and device for obtaining multi-label user portrait
US10949907B1 (en) Systems and methods for deep learning model based product matching using multi modal data
CN110008399B (en) Recommendation model training method and device, and recommendation method and device
Mao et al. Multiobjective e-commerce recommendations based on hypergraph ranking
CN109934619A (en) User's portrait tag modeling method, apparatus, electronic equipment and readable storage medium storing program for executing
CN109189904A (en) Individuation search method and system
US8655737B1 (en) Brand name synonymy
JP2019079302A (en) Sales activity support system, sales activity support method and sales activity support program
Charoenrat et al. The performance of Thai manufacturing SMEs: Data envelopment analysis (DEA) approach
JP6731826B2 (en) Extraction device, extraction method, and extraction program
Papadopoulos et al. Multimodal Quasi-AutoRegression: Forecasting the visual popularity of new fashion products
KR101639656B1 (en) Method and server apparatus for advertising
CN110727864A (en) User portrait method based on mobile phone App installation list
Anand et al. Using deep learning to overcome privacy and scalability issues in customer data transfer
Wang et al. Who are the best adopters? User selection model for free trial item promotion
Narke et al. A comprehensive review of approaches and challenges of a recommendation system
Xu et al. Sportswear retailing forecast model based on the combination of multi-layer perceptron and convolutional neural network
Wu et al. [Retracted] Using the Mathematical Model on Precision Marketing with Online Transaction Data Computing
Leng et al. Geometric deep learning based recommender system and an interpretable decision support system
JP2020095608A (en) Device, method, and program for processing information
Qi et al. Recommendations based on social relationships in mobile services
Shanmugam et al. A multi-criteria decision-making approach for selection of brand ambassadors using machine learning algorithm
Wu et al. Applying a Probabilistic Network Method to Solve Business‐Related Few‐Shot Classification Problems
Ma Modeling users for online advertising
Sharma et al. Hybrid Real-Time Implicit Feedback SOM-Based Movie Recommendation Systems