CN111552865A - User interest portrait method and related equipment - Google Patents

User interest portrait method and related equipment Download PDF

Info

Publication number
CN111552865A
CN111552865A CN202010243221.7A CN202010243221A CN111552865A CN 111552865 A CN111552865 A CN 111552865A CN 202010243221 A CN202010243221 A CN 202010243221A CN 111552865 A CN111552865 A CN 111552865A
Authority
CN
China
Prior art keywords
user
target
interest
probability value
registration information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010243221.7A
Other languages
Chinese (zh)
Inventor
张超亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
OneConnect Financial Technology Co Ltd Shanghai
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN202010243221.7A priority Critical patent/CN111552865A/en
Priority to PCT/CN2020/105900 priority patent/WO2021196474A1/en
Publication of CN111552865A publication Critical patent/CN111552865A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a user interest portrait method and related equipment. The user interest portrait method judges whether the registration information of the user exists in a plurality of websites according to the identification information to obtain a plurality of target websites; generating a registration characteristic vector of a user according to a judgment result of whether the registration information of the user exists in a plurality of websites; determining a first probability value of each interest label of the user according to the registration feature vector of the user by adopting a clustering method; crawling a plurality of target named entities of a user from each target website; calculating a second probability value of each interest label by using the trained neural network; calculating a third probability value of each interest tag based on a statistical method; determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as a target probability value of the interest tag; and determining the interest tag with the target probability value larger than a first preset threshold value as the interest tag of the user. The method and the device improve the accuracy of extracting the interest tags of the users.

Description

User interest portrait method and related equipment
Technical Field
The invention relates to the technical field of entity identification, in particular to a user interest image drawing method, a user interest image drawing device, computer equipment and a computer readable storage medium.
Background
The interest and hobbies in the interest portrait of the user are important data in modern financial scenes, and are widely applied to various fields such as marketing, service and even wind control.
The user interest portrayal needs to extract interest tags of users (such as tourism, programming learning and the like), the existing user interest portrayal method is used for extracting the interest tags of the users according to social contact and use habit data of the users on a certain platform, and the accuracy of extracting the interest tags of the users is low due to single data and data defects. How to accurately extract the interest tags of the users becomes a problem to be solved urgently.
Disclosure of Invention
In view of the foregoing, there is a need for a user interest mapping method, apparatus, computer device and computer readable storage medium, which can extract user interest tags according to user registration information at various websites.
A first aspect of the present application provides a user interest representation method, including:
acquiring identification information of a plurality of websites, a plurality of interest tags and a user;
judging whether the plurality of websites have the registration information of the user according to the identification information to obtain a plurality of target websites having the registration information of the user;
generating a registration characteristic vector of the user according to a judgment result of whether the registration information of the user exists in the plurality of websites;
determining a first probability value of each interest tag of the user according to the registered feature vector of the user by adopting a clustering method;
crawling a plurality of target named entities of the user from each target website;
calculating a second probability value of each interest label according to the plurality of target named entities and the target website to which each target named entity belongs by using the trained neural network;
calculating a third probability value of each interest tag based on a statistical method;
determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as a target probability value of the interest tag;
and determining the interest tag with the target probability value larger than a first preset threshold value as the interest tag of the user.
In another possible implementation manner, the determining whether the registration information of the user exists in the multiple websites according to the identification information includes:
searching a designated website among the plurality of websites for the identification information;
if the search result of the specified website comprises the identification information, the specified website has the registration information of the user;
and if the search result of the specified website does not comprise the identification information, the specified website does not have the registration information of the user.
In another possible implementation manner, the determining whether the registration information of the user exists in the multiple websites according to the identification information includes:
inquiring registration information of the user from an interface authorized by a specified website in the plurality of websites according to the identification information;
if the specified website returns the registration information of the user, the specified website has the registration information of the user;
and if the specified website does not return the registration information of the user or the return value is null, the specified website does not have the registration information of the user.
In another possible implementation manner, the determining, by using a clustering method, a first probability value of each interest tag of the user according to the registered feature vector of the user includes:
acquiring a plurality of first historical users;
clustering the plurality of first historical users according to the registered feature vectors of the plurality of first historical users to obtain a plurality of user clusters and a central vector of each user cluster;
determining a target user cluster to which the user belongs according to the distance between the registration characteristic vector of the user and the center vector of each user cluster;
determining the mean value of the probability values of each target user in the target user cluster about the designated interest tag as a first probability value of the designated interest tag of the user, or determining the ratio of the number of the target users in the target user cluster with the probability value of the designated interest tag larger than a second preset threshold value to the total number of the target users in the target user cluster as the first probability value of the designated interest tag of the user.
In another possible implementation manner, the calculating, by using the trained neural network, a second probability value of each interest tag according to the plurality of target named entities and the target website to which each target named entity belongs includes:
coding each target named entity and a target website to which the target named entity belongs into a feature vector of the target named entity;
inputting the feature vector of each target named entity into the trained neural network to obtain the probability value of each interest tag corresponding to the target named entity;
and calculating the mean value of the probability values of each interest label corresponding to the named entities to obtain a second probability value of the interest label.
In another possible implementation manner, the calculating the third probability value of each interest tag based on a statistical method includes:
acquiring a plurality of second historical users with registration information at the plurality of target websites, wherein the user interest portrait of each second historical user comprises a plurality of labels of the second historical user;
counting a first number of second historical users with the interest labels in the interest portrait of the users;
counting a second number of the plurality of second historical users;
calculating the ratio of the first quantity to the second quantity, and taking the ratio of the first quantity to the second quantity as the third probability value.
In another possible implementation manner, before the determining whether the registration information of the user exists in the plurality of websites according to the identification information, the method for representing the interest of the user further includes: and obtaining the authorization of the user.
A second aspect of the present application provides a user-interest portrayal apparatus, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a plurality of websites, a plurality of interest tags and identification information of a user;
the judging module is used for judging whether the plurality of websites have the registration information of the user according to the identification information to obtain a plurality of target websites having the registration information of the user;
the generating module is used for generating the registration characteristic vector of the user according to the judgment result of whether the registration information of the user exists in the plurality of websites;
a first determination module, configured to determine a first probability value of each interest tag of the user according to the registered feature vector of the user by using a clustering method;
a crawling module to crawl a plurality of target named entities of the user from each target website;
the first calculation module is used for calculating a second probability value of each interest tag according to the target named entities and the target website to which each target named entity belongs by using the trained neural network;
the second calculation module is used for calculating a third probability value of each interest tag based on a statistical method;
the second determination module is used for determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as the target probability value of the interest tag;
and the third determining module is used for determining the interest tag with the target probability value larger than a first preset threshold value as the interest tag of the user.
A third aspect of the application provides a computer device comprising a processor for implementing the user interest portrayal method when executing a computer program stored in a memory.
A fourth aspect of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the user interest representation method.
According to the method and the system, the interest tag of the user is determined through the website related to the interest of the user and the target named entity of the user in the target website, so that the accuracy rate of identifying the interest tag of the user can be improved; the target probability value of the interest label can be determined through the first probability value of the interest label obtained through the clustering method, the second probability value of the interest label obtained through the neural network and the third probability value of the interest label obtained based on statistics, and the risk of deviation can be reduced. Therefore, the interest tags of the users are extracted according to the registration information of the users in each website, and the accuracy rate of extracting the interest tags of the users is improved.
Drawings
FIG. 1 is a flowchart of a method for representing a user interest image according to an embodiment of the present invention.
FIG. 2 is a block diagram of a user interest representation apparatus according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention, and the described embodiments are merely a subset of the embodiments of the present invention, rather than a complete embodiment. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Preferably, the user interest representation method of the invention is applied to one or more computer devices. The computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
Example one
FIG. 1 is a flowchart illustrating a method for representing a user interest of an embodiment of the present invention. The user interest portrait method is applied to computer equipment and is used for extracting interest tags of users according to registration information of the users in various websites.
As shown in FIG. 1, the user interest representation method includes:
101, acquiring a plurality of websites, a plurality of interest tags and identification information of a user.
In a specific embodiment, the websites may include internet music, internet classroom, hectic post, CSDN, microblog, small red book, travel distance, and the like.
The plurality of interest tags may include fitness, education, mother-to-baby, travel, and the like. Users with different interests may have an association between the user's interests and the websites the user has registered using the websites associated with the interests (i.e., the websites associated with the user's interests have registration information for the user). For example, travel is associated with travel, and education is associated with internet cloud class.
The identification information input by the user or the identification information of the user transmitted by the user identification means may be received.
In a specific embodiment, the identification information includes a mobile phone number, an identification number, an encrypted mobile phone number, or an encrypted identification number.
For example, the mobile phone number or the identification number input by the user through the keyboard may be received, or the identification number of the user transmitted by the character recognition device may be received, and the character recognition device may recognize the identification number on the identification card of the user. The mobile phone number can be encrypted through Hash encryption or an MD5 encryption algorithm to obtain an encrypted mobile phone number, and the identity card number is encrypted to obtain an encrypted identity card number.
In another embodiment, the identification information may further include fingerprint information, iris information, face information, or the like.
And 102, judging whether the plurality of websites have the registration information of the user according to the identification information, and obtaining a plurality of target websites having the registration information of the user.
In an embodiment, the determining whether the registration information of the user exists in the plurality of websites according to the identification information includes:
inquiring registration information of the user from an interface authorized by a specified website in the plurality of websites according to the identification information;
if the specified website returns the registration information of the user, the specified website has the registration information of the user;
and if the specified website does not return the registration information of the user or the return value is null, the specified website does not have the registration information of the user.
For example, querying a registration information query interface of the CSDN for the registration information of the user a (the query parameter is the telephone number of the user a); if the CSDN returns the registration information of the user a (e.g., registration time, registration status, user name, etc.) the CSDN has the registration information of the user a.
In another embodiment, the determining whether the registration information of the user exists in the plurality of websites according to the identification information includes:
registering a new account with a designated website of the plurality of websites with the identification information;
if the specified website prompts that the user is registered, the specified website has the registration information of the user;
and if the specified website prompts to input registration verification information, the specified website does not have the registration information of the user.
For example, a registration of a new account may be requested from the CSDN by the phone number of user a; if the CSDN prompts to input registration verification information (e.g., a verification code issued by the CSDN to the telephone number of the subscriber a), the CSDN does not have the registration information of the subscriber a.
In another embodiment, the determining whether the registration information of the user exists in the plurality of websites according to the identification information includes:
searching a designated website among the plurality of websites for the identification information;
if the search result of the specified website comprises the identification information, the specified website has the registration information of the user;
and if the search result of the specified website does not comprise the identification information, the specified website does not have the registration information of the user.
103, generating a registration feature vector of the user according to a judgment result of whether the registration information of the user exists in the plurality of websites.
For example, the generated registration feature vector of the user a is (1, 1, 0, 1, 0), where 1 in the first dimension from left to right represents the registration information of the internet access cloud music presence user a; the second dimension 1 represents the registration information of the Baidu bar existing user A; 0 in the third dimension indicates that the registration information of the user A does not exist in the CSDN; a fourth dimension 1 represents registration information of a microblog user A; 0 in the fifth dimension indicates that the small red book does not have the registration information of user a.
And 104, determining a first probability value of each interest label of the user according to the registered feature vector of the user by adopting a clustering method.
In a specific embodiment, the determining, by using a clustering method, a first probability value of each interest tag of the user according to the registered feature vector of the user includes:
(1) a plurality of first historical users are obtained.
(2) And clustering the plurality of first historical users according to the registered feature vectors of the plurality of first historical users to obtain a plurality of user clusters and a central vector of each user cluster.
(3) And determining a target user cluster to which the user belongs according to the distance between the registration characteristic vector of the user and the center vector of each user cluster. For example, two user clusters (a first user cluster and a second user cluster respectively) are obtained through clustering, the euclidean distance between the registered feature vector of the user and the central vector of the first user cluster is num1, the euclidean distance between the registered feature vector of the user and the central vector of the second user cluster is num2, and num1 is greater than num2, and then the second user cluster is determined as the target user cluster.
(4) Determining the mean value of the probability values of each target user in the target user cluster about the designated interest tag as a first probability value of the designated interest tag of the user, or determining the ratio of the number of the target users in the target user cluster with the probability value of the designated interest tag larger than a second preset threshold value to the total number of the target users in the target user cluster as the first probability value of the designated interest tag of the user. For example, the target user cluster includes 3 users, the interest tags are designated as tourism, the probability values of the tourism interest tags of the 3 users are 0.5, 0.6 and 0.4 respectively, and then the first probability value of the tourism interest tag of the user is 0.5. The second preset threshold is a preset value adjusted according to experimental data.
105, crawling a plurality of target named entities of the user from each target website.
The method comprises the steps of crawling a plurality of webpage texts of a user from each target website, wherein the plurality of webpage texts of the target website comprise social information and behavior information of the user at the target website, and extracting a plurality of target named entities of the target website from the plurality of webpage texts of the target website. For example, the target website is internet music, the plurality of web page texts include a song focused or shared by the user a, named entity extraction is performed from the plurality of web page texts, and the obtained plurality of target named entities are balladry, campus, and the like (the user who has extracted the balladry and campus is found to be popular with "travel"). For another example, the target website is a small red book, the plurality of web page texts include a pan experience that the user B pays attention to or shares, named entity extraction is performed from the plurality of web page texts, and the obtained plurality of target named entities are milk powder, a baby carriage and the like (the user who extracts the milk powder and the baby carriage is found to be generally prone to mother and baby). For another example, the target website is an internet cloud classroom, the plurality of web page texts include video presentations concerned or shared by the user B, named entity extraction is performed on the plurality of web page texts, and the obtained plurality of target named entities are JAVA, SPRING and the like (users extracted with JAVA and SPRING are found to be generally prone to programming education).
And 106, calculating a second probability value of each interest label by using the trained neural network according to the plurality of target named entities and the target website to which each target named entity belongs.
In a specific embodiment, the calculating, by the trained neural network, the second probability value of each interest tag according to the target entities and the target website to which each target named entity belongs includes:
coding each target named entity and a target website to which the target named entity belongs into a feature vector of the target named entity;
inputting the feature vector of each target named entity into the trained neural network to obtain the probability value of each interest tag corresponding to the target named entity;
and calculating the mean value of the probability values of each interest label corresponding to the named entities to obtain a second probability value of the interest label.
For example, if the two named entities are JAVA, SPRING, respectively, and the probability value for "programmatic education" (interest tags) is 0.9 for JAVA and 0.7 for SPRING, then the second probability value for "programmatic education" (interest tags) is 0.8.
The encoding of each target named entity and the target website to which the target named entity belongs as the feature vector of the target named entity includes:
coding the target named entity into a first intermediate vector according to a preset coder (such as a one-hot coder and a word2vec coder);
encoding the target website to which the target named entity belongs into a second intermediate vector according to the preset encoder;
and connecting the first intermediate vector with the second intermediate vector, or multiplying the first intermediate vector with the second intermediate vector by elements to obtain the feature vector of the target named entity.
Training the neural network may include:
obtaining a training sample and a label of the training sample, wherein the training sample comprises a target named entity and a target website code to which the target named entity belongs;
coding the target named entity into a first vector according to a preset coding table, and coding a target website to which the target named entity belongs into a second vector according to the preset coding table;
splicing the first vector and the second vector to obtain a feature vector of the target named entity;
inputting the characteristic vector of the target named entity into a neural network with an initialization parameter value to obtain an output vector;
and optimizing the parameter value of the neural network through a back propagation algorithm according to the output vector and the label of the training sample.
A third probability value for each interest tag is calculated 107 based on statistical methods.
The calculating a third probability value for each interest tag based on a statistical method comprises:
acquiring a plurality of second historical users with registration information at the plurality of target websites, wherein the user interest portrait of each second historical user comprises a plurality of labels of the second historical user;
counting a first number of second historical users with the interest labels in the interest portrait of the users;
counting a second number of the plurality of second historical users;
calculating the ratio of the first quantity to the second quantity, and taking the ratio of the first quantity to the second quantity as the third probability value.
For example, 4 second history users (user a, user B, user C, and user D, respectively) who have registration information on internet-accessible cloud music and carry history are acquired; counting a first number of second historical users having "travel" (interest tags) in the user interest representation to 3; the second number of second historical users is 4; the third probability value for "travel" (interest tag) is 0.75.
And 108, determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as the target probability value of the interest tag.
For example, if the first probability value of "travel" (interest tag) is 0.65, the third probability value of "travel" (interest tag) is 0.70, and the third probability value of "travel" (interest tag) is 0.75, then 0.75 is determined as the target probability value of "travel" (interest tag).
And 109, determining the interest label with the target probability value larger than a first preset threshold value as the interest label of the user.
For example, if the target probability value of "travel" (interest tag) is 0.75, the target probability value of "programming education" (interest tag) is 0.85, and the first preset threshold value is 0.80, the "programming education" is determined as the interest tag of the user.
In the user interest representation method according to the first embodiment, the interest tag of the user is determined through the website associated with the interest of the user and the target named entity of the user in the target website, so that the accuracy of identifying the interest tag of the user can be improved; the target probability value of the interest label can be determined through the first probability value of the interest label obtained through the clustering method, the second probability value of the interest label obtained through the neural network and the third probability value of the interest label obtained based on statistics, and the risk of deviation can be reduced. According to the embodiment, the interest tags of the users are extracted according to the registration information of the users in each website, the accuracy of extracting the interest tags of the users is improved, the extracted interest tags of the users are used for describing the interest portraits of the users, and the accuracy of describing the interest portraits of the users is improved.
In another embodiment, before the determining whether the registration information of the user exists in the plurality of websites according to the identification information, the user interest representation method further includes: and obtaining the authorization of the user.
Before judging whether the registration information of the user exists in the plurality of websites according to the identification information, an authorization option box can be issued to the user, and authorization options selected by the user in the authorization option box are received.
Example two
FIG. 2 is a block diagram of a user interest representation apparatus according to a second embodiment of the present invention. The user interest representation device 20 is applied to a computer device. The user interest representation device 20 is used for extracting the interest tags of the users according to the registration information of the users at each website.
As shown in FIG. 2, the user interest representation apparatus 20 may include an obtaining module 201, a determining module 202, a generating module 203, a first determining module 204, a crawling module 205, a first calculating module 206, a second calculating module 207, a second determining module 208, and a third determining module 209.
The obtaining module 201 is configured to obtain a plurality of websites, a plurality of interest tags, and identification information of a user.
In a specific embodiment, the websites may include internet music, internet classroom, hectic post, CSDN, microblog, small red book, travel distance, and the like.
The plurality of interest tags may include fitness, education, mother-to-baby, travel, and the like. Users with different interests may have an association between the user's interests and the websites the user has registered using the websites associated with the interests (i.e., the websites associated with the user's interests have registration information for the user). For example, travel is associated with travel, and education is associated with internet cloud class.
The identification information input by the user or the identification information of the user transmitted by the user identification means may be received.
In a specific embodiment, the identification information includes a mobile phone number, an identification number, an encrypted mobile phone number, or an encrypted identification number.
For example, the mobile phone number or the identification number input by the user through the keyboard may be received, or the identification number of the user transmitted by the character recognition device may be received, and the character recognition device may recognize the identification number on the identification card of the user. The mobile phone number can be encrypted through Hash encryption or an MD5 encryption algorithm to obtain an encrypted mobile phone number, and the identity card number is encrypted to obtain an encrypted identity card number.
In another embodiment, the identification information may further include fingerprint information, iris information, face information, or the like.
The determining module 202 is configured to determine whether the multiple websites have the registration information of the user according to the identification information, so as to obtain multiple target websites having the registration information of the user.
In an embodiment, the determining whether the registration information of the user exists in the plurality of websites according to the identification information includes:
inquiring registration information of the user from an interface authorized by a specified website in the plurality of websites according to the identification information;
if the specified website returns the registration information of the user, the specified website has the registration information of the user;
and if the specified website does not return the registration information of the user or the return value is null, the specified website does not have the registration information of the user.
For example, querying a registration information query interface of the CSDN for the registration information of the user a (the query parameter is the telephone number of the user a); if the CSDN returns the registration information of the user a (e.g., registration time, registration status, user name, etc.) the CSDN has the registration information of the user a.
In another embodiment, the determining whether the registration information of the user exists in the plurality of websites according to the identification information includes:
registering a new account with a designated website of the plurality of websites with the identification information;
if the specified website prompts that the user is registered, the specified website has the registration information of the user;
and if the specified website prompts to input registration verification information, the specified website does not have the registration information of the user.
For example, a registration of a new account may be requested from the CSDN by the phone number of user a; if the CSDN prompts to input registration verification information (e.g., a verification code issued by the CSDN to the telephone number of the subscriber a), the CSDN does not have the registration information of the subscriber a.
In another embodiment, the determining whether the registration information of the user exists in the plurality of websites according to the identification information includes:
searching a designated website among the plurality of websites for the identification information;
if the search result of the specified website comprises the identification information, the specified website has the registration information of the user;
and if the search result of the specified website does not comprise the identification information, the specified website does not have the registration information of the user.
A generating module 203, configured to generate a registration feature vector of the user according to a determination result of whether the registration information of the user exists in the multiple websites.
For example, the generated registration feature vector of the user a is (1, 1, 0, 1, 0), where 1 in the first dimension from left to right represents the registration information of the internet access cloud music presence user a; the second dimension 1 represents the registration information of the Baidu bar existing user A; 0 in the third dimension indicates that the registration information of the user A does not exist in the CSDN; a fourth dimension 1 represents registration information of a microblog user A; 0 in the fifth dimension indicates that the small red book does not have the registration information of user a.
A first determining module 204, configured to determine a first probability value of each interest tag of the user according to the registered feature vector of the user by using a clustering method.
In a specific embodiment, the determining, by using a clustering method, a first probability value of each interest tag of the user according to the registered feature vector of the user includes:
(1) a plurality of first historical users are obtained.
(2) And clustering the plurality of first historical users according to the registered feature vectors of the plurality of first historical users to obtain a plurality of user clusters and a central vector of each user cluster.
(3) And determining a target user cluster to which the user belongs according to the distance between the registration characteristic vector of the user and the center vector of each user cluster. For example, two user clusters (a first user cluster and a second user cluster respectively) are obtained through clustering, the euclidean distance between the registered feature vector of the user and the central vector of the first user cluster is num1, the euclidean distance between the registered feature vector of the user and the central vector of the second user cluster is num2, and num1 is greater than num2, and then the second user cluster is determined as the target user cluster.
(4) Determining the mean value of the probability values of each target user in the target user cluster about the designated interest tag as a first probability value of the designated interest tag of the user, or determining the ratio of the number of the target users in the target user cluster with the probability value of the designated interest tag larger than a second preset threshold value to the total number of the target users in the target user cluster as the first probability value of the designated interest tag of the user. For example, the target user cluster includes 3 users, the interest tags are designated as tourism, the probability values of the tourism interest tags of the 3 users are 0.5, 0.6 and 0.4 respectively, and then the first probability value of the tourism interest tag of the user is 0.5. The second preset threshold is a preset value adjusted according to experimental data.
A crawling module 205 to crawl a plurality of target named entities of the user from each target website.
The method comprises the steps of crawling a plurality of webpage texts of a user from each target website, wherein the plurality of webpage texts of the target website comprise social information and behavior information of the user at the target website, and extracting a plurality of target named entities of the target website from the plurality of webpage texts of the target website. For example, the target website is internet music, the plurality of web page texts include a song focused or shared by the user a, named entity extraction is performed from the plurality of web page texts, and the obtained plurality of target named entities are balladry, campus, and the like (the user who has extracted the balladry and campus is found to be popular with "travel"). For another example, the target website is a small red book, the plurality of web page texts include a pan experience that the user B pays attention to or shares, named entity extraction is performed from the plurality of web page texts, and the obtained plurality of target named entities are milk powder, a baby carriage and the like (the user who extracts the milk powder and the baby carriage is found to be generally prone to mother and baby). For another example, the target website is an internet cloud classroom, the plurality of web page texts include video presentations concerned or shared by the user B, named entity extraction is performed on the plurality of web page texts, and the obtained plurality of target named entities are JAVA, SPRING and the like (users extracted with JAVA and SPRING are found to be generally prone to programming education).
The first calculating module 206 is configured to calculate, by using the trained neural network, a second probability value of each interest tag according to the plurality of target named entities and the target website to which each target named entity belongs.
In a specific embodiment, the calculating, by the trained neural network, the second probability value of each interest tag according to the target entities and the target website to which each target named entity belongs includes:
coding each target named entity and a target website to which the target named entity belongs into a feature vector of the target named entity;
inputting the feature vector of each target named entity into the trained neural network to obtain the probability value of each interest tag corresponding to the target named entity;
and calculating the mean value of the probability values of each interest label corresponding to the named entities to obtain a second probability value of the interest label. For example, if the two named entities are JAVA, SPRING, respectively, and the probability value for "programmatic education" (interest tags) is 0.9 for JAVA and 0.7 for SPRING, then the second probability value for "programmatic education" (interest tags) is 0.8.
The encoding of each target named entity and the target website to which the target named entity belongs as the feature vector of the target named entity includes:
coding the target named entity into a first intermediate vector according to a preset coder (such as a one-hot coder and a word2vec coder);
encoding the target website to which the target named entity belongs into a second intermediate vector according to the preset encoder;
and connecting the first intermediate vector with the second intermediate vector, or multiplying the first intermediate vector with the second intermediate vector by elements to obtain the feature vector of the target named entity.
Training the neural network may include:
obtaining a training sample and a label of the training sample, wherein the training sample comprises a target named entity and a target website code to which the target named entity belongs;
coding the target named entity into a first vector according to a preset coding table, and coding a target website to which the target named entity belongs into a second vector according to the preset coding table;
splicing the first vector and the second vector to obtain a feature vector of the target named entity;
inputting the characteristic vector of the target named entity into a neural network with an initialization parameter value to obtain an output vector;
and optimizing the parameter value of the neural network through a back propagation algorithm according to the output vector and the label of the training sample.
A second calculating module 207, configured to calculate a third probability value of each interest tag based on a statistical method.
The calculating a third probability value for each interest tag based on a statistical method comprises:
acquiring a plurality of second historical users with registration information at the plurality of target websites, wherein the user interest portrait of each second historical user comprises a plurality of labels of the second historical user;
counting a first number of second historical users with the interest labels in the interest portrait of the users;
counting a second number of the plurality of second historical users;
calculating the ratio of the first quantity to the second quantity, and taking the ratio of the first quantity to the second quantity as the third probability value.
For example, 4 second history users (user a, user B, user C, and user D, respectively) who have registration information on internet-accessible cloud music and carry history are acquired; counting a first number of second historical users having "travel" (interest tags) in the user interest representation to 3; the second number of second historical users is 4; the third probability value for "travel" (interest tag) is 0.75.
A second determining module 208, configured to determine a maximum value of the first probability value, the second probability value, and the third probability value of each interest tag as a target probability value of the interest tag.
For example, if the first probability value of "travel" (interest tag) is 0.65, the third probability value of "travel" (interest tag) is 0.70, and the third probability value of "travel" (interest tag) is 0.75, then 0.75 is determined as the target probability value of "travel" (interest tag).
A third determining module 209, configured to determine the interest tag with the target probability value greater than the first preset threshold as the interest tag of the user.
For example, if the target probability value of "travel" (interest tag) is 0.75, the target probability value of "programming education" (interest tag) is 0.85, and the first preset threshold value is 0.80, the "programming education" is determined as the interest tag of the user.
The user interest representation device 20 of the second embodiment determines the interest tag of the user through the website associated with the interest of the user and the target named entity of the user in the target website, so that the accuracy of identifying the interest tag of the user can be improved; the target probability value of the interest label can be determined through the first probability value of the interest label obtained through the clustering method, the second probability value of the interest label obtained through the neural network and the third probability value of the interest label obtained based on statistics, and the risk of deviation can be reduced. According to the embodiment, the interest tags of the users are extracted according to the registration information of the users in each website, the accuracy of extracting the interest tags of the users is improved, the extracted interest tags of the users are used for describing the interest portraits of the users, and the accuracy of describing the interest portraits of the users is improved.
In another embodiment, the obtaining module is further configured to obtain the authorization of the user before the determining, according to the identification information, whether the registration information of the user exists in the plurality of websites.
Before judging whether the registration information of the user exists in the plurality of websites according to the identification information, an authorization option box can be issued to the user, and authorization options selected by the user in the authorization option box are received.
EXAMPLE III
The present embodiment provides a computer-readable storage medium, which stores thereon a computer program, and when the computer program is executed by a processor, the computer program implements the steps in the user interest representation method embodiment, such as steps 101 and 109 shown in fig. 1:
101, acquiring a plurality of websites, a plurality of interest tags and identification information of a user;
102, judging whether the registration information of the user exists in the plurality of websites according to the identification information, and obtaining a plurality of target websites in which the registration information of the user exists;
103, generating a registration feature vector of the user according to a judgment result of whether the registration information of the user exists in the plurality of websites;
104, determining a first probability value of each interest label of the user according to the registered feature vector of the user by adopting a clustering method;
105, crawling a plurality of target named entities of the user from each target website;
106, calculating a second probability value of each interest label by using the trained neural network according to the plurality of target named entities and the target website to which each target named entity belongs;
107, calculating a third probability value of each interest tag based on a statistical method;
108, determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as a target probability value of the interest tag;
and 109, determining the interest label with the target probability value larger than a first preset threshold value as the interest label of the user.
Alternatively, the computer program, when executed by the processor, implements the functions of the modules in the above device embodiments, such as the module 201 and 209 in fig. 2:
an obtaining module 201, configured to obtain a plurality of websites, a plurality of interest tags, and identification information of a user;
a judging module 202, configured to judge whether the multiple websites have the registration information of the user according to the identification information, so as to obtain multiple target websites where the registration information of the user exists;
a generating module 203, configured to generate a registration feature vector of the user according to a determination result of whether the registration information of the user exists in the multiple websites;
a first determining module 204, configured to determine a first probability value of each interest tag of the user according to the registered feature vector of the user by using a clustering method;
a crawling module 205 to crawl a plurality of target named entities of the user from each target website;
the first calculating module 206 is configured to calculate, by using the trained neural network, a second probability value of each interest tag according to the plurality of target named entities and the target website to which each target named entity belongs;
a second calculating module 207, configured to calculate a third probability value of each interest tag based on a statistical method;
a second determining module 208, configured to determine a maximum value of the first probability value, the second probability value, and the third probability value of each interest tag as a target probability value of the interest tag;
a third determining module 209, configured to determine the interest tag with the target probability value greater than the first preset threshold as the interest tag of the user.
Example four
Fig. 3 is a schematic diagram of a computer device according to a third embodiment of the present invention. The computer device 30 includes a memory 301, a processor 302, and a computer program 303, such as a user interest representation program, stored in the memory 301 and executable on the processor 302. The processor 302, when executing the computer program 303, implements the steps in the above-mentioned user interest imaging method embodiment, such as 101-109 shown in fig. 1:
101, acquiring a plurality of websites, a plurality of interest tags and identification information of a user;
102, judging whether the registration information of the user exists in the plurality of websites according to the identification information, and obtaining a plurality of target websites in which the registration information of the user exists;
103, generating a registration feature vector of the user according to a judgment result of whether the registration information of the user exists in the plurality of websites;
104, determining a first probability value of each interest label of the user according to the registered feature vector of the user by adopting a clustering method;
105, crawling a plurality of target named entities of the user from each target website;
106, calculating a second probability value of each interest label by using the trained neural network according to the plurality of target named entities and the target website to which each target named entity belongs;
107, calculating a third probability value of each interest tag based on a statistical method;
108, determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as a target probability value of the interest tag;
and 109, determining the interest label with the target probability value larger than a first preset threshold value as the interest label of the user.
Alternatively, the computer program, when executed by the processor, implements the functions of the modules in the above device embodiments, such as the module 201 and 209 in fig. 2:
an obtaining module 201, configured to obtain a plurality of websites, a plurality of interest tags, and identification information of a user;
a judging module 202, configured to judge whether the multiple websites have the registration information of the user according to the identification information, so as to obtain multiple target websites where the registration information of the user exists;
a generating module 203, configured to generate a registration feature vector of the user according to a determination result of whether the registration information of the user exists in the multiple websites;
a first determining module 204, configured to determine a first probability value of each interest tag of the user according to the registered feature vector of the user by using a clustering method;
a crawling module 205 to crawl a plurality of target named entities of the user from each target website;
the first calculating module 206 is configured to calculate, by using the trained neural network, a second probability value of each interest tag according to the plurality of target named entities and the target website to which each target named entity belongs;
a second calculating module 207, configured to calculate a third probability value of each interest tag based on a statistical method;
a second determining module 208, configured to determine a maximum value of the first probability value, the second probability value, and the third probability value of each interest tag as a target probability value of the interest tag;
a third determining module 209, configured to determine the interest tag with the target probability value greater than the first preset threshold as the interest tag of the user.
Illustratively, the computer program 303 may be partitioned into one or more modules that are stored in the memory 301 and executed by the processor 302 to perform the present method. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 303 in the computer device 30. For example, the computer program 303 may be divided into the obtaining module 201, the determining module 202, the generating module 203, the first determining module 204, the crawling module 205, the first calculating module 206, the second calculating module 207, the second determining module 208, and the third determining module 209 in fig. 2, where specific functions of the modules are described in embodiment two.
Those skilled in the art will appreciate that the schematic diagram 3 is merely an example of the computer device 30 and does not constitute a limitation of the computer device 30, and may include more or less components than those shown, or combine certain components, or different components, for example, the computer device 30 may also include input and output devices, network access devices, buses, etc.
The Processor 302 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor 302 may be any conventional processor or the like, the processor 302 being the control center for the computer device 30 and connecting the various parts of the overall computer device 30 using various interfaces and lines.
The memory 301 may be used to store the computer program 303, and the processor 302 may implement various functions of the computer device 30 by running or executing the computer program or module stored in the memory 301 and calling data stored in the memory 301. The memory 301 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the computer device 30, and the like. Further, the memory 301 may include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other non-volatile solid state storage device.
The modules integrated by the computer device 30 may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware form, and can also be realized in a form of hardware and a software functional module.
The integrated module implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to perform some steps of the method for representing a user interest according to various embodiments of the present invention.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned. Furthermore, it is to be understood that the word "comprising" does not exclude other modules or steps, and the singular does not exclude the plural. A plurality of modules or means recited in the system claims may also be implemented by one module or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A user interest imaging method is characterized by comprising the following steps:
acquiring identification information of a plurality of websites, a plurality of interest tags and a user;
judging whether the plurality of websites have the registration information of the user according to the identification information to obtain a plurality of target websites having the registration information of the user;
generating a registration characteristic vector of the user according to a judgment result of whether the registration information of the user exists in the plurality of websites;
determining a first probability value of each interest tag of the user according to the registered feature vector of the user by adopting a clustering method;
crawling a plurality of target named entities of the user from each target website;
calculating a second probability value of each interest label according to the plurality of target named entities and the target website to which each target named entity belongs by using the trained neural network;
calculating a third probability value of each interest tag based on a statistical method;
determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as a target probability value of the interest tag;
and determining the interest tag with the target probability value larger than a first preset threshold value as the interest tag of the user.
2. The method of claim 1, wherein the determining whether the registration information of the user exists at the plurality of websites based on the identification information comprises:
searching a designated website among the plurality of websites for the identification information;
if the search result of the specified website comprises the identification information, the specified website has the registration information of the user;
and if the search result of the specified website does not comprise the identification information, the specified website does not have the registration information of the user.
3. The method of claim 1, wherein the determining whether the registration information of the user exists at the plurality of websites based on the identification information comprises:
inquiring registration information of the user from an interface authorized by a specified website in the plurality of websites according to the identification information;
if the specified website returns the registration information of the user, the specified website has the registration information of the user;
and if the specified website does not return the registration information of the user or the return value is null, the specified website does not have the registration information of the user.
4. The method of claim 1, wherein the determining a first probability value for each interest tag of the user based on the registered feature vector of the user using a clustering method comprises:
acquiring a plurality of first historical users;
clustering the plurality of first historical users according to the registered feature vectors of the plurality of first historical users to obtain a plurality of user clusters and a central vector of each user cluster;
determining a target user cluster to which the user belongs according to the distance between the registration characteristic vector of the user and the center vector of each user cluster;
determining the mean value of the probability values of each target user in the target user cluster about the designated interest tag as a first probability value of the designated interest tag of the user, or determining the ratio of the number of the target users in the target user cluster with the probability value of the designated interest tag larger than a second preset threshold value to the total number of the target users in the target user cluster as the first probability value of the designated interest tag of the user.
5. The method for portraying user interest as recited in claim 1, wherein the calculating, with a trained neural network, the second probability value for each interest tag based on the plurality of target named entities and a target website to which each target named entity belongs comprises:
coding each target named entity and a target website to which the target named entity belongs into a feature vector of the target named entity;
inputting the feature vector of each target named entity into the trained neural network to obtain the probability value of each interest tag corresponding to the target named entity;
and calculating the mean value of the probability values of each interest label corresponding to the named entities to obtain a second probability value of the interest label.
6. The method of claim 1, wherein the statistically calculating a third probability value for each interest tag comprises:
acquiring a plurality of second historical users with registration information at the plurality of target websites, wherein the user interest portrait of each second historical user comprises a plurality of labels of the second historical user;
counting a first number of second historical users with the interest labels in the interest portrait of the users;
counting a second number of the plurality of second historical users;
calculating the ratio of the first quantity to the second quantity, and taking the ratio of the first quantity to the second quantity as the third probability value.
7. The user interest portrayal method of any one of claims 1-6, wherein prior to said determining whether the registration information of the user exists at the plurality of websites based on the identification information, the user interest portrayal method further comprises: and obtaining the authorization of the user.
8. A user interest portrayal apparatus, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a plurality of websites, a plurality of interest tags and identification information of a user;
the judging module is used for judging whether the plurality of websites have the registration information of the user according to the identification information to obtain a plurality of target websites having the registration information of the user;
the generating module is used for generating the registration characteristic vector of the user according to the judgment result of whether the registration information of the user exists in the plurality of websites;
a first determination module, configured to determine a first probability value of each interest tag of the user according to the registered feature vector of the user by using a clustering method;
a crawling module to crawl a plurality of target named entities of the user from each target website;
the first calculation module is used for calculating a second probability value of each interest tag according to the target named entities and the target website to which each target named entity belongs by using the trained neural network;
the second calculation module is used for calculating a third probability value of each interest tag based on a statistical method;
the second determination module is used for determining the maximum value of the first probability value, the second probability value and the third probability value of each interest tag as the target probability value of the interest tag;
and the third determining module is used for determining the interest tag with the target probability value larger than a first preset threshold value as the interest tag of the user.
9. A computer device, characterized in that the computer device comprises a processor for executing a computer program stored in a memory to implement the user interest representation method as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the user interest representation method as recited in any one of claims 1-7.
CN202010243221.7A 2020-03-31 2020-03-31 User interest portrait method and related equipment Pending CN111552865A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010243221.7A CN111552865A (en) 2020-03-31 2020-03-31 User interest portrait method and related equipment
PCT/CN2020/105900 WO2021196474A1 (en) 2020-03-31 2020-07-30 User interest profiling method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010243221.7A CN111552865A (en) 2020-03-31 2020-03-31 User interest portrait method and related equipment

Publications (1)

Publication Number Publication Date
CN111552865A true CN111552865A (en) 2020-08-18

Family

ID=72003804

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010243221.7A Pending CN111552865A (en) 2020-03-31 2020-03-31 User interest portrait method and related equipment

Country Status (2)

Country Link
CN (1) CN111552865A (en)
WO (1) WO2021196474A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883269A (en) * 2021-02-26 2021-06-01 上海连尚网络科技有限公司 Method and equipment for adjusting label data information

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840743B (en) * 2022-03-01 2023-02-07 深圳市小秤砣科技有限公司 Model recommendation method and device, electronic equipment and readable storage medium
CN114489447B (en) * 2022-03-28 2022-07-12 山东大学 Word processing control method and system based on user behavior and readable storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011062883A1 (en) * 2009-11-20 2011-05-26 Ustream, Inc. Broadcast notifications using social networking systems
CN106874435B (en) * 2017-01-25 2020-02-14 北京航空航天大学 User portrait construction method and device
CN108596655A (en) * 2018-04-10 2018-09-28 四川金亿信财务咨询有限公司 A kind of statistics extension system for information of being registered based on advertisement
CN109408735B (en) * 2018-10-11 2021-06-25 杭州飞弛网络科技有限公司 Stranger social user portrait generation method and system
CN109992632A (en) * 2019-01-14 2019-07-09 江苏智途科技股份有限公司 A kind of spatial data intelligence distribution method of servicing based on big data
CN110298029B (en) * 2019-05-22 2022-07-12 平安科技(深圳)有限公司 Friend recommendation method, device, equipment and medium based on user corpus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883269A (en) * 2021-02-26 2021-06-01 上海连尚网络科技有限公司 Method and equipment for adjusting label data information
CN112883269B (en) * 2021-02-26 2024-05-31 上海连尚网络科技有限公司 Method and equipment for adjusting tag data information

Also Published As

Publication number Publication date
WO2021196474A1 (en) 2021-10-07

Similar Documents

Publication Publication Date Title
CN111695033B (en) Enterprise public opinion analysis method, enterprise public opinion analysis device, electronic equipment and medium
US11886555B2 (en) Online identity reputation
CN112417096B (en) Question-answer pair matching method, device, electronic equipment and storage medium
WO2019119505A1 (en) Face recognition method and device, computer device and storage medium
US20180261214A1 (en) Sequence-to-sequence convolutional architecture
CN111552865A (en) User interest portrait method and related equipment
JP6532523B2 (en) Management of user identification registration using handwriting
US20230032728A1 (en) Method and apparatus for recognizing multimedia content
WO2022105496A1 (en) Intelligent follow-up contact method and apparatus, and electronic device and readable storage medium
US20240184865A1 (en) Systems and methods for providing user validation
CN111666415A (en) Topic clustering method and device, electronic equipment and storage medium
CN113094478B (en) Expression reply method, device, equipment and storage medium
US11275994B2 (en) Unstructured key definitions for optimal performance
CN115130711A (en) Data processing method and device, computer and readable storage medium
CN111538816A (en) Question-answering method, device, electronic equipment and medium based on AI identification
KR20220070181A (en) Method and system for detecting duplicated document using document similarity measuring model based on deep learning
CN114037545A (en) Client recommendation method, device, equipment and storage medium
CN113626704A (en) Method, device and equipment for recommending information based on word2vec model
CN112507095A (en) Information identification method based on weak supervised learning and related equipment
CN114219971A (en) Data processing method, data processing equipment and computer readable storage medium
CN115618415A (en) Sensitive data identification method and device, electronic equipment and storage medium
CN113656699A (en) User feature vector determination method, related device and medium
CN113821612A (en) Information searching method and device
CN111209403B (en) Data processing method, device, medium and electronic equipment
CN112036439A (en) Dependency relationship classification method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40033541

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination