CN104778283A - User occupation classification method and system based on microblog - Google Patents

User occupation classification method and system based on microblog Download PDF

Info

Publication number
CN104778283A
CN104778283A CN201510236383.7A CN201510236383A CN104778283A CN 104778283 A CN104778283 A CN 104778283A CN 201510236383 A CN201510236383 A CN 201510236383A CN 104778283 A CN104778283 A CN 104778283A
Authority
CN
China
Prior art keywords
user
occupation type
microblogging text
occupational
microblogging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510236383.7A
Other languages
Chinese (zh)
Other versions
CN104778283B (en
Inventor
李寿山
戴斌
周国栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201510236383.7A priority Critical patent/CN104778283B/en
Publication of CN104778283A publication Critical patent/CN104778283A/en
Application granted granted Critical
Publication of CN104778283B publication Critical patent/CN104778283B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a user occupation classification method and system based on microblog. The method comprises the steps that the preset number of first users are acquired, and the first users are the microblog users who have provided occupational information; the occupational information and microblog text of each first user are acquired; the first users are classified according to the occupational information of the first users, and the occupation type of each first user is determined by utilizing the obtained result after the classification; word segmentation processing is performed on the microblog text of each first user to obtain a first text word group; a first feature vector corresponding to the microblog text of each first user is constituted by utilizing the first text word group, and a maximum entropy classifier is constructed by utilizing the occupation types and the first feature vectors of the first users; second feature vectors are processed by utilizing the maximum entropy classifier to obtain the occupation types of second users to which microblog text corresponding to the second feature vectors belongs. Accordingly, the occupation types of the microblog users of which the occupational information is unknown can be accurately acquired.

Description

A kind of user's occupational classification method and system based on microblogging
Technical field
The present invention relates to natural language processing and social networks technical field, more particularly, relate to a kind of user's occupational classification method and system based on microblogging.
Background technology
The opening of internet, virtual and sharing, make it gradually become common platform that people express viewpoint, attitude, sensation, mood etc.; Meanwhile, create in a large number based on the social network sites of internet, wherein just comprise miniature blog (Microblog), i.e. microblogging.Increasing research work starts to pay close attention to microblogging, and a wherein important class research is exactly microblog users signature analysis.
So-called microblog users signature analysis, carry out digging user feature by carrying out decision tree analysis, correlation analysis and correlation rule to the information of microblog users and relation data exactly, and carry out users classification, usage mining and influence power detection etc. according to these user characteristicses.Wherein, microblog users occupation is a substance of microblog users signature analysis, specifically, it is mainly and carries out specific classification according to the occupation of microblog users, such as the occupation of user can be divided into student, profession, computer and sell class etc.
But, there is not the technical scheme of based on microblogging, microblog users being carried out to occupational classification in prior art, that is, lack a kind of technical scheme of based on microblogging, microblog users being carried out to occupational classification in prior art.
Summary of the invention
The object of this invention is to provide a kind of user's occupational classification method and system based on microblogging, determine its occupation type with the microblogging text by microblog users.
To achieve these goals, the invention provides following technical scheme:
Based on user's occupational classification method of microblogging, comprising:
Obtain the first user of predetermined number, described first user is for providing the microblog users of occupational information;
Obtain occupational information and the microblogging text of each first user;
Occupational information according to described first user is classified to described first user, and the result obtained after utilizing classification determines the occupation type of described each first user;
The microblogging text of described each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of described each first user respectively textual phrase;
Utilize described first textual phrase to form first eigenvector corresponding with the microblogging text of described each first user respectively, and utilize the occupation type of described first user and described first eigenvector to build maximum entropy classifiers;
Utilize described maximum entropy classifiers to process second feature vector, obtain the occupation type of the second user belonging to the microblogging text corresponding with described second feature vector.
Preferably, the described occupational information according to described first user is classified to described first user, and the result obtained after utilizing classification determines the occupation type of described each first user, comprising:
The professional nature of described each first user is determined by the occupational information of described first user;
The occupation type of described first user is divided into the first kind and Equations of The Second Kind according to preset standard by the professional nature according to described each first user; The described first kind is the occupation type of brainwork, and described Equations of The Second Kind is blue-collar occupation type.
Preferably, obtain the occupational information of described first user, comprising:
The personal information provided by described first user obtains its professional label;
Determine that the occupation corresponding with described professional label is the occupational information of described first user.
Preferably, described method also comprises:
Build a first user list, and the first user of predetermined number is stored in described first user list, for inquiry.
Preferably, describedly utilize described maximum entropy classifiers to process second feature vector, obtain the occupation type of the second user belonging to the microblogging text corresponding with described second feature vector, comprising:
Obtain the second user, described second user is not for providing the microblog users of occupational information;
Obtain the microblogging text of described second user;
Word segmentation processing is carried out to the microblogging text of described second user, obtains the second textual phrase;
Second feature vector is built according to described second textual phrase and described first textual phrase;
Using described second feature vector as the input value of described maximum entropy classifiers, obtain classification results;
Described classification results is utilized to determine the occupation type of described second user.
Based on user's occupational classification system of microblogging, comprise first user acquisition module, occupation type determination module, the first textual phrase acquisition module, sorter structure module and sort module, wherein:
Described first user acquisition module, for obtaining the first user of predetermined number, and the occupational information of each first user and microblogging text, described first user is for providing the microblog users of occupational information;
Described occupation type determination module, classifies to described first user for the occupational information according to described first user, and the result obtained after utilizing classification determines the occupation type of described each first user;
Described first textual phrase acquisition module, for the microblogging text of described each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of described each first user respectively textual phrase;
Described sorter builds module, for utilizing described first textual phrase to form first eigenvector corresponding with the microblogging text of described each first user respectively, and the occupation type of described first user and described first eigenvector is utilized to build maximum entropy classifiers;
Described sort module, for utilizing described maximum entropy classifiers to process second feature vector, obtains the occupation type of the second user belonging to the microblogging text corresponding with described second feature vector.
A kind of user's occupational classification method and system based on microblogging provided by the invention, comprising: the first user obtaining predetermined number, and first user is for providing the microblog users of occupational information; Obtain occupational information and the microblogging text of each first user; Occupational information according to first user is classified to first user, and the result obtained after utilizing classification determines the occupation type of each first user; The microblogging text of each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of each first user respectively textual phrase; Utilize the first textual phrase to form first eigenvector corresponding with the microblogging text of each first user respectively, and utilize the occupation type of first user and first eigenvector to build maximum entropy classifiers; Utilize maximum entropy classifiers to process second feature vector, obtain the occupation type of the second user belonging to the microblogging text corresponding with second feature vector.Thus, based on occupational information and the microblogging text structure maximum entropy classifiers of the first user of known occupational information, maximum entropy classifiers is utilized to carry out occupational classification by the microblogging text of the second user of unknown occupational information to it, determine the occupation type of the second user, that is, the occupation type of the microblog users of unknown occupational information can accurately be known.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only embodiments of the invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to the accompanying drawing provided.
The process flow diagram of a kind of user's occupational classification method based on microblogging that Fig. 1 provides for the embodiment of the present invention;
The structural representation of a kind of user's occupational classification system based on microblogging that Fig. 2 provides for the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Refer to Fig. 1, it illustrates the process flow diagram of a kind of user's occupational classification method based on microblogging that the embodiment of the present invention provides, can comprise the following steps:
S11: the first user obtaining predetermined number, first user is for providing the microblog users of occupational information.
It should be noted that, existing microblogging all provides ad-hoc location or the webpage of filling in personal information for microblog users, and personal information just comprises occupational information, and user can carry out choosing as required and fill out.
S12: the occupational information and the microblogging text that obtain each first user.
The API (Application Programming Interface, application programming interface) that microblogging can be utilized to provide obtains the microblogging text of first user, and microblogging text can think the text that first user was delivered.
S13: the occupational information according to first user is classified to first user, the result obtained after utilizing classification determines the occupation type of each first user.
S14: the microblogging text of each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of each first user respectively textual phrase.
Word segmentation processing in the present embodiment all can adopt participle software I CTCLAS to realize.
S15: utilize the first textual phrase to form first eigenvector corresponding with the microblogging text of each first user respectively, and utilize the occupation type of first user and first eigenvector to build maximum entropy classifiers.
Utilize the occupation type of first user to mark first eigenvector, the maximum entropy kit that Mallet can be utilized to provide based on the occupation type of first user and first eigenvector builds maximum entropy classifiers.
S16: utilize maximum entropy classifiers to process second feature vector, obtains the occupation type of the second user belonging to the microblogging text corresponding with second feature vector.
Thus, based on occupational information and the microblogging text structure maximum entropy classifiers of the first user of known occupational information, maximum entropy classifiers is utilized to carry out occupational classification by the microblogging text of the second user of unknown occupational information to it, determine the occupation type of the second user, that is, the occupation type of the microblog users of unknown occupational information can accurately be known.
Maximum entropy classifiers is as the one in machine learning classification method, and it is a kind of sorter based on maximum entropy information theory.The basic thought of maximum entropy classifiers is: be all known factor Modling model, and the factor of all the unknowns is foreclosed.That is, a kind of probability distribution be found, meet all known facts, but allow the most randomization of unknown factor.Relative to Nae Bayesianmethod, the maximum feature of the method is exactly the conditional sampling not between demand fulfillment feature and feature.Therefore, the method is applicable to merging various different feature, and without the need to considering the impact between them.
Under maximum entropy model, suppose that p (y|X) representative sample X belongs to the probability of classification y, maximum entropy model requires that p (y|X) meets certain constraint condition, the entropy obtained according to following formulae discovery must be made to obtain maximal value simultaneously:
H ( p ) = - Σ X , y p ( y | X ) log ( p ( y | X ) )
Here H (p) represents conditional entropy H (y|X), and H (y|X) is a kind of method of condition metric Probability p (y|X) homogeneity, emphasizes the dependence to probability distribution p.Above-mentioned constraint condition refers to all known facts, can state with the following methods:
Wherein, f (X, the y) feature that is maximum entropy model.Can see: these feature interpretation the contacting of vectorial X and classification y, final probability output is:
p ( y | X ) = 1 Z ( X ) exp ( Σ i λ i f i ( X , y ) )
Wherein,
Z ( X ) = Σ y exp ( Σ i λ i f i ( X , y ) )
Be called normalized factor; λ ifor the weight of feature.
Prove by experiment, a kind of user's occupational classification method based on microblogging text adopting the embodiment of the present invention to provide can reach more than 0.56 to the accuracy rate that microblog users is classified.
It should be noted that, utilize the first textual phrase to form first eigenvector corresponding with the microblogging text of each first user respectively in above-described embodiment, be specifically as follows:
Extract all words in whole first textual phrase as feature space collection, and the first textual phrase to each microblogging text, build training feature vector according to vector space model (Vector Space Model, VSM).Wherein, the concrete methods of realizing of vector space model is as follows:
First, each first textual phrase is expressed as the set be made up of all words contained in this first textual phrase (each word is a characteristic item), i.e. the first textual phrase collection: Document=D (t 1, t 2..., t n), wherein t k(1≤k≤n) is characteristic item wherein.Such as, if first textual phrase comprises s, t, m, n tetra-characteristic items, then this first textual phrase collection can be expressed as Document=D (s, t, m, n).Then, the concentrated all characteristic items of the first textual phrase are extracted as feature space.Such as, if the first textual phrase collection comprises N number of characteristic item altogether, then feature space can be expressed as N dimension coordinate system: Vector=V (t 1, t 2, t 3..., t n).Finally, represent according to the proper vector of each first textual phrase of feature space structure obtained.That is, if comprise the characteristic item in feature space in the first textual phrase, then coordinate figure corresponding for this characteristic item is set to 1, otherwise is 0.
Obtain the process of the first user of predetermined number, be specifically as follows:
(1) build a first user list, in first user list, add arbitrary microblog users in microblogging as current first user; This first user can be official or the higher microblog users of popularity;
(2) there is with current first user the associated user necessarily contacted in acquisition, can be current first user be the microblog users that microblogging text is made comments or it is made comments for the microblogging text that this first user is delivered that it is delivered, these associated users are defined as first user, and add in first user list;
(3) by choosing arbitrary microblog users in above-mentioned associated user as current first user, execution (2) is returned, till the quantity of the first user in first user list reaches predetermined number.
Wherein, just comprise: build a first user list, and the first user of predetermined number is stored in first user list, thereby, it is possible to facilitate staff inquiring about it whenever necessary.
It should be noted that, above-described embodiment provides a kind of occupational information based on obtaining first user in user's occupational classification method of microblogging, can comprise:
The personal information provided by first user obtains its professional label;
Determine that the occupation corresponding with professional label is the occupational information of first user.
May occur in the personal information provided in existing microblogging that user fills in the situation of personal information according to unified professional label, now, then need to determine that the occupation corresponding with professional label is occupational information.Thereby, it is possible to ensure carrying out smoothly of the correlation step of a kind of occupational classification method based on microblogging that above-described embodiment provides.
In addition, it is a kind of based in user's occupation type sorting technique of microblogging that above-described embodiment provides, utilize maximum entropy classifiers to process second feature vector, obtain the occupation type of the second user belonging to the microblogging text corresponding with second feature vector, can comprise:
Obtain the second user, the second user is not for providing the microblog users of occupational information;
Obtain the microblogging text of the second user;
Word segmentation processing is carried out to the microblogging text of the second user, obtains the second textual phrase;
Second feature vector is built according to the second textual phrase and the first textual phrase;
Using second feature vector as the input value of maximum entropy classifiers, obtain classification results;
Classification results is utilized to determine the occupation type of the second user.
The process building second feature vector according to the second textual phrase and the first textual phrase can with reference to the discussion of the above-mentioned process for formation first eigenvector.Thereby, it is possible to utilize the microblogging text of maximum entropy classifiers and the second user to determine the occupation type of the second user.
Above-described embodiment provides a kind of based in user's occupational classification method of microblogging, and the occupational information according to first user is classified to first user, and the result obtained after utilizing classification determines the occupation type of each first user, can comprise:
The professional nature of each first user is determined by the occupational information of first user;
The occupation type of first user is divided into the first kind and Equations of The Second Kind according to preset standard by the professional nature according to each first user; The first kind is the occupation type of brainwork, and Equations of The Second Kind is blue-collar occupation type.
It is an example herein, the occupation type of brainwork and blue-collar occupation type is divided into by occupation type, in the process that reality is implemented, the classification of occupation type and the quantity of type all can be set by staff according to actual needs, are not only confined to the classification of the occupation type related in the embodiment of the present invention.
In addition, first eigenvector corresponding with the microblogging text of each first user is respectively formed utilizing the first textual phrase, and when utilizing the occupation type of first user and first eigenvector to build maximum entropy classifiers, the occupation type of first user can be utilized to mark first eigenvector, different reference numerals can be determined according to different occupation types, according to this reference numerals, first eigenvector be marked further; As shown in table 1, it illustrates a kind of mode classification of occupation type and the example of reference numerals corresponding with it.
The explanation of table 1 occupation type
Corresponding with said method embodiment, present invention also offers a kind of user's occupational classification system based on microblogging, as shown in Figure 2, this system can comprise first user acquisition module 21, occupation type determination module 22, first textual phrase acquisition module 23, sorter structure module 24 and sort module 25, wherein:
First user acquisition module 21, for obtaining the first user of predetermined number, and the occupational information of each first user and microblogging text, first user is for providing the microblog users of occupational information;
Occupation type determination module 22, classifies to first user for the occupational information according to first user, and the result obtained after utilizing classification determines the occupation type of each first user;
First textual phrase acquisition module 23, for the microblogging text of each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of each first user respectively textual phrase;
Sorter builds module 24, for utilizing the first textual phrase to form first eigenvector corresponding with the microblogging text of each first user respectively, and utilizes the occupation type of first user and first eigenvector to build maximum entropy classifiers;
Sort module 25, for utilizing maximum entropy classifiers to process second feature vector, obtains the occupation type of the second user belonging to the microblogging text corresponding with second feature vector.
Thus, said system is utilized to realize: based on occupational information and the microblogging text structure maximum entropy classifiers of the first user of known occupational information, maximum entropy classifiers is utilized to carry out occupational classification by the microblogging text of the second user of unknown occupational information to it, determine the occupation type of the second user, that is, the occupation type of the microblog users of unknown occupational information can accurately be known.
To the above-mentioned explanation of the disclosed embodiments, those skilled in the art are realized or uses the present invention.To be apparent for a person skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.

Claims (6)

1., based on user's occupational classification method of microblogging, it is characterized in that, comprising:
Obtain the first user of predetermined number, described first user is for providing the microblog users of occupational information;
Obtain occupational information and the microblogging text of each first user;
Occupational information according to described first user is classified to described first user, and the result obtained after utilizing classification determines the occupation type of described each first user;
The microblogging text of described each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of described each first user respectively textual phrase;
Utilize described first textual phrase to form first eigenvector corresponding with the microblogging text of described each first user respectively, and utilize the occupation type of described first user and described first eigenvector to build maximum entropy classifiers;
Utilize described maximum entropy classifiers to process second feature vector, obtain the occupation type of the second user belonging to the microblogging text corresponding with described second feature vector.
2. method according to claim 1, is characterized in that, the described occupational information according to described first user is classified to described first user, and the result obtained after utilizing classification determines the occupation type of described each first user, comprising:
The professional nature of described each first user is determined by the occupational information of described first user;
The occupation type of described first user is divided into the first kind and Equations of The Second Kind according to preset standard by the professional nature according to described each first user; The described first kind is the occupation type of brainwork, and described Equations of The Second Kind is blue-collar occupation type.
3. method according to claim 2, is characterized in that, obtains the occupational information of described first user, comprising:
The personal information provided by described first user obtains its professional label;
Determine that the occupation corresponding with described professional label is the occupational information of described first user.
4. method according to claim 3, is characterized in that, described method also comprises:
Build a first user list, and the first user of predetermined number is stored in described first user list, for inquiry.
5. the method according to right 1, is characterized in that, describedly utilizes described maximum entropy classifiers to process second feature vector, obtains the occupation type of the second user belonging to the microblogging text corresponding with described second feature vector, comprising:
Obtain the second user, described second user is not for providing the microblog users of occupational information;
Obtain the microblogging text of described second user;
Word segmentation processing is carried out to the microblogging text of described second user, obtains the second textual phrase;
Second feature vector is built according to described second textual phrase and described first textual phrase;
Using described second feature vector as the input value of described maximum entropy classifiers, obtain classification results;
Described classification results is utilized to determine the occupation type of described second user.
6. based on user's occupational classification system of microblogging, it is characterized in that, comprise first user acquisition module, occupation type determination module, the first textual phrase acquisition module, sorter structure module and sort module, wherein:
Described first user acquisition module, for obtaining the first user of predetermined number, and the occupational information of each first user and microblogging text, described first user is for providing the microblog users of occupational information;
Described occupation type determination module, classifies to described first user for the occupational information according to described first user, and the result obtained after utilizing classification determines the occupation type of described each first user;
Described first textual phrase acquisition module, for the microblogging text of described each first user is carried out word segmentation processing respectively, obtains the first corresponding with the microblogging text of described each first user respectively textual phrase;
Described sorter builds module, for utilizing described first textual phrase to form first eigenvector corresponding with the microblogging text of described each first user respectively, and the occupation type of described first user and described first eigenvector is utilized to build maximum entropy classifiers;
Described sort module, for utilizing described maximum entropy classifiers to process second feature vector, obtains the occupation type of the second user belonging to the microblogging text corresponding with described second feature vector.
CN201510236383.7A 2015-05-11 2015-05-11 A kind of user's occupational classification method and system based on microblogging Active CN104778283B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510236383.7A CN104778283B (en) 2015-05-11 2015-05-11 A kind of user's occupational classification method and system based on microblogging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510236383.7A CN104778283B (en) 2015-05-11 2015-05-11 A kind of user's occupational classification method and system based on microblogging

Publications (2)

Publication Number Publication Date
CN104778283A true CN104778283A (en) 2015-07-15
CN104778283B CN104778283B (en) 2018-05-01

Family

ID=53619747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510236383.7A Active CN104778283B (en) 2015-05-11 2015-05-11 A kind of user's occupational classification method and system based on microblogging

Country Status (1)

Country Link
CN (1) CN104778283B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105243094A (en) * 2015-09-11 2016-01-13 苏州大学张家港工业技术研究院 Microblog text and personal information based user occupation classification method and system
CN105868180A (en) * 2016-04-11 2016-08-17 苏州大学 Method, device and system for Internet user data processing
CN105869073A (en) * 2016-04-11 2016-08-17 苏州大学 Internet user data processing method, educational background type classifying device and educational background type classifying system
CN106095915A (en) * 2016-06-08 2016-11-09 百度在线网络技术(北京)有限公司 The processing method and processing device of user identity
CN106228453A (en) * 2016-08-08 2016-12-14 联动优势科技有限公司 A kind of method and apparatus obtaining user's occupational information
CN107577660A (en) * 2017-07-21 2018-01-12 阿里巴巴集团控股有限公司 Category information recognition methods, device and server
CN107908620A (en) * 2017-11-15 2018-04-13 珠海金山网络游戏科技有限公司 A kind of method and apparatus based on job documentation anticipation user's occupation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124191A1 (en) * 2011-11-14 2013-05-16 Microsoft Corporation Microblog summarization
CN103279549A (en) * 2013-06-07 2013-09-04 苏州大学 Method and device for acquiring target data of target objects

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124191A1 (en) * 2011-11-14 2013-05-16 Microsoft Corporation Microblog summarization
CN103279549A (en) * 2013-06-07 2013-09-04 苏州大学 Method and device for acquiring target data of target objects

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何跃: ""中文微博的情绪识别与分类研究"", 《情报杂志》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105243094A (en) * 2015-09-11 2016-01-13 苏州大学张家港工业技术研究院 Microblog text and personal information based user occupation classification method and system
CN105868180A (en) * 2016-04-11 2016-08-17 苏州大学 Method, device and system for Internet user data processing
CN105869073A (en) * 2016-04-11 2016-08-17 苏州大学 Internet user data processing method, educational background type classifying device and educational background type classifying system
CN106095915A (en) * 2016-06-08 2016-11-09 百度在线网络技术(北京)有限公司 The processing method and processing device of user identity
CN106228453A (en) * 2016-08-08 2016-12-14 联动优势科技有限公司 A kind of method and apparatus obtaining user's occupational information
CN107577660A (en) * 2017-07-21 2018-01-12 阿里巴巴集团控股有限公司 Category information recognition methods, device and server
CN107577660B (en) * 2017-07-21 2020-07-03 阿里巴巴集团控股有限公司 Category information identification method and device and server
CN107908620A (en) * 2017-11-15 2018-04-13 珠海金山网络游戏科技有限公司 A kind of method and apparatus based on job documentation anticipation user's occupation

Also Published As

Publication number Publication date
CN104778283B (en) 2018-05-01

Similar Documents

Publication Publication Date Title
CN106708966B (en) Junk comment detection method based on similarity calculation
CN104778283A (en) User occupation classification method and system based on microblog
CN104102626B (en) A kind of method for short text Semantic Similarity Measurement
CN104574192B (en) Method and device for identifying same user in multiple social networks
CN103294778B (en) A kind of method and system pushing information
CN103246670B (en) Microblogging sequence, search, methods of exhibiting and system
CN104750798B (en) Recommendation method and device for application program
CN103744981A (en) System for automatic classification analysis for website based on website content
CN110750640A (en) Text data classification method and device based on neural network model and storage medium
CN105302810A (en) Information search method and apparatus
CN103761254A (en) Method for matching and recommending service themes in various fields
CN102486791A (en) Method and server for intelligently classifying bookmarks
CN110427480B (en) Intelligent personalized text recommendation method and device and computer readable storage medium
CN104317784A (en) Cross-platform user identification method and cross-platform user identification system
CN106156135A (en) The method and device of inquiry data
CN105843796A (en) Microblog emotional tendency analysis method and device
Razenshteyn High-dimensional similarity search and sketching: algorithms and hardness
CN114357117A (en) Transaction information query method and device, computer equipment and storage medium
CN110609958A (en) Data pushing method and device, electronic equipment and storage medium
CN104346408A (en) Method and equipment for labeling network user
CN106919588A (en) A kind of application program search system and method
CN110232131A (en) Intention material searching method and device based on intention label
CN103577547A (en) Webpage type identification method and device
CN104598624A (en) User class determination method and device for microblog user
CN104809236A (en) Microblog-based user age classification method and Microblog-based user age classification system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant