CN109961075A - User gender prediction method, apparatus, medium and electronic equipment - Google Patents

User gender prediction method, apparatus, medium and electronic equipment Download PDF

Info

Publication number
CN109961075A
CN109961075A CN201711405558.8A CN201711405558A CN109961075A CN 109961075 A CN109961075 A CN 109961075A CN 201711405558 A CN201711405558 A CN 201711405558A CN 109961075 A CN109961075 A CN 109961075A
Authority
CN
China
Prior art keywords
sample
feature
information
user
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711405558.8A
Other languages
Chinese (zh)
Inventor
陈岩
刘耀勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201711405558.8A priority Critical patent/CN109961075A/en
Priority to PCT/CN2018/115358 priority patent/WO2019120007A1/en
Publication of CN109961075A publication Critical patent/CN109961075A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

User gender prediction method, apparatus, storage medium and electronic equipment provided herein by acquiring the multidimensional characteristic that the behavioural habits of user of gender information have been provided as sample, and constructs the sample set that the behavioural habits of user of gender information have been provided;When the quantity of the feature is more than preset threshold, sample classification is carried out to sample set according to information gain-ratio of the feature for sample classification, to construct the decision-tree model of prediction user's gender, the output of decision-tree model includes " male " or " female ";The multidimensional characteristic of the behavioural habits of the user of gender information is not provided according to predicted time acquisition as forecast sample;The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.

Description

User gender prediction method, apparatus, medium and electronic equipment
Technical field
This application involves electronic equipment terminal fields, and in particular to a kind of user gender prediction method, apparatus, medium and electricity Sub- equipment.
Background technique
User's portrait is a very popular in recent years research direction, such as on smart phone.If there is a kind of side Method traditionally can accurately judge the gender of user from user, so that the depth optimization for carrying out various aspects to electronic equipment is Significantly.Current electronic device system can allow user's registration to bind electronic equipment and user account, but not be each use All it is ready to provide gender information, then causes not can solve the problem of most user draws a portrait in family.Therefore, to no offer gender The user of information, it is necessary to which a kind of user gender prediction method, apparatus, medium and electronic equipment are provided.
Summary of the invention
The embodiment of the present application provides a kind of user gender prediction method, apparatus, storage medium and electronic equipment, with intelligent pass Close application program.
The embodiment of the present application provides a kind of user gender prediction method, comprising:
The multidimensional characteristic of the behavioural habits of the user of gender information has been provided as sample in acquisition, and constructs and gender has been provided The sample set of the behavioural habits of the user of information;
When the quantity of the feature be more than preset threshold when, according to feature for sample classification information gain-ratio to sample Collection carries out sample classification, to construct the decision-tree model of user gender prediction;
The multidimensional characteristic of the behavioural habits of the user of gender information is not provided according to predicted time acquisition as forecast sample;
The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
The embodiment of the present application also provides a kind of user gender prediction square law device, and described device includes:
First acquisition unit, for acquiring the multidimensional characteristic that the behavioural habits of user of gender information have been provided as sample This, and construct the sample set that the behavioural habits of user of gender information have been provided;
Taxon, for when the quantity of the feature be more than preset threshold when, according to feature for the letter of sample classification It ceases ratio of profit increase and sample classification is carried out to sample set, to construct the decision-tree model of user gender prediction, the decision-tree model Output include male or female;
Second acquisition unit, the multidimensional of the behavioural habits of the user for not providing gender information according to predicted time acquisition Feature is as forecast sample;
Predicting unit, the gender of the user for not providing gender information according to forecast sample and decision-tree model prediction.
Storage medium provided by the embodiments of the present application, is stored thereon with computer program, when the computer program is being counted When being run on calculation machine, so that the computer executes the user's gender prediction's method provided such as the application any embodiment.
Electronic equipment provided by the embodiments of the present application, including processor and memory, the memory have computer program, It is characterized in that, the processor, which passes through, calls the computer program, for executing as the application any embodiment provides User's gender prediction's method.
User gender prediction method, apparatus, storage medium and electronic equipment provided herein is had been provided by acquisition The multidimensional characteristic of the behavioural habits of the user of gender information constructs the behavior habit that the user of gender information has been provided as sample Used sample set;When the quantity of the feature is more than preset threshold, according to feature for the information gain-ratio pair of sample classification Sample set carries out sample classification, and to construct the decision-tree model of prediction user's gender, the output of decision-tree model includes " male " Or " female ";The multidimensional characteristic of the behavioural habits of the user of gender information is not provided according to predicted time acquisition as forecast sample; The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 is the application scenarios schematic diagram of user gender prediction method provided by the embodiments of the present application.
Fig. 2 is a flow diagram of user gender prediction method provided by the embodiments of the present application.
Fig. 3 is a kind of schematic diagram of decision tree provided by the embodiments of the present application.
Fig. 4 is the schematic diagram of another decision tree provided by the embodiments of the present application.
Fig. 5 is the schematic diagram of another decision tree provided by the embodiments of the present application.
Fig. 6 is another flow diagram of user gender prediction method provided by the embodiments of the present application.
Fig. 7 is a structural schematic diagram of user gender prediction device provided by the embodiments of the present application.
Fig. 8 is another structural schematic diagram of user gender prediction device provided by the embodiments of the present application.
Fig. 9 is a structural schematic diagram of electronic equipment provided by the embodiments of the present application.
Figure 10 is another structural schematic diagram of electronic equipment provided by the embodiments of the present application.
Specific embodiment
Schema is please referred to, wherein identical component symbol represents identical component, the principle of the application is to implement one It is illustrated in computing environment appropriate.The following description be based on illustrated by the application specific embodiment, should not be by It is considered as limitation the application other specific embodiments not detailed herein.
In the following description, the specific embodiment of the application will refer to the step as performed by one or multi-section computer And symbol illustrates, unless otherwise stating clearly.Therefore, these steps and operation will have to mention for several times is executed by computer, this paper institute The computer execution of finger includes by representing with the computer processing unit of the electronic signal of the data in a structuring pattern Operation.This operation is converted at the data or the position being maintained in the memory system of the computer, reconfigurable Or in addition change the running of the computer in mode known to the tester of this field.The maintained data structure of the data For the provider location of the memory, there is the specific feature as defined in the data format.But the application principle is with above-mentioned text Word illustrates that be not represented as a kind of limitation, this field tester will appreciate that plurality of step and behaviour as described below Also it may be implemented in hardware.
Term as used herein " module " can regard the software object to execute in the arithmetic system as.It is as described herein Different components, module, engine and service can be regarded as the objective for implementation in the arithmetic system.And device as described herein and side Method can be implemented in the form of software, can also be implemented on hardware certainly, within the application protection scope.
Term " first ", " second " and " third " in the application etc. are for distinguishing different objects, rather than for retouching State particular order.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include. Such as contain series of steps or module process, method, system, product or equipment be not limited to listed step or Module, but some embodiments further include the steps that not listing or module or some embodiments further include for these processes, Method, product or equipment intrinsic other steps or module.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly understand, embodiment described herein can be combined with other embodiments.
Following disclosure provides many different embodiments or example is used to realize the different structure of the application.For letter Change disclosure herein, hereinafter the component of specific examples and setting are described.Certainly, they are merely examples, and Purpose does not lie in limitation the application.In addition, the application can in different examples repeat reference numerals and/or reference letter, this Kind repetition is for purposes of simplicity and clarity, itself not indicate the relationship between discussed various embodiments and/or setting. In addition, this application provides various specific techniques and material example, but those of ordinary skill in the art will be appreciated that To the application of other techniques and/or the use of other materials.
Please refer to the schema in attached drawing, wherein identical component symbol represents identical component, the principle of the application be with Implement to illustrate in a computing environment appropriate.The following description is the specific implementation based on exemplified the application Example is not construed as limitation the application other specific embodiments not detailed herein.
The application principle illustrates that be not represented as a kind of limitation, those skilled in the art can with above-mentioned text Solving plurality of step and operation as described below also may be implemented in hardware.The principle of the application uses many other wide usages Or specific purpose operation, communication environment or configuration are operated.
The embodiment of the present application provides a kind of user gender prediction method, and the executing subject of user's gender prediction's method can be with It is user gender prediction device provided by the embodiments of the present application, or is integrated with the electronic equipment of user's gender prediction's device, Wherein the user gender prediction device can be realized by the way of hardware or software.Wherein, electronic equipment can be intelligence The equipment such as mobile phone, tablet computer, palm PC, laptop or desktop computer.
Referring to Fig. 1, Fig. 1 is the application scenarios schematic diagram of user gender prediction method provided by the embodiments of the present application, with For user's gender prediction's device integrates in the electronic device, electronic equipment can acquire the row that the user of gender information has been provided For habit multidimensional characteristic have been provided as sample, and described in constructing gender information user behavioural habits sample set;Root Sample classification is carried out to the sample set according to information gain-ratio of the feature for sample classification, to construct prediction user's property Other decision-tree model;The multidimensional characteristic of the corresponding behavioural habits of user of gender information is not provided according to prediction threshold value acquisition, Obtain forecast sample;According to the property of the forecast sample and the decision-tree model prediction user for not providing gender information Not.
It specifically,, can be in history by taking the gender for the user for judging not providing gender information as an example such as shown in Fig. 1 Between in section, the multidimensional characteristic of the behavioural habits of the user of gender information has been provided (for example, user reads sport category news in acquisition Duration, user use the number etc. of U.S. face class software) as sample, the behavioural habits of the user of gender information have been provided in building Sample set, according to feature (for example, user reads the duration of sport category news, user uses number of U.S. face class software etc.) for The information gain-ratio of sample classification carries out sample classification to sample set, to construct the decision-tree model of user gender prediction;Root (such as t) the corresponding multidimensional characteristic of user that acquisition does not provide gender information (such as is collected and does not provide gender letter it is predicted that threshold value The duration of the t user reading sport category news of the user of breath, user use the sample of the behavioural habits such as the number of U.S. face class software This) it is used as forecast sample;The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
Referring to Fig. 2, Fig. 2 is the flow diagram of user gender prediction method provided by the embodiments of the present application.The application The detailed process for user's gender prediction's method that embodiment provides can be such that
201, the multidimensional characteristic that the behavioural habits of user of gender information have been provided is acquired as sample, and is constructed and had been provided The sample set of the behavioural habits of the user of gender information.
The multidimensional characteristic of the behavioural habits of the user that gender information has been provided has the dimension of certain length, each of which A kind of characteristic information of the behavioural habits of the user of gender information has been provided in the corresponding characterization of parameter in dimension, i.e. the multidimensional is special Sign breath is made of multiple features.
Wherein, the sample set that the behavioural habits of the user of gender information have been provided may include multiple samples, each sample Multidimensional characteristic including the behavioural habits of the user of gender information have been provided.In the behavioural habits for the user that gender information has been provided Sample set in, may include in historical time section, according to predeterminated frequency acquire multiple samples.Historical time section, such as It can be over 7 days, 10 days;Predeterminated frequency, such as can be and acquire within every 10 minutes primary, per half an hour acquisition once.It can be with Understanding, the multi-dimensional feature data of the behavioural habits of the user that gender information has been provided of one acquisition constitutes a sample, Multiple samples of multi collect constitute sample set.
After constituting sample set, each sample in sample set can be marked, obtain the sample of each sample Label, due to this implementation to be accomplished that prediction user gender, the sample label marked include gender be " male " and Gender is " female " namely sample class includes " male " and " female ".It can specifically be practised according to the behavior for the user that gender information has been provided It is used to be marked, such as: it is 50 times that user browses inclined male's class commodity (such as men's clothing) number in shopping application, then is labeled as " male ";For another example the duration that user reads inclined women class novel is more than 20 hours, then " female " is labeled as.Specifically, number can be used It is worth " 1 " expression " male ", with numerical value " 0 " expression " female ", vice versa.
202, when the quantity of the feature is more than preset threshold, according to feature for the information gain-ratio pair of sample classification Sample set carries out sample classification, to construct the decision-tree model of user gender prediction.
In one embodiment, the preset threshold can be 10000, namely when the quantity of the feature is more than 10000 When, sample classification is carried out to sample set according to information gain-ratio of the feature for sample classification, to construct user gender prediction Decision-tree model
In one embodiment, for the behavioural habits of the user of gender information can will have been provided convenient for sample classification In multidimensional characteristic information, the characteristic information that unused numerical value directly indicates is come out with specific numerical quantization, for example, whether user opens This characteristic information of front camera is opened, can indicate to open with numerical value 1, is indicated not open (vice versa) with numerical value 0;Example again Such as, whether user carries out U.S. face to picture and handles this characteristic information, can be indicated to have carried out U.S. face processing with numerical value 1, use number The expression of value 0 does not carry out U.S. face processing (vice versa).
The embodiment of the present application can carry out sample classification to sample set based on information gain-ratio of the feature for sample classification, To construct the decision-tree model of user gender prediction.For example, decision-tree model can be constructed based on C4.5 algorithm.
Wherein, decision tree is a kind of a kind of tree relying on decision and setting up.In machine learning, decision tree is a kind of Prediction model, representative is a kind of a kind of mapping relations between object properties and object value, some is right for each node on behalf As, each of tree diverging paths represent some possible attribute value, and each leaf node then correspond to from root node to The value of object represented by leaf node path experienced.Decision tree only has single output, can be with if there is multiple outputs Establish independent decision tree respectively to handle different output.
Wherein, C4.5 algorithm is one kind of decision tree, it is that a series of classification used in machine learning and data mining are asked Algorithm in topic is by a kind of improved important algorithm of ID3.Its target is supervised learning: a data set is given, wherein Each tuple can be described with one group of attribute value, each tuple belong in the classification of a mutual exclusion certain is a kind of. The target of C4.5 is to find a dependence value to the mapping relations of classification, and this mapping can be used for new by study The unknown entity of classification classify.
Based on "ockham's razor" principle, i.e., ID3 (Iterative Dichotomiser 3,3 generation of iteration binary tree) is With doing more things with less thing as far as possible.In information theory, it is expected that information is smaller, then information gain is bigger, thus Purity is higher.The core concept of ID3 algorithm is exactly to carry out the selection of metric attribute with information gain, information gain after selection division Maximum attribute is divided.The algorithm traverses possible decision space using top-down greedy search.
In the embodiment of the present application, information gain-ratio can be with is defined as: information gain and feature of the feature for sample classification For the ratio between the division information of sample classification.Information gain-ratio acquisition modes specifically refer to following description.
Information gain exactly sees a feature t for feature one by one, when system has it and do not have it Information content is respectively how many, and the difference of the two is exactly that this feature gives system bring information content, i.e. information gain.
Division information is used to measure the range and uniformity coefficient of feature division data (such as sample set), which can be with The entropy being characterized.
The process classified based on information gain-ratio to sample set is described in detail below, for example, assorting process can be with Include the following steps:
The root node of decision tree is generated, and using sample set as the nodal information of root node;
The sample set of root node is determined as current target sample collection to be sorted;
Obtain the information gain-ratio that feature classifies for target sample collection in target sample collection;
Current division feature is chosen from feature according to information gain-ratio selection;
Sample set is divided according to feature is divided, obtains several subsample collection;
The division feature of sample in sub- sample set is removed, subsample collection after being removed;
The child node of present node is generated, and using subsample collection after removal as the nodal information of child node;
Judge whether child node meets default classification termination condition;
If it is not, target sample collection is then updated to subsample collection after removing, and returns to execution and obtain spy in target sample collection Levy the information gain-ratio classified for target sample collection;
If so, concentrating the classification of sample that leaf section is arranged according to subsample after removal using child node as leaf node The output of point, the classification of sample include " male " and " female ".
Wherein, dividing feature is the spy chosen from feature according to the information gain-ratio that each feature classifies for sample set Sign, for classifying to sample set.Wherein, there are many modes that division feature is chosen according to information gain-ratio, such as in order to be promoted The accuracy of sample classification can choose the corresponding feature of maximum information ratio of profit increase to divide feature.
Wherein, the classification of sample may include " male " and " female " two categories, and the classification of each sample can use sample mark Note is to indicate, for example, numerical value " 1 " expression " male ", with numerical value " 0 " expression " female ", vice versa when sample labeling is numerical value.
When child node meets default classification termination condition, it can stop to the son using child node as leaf node The sample set of node is classified, and can concentrate the classification of sample that the output of the leaf node is arranged based on subsample after removal. There are many modes of the output of classification setting leaf node based on sample.For example, sample size in sample set after can removing Output of most classifications as the leaf node.
Wherein, presetting classification termination condition can set according to actual needs, and child node meets default classification and terminates item When part, using current node as leaf node, stopping carries out participle classification to the corresponding sample set of child node;Child node is not When meeting default classification termination condition, continue to classify to the corresponding volume sample set of child node.For example, default classification terminates item Part may include: child node removal after in the set of subsample the categorical measure of sample be and " the judgement of preset quantity namely step Whether child node meets default classification termination condition " may include:
Subsample concentrates whether the categorical measure of sample is preset quantity after judging the corresponding removal of child node;
If so, determining that child node meets default classification termination condition;
If not, it is determined that the discontented default classified terminal termination condition of child node.
For example, default classification termination condition may include: the classification of subsample concentration sample after the corresponding removal of child node Quantity be 1 namely the sample set of child node in only one classification sample.At this point, if child node meets the default classification Termination condition, then, concentrate the classification of sample as the output of the leaf node subsample.Subsample is concentrated only after such as removing Have classification be " male " sample when, it is possible to the output by " male " as the leaf node.
In one embodiment, in order to promote the accuracy of determination of decision-tree model, a ratio of profit increase threshold can also be set Value;When maximum information gain-ratio is greater than the threshold value, the corresponding feature of the information gain-ratio is chosen just to divide feature.? That is, step " choosing current division feature from feature according to information gain-ratio selection " may include:
Maximum target information ratio of profit increase is chosen from information gain-ratio;
Judge whether target information ratio of profit increase is greater than preset threshold;
If so, choosing the corresponding feature of target information ratio of profit increase as current division feature.
It in one embodiment, can be using present node as leaf when target information ratio of profit increase is not more than preset threshold Child node, and choose output of the most sample class of sample size as the leaf node., wherein sample class includes " male " and " female ".
Wherein, preset threshold can be set according to actual needs, such as 0.9,0.8.
For example, when information gain-ratio 0.9 of the feature 1 for sample classification is maximum information gain, predetermined gain ratio step threshold When value is 0.8, since maximum information ratio of profit increase is greater than preset threshold, at this point it is possible to by feature 1 as division feature.
In another example when preset threshold is 1, then maximum information ratio of profit increase is less than preset threshold, at this point it is possible to by current Node is that the sample size of " male " is most to classification known to sample set analysis as leaf node, is greater than the sample that classification is " female " This quantity, at this point it is possible to the output by " male " as the leaf node.
Wherein, there are many modes for carrying out classifying and dividing to sample according to division feature, for example, can be based on division feature Characteristic value sample set divided.Namely step " dividing according to feature is divided to sample set " may include:
It obtains target sample and concentrates the characteristic value for dividing feature;
Target sample collection is divided according to characteristic value.
It is concentrated for example, can will divide the identical sample of characteristic value in sample set and be divided into same subsample.For example, it divides The characteristic value of feature includes: 0,1,2, then at this point it is possible to the sample that the characteristic value for dividing feature is 0 be classified as it is a kind of, by feature The sample that value is 1 is classified as sample that is a kind of, being 2 by characteristic value and is classified as one kind.
For example, wherein sample includes several features for sample set D { sample 1, sample 2 ... sample i ... sample n } A。
Firstly, initializing to samples all in sample set, then, a root node d is generated, and sample set D is made For the nodal information of root node d, Fig. 3 is such as referred to.
Calculate the information gain-ratio g that each feature such as feature A classifies for sample setR(D, A) 1, gR(D, A) 2 ... gR(D, A)m;Choose maximum information gain-ratio gR(D, A) max.
As maximum information gain-ratio gR(D, A) max be less than preset threshold ε when, current node as leaf node, and Choose output of the most sample class of sample size as leaf node.
As maximum information gain-ratio gRWhen (D, A) max is greater than preset threshold ε, information gain g can be chosenR(D, A) max Corresponding feature as dividing feature Ag, according to feature Ag to sample set D { sample 1, sample 2 ... sample i ... sample n } into Row divides, and specifically, to each value ai of Ag, D is divided into several nonempty sets Di according to Ag=ai, as current The child node of node.Sample set is such as divided into two sub- sample set D1 { sample 1, sample 2 ... sample k } and D2 { sample k+ 1 ... sample n }.
It is A-Ag that feature Ag removal will be divided in subsample collection D1 and D2.With reference to Fig. 3 generate root node d child node d1 and D2, and using subsample collection D1 as the nodal information of child node d1, using subsample collection D2 as the nodal information of child node d2.
Then, for each child node, for each child node, using A-Ag as feature, the Di of child node is as data Collection, the above-mentioned step of recursive call construct subtree, until meeting default classification termination condition.
By taking child node d1 as an example, judge whether child node meets default classification termination condition, if so, current son is saved Point d1 concentrates the classification of sample that leaf node output is arranged as leaf node, and according to the corresponding subsample child node d1.
When child node is unsatisfactory for default classification termination condition, by the way of the above-mentioned classification based on information gain, continue Classify to the corresponding subsample collection of child node, can such as be calculated by taking child node d2 as an example in A2 sample set each feature relative to The information gain-ratio g of sample classificationR(D, A) chooses maximum information gain-ratio gR(D, A) max, when maximum information gain-ratio gRWhen (D, A) max is greater than preset threshold ε, information gain-ratio g can be chosenR(D, A) corresponding feature is to divide feature Ag (such as Feature Ai+1), D2 is divided into several subsample collection based on feature Ag is divided, D2 can be such as divided into subsample collection D21, Then D22, D23 remove the division feature Ag in subsample collection D21, D22, D23, and generate the child node of present node d2 D21, d22, d23 will remove sample set D21, D22, D23 after dividing feature Ag as child node d21, d22, d23 Nodal information.
And so on, by it is above-mentioned based on information gain-ratio classification in the way of may be constructed out decision as shown in Figure 4 Tree, the output of the leaf node of the decision tree include " male " or " female ".
In one embodiment, in order to promote the speed and efficiency predicted using decision tree, can also node it Between path on the corresponding characteristic value for dividing feature of label.For example, during the above-mentioned classification based on information gain, it can be with In the present node characteristic value for dividing feature corresponding to label on its child node path.
For example, divide the characteristic value of feature Ag when including: 0,1, can label 1 on the path between d2 and d, a1 with Label 0 on path between a, and so on, it, can be in the path subscript of present node and its child node after each divide Note is corresponding to divide characteristic value such as 0 or 1, can obtain decision tree as shown in Figure 5.
The acquisition modes of lower mask body recommended information ratio of profit increase:
In the embodiment of the present application, information gain-ratio can be with is defined as: information gain and feature of the feature for sample classification For the ratio between the division information of sample classification.
Information gain exactly sees a feature t for feature one by one, when system has it and do not have it Information content is respectively how many, and the difference of the two is exactly that this feature gives system bring information content, i.e. information gain.Information gain table Show the uncertain reduction degree of the information of the class (male and female) of some feature.
Division information is used to measure the range and uniformity coefficient of feature division data (such as sample set), which can be with The entropy being characterized.
Wherein, step " obtaining the information gain-ratio classified for target sample collection of the feature in target sample collection " can be with Include:
Obtain the information gain that the feature classifies for target sample collection;
Obtain the division information that the feature classifies for target sample collection;
According to the information gain and the division information, obtains the feature and the information that target sample collection is classified is increased Beneficial rate.
In one embodiment, can empirical entropy based on sample classification and feature for sample set classification results item Part entropy obtains the information gain that feature classifies for sample set.Namely step " obtains the feature to classify for target sample collection Information gain " may include:
Obtain the empirical entropy of target sample classification;
The feature is obtained for the conditional entropy of target sample collection classification results;
According to the conditional entropy and the empirical entropy, obtains the feature and the information that the target sample collection is classified is increased Benefit.Wherein it is possible to obtain the first probability that positive sample occurs in sample set and negative sample occurs in sample set second Probability, positive sample are the sample that sample class is " male ", and negative sample is the sample that sample class is " female ";According to the first probability and The empirical entropy of second probability acquisition sample.
In one embodiment, the information gain that feature integrates classification for the target sample can be empirical entropy and condition Difference between entropy.For example, sample includes multidimensional spy for sample set D { sample 1, sample 2 ... sample i ... sample n } Sign, such as feature A.Feature A can obtain the information gain-ratio of sample classification by following formula:
Wherein, gR(D, A) is characterized the information gain-ratio that A classifies for sample set D, and g (D, A) is characterized A for sample The information gain of classification, HA(D) the division information of A, the i.e. entropy of feature A are characterized.
Wherein, gR(D, A) can be obtained by following formula:
The empirical entropy that H (D) classifies for sample set D, and H (D | A) it is characterized the conditional entropy that A classifies for sample set D.
If the sample size that sample class is " male " is j, the sample size of " female " is n-j;At this point, positive sample is in sample Collect the probability of occurrence p1=j/n, probability of occurrence p2=n-j/n of the negative sample in sample set D in D.Then, it is based on following experience The calculation formula of entropy calculates the empirical entropy H (D) of sample classification:
In decision tree classification problem, information gain is exactly decision tree information after carrying out Attributions selection and dividing preceding and division Difference.In this implementation, the empirical entropy H (D) of sample classification are as follows:
H (D)=p1log p1+p2log p2
In one embodiment, sample set can be divided by several subsample collection according to feature A, then, obtains each son The comentropy of sample set classification and the probability that occurs in sample set of each characteristic value of this feature A, according to the comentropy and The probability can be divided after comentropy, i.e. conditional entropy of this feature Ai for sample set classification results.
For example, sample characteristics A can be by following for the conditional entropy of sample set D classification results for sample characteristics A Formula is calculated
Wherein, n is characterized the value kind number of A, i.e. characteristic value number of types.At this point, it is i-th kind of value that pi, which is A characteristic value, The probability that occurs in sample set D of sample, Ai is i-th kind of value of A.(D | A=Ai) it is the experience that collection Di in subsample classifies Entropy, the A characteristic value of sample is i-th kind of value in the collection Di of the subsample.
For example, with the value kind number of feature A for 3, i.e., for A1, A2, A3, at this point it is possible to which feature A is by sample set D { sample 1, sample 2 ... sample i ... sample n } three sub- sample sets are divided into, characteristic value is D1 { sample 1, sample 2 ... sample of A1 This d }, the D 2 { sample d+1 ... sample e } that characteristic value is A2, the D 3 { sample e+1 ... sample n } that characteristic value is A3.d,e It is positive integer, and is less than n.
At this point, conditional entropy of the feature A for sample set D classification results are as follows:
H (D | A)=p1H (D | A=A1)+p2H (D | A=A2)+p3H (D | A=A3);
Wherein, p1=D1/D, p2=D2/D, p2=D3/D;
H (D | A1) it is the comentropy that collection D1 in subsample classifies, i.e. empirical entropy, the calculation formula of above-mentioned empirical entropy can be passed through It is calculated.
In the empirical entropy H (D) and feature A for obtaining sample classification for the conditional entropy H (D | A) of sample set D classification results Afterwards, the information gain that feature A classifies for sample set D can be calculated, is such as calculated by the following formula to obtain:
Namely the information gain that feature A classifies for sample set D are as follows: empirical entropy H (D) and feature A classifies for sample set D As a result the difference of conditional entropy H (D | A).
Wherein, the entropy that feature is characterized the division information that sample set is classified.Can the value based on feature in mesh sample The sample distribution probability of this concentration obtains.For example, HA(D) it can be obtained by following formula:
I=1,2 ..., n are characterized the value classification of A, or kind number.
Wherein, Di is the sample set that sample set D feature A is i-th kind.
203, the multidimensional characteristic of the behavioural habits of the user of gender information is not provided according to predicted time acquisition as prediction Sample.
Wherein, predicted time can be set according to demand, such as can be current time.
The multidimensional characteristic of the behavioural habits of user of gender information is not provided for example, can acquire according to predicted time point As forecast sample.
In the embodiment of the present application, the multidimensional characteristic acquired in step 201 and 203 is same characteristic features, such as: user reads inclined The duration of male's class novel, user read the duration etc. of inclined women class novel.
204, the gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
Specifically, corresponding output is obtained according to forecast sample and decision-tree model as a result, determining not according to output result The gender of the user of gender information is provided.Wherein, output result includes " male " or " female ".
For example, corresponding leaf node can be determined according to the feature and decision-tree model of forecast sample, by the leaf section The output of point is as prediction output result.Such as using the feature of forecast sample according to branch condition (the i.e. division feature of decision tree Characteristic value) determine current leaf node, take the leaf node output as predict result.It is defeated due to leaf node It out include that " male " or " female " therefore can be determined based on decision tree at this time and not provide the gender of the user of gender information.
For example, acquisition do not provide gender information user behavioural habits multidimensional characteristic after, can be shown in Fig. 5 certainly Searching corresponding leaf node according to the branch condition of decision tree in plan tree is an1, and the output of leaf node an1 is " male ", this When, just determine that the gender for the user for not providing gender information is male.
From the foregoing, it will be observed that the multidimensional characteristic conduct of the behavioural habits of the user of gender information has been provided in the embodiment of the present application acquisition Sample, and construct the sample set that the behavioural habits of user of gender information have been provided;When the quantity of the feature is more than default threshold When value, sample classification is carried out to sample set according to information gain-ratio of the feature for sample classification, to construct prediction user's property Other decision-tree model, the output of decision-tree model include " male " or " female ";Gender information is not provided according to predicted time acquisition User behavioural habits multidimensional characteristic as forecast sample;Gender is not provided according to forecast sample and decision-tree model prediction The gender of the user of information.
Further, due in each sample of sample set, including the multiple characteristic informations for reflecting user behavior habit, Therefore the embodiment of the present application can make user gender prediction more intelligent.
Further, user gender prediction is realized based on decision tree prediction model, can promote user gender prediction's Accuracy, and then improve the accuracy of prediction.
Below by the basis of the method that above-described embodiment describes, user's gender prediction's method of the application is done further It introduces.With reference to Fig. 6, which may include:
301, the multidimensional characteristic that the behavioural habits of user of gender information have been provided is acquired as sample, and is constructed and had been provided The sample set of the behavioural habits of the user of gender information.
The multidimensional characteristic of the behavioural habits of the user that gender information has been provided has the dimension of certain length, each of which A kind of characteristic information of the behavioural habits of the user of gender information has been provided in the corresponding characterization of parameter in dimension, i.e. the multidimensional is special Sign breath is made of multiple features.
Wherein, the sample set that the behavioural habits of the user of gender information have been provided may include multiple samples, each sample Multidimensional characteristic including the behavioural habits of the user of gender information have been provided.In the behavioural habits for the user that gender information has been provided Sample set in, may include in historical time section, according to predeterminated frequency acquire multiple samples.Historical time section, such as It can be over 7 days, 10 days;Predeterminated frequency, such as can be and acquire within every 10 minutes primary, per half an hour acquisition once.It can be with Understanding, the multi-dimensional feature data of the behavioural habits of the user that gender information has been provided of one acquisition constitutes a sample, Multiple samples of multi collect constitute sample set.
One specific sample can be as shown in table 1 below, the characteristic information including multiple dimensions, it should be noted that 1 institute of table The characteristic information shown is only for example, and in practice, the quantity for the characteristic information that a sample is included can be more than than shown in table 1 The quantity of information, can also be less than the quantity of information shown in table 1, and the specific features information taken can also be different from shown in table 1, It is not especially limited herein.
Dimension Characteristic information
1 User browses inclined male's class commodity (such as men's clothing) number in shopping application
2 User browses inclined male's class commodity (such as men's clothing) duration in shopping application
3 User browses inclined women class commodity (such as cosmetics, women's dress) number in shopping application
4 User browses inclined women class commodity (such as cosmetics, women's dress) duration in shopping application
5 User reads the duration of inclined male's class novel
6 User reads the duration of inclined women class novel
7 The duration of user's reading sport category news
8 The duration of user's reading constellation class news
9 User uses the number of front camera self-timer
10 User uses the number of U.S. face class software
11 User plays the number and duration of different classes of game
Table 1
302, the sample in sample set is marked, obtains the sample label of each sample.
After constituting sample set, each sample in sample set can be marked, obtain the sample of each sample Label, due to this implementation to be accomplished that prediction user gender, the sample label marked include gender be " male " and Gender is " female " namely sample class includes " male " and " female ".It can specifically be practised according to the behavior for the user that gender information has been provided It is used to be marked, such as: it is 50 times that user browses inclined male's class commodity (such as men's clothing) number in shopping application, then is labeled as " male ";For another example the duration that user reads inclined women class novel is more than 20 hours, then " female " is labeled as.Specifically, number can be used It is worth " 1 " expression " male ", with numerical value " 0 " expression " female ", vice versa.
303, the root node of decision-tree model is generated, and using sample set as the nodal information of root node.
For example, for sample set D { sample 1, sample 2 ... sample i ... sample n }, can first be generated certainly with reference to Fig. 3 The root node d of plan tree, and using sample set D as the nodal information of root node d.
304, determine that sample set is current target sample collection to be sorted.
Namely determine the sample set of root node as current target sample collection to be sorted.
305, the information gain-ratio that each feature classifies for sample set in target sample collection is obtained, and determines maximum information Ratio of profit increase.
For example, the information gain-ratio g that each feature classifies for sample set can be calculated for sample set DR(D, A) 1, gR (D, A) 2 ... gR(D, A) m;Choose maximum information gain-ratio gR(D, A) max, such as gR(D, A) i is maximum information gain Rate.
Wherein, the information gain-ratio that feature classifies for sample set can obtain in the following way:
Obtain the empirical entropy of sample classification;Feature is obtained for the conditional entropy of sample set classification results;According to conditional entropy and Empirical entropy obtains the information gain that feature classifies for sample set;
Obtain the division information that feature classifies for sample set, i.e. entropy of the feature for sample classification;
The ratio for obtaining information gain and entropy, obtains feature for the information gain-ratio of sample classification.
For example, sample includes multidimensional characteristic for sample set D { sample 1, sample 2 ... sample i ... sample n }, it is such as special Levy A.Feature A can obtain the information gain-ratio of sample classification by following formula:
Wherein, g (D, A) is characterized A for the information gain of sample classification, HA(D) be characterized the division information of A, i.e., it is special Levy the entropy of A.
Wherein, g (D, A) can be obtained by following formula:
H (D) is the empirical entropy of sample classification, and H (D | A) is characterized A for the conditional entropy of sample classification.
If the sample size that sample class is " male " is j, the sample size of " female " is n-j;At this point, positive sample is in sample Collect the probability of occurrence p1=j/n, probability of occurrence p2=n-j/n of the negative sample in sample set D in D.Then, it is based on following experience The calculation formula of entropy calculates the empirical entropy H (D) of sample classification:
In decision tree classification problem, information gain is exactly decision tree information after carrying out Attributions selection and dividing preceding and division Difference.In this implementation, the empirical entropy H (D) of sample classification are as follows:
H (D)=p1log p1+p2log p2
In one embodiment, sample set can be divided by several subsample collection according to feature A, then, obtains each son The comentropy of sample set classification and the probability that occurs in sample set of each characteristic value of this feature A, according to the comentropy and The probability can be divided after comentropy, i.e. conditional entropy of this feature Ai for sample set classification results.
For example, sample characteristics A can be by following for the conditional entropy of sample set D classification results for sample characteristics A Formula is calculated:
Wherein, n is characterized the value kind number of A, i.e. characteristic value number of types.At this point, it is i-th kind of value that pi, which is A characteristic value, The probability that occurs in sample set D of sample, Ai is i-th kind of value of A.(D | A=Ai) it is the experience that collection Di in subsample classifies Entropy, the A characteristic value of sample is i-th kind of value in the collection Di of the subsample.
For example, with the value kind number of feature A for 3, i.e., for A1, A2, A3, at this point it is possible to which feature A is by sample set D { sample 1, sample 2 ... sample i ... sample n } three sub- sample sets are divided into, characteristic value is D1 { sample 1, sample 2 ... sample of A1 This d }, the D 2 { sample d+1 ... sample e } that characteristic value is A2, the D 3 { sample e+1 ... sample n } that characteristic value is A3.d,e It is positive integer, and is less than n.
At this point, conditional entropy of the feature A for sample set D classification results are as follows:
H (D | A)=p1H (D | A=A1)+p2H (D | A=A2)+p3H (D | A=A3);
Wherein, p1=D1/D, p2=D2/D, p2=D3/D;
H (D | A1) it is the comentropy that collection D1 in subsample classifies, i.e. empirical entropy, the calculation formula of above-mentioned empirical entropy can be passed through It is calculated.
In the empirical entropy H (D) and feature A for obtaining sample classification for the conditional entropy H (D | A) of sample set D classification results Afterwards, the information gain that feature A classifies for sample set D can be calculated, is such as calculated by the following formula to obtain:
Namely the information gain that feature A classifies for sample set D are as follows: empirical entropy H (D) and feature A classifies for sample set D As a result the difference of conditional entropy H (D | A).
Wherein, the entropy that feature is characterized the division information that sample set is classified.Can the value based on feature in mesh sample The sample distribution probability of this concentration obtains.For example, HA(D) it can be obtained by following formula:
I=1,2 ..., n are characterized the value classification of A, or kind number.
Wherein, Di is the sample set that sample set D feature A is i-th kind.
306, judge whether maximum information gain-ratio is greater than preset threshold, if so, 307 are thened follow the steps, if it is not, then holding Row step 313.
Such as, it can be determined that maximum information gain gRWhether (D, A) max is greater than preset threshold epsilon, which can be with It sets according to actual needs.
307, the corresponding feature of maximum information gain-ratio is chosen as division feature, and according to the feature of the division feature Value divides sample set, obtains several subsample collection.
For example, working as maximum information gain gRWhen the corresponding feature of (D, A) max is characterized Ag, can be with selected characteristic Ag Divide feature.
Specifically, sample set can be divided by several subsample collection, subsample according to the characteristic value kind number for dividing feature The quantity of collection is identical as characteristic value kind number.For example, can will divide the identical sample of characteristic value in sample set is divided into same son In sample set.For example, the characteristic value for dividing feature includes: 0,1,2, then at this point it is possible to the sample that the characteristic value for dividing feature is 0 Originally it is classified as sample that is a kind of, being 1 by characteristic value and is classified as sample that is a kind of, being 2 by characteristic value being classified as one kind.
308, the division feature of sample is concentrated to remove subsample, subsample collection after being removed.
For example, sample set D can be divided into D1 { sample 1, sample 2 ... sample when there are two types of the values of division feature i This k } and D 2 { sample k+1 ... sample n }.It is then possible to the division feature Ag in subsample collection D 1 and D 2 be removed, i.e., A-Ag。
309, the child node of present node is generated, and using subsample collection after removal as the nodal information of corresponding child node.
Wherein, the corresponding child node of a sub- sample set.For example, child node d1 and d2 that Fig. 3 generates root node d are examined, And using subsample collection D1 as the nodal information of child node d1, using subsample collection D2 as the nodal information of child node d2.
It in one embodiment, can also be by the corresponding road for dividing characteristic value setting child node and present node of child node On diameter, it is convenient for subsequent progress user gender prediction, with reference to Fig. 5.
310, judge whether the subsample collection of child node meets default classification termination condition, if so, 311 are thened follow the steps, If it is not, thening follow the steps 312.
Wherein, presetting classification termination condition can set according to actual needs, and child node meets default classification and terminates item When part, using current node as leaf node, stopping carries out participle classification to the corresponding sample set of child node;Child node is not When meeting default classification termination condition, continue to classify to the corresponding volume sample set of child node.For example, default classification terminates item Part may include: child node removal after in the set of subsample the categorical measure of sample be and preset quantity.
For example, default classification termination condition may include: the classification of subsample concentration sample after the corresponding removal of child node Quantity be 1 namely the sample set of child node in only one classification sample.
By taking child node d1 as an example, judge whether child node meets default classification termination condition, if so, current son is saved Point d1 concentrates the classification of sample that leaf node output is arranged as leaf node, and according to the corresponding subsample child node d1.
311, target sample collection is updated to the subsample collection of child node, and returns to step 305.
When child node is unsatisfactory for default classification termination condition, by the way of the above-mentioned classification based on information gain, continue Classify to the corresponding subsample collection of child node, can such as be calculated by taking child node d2 as an example in A2 sample set each feature relative to The information gain-ratio g of sample classificationR(D, A) chooses maximum information gain-ratio gR(D, A) max, when maximum information gain-ratio gRWhen (D, A) max is greater than preset threshold ε, information gain-ratio g can be chosenR(D, A) corresponding feature is to divide feature Ag (such as Feature Ai+1), D2 is divided into several subsample collection based on feature Ag is divided, D2 can be such as divided into subsample collection D21, Then D22, D23 remove the division feature Ag in subsample collection D21, D22, D23, and generate the child node of present node d2 D 21, d22, d23 will remove sample set D21, D22, D23 after dividing feature Ag as child node d21, d22, d23 Nodal information.
312, using the child node as leaf node, and concentrate sample class that the leaf is set according to the subsample of child node The output of node.
For example, default classification termination condition may include: the classification of subsample concentration sample after the corresponding removal of child node Quantity be 1 namely the sample set of child node in only one classification sample.
At this point, if child node meets the default classification termination condition, using subsample concentrate the classification of sample as The output of the leaf node.When the sample for only having classification to be " male " is concentrated in subsample after such as removing, it is possible to by " male " conduct The output of the leaf node.
313, using present node as leaf node, and the most sample class of sample size is chosen as the leaf node Output.
Wherein, sample class includes " male " and " female ".
For example, in the subsample collection D1 classification of child node d1, if maximum information gain is small and preset threshold, at this point, It can be using the most sample class of sample size in the collection D1 of subsample as the output of the leaf node.Such as the sample size of " female " It at most, then can output by " female " as leaf node a1.
314, after having constructed decision-tree model, the time for needing to predict user's gender is obtained, is acquired according to predicted time The multidimensional characteristic of the behavioural habits of the user of gender information is not provided as forecast sample.
Wherein, predicted time can be set according to demand, such as can be current time.
315, the gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
For example, corresponding leaf node can be determined according to the feature and decision-tree model of forecast sample, by the leaf section The output of point is as prediction output result.Such as using the feature of forecast sample according to branch condition (the i.e. division feature of decision tree Characteristic value) determine current leaf node, take the leaf node output as predict result.It is defeated due to leaf node It out include that " male " or " female " therefore can be determined based on decision tree at this time and not provide the gender of the user of gender information.
For example, acquisition do not provide gender information user behavioural habits multidimensional characteristic after, can be shown in Fig. 5 certainly Searching corresponding leaf node according to the branch condition of decision tree in plan tree is an1, and the output of leaf node an1 is " male ", this When, just determine that the gender for the user for not providing gender information is male.
From the foregoing, it will be observed that the multidimensional characteristic conduct of the behavioural habits of the user of gender information has been provided in the embodiment of the present application acquisition Sample, and construct the sample set that the behavioural habits of user of gender information have been provided;When the quantity of the feature is more than default threshold When value, sample classification is carried out to sample set according to information gain-ratio of the feature for sample classification, to construct prediction user's property Other decision-tree model, the output of decision-tree model include " male " or " female ";Gender information is not provided according to predicted time acquisition User behavioural habits multidimensional characteristic as forecast sample;Gender is not provided according to forecast sample and decision-tree model prediction The gender of the user of information.
Further, due in each sample of sample set, including the multiple characteristic informations for reflecting user behavior habit, Therefore the embodiment of the present application can make user gender prediction more intelligent.
Further, user gender prediction is realized based on decision tree prediction model, can promote user gender prediction's Accuracy, and then improve the accuracy of prediction.
A kind of user gender prediction device is additionally provided in one embodiment.Referring to Fig. 7, Fig. 7 is the application implementation The structural schematic diagram for user's gender prediction's device that example provides.Wherein user's gender prediction's device is applied to electronic equipment, should User's gender prediction's device includes the first acquisition unit 401, taxon 402, the second acquisition unit 403 and predicting unit 404, as follows:
First acquisition unit 401, for acquiring the multidimensional characteristic conduct that the behavioural habits of user of gender information have been provided Sample, and construct the sample set that the behavioural habits of user of gender information have been provided;
Taxon 402, for when the quantity of the feature be more than preset threshold when, according to feature for sample classification Information gain-ratio carries out sample classification to sample set, to construct the decision-tree model of user gender prediction;
Second acquisition unit 403, for being practised according to the behavior for the user for not providing gender information according to predicted time acquisition Used multidimensional characteristic is as forecast sample;
Predicting unit 404, the property of the user for not providing gender information according to forecast sample and decision-tree model prediction Not.
In one embodiment, with reference to Fig. 8, taxon 402 may include:
First node generates subelement 4021, for generating the root node of decision tree, and using the sample set as described in The nodal information of root node;The sample set of the root node is determined as current target sample collection to be sorted;
Ratio of profit increase obtains subelement 4022, classifies for target sample collection for obtaining the feature in target sample collection Information gain-ratio;
Feature determines subelement 4023, for obtaining the letter that the feature classifies for target sample collection in target sample collection Cease ratio of profit increase;
Classification subelement 4024 obtains several increments for dividing according to the division feature to the sample set This collection;
Second node generates subelement 4025, for concentrating the division feature of sample to go the subsample It removes, subsample collection after being removed;The child node of present node is generated, and using subsample collection after the removal as the sub- section The nodal information of point;
Judgment sub-unit 4026, for judging whether child node meets default classification termination condition, if it is not, by the target Sample set is updated to subsample collection after the removal, and triggers the ratio of profit increase and obtain the execution acquisition target sample of subelement 4022 The step of information gain-ratio that the feature classifies for sample set in collecting;If so, using the child node as leaf node, Concentrate the classification of sample that the output of the leaf node is set according to subsample after the removal, the classification of the sample includes " male " and " female ".
Wherein, classification subelement 4024 can be used for obtaining the characteristic value that feature is divided in the sample set;
The sample set is divided according to the characteristic value.Identical sample is divided into identical subsample collection.
Wherein, feature determines subelement 4023, can be used for:
Maximum target information gain is chosen from the information gain;
Judge whether the target information gain is greater than preset threshold;
If so, choosing the corresponding feature of the target information gain as current division feature.
In one embodiment, ratio of profit increase obtains subelement 4022, can be used for:
Obtain the information gain that the feature classifies for target sample collection;
Obtain the division information that the feature classifies for target sample collection;
According to the information gain and the division information, obtains the feature and the information that target sample collection is classified is increased Beneficial rate.
For example, gain obtains subelement 4022, can be used for:
Obtain the empirical entropy of target sample classification;
The feature is obtained for the conditional entropy of target sample collection classification results;
According to the conditional entropy and the empirical entropy, obtains the feature and the information that the target sample collection is classified is increased Beneficial rate.
In one embodiment, judgment sub-unit 4025 can be used for judging increment after the corresponding removal of the child node Whether the categorical measure of this concentration sample is preset quantity;
If so, determining that the child node meets default classification termination condition.
In one embodiment, feature determines subelement 4023, can be also used for when target information ratio of profit increase is no more than pre- If when threshold value, using present node as leaf node, and choosing the most sample class of sample size as the leaf node Output.
Wherein, the step of each unit executes in the user gender prediction device side that reference can be made to the above method embodiment describes Method step.User's gender prediction's device can integrate in the electronic device, such as mobile phone, tablet computer.
It is realized when it is implemented, above each unit can be used as independent entity, any combination can also be carried out, as Same or several entities realize that the specific implementation of above each unit can be found in the embodiment of front, and details are not described herein.
From the foregoing, it will be observed that the present embodiment user gender prediction device can be acquired by the first acquisition unit 401 has been provided gender The multidimensional characteristic of the behavioural habits of the user of information constructs the behavioural habits that the user of gender information has been provided as sample Sample set;By taxon 402 when the quantity of the feature is more than preset threshold, according to feature for the information of sample classification Ratio of profit increase carries out sample classification to sample set, to construct the decision-tree model of prediction user's gender, the output of decision-tree model Including " male " or " female ";The behavior for the user for not providing gender information according to predicted time acquisition by the second acquisition unit 403 is practised Used multidimensional characteristic is as forecast sample;Gender is not provided according to forecast sample and decision-tree model prediction by predicting unit 404 The gender of the user of information.
Further, due in each sample of sample set, including the multiple characteristic informations for reflecting user behavior habit, Therefore the embodiment of the present application can make user gender prediction more intelligent.
Further, user gender prediction is realized based on decision tree prediction model, can promote user gender prediction's Accuracy, and then improve the accuracy of prediction.
The embodiment of the present application also provides a kind of electronic equipment.Referring to Fig. 9, electronic equipment 500 include processor 501 and Memory 502.Wherein, processor 501 and memory 502 are electrically connected.
The processor 500 is the control centre of electronic equipment 500, is set using various interfaces and the entire electronics of connection Standby various pieces by the computer program of operation or load store in memory 502, and are called and are stored in memory Data in 502 execute the various functions of electronic equipment 500 and handle data, to carry out whole prison to electronic equipment 500 Control.
The memory 502 can be used for storing software program and module, and processor 501 is stored in memory by operation 502 computer program and module, thereby executing various function application and data processing.Memory 502 can mainly include Storing program area and storage data area, wherein storing program area can computer needed for storage program area, at least one function Program (such as sound-playing function, image player function etc.) etc.;Storage data area, which can be stored, uses institute according to electronic equipment The data etc. of creation.In addition, memory 502 may include high-speed random access memory, it can also include non-volatile memories Device, for example, at least a disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 502 can also include Memory Controller, to provide access of the processor 501 to memory 502.
In the embodiment of the present application, the processor 501 in electronic equipment 500 can be according to following step, by one or one The corresponding instruction of the process of a above computer program is loaded into memory 502, and is stored in by the operation of processor 501 Computer program in reservoir 502, thus realize various functions, it is as follows:
The multidimensional characteristic of the behavioural habits of the user of gender information has been provided as sample in acquisition, and constructs and gender has been provided The sample set of the behavioural habits of the user of information;
When the quantity of the feature be more than preset threshold when, according to feature for sample classification information gain-ratio to sample Collection carries out sample classification, to construct the decision-tree model of user gender prediction, the output of the decision-tree model include male or Person female;
The multidimensional characteristic of the behavioural habits of the user of gender information is not provided according to predicted time acquisition as forecast sample;
The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
In some embodiments, the sample set is carried out according to information gain of the feature for sample classification Sample classification, when decision-tree model to construct the user, processor 501 can specifically execute following steps:
The root node of decision tree is generated, and using the sample set as the nodal information of the root node;
The sample set of the root node is determined as current target sample collection to be sorted;
Obtain the information gain-ratio that the feature classifies for target sample collection in target sample collection;
Current division feature is chosen from the feature according to information gain-ratio selection;
The sample set is divided according to the division feature, obtains several subsample collection;
The division feature of sample is concentrated to be removed the subsample, subsample collection after being removed;
The child node of present node is generated, and using subsample collection after the removal as the nodal information of the child node;
Judge whether child node meets default classification termination condition;
If it is not, the target sample collection is then updated to subsample collection after the removal, and returns to execution and obtain target sample The step of information gain-ratio that the feature classifies for target sample collection in this collection;
If so, concentrating the classification of sample that leaf section is arranged according to subsample after removal using child node as leaf node The output of point, the classification of sample include " male " and " female ".
In some embodiments, when being divided according to the division feature to the target sample collection, processor 501 can specifically execute following steps:
It obtains the target sample and concentrates the characteristic value for dividing feature;
The target sample collection is divided according to the characteristic value.
In some embodiments, current division spy is being chosen from the feature according to information gain-ratio selection When sign, processor 501 can specifically execute following steps:
Maximum target information ratio of profit increase is chosen from the information gain;
Judge whether the target information ratio of profit increase is greater than preset threshold;
If so, choosing the corresponding feature of the target information ratio of profit increase as current division feature.
In some embodiments, processor 501 can also specifically execute following steps:
When target information ratio of profit increase is not more than preset threshold, using present node as leaf node, and sample number is chosen Measure output of most sample class as the leaf node.
In some embodiments, the information gain that the feature classifies for sample set in target sample collection is being obtained When, processor 501 can specifically execute following steps:
Obtain the information gain that the feature classifies for target sample collection;
Obtain the division information that the feature classifies for target sample collection;
According to the information gain and the division information, obtains the feature and the information that target sample collection is classified is increased Beneficial rate.
It can be seen from the above, the electronic equipment of the embodiment of the present application, the behavioural habits of the user of gender information are had been provided in acquisition Multidimensional characteristic as sample, and construct the sample set that the behavioural habits of user of gender information have been provided;When the feature When quantity is more than preset threshold, sample classification is carried out to sample set according to information gain-ratio of the feature for sample classification, with structure The decision-tree model of prediction user's gender is built out, the output of decision-tree model includes " male " or " female ";It is acquired according to predicted time The multidimensional characteristic of the behavioural habits of the user of gender information is not provided as forecast sample;According to forecast sample and decision-tree model Prediction does not provide the gender of the user of gender information.
Also referring to Figure 10, in some embodiments, electronic equipment 500 can also include: display 503, radio frequency Circuit 504, voicefrequency circuit 505 and power supply 506.Wherein, wherein display 503, radio circuit 504, voicefrequency circuit 505 with And power supply 506 is electrically connected with processor 501 respectively.
The display 503 be displayed for information input by user or be supplied to user information and various figures Shape user interface, these graphical user interface can be made of figure, text, icon, video and any combination thereof.Display 503 may include display panel, in some embodiments, can use liquid crystal display (Liquid Crystal Display, LCD) or the forms such as Organic Light Emitting Diode (Organic Light-Emitting Diode, OLED) match Set display panel.
The radio circuit 504 can be used for transceiving radio frequency signal, with by wireless communication with the network equipment or other electricity Sub- equipment establishes wireless telecommunications, the receiving and transmitting signal between the network equipment or other electronic equipments.
The voicefrequency circuit 505 can be used for providing the audio between user and electronic equipment by loudspeaker, microphone Interface.
The power supply 506 is used to all parts power supply of electronic equipment 500.In some embodiments, power supply 506 Can be logically contiguous by power-supply management system and processor 501, to realize management charging by power-supply management system, put The functions such as electricity and power managed.
Although being not shown in Figure 10, electronic equipment 500 can also include camera, bluetooth module etc., and details are not described herein.
The embodiment of the present application also provides a kind of storage medium, and the storage medium is stored with computer program, when the meter When calculation machine program is run on computers, so that the computer executes the user gender prediction side in any of the above-described kind of embodiment Method, such as: the multidimensional characteristic of the behavioural habits of the user of gender information has been provided as sample in acquisition, and constructs and gender has been provided The sample set of the behavioural habits of the user of information;When the quantity of the feature is more than preset threshold, according to feature for sample The information gain-ratio of classification carries out sample classification to sample set, to construct the decision-tree model of prediction user's gender, decision tree The output of model includes " male " or " female ";The multidimensional of the behavioural habits of the user of gender information is not provided according to predicted time acquisition Feature is as forecast sample;The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
In the embodiment of the present application, storage medium can be magnetic disk, CD, read-only memory (Read Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, reference can be made to the related descriptions of other embodiments.
It should be noted that for user's gender prediction's method of the embodiment of the present application, this field common test personnel It is understood that realize all or part of the process of user's gender prediction's method of the embodiment of the present application, being can be by computer journey Sequence is completed to control relevant hardware, and the computer program can be stored in a computer-readable storage medium, such as deposit Storage executes in the memory of electronic equipment, and by least one processor in the electronic equipment, can wrap in the process of implementation Include the process of the embodiment such as user's gender prediction's method.Wherein, the storage medium can be magnetic disk, CD, read-only storage Device, random access memory etc..
For user's gender prediction's device of the embodiment of the present application, each functional module be can integrate in a processing core In piece, it is also possible to modules and physically exists alone, can also be integrated in two or more modules in a module.On It states integrated module both and can take the form of hardware realization, can also be realized in the form of software function module.The collection If at module realized in the form of software function module and when sold or used as an independent product, also can store In one computer-readable storage medium, the storage medium is for example read-only memory, disk or CD etc..
A kind of user gender prediction method, apparatus, storage medium and electronics provided by the embodiment of the present application are set above Standby to be described in detail, specific examples are used herein to illustrate the principle and implementation manner of the present application, above The explanation of embodiment is merely used to help understand the present processes and its core concept;Meanwhile for those skilled in the art Member, according to the thought of the application, there will be changes in the specific implementation manner and application range, in conclusion this explanation Book content should not be construed as the limitation to the application.

Claims (16)

1. a kind of user gender prediction method characterized by comprising
The multidimensional characteristic of the behavioural habits of the user of gender information has been provided as sample in acquisition, and constructs and gender information has been provided User behavioural habits sample set;
When the quantity of the feature be more than preset threshold when, according to feature for sample classification information gain-ratio to sample set into Row sample classification, to construct the decision-tree model of user gender prediction;
The multidimensional characteristic of the behavioural habits of the user of gender information is not provided according to predicted time acquisition as forecast sample;
The gender of the user of gender information is not provided according to forecast sample and decision-tree model prediction.
2. user gender prediction method as described in claim 1, which is characterized in that when the quantity of the feature is more than default threshold When value, sample classification is carried out to sample set according to information gain-ratio of the feature for sample classification, it is pre- to construct user's gender The decision-tree model of survey, comprising:
The root node of decision tree is generated, and using the sample set as the nodal information of the root node;
The sample set of the root node is determined as current target sample collection to be sorted;
Obtain the information gain-ratio that the feature classifies for target sample collection in target sample collection;
Current division feature is chosen from the feature according to information gain-ratio selection;
The sample set is divided according to the division feature, obtains several subsample collection;
The division feature of sample is concentrated to be removed the subsample, subsample collection after being removed;
The child node of present node is generated, and using subsample collection after the removal as the nodal information of the child node;
Judge whether child node meets default classification termination condition;
If it is not, the target sample collection is then updated to subsample collection after the removal, and returns to execution and obtain target sample collection The step of information gain-ratio that the interior feature classifies for target sample collection;
If so, concentrating the classification of sample that institute is arranged according to subsample after the removal using the child node as leaf node The output of leaf node is stated, the classification of the sample includes male and female.
3. user gender prediction method as claimed in claim 2, which is characterized in that according to the division feature to the target Sample set is divided, comprising:
It obtains the target sample and concentrates the characteristic value for dividing feature;
The target sample collection is divided according to the characteristic value.
4. user gender prediction method as claimed in claim 2, which is characterized in that chosen according to the information gain-ratio from institute It states and chooses current division feature in feature, comprising:
Maximum target information ratio of profit increase is chosen from the information gain;
Judge whether the target information ratio of profit increase is greater than preset threshold;
If so, choosing the corresponding feature of the target information ratio of profit increase as current division feature.
5. user gender prediction method as claimed in claim 4, which is characterized in that the user gender prediction method is also wrapped It includes:
When target information ratio of profit increase is not more than preset threshold, using present node as leaf node, and sample size is chosen most Output of more sample class as the leaf node.
6. user gender prediction method as claimed in claim 2, which is characterized in that judge whether child node meets default classification Termination condition, comprising:
Subsample concentrates whether the categorical measure of sample is preset quantity after judging the corresponding removal of the child node;
If so, determining that the child node meets default classification termination condition.
7. such as the described in any item user gender prediction methods of claim 2-6, which is characterized in that obtain institute in target sample collection State the information gain-ratio that feature classifies for target sample collection, comprising:
Obtain the information gain that the feature classifies for target sample collection;
Obtain the division information that the feature classifies for target sample collection;
According to the information gain and the division information, the information gain that the feature classifies for target sample collection is obtained Rate.
8. user gender prediction method as claimed in claim 7, which is characterized in that obtain the feature for target sample collection The information gain-ratio of classification, comprising:
Obtain the empirical entropy of target sample classification;
The feature is obtained for the conditional entropy of target sample collection classification results;
According to the conditional entropy and the empirical entropy, the information gain that the feature classifies for the target sample collection is obtained Rate.
9. user gender prediction method as claimed in claim 7, which is characterized in that according to the information gain and the division Information obtains the information gain-ratio that the feature classifies for target sample collection, comprising:
The information gain-ratio that feature classifies for target sample collection is calculated by following formula:
Wherein, gR(D, A) is characterized the information gain-ratio that A classifies for sample set D, and g (D, A) is characterized A for sample classification Information gain, HA (D) are characterized the division information of A;
Also, g (D, A) can be calculated by following formula:
Wherein, the empirical entropy that H (D) classifies for sample set D, and H (D | A) it is characterized the conditional entropy that A classifies for sample set D, pi A The probability that feature takes the sample of i-th kind of value to occur in sample set D, n and i are the positive integer greater than zero.
10. a kind of user gender prediction device characterized by comprising
First acquisition unit, for acquiring the multidimensional characteristic that the behavioural habits of user of gender information have been provided as sample, and The sample set of the behavioural habits of the user of gender information has been provided in building;
Taxon, for being increased according to information of the feature for sample classification when the quantity of the feature is more than preset threshold Beneficial rate carries out sample classification to sample set, to construct the decision-tree model of user gender prediction, the decision-tree model it is defeated It out include male or female;
Second acquisition unit, the multidimensional characteristic of the behavioural habits of the user for not providing gender information according to predicted time acquisition As forecast sample;
Predicting unit, the gender of the user for not providing gender information according to forecast sample and decision-tree model prediction.
11. user gender prediction device as claimed in claim 10, which is characterized in that the taxon includes:
First node generates subelement, for generating the root node of decision tree, and using the sample set as the root node Nodal information;The sample set of the root node is determined as current target sample collection to be sorted;
Ratio of profit increase obtains subelement, for obtaining the information gain that the feature classifies for target sample collection in target sample collection Rate;
Feature determines subelement, for choosing current division feature from the feature according to information gain-ratio selection;
Classification subelement obtains several subsample collection for dividing according to the division feature to the sample set;
Second node generates subelement, for concentrating the division feature of sample to be removed the subsample, is gone Except rear subsample collection;The child node of present node is generated, and using subsample collection after the removal as the node of the child node Information;
The target sample collection is updated to by judgment sub-unit for judging whether child node meets default classification termination condition Subsample collection after the removal, and trigger the ratio of profit increase obtain subelement execute obtain in target sample collection the feature for The step of information gain-ratio of sample set classification;If so, using the child node as leaf node, according to sub after the removal The output of the leaf node is arranged in the classification of sample in sample set, and the classification of the sample includes male and female.
12. user gender prediction device as claimed in claim 11, which is characterized in that the classification subelement is used for:
Obtain the characteristic value that feature is divided in the sample set;
The sample set is divided according to the characteristic value.
13. user gender prediction device as claimed in claim 11, which is characterized in that feature determines subelement, is used for:
Maximum target information ratio of profit increase is chosen from the information gain-ratio;
Judge whether the target information ratio of profit increase is greater than preset threshold;
If so, choosing the corresponding feature of the target information ratio of profit increase as current division feature.
14. user gender prediction device as claimed in claim 11, which is characterized in that the ratio of profit increase obtains subelement, uses In:
Obtain the information gain that the feature classifies for target sample collection;
Obtain the division information that the feature classifies for target sample collection;
According to the information gain and the division information, the information gain that the feature classifies for target sample collection is obtained Rate.
15. a kind of storage medium, is stored thereon with computer program, which is characterized in that when the computer program is in computer When upper operation, so that the computer executes user gender prediction method as described in any one of claim 1 to 9.
16. a kind of electronic equipment, including processor and memory, the memory have computer program, which is characterized in that described Processor is by calling the computer program, for executing user gender prediction side as described in any one of claim 1 to 9 Method.
CN201711405558.8A 2017-12-22 2017-12-22 User gender prediction method, apparatus, medium and electronic equipment Pending CN109961075A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201711405558.8A CN109961075A (en) 2017-12-22 2017-12-22 User gender prediction method, apparatus, medium and electronic equipment
PCT/CN2018/115358 WO2019120007A1 (en) 2017-12-22 2018-11-14 Method and apparatus for predicting user gender, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711405558.8A CN109961075A (en) 2017-12-22 2017-12-22 User gender prediction method, apparatus, medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN109961075A true CN109961075A (en) 2019-07-02

Family

ID=66993039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711405558.8A Pending CN109961075A (en) 2017-12-22 2017-12-22 User gender prediction method, apparatus, medium and electronic equipment

Country Status (2)

Country Link
CN (1) CN109961075A (en)
WO (1) WO2019120007A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639714A (en) * 2020-06-01 2020-09-08 贝壳技术有限公司 Method, device and equipment for determining attributes of users
CN112348583A (en) * 2020-11-04 2021-02-09 贝壳技术有限公司 User preference generation method and generation system
CN112446144A (en) * 2020-11-17 2021-03-05 哈工大机器人(合肥)国际创新研究院 Fault diagnosis method and device for large-scale rotating machine set

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116757450B (en) * 2023-08-17 2024-01-30 浪潮通用软件有限公司 Method, device, equipment and medium for task allocation of sharing center

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473231A (en) * 2012-06-06 2013-12-25 深圳先进技术研究院 Classifier building method and system
US20150379426A1 (en) * 2014-06-30 2015-12-31 Amazon Technologies, Inc. Optimized decision tree based models
CN105654131A (en) * 2015-12-30 2016-06-08 小米科技有限责任公司 Classification model training method and device
CN106294667A (en) * 2016-08-05 2017-01-04 四川九洲电器集团有限责任公司 A kind of decision tree implementation method based on ID3 and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164470A (en) * 2011-12-15 2013-06-19 盛大计算机(上海)有限公司 Directional application method based on user gender distinguished results and system thereof
CN104933075A (en) * 2014-03-20 2015-09-23 百度在线网络技术(北京)有限公司 User attribute predicting platform and method
CN104598648B (en) * 2015-02-26 2017-12-26 苏州大学 A kind of microblog users interactive mode gender identification method and device
CN107180044A (en) * 2016-03-09 2017-09-19 精硕科技(北京)股份有限公司 Recognize Internet user's sex method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473231A (en) * 2012-06-06 2013-12-25 深圳先进技术研究院 Classifier building method and system
US20150379426A1 (en) * 2014-06-30 2015-12-31 Amazon Technologies, Inc. Optimized decision tree based models
CN105654131A (en) * 2015-12-30 2016-06-08 小米科技有限责任公司 Classification model training method and device
CN106294667A (en) * 2016-08-05 2017-01-04 四川九洲电器集团有限责任公司 A kind of decision tree implementation method based on ID3 and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
薛锋等: "《技术站能力查定及其自动化》", 31 August 2017 *
韩忠明等: "《数据分析与R》", 31 August 2014 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639714A (en) * 2020-06-01 2020-09-08 贝壳技术有限公司 Method, device and equipment for determining attributes of users
CN112348583A (en) * 2020-11-04 2021-02-09 贝壳技术有限公司 User preference generation method and generation system
CN112348583B (en) * 2020-11-04 2022-12-06 贝壳技术有限公司 User preference generation method and generation system
CN112446144A (en) * 2020-11-17 2021-03-05 哈工大机器人(合肥)国际创新研究院 Fault diagnosis method and device for large-scale rotating machine set

Also Published As

Publication number Publication date
WO2019120007A1 (en) 2019-06-27

Similar Documents

Publication Publication Date Title
CN109961077A (en) Gender prediction's method, apparatus, storage medium and electronic equipment
CN109948633A (en) User gender prediction method, apparatus, storage medium and electronic equipment
CN107704070A (en) Using method for cleaning, device, storage medium and electronic equipment
CN106530010B (en) The collaborative filtering method and device of time of fusion factor
CN109063163A (en) A kind of method, apparatus, terminal device and medium that music is recommended
CN108108455A (en) Method for pushing, device, storage medium and the electronic equipment of destination
CN107678845A (en) Application program management-control method, device, storage medium and electronic equipment
US11170061B2 (en) System for decomposing events from managed infrastructures that includes a reference tool signalizer
CN109961075A (en) User gender prediction method, apparatus, medium and electronic equipment
CN108337358A (en) Using method for cleaning, device, storage medium and electronic equipment
CN108280458A (en) Group relation kind identification method and device
CN107894827A (en) Using method for cleaning, device, storage medium and electronic equipment
CN108268617A (en) User view determines method and device
CN107678531A (en) Using method for cleaning, device, storage medium and electronic equipment
CN108197225A (en) Sorting technique, device, storage medium and the electronic equipment of image
US10050910B2 (en) Application of neural nets to determine the probability of an event being causal
CN107678800A (en) Background application method for cleaning, device, storage medium and electronic equipment
CN108563680A (en) Resource recommendation method and device
US11010220B2 (en) System and methods for decomposing events from managed infrastructures that includes a feedback signalizer functor
CN113254711B (en) Interactive image display method and device, computer equipment and storage medium
CN107704289A (en) Using method for cleaning, device, storage medium and electronic equipment
US10402428B2 (en) Event clustering system
CN110458296A (en) The labeling method and device of object event, storage medium and electronic device
CN109583949A (en) A kind of user changes planes prediction technique and system
CN110347781A (en) Article falls discharge method, article recommended method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190702

RJ01 Rejection of invention patent application after publication