CN110059112A - Usage mining method and device based on machine learning, electronic equipment, medium - Google Patents

Usage mining method and device based on machine learning, electronic equipment, medium Download PDF

Info

Publication number
CN110059112A
CN110059112A CN201811062830.1A CN201811062830A CN110059112A CN 110059112 A CN110059112 A CN 110059112A CN 201811062830 A CN201811062830 A CN 201811062830A CN 110059112 A CN110059112 A CN 110059112A
Authority
CN
China
Prior art keywords
mining
target
model
user
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811062830.1A
Other languages
Chinese (zh)
Inventor
林凌军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN201811062830.1A priority Critical patent/CN110059112A/en
Publication of CN110059112A publication Critical patent/CN110059112A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The usage mining method and device that the disclosure is directed to a kind of based on machine learning, electronic equipment, storage medium, are related to field of artificial intelligence, this method comprises: obtaining the historical data of all users by multiple dimensional characteristics;Feature selecting is carried out to the multiple dimensional characteristics, is obtained and at least one associated target dimension feature of test object;The multiple mining models established for carrying out user in predicting are trained to preset model by the historical data of at least one target dimension feature, and target mining model is determined based on the multiple mining model;Target user is determined from all users by the target mining model.The disclosure can carry out usage mining from multiple dimensions by model, improve the accuracy rate of usage mining by intelligent predicting based on big data analysis and calculating.

Description

Usage mining method and device based on machine learning, electronic equipment, medium
Technical field
This disclosure relates to field of artificial intelligence, in particular to a kind of usage mining side based on machine learning Method, usage mining device, electronic equipment and computer readable storage medium based on machine learning.
Background technique
It is very more using the number of users of platform during internet business, and there may be multiple in these users The user of high value.If high-value user can be accurately located in multiple users, be very beneficial for platform propulsion and It promotes.
In the prior art, usually judged according to the data of a certain dimension.For example, only by fund dimension or It is that positioning high-value user is only established by academic dimension.It is single for the excavation angle changing rate of high-value user;In addition to this, High-value user is only determined by a model, does not ensure that the best performance in the model, and then leads to determining height Value user is inaccurate.
It should be noted that information is only used for reinforcing the reason to the background of the disclosure disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The disclosure is designed to provide a kind of usage mining method and device based on machine learning, electronic equipment, deposits Storage media, and then usage mining accuracy rate caused by the limitation and defect due to the relevant technologies is overcome at least to a certain extent Low problem.
Other characteristics and advantages of the disclosure will be apparent from by the following detailed description, or partially by the disclosure Practice and acquistion.
According to one aspect of the disclosure, a kind of usage mining method based on machine learning is provided, comprising: by multiple Dimensional characteristics obtain the historical data of all users;Feature selecting is carried out to the multiple dimensional characteristics, is obtained and test object At least one associated target dimension feature;By the historical data of at least one target dimension feature to preset model into Multiple mining models for carrying out user in predicting are established in row training, and determine that target excavates mould based on the multiple mining model Type;Target user is determined from all users by the target mining model.
In a kind of exemplary embodiment of the disclosure, the method also includes: the historical data progress to all users Data cleansing.
In a kind of exemplary embodiment of the disclosure, feature selecting is carried out to the multiple dimensional characteristics, obtains and surveys Examination at least one associated target dimension feature of object includes: that calculate the multiple dimensional characteristics by random forests algorithm opposite In the importance score value of the test object, and according to importance score value sequence from big to small determine it is described at least one Target dimension feature.
In a kind of exemplary embodiment of the disclosure, pass through the historical data pair of at least one target dimension feature It includes: by each target dimension feature that preset model, which is trained the multiple mining models established for carrying out user in predicting, Historical data is trained the preset model, establishes a mining model respectively for each target dimension feature, with Obtain multiple mining models;Or at least one described target dimension feature is corresponded to as assemblage characteristic, and according to assemblage characteristic Historical data the preset model is trained, to establish multiple mining models.
In a kind of exemplary embodiment of the disclosure, target mining model packet is determined based on the multiple mining model It includes: the multiple mining model is tested, the mining model that test value is greater than preset value is determined as the target Mining model.
In a kind of exemplary embodiment of the disclosure, the mining model that test value is greater than preset value is determined as institute Stating target mining model includes: to be tested by the data of pre-set user the multiple mining model, to obtain test value; The mining model that the test value is greater than the preset value is determined as the target mining model.
In a kind of exemplary embodiment of the disclosure, target is determined from all users by the target mining model User includes: to be analyzed by the target mining model the historical data of all users, to calculate class probability;Pass through The class probability determines the target user for being directed to the test object.
According to one aspect of the disclosure, a kind of usage mining device based on machine learning is provided, comprising: data acquisition Module, for obtaining the historical data of all users by multiple dimensional characteristics;Feature selection module, for the multiple dimension It spends feature and carries out feature selecting, obtain and at least one associated target dimension feature of test object;Model building module is used for Foundation is trained for carrying out user in predicting to preset model by the historical data of at least one target dimension feature Multiple mining models, and target mining model is determined based on the multiple mining model;User's determining module, for passing through It states target mining model and determines target user from all users.
According to one aspect of the disclosure, a kind of electronic equipment is provided, comprising: processor;And memory, for storing The executable instruction of the processor;Wherein, the processor is configured to above-mentioned to execute via the executable instruction is executed Usage mining method described in any one based on machine learning.
According to one aspect of the disclosure, a kind of computer readable storage medium is provided, computer program is stored thereon with, The computer program realizes the usage mining method described in above-mentioned any one based on machine learning when being executed by processor.
A kind of usage mining method, apparatus based on machine learning for there is provided in disclosure exemplary embodiment, electronics are set In standby and computer readable storage medium, on the one hand, determine target mining model by least one target dimension feature, increase Excavation dimension and application range are added;On the other hand, a target mining model is obtained by multiple mining models, so that mesh The performance for marking mining model is more excellent, so as to accurately excavate target user.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 schematically shows a kind of usage mining method signal based on machine learning in disclosure exemplary embodiment Figure;
Fig. 2 is schematically shown in disclosure exemplary embodiment to be shown by the process that machine learning algorithm excavates target user It is intended to;
Fig. 3 schematically shows a kind of frame of the usage mining device based on machine learning in disclosure exemplary embodiment Figure;
Fig. 4 schematically shows the block diagram of a kind of electronic equipment in disclosure exemplary embodiment;
Fig. 5 schematically shows a kind of program product in disclosure exemplary embodiment.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.In the following description, it provides perhaps More details fully understand embodiment of the present disclosure to provide.It will be appreciated, however, by one skilled in the art that can It is omitted with technical solution of the disclosure one or more in the specific detail, or others side can be used Method, constituent element, device, step etc..In other cases, be not shown in detail or describe known solution to avoid a presumptuous guest usurps the role of the host and So that all aspects of this disclosure thicken.
In addition, attached drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical attached drawing mark in figure Note indicates same or similar part, thus will omit repetition thereof.Some block diagrams shown in the drawings are function Energy entity, not necessarily must be corresponding with physically or logically independent entity.These function can be realized using software form Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place These functional entitys are realized in reason device device and/or microcontroller device.
A kind of usage mining method based on machine learning is provided firstly in this example embodiment, can be applied to each Kind it needs to be determined that or prediction target user application platform.Refering to what is shown in Fig. 1, the usage mining method based on machine learning of being somebody's turn to do can With the following steps are included:
In step s 110, the historical data of all users is obtained by multiple dimensional characteristics;
In the step s 120, feature selecting is carried out to the multiple dimensional characteristics, obtained associated at least with test object One target dimension feature;
In step s 130, preset model is trained by the historical data of at least one target dimension feature Multiple mining models for carrying out user in predicting are established, and target mining model is determined based on the multiple mining model;
In step S140, target user is determined from all users by the target mining model.
In the usage mining method based on machine learning provided in the present example embodiment, on the one hand, by least One target dimension feature determines target mining model, increases and excavates dimension and application range, can dig more fully hereinafter Dig target user;On the other hand, a target mining model is obtained by multiple mining models, so that the property of target mining model Can be more excellent, so as to accurately excavate target user.
Next, carrying out the usage mining method based on machine learning in the present exemplary embodiment into one in conjunction with attached drawing Step illustrates.
In step s 110, the historical data of all users is obtained by multiple dimensional characteristics.
In the present exemplary embodiment, all users may be, for example, all registration users on a certain default platform.It is default flat Platform can be flat for any appropriate application such as shopping online platform, information platform, on-line off-line e-commerce platform O2O Platform.In order to guarantee to analyze the validity of result, historical data may include the user data in preset duration, and preset duration is for example It can be one month, three months or half a year, can specifically be configured according to actual needs.Multiple dimensional characteristics include but not It is limited to age bracket, educational background, annual income range, place city, in default platform enlivens module, the liveness in platform, wealth Attribute etc..The historical data that user is obtained by multiple dimensional characteristics, may make the data of acquisition more comprehensively.
After the historical data for obtaining all users, since not all data are all valid data, it is therefore desirable to right These historical datas carry out data cleansing, to filter out valid data.It can be for example, obtaining the accumulative of a certain user in preset duration Log duration and/or the login times for obtaining a certain user in preset duration;Accumulative log duration can then be retained greater than predetermined Duration or login times are greater than the historical data of the certain customers of preset times, and weed out only registration or login times and step on The historical data for recording time all fewer user, to reduce calculation amount, while avoiding invalid data from interfering, and improves data matter Amount.
In the step s 120, feature selecting is carried out to the multiple dimensional characteristics, obtained associated at least with test object One target dimension feature.
It, can be to step since not all dimensional characteristics can all influence prediction result in the present exemplary embodiment Multiple dimensional characteristics in rapid S110 carry out feature selecting, are tieed up with filtering out at least one associated target signature of test object Degree.Target dimension feature herein refers to larger or to the biggish dimension of test object influence power with test object relevance Spend feature.Test object can be for example disparate modules, the different product either different application scene etc. in default platform, this It sentences and is illustrated for disparate modules.The target dimension feature having an impact for disparate modules can be different.For example, right In financial function module, obtained target dimension feature can be annual income range, wealth attribute, place city etc.;For health Functional module, obtained target dimension feature can be age bracket, annual income range, place city, wealth attribute etc..
Specifically, to the multiple dimensional characteristics carry out feature selecting, obtain with test object it is associated at least one The step of target dimension feature may include: to calculate the multiple dimensional characteristics relative to the test by random forests algorithm The importance score value of object, and determine that at least one described target dimension is special according to the sequence of the importance score value from big to small Sign.Wherein, random forests algorithm can be used and decision Tree algorithms calculate weight of each dimensional characteristics relative to test object The property wanted score value is ranked up made a difference score value according to the sequence of importance score value from big to small, will be arranged in top N Dimensional characteristics are determined as target dimension feature.Select the several target dimension features i.e. specific value of N can according to actual needs into Row setting, it should be noted that in order to enable prediction result is more acurrate more comprehensively, target dimension feature may include two or two A above dimensional characteristics.
Calculating the importance score value of some dimensional characteristics X by random forests algorithm, specific step is as follows: the first step, right Each decision tree selects the outer data (out of bag, OOB) of corresponding bag to calculate the outer data error of bag, is denoted as errOOB1. The outer data of so-called bag refer to, when establishing decision tree every time, obtain a data for training decision tree, at this moment by duplicate sampling There are also about 1/3 data not to be utilized, and is not engaged in the foundation of decision tree.This partial data can be used for decision tree Performance is assessed, the outer data error of the prediction error rate of computation model, referred to as bag.Second step, at random to data OOB institute outside bag There is the dimensional characteristics X of sample that noise jamming (value of the sample at dimensional characteristics X can be changed at random) is added, calculates outside bag again Data error is denoted as errOOB2.Third step, it is assumed that have N tree in forest, then the importance score value of dimensional characteristics X is represented by ∑(errOOB2-errOOB1)/N.This numerical value why can illustrate the importance of dimensional characteristics be because, if be added with After machine noise, the outer data accuracy sharp fall (i.e. errOOB2 rising) of bag illustrates this dimensional characteristics for the pre- of sample It surveys result to have a significant impact, it is relatively high to further relate to significance level.
Next, the step of carrying out feature selecting on the basis of calculating dimensional characteristics importance score value includes: to calculate often The importance score value of a dimensional characteristics, and sort in descending order;The ratio to be rejected of determination is picked according to dimensional characteristics importance score value Except the dimensional characteristics of corresponding proportion, a new feature set is obtained;It is repeated the above process with new feature set, until remaining m Feature (m is the value being set in advance);Error rate outside each feature set according to obtained in the above process and the corresponding bag of feature set, The feature set for selecting the outer error rate of bag minimum.It can be for example, selection comes preceding 3 dimensions spy according to importance score value from big to small Sign is used as target dimension feature, multiple dimensional characteristics A, B, C, D for the importance score value of test object 1 are respectively 0.5,0.6, 0.1,0.3, then target dimension feature is followed successively by dimensional characteristics B, dimensional characteristics A, dimensional characteristics D.For another example multiple dimensional characteristics A, B, C, D are respectively 0.8,0.5,0.6,0.1 for the importance score value of test object 2, then tie up for the target of test object 2 Degree feature is followed successively by dimensional characteristics A, dimensional characteristics C, dimensional characteristics B.In addition to this, filtration method also can be used, pack, be based on The Method for Feature Selection of tree-model or any appropriate algorithm in decision tree scheduling algorithm are determined for each test object Target dimension feature.By random forests algorithm, can the important dimensional characteristics of automatic screening mentioned for artificial screening High efficiency and accuracy rate.
In step s 130, preset model is trained by the historical data of at least one target dimension feature Multiple mining models for carrying out user in predicting are established, and target mining model is determined based on the multiple mining model.
In the present exemplary embodiment, preset model can be any one machine learning model, such as can be classifier Either Random Forest model, this is illustrated for sentencing Random Forest model.After determining target dimension feature, it can lead to The historical data for crossing the corresponding remaining users after data cleansing of these target dimension features instructs Random Forest model Practice, to obtain multiple mining models for determining target user.Wherein, going through by least one target dimension feature It may include two kinds of sides that history data, which are trained the step of establishing multiple mining models for carrying out user in predicting to preset model, Formula: mode one is trained the preset model by the historical data of each target dimension feature, is each target Dimensional characteristics establish a mining model respectively, to obtain multiple mining models.It can be for example, carrying out user to test object 1 When prediction, mining model 1 is established for 1 age bracket of dimensional characteristics, establishes mining model 2 for 2 annual income range of dimensional characteristics, for dimension Degree 3 wealth attribute of feature establishes mining model 3.Mode two incites somebody to action at least one described target dimension feature as assemblage characteristic, and The preset model is trained according to assemblage characteristic corresponding historical data, to establish multiple mining models.It can be for example, will Dimensional characteristics 1, dimensional characteristics 2 and dimensional characteristics 3 are combined, and generate multiple assemblage characteristics, and corresponding according to assemblage characteristic Historical data Random Forest model is trained, to obtain the mining model for each assemblage characteristic.Wherein, multiple The weighted value of dimensional characteristics 1, dimensional characteristics 2 and dimensional characteristics 3 is different in assemblage characteristic, but corresponding historical data phase Together.Can be for example, the weight in dimensional characteristics 1 be 0.4, the weight of dimensional characteristics 2 is 0.2, the history number that dimensional characteristics 3 are 0.3 According to generation mining model 1;It is 0.1 in the weight of dimensional characteristics 1, the weight of dimensional characteristics 2 is 0.3, and dimensional characteristics 3 are 0.6 Historical data generates mining model 2;It is 0.2 in the weight of dimensional characteristics 1, the weight of dimensional characteristics 2 is 0.3, and dimensional characteristics 3 are 0.5 historical data generates mining model 3.
After determining multiple dimensional models, it can determine that a final target is excavated based on multiple mining models of generation Model.What target mining model indicated is the final mask for determining target user, and target user for example may include being directed to The high-value user of test object, potential user etc., this is illustrated for sentencing high-value user.
The step of determining target mining model based on the multiple mining model includes: to carry out to the multiple mining model The mining model that test value is greater than preset value is determined as the target mining model by test.It specifically can be by for survey The characteristic of multiple pre-set users of object is tried, each of multiple mining models of foundation mining model is tested, To obtain test value.Wherein, pre-set user may include multiple known high-value users.It, can be by known high price based on this Whether the data of value user input each mining model and analyze, to determine the user still for high-value user.Further Ground can be determined as the target mining model by the mining model that test value is greater than preset value.Preset value can be according to reality Border demand is configured, and in order to guarantee the accuracy of verifying, can set preset value to bigger numerical value, for example, 90% and with On numerical value.And then the corresponding mining model of the smallest dimensional characteristics of error can be determined as final target mining model or The corresponding mining model of the lesser one group of weight of error is determined as target mining model by person.For example, being tieed up for each target For the mining model that degree feature is established respectively, it can calculate separately and the data of known high-value user are inputted into each excavation The test value of model is determined as target mining model for accuracy rate is highest.If there is the highest mining model of multiple accuracys rate, Multiple mining models can then be carried out to linear combination and obtain target mining model, such as when the sum of all weights are 1, will be owned The weight of mining model is set as identical numerical value.For another example being for the accuracy rate for the mining model 1 that assemblage characteristic is established 80%, the accuracy rate of mining model 2 is 90%, and the accuracy rate of mining model 3 is 92%, then can be by the weight of dimensional characteristics 1 It is 0.2, the weight of dimensional characteristics 2 is 0.3, and the corresponding mining model 3 of assemblage characteristic that dimensional characteristics 3 are 0.5 is used as target Mining model.
In the present exemplary embodiment, by multiple dimensional characteristics combination random forests algorithms, and by using high value Family carries out random test, obtains the target mining model calculated for the historical data to user, increases analysis dimension, And improve the accuracy rate of model.
In step S140, target user is determined from all users by the target mining model.
In the present exemplary embodiment, target user refers to the high-value user for test object.It can will pass through data The target mining model determined in the historical data input step S230 of remaining all users after cleaning, to calculate each user Historical data class probability.Class probability herein can be used to indicate that whether each user belongs to the general of target user Rate, class probability are bigger, then it represents that the probability that the user belongs to target user is bigger.In the present exemplary embodiment, in a certain use When the historical data at family is greater than preset threshold by the class probability that target mining model calculates, it is believed that the user belongs to target User.Preset threshold can be any number, can specifically set according to actual needs.For example, by the feature of user 1 Data input target mining model, when obtained class probability is greater than preset threshold 0.5, it is believed that user 1 belongs to for test The target user of object 1.By target mining model, high-value user can be more accurately excavated.
The detailed process that usage mining is carried out by machine learning algorithm is shown in Fig. 2, comprising the following steps:
In step S20, establish multiple mining models according to multiple target dimension features, specifically include: step S21 leads to The historical data for crossing each target dimension feature is trained preset model, establishes respectively for each target dimension feature One mining model;Step S22, using target dimension feature as assemblage characteristic, and according to the corresponding historical data of assemblage characteristic Preset model is trained, to establish multiple mining models.
For step S21 and step S22, target mining model can be obtained by step S23, is specifically included: by multiple diggings The highest mining model of accuracy rate is determined as target mining model in pick model.
Step S24 is executed on the basis of step S23, and the historical data input target mining model of user is classified Probability, to judge whether user belongs to target user.
By the step S20 to step S24 in the present exemplary embodiment, the precision and efficiency of usage mining can be improved.
The disclosure additionally provides a kind of usage mining device based on machine learning.Refering to what is shown in Fig. 3, the device 300 can To include:
Data acquisition module 301, for obtaining the historical data of all users by multiple dimensional characteristics;
Feature selection module 302 obtains being associated with test object for carrying out feature selecting to the multiple dimensional characteristics At least one target dimension feature;
Model building module 303, for the historical data by least one target dimension feature to preset model The multiple mining models established for carrying out user in predicting are trained, and determine that target is excavated based on the multiple mining model Model;
User's determining module 304, for determining target user from all users by the target mining model.
It should be noted that the detail of each module is right in the above-mentioned usage mining device based on machine learning It is described in detail in the usage mining method based on machine learning answered, therefore details are not described herein again.
It should be noted that although being referred to several modules or list for acting the equipment executed in the above detailed description Member, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or more Module or the feature and function of unit can embody in a module or unit.Conversely, an above-described mould The feature and function of block or unit can be to be embodied by multiple modules or unit with further division.
In addition, although describing each step of method in the disclosure in the accompanying drawings with particular order, this does not really want These steps must be executed in this particular order by asking or implying, or having to carry out step shown in whole could realize Desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/ Or a step is decomposed into execution of multiple steps etc..
In an exemplary embodiment of the disclosure, a kind of electronic equipment that can be realized the above method is additionally provided.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, complete The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.) or hardware and software, can unite here Referred to as circuit, " module " or " system ".
The electronic equipment 400 of this embodiment according to the present invention is described referring to Fig. 4.The electronics that Fig. 4 is shown Equipment 400 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 4, electronic equipment 400 is showed in the form of universal computing device.The component of electronic equipment 400 can wrap It includes but is not limited to: at least one above-mentioned processing unit 410, at least one above-mentioned storage unit 420, the different system components of connection The bus 430 of (including storage unit 420 and processing unit 410), display unit 440.
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 410 Row, so that various according to the present invention described in the execution of the processing unit 410 above-mentioned " illustrative methods " part of this specification The step of illustrative embodiments.For example, the processing unit 410 can execute step as shown in fig. 1: in step S110 In, the historical data of all users is obtained by multiple dimensional characteristics;In the step s 120, the multiple dimensional characteristics are carried out Feature selecting obtains and at least one associated target dimension feature of test object;In step s 130, pass through described at least one The historical data of a target dimension feature is trained the multiple mining models established for carrying out user in predicting to preset model, And target mining model is determined based on the multiple mining model;In step S140, by the target mining model from institute Have and determines target user in user.
Storage unit 420 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit (RAM) 4201 and/or cache memory unit 4202, it can further include read-only memory unit (ROM) 4203.
Storage unit 420 can also include program/utility with one group of (at least one) program module 4205 4204, such program module 4205 includes but is not limited to: operating system, one or more application program, other program moulds It may include the realization of network environment in block and program data, each of these examples or certain combination.
Bus 430 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Display unit 440 can be display having a display function, to pass through the display exhibits by processing unit 410 Execute processing result obtained from the method in the present exemplary embodiment.Display include but is not limited to liquid crystal display either Other displays.
Electronic equipment 400 can also be with one or more external equipments 600 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 400 communicate, and/or with make Any equipment (such as the router, modulation /demodulation that the electronic equipment 400 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 450.Also, electronic equipment 400 can be with By network adapter 460 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.As shown, network adapter 460 is communicated by bus 430 with other modules of electronic equipment 400. It should be understood that although not shown in the drawings, other hardware and/or software module can not used in conjunction with electronic equipment 400, including but not Be limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and Data backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server, terminal installation or network equipment etc.) is executed according to disclosure embodiment Method.
In an exemplary embodiment of the disclosure, a kind of computer readable storage medium is additionally provided, energy is stored thereon with Enough realize the program product of this specification above method.In some possible embodiments, various aspects of the invention may be used also In the form of being embodied as a kind of program product comprising program code, when described program product is run on the terminal device, institute Program code is stated for executing the terminal device described in above-mentioned " illustrative methods " part of this specification according to this hair The step of bright various illustrative embodiments.
Refering to what is shown in Fig. 5, describing the program product for realizing the above method of embodiment according to the present invention 500, can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, Such as it is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing can be with To be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or It is in connection.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetic signal, Optical signal or above-mentioned any appropriate combination.Readable signal medium can also be any readable Jie other than readable storage medium storing program for executing Matter, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or and its The program of combined use.
The program code for including on readable medium can transmit with any suitable medium, including but not limited to wirelessly, have Line, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).
In addition, above-mentioned attached drawing is only the schematic theory of processing included by method according to an exemplary embodiment of the present invention It is bright, rather than limit purpose.It can be readily appreciated that the time that above-mentioned processing shown in the drawings did not indicated or limited these processing is suitable Sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure His embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or Adaptive change follow the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure or Conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by claim It points out.

Claims (10)

1. a kind of usage mining method based on machine learning characterized by comprising
The historical data of all users is obtained by multiple dimensional characteristics;
Feature selecting is carried out to the multiple dimensional characteristics, is obtained and at least one associated target dimension feature of test object;
Foundation is trained for carrying out user to preset model by the historical data of at least one target dimension feature Multiple mining models of prediction, and target mining model is determined based on the multiple mining model;
Target user is determined from all users by the target mining model.
2. the usage mining method according to claim 1 based on machine learning, which is characterized in that the method is also wrapped It includes:
Data cleansing is carried out to the historical data of all users.
3. the usage mining method according to claim 1 based on machine learning, which is characterized in that the multiple dimension Feature carry out feature selecting, obtain include: at least one associated target dimension feature of test object
Importance score value of the multiple dimensional characteristics relative to the test object is calculated by random forests algorithm, and according to The sequence of the importance score value from big to small determines at least one described target dimension feature.
4. the usage mining method according to claim 1 based on machine learning, which is characterized in that pass through described at least one The historical data of a target dimension feature is trained the multiple mining models established for carrying out user in predicting to preset model Include:
The preset model is trained by the historical data of each target dimension feature, it is special for each target dimension Sign establishes a mining model respectively, to obtain multiple mining models;Or
By at least one described target dimension feature as assemblage characteristic, and according to the corresponding historical data of assemblage characteristic to described Preset model is trained, to establish multiple mining models.
5. the usage mining method according to claim 1 based on machine learning, which is characterized in that be based on the multiple digging Pick model determines that target mining model includes:
The multiple mining model is tested, the mining model that test value is greater than preset value is determined as the target Mining model.
6. the usage mining method according to claim 5 based on machine learning, which is characterized in that be greater than test value pre- If the mining model of value is determined as the target mining model
The multiple mining model is tested by the data of pre-set user, to obtain test value;
The mining model that the test value is greater than the preset value is determined as the target mining model.
7. the usage mining method according to claim 1 based on machine learning, which is characterized in that dug by the target It digs model and determines that target user includes: from all users
The historical data of all users is analyzed by the target mining model, to calculate class probability;
The target user for being directed to the test object is determined by the class probability.
8. a kind of usage mining device based on machine learning characterized by comprising
Data acquisition module, for obtaining the historical data of all users by multiple dimensional characteristics;
Feature selection module obtains associated at least with test object for carrying out feature selecting to the multiple dimensional characteristics One target dimension feature;
Model building module, for being trained by the historical data of at least one target dimension feature to preset model Multiple mining models for carrying out user in predicting are established, and target mining model is determined based on the multiple mining model;
User's determining module, for determining target user from all users by the target mining model.
9. a kind of electronic equipment characterized by comprising
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to come described in perform claim requirement 1-7 any one via the execution executable instruction The usage mining method based on machine learning.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The usage mining method based on machine learning described in claim 1-7 any one is realized when being executed by processor.
CN201811062830.1A 2018-09-12 2018-09-12 Usage mining method and device based on machine learning, electronic equipment, medium Pending CN110059112A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811062830.1A CN110059112A (en) 2018-09-12 2018-09-12 Usage mining method and device based on machine learning, electronic equipment, medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811062830.1A CN110059112A (en) 2018-09-12 2018-09-12 Usage mining method and device based on machine learning, electronic equipment, medium

Publications (1)

Publication Number Publication Date
CN110059112A true CN110059112A (en) 2019-07-26

Family

ID=67314971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811062830.1A Pending CN110059112A (en) 2018-09-12 2018-09-12 Usage mining method and device based on machine learning, electronic equipment, medium

Country Status (1)

Country Link
CN (1) CN110059112A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125531A (en) * 2019-12-25 2020-05-08 北京每日优鲜电子商务有限公司 Method, device and equipment for determining scores of multi-bin model and storage medium
CN111178934A (en) * 2019-11-29 2020-05-19 北京深演智能科技股份有限公司 Method and device for acquiring target object
CN111190967A (en) * 2019-12-16 2020-05-22 北京淇瑀信息科技有限公司 User multi-dimensional data processing method and device and electronic equipment
CN112382394A (en) * 2020-11-05 2021-02-19 苏州麦迪斯顿医疗科技股份有限公司 Event processing method and device, electronic equipment and storage medium
CN112487262A (en) * 2020-11-25 2021-03-12 建信金融科技有限责任公司 Data processing method and device
CN112529236A (en) * 2019-09-18 2021-03-19 泰康保险集团股份有限公司 Target object identification method and device, electronic equipment and storage medium
CN112825576A (en) * 2019-11-20 2021-05-21 中国电信股份有限公司 Method and device for determining cell capacity expansion and storage medium
CN113222632A (en) * 2020-02-04 2021-08-06 北京京东振世信息技术有限公司 Object mining method and device
CN113239139A (en) * 2021-06-09 2021-08-10 刘欢庆 Text data mining method and system based on RapidMiner
CN113591216A (en) * 2021-07-23 2021-11-02 三一重机有限公司 Excavator working mode determination method and system and excavator

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090024546A1 (en) * 2007-06-23 2009-01-22 Motivepath, Inc. System, method and apparatus for predictive modeling of spatially distributed data for location based commercial services
CN106845731A (en) * 2017-02-20 2017-06-13 重庆邮电大学 A kind of potential renewal user based on multi-model fusion has found method
CN106933956A (en) * 2017-01-22 2017-07-07 深圳市华成峰科技有限公司 Data digging method and device
CN107657267A (en) * 2017-08-11 2018-02-02 百度在线网络技术(北京)有限公司 Product potential user method for digging and device
CN107909433A (en) * 2017-11-14 2018-04-13 重庆邮电大学 A kind of Method of Commodity Recommendation based on big data mobile e-business

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090024546A1 (en) * 2007-06-23 2009-01-22 Motivepath, Inc. System, method and apparatus for predictive modeling of spatially distributed data for location based commercial services
CN106933956A (en) * 2017-01-22 2017-07-07 深圳市华成峰科技有限公司 Data digging method and device
CN106845731A (en) * 2017-02-20 2017-06-13 重庆邮电大学 A kind of potential renewal user based on multi-model fusion has found method
CN107657267A (en) * 2017-08-11 2018-02-02 百度在线网络技术(北京)有限公司 Product potential user method for digging and device
CN107909433A (en) * 2017-11-14 2018-04-13 重庆邮电大学 A kind of Method of Commodity Recommendation based on big data mobile e-business

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529236A (en) * 2019-09-18 2021-03-19 泰康保险集团股份有限公司 Target object identification method and device, electronic equipment and storage medium
CN112825576B (en) * 2019-11-20 2023-05-05 中国电信股份有限公司 Cell capacity expansion determining method, device and storage medium
CN112825576A (en) * 2019-11-20 2021-05-21 中国电信股份有限公司 Method and device for determining cell capacity expansion and storage medium
CN111178934A (en) * 2019-11-29 2020-05-19 北京深演智能科技股份有限公司 Method and device for acquiring target object
CN111178934B (en) * 2019-11-29 2024-03-08 北京深演智能科技股份有限公司 Method and device for acquiring target object
CN111190967A (en) * 2019-12-16 2020-05-22 北京淇瑀信息科技有限公司 User multi-dimensional data processing method and device and electronic equipment
CN111190967B (en) * 2019-12-16 2024-04-26 北京淇瑀信息科技有限公司 User multidimensional data processing method and device and electronic equipment
CN111125531A (en) * 2019-12-25 2020-05-08 北京每日优鲜电子商务有限公司 Method, device and equipment for determining scores of multi-bin model and storage medium
CN113222632A (en) * 2020-02-04 2021-08-06 北京京东振世信息技术有限公司 Object mining method and device
CN112382394A (en) * 2020-11-05 2021-02-19 苏州麦迪斯顿医疗科技股份有限公司 Event processing method and device, electronic equipment and storage medium
CN112487262A (en) * 2020-11-25 2021-03-12 建信金融科技有限责任公司 Data processing method and device
CN113239139A (en) * 2021-06-09 2021-08-10 刘欢庆 Text data mining method and system based on RapidMiner
CN113591216A (en) * 2021-07-23 2021-11-02 三一重机有限公司 Excavator working mode determination method and system and excavator
CN113591216B (en) * 2021-07-23 2023-11-17 三一重机有限公司 Excavator working mode determining method and system and excavator

Similar Documents

Publication Publication Date Title
CN110059112A (en) Usage mining method and device based on machine learning, electronic equipment, medium
US10929614B2 (en) Automated contextual dialog generation for cognitive conversation
CN110366734B (en) Optimizing neural network architecture
US11386496B2 (en) Generative network based probabilistic portfolio management
CN109657805A (en) Hyper parameter determines method, apparatus, electronic equipment and computer-readable medium
CN107169534A (en) Model training method and device, storage medium, electronic equipment
CN109922032A (en) Method and apparatus for determining the risk of logon account
US11237806B2 (en) Multi objective optimization of applications
CN106471525A (en) Strength neural network is to generate additional output
Vaziri et al. Identification of optimization-based probabilistic earthquake scenarios for regional loss estimation
US20200026502A1 (en) Method and system for determining inefficiencies in a user interface
US11861469B2 (en) Code generation for Auto-AI
US11803793B2 (en) Automated data forecasting using machine learning
CN116194908A (en) Optimizing automatic selection of machine learning pipelines using meta-learning
US11267128B2 (en) Online utility-driven spatially-referenced data collector for classification
US20220358594A1 (en) Counterfactual e-net learning for contextual enhanced earnings call analysis
US20240037370A1 (en) Automated data forecasting using machine learning
WO2020150597A1 (en) Systems and methods for entity performance and risk scoring
KR102519878B1 (en) Apparatus, method and recording medium storing commands for providing artificial-intelligence-based risk management solution in credit exposure business of financial institution
CN116662527A (en) Method for generating learning resources and related products
US20230056772A1 (en) Influence function in machine learning for interpretation of lengthy and noisy documents
US20230136972A1 (en) Egocentric network entity robustness prediction
CN112686705B (en) Method and device for predicting sales effect data and electronic equipment
CN115689106A (en) Method, device and equipment for quantitatively identifying regional space structure of complex network view angle
US11880765B2 (en) State-augmented reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination