CN110378739A - A kind of data traffic matching process and device - Google Patents

A kind of data traffic matching process and device Download PDF

Info

Publication number
CN110378739A
CN110378739A CN201910668490.5A CN201910668490A CN110378739A CN 110378739 A CN110378739 A CN 110378739A CN 201910668490 A CN201910668490 A CN 201910668490A CN 110378739 A CN110378739 A CN 110378739A
Authority
CN
China
Prior art keywords
matching result
user
training
matching
test sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910668490.5A
Other languages
Chinese (zh)
Other versions
CN110378739B (en
Inventor
崔羽飞
张第
刘颖慧
张溶芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN201910668490.5A priority Critical patent/CN110378739B/en
Publication of CN110378739A publication Critical patent/CN110378739A/en
Application granted granted Critical
Publication of CN110378739B publication Critical patent/CN110378739B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/60Business processes related to postal services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of data traffic matching process and devices.This method comprises: test sample is input in the first training pattern library, the first matching result is obtained, test sample includes user data;It sends the first matching result and gives core Cloud Server;The second matching result that core Cloud Server returns is received, the second matching result is that test sample is input to the matching result obtained in the second training pattern library, and the second training pattern library is the training pattern library obtained according to the training of the first matching result;The first matching result and the second matching result are merged, obtains final matching results, final matching results include the data traffic type of user's reserve purchase and the data flow magnitude of user's reserve purchase.Pass through the first matching result of fusion and the second matching result, acquisition accuracy is higher, can really reflect the final matching results of the individual demand of user, enable operator according to the data traffic set meal that final matching results are that user's recommendation is more suitable for user, promotes user experience.

Description

A kind of data traffic matching process and device
Technical field
The present invention relates to computer fields, and in particular to a kind of data traffic matching process and device.
Background technique
With the rapid development of mobile Internet, people are widely used the social categories software such as wechat, QQ and carry out communication exchanges, Meanwhile user passes through mobile terminal accessing internet to obtain required information, all makes user for the need of data service It asks and dramatically increases.
Currently, flow business used in mobile terminal user is all in the form of flow package, from user oneself to fortune Battalion quotient apply and orders, i.e. a variety of data traffic set meals of operator's offer, user according to oneself for the demand of data service, Select the data traffic set meal that simultaneously non-subscribing carrier provides.But in real life, different users has different data flows Use demand is measured, a variety of data traffic set meals are faced, user will appear selection query in selection.If in the set meal of user's selection Data traffic it is larger, then will appear flow waste the problem of;If the data traffic in the set meal of user's selection is less, can make User does not have available data traffic to the end of month, cause the interruption of data communication.User can not accurately summarize oneself Data traffic service condition, and operator can not also recommend more reasonable data traffic set meal to user, lead to user experience Difference.
Summary of the invention
For this purpose, the present invention provides a kind of data traffic matching process and device, to solve in the prior art due to can not root Recommend the problem of user experience difference caused by suitable data traffic business according to the individual demand of user.The present invention will solve Certainly the problem of, is: how more accurately to recommend personalized data traffic business for user.
To achieve the goals above, first aspect present invention provides a kind of service traffics matching process, and method includes: that will survey Sample is originally input in the first training pattern library, obtains the first matching result, test sample includes user data;Send first Core Cloud Server is given with result;The second matching result that core Cloud Server returns is received, the second matching result is that will test Sample is input to the matching result obtained in the second training pattern library, and the second training pattern library is according to the training of the first matching result The training pattern library of acquisition;The first matching result and the second matching result are merged, final matching results, final matching results are obtained The data flow magnitude of data traffic type and user's reserve purchase including user's reserve purchase.
Wherein, user data, comprising: the data traffic type of data traffic packet, user's actual use that user has ordered, At least two in the data flow magnitude of user's actual use and the telephone expenses of user.
Wherein, test sample is input in the first training pattern library, obtains the first matching result step, comprising: to the Each disaggregated model in one training pattern library is done as follows: being carried out cross validation to disaggregated model using training sample, is obtained Obtain the prediction training sample of disaggregated model;Disaggregated model is tested using test sample, it is corresponding pre- to obtain disaggregated model Survey test sample;According to the corresponding prediction training of each of the first training pattern library, the first training pattern library disaggregated model Sample and prediction test sample, determine the first matching result.
Wherein, it is trained according to the corresponding prediction of each disaggregated model in the first training pattern library, the first training pattern library Sample and prediction test sample, determine the first matching result step, comprising: at least two prediction training samples stacked, Obtain new training sample;According to each prediction test sample, new test sample is determined;According to new training sample, first Training pattern library and new test sample, determine the first matching result.
Wherein, according to new training sample, the first training pattern library and new test sample, determine that the first matching result walks Suddenly, comprising: using the method for exhaustive search, any model in the first training pattern library is carried out using new training sample Training, obtains optimal training pattern;New test sample is input in optimal training pattern, the first matching result is obtained.
Wherein, the first training pattern library, comprising: random forest disaggregated model, Decision-Tree Classifier Model, extreme gradient are promoted Data model and sieve volume return at least two disaggregated models in disaggregated model.
To achieve the goals above, second aspect of the present invention provides a kind of data traffic matching process, and method includes: to receive The first matching result that edge Cloud Server is sent, and the first matching result is put into training sample, update training sample, test Sample includes user data;Updated training sample is trained, the second training pattern library is obtained;Test sample is inputted Into the second training pattern library, the second matching result is obtained, test sample includes user data;The second matching result is sent to side Edge Cloud Server obtains final matching knot so that edge Cloud Server can merge the second matching result and the first matching result Fruit.
Wherein, user data, comprising: the data traffic type of data traffic packet, user's actual use that user has ordered, At least two in the data flow magnitude of user's actual use and the telephone expenses of user.
To achieve the goals above, third aspect present invention provides a kind of data traffic coalignment, comprising: first obtains Module obtains the first matching result, test sample includes number of users for test sample to be input in the first training pattern library According to;First sending module gives core Cloud Server for sending the first matching result;First receiving module, for receiving core The second matching result that Cloud Server returns, the second matching result are that test sample is input in the second training pattern library to obtain Matching result, the second training pattern library be according to the first matching result training obtain training pattern library;Fusion Module is used for The first matching result and the second matching result are merged, obtains final matching results, final matching results include user's reserve purchase The data flow magnitude of data traffic type and user's reserve purchase.
To achieve the goals above, fourth aspect present invention provides a kind of data traffic coalignment, comprising: second receives First matching result for receiving the first matching result of edge Cloud Server transmission, and is put into training sample by module, is updated Training sample, test sample include user data;Training module, for being carried out to the user data in updated training sample Training obtains the second training pattern library;Second acquisition module is obtained for test sample to be input in the second training pattern library The second matching result is obtained, test sample includes user data;Second sending module gives edge cloud for sending the second matching result Server obtains final matching results so that edge Cloud Server can merge the second matching result and the first matching result.
The present invention has the advantage that preliminary obtain can expire by the way that test sample to be input in the first training pattern First matching result of sufficient user demand, since core Cloud Server can converge more data, so that core Cloud Server The the second matching result accuracy returned is higher;And then data traffic is used by the user in an edge Cloud Server is embodied First matching result of feature, and, all users can be embodied and blended using the second matching result of the feature of data traffic, Obtain that accuracy is higher, can really reflect the final matching results of the individual demand of user, so that operator is for user When recommending data flow package, it can recommend the data traffic set meal for being more suitable for user for user in conjunction with final matching results, be User brings better service, so that user experience increases.
Wherein, user data includes the data traffic type of user's actual use, by the data flow that will include user The test sample of amount type is input in the first training pattern library, is obtained the first matching result, is enabled first matching result Enough demands of data traffic of the reflection user when doing different business, and then blended with the second matching result, obtain accuracy Higher final matching results.
Detailed description of the invention
The drawings are intended to provide a further understanding of the invention, and constitutes part of specification, with following tool Body embodiment is used to explain the present invention together, but is not construed as limiting the invention.
Fig. 1 is a kind of data traffic matching process flow diagram provided in first embodiment of the invention;
Fig. 2 is a kind of data traffic matching process flow diagram provided in second embodiment of the invention;
Fig. 3 is a kind of data traffic matching process flow diagram provided in second embodiment of the invention;
Fig. 4 is a kind of data traffic matching process flow diagram provided in third embodiment of the invention;
Fig. 5 is a kind of data traffic coalignment block diagram provided in four embodiment of the invention;
Fig. 6 is a kind of data traffic coalignment block diagram provided in fifth embodiment of the invention;
Fig. 7 is a kind of data traffic coalignment block diagram provided in fifth embodiment of the invention.
In the accompanying drawings:
Obtain module 502: the first sending module at 501: the first
503: the first receiving modules 504: Fusion Module
601: the second receiving modules 602: training module
Obtain module 604: the second sending module at 603: the second
701: edge Cloud Server 702: core Cloud Server
Specific embodiment
Below in conjunction with attached drawing, detailed description of the preferred embodiments.It should be understood that this place is retouched The specific embodiment stated is merely to illustrate and explain the present invention, and is not intended to restrict the invention.
The first embodiment of the present invention is related to a kind of data traffic matching process.For according to final matching results, more Accurately recommend personalized data traffic business for user.
The realization details of the service traffics matching process in present embodiment is specifically described below, the following contents The realization details of this programme is only understood for convenience, not implements the necessary of this programme.
Fig. 1 is the flow chart of the data traffic matching process in present embodiment, and this method can be used for edge Cloud Server.
It should be noted that edge Cloud Server refers to the network for having less intermediate link apart from end user's access Node has preferable responding ability and connection speed to final accessing user.The edge Cloud Server can be content delivery network The node server of network (Content Delivery Network, CDN), a node serve being also possible in internet Device.Also, the biggish user of amount of access and user data can be stored in dedicated cache equipment by the edge Cloud Server, into And improve the processing speed of data.
This method may include following steps.
In a step 101, test sample is input in the first training pattern library, obtains the first matching result.
Wherein, test sample includes user data, and user data includes: data traffic packet, the Yong Hushi that user has ordered At least two in data traffic type, the data flow magnitude of user's actual use and the telephone expenses of user that border uses.
It should be noted that user actual use data traffic type may include local flow, inter-provincial roaming flow, The discharge patterns such as international roaming flow and Hong Kong, Macao and Taiwan roaming flow, different according to the demand of user, the data of user's actual use Discharge pattern is also different.
At one in the specific implementation, user data can also include the voice use information of user, such as: user's conduct Caller or it is called when the duration of call and the information such as corresponding telephone expenses.It should be noted that the above user data can be according to reality Border setting, it is not limited to which the example above explanation, the information that other are not illustrated is also within the scope of the present invention, herein no longer It repeats.
Wherein, the first training pattern library, comprising: random forest disaggregated model, Decision-Tree Classifier Model, extreme gradient are promoted Data model and sieve volume return at least two disaggregated models in disaggregated model.
It should be noted that random forest disaggregated model therein refers to being trained simultaneously in advance sample using more trees A kind of classifier surveyed.And logistic regression is two classification problems, two classification problems refer to the y value of prediction, and only there are two values (0 or 1), two classification problems can extend to more classification problems, therefore, can be used logistic regression disaggregated model to the above number of users According to being trained, classification results can be quickly obtained.
Decision tree (Decision Tree) disaggregated model therein be it is known it is various happen probability on the basis of, The desired value that net present value (NPV) is sought by constituting decision tree is more than or equal to zero probability, and assessment item risk judges its feasibility Method of decision analysis, be a kind of intuitive graphical method for using probability analysis.Since this decision branch is drawn as figure like one The limb of tree, wherein each internal node indicates the test on an attribute, each branch represents test and exports, each Leaf node represents a kind of classification therefore claims decision tree.In machine learning, decision tree is a prediction model, and what he represented is object A kind of mapping relations between attribute and object value.
Extreme gradient therein promotes data model, is to promote (Extreme Gradient using extreme gradient Boosting, XGBoost) data model established of algorithm, Boosting method be it is a kind of be used to improve weak typing algorithm it is accurate The method of degree, the thought of this method are to integrate many Weak Classifiers to form a strong classifier.And XGBoost algorithm It is one of boosting algorithm, is a kind of promotion tree-model, therefore it is to integrate many tree-models to form one Very strong classifier.
It should be noted that the data traffic type that the data traffic packet ordered including user, user are actually used, At least two information in the data flow magnitude of user's actual use and the telephone expenses of user pre-process, to the maximum extent from original Feature is extracted in the user data of beginning, obtains test sample, then the test sample is input in the first training pattern library and is carried out Test, can tentatively obtain the data traffic set meal type for being suitble to the user, i.e. the first matching result.
In a step 102, it sends the first matching result and gives core Cloud Server.
It should be noted that first matching result is carried out according to the user data collected on edge Cloud Server Acquisition is modeled and tested, the feature that the user in an edge Cloud Server uses data traffic can only be embodied.Therefore, it is necessary to First matching result is sent to core Cloud Server, to update the training sample of core Cloud Server, and then is obtained more quasi- The data traffic set meal type of true user.
In step 103, the second matching result that core Cloud Server returns is received.
Wherein, the second matching result is that test sample is input to the matching result obtained in the second training pattern library, the Two training pattern libraries are the training pattern library obtained according to the training of the first matching result.
It should be noted that core Cloud Server after receiving the first matching result, will be updated the training sample of oneself, Updated training sample is trained again, can get the second training pattern library, which includes: random gloomy Standing forest class model, Decision-Tree Classifier Model, extreme gradient promote at least two in data model and sieve volume recurrence disaggregated model Disaggregated model.Test sample is input in the second training pattern library and is tested, can get the second matching result, due to core The training sample of heart Cloud Server is the feature for having gathered the user data of all edge Cloud Servers, therefore the second matching knot Fruit can embody all users for the use demand of data traffic.
At step 104, the first matching result and the second matching result are merged, final matching results are obtained.
Wherein, final matching results include the data traffic type of user's reserve purchase and the data traffic of user's reserve purchase Value.
It should be noted that by embody the user in some edge Cloud Server using data traffic feature first Matching result, and, all users can be embodied, the second matching result of the use demand of data traffic is blended, for example, The modes such as averaging are voted or weighted to the first matching result and the second matching result, obtain final matching results, it can Obtain the higher final matching results of accuracy.
Currently, usually all including data traffic type, each data traffic in each existing data traffic set meal of operator Data flow magnitude corresponding to type, by the way that the data traffic type in existing set meal and the user in final matching results is pre- The data traffic type of order matches, and by data flow magnitude corresponding to each data traffic type in existing set meal Match with the data flow magnitude of user's reserve purchase in final matching results, the data for being more suitable for user can be recommended for user Flow package brings better service for user, so that user experience increases.
In the present embodiment, by the way that test sample to be input in the first training pattern, preliminary obtain can satisfy use First matching result of family demand, since core Cloud Server can converge more data, so that core Cloud Server returns The second matching result accuracy it is higher;And then the feature of data traffic is used by the user in an edge Cloud Server is embodied The first matching result, and, all users can be embodied and blended using the second matching result of the feature of data traffic, obtain Accuracy is higher, can really reflect the final matching results of the individual demand of user, so that operator is recommended for user When data traffic set meal, the data traffic set meal for being more suitable for user can be recommended for user in conjunction with final matching results, be user Better service is brought, so that user experience increases.
Second embodiment of the present invention is related to a kind of data traffic matching process.Second embodiment and the first embodiment party Formula is roughly the same, is in place of the main distinction: the prediction training sample of each disaggregated model is obtained by way of cross validation Originally it and predicts test sample, then by way of stacking, gets new training sample, and then new training sample is input to It is trained in first training pattern library, obtains the first matching result.
Fig. 2 is the flow chart of data traffic matching process in present embodiment, and this method can be used for edge Cloud Server.It should Method may include following steps.
In step 201, cross validation is carried out using each disaggregated model of training sample to the first training pattern library, Obtain the prediction training sample of each disaggregated model.
It should be noted that training sample is divided into five equal portions, made respectively using any quarter training sample therein For training subsample, a remaining equal portions training sample is as test subsample, using the training subsample to the first training pattern Any one disaggregated model in library is trained, and can get the prediction training sample of the disaggregated model.
For example, there are 5 subsamples in training sample, takes 4 subsamples as training subsample at random, use training Sample is trained Decision-Tree Classifier Model, can get the corresponding prediction training sample of Decision-Tree Classifier Model.
In step 202, it is tested, is obtained using each disaggregated model of test sample to the first training pattern library The corresponding prediction test sample of each disaggregated model.
It should be noted that remaining equal portions training sample will be taken after quarter as test increment in step 201 This, tests any one disaggregated model in the first training pattern library using the test subsample, can get corresponding classification The prediction test sample of model.
For example, there are 5 subsamples in training sample, take 4 subsamples as training subsample, a remaining increment at random Subsample is tested in this conduct, is tested using the test subsample Decision-Tree Classifier Model, can get decision tree classification mould The corresponding prediction test sample of type.
In step 203, corresponding according to each of the first training pattern library, the first training pattern library disaggregated model It predicts training sample and prediction test sample, determines the first matching result.
At one in the specific implementation, being stacked (stacking) at least two prediction training samples, new instruction is obtained Practice sample;According to each prediction test sample, new test sample is determined;According to new training sample, the first training pattern library With new test sample, the first matching result is determined.
Specifically, at least two prediction training samples are stacked, obtains new training sample, can specifically carries out as follows Operation: being divided into five equal portions for training sample, uses any quarter training sample therein as training subsample respectively, remaining One equal portions training sample is as test subsample, using the training subsample to any of first training pattern library classification mould Type is trained, and is obtained 5 pre-training collection, then stacking up this 5 pre-training collection longitudinal directions, is obtained corresponding disaggregated model New training sample.
Specifically, it according to each prediction test sample, determines new test sample, can specifically proceed as follows: will walk Remaining equal portions training sample is as test subsample after taking quarter in rapid 201, using the test subsample to first Any one disaggregated model in training pattern library is tested, and can get 5 different pretests as a result, predicting to this 5 Test result sums up the calculating of averaging, can be obtained the new test sample of corresponding disaggregated model.
For example, there are 5 subsamples in training sample, take 4 subsamples as training subsample, a remaining increment at random Subsample is tested in this conduct;Random forest disaggregated model is trained using the training subsample, can get 5 pre-training Collection, then stacking up this 5 pre-training collection longitudinal directions, can be obtained the new training sample of random forest disaggregated model.
Random forest disaggregated model is tested using the test subsample, can get 5 pretests as a result, again by this 5 prediction test results sum up averaging and calculate, and can be obtained the new test sample of random forest disaggregated model.
According to the above operation, then at least two disaggregated models in the first training pattern library are trained respectively, be can get At least two new test samples and at least two new training samples.Finally, new according at least two of the above training acquisition Training sample, the first training pattern library and at least two new test samples, determine the first matching result.
At one in the specific implementation, according to new training sample, the first training pattern library and new test sample, is determined One matching result step, comprising: the method for utilizing exhaustive search (Grid Search) is instructed using new training sample to first Any disaggregated model practiced in model library is trained, and obtains optimal training pattern;New test sample is input to optimal In training pattern, the first matching result is obtained.
It should be noted that exhaustive search therein refers to, is enumerated and examined one by one in some sequence, and therefrom looked for Those satisfactory candidate results are as final result out.It in the present embodiment, is by the first training pattern library Any disaggregated model is trained, i.e., is tested by all carrying out intersecting to each disaggregated model in the first training pattern library Card, and cross validation results are stacked, at least two new training samples are obtained, this at least two new training is reused Sample is trained each disaggregated model, more final training result, takes training mould corresponding to optimal training result Type is input in the optimal training pattern as optimal training pattern, then by least two new test samples, obtains first With result.
In step 204, it sends the first matching result and gives core Cloud Server.
In step 205, the second matching result that core Cloud Server returns is received.
In step 206, the first matching result and the second matching result are merged, final matching results are obtained.
Wherein, final matching results include the data traffic type of user's reserve purchase and the data traffic of user's reserve purchase Value.
It should be noted that step 204~206 in the present embodiment, with step 102~103 in first embodiment Content it is identical, details are not described herein.
At one in the specific implementation, Fig. 3 be by handling initial testing sample and training sample after, by twice Model training obtains the first matching result and the second matching result respectively, then the first matching result and the second matching result are carried out Fusion obtains the flow chart of the data traffic matching process of final matching results.
In step 301, initial data is pre-processed, obtains initial test sample and training sample, obtains simultaneously Obtain the first training pattern library.
It should be noted that initial data be by from different types of database (such as: based on distributed file system The data warehouse of (Hadoop Distributed File System, HDFS)) in obtain user data obtain.Use therein User data can include: data traffic packet that user has ordered, user's actual use data traffic type (including local flow, Inter-provincial roaming flow, international roaming flow and Hong Kong, Macao and Taiwan roaming flow etc.), the data flow magnitude of user's actual use and user The information such as telephone expenses, specifically, further include user voice use information (such as: the user as caller or it is called when call Duration and corresponding telephone expenses etc.).It should be noted that user data therein can be according to actual setting, it is not limited on It states for example, other information for not illustrating are also within the scope of the present invention, details are not described herein.
The above user data is quickly handled using the computing engines based on database, i.e., to different types of in user data Data carry out the processing such as attitude layer of data conversion, Data Mining and data, so that user data standardizes, then will mark In user data after standardization, field related with terminal used by a user arranged, obtain initial sample, and then will be first Beginning sample, which is split, (such as with the ratio of 7:3, obtains initial test sample and training sample, i.e., 70% data conduct Training sample, 30% data are as test sample), and the test sample obtained after segmentation and training sample are stored in HDFS In system.
It should be noted that if the feature of given user data is fewer, then Feature Engineering can be carried out to data with existing, Feature is extracted from initial data to the maximum extent for algorithm and model use.Feature choosing is carried out to user data first It selects, and then construction feature dimension, ultimately produces the data of the feature with multiple dimensions.
First training pattern library is promoted by selection random forest disaggregated model, Decision-Tree Classifier Model, extreme gradient Data model and sieve volume return the set that at least two disaggregated models in disaggregated model optimize the training pattern of acquisition.
In step 302, it is tested, is obtained using each disaggregated model of test sample to the first training pattern library The corresponding prediction test sample of each disaggregated model.
In step 303, cross validation is carried out using each disaggregated model of training sample to the first training pattern library, Obtain the prediction training sample of each disaggregated model.
It should be noted that step 302~303 are identical as the content of step 201~202 in second embodiment, This is repeated no more.
In step 304, according to each prediction test sample, new test sample is determined.
In step 305, at least two prediction training samples are stacked, obtains new training sample.
Within step 306, using the method for exhaustive search, using new training sample to appointing in the first training pattern library A kind of model is trained, and obtains optimal training pattern.
In step 307, new test sample is input in optimal training pattern, obtains the first matching result.
It should be noted that step 304~307 are identical as the content of the step 203 in second embodiment, herein no longer It repeats.
In step 308, it sends the first matching result and gives core Cloud Server.
In a step 309, the second matching result that core Cloud Server returns is received.
In the step 310, the first matching result and the second matching result are merged, final matching results are obtained.
It should be noted that step 308~310 are identical as the content of step 102~103 in first embodiment, This is repeated no more.
In the present embodiment, each of the first training pattern library disaggregated model is obtained by way of cross validation Prediction training sample and prediction test sample get new training sample then by way of stacking, and then by new instruction Practice sample and new test sample is input in the first training pattern library and is trained, the first matching result is obtained, so that first The accuracy of matching result is higher, can really reflect the individual demand of user.
Third embodiment of the present invention is related to a kind of data traffic matching process.Fig. 4 is data flow in present embodiment The flow chart of flux matched method, this method can be used for core Cloud Server.This method may include following steps.
In step 401, the first matching result that edge Cloud Server is sent is received, and the first matching result is put into instruction Practice sample, updates training sample.
It should be noted that the first matching result, which can only embody the user in an edge Cloud Server, it is expected the number obtained According to flow package type, need to gather multiple edge Cloud Servers, to obtain multiple first matching results, and will above multiple the One matching result is put into training sample, and core Cloud Server is enabled to obtain more representative samples.
For example, first edge Cloud Server collect be first province user data, the first matching result 1 obtained The user for characterizing the province it is expected the data traffic set meal type obtained;And what second edge Cloud Server was collected is second province User data, the first matching result 2 obtained characterize the province user it is expected obtain data traffic set meal class Type;..., and it is the user data of n-th of province, the first matching result n obtained characterization that the n-th edge Cloud Server, which is collected, The user of the province it is expected the data traffic set meal type obtained;Wherein n is natural number more than or equal to 1, by will more than First matching result 1, the first matching result 2 ... ..., the first matching result n are put into training sample, have updated core cloud clothes The training sample of business device enables core Cloud Server to count the data traffic service condition of the user in the whole nation, and then obtains The data traffic set meal type of more accurate user.
Wherein, test sample includes user data, and user data includes: data traffic packet, the Yong Hushi that user has ordered At least two in data traffic type, the data flow magnitude of user's actual use and the telephone expenses of user that border uses.
It should be noted that user actual use data traffic type may include local flow, inter-provincial roaming flow, The discharge patterns such as international roaming flow and Hong Kong, Macao and Taiwan roaming flow, different according to the demand of user, the data of user's actual use Discharge pattern is also different.
At one in the specific implementation, user data can also include the voice use information of user, such as: user's conduct Caller or it is called when the duration of call and the information such as corresponding telephone expenses.It should be noted that the above user data can be according to reality Border setting, it is not limited to which the example above explanation, the information that other are not illustrated is also within the scope of the present invention, herein no longer It repeats.
In step 402, updated training sample is trained, obtains the second training pattern library.
Wherein, the second training pattern library includes: random forest disaggregated model, Decision-Tree Classifier Model, the promotion of extreme gradient Data model and sieve volume return at least two disaggregated models in disaggregated model.
It should be noted that updated training sample may include the first matching knot that edge Cloud Server is sent Fruit, can also include the user data that oneself gets of core Cloud Server, updated training sample can according to actual setting, It is not limited to the example above explanation, the information that other are not illustrated is also within the scope of the present invention, and details are not described herein.
In step 403, test sample is input in the second training pattern library, obtains the second matching result.
It should be noted that the data traffic type that the data traffic packet ordered including user, user are actually used, At least two information in the data flow magnitude of user's actual use and the telephone expenses of user pre-process, to the maximum extent from original Feature is extracted in the user data of beginning, obtains test sample, then the test sample is input in the second training pattern library and is carried out Test, can tentatively obtain the data traffic set meal type for being suitble to the user, i.e. the second matching result.
In step 404, it sends the second matching result and gives edge Cloud Server, so that edge Cloud Server can merge the Two matching results and the first matching result obtain final matching results.
It should be noted that by embody the user in some edge Cloud Server using data traffic feature first Matching result, and, all users can be embodied and blended using the second matching result of the feature of data traffic, (for example, to One matching result and the second matching result carry out stacking fusion) the higher final matching results of accuracy can be obtained.
In the present embodiment, by using training method identical with second embodiment, to updated trained sample Originally it is trained, the second matching result is obtained, since core Cloud Server can converge more data, so that core cloud service The second matching result accuracy that device returns is higher, then second matching result is sent to edge Cloud Server, so that edge Cloud Server can merge the second matching result and the first matching result, obtain the higher final matching results of accuracy.It is transporting When seeking Shang Wei user's recommending data flow package, the number for being more suitable for user can be recommended for user in conjunction with final matching results According to flow package, better service is brought for user, so that user experience increases.
The step of various methods divide above, be intended merely to describe it is clear, when realization can be merged into a step or Certain steps are split, multiple steps are decomposed into, as long as including identical logical relation, all in the protection scope of this patent It is interior;To adding inessential modification in algorithm or in process or introducing inessential design, but its algorithm is not changed Core design with process is all in the protection scope of the patent.
4th embodiment of the invention is related to a kind of data traffic coalignment, and the specific implementation of the device can be found in The associated description of one embodiment, overlaps will not be repeated.It is worth noting that the specific reality of the device in present embodiment The associated description that also can be found in second embodiment is applied, but is not limited to both examples above, other unaccounted embodiments Also within the protection scope of the present apparatus.
As shown in figure 5, the device mainly includes: first obtains module 501, for test sample to be input to the first training In model library, the first matching result is obtained, test sample includes user data;First sending module 502, for sending first Core Cloud Server is given with result;First receiving module 503, for receiving the second matching result of core Cloud Server return, Second matching result is that test sample is input to the matching result obtained in the second training pattern library, and the second training pattern library is The training pattern library obtained according to the training of the first matching result;Fusion Module 504, for merging the first matching result and second With as a result, obtaining final matching results, final matching results include the data traffic type and user's reserve purchase of user's reserve purchase Data flow magnitude.
In one example, first the user data in module 501 is obtained, comprising: data traffic packet that user has ordered, At least two in data traffic type, the data flow magnitude of user's actual use and the telephone expenses of user that user actually uses.
5th embodiment of the invention is related to a kind of data traffic coalignment, and the specific implementation of the device can be found in The associated description of three embodiments, overlaps will not be repeated.
As shown in fig. 6, the device mainly includes second receiving module 601, for receiving the of the transmission of edge Cloud Server One matching result, and the first matching result is put into training sample, training sample is updated, test sample includes user data;Instruction Practice module 602, for being trained to the user data in updated training sample, obtains the second training pattern library;Second Module 603 is obtained, for test sample to be input in the second training pattern library, obtains the second matching result, test sample packet Include user data;Second sending module 604 gives edge Cloud Server for sending the second matching result, so that edge cloud service Device can merge the second matching result and the first matching result, obtain final matching results.
At one in the specific implementation, as shown in fig. 7, edge Cloud Server 701 comprises the following modules: first obtains module 501, the first sending module 502, the first receiving module 503 and Fusion Module 504;Core Cloud Server 702 comprises the following modules: Second receiving module 601, training module 602, second obtains module 603 and the second sending module 604.
Module 501 is obtained by the first of edge Cloud Server and calculates the first matching result of acquisition, this first matching is tied Fruit is sent to the second receiving module 601 of core Cloud Server by the first sending module 502, so that core Cloud Server can be more New training sample reuses training module 602 and is trained to the user data in updated training sample, obtains the second instruction Practice model library;Test sample is input in the second training pattern library by the second acquisition module 603, obtains the second matching knot Fruit;This second matching result is sent to the first receiving module 503 of edge Cloud Server by the second sending module 604, into And the Fusion Module 504 in edge Cloud Server is enable to blend the second matching result and the first matching result, it obtains most Whole matching result.
By training twice for edge Cloud Server and core Cloud Server, then the second matching that training twice is obtained is tied Fruit and the first matching result blend, and improve the accuracy of final matching results, enable final matching results more accurate Final matching results are recommended user, improve user experience by the individual demand of true reflection user.
It is noted that each module involved in present embodiment is logic module, and in practical applications, one A logic unit can be a physical unit, be also possible to a part of a physical unit, can also be with multiple physics lists The combination of member is realized.In addition, in order to protrude innovative part of the invention, it will not be with solution institute of the present invention in present embodiment The technical issues of proposition, the less close unit of relationship introduced, but this does not indicate that there is no other single in present embodiment Member.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses Mode, however the present invention is not limited thereto.For those skilled in the art, essence of the invention is not being departed from In the case where mind and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.

Claims (10)

1. a kind of data traffic matching process, which is characterized in that the described method includes:
Test sample is input in the first training pattern library, obtains the first matching result, the test sample includes number of users According to;
It sends first matching result and gives core Cloud Server;
The second matching result that the core Cloud Server returns is received, second matching result is that the test sample is defeated Enter the matching result obtained into the second training pattern library, second training pattern library is to instruct according to first matching result Practice the training pattern library obtained;
First matching result and second matching result are merged, final matching results, the final matching results are obtained The data flow magnitude of data traffic type and user's reserve purchase including user's reserve purchase.
2. data traffic matching process according to claim 1, which is characterized in that the user data, comprising:
Data traffic packet that user has ordered, the data traffic type of user actual use, user actual use At least two in the telephone expenses of data flow magnitude and the user.
3. data traffic matching process according to claim 1, which is characterized in that described that test sample is input to first In training pattern library, the first matching result step is obtained, comprising:
Each disaggregated model in first training pattern library is done as follows:
Cross validation is carried out to the disaggregated model using the training sample, obtains the prediction training sample of the disaggregated model This;
The disaggregated model is tested using the test sample, obtains the corresponding prediction test specimens of the disaggregated model This;
According to the corresponding prediction training of each of first training pattern library, first training pattern library disaggregated model Sample and the prediction test sample, determine first matching result.
4. data traffic matching process according to claim 3, which is characterized in that described according to first training pattern Library, the corresponding prediction training sample of each disaggregated model in first training pattern library and the prediction test sample, really The fixed first matching result step, comprising:
At least two prediction training samples are stacked, new training sample is obtained;
According to each prediction test sample, new test sample is determined;
According to the new training sample, first training pattern library and the new test sample, described first is determined With result.
5. data traffic matching process according to claim 4, which is characterized in that described according to the new training sample Originally, first training pattern library and the new test sample, determine the first matching result step, comprising:
Using the method for exhaustive search, using the new training sample to any model in first training pattern library It is trained, obtains optimal training pattern;
The new test sample is input in the optimal training pattern, first matching result is obtained.
6. data traffic matching process according to any one of claim 1 to 5, which is characterized in that first training Model library or second training pattern library include:
Random forest disaggregated model, Decision-Tree Classifier Model, extreme gradient promote data model and sieve volume returns in disaggregated model At least two disaggregated models.
7. a kind of data traffic matching process, which is characterized in that the described method includes:
The first matching result that edge Cloud Server is sent is received, and first matching result is put into training sample, is updated The training sample, test sample include user data;
The updated training sample is trained, the second training pattern library is obtained;
Test sample is input in second training pattern library, obtains the second matching result, the test sample includes institute State user data;
Second matching result is sent to the edge Cloud Server, so that the edge Cloud Server can merge described Two matching results and first matching result obtain final matching results.
8. data traffic matching process according to claim 7, which is characterized in that the user data, comprising:
Data traffic packet that user has ordered, the data traffic type of user actual use, user actual use At least two in the telephone expenses of data flow magnitude and the user.
9. a kind of data traffic coalignment characterized by comprising
First obtains module, for test sample to be input in the first training pattern library, obtains the first matching result, the survey Sample originally includes user data;
First sending module gives core Cloud Server for sending first matching result;
First receiving module, the second matching result returned for receiving the core Cloud Server, second matching result For the test sample is input to the matching result obtained in the second training pattern library, according to second training pattern library The training pattern library that the first matching result training obtains;
Fusion Module, for merging first matching result and second matching result, acquisition final matching results are described Final matching results include the data traffic type of user's reserve purchase and the data flow magnitude of user's reserve purchase.
10. a kind of data traffic coalignment characterized by comprising
Second receiving module, for receiving the first matching result of edge Cloud Server transmission, and by first matching result It is put into training sample, updates the training sample, test sample includes user data;
Training module obtains the second training pattern for being trained to the user data in the updated training sample Library;
Second obtains module, for test sample to be input in second training pattern library, obtains the second matching result, institute Stating test sample includes the user data;
Second sending module, for sending second matching result to the edge Cloud Server, so that the edge cloud takes Business device can merge second matching result and first matching result, obtain final matching results.
CN201910668490.5A 2019-07-23 2019-07-23 Data traffic matching method and device Active CN110378739B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910668490.5A CN110378739B (en) 2019-07-23 2019-07-23 Data traffic matching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910668490.5A CN110378739B (en) 2019-07-23 2019-07-23 Data traffic matching method and device

Publications (2)

Publication Number Publication Date
CN110378739A true CN110378739A (en) 2019-10-25
CN110378739B CN110378739B (en) 2022-03-29

Family

ID=68255320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910668490.5A Active CN110378739B (en) 2019-07-23 2019-07-23 Data traffic matching method and device

Country Status (1)

Country Link
CN (1) CN110378739B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110942180A (en) * 2019-11-12 2020-03-31 广州泽沐信息科技有限责任公司 Industrial design matching service party prediction method based on xgboost algorithm
CN112202888A (en) * 2020-09-30 2021-01-08 中国联合网络通信集团有限公司 Message forwarding method for edge user and SDN
CN112487295A (en) * 2020-12-04 2021-03-12 ***通信集团江苏有限公司 5G package pushing method and device, electronic equipment and computer storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530321A (en) * 2013-09-18 2014-01-22 上海交通大学 Sequencing system based on machine learning
US20150235143A1 (en) * 2003-12-30 2015-08-20 Kantrack Llc Transfer Learning For Predictive Model Development
CN104866626A (en) * 2015-06-15 2015-08-26 ***通信集团黑龙江有限公司 Method and device for recommending telecommunication service
CN105069476A (en) * 2015-08-10 2015-11-18 国网宁夏电力公司 Method for identifying abnormal wind power data based on two-stage integration learning
CN105930934A (en) * 2016-04-27 2016-09-07 北京物思创想科技有限公司 Prediction model demonstration method and device and prediction model adjustment method and device
US20170032221A1 (en) * 2015-07-29 2017-02-02 Htc Corporation Method, electronic apparatus, and computer readable medium of constructing classifier for disease detection
CN107766418A (en) * 2017-09-08 2018-03-06 广州汪汪信息技术有限公司 A kind of credit estimation method based on Fusion Model, electronic equipment and storage medium
CN108280462A (en) * 2017-12-11 2018-07-13 北京三快在线科技有限公司 A kind of model training method and device, electronic equipment
CN109741175A (en) * 2018-12-28 2019-05-10 上海点融信息科技有限责任公司 Based on artificial intelligence to the appraisal procedure of credit again and equipment for purchasing automobile-used family by stages
CN109886349A (en) * 2019-02-28 2019-06-14 成都新希望金融信息有限公司 A kind of user classification method based on multi-model fusion
CN109902753A (en) * 2019-03-06 2019-06-18 深圳市珍爱捷云信息技术有限公司 User's recommended models training method, device, computer equipment and storage medium
CN110009017A (en) * 2019-03-25 2019-07-12 安徽工业大学 A kind of multi-angle of view multiple labeling classification method based on the study of visual angle generic character

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150235143A1 (en) * 2003-12-30 2015-08-20 Kantrack Llc Transfer Learning For Predictive Model Development
CN103530321A (en) * 2013-09-18 2014-01-22 上海交通大学 Sequencing system based on machine learning
CN104866626A (en) * 2015-06-15 2015-08-26 ***通信集团黑龙江有限公司 Method and device for recommending telecommunication service
US20170032221A1 (en) * 2015-07-29 2017-02-02 Htc Corporation Method, electronic apparatus, and computer readable medium of constructing classifier for disease detection
CN105069476A (en) * 2015-08-10 2015-11-18 国网宁夏电力公司 Method for identifying abnormal wind power data based on two-stage integration learning
CN105930934A (en) * 2016-04-27 2016-09-07 北京物思创想科技有限公司 Prediction model demonstration method and device and prediction model adjustment method and device
CN107766418A (en) * 2017-09-08 2018-03-06 广州汪汪信息技术有限公司 A kind of credit estimation method based on Fusion Model, electronic equipment and storage medium
CN108280462A (en) * 2017-12-11 2018-07-13 北京三快在线科技有限公司 A kind of model training method and device, electronic equipment
CN109741175A (en) * 2018-12-28 2019-05-10 上海点融信息科技有限责任公司 Based on artificial intelligence to the appraisal procedure of credit again and equipment for purchasing automobile-used family by stages
CN109886349A (en) * 2019-02-28 2019-06-14 成都新希望金融信息有限公司 A kind of user classification method based on multi-model fusion
CN109902753A (en) * 2019-03-06 2019-06-18 深圳市珍爱捷云信息技术有限公司 User's recommended models training method, device, computer equipment and storage medium
CN110009017A (en) * 2019-03-25 2019-07-12 安徽工业大学 A kind of multi-angle of view multiple labeling classification method based on the study of visual angle generic character

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110942180A (en) * 2019-11-12 2020-03-31 广州泽沐信息科技有限责任公司 Industrial design matching service party prediction method based on xgboost algorithm
CN110942180B (en) * 2019-11-12 2023-07-04 广州泽沐信息科技有限责任公司 Industrial design matching service side prediction method based on xgboost algorithm
CN112202888A (en) * 2020-09-30 2021-01-08 中国联合网络通信集团有限公司 Message forwarding method for edge user and SDN
CN112202888B (en) * 2020-09-30 2021-12-14 中国联合网络通信集团有限公司 Message forwarding method for edge user and SDN
CN112487295A (en) * 2020-12-04 2021-03-12 ***通信集团江苏有限公司 5G package pushing method and device, electronic equipment and computer storage medium

Also Published As

Publication number Publication date
CN110378739B (en) 2022-03-29

Similar Documents

Publication Publication Date Title
CN107832468B (en) Demand recognition methods and device
CN110807085B (en) Fault information query method and device, storage medium and electronic device
CN107105031A (en) Information-pushing method and device
CN110378739A (en) A kind of data traffic matching process and device
CN108038052A (en) Automatic test management method, device, terminal device and storage medium
CN109582849A (en) A kind of Internet resources intelligent search method of knowledge based map
CN110134845A (en) Project public sentiment monitoring method, device, computer equipment and storage medium
CN110288350A (en) User's Value Prediction Methods, device, equipment and storage medium
CN110019519A (en) Data processing method, device, storage medium and electronic device
CN108280091A (en) A kind of task requests execution method and apparatus
CN111815169A (en) Business approval parameter configuration method and device
CN105868956A (en) Data processing method and device
CN107145493A (en) Information processing method and device
CN109582865A (en) A kind of method and device of pushing application program
CN108268357A (en) real-time data processing method and device
CN109284342A (en) Method and apparatus for output information
CN113568899A (en) Data optimization method based on big data and cloud server
CN109582560A (en) Test file edit methods, device, equipment and computer readable storage medium
CN111602157A (en) Supplier supply chain risk analysis method
CN107871055A (en) A kind of data analysing method and device
CN111931069B (en) User interest determination method and device and computer equipment
CN109978302A (en) A kind of credit-graded approach and equipment
CN108369590A (en) For commending system, the devices and methods therefor for instructing Self-Service to analyze
CN106777310A (en) The method of calibration and device of information
CN110533454A (en) A kind of method and system identifying business object group

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant