CN108694212A - A kind of processing method and processing device of sample object - Google Patents

A kind of processing method and processing device of sample object Download PDF

Info

Publication number
CN108694212A
CN108694212A CN201710234565.XA CN201710234565A CN108694212A CN 108694212 A CN108694212 A CN 108694212A CN 201710234565 A CN201710234565 A CN 201710234565A CN 108694212 A CN108694212 A CN 108694212A
Authority
CN
China
Prior art keywords
sample object
target
sample
subclass
achievement data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710234565.XA
Other languages
Chinese (zh)
Other versions
CN108694212B (en
Inventor
隋馨缘
王冬冬
黄利贤
张望
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710234565.XA priority Critical patent/CN108694212B/en
Publication of CN108694212A publication Critical patent/CN108694212A/en
Application granted granted Critical
Publication of CN108694212B publication Critical patent/CN108694212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of processing method and processing devices of sample object.Present invention method includes:Obtain the achievement data of each sample object in sample object set;The invalid sample object in sample object set is deleted according to achievement data, obtains first sample object set, invalid sample object is the sample object that achievement data is more than preset range;The target sample object set of selection target quantity from first sample object set, destination number are more than first threshold, and the achievement data corresponding to each target sample object in target sample object set reaches preset standard;Periodic behavioral data analysis is carried out to the sample object in target sample object set according to preset standard.The embodiment of the present invention additionally provides a kind of processing unit of sample object, and the representativeness of the sample object of different cycles is consistent, and reduction causes to generate deviation to behavioural analysis due to the representative sex differernce of sample object, is conducive to improve the accuracy for analyzing object behavior.

Description

A kind of processing method and processing device of sample object
Technical field
The present invention relates to computer realm more particularly to the processing method and processing devices of sample object.
Background technology
Data analysis refer to statistical method appropriate to collect come mass data analyze, in the hope of maximumlly opening The function of sending out data, plays the effect of data.Extraction useful information is subject to study and summarize in detail with conclusion is formed to data The process of summary.And then by reflecting that the behavior of user also becomes current each enterprise marketing strategy system to the analysis of user data Fixed key.
It is currently possible to by obtaining a certain number of sample object user data, to moving integrally internet and all kinds of hands The analysis of the development tendency of machine application.Usually certain amount of sample object can be chosen with statistical sample mode, To reflect the situation of whole the Internet netizen.
During certain amount of netizen is chosen by the way of sampling as sample object, sample object can be directed to The ascribed characteristics of population matches, the case where to make the proportioning of the ascribed characteristics of population of sample object meet object overall (such as netizen is overall), Such as the ratio of the male's sample size and women sample size in sample object set is 1:1, it finally determines the ascribed characteristics of population Proportioning meet sample object of the sample object set of object general status as ultimate analysis user behavior.
In traditional approach, do not need purpose to be achieved specifically in conjunction with data analysis and Screening Samples object.For example, Conventional method may can also comply with pre- when reflecting attitude class data (view, viewpoint of the user to certain class things) of certain class user Phase, but when needing to reflect the behavioral data of user (the return visit number of such as webpage, jump out rate etc.), then it can be in sample object behavior Level is representative insufficient, and then causes when by behavior of the observation sample object to analyze whole object, sample between different cycles The representativeness of this object itself is inconsistent, and the analysis of sample object behavioral data generates deviation.For example, at 01 month and 2,015 2015 In the two observation cycles of February in year, the sample object of February may be whole less using mobile phone compared to 01 month, then passes through sample When object analyzes user behavior (paying a return visit number, jump out rate etc.) and the actual conditions of all netizens generate deviation.
Invention content
An embodiment of the present invention provides a kind of processing method and processing device of sample object, the sample for making different cycles The representativeness of object is consistent, and reduction causes to generate deviation to behavioural analysis due to the representative sex differernce of sample object, conducive to carrying The accuracy that height analyzes object behavior.
In a first aspect, a kind of processing method of sample object is provided in the embodiment of the present invention, including:
Obtain the achievement data of each sample object in sample object set;
The invalid sample object in the sample object set is deleted according to the achievement data, obtains first sample object Set, the invalid sample object are the sample object that achievement data is more than preset range;
The target sample object set of selection target quantity, the destination number are big from the first sample object set The achievement data corresponding to each target sample object in first threshold, the target sample object set reaches pre- bidding It is accurate;
Periodic behavior number is carried out to the sample object in the target sample object set according to the preset standard According to analysis.
Second aspect provides a kind of processing unit of sample object in the embodiment of the present invention, including:
Acquisition module obtains the achievement data of each sample object in sample object set;
Removing module, the achievement data for being obtained according to the acquisition module are deleted in the sample object set Invalid sample object, obtain first sample object set, the invalid sample object is that achievement data is more than preset range Sample object;
Selecting module, for deleting the first sample object set obtained after invalid sample object from the removing module The target sample object set of selection target quantity in conjunction, the destination number are more than first threshold, the target sample object The achievement data corresponding to each target sample object in set reaches preset standard;
Data analysis module, the preset standard for being determined according to the selecting module is to the target sample object Sample object in set carries out periodic behavioral data analysis.
As can be seen from the above technical solutions, the embodiment of the present invention has the following advantages:
During sampling, then sample drawn object set further obtains the achievement data of sample object set, The achievement data can be used for weighing the representativeness of sample object, then, be deleted in sample object set according to achievement data Invalid sample object, obtains first sample object set, and invalid sample object is the sample pair that achievement data is more than preset range As finally, a certain number of target sample object sets, target sample object set being selected from first sample object set In each target sample object corresponding to achievement data need to reach preset standard.In the present embodiment, make target sample object The corresponding achievement data of each target object sample in set reaches a preset standard, then according to the preset standard to institute The sample object stated in target sample object set carries out periodic behavioral data analysis, in this way subsequent to sample object Behavioral data when being analyzed, the representativeness of the sample object of different cycles is consistent, will not be due to the representativeness of sample object Difference and cause to Users'Data Analysis generate deviation, be conducive to improve subsequently to the accuracy of user behavior analysis.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those skilled in the art, other drawings may also be obtained based on these drawings.
Fig. 1 is a kind of schematic diagram of a scenario of one embodiment of the processing method of sample object in the embodiment of the present invention;
Fig. 2 is a kind of step schematic diagram of one embodiment of the processing method of sample object in the embodiment of the present invention;
Fig. 3 is that the sample object in the embodiment of the present invention projects enlarged diagram;
Fig. 4 is a kind of schematic diagram of a scenario of another embodiment of the processing method of sample object in the embodiment of the present invention;
Fig. 5 is a kind of step flow chart of another embodiment of the processing method of sample object in the embodiment of the present invention;
Fig. 6 is the schematic diagram of subclass in the target sample object set in the embodiment of the present invention;
Fig. 7 is a kind of test data trend line chart of the processing method of sample object in the embodiment of the present invention;
Fig. 8 is a kind of structural schematic diagram of one embodiment of the processing unit of sample object in the embodiment of the present invention;
Fig. 9 is a kind of structural representation of another embodiment of the processing unit of sample object in the embodiment of the present invention Figure;
Figure 10 is a kind of structural representation of another embodiment of the processing unit of sample object in the embodiment of the present invention Figure;
Figure 11 is a kind of structural representation of another embodiment of the processing unit of sample object in the embodiment of the present invention Figure;
Figure 12 is a kind of structural representation of another embodiment of the processing unit of sample object in the embodiment of the present invention Figure;
Figure 13 is a kind of structural representation of another embodiment of the processing unit of sample object in the embodiment of the present invention.
Specific implementation mode
An embodiment of the present invention provides a kind of processing method and processing devices of sample object, for reducing due to sample object It represents sex differernce and causes to generate deviation to behavioural analysis, be conducive to improve the accuracy for analyzing object behavior.
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people The every other embodiment that member is obtained, should all belong to the scope of protection of the invention.
Term " first ", " second ", " third " " in description and claims of this specification and above-mentioned attached drawing The (if present)s such as four " are for distinguishing similar object, without being used to describe specific sequence or precedence.It should manage The data that solution uses in this way can be interchanged in the appropriate case, so that the embodiments described herein can be in addition to illustrating herein Or the sequence other than the content of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that Cover it is non-exclusive include, for example, containing the process of series of steps or unit, method, system, product or equipment need not limit In those of clearly listing step or unit, but may include not listing clearly or for these processes, method, production The intrinsic other steps of product or equipment or unit.
Before totally carrying out user behavior analysis to object, needs totally to be sampled object, obtain sample object Set totally analyzes object by studying the sample object in sample object set, is currently interconnected to moving integrally By periodically analyzing sample object in the analytic process of net and the development tendency of mobile phone types of applications.
The processing method of a kind of sample object provided in the embodiment of the present invention, incorporated by reference to processing method shown in FIG. 1 Schematic diagram of a scenario is understood, can needle during certain amount of netizen is chosen by the way of sampling as sample object The ascribed characteristics of population of sample object is matched, to make the proportioning of the ascribed characteristics of population of sample object meet overall (such as netizen of object On the basis of totally) the case where, the achievement data of each sample object 1101 in sample object set 110, achievement data are obtained There is relevance with the behavioral data of sample object, it is to be understood that foundation needs behavioral data determination to refer in the present embodiment Data are marked, are to be oriented to analyze purpose, in the present embodiment, for the purpose of the development tendency for analyzing types of applications, then should Achievement data can be activity index, then, the invalid sample object in sample object set be deleted according to achievement data 1102, first sample object set 120 is obtained, which is the sample object that achievement data is more than preset range; Then, from first sample object set selection target quantity target sample object set 130, destination number be more than the first threshold It is worth, the achievement data corresponding to each target sample object 1103 in target sample object set 130 reaches preset standard;Most Afterwards, periodic behavioral data analysis is carried out to the sample object in target sample object set according to preset standard.This implementation In example, period of each behavioural analysis be used as the standard of target object of screening by the way that the achievement data is reached preset standard, this For sample when the subsequent behavioral data to sample object is analyzed, the representativeness of the sample object of different cycles is consistent, will not Cause to generate deviation to Users'Data Analysis due to the representative sex differernce of sample object, is conducive to improve subsequently to user behavior point The accuracy of analysis.
In order to facilitate understanding, the word involved in the embodiment of the present invention is explained first.
Sample object set:Sample database is can be understood as, then before carrying out behavioral data analysis, need to choose or recruited User, user that is that these are selected or being recruited by some channels is sample object, a certain number of sample objects compositions Sample object set, then can further obtain the behavioral data of sample object in sample object set, to the behavior Data are analyzed, the analysis result needed.
Behavioral data:The data obtained by the certain behaviors of sample object, for example, behavior data may include sample pair Area of source, incoming road domain name and the page of elephant;Website residence time, jump out rate, return visit person, new visitor, pay a return visit number, Return visit is separated by number of days;Register user and nonregistered user;It is closed in used search engine, keyword, association keyword and station Key word;Entry form (advertisement or web portal link);Access website flow;Click the webpage hotspot graph distribution on the page The number of number and web overlay;In the visit capacity etc. of different periods.
Behavioral data is analyzed:By obtaining the behavioral data of sample object, periodic observation sample object behavior, example Such as, the period here can be one month or a week etc., counted, analyzed to the behavioral data of sample object, therefrom It was found that user accesses the rule of website.For example, by selecting which type of entry form to sample object, (advertisement or website enter Mouth link), it is more effective to analyze which kind of entry form;The flow of website is accessed by sample object, analysis page structure design is It is no rationally etc..Specific duration for the period in actual application does not limit.
Sampling:In statistics, a part is extracted from target population as sample object, subsequently sample object is carried out When behavioural analysis, it can be obtained according to the behavioral data to sample object by analyzing a certain or certain attribute of sample object The assessment that the behavior of (such as all netizens) overall to object carries out certain reliability judges, to reach to overall understanding.One As during data analysis, it is all because Chinese netizen can not be covered, therefore be required to reach part by some channels to touch Netizen, namely used the mode of sampling.Specifically, sampling can be sampled according to region, one is extracted in different cities The sample object of fixed number amount, e.g., in city, A selects 100 netizens as sample object, in city B select 100 netizens as Sample object;Alternatively, sampling can also be sampled according to the age, in the different a certain number of sample objects of age bracket, example Such as, this age bracket at 10 to 15 years old selected 100 netizens as 100 sample objects, 16 to 20 years old this age Section selects 100 netizens as 100 sample objects etc..
Fail sample object:Some are unqualified, the false sample object of information, after ensureing to delete failure sample object Remaining sample object is integrated into more representative in composition.For example, if the information of sample object C active reportings is " man Property, it is 14 years old, married ", then it can determine whether out that the information that sample object provides is played tricks according to relevant laws and regulations, therefore, by the sample Object C is failure sample object, it should be noted that the sample object set in the present embodiment can be to have deleted failure sample Sample object set after this object.
Achievement data:For being finely divided data to obtaining sample object, so that sample object is for follow-up behavior point Analysis is more representative;Achievement data includes index and the corresponding index value of the index, the index may include enliven index and ATTRIBUTE INDEX etc..
Enliven index:It is used to indicate the activity level of sample, can be used to indicate that in some period (in such as one month) The total duration and/or frequency of interior using terminal (such as mobile phone) online.This in the embodiment of the present invention, which enlivens index, can pass through sample Object is weighed using the behavioral data of each application (app).This enliven index may include it is all apply used total duration, Total degree, total number of days (removal repeat number of days), using the total number of application, daily using duration, average daily access times, daily make With at least one of application number.Use the behavior of " application " as index is enlivened, for indicating the sample pair within the period As use internet enlivens situation.
For example, 4 applications (app) are loaded in the terminal of sample object E, if four applications are respectively wechat, Taobao, Browser and QQ, then it is all apply within a cycle enliven index and corresponding parameter value is as shown in table 1 below, this implementation A cycle in example can be illustrated for one month, and the concrete numerical value in the period in practical applications can be according to reality Border situation is set, and limitation of the invention explanation is not caused for the citing in period in the embodiment of the present invention:
Table 1
Using Use duration Access times
Wechat 60 280
Taobao 28 30
QQ 25 60
Browser 30 120
It amounts to 143 490
From, as can be seen that use the total duration of application in a cycle in table 1, total degree is for example, total in upper table 1 Shi Changwei adds up each application using duration, obtains total duration;Total degree be to the access times of each application into Row adds up, and obtains total degree.
In a further mode of operation, the total duration may be record each using initial time and end time, Each application is added up using duration, after obtaining total duration, removes the duration of repetition, if according to the concept of set Understood, total duration be respectively using duration union.
Using application total number of days be by it is each using number of days add up, then remove the number of days of repetition, obtain To total number of days, if the concept according to set is understood, total number of days be to determine respectively using date union.Incorporated by reference to The following table 2 is understood.
Table 2
Date 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Wechat
Taobao
QQ
Browser
It should be noted that in upper table 2, illustrated using 15 days as a cycle, a cycle in table 2 It is the exemplary illustration carried out as least unit using day, determines four unions using the date:In 1 to 15 this The total number of days applied using this four in 15 days is 13 days.It should be noted that upper table 2 merely for convenience of description and lift example Son, in practical applications, some application may not all used a whole day, in such a case, it is possible to record each Using initial time and end time, then according to respectively using initial time and end time respectively answered to determine With using the union of duration, so that it is determined that go out this four using total number of days, principle with shown in upper table 2 with heaven-made Identical for the principle of unit, in the embodiment of the present invention, the least unit of the initial time and end time can be according to practical feelings In condition it needs to be determined that accuracy and determine, do not limit herein, for example, initial time and end time can be single with hour Position (such as 10 point on the 2nd), can also be to be divided into unit (such as on the 2nd 20 minutes) at 10 points.
It is illustrated by taking above-mentioned Tables 1 and 2 as an example, the total number using application is 4.
The quotient of total number of days of the total duration and a cycle of a length of a cycle when average daily use, with " wechat " in table 1 For, total duration is 60 hours, and total number of days of a cycle is 30 days, then 2 hours a length of when average daily use.
Average daily access times are the quotient of the total duration of a cycle and total number of days of a cycle, with " wechat " in table 1 For, total degree is 490 times, and total number of days of a cycle is 30 days, then average daily access times are 16 times.
It should be noted that this enlivens index, to may be the application (gone by the total duration that uses, total degree, total number of days Except repeating number of days), using application total number, daily use duration, average daily access times, it is average daily using using in number extremely It is two few, index is enlivened according to after weight calculation, for example, it is respectively total duration and total degree to take two parameters, when total Long weight coefficient is a, and the weight coefficient of total degree is b, then active index=a* total duration+b* total degrees.
ATTRIBUTE INDEX:Affiliated region, income including sample object etc..If the income of the sample object of each period selection It has a long way to go, then the income index for the sample object that can be selected is as data target, for example, if desired by sample object Behavioral data analyze the service condition of some payment applications, then this category can will be taken in when selecting object sample Property index of the index as auxiliary, to keep object samples in each period representative consistent.
Invalid sample object:Achievement data is more than the sample object of preset range.For example, the achievement data is made with terminal Total duration, the preset range are more than 10 seconds.It gives one example, if the achievement data of the sample object D got is Mobile phone is used only in 1 month 9 seconds, then such sample object possibly can not represent true netizen's situation, then the sample pair As D is that invalid sample object deletes sample object D then in the processing procedure of sample object from sample object set.
Quantile:It is common to have quartile, percentile, median etc., i.e., all numerical value are arranged from small to large and is divided At several deciles, the numerical value in cut-point position is exactly quantile.Such as quartile from small to large arranges all numerical value Quadrisection after row, the numerical value in three cut-point positions i.e. respectively 1/4 quantile, median (2/4 digit), 3/4 Quantile.Quantile can be used for analyzing the variation tendency of data variable.
Explanation is explained to word involved in the embodiment of the present invention above, below to being carried in the embodiment of the present invention A kind of one embodiment of processing method of the sample object supplied illustrates, which can answer For a kind of processing unit of sample object, which can exist in the form of server.Incorporated by reference to Fig. 1 and Fig. 2 into Row understands that Fig. 2 is a kind of step schematic diagram of the processing method of the sample object provided in the embodiment of the present invention.
Step 201, the achievement data for obtaining each sample object in sample object set.
Obtaining the approach of the achievement data of each sample object can be:Terminal records the exponent data, terminal actively to Server reports the exponent data of sample object;Or can also server to terminal send ask, after terminal receives request To the exponent data of the server feedback sample object.
The sample object set can be the set for having deleted failure sample object, which can be actively to refer to The achievement data after weight calculation of data and attribute exponent data is counted, behavioral data can be certain net in the present embodiment It is illustrated for the page browsing amount stood, which can be illustrated for enlivening achievement data.
Step 202 deletes the invalid sample object in sample object set according to achievement data, obtains first sample object Set, invalid sample object are the sample object that achievement data is more than preset range.
In the present embodiment, for convenience of explanation, which illustrates by taking the total duration that terminal is used as an example.
In one implementation, which can be and to be less than the second preset value more than the first preset value, can be with Understand, in practical applications, it is possible that mobile phone is all rarely employed in some sample objects within one month, and has Sample object is all within one month to apply the total duration (hereinafter referred to as total duration) used and very long, such sample Object does not have representativeness, and representativeness, which refers to the active index data of the sample object of selection, can represent most of objects Activity level.
For example, first preset value can be 0.1 hour, the second preset value is 500 hours, that is, says invalid sample pair As total duration is less than 0.1 hour or sample object of the total duration more than 500 hours.
In another implementation, which can be more than the first preset value, then invalid sample object is just Total duration is less than or equal to the sample of the first preset value.For example, total duration is less than 0.1 hour sample object.Alternatively, this is pre- It may be less than the second preset value to set range, then invalid sample object is that the corresponding exponential quantity of active index is more than 500 hours Sample object.
In alternatively possible realization method, which may include the standard deviation for enlivening index and index value, The preset range is less than the standard deviation of preset quantity.It is understood that big this group of data of explanation of the value of standard deviation are relatively disperseed, if Standard deviation is small, then illustrates that this group of data are relatively concentrated.If there is the corresponding index values of sample object F to fall in first sample object set Average value add and subtract three standard deviations except, then show object F within one month using mobile phone total duration it is very short or Total duration is very long, and sample object F does not have representativeness, and sample object F is invalid sample object.
It should be noted that in practical applications, the historical data for the behavioral data that can be analyzed as needed is come specifically The preset range is set, only the preset range is illustrated in the present embodiment, does not cause limitation of the invention Property explanation.
Step 203, from first sample object set selection target quantity target sample object set, destination number is big The achievement data corresponding to each target sample object in first threshold, target sample object set reaches preset standard.
The first threshold is that the lower limiting value for the sample object quantity for including is needed in sample set, it is to be understood that can It is configured with the quantity for the object totality analyzed as needed.For example, it is desired to which the quantity of the object totality of analysis is 100 Ten thousand, which can be 10,000, and the quantity for the object totality if desired analyzed is 100,000, which can be 0.1 Ten thousand.It should be noted that above-mentioned is for example, not causing limitation of the invention explanation for first threshold.
It should be noted that the achievement data corresponding to sample object in the first sample object set have reached it is pre- Bidding is accurate, then first sample object set is identical as target sample object set.
The preset standard can be:The achievement data corresponding to each target sample object in the target sample object set Average value is second threshold.
The preset standard can (such as China Internet network be believed according to history achievement data average value and/or third party's data Breath center (China Internet Network Information Center, abbreviation:CNNIC) to the generaI investigation of Chinese netizen Report etc.) in assessment of netizen's behavior activity level etc. determine.
For example, the second threshold can be 100 hours, it is to be understood that each sample in the target sample set The active index of object corresponds to an index value, and the corresponding index value of all sample objects in the target sample set is averaged Value is 100 hours.
Number of objects representated by each sample object in target sample set is amplified to number of targets by step 204 from 1 Amount.
The number of objects that each sample object in target sample set is represented projects amplification, each sample object institute's generation The number of objects of table is amplified to destination number from 1, so that the sample size in the target sample set can represent object It is overall.Understood incorporated by reference to Fig. 3, Fig. 3 is that sample object projects enlarged diagram.
For example, currently needing the behavioral data to 1,000,000 netizen to analyze, then that is number of object totality Amount is 1,000,000, and the sample size in target sample set is 10,000, it is to be understood that each sample object needs to represent 100 objects could make 10,000 sample object represent 1,000,000 object totality, then, which is 100.
It it should be noted that step 204 is optional step, can not execute, and directly execute step 205.In step 204 It is for the overall quantity with the sample object in target object set of object for example, not causing the limit to the present invention Qualitative explanation.
Step 205 carries out periodic behavior number according to preset standard to the sample object in target sample object set According to analysis.
The behavioral data of each sample object is obtained, then the behavioral data of each sample object is analyzed, to sample The behavioral data analysis of this object periodically carries out.The sex index of enlivening in each period reaches preset standard, so that The representativeness for obtaining sample object selected in each period is with uniformity, to make the behavioral data analysis to sample object Accurately.
A kind of another embodiment of the processing method of the sample object provided in the embodiment of the present invention.Incorporated by reference to Fig. 4 and Fig. 5 is understood that Fig. 4 is a kind of schematic diagram of a scenario of the processing method of sample object, and Fig. 5 is a kind of processing side of sample object The step flow chart of another embodiment of method.
Step 501, the achievement data for obtaining each sample object in sample object set;
Obtaining the approach of the achievement data of each sample object can be:Terminal records the exponent data, terminal actively to Server reports the exponent data of sample object;Or can also server to terminal send ask, after terminal receives request To the exponent data of the server feedback sample object.
The sample object set can be the set for having deleted failure sample object, which can be actively to refer to The achievement data after weight calculation of data and attribute exponent data is counted, behavioral data can be certain net in the present embodiment It is illustrated for the page browsing amount stood, which can be illustrated for enlivening achievement data.
Step 502 deletes the invalid sample object in sample object set according to achievement data, obtains first sample object Set, invalid sample object are the sample object that achievement data is more than preset range.
In the present embodiment, for convenience of explanation, which illustrates by taking the total duration that terminal is used as an example.
In one implementation, which can be and to be less than the second preset value more than the first preset value, can be with Understand, in practical applications, it is possible that mobile phone is all rarely employed in some sample objects within one month, and has Sample object is all within one month to apply the total duration (hereinafter referred to as total duration) used and very long, such sample Object does not have representativeness, and representativeness, which refers to the active index data of the sample object of selection, can represent most of objects Activity level.
For example, first preset value can be 0.1 hour, the second preset value is 500 hours, that is, says invalid sample pair As total duration is less than 0.1 hour or sample object of the total duration more than 500 hours.
In another implementation, which can be more than the first preset value, then invalid sample object is just Total duration is less than or equal to the sample of the first preset value.For example, total duration is less than 0.1 hour sample object.Alternatively, this is pre- It may be less than the second preset value to set range, then invalid sample object is that the corresponding exponential quantity of active index is more than 500 hours Sample object.
In alternatively possible realization method, which may include the standard deviation for enlivening index and index value, The preset range is less than the standard deviation of preset quantity.It is understood that big this group of data of explanation of the value of standard deviation are relatively disperseed, if Standard deviation is small, then illustrates that this group of data are relatively concentrated.If there is the corresponding index values of sample object F to fall in first sample object set Average value add and subtract three standard deviations except, then show object F within one month using mobile phone total duration it is very short or Total duration is very long, and sample object F does not have representativeness, and sample object F is invalid sample object.
It should be noted that in practical applications, the actual conditions for the behavioral data that can be analyzed as needed are come specifically The preset range is set, only the preset range is illustrated in the present embodiment, does not cause limitation of the invention Property explanation.
Step 503, from first sample object set selection target quantity target sample object set, destination number is big The achievement data corresponding to each target sample object in first threshold, target sample object set reaches preset standard.
The first threshold is that the lower limiting value for the sample object quantity for including is needed in sample set, it is to be understood that can It is configured with the quantity for the object totality analyzed as needed.For example, it is desired to which the quantity of the object totality of analysis is 100 Ten thousand, which can be 10,000, and the quantity for the object totality if desired analyzed is 100,000, which can be 0.1 Ten thousand.It should be noted that above-mentioned is for example, not causing limitation of the invention explanation for first threshold.
The preset standard can be:The achievement data corresponding to each target sample object in the target sample object set Average value is second threshold.
For example, the second threshold can be 100 hours, it is to be understood that each sample in the target sample set The active index of object corresponds to an index value, and the corresponding index value of all sample objects in the target sample set is averaged Value is 100 hours.
Step 504 determines multiple subclass in target sample object set, the finger of the sample object in each subclass It marks data and corresponds to a value range.
Further, in order to enable target sample object is more representative, that is, sample object and practical object it is total The case where body closer to seemingly, then to the situation one of the ratio and object totality of each section composition inside target sample object set It causes.
Determining the mode of multiple subclass in the target sample object set can be:
Understood incorporated by reference to Fig. 6, Fig. 6 is the schematic diagram of subclass in target sample object set.Determine target sample The quantile for enlivening achievement data of multiple sample objects in object set is said by taking tertile as an example in the present embodiment It is bright, target sample object set is divided into multiple subclass according to quantile.When quantile includes the first quantile and second point When digit, three subclass are determined, which is respectively the first subclass, determines second subset conjunction and determine third Set, first subclass include the sample object that achievement data is less than the first quantile;Second subset conjunction includes index number According to more than or equal to the first quantile, and achievement data is less than the sample object of the second quantile;The third subclass includes Achievement data is more than or equal to the sample object of the second quantile.
It is understood that a subset conjunction can represent an activity level, such as the first subclass can represent work Jump horizontal A sample object set, second subset close can represent activity level B sample object set, third subset Close the set for the sample object that can represent activity level C.
For another example, the first quantile is 10 hours, and the second quantile is 20 hours.Then first subclass includes that total duration is small In the sample object corresponding to 10 hours;Second subset conjunction includes that total duration is more than or equal to 10 hours, and it is small to be less than 20 When corresponding sample object;The third subclass include total duration be more than or equal to 20 hours corresponding to sample object.
Step 505, the difference parameter for determining sample object quantity between each subclass.
In one implementation, which can be indicated with ratio, determine sample object in each subclass Then quantity determines the target proportion of the sample object quantity between subclass according to sample object quantity.
Such as the quantity of sample object is 100,000 in target sample set, the quantity of sample object is 4 in the first subclass Ten thousand, the quantity of sample object is 30,000 during second subset is closed, and the quantity of sample object is 30,000 in third subclass, then can determine The quantity of sample object in first subclass, the quantity of sample object and sample object in third subclass during second subset is closed Quantity is 4:3:3.It can be appreciated that activity level is the quantity of the sample object of A, activity level is the number of the sample object of B The ratio of the quantity for the sample object that amount is B with activity level is 4:3:3.
It in another implementation, can be according in the quantity of sample object in each subclass and target object set The total amount of sample object determines that each subclass accounts for the second ratio of sample object total amount in target sample set, then, then root The target proportion of the sample object quantity between subclass is determined according to target proportion.
For example, the quantity of sample object accounts for 40% of sample object total amount in target sample set, in the first subclass The quantity of sample object accounts for 30% of sample object total amount in target sample set in two subclass, sample pair in third subclass The quantity of elephant accounts for 30% of sample object total amount in target sample set.It is understood that sample object in the first subclass Quantity, ratio between the quantity of sample object and the quantity of sample object in third subclass is 4 during second subset is closed:3: 3。
Step 506, adjustment difference parameter, so that difference parameter meets predetermined target value.
Predetermined target value can be according to historical data and/or third party's data (such as China Internet Network Information Center (China Internet Network Information Center, abbreviation:CNNIC it) determines, for example, in third party's data In, activity level is that the quantity of the object of A accounts for the 30% of object total number, and activity level is that the quantity of the object of B accounts for object The 40% of total number, activity level are that the quantity of the object of B accounts for the 30% of object total number, that is to say, that goal-selling Value is 3:4:3.And the quantity for the object that the activity level in realistic objective sample set is A, activity level are the number of the object of B The ratio of the quantity for the object that amount is C with activity level is 4:3:3, in order to enable sample object more to meet the feelings of practical object Ratio between sample object quantity in each subclass in target sample set is adjusted by condition.
By the ratio between the sample object quantity in each subclass in target sample set can by weight coefficient into The mode of row adjustment:
Assuming that quantile is P1, P2, predetermined target value corresponding with the quantile is:p1:p2:P3, and with the quantile Target proportion in the target object set of object between the sample object quantity of each subclass is:q1:q2:q3;In order to enable User's accounting predetermined target value of different activity levels, then be multiplied by weight by the number of objects of the first subclass (activity level A) P1/q1, the number of objects that second subset is closed to (activity level B) is multiplied by weight p2/q2, by third subclass (activity level C) Number of objects be multiplied by weight p3/q3 so that the ratio between three in target object set subclass is default mesh Scale value p1:p2:p3.
In order to facilitate understanding, give one example, predetermined target value 3:4:3, it is contemplated that proportionate relationship be that activity level is The object accounting 30% of A, activity level B's exclusively enjoys accounting 40%, the object accounting 30% of activity level C;Then activity level It is being multiplied by weight coefficient (3/4) for the sample object quantity of A, and the sample object quantity of activity level B is multiplied by weight coefficient (4/3), so that the ratio between the sample object quantity of three activity levels meets 3:4:3 accounting rather than 4:3:3.
Further multiple sample objects in target object set are divided according to the quantile of exponent data in the present embodiment At multiple subclass, wherein each subclass can represent an activity level, according to predetermined target value to sample in subclass Target proportion between number of objects is adjusted, and can carry out more Precise control, target sample object to sample object The case where sample object in set, can more reflect truth.
Number of objects representated by each sample object in target sample set is amplified to number of targets by step 507 from 1 Amount.The number of objects that each sample object in target sample set is represented, which projects, amplifies, representated by each sample object Number of objects is amplified to destination number from 1, so that the sample size in the target sample set can represent object totality. Understood incorporated by reference to Fig. 3, Fig. 3 is that sample object projects enlarged diagram.
For example, currently needing the behavioral data to 1,000,000 netizen to analyze, then that is number of object totality Amount is 1,000,000, and the sample size in target sample set is 10,000, it is to be understood that each sample object needs to represent 100 objects could make 10,000 sample object represent 1,000,000 object totality, then, which is 100.
It it should be noted that step 507 is optional step, can not execute, and directly execute step 508.In step 507 It is for the overall quantity with the sample object in target object set of object for example, not causing the limit to the present invention Qualitative explanation.
Step 508 carries out periodic behavior number according to preset standard to the sample object in target sample object set According to analysis.
The behavioral data of each sample object is obtained, then the behavioral data of each sample object is analyzed, to sample The behavioral data analysis of this object periodically carries out.The sex index of enlivening in each period reaches preset standard, so that The representativeness for obtaining sample object selected in each period is with uniformity, to make the behavioral data analysis to sample object Accurately.
It optionally, in embodiments of the present invention, can be according to step 501 to step in each data analysis period 508 method determines sample object, can by preset standard in step 203 and step 503 can basis in the subsequent period The actual conditions of netizen are set.For example, the preset standard can be positively correlated with the period, the preset standard of a cycle Average value for total duration is 100 hours, then in second period, which may be configured as 100.1 hours, in third In the period, which could be provided as 100.2 hours, if the actual conditions of the netizen embodied in third party's data report are nets The people are gradually increased using the total duration of mobile phone, then such setting can more agree with the truth of observed netizen's totality.
It please refers to shown in Fig. 7, it, can be with analogue data point in the embodiment of the present invention according to the method in the embodiment of the present invention The effect data of platform is analysed to show the advantageous effect of the method provided in the embodiment of the present invention.With in target sample object set Sample object when going to speculate Chinese netizen's general status, in traditional method, due to the representativeness of sample is changed So that the monthly user volume trend of Taobao and the trend of reference data (truthful data obtained by other channels) are away from each other (one A drop that goes up), and use and determine sample object in the processing method that sample object is provided in the embodiment of the present invention, to sample After the behavioral data of this object is analyzed, thus it is speculated that monthly user volume trend (the simulation knot of Taobao in the Chinese netizen's totality gone out Fruit) trend between reference data is more close.A kind of processing method of the sample object provided in the embodiment of the present invention is logical It crosses and the achievement data in each behavioral data analytical cycle is met into preset standard, so that the sample in each period The representativeness of object is with uniformity, and the behavior situation for the object totality that sample object is reflected is more accurate, really, to sample The analysis result of the behavioral data of object is more consistent with actual conditions.
A kind of processing method of sample object is described above, below to a kind of processing method institute of sample object The processing unit of application is described, which can exist in the form of server.
It please refers to shown in Fig. 8, a kind of implementation of the processing unit 800 of sample object is provided in the embodiment of the present invention Example include:
Acquisition module 801 obtains the achievement data of each sample object in sample object set;
Removing module 802, the achievement data for being obtained according to the acquisition module 801 delete the sample object Invalid sample object in set, obtains first sample object set, and the invalid sample object is that achievement data is more than to preset The sample object of range;
Selecting module 803, for deleting the first sample obtained after invalid sample object from the removing module 802 The target sample object set of selection target quantity in object set, the destination number are more than first threshold, the target sample The achievement data corresponding to each target sample object in this object set reaches preset standard;
Data analysis module 804, the preset standard for being determined according to the selecting module 803 is to the target sample Sample object in this object set carries out periodic behavioral data analysis.
It please refers to shown in Fig. 9, on the basis of Fig. 8 corresponding embodiments, a kind of sample is provided in the embodiment of the present invention Another embodiment of the processing unit 900 of object includes:
First determining module 805, for determining multiple subclass in the target sample object set, each subclass In sample object achievement data correspond to a value range;
Second determining module 806, the sample between each subclass for determining the determination of the first determining module 805 The difference parameter of number of objects;
Module 807 is adjusted, the difference parameter determined for adjusting second determining module 806, so that the difference is joined Number meets predetermined target value.
It please refers to Fig.1 shown in 0, on the basis of Fig. 9 corresponding embodiments, a kind of sample is provided in the embodiment of the present invention Another embodiment of the processing unit 1000 of object includes:
First determining module 805 further includes the first determination unit 8051 and the second determination unit 8052;
First determination unit 8051, for determining corresponding to the sample object in the target sample object set The quantile of achievement data;
Second determination unit 8052, the quantile for determining according to first determination unit 8051 is by institute It states target sample object set and is divided into multiple subclass.
Optionally, when the quantile includes the first quantile and the second quantile, second determination unit 8052 It is additionally operable to:
Determine that the first subclass, first subclass include the sample pair that achievement data is less than first quantile As;
Determining that second subset is closed, the second subset conjunction includes that achievement data is more than or equal to first quantile, And achievement data is less than the sample object of second quantile;
Determine that third subclass, the third subclass include that achievement data is more than or equal to the third quantile Sample object.
It please refers to Fig.1 shown in 1, on the basis of Fig. 9 corresponding embodiments, a kind of sample is provided in the embodiment of the present invention Another embodiment of the processing unit 1100 of object includes:
Second determining module 806 further includes third determination unit 8061 and the 4th determination unit 8062;
The third determination unit 8061, for determining sample object quantity in each subclass;
4th determination unit 8062, the sample object number for being determined according to the third determination unit 8061 Amount determines the target proportion of the sample object quantity between the subclass;
The adjustment module 807 is additionally operable to adjust described in the 4th determination unit 8062 determination by weight coefficient Target proportion between subclass, so that the target proportion meets predetermined target value.
It please refers to Fig.1 shown in 2, on the basis of Fig. 8 corresponding embodiments, a kind of sample is provided in the embodiment of the present invention Another embodiment of the processing unit 1200 of object includes:
The quantity amplification module 808, being additionally operable to will be representated by each sample object in the target sample set Number of objects is amplified to destination number from 1.
Optionally, the preset standard is the finger corresponding to each target sample object in the target sample object set Mark statistical average is second threshold.
Further, processing unit in Fig. 8 to Figure 12 is presented in the form of function module.Here " module " can To refer to application-specific integrated circuit (application-specific integrated circuit, ASIC), circuit executes The processor and memory of one or more softwares or firmware program, integrated logic circuit and/or other above-mentioned work(can be provided The device of energy.In a simple embodiment, form shown in Figure 13 may be used in the processing unit in Fig. 8 to Figure 12.
Figure 13 is a kind of server architecture schematic diagram provided in an embodiment of the present invention, which can be because of configuration or property Can be different and generate bigger difference, may include one or more processors 1322 and memory 1332, one or (such as one or more mass memories are set the storage medium 1330 of more than one storage application program 1342 or data 1344 It is standby).Wherein, memory 1332 and storage medium 1330 can be of short duration storage or persistent storage.It is stored in storage medium 1330 Program may include one or more modules (diagram does not mark), each module may include to the system in server Row instruction operation.Further, processor 1322 could be provided as communicating with storage medium 1330, be held on server 1300 Series of instructions operation in row storage medium 1330.
Server 1300 can also include one or more power supplys 1326, one or more wired or wireless nets Network interface 1350, one or more input/output interfaces 1358, and/or, one or more operating systems 1341, example Such as Windows Server, Mac OS XTM, Unix, Linux, FreeBSD etc..
Network interface 1350, the achievement data for obtaining each sample object in sample object set;
Processor 1322, for deleting the invalid sample object in the sample object set according to the achievement data, First sample object set is obtained, the invalid sample object is the sample object that achievement data is more than preset range;From described The target sample object set of selection target quantity in first sample object set, the destination number are more than first threshold, institute It states the achievement data corresponding to each target sample object in target sample object set and reaches preset standard;According to described default Standard carries out periodic behavioral data analysis to the sample object in the target sample object set.
Optionally, processor 1322 are additionally operable to determine multiple subclass in the target sample object set, per height The achievement data of sample object in set corresponds to a value range;Determine the difference of the sample object quantity between each subclass Different parameter;The difference parameter is adjusted, so that the difference parameter meets predetermined target value.
Optionally, processor 1322 are additionally operable to determine corresponding to the sample object in the target sample object set The quantile of achievement data;The target sample object set is divided into multiple subclass according to the quantile.
Optionally, when the quantile includes the first quantile and the second quantile, processor 1322 is additionally operable to determine First subclass, first subclass include the sample object that achievement data is less than first quantile;Determine the second son Set, the second subset conjunction include that achievement data is more than or equal to first quantile, and achievement data is less than described The sample object of second quantile;Determine that third subclass, the third subclass include that achievement data is more than or equal to institute State the sample object of third quantile.
Optionally, processor 1322 are additionally operable to determine sample object quantity in each subclass;According to the sample object Quantity determines the target proportion of the sample object quantity between the subclass;It is adjusted between the subclass by weight coefficient Target proportion so that the target proportion meets predetermined target value.
Optionally, processor 1322 are additionally operable to pair representated by each sample object in the target sample set As quantity is amplified to destination number from 1.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be the indirect coupling by some interfaces, device or unit It closes or communicates to connect, can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can be stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to before Stating embodiment, invention is explained in detail, it will be understood by those of ordinary skill in the art that:It still can be to preceding The technical solution recorded in each embodiment is stated to modify or equivalent replacement of some of the technical features;And these Modification or replacement, the spirit and scope for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution.

Claims (14)

1. a kind of processing method of sample object, which is characterized in that including:
Obtain the achievement data of each sample object in sample object set;
The invalid sample object in the sample object set is deleted according to the achievement data, obtains first sample object set It closes, the invalid sample object is the sample object that achievement data is more than preset range;
The target sample object set of selection target quantity from the first sample object set, the destination number are more than the One threshold value, the achievement data corresponding to each target sample object in the target sample object set reach preset standard;
Periodic behavioral data point is carried out to the sample object in the target sample object set according to the preset standard Analysis.
2. the processing method of sample object according to claim 1, which is characterized in that described from the first sample object After the target sample object set of middle selection target quantity, the method further includes:
Determine multiple subclass in the target sample object set, the achievement data pair of the sample object in each subclass Answer a value range;
Determine the difference parameter of the sample object quantity between each subclass;
The difference parameter is adjusted, so that the difference parameter meets predetermined target value.
3. the determination method of sample object according to claim 2, which is characterized in that the determination target sample pair As multiple subclass in set, including:
Determine the quantile of the achievement data corresponding to the sample object in the target sample object set;
The target sample object set is divided into multiple subclass according to the quantile.
4. the processing method of sample object according to claim 3, which is characterized in that when the quantile includes first point It is described that the target sample object set is divided into multiple subclass according to the quantile when digit and the second quantile, packet It includes:
Determine that the first subclass, first subclass include the sample object that achievement data is less than first quantile;
Determine that second subset is closed, the second subset conjunction includes that achievement data is more than or equal to first quantile, and institute State the sample object that achievement data is less than second quantile;
Determine that third subclass, the third subclass include the sample that achievement data is more than or equal to the third quantile Object.
5. the processing method of sample object according to any one of claim 2 to 4, which is characterized in that the determination is every The difference parameter of sample object quantity between a subclass, including:
Determine sample object quantity in each subclass;
The target proportion of the sample object quantity between the subclass is determined according to the sample object quantity;
The adjustment difference parameter, including:
The target proportion between the subclass is adjusted by weight coefficient, so that the target proportion meets predetermined target value.
6. the processing method of sample object according to any one of claim 1 to 4, which is characterized in that described from described In first sample object set after the target sample object set of selection target quantity, the method further includes:
Number of objects representated by each sample object in the target sample set is amplified to destination number from 1.
7. the processing method of sample object according to any one of claim 1 to 4, which is characterized in that the pre- bidding Standard is that the achievement data average value corresponding to each target sample object in the target sample object set is second threshold.
8. a kind of processing unit of sample object, which is characterized in that including:
Acquisition module obtains the achievement data of each sample object in sample object set;
Removing module, the achievement data for being obtained according to the acquisition module delete the nothing in the sample object set Sample object is imitated, first sample object set is obtained, the invalid sample object is the sample that achievement data is more than preset range Object;
Selecting module, for being deleted in the first sample object set obtained after invalid sample object from the removing module The target sample object set of selection target quantity, the destination number are more than first threshold, the target sample object set In each target sample object corresponding to achievement data reach preset standard;
Data analysis module, the preset standard for being determined according to the selecting module is to the target sample object set In sample object carry out periodic behavioral data analysis.
9. the processing unit of sample object according to claim 8, which is characterized in that further include:
First determining module, for determining multiple subclass in the target sample object set, the sample in each subclass The achievement data of this object corresponds to a value range;
Second determining module, for determining the sample object quantity between each subclass that first determining module determines Difference parameter;
Module is adjusted, the difference parameter determined for adjusting second determining module is preset so that the difference parameter meets Desired value.
10. the processing unit of sample object according to claim 9, which is characterized in that
First determining module further includes the first determination unit and the second determination unit;
First determination unit, for determining the achievement data corresponding to the sample object in the target sample object set Quantile;
Second determination unit, the quantile for determining according to first determination unit is by the target sample pair As set is divided into multiple subclass.
11. the processing unit of sample object according to claim 10, which is characterized in that when the quantile includes first When quantile and the second quantile, second determination unit is additionally operable to:
Determine that the first subclass, first subclass include the sample object that achievement data is less than first quantile;
Determine that second subset is closed, the second subset conjunction includes that achievement data is more than or equal to first quantile, and refers to Mark the sample object that data are less than second quantile;
Determine that third subclass, the third subclass include the sample that achievement data is more than or equal to the third quantile Object.
12. the processing unit of the sample object according to any one of claim 9 to 11, which is characterized in that described second Determining module further includes third determination unit and the 4th determination unit;
The third determination unit, for determining sample object quantity in each subclass;
4th determination unit, the sample object quantity for being determined according to the third determination unit determine the son The target proportion of sample object quantity between set;
The adjustment module is additionally operable between the subclass for adjusting the 4th determination unit determination by weight coefficient Target proportion, so that the target proportion meets predetermined target value.
13. the processing unit of the sample object according to any one of claim 8 to 11, which is characterized in that further include:
The quantity amplification module is additionally operable to the number of objects representated by each sample object in the target sample set It is amplified to destination number from 1.
14. the processing unit of the sample object according to any one of claim 8 to 11, which is characterized in that described default Standard is that the achievement data average value corresponding to each target sample object in the target sample object set is second threshold.
CN201710234565.XA 2017-04-11 2017-04-11 Sample object processing method and device Active CN108694212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710234565.XA CN108694212B (en) 2017-04-11 2017-04-11 Sample object processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710234565.XA CN108694212B (en) 2017-04-11 2017-04-11 Sample object processing method and device

Publications (2)

Publication Number Publication Date
CN108694212A true CN108694212A (en) 2018-10-23
CN108694212B CN108694212B (en) 2023-04-14

Family

ID=63843499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710234565.XA Active CN108694212B (en) 2017-04-11 2017-04-11 Sample object processing method and device

Country Status (1)

Country Link
CN (1) CN108694212B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487225A (en) * 2021-07-23 2021-10-08 北京云从科技有限公司 Risk control method, system, device and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11127354A (en) * 1997-10-20 1999-05-11 Casio Comput Co Ltd Image encoding method and storage medium
JP2004334326A (en) * 2003-04-30 2004-11-25 Nri & Ncc Co Ltd System for predicting demand of merchandise and system for adjusting number of merchandise sold
CN102262678A (en) * 2011-08-16 2011-11-30 郑毅 System for sampling mass data and managing sampled data
CN103426121A (en) * 2013-04-28 2013-12-04 中国南方电网有限责任公司 Method for calculating power grid operation evaluation index dimensionless
CN104408143A (en) * 2014-12-01 2015-03-11 北京国双科技有限公司 Webpage data monitoring method and device
CN104778481A (en) * 2014-12-19 2015-07-15 五邑大学 Method and device for creating sample library for large-scale face mode analysis
CN104951962A (en) * 2015-06-05 2015-09-30 百度在线网络技术(北京)有限公司 Method and device for determining trend information corresponding to multiple indexes
CN105139282A (en) * 2015-08-20 2015-12-09 国家电网公司 Power grid index data processing method, device and calculation device
CN105868275A (en) * 2016-03-22 2016-08-17 深圳市艾酷通信软件有限公司 Data statistical method and electronic device
CN106203298A (en) * 2016-06-30 2016-12-07 北京集创北方科技股份有限公司 Biological feather recognition method and device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11127354A (en) * 1997-10-20 1999-05-11 Casio Comput Co Ltd Image encoding method and storage medium
JP2004334326A (en) * 2003-04-30 2004-11-25 Nri & Ncc Co Ltd System for predicting demand of merchandise and system for adjusting number of merchandise sold
CN102262678A (en) * 2011-08-16 2011-11-30 郑毅 System for sampling mass data and managing sampled data
CN103426121A (en) * 2013-04-28 2013-12-04 中国南方电网有限责任公司 Method for calculating power grid operation evaluation index dimensionless
CN104408143A (en) * 2014-12-01 2015-03-11 北京国双科技有限公司 Webpage data monitoring method and device
CN104778481A (en) * 2014-12-19 2015-07-15 五邑大学 Method and device for creating sample library for large-scale face mode analysis
CN104951962A (en) * 2015-06-05 2015-09-30 百度在线网络技术(北京)有限公司 Method and device for determining trend information corresponding to multiple indexes
CN105139282A (en) * 2015-08-20 2015-12-09 国家电网公司 Power grid index data processing method, device and calculation device
CN105868275A (en) * 2016-03-22 2016-08-17 深圳市艾酷通信软件有限公司 Data statistical method and electronic device
CN106203298A (en) * 2016-06-30 2016-12-07 北京集创北方科技股份有限公司 Biological feather recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张悦玫: "基于价值增长的企业绩效评价体系研究" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487225A (en) * 2021-07-23 2021-10-08 北京云从科技有限公司 Risk control method, system, device and medium
CN113487225B (en) * 2021-07-23 2024-05-24 北京云从科技有限公司 Risk control method, system, equipment and medium

Also Published As

Publication number Publication date
CN108694212B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
Chan et al. Evaluating online ad campaigns in a pipeline: causal models at scale
CN106251174A (en) Information recommendation method and device
CN109784959B (en) Target user prediction method and device, background server and storage medium
CN104834731A (en) Recommendation method and device for self-media information
US20150170271A1 (en) System and Method to Request and Collect Information to Determine Personalized Credit
KR20200003109A (en) Method and apparatus for setting sample weight, electronic device
CN106681921A (en) Method and device for achieving data parameterization
CN113157752B (en) Scientific and technological resource recommendation method and system based on user portrait and situation
CN109309596A (en) A kind of method for testing pressure, device and server
US20170288989A1 (en) Systems and Techniques for Determining Associations Between Multiple Types of Data in Large Data Sets
Ha et al. An analysis on information diffusion through BlogCast in a blogosphere
CN110175264A (en) Construction method, server and the computer readable storage medium of video user portrait
Pavlič et al. Using pseudo-observations for estimation in relative survival
CN110348745A (en) The ranking method and device of advertising channel
CN104050197A (en) Evaluation method and device for information retrieval system
CN110349013A (en) Risk control method and device
CN115345530A (en) Market address recommendation method, device and equipment and computer readable storage medium
CN113157922A (en) Network entity behavior evaluation and visualization method based on graph
Topalović et al. Evaluating the transferability of monthly water balance models under changing climate conditions
CN107067276A (en) Determine the method and device of object influences power
JP2009289172A (en) Conduct history analysis system and its method
CN109308660B (en) Credit assessment scoring model evaluation method, apparatus, device and storage medium
CN116932549A (en) Intelligent model-based platform data storage method, system, medium and equipment
CN106776757A (en) User completes the indicating means and device of Net silver operation
CN108694212A (en) A kind of processing method and processing device of sample object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant