CN107248086A - Advertisement putting aided analysis method based on user power utilization behavioural analysis - Google Patents
Advertisement putting aided analysis method based on user power utilization behavioural analysis Download PDFInfo
- Publication number
- CN107248086A CN107248086A CN201710270555.1A CN201710270555A CN107248086A CN 107248086 A CN107248086 A CN 107248086A CN 201710270555 A CN201710270555 A CN 201710270555A CN 107248086 A CN107248086 A CN 107248086A
- Authority
- CN
- China
- Prior art keywords
- data
- user
- analysis
- mover
- power utilization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 37
- 230000003542 behavioural effect Effects 0.000 title claims abstract description 27
- 230000005611 electricity Effects 0.000 claims abstract description 44
- 238000010835 comparative analysis Methods 0.000 claims abstract description 4
- 238000013480 data collection Methods 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 35
- 238000012545 processing Methods 0.000 claims description 15
- 241001269238 Data Species 0.000 claims description 9
- 238000009826 distribution Methods 0.000 claims description 9
- HUTDUHSNJYTCAR-UHFFFAOYSA-N ancymidol Chemical compound C1=CC(OC)=CC=C1C(O)(C=1C=NC=NC=1)C1CC1 HUTDUHSNJYTCAR-UHFFFAOYSA-N 0.000 claims description 3
- 238000007621 cluster analysis Methods 0.000 claims description 3
- 210000000805 cytoplasm Anatomy 0.000 claims description 3
- 230000007774 longterm Effects 0.000 claims description 3
- 230000001737 promoting effect Effects 0.000 claims description 3
- 238000013459 approach Methods 0.000 claims description 2
- 230000002159 abnormal effect Effects 0.000 claims 1
- 230000005856 abnormality Effects 0.000 claims 1
- 238000005259 measurement Methods 0.000 abstract description 3
- 230000006399 behavior Effects 0.000 description 8
- 102000000584 Calmodulin Human genes 0.000 description 3
- 108010041952 Calmodulin Proteins 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 230000002902 bimodal effect Effects 0.000 description 1
- 238000010411 cooking Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000005612 types of electricity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0269—Targeted advertisements based on user profile or attribute
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- Accounting & Taxation (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Entrepreneurship & Innovation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of advertisement putting aided analysis method based on user power utilization behavioural analysis, it is characterized in that:Including Data Collection;The data collected, sequence of operations is carried out to the noise of data;Normalized;K means are clustered;Obtained result is clustered according to K means, the user power utilization flow characteristic of each classification is analyzed, it is determined that per the power consumption characteristics of class user;According to the result of cluster, the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis, the business delivered with regard to auxiliary ad-vertisement is promoted.The present invention provides different zones consumption of resident Capability Measurement, consumer goods concern direction prediction for enterprise customer, and advertisement orientation is carried out by its product of auxiliary enterprises and is delivered.
Description
Technical field
The present invention relates to a kind of advertisement putting aided analysis method based on user power utilization behavioural analysis.
Background technology
Current State Grid Corporation of China applies marketing information system, the progressively development of power information acquisition system, have accumulated each
Each sea of retribution amount power information of row, and with the development of intelligent power grid technology, intelligent gateway, smart jack equipment progressively popularize so that
User power utilization Data acquisition and storage becomes more convenient, and these data have that data volume is big, data type is various, data are real-time
Property high big data essential characteristic, Guo Wang intra-companies sales service aid decision is mainly used at present, to government, enterprise visitor
The data analysis at family is served by being extremely limited.
Strengthened research with " smart city " project in each experimental city is carried out, it is desirable to inter-trade, region, association of department
Make, break the obstacle of Information application, realize interconnecting for information, thus new demand is proposed to the application of power network big data, i.e.,
By customer electricity information calmodulin binding domain CaM historical data be government, corporate client offer customize scene analysis service, these determine
System analysis scene consigns to client by way of analysis report, data visualization, formulation service, is that State Grid Corporation of China creates
Economic value.
Customer electricity information based on power information acquisition system, all electric quantity datas, one in being related to according to preresearch estimates
3 years historical datas of individual province (city) company are about 50T, every year newly-increased about 20T, meet big data scale greatly, data type is various, number
According to the ageing high feature of processing.These electricity consumption datas except meet interior business lifting with improvement in terms of decision support in addition to,
Custom analysis service can be equally provided with Government, enterprise customer, such as Government provides region residential building vacancy rate point
Analysis, the every making policies of prediction economic development trend service direction government;Towards corporate client provide trade investment addressing,
Advertisement orientation delivers measuring and calculating service auxiliary enterprises and carries out trade investment decision-making.
The content of the invention
It is an object of the invention to provide one kind by analyzing different zones residential electricity consumption behavior, calmodulin binding domain CaM commercial consumption
Related many external datas, are modeled using clustering, segment residential electricity consumption consumption feature, excavate each region consumption of resident custom,
So as to provide different zones consumption of resident Capability Measurement, consumer goods concern direction prediction for enterprise customer, using auxiliary enterprises as it
Product carries out the advertisement putting aided analysis method based on user power utilization behavioural analysis that advertisement orientation is delivered.
The present invention technical solution be:
A kind of advertisement putting aided analysis method based on user power utilization behavioural analysis, it is characterized in that:Comprise the following steps:
(1) Data Collection, with reference to business demand, collects the power consumption data of user;
(2) data collected, sequence of operations is carried out to the noise of data:
Outlier processing:Exceptional value judges that Main Basiss method has two kinds, and one kind is Pauta Criterion, and method is simply easy
In operation, this method is that the most frequently used exceptional value judges to meet the number of totality x Normal Distributions for data with rejecting criterion
According to having:
P(|x‐μ|>3σ)≤0.003
Wherein, μ and σ represent the mathematic expectaion and standard deviation of normal population respectively, and x is actual observed value, and P is that observation exists
Probability outside the standard deviation of 3 times of the left and right of average;
Now occur being more than the σ of μ+3 or the data probability very little less than μ -3 σ in data, therefore can be picked as exceptional value
Except the partial data;
Another is standardized value method, the data Normal Distribution after Z score standardization;This method can recognize different
Constant value, data of the Z score less than -3 or higher than 3 are exceptional value;Criterion score (Z score) formula is:
Z=(x- μ)/σ
Wherein μ and σ represent the mathematic expectaion and standard deviation of normal population respectively;
KNN algorithms fill missing values:Missing Data Filling method based on k nearest neighbor, allows for the characteristic of electric power data, leads to
Cross to select and close on data near how many missing values, to calculate missing values, to select the data of K arest neighbors, and to not
The data setting weights of same distance, then determine missing values according to corresponding weighted average;Mean value method fills missing values,
After outlier processing the average value of index as missing values filling;
(3) normalized
The electricity consumption behavioural characteristic of user, primarily to user is distinguished in electricity consumption Long-term change trend situation not in the same time, by
Wide in Electricity customers coverage rate, the power consumption of different user differs greatly, to avoid the influence that electricity consumption data magnitude is brought,
The present invention is specially expert to having done normalized, and formula is:
Wherein i represents i-th of user, and t is a certain moment in one day 96 moment;Above-mentioned formula is to use each performance number
Subtract the minimum value of current line power, then divided by current line changed power scope, that is, the numerical value after being normalized;
(4) K-means is clustered
Before cluster is started, the operation of a Data Dimensionality Reduction is also done, is with the data that data are collected into one day are adopted
96 points, but this 96 points are more or less the same in 4 adjacent points, the power data then rounded a little is representative, is dropped
The power data of each user after dimension;
This step will be carried out category division to user, comprised the following steps that on the basis of above-mentioned processing procedure:
S1. optimal cluster number N is determined according to silhouette coefficient method;
S2. N number of user is randomly selected from above-mentioned data as barycenter;
S3. it is measured to the distance of each barycenter from remaining each user, and it is grouped into the classification of nearest barycenter;
S4. according to Euclidean distance method, the barycenter of each classification is recalculated;
S5. iteration S3-S4, until new barycenter is equal with the protoplasm heart or less than specified threshold, algorithm terminates;
(5) obtained result is clustered according to step (4) K-means, the user power utilization flow characteristic of each classification is divided
Analysis, it is determined that per the power consumption characteristics of class user;
(6) according to the result of above-mentioned cluster, the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis are just auxiliary
The business of advertisement putting is helped to be promoted;
S1. cluster analysis result is illustrated;
S2. different classes of corresponding user behavior feature is analyzed, including with electrical characteristics, the distribution situation of peak valley;
S3. the promoting service that auxiliary ad-vertisement is delivered:
Summary is analyzed the behavioural characteristic of different classes of user, and the major focus of advertisement putting is done and returned
Class.
The present invention is by analyzing different zones residential electricity consumption behavior, the related many external datas of calmodulin binding domain CaM commercial consumption, profit
Modeled with clustering, segment residential electricity consumption consumption feature, each region consumption of resident custom is excavated, so as to be provided for enterprise customer
Different zones consumption of resident Capability Measurement, consumer goods concern direction prediction, advertisement orientation is carried out by its product of auxiliary enterprises and is thrown
Put.
The present invention main innovation be to propose a kind of applied analysis framework based on user power utilization behavioural analysis result,
Effectively apply the result of user power utilization behavioural analysis.The present invention can solve the problem that user power utilization behavior validity, applicability, efficiently
Property and element set for higher-dimension the aspect of autgmentability four the problem of.
Validity refers to, that is, excavates all mode for meeting practical business requirement, and without omission, it is necessary to ensure to wait
Lectotype can cover all Electricity customers, and the pattern in result set is also necessarily satisfying for definition.
Applicability refers to, due to the number of some points in the data of sequence pattern in possible noise data itself, sequence
According to being probably mistake, if strict carries out pattern search according to sequence pattern order, the potential mould in part may be missed
Formula, these patterns may meet definition in itself, but be due to include noise in sequence data, cause these patterns can not
Directly excavated and obtained from sequence data in the way of strictly matching.So, the present invention needs to introduce other constraints, improves algorithm
The applicability for the pattern excavated.
High efficiency refers to, because in order to ensure the validity of result, the present invention needs to ensure the validity of arithmetic result, calculated
Method needs to travel through full search space, and this can cause for larger data set.So the present invention is carried out based on Hadoop
Realize, improve the operational efficiency of algorithm, i.e. high efficiency.
Element set refers to for the autgmentability of higher-dimension, with reference to different user power utilization behaviors, gathers the actual industry of various dimensions
Business, with the change increase of the practical business of element set, the electricity consumption behavioural characteristic of user is capable of the reform of auxiliary activities with carrying
Rise, this will cause obtained user power utilization behavioural analysis to have good autgmentability.
Brief description of the drawings
The degree present invention is described further with reference to the accompanying drawings and examples.
Fig. 1 is the general frame of the inventive method.
Fig. 2 is the total body display figure of cluster result.
Fig. 3 is all kinds of comprising number of users schematic diagram.
Fig. 4 is the 1st class user power utilization characteristic pattern.
Fig. 5 is the 2nd class user power utilization characteristic pattern.
Fig. 6 is the 3rd class user power utilization characteristic pattern.
Fig. 7 is the 4th class user power utilization characteristic pattern.
Fig. 8 is the 5th class user power utilization characteristic pattern.
Fig. 9 is the 6th class user power utilization characteristic pattern.
Figure 10 is the 7th class user power utilization characteristic pattern.
Figure 11 is the 8th class user power utilization characteristic pattern.
Figure 12 is the 9th class user power utilization characteristic pattern.
Figure 13 is the 10th class user power utilization characteristic pattern.
Embodiment
The date usually deposits in column form in the power consumption information of collection table area user, the data being collected into, according to
The realization approach of above-mentioned technical proposal, is first according to the mode that the date draws near and is arranged, and will be converted to row mark the date.
From with data are adopted, by denoising, duplicate removal, the data cleansing housekeeping such as sky is gone, then is normalized,
The clustering for being based ultimately upon electricity consumption data goes user power utilization behavior to be classified, and finds out inhomogeneity another characteristic, and be directed to
Property carry out analysis and advertisement putting decision-making.
1st, Data Collection, with reference to business demand, collects the power consumption data of user, such as following table:
The raw data table of table 1
Wherein, Pjt i(i=1,2 ..., m;J=1,2 ..., n;T=1,2 ..., 96) user i jth day is represented at t-th
The realtime power data at quarter.idi(i=1,2 ..., m) represent user i unique mark.datej(j=1,2 ..., m) represent jth
It date.
2nd, the data that step 1 is collected, have for sky, it is having probably due to external interference or itself
Mechanical disorder causes data acquisition not normal, it is necessary to carry out sequence of operations to the noise of data.
Outlier processing, exceptional value judges that Main Basiss method has two kinds, and one kind is Pauta Criterion, and method is simply easy
In operation, this method is that the most frequently used exceptional value judges to meet the number of totality x Normal Distributions for data with rejecting criterion
According to having:
P(|x‐μ|>3σ)≤0.003
Wherein, μ and σ represent the mathematic expectaion and standard deviation of normal population respectively, and x is actual observed value, and P is that observation exists
Probability outside the standard deviation of 3 times of the left and right of average.
Now occur being more than the σ of μ+3 or the data probability very little less than μ -3 σ in data, therefore can be picked as exceptional value
Except the partial data.One kind is standardized value (Z-score) method, the data Normal Distribution after Z score standardization.Therefore
This method can recognize exceptional value, it is proposed that data of the Z score less than -3 or higher than 3 are exceptional value.Criterion score (Z score) is public
Formula is:
Z=(x- μ)/σ
Wherein μ and σ represent the mathematic expectaion and standard deviation of normal population respectively.In addition statistic law, classification and Furthest Neighbor
It can also be applied according to the problem of different.
KNN algorithms fill missing values, and the Missing Data Filling method based on k nearest neighbor allows for the characteristic of electric power data, lead to
Cross to select and close on data near how many missing values, to calculate missing values, to select the data of K arest neighbors, and to not
The data setting weights of same distance, then determine missing values according to corresponding weighted average.Mean value method fills missing values,
The average value of index is as the filling of missing values after outlier processing, and this method is more convenient, the side of Main Basiss mathematical statistics
Method.
Data such as table 2 below after processing:
Table in the middle of the data processing of table 2
Wherein,Represent per day power datas of the user i k-th of moment.
3rd, normalized
The electricity consumption behavioural characteristic of user, primarily to user is distinguished in electricity consumption Long-term change trend situation not in the same time, by
Wide in Electricity customers coverage rate, the power consumption of different user differs greatly, to avoid the influence that electricity consumption data magnitude is brought,
The present invention is specially expert to having done normalized, and formula is:
(wherein i represents i-th of user, and t is a certain moment in one day 96 moment.Subtracted currently with each performance number
Capable minimum value, then divided by current line changed power scope, that is, the numerical value after being normalized)
The normalization data table of table 3
Wherein, idi(i=1,2 ..., m) represent user i unique mark, Pi k(i=1,2,3 ... 96, k=1,2,3 ... is m)
For the power data of each user after normalization
4th, K-means is clustered
Before cluster is started, the operation of a Data Dimensionality Reduction is also done, is with the data that data are collected into one day are adopted
96 points, but this 96 points are more or less the same in 4 adjacent points, the power data then rounded a little is representative, by returning
Data after changing form table 4 below:
The dimensionality reduction tables of data of table 4
Wherein, idi(i=1,2 ..., m) represent user i unique mark,
Pi k(the power datas of i=1,5,9 ... 93, k=1,2,3 ... m) for each user after dimensionality reduction
This step will be carried out category division to user, comprised the following steps that on the basis of above-mentioned processing procedure:
S1. optimal cluster number N is determined according to silhouette coefficient method;
S2. N number of user is randomly selected from above-mentioned data as barycenter;
S3. it is measured to the distance of each barycenter from remaining each user, and it is grouped into the classification of nearest barycenter;
S4. according to Euclidean distance method, the barycenter of each classification is recalculated;
S5. iteration S3~S4, until new barycenter is equal with the protoplasm heart or less than specified threshold, algorithm terminates.
1. clustering obtained result according to step 4 K-means, the user power utilization flow characteristic of each classification is analyzed,
It is determined that per the power consumption characteristics of class user.
2. practical business is combined, service hoisting quality of service.
, can be with the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis according to the result of above-mentioned cluster.This
The business that invention is delivered with regard to auxiliary ad-vertisement is promoted.
S1. cluster analysis result is illustrated
Based on above-mentioned analysis process, the cluster result such as Fig. 2 can be obtained.
(in Fig. 2, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data)
Abscissa is cluster out each vectorial classification of different transforming factors in Fig. 3, and ordinate is all kinds of use included
Amount.
From figure 3, it can be seen that electricity consumption behavior has larger difference between different classes of user.It is below that classification statement is each
The electricity consumption behavioural characteristic of class user.
S2. different classes of corresponding user behavior feature is analyzed, including with electrical characteristics, distribution situation of peak valley etc.;
In Fig. 4, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user respectively has a small peak in the morning, afternoon and evening, and evening peak is substantially high
In morning peak, the noon peak it is minimum, and usually electricity consumption is general, thus it is speculated that such user is the common working clan for having old man in family, or
There are the child gone to school, the class family that only child is at noon in family.
In Fig. 5, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user power consumption straight line after late 9 points declines, and early nine
Point is universal higher to 9 power consumptions of evening, and power consumption can be declined slightly at 13, thus it is speculated that such user is to belong to class processing
Factory, belongs to machinery operation.
In Fig. 6, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The electricity consumption of such user fluctuates near 0.525 always, and floating
Dynamic value is no more than 0.1, and daily power consumption is more steady, thus it is speculated that such user is continual processing factory user round the clock.
In Fig. 7, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user belongs to low power consumption at ten eight points from early 8 points to evening,
And early 6 points of late 8 points to second day are constantly in peak times of power consumption, thus it is speculated that such user belongs to the processing factory used electricity in off-peak hours,
Belong to machinery operation.
In Fig. 8, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user is 8:00-12:00 He
14:00-18:Belong to peak of power consumption in 00 interval, noon power consumption is decreased obviously, and remaining period power consumption is relatively low,
It is the general office space such as office building to speculate such user.
In Fig. 9, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The use electrical feature of such user and the user power utilization feature class of classification 5
Seemingly, bimodal class Electricity customers are belonged to, but afternoon, power consumption differed larger with morning power consumption, illustrated such user easily by gas
The influence of the other factorses such as temperature so that power consumption is decreased obviously.
In Figure 10, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The power consumption of such user is relatively low, and most power consumption is all small
In 0.03, and overall electricity consumption is more steady, illustrates that the time that such user is typically in is few, thus it is speculated that such user is zero load.
In Figure 11, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The use electrical feature and classification 4 of such user is somewhat like, electricity consumption on daytime
Amount is few and evening power consumption is higher, and there be a small peak of electricity consumption at ten two points at noon, and it is electricity consumption of cooking at noon to infer it, is pushed away
On the basis of such user is surveyed for comprehensive 4th class user, some is used for the user cooked at noon.
In Figure 12, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:It is high that such user only just steps into electricity consumption after ten nine points
, there is a small peak of electricity consumption at peak with noon in the morning, thus it is speculated that such user is young working clan user, and stay out at noon in the morning,
Go home the more type of electricity consumption in the evening.
In Figure 13, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Some similar, electricity consumption trend of the use electrical feature of such user and classification 1
It is similar, but the electricity consumption of such user is characterized as that electricity consumption in morning is higher than evening, thus it is speculated that and such user is that housework activity etc. was liked in morning
One class user of upper progress.
S3. the promoting service that auxiliary ad-vertisement is delivered.
Summary is analyzed the behavioural characteristic of different classes of user, and the major focus of advertisement putting is done into as follows
Sort out:
Above category division belongs to a kind of division of frame-type, if to be specifically divided in the commercial paper of different time sections
Type, then need to combine more customer profile datas;Meanwhile, the advertisement for distinguishing festivals or holidays and non-festivals or holidays is divided, and being also can be with
The direction that further auxiliary ad-vertisement is delivered.
Claims (6)
1. a kind of advertisement putting aided analysis method based on user power utilization behavioural analysis, it is characterized in that:Comprise the following steps:
(1) Data Collection, with reference to business demand, collects the power consumption data of user;
(2) data collected, sequence of operations is carried out to the noise of data:
Outlier processing:Exceptional value judges that Main Basiss method has two kinds, and one kind is Pauta Criterion, and method is simply easy to behaviour
Make, this method is that the most frequently used exceptional value judges to meet the data of totality x Normal Distributions for data with rejecting criterion,
Have:
P(|x‐μ|>3σ)≤0.003
Wherein, μ and σ represent the mathematic expectaion and standard deviation of normal population respectively, and x is actual observed value, and P is observation in average
3 times of left and right standard deviation outside probability;
Now occur being more than the σ of μ+3 or the data probability very little less than μ -3 σ in data, therefore can be somebody's turn to do as abnormality value removing
Partial data;
Another is standardized value method, the data Normal Distribution after Z score standardization;This method can recognize exception
Value, data of the Z score less than -3 or higher than 3 are exceptional value;Criterion score (Z score) formula is:
Z=(x- μ)/σ
Wherein μ and σ represent the mathematic expectaion and standard deviation of normal population respectively;
KNN algorithms fill missing values:Missing Data Filling method based on k nearest neighbor, allows for the characteristic of electric power data, passes through choosing
Close on data near fixed how many missing values, to calculate missing values, to select the data of K arest neighbors, and to difference away from
From data setting weights, missing values are then determined according to corresponding weighted average;Mean value method fills missing values, abnormal
Value processing after index average value as missing values filling;
(3) normalized
The electricity consumption behavioural characteristic of user, primarily to distinguish user in electricity consumption Long-term change trend situation not in the same time, due to
Electric client's coverage rate is wide, and the power consumption of different user differs greatly, to avoid the influence that electricity consumption data magnitude is brought, this hair
Bright to be specially expert to having done normalized, formula is:
<mrow>
<mover>
<mover>
<msubsup>
<mi>P</mi>
<mi>t</mi>
<mi>i</mi>
</msubsup>
<mo>&OverBar;</mo>
</mover>
<mo>&OverBar;</mo>
</mover>
<mo>=</mo>
<mfrac>
<mrow>
<mover>
<msubsup>
<mi>P</mi>
<mi>t</mi>
<mi>i</mi>
</msubsup>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<munder>
<mrow>
<mi>m</mi>
<mi>i</mi>
<mi>n</mi>
</mrow>
<mi>t</mi>
</munder>
<mover>
<msubsup>
<mi>P</mi>
<mi>t</mi>
<mi>i</mi>
</msubsup>
<mo>&OverBar;</mo>
</mover>
</mrow>
<mrow>
<munder>
<mi>max</mi>
<mi>t</mi>
</munder>
<mover>
<msubsup>
<mi>P</mi>
<mi>t</mi>
<mi>i</mi>
</msubsup>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<munder>
<mrow>
<mi>m</mi>
<mi>i</mi>
<mi>n</mi>
</mrow>
<mi>t</mi>
</munder>
<mover>
<msubsup>
<mi>P</mi>
<mi>t</mi>
<mi>i</mi>
</msubsup>
<mo>&OverBar;</mo>
</mover>
</mrow>
</mfrac>
<mo>&Element;</mo>
<mo>&lsqb;</mo>
<mn>0</mn>
<mo>,</mo>
<mn>1</mn>
<mo>&rsqb;</mo>
</mrow>
Wherein i represents i-th of user, and t is a certain moment in one day 96 moment;Above-mentioned formula is subtracted with each performance number
The minimum value of current line power, then divided by current line changed power scope, that is, the numerical value after being normalized;
(4) K-means is clustered
Before cluster is started, the operation of a Data Dimensionality Reduction is also done, is 96 with the data that data are collected into one day are adopted
Point, but this 96 points are more or less the same in 4 adjacent points, the power data then rounded a little is representative, is obtained after dimensionality reduction
Each user power data;
This step will be carried out category division to user, comprised the following steps that on the basis of above-mentioned processing procedure:
S1. optimal cluster number N is determined according to silhouette coefficient method;
S2. N number of user is randomly selected from above-mentioned data as barycenter;
S3. it is measured to the distance of each barycenter from remaining each user, and it is grouped into the classification of nearest barycenter;
S4. according to Euclidean distance method, the barycenter of each classification is recalculated;
S5. iteration S3-S4, until new barycenter is equal with the protoplasm heart or less than specified threshold, algorithm terminates;
(5) obtained result is clustered according to step (4) K-means, the user power utilization flow characteristic of each classification is analyzed, really
The fixed power consumption characteristics per class user;
(6) according to the result of above-mentioned cluster, the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis are just aided in wide
The business delivered is accused to be promoted;
S1. cluster analysis result is illustrated;
S2. different classes of corresponding user behavior feature is analyzed, including with electrical characteristics, the distribution situation of peak valley;
S3. the promoting service that auxiliary ad-vertisement is delivered:
Summary is analyzed the behavioural characteristic of different classes of user, and the major focus of advertisement putting is done and sorted out.
2. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that:
From with data are adopted, by denoising, duplicate removal, sky data cleansing housekeeping is gone, then is normalized, use is based ultimately upon
The clustering of electric data goes user power utilization behavior to be classified, and finds out inhomogeneity another characteristic, and targetedly divided
Analysis and advertisement putting decision-making.
3. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that:
If to be specifically divided in the adline of different time sections, need to combine more customer profile datas.
4. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that:
Divided by the advertisement for distinguishing festivals or holidays and non-festivals or holidays, the direction that further auxiliary ad-vertisement is delivered.
5. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that:
The different classes of user include family's class, normal commercial class, uninterrupted industrial class, industrial class of avoiding the peak hour, working class, unloaded class,
Industry+family, young working class, family's class.
6. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that:
The date deposits in column form in the power consumption information of collection table area user, the data being collected into, according to above-mentioned technical proposal
Realization approach, is first according to the mode that the date draws near and is arranged, and will be converted to row mark the date.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710092258 | 2017-02-21 | ||
CN2017100922582 | 2017-02-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107248086A true CN107248086A (en) | 2017-10-13 |
Family
ID=60016980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710270555.1A Pending CN107248086A (en) | 2017-02-21 | 2017-04-24 | Advertisement putting aided analysis method based on user power utilization behavioural analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107248086A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108038178A (en) * | 2017-12-07 | 2018-05-15 | 北京邮电大学 | A kind of user power utilization behavior visual analysis method, device and electronic equipment |
CN108564390A (en) * | 2017-12-29 | 2018-09-21 | 广东金赋科技股份有限公司 | Data trend analysis method, electronic equipment and the computer storage media of a large amount of individuals |
CN108776939A (en) * | 2018-06-07 | 2018-11-09 | 上海电气分布式能源科技有限公司 | The analysis method and system of user power utilization behavior |
CN109949181A (en) * | 2019-03-22 | 2019-06-28 | 华立科技股份有限公司 | The power grid type judgement method and device of algorithm are closed on based on KNN |
CN109961197A (en) * | 2017-12-22 | 2019-07-02 | 中国电力科学研究院有限公司 | A kind of regional complex electricity consumption evaluation method and system based on power consumer label |
CN110298552A (en) * | 2019-05-31 | 2019-10-01 | 国网上海市电力公司 | A kind of power distribution network individual power method for detecting abnormality of combination history electrical feature |
CN110347957A (en) * | 2019-06-12 | 2019-10-18 | 合肥大多数信息科技有限公司 | One kind is from media power supply service method for pushing |
CN110363596A (en) * | 2019-07-24 | 2019-10-22 | 金禧 | A kind of accurate advertisement put-on method and system |
CN110967251A (en) * | 2019-12-02 | 2020-04-07 | 湘潭大学 | Method for identifying damage mode of wind power blade |
CN111353814A (en) * | 2020-02-24 | 2020-06-30 | 上海佳投互联网技术集团有限公司 | Method and system for identifying invalid advertisement users |
CN111353795A (en) * | 2018-12-20 | 2020-06-30 | 北京沃东天骏信息技术有限公司 | Advertisement effect measuring method, device, medium and equipment |
US11107097B2 (en) | 2019-08-29 | 2021-08-31 | Honda Motor Co., Ltd. | System and method for completing trend mapping using similarity scoring |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982489A (en) * | 2012-11-23 | 2013-03-20 | 广东电网公司电力科学研究院 | Power customer online grouping method based on mass measurement data |
CN103093394A (en) * | 2013-01-23 | 2013-05-08 | 广东电网公司信息中心 | Clustering fusion method based on user electrical load data subdivision |
CN103942606A (en) * | 2014-03-13 | 2014-07-23 | 国家电网公司 | Residential electricity consumption customer segmentation method based on fruit fly intelligent optimization algorithm |
US20160274609A1 (en) * | 2015-03-18 | 2016-09-22 | Onzo Limited | Classifying utility consumption of consumers |
-
2017
- 2017-04-24 CN CN201710270555.1A patent/CN107248086A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982489A (en) * | 2012-11-23 | 2013-03-20 | 广东电网公司电力科学研究院 | Power customer online grouping method based on mass measurement data |
CN103093394A (en) * | 2013-01-23 | 2013-05-08 | 广东电网公司信息中心 | Clustering fusion method based on user electrical load data subdivision |
CN103942606A (en) * | 2014-03-13 | 2014-07-23 | 国家电网公司 | Residential electricity consumption customer segmentation method based on fruit fly intelligent optimization algorithm |
US20160274609A1 (en) * | 2015-03-18 | 2016-09-22 | Onzo Limited | Classifying utility consumption of consumers |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108038178A (en) * | 2017-12-07 | 2018-05-15 | 北京邮电大学 | A kind of user power utilization behavior visual analysis method, device and electronic equipment |
CN109961197B (en) * | 2017-12-22 | 2021-08-27 | 中国电力科学研究院有限公司 | Regional comprehensive power utilization evaluation method and system based on power user label |
CN109961197A (en) * | 2017-12-22 | 2019-07-02 | 中国电力科学研究院有限公司 | A kind of regional complex electricity consumption evaluation method and system based on power consumer label |
CN108564390A (en) * | 2017-12-29 | 2018-09-21 | 广东金赋科技股份有限公司 | Data trend analysis method, electronic equipment and the computer storage media of a large amount of individuals |
CN108776939A (en) * | 2018-06-07 | 2018-11-09 | 上海电气分布式能源科技有限公司 | The analysis method and system of user power utilization behavior |
CN111353795A (en) * | 2018-12-20 | 2020-06-30 | 北京沃东天骏信息技术有限公司 | Advertisement effect measuring method, device, medium and equipment |
CN109949181A (en) * | 2019-03-22 | 2019-06-28 | 华立科技股份有限公司 | The power grid type judgement method and device of algorithm are closed on based on KNN |
CN109949181B (en) * | 2019-03-22 | 2021-05-25 | 华立科技股份有限公司 | Power grid type judgment method and device based on KNN proximity algorithm |
CN110298552A (en) * | 2019-05-31 | 2019-10-01 | 国网上海市电力公司 | A kind of power distribution network individual power method for detecting abnormality of combination history electrical feature |
CN110298552B (en) * | 2019-05-31 | 2023-12-01 | 国网上海市电力公司 | Power distribution network individual power abnormality detection method combining historical electricity utilization characteristics |
CN110347957A (en) * | 2019-06-12 | 2019-10-18 | 合肥大多数信息科技有限公司 | One kind is from media power supply service method for pushing |
CN110363596A (en) * | 2019-07-24 | 2019-10-22 | 金禧 | A kind of accurate advertisement put-on method and system |
US11107097B2 (en) | 2019-08-29 | 2021-08-31 | Honda Motor Co., Ltd. | System and method for completing trend mapping using similarity scoring |
CN110967251A (en) * | 2019-12-02 | 2020-04-07 | 湘潭大学 | Method for identifying damage mode of wind power blade |
CN110967251B (en) * | 2019-12-02 | 2023-07-11 | 湘潭大学 | Method for identifying damage mode of wind power blade |
CN111353814A (en) * | 2020-02-24 | 2020-06-30 | 上海佳投互联网技术集团有限公司 | Method and system for identifying invalid advertisement users |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107248086A (en) | Advertisement putting aided analysis method based on user power utilization behavioural analysis | |
Rajabi et al. | A comparative study of clustering techniques for electrical load pattern segmentation | |
Benítez et al. | Dynamic clustering segmentation applied to load profiles of energy consumption from Spanish customers | |
Yildiz et al. | Household electricity load forecasting using historical smart meter data with clustering and classification techniques | |
Rajabi et al. | A pattern recognition methodology for analyzing residential customers load data and targeting demand response applications | |
Alahakoon et al. | Advanced analytics for harnessing the power of smart meter big data | |
Choksi et al. | Feature based clustering technique for investigation of domestic load profiles and probabilistic variation assessment: Smart meter dataset | |
CN109446193A (en) | It opposes electricity-stealing model generating method and device | |
CN105117810A (en) | Residential electricity consumption mid-term load prediction method under multistep electricity price mechanism | |
CN104992239A (en) | Correlation coefficient-based industry electricity consumption law forecasting method | |
Gajowniczek et al. | Electricity peak demand classification with artificial neural networks | |
CN107784518A (en) | A kind of power customer divided method based on multidimensional index | |
CN114004296A (en) | Method and system for reversely extracting monitoring points based on power load characteristics | |
Park et al. | A novel load image profile-based electricity load clustering methodology | |
Lin et al. | A hybrid economic indices based short-term load forecasting system | |
CN116976707B (en) | User electricity consumption data anomaly analysis method and system based on electricity consumption data acquisition | |
Kojury-Naftchali et al. | Identifying susceptible consumers for demand response and energy efficiency policies by time-series analysis and supplementary approaches | |
CN111126499A (en) | Secondary clustering-based power consumption behavior pattern classification method | |
Xu et al. | Spatial-temporal load forecasting using AMI data | |
Zufferey et al. | Unsupervised learning methods for power system data analysis | |
CN112508260B (en) | Medium-and-long-term load prediction method and device of distribution transformer based on comparative learning | |
Abrishami et al. | Using real-world store data for foot traffic forecasting | |
CN112288496A (en) | Load classification calculation method and tracking analysis method for power industry | |
Wang et al. | Application of clustering technique to electricity customer classification for load forecasting | |
CN106651425A (en) | User electricity stealing and electricity leakage behavior monitoring method considering business expanding installation data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171013 |