CN107248086A - Advertisement putting aided analysis method based on user power utilization behavioural analysis - Google Patents

Advertisement putting aided analysis method based on user power utilization behavioural analysis Download PDF

Info

Publication number
CN107248086A
CN107248086A CN201710270555.1A CN201710270555A CN107248086A CN 107248086 A CN107248086 A CN 107248086A CN 201710270555 A CN201710270555 A CN 201710270555A CN 107248086 A CN107248086 A CN 107248086A
Authority
CN
China
Prior art keywords
data
user
analysis
mover
power utilization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710270555.1A
Other languages
Chinese (zh)
Inventor
刘飞
王栋
毛艳芳
胡斌
杨佩
冯鹏
蒋亮
朱喆华
江陈桢
季润阳
李珺涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Global Energy Interconnection Research Institute
Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Global Energy Interconnection Research Institute
Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Global Energy Interconnection Research Institute, Nantong Power Supply Co of State Grid Jiangsu Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Publication of CN107248086A publication Critical patent/CN107248086A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of advertisement putting aided analysis method based on user power utilization behavioural analysis, it is characterized in that:Including Data Collection;The data collected, sequence of operations is carried out to the noise of data;Normalized;K means are clustered;Obtained result is clustered according to K means, the user power utilization flow characteristic of each classification is analyzed, it is determined that per the power consumption characteristics of class user;According to the result of cluster, the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis, the business delivered with regard to auxiliary ad-vertisement is promoted.The present invention provides different zones consumption of resident Capability Measurement, consumer goods concern direction prediction for enterprise customer, and advertisement orientation is carried out by its product of auxiliary enterprises and is delivered.

Description

Advertisement putting aided analysis method based on user power utilization behavioural analysis
Technical field
The present invention relates to a kind of advertisement putting aided analysis method based on user power utilization behavioural analysis.
Background technology
Current State Grid Corporation of China applies marketing information system, the progressively development of power information acquisition system, have accumulated each Each sea of retribution amount power information of row, and with the development of intelligent power grid technology, intelligent gateway, smart jack equipment progressively popularize so that User power utilization Data acquisition and storage becomes more convenient, and these data have that data volume is big, data type is various, data are real-time Property high big data essential characteristic, Guo Wang intra-companies sales service aid decision is mainly used at present, to government, enterprise visitor The data analysis at family is served by being extremely limited.
Strengthened research with " smart city " project in each experimental city is carried out, it is desirable to inter-trade, region, association of department Make, break the obstacle of Information application, realize interconnecting for information, thus new demand is proposed to the application of power network big data, i.e., By customer electricity information calmodulin binding domain CaM historical data be government, corporate client offer customize scene analysis service, these determine System analysis scene consigns to client by way of analysis report, data visualization, formulation service, is that State Grid Corporation of China creates Economic value.
Customer electricity information based on power information acquisition system, all electric quantity datas, one in being related to according to preresearch estimates 3 years historical datas of individual province (city) company are about 50T, every year newly-increased about 20T, meet big data scale greatly, data type is various, number According to the ageing high feature of processing.These electricity consumption datas except meet interior business lifting with improvement in terms of decision support in addition to, Custom analysis service can be equally provided with Government, enterprise customer, such as Government provides region residential building vacancy rate point Analysis, the every making policies of prediction economic development trend service direction government;Towards corporate client provide trade investment addressing, Advertisement orientation delivers measuring and calculating service auxiliary enterprises and carries out trade investment decision-making.
The content of the invention
It is an object of the invention to provide one kind by analyzing different zones residential electricity consumption behavior, calmodulin binding domain CaM commercial consumption Related many external datas, are modeled using clustering, segment residential electricity consumption consumption feature, excavate each region consumption of resident custom, So as to provide different zones consumption of resident Capability Measurement, consumer goods concern direction prediction for enterprise customer, using auxiliary enterprises as it Product carries out the advertisement putting aided analysis method based on user power utilization behavioural analysis that advertisement orientation is delivered.
The present invention technical solution be:
A kind of advertisement putting aided analysis method based on user power utilization behavioural analysis, it is characterized in that:Comprise the following steps:
(1) Data Collection, with reference to business demand, collects the power consumption data of user;
(2) data collected, sequence of operations is carried out to the noise of data:
Outlier processing:Exceptional value judges that Main Basiss method has two kinds, and one kind is Pauta Criterion, and method is simply easy In operation, this method is that the most frequently used exceptional value judges to meet the number of totality x Normal Distributions for data with rejecting criterion According to having:
P(|x‐μ|>3σ)≤0.003
Wherein, μ and σ represent the mathematic expectaion and standard deviation of normal population respectively, and x is actual observed value, and P is that observation exists Probability outside the standard deviation of 3 times of the left and right of average;
Now occur being more than the σ of μ+3 or the data probability very little less than μ -3 σ in data, therefore can be picked as exceptional value Except the partial data;
Another is standardized value method, the data Normal Distribution after Z score standardization;This method can recognize different Constant value, data of the Z score less than -3 or higher than 3 are exceptional value;Criterion score (Z score) formula is:
Z=(x- μ)/σ
Wherein μ and σ represent the mathematic expectaion and standard deviation of normal population respectively;
KNN algorithms fill missing values:Missing Data Filling method based on k nearest neighbor, allows for the characteristic of electric power data, leads to Cross to select and close on data near how many missing values, to calculate missing values, to select the data of K arest neighbors, and to not The data setting weights of same distance, then determine missing values according to corresponding weighted average;Mean value method fills missing values, After outlier processing the average value of index as missing values filling;
(3) normalized
The electricity consumption behavioural characteristic of user, primarily to user is distinguished in electricity consumption Long-term change trend situation not in the same time, by Wide in Electricity customers coverage rate, the power consumption of different user differs greatly, to avoid the influence that electricity consumption data magnitude is brought, The present invention is specially expert to having done normalized, and formula is:
Wherein i represents i-th of user, and t is a certain moment in one day 96 moment;Above-mentioned formula is to use each performance number Subtract the minimum value of current line power, then divided by current line changed power scope, that is, the numerical value after being normalized;
(4) K-means is clustered
Before cluster is started, the operation of a Data Dimensionality Reduction is also done, is with the data that data are collected into one day are adopted 96 points, but this 96 points are more or less the same in 4 adjacent points, the power data then rounded a little is representative, is dropped The power data of each user after dimension;
This step will be carried out category division to user, comprised the following steps that on the basis of above-mentioned processing procedure:
S1. optimal cluster number N is determined according to silhouette coefficient method;
S2. N number of user is randomly selected from above-mentioned data as barycenter;
S3. it is measured to the distance of each barycenter from remaining each user, and it is grouped into the classification of nearest barycenter;
S4. according to Euclidean distance method, the barycenter of each classification is recalculated;
S5. iteration S3-S4, until new barycenter is equal with the protoplasm heart or less than specified threshold, algorithm terminates;
(5) obtained result is clustered according to step (4) K-means, the user power utilization flow characteristic of each classification is divided Analysis, it is determined that per the power consumption characteristics of class user;
(6) according to the result of above-mentioned cluster, the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis are just auxiliary The business of advertisement putting is helped to be promoted;
S1. cluster analysis result is illustrated;
S2. different classes of corresponding user behavior feature is analyzed, including with electrical characteristics, the distribution situation of peak valley;
S3. the promoting service that auxiliary ad-vertisement is delivered:
Summary is analyzed the behavioural characteristic of different classes of user, and the major focus of advertisement putting is done and returned Class.
The present invention is by analyzing different zones residential electricity consumption behavior, the related many external datas of calmodulin binding domain CaM commercial consumption, profit Modeled with clustering, segment residential electricity consumption consumption feature, each region consumption of resident custom is excavated, so as to be provided for enterprise customer Different zones consumption of resident Capability Measurement, consumer goods concern direction prediction, advertisement orientation is carried out by its product of auxiliary enterprises and is thrown Put.
The present invention main innovation be to propose a kind of applied analysis framework based on user power utilization behavioural analysis result, Effectively apply the result of user power utilization behavioural analysis.The present invention can solve the problem that user power utilization behavior validity, applicability, efficiently Property and element set for higher-dimension the aspect of autgmentability four the problem of.
Validity refers to, that is, excavates all mode for meeting practical business requirement, and without omission, it is necessary to ensure to wait Lectotype can cover all Electricity customers, and the pattern in result set is also necessarily satisfying for definition.
Applicability refers to, due to the number of some points in the data of sequence pattern in possible noise data itself, sequence According to being probably mistake, if strict carries out pattern search according to sequence pattern order, the potential mould in part may be missed Formula, these patterns may meet definition in itself, but be due to include noise in sequence data, cause these patterns can not Directly excavated and obtained from sequence data in the way of strictly matching.So, the present invention needs to introduce other constraints, improves algorithm The applicability for the pattern excavated.
High efficiency refers to, because in order to ensure the validity of result, the present invention needs to ensure the validity of arithmetic result, calculated Method needs to travel through full search space, and this can cause for larger data set.So the present invention is carried out based on Hadoop Realize, improve the operational efficiency of algorithm, i.e. high efficiency.
Element set refers to for the autgmentability of higher-dimension, with reference to different user power utilization behaviors, gathers the actual industry of various dimensions Business, with the change increase of the practical business of element set, the electricity consumption behavioural characteristic of user is capable of the reform of auxiliary activities with carrying Rise, this will cause obtained user power utilization behavioural analysis to have good autgmentability.
Brief description of the drawings
The degree present invention is described further with reference to the accompanying drawings and examples.
Fig. 1 is the general frame of the inventive method.
Fig. 2 is the total body display figure of cluster result.
Fig. 3 is all kinds of comprising number of users schematic diagram.
Fig. 4 is the 1st class user power utilization characteristic pattern.
Fig. 5 is the 2nd class user power utilization characteristic pattern.
Fig. 6 is the 3rd class user power utilization characteristic pattern.
Fig. 7 is the 4th class user power utilization characteristic pattern.
Fig. 8 is the 5th class user power utilization characteristic pattern.
Fig. 9 is the 6th class user power utilization characteristic pattern.
Figure 10 is the 7th class user power utilization characteristic pattern.
Figure 11 is the 8th class user power utilization characteristic pattern.
Figure 12 is the 9th class user power utilization characteristic pattern.
Figure 13 is the 10th class user power utilization characteristic pattern.
Embodiment
The date usually deposits in column form in the power consumption information of collection table area user, the data being collected into, according to The realization approach of above-mentioned technical proposal, is first according to the mode that the date draws near and is arranged, and will be converted to row mark the date.
From with data are adopted, by denoising, duplicate removal, the data cleansing housekeeping such as sky is gone, then is normalized, The clustering for being based ultimately upon electricity consumption data goes user power utilization behavior to be classified, and finds out inhomogeneity another characteristic, and be directed to Property carry out analysis and advertisement putting decision-making.
1st, Data Collection, with reference to business demand, collects the power consumption data of user, such as following table:
The raw data table of table 1
Wherein, Pjt i(i=1,2 ..., m;J=1,2 ..., n;T=1,2 ..., 96) user i jth day is represented at t-th The realtime power data at quarter.idi(i=1,2 ..., m) represent user i unique mark.datej(j=1,2 ..., m) represent jth It date.
2nd, the data that step 1 is collected, have for sky, it is having probably due to external interference or itself
Mechanical disorder causes data acquisition not normal, it is necessary to carry out sequence of operations to the noise of data.
Outlier processing, exceptional value judges that Main Basiss method has two kinds, and one kind is Pauta Criterion, and method is simply easy In operation, this method is that the most frequently used exceptional value judges to meet the number of totality x Normal Distributions for data with rejecting criterion According to having:
P(|x‐μ|>3σ)≤0.003
Wherein, μ and σ represent the mathematic expectaion and standard deviation of normal population respectively, and x is actual observed value, and P is that observation exists Probability outside the standard deviation of 3 times of the left and right of average.
Now occur being more than the σ of μ+3 or the data probability very little less than μ -3 σ in data, therefore can be picked as exceptional value Except the partial data.One kind is standardized value (Z-score) method, the data Normal Distribution after Z score standardization.Therefore This method can recognize exceptional value, it is proposed that data of the Z score less than -3 or higher than 3 are exceptional value.Criterion score (Z score) is public Formula is:
Z=(x- μ)/σ
Wherein μ and σ represent the mathematic expectaion and standard deviation of normal population respectively.In addition statistic law, classification and Furthest Neighbor It can also be applied according to the problem of different.
KNN algorithms fill missing values, and the Missing Data Filling method based on k nearest neighbor allows for the characteristic of electric power data, lead to Cross to select and close on data near how many missing values, to calculate missing values, to select the data of K arest neighbors, and to not The data setting weights of same distance, then determine missing values according to corresponding weighted average.Mean value method fills missing values, The average value of index is as the filling of missing values after outlier processing, and this method is more convenient, the side of Main Basiss mathematical statistics Method.
Data such as table 2 below after processing:
Table in the middle of the data processing of table 2
Wherein,Represent per day power datas of the user i k-th of moment.
3rd, normalized
The electricity consumption behavioural characteristic of user, primarily to user is distinguished in electricity consumption Long-term change trend situation not in the same time, by Wide in Electricity customers coverage rate, the power consumption of different user differs greatly, to avoid the influence that electricity consumption data magnitude is brought, The present invention is specially expert to having done normalized, and formula is:
(wherein i represents i-th of user, and t is a certain moment in one day 96 moment.Subtracted currently with each performance number Capable minimum value, then divided by current line changed power scope, that is, the numerical value after being normalized)
The normalization data table of table 3
Wherein, idi(i=1,2 ..., m) represent user i unique mark, Pi k(i=1,2,3 ... 96, k=1,2,3 ... is m) For the power data of each user after normalization
4th, K-means is clustered
Before cluster is started, the operation of a Data Dimensionality Reduction is also done, is with the data that data are collected into one day are adopted 96 points, but this 96 points are more or less the same in 4 adjacent points, the power data then rounded a little is representative, by returning Data after changing form table 4 below:
The dimensionality reduction tables of data of table 4
Wherein, idi(i=1,2 ..., m) represent user i unique mark,
Pi k(the power datas of i=1,5,9 ... 93, k=1,2,3 ... m) for each user after dimensionality reduction
This step will be carried out category division to user, comprised the following steps that on the basis of above-mentioned processing procedure:
S1. optimal cluster number N is determined according to silhouette coefficient method;
S2. N number of user is randomly selected from above-mentioned data as barycenter;
S3. it is measured to the distance of each barycenter from remaining each user, and it is grouped into the classification of nearest barycenter;
S4. according to Euclidean distance method, the barycenter of each classification is recalculated;
S5. iteration S3~S4, until new barycenter is equal with the protoplasm heart or less than specified threshold, algorithm terminates.
1. clustering obtained result according to step 4 K-means, the user power utilization flow characteristic of each classification is analyzed, It is determined that per the power consumption characteristics of class user.
2. practical business is combined, service hoisting quality of service.
, can be with the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis according to the result of above-mentioned cluster.This The business that invention is delivered with regard to auxiliary ad-vertisement is promoted.
S1. cluster analysis result is illustrated
Based on above-mentioned analysis process, the cluster result such as Fig. 2 can be obtained.
(in Fig. 2, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data)
Abscissa is cluster out each vectorial classification of different transforming factors in Fig. 3, and ordinate is all kinds of use included Amount.
From figure 3, it can be seen that electricity consumption behavior has larger difference between different classes of user.It is below that classification statement is each The electricity consumption behavioural characteristic of class user.
S2. different classes of corresponding user behavior feature is analyzed, including with electrical characteristics, distribution situation of peak valley etc.;
In Fig. 4, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user respectively has a small peak in the morning, afternoon and evening, and evening peak is substantially high In morning peak, the noon peak it is minimum, and usually electricity consumption is general, thus it is speculated that such user is the common working clan for having old man in family, or There are the child gone to school, the class family that only child is at noon in family.
In Fig. 5, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user power consumption straight line after late 9 points declines, and early nine Point is universal higher to 9 power consumptions of evening, and power consumption can be declined slightly at 13, thus it is speculated that such user is to belong to class processing Factory, belongs to machinery operation.
In Fig. 6, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The electricity consumption of such user fluctuates near 0.525 always, and floating Dynamic value is no more than 0.1, and daily power consumption is more steady, thus it is speculated that such user is continual processing factory user round the clock.
In Fig. 7, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user belongs to low power consumption at ten eight points from early 8 points to evening, And early 6 points of late 8 points to second day are constantly in peak times of power consumption, thus it is speculated that such user belongs to the processing factory used electricity in off-peak hours, Belong to machinery operation.
In Fig. 8, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Such user is 8:00-12:00 He
14:00-18:Belong to peak of power consumption in 00 interval, noon power consumption is decreased obviously, and remaining period power consumption is relatively low, It is the general office space such as office building to speculate such user.
In Fig. 9, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The use electrical feature of such user and the user power utilization feature class of classification 5 Seemingly, bimodal class Electricity customers are belonged to, but afternoon, power consumption differed larger with morning power consumption, illustrated such user easily by gas The influence of the other factorses such as temperature so that power consumption is decreased obviously.
In Figure 10, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The power consumption of such user is relatively low, and most power consumption is all small In 0.03, and overall electricity consumption is more steady, illustrates that the time that such user is typically in is few, thus it is speculated that such user is zero load.
In Figure 11, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:The use electrical feature and classification 4 of such user is somewhat like, electricity consumption on daytime Amount is few and evening power consumption is higher, and there be a small peak of electricity consumption at ten two points at noon, and it is electricity consumption of cooking at noon to infer it, is pushed away On the basis of such user is surveyed for comprehensive 4th class user, some is used for the user cooked at noon.
In Figure 12, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:It is high that such user only just steps into electricity consumption after ten nine points , there is a small peak of electricity consumption at peak with noon in the morning, thus it is speculated that such user is young working clan user, and stay out at noon in the morning, Go home the more type of electricity consumption in the evening.
In Figure 13, abscissa P1, P2 ... P24 are the time point of 24 hours one day, and ordinate is power data.
Analyze being found with electrical feature for such user:Some similar, electricity consumption trend of the use electrical feature of such user and classification 1 It is similar, but the electricity consumption of such user is characterized as that electricity consumption in morning is higher than evening, thus it is speculated that and such user is that housework activity etc. was liked in morning One class user of upper progress.
S3. the promoting service that auxiliary ad-vertisement is delivered.
Summary is analyzed the behavioural characteristic of different classes of user, and the major focus of advertisement putting is done into as follows Sort out:
Above category division belongs to a kind of division of frame-type, if to be specifically divided in the commercial paper of different time sections Type, then need to combine more customer profile datas;Meanwhile, the advertisement for distinguishing festivals or holidays and non-festivals or holidays is divided, and being also can be with The direction that further auxiliary ad-vertisement is delivered.

Claims (6)

1. a kind of advertisement putting aided analysis method based on user power utilization behavioural analysis, it is characterized in that:Comprise the following steps:
(1) Data Collection, with reference to business demand, collects the power consumption data of user;
(2) data collected, sequence of operations is carried out to the noise of data:
Outlier processing:Exceptional value judges that Main Basiss method has two kinds, and one kind is Pauta Criterion, and method is simply easy to behaviour Make, this method is that the most frequently used exceptional value judges to meet the data of totality x Normal Distributions for data with rejecting criterion, Have:
P(|x‐μ|>3σ)≤0.003
Wherein, μ and σ represent the mathematic expectaion and standard deviation of normal population respectively, and x is actual observed value, and P is observation in average 3 times of left and right standard deviation outside probability;
Now occur being more than the σ of μ+3 or the data probability very little less than μ -3 σ in data, therefore can be somebody's turn to do as abnormality value removing Partial data;
Another is standardized value method, the data Normal Distribution after Z score standardization;This method can recognize exception Value, data of the Z score less than -3 or higher than 3 are exceptional value;Criterion score (Z score) formula is:
Z=(x- μ)/σ
Wherein μ and σ represent the mathematic expectaion and standard deviation of normal population respectively;
KNN algorithms fill missing values:Missing Data Filling method based on k nearest neighbor, allows for the characteristic of electric power data, passes through choosing Close on data near fixed how many missing values, to calculate missing values, to select the data of K arest neighbors, and to difference away from From data setting weights, missing values are then determined according to corresponding weighted average;Mean value method fills missing values, abnormal Value processing after index average value as missing values filling;
(3) normalized
The electricity consumption behavioural characteristic of user, primarily to distinguish user in electricity consumption Long-term change trend situation not in the same time, due to Electric client's coverage rate is wide, and the power consumption of different user differs greatly, to avoid the influence that electricity consumption data magnitude is brought, this hair Bright to be specially expert to having done normalized, formula is:
<mrow> <mover> <mover> <msubsup> <mi>P</mi> <mi>t</mi> <mi>i</mi> </msubsup> <mo>&amp;OverBar;</mo> </mover> <mo>&amp;OverBar;</mo> </mover> <mo>=</mo> <mfrac> <mrow> <mover> <msubsup> <mi>P</mi> <mi>t</mi> <mi>i</mi> </msubsup> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>t</mi> </munder> <mover> <msubsup> <mi>P</mi> <mi>t</mi> <mi>i</mi> </msubsup> <mo>&amp;OverBar;</mo> </mover> </mrow> <mrow> <munder> <mi>max</mi> <mi>t</mi> </munder> <mover> <msubsup> <mi>P</mi> <mi>t</mi> <mi>i</mi> </msubsup> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>t</mi> </munder> <mover> <msubsup> <mi>P</mi> <mi>t</mi> <mi>i</mi> </msubsup> <mo>&amp;OverBar;</mo> </mover> </mrow> </mfrac> <mo>&amp;Element;</mo> <mo>&amp;lsqb;</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>&amp;rsqb;</mo> </mrow>
Wherein i represents i-th of user, and t is a certain moment in one day 96 moment;Above-mentioned formula is subtracted with each performance number The minimum value of current line power, then divided by current line changed power scope, that is, the numerical value after being normalized;
(4) K-means is clustered
Before cluster is started, the operation of a Data Dimensionality Reduction is also done, is 96 with the data that data are collected into one day are adopted Point, but this 96 points are more or less the same in 4 adjacent points, the power data then rounded a little is representative, is obtained after dimensionality reduction Each user power data;
This step will be carried out category division to user, comprised the following steps that on the basis of above-mentioned processing procedure:
S1. optimal cluster number N is determined according to silhouette coefficient method;
S2. N number of user is randomly selected from above-mentioned data as barycenter;
S3. it is measured to the distance of each barycenter from remaining each user, and it is grouped into the classification of nearest barycenter;
S4. according to Euclidean distance method, the barycenter of each classification is recalculated;
S5. iteration S3-S4, until new barycenter is equal with the protoplasm heart or less than specified threshold, algorithm terminates;
(5) obtained result is clustered according to step (4) K-means, the user power utilization flow characteristic of each classification is analyzed, really The fixed power consumption characteristics per class user;
(6) according to the result of above-mentioned cluster, the different electricity consumption behavioural characteristics of each classification Electricity customers of comparative analysis are just aided in wide The business delivered is accused to be promoted;
S1. cluster analysis result is illustrated;
S2. different classes of corresponding user behavior feature is analyzed, including with electrical characteristics, the distribution situation of peak valley;
S3. the promoting service that auxiliary ad-vertisement is delivered:
Summary is analyzed the behavioural characteristic of different classes of user, and the major focus of advertisement putting is done and sorted out.
2. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that: From with data are adopted, by denoising, duplicate removal, sky data cleansing housekeeping is gone, then is normalized, use is based ultimately upon The clustering of electric data goes user power utilization behavior to be classified, and finds out inhomogeneity another characteristic, and targetedly divided Analysis and advertisement putting decision-making.
3. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that: If to be specifically divided in the adline of different time sections, need to combine more customer profile datas.
4. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that: Divided by the advertisement for distinguishing festivals or holidays and non-festivals or holidays, the direction that further auxiliary ad-vertisement is delivered.
5. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that: The different classes of user include family's class, normal commercial class, uninterrupted industrial class, industrial class of avoiding the peak hour, working class, unloaded class, Industry+family, young working class, family's class.
6. the advertisement putting aided analysis method according to claim 1 based on user power utilization behavioural analysis, it is characterized in that: The date deposits in column form in the power consumption information of collection table area user, the data being collected into, according to above-mentioned technical proposal Realization approach, is first according to the mode that the date draws near and is arranged, and will be converted to row mark the date.
CN201710270555.1A 2017-02-21 2017-04-24 Advertisement putting aided analysis method based on user power utilization behavioural analysis Pending CN107248086A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710092258 2017-02-21
CN2017100922582 2017-02-21

Publications (1)

Publication Number Publication Date
CN107248086A true CN107248086A (en) 2017-10-13

Family

ID=60016980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710270555.1A Pending CN107248086A (en) 2017-02-21 2017-04-24 Advertisement putting aided analysis method based on user power utilization behavioural analysis

Country Status (1)

Country Link
CN (1) CN107248086A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038178A (en) * 2017-12-07 2018-05-15 北京邮电大学 A kind of user power utilization behavior visual analysis method, device and electronic equipment
CN108564390A (en) * 2017-12-29 2018-09-21 广东金赋科技股份有限公司 Data trend analysis method, electronic equipment and the computer storage media of a large amount of individuals
CN108776939A (en) * 2018-06-07 2018-11-09 上海电气分布式能源科技有限公司 The analysis method and system of user power utilization behavior
CN109949181A (en) * 2019-03-22 2019-06-28 华立科技股份有限公司 The power grid type judgement method and device of algorithm are closed on based on KNN
CN109961197A (en) * 2017-12-22 2019-07-02 中国电力科学研究院有限公司 A kind of regional complex electricity consumption evaluation method and system based on power consumer label
CN110298552A (en) * 2019-05-31 2019-10-01 国网上海市电力公司 A kind of power distribution network individual power method for detecting abnormality of combination history electrical feature
CN110347957A (en) * 2019-06-12 2019-10-18 合肥大多数信息科技有限公司 One kind is from media power supply service method for pushing
CN110363596A (en) * 2019-07-24 2019-10-22 金禧 A kind of accurate advertisement put-on method and system
CN110967251A (en) * 2019-12-02 2020-04-07 湘潭大学 Method for identifying damage mode of wind power blade
CN111353814A (en) * 2020-02-24 2020-06-30 上海佳投互联网技术集团有限公司 Method and system for identifying invalid advertisement users
CN111353795A (en) * 2018-12-20 2020-06-30 北京沃东天骏信息技术有限公司 Advertisement effect measuring method, device, medium and equipment
US11107097B2 (en) 2019-08-29 2021-08-31 Honda Motor Co., Ltd. System and method for completing trend mapping using similarity scoring

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982489A (en) * 2012-11-23 2013-03-20 广东电网公司电力科学研究院 Power customer online grouping method based on mass measurement data
CN103093394A (en) * 2013-01-23 2013-05-08 广东电网公司信息中心 Clustering fusion method based on user electrical load data subdivision
CN103942606A (en) * 2014-03-13 2014-07-23 国家电网公司 Residential electricity consumption customer segmentation method based on fruit fly intelligent optimization algorithm
US20160274609A1 (en) * 2015-03-18 2016-09-22 Onzo Limited Classifying utility consumption of consumers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982489A (en) * 2012-11-23 2013-03-20 广东电网公司电力科学研究院 Power customer online grouping method based on mass measurement data
CN103093394A (en) * 2013-01-23 2013-05-08 广东电网公司信息中心 Clustering fusion method based on user electrical load data subdivision
CN103942606A (en) * 2014-03-13 2014-07-23 国家电网公司 Residential electricity consumption customer segmentation method based on fruit fly intelligent optimization algorithm
US20160274609A1 (en) * 2015-03-18 2016-09-22 Onzo Limited Classifying utility consumption of consumers

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038178A (en) * 2017-12-07 2018-05-15 北京邮电大学 A kind of user power utilization behavior visual analysis method, device and electronic equipment
CN109961197B (en) * 2017-12-22 2021-08-27 中国电力科学研究院有限公司 Regional comprehensive power utilization evaluation method and system based on power user label
CN109961197A (en) * 2017-12-22 2019-07-02 中国电力科学研究院有限公司 A kind of regional complex electricity consumption evaluation method and system based on power consumer label
CN108564390A (en) * 2017-12-29 2018-09-21 广东金赋科技股份有限公司 Data trend analysis method, electronic equipment and the computer storage media of a large amount of individuals
CN108776939A (en) * 2018-06-07 2018-11-09 上海电气分布式能源科技有限公司 The analysis method and system of user power utilization behavior
CN111353795A (en) * 2018-12-20 2020-06-30 北京沃东天骏信息技术有限公司 Advertisement effect measuring method, device, medium and equipment
CN109949181A (en) * 2019-03-22 2019-06-28 华立科技股份有限公司 The power grid type judgement method and device of algorithm are closed on based on KNN
CN109949181B (en) * 2019-03-22 2021-05-25 华立科技股份有限公司 Power grid type judgment method and device based on KNN proximity algorithm
CN110298552A (en) * 2019-05-31 2019-10-01 国网上海市电力公司 A kind of power distribution network individual power method for detecting abnormality of combination history electrical feature
CN110298552B (en) * 2019-05-31 2023-12-01 国网上海市电力公司 Power distribution network individual power abnormality detection method combining historical electricity utilization characteristics
CN110347957A (en) * 2019-06-12 2019-10-18 合肥大多数信息科技有限公司 One kind is from media power supply service method for pushing
CN110363596A (en) * 2019-07-24 2019-10-22 金禧 A kind of accurate advertisement put-on method and system
US11107097B2 (en) 2019-08-29 2021-08-31 Honda Motor Co., Ltd. System and method for completing trend mapping using similarity scoring
CN110967251A (en) * 2019-12-02 2020-04-07 湘潭大学 Method for identifying damage mode of wind power blade
CN110967251B (en) * 2019-12-02 2023-07-11 湘潭大学 Method for identifying damage mode of wind power blade
CN111353814A (en) * 2020-02-24 2020-06-30 上海佳投互联网技术集团有限公司 Method and system for identifying invalid advertisement users

Similar Documents

Publication Publication Date Title
CN107248086A (en) Advertisement putting aided analysis method based on user power utilization behavioural analysis
Rajabi et al. A comparative study of clustering techniques for electrical load pattern segmentation
Benítez et al. Dynamic clustering segmentation applied to load profiles of energy consumption from Spanish customers
Yildiz et al. Household electricity load forecasting using historical smart meter data with clustering and classification techniques
Rajabi et al. A pattern recognition methodology for analyzing residential customers load data and targeting demand response applications
Alahakoon et al. Advanced analytics for harnessing the power of smart meter big data
Choksi et al. Feature based clustering technique for investigation of domestic load profiles and probabilistic variation assessment: Smart meter dataset
CN109446193A (en) It opposes electricity-stealing model generating method and device
CN105117810A (en) Residential electricity consumption mid-term load prediction method under multistep electricity price mechanism
CN104992239A (en) Correlation coefficient-based industry electricity consumption law forecasting method
Gajowniczek et al. Electricity peak demand classification with artificial neural networks
CN107784518A (en) A kind of power customer divided method based on multidimensional index
CN114004296A (en) Method and system for reversely extracting monitoring points based on power load characteristics
Park et al. A novel load image profile-based electricity load clustering methodology
Lin et al. A hybrid economic indices based short-term load forecasting system
CN116976707B (en) User electricity consumption data anomaly analysis method and system based on electricity consumption data acquisition
Kojury-Naftchali et al. Identifying susceptible consumers for demand response and energy efficiency policies by time-series analysis and supplementary approaches
CN111126499A (en) Secondary clustering-based power consumption behavior pattern classification method
Xu et al. Spatial-temporal load forecasting using AMI data
Zufferey et al. Unsupervised learning methods for power system data analysis
CN112508260B (en) Medium-and-long-term load prediction method and device of distribution transformer based on comparative learning
Abrishami et al. Using real-world store data for foot traffic forecasting
CN112288496A (en) Load classification calculation method and tracking analysis method for power industry
Wang et al. Application of clustering technique to electricity customer classification for load forecasting
CN106651425A (en) User electricity stealing and electricity leakage behavior monitoring method considering business expanding installation data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171013