CN105069703B - A kind of electrical network mass data management method - Google Patents

A kind of electrical network mass data management method Download PDF

Info

Publication number
CN105069703B
CN105069703B CN201510487734.1A CN201510487734A CN105069703B CN 105069703 B CN105069703 B CN 105069703B CN 201510487734 A CN201510487734 A CN 201510487734A CN 105069703 B CN105069703 B CN 105069703B
Authority
CN
China
Prior art keywords
data
attribute
power grid
decision tree
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510487734.1A
Other languages
Chinese (zh)
Other versions
CN105069703A (en
Inventor
刘志刚
魏晓光
陈剑飞
刘小宝
戴昭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Jinan Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201510487734.1A priority Critical patent/CN105069703B/en
Publication of CN105069703A publication Critical patent/CN105069703A/en
Application granted granted Critical
Publication of CN105069703B publication Critical patent/CN105069703B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of electrical network mass data management method, this method includes:Power grid user data management system is built, each collected data of power grid subsystem are integrated, and the data of power grid user are excavated and analyzed using parallel computation frame;System is managed based on the data, realizes that parallel load is predicted using distributed terminator prediction algorithm.The present invention proposes a kind of electrical network mass data management method, and the data of each system of power grid user are merged and integrated, and traditional data computational methods are moved in distributed platform, meets the operation requirement of mass data.

Description

A kind of electrical network mass data management method
Technical field
The present invention relates to intelligent grid, more particularly to a kind of electrical network mass data management method.
Background technology
Acquisition, transimission and storage to power grid user real time data, and the magnanimity multi-source historical data of associate cumulation carries out Quickly analysis can effectively improve demand management, be managed to user data and support smart grid security, heavily fortified point with processing Strong and reliability service.With being continuously increased for various kinds of sensors and smart machine quantity, equipment obtains the Various types of data with transmission Also exponential growth is occurring, these data include not only the electricity consumption that intelligent electric meter is collected, and further include various kinds of sensors According to temperature, weather, humidity, geography information and the wind speed information etc. of fixed frequency acquisition.User data complexity increases.
The technology of China's generating and transmitting system and external difference are little, but with electricity consumption especially user terminal, there are larger differences Different, since adaptable market mechanism is not yet formed, the condition of implementation of China intelligent power technology is not mature enough, it is difficult to support intelligence The effective integration of energy electric power distribution system and Subscriber Management System.Generally speaking, the Mass Data Management of power grid user exists such as Lower challenge:The fast development of intelligent electric meter and technology of Internet of things keeps the mass data mode that it is generated multifarious, constituent parts number Differ according to bore, processing is integrated difficult.For mass data, a module how is built to carry out specification expression to it how The problem of Data Integration is urgent need to resolve is realized based on the module.Since the acquisition mode of data is varied, each communication Channel quality differs, and the quality of data not only received is inferior, but also also insufficient to the management and control ability of data, so as to cause utilization It is also unscientific that these inferior data, which carry out the knowledge of mining analysis discovery, cannot make accurately decision.This exists Ill effect is caused in global range, seriously annoyings information-intensive society.Data type is complicated, traditional relevant database and File memory format has been unable to meet the demand of mass data rapid growth.
Invention content
To solve the problems of above-mentioned prior art, the present invention proposes a kind of electrical network mass data management method, Including:
Power grid user data management system is built, each collected data of power grid subsystem are integrated, and utilizes Parallel computation frame is excavated and is analyzed to the data of power grid user;System is managed based on the data, it is negative using distribution It carries prediction algorithm and realizes parallel load prediction.
Preferably, the framework of the power grid user data management system is divided into application layer, data analysis computation layer, data pipe Layer is managed, power grid user data management system is built using Hadoop, data storage system is established using HDFS, HBase on platform System builds MapReduce parallel computation frames and Storm memory parallels Computational frame and is calculated as mass data and divides on platform Analysis system analyzes the mass data of power grid user;The data management layer is acquired and integrates to data;The number Include the data acquired from intelligent electric meter, data acquisition monitoring system and various sensors according to acquisition, to the collection of these data At including being managed Data Migration to cluster server;In the integrating process of data, using data transfer tool logarithm According to extraction and integration work is carried out, data transfer tool is utilized to extract data and historical data that each independent system generates It is integrated into HBase, and column storage database is operated using java persistence tools, it will be based on Distributed Calculation It is written in HBase using the online data of generation;Storage and calculating point of the data analysis computation layer for mass data Analysis;Electrical load data and related data are stored using HBase;Using parallel computation module MapReduce to mass data into Row parallel batch calculates analysis, and uses the parallel computation module Storm based on memory to data-intensive iterative calculation, will Data needed for business read in memory, need directly to inquire from memory when data.
Preferably, described to manage system based on the data, realize that parallel load is pre- using distributed terminator prediction algorithm It surveys, further comprises:
The training process of algorithm is executed using 3 MapReduce service class, the output of each MapReduce is as it The input of the latter, the decision-making module obtained after training are stored in the distributed type assemblies of Hadoop, are divided into three parts: Generate data dictionary;Generate decision tree;Form decision tree set;
The wherein described generation data dictionary includes that the sample data being trained is described, and generates a file to retouch Sample conditional attribute and decision attribute, the type of record condition attribute value and the position of decision attribute are stated, and to be created Module carries out classification or regressing calculation, this process are completed by first MapReduce, and each Map processes read experiment A part for data records the attribute type and load value or type identification of data;The description file of generation is with the shape of key/value Formula is stored in the file system HDFS of Hadoop;
The wherein described generation decision tree process includes following parallel procedure:
1) carry out having extraction K put back to and the equirotal sample data of original sample data set at random to original data set TS1,2 ..., k;One sample data corresponds to the training set of a decision tree, and each sample data is different, and and original data set Size is the same;
2) each node randomly selected attribute number m, wherein m are determined according to the number M of attribute in sample data<<M, M is the square root of M in sort module, and m is the 1/3 of M in regression block;Calculate the information content of each attribute in m attribute, selection Best attributes carry out branch;
3) recurrence carries out the foundation of node, generates decision tree;The generation of K decision tree generates parallel, a Map life At a decision tree, this process is completed by second MapReduce process;
The formation decision tree set includes that each decision tree classifiers combination is got up, and each decision tree generates a knot Fruit, if it is decided that tree set is that ballot is chosen for its final result of classifying, and when it is used for regression forecasting, sets for K and provides K Value, end value are the average value of each tree, this process is completed by third MapReduce.
Preferably, in the deployment framework of the HBase systems, using control centre as entire distributing real-time data bank Manager, store metadata information, including the division of labor of each node, node state, data partition mode, data block location, task The key message of scheduling, safety management;The control centre keeps the consistency of metadata, data by synchronization mechanism each other Analysis computation layer is reciprocity in logic, and deployment same process completes same logical operation, and data analysis computation layer uses base In the redundancy backup mechanism of affairs, power grid user data management system uses the distributed field system that HDFS is stored as bottom System builds the timing control component towards electrical network mass data to store the time series data in electrical network business, by timing control group Part builds time series data module, according to the unified time series data for receiving storage acquisition of peculiar module, and externally provides unified Query interface;
On storage mode, data are stored in the form of key-value, i.e., are stored towards row, be basic with column family Storage and permission control unit, for for empty row, real space being not take up in actual storage, uses the design of sparse table Mode abandons the pattern of traditional C/S multi-clients, single server in data framework deployment;Using distributed more clothes The cluster mode of business device, all data are disperseed according to replicator in the multiple stage computers being stored in cluster;Timing control group Part bottom depend on column storage database, specifically processing time series data when be abstracted as the reading and writing to HBase databases, increase, It deletes, the basic operation of modification, software top layer is the client and third-party application client of timing control component, Suo Youke Family end carries out concrete operations by the API of Java, and all API are a database manipulation by type parsing module function decomposition into analytic function Or the arrangement set of multiple database manipulations, these database manipulation set are called by the RPC inside control assembly, are finally united One completes data manipulation using asynchronous HBase operations API.
The present invention compared with prior art, has the following advantages:
The present invention proposes a kind of electrical network mass data management method, by the data of each system of power grid user carry out fusion and It is integrated, and traditional data computational methods are moved in distributed platform, meet the operation requirement of mass data.
Specific implementation mode
It is hereafter the detailed description to one or more embodiment of the invention.This hair is described in conjunction with such embodiment It is bright, but the present invention is not limited to any embodiments.The scope of the present invention is limited only by the appended claims, and the present invention cover it is all More replacements, modification and equivalent.Illustrate many details in order to provide thorough understanding of the present invention in the following description.Go out These details are provided in exemplary purpose, and can also be according to power without some or all details in these details Sharp claim realizes the present invention.
An aspect of of the present present invention provides a kind of power grid user mass data processing method.Sea is built using Hadoop clusters The basic management system for measuring data, by each collected Data Integration of power grid subsystem at mass data storage, and using parallel Computational frame carries out quick mining analysis to the mass data of power grid user.It, will be traditional by taking electrical load prediction application as an example Load estimation moves to Distributed Computing Platform, realizes that parallel load is predicted using the load estimation algorithm based on decision tree.This Invention combines the actual needs of power grid user mass data analysis, and structure is to analyze the power grid user data management system based on calculating System, basic framework are divided into application layer, data analysis computation layer, data management layer.
The frame builds power grid user data management system using Hadoop, and sea is established using HDFS, HBase on platform Data-storage system is measured, MapReduce parallel computation frames and Storm memory parallel Computational frames are built on platform as sea It measures data and calculates analysis system, the mass data of power grid user is analyzed.
Wherein, data management layer is that data are acquired and are integrated.Data acquisition includes being acquired from intelligent electric meter, data The data acquired in monitoring system and various sensors, these data include not only the data inside power grid, further include a large amount of phases The data of pass, these data are generated by the equipment of different vendor, and mode is multifarious, and constituent parts data bore differs, and forms Mass data flow, processing are integrated difficult.These data it is integrated refer to the generation to legacy system Data Migration to cluster take Business device, is efficiently managed.
Platform carries out extracting integral work using data transfer tool at this difficult point to data for data sets, will be each The data and historical data that independent system generates are using in data transfer tool extracting integral to HBase.It is lasting using java Chemical industry tool operates column storage database, and the online data that the application based on Distributed Calculation generates is written to HBase In.
Storage and calculating analytic function of the data analysis computation layer for mass data.Distributed Calculation layer utilizes Hadoop Built-up, mass data storage is managed data in distributed file system HDFS, using HBase.
The platform is to be classified as storage unit using HBase storage electrical load data and related data, HBase databases , it is convenient that the prediction algorithm that permutation data are inquired, and then used is needed repeatedly in learning process to permutation data The characteristics of being read out calculating, the storage of HBase data met to the operational requirements of data.
Parallel batch is carried out to mass data using parallel computation module MapReduce and calculates analysis, and to data-intensive The iterative calculation of type uses the parallel computation module Storm based on memory.Storm provides a kind of memory parallel Computational frame, Data needed for business are read in memory by frame, and whens required data directly inquires from memory, and ratio is based on disk in this way The speed that MapReduce accesses data is fast, reduces the run time of business, decreases I/O operation.
Load estimation is the key link in Electric Power Network Planning, is substation, space truss project important computations foundation, high-precision Switch-time load prediction can effectively reduce cost of electricity-generating, there is key effect.The present invention uses a kind of improved integrated learning approach, Using decision tree as basic studies unit, including the decision tree that multiple Stochastic subspace identification methods are trained, inputs sample to be sorted This generates each classification results by each decision tree, and final classification results are chosen in a vote by the result of each decision tree.It can It to overcome some shortcomings of decision tree, and is with good expansibility and concurrency, can effectively solve the problem that mass data Quick process problem has preferable application prospect for the electrical load prediction under mass data environment.
Entire load estimation process executes the training process of algorithm using 3 MapReduce service class, each Input of the output of MapReduce as its latter.The decision-making module obtained after training is stored in the distribution of Hadoop In formula cluster, it is divided into three parts:Generate data dictionary;Generate decision tree;Form decision tree set.Generating data dictionary is exactly The sample data being trained is described, a file is generated to describe sample conditional attribute and decision attribute, records The type of conditional attribute value and the position of decision attribute, and the module to be created carry out classification or regressing calculation.This Process is completed by first MapReduce, and each Map processes read a part for experimental data, record the attribute type of data With load value or type identification.The description file of generation is stored in the form of key/value in the file system HDFS of Hadoop, In case subsequent MapReduce is used.
The core that decision tree process is entire parallel algorithm is generated, parallel procedure is wherein in following several respects:1) to original Data set carries out having extraction K put back to and the equirotal sample data TS of original sample data set at random1,2 ..., k.Because being to have The extraction put back to, it is possible to original data set be extracted parallel, without being had an impact to TS.One TS corresponds to one and sentences Surely the training set set, each TS is different, and as original data set size, both ensure that each decision tree not in this way Together, and the knowledge scale of original data set will not be lost.
2) the randomly selected attribute number m (m of each node are determined according to the number M of attribute in sample data<<M), classify M is the square root of M in module, and m is the 1/3 of M in regression block.The information content of each attribute in m attribute is calculated, selection is best Attribute carry out branch;
3) the recursive foundation for carrying out node, generates decision tree.The generation of K decision tree generates parallel, a Map A decision tree is generated, the parallel of algorithm is realized.This process is completed by second MapReduce process.This MapReduce Only Map processes do not have Reduce processes.
Decision tree set is formed namely to get up each decision tree classifiers combination.Each decision tree can generate one As a result, if it is decided that tree set is that ballot is chosen for its final result of classifying, and when it is used for regression forecasting, K tree can be given Go out K value, end value is the average value of each tree.This process is completed by third MapReduce.
Entire module is built upon on the distributed type assemblies of Hadoop, is carried out distributed storage to mass data, is utilized MapReduce is parallel by algorithm, and calculation sample is enable always to collect storage capacity and computing capability logarithm that S methods rely on Hadoop clusters According to excavation and calculate prediction, whole process all executes parallel, can effectively improve the precision of prediction and to improve load pre- Examining system handles the ability of mass data.
In the deployment framework of above-mentioned HBase systems, using control centre as the management of entire distributing real-time data bank Person stores metadata information, including the division of labor of each node, node state, data partition mode, data block location, task scheduling, peace The key messages such as full management.Control centre's generally deployment 2 (can also be formed by more), keeps member by synchronization mechanism each other The consistency of data to eliminate the risk that control centre's Single Point of Faliure causes system allomeric function to lose, while being also simultaneously The realization of hair request load balancing is laid a good foundation.Fragment of the data analysis computation layer for mass data stores, and is completed at the same time The quantity of all kinds of calculating process, data analysis computation layer is limited solely by the rigid condition such as Ethernet bandwidth, computer room physical condition.Respectively Data analysis computation layer is reciprocity in logic, and deployment same process completes same logical operation, according to control centre's logarithm According to area principle, only storage belongs to the data of respective partition, to achieve the purpose that distributed storage.In view of distributed body System structure lower node fails and failure can use the redundancy backup machine based on affairs frequent occurrence between data analysis computation layer System, by the same transaction operation be synchronized to another or a few number of units according in analysis computation layer (depend on customized duplication because Son), while realizing data high reliability, lay a good foundation for the load balancing of data access.
Power grid user data management system uses the distributed file system that HDFS is stored as bottom, herein on basis The timing control component towards electrical network mass data is built to store the time series data in electrical network business.By timing control component Lai Time series data module is built, receives the time series data of storage acquisition according to peculiar module is unified, and externally provide unified inquiry Interface.
On specific storage mode, it is different from the table structure of the determinant of traditional relational, using the form of key-value Data are stored, i.e., are stored towards row, with column family for basic storage and permission control unit.For for empty row, It is not take up real space in actual storage, uses the design method of sparse table.In this way, Different sampling period is solved Caused by space waste problem.The mould of traditional C/S multi-clients, single server is abandoned in data framework deployment simultaneously Formula.Using the cluster mode of distributed multiserver, all data are stored in more in cluster according to replicator dispersion The storage security for enhancing data on computer improves the search efficiency of data.
Timing control component bottom depends on column storage database.In specifically processing time series data, can be abstracted as pair The basic operations such as reading and writing, increase, deletion, the modification of HBase databases.Software top layer is the client of timing control component And third-party application client.All clients carry out concrete operations by the API of Java.All API parse mould by type Block can be a database manipulation or the arrangement set of multiple database manipulations with function decomposition into analytic function.These database manipulation set are logical The RPC crossed inside control assembly is called, and finally unifies to complete data manipulation using asynchronous HBase operations API.
Time series data record is made of 4 measurement object, timestamp, measured value, label fields.Wherein, label is by one Or multiple key/value, to constituting, for further describing measurement object information, measurement object and tag combination are measurement item.Label Design make user be easy to inquire its care measurement item value.Control assembly stores data using accumulation layer, and deposits Reservoir is the distributed file storage system of a key/value structure.Time series data efficiently is stored in distributed accumulation layer, The data point of over ten billion easily is stored with minimum memory/disk space, must be solved when being outstanding node store structure design Critical issue certainly.For this purpose, distributing real-time data bank management level rely on columnar database HBase table design need to abide by with Lower principle:Should include retrieval information as much as possible for the major key of the timing control component using regular length;The number of storage According to generally comprising a large amount of measurement object and label, and these fields are elongated, and therefore, one ID table of setting stores these letters Breath is incorporated as major key as globally unique number, and number with timestamp;Often row should store information as much as possible. For example, the data of some period distributed collection are merged, data are submitted according to a row.The program can be reduced The number of entire table row major key, to improve the speed of row retrieval.Data are stored according to the extension of time, use is stateless Storage scheme, to provide system survivability.
The method that key and value for each measurement object, label are all made of Hash maps is numbered, while in order to carry Above-mentioned map information is stored in 2 parts by the efficiency of high data query in ID tables, and portion is that measurement object, label key and value arrive it The mapping of number is hashed, another is mapping of the hash number to measurement object, label key and value.Above-mentioned hash number is all made of The regular length of 3 bytes.The time series data of measurement object is stored in another table, and the line unit of the table uses measurement object ID The ID of the ID+ label values of+fiducial time+label key, wherein fiducial time field are right for a certain time series data record to be stored The system development answered is with using the integral point time, and in addition to fiducial time is 4 bytes, other fields are 3 bytes.In 1 hour Time series data be stored in a line in table, a certain record storage is by row and its offset Δ t institute relative to fiducial time Under corresponding row, timestamp-fiducial time of wherein Δ t=records.When certain a line record is filled with, next line is opened after renewing Storage.
Obviously, it should be appreciated by those skilled in the art, each module of the above invention or each steps can be with general Computing system realize that they can be concentrated in single computing system, or be distributed in multiple computing systems and formed Network on, optionally, they can be realized with the program code that computing system can perform, it is thus possible to they are stored It is executed within the storage system by computing system.In this way, the present invention is not limited to any specific hardware and softwares to combine.
It should be understood that the above-mentioned specific implementation mode of the present invention is used only for exemplary illustration or explains the present invention's Principle, but not to limit the present invention.Therefore, that is done without departing from the spirit and scope of the present invention is any Modification, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.In addition, appended claims purport of the present invention Covering the whole variations fallen into attached claim scope and boundary or this range and the equivalent form on boundary and is repairing Change example.

Claims (2)

1. a kind of electrical network mass data management method, which is characterized in that including:
Power grid user data management system is built, each collected data of power grid subsystem are integrated, and using parallel Computational frame is excavated and is analyzed to the data of power grid user;System is managed based on the data, it is pre- using distributed terminator Method of determining and calculating realizes parallel load prediction;
The framework of the power grid user data management system is divided into application layer, data analysis computation layer, data management layer, utilizes Hadoop builds power grid user data management system, data-storage system is established using HDFS, HBase on platform, in platform Upper structure MapReduce parallel computation frames and Storm memory parallels Computational frame calculate analysis system as mass data, right The mass data of power grid user is analyzed;The data management layer is acquired and integrates to data;The data acquisition packet The data acquired from intelligent electric meter, data acquisition monitoring system and various sensors are included, include inciting somebody to action to the integrated of these data Data Migration to cluster server is managed;In the integrating process of data, data are taken out using data transfer tool It takes and integration work, data and historical data that each independent system generates is arrived using data transfer tool extracting integral In HBase, and column storage database is operated using java persistence tools, the application based on Distributed Calculation is generated Online data be written in HBase;Storage and calculating analysis of the data analysis computation layer for mass data;It utilizes HBase stores electrical load data and related data;Mass data is criticized parallel using parallel computation module MapReduce Gauge point counting is analysed, and uses the parallel computation module Storm based on memory to data-intensive iterative calculation, needed for business Data read in memory, need directly to inquire from memory when data;
It is described to manage system based on the data, it realizes that parallel load is predicted using distributed terminator prediction algorithm, further wraps It includes:
The training process of algorithm is executed using 3 MapReduce service class, the output of each MapReduce is latter as its A input, the decision-making module obtained after training are stored in the distributed type assemblies of Hadoop, are divided into three parts:It generates Data dictionary;Generate decision tree;Form decision tree set;
The wherein described generation data dictionary includes that the sample data being trained is described, and generates a file to describe sample This conditional attribute and decision attribute, the type of record condition attribute value and the position of decision attribute, and the module to be created It carries out classification or regressing calculation, this process is completed by first MapReduce, each Map processes read experimental data A part, record the attribute type and load value or type identification of data;The description file of generation is deposited in the form of key/value Storage is in the file system HDFS of Hadoop;
The wherein described generation decision tree process includes following parallel procedure:
1) carry out having extraction K put back to and the equirotal sample data of original sample data set at random to original data set TS1,2 ..., k;One sample data corresponds to the training set of a decision tree, and each sample data is different, and and original data set Size is the same;
2) each node randomly selected attribute number m, wherein m are determined according to the number M of attribute in sample data<<M, classification M is the square root of M in module, and m is the 1/3 of M in regression block;The information content of each attribute in m attribute is calculated, selection is best Attribute carries out branch;
3) recurrence carries out the foundation of node, generates decision tree;The generation of K decision tree generates parallel, and a Map generates one A decision tree, this process are completed by second MapReduce process;
The formation decision tree set includes that each decision tree classifiers combination is got up, each decision tree generation one as a result, If it is determined that tree set is used for classifying, its final result is that ballot is chosen, and when it is used for regression forecasting, K tree provides K value, End value is the average value of each tree, this process is completed by third MapReduce.
2., will be in scheduling according to the method described in claim 1, it is characterized in that, in the deployment framework of the HBase systems Manager of the heart as entire distributing real-time data bank stores metadata information, including the division of labor of each node, node state, number According to partitioned mode, data block location, task scheduling, safety management key message;The control centre passes through synchronization each other It is reciprocity in logic that mechanism, which keeps the consistency of metadata, data analysis computation layer, and deployment same process completion is similarly patrolled Operation is collected, data analysis computation layer uses the redundancy backup mechanism based on affairs, power grid user data management system to use HDFS As the distributed file system of bottom storage, the timing control component towards electrical network mass data is built to store electrical network business In time series data, time series data module is built by timing control component, storage acquisition is received according to peculiar module is unified Time series data, and unified query interface is externally provided;
On storage mode, data are stored in the form of key-value, i.e., are stored towards row, be basic deposit with column family Storage and permission control unit are not take up real space in actual storage, use the design side of sparse table for the row for sky Formula abandons the pattern of traditional C/S multi-clients, single server in data framework deployment;Using distributed more services The cluster mode of device, all data are disperseed according to replicator in the multiple stage computers being stored in cluster;Timing control component Bottom depends on column storage database, and the reading and writing to HBase databases are abstracted as in specifically processing time series data, increases, delete The basic operation remove, changed, software top layer are the client and third-party application client of timing control component, all clients End carries out concrete operations by the API of Java, all API by type parsing module function decomposition into analytic function be a database manipulation or The arrangement set of multiple database manipulations, these database manipulation set are called by the RPC inside control assembly, are finally unified Data manipulation is completed using asynchronous HBase operations API.
CN201510487734.1A 2015-08-10 2015-08-10 A kind of electrical network mass data management method Expired - Fee Related CN105069703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510487734.1A CN105069703B (en) 2015-08-10 2015-08-10 A kind of electrical network mass data management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510487734.1A CN105069703B (en) 2015-08-10 2015-08-10 A kind of electrical network mass data management method

Publications (2)

Publication Number Publication Date
CN105069703A CN105069703A (en) 2015-11-18
CN105069703B true CN105069703B (en) 2018-08-28

Family

ID=54499061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510487734.1A Expired - Fee Related CN105069703B (en) 2015-08-10 2015-08-10 A kind of electrical network mass data management method

Country Status (1)

Country Link
CN (1) CN105069703B (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105302500B (en) * 2015-11-24 2018-04-10 中国科学技术大学 A kind of distributed coding method based on dynamic banded structure
CN105608144B (en) * 2015-12-17 2019-02-26 山东鲁能软件技术有限公司 A kind of big data analysis stage apparatus and method based on multilayered model iteration
CN105608758B (en) * 2015-12-17 2018-03-27 山东鲁能软件技术有限公司 A kind of big data analysis platform device and method calculated based on algorithm configuration and distributed stream
CN106897306B (en) * 2015-12-21 2019-04-30 阿里巴巴集团控股有限公司 Database operation method and device
CN105678467A (en) * 2016-01-15 2016-06-15 国家电网公司 Regulation and control integrated data analysis and aid decision making system and method under ultrahigh-voltage alternating current and direct current networking
CN106021080B (en) * 2016-05-10 2018-10-19 国家电网公司 Using middleware database connection pool resource consumption trend intelligent Forecasting
CN106534251B (en) * 2016-09-23 2019-12-13 苏州浪潮智能科技有限公司 Task visual uploading and starting method based on Storm
CN106372256A (en) * 2016-09-30 2017-02-01 浙江大学 Distributed storage method for massive Argo data
CN106682106A (en) * 2016-12-05 2017-05-17 国网宁夏电力公司信息通信公司 Distributed management system based on massive electric power real-time data
CN106709035B (en) * 2016-12-29 2019-11-26 贵州电网有限责任公司电力科学研究院 A kind of pretreatment system of electric power multidimensional panoramic view data
CN106934014B (en) * 2017-03-10 2021-03-19 山东省科学院情报研究所 Hadoop-based network data mining and analyzing platform and method thereof
US20180314971A1 (en) * 2017-04-26 2018-11-01 Midea Group Co., Ltd. Training Machine Learning Models On A Large-Scale Distributed System Using A Job Server
CN107341084B (en) * 2017-05-16 2021-07-06 创新先进技术有限公司 Data processing method and device
CN107391596B (en) * 2017-06-29 2023-09-22 中国电力科学研究院 Power distribution network mass data fusion method and device
CN107341241A (en) * 2017-07-05 2017-11-10 深圳市樊溪电子有限公司 A kind of wind-powered electricity generation big data analysis system based on cloud computing
CN107330567A (en) * 2017-07-20 2017-11-07 云南电网有限责任公司电力科学研究院 Distribution switch-time load Forecasting Methodology based on big data technology
CN107483858A (en) * 2017-08-31 2017-12-15 益和电气集团股份有限公司 The distributed memory system and its distributed storage method of electricity consumption enterprise supervision video
CN107679133B (en) * 2017-09-22 2020-01-17 电子科技大学 Mining method applicable to massive real-time PMU data
CN107943831B (en) * 2017-10-23 2022-05-13 国家电网公司西北分部 HBase-based power grid historical data centralized storage method
FR3087055B1 (en) * 2018-10-04 2021-06-18 Voltalis ESTIMATE OF A PHYSICAL QUANTITY BY A DISTRIBUTED MEASUREMENT SYSTEM
CN109800271A (en) * 2019-02-23 2019-05-24 湖北理工学院 A kind of information collecting method based on big data
CN110232007A (en) * 2019-05-21 2019-09-13 昆明能讯科技有限责任公司 A kind of electric power enterprise information service monitoring method based on APM technology
CN110457330B (en) * 2019-08-21 2022-09-13 北京远舢智能科技有限公司 Time sequence data management platform
CN110502517B (en) * 2019-08-23 2022-01-28 中国南方电网有限责任公司 Distributed storage system for storing real-time operation data of power grid
CN110837516A (en) * 2019-11-07 2020-02-25 恩亿科(北京)数据科技有限公司 Data cutting and connecting method and device, computer equipment and readable storage medium
CN111400129B (en) * 2020-03-06 2022-02-11 广东电网有限责任公司 Distributed application performance monitoring and bottleneck positioning system, method and equipment
CN111597415B (en) * 2020-05-13 2023-05-26 云南电网有限责任公司电力科学研究院 Neural network-based distribution network account data penetration method and device
CN112199421B (en) * 2020-12-04 2021-03-09 中国电力科学研究院有限公司 Multi-source heterogeneous data fusion and measurement data multi-source mutual verification method and system
CN112905573A (en) * 2021-01-29 2021-06-04 杭州市电力设计院有限公司余杭分公司 Mass power grid data management and storage system
CN112653771B (en) * 2021-03-15 2021-06-01 浙江贵仁信息科技股份有限公司 Water conservancy data fragment storage method, on-demand method and processing system
CN116881200B (en) * 2023-09-07 2024-01-16 四川竺信档案数字科技有限责任公司 Multi-center distributed electronic archive data security management method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information

Also Published As

Publication number Publication date
CN105069703A (en) 2015-11-18

Similar Documents

Publication Publication Date Title
CN105069703B (en) A kind of electrical network mass data management method
He et al. QoE-driven big data architecture for smart city
CN102915347B (en) A kind of distributed traffic clustering method and system
CN104881424B (en) A kind of acquisition of electric power big data, storage and analysis method based on regular expression
CN108446293A (en) A method of based on urban multi-source isomeric data structure city portrait
CN104346438B (en) Based on big data data management service system
CN107315776A (en) A kind of data management system based on cloud computing
CN104239377A (en) Platform-crossing data retrieval method and device
CN107247799A (en) Data processing method, system and its modeling method of compatible a variety of big data storages
CN108492134A (en) The big data user power utilization behavior analysis system integrated based on multicycle regression tree
Buddhika et al. Synopsis: A distributed sketch over voluminous spatiotemporal observational streams
CN106777093A (en) Skyline inquiry systems based on space time series data stream application
CN103995828B (en) A kind of cloud storage daily record data analysis method
CN107832876A (en) Subregion peak load Forecasting Methodology based on MapReduce frameworks
Gupta et al. Faster as well as early measurements from big data predictive analytics model
CN112148578A (en) IT fault defect prediction method based on machine learning
CN109446230A (en) A kind of big data analysis system and method for photovoltaic power generation influence factor
CN109308290A (en) A kind of efficient data cleaning conversion method based on CIM
CN106599189A (en) Dynamic Skyline inquiry device based on cloud computing
Buddhika et al. Living on the edge: Data transmission, storage, and analytics in continuous sensing environments
CN107818106B (en) Big data offline calculation data quality verification method and device
CN111459900A (en) Big data life cycle setting method and device, storage medium and server
Sawalha et al. Towards an efficient big data management schema for IoT
Bereş et al. A brief survey on smart grid data analysis in the cloud
CN112540987A (en) Big data management system of distribution and utilization electricity based on data mart

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180828

Termination date: 20190810