CN107229693B - The method and system of big data system configuration parameter tuning based on deep learning - Google Patents

The method and system of big data system configuration parameter tuning based on deep learning Download PDF

Info

Publication number
CN107229693B
CN107229693B CN201710361578.3A CN201710361578A CN107229693B CN 107229693 B CN107229693 B CN 107229693B CN 201710361578 A CN201710361578 A CN 201710361578A CN 107229693 B CN107229693 B CN 107229693B
Authority
CN
China
Prior art keywords
parameter
neural network
layer
output
mapping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710361578.3A
Other languages
Chinese (zh)
Other versions
CN107229693A (en
Inventor
王宏志
王艺蒙
赵志强
孙旭冉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Da Da Data Industry Co Ltd
Original Assignee
Da Da Data Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Da Da Data Industry Co Ltd filed Critical Da Da Data Industry Co Ltd
Priority to CN201710361578.3A priority Critical patent/CN107229693B/en
Publication of CN107229693A publication Critical patent/CN107229693A/en
Application granted granted Critical
Publication of CN107229693B publication Critical patent/CN107229693B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of method and system of the big data system configuration parameter tuning based on deep learning, wherein method includes:Neural metwork training step, Primary Construction deep neural network, using at least one mapping stipulations parameter as input parameter, using it is to be predicted go out allocation optimum parameter as output parameter, training sample set is used as using the historical data of big data system;Again to map the stipulations time as the measurement standard of the deep neural network, the parameter learning rule based on backpropagation thought is adjusted the weights of every layer of neuron, until the mapping stipulations time meets time cost requirement;Parameter prediction step is configured, sets the initial value of at least one mapping stipulations parameter, and reads current test data, is input in the deep neural network obtained via neural metwork training step, obtains configuration parameter.The present invention carries out tuning by deep neural network to the configuration parameter in mapping stipulations frame, avoids manual adjustment, and the parameter good application effect predicted.

Description

Method and system for optimizing configuration parameters of big data system based on deep learning
Technical Field
The invention relates to the technical field of computers, in particular to a method and a system for optimizing configuration parameters of a big data system based on deep learning.
Background
In recent years, big data exploration and analysis have been vigorously developed in various fields. Big data systems can be divided into 3 levels: (1) base layer: the basic data processing layer allocates hardware resources to the execution platform layer supporting the calculation task; (2) platform layer: the core service layer provides an interface which is easy to process a data set for the application layer and can manage resources distributed by the infrastructure layer; (3) an application layer: namely a prediction result output layer, an expert decision is predicted, and a big data analysis result is given.
The platform layer plays a role in starting and stopping in the big data system and is also a core part of the big data system. MapReduce (mapping convention) in the Hadoop system is a model in the platform layer. Hadoop is a distributed system infrastructure. A user can develop a distributed program without knowing the distributed underlying details. The power of the cluster is fully utilized to carry out high-speed operation and storage. MapReduce is a programming model under Hadoop and is used for parallel operation of large-scale data sets (larger than 1 TB). The method greatly facilitates programmers to run programs on the distributed system without distributed parallel programming. The MapReduce function of Hadoop realizes the breaking of a single task, sends a mapping task (Map) to a plurality of nodes, and then loads a specification (Reduce) into a data warehouse in the form of a single data set.
The setting of the configuration parameters has a great influence on the working performance of MapReduce. Good configuration parameters make MapReduce work well, and configuration parameter errors are the main reasons for performance degradation and system failure of the MapReduce system of Hadoop. To help platform administrators optimize system performance, configuration parameters need to be adjusted to handle different features, different programs, and different input data in pursuit of faster performance. In the traditional method, an administrator adjusts configuration parameters one by one or utilizes linear regression to configure the parameters, extracts parameter characteristics and expresses according to MapReduce operation performance, so that an approximate optimal solution is given, and the configuration parameters are predicted to achieve better working performance.
However, there are two major challenges for administrators to manage the Hadoop system: (1) because the behavior and characteristics of a large-scale distributed system are too complex, proper configuration parameters are difficult to find; (2) hundreds of parameters exist in the system, and dozens of configuration parameters which mainly affect the performance of the system make the adjustment of the configuration parameters troublesome. In the traditional method, manual method or regression automatic parameter adjustment is very complicated, a lot of time is consumed for parameter adjustment, the obtained effect is not good, and the whole work of the system needs a long time.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method and a system for optimizing configuration parameters of a big data system based on deep learning, aiming at the defects of low efficiency and poor effect of manual method or automatic configuration parameter adjustment by regression in the prior art.
The invention provides a method for optimizing configuration parameters of a big data system based on deep learning, which comprises a neural network training step and a configuration parameter predicting step; wherein,
the neural network training step comprises the following steps:
step 1-1, preliminarily constructing a deep neural network, wherein at least one mapping protocol parameter is used as an input parameter, an optimal configuration parameter to be predicted is used as an output parameter, and historical data of a big data system is used as a training sample set;
step 1-2, taking the mapping reduction time as a measurement standard of the deep neural network, and adjusting the weight of each layer of neurons based on a parameter learning rule of a back propagation idea until the mapping reduction time meets the time cost requirement;
the configuration parameter predicting step includes the steps of:
step 2-1, setting an initial value of the at least one mapping protocol parameter, and reading current test data;
and 2-2, inputting the initial value of the at least one mapping reduction parameter and the current test data into the deep neural network obtained in the neural network training step to obtain the configuration parameters of the big data system based on deep learning.
In the method for optimizing the configuration parameters of the big data system based on deep learning, the number of the at least one mapping reduction parameter is 2-20.
The invention provides a system for optimizing configuration parameters of a big data system based on deep learning, which comprises a neural network training module and a configuration parameter prediction module; wherein,
the neural network training module is used for preliminarily constructing a deep neural network, wherein at least one mapping protocol parameter is used as an input parameter, an optimal configuration parameter to be predicted is used as an output parameter, and historical data of a big data system is used as a training sample set; the mapping reduction time is used as a measurement standard of the deep neural network, and the weight of each layer of neurons is adjusted based on a parameter learning rule of a back propagation idea until the mapping reduction time meets the time cost requirement;
and the configuration parameter prediction module is used for inputting the set initial value of the at least one mapping protocol parameter and the current test data into the deep neural network obtained through the neural network training step to obtain the configuration parameters of the big data system based on deep learning.
In the deep learning-based big data system configuration parameter tuning system, the number of the at least one mapping reduction parameter is 2-20.
The implementation of the method and the system for optimizing the configuration parameters of the big data system based on deep learning has the following beneficial effects: the invention optimizes the configuration parameters in the mapping protocol framework through the deep neural network, avoids the problems of manual adjustment and optimal parameter searching, can obtain the self characteristics and the mutual relation of each configuration parameter more deeply through the learning of historical parameters, and obtains the parameter configuration which is most suitable for the application requirement of an application layer through the repeated learning, weight updating and network prediction of the deep neural network. The invention not only saves the time for adjusting the parameters, but also ensures that the working time of the system is distributed to the compressed and decompressed data by the proper parameters of the system, thereby greatly reducing the writing and transmission time, ensuring that the whole system can work quickly and achieve better working effect.
Drawings
FIG. 1 is a flow chart of a method for configuration parameter tuning for big data systems based on deep learning according to a preferred embodiment of the present invention;
FIG. 2 is a schematic flow chart of the neural network training step in the method according to the preferred embodiment of the present invention;
FIG. 3 is a block diagram of a system for deep learning based big data system configuration parameter tuning, according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The invention provides a method for optimizing configuration parameters of a big data network by using a deep neural network, which introduces a deep neural network framework into a parameter configuration link, saves time and cost and achieves good working effect. The method mainly aims at learning and optimizing configuration of parameters in a mapping task (Map task) and a specification task (Reduce task) of a big data system. Mapping convention (MapReduce) is a complex process, and the workflow of mapping convention is briefly introduced below, and the main steps of convention mapping are as follows:
map end (mapping end) working process
(1) Each incoming fragment is processed by a map task (mapping task) with the size of one block (with an initial value of 64M) of the distributed file system (HDFS) as one fragment. The Map output result is temporarily placed in a ring memory buffer, the initial value of the size of the buffer is 100M, and the buffer is controlled by the attribute of io. When the buffer is about to overflow (initially set to 80% of the buffer size, controlled by the io. sort. spill. percent attribute), an overflow file is created in the local file system, and the data in the buffer is written to this file.
(2) Before writing into a disk, a thread firstly divides data into partitions with the same number according to the number of reduce tasks (reduction tasks), namely, one reduce task corresponds to the data of one partition. Then, the data in each partition is sorted, Combiner is set, and the sorted result is subjected to Combia (merging) operation.
(3) When the map task outputs the last record, there may be many overflow files that need to be merged. The sorting and Combia operations are continuously performed during the merging process.
(4) And transmitting the data in the partitions to the corresponding reduce tasks.
(II) Reduce end (protocol end) working process
(1) The Reduce receives data from different map tasks, and the data from each map is ordered. If the amount of data received by the reduce end is quite small, it is directly stored in the memory (buffer size, controlled by mapred. If the data amount exceeds a certain proportion of the buffer size (determined by mapred.
(2) And executing the simplification program defined by the application program layer, and finally outputting the data. Compressed as required, written and finally output to the HDFS.
Referring to fig. 1, a flowchart of a method for tuning configuration parameters of a deep learning-based big data system according to a preferred embodiment of the present invention is shown. As shown in fig. 1, the method for tuning configuration parameters of a deep learning-based big data system provided in this embodiment mainly includes a neural network training step and a configuration parameter predicting step:
first, in steps S101 to S102, a neural network training step is performed to construct a deep neural network, using a historical operating state provided by an administrator as a training set, and using predicted optimal configuration parameters as an output. And the time cost of (mapping protocol) MapReduce is taken as the final measurement standard of the network structure, and the structure is continuously fed back and adjusted to obtain the final deep neural network structure. The method comprises the following specific steps:
step S101, a deep neural network is preliminarily constructed, wherein at least one mapping reduction parameter is used as an input parameter, an optimal configuration parameter to be predicted is used as an output parameter, and historical data of a big data system is used as a training sample set. The historical data of the big data system is specifically historical working states provided by an administrator. Preferably, the at least one mapping convention parameter may be selected from one or more of the following table 1 of important parameters. In a specific application, according to different situations, 20 parameters which have the largest influence on the system are obtained from a system administrator and added into the input and output list, and the selected parameters are shown in the following table 1. The number of the at least one mapping reduction parameter is preferably 2-20.
Table 1 important parameter table
And step S102, taking the mapping reduction time as a measurement standard of the deep neural network, and adjusting the weight of each layer of neurons based on the parameter learning rule of the back propagation idea until the mapping reduction time meets the time cost requirement. In the step, the time cost of MapReduce is used as a final measuring standard of the network structure, and the structure is continuously fed back and adjusted to obtain the final structure of the deep neural network.
Subsequently, a configuration parameter prediction step is performed in steps S103 to S104, and a configuration parameter that optimizes the work effect is predicted using the obtained deep neural network. The method comprises the following specific steps:
step S103, setting an initial value of the at least one mapping protocol parameter, and reading the current test data.
And step S104, inputting the initial value of the at least one mapping specification parameter and the current test data into the deep neural network obtained in the neural network training step to obtain the configuration parameters of the big data system based on deep learning.
Therefore, after Map (Map) tasks and Reduce (Reduce) task parameters are initialized, the deep neural network is introduced, the training set source is a historical task log, the historical parameters are learned, semi-supervised learning is carried out, and the parameters in the deep neural network are obtained through the feedback of the known historical working state and the working performance, so that the configuration parameters of the optimal working performance can be predicted and optimized according to different programs and different input data.
Please refer to fig. 2, which is a flowchart illustrating a neural network training step in the method according to the preferred embodiment of the invention. As shown in fig. 2, the neural network training step includes:
first, in step S201, the flow starts;
subsequently, in step S202, a deep neural network is preliminarily constructed. The deep neural network is a common deep neural network that utilizes back propagation. Specifically, a five-layer deep neural network with a mapping specification parameter as an input parameter is constructed in the step, an optimal configuration parameter to be predicted is used as an output parameter, and the five-layer network respectively comprises an input layer, an output layer and three hidden layers.
Subsequently, in step S203, the historical data of the big data system is input into the deep neural network as a training sample set. Inputting a training sample x, and outputting a hidden layer as xl=f(ul) Wherein u isl=Wlxl-1+blWherein, the function f represents the output activation function, W represents the weight, b represents the bias term, and l represents the layer 1. Because parameters in the map and reduce processes cannot be infinitely expanded and have a certain range, b needs to be fixed as the upper limit of the parameters.
Subsequently, in step S204, it is determined whether the map-reduction time satisfies the time cost requirement. Using a square error cost function to measure errors, and assuming that the output parameter class is c and N training samples in the training sample set are shared, mapping the errors E between the reduction time and the specified time cost tNComprises the following steps:wherein,for the k-th dimension of the target output of the nth training sample,for the k-th dimension of the actual output corresponding to the nth sample, c is 20. And calculating errors among the networks of each layer, and turning to the step S206 to store the deep neural network when the errors are smaller than a preset threshold, or turning to the step S205 to adjust the weight of each layer of neurons.
In step S205, the weight values for each layer of neurons are adjusted. Specifically, in this step, the weight W of each layer of neurons is scaled by the sensitivity δ of the neurons, and finally the weight with the smallest E is obtained:
wherein,and sensitivity of the l-th layer: deltal=(Wl+1)Tδl+1οf'(ul) (ii) a The sensitivity of the neurons of the output layer is: deltaL=f'(uL)·(yn-tn) Wherein L represents the total number of layers, ynIs the actual output of the nth neuron, tnIs the target output of the nth neuron.
In step S206, the deep neural network is saved;
finally, in step S207, the flow of the neural network training step ends.
In the invention, both the protocol mapping time and the configuration parameters are used as output, the protocol mapping time can be understood as the output in the intermediate process, the configuration parameters are the most important output recorded and used by people, the weight is adjusted after the error is compared between the output time and the ideal time, the output of the time is changed after the weight is adjusted, and the output of the configuration parameters is also changed, so that the configuration parameters with the optimal time can be obtained.
Referring to fig. 3, a block diagram of a system for configuring parameter tuning for a deep learning based big data system according to a preferred embodiment of the present invention is shown. As shown in fig. 3, the system 300 for tuning configuration parameters of a deep learning-based big data system according to this embodiment includes a neural network training module 301 and a configuration parameter predicting module 302.
The neural network training module 301 is configured to preliminarily construct a deep neural network, where at least one mapping protocol parameter is used as an input parameter, an optimal configuration parameter to be predicted is used as an output parameter, and historical data of a big data system is used as a training sample set; and the mapping reduction time is taken as a measurement standard of the deep neural network, and the weight of each layer of neurons is adjusted based on the parameter learning rule of the back propagation idea until the mapping reduction time meets the time cost requirement. Preferably, the at least one mapping reduction parameter may be selected from one or more of a re-table. The number of the at least one mapping reduction parameter is preferably 2-20.
Specifically, the neural network training module 301 constructs a five-layer deep neural network using a mapping protocol parameter as an input parameter, and using an optimal configuration parameter to be predicted as an output parameter, where the five-layer deep neural network includes an input layer, an output layer, and three hidden layers, respectively, a training sample x is input, and the hidden layer output is xl=f(ul) Wherein u isl=Wlxl-1+blThe function f represents the output activation function, W represents the weight, b represents the bias term, and l represents layer 1.
The neural network training module 301 further measures the error using a squared error cost function, and assuming that the output parameter class is c, and N training samples in the training sample set are total, maps the error E between the reduction time and the specified time cost tNComprises the following steps:wherein,for the k-th dimension of the target output of the nth training sample,the k-th dimension of the actual output corresponding to the nth sample.
And then calculating the error between each layer of network, saving the deep neural network when the error is smaller than a preset threshold, otherwise, scaling the weight W of each layer of neuron through the sensitivity delta of the neuron:
wherein,and sensitivity of the l-th layer: deltal=(Wl+1)Tδl+1οf'(ul) (ii) a The sensitivity of the neurons of the output layer is: deltaL=f'(uL)·(yn-tn) Wherein L represents the total number of layers, ynIs the actual output of the nth neuron, tnIs the target output of the nth neuron.
The configuration parameter prediction module 302 is connected to the neural network training module 301, and is configured to input the set initial value of the at least one mapping protocol parameter and the current test data into the deep neural network obtained through the neural network training step, so as to obtain configuration parameters for the deep learning-based big data system.
In summary, the invention adopts the deep neural network to optimize the configuration parameters in the mapping reduction (MapReduce) framework, avoids the difficult problems of manual adjustment and optimal parameter searching, can obtain the self characteristics and the relationship of each configuration parameter more deeply through the learning of historical parameters, and obtains the parameter configuration most suitable for the application requirements of the application layer through the multiple learning, weight updating and network prediction of the deep network. The invention not only saves the time for adjusting the parameters, but also ensures that the working time of the system is distributed to the compressed and decompressed data by the proper parameters of the system, thereby greatly reducing the writing and transmission time, ensuring that the whole system can work quickly and achieve better working effect. Meanwhile, the method can automatically learn aiming at the input data of different basic layers and the application requirements proposed by application layers, and has strong adaptability.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (4)

1. A big data system configuration parameter tuning method based on deep learning is characterized by comprising a neural network training step and a configuration parameter predicting step; wherein,
the neural network training step comprises the following steps:
step 1-1, preliminarily constructing a deep neural network, wherein at least one mapping protocol parameter is used as an input parameter, an optimal configuration parameter to be predicted is used as an output parameter, and historical data of a big data system is used as a training sample set;
step 1-2, taking the mapping reduction time as a measurement standard of the deep neural network, and adjusting the weight of each layer of neurons based on a parameter learning rule of a back propagation idea until the mapping reduction time meets the time cost requirement;
the configuration parameter predicting step includes the steps of:
step 2-1, setting an initial value of the at least one mapping protocol parameter, and reading current test data;
2-2, inputting the initial value of the at least one mapping protocol parameter and the current test data into a deep neural network obtained in the neural network training step to obtain configuration parameters of the big data system based on deep learning;
in the step 1-1:
constructing a five-layer deep neural network with mapping protocol parameters as input parameters, taking optimal configuration parameters to be predicted as output parameters, wherein the five-layer network respectively comprises an input layer, an output layer and three hidden layers, training samples x are input, and the output of the hidden layers is that y is xl=f(ul) Wherein u isl=Wlxl-1+blThe function f represents the output activation function, W represents the weight, b represents the bias term, and l represents the secondlA layer;
in the step 1-2:
using a square error cost function to measure errors, and assuming that the output parameter class is c and N training samples in the training sample set are shared, mapping the errors E between the reduction time and the specified time cost tNComprises the following steps:wherein,for the k-th dimension of the target output of the nth training sample,the k dimension of the actual output corresponding to the nth sample;
calculating the error between each layer of network, saving the deep neural network when the error is smaller than a preset threshold, otherwise, scaling the weight W of each layer of neuron through the sensitivity delta of the neuron:
wherein,and sensitivity of the l-th layer:the sensitivity of the neurons of the output layer is:wherein L represents the total number of layers, ynIs the actual output of the nth neuron, tnIs the target output of the nth neuron, signRepresenting a convolution.
2. The big data system configuration parameter tuning method based on deep learning of claim 1, wherein the number of the at least one mapping reduction parameter is 2-20.
3. A big data system configuration parameter tuning system based on deep learning is characterized by comprising a neural network training module and a configuration parameter prediction module; wherein,
the neural network training module is used for preliminarily constructing a deep neural network, wherein at least one mapping protocol parameter is used as an input parameter, an optimal configuration parameter to be predicted is used as an output parameter, and historical data of a big data system is used as a training sample set; the mapping reduction time is used as a measurement standard of the deep neural network, and the weight of each layer of neurons is adjusted based on a parameter learning rule of a back propagation idea until the mapping reduction time meets the time cost requirement;
the configuration parameter prediction module is used for inputting the set initial value of the at least one mapping protocol parameter and the current test data into a deep neural network obtained through the neural network training step to obtain the configuration parameters of the big data system based on deep learning;
the neural network training module is used for constructing a five-layer deep neural network with a mapping protocol parameter as an input parameter and an optimal configuration parameter to be predicted as an output parameter, the five-layer network comprises an input layer, an output layer and three hidden layers respectively, a training sample x is input, and the output of the hidden layer is xl=f(ul) Wherein u isl=Wlxl-1+blWherein, the function f represents the output activation function, W represents the weight, b represents the bias term, and l represents the l-th layer;
the neural network training module measures errors by using a square error cost function, and supposing that the output parameter class is c and N training samples in the training sample set are common, the error E between the mapping reduction time and the specified time cost t isNComprises the following steps:wherein,for the k-th dimension of the target output of the nth training sample,the k dimension of the actual output corresponding to the nth sample;
calculating the error between each layer of network, saving the deep neural network when the error is smaller than a preset threshold, otherwise, scaling the weight W of each layer of neuron through the sensitivity delta of the neuron:
wherein,and sensitivity of the l-th layer:the sensitivity of the neurons of the output layer is:wherein L represents the total number of layers, ynIs the actual output of the nth neuron, tnIs the target output of the nth neuron, signRepresenting a convolution.
4. The deep learning based big data system configuration parameter tuning system according to claim 3, wherein the number of the at least one mapping reduction parameter is 2-20.
CN201710361578.3A 2017-05-22 2017-05-22 The method and system of big data system configuration parameter tuning based on deep learning Active CN107229693B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710361578.3A CN107229693B (en) 2017-05-22 2017-05-22 The method and system of big data system configuration parameter tuning based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710361578.3A CN107229693B (en) 2017-05-22 2017-05-22 The method and system of big data system configuration parameter tuning based on deep learning

Publications (2)

Publication Number Publication Date
CN107229693A CN107229693A (en) 2017-10-03
CN107229693B true CN107229693B (en) 2018-05-01

Family

ID=59933231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710361578.3A Active CN107229693B (en) 2017-05-22 2017-05-22 The method and system of big data system configuration parameter tuning based on deep learning

Country Status (1)

Country Link
CN (1) CN107229693B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992404B (en) * 2017-12-31 2022-06-10 ***通信集团湖北有限公司 Cluster computing resource scheduling method, device, equipment and medium
CN108363478B (en) * 2018-01-09 2019-07-12 北京大学 For wearable device deep learning application model load sharing system and method
CN110427356B (en) * 2018-04-26 2021-08-13 中移(苏州)软件技术有限公司 Parameter configuration method and equipment
CN108764568B (en) * 2018-05-28 2020-10-23 哈尔滨工业大学 Data prediction model tuning method and device based on LSTM network
CN108990141B (en) * 2018-07-19 2021-08-03 浙江工业大学 Energy-collecting wireless relay network throughput maximization method based on deep multi-network learning
CN109041195A (en) * 2018-07-19 2018-12-18 浙江工业大学 A kind of energy-collecting type wireless relay network througput maximization approach based on semi-supervised learning
CN109445935B (en) * 2018-10-10 2021-08-10 杭州电子科技大学 Self-adaptive configuration method of high-performance big data analysis system in cloud computing environment
CN109815537B (en) * 2018-12-19 2020-10-27 清华大学 High-flux material simulation calculation optimization method based on time prediction
CN109739950B (en) * 2018-12-25 2020-03-31 中国政法大学 Method and device for screening applicable legal provision
CN110134697B (en) * 2019-05-22 2024-01-16 南京大学 Method, device and system for automatically adjusting parameters of storage engine for key value
TWI752614B (en) * 2020-09-03 2022-01-11 國立陽明交通大學 Multiple telecommunication endpoints system and testing method thereof based on ai decision
CN113254472B (en) * 2021-06-17 2021-11-16 浙江大华技术股份有限公司 Parameter configuration method, device, equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504460A (en) * 2014-12-09 2015-04-08 北京嘀嘀无限科技发展有限公司 Method and device for predicating user loss of car calling platform
CN106022521A (en) * 2016-05-19 2016-10-12 四川大学 Hadoop framework-based short-term load prediction method for distributed BP neural network
CN106202431A (en) * 2016-07-13 2016-12-07 华中科技大学 A kind of Hadoop parameter automated tuning method and system based on machine learning
CN106648654A (en) * 2016-12-20 2017-05-10 深圳先进技术研究院 Data sensing-based Spark configuration parameter automatic optimization method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9886310B2 (en) * 2014-02-10 2018-02-06 International Business Machines Corporation Dynamic resource allocation in MapReduce

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504460A (en) * 2014-12-09 2015-04-08 北京嘀嘀无限科技发展有限公司 Method and device for predicating user loss of car calling platform
CN106022521A (en) * 2016-05-19 2016-10-12 四川大学 Hadoop framework-based short-term load prediction method for distributed BP neural network
CN106202431A (en) * 2016-07-13 2016-12-07 华中科技大学 A kind of Hadoop parameter automated tuning method and system based on machine learning
CN106648654A (en) * 2016-12-20 2017-05-10 深圳先进技术研究院 Data sensing-based Spark configuration parameter automatic optimization method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BP神经网络的优化与研究;吕琼帅;《中国优秀硕士学位论文全文数据库信息科技辑》;20120415(第4期);第I140-69页 *

Also Published As

Publication number Publication date
CN107229693A (en) 2017-10-03

Similar Documents

Publication Publication Date Title
CN107229693B (en) The method and system of big data system configuration parameter tuning based on deep learning
US20190279088A1 (en) Training method, apparatus, chip, and system for neural network model
CN111966684B (en) Apparatus, method and computer program product for distributed data set indexing
US10649996B2 (en) Dynamic computation node grouping with cost based optimization for massively parallel processing
CN110852421B (en) Model generation method and device
CN112884086B (en) Model training method, device, equipment, storage medium and program product
US20230385333A1 (en) Method and system for building training database using automatic anomaly detection and automatic labeling technology
CN103942108B (en) Resource parameters optimization method under Hadoop isomorphism cluster
US20210398013A1 (en) Method and system for performance tuning and performance tuning device
CN104750780A (en) Hadoop configuration parameter optimization method based on statistic analysis
CN107562804B (en) Data caching service system and method and terminal
US11568170B2 (en) Systems and methods of generating datasets from heterogeneous sources for machine learning
US20150347470A1 (en) Run-time decision of bulk insert for massive data loading
US9953057B2 (en) Partitioned join with dense inner table representation
CN116050540A (en) Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling
CN110688993B (en) Spark operation-based computing resource determination method and device
Sudharsan et al. Globe2train: A framework for distributed ml model training using iot devices across the globe
CN114443236A (en) Task processing method, device, system, equipment and medium
US11934927B2 (en) Handling system-characteristics drift in machine learning applications
CN106502842A (en) Data reconstruction method and system
CN115412401B (en) Method and device for training virtual network embedding model and virtual network embedding
US9600517B2 (en) Convert command into a BULK load operation
WO2022161081A1 (en) Training method, apparatus and system for integrated learning model, and related device
CN114968585A (en) Resource configuration method, device, medium and computing equipment
CN114817315B (en) Data processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant