CN106506254A - A kind of bottleneck node detection method of extensive stream data processing system - Google Patents

A kind of bottleneck node detection method of extensive stream data processing system Download PDF

Info

Publication number
CN106506254A
CN106506254A CN201610835764.1A CN201610835764A CN106506254A CN 106506254 A CN106506254 A CN 106506254A CN 201610835764 A CN201610835764 A CN 201610835764A CN 106506254 A CN106506254 A CN 106506254A
Authority
CN
China
Prior art keywords
fuzzy
node
reasoning
bottleneck
stream data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610835764.1A
Other languages
Chinese (zh)
Other versions
CN106506254B (en
Inventor
翟岩龙
吴煦
王子硕
扶聪
张鑫宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201610835764.1A priority Critical patent/CN106506254B/en
Publication of CN106506254A publication Critical patent/CN106506254A/en
Application granted granted Critical
Publication of CN106506254B publication Critical patent/CN106506254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of bottleneck node detection method of extensive stream data processing system of the present invention, belongs to big data calculating, fuzzy logic and streaming preconditioning technique field.A kind of bottleneck node detection method of extensive stream data processing system, abbreviation this method, the system that is relied on, i.e., a kind of bottleneck detecting system based on fuzzy logic control, abbreviation the system, conciliates blur unit including initialization unit, node state collecting unit, fuzzy reasoning unit;This method step is:1 initialization unit initializes fuzzy logic engine, sets the membership function of semantization label and each quantity of state, loads Fuzzy Rule Sets, sets the reasoning results critical parameter;2 obtain node state;3 by input variable obfuscation;4 fuzzy reasonings;5 ambiguity solutions, obtain result of determination.The invention detects that change of the change of flow to system load, judges that bottleneck node is extended in time, to safeguard the only optimum cluster of one utilization of resources of operation, the purpose for reducing cluster scale is reached.

Description

A kind of bottleneck node detection method of extensive stream data processing system
Technical field
The present invention relates to a kind of bottleneck node detection method of extensive stream data processing system, belongs to big data meter Calculation, fuzzy logic and streaming preconditioning technique field.
Background technology
With the development of real-time big data technology, a lot of companies start to dispose the stream data process cluster of oneself, safeguard The operation of these clusters needs very big expense, and stream data processing system is typically characterised by data stream size shakiness Fixed, the complexity of system is converted quickly with event.In order that system also can normally can be run in the case of minority big flow, Need to carry out resource allocation according to the maximum stream flow that estimates when cluster is configured;But heavy traffic condition is generally only little In the case of occur, if resource distribution by peak demand configure, in major part major part resource all in idle shape State, the resource utilization of system are very low, cause the serious wasting of resources.Therefore how an operating cluster is monitored, soon Speed efficiently detects the bottleneck node in cluster and its key for implementing to be expanded into cloud computing architecture field is asked One of topic.
Now the stream data of main flow processes engine and all transships (bottleneck) detection to individual node and be extended Design, such as Storm and S4 are to carry out operation using the mode of static configuration, for can not root during instability of flow According to dynamic distribution and Resource recovery is needed, can only then need to stop from the running status of whole detection system if necessary to extension Only cluster, edits Static Configuration Files as needed and redistributes resource and then can just continue to run with.Flat for present cloud computing Demand of the platform to extension sexual function, the methods that scientific research personnel have studied several detection bottleneck nodes, and on the platforms such as Storm Carried out integrated, many applications have been obtained in stream data process field.
For the method for the bottleneck node detection and extension of stream data processing system is broadly divided into three classes, the first kind is Based on the static determination methods of threshold value, this is a kind of simply and intuitively method, but the setting of static threshold needs user couple The loading trends of application have very deep understanding correctly could arrange and threshold value to application be independent, cloud platform can not learn this How a little threshold values determine.Equations of The Second Kind is a kind of based on the automatic decision mode of study is strengthened for extension automatically, and this mode makes Model and Q-learning algorithms are processed with Markovian decision, the method using machine learning is by training judgment models The load condition of system is made a decision;Method and two defects in this:One is initialization poor performance, when needing very big training Between;Two is to need very big state space, and as the increase of state variable, the quantity of state are exponentially increased, state is excessive In the case of cause hydraulic performance decline serious.3rd class is the method based on control theory, and control theory has been used to web server, The automatic management of the systems such as storage system, data center;Control theory method is often divided into open loop (Open loop) and front Feedback-feedback (feedback and feed-forward) two ways, open loop is a kind of mode of feedback-less, its root A value is calculated according to the state and system model of current system, does not judge whether impact of this result to the output of system takes Obtain desired result.And the controller with feedback then can observing system output, and calculating is correspondingly improved according to the output of system One desired result is obtained with which.
Control theory Integration ofTechnology has obtained extensive research to the method for flow data processing system.Palden Lama and Xiaobo Zhou etc. are in 2010IEEE International Symposium on Modeling, Analysis and That delivered in Simulation of Computer and Telecommunication Systems meetings is entitled “Automated control in cloud computing:Challenges and opportunities " propose one Plant using the control method for adjusting cluster virtual machine quantity based on average CPU utilization, this method is using more visual and understandable Control logic, realize the automatic distribution of resources of virtual machine, but its have the disadvantage excessively simple, only consideration CPU usage this Individual parameter, variable is excessively single to be difficult to embody the overall load condition of stream data processing system, and the reliability of its effect is relatively low, Error is larger.In 9th IEEE/ACM International Symposium on Cluster Computing in 2009 Entitled " the Self-Tuning Virtual Machines for Predictable delivered in and the Grid seminars The article of eScience " proposes a kind of PI (Proportional-Integral) controller of control batch processing job resource, This method establishes a model for having feedback with regard to distributing resource according to the implementation progress of operation, although this control Device effectively can work, but this model is mainly used to the implementation progress and resource allocation for predicting batch processing system, not complete It is suitable for full stream data processing system.Another extensive control theory is fuzzy logic technology, fuzzy logic control A fuzzy set is mapped to using by load parameter, the corresponding mould of result is obtained by the computing of the fuzzy rule set for defining Paste variable and its degree of membership, operate the final result for obtaining a fuzzy reasoning finally by ambiguity solution.Managed based on fuzzy logic By control system be referred to as fuzzy controller, in the IEEE 19th International Symposium on of 2011 In Modeling, Analysis&Simulation of Computer and Telecommunication Systems meetings Entitled " the Fuzzy Modeling Based Resource Management for Virtualized Database for delivering Systems " proposes a kind of method of use fuzzy logic control Resource dynamic allocation, changes method and is represented using CPU usage Input load, changes the feasibility that method validation uses fuzzy logic as resource allocation controller, but the paper is mainly examined Considered is the resource allocation of Business Logic (database), and there is also the input variable excessively simple question of selection, no The changeable caused system mode variation feature of data flow in stream data processing system can be reflected completely.
Although above-mentioned existing resource control scheme has certain effect in respective application scenarios, based on control There is selection input variable excessively simple question in the theoretical method of system, and there is initialization based on the method for strengthening study mostly Stage performance is too low and result of learning model does not ensure reliability, and the setting threshold method of simple, intuitive is not then adapted to more Many application scenarios, need to arrange each application independent threshold value and the setting of threshold value are depended on to applying complexity Solution.Present invention aim at solving the problems, such as above-mentioned, one extensive stream data processing system based on control theory of proposition Bottleneck node detection method, the method can be obtained, and it is special to choose reflection Stream Processing system enough The multiple variables that levies participate in calculating.Our invention can detect the change that the change of flow is brought to system load in time, and When judge that bottleneck node is extended, to safeguard the only optimum cluster of one utilization of resources of operation, reach reduction cluster scale, The purpose of save resources.
Content of the invention
The purpose of the present invention is that the state variable for overcoming existing extension theoretical chooses the not high skill of insufficient and reliability A kind of art defect, it is proposed that bottleneck node detection method of extensive stream data processing system.
A kind of system relied on by bottleneck node detection method of extensive stream data processing system, i.e., a kind of based on mould The bottleneck detecting system of fuzzy logic control, abbreviation the system include initialization unit, node state collecting unit, fuzzy reasoning list Unit conciliates blur unit;
A kind of bottleneck node detection method of extensive stream data processing system, abbreviation this method are comprised the following steps that:
Step 1:Initialization unit initializes fuzzy logic engine, sets the semantization label and each semantic mark of input variable The membership function of label, loads Fuzzy Rule Sets, sets the reasoning results critical parameter;
Wherein, fuzzy logic engine is to achieve fuzzy logic control language (FCL) standard (IEC1131-7) and can enter The language such as the program engine of row fuzzy reasoning, the fuzzylite that can be realized using C Plus Plus and Java are realized jFuzzylogic;
Semantization label is that fuzzy logic uses " true value ", and each input value (i.e. node state) has the semantic mark of oneself Sign and corresponding membership function, these should be set in initialization node, are typically recorded in a configuration file, are had fuzzy Logic engine reads and parses;
Each semantic label of input variable can correspond to a membership function, and the value of this function is between 0~1, false If the span of certain input variable x is (m, n), function f (x, v) represents that value is subordinate to letter for semantic label v when being x Number;In general, the membership function of certain semantic label is trapezoidal or triangle in rectangular coordinate system;Fuzzy Rule Sets are The regular collection of the fuzzy reasoning that writes in advance, fuzzy logic engine can read these rules and parse, with logic later Reasoning;
The reasoning results judge that we also need to arrange two threshold values for final result:Collapsible threshold value and expansible Threshold value, is represented using threshold_scale_in and threshold_scale_out respectively, when the result of ambiguity solution is less than Represent during threshold_scale_in that present node can be reclaimed, when ambiguity solution result is more than threshold_scale_ Represent during out that present node needs to extend;
Step 2:Node state collecting unit obtains node state;
We will carry out bottleneck judgement to a node, need the current operating conditions for first obtaining the node, for For stream data processes cluster, we select the letter based on the CPU usage of node, memory usage, data tuple size Breath;
Wherein, node state is with an element group representation:statusi={ Ci,Mi,Si,Missi, represent that node i is current respectively Cpu load, Ci;Memory usage, Mi;Process the currently processed size of data of tuple, Si,;Do not process in time in time recently The tuple quantity that falls, Missi
Complete for all data tuples are all strictly processed in prescribed limit, i.e., do not allow the flow data of time-out to process and draw Hold up, the process time-out expression system if there are tuple has needed to extend, so this kind of engine Miss in the present systemi's Value is 0 forever, can be then Miss options in semantic label for allowing some fault-tolerant stream datas occur to process engine Set a group of labels and corresponding membership function;
Wherein, CiScope be 0~100, MiScope be 0~100;SiAnd MissiSpan and application concrete Scene is related;
Step 3:Fuzzy reasoning unit by input variable obfuscation, specially:
The state tuple that fuzzy reasoning unit is obtained using step 2, arranges the input quantity of fuzzy logic processes engine, by fixed The membership function of justice carries out obfuscation to input quantity, the step for can be completed by fuzzy logic processes engine;Input variable mould The idiographic flow of gelatinization is as follows:
A membership function record variable of step 3.1 for the subjection degree of fuzzy set, to the every of certain input variable One fuzzy set needs to seek a group subjection degree respectively;
Wherein, membership function, is designated as μA(x);The span of subjection degree is the real number between 0 to 1;
Each fuzzy set of certain input variable is needed to seek a group subjection degree respectively, specially:
Assume there is A1,A2,...,AnIndividual fuzzy set, then need to seek degree of membership respectively to this n fuzzy set, obtain [μA1, μA2,...,μAn];
The all input variables of step 3.2 pair seek the subjection degree of its fuzzy set respectively;
Step 4:Fuzzy reasoning;
Wherein, fuzzy reasoning is the reasoning based on fuzzy rule, and the condition of the premise of fuzzy rule, i.e. fuzzy reasoning is mould The logical combination of paste proposition;The conclusion of fuzzy rule is to represent the fuzzy proposition of the reasoning results, the mould that all fuzzy propositions are set up Paste degree represented with the membership function of corresponding language variable qualitative value, i.e., the obfuscation result required by step 3;
Step 4, specially:
Step 4.1 fuzzy reasoning unit calculates the conclusion of every fuzzy rule;The obfuscation result meter obtained using step 3 Calculate the logical combination of regular premise part fuzzy proposition, and the subjection degree of premise logical combination and conclusion proposition be subordinate to letter Number does min computings, tries to achieve the fog-level of conclusion;
Step 4.2 does max computings to the fog-level of the conclusion of all fuzzy rules in step 4.1, obtains fuzzy reasoning As a result;
So far, fuzzy logic engine provides complete fuzzy reasoning and has realized that we only need to define Fuzzy Rule Sets, The interface that engine can be called to provide obtains the reasoning results;
Step 5:Ambiguity solution, obtains result of determination;
What step 4 was obtained is the value of one group of degree of membership of the fuzzy set of result, and we will carry out Xie Mo to this group of result Paste obtains the conclusion whether a node is in bottleneck, and preferred ambiguity solution method has maximum membership degree method, weighting flat Equal method and gravity model appoach (the Center of Gravity, COG);Maximum membership degree method takes in all results degree of membership most Used as final result of determination, this method realizes that simple but precision is poor to that big result;More usually COG, COG side Method passes through the position of centre of gravity of result of calculation collection as a result;
Multiple ambiguity solution algorithms are achieved in fuzzy logic engine, such as only need in configuration file in jFuzzylogic In specify the value of DEFUZZIFY METHOD both use gravity model appoach ambiguity solution for COG;The result of gravity model appoach ambiguity solution is one Individual numerical value, two threshold values for being same as being arranged with initial phase are compared, and are finally to extend, should shrink and be also to maintain Constant decision-making.
So far, step 1 completes a kind of bottleneck node detection method of extensive stream data processing system to step 5.
Beneficial effect
A kind of bottleneck node detection method of extensive stream data processing system, is processed with other extensive stream datas The bottleneck node detection method of system is compared, and is had the advantages that:
1. the bottleneck node detection method of the extensive stream data processing system carried by the present invention only depends on system and works as Front state and the system mode of timing node before, it is not necessary to the calculating of an integral function to the time;
2. the bottleneck node detection method of the extensive stream data processing system carried by the present invention does not need model training, Its Stability and veracity is not by the impact (need not train) of training data itself;
3. the advantage of the bottleneck node detection method of the extensive stream data processing system carried by the present invention is to simplify Calculate, and Stability and dependability is all relatively good;
Description of the drawings
Fig. 1 is a kind of execution flow chart of the bottleneck node detection method of extensive stream data processing system of the present invention;
Fig. 2 is cpu busy percentage in a kind of bottleneck node detection method of extensive stream data processing system of the present invention Membership function exemplary plot.
Specific embodiment
In-depth explanation is carried out to the method for the invention with specific embodiment below in conjunction with the accompanying drawings.
Embodiment 1
The present embodiment is specifically described the stream that the present invention is applied under stream data processing system bottleneck node detection scene Journey.
Step A:Initialization,
The Fuzzy Processing engine that this example is used is the jFuzzyLogic that Java language is realized, is carried by jFuzzyLogic For configuration file configure initialized semantic label and membership function.
The node of this example only needs to the data flow for processing an inflow.It is the task of representing O with T, in time tiWhen make Use TaskiRepresent the state of the process being carrying out.We describe this state using following parameter.
pi(t):The size of currently processed data tuple
ci(t):The CPU usage of present node
mi(t):The memory usage of present node
missi(t):Currently not processed and tuple quantity that miss falls
We are to judge whether a node has reached bottleneck using the purpose that fuzzy logic carries out decision-making, if section Point has been bottleneck to be accomplished by extending this node, if the very low situation of load can then reclaim this node.So We set the action for being output as executing node, and the collection of output is combined into Out={ extension, maintains, and shrinks }.
According to the membership function that chooses, we are obscured to four |input parametes above and an output parameter respectively Change.Parameter setting semantization label to choosing, that is, arrange their fuzzy set respectively.For the utilization rate of CPU, its domain For 0%-100%, empirically, the semantization tag set that CPU usage can be set be C=very low, low, medium, Height, very high;We are collectively referred to as this collection the linguistic labelses of CPU usage.For memory usage, its domain scope is 0%-100%;M={ very low, low, medium, height, very high } could be arranged in the same manner;For the tuple that overtime (Miss) falls Quantity, its interval is for 0-10 in this example, arranges its fuzzy set for E={ little, in, big }.In the same manner for currently processed The size of data tuple, in this example its domain scope be 0Mb-10Mb, take its fuzzy set for P=little, in, greatly, very Greatly }.
Membership function is represented with piecewise function, it is also possible to represented with broken line graph.For convenience, we use broken line chart Show the membership function of each dimension.Rule of thumb, it is believed that CPU usage thinks its being subordinate to for " very low " less than 5% Spend and think which is very low for 1, i.e., 100%;Think which for the degree of membership of " very high " is when CPU usage is higher than 90% 1;When which is in other situations, i.e., when between 5%-90%, its degree of membership is as shown in Figure 2.
Intuitively understood according to one kind, such as think node to be thought when CPU usage and all very high memory usage Needs are extended, so, variable Combination Design fuzzy rule to each obfuscation, following table is the fuzzy rule that this example is used Then a subset in storehouse, carries out fuzzy reasoning using this fuzzy rule base:
Fig. 1 is the flow chart that the system relied on by institute's extracting method of the present invention is executed.
From figure 1 it appears that our system is present as a card format of flow data processing system, from streaming Status data is gone to execute clearing in data handling system.Fuzzy logic engine jFuzzyLogic is by reading semantic label and person in servitude The definition of membership fuction executes initialization.Then result is obtained by obfuscation, fuzzy reasoning, the several steps of ambiguity solution.
Step B:Node state collecting unit obtains node state.The variable that step one determines can be operationally very convenient Acquisition, the size of data tuple is an attribute of data flow, and CPU usage, memory usage and network interface data flow are all Can be obtained by system interface.These parameters are to affect maximum for flow data processes engine, a node Process performance depends on its CPU and calculates performance and memory size, and the handling capacity of system is subject to size and the list of data traffic The impact of the size of individual data tuple.For the configuration of the process node of main flow, it is believed that disk I/O performance will not be affected The principal element of node throughput, immediately for mechanical hard disk storage device its transmission speed was for ought at present flow data This is also enough.
Step C:Input variable obfuscation can be arranged after initialization engine by fuzzy reasoning unit by the interface of Java Input quantity.
Step D:Fuzzy reasoning, fuzzy reasoning again may be by calling and the Java interfaces of FuzyyLogic are realized.
Step E:Ambiguity solution, obtains result of determination.
In this example, ambiguity solution is carried out using conventional COG (Center of Gravity) algorithm, Defuzzifier is set in jFuzzyLogic for COG.The number of an output variable (Out) can be obtained after COG ambiguity solutions Value.It is 80 as 20threshold_scale_out that we set the decision threshold threshold_scale_in of result, when solution mould When the result of paste is less than 20, predicate node can be reclaimed, and think that when the result of ambiguity solution is more than 80 present node is in bottleneck shape State, needs to extend.
The above is presently preferred embodiments of the present invention, and the present invention should not be limited to the embodiment and accompanying drawing institute is public The content that opens.Every without departing from complete equivalent or modification under spirit disclosed in this invention, both fall within the model of present invention protection Enclose.

Claims (7)

1. a kind of bottleneck node detection method of extensive stream data processing system, it is characterised in that:This method is relied on System, i.e., a kind of bottleneck detecting system based on fuzzy logic control, abbreviation the system include that initialization unit, node state are adopted Collection unit, fuzzy reasoning unit conciliate blur unit;
A kind of bottleneck node detection method of extensive stream data processing system, comprises the following steps that:
Step 1:Initialization unit initializes fuzzy logic engine, sets the semantization label and each semantic label of input variable Membership function, loads Fuzzy Rule Sets, sets the reasoning results critical parameter;
Step 2:Node state collecting unit obtains node state;
Step 3:Fuzzy reasoning unit is by input variable obfuscation;
Step 4:Fuzzy reasoning;
Step 5:Ambiguity solution, obtains result of determination;
So far, step 1 completes a kind of bottleneck node detection method of extensive stream data processing system to step 5.
2. the bottleneck node detection method of a kind of extensive stream data processing system according to claim 1, its feature Also reside in:In step 1, fuzzy logic engine is to achieve fuzzy logic control language (FCL) standard (IEC1131-7) and energy The program engine of fuzzy reasoning is carried out, the language such as fuzzylite that can be realized using C Plus Plus and Java is realized jFuzzylogic;
Semantization label is that fuzzy logic uses " true value ", each input value (i.e. node state) have the semantic label of oneself and Corresponding membership function, these should be set in initialization node, are typically recorded in a configuration file, are had fuzzy logic Engine reads and parses;
Each semantic label of input variable can correspond to a membership function, and the value of this function is between 0~1, it is assumed that certain The span of input variable x is (m, n), for the membership function of semantic label v when function f (x, v) represents that value is x;One As for, the membership function of certain semantic label is trapezoidal or triangle in rectangular coordinate system;Fuzzy Rule Sets are prior The regular collection of the fuzzy reasoning that writes, fuzzy logic engine can read these rules and parse, with reasoning from logic later;
The reasoning results judge that we also need to arrange two threshold values for final result:Collapsible threshold value and expansible threshold value, Represented using threshold_scale_in and threshold_scale_out respectively, when the result of ambiguity solution is less than Represent during threshold_scale_in that present node can be reclaimed, when ambiguity solution result is more than threshold_scale_ Represent during out that present node needs to extend.
3. the bottleneck node detection method of a kind of extensive stream data processing system according to claim 1, its feature Also reside in:Step 2 will carry out bottleneck judgement to a node, need the current operating conditions for first obtaining the node, for For stream data processes cluster, we select the letter based on the CPU usage of node, memory usage, data tuple size Breath;
Wherein, node state is with an element group representation:statusi={ Ci,Mi,Si,Missi, represent that node i is current respectively Cpu is loaded, Ci;Memory usage, Mi;Process the currently processed size of data of tuple, Si,;Do not dispose in time in time recently Tuple quantity, Missi
Complete for all data tuples are all strictly processed in prescribed limit, i.e., do not allow the flow data of time-out to process engine, Process time-out expression system if there are tuple has needed to extend, so this kind of engine Miss in the present systemiValue It is forever 0, then can sets for Miss options in semantic label for allowing some fault-tolerant stream datas occur to process engines Determine a group of labels and corresponding membership function;
Wherein, CiScope be 0~100, MiScope be 0~100;SiAnd MissiSpan and application concrete scene Related.
4. the bottleneck node detection method of a kind of extensive stream data processing system according to claim 1, its feature Also reside in:The state tuple that reasoning element is obtained using step 2 is obscured in step 3, and the input of fuzzy logic processes engine is set Amount, carries out obfuscation by the membership function for defining to input quantity, the step for can be completed by fuzzy logic processes engine;Input The idiographic flow of variable obfuscation is as follows:
A membership function record variable of step 3.1 for the subjection degree of fuzzy set, to certain input variable each Fuzzy set needs to seek a group subjection degree respectively;
Wherein, membership function, is designated as μA(x);The span of subjection degree is the real number between 0 to 1;
Each fuzzy set of certain input variable is needed to seek a group subjection degree respectively, specially:
Assume there is A1,A2,...,AnIndividual fuzzy set, then need to seek degree of membership respectively to this n fuzzy set, obtain [μA1A2,..., μAn];
The all input variables of step 3.2 pair seek the subjection degree of its fuzzy set respectively.
5. the bottleneck node detection method of a kind of extensive stream data processing system according to claim 1, its feature Also reside in:In step 4, fuzzy reasoning is the reasoning based on fuzzy rule, and the condition of the premise of fuzzy rule, i.e. fuzzy reasoning is The logical combination of fuzzy proposition;The conclusion of fuzzy rule is to represent the fuzzy proposition of the reasoning results, and all fuzzy propositions are set up Fog-level represented with the membership function of corresponding language variable qualitative value, i.e., the obfuscation result required by step 3.
6. the bottleneck node detection method of a kind of extensive stream data processing system according to claim 1, its feature Also reside in:Step 4, specially:
Step 4.1 fuzzy reasoning unit calculates the conclusion of every fuzzy rule;The obfuscation result obtained using step 3 calculates rule The then logical combination of premise part fuzzy proposition, and the membership function by the subjection degree of premise logical combination with conclusion proposition does Min computings, try to achieve the fog-level of conclusion;
Step 4.2 does max computings to the fog-level of the conclusion of all fuzzy rules in step 4.1, obtains fuzzy reasoning result;
So far, fuzzy logic engine provides complete fuzzy reasoning and has realized that we only need to define Fuzzy Rule Sets, you can The interface for calling engine to provide obtains the reasoning results.
7. the bottleneck node detection method of a kind of extensive stream data processing system according to claim 1, its feature Also reside in:
What step 4 was obtained is the value of one group of degree of membership of the fuzzy set of result, and we will carry out ambiguity solution to this group of result and obtain To node whether conclusion in bottleneck, preferred ambiguity solution method has maximum membership degree method, weighted mean method With gravity model appoach (the Center of Gravity, COG);Maximum membership degree method takes degree of membership maximum in all results Used as final result of determination, this method realizes that simple but precision is poor to that result;More usually COG, COG methods lead to Cross the position of centre of gravity of result of calculation collection as a result;
Multiple ambiguity solution algorithms are achieved in fuzzy logic engine, such as only need in jFuzzylogic in configuration file middle finger The value for determining DEFUZZIFY METHOD can both use gravity model appoach ambiguity solution for COG;The result of gravity model appoach ambiguity solution is a number Value, be same as with initial phase arrange two threshold values compare, be finally extend, should shrink also be to maintain constant Decision-making.
CN201610835764.1A 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system Active CN106506254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610835764.1A CN106506254B (en) 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610835764.1A CN106506254B (en) 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system

Publications (2)

Publication Number Publication Date
CN106506254A true CN106506254A (en) 2017-03-15
CN106506254B CN106506254B (en) 2019-04-16

Family

ID=58291455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610835764.1A Active CN106506254B (en) 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system

Country Status (1)

Country Link
CN (1) CN106506254B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669436A (en) * 2018-12-06 2019-04-23 广州小鹏汽车科技有限公司 A kind of method for generating test case and device of the functional requirement based on electric car
CN112148566A (en) * 2020-11-09 2020-12-29 中国平安人寿保险股份有限公司 Monitoring method and device of computing engine, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1345149A (en) * 2000-08-07 2002-04-17 香港科技大学 Flow-type data method and device
CN102404399A (en) * 2011-11-18 2012-04-04 浪潮电子信息产业股份有限公司 Fuzzy dynamic allocation method for cloud storage resource
CN102624870A (en) * 2012-02-01 2012-08-01 北京航空航天大学 Intelligent optimization algorithm based cloud manufacturing computing resource reconfigurable collocation method
CN103491024A (en) * 2013-09-27 2014-01-01 中国科学院信息工程研究所 Job scheduling method and device for streaming data
CN103530189A (en) * 2013-09-29 2014-01-22 中国科学院信息工程研究所 Automatic scaling and migrating method and device oriented to stream data
CN103853766A (en) * 2012-12-03 2014-06-11 中国科学院计算技术研究所 Online processing method and system oriented to streamed data
CN105069025A (en) * 2015-07-17 2015-11-18 浪潮通信信息***有限公司 Intelligent aggregation visualization and management and control system for big data
CN105721199A (en) * 2016-01-18 2016-06-29 中国石油大学(华东) Real-time cloud service bottleneck detection method based on kernel density estimation and fuzzy inference system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1345149A (en) * 2000-08-07 2002-04-17 香港科技大学 Flow-type data method and device
CN102404399A (en) * 2011-11-18 2012-04-04 浪潮电子信息产业股份有限公司 Fuzzy dynamic allocation method for cloud storage resource
CN102624870A (en) * 2012-02-01 2012-08-01 北京航空航天大学 Intelligent optimization algorithm based cloud manufacturing computing resource reconfigurable collocation method
CN103853766A (en) * 2012-12-03 2014-06-11 中国科学院计算技术研究所 Online processing method and system oriented to streamed data
CN103491024A (en) * 2013-09-27 2014-01-01 中国科学院信息工程研究所 Job scheduling method and device for streaming data
CN103530189A (en) * 2013-09-29 2014-01-22 中国科学院信息工程研究所 Automatic scaling and migrating method and device oriented to stream data
CN105069025A (en) * 2015-07-17 2015-11-18 浪潮通信信息***有限公司 Intelligent aggregation visualization and management and control system for big data
CN105721199A (en) * 2016-01-18 2016-06-29 中国石油大学(华东) Real-time cloud service bottleneck detection method based on kernel density estimation and fuzzy inference system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
林建秋等: "一种针对网络流式文本数据的匹配算法", 《齐齐哈尔大学学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669436A (en) * 2018-12-06 2019-04-23 广州小鹏汽车科技有限公司 A kind of method for generating test case and device of the functional requirement based on electric car
CN109669436B (en) * 2018-12-06 2021-04-13 广州小鹏汽车科技有限公司 Test case generation method and device based on functional requirements of electric automobile
CN112148566A (en) * 2020-11-09 2020-12-29 中国平安人寿保险股份有限公司 Monitoring method and device of computing engine, electronic equipment and storage medium
CN112148566B (en) * 2020-11-09 2023-07-25 中国平安人寿保险股份有限公司 Method and device for monitoring computing engine, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN106506254B (en) 2019-04-16

Similar Documents

Publication Publication Date Title
CN102770826B (en) virtual machine power consumption measurement and management
JP2016100005A (en) Reconcile method, processor and storage medium
Yin et al. Cloudscout: A non-intrusive approach to service dependency discovery
KR101686919B1 (en) Method and apparatus for managing inference engine based on big data
CN106296315A (en) Context aware systems based on user power utilization data
CN105022823B (en) A kind of cloud service performance early warning event generation method based on data mining
Xiu et al. Sustainable development of port economy based on intelligent system dynamics
CN106506254B (en) A kind of bottleneck node detection method of extensive stream data processing system
CN115291806A (en) Processing method, processing device, electronic equipment and storage medium
Dogani et al. K-agrued: A container autoscaling technique for cloud-based web applications in kubernetes using attention-based gru encoder-decoder
Jiang et al. An energy-aware virtual machine migration strategy based on three-way decisions
Ruan et al. Cloud workload turning points prediction via cloud feature-enhanced deep learning
Wang et al. Data Factory: An Efficient Data Analysis Solution in the Era of Big Data
Chehida et al. Applied statistical model checking for a sensor behavior analysis
CN115495231A (en) Dynamic resource scheduling method and system under complex scene of high concurrent tasks
CN114443205B (en) Fault analysis method, device and non-transitory computer readable storage medium
Yongdnog et al. A scalable and integrated cloud monitoring framework based on distributed storage
CN110113301B (en) Intrusion detection system based on cloud computing
CN106713051A (en) Network management system
Du et al. OctopusKing: A TCT-aware task scheduling on spark platform
Xiao et al. YISHAN: Managing large-scale cloud database instances via machine learning
Zhu et al. CPU and network traffic anomaly detection method for cloud data center
Cai et al. Big data mining analysis method based on cloud computing
Kim et al. Apache storm configuration platform for dynamic sampling and filtering of data streams
Daud et al. Self-Configured Framework for scalable link prediction in twitter: Towards autonomous spark framework

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant