CN107479977B - Method and equipment for determining equipment performance - Google Patents

Method and equipment for determining equipment performance Download PDF

Info

Publication number
CN107479977B
CN107479977B CN201710763494.2A CN201710763494A CN107479977B CN 107479977 B CN107479977 B CN 107479977B CN 201710763494 A CN201710763494 A CN 201710763494A CN 107479977 B CN107479977 B CN 107479977B
Authority
CN
China
Prior art keywords
cluster
machines
machine
tasks
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710763494.2A
Other languages
Chinese (zh)
Other versions
CN107479977A (en
Inventor
黎志勇
翁忠会
张春创
唐锦坤
傅锋
曹鹏飞
张攀
陈锐均
梁莎民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Comba Network Systems Co Ltd
Original Assignee
Comba Telecom Systems China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Comba Telecom Systems China Ltd filed Critical Comba Telecom Systems China Ltd
Priority to CN201710763494.2A priority Critical patent/CN107479977B/en
Publication of CN107479977A publication Critical patent/CN107479977A/en
Application granted granted Critical
Publication of CN107479977B publication Critical patent/CN107479977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method and equipment for determining equipment performance, which are used for solving the problem that the mode of processing the performance data of network equipment in the prior art is difficult to meet the requirement of higher and higher computing capacity. After determining that the machines in the cluster are hosts, distributing a plurality of tasks for determining the performance of equipment to the machines in the cluster; and analyzing and calculating the machines in the cluster according to the obtained tasks to obtain corresponding performance index values. Because the original task of one machine is distributed to a plurality of machines, the task amount required to be charged by each machine is greatly reduced compared with that of the original machine, the requirement on the computing capacity of a single machine is reduced, and the processing of mass performance data can be completed.

Description

Method and equipment for determining equipment performance
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for determining device performance.
Background
The network maintenance personnel judge whether the network equipment is abnormal or not through the data in the network management report system, so as to ensure the normal operation of the system.
The network management report system collects Performance data of mass network equipment, and performs multi-dimensional and multi-granularity KPI (Key Performance Indicator) analysis and calculation on the data according to time, space, equipment type and the like to obtain a statistical result, and the obtained statistical result is displayed visually and in real time.
Currently, a network management report system collects and processes data through one machine. With the popularization of the internet and the development of wireless communication technology, the number of network devices is increasing, KPI calculation needs to be considered from multiple dimensions and multiple granularities, each target device classification contains hundreds of KPI formulas, and the calculation complexity is increasing with the increase of the number of network devices.
Due to the limitations of the machine itself, such as: the number of connections is limited, and the IO processing capability of the disk is limited, so that the current method for processing the performance data of the network device is difficult to meet the demand of higher and higher computing capability.
Disclosure of Invention
The invention provides a method and equipment for determining equipment performance, which are used for solving the problem that the mode of processing the performance data of network equipment in the prior art is difficult to meet the requirement of higher and higher computing capacity.
The embodiment of the invention provides a method for determining the performance of equipment, which comprises the following steps:
after a machine in the cluster determines that the machine is a host, dividing information required for determining the performance of equipment into a plurality of dimensions, wherein each dimension corresponds to a type of information, and the granularity of each dimension is divided according to the content of the corresponding information;
the machine in the cluster combines the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension;
the machines in the cluster distribute the obtained plurality of tasks to the machines in the cluster, so that the machines determine the equipment performance index according to the tasks.
An embodiment of the present invention provides an apparatus for determining an apparatus performance, where the apparatus includes:
at least one processing unit, and at least one memory unit, wherein the memory unit stores program code that, when executed by the processing unit, causes the processing unit to perform the following:
after the host is determined, dividing information required by the performance of the equipment into a plurality of dimensions, wherein each dimension corresponds to one type of information, and the granularity of each dimension is divided according to the content of the corresponding information; combining the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension; the obtained tasks are distributed to machines in the cluster, so that the machines determine equipment performance indexes according to the tasks.
An embodiment of the present invention provides a machine for determining device performance, where the machine includes:
the dividing module is used for dividing the information required by the performance of the equipment into a plurality of dimensions after determining that the equipment is a host, wherein each dimension corresponds to one type of information, and the granularity of each dimension is divided according to the content of the corresponding information;
the combination module is used for combining the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension;
and the distribution module is used for distributing the obtained tasks to the machines in the cluster so that the machines determine the equipment performance index according to the tasks.
After determining that the machines in the cluster are hosts, distributing a plurality of tasks for determining the performance of equipment to the machines in the cluster; and analyzing and calculating the machines in the cluster according to the obtained tasks to obtain corresponding performance index values. Because the original task of one machine is distributed to a plurality of machines, the task amount required to be charged by each machine is greatly reduced compared with that of the original machine, the requirement on the computing capacity of a single machine is reduced, and the processing of mass performance data can be completed.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic diagram of a system for determining device performance according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a method for forming a plurality of tasks according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a cluster host monitoring cluster machine according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating redistribution of computing tasks in downtime of a machine according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a first apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a second apparatus according to an embodiment of the present invention;
FIG. 7 is a logical representation of a machine operating environment for determining device performance according to an embodiment of the present invention;
FIG. 8 is a flowchart illustrating a method for determining device performance according to an embodiment of the present invention;
FIG. 9 is a flowchart illustrating a process of computing machine computations in a computing cluster according to an embodiment of the present invention;
fig. 10 is a schematic diagram of a computing flow of the computing cluster system according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a system for determining device performance according to an embodiment of the present invention includes: a plurality of machines 10.
The machine 10 is configured to, after determining that the machine is a host, divide information required for determining device performance into a plurality of dimensions, where each dimension corresponds to a type of information, and a granularity of each dimension is divided according to content of the corresponding information; combining the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension; the resulting plurality of tasks are distributed to machines 10 in the cluster such that machines 10 determine a device performance indicator based on the tasks.
After determining that the machine 10 in the cluster is the host, distributing a plurality of tasks for determining the performance of the equipment to the machines in the cluster; and analyzing and calculating the machines in the cluster according to the obtained tasks to obtain corresponding performance index values. Because the original task of one machine is distributed to a plurality of machines, the task amount required to be charged by each machine is greatly reduced compared with that of the original machine, the requirement on the computing capacity of a single machine is reduced, and the processing of mass performance data can be completed.
The performance index reflects the operation capability of the target device and reflects the working state of the target device. Different working states or different operation capabilities and different corresponding performance indexes.
For example:
the main performance indexes of the CPU mainly comprise: the coprocessor adopts a Write Back (Write Back) structure for high-speed cache, dynamic processing and the like of the main frequency, frequency multiplication, external frequency, memory bus speed, expansion bus speed, working voltage address bus width and data bus width superscalar;
the performance indexes of the hard disk mainly comprise: the rotation speed of the hard disk, the data transmission rate of the hard disk, the average seek time of the hard disk cache, the average access time and the like;
the performance indexes of the memory bank mainly comprise: speed, capacity, parity, memory voltage, etc.;
the performance indexes of the display card mainly comprise: the core model, the core frequency, the vertex shadow, the pixel shadow number, the rendering pipeline number, the video memory model, the video memory frequency, the video memory bit width, the video memory capacity, the interface type and the like of the display card;
the network performance indexes mainly comprise: bandwidth, delay, bandwidth-delay product, etc.
The dimension in the above is a basis for classifying the performance data, which embodies a certain characteristic of the device performance data. Dimensions include, but are not limited to, some or all of the following:
device type, KPI formula, time, space used to determine device performance.
The device type is also different for different devices, such as:
device types include, but are not limited to, some or all of the following:
the system comprises a TD base station, a GSM base station, an LTE base station, a TD gateway, a GSM gateway, an LTE gateway, a security gateway, a signaling gateway, a local controller, a macro network base station and distributed equipment.
Each equipment type can be divided into a plurality of service scenes;
for example:
for the GSM base station, the service indexes are divided into GSM traffic, GSM access type, GSM hold type, GSM resource type, GSM distribution type, and the like.
Each service scene of each equipment type corresponds to one independent KPI formula and relatively independent performance metadata, so that the equipment type can be divided into relatively independent computing tasks.
In addition, the time of the time dimension is defined as a time interval for uploading performance data by the device. For example, a device has an index of traffic volume, and the device will report a traffic volume index every 5 minutes, then an hour refers to the sum of all reported traffic volume indexes for the device in one hour. The task calculation of the TD base station-traffic KPI1 formula-hour-network element is assumed, and the task is to calculate the sum of performance data uploaded by equipment in a certain network element in one hour corresponding to the service scene of the TD base station of the formula traffic KPI 1.
In addition, the spatial dimension is also a physical dimension, and the network element-site-area has dependency.
The network element is composed of one or more machine disks or machine frames, and can independently complete certain transmission functions, and the simple understanding is that the network element or equipment in the network. A network element is the smallest unit that can be monitored and managed in network management. The site comprises a plurality of network elements, and the area comprises a plurality of sites. The performance index calculation results of the sites and the areas can be obtained by aggregation operation according to the calculation results of the network elements.
Granularity is the content embodied in a dimension, and the specific content of different dimensions is different.
If the device types are divided into a TD base station, a GSM base station and a home base station, the TD base station has one granularity;
the time can be divided into minutes, hours, days, weeks, months, seasons and years, and the hours are one granularity;
the following describes a scheme for determining a plurality of tasks according to an embodiment of the present invention with a detailed example.
First, each machine in the distributed computing cluster participates in the collection of device performance data, and subsequent parsing of the performance data. There are two ways for cluster machines to collect data:
each machine can widely collect the performance data of the target equipment, and when the machine receives the tasks, the machine calls the required data from the uploaded performance data according to the divided tasks to calculate the tasks;
or the machine in the cluster receives the calculation task distributed by the host of the cluster firstly, so that the uploaded data is purposefully collected according to the requirement of the task and used for the calculation requirement of the machine.
The embodiment of the invention adopts the distributed computing cluster, and a plurality of machines in the cluster simultaneously collect and analyze the performance data of the equipment, so that the capability of the computing cluster for collecting and analyzing the basic performance data is greatly improved.
The assumed dimensions include device type, KPI formula, time, space used to determine device performance.
The device types comprise a TD base station, a GSM base station and an LTE base station;
KPI formulas used to determine device performance include KPI1, KPI2, KPI 3;
the time includes minutes, hours and days;
the space comprises network elements, sites and areas;
as shown in fig. 2, when a plurality of tasks are formed, the embodiment of the present invention:
the machine takes a granularity from the device type dimension, such as a TD base station; then, taking a granularity from the KPI formula dimension used for determining the performance of the equipment, such as KPI 1; taking a granularity, such as hours, from the time dimension; because the spatial dimension has dependency, the performance indexes of the sites and the areas can be calculated by the performance indexes of the network elements, and the granularity is not independently selected in the spatial dimension; thus combining into one task: TD base station-KPI 1-hour-network element-site-area. In doing so, the machine decomposes complex computational flows, resulting in many relatively simple computational tasks.
The following illustrates the total number of tasks that will be obtained after the task division:
assuming that the group of tasks is divided into four dimensions of equipment type, KPI formula used for determining equipment performance, time and space, wherein the number of equipment types is 10, the number of KPI formula used for determining equipment performance is 1000, and the number of time dimensions is 5, the number of independently calculated tasks is 10 × 1000 × 5.
The above is that the embodiment of the present invention decomposes the calculation flow of the calculation task according to the KPI formula corresponding to the service scenario of the device type to obtain a plurality of tasks. Because the complex calculation process is decomposed, a plurality of relatively independent calculation tasks are obtained, and the calculation complexity of the tasks is reduced compared with the calculation complexity difficulty of the original tasks processed by one machine, so that the single machine can sufficiently meet the requirement of task calculation capacity.
In the implementation, one device in the cluster may be set as the master, or one master may be elected from the devices in the cluster by setting a priority.
If the mode of setting the priority is adopted, there are many specific selection modes, for example, the machines in the cluster determine themselves as the host according to the following modes:
and if the machine in the cluster cannot be communicated with other machines in the cluster with the priority greater than the priority of the machine, determining the machine as the host.
There are many ways to set the priority, for example, it can be manually set according to the machine performance. A number may also be set for the machine, from which the priority is determined. In this way, priority with a large number may be set as a cluster master, priority with a high intra-cluster device weight may be set as a cluster master, and the like.
The following will specifically give an example, assuming that the priority level is set to be high when the number is small:
and numbering the machines in the cluster, wherein each machine is unique in number and is respectively A1, A2, A3, A4 and A5 …, and the priority with small program setting number is used as a cluster host to undertake task distribution.
For example, at the time of a1, a2, A3 elections, each machine calls the machine that is ranked earlier than it, e.g., A3 calls a2, a1 through a program interface; a2 calls A1 through a program interface.
If A1 confirms that the failure does not occur, the device automatically serves as a host; meanwhile, the A1 replies after receiving the call-in commands of A2 and A3;
if the A2 calls the program interface of the A1, the calling cannot be performed, and the normal work of the A2 is confirmed, the A2 automatically serves as a host, and if the calling can be performed, the host is not used; meanwhile, after receiving the call-in command of the machine A3, the A2 replies;
if the A3 calls the program interfaces A1 and A2, the calling cannot be conducted, and the normal work of the A3 is confirmed, the A3 automatically serves as a host, and if the calling can be conducted, the host is not used;
at intervals, the machines in the cluster repeat the operation to determine whether the current host fails; and if the fault occurs, selecting a new cluster host in time to replace the original cluster host.
In the embodiment of the invention, the fault host can be replaced in time, so that the problem that the whole task cannot be completed because the cluster host has a fault can be effectively avoided.
In an embodiment, when distributing the obtained task to the machines in the cluster, the task may be randomly distributed, or a certain rule may be adopted to more optimally use machine resources, for example, the cluster host distributes the obtained multiple tasks to the machines in the cluster as follows:
the machine in the cluster determines the number of tasks distributed to each machine in the cluster according to the weight of the machine and/or the number of machines in the cluster;
and the machines in the cluster distribute the obtained tasks according to the determined task quantity distributed to each machine in the cluster.
When distributing tasks, the cluster host can participate in the task calculation, and the task distribution of the cluster host is to reserve the calculation tasks in charge of the cluster host in advance and distribute the rest tasks; when the cluster host does not participate in the calculation, all tasks are distributed to other machines in the cluster according to the rules.
Examples will be specifically given below:
1. setting three machines in the cluster, wherein the numbers of the machines are A, B, C respectively, the host number of the cluster is A, and assuming that 6 tasks are finally obtained, only considering the number of the machines and the weight:
when the cluster host participates in the calculation, the cluster host equally divides the tasks obtained by decomposing the calculation flow into three machines. Each machine gets 2 tasks.
When the cluster host does not participate in the calculation, the cluster host distributes the tasks obtained by decomposing the calculation flow to the cluster machine B, C, and the number of the tasks obtained by the two machines is 3.
2. Setting three machines in the cluster, wherein the machines are respectively numbered A, B, C, the host number of the cluster is A, and the weights are respectively 1, 3 and 2, assuming that 7 tasks are finally obtained, and considering the number and the weight of the machines:
when the cluster host participates in calculation, the cluster host can circularly traverse each cluster machine, A, B, C three cluster machines are circularly performed for the first time, the number of tasks respectively obtained by the three cluster machines is 1, 3 and 2, then the cluster host traverses the remaining 1 task through the three cluster machines, because the sequence of traversing the host has certain randomness, if the traversing sequence is A, B, C, the cluster machine A obtains one task, and B, C does not obtain the task; if the order of the second round of traversal is: C. a, B, then clustered machine C gets a task and A, B gets no task.
When the cluster host does not participate in calculation, the cluster host can circularly traverse each cluster machine, B, C two cluster machines are circularly performed for the first time, the number of tasks respectively obtained by the two cluster machines is 3 and 2, and then the cluster host traverses the remaining 2 tasks through the two cluster machines, because the sequence of traversing the host has certain randomness, if the traversing sequence is B, C, the cluster machine B obtains two tasks, and C does not obtain a task; if the order of the second round of traversal is: C. b, the cluster machine C obtains two tasks, and B does not obtain the tasks.
By the above traversal method, the tasks can be distributed to machines with different weights.
3. Considering only the weight, setting a plurality of machines in the cluster, which are respectively numbered A, B, C … …, and adding the weights of all the machines together to be 100, wherein the host number of the cluster is a, assuming that 200 tasks are finally obtained:
when the cluster host participates in the calculation, if the weight of the cluster host A is 20, the obtained tasks are divided into 100 parts, the cluster host obtains 20 parts of the total tasks, namely 40 tasks, and other machines are analogized in sequence. When the cluster host does not participate in the calculation, dividing the obtained tasks into 100 parts, if the weight of the cluster machine B is 10, dividing the obtained tasks into 100 parts, obtaining 10 parts of the total tasks by the cluster machine B, namely 20 tasks, and repeating the steps by the other machines.
For example, the number of tasks is small, and the number of tasks obtained by the cluster is larger than the number of examples under normal conditions. In the embodiment of the invention, the number and the weight of the machines are taken into consideration during task allocation, so that the machine resources can be utilized more optimally, and the tasks can be completed more quickly and better.
Optionally, in the embodiment of the present invention, if a machine fault occurs, a specific processing manner is given as follows:
after the machine in the cluster allocates the obtained plurality of tasks to the machine in the cluster, the method further includes:
and after determining that other machines in the cluster have faults, the machines in the cluster redistribute the tasks distributed to the machines with the faults.
Specific examples will be given below for solutions of machine failures, and it is assumed that there are five machines in the system, and the machines in the cluster are corresponding to unique numbers a1, a2, A3, a4, and a5, and the priority with a small program setting number is used as a cluster host to undertake task distribution. The interval for uploading heartbeats was set to one minute.
Example (c): a1 works normally as a cluster host, A1 receives heartbeat uploaded by A2, A3, A4 and A5, after T1 time, A2 does not upload heartbeat for more than one minute, and A1 judges A2 is down; then A1 takes out the task data from the database and redistributes them to A2, and then they are handed to A3, A4, A5 for processing.
In the embodiment of the present invention, a cluster host monitors machines in a cluster, as shown in fig. 3.
In fig. 3, each machine in the cluster reports a heartbeat to the cluster host (i.e., the machine in the cluster that determines itself to be the host) periodically;
the cluster host judges whether the machine is in fault currently according to whether the heartbeat of the machine can be received.
If the heartbeat of a certain machine or some machines is not received within the set time length, it is determined that the corresponding machine is down (i.e. has a fault), and the cluster host needs to distribute tasks, as shown in fig. 4.
In fig. 4, the cluster host does not receive the heartbeat of the second machine within the set time length, determines that the second machine is down, and distributes the task allocated to the second machine to other machines in the cluster. The distribution may be performed to only one machine or to a plurality of machines.
In the embodiment of the invention, the tasks of the machine with the fault in the mode are distributed to other machines without the fault in time, so that the distributed computing tasks in the distributed computing cluster can be effectively prevented from being incomplete due to the downtime of the machines in the system, and the fault tolerance rate of KPI computing is ensured. And because the logic units of all the machines are relatively independent, the machine resources in the cluster can be reasonably distributed according to the scale of the service, and the computing capacity is improved.
Aggregation operation refers to calculation between network element-site-areas, and the network element-site-areas are classified in spatial dimensions and have dependency among each other, so that the aggregation operation is required to be applied at the stage. Based on this, after the optional machine in the cluster allocates the obtained multiple tasks to the machines in the cluster, the method further includes:
and the machines in the cluster aggregate the equipment performance indexes determined by the machines in the cluster and the received equipment performance indexes sent by other machines in the cluster according to an aggregation rule.
For example: for the independent calculation task of TD base station-KPI 1-hour-net element-site-region, the aggregation operation is: calculating all network element TD base stations and network element hour reports of KPI formula 1, then calculating KPI values of each site according to all the hour report calculation results, and then calculating indexes of the region according to the values of each site.
Assuming that each machine number in the cluster is unique, a1, a2, A3, a4, and a5, the priority with a smaller program number is set as the cluster master, and the task distribution is performed. A1 is set to work normally, and as a cluster host, A2, A3, A4 and A5 receive tasks distributed by A1, and the result is obtained through calculation. The index A is a formula of cumulative sum, and as shown in Table 1, the TD base station-KPI 1-hour-network element-site-regional calculation task regional aggregate calculation;
the station 1 comprises 3 network elements, the station 2 comprises 7 network elements, the area 1 comprises the station 1 and the station 2, the value of each network element index A obtained after the network element calculation is 1, the aggregation operation is carried out, the value of the station 1 index A is 3, the value of the station 2 index A is 7, and the value of the area 1 index A is 10.
Figure BDA0001393638660000111
TABLE 1
The value of each network element index A calculated by the machine in the cluster is 1, and at the moment, the machine in the cluster performs aggregation operation on the sites and the areas. The aggregation operation of the cluster machine can be divided into two conditions, the aggregation operation can be independently completed by the cluster host, or the aggregation operation in the task which is responsible for the cluster machine can be completed by other machines in the cluster, and then the aggregation operation is uploaded to the cluster host, and the rest aggregation operation is completed by the cluster host. The calculation flow of the aggregation operation can be shown as the following example:
assuming that each device number in the cluster is unique, a1, a2, A3, a4, and a5, the priority with a smaller program number is set as the cluster master, and the task distribution is performed. The A1 is set to work normally, and as the cluster host, the A2, the A3, the A4 and the A5 receive the tasks distributed by the A1 and calculate to obtain the result.
Example 1: a2, A3, A4 and A5 upload the values of the network element indexes A obtained by calculation to A1, and then the A1 completes the aggregation operation.
Example 2: a2, A3, A4 and A5 finish the calculation to obtain the value of each network element index A, finish the aggregation operation of the site and the area to obtain the value of the network element index A, the site index A and the area index A, upload the obtained index value to the cluster host, and finish the aggregation operation by the cluster host.
After the data are aggregated in the embodiment of the invention, the performance indexes of the equipment in the area can be displayed more intuitively and in real time.
As shown in fig. 5, a first apparatus according to an embodiment of the present invention includes:
at least one processing unit 500, and at least one memory unit 501, wherein the memory unit stores program code that, when executed by the processing unit, causes the processing unit to perform the following:
after the host is determined, dividing information required by the performance of the equipment into a plurality of dimensions, wherein each dimension corresponds to one type of information, and the granularity of each dimension is divided according to the content of the corresponding information; combining the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension; the obtained tasks are distributed to machines in the cluster, so that the machines determine equipment performance indexes according to the tasks.
Optionally, the processing unit 500 is configured to determine itself as a host according to the following manner:
and if the communication with other machines in the cluster with the priority greater than the self is determined to be unavailable, determining the self as the host.
Optionally, the processing unit 500 is configured to determine, according to the machine weight and/or the number of machines in the cluster, the number of tasks allocated to each machine in the cluster; and distributing the obtained multiple tasks according to the determined number of the tasks distributed to each machine in the cluster.
Optionally, the processing unit 500 is further configured to:
after the obtained tasks are distributed to the machines in the cluster, after determining that other machines in the cluster have faults, the tasks distributed to the machines with the faults are distributed again.
Optionally, the processing unit 500 is further configured to:
and after distributing the obtained tasks to the machines in the cluster, aggregating the equipment performance indexes determined by the tasks and the received equipment performance indexes sent by other machines in the cluster according to an aggregation rule.
As shown in fig. 6, a second apparatus according to an embodiment of the present invention includes:
a dividing module 600, configured to divide information required for determining device performance into multiple dimensions after determining that the host is the host, where each dimension corresponds to a type of information, and a granularity of each dimension is divided according to content of the corresponding information;
a combination module 601, configured to combine the granularity of each dimension to obtain multiple tasks, where one task includes one granularity of each dimension;
an allocating module 602, configured to allocate the obtained multiple tasks to machines in the cluster, so that the machines determine the device performance index according to the tasks.
Optionally, the dimensions include some or all of the following:
device type, KPI formula used to determine device performance, time, zone.
Optionally, the dividing module 600 is specifically configured to determine itself as a host according to the following manner:
and if the communication with other machines in the cluster with the priority greater than the self is determined to be unavailable, determining the self as the host.
Optionally, the allocating module 602 is specifically configured to:
determining the number of tasks allocated to each machine in the cluster according to the machine weight and/or the number of machines in the cluster;
and distributing the obtained multiple tasks according to the determined number of the tasks distributed to each machine in the cluster.
Optionally, the allocating module 602 is further configured to:
after the obtained tasks are distributed to the machines in the cluster, after determining that other machines in the cluster have faults, the tasks distributed to the machines with the faults are distributed again.
Optionally, the allocating module 602 is further configured to:
and after distributing the obtained tasks to the machines in the cluster, aggregating the equipment performance indexes determined by the tasks and the received equipment performance indexes sent by other machines in the cluster according to an aggregation rule.
As shown in fig. 7, the logic of the working environment of the cluster machine in the embodiment of the present invention is as follows:
target equipment uploads performance data of the target equipment, and machines in a computing cluster receive and analyze the uploaded data; then, the cluster host decomposes the calculation flow of the cluster calculation task, and divides the calculation flow by taking a KPI (Key performance indicator) formula for determining the performance of the equipment as a basis to form a plurality of tasks; then the cluster host distributes the obtained tasks to cluster machines according to a certain rule; after the cluster machine completes the calculation task obtained by the cluster machine, the machines in the cluster simply aggregate the calculation result to obtain the required performance index; and finally uploading to a database.
Based on the same inventive concept, the embodiment of the present invention further provides a method for determining device performance, and since devices corresponding to the method are multiple machines for processing tasks in the embodiment of the present invention, and the principle of the method for solving problems is similar to that of the device, the implementation of the method can refer to the implementation of the system, and repeated details are omitted.
As shown in fig. 8, an embodiment of the present invention provides a method for determining device performance, where the method includes:
step 800: after a machine in the cluster determines that the machine is a host, dividing information required for determining the performance of equipment into a plurality of dimensions, wherein each dimension corresponds to a type of information, and the granularity of each dimension is divided according to the content of the corresponding information;
step 801: combining the granularity of each dimension by a machine in the cluster to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension;
step 802: the machines in the cluster distribute the resulting plurality of tasks to the machines in the cluster such that the machines determine device performance indicators from the tasks.
Optionally, the dimensions include some or all of the following:
device type, key performance indicator KPI formula, time, area used to determine device performance.
Optionally, the machines in the cluster determine themselves as hosts according to the following manner:
and if the machine in the cluster cannot be communicated with other machines in the cluster with the priority greater than the priority of the machine, determining the machine as the host.
Optionally, the allocating, by the cluster master, the obtained multiple tasks to the machines in the cluster includes:
the machine in the cluster determines the number of tasks distributed to each machine in the cluster according to the weight of the machine and/or the number of machines in the cluster;
and the machines in the cluster distribute the obtained tasks according to the determined task quantity distributed to each machine in the cluster.
Optionally, after the machine in the cluster allocates the obtained multiple tasks to the machine in the cluster, the method further includes:
and after determining that other machines in the cluster have faults, the machines in the cluster redistribute the tasks distributed to the machines with the faults.
Optionally, after the machine in the cluster allocates the obtained multiple tasks to the machine in the cluster, the method further includes:
and the machines in the cluster aggregate the equipment performance indexes determined by the machines in the cluster and the received equipment performance indexes sent by other machines in the cluster according to an aggregation rule.
As shown in fig. 9, the flow of processing tasks by the cluster machine in the embodiment of the present invention is as follows:
after the cluster receives a task starting instruction, the cluster host divides a plurality of tasks according to a KPI (Key performance indicator) formula for determining the performance of the equipment, and the tasks are not influenced with each other; and then the cluster host distributes the obtained tasks to the cluster machines, and the cluster machines complete the calculation. Meanwhile, the cluster host uploads the task records to the database and records the task state as an incomplete state.
After the cluster machine obtains the tasks distributed by the cluster host, submitting the tasks to a thread pool, and running in a multi-thread mode; then, performing network element level calculation, and putting and caching the obtained result; and finally, acquiring the network element level calculation result from the cache, and performing aggregation operation.
And after the network element level calculation and the aggregation operation are finished, the cluster host uploads the network element level and aggregation type calculation results to the database. And after the task is finished, updating the task into a finished state in the database.
In the embodiment of the present invention, a detailed flowchart of the method for determining the device performance may be shown in fig. 10.
Step 1000: the machine in the cluster collects and analyzes the performance data uploaded by the target equipment
Step 1001: judging whether the current host works normally, if so, executing step 1003; otherwise, go to step 1002;
step 1002: the machine in the cluster reselects the cluster host by using the setting;
step 1003: the cluster host divides the calculation flow of the tasks according to a KPI (Key performance indicator) formula required by determining the performance of the equipment to form a plurality of tasks; monitoring the cluster machine and judging whether the cluster machine has a fault;
step 1004: the cluster host distributes the obtained tasks to machines in the cluster according to a set rule in sequence;
step 1005: machines within a cluster receive tasks distributed by a cluster host
Step 1006: cluster machine carries out calculation processing on task
Step 1007: the cluster machine uploads the calculation result to the cluster host;
step 1008: and the cluster host performs simple aggregation operation and stores the calculation result in a storage. The present application is described above with reference to block diagrams and/or flowchart illustrations of methods, apparatus (systems) and/or computer program products according to embodiments of the application. It will be understood that one block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.
Accordingly, the subject application may also be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). Furthermore, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system. In the context of this application, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (9)

1. A method of determining device performance, comprising:
after a machine in the cluster determines that the machine is a host, dividing information required for determining the performance of equipment into a plurality of dimensions, wherein each dimension corresponds to a type of information, and the granularity of each dimension is divided according to the content of the corresponding information;
the machine in the cluster combines the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension;
the machine in the cluster distributes the obtained tasks to the machines in the cluster, so that the machines determine equipment performance indexes according to the tasks;
the dimensions include: the region, and the type of equipment, the Key Performance Indicator (KPI) formula used to determine the performance of the equipment, and part or all of the time; or the dimensions include: the type of equipment, the Key Performance Indicator (KPI) formula used for determining the performance of the equipment, and part or all of the time;
the cluster master distributes the obtained tasks to the machines in the cluster, and the method comprises the following steps:
the machine in the cluster determines the task number of each machine in the cluster through multi-round distribution; the machine in the cluster distributes the obtained multiple tasks according to the determined task quantity distributed to each machine in the cluster;
wherein, each round of distribution is carried out according to the following steps:
and determining the number of tasks which are distributed to each machine in the cluster and matched with the corresponding machine weight according to the distribution sequence and the number of machines in the cluster.
2. The method of claim 1, wherein the machines in the cluster determine themselves as hosts according to:
and if the machine in the cluster cannot be communicated with other machines in the cluster with the priority greater than the priority of the machine, determining the machine as the host.
3. The method of claim 1, wherein after the machines in the cluster assign the resulting plurality of tasks to the machines in the cluster, further comprising:
and after determining that other machines in the cluster have faults, the machines in the cluster redistribute the tasks distributed to the machines with the faults.
4. A method as claimed in any one of claims 1 to 3, wherein, after the machines in the cluster have allocated the plurality of tasks to the machines in the cluster, the method further comprises:
and the machines in the cluster aggregate the equipment performance indexes determined by the machines in the cluster and the received equipment performance indexes sent by other machines in the cluster according to an aggregation rule.
5. A machine for determining the performance of a device, the device comprising:
at least one processing unit, and at least one memory unit, wherein the memory unit stores program code that, when executed by the processing unit, causes the processing unit to perform the following:
after the host is determined, dividing information required by the performance of the equipment into a plurality of dimensions, wherein each dimension corresponds to one type of information, and the granularity of each dimension is divided according to the content of the corresponding information; combining the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension; distributing the obtained tasks to machines in the cluster so that the machines determine equipment performance indexes according to the tasks;
the dimensions include: the region, and the type of equipment, the Key Performance Indicator (KPI) formula used to determine the performance of the equipment, and part or all of the time; or the dimensions include: the type of equipment, the Key Performance Indicator (KPI) formula used for determining the performance of the equipment, and part or all of the time;
the processing unit is specifically configured to:
determining the task number of each machine in the cluster through multi-round distribution; distributing the obtained multiple tasks according to the determined number of the tasks distributed to each machine in the cluster;
wherein, each round of distribution is carried out according to the following steps:
and determining the number of tasks which are distributed to each machine in the cluster and matched with the corresponding machine weight according to the distribution sequence and the number of machines in the cluster.
6. The machine according to claim 5, characterized in that said processing unit is particularly adapted to determine itself as a master according to:
and if the communication with other machines in the cluster with the priority greater than the self is determined to be unavailable, determining the self as the host.
7. The machine of claim 5, wherein the processing unit is further to:
after the obtained tasks are distributed to the machines in the cluster, after determining that other machines in the cluster have faults, the tasks distributed to the machines with the faults are distributed again.
8. The machine of claim 5, wherein the processing unit is further to:
and after distributing the obtained tasks to the machines in the cluster, aggregating the equipment performance indexes determined by the tasks and the received equipment performance indexes sent by other machines in the cluster according to an aggregation rule.
9. A machine for determining device performance, the machine comprising:
the dividing module is used for dividing the information required by the performance of the equipment into a plurality of dimensions after determining that the equipment is a host, wherein each dimension corresponds to one type of information, and the granularity of each dimension is divided according to the content of the corresponding information;
the combination module is used for combining the granularity of each dimension to obtain a plurality of tasks, wherein one task comprises one granularity of each dimension;
the distribution module is used for distributing the obtained tasks to the machines in the cluster so that the machines determine equipment performance indexes according to the tasks;
the dimensions include: the region, and the type of equipment, the Key Performance Indicator (KPI) formula used to determine the performance of the equipment, and part or all of the time; or the type of equipment, the key performance indicator KPI formula used to determine the performance of the equipment, or some or all of the time;
the allocation module is specifically used for determining the task number of each machine in the cluster through multi-round allocation; the machine in the cluster distributes the obtained multiple tasks according to the determined task quantity distributed to each machine in the cluster;
wherein, each round of distribution is carried out according to the following steps:
and determining the number of tasks which are distributed to each machine in the cluster and matched with the corresponding machine weight according to the distribution sequence and the number of machines in the cluster.
CN201710763494.2A 2017-08-30 2017-08-30 Method and equipment for determining equipment performance Active CN107479977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710763494.2A CN107479977B (en) 2017-08-30 2017-08-30 Method and equipment for determining equipment performance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710763494.2A CN107479977B (en) 2017-08-30 2017-08-30 Method and equipment for determining equipment performance

Publications (2)

Publication Number Publication Date
CN107479977A CN107479977A (en) 2017-12-15
CN107479977B true CN107479977B (en) 2020-11-03

Family

ID=60603944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710763494.2A Active CN107479977B (en) 2017-08-30 2017-08-30 Method and equipment for determining equipment performance

Country Status (1)

Country Link
CN (1) CN107479977B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110308716B (en) * 2018-03-27 2022-02-25 上海汽车集团股份有限公司 Cluster-based method and device for automatically driving vehicle and vehicle

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102149119A (en) * 2011-04-27 2011-08-10 浪潮通信信息***有限公司 Speech service quality end-to-end analyzing method
CN102724065A (en) * 2012-05-22 2012-10-10 长沙中联消防机械有限公司 Network communication system and engineering mechanical equipment comprising same
CN104639350A (en) * 2013-11-11 2015-05-20 中兴通讯股份有限公司 Method and device for performance object aggregation path interface display in comprehensive network management
CN106326461A (en) * 2016-08-30 2017-01-11 杭州东方通信软件技术有限公司 Real time processing guarantee method and system based on network signaling record
CN106412124A (en) * 2016-12-01 2017-02-15 广州高能计算机科技有限公司 Task allocation system and task allocation method for parallel ordering cloud service platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102149119A (en) * 2011-04-27 2011-08-10 浪潮通信信息***有限公司 Speech service quality end-to-end analyzing method
CN102724065A (en) * 2012-05-22 2012-10-10 长沙中联消防机械有限公司 Network communication system and engineering mechanical equipment comprising same
CN104639350A (en) * 2013-11-11 2015-05-20 中兴通讯股份有限公司 Method and device for performance object aggregation path interface display in comprehensive network management
CN106326461A (en) * 2016-08-30 2017-01-11 杭州东方通信软件技术有限公司 Real time processing guarantee method and system based on network signaling record
CN106412124A (en) * 2016-12-01 2017-02-15 广州高能计算机科技有限公司 Task allocation system and task allocation method for parallel ordering cloud service platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
设备综合性能评价体系构建;陈霞,等;;《质量方法》;20170228(第2期);54-57 *

Also Published As

Publication number Publication date
CN107479977A (en) 2017-12-15

Similar Documents

Publication Publication Date Title
EP3180695B1 (en) Systems and methods for auto-scaling a big data system
US20150309908A1 (en) Generating an interactive visualization of metrics collected for functional entities
CN102694868B (en) A kind of group system realizes and task dynamic allocation method
CN105912399B (en) Task processing method, device and system
CN110740061B (en) Fault early warning method and device and computer storage medium
DE202011110892U1 (en) System of active risk management to reduce the likelihood of job scheduling in computer clusters
US9858106B2 (en) Virtual machine capacity planning
CN104133727A (en) Load distribution method based on real-time resources
CN111459641B (en) Method and device for task scheduling and task processing across machine room
CN111966289A (en) Partition optimization method and system based on Kafka cluster
WO2021164404A1 (en) Inspection method and apparatus
CN103763740A (en) Method and device for balancing loads of single boards
US20190280945A1 (en) Method and apparatus for determining primary scheduler from cloud computing system
CN111930493A (en) NodeManager state management method and device in cluster and computing equipment
CN105872061A (en) Server cluster management method, device and system
CN107479977B (en) Method and equipment for determining equipment performance
CN110955516A (en) Batch task processing method and device, computer equipment and storage medium
CN106034047A (en) Data processing method and device
CN107273413B (en) Intermediate table creating method, intermediate table inquiring method and related devices
CN109359800B (en) Evaluation method and system for running state of power distribution automation master station system
US10223189B1 (en) Root cause detection and monitoring for storage systems
CN110837970A (en) Regional health platform quality control method and system
CN113177060B (en) Method, device and equipment for managing SQL (structured query language) sentences
US9898357B1 (en) Root cause detection and monitoring for storage systems
CN115168042A (en) Management method and device of monitoring cluster, computer storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200107

Address after: 510663 Shenzhou Road 10, Guangzhou Science City, Guangzhou economic and Technological Development Zone, Guangzhou, Guangdong

Applicant after: Jingxin Communication System (China) Co., Ltd.

Address before: 510663 Luogang District Science City, Guangzhou, Shenzhou Road, No. 10, Guangdong

Applicant before: Jingxin Communication System (China) Co., Ltd.

Applicant before: Jingxin Communication System (Guangzhou) Co., Ltd.

Applicant before: Jingxin Communication Technology (Guangzhou) Co., Ltd.

Applicant before: TIANJIN COMBA TELECOM SYSTEMS CO., LTD.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 510663 Shenzhou Road 10, Guangzhou Science City, Guangzhou economic and Technological Development Zone, Guangzhou, Guangdong

Patentee after: Jingxin Network System Co.,Ltd.

Address before: 510663 Shenzhou Road 10, Guangzhou Science City, Guangzhou economic and Technological Development Zone, Guangzhou, Guangdong

Patentee before: Comba Telecom System (China) Ltd.

CP01 Change in the name or title of a patent holder