CN117650971A

CN117650971A - Method and device for preventing equipment failure of communication system

Info

Publication number: CN117650971A
Application number: CN202311660988.XA
Authority: CN
Inventors: 范志强; 吴振威; 熊云飞; 赵明明; 李海涛
Original assignee: Wuhan Fiberhome Technical Services Co Ltd
Current assignee: Wuhan Fiberhome Technical Services Co Ltd
Priority date: 2023-12-04
Filing date: 2023-12-04
Publication date: 2024-03-05
Anticipated expiration: 2043-12-04
Also published as: CN117650971B

Abstract

The present invention relates to the field of fault prevention, and in particular, to a method and apparatus for preventing a communication system device from a fault. Mainly comprises the following steps: preprocessing original characteristic data of the equipment into corresponding characteristic variables, setting corresponding weights for each characteristic variable, and generating weighted characteristic variables; according to the topological structure of the equipment, forming a sample of a single-chain structure by the weighted characteristic variables, putting the sample of the single-chain structure into a trunk node of a first isolated tree, and identifying abnormal sample points through the first isolated tree; and placing all the abnormal sample points into a trunk node of a second isolated tree, reversely identifying an abnormal sample cluster through the second isolated tree, and preventing faults of corresponding equipment according to the abnormal sample points and/or the abnormal sample cluster. The invention can improve the accuracy of the isolated forest algorithm in detecting and preventing the faults of the communication system equipment and realize the optimization of the fault prevention effect of the communication equipment.

Description

Method and device for preventing equipment failure of communication system

Technical Field

The present invention relates to the field of fault prevention, and in particular, to a method and apparatus for preventing a communication system device from a fault.

Background

An isolated Forest (iForest) algorithm is an anomaly detection method based on ensemble learning (ensembe), and therefore has linear time complexity. Unlike KMeans, DBSCAN algorithm, the isolated forest has no need of calculating the indexes of distance and density, can greatly raise speed, reduce system overhead, has high accuracy, and can obtain high-quality data with high data dimension,

the speed advantage is obvious, so the application range in the industry is wider at present. Common scenarios include: attack detection in network security, financial transaction fraud detection, disease detection, noise data filtering (data cleansing), etc.

When the fault detection and prevention of the communication system equipment are carried out, the ideas of the isolated forest algorithm can be used for carrying out abnormal point identification. However, the isolated forest algorithm is mainly aimed at processing simple data points in continuous structured data. In an actual communication scenario, a single device of the communication system may include multiple nodes in a tree topology, rather than a single data point; meanwhile, the data used for fault detection in the communication system is not purely numerical data. Therefore, when the isolated forest algorithm is directly used for detecting and predicting the faults of the communication system equipment, the detection cannot be finished or the detection result is wrong due to the topological characteristic and the data characteristic of the equipment in the scene of the communication system.

In view of this, how to overcome the defects existing in the prior art, and solve the problem that the failure of the communication system cannot be prevented by directly using the isolated forest algorithm is a problem to be solved in the technical field.

Disclosure of Invention

Aiming at the defects or improvement demands of the prior art, the invention solves the problem that the fault prevention of the communication system cannot be directly carried out by using an isolated forest algorithm.

The embodiment of the invention adopts the following technical scheme:

in a first aspect, the present invention provides a method for preventing a communication system device from failure, specifically: preprocessing original characteristic data of the equipment into corresponding characteristic variables, setting corresponding weights for each characteristic variable, and generating weighted characteristic variables; according to the topological structure of the equipment, forming a sample of a single-chain structure by the weighted characteristic variables, putting the sample of the single-chain structure into a trunk node of a first isolated tree, and identifying abnormal sample points through the first isolated tree; and placing all the abnormal sample points into a trunk node of a second isolated tree, reversely identifying an abnormal sample cluster through the second isolated tree, and preventing faults of corresponding equipment according to the abnormal sample points and/or the abnormal sample cluster.

Preferably, the preprocessing the raw feature data of the device into corresponding feature variables specifically includes: for the missing original characteristic data, the corresponding characteristic variables are complemented; and/or, for raw feature data of a non-numeric type, processing into quantifiable calculated feature variables; and/or, carrying out strengthening pretreatment on the original characteristic data of the numerical value type to obtain characteristic variables with higher distinction degree.

Preferably, the supplementing the corresponding feature variables for the missing original feature data specifically includes: for original characteristic data which is normally missing, corresponding characteristic variables are assigned to be the median value of the normal value range of the characteristic variables; and for the original characteristic data with abnormal missing, assigning the corresponding characteristic variable as an extreme value of the abnormal side of the characteristic variable.

Preferably, for the non-numeric type of raw feature data, the processing is performed as a feature variable capable of being quantitatively calculated, and specifically includes: for the original characteristic data of the sequence type, mapping each characteristic value in the sequence into a characteristic variable with a specified numerical value according to the sequence characteristics; and for the original feature data of the logic, according to the state corresponding to the logic feature, each state of the logic is used as a feature variable, and each feature variable is assigned to be a corresponding state value.

Preferably, the strengthening pretreatment is performed on the original characteristic data of the numerical value type to obtain characteristic variables with higher distinction degree, which specifically includes: for original characteristic data with single-side abnormal characteristics, acquiring original characteristic data of a normal side exceeding a normal value range, and assigning corresponding characteristic variables as extreme values of corresponding sides of the normal value range; for the characteristic variable with the difference between the normal value range and the abnormal value range smaller than the specified difference, the gradient of the characteristic variable positioned in the normal value range is reduced, and the gradient of the characteristic variable positioned outside the normal value range is improved.

Preferably, the step of setting a corresponding weight for each feature variable to generate a weighted feature variable specifically includes: for a numerical or sequential characteristic variable, assigning a corresponding weight to each characteristic variable; for the logic type characteristic variables, the number of all the characteristic variables mapped by the original characteristic data of the logic type variable is obtained, the weight of the logic type variable is divided into a corresponding number of sub-weights according to the number of the characteristic variables, and the weight of each characteristic variable is designated as a sub-weight.

Preferably, the forming the weighted characteristic variables into the sample of the single-chain structure according to the topology structure of the device specifically includes: taking a main control node of the equipment as a root node, and respectively acquiring an uplink tree structure of the uplink direction and a downlink tree structure of the downlink direction of the equipment; acquiring an uplink single chain from a root node to each leaf node in an uplink tree structure, and acquiring a downlink single chain from the root node to each leaf node in a downlink tree structure; and (3) obtaining all one-to-one combinations of the uplink single chain and the downlink single chain, using a main control node as a connection point, connecting the uplink single chain and the downlink single chain in each combination into a single chain structure, and putting the weighted characteristic variables into the nodes corresponding to the single chain structure.

Preferably, the placing the sample with the single-chain structure into the trunk node of the first isolated tree, and identifying the abnormal sample point through the first isolated tree specifically includes: taking all single-chain structure samples in each device as a sample set, taking out one sample from the sample set, and putting the sample into a trunk node of a first isolated tree; training the first isolated tree until a sample with isolated weighted feature variable values is obtained, and taking the obtained sample as an abnormal sample point; when all samples in the sample set of one device are taken out, all samples are put back into the sample set of the device, and the samples are taken out again to train the first isolated tree until the segmentation cannot be continued or the isolated tree reaches the designated height.

Preferably, the step of placing all the abnormal sample points into the trunk node of the second isolated tree, and reversely identifying the abnormal sample cluster through the second isolated tree specifically includes: placing all the abnormal sample points into a trunk node of a second isolated tree, performing iterative computation on the second isolated tree, and removing the isolated abnormal sample points according to the iterative computation result; and carrying out clustering calculation on the rest abnormal sample points, and acquiring an abnormal sample cluster according to a clustering calculation result.

In another aspect, the present invention provides an apparatus for preventing a failure of a communication system device, specifically: the method comprises the steps of connecting at least one processor with a memory through a data bus, wherein the memory stores instructions executed by the at least one processor, and the instructions are used for completing the method for preventing the fault of the communication system equipment in the first aspect after being executed by the processor.

Compared with the prior art, the invention has the beneficial effects that: the original characteristic data is preprocessed and weighted, so that the influence of the data characteristics on the evaluation accuracy is reduced; the weighted characteristic variables are formed into a sample of a single-chain structure, so that the influence of the topological characteristic on the evaluation accuracy is reduced; and predicting by using the isolated tree twice to obtain an abnormal sample point and an abnormal sample cluster, thereby further improving the evaluation accuracy. By the method, the accuracy of the isolated forest algorithm in detecting and preventing the faults of the communication system equipment is improved, and the effect of preventing the faults of the communication equipment is optimized.

Drawings

In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings that are required to be used in the embodiments of the present invention will be briefly described below. It is evident that the drawings described below are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is a schematic diagram of a training process of an isolated forest algorithm in the prior art;

FIG. 2 is a schematic diagram of a process of determining outliers by an isolated forest algorithm in the prior art;

fig. 3 is a flowchart of a method for preventing a communication system from a device failure according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a single-sided exception data preprocessing in the method according to the embodiment of the present invention;

FIG. 5 is a schematic diagram of the method of the present invention;

FIG. 6 is a schematic flow chart of preprocessing original feature data in the method according to the embodiment of the present invention;

fig. 7 is a flowchart of another method for preventing a device failure in a communication system according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a device topology in a scenario according to the method provided by the embodiment of the present invention;

fig. 9 is a flowchart of another method for preventing a device failure of a communication system according to an embodiment of the present invention;

fig. 10 is a flowchart of another method for preventing a device failure in a communication system according to an embodiment of the present invention;

FIG. 11 is a flowchart of another method for preventing a communication system device from being failed according to an embodiment of the present invention;

FIG. 12 is a flowchart of another method for preventing a communication system device from malfunctioning according to an embodiment of the present invention;

Fig. 13 is a schematic structural diagram of an apparatus for preventing a communication system device from failure according to an embodiment of the present invention;

wherein, the reference numerals are as follows:

11: a processor; 12: a memory.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The present invention is an architecture of a specific functional system, so that in a specific embodiment, functional logic relationships of each structural module are mainly described, and specific software and hardware implementations are not limited.

In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other. The invention will be described in detail below with reference to the drawings and examples.

When an outlier is determined by using an isolated forest algorithm, the outlier is defined as an outlier that is easily isolated. In data visualization, the image features of outliers are: points are sparsely distributed and are farther from the high density population. Statistically, if there are only sparse points in a region in the data space, the probability of the data points falling in that region is low, and therefore the points in those regions can be considered abnormal.

The theoretical basis of the isolated forest algorithm has two points: 1. the proportion of outlier data to the total sample size is small, which determines that it is suitable for finding potential points of failure by discriminating outlier feature vectors, since the points of failure are always very few. 2. The characteristic value of the abnormal point is greatly different from that of the normal point, and the algorithm effect can be optimized through targeted fault characteristic data strengthening processing.

Based on the theoretical basis, the algorithm idea of the isolated forest algorithm is as follows: and splitting a data space by using a random hyperplane, and generating two subspaces after the splitting is completed once. And next, randomly selecting the hyperplane to segment the two subspaces obtained in the first step. This loops until only one sample point is contained within each subspace. If the distance between a certain sample point and all other sample points is larger, the sample point can be separated from the other sample points with less segmentation times.

Corresponding to the above algorithm idea, the existing isolated forest algorithm is divided into two steps:

step 101: training: sampling from a training set, constructing an Isolation Tree (abbreviated as an iTree), testing each iTree in the Isolation forest, and recording the path length from a root node to each leaf node in the Tree structure of the Isolation Tree.

Step 102: calculating an anomaly score: according to the anomaly score calculation formula, an anomaly score (anomaly score) for each sample point is calculated.

In step 101, each itrree is trained as follows:

step 201: m points are randomly selected from the training set to serve as sub-samples, and a trunk node of an iTree is placed.

Step 202: and randomly designating a characteristic variable, and randomly generating a segmentation point p in the current node data range, wherein the segmentation point p is positioned between the maximum value and the minimum value of the designated characteristic variable in the current node data.

Step 203: generating a hyperplane based on the segmentation point p, wherein the hyperplane segments the current node data space into 2 subspaces: the point less than p under the currently selected characteristic variable is placed on the left branch of the current node, and the point greater than or equal to p is placed on the right branch of the current node.

Step 204: new leaf nodes are continuously constructed by recursing steps 202 and 203 on the left and right branches of the node. Until only one data exists on the leaf node, the segmentation can not be continued; or, the tree has grown to a specified height.

Further, since only the possible outliers with short path lengths are concerned during detection, the normal points with long paths are not concerned. For simplicity of calculation, the tree height may be limited, and in practical implementation, the specified height of the iTree is related to the number of subsamples m. Preferably, the height limit of the iTree is generally log ₂ m, which can be regarded as normal nodes in a general scenarioAverage height.

As shown in FIG. 1, the process of segmentation training of sub-samples in a certain scene is shown as x _i In the region of higher density, so that the region is divided into separate subspaces after a plurality of times of segmentation; and x is ₀ The method falls in a region with sparser sample point distribution, and encouragement is completed after a small number of segmentation. It can be seen that, in a single itrree, if the path length from a certain leaf node to the root node is shorter, this indicates that the number of slicing experienced by the leaf node is smaller, and the probability that the leaf node is an outlier is greater.

Since the segmentation process is completely random, an ensable method is required to converge the result, i.e., repeatedly starting the segmentation from scratch, and then calculating the average value of the results of each segmentation. After a specified number t of orphaned trees is obtained, the test data can be evaluated using the generated orphaned trees. That is, an anomaly score s is calculated according to step 102.

For each sample x, the result of each orphan tree needs to be computed comprehensively, and the anomaly score is computed by the following formula:

where h (x) is the average of the path lengths of samples x at the height of each island and c (m) is the given number of samples m.

The path length h (x) of the sample x is normalized by c (m), and the result of the normalization is taken as an anomaly score of the sample x, as shown in fig. 2:

(1) If the anomaly score of sample x is close to 1, sample x must be the anomaly point;

(2) If the anomaly score for sample x is much less than 0.5, sample x must not be an outlier;

(3) If the anomaly score for all samples is around 0.5, this indicates that there may be no outliers in the scene.

The fault prevention of the communication equipment needs to find out equipment with performance and state value outliers from a large number of equipment in the network, and accords with the main characteristics of an isolated forest algorithm. However, if the conventional method of isolated forests is directly applied to the fault prevention of communication equipment and has a good effect in engineering practice, the following problems are also to be solved.

1. The individual communication devices are not a single point structure, typically a tree structure. Thus, a device cannot be simply considered a sample point. In fault prevention, the discovery of the fault point should be located at a certain node position in the equipment, instead of taking the tree structure of the whole equipment as a fault prevention unit.

2. The isolated forest algorithm is mainly aimed at continuous structured data, but some characteristic variables in the communication equipment are not continuous data, but logic values or sequence values, for example: classification values, state values, etc., these non-numerical type feature variables must be converted into numerical variable forms that can be handled by the isolated forest.

3. The isolated forest algorithm has higher degree of distinction for scenes with very different characteristic values of abnormal points and normal points, but the difference between the normal value range and the abnormal value range of some characteristic variables of the communication equipment is not very large.

4. Communication devices have many characteristic variables that feature a single-sided anomaly, such as device temperature, typically ranging from a normal value of [40,60] (units: degrees), to more abnormal, but also to a small number of device temperatures of around 30 degrees or even lower, which normal values can be misidentified as anomalies due to "outliers".

5. The collection of characteristic variables (performance and state parameters) of the communication equipment inevitably encounters few missing values, including normal missing and abnormal missing, and different complement pretreatment is needed for different situations.

6. In general, the iTree randomly assigns a feature variable during the calculation, i.e. the relative probability of each feature variable being selected is the same for each round of slicing, assuming 1. In a practical scenario, not all feature variables are weighted the same for fault identification.

Example 1:

aiming at the problem characteristics of the fault prevention of the communication equipment, in the embodiment, the rules of the data preprocessing link, the iTree sampling and the isolated computing link of the isolated forest algorithm are optimized so as to improve the effect of the fault prevention of the communication equipment.

As shown in fig. 3, the method for preventing the fault of the communication system device provided by the embodiment of the invention specifically includes the following steps:

step 301: preprocessing the original characteristic data of the equipment into corresponding characteristic variables, setting corresponding weights for each characteristic variable, and generating weighted characteristic variables.

When fault prevention is carried out, the original characteristic data of each device is required to be acquired firstly, and the original characteristic data can be acquired through device reporting or active acquisition of network management devices. In practical implementations, the original feature data may include non-numeric variables such as logical values or sequence values, or single-sided anomalies, or data deletions. Because, first, the original feature data needs to be preprocessed, and corresponding feature variables are obtained. Furthermore, in order to distinguish the importance of different feature variables in fault prevention, the feature variables need to be weighted to obtain weighted feature variables.

Step 302: according to the topological structure of the equipment, the weighted characteristic variables are formed into samples of a single-chain structure, the samples of the single-chain structure are placed into trunk nodes of a first isolated tree, and abnormal sample points are identified through the first isolated tree.

In order to accord with the topological structure characteristics of the equipment, the fault is positioned on a certain node of the topological structure of the equipment, in the embodiment, the weighted characteristic variables form a sample of a single-chain structure according to the topological structure of the equipment, and the influence of the topological structure such as a tree structure on the fault positioning is eliminated through the sample of the single-chain structure.

After the samples of the single-chain structures are obtained, the samples of the single-chain structures can be sequentially placed into the first iTrees, each first iTree is trained through an isolated forest algorithm, the abnormal scores of the samples are calculated, and abnormal sample points are obtained through the abnormal scores of the samples.

Step 303: and placing all the abnormal sample points into trunk nodes of the isolated tree, reversely identifying an abnormal sample cluster through the isolated tree, and preventing faults of corresponding equipment according to the abnormal sample points and/or the abnormal sample cluster.

After the abnormal sample points are obtained, the equipment faults can be prevented directly according to the characteristic items represented by the abnormal sample points or the positions of the abnormal sample points.

Further, if a fault occurs at a position close to a root node of a certain device, all single-chain feature data passing through the node are abnormal, so that an abnormal sample cluster is formed around the node. In an actual scene, an isolated forest algorithm can be used again, an abnormal sample cluster is obtained through the second iTree, and faults of the equipment are prevented through the abnormal sample cluster.

After steps 301 to 303 provided in this embodiment, the problem existing in the isolated forest during the fault prevention of the communication system can be eliminated, and the fault prevention can be performed more accurately.

To eliminate the problems 2-5, preprocessing of the raw feature data of the device into corresponding feature variables needs to be performed for different types of raw feature data.

(1) And for the missing original characteristic data, complementing corresponding characteristic variables.

(2) For raw feature data of non-numeric type, it is processed into quantifiable calculated feature variables.

(3) And carrying out strengthening pretreatment on the original characteristic data of the numerical value type to obtain characteristic variables with higher distinction degree.

For problem 5, for missing raw feature data, the corresponding feature variables need to be complemented.

(1) And for the original characteristic data which is normally missing, assigning the corresponding characteristic variable as the median of the normal value range of the characteristic variable.

For normal raw feature data missing, for example: a few specific models of devices do not have a certain feature or do not support providing a certain feature value. The defect is not caused by a fault and does not affect fault prevention and positioning, so that in the method provided by the embodiment, default original characteristic data is a normal value, and the default original characteristic data is uniformly assigned as the median of the normal value range of the characteristic variable.

For example, a service disk of a certain model has no temperature sensor and no temperature characteristic value. At this time, the temperature characteristic variable of the service disc of the model is uniformly assigned as the median value 50 of the normal value ranges [40,60 ].

(2) And for the original characteristic data with abnormal missing, assigning the corresponding characteristic variable as an extreme value of the abnormal side of the characteristic variable.

Abnormal absence of raw feature data means that the device is not providing corresponding raw feature data, and the reasons for the generation include, but are not limited to, device pipe removal, feature value overflow, etc. When an anomaly loss occurs, the device can be considered to be in an anomaly state and corresponding fault positioning is required, so that in the method provided by the embodiment, default original characteristic data is an anomaly value, and an extremum value of an anomaly side of the characteristic variable is assigned.

For problem 2, for raw feature data of a non-numeric type, feature variables that need to be processed for quantifiable computation.

(1) For the original characteristic data of the sequence type, mapping each characteristic value in the sequence into a characteristic variable with a specified numerical value according to the sequence characteristics;

raw feature data of the sequence type, which means that the feature value is not of the numerical type, but has sequential features, such as: normal/flash/long break, or normal/light/moderate/heavy.

In order to facilitate calculation and comparison, in the method provided in this embodiment, each feature value in the sequence is mapped and converted into a numerical value, and the sequence and the distance thereof conform to the business logic relationship. For example, the map is converted to "0/1/3/6" for "normal/mild/moderate/severe".

(2) And for the original feature data of the logic, according to the state corresponding to the logic feature, each state of the logic is used as a feature variable, and each feature variable is assigned to be a corresponding state value.

The raw feature data of a logic type generally represents a classification, e.g., a device switch state in boolean values, which can be considered as a logic type feature variable comprising two states. In the method provided by the embodiment, each state of the logic feature can be used as a feature variable, and a corresponding feature value is given to the feature variable corresponding to each state.

For example: the logical original characteristic data can be converted into characteristic variables by using One-Hot coding:

1. counting the number n of states of the original characteristic data X;

2. the original characteristic data X is disassembled into n characteristic variables: x is X ₁ To X _n 。

For example, the states of the original feature data X have a/B/C, and each state is mutually exclusive, so that the original feature data X becomes 3 numerical feature variables after transcoding, and each feature variable has a corresponding feature value:

for example, when the value of the original feature data X is a, the corresponding three feature variable values are respectively: x is X ₁ ＝0b001，X ₂ ＝0b010，X ₃ ＝0b100。

For problems 3 and 4, reinforcement pretreatment is also required for the original characteristic data of the numerical type to obtain characteristic variables with higher distinction degree.

(1) For original characteristic data with single-side abnormal characteristics, acquiring original characteristic data of a normal side exceeding a normal value range, and assigning corresponding characteristic variables as extreme values of corresponding sides of the normal value range;

for feature variables with one-sided anomaly characteristics, all can be assigned normal values for feature variables that are outside of the normal value range but are not faulty. In practical implementations, the data preprocessing may be performed using a ReLU class activation function, for example: reLU, softplus, etc. The general ReLU class functions are shown in FIG. 4.

For example, the service disk temperature of a certain device is usually in the range of 40-60 degrees, only one side with the temperature higher than the upper limit of the range of the normal value is regarded as fault, and the other side with the temperature lower than the lower limit of the range of the normal value is not regarded as fault. But also a few normal devices have temperatures around 30 degrees or even lower, the normal temperature of the device being below the lower limit of the normal range of values. For ease of calculation and comparison, the raw feature data may be processed using a standard ReLU function, with feature values 40 degrees below the lower limit of the normal range of values all assigned 40 degrees to solve the problem of low temperature being identified as outliers.

(2) For the characteristic variable with the difference between the normal value range and the abnormal value range smaller than the appointed difference value, the gradient of the characteristic variable positioned in the normal value range is reduced, and the gradient of the characteristic variable positioned outside the normal value range is improved.

Because the precision requirement of the communication equipment is higher, the difference between the normal value range and the abnormal value range of certain characteristic variables is not very large, and the characteristic values are subjected to strengthening pretreatment.

For example, the characteristic of power series can be used to reduce the numerical gradient of normal domain and increase the number of abnormal domain

Value gradient, thereby making the effect of isolated segmentation more optimal. In a certain actual scene, according to scene characteristics and expert experience, required deformation and parameter setting are carried out on the basis of a power function, the following gradient strengthening formula is obtained, normal value range gradients are reduced, abnormal value range gradients are increased, and therefore isolation effects are optimized.

Wherein a is the median of the normal value range, and b is the normal threshold width.

In the above formula, the derivative of f (x) is derived from x, and when x is within the normal value range, the derivative of f (x) is not more than 1, and when x is not within the normal value range, the derivative of f (x) is more than 1. And the farther x is from the normal threshold, the greater the derivative of f (x) with respect to x.

For example, the normal value range of the light receiving power of the light module of a certain device is [ -28, -8] (unit dm), the median value is-18, the value range width is 20, and the original characteristic data are substituted into the formula:

after conversion, as shown in fig. 5, the gradient in the normal threshold [ -28, -8] range is not more than 1, the gradient in the abnormal value range is more than 1, and the transition can be smooth, so that the abnormal value can be segmented and isolated as early as possible in the computation of the iTree.

In practical implementation, the above-mentioned data preprocessing processes may be integrated by using the process shown in fig. 6, and all preprocessing processes are executed according to actual needs, or preprocessing processes that should be used are selected according to actual service needs.

In engineering practice for performance state evaluation of communication equipment, not all the importance, reliability and independence of characteristic parameters are the same. Therefore, a weight value needs to be given to each feature variable according to the actual scene requirement, data statistics, expert experience, etc., so as to control the relative probability that each feature variable is selected during each round of slicing. Therefore, for problem 6, the weights of the feature variables need to be preprocessed before the iTree calculation, so that the relative probability that each feature variable is selected as a slicing point p during each round of slicing is quantitatively controlled.

(1) For a numerical or sequential type of feature variable, a corresponding weight is assigned to each feature variable.

For a numerical type characteristic variable, a corresponding weight can be directly assigned to the relevance of the characteristic variable to fault prevention. The feature variables with higher importance, reliability, independence and the like should have higher weights, i.e. the selected relative probabilities are larger.

The sequential characteristic variables in the non-numerical characteristic variables are correspondingly converted in the preprocessing, so that when the weights are set, the sequential characteristic variables can be processed in a consistent manner with the numerical characteristic variables, and corresponding weights can be directly formulated for the sequential characteristic variables.

(2) For the logic type characteristic variables, the number of all the characteristic variables mapped by the original characteristic data of the logic type variable is obtained, the weight of the logic type variable is divided into a corresponding number of sub-weights according to the number of the characteristic variables, and the weight of each characteristic variable is designated as a sub-weight.

In this embodiment, in order to convert the non-numeric feature variable of the logical type into a corresponding numeric value, 1 piece of the original feature data X of the logical type needs to be split into n pieces of feature variables. If the splitting is directly performed, the influence weight of the original characteristic data X in the iTree is n times that of other characteristic variables, namely, the probability of being randomly selected becomes n times that of the original characteristic variables. Therefore, before the iTree calculation, X needs to be calculated ₁ To X _n Weight of (2) is assigned so that X ₁ To X _n The sum of the weights of (2) is equal to the weight to which the original feature data X should be. In practical implementation, the weight of the original characteristic data X can be simply divided into equal parts ₁ To X _n The weight of each feature variable is reduced to 1/n of the weight of the original feature data X.

In a practical communication system, many devices are not a single point structure, and one device cannot be simply regarded as a sample point. In view of the problem 1, in this embodiment, the data samples of the tree structure are decomposed into a plurality of single-chain structures by permutation and combination, and each single chain is used as a sample point to participate in the iTree calculation.

As shown in fig. 7, the weighted feature variables may be composed into samples of a single-chain structure in the following manner.

Step 401: and taking the main control node of the equipment as a root node to respectively acquire an uplink tree structure of the uplink direction and a downlink tree structure of the downlink direction of the equipment.

Any one of the child device nodes may be expressed in terms of an ID in the form of a Root-Slot number-interface number (Root-Slot-Port). For example: r represents a root node (master disk), R-11 represents an 11 th slot (slot) service disk, and R-11-3 represents a 3 rd PON port (port) under the 11 th slot (slot) service disk.

Taking the device shown in fig. 8 as an example, in the downlinking direction of the master disk, there are 3 levels of tree structures (R-1-2): master disk-service disk-passive optical network (Passive Optical Network, abbreviated PON) port: in the uplink direction of the master disk, there are also 3 levels of tree structures (2-1-R): upper joint, upper joint disc, main control disc.

Step 402: and obtaining an uplink single chain from the root node to each leaf node in the uplink tree structure, and obtaining a downlink single chain from the root node to each leaf node in the downlink tree structure.

In order to facilitate obtaining a single-chain structure containing all layers in the equipment, namely, a single-chain structure from a leaf node of an upper tree structure to a leaf node of a lower tree structure, the upper tree structure and the lower tree structure can be respectively decomposed into a plurality of upper single chains and a plurality of lower single chains, the upper single chains and the lower single chains are arranged and combined, and then the upper single chains and the lower single chains are connected at a main control disc, so that each single-chain structure in the equipment can be obtained.

Taking fig. 8 as an example, there are 2 single strands in the upper strand: 1-1-R,1-2-R; the lower pair also has 2 single strands: r-1, R-1-2.

Step 403: and (3) obtaining all one-to-one combinations of the uplink single chain and the downlink single chain, using a main control node as a connection point, connecting the uplink single chain and the downlink single chain in each combination into a single chain structure, and putting the weighted characteristic variables into the nodes corresponding to the single chain structure.

After all the uplink single strands and all the downlink single strands are obtained, the uplink single strands and the downlink single strands are arranged and combined one by one to form all the single strands.

The arrangement and combination form 4 single-chain structures, namely the device is decomposed into 4 single-chain structures: 1-1-R-1, 1-1-R-1-2,1-2-R-1, 1-2-R-1-2.

In each single-chain structure, all levels of the device are included, and there is one and only one node at each level of the device. All characteristic variables of each node on the single-chain structure are obtained, and then the nodes are connected together according to the node sequence of the single-chain structure, so that all characteristic data of a sample corresponding to the single-chain structure are formed.

After steps 401 to 403 provided in this embodiment, a tree-structured device may be decomposed into a plurality of samples expressed by one-dimensional vectors, which may be directly used for the iTree calculation.

The conventional practice for single itrree training is to randomly select m from all training samples as sub-samples and put into the trunk node of one itrree. In the communication system, because each trunk node of the itree contains a plurality of samples with single-chain structures, and the device sub-nodes corresponding to each sample are different, the sub-samples cannot be used directly by using a conventional method.

Meanwhile, due to the problems 3 and 4, the existing method for extracting the itrate subsamples may have a flooding (swamping) problem and a covering (masking) problem. swamping refers to the erroneous prediction of a normal sample as abnormal. When a normal sample is very close to an abnormal sample, the number of splits required to isolate the abnormality increases, making it more difficult to distinguish the abnormal sample from the normal sample. masking refers to the existence of more densely packed outliers to form outlier clusters, as well as requiring more splits to isolate them.

Thus, in this embodiment, it is also necessary to extract sub-samples for a single itrree training, rather than directly using all samples.

As shown in fig. 9, the extraction of the subsamples may be performed in the following manner.

Step 501: taking all single-chain structure samples in each device as a sample set, taking out one sample from the sample set, and putting the sample into a trunk node of a first isolated tree.

As can be seen from the sample construction methods in steps 401-403, in this embodiment, each sample contains data of a single-chain structure in one device, and each device contains a plurality of samples. Therefore, for each first itrene, in the sample set composed of all samples of each device, only one sample with a single-chain structure is taken, so as to avoid that a plurality of abnormal single-chain samples contained in an abnormal cluster are difficult to be isolated due to the abnormal root node of the device. . Meanwhile, in order to avoid repetition, single chains are traversed, and each sample is taken out and not put back.

Step 502: training the first isolated tree until a sample with isolated weighted feature variable values is obtained, and taking the obtained sample as an abnormal sample point.

After one sample is extracted, the extracted sample can be used as one sample in the prior iTree training to be put into a first iTree, the first iTree is trained, and the abnormal sample is obtained through the result of the first iTree training. In this embodiment, a single-stranded structure is used as a sample, and any node in the single-stranded structure of a certain sample has an isolated weighted feature variable, and the sample is regarded as an abnormal sample point.

Step 503: when all samples in the sample set of one device are taken out, all samples are put back into the sample set of the device, and the samples are taken out again to train the first isolated tree until the segmentation cannot be continued or the isolated tree reaches the designated height.

In an actual communication system, the topology structure of each device is different, the number of samples contained in the sample set is different, and after the sample set of one device is emptied, unprocessed samples may still exist in the sample sets of other devices. In order to enable all samples to be processed, when the sample set of one device is emptied, all samples need to be put back into the sample set of the device for further random access.

Until all samples have been cut, or the orphan tree reaches a specified height, it may be considered that all training is complete, ending training of the first iTree.

After steps 501-503 provided in this embodiment, samples for single first iTree training may be extracted, and single first iTree training may be completed based on the extracted samples.

The processing in steps 401-403, and steps 501-503, in conjunction with existing orphan forest algorithms, may be implemented in the manner shown in fig. 10.

By using the sample extraction method, all abnormal samples contained in the abnormal cluster caused by the abnormal equipment root node can be split into different first iTree, and the taken opportunities of the samples are equal. However, after decomposing a device of a tree structure into a plurality of samples of a single-chain structure, if an outlier is located near the root node of the device, a masking problem may also be caused.

Therefore, after finding the abnormal sample points, it is also necessary to integrate all the first iTree calculation results, and perform the second iTree calculation on all the sample points in the manner shown in fig. 11.

Step 601: and placing all the abnormal sample points into a trunk node of a second isolated tree, performing iterative computation on the second isolated tree, and removing the isolated abnormal sample points according to the iterative computation result.

The abnormal sample points obtained by using the first iTree include relatively isolated abnormal sample points and also include abnormal sample clusters composed of a group of abnormal sample points aggregated with each other. In this embodiment, the principle of the isolated forest algorithm is reversely used, all the obtained abnormal sample points are split through the second iTree, the isolated abnormal sample points and the aggregated abnormal sample points are distinguished, and then the identified isolated abnormal sample points are removed, so that the aggregated abnormal sample points can be obtained.

Step 602: and carrying out clustering calculation on the rest abnormal sample points, and acquiring an abnormal sample cluster according to a clustering calculation result.

After the isolated abnormal sample points are removed, the rest abnormal sample points are mutually gathered. In practical implementations, there may be one root node fault or multiple root node faults, corresponding to multiple groups of aggregated abnormal sample points. Therefore, clustering calculation is further needed for the clustered abnormal sample points to distinguish each group of clustered abnormal sample points, and each group of clustered abnormal sample points is used as an abnormal sample cluster.

After steps 601-602 provided in this embodiment, an abnormal sample cluster caused by the root node failure can be found. Each obtained abnormal sample cluster may indicate an abnormality in the location of a root node of a device. Because the influence surface of the root node fault hidden trouble is wider, in the engineering practice of fault prevention, the use of the abnormal sample cluster for fault prevention has higher practical value,

the processing in steps 601-602 may be implemented in the manner shown in fig. 12, in combination with existing isolated forest algorithms.

The method for preventing the faults of the communication system equipment has the following beneficial effects:

1. The data samples of the tree structure are decomposed into a plurality of single-chain structures through permutation and combination, so that the problem that the isolated forest algorithm in the problem 1 cannot process the sample data of the tree structure is solved.

2. In an itrree, each device selects only one single-chain structure sample, and the selected sample is not put back until all samples of the device are taken, and then all samples of the device are put back. The problem that the single-chain data associated with the abnormal node is aggregated and difficult to be isolated due to the fact that the abnormality near the root node in the problem 2 is solved.

3. When the single iTree is trained, the weight problem caused by the transcoding of the logic type characteristic variable in the problem 2 is controlled by controlling the probability of random selection of the characteristic variable.

4. And when the single iTree is trained, the weight setting problem of the feature variables in the problem 6 is controlled by controlling the probability of random selection of the feature variables.

5. After the training of all the iTree is finished, finding out abnormal sample points of the single-chain structure, performing isolated calculation on all the abnormal sample points, and reversely finding out an aggregated abnormal sample cluster, so as to find out possible abnormal nodes close to the equipment root node, and solve the problem that the existing isolated forest algorithm cannot process the equipment topology of the tree structure in the problem 1.

Example 2:

on the basis of the method for preventing the faults of the communication system equipment provided in the embodiment 1, the invention also provides a device for preventing the faults of the communication system equipment, which can be used for realizing the method, and as shown in fig. 13, the device is schematically structured. The communication system equipment failure prevention device of the present embodiment includes one or more processors 11 and a memory 12. In fig. 13, a processor 11 is taken as an example.

The processor 11 and the memory 12 may be connected by a bus or otherwise, in fig. 13 by way of example.

The memory 12 is a non-volatile computer-readable storage medium as a method of communication system device failure prevention, and is operable to store a non-volatile software program, a non-volatile computer-executable program, and modules, such as the method of communication system device failure prevention in embodiment 1. The processor 11 executes various functional applications and data processing of the apparatus for communication system device failure prevention, that is, implements the method for communication system device failure prevention of embodiment 1, by running nonvolatile software programs, instructions, and modules stored in the memory 12.

Memory 12 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 12 may optionally include memory located remotely from processor 11, which may be connected to processor 11 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The program instructions/modules are stored in the memory 12 and when executed by the one or more processors 11 perform the method of communication system device failure prevention in embodiment 1 described above, for example, performing the steps shown in fig. 3, 7 and 11 described above.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the embodiments may be implemented by a program that instructs associated hardware, the program may be stored on a computer readable storage medium, the storage medium may include: read Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. A method for communication system equipment failure prevention, comprising:

preprocessing original characteristic data of the equipment into corresponding characteristic variables, setting corresponding weights for each characteristic variable, and generating weighted characteristic variables;

according to the topological structure of the equipment, forming a sample of a single-chain structure by the weighted characteristic variables, putting the sample of the single-chain structure into a trunk node of a first isolated tree, and identifying abnormal sample points through the first isolated tree;

and placing all the abnormal sample points into a trunk node of a second isolated tree, reversely identifying an abnormal sample cluster through the second isolated tree, and preventing faults of corresponding equipment according to the abnormal sample points and/or the abnormal sample cluster.

2. The method for preventing faults of equipment of a communication system according to claim 1, wherein the preprocessing of the original characteristic data of the equipment into corresponding characteristic variables specifically comprises:

for the missing original characteristic data, the corresponding characteristic variables are complemented;

And/or, for raw feature data of a non-numeric type, processing into quantifiable calculated feature variables;

and/or, carrying out strengthening pretreatment on the original characteristic data of the numerical value type to obtain characteristic variables with higher distinction degree.

3. A method for fault prevention of a communication system according to claim 2, characterized in that said supplementing the missing raw characteristic data with corresponding characteristic variables, in particular comprises

For original characteristic data which is normally missing, corresponding characteristic variables are assigned to be the median value of the normal value range of the characteristic variables;

and for the original characteristic data with abnormal missing, assigning the corresponding characteristic variable as an extreme value of the abnormal side of the characteristic variable.

4. The method for preventing faults of communication system equipment according to claim 2, wherein for raw characteristic data of a non-numeric type, the processing is performed as a quantifiable calculated characteristic variable, specifically comprising:

for the original characteristic data of the sequence type, mapping each characteristic value in the sequence into a characteristic variable with a specified numerical value according to the sequence characteristics;

and for the original feature data of the logic, according to the state corresponding to the logic feature, each state of the logic is used as a feature variable, and each feature variable is assigned to be a corresponding state value.

5. The method for preventing equipment faults of a communication system according to claim 2, wherein the strengthening pretreatment is carried out on the original characteristic data of a numerical type to obtain characteristic variables with higher distinction degree, and the method specifically comprises the following steps:

for original characteristic data with single-side abnormal characteristics, acquiring original characteristic data of a normal side exceeding a normal value range, and assigning corresponding characteristic variables as extreme values of corresponding sides of the normal value range;

for the characteristic variable with the difference between the normal value range and the abnormal value range smaller than the appointed difference value, the gradient of the characteristic variable positioned in the normal value range is reduced, and the gradient of the characteristic variable positioned outside the normal value range is improved.

6. The method for preventing equipment failure of a communication system according to claim 1, wherein the step of setting a corresponding weight for each feature variable to generate a weighted feature variable specifically comprises:

for a numerical or sequential characteristic variable, assigning a corresponding weight to each characteristic variable;

for the logic type characteristic variables, the number of all the characteristic variables mapped by the original characteristic data of the logic type variable is obtained, the weight of the logic type variable is divided into a corresponding number of sub-weights according to the number of the characteristic variables, and the weight of each characteristic variable is designated as a sub-weight.

7. The method for preventing faults of a communication system according to claim 1, wherein the step of forming the weighted feature variables into samples of a single-chain structure according to the topology of the device specifically comprises the steps of:

taking a main control node of the equipment as a root node, and respectively acquiring an uplink tree structure of the uplink direction and a downlink tree structure of the downlink direction of the equipment;

acquiring an uplink single chain from a root node to each leaf node in an uplink tree structure, and acquiring a downlink single chain from the root node to each leaf node in a downlink tree structure;

and (3) obtaining all one-to-one combinations of the uplink single chain and the downlink single chain, using a main control node as a connection point, connecting the uplink single chain and the downlink single chain in each combination into a single chain structure, and putting the weighted characteristic variables into the nodes corresponding to the single chain structure.

8. The method for preventing faults of equipment of a communication system according to claim 1, wherein the step of placing the sample of the single-chain structure into a trunk node of a first isolated tree, and identifying abnormal sample points through the first isolated tree specifically comprises:

taking all single-chain structure samples in each device as a sample set, taking out one sample from the sample set, and putting the sample into a trunk node of a first isolated tree;

Training the first isolated tree until a sample with isolated weighted feature variable values is obtained, and taking the obtained sample as an abnormal sample point;

when all samples in the sample set of one device are taken out, all samples are put back into the sample set of the device, and the samples are taken out again to train the first isolated tree until the segmentation cannot be continued or the isolated tree reaches the designated height.

9. The method for preventing faults of equipment of a communication system according to claim 1, wherein the step of placing all abnormal sample points into a trunk node of a second isolated tree and reversely identifying abnormal sample clusters through the second isolated tree specifically comprises the steps of:

placing all the abnormal sample points into a trunk node of a second isolated tree, performing iterative computation on the second isolated tree, and removing the isolated abnormal sample points according to the iterative computation result;

and carrying out clustering calculation on the rest abnormal sample points, and acquiring an abnormal sample cluster according to a clustering calculation result.

10. An apparatus for preventing a communication system device from malfunctioning, comprising:

comprising at least one processor and a memory connected by a data bus, the memory storing instructions for execution by the at least one processor, the instructions, when executed by the processor, for performing the method of communication system device failure prevention of any of claims 1-9.