CN117014069B

CN117014069B - Fault prediction method, device, electronic equipment, storage medium and program product

Info

Publication number: CN117014069B
Application number: CN202311242423.XA
Authority: CN
Inventors: 罗慧芬; 罗哲; 肖晨; 李�城
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-09-25
Filing date: 2023-09-25
Publication date: 2024-01-12
Anticipated expiration: 2043-09-25
Also published as: CN117014069A

Abstract

The application provides a fault prediction method, a fault prediction device, an electronic device, a computer readable storage medium and a computer program product; the method comprises the following steps: acquiring index data of the optical module in a global time sequence; performing local regression processing on the index data of the optical module based on the unit time sequences to obtain local fitting data corresponding to each unit time sequence; combining the local fitting data of the unit time sequences to obtain combined data, and performing global regression on the combined data to obtain global fitting data of the optical module; and carrying out probability mapping processing on the global fitting data of the optical module to obtain the future fault probability of the link where the optical module is located. According to the method and the device, the fault prediction of the link can be achieved through the index data of the optical module, so that the link stability is improved.

Description

Fault prediction method, device, electronic equipment, storage medium and program product

Technical Field

The present application relates to communications technologies, and in particular, to a fault prediction method, apparatus, electronic device, computer readable storage medium, and computer program product.

Background

The optical module is one of the network data link component parts and is connected with the network element and the network element, and the network element and the server. Data centers of large enterprises usually run millions of interconnected optical modules, and the aging of the optical modules, the pollution of optical fibers, the loosening of optical links and the like bring about endless network anomalies. In the whole network fault, the link fault accounts for about 20%, the current fault of the optical communication link can be known by adopting a detection alarm mode in the related technology, but the detection alarm mode in the related technology can not prevent the fault from happening and can only alarm in time after the fault happens, thereby influencing the stability of the optical communication link.

Disclosure of Invention

The embodiment of the application provides a fault prediction method, a device, electronic equipment, a computer readable storage medium and a computer program product, which can realize the fault prediction of a link through index data of an optical module so as to improve the link stability.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a fault prediction method, which comprises the following steps:

acquiring index data of an optical module in a global time sequence, wherein the global time sequence consists of a plurality of unit time sequences;

Performing local regression processing on the index data of the optical module based on the unit time sequences to obtain local fitting data corresponding to each unit time sequence;

combining the local fitting data of the unit time sequences to obtain combined data, and performing global regression on the combined data to obtain global fitting data of the optical module;

and carrying out probability mapping processing on the global fitting data of the optical module to obtain the future fault probability of the link where the optical module is located.

The embodiment of the application provides a fault prediction device, which comprises:

the acquisition module is used for acquiring index data of the optical module in a global time sequence, wherein the global time sequence consists of a plurality of unit time sequences;

the first regression module is used for carrying out local regression processing on the index data of the optical module based on the unit time sequences to obtain local fitting data corresponding to each unit time sequence;

the second regression module is used for carrying out combination processing on the local fitting data of the plurality of unit time sequences to obtain combined data, and carrying out global regression processing on the combined data to obtain global fitting data of the optical module;

And the prediction module is used for carrying out probability mapping processing on the global fitting data of the optical module to obtain the future fault probability of the link where the optical module is located.

An embodiment of the present application provides an electronic device, including:

a memory for storing computer executable instructions;

and the processor is used for realizing the fault prediction method provided by the embodiment of the application when executing the computer executable instructions stored in the memory.

The embodiment of the application provides a computer readable storage medium, which stores computer executable instructions for implementing the fault prediction method provided by the embodiment of the application when the computer readable storage medium causes a processor to execute.

Embodiments of the present application provide a computer program product comprising computer-executable instructions that, when executed by a processor, implement the fault prediction method provided by the embodiments of the present application.

The embodiment of the application has the following beneficial effects:

according to the method and the device for obtaining the global fitting data of the optical module, the index data of the optical module in the global time sequence are obtained, the local regression processing and the global regression processing based on the unit time sequence are carried out on the index data of the optical module, global fitting data of the optical module are obtained, high-accuracy fitting can be carried out on the index data of the optical module while consumption of computing resources can be effectively reduced in a double-scale twice linear regression mode, probability mapping processing is carried out on the global fitting data of the optical module, future fault probability of a link where the optical module is located is obtained, and compared with a scheme that the fault prediction accuracy can be improved through judging through an index threshold.

Drawings

FIG. 1 is a schematic diagram of a failure prediction system provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

FIG. 3A is a first flow chart of a fault prediction method according to an embodiment of the present disclosure;

FIG. 3B is a second flow chart of a fault prediction method according to an embodiment of the present disclosure;

FIG. 3C is a third flow chart of a fault prediction method according to an embodiment of the present disclosure;

FIG. 3D is a fourth flowchart of a fault prediction method according to an embodiment of the present disclosure;

fig. 4 is a schematic architecture diagram of a network physical link of the fault prediction method provided in the embodiment of the present application;

fig. 5 is a link failure prediction alarm schematic diagram of the failure prediction method provided in the embodiment of the present application;

fig. 6 is a schematic diagram of a fault prediction flow of the fault prediction method provided in the embodiment of the present application;

FIG. 7 is a schematic diagram of an algorithm of a two-scale linear regression of a fault prediction method provided by an embodiment of the present application;

FIG. 8 is a sample selection schematic of a fault prediction method provided by an embodiment of the present application;

fig. 9 is a network operation schematic diagram of a fault prediction method provided in an embodiment of the present application;

fig. 10 is a schematic diagram of fault prediction of the fault prediction method provided in the embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

In the following description, the terms "first", "second", "third" and the like are merely used to distinguish similar objects and do not represent a specific ordering of the objects, it being understood that the "first", "second", "third" may be interchanged with a specific order or sequence, as permitted, to enable embodiments of the application described herein to be practiced otherwise than as illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.

1) Simple network management protocol (SNMP, simple Network Management Protocol): the system is a standard protocol for network equipment management, and can be used for detection, configuration management, fault diagnosis and remote management of management equipment.

2) Network physical link (hereinafter simply link): two ends of a network physical link are respectively formed by connecting one port of a switch.

3) Dominant failure: the method refers to the phenomena of port oscillation, port packet error and port closing of a physical link of a network, and when a dominant fault occurs, a signal transmitted by the link where the dominant fault exists is not reachable, so that the network quality is affected.

4) Digital diagnostic assay (DDM, digital Diagnostic Monitoring): the technology is used in the optical module, wherein the core 5 indexes comprise working temperature, working voltage, working current, transmitting power and receiving power, and the digital diagnosis detection is used for detecting the working state of the optical module and helping the system to locate the module.

5) Huber regressions: a robust linear regression method is characterized in that a regression line is greatly moved due to the existence of an outlier in a common least square regression, a parameter is introduced by Huber regression, and the influence of the outlier on a regression model is reduced.

6) Spark: a quick, universal and extensible big data processing and analyzing engine, spark provides a distributed computing framework, and data is loaded into a memory for efficient processing, and the core concept is an elastic distributed data set, so that distributed computation can be performed in a cluster.

7) Time series stationarity: the invariance of the time sequence in the statistical property, namely, the statistical property of the time sequence is kept unchanged in different time periods, including strong and stable time sequence, which means that the mean value, variance and autocorrelation structure of the time sequence are constant in time and do not change with time under the condition of strong and stable time sequence;

8) 3sigma principle: is a common method in statistics, the core idea is that if a process has stability and the results follow a normal distribution, 99.73% of the data should fall within 3 standard deviations of the mean on average.

9) Small procedure: the applet is an application which can be used without downloading and installing, and a user can open the application by sweeping or searching, so that the user does not need to care whether to install too many applications, the application is ubiquitous and available at any time, and the application does not need to be installed and uninstalled.

The related technology determines a classification threshold value of the working parameter according to a classification sample set corresponding to the working parameter of the optical module, and predicts whether the optical module corresponding to the sequence to be detected will fail in the future according to a comparison result of the classification threshold value and a plurality of measured values in the sequence to be detected.

The applicant finds that when the number of optical modules exceeds a million level in implementing the embodiments of the present application, the prediction accuracy of the technical solution of the related art is low, the complexity is high, and the computing resources are excessively consumed. Specifically, the prediction accuracy of the scheme based on the index threshold in the related art is extremely low, and the scheme cannot be applied to a production environment. In the classification scheme based on machine learning/deep learning in the related art, under the condition that the proportion of positive and negative samples is extremely unbalanced, better performance is difficult to obtain in a production environment, especially, fault prediction is a typical abnormal sample scarcity scene, positive samples (sub-health samples) are extremely few, negative samples (health samples) are extremely many, the related technology is difficult to be suitable for a large-scale network-on-site environment, and when the model parameters are large, the model is difficult to obtain better performance under the condition that the abnormal samples are scarcity. When the number of optical modules exceeds a million level, the time sequence exceeds a hundred million level, and the prediction scheme in the related art causes larger consumption of computing resources.

An exemplary application of the electronic device provided by the embodiment of the present application is described below, where the electronic device provided by the embodiment of the present application may be implemented as a terminal or a server.

Referring to fig. 1, fig. 1 relates to a server 200, a network 300, and a terminal 400. The terminal 400 is connected to the server 200 through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of both.

In some embodiments, the server 200 may be a server to which an application program corresponds, for example: the application is fault detection software installed in the terminal 400, and the server 200 is a fault detection server for performing a fault prediction process and feeding back a fault prediction result to the terminal for display.

In some embodiments, the terminal 400 receives a failure prediction request for any link, and sends the failure prediction request to the server 200, where the server 200 obtains index data of an optical module included in the link in a global time sequence; performing local regression processing on index data of the optical module based on the unit time sequence to obtain local fitting data corresponding to each unit time sequence; combining the local fitting data of the plurality of unit time sequences to obtain combined data, and performing global regression on the combined data to obtain global fitting data of the optical module; and carrying out probability mapping processing on the global fitting data of the optical module to obtain the future fault probability of the link where the optical module is located. When the probability of the future failure is greater than the threshold value of the probability of the failure, the server 200 predicts that the link will fail in the future, and the server 200 returns the probability of the future failure and the prediction result of the link predicted to fail in the future to the terminal 400 for display.

As an example, the index data of the unit time series is actually a plurality of discrete points, the abscissa of the discrete points in the coordinate axis is the sampling time point of the unit time series, the ordinate is the index data of the sampling time point, and since the discrete points are from the unit time series (rather than the global time series), the discrete points are used for representing local information, the local regression processing includes a process of linear fitting and sampling, specifically, the local regression processing is equivalent to performing linear fitting on the discrete points representing the local information, fitting a straight line which can be used for representing the change relation between the index data and the sampling time point, and sampling the index data of at least two sampling time points from the straight line as local fitting data, thereby expressing the local information of the discrete points of the unit time series by using the local fitting data.

As an example, the local fitting data of one unit time series is index data of n sampling time points, the merging data is obtained by merging the local fitting data of a plurality of unit time series, and when the number of unit time series in the global time series is m, the merging data is The index data of the sampling time points can be regarded as actuallyThe abscissa is the sampling time point of the unit time series, and the ordinate is the index data of the sampling time point.

As an example, due to theseThe discrete points come from the global time sequence, so that the global information is represented, the global regression processing comprises the processes of linear fitting and sampling, specifically, the global regression processing is equivalent to the linear fitting of the discrete points representing the global information, and the fitting of another straight line which can be used for representing the change relation between index data and sampling time points is carried outIndex data of at least two sampling time points in the straight line are sampled as global fitting data, so that global information of discrete points of a global time sequence is expressed by using the global fitting data.

In some embodiments, the terminal 400 sends a polling failure prediction request for each link to the server, and sends the polling failure prediction request to the server 200, and the server 200 acquires index data of an optical module included in the link in a global time sequence; performing local regression processing on index data of the optical module based on the unit time sequence to obtain local fitting data corresponding to each unit time sequence; combining the local fitting data of the plurality of unit time sequences to obtain combined data, and performing global regression on the combined data to obtain global fitting data of the optical module; and carrying out probability mapping processing on the global fitting data of the optical module to obtain the future fault probability of the link where the optical module is located. When the probability of the future failure is greater than the threshold value of the probability of the failure, the server 200 predicts that the link will fail in the future, and the server 200 returns the probability of the future failure and the prediction result of the link predicted to fail in the future to the terminal 400 for display.

In some embodiments, the server 200 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a smart television, a car terminal, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiments of the present application. The database may be integrated on the server 200 or the database may be provided on a machine independent of the server 200, as embodiments of the present application are not limited.

In some embodiments, the terminal 400 may implement the fault prediction method provided in the embodiments of the present application by running a computer program, for example, the computer program may be a native program or a software module in an operating system; may be a Native Application (APP), i.e. a program that needs to be installed in an operating system to run, such as a fault detection APP; the method can also be an applet, namely an application program which can be used without downloading and installing; but also an applet that can be embedded in any APP. In general, the computer programs described above may be any form of application, module or plug-in.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, where the electronic device is a terminal or a server, and the electronic device is illustrated as a server, and the server shown in fig. 2 includes: at least one processor 210, a memory 250, at least one network interface 220, and a user interface 230. The various components in terminal 400 are coupled together by bus system 240. It is understood that the bus system 240 is used to enable connected communications between these components. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 240 in fig. 2.

The processor 210 may be an integrated circuit chip with signal processing capabilities such as a general purpose processor, such as a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

The user interface 230 includes one or more output devices 231 that enable presentation of media content, which may include one or more speakers and/or one or more visual displays. The user interface 230 also includes one or more input devices 232, which may include user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 250 optionally includes one or more storage devices physically located remote from processor 210.

Memory 250 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a random access Memory (RAM, randomAccess Memory). The memory 250 described in embodiments of the present application is intended to comprise any suitable type of memory.

In some embodiments, memory 250 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 251 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

a network communication module 252 for reaching other electronic devices via one or more (wired or wireless) network interfaces 220, the exemplary network interfaces 220 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Serial Bus), etc.;

A presentation module 253 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 231 (e.g., a display screen, speakers, etc.) associated with the user interface 230;

an input processing module 254 for detecting one or more user inputs or interactions from one of the one or more input devices 232 and translating the detected inputs or interactions.

In some embodiments, the fault prediction device provided in the embodiments of the present application may be implemented in a software manner, and fig. 2 shows the fault prediction device 255 stored in the memory 250, which may be software in the form of a program, a plug-in, or the like, including the following software modules: the acquisition module 2551, the first regression module 2552, the second regression module 2553, the prediction module 2554, the training module 2555 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be described hereinafter.

In the following, the fault prediction method provided in the embodiment of the present application is described, and as before, the electronic device implementing the fault prediction method in the embodiment of the present application may be a terminal or a server, and a server is described as an example. The execution subject of the respective steps will not be repeated hereinafter. Referring to fig. 3A, the description is given with reference to steps 101 to 104 shown in fig. 3A.

In step 101, index data of the optical module in the global time sequence is acquired.

As an example, each large enterprise and each platform have respective data centers, each data center is built up by tens of thousands of physical links (hereinafter referred to as links), and referring to fig. 4, two ends of a link a (a link formed by a port 1 of a switch a and a port 1 of a switch b) have an optical module, and the optical module is composed of an optoelectronic device, a functional circuit, an optical interface, and the like, where the optoelectronic device includes two parts of transmitting and receiving. In short, the optical module is used for converting an electrical signal into an optical signal by the transmitting end, and converting the optical signal into an electrical signal by the receiving end after the optical signal is transmitted through the optical fiber. How to obtain the index data of the optical module in the global time sequence is described in detail below.

In some embodiments, the global time sequence is composed of a plurality of unit time sequences, referring to fig. 3B, and the acquiring of the index data of the optical module in the global time sequence in step 101 may be implemented by steps 1011 to 1015 shown in fig. 3B.

In step 1011, for each unit time series in the global time series: and acquiring the original index data of the optical module in the unit time sequence.

As an example, the time series is a sequence of discrete sampling time points, and index data may be acquired with 5 minutes as a time granularity, and the time period corresponding to the global time series is 3 days, so that the global time series corresponds to 864 sampling time points, namely, 0 th minute of the first day is the first sampling time point, 24 th minute of the third day is the 864 th sampling time point, the time period corresponding to the unit time series here may be 1 day, and the first day of three days is taken as an example for illustration, and then the unit time series corresponds to 288 sampling time points, namely, 0 th minute of the first day is the first sampling time point, and 24 th minute of the first day is the 288 th sampling time point.

As an example, the acquiring the original index data of the optical module in the unit time sequence specifically refers to acquiring 288 sampling time points corresponding to the unit time sequence, where the original index data of the optical module is acquired through the SNMP acquisition system.

In some embodiments, the obtaining the original index data of the optical module in the unit time sequence in step 1011 may be implemented by the following technical scheme: the following processing is performed for each sampling time point of the unit time series: acquiring a light receiving power index, a light emitting power index and a current index of the light module at a sampling time point; the method comprises the steps that a light receiving power index, a light emitting power index and a current index of an optical module at a sampling time point form original index data of the optical module at the sampling time point; and forming the original index data of the optical module at a plurality of sampling time points into the original index data of the optical module in a unit time sequence. According to the embodiment of the application, the data which are the three most sensitive to the working condition of the link, namely the light receiving power index, the light emitting power index and the current index, are used as the original index data, so that the accuracy of the link fault prediction can be improved subsequently.

Taking the above example as an example, the unit time sequence includes 288 sampling time points as an example, and the light receiving power, the light emitting power and the working current of the optical module on the port are obtained, and the set is identified by the sampling time points of the unit time sequence，/>Refer to sampling timePoint identification->Representing an integer set, an index dimension setRx is the light receiving power index, tx is the light emitting power index, cur is the current index, and the original index data of the unit time sequence can form a set +.>，Represents the time corresponding to the ith sampling time point, < +.>Representing the received light power index corresponding to the ith sampling time point,representing the luminous power index corresponding to the ith sampling time point,/->Representing the current index corresponding to the ith sampling time point.

In step 1012, for each unit time sequence in the global time sequence: performing data cleaning treatment on the original index data of the optical module in the unit time sequence to obtain cleaning index data of the unit time sequence;

as an example, the raw index data collected during the actual application process may have null values, anomalies, and repetition phenomena, so that the raw index data needs to be subjected to data cleaning processing, and the data cleaning process is described in detail below.

In some embodiments, the raw index data of the light module in the unit time series includes raw index data of the light module at each sampling time point in the unit time series; in step 1012, the original index data of the optical module in the unit time sequence is subjected to data cleaning processing, so as to obtain cleaning index data of the unit time sequence, which can be realized by the following technical scheme: removing original index data belonging to null values and abnormal values in the original index data of each sampling time point in the unit time sequence of the optical module to obtain reserved original index data; and repeatedly filtering the original index data of each sampling time point in the reserved original index data to obtain cleaning index data of the unit time sequence. According to the method and the device for cleaning the original index data, the original index data are cleaned, abnormal values, null values and repeated values are removed, so that the representation capability of the original index data can be improved, and the fault prediction accuracy is improved.

As an example, the original index data of the light module at the sampling time point i in the unit time series may be expressed as，/>Represents the time corresponding to the ith sampling time point, < +.>Representing the corresponding received light power index of the ith sampling time point,/for the sample time point >Representing the luminous power index corresponding to the ith sampling time point,/->Representing the current index corresponding to the ith sampling time point. During the data cleaning process, the null value is first removed, i.e. ensure +.>Are all non-null values, then the outliers are processed, where it is ensured thatThe lower limit and the upper limit values of the light receiving power index, the light emitting power index and the current index are preset in the normal value interval in an experimental mode, and the upper limit and the lower limit values restrict the normalAnd in the value taking section, removing the null value and the abnormal value and then removing the repeated value, namely, ensuring that only one current index, one light receiving power index and one light emitting power index are provided at the same sampling time point, and removing the index data which are repeatedly collected, wherein the original index data are collected twice at the same sampling time point sometimes, so that two identical original index data are obtained.

In step 1013, for each unit time series in the global time series: and carrying out data compression processing on the cleaning index data to obtain compression index data of the unit time sequence.

As an example, the data cleaning process is followed by a column-row transformation process, and when massive data is processed through Spark, data compression can be achieved by converting line data into column data, and the calculation performance is significantly improved. Specifically, 288 rows of a certain light module are aggregated into one row 288 columns in advance, and the row-column transformation can be defined as ，/>Is an index dimension set (comprising a current index, a light receiving power index and a light emitting power index), I is a sampling time point identification set, and +.>Represents the time corresponding to the ith sampling time point, < +.>Wash index data representing index dimension j corresponding to the ith sample time point,is 288 columns (288 sampling time points respectively) of index data corresponding to index j.

In step 1014, for each unit time sequence in the global time sequence: and carrying out standardization processing on the compressed index data of the unit time sequence to obtain the index data of the unit time sequence.

In some embodiments, the compressed index data of the unit time series includes compressed index data of a plurality of index dimensions for each sampling time point in the unit time series; in step 1014, the compressed index data of the unit time series is normalized to obtain index data of the unit time series. The standardized treatment can be realized by the following technical scheme: carrying out standardization processing on the compressed index data of each index dimension to obtain standardized index data of each sampling time point in each index dimension; forming the standardized index data of each sampling time point in a plurality of index dimensions into index data of each sampling time point; the index data of the plurality of sampling time points are combined into index data of a unit time series.

Wherein the compressed index data for each index dimension performs the following processing: obtaining the maximum compression index data and the minimum compression index data in the compression index data of index dimensions at a plurality of sampling time points; acquiring a first difference value between the maximum compression index data and the minimum compression index data; acquiring a second difference value between the compressed index data of the index dimension of each sampling time point and the minimum compressed index data; and taking the ratio of the second difference value corresponding to each sampling time point to the first difference value as standardized index data of the sampling time point in the index dimension.

All compression index data can be standardized to the same level through the embodiment of the application, so that local fitting data and global fitting data which accord with time sequence stability can be fitted faster when double-scale fitting is carried out later.

By way of example, data normalization is defined asSee formula (1):

（1）；

wherein,is index dimension set, & lt + & gt>Is the normalized index data of the corresponding index dimension j of the sampling time point i,/and>is the compressed index data of the corresponding index dimension j of the sampling time point i, +.>Is the minimum compression index data,/-among the compression index data of all sampling time points corresponding to index dimension j >Is the maximum compressed index data among the compressed index data of all sampling time points corresponding to the index dimension j.

The index data of the plurality of unit time series are composed into index data of the global time series in step 1015.

In step 102, the index data of the optical module is subjected to local regression processing based on the unit time series, so as to obtain local fitting data corresponding to each unit time series.

As an example, the global time series is composed of a plurality of unit time series, for example, a global time series is composed of a plurality of sampling time points of 2022 month 1 day to 2022 month 1 day 3, a unit time series is composed of a plurality of sampling time points of 2022 month 1 day, a sequence composed of a plurality of sampling time points of 2022 month 1 day 2 day, and a sequence composed of a plurality of sampling time points of 2022 month 1 day 3, where the local regression processing is to execute linear regression processing for each of 3 unit time series, and finally, local fitting data corresponding to each of 3 unit time series can be obtained.

In some embodiments, referring to fig. 3C, the performing, in step 102, the local regression processing based on the unit time series on the index data of the optical module to obtain the local fitting data corresponding to each unit time series may be implemented by performing the following processing on each unit time series in steps 1021 to 1023 shown in fig. 3C.

In step 1021, the index data of the unit time series is subjected to a unitary regression process to obtain a first slope of the index data and a first intercept of the index data.

In some embodiments, the unitary regression process is equivalent to performing linear fitting on a plurality of discrete points, where the abscissa of the discrete points in the coordinate axis is a sampling time point, and the ordinate is index data corresponding to the sampling time point, and a straight line (a first slope and a first intercept) representing a relationship between the index data and time can be obtained through fitting, and in step 1021, performing the unitary regression process on the index data of the unit time sequence to obtain the first slope of the index data and the first intercept of the index data, which can be achieved by the following technical scheme: acquiring an initial slope of index data and an initial intercept of the index data; determining a local loss based on the index data for each sampling time point, an initial intercept of the index data, and an initial slope of the index data; and updating the initial intercept of the index data and the initial slope of the index data by taking the minimized local loss as a constraint condition to obtain a first intercept of the index data and a first slope of the index data. According to the embodiment of the application, the first slope and the first intercept of the index data change representing the unit time sequence can be fitted based on the constraint condition of minimizing the local loss.

As an example, the embodiment of the present application performs linear regression based on a unitary equation, so parameters to be fitted are an intercept and a slope, an initial slope and an initial intercept are initialization data in a fitting process, and the initialization data needs to be updated in the fitting process, where an update target is to minimize a local loss, and a first intercept and a first slope when the local loss is minimized are obtained.

In some embodiments, the determining the local loss based on the index data, the initial intercept of the index data, and the initial slope of the index data at each sampling time point may be implemented by the following technical solutions: the following processing is performed for each sampling time point: performing unitary mapping processing on the sampling time points based on the initial intercept of the index data and the initial slope of the index data to obtain a first mapping result; when the absolute value between the first mapping result and the index data of the sampling time point is not greater than an absolute value threshold value, acquiring local loss positively correlated with the square of the absolute value of the sampling time point; when the absolute value between the first mapping result and index data of the sampling time point is larger than an absolute value threshold, a first standard value which is positively correlated with the square of the absolute value threshold is obtained, the absolute value and the absolute value threshold are multiplied, the multiplication result and the first standard value are subtracted, and the local loss of the sampling time point is obtained; and carrying out summation processing on the local losses of the sampling time points to obtain the local losses.

As an example, to reduce the problem that the conventional least square method is too sensitive to outliers, the embodiment of the present application uses the Huber regression loss function as the local loss to perform the optimization fit, see formula (2):

（2）；

wherein,is the index data of index dimension j of sampling time point i obtained by step 101,is the index data obtained by fitting +.>(first mapping result), a->And->Slope and intercept, respectively, +.>Is the Huber regression loss function,/>finger acquisitionFitting to get +.>And->R is the real number set, ">Is the local loss of the sampling time point corresponding to the sampling time point i,/i>Is a local loss.

The Huber regression loss function can be found in equation (3):

（3）；

where y is index data of index dimension j of sampling time point i obtained by step 101,is the index data (first mapping result) obtained by fitting,>is an absolute value threshold, < >>Is the Huber regression loss value (sampling time point local loss at sampling time point).

In step 1022, the first slope of the index data, the first intercept of the index data, and the first error of the index data are combined into first regression data of the index data.

By way of example, a minimized time fit is taken here(first slope), ->(first intercept) first error of index data->The first regression data of the index data is composed, for the index dimension j (for example, light emitting power), the first slope corresponding to the index dimension j, the first intercept corresponding to the index dimension j, and the first error corresponding to the index dimension j are composed into the first regression data of the corresponding index dimension j, and the first regression data corresponding to the plurality of index dimensions j are composed into the first regression data of the index data.

In step 1023, mapping processing is performed on the target sampling time points of the unit time series based on the first regression data, so as to obtain local fitting data of the unit time series.

In some embodiments, the target sampling time points include a first sampling time point and a last sampling time point, and in step 1023, mapping processing is performed on the target sampling time points of the unit time sequence based on the first regression data to obtain local fitting data of the unit time sequence, which may be implemented by the following technical scheme: multiplying a first slope in the first regression data with a first sampling time point of a unit time sequence to obtain a first multiplication result; adding the first multiplication result and the first intercept to obtain local fitting data corresponding to a first sampling time point; multiplying the first slope in the first regression data with the last sampling time point of the unit time sequence to obtain a second multiplication result; adding the second multiplication result and the first intercept to obtain local fitting data corresponding to the last sampling time point; the local fitting data corresponding to the first sampling time point and the local fitting data corresponding to the last sampling time point are formed into local fitting data of a unit time sequence. According to the embodiment of the application, data compression can be realized, before compression, the unit time sequence is described by index data of 288 sampling time points, mapping processing can be performed only by index data of 2 sampling time points, and data processing complexity is reduced and data processing efficiency is improved.

As an example, mapping the target sampling time points of the unit time series based on the first regression data, the process of obtaining the local fitting data of the unit time series is actually data compression, and since the index data is time-series stable, fitting can be performed using linear regression, where in order to reduce the data amount, the first regression data is used，/>Is the slope of the fit, ∈>Is the intercept obtained by fitting ∈>Is the first error, ++>Is an experimental value when +.>The characteristic fitting result is poor, so that the characteristic that the data collected by the optical module are extremely unstable is related to the port state of equipment, manual operation under the line and the like, and the characteristic is abnormal, so that the index data of index dimension j corresponding to the unit time sequence needs to be removed, namely the index data of the unit time sequence is replaced for fitting, and when the index data of the unit time sequence is insufficient, the characteristic is that>The characteristic fitting result is better, and the index dimension of the unit time sequence is +.>Is compressed into the 1 st sampling time point and 288 th sampling time point +.>See formula (4) and formula (5):

（4）；

（5）；

wherein,is the fitting result (local fitting data corresponding to the first sampling time point) of the 1 st sampling time point of the index dimension j, >Is the fitting result of the 288 th sampling time point of index dimension j (the local fitting data corresponding to the last sampling time point),>is the slope of the fit, ∈>Is the intercept obtained by fitting ∈>Is a set of index dimensions.

In step 103, the local fitting data of the plurality of unit time sequences are combined to obtain combined data, and global regression processing is performed on the combined data to obtain global fitting data of the optical module.

As an example, to capture the time series characteristics of the optical module for 3 consecutive days, in the embodiment of the present application, the local fitting data of the unit time series for 3 consecutive days is subjected to a second linear regression, that is, a global linear regression, where the local fitting data of the plurality of unit time series is first subjected to a merging process to obtain merged data. Specifically, through the formula [ ]4) And equation (5) can result in the optical module at the firstDay local fitting data>Local linear regression is performed on the index data on day 2 to obtain local fitting data>Local linear regression of the index data on day 3 to obtain local fitting data +.>. Fitting data on day 1, day 2 and day 3 are spliced according to time dimension to obtain 6 columns of vectors, so that total 864 columns of index data on day 1, day 2 and day 3 are compressed into 6 columns, and 144 times of index data are compressed.

In some embodiments, referring to fig. 3D, global regression processing is performed on the combined data in step 103 to obtain global fitting data of the optical module, which may be implemented through steps 1031 to 1033 shown in fig. 3D.

In step 1031, unitary regression processing is performed on the combined data to obtain a second slope of the index data and a second intercept of the index data.

In some embodiments, in step 1031, the unitary regression processing is performed on the combined data to obtain the second slope of the index data and the second intercept of the index data, which may be implemented by the following technical scheme: acquiring an initial slope of index data and an initial intercept of the index data; determining a global loss based on the index data for each sampling time point, an initial intercept of the index data, and an initial slope of the index data; and updating the initial intercept of the index data and the initial slope of the index data by taking the minimized global loss as a constraint condition to obtain a second intercept of the index data and a second slope of the index data.

As an example, the embodiment of the present application performs linear regression based on a unitary equation, so parameters to be fitted are an intercept and a slope, an initial slope and an initial intercept are initialization data in a fitting process, and the initialization data needs to be updated in the fitting process, where an update target is to minimize a global loss, and a second intercept and a second slope when the global loss is minimized are obtained.

As an example, the above-mentioned determination of the global loss based on the index data, the initial intercept of the index data, and the initial slope of the index data at each sampling time point may be achieved by the following technical scheme: the following processing is performed for each sampling time point: performing unitary mapping processing on the sampling time points based on the initial intercept of the index data and the initial slope of the index data to obtain a second mapping result; when the absolute value between the second mapping result and the index data of the sampling time point is not greater than an absolute value threshold value, acquiring global loss positively correlated with the square of the absolute value of the sampling time point; when the absolute value between the second mapping result and the index data of the sampling time point is larger than the absolute value threshold, a second standard value of square positive correlation of the absolute value threshold is obtained, multiplication processing is carried out on the absolute value and the absolute value threshold, subtraction processing is carried out on the multiplication result and the second standard value, and the sampling time point global loss of the sampling time point is obtained; and carrying out summation processing on the sampling time point global loss of the sampling time points to obtain the global loss. Reference is made herein to the embodiment of calculating global losses.

In step 1032, the second slope of the index data, the second intercept of the index data, and the second error of the index data are combined into second regression data of the index data.

Fitting the minimized time to obtain、/>Second error of index data +.>Second regression data that constitutes the index data, for index dimension j (e.gLuminous power), the second slope corresponding to the index dimension j, the second intercept corresponding to the index dimension j and the second error corresponding to the index dimension j are combined into second regression data corresponding to the index dimension j, and the second regression data corresponding to the index dimensions j are combined into second regression data of the index data.

In step 1033, mapping processing is performed based on the second regression data, so as to obtain global fitting data of the global time series.

As an example, the second regression data obtained in step 1032 may be subjected to a data cleaning process before step 1033 is performed,is an experimental value when +.>Characterization fitting using the Huber regression loss function>The result of the method is that the index data of the optical module acquired in 3 continuous days is extremely unstable, and the optical module may be in the state of just starting, preparing for offline, debugging and the like, so that the index data of the index dimension j corresponding to the unit time sequence needs to be removed, namely the index data of the global time sequence needs to be replaced for fitting.

In some embodiments, in step 1033, mapping is performed based on the second regression data to obtain global fitting data of the global time sequence, which may be implemented by the following technical scheme: multiplying the second slope in the second regression data with the first sampling time point of the global time sequence to obtain a third multiplication result; adding the third multiplication result and the second intercept to obtain global fitting data corresponding to the first sampling time point; multiplying the second slope in the second regression data with the last sampling time point of the global time sequence to obtain a fourth multiplication result; adding the fourth multiplication result and the second intercept to obtain global fitting data corresponding to the last sampling time point; the global fitting data corresponding to the first sampling time point, the global fitting data corresponding to the last sampling time point and the second slope are formed into global fitting data.

As an example, a mapping process may be performed after performing the data cleansing process, where the mapping process is equivalent to performing data compression for the purpose of observing the index dimensionThe parameters +. >Andfitting columns 1 and 864, see equations (6) and (7):

（6）；

（7）；

wherein,is the global fitting result of the 1 st sampling time point of index dimension j,/and>is the global fitting result of the 864 th sampling time point of index dimension j, +.>Is a second slope obtained by fitting, +.>Is the second intercept obtained by fitting, +.>Is a set of index dimensions.

Thereby indexing the optical moduleGlobal fit data for sequential data over 3 consecutive days is represented as 3 column vectors，/>Is an index dimension set, index dimension +.>To->The global fit at sample time point 1 results in +.>The global fit at sample time point 864 resulted in +.>. At this time, the optical module can be represented as a 9-column vector (total 3 index dimensions) with a compression ratio of 288 times for a total of 2592 columns (total 3 index dimensions) of sequential data for 3 consecutive days.

In step 104, probability mapping processing is performed on the global fitting data of the optical module, so as to obtain the future failure probability of the link where the optical module is located.

In some embodiments, before probability mapping is performed on global fitting data of the optical module to obtain future fault probability of the optical module, acquiring index data of a global time sequence before actual fault occurrence as a positive sample, and acquiring index data of a global time sequence after actual fault repair as a negative sample; global fitting data corresponding to the positive samples and global fitting data corresponding to the negative samples are obtained; generating a distribution interval of the positive sample based on global fitting data of the positive sample, and generating a distribution interval of the negative sample based on global fitting data of the negative sample; taking a coincidence interval between a distribution interval of the positive sample and a distribution interval of the negative sample as an abnormal distribution interval; in step 104, probability mapping processing is performed on global fitting data of the optical module to obtain future fault probability of the optical module, which can be realized by the following technical scheme: when the global fitting data of the optical module is in an abnormal distribution interval, the probability of the corresponding abnormal distribution interval is used as the future fault probability of the link where the optical module is located.

As an example, a data set is first generated, in a work order system of the last half year, 3106 link alarms occur altogether, time series data of 1 week to 3 weeks before the time of failure is sampled to generate a positive sample (sub-health sample), 2 weeks after the end of the under-line processing of the failure link is sampled to generate a negative sample (health sample), and a data set a is generated based on the positive sample and the negative sample. Then carrying out double-scale linear regression processing to obtain global fitting data of samples in the training set, and carrying out the dimension matching on any indexGlobal fitting data +.>Are all +.>A matrix. />Is index dimension set, & lt + & gt>Is the number of optical modules in the training set sample, < >>Is the global fitting result at sample time point 1,/i>Is the global fitting result at the 864 th sampling time point,/and>is the second slope of the fit. Global fitting of reuse training setData->Parameter estimation is performed, in the negative sample, with +.>Principle separate estimationThe range of values in the negative samples. To->For example (+)>Similarly), according to the 3sigma principle, in positive samples its distribution interval +.>Is->，/>Is the second slope in positive sample +.>Mean value of->Is the second slope in positive sample +. >Standard deviation of (2). In negative samples, its distribution interval +.>Is->，/>Is the second slope in the negative sample +.>Mean value of->Is the second slope in the negative sample +.>Is to distribute the interval +.>And->Is defined as the distribution interval of outliers, whereby an outlier distribution interval of the second slope can be obtained +.>。

As an example, in the embodiment of the present application, the probability of future failure of the link where the optical module is located is actually determined by the index data of the optical module, where the determination is made by three feature data in the global fitting data, which are respectively the second slopesGlobal fitting data corresponding to the first sampling time point and global fitting data corresponding to the last sampling time point, wherein each characteristic data has an abnormal distribution interval, and the calculation mode of the abnormal distribution interval can refer to the second slope->When at least one global fitting data of the global fitting data corresponding to the first sampling time point and the global fitting data corresponding to the last sampling time point is in the abnormal distribution interval, and the second slope +.>When the optical module is also in an abnormal distribution interval, the optical module is characterized in that the link is in the future within one month The probability of the barrier is 90%. When only the second slope +>When the three characteristic data do not fall into the abnormal distribution interval, the probability of the fault occurring in the link of the characterization optical module in the future in one month is less than 30%. When the number of index dimensions is plural, as long as the feature data of one index dimension falls into the abnormal distribution section, the feature data of that type is considered to fall into the abnormal distribution section.

According to the method and the device for obtaining the global fitting data of the optical module, the index data of the optical module in the global time sequence are obtained, the local regression processing and the global regression processing are carried out on the index data of the optical module based on the unit time sequence, global fitting data of the optical module are obtained, high-accuracy fitting can be carried out on the index data of the optical module while consumption of computing resources can be effectively reduced in a double-scale twice linear regression mode, probability mapping processing is carried out on the global fitting data of the optical module, future fault probability of a link where the optical module is located is obtained, and compared with a scheme judged directly through an index threshold value, accuracy of fault prediction can be improved.

In the following, an exemplary application of the embodiments of the present application in a practical application scenario will be described.

In some embodiments, the terminal receives a failure prediction request for any link, and sends the failure prediction request to the server, and the server acquires index data of an optical module included in the link in a global time sequence; performing local regression processing on index data of the optical module based on the unit time sequence to obtain local fitting data corresponding to each unit time sequence; combining the local fitting data of the plurality of unit time sequences to obtain combined data, and performing global regression on the combined data to obtain global fitting data of the optical module; and carrying out probability mapping processing on the global fitting data of the optical module to obtain the future fault probability of the link where the optical module is located. And when the future failure probability is larger than the failure probability threshold, the server predicts that the link will fail in the future, and returns the future failure probability and the prediction result of predicting that the link will fail in the future to the terminal for display.

In some embodiments, referring to fig. 4, an optical module is inserted at one port of a switch, and an optical module is located at each end of link a (the link formed by port 1 of switch a and port 1 of switch b). The SNMP acquisition system of the switch collects the working temperature, the working voltage, the working current, the transmitting power and the receiving power of the optical modules on the port in real time, wherein the total of the optical modules is 5 continuity indexes, the enterprise data center has millions of optical modules, and the original data of each day has about 200 hundred million.

Referring to fig. 5, the SNMP based acquisition system acquires the following metrics: and (3) receiving light power, light emitting power and current, predicting whether each link of the whole network has dominant faults such as port oscillation, port packet error and the like within 1 month, displaying fault level, alarm reasons, specific network equipment and port information and the like in a visual mode, and informing corresponding personnel to conduct down-line problem investigation to prevent faults. When the algorithm predicts that a certain link is about to generate faults, a fault prediction alarm of the link is generated, for example, the link where the port 1 of the switch a and the port 1 of the switch b are located generates a link fault prediction alarm, the network work order system receives the early warning, and the link is checked offline, so that manual intervention is facilitated in advance.

In some embodiments, referring to fig. 6, the optical module-based fault prediction process includes: data preprocessing, double-scale linear regression processing and pattern recognition. The data preprocessing consists of data cleaning, row-column transformation and data standardization; the double-scale linear regression process consists of local regression process and global regression process; pattern recognition consists of data set generation processing, feature extraction, parameter estimation, and parameter verification. Each process is explained in detail below.

The specific flow of data preprocessing is first described below.

Defining the total data acquired by the link A in one day based on the SNMP system as a data setThe cleaned dataset is defined as +.>。

1440 minutes is included in one day, data is collected according to the collection granularity of 5 minutes, 288 numerical value points can be obtained, and a collection is marked by the sampling time points，/>Refer to the sample time point identification ∈ ->Representing a set of integers. Index dimension set +.>Rx represents the received light power index, tx represents the emitted light power index, cur represents the current index. Full dataset +.>，/>Representing a full dataset,/->Represents the i-th sampling time point in the day, < >>Represented by->Time light receiving index->Value of->Represented by->Luminous index->Value of->Represented by->Current index +.>And (5) taking a value. For each index of set D +.>Can be described as->，/>Represents the time corresponding to the ith sampling time point, < +.>Representing the original index data of index dimension j corresponding to the ith sampling time point.

First, a data cleaning process is performed to remove the null value of the set D, where it is ensured thatAll are non-null values, and the processing of removing the null value may be to replace the null value with an average value, for example, if the light emitting power of the light module at the ith sampling time point is the null value, the null value is replaced with an average value of the light emitting powers of the light module at a plurality of sampling time points. Then the outliers are processed, where ensure +. >All are in the normal value interval, +.>And->Representing the lower and upper limit values of the jth index, respectively, i.e. +.>The abnormal value may be replaced by an average value in the abnormal value removal process, for example, if the light emission power of the light module at the ith sampling time point is an abnormal value, the abnormal value is replaced by an average value of the light emission powers of the light module at a plurality of sampling time points, and the null value and the set after the abnormal value are removedHere, "and" means and the logical relationship, ">Representation->Are all non-null values, < >>Representation ofAll are in the normal value interval, +.>Represents the ith sampling time point, J is the index dimension set, +.>Index data corresponding to index j at the ith sampling time point is subjected to null value and abnormal value removal processing and repeated value removal processing, and +.>No repeated value is taken in the process.

After the data cleaning process, the column-row conversion process is performed, and when mass data is processed by Spark, the column-row conversion process is performedThe line data is converted into the column data, so that data compression can be realized, and the calculation performance is remarkably improved. Specifically, the 288 data of a certain optical module are aggregated into one row 288 columns in advance, so that the operation efficiency is greatly improved. For collections Each element->Transform to get the set->The method of line-column transformation is defined asI is the sample time point identification set, J is the index dimension set, here +.>Represents the i-th sampling time point, +.>Index data corresponding to index j at the i-th sampling time point,/is>Is 288 columns (288 sampling time points respectively) of index data corresponding to index j.

The line transformation process is followed by a data normalization process, defined asSee formula (8) and formula (9):

（8）；

（9）；

wherein,is the index data at time i after normalization, < >>Is the index data at time i before normalization, < >>Is the minimum index data of the index data of all the moments corresponding to index j,/or->The index data is the maximum index data in the index data of all moments corresponding to index dimension J, I is the sampling time point identifier, J is the index dimension set, and I is the sampling time point identifier set.

The following describes the two-scale linear regression process, see FIG. 7, in partial linear regression, for the index dimensionAnd performing time sequence fitting on the acquired index data in one day to obtain fitting values, cascading the time sequence fitting values in 3 consecutive days to obtain a time sequence after data compression, and performing feature extraction on the time sequence in 3 consecutive days in global linear regression.

According to the embodiment of the application, 3 indexes of the optical module are used, time sequence stability is displayed in a stable state, and the trend of the time sequence can be predicted well by the unitary linear regression model. The unitary linear regression equation definition is described in equation (10):

（10）；

wherein,to fit the slope of a straight line +.>For the intercept->Is error item->For the sampling time point identification, +.>Is the index data obtained by fitting.

In order to reduce the problem that the traditional least square method is too sensitive to an abnormal value, the embodiment of the application adopts a Huber regression loss function to optimize, and the optimization is shown in a formula (11):

（11）；

wherein,is the index data of index dimension j of sampling time point i obtained by step 101,is the index data obtained by fitting +.>(first mapping result), a->And->Slope and intercept, respectively, +.>Is the Huber regression loss function, < ->Finger acquisitionFitting to get +.>And->R is the real number set, ">Is the local loss of the sampling time point corresponding to the sampling time point i,/i>Is a local loss.

The Huber regression loss function can be found in equation (12):

（12）；

where y is index data of index dimension j of sampling time point i obtained by step 101, Is the index data (first mapping result) obtained by fitting,>is an absolute value threshold, < >>Is the Huber regression loss value (sampling time point local loss at sampling time point).

Local linear regression, which is a day-level process, i.e., a linear regression process is performed on the daily index data, is described below.

First, time series fitting and aggregation are carried outThe set +.>，/>Is index dimension->Corresponding slope, +.>Is the intercept corresponding to index dimension j, +.>Is an error term. Then data cleaning treatment is carried out, and a set is obtained after data cleaning，/>Is an experimental value when +.>Fitting +.A Huber regression was used>The result of (2) is poor, and the data collected by the characterization optical module is extremely unstable, which is related to the port state of the equipment, the off-line manual operation and the like, and belongs to abnormal phenomena, so that the data need to be removed. Finally, the data compression processing is carried out, and the index data is time-series stable, so that the linear regression can be used for fitting, and the data quantity is reduced by collectingMiddle->Index dimension +.>Time series of>Compressed into the 1 st sampling time point and 288 st Sample time points->See formula (13) and formula (14):

（13）；

（14）；

wherein,is the fitting result (local fitting data corresponding to the first sampling time point) of the 1 st sampling time point of the index dimension j,>is the fitting result of the 288 th sampling time point of index dimension j (the local fitting data corresponding to the last sampling time point),>is the slope of the fit, ∈>Is the intercept obtained by fitting ∈>Is a set of index dimensions.

Thereby obtaining the optical module at the firstDay fitting data>，/>Fitting result of 1 st sampling time point of index dimension j on the T-th day, +.>Fitting result of 288 th sampling time points of index dimension j on the T day, and performing data compression on index data on the T+1th day to obtain fitting data，/>Is the fitting result of the 1 st sampling time point of index dimension j on day T +1,fitting result of 288 th sampling time point of index dimension j on the t+1th day, will be +.>Data compression is carried out on the index data of the day to obtain fitting data +.>。/>Fitting result of 1 st sampling time point of index dimension j on day t+2,/, is->Fitting results of 288 sampling time points of index dimension j on the T+2th day, and fitting data on the T, T+1th and T+2th days are spliced according to the time dimension to obtain 6 column vectors of the global time sequence corresponding to the index dimension j Column data 864 on days T, t+1, and t+2 were thus compressed to 6 columns, 144 times.

To capture the time series characteristics of light modules for 3 consecutive days, embodiments of the present application fit data to the time series for 3 consecutive daysPerforming a second linear returnThe regression is global linear regression.

First, time series fitting is performed, and data is fitted to time series for 3 consecutive daysFitting the slope a and the intercept b using a Huber regression loss function to obtain the set +.>，Expressed in data->Slope obtained after linear regression, ++>Expressed in data->Intercept obtained after linear regression, ++>Representing the error term. Then data cleaning treatment is carried out, and the program is added in>Is an experimental value when +.>Fitting +.A Huber regression loss function was used>As a result of the poor results, the data collected by the light module on consecutive 3 days is extremely unstable, and the light module may be in a state of being just started, ready to be taken off line, being debugged, etc., so that such data needs to be removed. Finally, feature extraction processing is carried out to observe index dimension +.>Column 1 and 864 on consecutive 3 daysVariation, use parameter->And->Fitting columns 1 and 864, see equations (15) and (16):

（15）；

（16）；

Thereby indexing the optical moduleThe sequential data is characterized by a 3-column vector for 3 consecutive days，/>Is a set of index dimensions that are set,index dimension->To->The speed of (2) is changed, the fitting value in column 1 is +.>The fitting value in column 864 is +.>. At this time, the optical module has 2592 columns of sequential data for 3 consecutive days, and is characterized by a 9-column vector with a compression ratio of 288 times.

The abnormal pattern recognition process is described below. The index data is compressed from 2592 columns into 9 columns of feature vectors by a double-scale linear regression process. The embodiment of the application describes how to identify the working state of the optical module by 9 columns of feature vectors and predict whether the optical module will have an explicit fault within 1 month.

First a dataset is generated, e.g. 3106 link alarms in a half year work order system, for the moment of failure t ₀ Sampling time series data from the first 1 week to the first 3 weeks to generate positive samples (sub-health samples), and after the fault link under-route processing is finished (fault recovery time t ₁ ) Sampling the 2-week time series data of (a) to generate a negative sample (health sample) and generating a data set A. The time window for positive and negative sample selection is shown in fig. 8. The training set, the test set and the verification set are divided in a mode of 7:2:1. Then carrying out feature extraction processing, carrying out feature extraction of the training set by using double-scale linear regression processing, and generating a feature matrix of the training set. Dimension +.>Its characteristic matrix->Are all +.>A matrix. />Is index dimension set, & lt + & gt>Is the number of optical modules in the training set sample, < >>Is the global fitting result at sample time point 1,/i>Is the global fitting result at the 864 th sampling time point,/and>is the second slope of the fit. Global fitting data of reuse training set +.>Parameter estimation is performed, in the negative sample, with +.>Principle separate estimationThe range of values in the negative samples. To->For example (+)>Similarly), according to the 3sigma principle, in positive samples its distribution interval +.>Is->，/>Is the second slope in positive sample +.>Mean value of->Is the second slope in positive sample +.>Standard deviation of (2). In negative samples, its distribution interval +.>Is->，/>Is the second slope in the negative sample +.>Mean value of->Is the second slope in the negative sample +. >Is to distribute the interval +.>And->Is defined as the distribution interval of outliers, whereby a second slope +.>Abnormal distribution interval->. When the second slope of a certain optical module +.>Fall at->When the probability of dominant failure in 1 month is greater than 90%, finally, parameter verification is carried out, double-scale linear regression processing is applied to the data set A, and an confusion matrix is obtained through experiments, wherein the dominant failure does not occur in the link in 1 month as shown in the table 1, and the sub-health represents that the dominant failure occurs in the link in 1 month.

TABLE 1 confusion matrix

Accuracy of the embodiment of the application in data set ARecall in dataset a of the examples of the present application +.>For a recall of 19%, the analysis is as follows: 1) There is an instantaneous failure, which cannot be predicted; 2) Trace before failure can be followed: however, it cannot be predicted from the received light power, the emitted light power, and the current time series data, and a finer granularity index is required. The analysis for 93.9% accuracy is as follows: when the embodiment of the application predicts that a certain link has problems, the probability of the link to fail within 1 month is more than 93.9%, offline processing can be performed in advance, and the occurrence of dominant failure is prevented, so that the network quality is influenced. / >

In some embodiments, referring to fig. 9, the present network has millions of optical modules, collecting about 200 billion time series data per day. And finally, about 3 single fault prediction alarms in the daily life are sent to the current network through data calculation, and offline processing is carried out to prevent the faults. Referring to fig. 10, a single link hidden danger is found 205 in advance in 7 months to 9 months, and manual intervention is performed offline, wherein the optical module problem accounts for 72%, the port problem accounts for 8%, the line problem accounts for 2%, the unwritten account for 18%, and about 3 optical module link fault prediction units are predicted daily, and offline processing is performed to prevent the fault.

It will be appreciated that in the embodiments of the present application, related data such as user information is referred to, and when the embodiments of the present application are applied to specific products or technologies, user permissions or consents need to be obtained, and the collection, use and processing of related data need to comply with related laws and regulations and standards of related countries and regions.

Continuing with the description below of an exemplary architecture of the fault prediction device 255 implemented as a software module provided by embodiments of the present application, in some embodiments, as shown in fig. 2, the software module stored in the fault prediction device 255 of the memory 250 may include: an obtaining module 2551, configured to obtain index data of the optical module in the global time sequence; the first regression module 2552 is configured to perform local regression processing on the index data of the optical module based on the unit time sequence to obtain local fitting data corresponding to each unit time sequence; the second regression module 2553 is configured to perform a merging process on the local fitting data of the multiple unit time sequences to obtain merged data, and perform a global regression process on the merged data to obtain global fitting data of the optical module; and the prediction module 2554 is configured to perform probability mapping processing on the global fitting data of the optical module, so as to obtain a future failure probability of the link where the optical module is located.

In some embodiments, the obtaining module 2551 is further configured to: the following processing is performed for each unit time series in the global time series: acquiring original index data of the optical module in a unit time sequence; performing data cleaning treatment on the original index data of the optical module in the unit time sequence to obtain cleaning index data of the unit time sequence; performing data compression processing on the cleaning index data to obtain compression index data of a unit time sequence; carrying out standardization processing on the compressed index data of the unit time sequence to obtain index data of the unit time sequence; the index data of the plurality of unit time series are composed into the index data of the global time series.

In some embodiments, the obtaining module 2551 is further configured to: the following processing is performed for each sampling time point of the unit time series: acquiring a light receiving power index, a light emitting power index and a current index of the light module at a sampling time point; the method comprises the steps that a light receiving power index, a light emitting power index and a current index of an optical module at a sampling time point form original index data of the optical module at the sampling time point; and forming the original index data of the optical module at a plurality of sampling time points into the original index data of the optical module in a unit time sequence.

In some embodiments, the raw index data of the light module in the unit time series includes raw index data of the light module at each sampling time point in the unit time series; the acquisition module 2551 is further configured to: removing original index data belonging to null values and abnormal values in the original index data of each sampling time point in the unit time sequence of the optical module to obtain reserved original index data; and repeatedly filtering the original index data of each sampling time point in the reserved original index data to obtain cleaning index data of the unit time sequence.

In some embodiments, the compressed index data for the unit time series includes compressed index data for a plurality of index dimensions for each sampling time point in the unit time series; the acquisition module 2551 is further configured to: the following processing is performed for the compressed index data for each index dimension: obtaining the maximum compression index data and the minimum compression index data in the compression index data of index dimensions at a plurality of sampling time points; acquiring a first difference value between the maximum compression index data and the minimum compression index data; acquiring a second difference value between the compressed index data of the index dimension of each sampling time point and the minimum compressed index data; taking the ratio between the second difference value and the first difference value corresponding to each sampling time point as standardized index data of the sampling time points in the index dimension; the following processing is performed for each sampling time point: forming the standardized index data of the sampling time points in a plurality of index dimensions into index data of the sampling time points; the index data of the plurality of sampling time points are combined into index data of a unit time series.

In some embodiments, the first regression module 2552 is further to: the following processing is performed for each unit time series: performing unitary regression processing on the index data of the unit time sequence to obtain a first slope of the index data and a first intercept of the index data; forming first regression data of the index data from the first slope of the index data, the first intercept of the index data, and the first error of the index data; and mapping the target sampling time points of the unit time sequence based on the first regression data to obtain the local fitting data of the unit time sequence.

In some embodiments, the first regression module 2552 is further to: acquiring an initial slope of index data and an initial intercept of the index data; determining a local loss based on the index data for each sampling time point, an initial intercept of the index data, and an initial slope of the index data; and updating the initial intercept of the index data and the initial slope of the index data by taking the minimized local loss as a constraint condition to obtain a first intercept of the index data and a first slope of the index data.

In some embodiments, the first regression module 2552 is further to: the following processing is performed for each sampling time point: performing unitary mapping processing on the sampling time points based on the initial intercept of the index data and the initial slope of the index data to obtain a first mapping result; when the absolute value between the first mapping result and the index data of the sampling time point is not greater than an absolute value threshold value, acquiring local loss positively correlated with the square of the absolute value of the sampling time point; when the absolute value between the first mapping result and index data of the sampling time point is larger than an absolute value threshold, a first standard value which is positively correlated with the square of the absolute value threshold is obtained, the absolute value and the absolute value threshold are multiplied, the multiplication result and the first standard value are subtracted, and the local loss of the sampling time point is obtained; and carrying out summation processing on the local losses of the sampling time points to obtain the local losses.

In some embodiments, the first regression module 2552 is further to: multiplying a first slope in the first regression data with a first sampling time point of a unit time sequence to obtain a first multiplication result; adding the first multiplication result and the first intercept to obtain local fitting data corresponding to a first sampling time point; multiplying the first slope in the first regression data with the last sampling time point of the unit time sequence to obtain a second multiplication result; adding the second multiplication result and the first intercept to obtain local fitting data corresponding to the last sampling time point; the local fitting data corresponding to the first sampling time point and the local fitting data corresponding to the last sampling time point are formed into local fitting data of a unit time sequence.

In some embodiments, the second regression module 2553 is further to: performing unitary regression processing on the combined data to obtain a second slope of the index data and a second intercept of the index data; forming second regression data of the index data from the second slope of the index data, the second intercept of the index data, and the second error of the index data; and mapping processing is carried out based on the second regression data, so as to obtain global fitting data of the global time sequence.

In some embodiments, the apparatus further comprises: the training module 2555 is configured to obtain, before probability mapping is performed on global fitting data of the optical module to obtain a future failure probability of the optical module, index data of a global time sequence before an actual failure occurs as a positive sample, and obtain index data of a global time sequence after an actual failure repair as a negative sample; global fitting data corresponding to the positive samples and global fitting data corresponding to the negative samples are obtained; generating a distribution interval of the positive sample based on global fitting data of the positive sample, and generating a distribution interval of the negative sample based on global fitting data of the negative sample; taking a coincidence interval between a distribution interval of the positive sample and a distribution interval of the negative sample as an abnormal distribution interval; the prediction module 2554 is further configured to, when the global fitting data of the optical module is in an abnormal distribution interval, use a probability of the corresponding abnormal distribution interval as a future failure probability of a link where the optical module is located.

Embodiments of the present application provide a computer program product comprising computer-executable instructions stored in a computer-readable storage medium. The processor of the electronic device reads the computer-executable instructions from the computer-readable storage medium, and the processor executes the computer-executable instructions, so that the electronic device executes the fault prediction method described in the embodiments of the present application.

The present embodiments provide a computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, cause the processor to perform the fault prediction methods provided by the embodiments of the present application, for example, as shown in fig. 3A-3D.

In some embodiments, the computer readable storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.

In some embodiments, computer-executable instructions may be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, in the form of programs, software modules or scripts, and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.

As an example, computer-executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, e.g., in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, subroutines).

As an example, executable instructions may be deployed to be executed on one electronic device or on multiple electronic devices located at one site or, alternatively, on multiple electronic devices distributed across multiple sites and interconnected by a communication network.

In summary, the embodiment of the application obtains the index data of the optical module in the global time sequence, performs local regression processing and global regression processing based on the unit time sequence on the index data of the optical module to obtain global fitting data of the optical module, and performs high-accuracy fitting on the index data of the optical module while effectively reducing the consumption of computing resources in a double-scale twice linear regression mode, and performs probability mapping processing on the global fitting data of the optical module to obtain future failure probability of a link where the optical module is located.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and scope of the present application are intended to be included within the scope of the present application.

Claims

1. A method of fault prediction, the method comprising:

acquiring index data of an optical module in a global time sequence, wherein the global time sequence consists of a plurality of unit time sequences, and the unit time sequences comprise a plurality of sampling time points;

the following processing is performed for each of the unit time series:

acquiring an initial slope of the index data of the unit time sequence and an initial intercept of the index data of the unit time sequence;

determining a local loss based on the index data for each of the sampling time points, an initial intercept of the index data, and an initial slope of the index data;

updating the initial intercept of the index data and the initial slope of the index data by taking the minimized local loss as a constraint condition to obtain a first intercept of the index data and a first slope of the index data;

forming first regression data of the index data from the first slope of the index data, the first intercept of the index data, and the first error of the index data;

mapping the target sampling time point of the unit time sequence based on the first regression data to obtain local fitting data of the unit time sequence;

2. The method according to claim 1, wherein the acquiring the index data of the optical module in the global time sequence includes:

the following processing is performed for each of the unit time series in the global time series:

acquiring original index data of the optical module in the unit time sequence;

performing data cleaning processing on the original index data of the optical module in the unit time sequence to obtain cleaning index data of the unit time sequence;

performing data compression processing on the cleaning index data to obtain compressed index data of the unit time sequence;

carrying out standardization processing on the compressed index data of the unit time sequence to obtain index data of the unit time sequence;

and forming a plurality of index data of the unit time sequence into index data of the global time sequence.

3. The method of claim 2, wherein the obtaining the original index data of the optical module in the unit time sequence includes:

the following processing is performed for each sampling time point of the unit time series:

acquiring a light receiving power index, a light emitting power index and a current index of the optical module at the sampling time point;

forming an original index data of the optical module at the sampling time point by using a light receiving power index, a light emitting power index and a current index of the optical module at the sampling time point;

and forming the original index data of the optical module at a plurality of sampling time points into the original index data of the optical module at the unit time sequence.

4. The method of claim 2, wherein the raw index data of the light module at the unit time series comprises raw index data of the light module at each sampling time point in the unit time series;

the step of performing data cleaning processing on the original index data of the optical module in the unit time sequence to obtain cleaning index data of the unit time sequence includes:

removing original index data belonging to null values and abnormal values from the original index data of each sampling time point in the unit time sequence of the optical module to obtain reserved original index data;

And repeatedly filtering the original index data of each sampling time point in the reserved original index data to obtain the cleaning index data of the unit time sequence.

5. The method of claim 2, wherein the compressed index data of the unit time series includes compressed index data of a plurality of index dimensions for each sampling time point in the unit time series;

the step of carrying out standardization processing on the compressed index data of the unit time sequence to obtain the index data of the unit time sequence comprises the following steps:

the following processing is performed for the compressed index data for each index dimension:

acquiring the maximum compression index data and the minimum compression index data in the compression index data of the index dimension at a plurality of sampling time points;

acquiring a first difference value between the maximum compression index data and the minimum compression index data;

acquiring a second difference value between the compressed index data of the index dimension and the minimum compressed index data of each sampling time point;

taking the ratio between the second difference value corresponding to each sampling time point and the first difference value as standardized index data of the sampling time point in the index dimension;

The following processing is performed for each of the sampling time points: forming the standardized index data of the sampling time point in a plurality of index dimensions into index data of the sampling time point;

and forming index data of a plurality of sampling time points into index data of the unit time sequence.

6. The method of claim 1, wherein the determining the local loss based on the index data for each of the sampling time points, the initial intercept of the index data, and the initial slope of the index data comprises:

the following processing is performed for each of the sampling time points:

performing unitary mapping processing on the sampling time point based on the initial intercept of the index data and the initial slope of the index data to obtain a first mapping result;

when the absolute value between the first mapping result and the index data of the sampling time point is not more than an absolute value threshold value, acquiring local loss positively correlated with the square of the absolute value of the sampling time point;

when the absolute value between the first mapping result and the index data of the sampling time point is larger than the absolute value threshold, a first standard value which is positively correlated with the square of the absolute value threshold is obtained, the absolute value and the absolute value threshold are multiplied, the multiplication result and the first standard value are subtracted, and the local loss of the sampling time point is obtained;

And carrying out summation processing on local losses of sampling time points of a plurality of sampling time points to obtain the local losses.

7. The method according to claim 1, wherein the target sampling time point includes a first sampling time point and a last sampling time point, the mapping the target sampling time point of the unit time series based on the first regression data to obtain the local fitting data of the unit time series includes:

multiplying a first slope in the first regression data with a first sampling time point of the unit time sequence to obtain a first multiplication result;

adding the first multiplication result and the first intercept to obtain local fitting data corresponding to the first sampling time point;

multiplying the first slope in the first regression data with the last sampling time point of the unit time sequence to obtain a second multiplication result;

adding the second multiplication result and the first intercept to obtain local fitting data corresponding to the last sampling time point;

and forming the local fitting data corresponding to the first sampling time point and the local fitting data corresponding to the last sampling time point into the local fitting data of the unit time sequence.

8. The method of claim 1, wherein the performing global regression processing on the merged data to obtain global fitting data of the optical module comprises:

performing unitary regression processing on the combined data to obtain a second slope of the index data and a second intercept of the index data;

forming second regression data of the index data from a second slope of the index data, a second intercept of the index data, and a second error of the index data;

and mapping processing is carried out based on the second regression data, so as to obtain global fitting data of the global time sequence.

9. The method of claim 1, wherein prior to performing probability mapping on the globally fitted data of the optical module to obtain a future failure probability of the optical module, the method further comprises:

acquiring index data of a global time sequence before an actual fault occurs as a positive sample, and acquiring index data of the global time sequence after the actual fault is repaired as a negative sample;

global fitting data corresponding to the positive sample and global fitting data corresponding to the negative sample are obtained;

Generating a distribution interval of the positive sample based on the global fitting data of the positive sample, and generating a distribution interval of the negative sample based on the global fitting data of the negative sample;

taking a coincidence interval between the distribution interval of the positive sample and the distribution interval of the negative sample as an abnormal distribution interval;

the probability mapping processing is performed on the global fitting data of the optical module to obtain the future fault probability of the optical module, which comprises the following steps:

when the global fitting data of the optical module is in the abnormal distribution interval, the probability corresponding to the abnormal distribution interval is used as the future fault probability of the link where the optical module is located.

10. A fault prediction device, the device comprising:

the acquisition module is used for acquiring index data of the optical module in a global time sequence, wherein the global time sequence consists of a plurality of unit time sequences, and the unit time sequences comprise a plurality of sampling time points;

a first regression module for performing the following processing for each of the unit time series: acquiring an initial slope of the index data of the unit time sequence and an initial intercept of the index data of the unit time sequence; determining a local loss based on the index data for each of the sampling time points, an initial intercept of the index data, and an initial slope of the index data; updating the initial intercept of the index data and the initial slope of the index data by taking the minimized local loss as a constraint condition to obtain a first intercept of the index data and a first slope of the index data; forming first regression data of the index data from the first slope of the index data, the first intercept of the index data, and the first error of the index data; mapping the target sampling time point of the unit time sequence based on the first regression data to obtain local fitting data of the unit time sequence;

11. An electronic device, the electronic device comprising:

a memory for storing computer executable instructions;

a processor for implementing the fault prediction method of any one of claims 1 to 9 when executing computer-executable instructions stored in the memory.

12. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the fault prediction method of any one of claims 1 to 9.