CN112882854B - Method and device for processing request exception - Google Patents

Method and device for processing request exception Download PDF

Info

Publication number
CN112882854B
CN112882854B CN201911197832.6A CN201911197832A CN112882854B CN 112882854 B CN112882854 B CN 112882854B CN 201911197832 A CN201911197832 A CN 201911197832A CN 112882854 B CN112882854 B CN 112882854B
Authority
CN
China
Prior art keywords
request
data
abnormal
processing stage
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911197832.6A
Other languages
Chinese (zh)
Other versions
CN112882854A (en
Inventor
王梦杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201911197832.6A priority Critical patent/CN112882854B/en
Publication of CN112882854A publication Critical patent/CN112882854A/en
Application granted granted Critical
Publication of CN112882854B publication Critical patent/CN112882854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method and a device for processing request exception, wherein the method comprises the following steps: determining first distribution information of the request and processing results thereof in different processing stages in a preset request set based on the abnormal types of the request; determining at least one exception handling stage of the request according to the first distribution information; determining second distribution information of the abnormal data of the abnormal processing stage in the abnormal data of the similar abnormal request according to the index data associated with the processing stage; and determining index data which causes the request to generate abnormality according to the second distribution information. According to the embodiment of the invention, the reasons of the abnormal requests are analyzed through the associated index data, so that a large number of abnormal requests are avoided, and the system performance and the user experience are improved.

Description

Method and device for processing request exception
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for processing a request exception.
Background
In a mass machine distributed storage system, a large number of request exceptions, particularly slow requests, can occur for a variety of reasons. Although there are various mechanisms for the upper layer application to reduce the impact of the underlying slow requests, some of the underlying slow requests may be delivered to the user layer, affecting the user experience.
In order to reduce the generation of slow requests, analysis of the causes of the generation is required, however, manual investigation of these slow requests takes a lot of manpower, and the method of automatically analyzing slow requests relies on rules and thresholds summarized by expert experience, but such methods have the following disadvantages: 1. the condition that the rule is not covered cannot be analyzed, and expert rules need to be updated repeatedly frequently; 2. the setting of the threshold value of a knife may be unreasonable, and the situation of high or low is existed, so that the knife cannot be flexibly adapted to various situations. It can be seen that the existing analysis of the reasons for the abnormality of the request still has a disadvantage, so that a large number of slow requests appear to influence the user experience.
Disclosure of Invention
In view of the above problems, the present invention provides a method and an apparatus for processing request exceptions, which are mainly aimed at analyzing the cause of the request exceptions by associated index data, so as to avoid the occurrence of a large number of exception requests and improve the user experience.
In order to achieve the above purpose, the present invention mainly provides the following technical solutions:
In one aspect, the present invention provides a method for processing a request exception, which specifically includes:
Determining first distribution information of the request and processing results thereof in different processing stages in a preset request set based on the abnormal types of the request;
determining at least one exception handling stage of the request according to the first distribution information;
Determining second distribution information of the abnormal data of the abnormal processing stage in the abnormal data of the similar abnormal request according to the index data associated with the processing stage;
and determining index data which causes the request to generate abnormality according to the second distribution information.
In another aspect, the present invention provides a device for processing a request exception, specifically including:
The first statistics unit is used for determining first distribution information of the request and the processing results of the request in different processing stages in a preset request set based on the abnormal types of the request;
A first determining unit, configured to determine at least one exception processing stage of the request according to the first distribution information obtained by the first statistics unit;
The second statistical unit is used for determining second distribution information of the abnormal data of the abnormal processing stage determined by the first determination unit in the abnormal data of the similar abnormal request according to the index data associated with the processing stage based on the abnormal processing stage determined by the first determination unit;
and the second determining unit is used for determining index data which causes the request to generate abnormality according to the second distribution information obtained by the second statistics unit.
In another aspect, the present invention provides a processor, where the processor is configured to run a program, and the program executes the method for processing the request exception.
In another aspect, the present invention provides an electronic device, where the electronic device includes a processor and a memory, where the memory is configured to store a program, and the processor is coupled to the memory and configured to run the program to perform the method for processing a request exception as described above.
In another aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the above-described method of processing a request exception.
By means of the technical scheme, the method and the device for processing the request exception determine respective distribution conditions of processing results of the target request in different processing stages in a request set according to the exception types of the target request, analyze which processing stages are reasons for causing the request to be abnormal, analyze each abnormal processing stage, analyze index data corresponding to the target request to be abnormal in the abnormal processing stage based on the associated index data, and accordingly obtain the reasons for the target request to be abnormal in the abnormal processing stage. Therefore, the method determines the reason of the request abnormality by finding out the abnormal processing stage and determining the abnormal reason according to the index data corresponding to each abnormal processing stage, and in the process, the analysis defects existing in the modes of relying on expert experience, rules, setting threshold values and the like are avoided by comparing all requests in the request set mainly through big data or historical data, so that the reason of the request abnormality is accurately and objectively analyzed on the target request, the occurrence of the request abnormality is reduced, and the user experience is improved
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a flowchart of a method for processing a request exception according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating another method for handling a request exception according to an embodiment of the present invention;
FIG. 3 is a flow chart showing a method for determining a time-consuming cause of a slow request according to an embodiment of the present invention;
FIG. 4 is a block diagram showing a processing apparatus for exception request according to an embodiment of the present invention;
FIG. 5 is a block diagram showing another apparatus for handling request exceptions according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The embodiment of the invention provides a processing method for request exception, which is mainly used for analyzing specific reasons for causing the request exception, so that the processing flow of the request is optimized, the exception condition is avoided, and the user experience is improved. The specific steps of the analysis are shown in fig. 1, and the method comprises the following steps:
Step 101, determining first distribution information of the request and processing results thereof in different processing stages in a preset request set based on the abnormal type of the request.
The exception types of the request such as slow request (long tail request) with overlong response time, response overtime or response failure request, parameter exception of the request, data type exception and the like. For an abnormal request, since there are multiple processing stages in the process of the request, there are various possibilities for the reason for the abnormality, and the analysis of the accurate abnormality causes is performed, first to determine in which processing stage or stages the abnormality exists in the request.
In this step, a large amount of requests and related data are stored in a preset request set, and an exception processing stage of the target request is determined through comparative analysis. For a request, especially in a distributed system, there are multiple processing stages in the process of processing the request, for example, a service in the distributed system is related to multiple servers, and there is a hierarchical relationship among the multiple servers, such as a front-end server, a middle-layer server and a back-end server, when a user initiates a request of the service, the request first reaches the front-end server and then is sent to the middle server, if the middle server can process the request, the request can directly respond, and if the request cannot process the request, the request needs to be sent to the back-end server for processing and responding, so that in the process of processing a request, the processing of different servers can correspond to different processing stages. In this embodiment, the processing information of the request at different processing stages is recorded for analysis of the cause of the anomaly.
The request in the preset request set may be a pre-specified request or may be all history requests, which is not particularly limited in this embodiment.
Since there are one or more processing stages per request, the first distribution information determined in this step refers to: according to the value of the abnormal data corresponding to the request abnormal type, wherein the abnormal data corresponds to the request abnormal type, for example, the abnormal data of the slow request is delay time, the corresponding abnormal data such as hardware working state, system progress or event occurrence frequency and the like cannot be accessed, the positions of the target requests in all requests of the set are analyzed, for example, the ranks of the target requests in the set are determined according to a certain ordering sequence, and the positions of all processing stages in the target requests in the same processing stage corresponding to all requests of the set are determined, so that the first distribution information is determined.
It can be seen that the first distribution information in this step is determined based on a preset distribution ordering manner, and the ranking information of the values of the abnormal data in the preset request set is aimed at, where the common distribution ordering manner is such as positive ordering, reverse ordering, and the like. For example, for slow requests, the first distribution information of a slow request is its rank among all slow requests determined in a request set according to its latency. It should be noted that, in the first distribution information of the present invention, the ranking data includes two types: one is the ranking corresponding to the processing results of each processing stage, and the other is the ranking corresponding to the total processing results after each processing stage is integrated.
Step 102, determining at least one exception processing stage of the request according to the first distribution information.
Specifically, since the distribution ranks of all requests in the set of the target requests and the distribution ranks of all requests of the target requests in each processing stage are recorded in the first distribution information, it is determined whether one processing stage is abnormal, and each processing stage can be measured by taking the distribution ranks of all requests in the set of the target requests as a standard, if the distribution ranks of one processing stage exceed the standard, it is determined that the processing stage is an abnormal processing stage when the processing in the processing stage has a possibility of causing the abnormality of the target requests. By this step it can be determined which of the multiple processing stages(s) of the target request are exception processing stages.
And step 103, determining second distribution information of the abnormal data of the abnormal processing stage in the abnormal data of the similar abnormal request according to the index data associated with the processing stage.
This step performs a one-by-one analysis based on the exception handling phases determined in step 102, and determines second distribution information for each exception handling phase.
In this embodiment, the index data may be used to measure the completion condition or resource occupation condition of the processing target request process in the processing stage, for example, the occupancy rate of the memory and the disk, the occurrence frequency of the system event and the process event. The second distribution information mainly records the arrangement position of the target request abnormal data in the abnormal data corresponding to the request with the same index data in the abnormal processing stage, so as to measure which index data the target request has strong correlation with in the abnormal processing stage, namely, determine the index data related to the target request processing abnormality.
And 104, determining index data which causes the abnormality of the request according to the second distribution information.
The index data for determining that the request is abnormal may be one or more. The determination mode may be determined according to a preset threshold value, or may be determined by counting the average value of the abnormal data corresponding to all the requests in the index data, and using the average value as a judgment standard to determine whether the index data is the cause of the abnormality of the target request.
According to the method for processing the request exception, which is provided by the embodiment of the invention, according to the exception types of the target request, the respective distribution conditions of the processing results of different processing stages in the request set are determined, which processing stages are the reasons for causing the request exception, and for each exception processing stage analysis, the corresponding index data for causing the target request exception in the exception processing stage is analyzed based on the associated index data, so that the reasons for causing the exception in the exception processing stage are obtained. Therefore, the method and the device for determining the reasons of the request abnormality determine the reasons of the abnormality by finding out the abnormal processing stages and determining the reasons of the abnormality according to the index data corresponding to each abnormal processing stage, and in the process, all requests in a request set are compared mainly through big data or historical data so as to avoid analysis defects in modes of relying on expert experience, rules, setting thresholds and the like, so that the reasons of the request abnormality are accurately and objectively analyzed on the target request, the occurrence of the request abnormality is reduced, and the user experience is improved.
Further, corresponding to the method for handling the request exception in the above embodiment, a preferred embodiment of the present invention will be described in detail with respect to each step shown in fig. 1, and the specific implementation process is shown in fig. 2, where the steps include:
step 201, determining first distribution data of the request in the similar exception request according to the exception type of the request.
The homogeneous abnormal requests refer to all requests in a preset request set, or requests with homogeneous abnormal conditions in the preset request set. The first distribution data refers to the positions of the abnormal data corresponding to the target request in the abnormal data corresponding to the similar abnormal requests, and the distribution data is determined based on a preset ordering mode. That is, the first distribution data is the arrangement of the processing results of the respective processing stages in the same kind of exception requests.
Step 202, determining second distribution data corresponding to the processing result of the request in each processing stage in the same kind of exception request according to different processing stages of the request.
Based on each processing stage of the request, the step respectively counts the arrangement positions of the abnormal data corresponding to the target request in each processing stage in the abnormal data corresponding to the request with the processing stage. Note that the distribution data in this step is sorted in the same manner as in step 201.
The second distribution data is a set of distribution positions corresponding to the target requests in each processing stage.
It can be seen that the first distribution information in the embodiment shown in fig. 1 specifically includes the first distribution data and the second distribution data.
Step 203, comparing the first distribution data with the second distribution data, and determining an exception processing stage of the request.
The comparison process can compare the arrangement position corresponding to one processing stage in the second distribution data with the arrangement position in the first distribution data, the arrangement position corresponding to the first distribution data is used as a standard for judging abnormality, and when the arrangement position corresponding to one processing stage in the second distribution data exceeds the standard, the processing stage is determined to be an abnormal processing stage, and analysis of subsequent steps is needed; if not, the processing stage is considered normal without the need for subsequent steps. For example, the delay in a slow request is 1 second, and the first distribution data corresponding to the request set is bit 5; the method comprises all three processing stages, wherein the first processing stage is 0.2 seconds, the second processing stage is 0.4 seconds, the third processing stage is 0.4 seconds, the corresponding second distribution data is respectively the 3 rd bit, the 5 th bit and the 7 th bit, and if the sorting is abnormal earlier, the abnormal processing stages contained in the slow request can be determined to be the first processing stage and the second processing stage.
And 204, acquiring the abnormal data of the same type of abnormal requests in the abnormal processing stage according to the index data associated with each abnormal processing stage to obtain a first set.
Wherein for each processing stage in the target request, there is corresponding associated index data for the different processing stages, which may be predefined.
In the step, a request with the exception processing stage and index data of the same type is searched in a preset request set, and the exception data of the corresponding exception type in the request is extracted to generate a first set.
Step 205, determining third distribution data of the exception data in the first set requested in the exception processing stage.
The third distribution data refers to an arrangement position corresponding to the abnormal data of the target request in the first set, and the ordering manner of the third distribution data is not necessarily related to the ordering manner in the step 201 and the step 202.
Step 206, obtaining the request index data with the same type in the offline data set, and extracting the processing data corresponding to the exception processing stage in the request index data to obtain a second set.
The offline data set is a set obtained by integrating all history requests according to processing stages and counting index data associated with each processing stage. In this way, corresponding requests are searched for index data of the same type through the offline data set, request index data of the requests are obtained, the request index data specifically refers to values of the index data, namely, processing data results corresponding to abnormal processing stages in the request index data, and the values are combined into a second set. That is, the elements in the second set are abnormal data values corresponding to requests having the same processing stage based on statistics of index data of a certain type.
Further, when the request index data is obtained, in order to improve accuracy of subsequent comparison, the request index data may be obtained according to a first value of index data corresponding to the target request, where the value corresponding to the request index data is within a preset range centered on the first value, that is, the value corresponding to the request index data is similar to the first value.
Step 207, determining fourth distribution data in the second set of exception data requested in the exception handling stage.
This step is similar to the processing of step 205, and the data distribution ordering is the same as that of step 205.
It should be noted that, the fourth distribution data is obtained for one type of index data, and when the exception processing stage of the target request has a plurality of associated index data, the fourth distribution data may have a plurality of corresponding values.
Step 208, comparing the third distribution data with the fourth distribution data, and determining index data causing the request to generate an abnormality.
In this step, the fourth distribution data is used as a judgment standard to measure the abnormal state of the third distribution data, and when the third distribution data exceeds the judgment standard, it is determined that the index data corresponding to the fourth distribution data is the cause of the abnormality of the target request. For example, in three processing stages of a slow request, if the first processing stage and the second processing stage are abnormal processing stages, third distribution data and fourth distribution data in each abnormal stage are respectively determined for the first processing stage and the second processing stage, the third distribution data is determined according to a preset ordering rule, such as ordering according to a delay time length, ordering according to a CPU occupancy rate, etc., the third distribution data is determined according to the determined number of index data, the fourth distribution data is obtained according to data in an offline data set, and if the delay time length or the fourth distribution data of the CPU occupancy rate exceeds the third distribution data, the delay time length is determined to be the reason for generating the abnormality of the slow request.
Further, after determining the reason for the abnormality of the target request, the determined index data may be one or more, and the reason is generated into alarm information, so that technicians can optimize and improve the abnormality processing stage according to the alarm information, thereby reducing the occurrence of abnormal conditions of the request and improving user experience.
Further, in order to more clearly describe the method for analyzing and determining the cause of the abnormality of the request in the embodiment shown in fig. 1 and 2, in the following, taking a slow request as an example, the data is distributed in a reverse order from large to small, and the specific implementation process thereof is shown in fig. 3, where the specific steps include:
Step 301, determining a corresponding percentile of the slow request in a preset request set according to the delay time of the slow request, so as to obtain first distribution data.
When calculating the percentile, firstly determining delay time corresponding to each request in a preset request set, sequencing the delay times in an inverse sequence sequencing mode, then determining a target delay time closest to the delay time of a slow request in the delay times, and calculating the percentile according to the sequencing of the target delay time in the set to obtain first distribution data of the slow request.
Step 302, determining a corresponding percentile of the delay time of the slow request in each processing stage in a preset request set according to different processing stages of the slow request, and obtaining second distribution data.
In this embodiment, different processing stages of the slow request may be represented by a node span in the trace tree, where the trace tree is a tree structure formed by the same request in each call portion in the distributed system call chain, and the node span represents different processing stages.
Specifically, in this step, all requests with the node in the preset request set are counted for each node, and all requests are ordered in reverse order according to the delay time corresponding to the node, then the percentile corresponding to the delay time of the slow request in the delay time of the node in all requests is determined, and the specific process of calculation is the same as the method for calculating the percentile in step 301, which is not repeated here. Thus, the percentile for different nodes can be obtained, and the second distribution data can be obtained.
Step 303, comparing the values of the first distribution data and the second distribution data, and if the first distribution data is greater than or equal to the value of the second distribution data, determining that the processing stage corresponding to the second distribution data is an abnormal processing stage of the slow request.
Because of the reverse order, the earlier the percentile indicates that the delay time is larger, and correspondingly, when the first distribution data is larger than or equal to the value of the second distribution data, the earlier the percentile indicates that the percentile of the second distribution data is larger, the corresponding delay time is larger, so that the processing stage corresponding to the second distribution data can be determined to be an abnormal processing stage of the slow request, namely, the processing stage with serious time consumption of the slow request exists.
Step 304, according to the index data associated with each exception processing stage, obtaining the delay time of each request in the preset request set in the exception processing stage, and obtaining a first set.
The index data at least comprises indexes for measuring the working states of a processor, a memory, a disk and a network, the occurrence frequency of hardware faults, system events, process events and the like.
For a processing stage, a plurality of index data are generally preset to be associated, and this step is to obtain delay time of each request in a preset request set in the exception processing stage based on the associated index data, for example, in the exception processing stage, the stage a is a stage a, and one index data associated with the stage a is a memory occupancy rate, and then this step is to find all associated memory occupancy rates in the preset request set, and have the requests of the stage a, and obtain delay time of the requests, thereby generating a first set.
Step 305, determining a percentile of delay times of slow requests in the exception processing stage in the first set, and obtaining third distribution data.
The element ordering in the first set is defined as performing an inverse order ordering according to the delay time, and based on the ordering in the first set, the delay time of the slow request in the exception processing stage is calculated to correspond to the percentile in the first set, and the specific calculation manner is the same as that in step 301, which is not described herein.
Step 306, obtaining the request index data in the offline data set, wherein the index data has the same index data, and the value of the index data is within a preset range centering on the value of the index data corresponding to the slow request.
If the index data is the memory occupancy rate, and the value of the index data is the memory occupancy rate=80%, then the acquired request index data is the index data with the memory occupancy rate of about 80% searched from the offline data set, the preset range can be set manually, the range size determines the number of the request index data, and meanwhile, the calculation accuracy of the subsequent steps is affected.
Step 307, extracting delay time corresponding to the exception processing stage in the request index data, to obtain a second set.
The second set is obtained by further screening requests with the same exception processing stage according to the request index data obtained in step 306, and obtaining the delay time of the requests in the exception processing stage according to the reverse order ordering mode.
Step 308, determining the percentile of the delay time of the slow request in the exception processing stage in the second set, so as to obtain fourth distribution data.
The calculation method of this step is the same as that of step 305, and a specific calculation process is not described here again.
Step 309, comparing the values of the third distribution data and the fourth distribution data, and if the value of the third distribution data is less than or equal to the value of the fourth distribution data, determining that the index data corresponding to the fourth distribution data is the cause of time consumption of the slow request.
The third distribution data and the fourth distribution data are both obtained for a certain index data, and since the fourth distribution data is based on a delay time corresponding to the index data approximate value, the delay time can be regarded as a judgment standard, and when the value of the third distribution data is smaller than or equal to the value of the fourth distribution data, it is explained that the index data has a larger influence on time consumption in the exception processing stage, and therefore the index data can be determined as one of reasons for causing slow request time consumption.
When the exception handling stage corresponds to a plurality of index data, the steps 304-309 are repeated for other index data to determine whether the other index data is the cause of slow request time consumption.
According to the method for processing the request exception, particularly in the determining process of the time-consuming reasons of the slow request, the respective distribution conditions of different corresponding processing stages in the request set are determined according to the exception types of the target request, so that the reasons of the request exception are analyzed, and the reasons of the exception are analyzed according to each exception processing stage. Therefore, the method and the device for determining the reasons of the request abnormality determine the reasons of the request abnormality by finding out the abnormal processing stages and determining the reasons of the abnormality according to the index data corresponding to each abnormal processing stage, and in the process, all requests in a request set are compared mainly through big data or historical data so as to avoid analysis defects in modes of relying on expert experience, rules, setting thresholds and the like, so that the reasons of the request abnormality are accurately and objectively analyzed on the target request, the occurrence of the request abnormality is reduced, and the user experience is improved.
Further, as an implementation of the method shown in fig. 1 to 3, an embodiment of the present invention provides a processing device for request exception, where the device is mainly used to analyze an exception cause of a request through associated index data, so as to avoid occurrence of a large number of exception requests and improve user experience. For convenience of reading, the details of the foregoing method embodiment are not described one by one in the embodiment of the present apparatus, but it should be clear that the apparatus in this embodiment can correspondingly implement all the details of the foregoing method embodiment. The device is shown in fig. 4, and specifically comprises:
a first statistics unit 41, configured to determine, based on an abnormal type of a request, first distribution information of the request and processing results thereof in different processing stages in a preset request set;
A first determining unit 42, configured to determine at least one exception processing stage of the request according to the first distribution information obtained by the first statistics unit 41;
A second statistics unit 43, configured to determine, based on the index data associated with the processing stage, second distribution information of the exception data of the exception processing stage determined by the first determining unit 42 in the exception data corresponding to the similar exception request;
A second determining unit 44, configured to determine index data that causes the abnormality of the request according to the second distribution information obtained by the second statistics unit 43.
Further, as shown in fig. 5, the first statistics unit 41 includes:
a first statistics module 411, configured to determine, according to an exception type of a request, first distribution data of the request in an exception request of the same type;
and the second statistics module 412 is configured to determine, according to different processing stages of the request, second distribution data corresponding to a processing result of the request in each processing stage in the similar exception request.
Further, the first determining unit 42 is further configured to compare the first distribution data with the second distribution data to determine an exception processing stage of the request.
Further, as shown in fig. 5, the second statistics unit 43 includes:
A first obtaining module 431, configured to obtain, according to the index data associated with each exception processing stage, exception data of the same type of exception request in the exception processing stage, to obtain a first set;
A first calculation module 432, configured to determine third distribution data of the exception data requested in the exception processing stage in the first set obtained by the first obtaining module 431;
A second obtaining module 433, configured to obtain request index data having the same type in an offline data set, and extract processing data corresponding to the exception processing stage in the request index data to obtain a second set, where the request index data is a value of the index data;
A second calculation module 434, configured to determine fourth distribution data of the exception data in the second set obtained by the second obtaining module 433 during the exception processing stage.
Further, the second obtaining module 433 is further configured to obtain, according to the first value of the index data corresponding to the request, the requested index data, where the value of the requested index data is within a preset range centered on the first value.
Further, the second determining unit 44 is further configured to compare the third distribution data with the fourth distribution data, and determine index data that causes the abnormality in the request.
Further, when the request is a slow request and the first distribution information is ordered in reverse order from big to small, the first statistics unit 41 is specifically configured to:
The first statistics module 411 is further configured to determine, according to a delay time of the slow request, a percentile corresponding to the slow request in a preset request set, so as to obtain first distribution data;
The second statistics module 412 is further configured to determine, according to different processing stages of the slow request, a percentile corresponding to a delay time of the slow request in each processing stage in a preset request set, so as to obtain second distribution data.
Further, as shown in fig. 5, the first determining unit 42 further includes:
A comparison module 421 for comparing the values of the first distribution data and the second distribution data;
A determining module 422, configured to determine, when the value of the first distribution data is greater than or equal to the value of the second distribution data, that the processing stage corresponding to the second distribution data is an abnormal processing stage of the request.
Further, when the request is a slow request and the second distribution information is ordered in reverse order from big to small, the second statistics unit 43 is specifically configured to:
The first obtaining module 431 is further configured to obtain a delay time of each request in the preset request set in the exception processing stage according to index data associated with each exception processing stage, so as to obtain a first set, where the index data at least includes a working state of a measurement processor, a memory, a disk, and a network, and occurrence frequencies of hardware faults, system events, and process events;
the first calculating module 432 is further configured to determine a percentile of the delay time of the slow request in the exception processing stage in the first set obtained by the first obtaining module 431, so as to obtain third distribution data;
the second obtaining module 433 is further configured to obtain request index data in the offline data set, where the value of the index data is within a preset range centered on the value of the index data corresponding to the slow request; extracting delay time corresponding to the exception processing stage in the request index data to obtain a second set;
The second calculating module 434 is further configured to determine a percentile of the delay time of the slow request in the exception handling stage in the second set obtained by the second obtaining module 433, to obtain fourth distribution data.
Further, as shown in fig. 5, according to the second determining unit 44, further includes:
A comparison module 441, configured to compare the value of the third distribution data with the value of the fourth distribution data;
A determining module 442, configured to determine, when the value of the third distribution data is less than or equal to the value of the fourth distribution data, that the index data corresponding to the fourth distribution data is the cause of the slow request consuming time.
Further, the embodiment of the invention also provides a processor, which is used for running a program, wherein the program runs to execute the method for processing the request exception as shown in fig. 1-3.
In addition, the embodiment of the invention also provides electronic equipment, which comprises a processor and a memory, wherein the memory is used for storing a program, and the processor is coupled to the memory and used for running the program to execute the method for processing the request exception as shown in fig. 1-3.
In addition, the embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to implement the method for processing the request exception as shown in fig. 1-3.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
It will be appreciated that the relevant features of the methods and apparatus described above may be referenced to one another. In addition, the "first", "second", and the like in the above embodiments are for distinguishing the embodiments, and do not represent the merits and merits of the embodiments.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In addition, the memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory in a computer-readable medium such as read-only memory (ROM) or flash memory
(Flash RAM) the memory comprises at least one memory chip.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (12)

1. A method for handling a request exception, the method comprising:
Determining first distribution information of the requests and processing results thereof in different processing stages in a preset request set based on the abnormal types of the requests, wherein the preset request set stores a plurality of requests and related data, the requests in the preset request set comprise pre-designated requests or all history requests, the distribution information refers to ranking in the set determined according to a certain ordering sequence, the processing results are used for indicating values of abnormal data, and the first distribution information refers to ranking of the values of the abnormal data in the preset request set;
determining at least one exception handling stage of the request according to the first distribution information;
Determining second distribution information of the abnormal data of each abnormal processing stage in the abnormal data of the same type of abnormal request according to the index data associated with the processing stage;
Determining index data which causes the request to generate abnormality according to the second distribution information;
The determining, based on the abnormal type of the request, first distribution information of the request and processing results thereof in different processing stages in a preset request set includes:
determining first distribution data of the requests in homogeneous abnormal requests according to the abnormal types of the requests, wherein the homogeneous abnormal requests refer to all requests in the preset request set or requests with homogeneous abnormal conditions in the preset request set;
Determining second distribution data corresponding to the processing result of the request in each processing stage in the similar abnormal requests according to different processing stages of the request;
The determining, according to the index data of the processing stages, second distribution information of the exception data of each exception processing stage in the exception data of the same type of exception request includes:
Acquiring the abnormal data of the same type of abnormal requests in the abnormal processing stage according to the index data associated with each abnormal processing stage to obtain a first set;
Determining third distribution data of the abnormal data of the request in the abnormal processing stage in the first set;
Acquiring request index data with the same type in an offline data set, extracting processing data corresponding to the abnormal processing stage in the request index data to obtain a second set, wherein the offline data set is a set obtained by integrating all historical requests according to the processing stages and counting index data associated with each processing stage;
determining fourth distribution data of the abnormal data of the request in the abnormal processing stage in the second set;
the determining at least one exception handling stage of the request according to the first distribution information comprises:
comparing the values of the first distribution data and the second distribution data;
If the value of the first distribution data is greater than or equal to the value of the second distribution data, determining that the processing stage corresponding to the second distribution data is an abnormal processing stage of the request;
the determining, according to the second distribution information, index data that causes the request to generate an exception includes:
Comparing the values of the third distribution data with the values of the fourth distribution data;
And if the value of the third distribution data is smaller than or equal to the value of the fourth distribution data, determining that the index data corresponding to the fourth distribution data is the reason for causing time consumption of the slow request.
2. The method of claim 1, wherein determining at least one exception handling phase of the request from the first distribution information comprises:
and comparing the first distribution data with the second distribution data, and determining an abnormal processing stage of the request.
3. The method of claim 1, wherein obtaining request metric data of the same type in the offline dataset comprises:
And acquiring the request index data according to the first value of the index data corresponding to the request, wherein the value of the request index data is within a preset range taking the first value as the center.
4. The method of claim 1, wherein determining, based on the second distribution information, index data that causes the request to be anomalous comprises:
And comparing the third distribution data with the fourth distribution data, and determining index data which causes the request to generate abnormality.
5. The method according to claim 1, wherein when the request is a slow request and the first distribution information is ordered in reverse order from big to small, the determining the first distribution information of the request and the processing results thereof in different processing stages in a preset request set based on the abnormal category of the request includes:
Determining a corresponding percentile of the slow request in a preset request set according to the delay time of the slow request to obtain the first distribution data;
and determining the corresponding percentile of the delay time of the slow request in each processing stage in the preset request set according to different processing stages of the slow request, and obtaining the second distribution data.
6. The method of claim 1, wherein when the request is a slow request and the second distribution information is ordered in reverse order from big to small, the determining, according to the index data of the processing stage, the second distribution information of the exception data of the exception processing stage in the exception data of the same type of exception request includes:
acquiring delay time of each request in a preset request set in the exception processing stage according to index data associated with each exception processing stage to obtain the first set;
Determining the percentile of the delay time of the slow request in the exception processing stage in the first set to obtain the third distribution data;
acquiring request index data which have the same index data in an offline data set and have values within a preset range centering on the values of the index data corresponding to the slow request;
Extracting delay time corresponding to the exception processing stage in the request index data to obtain the second set;
Determining the percentile of the delay time of the slow request in the exception processing stage in the second set, and obtaining the fourth distribution data.
7. A device for handling a request exception, the device comprising:
A first statistics unit, configured to determine, based on an abnormal type of a request, first distribution information of the request and processing results thereof in different processing stages in a preset request set, where the preset request set stores a plurality of requests and related data, where the requests in the preset request set include a pre-specified request or all history requests, the distribution information refers to a ranking in the set determined according to a certain ordering order, the processing results are used to indicate a value of abnormal data, and the first distribution information refers to a ranking of the value of the abnormal data in the preset request set;
A first determining unit, configured to determine at least one exception processing stage of the request according to the first distribution information obtained by the first statistics unit;
the second statistical unit is used for determining second distribution information of the abnormal data of each abnormal processing stage determined by the first determination unit in the abnormal data of the similar abnormal request according to the index data associated with the processing stage based on the abnormal processing stage determined by the first determination unit;
A second determining unit, configured to determine index data that causes the abnormality of the request according to the second distribution information obtained by the second statistics unit;
wherein the first statistical unit includes:
the first statistics module is used for determining first distribution data of the request in the similar abnormal requests according to the abnormal types of the request;
the second statistical module is used for determining second distribution data corresponding to the processing result of the request in each processing stage in the similar abnormal requests according to different processing stages of the request;
the second statistical unit includes:
the first acquisition module is used for acquiring the abnormal data of the similar abnormal requests in the abnormal processing stage according to the index data associated with each abnormal processing stage to obtain a first set;
a first calculation module, configured to determine third distribution data of the abnormal data of the request in the first set obtained by the first obtaining module in the abnormal processing stage;
The second acquisition module is used for acquiring request index data with the same type in the offline data set, extracting processing data corresponding to the abnormal processing stage in the request index data, and obtaining a second set;
A second calculation module, configured to determine fourth distribution data of the abnormal data in the second set, where the fourth distribution data is obtained by the second acquisition module and is requested in the abnormal processing stage;
The first determination unit includes:
the comparison module is used for comparing the values of the first distribution data and the second distribution data;
The determining module is used for determining that the processing stage corresponding to the second distribution data is the abnormal processing stage of the request when the value of the first distribution data is larger than or equal to the value of the second distribution data;
The second determination unit includes:
The comparison module is used for comparing the value of the third distribution data with the value of the fourth distribution data;
And the determining module is used for determining index data corresponding to fourth distribution data as a cause of time consumption of the slow request when the value of the third distribution data is smaller than or equal to the value of the fourth distribution data.
8. The apparatus of claim 7, wherein when the request is a slow request and the first distribution information is ordered in reverse order from big to small, the first statistics unit is configured to:
The first statistics module is further configured to determine, according to a delay time of a slow request, a percentile corresponding to the slow request in a preset request set, so as to obtain the first distribution data;
The second statistical module is further configured to determine, according to different processing stages of the slow request, a percentile corresponding to a delay time of the slow request in each processing stage in the preset request set, so as to obtain the second distribution data.
9. The apparatus of claim 7, wherein when the request is a slow request and the second distribution information is ordered in reverse order from big to small, the second statistics unit is specifically configured to:
The first obtaining module is further configured to obtain a delay time of each request in a preset request set in the exception processing stage according to index data associated with each exception processing stage, so as to obtain a first set;
the first calculation module is further configured to determine a percentile of a delay time of the slow request in the exception processing stage in the first set obtained by the first obtaining module, so as to obtain the third distribution data;
The second acquisition module is further used for acquiring request index data which have the same index data in the offline data set and the value of which is within a preset range centering on the value of the index data corresponding to the slow request; extracting delay time corresponding to the exception processing stage in the request index data to obtain a second set;
the second calculation module is further configured to determine a percentile of a delay time of the slow request in the exception processing stage in the second set obtained by the second obtaining module, so as to obtain the fourth distribution data.
10. A processor for running a program, wherein the program is run-time to perform the method of handling a request exception as claimed in any one of claims 1 to 6.
11. An electronic device, comprising:
A memory for storing a program;
A processor coupled to the memory for running the program to perform the method of handling a request exception as claimed in any one of claims 1 to 6.
12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a method of handling a request exception according to any one of claims 1-6.
CN201911197832.6A 2019-11-29 2019-11-29 Method and device for processing request exception Active CN112882854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911197832.6A CN112882854B (en) 2019-11-29 2019-11-29 Method and device for processing request exception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911197832.6A CN112882854B (en) 2019-11-29 2019-11-29 Method and device for processing request exception

Publications (2)

Publication Number Publication Date
CN112882854A CN112882854A (en) 2021-06-01
CN112882854B true CN112882854B (en) 2024-06-11

Family

ID=76039387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911197832.6A Active CN112882854B (en) 2019-11-29 2019-11-29 Method and device for processing request exception

Country Status (1)

Country Link
CN (1) CN112882854B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115756782A (en) * 2022-11-15 2023-03-07 支付宝(杭州)信息技术有限公司 Large-scale alarm defense deploying method, device and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107181607A (en) * 2016-03-11 2017-09-19 ***通信集团内蒙古有限公司 One kind is based on application system Fault Locating Method and device end to end
CN108259241A (en) * 2018-01-11 2018-07-06 上海有云信息技术有限公司 A kind of abnormal localization method and device of cloud platform monitoring system
CN108933695A (en) * 2018-06-25 2018-12-04 百度在线网络技术(北京)有限公司 Method and apparatus for handling information
CN108990092A (en) * 2018-08-21 2018-12-11 麒麟合盛网络技术股份有限公司 Communication abnormality localization method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080215576A1 (en) * 2008-03-05 2008-09-04 Quantum Intelligence, Inc. Fusion and visualization for multiple anomaly detection systems

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107181607A (en) * 2016-03-11 2017-09-19 ***通信集团内蒙古有限公司 One kind is based on application system Fault Locating Method and device end to end
CN108259241A (en) * 2018-01-11 2018-07-06 上海有云信息技术有限公司 A kind of abnormal localization method and device of cloud platform monitoring system
CN108933695A (en) * 2018-06-25 2018-12-04 百度在线网络技术(北京)有限公司 Method and apparatus for handling information
CN108990092A (en) * 2018-08-21 2018-12-11 麒麟合盛网络技术股份有限公司 Communication abnormality localization method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于异常分析的电力信息通信***运维策略;王志强;吴庆;张拯;胡斌;杨乐;宋潇杨;;陕西电力;20160420(04);全文 *

Also Published As

Publication number Publication date
CN112882854A (en) 2021-06-01

Similar Documents

Publication Publication Date Title
CN111092757B (en) Abnormal data detection method, system and equipment
EP3425524A1 (en) Cloud platform-based client application data calculation method and device
CN110601900B (en) Network fault early warning method and device
US10031829B2 (en) Method and system for it resources performance analysis
CN109934268B (en) Abnormal transaction detection method and system
KR102097953B1 (en) Failure risk index estimation device and failure risk index estimation method
CN107220121B (en) Sandbox environment testing method and system under NUMA architecture
CN106445938B (en) Data detection method and device
CN113535454B (en) Log data anomaly detection method and device
CN115098740B (en) Data quality detection method and device based on multi-source heterogeneous data source
CN110377519B (en) Performance capacity test method, device and equipment of big data system and storage medium
CN112882854B (en) Method and device for processing request exception
US20190266136A1 (en) Data sampling in a storage system
CN112526905B (en) Processing method and system for index abnormity
CN108463813B (en) Method and device for processing data
CN112749035B (en) Abnormality detection method, abnormality detection device, and computer-readable medium
CN115658441B (en) Method, equipment and medium for monitoring abnormality of household service system based on log
WO2019046996A1 (en) Java software latency anomaly detection
CN110069379B (en) Screening method and screening device for monitoring indexes
CN110991241A (en) Abnormality recognition method, apparatus, and computer-readable medium
CN112445687A (en) Blocking detection method of computing equipment and related device
CN116319255A (en) Root cause positioning method, device, equipment and storage medium based on KPI
CN115690681A (en) Processing method of abnormity judgment basis, abnormity judgment method and device
CN111367781B (en) Instance processing method and device
CN114154668A (en) IT system capacity expansion prediction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant