CN113595783B - Fault positioning method, device, server and computer storage medium - Google Patents

Fault positioning method, device, server and computer storage medium Download PDF

Info

Publication number
CN113595783B
CN113595783B CN202110850027.XA CN202110850027A CN113595783B CN 113595783 B CN113595783 B CN 113595783B CN 202110850027 A CN202110850027 A CN 202110850027A CN 113595783 B CN113595783 B CN 113595783B
Authority
CN
China
Prior art keywords
network
determining
network equipment
level
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110850027.XA
Other languages
Chinese (zh)
Other versions
CN113595783A (en
Inventor
张路晗
丁利锋
魏宇涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110850027.XA priority Critical patent/CN113595783B/en
Publication of CN113595783A publication Critical patent/CN113595783A/en
Application granted granted Critical
Publication of CN113595783B publication Critical patent/CN113595783B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/065Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application provides a fault positioning method, a fault positioning device, a server and a computer storage medium, wherein the fault positioning method comprises the following steps: firstly, determining a hash strategy and a hash function of each level of network equipment under a target service path; then, determining all levels of network nodes for forwarding the target service path flow according to the hash strategies and hash functions of all the network equipment; determining a transmission track and an interaction track forwarded by the target service path flow according to the network nodes of each level to obtain the network paths of the network nodes of each level; generating a network topological graph of the target service path according to the network path; then, aiming at each level of network equipment in the network topology, obtaining a pre-analysis result of the network equipment; and finally, fault positioning is carried out according to the pre-analysis results of all the network equipment to obtain a fault positioning result. Therefore, the purpose of quickly and accurately positioning and analyzing the fault is achieved.

Description

Fault positioning method, device, server and computer storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for locating a fault, a server, and a computer storage medium.
Background
At present, more and more users build own cloud network environments through VPCs, private line gateways, VPN gateways and the like, and migrate services to the cloud, the cloud services mostly depend on three-layer networking of TOR/LEAF/SPAIN, and network traffic on a plurality of clouds needs to position traffic bearing paths through analyzing hash strategies of the cloud service paths, so that fault nodes of cloud network gateway clusters are positioned.
However, with the increase of the service traffic, when there is an abnormality in the forwarding of the service feedback data, a fault node cannot be located quickly due to the existence of massive nodes and links in the cloud network.
Disclosure of Invention
In view of this, the present application provides a method, an apparatus, a server and a computer storage medium for locating a fault, which can quickly and accurately perform location analysis on the fault.
The first aspect of the present application provides a method for locating a fault, including:
determining hash strategies and hash functions of network equipment of each stage under a target service path;
determining all levels of network nodes for forwarding the target service path flow according to the hash strategies and hash functions of all the network equipment;
determining a transmission track and an interaction track forwarded by the target service path flow according to the network nodes at all levels to obtain a network path of the network nodes at all levels;
generating a network topological graph of the target service path according to the network path;
aiming at each level of network equipment in the network topology, obtaining a pre-analysis result of the network equipment;
and carrying out fault positioning according to the pre-analysis results of all the network equipment to obtain a fault positioning result.
Optionally, the determining, according to the hash policy and the hash function of all the network devices, each level of network nodes to which the target service path traffic is forwarded includes:
for each network device, determining a first unit identifier of a first hash array according to a first hash function of the network device, and determining a second unit identifier of a second hash array according to a second hash function; wherein each cell in the first hash array is used to store the second hash array; each unit in the second hash array is used as a pointer to point to a storage unit;
determining a target second hash array according to the first unit identification and the second unit identification;
traversing all the storage units pointed by the target second hash array, and determining first-level network equipment distributed by the target service path flow;
and determining each level of network nodes forwarded by the target service path flow according to all the first level of network equipment.
Optionally, the obtaining, for each level of network devices in the network topology, a pre-analysis result of the network device includes:
and for each level of network equipment in the network topology, when the number of fault logs in the network flow logs of the network equipment reaches a threshold value, recording the logs of the forwarding paths of each piece of network equipment, and taking the logs of the forwarding paths of all pieces of network equipment as the analysis result of the network equipment.
Optionally, the obtaining, for each level of network devices in the network topology, a pre-analysis result of the network device includes:
and aiming at each level of network equipment in the network topology, carrying out health check on the network equipment, and taking a health check result as a pre-analysis result of the network equipment.
Optionally, the obtaining, for each level of network devices in the network topology, a pre-analysis result of the network device includes:
and capturing the packet of each level of network equipment in the network topology by using a preset packet capturing tool to obtain a packet capturing result, and taking the packet capturing result as a pre-analysis result of the network equipment.
Optionally, the performing fault location according to the pre-analysis results of all the network devices to obtain fault location results further includes:
and storing the fault positioning result.
The second aspect of the present application provides a fault location device, including:
the determining unit is used for determining the hash strategy and the hash function of each level of network equipment under the target service path;
the network node determining unit is used for determining all levels of network nodes forwarded by the target service path flow according to the hash strategies and hash functions of all the network equipment;
the network path determining unit is used for determining a transmission track and an interaction track forwarded by the target service path flow according to the network nodes at all levels to obtain the network paths of the network nodes at all levels;
a generating unit, configured to generate a network topology map of the target service path according to the network path;
the pre-analysis unit is used for obtaining a pre-analysis result of the network equipment for each level of the network equipment in the network topology;
and the fault positioning unit is used for positioning the fault according to the pre-analysis results of all the network equipment to obtain a fault positioning result.
Optionally, the network node determining unit includes:
the identification determining unit is used for determining a first unit identification of the first hash array according to a first hash function of each network device and determining a second unit identification of the second hash array according to a second hash function; wherein each cell in the first hash array is used to store the second hash array; each unit in the second hash array is used as a pointer to point to a storage unit;
the hash array determining unit is used for determining a target second hash array according to the first unit identifier and the second unit identifier;
the network equipment determining unit is used for traversing all the storage units pointed by the target second hash array and determining first-level network equipment distributed by the target service path flow;
and the network node determining subunit is used for determining each level of network nodes forwarded by the target service path flow according to all the first level network devices.
Optionally, the pre-analysis unit includes:
and the recording unit is used for recording the log of the forwarding path of each network device when the number of fault logs in the network flow log of the network device reaches a threshold value aiming at each level of network device in the network topology, and taking the logs of the forwarding paths of all the network devices as the analysis result of the network device.
Optionally, the pre-analysis unit includes:
and the health check unit is used for carrying out health check on the network equipment aiming at each level of network equipment in the network topology and taking a health check result as a pre-analysis result of the network equipment.
Optionally, the pre-analysis unit includes:
and the packet capturing unit is used for capturing packets of the network equipment by using a preset packet capturing tool aiming at each level of the network equipment in the network topology to obtain a packet capturing result, and the packet capturing result is used as a pre-analysis result of the network equipment.
Optionally, the fault locating device further includes:
and the storage unit is used for storing the fault positioning result.
A third aspect of the present application provides a server comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of locating a fault as defined in any of the first aspects.
A fourth aspect of the present application provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for locating a fault according to any one of the first aspects.
In view of the above, the present application provides a method, an apparatus, a server and a computer storage medium for locating a fault, where the method for locating a fault includes: firstly, determining a hash strategy and a hash function of each level of network equipment under a target service path; then, determining all levels of network nodes for forwarding the target service path flow according to the hash strategies and hash functions of all the network equipment; determining a transmission track and an interaction track forwarded by the target service path flow according to the network nodes of each level to obtain the network paths of the network nodes of each level; generating a network topological graph of the target service path according to the network path; then, aiming at each level of network equipment in the network topology, obtaining a pre-analysis result of the network equipment; and finally, fault positioning is carried out according to the pre-analysis results of all the network equipment to obtain a fault positioning result. Therefore, the purpose of quickly and accurately positioning and analyzing the fault is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a detailed flowchart of a fault location method according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a complete service access path on a cloud of a private line gateway according to another embodiment of the present application;
fig. 3 is a detailed flowchart of each level of network nodes for determining target service path traffic forwarding according to another embodiment of the present application;
FIG. 4 is a schematic view of a fault location device according to another embodiment of the present application;
fig. 5 is a schematic diagram of a server implementing a fault location method according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first", "second", and the like, referred to in this application, are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence of functions performed by these devices, modules or units, but the terms "include", or any other variation thereof are intended to cover a non-exclusive inclusion, so that a process, method, article, or apparatus that includes a series of elements includes not only those elements but also other elements that are not explicitly listed, or includes elements inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The terms appearing in the present application are explained first:
wireshark: network packet analysis software.
tcpdump: and a tool for intercepting the network packet and outputting the packet content.
VPC: virtual Private network (Virtual Private Cloud), isolated network space on the Cloud.
NAT gateway: a cloud service that provides high performance Internet public network access to resources within the cloud.
VPN gateway: a cloud service for realizing interconnection and intercommunication between a user local data center and resources on a cloud is provided.
A special line gateway: a cloud service that provides a quick and reliable connection between a user data center and resources on the cloud.
At present, in a network applied to cloud computing and a data center, due to a requirement on scale, three-layer networking of TOR/LEAF/span is mostly adopted, a hash strategy is mostly adopted for a service network path to realize routing forwarding, and multiple layers of hash strategies exist under multi-stage equipment. Therefore, the hash strategies under the current multi-stage devices are not completely consistent, so that the route forwarding path of the multi-stage device cannot be determined to a specific device, and the cloud network gateway cluster of the service end cannot perform rapid fault location.
Therefore, an embodiment of the present application provides a method for locating a fault, as shown in fig. 1, specifically including the following steps:
s101, determining hash strategies and hash functions of network equipment of each stage under a target service path.
Taking the quintuple hash as an example, the source address sip, the destination address dip, the source port sport, the destination port dport and the adopted transport protocol of the target service traffic forwarding are obtained according to the session information.
Hash functions commonly used at present are addition hash, bit operation hash (usually shift and xor), multiplication hash, division hash, table lookup hash (CRC series algorithm), and hybrid hash.
S102, determining each level of network nodes for forwarding the target service path flow according to the hash strategies and hash functions of all the network devices.
At present, common cloud gateways in cloud services include a VPC gateway, an NAT gateway, a VPN gateway, a private line gateway, and the like.
Taking a dedicated line gateway as an example, as shown in fig. 2, a complete cloud service access path of the dedicated line gateway is CVM-master-cloud gateway-TOR switch-LEAF switch-span switch-dedicated line access switch-opposite terminal IDC, and routing based on a hash policy exists among the TOR switch, the LEAF switch, and the span switch, so that a traffic forwarding path is more complex, and therefore, it is very important to generate a network topology to each network node for fast fault location.
Optionally, in another embodiment of the present application, an implementation manner of step S102, as shown in fig. 3, includes:
s301, for each network device, determining a first unit identifier of a first hash array according to a first hash function of the network device, and determining a second unit identifier of a second hash array according to a second hash function.
Continuing with the above example, according to the obtained quintuple hash information, a first cell id of the first hash array is determined based on the first hash function, and a second cell id of the second hash array is determined according to the second hash function.
Each unit in the first hash array is used for storing a second hash array; each cell in the second hash array is configured to act as a pointer to a storage cell.
Taking the CRC _32 algorithm as an example, the first hash function is F = CRC _32 (M- > link)% a.
Wherein, F is the unit identifier in the hash array, CRC _32 is a 32-bit cyclic redundancy check algorithm function, M- > link is IP quintuple hash information obtained in the session storage unit, and A is the length of the hash array.
Taking the exclusive-or algorithm as an example, the second hash function is f = (sip ^ dip ^ ((sport < < 16) + dport))% a.
Wherein f is the second unit identifier in the second hash array, and a is the length of the second hash array.
S302, determining a target second hash array according to the first unit identifier and the second unit identifier.
Specifically, the first unit identifier can determine the position of the first hash array, and the second unit identifier can determine the position of the second hash array. Each unit in the first hash array is a second hash array, so that the second hash array can be determined by combining the first unit identification and the second unit identification.
S303, traversing all storage units pointed by the target second hash array, and determining the first-level network equipment distributed by the target service path flow.
Specifically, the first-level network device distributed by the target service path traffic may be determined by finding a value corresponding to the current quintuple hash information.
S304, determining each level of network nodes forwarded by the target service path flow according to all the first level network devices.
It is to be understood that the hash functions may not be identical.
S103, determining a transmission track and an interaction track of target service path flow forwarding according to each level of network nodes to obtain network paths of each level of network nodes.
And S104, generating a network topological graph of the target service path according to the network path.
According to the method and the system, the specific network nodes through which the traffic passes when the network equipment at each level forwards the service are calculated according to the hash strategies of the network equipment at each level and the hash functions of the forwarding paths of the network equipment at each level, the transmission track and the interaction track of the service traffic on the cloud are automatically found out, and a clear service path traffic forwarding network topology is formed, so that subsequent fault location is facilitated, and rapid troubleshooting is realized.
And S105, aiming at each level of network equipment in the network topology, obtaining a pre-analysis result of the network equipment.
It should be noted that the service path on the cloud can be divided into three cases: the first is that the client and the server are both on the cloud, and the service path on the cloud is a complete service path: namely CVM (client) - > parent-cloud gateway(s) - > parent-CVM (server); the second case is that the client is on the cloud, and the service path on the cloud only contains CVM (client) - > mother machine- > cloud gateway- > TOR- > LEAF- > SPAIN- > other switches; the third situation is that the server is in the cloud, and the service path in the cloud only contains CVM (server) - > mother machine- > cloud gateway- > TOR- > LEAF- > SPAIN- > other switches.
Based on the network topology map of the target service path obtained in step S104, the present application performs troubleshooting on each level of network nodes from front to back starting from the initial network node to which the service traffic is forwarded. For the first case, the initial node of the traffic path on the cloud is CVM (client), the end node is a server (CVM), for the second case, the initial node of the traffic path on the cloud is CVM (client), the end node is a cloud gateway and a related server, for the third case, the initial node of the traffic path on the cloud is a related switch and a cloud gateway, and the end node is a server (CVM).
Optionally, in another embodiment of the present application, an implementation manner of step S105 includes:
for each level of network equipment in the network topology, when the number of fault logs in the network flow logs of the network equipment reaches a threshold value, the logs of the forwarding paths of each network equipment are recorded, and the logs of the forwarding paths of all the network equipment are used as the sum analysis result of the network equipment.
It should be noted that, by setting, but not limited to, a log start threshold of a cloud gateway system fault, it is determined whether an operating parameter of a current fault of the cloud gateway system exceeds the log start threshold, and when the operating parameter exceeds the set threshold, a network flow log is started, and the acquired network flow log is stored in a bypass local storage device; setting a service data stream for starting a network stream log, starting the network stream log when the current service data stream is detected, and storing the obtained network stream log to a bypass local storage device, which is not limited herein.
Optionally, in another embodiment of the present application, an implementation manner of step S105 includes:
and aiming at each level of network equipment in the network topology, carrying out health check on the network equipment, and taking a health check result as a pre-analysis result of the network equipment.
It should be noted that the health check content may include CPU occupancy, memory occupancy, I/O, disk, network card traffic, network card packet access amount, network card packet access error packet, and the like of the danba breeding gateway physical server, as well as port status and routing related information of the switch, which is not limited herein.
It can be understood that each health examination item parameter can also set a threshold value, and when the current parameter value exceeds the threshold value, alarm information is sent to operation and maintenance personnel.
Optionally, in another embodiment of the present application, an implementation manner of step S105 includes:
and capturing the packet of the network equipment by using a preset packet capturing tool aiming at each level of network equipment in the network topology to obtain a packet capturing result, and taking the packet capturing result as a pre-analysis result of the network equipment.
It should be noted that the packet grabbing tool detection includes using a tool to grab a packet according to a packet grabbing offset set by an unpopulated package, a common packet grabbing tool includes a packet grabbing tool such as tcpdump, and then analyzing and outputting a packet grabbing result through a tool such as wireshark, so as to locate a fault device and a fault reason.
Tcpdump is a Unix next powerful network packet capture tool that allows a user to intercept and display TCP/IP and other packets sent to or received from a network connected to the computer, and to completely intercept and parse the "header" of the packets transmitted over the network for analysis. It supports filtering for network layers, protocols, hosts, networks, or ports, and provides logical statements such as and, or, not, etc. to help eliminate useless information. The w option may save the captured data for further analysis using tools such as wireshark.
It can be seen from the foregoing embodiments that, after a fault occurs, the analysis method is automatically selected according to the fault scenario and the network topology map of the target service path obtained in step S104, and processing and analysis are performed at the corresponding network node. For the fault of the data flow line pipe of a specific service, a network flow log detection log can be started according to a specific data flow; aiming at a specific fault, a network fluid system detection mode can be started according to the specific fault; for faults in other forms, a health check detection mode and a packet capture detection mode can be combined for comprehensive analysis. For example: the switch adopts a log stream mode, the gateway equipment adopts a health check and packet capture mode, and an optimal analysis mode capable of judging faults at present is set in each level of network equipment.
And S106, carrying out fault positioning according to the pre-analysis results of all the network equipment to obtain a fault positioning result.
Specifically, all pre-analysis results obtained in step S105 are summarized, and meanwhile, a summary analysis is performed in combination with a fault condition of each network node of the service path on the cloud, and finally, a faulty network node is located.
Optionally, in another embodiment of the present application, after the fault location result is obtained, the fault location result may also be stored.
The method is convenient for rapidly determining the service paths on the cloud of different users, determining each level of network equipment for forwarding the traffic, and facilitating subsequent troubleshooting and positioning; meanwhile, the fault positioning results of each time are stored, network nodes related to various abnormal scenes are counted, and all levels of fault network equipment corresponding to the common fault scenes are obtained through further recording and analyzing, so that similar faults can be positioned in time, and users and operation and maintenance personnel can assist in positioning problems in cloud service path flow forwarding or optimizing topology of the cloud service path and nodes in the topology according to the conclusion.
According to the scheme, the application provides a fault positioning method, which comprises the following steps: firstly, determining a hash strategy and a hash function of each level of network equipment under a target service path; then, determining all levels of network nodes for forwarding the target service path flow according to the hash strategies and hash functions of all the network equipment; determining a transmission track and an interaction track of target service path flow forwarding according to each level of network nodes to obtain network paths of each level of network nodes; generating a network topological graph of a target service path according to the network path; then, aiming at each level of network equipment in the network topology, obtaining a pre-analysis result of the network equipment; and finally, fault positioning is carried out according to the pre-analysis results of all the network equipment, and a fault positioning result is obtained. Therefore, the purpose of quickly and accurately positioning and analyzing the fault is achieved.
Another embodiment of the present application provides a fault location device, as shown in fig. 4, specifically including:
a determining unit 401, configured to determine a hash policy and a hash function of each level of network equipment under the target traffic path.
A network node determining unit 402, configured to determine, according to the hash policies and hash functions of all network devices, network nodes at different levels to which the target service path traffic is forwarded.
Optionally, in another embodiment of the present application, an implementation manner of the network node determining unit 402 includes:
and the identification determining unit is used for determining a first unit identification of the first hash array according to a first hash function of the network equipment and determining a second unit identification of the second hash array according to a second hash function aiming at each network equipment.
Each unit in the first hash array is used for storing a second hash array; each cell in the second hash array is configured to act as a pointer to a storage cell.
And the hash array determining unit is used for determining the target second hash array according to the first unit identifier and the second unit identifier.
And the network equipment determining unit is used for traversing all the storage units pointed by the target second hash array and determining the first-level network equipment distributed by the target service path flow.
And the network node determining subunit is used for determining each level of network nodes forwarded by the target service path flow according to all the first level network devices.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 3, which is not described herein again.
The network path determining unit 403, similarly to determining the transmission trajectory and the interaction trajectory of the target service path traffic forwarding according to each level of network nodes, obtains the network paths of each level of network nodes.
A generating unit 404, configured to generate a network topology map of the target service path according to the network path.
The pre-analysis unit 405 is configured to obtain a pre-analysis result of the network device for each level of network devices in the network topology.
Optionally, in another embodiment of the present application, an implementation manner of the pre-analysis unit 405 includes:
and the recording unit is used for recording the log of the forwarding path of each network device when the number of the fault logs in the network flow log of the network device reaches a threshold value aiming at each level of network devices in the network topology, and taking the logs of the forwarding paths of all the network devices as the analysis result of the network devices.
For specific working processes of the units disclosed in the above embodiments of the present application, reference may be made to the contents of the corresponding method embodiments, which are not described herein again.
Optionally, in another embodiment of the present application, an implementation manner of the pre-analysis unit 405 includes:
and the health check unit is used for carrying out health check on the network equipment aiming at each level of network equipment in the network topology and taking the health check result as the pre-analysis result of the network equipment.
For specific working processes of the units disclosed in the above embodiments of the present application, reference may be made to the contents of the corresponding method embodiments, which are not described herein again.
Optionally, in another embodiment of the present application, an implementation manner of the pre-analysis unit 405 includes:
and the packet capturing unit is used for capturing packets of the network equipment by using a preset packet capturing tool aiming at each level of the network equipment in the network topology to obtain a packet capturing result, and the packet capturing result is used as a pre-analysis result of the network equipment.
For specific working processes of the units disclosed in the above embodiments of the present application, reference may be made to the contents of the corresponding method embodiments, which are not described herein again.
And a fault positioning unit 406, configured to perform fault positioning according to the pre-analysis results of all the network devices to obtain a fault positioning result.
For a specific working process of the unit disclosed in the above embodiment of the present application, reference may be made to the content of the corresponding method embodiment, as shown in fig. 1, which is not described herein again.
Optionally, in another embodiment of the present application, an implementation manner of the fault location device further includes:
and the storage unit is used for storing the fault positioning result.
For specific working processes of the units disclosed in the above embodiments of the present application, reference may be made to the contents of the corresponding method embodiments, which are not described herein again.
According to the above scheme, the present application provides a fault location device: first, a determining unit 401 determines a hash policy and a hash function of each level of network equipment under a target service path; then, the network node determining unit 402 determines each level of network nodes to which the target service path traffic is forwarded according to the hash policies and hash functions of all the network devices; the network path determining unit 403 determines a transmission trajectory and an interaction trajectory of target service path traffic forwarding according to each level of network nodes, so as to obtain network paths of each level of network nodes; the generating unit 404 generates a network topology map of the target service path according to the network path; then, the pre-analysis unit 405 obtains a pre-analysis result of the network device for each level of the network device in the network topology; finally, the fault location unit 406 performs fault location according to the pre-analysis results of all the network devices to obtain a fault location result. Therefore, the purpose of quickly and accurately positioning and analyzing the fault is achieved.
Another embodiment of the present application provides a server, as shown in fig. 5, including:
one or more processors 501.
A storage device 502 on which one or more programs are stored.
The one or more programs, when executed by the one or more processors 501, cause the one or more processors 501 to implement a method of locating a fault as described in any of the above embodiments.
Another embodiment of the present application provides a computer storage medium, on which a computer program is stored, wherein when being executed by a processor, the computer program implements the method for locating a fault as described in any one of the above embodiments.
In the above embodiments disclosed in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present disclosure may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a live broadcast device, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those skilled in the art can make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A method for locating a fault, comprising:
determining a hash strategy and a hash function of each level of network equipment under a target service path;
for each network device, determining a first unit identifier of a first hash array according to a first hash function of the network device, and determining a second unit identifier of a second hash array according to a second hash function; wherein each cell in the first hash array is used to store the second hash array; each unit in the second hash array is used as a pointer to point to a storage unit;
determining a target second hash array according to the first unit identifier and the second unit identifier;
traversing all the storage units pointed by the target second hash array, and determining first-level network equipment distributed by the target service path flow;
determining each level of network nodes forwarded by the target service path flow according to all the first level network equipment;
determining a transmission track and an interaction track forwarded by the target service path flow according to the network nodes at all levels to obtain network paths of the network nodes at all levels;
generating a network topological graph of the target service path according to the network path;
aiming at each level of network equipment in the network topology, obtaining a pre-analysis result of the network equipment;
and carrying out fault positioning according to the pre-analysis results of all the network equipment to obtain a fault positioning result.
2. The method according to claim 1, wherein the obtaining, for each level of network device in the network topology, a pre-analysis result of the network device comprises:
and for each level of network equipment in the network topology, when the number of fault logs in the network flow logs of the network equipment reaches a threshold value, recording the logs of the forwarding paths of each piece of network equipment, and taking the logs of the forwarding paths of all pieces of network equipment as the analysis result of the network equipment.
3. The method according to claim 1, wherein the obtaining a pre-analysis result of the network device for each level of the network topology comprises:
and aiming at each level of network equipment in the network topology, carrying out health check on the network equipment, and taking a health check result as a pre-analysis result of the network equipment.
4. The method according to claim 1, wherein the obtaining, for each level of network device in the network topology, a pre-analysis result of the network device comprises:
and capturing the packet of each level of network equipment in the network topology by using a preset packet capturing tool to obtain a packet capturing result, and taking the packet capturing result as a pre-analysis result of the network equipment.
5. The method according to claim 1, wherein the fault location is performed according to the pre-analysis results of all the network devices, and after obtaining the fault location result, the method further comprises:
and storing the fault positioning result.
6. A fault locating device, comprising:
the determining unit is used for determining the hash strategy and the hash function of each level of network equipment under the target service path;
the identification determining unit is used for determining a first unit identification of the first hash array according to a first hash function of each network device and determining a second unit identification of the second hash array according to a second hash function; wherein each cell in the first hash array is used to store the second hash array; each unit in the second hash array is used as a pointer to point to a storage unit;
the hash array determining unit is used for determining a target second hash array according to the first unit identifier and the second unit identifier;
the network equipment determining unit is used for traversing all the storage units pointed by the target second hash array and determining first-level network equipment distributed by the target service path flow;
a network node determining subunit, configured to determine, according to all the first-stage network devices, network nodes of each stage to which the target service path traffic is forwarded;
the network path determining unit is used for determining a transmission track and an interaction track forwarded by the target service path flow according to the network nodes at all levels to obtain the network paths of the network nodes at all levels;
a generating unit, configured to generate a network topology map of the target service path according to the network path;
the pre-analysis unit is used for obtaining a pre-analysis result of the network equipment for each level of the network equipment in the network topology;
and the fault positioning unit is used for positioning the fault according to the pre-analysis results of all the network equipment to obtain a fault positioning result.
7. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of locating a fault of any of claims 1 to 5.
8. A computer storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements a method of locating a fault as claimed in any one of claims 1 to 5.
CN202110850027.XA 2021-07-27 2021-07-27 Fault positioning method, device, server and computer storage medium Active CN113595783B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110850027.XA CN113595783B (en) 2021-07-27 2021-07-27 Fault positioning method, device, server and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110850027.XA CN113595783B (en) 2021-07-27 2021-07-27 Fault positioning method, device, server and computer storage medium

Publications (2)

Publication Number Publication Date
CN113595783A CN113595783A (en) 2021-11-02
CN113595783B true CN113595783B (en) 2022-12-13

Family

ID=78250338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110850027.XA Active CN113595783B (en) 2021-07-27 2021-07-27 Fault positioning method, device, server and computer storage medium

Country Status (1)

Country Link
CN (1) CN113595783B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114422337A (en) * 2022-03-09 2022-04-29 中国建设银行股份有限公司 Method and related device for network packet capturing and fault positioning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025538A (en) * 2010-12-03 2011-04-20 中兴通讯股份有限公司 Method and device for realizing multicasting flow load sharing based on equal-cost multi-path (ECMP) routing
CN104137517A (en) * 2012-02-27 2014-11-05 瑞典爱立信有限公司 Peer, application and method for detecting faulty peer in peer-to-peer network
CN112073234A (en) * 2020-09-02 2020-12-11 腾讯科技(深圳)有限公司 Fault detection method, device, system, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025538A (en) * 2010-12-03 2011-04-20 中兴通讯股份有限公司 Method and device for realizing multicasting flow load sharing based on equal-cost multi-path (ECMP) routing
CN104137517A (en) * 2012-02-27 2014-11-05 瑞典爱立信有限公司 Peer, application and method for detecting faulty peer in peer-to-peer network
CN112073234A (en) * 2020-09-02 2020-12-11 腾讯科技(深圳)有限公司 Fault detection method, device, system, equipment and storage medium

Also Published As

Publication number Publication date
CN113595783A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
US10917322B2 (en) Network traffic tracking using encapsulation protocol
EP3158694B1 (en) Systems and methods for performing operations on networks using a controller
EP3278503B1 (en) Method of packet marking for flow analytics
US7636305B1 (en) Method and apparatus for monitoring network traffic
US20200007446A1 (en) Full-path validation in segment routing
US7409712B1 (en) Methods and apparatus for network message traffic redirection
US8588081B2 (en) Monitoring a flow set to detect faults
US9455995B2 (en) Identifying source of malicious network messages
CN113132342B (en) Method, network device, tunnel entry point device, and storage medium
US20090180393A1 (en) Sampling apparatus distinguishing a failure in a network even by using a single sampling and a method therefor
US20080212484A1 (en) Tracing connection paths through transparent proxies
US20110145391A1 (en) System and method for correlating ip flows across network address translation firewalls
Wang et al. Towards mitigating link flooding attack via incremental SDN deployment
US10033602B1 (en) Network health management using metrics from encapsulation protocol endpoints
US7420929B1 (en) Adaptive network flow analysis
US8948023B2 (en) Enhancing mtrace to detect failure in multicast diverse paths
JP2011146982A (en) Computer system, and monitoring method of computer system
US8971195B2 (en) Querying health of full-meshed forwarding planes
Athira et al. Study on network performance of interior gateway protocols—RIP, EIGRP and OSPF
CN110557342A (en) Apparatus for analyzing and mitigating dropped packets
CN113542049A (en) Method for detecting lost packet in computer network, network device and storage medium
WO2013189414A2 (en) Automatic network topology acquisition method and system, and network query and management system
CN113595783B (en) Fault positioning method, device, server and computer storage medium
JP5178573B2 (en) Communication system and communication method
Polverini et al. Investigating on black holes in segment routing networks: Identification and detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant