CN109802855B - Fault positioning method and device - Google Patents

Fault positioning method and device Download PDF

Info

Publication number
CN109802855B
CN109802855B CN201811625982.8A CN201811625982A CN109802855B CN 109802855 B CN109802855 B CN 109802855B CN 201811625982 A CN201811625982 A CN 201811625982A CN 109802855 B CN109802855 B CN 109802855B
Authority
CN
China
Prior art keywords
network
node
test
fault
link
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811625982.8A
Other languages
Chinese (zh)
Other versions
CN109802855A (en
Inventor
鄢国平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201811625982.8A priority Critical patent/CN109802855B/en
Publication of CN109802855A publication Critical patent/CN109802855A/en
Application granted granted Critical
Publication of CN109802855B publication Critical patent/CN109802855B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

After first information used for identifying an originating node of a fault link and second information used for identifying a terminating node of the fault link are obtained, a fault positioning device determines m (m is larger than or equal to 3) dial testing objects and all nodes in the fault link according to the first information, the second information and a logic network topology generated in advance, and then the fault positioning device carries out dial testing according to a testing link formed by the determined dial testing objects, namely a fault value of each node in the fault link can be determined according to a dial testing result, and a node with a fault in the fault link is further determined. Compared with the prior art, the fault locating device has the advantages that the operation of determining the node with the fault is simple, the speed is higher, and the performance of the system is effectively improved.

Description

Fault positioning method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a fault location method and apparatus.
Background
Network Function Virtualization (NFV) technology can transform the functions of each network element used in a telecommunication network into independent applications, and is flexibly deployed on a unified infrastructure platform constructed based on standard servers, storage, switches and other devices.
For a fault of a communication link in the NFV system, an Internet protocol address (IP) staining diagnosis method is often used for positioning. Specifically, a plurality of data packets with the same transmission path sent by a Virtual Machine (VM) in the NFV system are dyed, different nodes on the transmission path detect and count the number of the dyed data packets, determine the difference between the numbers of the dyed data packets detected by the different nodes, and determine the occurrence position of the fault according to the difference.
The method utilizes the real service flow to carry out statistics, and can fully reflect the condition of the real service. However, the above method requires time synchronization and complex operation, and requires a large number of dyeing operations, dyeing identification and statistical operations, resulting in low system performance.
Disclosure of Invention
The application provides a fault positioning method and device, which are used for solving the problems of complex operation and low system performance.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
in a first aspect, a fault location method is provided, which is applied to an NFV system, and specifically, after acquiring first information for identifying an originating node of a faulty link and second information for identifying a terminating node of the faulty link, a fault location device determines m (m is an integer greater than or equal to 3) dial test objects and all nodes in the faulty link according to the first information, the second information and a pre-generated logical network topology, then, the fault location device determines n (n is an integer greater than or equal to 2) test links (a test link is a transmission link between any two dial test objects in the same network of the m dial test objects) according to the m dial test objects, and performs dial test on each test link in the n test links, so that the fault location device can determine a fault value of each node in the faulty link according to a dial test result, and then according to the fault value of each node, determining the node with the fault in the fault link. The originating node and the terminating node are both target nodes in the NFV system, the target nodes are ports of a virtual machine or a virtual network card, the dial testing objects are the target nodes, the m dial testing objects at least comprise the originating node and the terminating node, and the larger the fault value is, the higher the probability of the node fault is.
In summary, the fault location device determines the dial testing objects, and performs dial testing on the testing links formed between the dial testing objects, so as to determine the fault value of each node in the fault link according to the dial testing result, and further determine the node with the fault in the fault link. Compared with the prior art, the fault locating device of the embodiment of the application has the advantages that the operation is simple, the speed is higher, and the system performance is effectively improved.
In a possible implementation manner of the application, the method for determining m dial-up test objects by the fault location device according to the first information, the second information and the pre-generated logical network topology includes that the fault location device determines that a (a is an integer greater than or equal to 0) switching nodes (the switching nodes are physical switches or routers) according to the first information, the second information and the pre-generated logical network topology, when a is equal to 0, the originating node and the terminating node are located in the same physical machine, the fault location device selects m target nodes from the configured target nodes of the physical machine and determines m target nodes as m dial-up test objects, when a is greater than or equal to 1, the fault location device performs the first process on each switching node of the a switching nodes to determine m target nodes, and determines m target nodes as m dial-up test objects, specifically, the first process is that b physical machines are selected from the physical machines connected with the switching nodes, if b is greater than or equal to 2, the physical machine includes a physical machine, and if a is configured with a physical machine is equal to m target nodes, the physical machine is equal to 35c, and if a is greater than or equal to 2, the physical machine includes a physical machine, and if b is configured with a physical machine is an integer greater than or equal to an integer greater than an integer equal to an integer of the physical machine.
Besides the originating node and the terminating node, the fault positioning device also acquires other target nodes, and takes the acquired other target nodes as dial testing objects. In this way, the fault location device can use the selected other target nodes as reference, and further determine the node with the fault in the fault link.
In another possible implementation manner of the present application, the method for determining n test links by the fault location device according to m dial test objects includes that the fault location device determines that i dial test objects of the m dial test objects are located in a first network and m-i dial test objects of the m dial test objects are located in a second network, a network segment of the first network is different from a network segment of the second network, or a routing protocol of the first network is different from a routing protocol of the second network, and 1 ≦ i ≦ m, and then the fault location device determines i × (i-1)/2 test links in the first network, and (m-i) × (m-i-1)/2 test links in the second network, and also determines the number of inter-network links, which are links between a gateway of the first network and a gateway of the second network, wherein n ≦ i × (i-1)/2 (m-i) × (m-i-1)/2 test links.
When the nodes in the fault link are positioned in a plurality of networks, the fault positioning device respectively determines the transmission links between the dial testing objects in the networks as the testing links in each network, thereby avoiding the communication of the nodes between different networks and improving the dial testing rate of the subsequent testing lines.
In another possible implementation manner of the present application, the method for the fault location device to perform the dial test on each test link of the n test links includes that the fault location device performs the dial test on each test link of i × (i-1)/2 test links in the first network, performs the dial test on each test link of (m-i) × (m-i-1)/2 test links in the second network, and performs the dial test on inter-network links.
The fault positioning device carries out dial testing on the testing link comprising the dial testing object in the network where the dial testing object is located, avoids communication of nodes among different networks, and improves the dial testing speed of subsequent testing lines.
In another possible implementation manner of the present application, the method for determining the fault value of each node in the faulty link by the fault location device according to the dial-up test result includes: the fault location means performs a second process for each node to determine a fault value for the node. Specifically, the second process is as follows: the fault positioning device judges whether the node is on the test link or not for each test link in the n test links; if the node is on the test link and the test link dial test fails, or if the node is not on the test link and the test link dial test succeeds, updating the fault value of the node as: the stored failure value of the node is +1, and the initial value of the failure value of the node is 0.
In another possible implementation manner of the present application, the method for generating the logical network topology by the fault location device in advance comprises: the fault positioning device acquires the characteristic information of each network node in the NFV system, and generates a logic network topology according to the acquired characteristic information of the network nodes and the connection relation between different network nodes. The characteristic information includes at least one of a network address, a name and an identifier, and the network node is a physical switch, a router, a physical network card, a virtual switch, a virtual network card or a virtual machine. All nodes of the failed link in the present application belong to network nodes in the NFV system.
In another possible implementation manner of the present application, the fault location device further updates the logical network topology after generating the logical network topology.
In a second aspect, a fault location device is provided, which is capable of implementing the functions of the first aspect and any one of its possible implementations. These functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.
In a possible manner of this application, the fault location apparatus may include an obtaining unit, a determining unit, and a dial testing unit, and the obtaining unit, the determining unit, and the dial testing unit may perform corresponding functions in the fault location method according to the first aspect and any one of the possible implementation manners of the first aspect. For example: the device comprises an acquisition unit and a processing unit, wherein the acquisition unit is used for acquiring first information and second information, the first information is used for identifying an initial node of a fault link, the second information is used for identifying a terminal node of the fault link, the initial node and the terminal node are both target nodes in an NFV system, and the target nodes are virtual machines or ports of virtual network cards; a determining unit, configured to determine m dial test objects and all nodes in a faulty link according to the first information and the second information acquired by the acquiring unit and a pre-generated logical network topology, where the dial test object is a target node, the m dial test objects at least include an originating node and a terminating node, m is an integer greater than or equal to 3, and the determining unit is configured to determine n test links according to the m dial test objects, and n is an integer greater than or equal to 2; the dial testing unit is used for performing dial testing on each testing link in the n testing links determined by the determining unit, wherein the testing link is a transmission link of m dial testing objects between any two dial testing objects in the same network; the determining unit is further configured to determine a failure value of each node in the failed link according to a dial test result of the dial test unit, where the higher the failure value is, the higher the probability of the node failing is, and further configured to determine the node in the failed link that fails according to the failure value of each node.
In a third aspect, a fault locating device is provided, the fault locating device comprising a processor and a memory; the memory is configured to store computer executable instructions, and when the fault location apparatus runs, the processor executes the computer executable instructions stored in the memory to implement the fault location method according to the first aspect and any one of the possible implementation manners of the first aspect.
Further optionally, the fault location device may further include a display for displaying a node in the fault line at which a fault occurs, under the control of the processor of the fault location device.
The fault locating device may be any one of the NFV systems, or may be a part of a certain device in the NFV system, for example, a system on a chip in the device. The system-on-chip is configured to support the device to implement the functions referred to in the first aspect and any one of its possible implementations, for example, to process data and/or information referred to in the above-mentioned fault location method. The chip system includes a chip and may also include other discrete devices or circuit structures.
In a fourth aspect, a computer-readable storage medium having computer instructions stored therein is also provided; when the instructions are run on the fault locating device, the fault locating device is caused to perform the fault locating method as described above in the first aspect and its various possible implementations.
In a fifth aspect, there is also provided a computer program product comprising computer instructions which, when executed by a processor of a fault location device, cause the fault location device to perform the fault location method as described in the first aspect and its various possible implementations.
It should be noted that all or part of the above instructions may be stored on the first computer storage medium, where the first computer storage medium may be packaged together with the processor of the fault location device, or may be packaged separately from the processor of the fault location device, and this application is not limited in this respect.
For a detailed description of the second, third, fourth, fifth and their various implementations in this application, reference may be made to the detailed description of the first aspect and its various implementations; moreover, the beneficial effects of the second aspect, the third aspect, the fourth aspect, the fifth aspect and various implementation manners thereof may refer to the beneficial effect analysis of the first aspect and various implementation manners thereof, and are not described herein again.
In the present application, the names of the above-mentioned fault locating means do not limit the devices or functional modules themselves, which may appear under other names in practical implementations. Insofar as the functions of the respective devices or functional modules are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalents.
These and other aspects of the present application will be more readily apparent from the following description.
Drawings
FIG. 1 is a schematic diagram of an architecture of the NFV system;
FIG. 2 is a schematic diagram of another architecture of the NFV system;
FIG. 3 is a schematic diagram of a physical machine in the NFV system;
fig. 4 is a schematic hardware structure diagram of a fault location device according to an embodiment of the present disclosure;
fig. 5 is a schematic flowchart of a fault location method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a logical network topology in an embodiment of the present application;
FIG. 7 is a schematic diagram of a failed link in an embodiment of the present application;
fig. 8 is a schematic structural diagram of a fault location device according to an embodiment of the present application.
Detailed Description
In the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present application, "a plurality" means two or more unless otherwise specified.
In order to meet future competition and challenge and avoid being pipelined, operators propose NFV technology while complying with the development trend of current virtualization, cloud computing and other technologies.
The NFV technology converts each network element used in a telecommunication network into an independent application, can be flexibly deployed on a unified infrastructure platform constructed by other equipment such as a standard-based server, a storage and a switch, pools and virtualizes infrastructure hardware equipment resources through a virtualization technology, provides virtual resources for upper-layer applications, and realizes application and hardware decoupling, so that each application can rapidly increase the virtual resources to achieve the purpose of rapidly expanding the system capacity, or can rapidly reduce the virtual resources to achieve the purpose of shrinking the system capacity, and the elasticity of the network is greatly improved.
The foundation of NFV technology includes cloud computing technology and virtualization technology. By the virtualization technology, hardware equipment such as general computing/storage/network and the like can be decomposed into various virtual resources for use by various upper-layer applications, decoupling between the applications and the hardware can be realized, and the virtual resource supply speed is greatly increased. By the cloud computing technology, elastic expansion of application can be achieved, matching of virtual resources and service loads is achieved, utilization efficiency of the virtual resources is improved, and response rate of a system is improved.
The NFV system may be used in various networks, for example implemented in a data center network, operator network, or local area network.
Fig. 1 is a schematic diagram of an NFV system. As shown in fig. 1, the NFV system includes an NFV management and organization system (NFV MANO)101, an NFV infrastructure layer (NFVI) 102, a plurality of Virtual Network Functions (VNF) 103, a plurality of Element Management (EM) 104, a Network service, a VNF and infrastructure description (VNF and infrastructure description)105, and an operation-support management system (OSS/BSS) 106.
NFV MANO 101 is used to perform monitoring and management of NFVI 102 and VNF 103.
The NFV MANO 101 includes an NFV orchestrator (NFV organizer, NFVO)1011, one or more VNF managers (VNFM) 1012, and a Virtualized Infrastructure Manager (VIM) 1013.
The NFVO 1011 may implement a network service on the NFVI 102, and may also execute a resource-related request from one or more VNFMs 1012, send configuration information to the VNFM 1012, and collect status information of the VNF 103. Additionally, NFVO 1011 may communicate with VIM 1013 to enable allocation and/or reservation of resources, as well as exchange configuration and status information for virtualized hardware resources.
VNFM 1012 may manage one or more VNFs 103. VNFM 1012 may perform various management functions such as instantiating, updating, querying, scaling, and/or terminating VNF 103, etc.
The VIM 1013 may perform functions for resource management, such as managing allocation of infrastructure resources (e.g., adding resources to virtual containers) and operational functions (e.g., collecting NFVI fault information).
VNFM 1012 and VIM 1013 may communicate with each other for resource allocation and exchange configuration and status information for virtualized hardware resources.
NFVI 102 includes a hardware resource layer, a virtualization layer (virtualization layer), and a virtual resource layer. The hardware resources and/or software resources in NFVI 102 complete the deployment of the virtualized environment. In other words, the hardware resources and virtualization layer are used to provide virtualized resources, e.g., as virtual machines and other forms of virtual containers, for VNF 103.
The hardware resource layer includes computing hardware 1021, storage hardware 1022, and networking hardware 1023.
The computing hardware 1021 may be, among other things, commercially available hardware and/or custom hardware for providing processing and computing resources. The storage hardware 1022 may be storage capacity provided within a network or storage capacity resident on the storage hardware 1022 itself (local storage located within a server). In one implementation, the resources of computing hardware 1021 and storage hardware 1022 may be pooled together. Network hardware 1023 may be a switch, router, and/or any other network device configured with switching functionality. Network hardware 1023 may span multiple domains and may include multiple networks interconnected by one or more transport networks. A virtualization layer inside NFVI 102 may abstract hardware resources from the physical layer and decouple VNF 103 in order to provide virtualized resources to VNF 103.
Virtual resource layers include virtual compute 1024, virtual storage 1025, and virtual network 1026. Virtual compute 1024 and virtual storage 1025 may be provided to VNF 103 in the form of virtual machines, and/or other virtual containers. For example, one or more VNFs 103 may be deployed on one Virtual Machine (VM). The virtualization layer abstracts the networking hardware 1023 to form a virtual network 1026, and the virtual network 1026 may include virtual switches (virtual switches) that are used to provide connectivity between the virtual machines and other virtual machines. In addition, the transport network in the network hardware 1023 can be virtualized using a centralized control plane and a separate forwarding plane (e.g., a software defined network).
EM 104 performs traditional fault, configuration, user, performance, and security management (FCAPS) functions for VNF 103.
The OSS/BSS106 refers to the operator's existing operation and maintenance system OSS/BSS.
In hardware, the NFV system includes at least one physical machine, each of which may provide various types of hardware resources such as computing hardware, storage hardware, or networking hardware. The virtualization layer pools the computation, storage, and network of a large number of physical machines, providing VM's to the application VNF for use. The physical machine may be any kind of computer, such as a server.
Fig. 2 is another architecture diagram of the NFV system. As shown in fig. 2, the NFV system may include at least one switch (or router) 21, and at least one physical machine 22 connected to each switch (or router) 21, any two switches/routers 21 being connected through a network.
It is understood that the NFV system shown in fig. 2 is only one example. In practical implementation, a plurality of physical machines connected to the same switch/router may communicate directly or through other devices, which is not limited in this application.
At least one VM is configured in the physical machine 22, each VM accesses a network through at least one Virtual Network Interface Card (VNIC), different VNICs are connected through a virtual switch, and the virtual switch provides connection between the VNIC and a physical Network Interface Card (NIC) in the physical machine. The VM, VNIC and virtual switch are all in the virtual resource layer in fig. 1, and the physical network card is in the hardware resource layer in fig. 1.
In connection with FIG. 2, FIG. 3 shows physical machine 22 configured with three VNICs: VNIC 1, VNIC 2, VNIC3, each of which supports one VM, are connected to one virtual switch, which is also connected to the NIC in physical machine 22.
In conjunction with fig. 2 and 3, it can be seen that traffic flows between any two VMs located on different physical machines require transport through the VMs, virtual switches, switches (or routers). During transmission, a communication link may have a fault, thereby affecting the normal operation of the service.
Currently, IP dye diagnostic methods are often employed to locate communication link failures in NFV systems. The method utilizes the real service flow to carry out statistics, and can fully reflect the condition of the real service. However, this method requires time synchronization and complicated operation, and requires a large number of dyeing operations, dyeing identification and statistical operations, resulting in a low system performance.
In order to solve the above problem, an embodiment of the present application provides a fault location method, where a fault location device generates a logical network topology of an NFV system in advance, and after a fault link occurs, the fault location device determines, according to the logical network topology, m (m is greater than or equal to 3) dial test objects (a dial test object is a port of a virtual machine or a virtual network card in the NFV system) that at least include an originating node of the fault link and a terminating node of the fault link, so that the fault location device can perform dial test on n test links composed of the m dial test objects, and then determines, according to a dial test result of each test link, a fault value (where the fault value is used to indicate a probability that a node fails) of each node in the fault link, and then determines a failed node according to the fault value of each node. Compared with the prior art, the fault locating device of the embodiment of the application has the advantages that the rate of determining the node with the fault is higher, and the performance of the system is effectively improved.
In addition, the dialing and testing operation of the fault positioning device can be realized by adopting simple ping operation, the universality is higher, and the applicability of the fault positioning method provided by the embodiment of the application is improved.
The fault positioning method provided by the embodiment of the application is suitable for an NFV system. The structure of the NFV system can refer to fig. 1 or fig. 2.
The fault locating device may be any one of the devices in fig. 1 or fig. 2, and may also be an independently arranged device for implementing the fault locating method provided in the embodiment of the present application, which is not specifically limited in this embodiment of the present application.
In a specific implementation, the fault locating device has the components shown in fig. 4. Fig. 4 is a schematic diagram of a fault location device according to an embodiment of the present disclosure, and as shown in fig. 4, the fault location device may include at least one processor 41, a memory 42, a communication interface 43, and a communication bus 44. The following describes the components of the fault locating device in detail with reference to fig. 4:
the processor 41 is a control center of the fault location device, and may be a single processor or a combination of multiple processing elements. For example, the processor 41 is a Central Processing Unit (CPU), and may be an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application, such as: one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs).
Processor 41 may perform various functions of the fault locating device by, among other things, running or executing software programs stored in memory 42 and invoking data stored in memory 42.
In particular implementations, processor 41 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 4, as one embodiment.
In particular implementations, as an example, the fault locating device may include a plurality of processors, such as processor 41 and processor 45 shown in fig. 4. Each of these processors may be a single-Core Processor (CPU) or a multi-Core Processor (CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
The memory 42 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 42 may be self-contained and coupled to the processor 41 via a communication bus 44. The memory 42 may also be integrated with the processor 41.
The memory 42 is used for storing software programs for executing the scheme of the application, and is controlled by the processor 41 to execute.
Communication interface 43, using any transceiver or the like, may be adapted to communicate with other devices or communication networks, such as AN ethernet, a Radio Access Network (RAN), a wireless local area network (W L AN), etc. communication interface 43 may include a receiving unit to implement a receiving function and a transmitting unit to implement a transmitting function.
The communication bus 44 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus.
In a specific implementation, as an alternative implementation, the fault locating apparatus may further include an output device 46 and an input device 47.
Alternatively, the output device 46 may be a liquid crystal display (L CD), a light emitting diode (L ED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like.
The input device 47 may collect information input at the fault locating device by a user of the fault locating device and send the collected input information to other devices (e.g., the processor 41). Alternatively, the input device 47 may be a mouse, keyboard, touch pad or sensing device, etc.
Illustratively, if the input device 47 is a touch pad, then a touch event on or near the touch pad (e.g., a user operating on or near the touch pad using a finger, a stylus, or any other suitable object) and the captured touch information is transmitted to another device (e.g., the processor 41). Wherein, a touch event of a user near the touch pad can be called as a floating touch; hover touch may refer to a user not having to directly contact the touchpad in order to select, move, or drag a target (e.g., an icon, etc.), but rather only having to be in proximity to the device in order to perform a desired function. The touch panel can be implemented by various types, such as resistive, capacitive, infrared, and surface acoustic wave.
Since the output device 46 and the input device 47 are optional components, they are indicated by dashed boxes in fig. 4.
It is noted that the equipment structure shown in fig. 4 does not constitute a limitation of the fault locating device, which may comprise more or less components than those shown in fig. 4, or a combination of certain components, or a different arrangement of components, in addition to those shown in fig. 4.
The following describes a fault location method provided by the embodiment of the present application with reference to the NFV system shown in fig. 1 or fig. 2, the physical machine shown in fig. 3, and the fault location device shown in fig. 4.
Fig. 5 is a flowchart of a fault location method according to an embodiment of the present application. As shown in fig. 5, the fault location method may include:
s500, the fault positioning device generates a logic network topology.
The fault positioning device acquires the characteristic information of each network node in the NFV system, and generates a logic network topology according to the acquired characteristic information of the network nodes and the connection relation between different network nodes.
The characteristic information of the network node comprises at least one of a network address, a name and an identification; the network node is a physical switch, a router, a physical network card, a virtual switch, a virtual network card or a virtual machine.
Optionally, the fault location device obtains feature information of network nodes such as a VM, a virtual network card, a virtual switch, a physical network card, a switch, a router, and the like in the NFV system, and network topology information between the network nodes, and then, the fault location device generates a Directed Acyclic Graph (DAG) network topology described by data in a JS object notation (json) format according to a connection relationship between the VM, the virtual network card, the virtual switch, the physical network card, and the physical switch/router in sequence, where the DAG network topology is a logical network topology in the NFV system.
Of course, in practical application, the fault location device may also generate a logical network topology according to the connection relationship of the switch/router, the physical network card, the virtual switch, the virtual network card, and the VM in sequence, which is not specifically limited in this embodiment of the present application.
Illustratively, the fault locating device generates the logical network topology by performing the steps of:
step one, a fault positioning device acquires an ID, a name, a state and a stage (stage representation is adopted subsequently) of a physical machine in an NFV system.
The fault locating device acquires the ID, name (represented by hypervisor host name), state (represented by state), and phase (represented by state) of a physical machine in the NFV system (the NFV system includes 4 physical machines) by executing the command "nova hypervisor-list". The obtained ID, name, state and phase of the physical machine may be represented as:
+----+--------------------------------------+---
|ID|hypervisor hostname|state|status|
+----+--------------------------------------+---
|1|0CAEAC27-D21D-B211-8642-001823E5F68B|down|enabled|
|2|00F2F976-0BB4-E511-95E4-7CAD2A3F47C4|up|enabled|
|3|E2A04327-D21D-B211-AC83-001823E5F68B|up|enabled|
|4|5C556027-D21D-B211-B6EF-001823E5F68B|up|enabled|
wherein, down represents that the physical machine is in a disconnection (or disconnection) state, up represents that the physical machine is in an on-line state, and enabled represents that the physical machine is in an available stage.
And step two, the fault positioning device acquires the information of the physical network card and the virtual switch in each physical machine according to the acquired ID, name, state and stage of the physical machine.
And the fault positioning device acquires the information of the physical network card in each physical machine by executing a command 'cps host-show'.
For example: for the physical machine with the ID of 4 (simply referred to as physical machine 4), the fault location device executes the command cps host-show 5C556027-D21D-B211-B6EF-001823E5F68B to obtain the information of the physical network card in the physical machine.
And step three, the fault positioning device acquires the information of the virtual machine, the virtual network card and the virtual switch in each physical machine.
And the fault positioning device acquires the information of the virtual machine in the physical machine by executing a command 'nova list-all-te-host'. For example: the fault positioning device acquires the information of the virtual machine in the physical machine with the ID of 2 (simply referred to as physical machine 2) by executing a command nova list-all-te-host 00F2F976-0BB4-E511-95E4-7CAD2A3F47C 4;
then, the fault positioning device obtains the information of the virtual network card, the Media Access Control (MAC) address of the virtual machine, the IP address of the virtual machine and the ID of the subnet where the virtual machine is located by executing a command 'neutron port-list-device _ ID' according to the ID and the running state of the virtual machine; then, the fault positioning device acquires the state of the virtual network card and the ID of the network where the virtual network card is located by executing a command 'neutron port-show' according to the port ID of the virtual network card; then, the fault positioning device acquires a subnet mask and a gateway corresponding to the subnet by executing a command 'neutron subnet-show' according to the subnet ID; and finally, the fault positioning device acquires the information of the virtual switch by executing a command 'netronet-show' according to the network ID where the virtual network card is located.
And step four, the fault positioning device acquires the information of the VNF in the virtual machine.
And step five, the fault positioning device acquires the ports corresponding to the physical network card and the switch/router by executing a command display mac-address.
And after the fault positioning device executes the first step to the fifth step, the logic network topology can be generated.
Illustratively, if the NFV system includes 4 physical machines, the 4 physical machines are: the system comprises a physical machine 1, a physical machine 2, a physical machine 3 and a physical machine 4, wherein the physical machine 1 comprises an NIC 1, the NIC 1 is connected with a virtual switch A and is connected with a VNIC 1 and a VNIC 2 through the virtual switch A, the VNIC 1 supports the running of a VM1, and the VNIC 2 supports the running of a VM 2; the physical machine 2 comprises a NIC2, the NIC2 is connected with a virtual switch B and is connected with a VNIC3 through the virtual switch B, and the VNIC3 supports the operation of the VM 3; the physical machine 3 comprises an NIC3, the NIC3 is connected with a virtual switch C, and is connected with a VNIC4 through the virtual switch C, and the VNIC4 supports the operation of a VM 4; the physical machine 4 includes the NIC4, the NIC4 is connected to the virtual switch D, and the VNIC5, the VNIC 6, and the VNIC 7 are connected to the virtual switch D, where the VNIC5 supports the operation of the VM 5, the VNIC 6 supports the operation of the VM 6, and the VNIC 7 supports the operation of the VM 7, and then a logical network topology generated by the fault location device may be as shown in fig. 6.
Optionally, after generating the logical network topology, the fault location device displays the logical network topology.
After the logical network topology is generated, the fault location device may periodically update the logical network topology, or may update information of a network node corresponding to a first network node in the logical network topology after determining that a characteristic (e.g., a state or a connection relationship) of the network node (e.g., the first network node) in the logical network topology changes.
S501, after the fault link occurs, the fault positioning device obtains first information and second information.
Here, the first information is used to identify an originating node of the failed link, and the second information is used to identify a terminating node of the failed link.
Optionally, the first information is an IP address, number or name of the originating node, and the second information is an IP address, number or name of the terminating node.
It is easily understood that the originating node and the terminating node are both VMs in the NFV system or ports of VNICs in the NFV system, and the VMs in the NFV system or the ports of VNICs in the NFV system are referred to as target nodes in the embodiments of the present application.
For example, referring to fig. 6, as shown in fig. 7, if VM1 to VNIC 1 to virtual switch a to NIC 1 to switch/router to NIC2 to virtual switch B to VNIC3 to VM3 are failure links, the failure location device obtains information of VM1 and information of VM 4.
S502, the fault positioning device determines m dial-up test objects and all nodes in a fault link according to the first information, the second information and a logic network topology generated in advance.
m is an integer greater than or equal to 3.
The dial testing object is a target node in the NFV system, the target node is a VM in the NFV system or a port of a VNIC in the NFV system, and the m dial testing objects at least comprise an originating node and a terminating node. That is, the fault locating device determines m-2 target nodes other than the originating node and the terminating node.
In a specific implementation, the fault location device determines m dial test objects by performing the following steps a and B, or performs the following steps a and C.
And step A, the fault positioning device determines that the fault link comprises a switching nodes according to the first information, the second information and the logic network topology generated in advance.
Here, the switching node is a physical switch or router, and a is an integer greater than or equal to 0.
As shown in fig. 7, the failed link includes one switching node, so that the failure location device can determine that the failed link includes 1 switching node.
And step B, when a is equal to 0, the originating node and the terminating node are positioned in the same physical machine, the fault positioning device selects m target nodes from the target nodes configured by the physical machine, and determines the m target nodes as m dial testing objects.
Illustratively, as shown in fig. 7, if m is 3 and the failure link is VM 5 to VNIC5 to virtual switch D to VNIC 7 to VM 7, the failure link does not include a switching node, and thus the failure location device obtains VM 5, VM 6, and VM 7 from the physical machine 4 and determines VM 5, VM 6, and VM 7 as a dial-up test object.
And step C, when a is an integer greater than or equal to 1, the fault positioning device executes the following steps I and II on each determined switching node in the a switching nodes to determine m target nodes, and determines the m target nodes as m dial testing objects.
And step I, the fault positioning device selects b physical machines from the physical machines connected with the switching node.
If the physical machine connected to the switching node includes a target physical machine, the b physical machines include at least the target physical machine. The target physical machine is a first physical machine or a second physical machine, the first physical machine is provided with an originating node, and the second physical machine is provided with a terminating node.
For other physical machines except the target physical machine among the b physical machines, the fault locating device may be selected at will, or may be selected according to loads of the physical machines, which is not specifically limited in this embodiment of the present application.
If the physical machine connected to the switching node does not include the target physical machine, the fault location device may arbitrarily select b physical machines from the physical machines connected to the switching node, or may select b physical machines according to loads of the physical machines, which is not specifically limited in this embodiment of the present application.
And II, for each physical machine in the b physical machines, selecting c target nodes from the target nodes configured by the physical machine.
In the process of selecting c target nodes from the target nodes configured by the physical machine, if the physical machine is the first physical machine, c target nodes at least including the originating node (the originating node is known as the target node from the foregoing description) are selected from the target nodes configured by the first physical machine. Similarly, if the physical machine is the second physical machine, the c target nodes at least include a termination node (the termination node is also the target node).
In summary, the fault location device obtains m dial-up objects according to the above steps a and C, where m is a × b × C, a is an integer greater than or equal to 1, b is an integer greater than or equal to 2, and C is an integer greater than or equal to 2.
In practical application, for a certain physical machine, if the number of target nodes configured for the physical machine is less than 2, the fault location device may determine the target nodes configured for the physical machine as dial testing objects.
For example, as shown in fig. 7, if VM1 to VNIC 1 to virtual switch a to NIC 1 to switch/router to NIC2 to virtual switch B to VNIC3 to VM3 are failure links, VM1 is an originating node, VM1 is located in physical machine 1, VM3 is a terminating node, and VM3 is located in physical machine 2, the method for the failure location device to determine the dial-up test object may be: the fault locating device determines that the fault link includes 1 switching node, selects b (for example, b is 3) physical machines, such as the physical machine 1, the physical machine 2, and the physical machine 3, from the physical machines connected to the switching node (i.e., from the physical machine 1, the physical machine 2, the physical machine 3, and the physical machine 4), then selects VM1 and VM 2 from the physical machine 1, selects VM3 from the physical machine 2, and selects VM 4 from the physical machine 3, so that the fault locating device determines the selected VM1, VM 2, VM3, and VM 4 as a dial-up test object.
Of course, this method is only one example, and is not a limitation on the method for determining the dial test object, and the fault location device may also select the dial test object from the physical machine 1, the physical machine 2, and the physical machine 4.
S503, the fault positioning device determines n test links according to the m test objects, and conducts dial test on each test link in the n test links.
The test link is a transmission link of m dial test objects between any two dial test objects in the same network, and n is an integer greater than or equal to 2.
Specifically, the fault location device determines that i (i is not less than 1 and not more than m) dial test objects in m dial test objects are located in a first network, and m-i dial test objects in m dial test objects are located in a second network, so that the fault location device determines i × (i-1)/2 test links in the first network, and (m-i) × (m-i-1)/2 test links in the second network, and the number of inter-network links.
The network segment of the first network is different from the network segment of the second network, or the routing protocol of the first network is different from the routing protocol of the second network. An inter-network link is a link between a gateway of a first network and a gateway of a second network.
Subsequently, the fault location device performs dial testing on the links between the networks, performs dial testing on each of the i × (i-1)/2 test links in the first network, and performs dial testing on each of the (m-i) × (m-i-1)/2 test links in the second network,
with reference to the above example, as shown in fig. 7, the dial test object determined by the fault locating apparatus includes: VM1, VM 2, VM3, and VM 4, if VM1, VM 2, VM3, and VM 4 in fig. 7 are all located in the same network, the fault location device determines 6 test links, where the 6 test links are:
testing the link 1: VM 1-VNIC 1-virtual switch A-VNIC 2-VM 2;
and 2, testing the link: VM 1-VNIC 1-virtual switch A-NIC 1-switch/router-NIC 2-virtual switch B-VNIC 3-VM 3;
and 3, testing the link: VM 1-VNIC 1-virtual switch A-NIC 1-switch/router-NIC 3-virtual switch C-VNIC 4-VM 4;
and 4, testing the link 4: VM 2-VNIC 2-virtual switch A-NIC 1-switch/router-NIC 2-virtual switch B-VNIC 3-VM 3;
and 5, testing the link: VM 2-VNIC 2-virtual switch A-NIC 1-switch/router-NIC 3-virtual switch C-VNIC 4-VM 4;
and 6, testing the link: VM 3-VNIC 3-virtual switch B-NIC 2-switch/router-NIC 3-virtual switch C-VNIC 4-VM 4.
And after the fault positioning device determines the 6 test links, the 6 test links are dial-tested one by one in the network.
Optionally, the fault positioning device can adopt a simple ping operation to realize the dial test of the test link, the mode has high universality, and the applicability of the fault positioning method provided by the embodiment of the application is effectively improved.
S504, the fault locating device determines the fault value of each node in the fault link.
Wherein the failure value is used to indicate the probability of failure of the node. The larger the fault value, the higher the probability of the node failing. Correspondingly, the smaller the failure value, the lower the probability of the node failing.
Specifically, for each node in the failed link, the fault location device performs the following process to determine the fault value for the node. Taking the node A as an example, judging whether the node A is on the test link or not for each test link in the n test links; if the node A is on the test link and the test link dial test fails, or if the node A is not on the test link and the test link dial test succeeds, updating the fault value of the node as: a stored fault value of +1 for the node. Wherein the initial value of the fault value of the node is 0.
With reference to the above example, as shown in fig. 7, for VM1, the fault location device determines that VM1 is on the test link 1, and if the test link 1 fails to dial, the fault value of VM1 is 1; subsequently, the fault positioning device determines that the VM1 is on the test link 2, and if the dial test of the test link 2 is successful, the fault value of the VM1 is still kept to be 1; then, the fault locating device determines that the VM1 is on the test link 3, and if the test link 3 is successfully dial-tested, the fault value of the VM1 is still kept to be 1; then, the fault positioning device determines that the VM1 is not on the test link 4, and if the dial test of the test link 4 is successful, the fault value of the VM1 is updated to be 2; then, the fault positioning device determines that the VM1 is not on the test link 5, and if the dial test of the test link 5 fails, the fault value of the VM1 is still kept to be 2; and finally, the fault positioning device determines that the VM1 is not on the test link 6, and if the dial test of the test link 6 is successful, the fault value of the VM1 is updated to be 3.
For other nodes in the faulty link, the method for determining the fault value by the fault location device is similar to the method for determining the fault value of VM1, and is not described here any more.
And S505, the fault locating device determines the node with the fault in the fault link according to the fault value of each node.
Optionally, the fault location device may determine the node corresponding to the maximum fault value as the node with the fault in the fault link, or may determine the node corresponding to the fault value greater than or equal to the preset threshold as the node with the fault in the fault link.
For example, if the fault locating device determines that the fault value of the NIC 1 is the maximum after performing dial test on the 6 test links, the fault locating device determines that the NIC 1 is a node in the fault link that has a fault.
Optionally, after determining the failed node in the failed link, the fault location device highlights the failed node in the logical network topology, so that the operation and maintenance personnel can maintain the node in time.
In summary, after a failed link occurs, the fault location device determines, according to a pre-generated logical network topology, m (m is greater than or equal to 3) dial test objects (the dial test objects are target nodes in the NFV system) at least including an originating node of the failed link and a terminating node of the failed link, so that the fault location device can perform dial test on n test links composed of the m dial test objects, and further determine a fault value (the fault value is used for indicating the probability of the node failing) of each node in the failed link according to the dial test result of each test link, and further determine the node that fails according to the fault value of each node. Compared with the prior art, the fault locating device of the embodiment of the application has the advantages that the operation is simple, the speed is higher, and the system performance is effectively improved.
In addition, the dialing and testing operation of the fault positioning device can be realized by adopting simple ping operation, the universality is higher, and the applicability of the fault positioning method provided by the embodiment of the application is improved.
The embodiment of the present application further provides a fault location device, which may be any node in the NFV system (e.g., NFV-MANO) or a partial device of a certain node in the NFV system, for example, a system on chip in the NFV-MANO. Optionally, the chip system is configured to support the fault location device to implement functions related to the method embodiments, for example, to obtain, determine, or dial-test data and/or information related to the method. The chip system includes a chip and may also include other discrete devices or circuit structures.
The fault location device is used for executing the steps of the fault location method. The fault locating device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
In the embodiment of the present application, the fault location device may be divided into the functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 8 shows a schematic diagram of a possible structure of the fault locating device in the present embodiment, in the case of dividing each functional module according to each function. As shown in fig. 8, the fault location device 80 includes an acquisition unit 800, a determination unit 801, and a dial test unit 802.
The acquisition unit 800 is used to support the fault location apparatus to perform the acquisition operation shown in fig. 5, for example: s501, etc., and/or other processes for the techniques described herein.
The determination unit 801 is used to support the fault location device to perform the above-mentioned operations such as the determination shown in fig. 5, for example: s502, S503, S504, etc., and/or other processes for the techniques described herein.
The dial testing unit 802 is used to support the fault location apparatus to perform the receiving operation shown in fig. 5, for example: s503, etc., and/or other processes for the techniques described herein.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the fault location device provided in the embodiment of the present application includes, but is not limited to, the above modules, for example, the fault location device may further include a generating unit 803, an updating unit 804, and a storing unit 805. The generating unit 803 may be used to support the fault locating apparatus to perform the generating operation shown in fig. 5 described above, for example: s500, etc., and/or other processes for the techniques described herein. The updating unit 804 may be configured to support the fault locating device to update the logical network topology. The memory unit 805 may be used to store program codes and data for the fault locating device.
Further optionally, the fault location device further includes a display unit 806, where the display unit 806 is used to support the fault location device to display the faulty link, and to display the logical network topology, and to display the node in the faulty link, etc.
The physical block diagram of the fault locating device provided by the present application can refer to fig. 4 described above. The acquiring unit 800, the determining unit 801, the dial testing unit 802, the generating unit 803, and the updating unit 804 may be the processor 31 in fig. 3, the display unit 806 may be the output device 46 in fig. 3, and the storage unit 805 may be the memory 32 in fig. 3.
Another embodiment of the present application further provides a computer-readable storage medium, which stores instructions that, when executed on a fault location device, the fault location device performs the steps of the fault location method of the embodiment shown in fig. 5.
In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; the processor of the fault locating device may read the computer executable instructions from the computer readable storage medium, and the processor executing the computer executable instructions causes the fault locating device to perform the steps of the fault locating method of the embodiment shown in fig. 5.
The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., from one website, computer, server, or data center, via wire (e.g., coaxial cable, fiber optics, digital subscriber line (DS L)) or wirelessly (e.g., infrared, wireless, microwave, etc.) to another website, computer, server, or data center.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A fault location method is applied to a Network Function Virtualization (NFV) system, and comprises the following steps:
acquiring first information and second information, wherein the first information is used for identifying an originating node of a failed link, the second information is used for identifying a terminating node of the failed link, the originating node and the terminating node are both target nodes in the NFV system, and the target nodes are virtual machines or ports of virtual network cards;
according to the first information, the second information and a logic network topology generated in advance, m dial testing objects and all nodes in the fault link are determined, the dial testing objects are the target nodes, the m dial testing objects at least comprise the originating node and the terminating node, and m is an integer greater than or equal to 3;
determining n test links according to the m test objects, and performing dial test on each test link in the n test links, wherein the test link is a transmission link between any two test objects in the same network, and n is an integer greater than or equal to 2;
determining a fault value of each node in the fault link according to a dial test result, wherein the larger the fault value is, the higher the probability of the node failing is;
and determining the failed node in the failed link according to the failure value of each node.
2. The method according to claim 1, wherein the determining m dial-up test objects according to the first information, the second information, and a pre-generated logical network topology specifically includes:
determining that the fault link comprises a switching nodes according to the first information, the second information and a pre-generated logic network topology, wherein the switching nodes are physical switches or routers, and a is an integer greater than or equal to 0;
when a is equal to 0, the originating node and the terminating node are located in the same physical machine, m target nodes are selected from target nodes configured by the physical machine, and the m target nodes are determined as the m dial testing objects;
when a is an integer greater than or equal to 1, performing the following processes on each of the a switching nodes to determine m target nodes, and determining the m target nodes as the m dial test objects:
the method comprises the steps of selecting b physical machines from physical machines connected with a switching node, if the physical machines connected with the switching node comprise target physical machines, the b physical machines at least comprise the target physical machines, the target physical machines are first physical machines or second physical machines, the first physical machines are provided with originating nodes, the second physical machines are provided with terminating nodes, and for each physical machine of the b physical machines, c target nodes are selected from the target nodes configured by the physical machines, wherein m is a × b × c, b is an integer greater than or equal to 2, and c is an integer greater than or equal to 2.
3. The method according to claim 1 or 2, wherein the determining n test links according to the m dial-up test objects specifically includes:
determining that i of the m dial testing objects are located in a first network, and m-i of the m dial testing objects are located in a second network, wherein a network segment of the first network is different from a network segment of the second network, or a routing protocol of the first network is different from a routing protocol of the second network, and i is greater than or equal to 1 and less than or equal to m;
determining i × (i-1)/2 test links in the first network and (m-i) × (m-i-1)/2 test links in the second network;
determining the number of inter-network links, wherein the inter-network links are links between a gateway of the first network and a gateway of the second network;
wherein n ═ i × (i-1)/2+ (m-i) × (m-i-1)/2+ the number of inter-network links.
4. The method according to claim 3, wherein the performing the dial test on each of the n test links specifically includes:
in the first network, performing dial test on each test link in the i × (i-1)/2 test links;
in the second network, performing dial-up test on each test link in the (m-i) × (m-i-1)/2 test links;
and carrying out dial testing on the link between the networks.
5. The method according to claim 1, wherein the determining the failure value of each node in the failed link according to the dial-up test result specifically includes:
for each of the nodes, performing the following process to determine a fault value for the node:
for each test link in the n test links, judging whether the node is on the test link;
if the node is on the test link and the test link dial test fails, or if the node is not on the test link and the test link dial test succeeds, updating the fault value of the node as: and the stored fault value of the node is +1, and the initial value of the fault value of the node is 0.
6. The fault localization method according to any of claims 1-2, 4-5, characterized in that the logical network topology is pre-generated by:
acquiring feature information of each network node in the NFV system, wherein the feature information includes at least one of a network address, a name and an identifier, the network node is a physical switch, a router, a physical network card, a virtual switch, a virtual network card or a virtual machine, and all nodes of the fault link belong to network nodes in the NFV system;
and generating the logic network topology according to the acquired characteristic information of the network nodes and the connection relation among different network nodes.
7. The fault localization method according to any one of claims 1-2, 4-5, further comprising:
updating the logical network topology.
8. A fault locating device located in a network function virtualization, NFV, system, the fault locating device comprising:
an obtaining unit, configured to obtain first information and second information, where the first information is used to identify an originating node of a failed link, the second information is used to identify a terminating node of the failed link, the originating node and the terminating node are both target nodes in the NFV system, and the target nodes are virtual machines or ports of virtual network cards;
a determining unit, configured to determine m dial test objects and all nodes in the faulty link according to the first information, the second information, and a pre-generated logical network topology acquired by the acquiring unit, where the dial test object is the target node, the m dial test objects at least include the originating node and the terminating node, m is an integer greater than or equal to 3, and determine n test links according to the m dial test objects, where n is an integer greater than or equal to 2;
the dial testing unit is used for performing dial testing on each testing link in the n testing links determined by the determining unit, wherein the testing link is a transmission link of the m dial testing objects between any two dial testing objects in the same network;
the determining unit is further configured to determine a failure value of each node in the failed link according to a dial test result of the dial test unit, where the higher the failure value is, the higher the probability of the node failing is, and further determine the node in the failed link according to the failure value of each node.
9. The fault localization device according to claim 8, wherein the determination unit is specifically configured to:
determining that the fault link comprises a switching nodes according to the first information, the second information and a pre-generated logic network topology, wherein the switching nodes are physical switches or routers, and a is an integer greater than or equal to 0;
when a is equal to 0, the originating node and the terminating node are located in the same physical machine, m target nodes are selected from target nodes configured by the physical machine, and the m target nodes are determined as the m dial testing objects;
when a is an integer greater than or equal to 1, performing the following processes on each of the a switching nodes to determine m target nodes, and determining the m target nodes as the m dial test objects:
the method comprises the steps of selecting b physical machines from physical machines connected with a switching node, if the physical machines connected with the switching node comprise target physical machines, the b physical machines at least comprise the target physical machines, the target physical machines are first physical machines or second physical machines, the first physical machines are provided with originating nodes, the second physical machines are provided with terminating nodes, and for each physical machine of the b physical machines, c target nodes are selected from the target nodes configured by the physical machines, wherein m is a × b × c, b is an integer greater than or equal to 2, and c is an integer greater than or equal to 2.
10. The fault localization arrangement according to claim 8 or 9, characterized in that the determination unit is specifically configured to:
determining that i of the m dial testing objects are located in a first network, and m-i of the m dial testing objects are located in a second network, wherein a network segment of the first network is different from a network segment of the second network, or a routing protocol of the first network is different from a routing protocol of the second network, and i is greater than or equal to 1 and less than or equal to m;
determining i × (i-1)/2 test links in the first network and (m-i) × (m-i-1)/2 test links in the second network;
determining the number of inter-network links, wherein the inter-network links are links between a gateway of the first network and a gateway of the second network;
wherein n ═ i × (i-1)/2+ (m-i) × (m-i-1)/2+ the number of inter-network links.
11. The fault locating device according to claim 10, wherein the dial testing unit is specifically configured to:
in the first network, performing dial test on each test link in the i × (i-1)/2 test links;
in the second network, performing dial-up test on each test link in the (m-i) × (m-i-1)/2 test links;
and carrying out dial testing on the link between the networks.
12. The fault localization device according to claim 8, wherein the determination unit is specifically configured to:
for each of the nodes, performing the following process to determine a fault value for the node:
for each test link in the n test links, judging whether the node is on the test link;
if the node is on the test link and the test link dial test fails, or if the node is not on the test link and the test link dial test succeeds, updating the fault value of the node as: and the stored fault value of the node is +1, and the initial value of the fault value of the node is 0.
13. The fault localization arrangement according to any of claims 8-9, 11-12, characterized in that the fault localization arrangement further comprises a generating unit;
the generating unit is configured to generate the logical network topology in advance by:
acquiring feature information of each network node in the NFV system, wherein the feature information includes at least one of a network address, a name and an identifier, the network node is a physical switch, a router, a physical network card, a virtual switch, a virtual network card or a virtual machine, and all nodes of the fault link belong to network nodes in the NFV system;
and generating the logic network topology according to the acquired characteristic information of the network nodes and the connection relation among different network nodes.
14. The fault localization arrangement according to any of claims 8-9, 11-12, characterized in that the fault localization arrangement further comprises an updating unit;
the updating unit is used for updating the logic network topology.
15. A fault locating device, wherein the fault locating device comprises a processor and a memory; the memory is configured to store computer-executable instructions that, when executed by the fault location device, cause the fault location device to perform the fault location method of any one of claims 1-7.
16. A computer-readable storage medium having stored thereon computer instructions which, when run on a fault location device, cause the fault location device to perform the fault location method of any one of claims 1-7.
CN201811625982.8A 2018-12-28 2018-12-28 Fault positioning method and device Active CN109802855B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811625982.8A CN109802855B (en) 2018-12-28 2018-12-28 Fault positioning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811625982.8A CN109802855B (en) 2018-12-28 2018-12-28 Fault positioning method and device

Publications (2)

Publication Number Publication Date
CN109802855A CN109802855A (en) 2019-05-24
CN109802855B true CN109802855B (en) 2020-08-07

Family

ID=66558072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811625982.8A Active CN109802855B (en) 2018-12-28 2018-12-28 Fault positioning method and device

Country Status (1)

Country Link
CN (1) CN109802855B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110868355B (en) * 2019-11-19 2022-05-13 广州丰石科技有限公司 Topology automatic discovery and fault delimitation method based on NFV network
CN113568789B (en) * 2020-04-28 2024-04-16 北京比特大陆科技有限公司 Chip detection method, detection device and electronic equipment
CN112684371B (en) * 2020-12-07 2023-11-21 深圳市道通科技股份有限公司 Fault positioning method, diagnosis equipment and automobile detection system and method for automobile bus
CN114221882A (en) * 2021-12-23 2022-03-22 锐捷网络股份有限公司 Method, device, equipment and storage medium for detecting fault link
CN115134248A (en) * 2022-05-23 2022-09-30 奇安信科技集团股份有限公司 Network topology difference detection method and device
CN116248573B (en) * 2022-12-01 2024-06-18 中国联合网络通信集团有限公司 Link splicing method, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107278362A (en) * 2016-11-09 2017-10-20 华为技术有限公司 The method of Message processing, main frame and system in cloud computing system
CN108282376A (en) * 2018-04-20 2018-07-13 江南大学 A kind of LDDoS emulation modes based on lightweight virtualization
CN108833202A (en) * 2018-05-22 2018-11-16 华为技术有限公司 Faulty link detection method, device and computer readable storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006080899A1 (en) * 2005-01-28 2006-08-03 Agency For Science, Technology And Research Systems and methods for testing data link connectivity of an optical network employing transparent optical nodes
CN101022474A (en) * 2007-03-12 2007-08-22 华为技术有限公司 Network fault testing method and device
CN104270268B (en) * 2014-09-28 2017-12-05 曙光信息产业股份有限公司 A kind of distributed system network performance evaluation and method for diagnosing faults
EP3367612B1 (en) * 2016-08-25 2019-05-08 Huawei Technologies Co., Ltd. Dial testing method, dial testing system, and computing node
US10148564B2 (en) * 2016-09-30 2018-12-04 Juniper Networks, Inc. Multiple paths computation for label switched paths
CN108737205A (en) * 2017-04-18 2018-11-02 中兴通讯股份有限公司 Dial testing method, apparatus and system
CN108933694B (en) * 2018-06-09 2021-11-09 西安电子科技大学 Data center network fault node diagnosis method and system based on dial testing data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107278362A (en) * 2016-11-09 2017-10-20 华为技术有限公司 The method of Message processing, main frame and system in cloud computing system
CN108282376A (en) * 2018-04-20 2018-07-13 江南大学 A kind of LDDoS emulation modes based on lightweight virtualization
CN108833202A (en) * 2018-05-22 2018-11-16 华为技术有限公司 Faulty link detection method, device and computer readable storage medium

Also Published As

Publication number Publication date
CN109802855A (en) 2019-05-24

Similar Documents

Publication Publication Date Title
CN109802855B (en) Fault positioning method and device
JP5458308B2 (en) Virtual computer system, virtual computer system monitoring method, and network device
US10432460B2 (en) Network service scaling method and apparatus
CN106489251B (en) The methods, devices and systems of applied topology relationship discovery
CN105531970B (en) Method and system for the load that maps out the work in a network
US9686146B2 (en) Reconfiguring interrelationships between components of virtual computing networks
US9465641B2 (en) Selecting cloud computing resource based on fault tolerance and network efficiency
US10999132B1 (en) Detecting degraded network monitoring agents
TWI612786B (en) Nodes managing system, nodes managing method and computer-readable storage device
US8621057B2 (en) Establishing relationships among elements in a computing system
CN109995552B (en) VNF service instantiation method and device
BR112017017330B1 (en) METHOD AND APPARATUS FOR PROCESSING ALARM INFORMATION
US11005968B2 (en) Fabric support for quality of service
US10735253B2 (en) Alarm information reporting method and apparatus
US11349724B2 (en) Predictive analysis in a software defined network
CN113872997B (en) Container group POD reconstruction method based on container cluster service and related equipment
EP3806395A1 (en) Virtual network function (vnf) deployment method and apparatus
US11531564B2 (en) Executing multi-stage distributed computing operations with independent rollback workflow
CN111092828B (en) Network operation method, device, equipment and storage medium
KR20190114495A (en) Apparatus and method for network resource management in network fucntion virtualizaion environment
WO2017143935A1 (en) Connectivity testing method and device
US10374941B2 (en) Determining aggregation information
US10367711B2 (en) Protecting virtual computing instances from network failures
WO2022212050A1 (en) Route discovery for failure detection in computer networks
JP7450072B2 (en) Virtualization network service deployment method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant