CN114760224A - System, method, apparatus, and storage medium for monitoring status of network channels - Google Patents

System, method, apparatus, and storage medium for monitoring status of network channels Download PDF

Info

Publication number
CN114760224A
CN114760224A CN202111600863.9A CN202111600863A CN114760224A CN 114760224 A CN114760224 A CN 114760224A CN 202111600863 A CN202111600863 A CN 202111600863A CN 114760224 A CN114760224 A CN 114760224A
Authority
CN
China
Prior art keywords
monitoring
network
network device
unit
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111600863.9A
Other languages
Chinese (zh)
Inventor
郝建明
宋泽锋
伍福生
李兴锋
简超
潘星明
张源
吴晨楠
姜雪娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN202111600863.9A priority Critical patent/CN114760224A/en
Publication of CN114760224A publication Critical patent/CN114760224A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present application relates to network operation and maintenance technology, and more particularly, to a system and method for monitoring the status of a network channel. One aspect of the present application provides a system for monitoring a state of a network path, the network path including a plurality of nodes, the system comprising: a plurality of monitoring units, each of the monitoring units being disposed at one of the plurality of nodes and configured to monitor a status of a network device at the respective node to generate a monitoring result; a determination unit configured to determine whether a state of the network device is abnormal based on the monitoring result and generate an automatic disposition command when the state is abnormal; and a processing unit configured to perform exception handling operations for network devices at respective nodes based on the auto-handle command.

Description

System, method, apparatus, and storage medium for monitoring status of network channel
Technical Field
The present application relates to network operation and maintenance technologies, and in particular, to a system, a method, a computer device, and a computer-readable storage medium for monitoring a status of a network channel.
Background
With the rapid development of the internet, the service support of the data center cannot be left in various fields such as online shopping, social media, online entertainment, mobile payment and local life, which brings opportunities and challenges to the development of the data center. With the development of data centers, the types and the number of network devices are increased obviously, and the distribution is wider, which significantly increases the difficulty of monitoring and managing in the daily operation process of the data centers.
Disclosure of Invention
According to one aspect of the present application, there is provided a system for monitoring the status of a network path, the network path comprising a plurality of nodes, the system comprising: a plurality of monitoring units, each of the monitoring units being disposed at one of the plurality of nodes and configured to monitor a status of a network device at the respective node to generate a monitoring result; a determination unit configured to determine whether a state of the network device is abnormal based on the monitoring result and generate an automatic disposition command when the state is abnormal; and a processing unit configured to perform exception handling operations for network devices at respective nodes based on the auto-handle command.
Optionally, in the above system, the exception handling operation includes a primary/standby switching operation between network devices.
Optionally, in the above system, the network device comprises one or more of: switches, routers, firewalls, and intrusion prevention systems.
Optionally, in the system, the monitoring unit has a binding relationship with the network device at the corresponding node and is connected to a port of the bound network device.
Optionally, in the above system, the monitoring unit is further configured to monitor the status of the network device at the respective node using a service order based on one or more of the following protocols: ICMP, TCP, HTTP network protocols.
Optionally, in the system, the determining unit and the processing unit are disposed at a node where a core switch is located.
Optionally, in the above system, the judging unit is further configured to: periodically determining whether a state of a network device at a corresponding node is abnormal based on the monitoring result received from the monitoring unit; generating the automatic disposition command and alarm information in response to a determination result that an abnormality occurs; and issuing the automatic disposition command to the processing unit.
Optionally, in the above system, the judging unit is further configured to: monitoring whether the monitoring unit uploads the monitoring result within a set time; and responding to the event that the monitoring unit does not upload the monitoring result within the set time, and generating alarm information.
Optionally, in the above system, the network channel is connected between the data centers to provide a channel for data transmission between the two.
Optionally, in the above system, the network device is connected to a power supply via a PDU device, and the processing unit is further configured to perform an active/standby switching operation between the network devices by: and remotely controlling PDU equipment connected with a pair of network equipment with a main-standby relation to switch one of the PDU equipment from a power-on state to a power-off state.
Optionally, in the system, the PDU device connected to a pair of network devices having a primary/standby relationship is remotely controlled in a synchronous manner.
According to another aspect of the present application, there is provided a method for monitoring the status of a network path, the network path comprising a plurality of nodes, the method comprising the following steps performed at a computer device: A. receiving monitoring results from a plurality of monitoring units, wherein each of the monitoring units is deployed at one of the plurality of nodes and is configured to monitor a status of a network device at the respective node to generate monitoring results; B. determining whether a state of the network device is abnormal based on the monitoring result and generating an automatic disposition command when the state is abnormal; and C, sending the automatic handling command to a processing unit to enable the processing unit to execute abnormal processing operation on the network equipment at the corresponding node based on the automatic handling command.
According to still another aspect of the present application, there is provided a computer apparatus comprising: a memory; a processor; and a computer program stored on the memory and executable on the processor, execution of the computer program resulting in the following operations: A. receiving monitoring results from a plurality of monitoring units, wherein each of the monitoring units is deployed at one of the plurality of nodes and is configured to monitor a status of a network device at the respective node to generate monitoring results; B. determining whether a state of the network device is abnormal based on the monitoring result and generating an auto-handle command when the state is abnormal; and C, sending the automatic handling command to a processing unit to enable the processing unit to execute exception handling operation on the network equipment at the corresponding node based on the automatic handling command.
According to yet another aspect of the application, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the method as described above.
In some embodiments, a system for monitoring the state of a network channel is deployed for increasing the operation and maintenance difficulty of a current data center, so that the states of network devices at multiple nodes on the network channel can be monitored all day long and uninterruptedly, and meanwhile, the monitored abnormal network devices can be automatically and quickly handled, that is, the abnormal network devices are powered off, and meanwhile, the master-slave switching operation of the network devices is realized by remotely controlling the PDU devices.
By the integrated operation of monitoring the states of the network equipment at a plurality of nodes on the network channel, judging the abnormal network equipment and automatically disposing the abnormal network equipment, the operation and maintenance difficulty and the management cost of the data center are obviously reduced, and the safety and the reliability of the data center are improved. In addition, compared with the existing abnormal device processing flow or means, the integrated operation improves the processing efficiency of the abnormal device, and meanwhile, the influence of the abnormal network device on the data center is reduced through the main/standby switching operation of the abnormal network device.
Drawings
The foregoing and/or other aspects and advantages of the present application will become more apparent and more readily appreciated from the following description of the various aspects, taken in conjunction with the following drawings, wherein like or similar elements have like numerals. The drawings include:
FIG. 1 illustrates a schematic block diagram of a system for monitoring a status of a network channel in accordance with some embodiments of the present application.
FIG. 2 illustrates a deployment architecture diagram of a system for monitoring the status of a network channel according to some embodiments of the present application.
Fig. 3 is a flow diagram of a method for monitoring a status of a network channel in accordance with some embodiments of the present application.
FIG. 4 is a block diagram of a typical computer device.
Detailed Description
The present application is described more fully hereinafter with reference to the accompanying drawings of illustrative embodiments of the application. This application may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. The embodiments described above are intended to be a complete and complete disclosure of the present disclosure, so as to more fully convey the scope of the present application to those skilled in the art.
In the present specification, words such as "comprise" and "comprises" mean that in addition to elements and steps directly and unequivocally stated in the specification and claims, the technical solutions of the present application do not exclude other elements and steps not directly or unequivocally stated.
Terms such as "first" and "second" do not denote an order of the elements in time, space, size, etc., but rather are used to distinguish one element from another.
In this specification, a pdu (power Distribution unit) device refers to a power Distribution unit that is connected between a network device and a power supply as a power Distribution outlet for a cabinet, thereby providing power Distribution for the network device.
In this specification, network devices may include, but are not limited to, switches, routers, firewalls, Intrusion Prevention Systems (IPS), and the like. Embodiments of a deployment architecture for a system for monitoring the status of network tunnels are described below with switches and routers as examples. However, it should be noted that the embodiments described herein are equally applicable to a variety of other network devices.
FIG. 1 illustrates a schematic block diagram of a system for monitoring a status of a network channel in accordance with some embodiments of the present application. The system 10 for monitoring the status of a network path shown in fig. 1 comprises a monitoring unit 110, a judging unit 120 and a processing unit 130. Illustratively, the network channel may be connected between data centers to provide a channel for data transmission between the two.
The monitoring unit 110 may be disposed at each of a plurality of nodes included in the network channel and configured to monitor a status of a network device at the respective node to generate a monitoring result. As an example, the monitoring unit 110 may be configured to monitor a communication status of the network devices at the corresponding nodes, and when the monitoring unit 110 monitors that the network device at a node cannot communicate with the network devices at other nodes, generate a corresponding monitoring result to alert the network device at the node of the abnormality.
Alternatively, the monitoring unit 110 may have a binding relationship with the network devices at the respective nodes and be connected with ports of the bound network devices.
Optionally, the monitoring unit 110 is further configured to monitor the status of the network devices at the respective nodes using service commands based on one or more of the following protocols: ICMP, TCP, HTTP network protocols. For example, the monitoring unit 110 may use various service commands such as Ping, Crul, Nc, etc. in various network protocols to monitor the status of the network devices at the respective nodes. In one example, the monitoring unit 110 may use Ping commands to monitor whether the local network device can successfully exchange (e.g., send and receive) data packets with another network device, and then infer from the returned information whether the TCP/IP parameters of the local network device are set correctly and are running normally, the network is unobstructed, and the like, thereby generating corresponding monitoring results. In another example, the monitoring unit 110 may use Nc commands to establish and listen to any TCP and UDP connections of the network device, thereby generating corresponding monitoring results.
By using various service commands in the various network protocols, the comprehensiveness and the accuracy of the monitoring process can be improved, and different network protocols can cover different network OSI layers.
In addition, a plurality of nodes in the network channel can support capacity expansion, for example, the nodes can be expanded according to monitoring requirements to improve the monitoring range, so that full-coverage monitoring of the network equipment is realized.
The monitoring unit 110 may be implemented as a monitoring server directly connected to a port of a network device, or any other device having a monitoring function.
The determination unit 120 may be configured to determine whether a state of the network device is abnormal based on the monitoring result and generate an automatic disposition command when the state is abnormal. Optionally, the auto-handle command may indicate an occurrence time of an anomaly of the network device, an identification of the anomalous network device, an identification of a PDU device socket to which the anomalous network device is connected, and the like. Optionally, the automatic processing command may further indicate an identifier of a PDU device socket connected to the standby machine corresponding to the network device in which the abnormality occurs.
Optionally, the determining unit 120 is further configured to perform the following operations: periodically determining whether the state of the network device at the corresponding node is abnormal based on the monitoring result received from the monitoring unit 110; generating an automatic disposition command in response to a determination result that the abnormality occurs; issue an automatic disposition command to the processing unit 130; and generating alarm information in response to the abnormal judgment result so as to inform relevant personnel to pay attention, so that the relevant personnel can conveniently perform subsequent processing. Optionally, the automatic disposition command and the alert information may be generated substantially synchronously. As an example, when each monitoring unit 110 transmits the monitoring result to the determination unit 120, the determination unit 120 completes the warehousing of the information of the monitoring result and periodically (e.g., every 30 seconds) determines whether the state of the network device at the corresponding node is abnormal based on the monitoring result. An auto-disposition command is generated when it is judged that the state of the network device at the corresponding node is abnormal, and the auto-disposition command is issued to the processing unit 130. Meanwhile, the determination unit 120 may be configured to generate an alarm command in response to a determination result of the occurrence of the abnormality to notify the relevant person of attention, so as to facilitate subsequent processing by the relevant person.
Optionally, the determining unit 120 is further configured to perform the following operations: monitoring whether the monitoring unit 110 uploads a monitoring result within a set time; and generating alarm information in response to an event that the monitoring unit 110 does not upload the monitoring result within a set time. As an example, the determining unit 120 may be configured to monitor whether all the monitoring units 110 upload the monitoring results every 15 seconds, so as to monitor the states of the monitoring units 110, thereby effectively preventing the monitoring units 110 from failing to obtain the monitoring results. Thereby, the reliability of the monitoring process performed by the monitoring unit 110 is further improved.
The processing unit 130 may be configured to perform exception handling operations for network devices at respective nodes based on the auto-handle command. Optionally, the exception handling operation includes implementing a master/slave switching operation between the network devices by remotely controlling the PDU device. As an example, the network device may be equipped with a high availability mechanism, i.e., the network device is deployed in a master standby mode. During normal operation, various services are provided by the host of the network device, and the standby of the network device is enabled to take over the services when the host of the network device is abnormal.
However, in the implementation process, although the network device may be equipped with a high availability mechanism, when conditions such as network device performance degradation, network device blocking, and the like occur, the data center surface may not seem to have an influence on the operation of part of the traffic, resulting in failure to generate a complete monitoring result. In view of this, the monitoring unit 110 monitors the states of the network devices at the corresponding nodes by using the multiple network protocols and the multiple service commands corresponding thereto, so as to ensure that various non-perceptible abnormal states of the network devices are monitored in time, thereby improving the comprehensiveness and accuracy of the monitoring process, and meanwhile, different network protocols can cover different network OSI layers.
Optionally, the processing unit 130 is further configured to perform active/standby switching between the network devices via the PDU device, so as to implement exception handling on the network devices at the corresponding nodes. As an example, the processing unit 130 may instruct the PDU device to disconnect the network device in which the abnormality occurs from the power supply by way of remote control based on the identification of the network device in which the abnormality occurs, which is indicated by the received auto-disposition command. In order to shorten the process of master/slave switching, processing unit 130 optionally performs disconnection of a pair of PDUs from the network device in a synchronous manner or a nearly synchronous manner. By the synchronous or nearly synchronous mode, the abnormal events can be processed quickly without sensing by the outside.
Optionally, the processing unit 130 may implement remote control on the PDU device through an ssh (secure shell) secure channel protocol to implement active/standby switching by using a high availability mechanism of the network device, so as to reduce the influence of the exception handling operation on the data center.
Optionally, the above-mentioned determining unit 120 and the processing unit 130 may be disposed at a node where the core switch is located, so as to improve the reliability and efficiency of the determination for the abnormal network device and the abnormal processing operation.
Although not shown in fig. 1, the system 10 for monitoring the status of the network channel may further include various other functional modules, such as a monitoring unit management module, a judgment unit management module, a processing unit management module, a device deployment information management module, an emergency manual management module, an alarm notification management module, and the like, so that multi-module integration can be implemented in the system 10 for monitoring the status of the network channel to provide more efficient and convenient operation and maintenance services for operation and maintenance personnel.
According to the system for monitoring the state of the network channel, the state of the network equipment at multiple nodes on the network channel can be monitored, abnormal network equipment is judged, and the automatic handling of the abnormal network equipment is integrated, so that the operation and maintenance difficulty and the management cost of the data center are obviously reduced, and the safety and the reliability of the data center are improved. In addition, compared with the existing abnormal device processing flow or means, the integrated operation improves the processing efficiency for the abnormal device, and meanwhile, the influence of the abnormal network device on the data center is reduced through the main/standby switching operation of the abnormal network device.
Fig. 2 illustrates a deployment architecture diagram of a system for monitoring the status of a network channel in accordance with some embodiments of the present application.
Referring to fig. 2, each circle represents a monitoring node in a network path, where a corresponding monitoring unit may be deployed at monitoring nodes 1-11 as shown. Illustratively, the monitoring units deployed at monitoring nodes 1-11 may have the features and structure of monitoring unit 110, wherein the monitoring unit deployed at monitoring node 11 is configured to monitor the state of switches in network a and generate monitoring results, the monitoring unit deployed at monitoring node 10 is configured to monitor the state of firewall switches in network a and generate monitoring results, and the monitoring unit deployed at monitoring node 9 is configured to monitor the state of routers in network B and generate monitoring results.
It is understood that the number and the deployment location of the monitoring nodes 1-11 shown in fig. 2 are only exemplary, a person skilled in the art may deploy the monitoring units at other monitoring nodes according to the actual needs of the data center, and the number of monitoring nodes or monitoring units is not limited to the specific example shown in fig. 2. It will be appreciated that the monitoring node may be selected based on the network architecture of the data center to maximize coverage of various network devices. In addition, the monitoring node can support the dilatation, and is applicable to novel cloud network framework, for example can carry out the dilatation to the node according to the monitoring demand in order to further promote monitoring range to the realization is to network equipment's full coverage monitoring.
Alternatively, the determination unit 120 and the processing unit 130 shown in fig. 1 may be disposed at a node (e.g., the monitoring node 5 shown in fig. 2) where the core switch is located, where the determination unit 120 may receive monitoring results generated from the monitoring units 110 disposed at the respective monitoring nodes (e.g., the monitoring nodes 1 to 11) and determine whether the state of the network device is abnormal based on the monitoring results, and generate an auto-disposition command when the state is abnormal, and the processing unit 130 may perform an abnormal processing operation on the network device at the corresponding monitoring node based on the auto-disposition command. By deploying the determination unit 120 and the processing unit 130 at the node where the core switch is located, the determination unit 120 can determine, based on the monitoring result, the network device in which the abnormality occurs and generate the automatic handling command at the first time, and the processing unit 130 can perform, based on the automatic handling command, the abnormality processing operation on the network device at the corresponding monitoring node at the first time, thereby improving the reliability and efficiency of the determination for the abnormal network device and the abnormality processing operation.
Optionally, the automatic disposition command may include an exception occurrence time of the network device, an identifier of the abnormal network device, an identifier of the PDU device socket connected to the standby device corresponding to the abnormal network device, and the like. Alternatively, the binding relationship between the abnormal network device and the standby device is stored at the processing unit 130, so that the processing unit can find the identifier of the PDU device socket connected to the corresponding standby device based on the identifier of the abnormal network device.
Optionally, various network devices (e.g., a switch, a router, a firewall, and an IPS) shown in fig. 2 are connected to the power supply via the PDU device, so that the processing unit 130 can implement active/standby switching of the network devices by remotely controlling a socket specified in the PDU device, thereby completing exception handling of the network devices at the corresponding monitoring nodes.
For example, when the determination unit 120 determines that an abnormality occurs in the switch at the monitoring node 11 based on the monitoring result, an auto-disposal command may be generated and transmitted to the processing unit 130. The processing unit 130 instructs the PDU device to cut off the connection between its socket and the switch based on the identification of the switch where the exception occurs, which is indicated by the received auto-disposition command; at the same time. Therefore, the fast switching operation between the host machine and the standby machine is realized.
Alternatively, the determination unit 120 may generate alarm information in response to the determination result of the occurrence of the abnormality to notify the relevant person of attention, so as to facilitate subsequent processing by the relevant person. Meanwhile, the determining unit 120 may monitor whether the monitoring units 110 deployed at the monitoring nodes upload monitoring results within a set time; and generating alarm information in response to an event that the monitoring unit 110 does not upload the monitoring result within a set time. For example, the determining unit 120 may monitor whether all the monitoring units 110 upload the monitoring results every 15 seconds, so as to monitor the states of the monitoring units 110, thereby effectively preventing the monitoring units 110 from failing to obtain the monitoring results. Thereby, the reliability of the monitoring process performed by the monitoring unit 110 is further improved.
It is understood that the deployment architecture of the system for monitoring the status of the network channel shown in fig. 2 above is only exemplary, and those skilled in the art can implement other deployment architectures according to the actual needs of the data center (e.g., network device monitoring of multi-level multi-network protocol, monitoring of decentralized network devices).
Fig. 3 is a flow diagram of a method for monitoring a status of a network channel in accordance with some embodiments of the present application.
In these embodiments, the steps in the method flow are implemented by means of the computer device shown in fig. 4 below or the above determination unit 120 shown in fig. 1.
In step 310, the monitoring results are received from the monitoring unit 110.
In step 320, it is determined whether the state of the network device is abnormal based on the monitoring result and an auto-handle command is generated when the state is abnormal. Optionally, the auto-handle command may indicate an occurrence time of an anomaly of the network device, an identification of the anomalous network device, an identification of a PDU device socket to which the anomalous network device is connected. Optionally, the automatic processing command may further indicate an identification of the PDU device socket to which the standby corresponding to the abnormal network device is connected.
In step 330, an auto-handle command is sent to the processing unit 130 to cause the processing unit 130 to perform exception handling operations for the network devices at the respective nodes based on the auto-handle command.
Alternatively, step 320 may be performed in the following manner: periodically determining whether the state of the network device at the corresponding node is abnormal based on the monitoring result received from the monitoring unit 110; generating the automatic handling command in response to a determination result that an abnormality occurs; and issuing the automatic disposition command to the processing unit 130.
Optionally, the method further includes generating warning information in response to the determination result of the occurrence of the abnormality to notify the relevant person to pay attention, so that the relevant person can perform subsequent processing conveniently.
Optionally, the method further includes monitoring whether the monitoring unit 110 uploads the monitoring result within a set time; and generating alarm information in response to an event that the monitoring unit 110 does not upload the monitoring result within a set time.
According to the method for monitoring the state of the network channel, the state of the network equipment at multiple nodes on the network channel can be monitored, abnormal network equipment is judged, and the abnormal network equipment is automatically disposed, so that the operation and maintenance difficulty and the management cost of the data center are obviously reduced, and the safety and the reliability of the data center are improved. In addition, compared with the existing abnormal device processing flow or means, the integrated operation improves the processing efficiency of the abnormal device, and meanwhile, the influence of the abnormal network device on the data center is reduced through the main/standby switching operation of the abnormal network device.
FIG. 4 is a block diagram of a typical computer device. The computer device 40 shown in fig. 4 may be used to implement the method described in fig. 3 for monitoring the status of a network channel.
Referring to fig. 4, computer device 40 includes a memory 410 (e.g., non-volatile memory such as flash memory, ROM, hard drives, magnetic disks, optical disks, etc.), a processor 420, and a computer program 430 stored on memory 410 and executable on processor 420.
The memory 410 stores a computer program 430 executable by the processor 420. The processor 420 is configured to execute the computer program 430 to implement a corresponding method flow for monitoring the status of a network channel.
According to another aspect of the present application, there is also provided a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to implement the above-mentioned method flow for monitoring the status of a network channel.
Here, as the computer-readable storage medium, various types of computer storage media such as a disk (e.g., a magnetic disk, an optical disk, etc.), a card (e.g., a memory card, an optical card, etc.), a semiconductor memory (e.g., a ROM, a nonvolatile memory, etc.), a tape (e.g., a magnetic tape, a cassette tape, etc.), and the like can be used.
Where applicable, the various embodiments provided by the present application can be implemented using hardware, software, or a combination of hardware and software. Also, where applicable, the various hardware components and/or software components set forth herein may be combined into composite components comprising software, hardware, and/or both without departing from the scope of the present application. Where applicable, the various hardware components and/or software components set forth herein may be separated into sub-components comprising software, hardware, or both without departing from the scope of the present application. Further, where applicable, it is contemplated that software components may be implemented as hardware components, and vice versa.
Software according to the present application (such as program code and/or data) can be stored on one or more computer storage media. It is also contemplated that the software identified herein may be implemented using one or more general purpose or special purpose computers and/or computer systems that are networked and/or otherwise. Where applicable, the order of various steps described herein may be changed, combined into composite steps, and/or separated into sub-steps to provide features described herein.
The embodiments and examples set forth herein are presented to best explain embodiments in accordance with the present technology and its particular application and to thereby enable those skilled in the art to make and utilize the application. Those skilled in the art, however, will recognize that the foregoing description and examples have been presented for the purpose of illustration and example only. The description as set forth is not intended to cover all aspects of the disclosure or to limit the disclosure to the precise form disclosed.
In view of the above, the scope of the present application is to be determined by the following claims.

Claims (36)

1. A system for monitoring the status of a network path, the network path comprising a plurality of nodes, the system comprising:
a plurality of monitoring units, each of the monitoring units being disposed at one of the plurality of nodes and configured to monitor a status of a network device at the respective node to generate a monitoring result;
a determination unit configured to determine whether a state of the network device is abnormal based on the monitoring result and generate an automatic disposition command when the state is abnormal; and
a processing unit configured to perform exception handling operations for network devices at respective nodes based on the auto-handle command.
2. The system of claim 1, wherein the exception handling operation comprises a master-slave switching operation between network devices.
3. The system of claim 1, wherein the network device comprises one or more of: switches, routers, firewalls, and intrusion prevention systems.
4. The system of claim 1, wherein the monitoring unit is implemented on a server that establishes a binding relationship with a network device at a respective node by connecting to a port of the network device.
5. The system of claim 1, wherein the monitoring unit is further configured to monitor the status of the network devices at the respective nodes using service commands based on one or more of the following protocols: ICMP, TCP, HTTP network protocols.
6. The system of claim 1, wherein the determining unit and the processing unit are deployed at a node where a core switch is located.
7. The system of claim 1, wherein the determining unit is further configured to:
periodically determining whether a state of a network device at a corresponding node is abnormal based on the monitoring result received from the monitoring unit;
Generating the automatic handling command and alarm information in response to a determination result that an abnormality occurs; and
issuing the automatic disposition command to the processing unit.
8. The system of claim 1, wherein the determination unit is further configured to:
monitoring whether the monitoring unit uploads the monitoring result within a set time; and
and generating alarm information in response to the event that the monitoring unit does not upload the monitoring result within the set time.
9. The system of claim 1, wherein the network channel is connected between data centers to provide a channel for data transmission between the two.
10. The system of claim 2, wherein the network device is connected to a power source via a PDU device, the processing unit being further configured to perform active-standby switching operations between network devices by:
and remotely controlling PDU equipment connected with a pair of network equipment with a main-standby relation to switch one of the PDU equipment from a power-on state to a power-off state.
11. The system of claim 10, wherein the remote control is performed in a synchronous manner for PDU devices connected to a pair of network devices having a master-slave relationship.
12. A method for monitoring the status of a network path, the network path comprising a plurality of nodes, the method comprising the steps performed at a computer device of:
A. receiving monitoring results from a plurality of monitoring units, wherein each of the monitoring units is deployed at one of the plurality of nodes and is configured to monitor a status of a network device at the respective node to generate monitoring results;
B. determining whether a state of the network device is abnormal based on the monitoring result and generating an automatic disposition command when the state is abnormal; and
C. sending the auto-handle command to a processing unit to cause the processing unit to perform exception handling operations for network devices at respective nodes based on the auto-handle command.
13. The method of claim 12, wherein the exception handling operation comprises a master-slave switching operation between network devices.
14. The method of claim 12, wherein the network device comprises one or more of: switches, routers, firewalls, and intrusion prevention systems.
15. The method of claim 12, wherein the monitoring unit has a binding relationship with the network devices at the respective nodes and is connected with ports of the bound network devices.
16. The method of claim 12, wherein the monitoring unit is further configured to monitor the status of the network device at the respective node using service commands based on one or more of the following protocols: ICMP, TCP, HTTP network protocols.
17. The method of claim 12, wherein the determining unit and the processing unit are deployed at a node where a core switch is located.
18. The method of claim 12, wherein step B comprises:
b1, periodically judging whether the state of the network device at the corresponding node is abnormal based on the monitoring result received from the monitoring unit;
b2, responding to the judgment result of the abnormal occurrence to generate the automatic handling command; and
b3, issuing the automatic treatment command to the processing unit.
19. The method of claim 12, wherein the method further comprises:
D. and generating alarm information in response to the judgment result of the abnormality occurrence.
20. The method of claim 12, wherein the method further comprises:
E. monitoring whether the monitoring unit uploads the monitoring result within a set time; and
F. And responding to the event that the monitoring unit does not upload the monitoring result in the set time, and generating alarm information.
21. The method of claim 12, wherein the network channel is connected between data centers to provide a channel for data transmission between the two.
22. The method of claim 13, wherein the network device is connected to a power supply via a PDU device, and wherein the auto-handle command causes the processing unit to perform a master-slave switching operation between network devices by:
the PDU equipment connected with a pair of network equipment with a master-slave relationship is remotely controlled to switch one of the PDU equipment from a power-on state to a power-off state.
23. The method of claim 22, wherein the remote control is performed in a synchronous manner for PDU devices connected to a pair of network devices having a master-slave relationship.
24. A computer device, comprising:
a memory;
a processor; and
a computer program stored on the memory and executable on the processor, execution of the computer program resulting in the following operations:
A. receiving monitoring results from a plurality of monitoring units, wherein each of the monitoring units is deployed at one of the plurality of nodes and is configured to monitor a status of a network device at the respective node to generate monitoring results;
B. Determining whether a state of the network device is abnormal based on the monitoring result and generating an auto-handle command when the state is abnormal; and
C. sending the auto-handle command to a processing unit to cause the processing unit to perform exception handling operations for network devices at respective nodes based on the auto-handle command.
25. The device of claim 24, wherein the exception handling operation comprises a master-slave switching operation between network devices.
26. The device of claim 24, wherein the network device comprises one or more of: switches, routers, firewalls, and intrusion prevention systems.
27. The device of claim 24, wherein the monitoring unit has a binding relationship with the network devices at the respective nodes and is connected with ports of the bound network devices.
28. The device of claim 24, wherein the monitoring unit is further configured to monitor the status of the network device at the respective node using service commands based on one or more of the following protocols: ICMP, TCP, HTTP network protocols.
29. The apparatus of claim 24, wherein the determining unit and the processing unit are deployed at a node where a core switch is located.
30. The apparatus of claim 24, wherein the computer program is operative to perform step B by:
b1, periodically determining whether the state of the network device at the corresponding node is abnormal based on the monitoring result received from the monitoring unit;
b2, generating the automatic handling command in response to the judgment result of the abnormality; and
b3, issuing the automatic treatment command to the processing unit.
31. The apparatus of claim 24, wherein execution of the computer program further results in the following operations:
D. and generating alarm information in response to the judgment result of the abnormality occurrence.
32. The apparatus of claim 24, wherein execution of the computer program further results in the following operations:
E. monitoring whether the monitoring unit uploads the monitoring result within a set time; and
F. and generating alarm information in response to the event that the monitoring unit does not upload the monitoring result within the set time.
33. The apparatus of claim 24, wherein the network channel is connected between data centers to provide a channel for data transmission between the two.
34. The device of claim 25, wherein the network device is connected to a power source via a PDU device, execution of the computer program causing active-standby switching operations between network devices to be performed by:
and remotely controlling PDU equipment connected with a pair of network equipment with a main-standby relation to switch one of the PDU equipment from a power-on state to a power-off state.
35. The device of claim 34, wherein execution of the computer program further causes remote control of PDU devices connected to a pair of network devices in a master-slave relationship in a synchronized manner.
36. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 12-23.
CN202111600863.9A 2021-12-24 2021-12-24 System, method, apparatus, and storage medium for monitoring status of network channels Pending CN114760224A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111600863.9A CN114760224A (en) 2021-12-24 2021-12-24 System, method, apparatus, and storage medium for monitoring status of network channels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111600863.9A CN114760224A (en) 2021-12-24 2021-12-24 System, method, apparatus, and storage medium for monitoring status of network channels

Publications (1)

Publication Number Publication Date
CN114760224A true CN114760224A (en) 2022-07-15

Family

ID=82324943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111600863.9A Pending CN114760224A (en) 2021-12-24 2021-12-24 System, method, apparatus, and storage medium for monitoring status of network channels

Country Status (1)

Country Link
CN (1) CN114760224A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170296A (en) * 2023-04-21 2023-05-26 北京智享嘉网络信息技术有限公司 Automatic operation and maintenance management system and method for network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007105271A (en) * 2005-10-14 2007-04-26 Kita Denshi Corp Slot machine
CN106789435A (en) * 2016-12-29 2017-05-31 深圳市深信服电子科技有限公司 A kind of method for monitoring state and its device, data center and many live data centers
CN106789323A (en) * 2017-01-05 2017-05-31 深圳奇迹智慧网络有限公司 A kind of communication network management method and its device
CN107181623A (en) * 2017-06-29 2017-09-19 国家电网公司 Information network equipment fault handling method and device
CN107634863A (en) * 2017-10-25 2018-01-26 北京百悟科技有限公司 Distributed monitoring device and method for domain name mapping disaster tolerance service
CN108628717A (en) * 2018-03-02 2018-10-09 北京辰森世纪科技股份有限公司 A kind of Database Systems and monitoring method
CN109617716A (en) * 2018-11-30 2019-04-12 新华三技术有限公司合肥分公司 Data center's abnormality eliminating method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007105271A (en) * 2005-10-14 2007-04-26 Kita Denshi Corp Slot machine
CN106789435A (en) * 2016-12-29 2017-05-31 深圳市深信服电子科技有限公司 A kind of method for monitoring state and its device, data center and many live data centers
CN106789323A (en) * 2017-01-05 2017-05-31 深圳奇迹智慧网络有限公司 A kind of communication network management method and its device
CN107181623A (en) * 2017-06-29 2017-09-19 国家电网公司 Information network equipment fault handling method and device
CN107634863A (en) * 2017-10-25 2018-01-26 北京百悟科技有限公司 Distributed monitoring device and method for domain name mapping disaster tolerance service
CN108628717A (en) * 2018-03-02 2018-10-09 北京辰森世纪科技股份有限公司 A kind of Database Systems and monitoring method
CN109617716A (en) * 2018-11-30 2019-04-12 新华三技术有限公司合肥分公司 Data center's abnormality eliminating method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170296A (en) * 2023-04-21 2023-05-26 北京智享嘉网络信息技术有限公司 Automatic operation and maintenance management system and method for network
CN116170296B (en) * 2023-04-21 2023-08-08 北京智享嘉网络信息技术有限公司 Automatic operation and maintenance management system and method for network

Similar Documents

Publication Publication Date Title
CN101309185B (en) Processing method of multi-host apparatus in stacking system and stacking member equipment
US20170026226A1 (en) Communication device with persistent configuration and verification
US10284499B2 (en) Dedicated control path architecture for systems of devices
US9813286B2 (en) Method for virtual local area network fail-over management, system therefor and apparatus therewith
EP1982447A2 (en) System and method for detecting and recovering from virtual switch link failures
US8782462B2 (en) Rack system
JP2007116275A (en) Path protection method and layer 2 switch
CN103957138B (en) A kind of method for monitoring network, device and its system
CN109547873A (en) A kind of processing method and processing device of the realization two-node cluster hot backup based on one-way optical gate
JP2005192306A (en) Uninterruptible power supply device, power supply control program, recording medium for power supply control program, and power supply control method
CN114760224A (en) System, method, apparatus, and storage medium for monitoring status of network channels
JP2001103062A (en) Method for notifying detection of fault
JP2011188072A (en) Fault detection and recovery system, fault detection and recovery method, and recovery program therefor
CN109120520B (en) Fault processing method and equipment
CN115484208A (en) Distributed drainage system and method based on cloud security resource pool
JP2009218727A (en) Communication management system, communication management method and communication device
CN1996880A (en) Method and network device of the self-adapted management network device
US20190250687A1 (en) Method for Monitoring, Control and Graceful Shutdown of Control and/or Computer Units
CN107302452B (en) Control method for PBX service continuity
CN103001785B (en) Realize method and the MRF system of the redundancy backup of MRF system
JP6204397B2 (en) COMMUNICATION DEVICE, COMMUNICATION SYSTEM, COMMUNICATION METHOD, AND PROGRAM
US11564114B2 (en) Premises communication hub
KR20150059697A (en) Method and System for detecting network failure in Software Defined Network
CN117714512B (en) Control method, system and device of intelligent fire pump
CN113395188B (en) Method and system for determining working state of server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination