CN118118468A - Abnormal communication recovery method, device and storage medium - Google Patents

Abnormal communication recovery method, device and storage medium Download PDF

Info

Publication number
CN118118468A
CN118118468A CN202410454622.5A CN202410454622A CN118118468A CN 118118468 A CN118118468 A CN 118118468A CN 202410454622 A CN202410454622 A CN 202410454622A CN 118118468 A CN118118468 A CN 118118468A
Authority
CN
China
Prior art keywords
node
signaling
communication
abnormal
media
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410454622.5A
Other languages
Chinese (zh)
Inventor
唐昶荣
吴超杰
喇建国
陈志才
王沣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Star Network Communication Technology Co ltd
Original Assignee
Shenzhen Star Network Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Star Network Communication Technology Co ltd filed Critical Shenzhen Star Network Communication Technology Co ltd
Priority to CN202410454622.5A priority Critical patent/CN118118468A/en
Publication of CN118118468A publication Critical patent/CN118118468A/en
Pending legal-status Critical Current

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The application discloses an abnormal communication recovery method, equipment and a storage medium, and belongs to the technical field of communication. The method comprises the following steps: judging that the first node has a fault when response information of the first node based on the communication detection signal is not received within preset time, and acquiring signaling and media information of the first node from a database; selecting look target nodes from the second nodes based on a preset election algorithm; and sending the signaling and the media information to the target node, wherein the target node performs communication recovery based on the signaling and the media information after receiving the signaling and the media information. According to the application, when the node communication is abnormal, a communication detection signal is sent to the node, and when the response information is not received within the preset time, the node is judged to have a fault, and the target node is selected from the rest nodes, so that the communication task of the node is taken over, and the stability of the cluster communication is improved.

Description

Abnormal communication recovery method, device and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method, an apparatus, and a storage medium for recovering abnormal communications.
Background
In the cluster communication, a communication mode of more efficient and safe data transmission and information communication is realized through resource sharing and cooperation among a plurality of servers in a network. Thus, when a failed node occurs in the clustered network, communication abnormality may be caused. In order to avoid abnormal communication, each node in the cluster is usually detected at regular time, and when a fault node is detected, a reset instruction is sent to the abnormal node to trigger a reset mechanism to recover the node.
In the related art, a reset mechanism of a fault node needs to be triggered by reset instructions of other nodes, and after the fault node receives the reset instructions, a reset program is started according to the reset instructions, so that the fault node is recovered to be a normal node. However, in the case that the abnormality occurring in the failed node is a device failure or a network failure, the failed node cannot respond to the reset instruction to start the reset program. This results in a lower stability of the trunked communication.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The application mainly aims to provide an abnormal communication recovery method, which aims to solve the technical problem of lower cluster communication stability.
In order to achieve the above object, the present application provides an abnormal communication recovery method comprising the steps of:
Judging that the first node has a fault when response information of the first node based on the communication detection signal is not received within preset time, and acquiring signaling and media information of the first node from a database;
selecting look target nodes from the second nodes based on a preset election algorithm;
And sending the signaling and the media information to the target node, wherein the target node performs communication recovery based on the signaling and the media information after receiving the signaling and the media information.
Optionally, when the response information of the first node based on the communication detection signal is not received within the preset time, the step of judging that the first node has a fault and acquiring signaling and media information of the first node in the database further includes:
Determining real-time operation parameters of a signaling system and a media system of a node corresponding to the heartbeat signal according to the received heartbeat signal;
judging that the first node is abnormal when the heartbeat signal of the first node is not received in preset heartbeat time or the real-time operation parameter is abnormal;
A communication detection signal is transmitted to the first node.
Optionally, before the step of determining the operation parameter of the server of the node corresponding to the heartbeat signal according to the received heartbeat signal, the method further includes:
Monitoring a signaling system and a media system in a current node, and acquiring the real-time operation parameters of the signaling system and the media system;
and sending the heartbeat signal to all communication nodes in the cluster based on a preset time period and the real-time operation parameters.
Optionally, after the step of monitoring the signaling system and the media system in the current node and acquiring the real-time operation parameters of the signaling system and the media system, the method further includes:
When receiving the communication detection signals sent by other communication nodes, determining real-time operation parameters of the signaling system and the media system, and sending the response information to the communication nodes based on the real-time operation parameters;
When the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time, replacing the target IP address of the current node with an abnormal node IP, and controlling the restarting of the signaling system and the media system;
and executing the step of selecting the target node in the second node based on the preset selection algorithm.
Optionally, after the step of replacing the target IP address of the current node with the abnormal node IP and controlling the restart of the signaling system and the media system when the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time, the method further includes:
After the signaling system and the media system in the current node are restarted, determining real-time operation parameters of the signaling system and the media system;
and when the real-time operation parameter is normal, sending a communication recovery signal to the target node, and replacing the abnormal node IP of the current node with the target IP address.
Optionally, when the response information of the first node based on the network detection signal is not received within the preset time, the step of judging that the first node has a fault and acquiring signaling and media information of the first node in the database includes:
judging that the state of the first node is network abnormality and/or equipment abnormality when the response information of the first node based on the network detection signal is not received within preset time;
And acquiring the signaling and the media information of the first node from a database shared by clusters.
Optionally, the sending the signaling and the media information to the target node, where after receiving the signaling and the media information, the target node further includes, after the step of performing communication recovery based on the signaling and the media information:
when the target node is selected, acquiring the IP address of the first node;
After receiving the signaling and the media information, identifying the signaling and the media information, and reloading communication information according to the signaling and the media information;
and constructing a monitoring network of the signaling and the media information according to the IP address.
Optionally, the step of electing the target node in the second node based on a preset election algorithm includes:
sending an election signal to all the sentry of the second node in the cluster, wherein the sentry of the second node returns an election response when receiving the election signal and when the load of the corresponding second node is smaller than a preset load;
After receiving the election response, determining the load of the second node according to the election response;
and determining the second node with the minimum load, and selecting the second node as the target node.
In addition, in order to achieve the above object, the present application also provides an abnormal communication recovery apparatus including: a memory, a processor, and an abnormal communication restoration program stored on the memory and executable on the processor, the abnormal communication restoration program configured to implement the steps of the abnormal communication restoration method as described above.
In addition, in order to achieve the above object, the present application also provides a storage medium having stored thereon an abnormal communication recovery program which, when executed by a processor, implements the steps of the abnormal communication recovery method as described above.
When the node communication is abnormal, the application sends the communication detection signal to the node, confirms the reason of the node abnormality, judges that the node has a fault when the response information is not received within the preset time, acquires the signaling and the media information of the node in the cluster database, selects the target node from other nodes, and sends the signaling and the media information to the target node so that the target node takes over the communication task of the node, thereby improving the stability of the cluster communication by selecting the target node to resume the communication under the condition that the server of the communication node has network problems or equipment faults. In addition, the application takes over the communication task of the abnormal node through the target node by a real-time hot standby method, can rapidly reload session information and restore the call, thereby ensuring that the audio and video call service is not interrupted.
Drawings
FIG. 1 is a flow chart of a first embodiment of an abnormal communication recovery method of the present application;
FIG. 2 is a flowchart of a second embodiment of an abnormal communication recovery method according to the present application;
FIG. 3 is a flowchart of a third embodiment of an abnormal communication recovery method according to the present application;
FIG. 4 is a flowchart of a method for recovering abnormal communication according to a fourth embodiment of the present application;
FIG. 5 is a flowchart of a fifth embodiment of an abnormal communication recovery method according to the present application;
Fig. 6 is a schematic structural diagram of an abnormal communication recovery device of a hardware running environment according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
According to the application, when the node communication is abnormal, a communication detection signal is sent to the node, and when the response information is not received within the preset time, the node is judged to have a fault, and the target node is selected from the rest nodes, so that the communication task of the node is taken over, and the stability of the cluster communication is improved.
In order that the above-described aspects may be better understood, exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the application to those skilled in the art.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments.
Example 1
In the audio and video communication process, the signaling negotiation is carried out, the media transmission only supports memory level backup, communication abnormal faults are easy to occur, for example, telephone software and an IP telephone interface display call are caused, but no sound phenomenon exists, and the automatic recovery is avoided. Although the audio and video platform generally supports the cluster, the problem of call recovery still cannot be solved when a single service fails. In order to quickly recover communication and improve stability of trunking communication, an embodiment of the present application provides an abnormal communication recovery method, and referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of the abnormal communication recovery method of the present application.
In this embodiment, the abnormal communication recovery method includes:
Step S10: judging that the first node has a fault when response information of the first node based on the communication detection signal is not received within preset time, and acquiring signaling and media information of the first node from a database;
In this embodiment, the communication service adopts a cluster manner to deploy the communication server, and each node corresponds to at least one service device, i.e., a node server. Communication nodes in the cluster can communicate with each other. Each communication node comprises a signaling system, a media system and a sentinel. The signaling system can monitor, select and operate the communication according to the signaling of the communication node, the media system can complete the information transmission function in the communication process according to the media information, and the whistle can monitor the signaling system and the media system to ensure the normal operation of the signaling system and the media system. In addition, the node servers of the respective communication nodes in the cluster share a database. The database stores signaling and media information for all nodes.
In the implementation, when the response information of the first node based on the network detection signal is not received within a preset time, a sentinel in the communication node can judge that the first node has a fault, and the state is network abnormality and/or equipment abnormality. The sentinel will obtain the signaling and the media information of the first node in a cluster database shared by the clusters.
Alternatively, the time of signal transmission may be different due to the different distances between the different nodes. In order to ensure the quick recovery of abnormal communication, in the cluster, the sentry most quickly determining that the first node has a fault acquires signaling and media information of the first node in a database, and marks information corresponding to the first node in the database so as to avoid repeated acquisition of other sentries. It should be noted that, the first node is a node corresponding to communication detection in the cluster, and the second node is another communication node except the first node in the communication cluster.
Step S20: selecting look target nodes from the second nodes based on a preset election algorithm;
In this embodiment, the communication node may send an election signal to all the sentinels of the second nodes in the cluster via the sentinels. And returning an election response to the communication node initiating the election when the sentinel of the second node receives the election signal and when the load of the corresponding second node is smaller than the preset load. The election response comprises the load of the corresponding second node and the real-time operation parameters of the signaling system and the media system.
Specifically, the communication node may directly select the target node according to the load condition of the second node, or may score the second node through a weighted summation algorithm, so as to select the target node according to the scoring result.
As an alternative embodiment of electing the target node, the communication node initiating the election may choose the target node based on the load of the second node. After receiving the election response, the communication node may determine a load of the second node according to the election response, determine the second node with the smallest load, and select the second node as the target node.
As an alternative embodiment of the election target node, the communication node may also score each second node according to the election response. The communication node may calculate the weight according to the preset weight, for the real-time operation parameter of the signaling system, the real-time operation parameter of the media system, and the load, respectively. By summing the weights, a final score for each second node may be determined. The score may more fully evaluate the communication processing capability of the second node. The communication cluster may select the target node based on the score.
Step S30: and sending the signaling and the media information to the target node, wherein the target node performs communication recovery based on the signaling and the media information after receiving the signaling and the media information.
In this embodiment, after determining the target node, the communication node sends the acquired signaling and media information to the signaling system and the media system of the target node through the sentinel respectively.
Further, when the communication node initiating the election is selected as the target node, the sentinel in the communication node will send signaling and media information to the signaling system and media system of the communication node.
For example, after any communication node in the cluster is selected as a target node and receives signaling and media information, the received signaling and media information is identified, and the communication information is reloaded according to the signaling and the media information. In addition, the sentinel of the node can acquire the IP address of the first node in the database shared by the clusters, and reconstruct a monitoring network of signaling and media information according to the IP address.
According to the embodiment of the application, when the node communication is abnormal, the communication detection signal is sent to the node, and when the response information is not received within the preset time, the node is judged to have a fault, the target node is selected from the rest nodes, and the communication task of the node is taken over, so that the communication recovery speed is improved, and the stability of cluster communication is improved.
Example two
Based on the same inventive concept, the present application also provides a second embodiment, referring to fig. 2, fig. 2 is a schematic flow chart of a second embodiment of the abnormal communication recovery method of the present application.
In this embodiment, when the response information of the first node based on the communication detection signal is not received within the preset time as in step S10, it is determined that the first node has a fault, and before the signaling and the media information of the first node are acquired in the database, the method further includes:
step S11: monitoring a signaling system and a media system in a current node, and acquiring real-time operation parameters of the signaling system and the media system;
Step S12: and sending the heartbeat signal to all communication nodes in the cluster based on a preset time period and the real-time operation parameters.
In this embodiment, the sentinels of each node in the cluster are deployed in a one-to-one correspondence with the signaling system and the media system, and are configured in one or more servers. Each sentinel monitors the signaling system and the media system in the current node, and real-time operation parameters of the signaling system and the media system are obtained in real time, so that normal operation of communication is ensured.
In specific implementation, in order to prevent failure of the server from causing monitoring failure of the sentry, the sentry needs to generate a heartbeat signal according to real-time operation parameters of the signaling system and the media system based on a preset time period, and send the heartbeat signal to all communication nodes in the cluster, so that information of normal operation of communication is transmitted to the cluster.
In the embodiment of the application, the communication service monitors the signaling system and the media system of the current node through the whistle on the basis of cluster deployment and sends heartbeat signals to all communication nodes in the cluster at regular time, thereby ensuring the normal operation of communication.
Since the system described in the second embodiment of the present application is a system for implementing the method in the first embodiment of the present application, based on the method described in the first embodiment of the present application, a person skilled in the art can understand the specific structure and the modification of the system, and therefore, the description thereof is omitted herein. All systems used in the method according to the first embodiment of the present application are within the scope of the present application.
Example III
Based on the same inventive concept, the present application also provides a third embodiment, referring to fig. 3, and fig. 3 is a schematic flow chart of a third embodiment of the abnormal communication recovery method of the present application.
In this embodiment, when the response information of the first node based on the communication detection signal is not received within the preset time as in step S10, it is determined that the first node has a fault, and before the signaling and the media information of the first node are acquired in the database, the method further includes:
step S13: determining real-time operation parameters of a signaling system and a media system of a node corresponding to the heartbeat signal according to the received heartbeat signal;
step S14: judging that the first node is abnormal when the heartbeat signal of the first node is not received in preset heartbeat time or the operation parameter of the heartbeat signal is abnormal;
Step S15: a communication detection signal is transmitted to the first node.
In this embodiment, the communication node receives heartbeat signals sent by other communication nodes, and determines real-time operation parameters of the signaling system and the media system in the corresponding node according to the heartbeat signals. The communication node can judge whether the signaling system and the media system are operated normally or not according to the real-time operation parameters.
Specifically, when the heartbeat signal of the first node is not received within the preset heartbeat time or the operation parameter of the heartbeat signal is abnormal, the communication node judges that the first node is abnormal and sends a communication detection signal to a sentinel of the first node. The sentinel of the first node will return corresponding response information based on the communication detection signal. If the response information of the first node based on the communication detection signal is received, the normal working state of the sentry is indicated, the operation of the signaling system and the media system is abnormal, and if the response information of the first node based on the communication detection signal is not received, the first node cannot respond to the communication detection signal, and the first node has the problem of network failure or equipment failure.
Illustratively, when a sentinel detects that a node is abnormal, a ping request is sent to the abnormal node to determine whether the node is a system fault or a network fault, and by the sentinel responding to the ping according to whether the abnormal node is responding to the ping, the reason of the abnormality of the abnormal node can be speculated, and corresponding actions can be taken.
In the embodiment of the application, the communication node determines the running state of the first node through the heartbeat signal, and diagnoses the abnormal state of the first node by sending the communication detection signal when the running state of the first node is abnormal, so as to determine the abnormal communication reason of the first node.
Since the system described in the third embodiment of the present application is a system for implementing the method of the first embodiment of the present application, based on the method described in the first embodiment of the present application, a person skilled in the art can understand the specific structure and the modification of the system, and therefore, the description thereof is omitted herein. All systems used in the method according to the first embodiment of the present application are within the scope of the present application.
Example IV
Based on the same inventive concept, the present application also provides a fourth embodiment, referring to fig. 4, and fig. 4 is a schematic flow chart of a fourth embodiment of the abnormal communication recovery method of the present application.
In this embodiment, the signaling and the media information are sent to the target node as in step S30, where after the target node receives the signaling and the media information, after performing communication recovery based on the signaling and the media information, the method further includes:
step S31: when receiving the communication detection signals sent by other communication nodes, determining real-time operation parameters of the signaling system and the media system, and sending the response information to the communication nodes based on the real-time operation parameters;
step S32: when the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time, replacing the target IP address of the current node with an abnormal server IP, and controlling the restarting of the signaling system and the media system;
Step S33: and executing the step of selecting the target node in the second node based on the preset selection algorithm.
In this embodiment, when the sentinel receives the communication detection signals of other communication nodes, it is described that the communication node determines that there is an abnormality in the signaling system and the media service of the current node. The sentinel transmits the real-time operating parameters to the communication node based on the real-time operating parameters of the signaling system and the media system.
In particular implementations, the sentinel needs to make further acknowledgements to the signaling system and media services of the current node. When the real-time operation parameters of the signaling system or the media system are abnormal and are not recovered within the preset time, the signaling system and the media system are controlled to restart, and the target IP address of the target server is replaced by the abnormal server IP, so that the receiving of the communication is temporarily stopped.
When the sentinel determines that the signaling system and the media service operation parameters of the current node are abnormal, the sentinel of the current node initiates the election process of the target node. It should be noted that the method is equally applicable to the monitoring of the operation of signaling systems and media services by the sentry.
Optionally, the real-time operation parameter abnormality of the signaling system or the media system is determined at the sentry, and the signaling system or the media system can be reset through a reset mechanism provided in the communication node.
Since the system described in the fourth embodiment of the present application is a system for implementing the method of the first embodiment of the present application, based on the method described in the first embodiment of the present application, a person skilled in the art can understand the specific structure and the modification of the system, and therefore, the description thereof is omitted herein. All systems used in the method according to the first embodiment of the present application are within the scope of the present application.
Example five
Based on the same inventive concept, the present application also provides a fifth embodiment, referring to fig. 5, and fig. 5 is a schematic flow chart of a fifth embodiment of the abnormal communication recovery method of the present application.
In this embodiment, when the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time as in step S32, the method further includes, after replacing the target IP address of the target server with the abnormal server IP and controlling the restarting of the signaling system and the media system in the current node:
step S34: after the signaling system and the media system are restarted, determining the running states of the signaling system and the media system in the target server;
Step S35: and when the running state is normal, sending a communication recovery signal to the target node, and replacing the abnormal node IP of the current node with the target IP address.
In this embodiment, after the signaling system and the media system are reset or restarted, the communication node determines, through the sentinel, whether the operation states of the signaling system and the media system are normal. When the running state is normal, the sentinel sends a communication recovery signal to the target node, and replaces the current abnormal node IP with the target IP address so as to recover the communication of the current node.
Optionally, when the signaling system and the media system are reset or restarted and the running state is still abnormal, the sentinel can reset or restart the signaling system and the media system again and send prompt information to the management terminal.
Since the system described in the fifth embodiment of the present application is a system for implementing the method of the first embodiment of the present application, based on the method described in the first embodiment of the present application, a person skilled in the art can understand the specific structure and the modification of the system, and therefore, the description thereof is omitted herein. All systems used in the method according to the first embodiment of the present application are within the scope of the present application.
Example six
Referring to fig. 6, fig. 6 is a schematic structural diagram of an abnormal communication recovery device of a hardware running environment according to an embodiment of the present application.
As shown in fig. 6, the abnormal communication recovery apparatus may include: a processor 1001, such as a core processor (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., a wireless FIdelity (WI-FI) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) Memory or a stable Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the structure shown in fig. 6 does not constitute a limitation of the abnormal communication restoration apparatus, and may include more or less components than illustrated, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 6, an operating system, a data storage module, a network communication module, a user interface module, and an abnormal communication recovery program may be included in the memory 1005 as one type of storage medium.
In the abnormal communication recovery apparatus shown in fig. 6, the network interface 1004 is mainly used for data communication with other apparatuses; the user interface 1003 is mainly used for data interaction with a user; the processor 1001, the memory 1005 in the abnormal communication recovery apparatus of the present application may be provided in an abnormal communication recovery apparatus which calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001 and performs the steps of:
Judging that the first node has a fault when response information of the first node based on the communication detection signal is not received within preset time, and acquiring signaling and media information of the first node from a database;
selecting look target nodes from the second nodes based on a preset election algorithm;
And sending the signaling and the media information to the target node, wherein the target node performs communication recovery based on the signaling and the media information after receiving the signaling and the media information.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
Determining real-time operation parameters of a signaling system and a media system of a node corresponding to the heartbeat signal according to the received heartbeat signal;
judging that the first node is abnormal when the heartbeat signal of the first node is not received in preset heartbeat time or the real-time operation parameter is abnormal;
A communication detection signal is transmitted to the first node.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
Monitoring a signaling system and a media system in a current node, and acquiring the real-time operation parameters of the signaling system and the media system;
and sending the heartbeat signal to all communication nodes in the cluster based on a preset time period and the real-time operation parameters.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
When receiving the communication detection signals sent by other communication nodes, determining real-time operation parameters of the signaling system and the media system, and sending the response information to the communication nodes based on the real-time operation parameters;
When the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time, replacing the target IP address of the current node with an abnormal node IP, and controlling the restarting of the signaling system and the media system;
and executing the step of selecting the target node in the second node based on the preset selection algorithm.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
After the signaling system and the media system in the current node are restarted, determining real-time operation parameters of the signaling system and the media system;
and when the real-time operation parameter is normal, sending a communication recovery signal to the target node, and replacing the abnormal node IP of the current node with the target IP address.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
judging that the state of the first node is network abnormality and/or equipment abnormality when the response information of the first node based on the network detection signal is not received within preset time;
And acquiring the signaling and the media information of the first node from a database shared by clusters.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
when the target node is selected, acquiring the IP address of the first node;
After receiving the signaling and the media information, identifying the signaling and the media information, and reloading communication information according to the signaling and the media information;
and constructing a monitoring network of the signaling and the media information according to the IP address.
Further, the abnormal communication recovery apparatus calls an abnormal communication recovery program stored in the memory 1005 through the processor 1001, and performs the following steps:
sending an election signal to all the sentry of the second node in the cluster, wherein the sentry of the second node returns an election response when receiving the election signal and when the load of the corresponding second node is smaller than a preset load;
After receiving the election response, determining the load of the second node according to the election response;
and determining the second node with the minimum load, and selecting the second node as the target node.
In addition, the present application also provides a computer-readable storage medium storing the abnormal communication recovery program, the abnormal communication recovery program being further executable by a processor for implementing the steps of the above-described embodiments of the abnormal communication recovery method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. An abnormal communication recovery method, applied to a communication node, comprising the steps of:
Judging that the first node has a fault when response information of the first node based on the communication detection signal is not received within preset time, and acquiring signaling and media information of the first node from a database;
selecting look target nodes from the second nodes based on a preset election algorithm;
And sending the signaling and the media information to the target node, wherein the target node performs communication recovery based on the signaling and the media information after receiving the signaling and the media information.
2. The abnormal communication recovery method according to claim 1, wherein when the response information of the first node based on the communication detection signal is not received within a preset time, the step of judging that the first node has a fault and acquiring signaling and media information of the first node in the database further comprises:
Determining real-time operation parameters of a signaling system and a media system of a node corresponding to the heartbeat signal according to the received heartbeat signal;
judging that the first node is abnormal when the heartbeat signal of the first node is not received in preset heartbeat time or the real-time operation parameter is abnormal;
A communication detection signal is transmitted to the first node.
3. The abnormal communication recovery method according to claim 2, wherein before the step of determining the operation parameters of the server of the node corresponding to the heartbeat signal according to the received heartbeat signal, the abnormal communication recovery method further comprises:
Monitoring a signaling system and a media system in a current node, and acquiring the real-time operation parameters of the signaling system and the media system;
and sending the heartbeat signal to all communication nodes in the cluster based on a preset time period and the real-time operation parameters.
4. The abnormal communication recovery method according to claim 3, wherein after the step of monitoring the signaling system and the media system in the current node and acquiring the real-time operation parameters of the signaling system and the media system, further comprising:
When receiving the communication detection signals sent by other communication nodes, determining real-time operation parameters of the signaling system and the media system, and sending the response information to the communication nodes based on the real-time operation parameters;
When the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time, replacing the target IP address of the current node with an abnormal node IP, and controlling the restarting of the signaling system and the media system;
and executing the step of selecting the target node in the second node based on the preset selection algorithm.
5. The abnormal communication recovery method according to claim 4, wherein said step of replacing the target IP address of the current node with an abnormal node IP when the real-time operation parameters of the signaling system or the media system are abnormal and not recovered within the preset time, and controlling the restarting of the signaling system and the media system, further comprises:
After the signaling system and the media system in the current node are restarted, determining real-time operation parameters of the signaling system and the media system;
and when the real-time operation parameter is normal, sending a communication recovery signal to the target node, and replacing the abnormal node IP of the current node with the target IP address.
6. The abnormal communication recovery method of claim 1, wherein the step of judging that the first node has a fault and acquiring signaling and media information of the first node in a database when no response information of the first node based on a network detection signal is received within a preset time comprises:
judging that the state of the first node is network abnormality and/or equipment abnormality when the response information of the first node based on the network detection signal is not received within preset time;
And acquiring the signaling and the media information of the first node from a database shared by clusters.
7. The abnormal communication recovery method of claim 1, wherein the signaling and the media information are transmitted to the target node, and wherein the target node, after receiving the signaling and the media information, further comprises, after the step of performing communication recovery based on the signaling and the media information:
when the target node is selected, acquiring the IP address of the first node;
After receiving the signaling and the media information, identifying the signaling and the media information, and reloading communication information according to the signaling and the media information;
and constructing a monitoring network of the signaling and the media information according to the IP address.
8. The abnormal communication recovery method according to claim 1, wherein the step of electing the target node in the second node based on a preset election algorithm comprises:
sending an election signal to all the sentry of the second node in the cluster, wherein the sentry of the second node returns an election response when receiving the election signal and when the load of the corresponding second node is smaller than a preset load;
After receiving the election response, determining the load of the second node according to the election response;
and determining the second node with the minimum load, and selecting the second node as the target node.
9. An abnormal communication recovery apparatus, characterized by comprising: a memory, a processor, and an abnormal communication restoration program stored on the memory and executable on the processor, the abnormal communication restoration program configured to implement the steps of the abnormal communication restoration method according to any one of claims 1 to 8.
10. A storage medium having stored thereon an abnormal communication recovery program which, when executed by a processor, implements the steps of the abnormal communication recovery method according to any one of claims 1 to 8.
CN202410454622.5A 2024-04-16 2024-04-16 Abnormal communication recovery method, device and storage medium Pending CN118118468A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410454622.5A CN118118468A (en) 2024-04-16 2024-04-16 Abnormal communication recovery method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410454622.5A CN118118468A (en) 2024-04-16 2024-04-16 Abnormal communication recovery method, device and storage medium

Publications (1)

Publication Number Publication Date
CN118118468A true CN118118468A (en) 2024-05-31

Family

ID=91215753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410454622.5A Pending CN118118468A (en) 2024-04-16 2024-04-16 Abnormal communication recovery method, device and storage medium

Country Status (1)

Country Link
CN (1) CN118118468A (en)

Similar Documents

Publication Publication Date Title
US10491671B2 (en) Method and apparatus for switching between servers in server cluster
CN108847982B (en) Distributed storage cluster and node fault switching method and device thereof
CN105933407B (en) method and system for realizing high availability of Redis cluster
CN111901422B (en) Method, system and device for managing nodes in cluster
US20150205683A1 (en) Maintaining a cluster of virtual machines
CN109787827B (en) CDN network monitoring method and device
CN113347037B (en) Data center access method and device
CN102882704A (en) Link protection method and apparatus in soft restart upgrade process of ISSU (in-service software upgrade)
CN106230954B (en) Virtualization management platform
CN110532096B (en) System and method for multi-node grouping parallel deployment
CN111988347B (en) Data processing method of board hopping machine system and board hopping machine system
CN113965576B (en) Container-based big data acquisition method, device, storage medium and equipment
CN111338858A (en) Disaster recovery method and device for double machine rooms
CN114615141A (en) Communication control method
CN118118468A (en) Abnormal communication recovery method, device and storage medium
CN112367386B (en) Ignite-based automatic operation and maintenance method and device and computer equipment
US8438261B2 (en) Failover scheme with service-based segregation
CN116192885A (en) High-availability cluster architecture artificial intelligent experiment cloud platform data processing method and system
CN112269693B (en) Node self-coordination method, device and computer readable storage medium
CN110086660B (en) Data processing method and device
US20210406141A1 (en) Computer cluster with adaptive quorum rules
CN114598594A (en) Method, system, medium and device for processing application faults under multiple clusters
CN114710485A (en) Processing method and processing apparatus
JP2017034610A (en) Call processing device, session recovery method and call processing server program
CN116991635B (en) Data synchronization method and data synchronization device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination