TWI468949B - Network sever system and management method thereof - Google Patents

Network sever system and management method thereof Download PDF

Info

Publication number
TWI468949B
TWI468949B TW101139119A TW101139119A TWI468949B TW I468949 B TWI468949 B TW I468949B TW 101139119 A TW101139119 A TW 101139119A TW 101139119 A TW101139119 A TW 101139119A TW I468949 B TWI468949 B TW I468949B
Authority
TW
Taiwan
Prior art keywords
controller
server
master
slave
signal
Prior art date
Application number
TW101139119A
Other languages
Chinese (zh)
Other versions
TW201416880A (en
Inventor
Lin Yu
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to TW101139119A priority Critical patent/TWI468949B/en
Publication of TW201416880A publication Critical patent/TW201416880A/en
Application granted granted Critical
Publication of TWI468949B publication Critical patent/TWI468949B/en

Links

Landscapes

  • Hardware Redundancy (AREA)

Description

網路伺服系統及其管理方法Network servo system and management method thereof

本發明是有關於一種網路伺服系統,且特別是有關於一種可透過多個控制器輪流運作而維持正常運作的網路伺服系統。The present invention relates to a network servo system, and more particularly to a network servo system that can operate normally through a plurality of controllers to maintain normal operation.

隨著電腦科技及網際網路的蓬勃發展,雲端運算(cloud computing)應用也日趨普遍。在雲端運算網路中,使用者不需要知道用於計算的基礎結構(infrastructure)的地點以及其他細節,即可以使用雲端運算所提供用來計算、資料存取以及儲存的資源。透過使用位於雲端網路中的其他電腦資源,可以讓運算能力較低的裝置(例如手機)使用這些資源來處理資料,進而使得其可以使用高運算能力的電腦(例如伺服器)才能執行的功能。此外,使用者可以更透過伺服器存取其他服務提供裝置提供的服務(例如音訊及視訊等),以獲得更多樣化的服務。With the rapid development of computer technology and the Internet, cloud computing applications are becoming more common. In a cloud computing network, the user does not need to know the location of the infrastructure used for the calculation and other details, that is, the resources provided for computing, data access, and storage provided by the cloud computing. By using other computer resources located in the cloud network, devices with lower computing power, such as mobile phones, can use these resources to process data, enabling them to use high-performance computers (such as servers) to perform functions. . In addition, users can access services provided by other service providers (such as audio and video) through the server to obtain more diverse services.

一般而言,在雲端運算的網路伺服系統結構中,通常僅用單一個機架管理控制器(Rack Management Controller,RMC)來管理整個機架中的伺服器。因此,當此RMC的操作情形出現異常或是故障時,可能使得其所管理的伺服器皆無法正常運作,因而造成使用上的不便。In general, in the network servo system architecture of cloud computing, only a single Rack Management Controller (RMC) is used to manage the servers in the entire rack. Therefore, when the operation situation of the RMC is abnormal or malfunctions, the servers managed by the RMC may not operate normally, thereby causing inconvenience in use.

有鑑於上述問題,本發明提供一種網路伺服系統及其管理方法,可透過多個控制器的輪流操作,使得網路伺服系統可以在控制器的操作情形出現異常時仍能正常運作。In view of the above problems, the present invention provides a network servo system and a management method thereof, which can be operated in turn by a plurality of controllers, so that the network servo system can still operate normally when an abnormality occurs in the operation state of the controller.

本發明提供一種網路伺服系統,包括至少一伺服器、至少一分配單元、第一控制器以及第二控制器。至少一分配單元連接至少一伺服器。第一控制器連接分配單元,用以透過分配單元管理至少一伺服器。第二控制器連接至少一分配單元以及第一控制器,用以透過第一詢問信號監控第一控制器的操作情形。其中,當第一控制器未以第一確認信號回應第一詢問信號時,第二控制器取代第一控制器而透過分配單元對至少一伺服器進行管理,並重新啟動第一控制器。The invention provides a network servo system comprising at least one server, at least one distribution unit, a first controller and a second controller. At least one distribution unit is connected to at least one server. The first controller is connected to the distribution unit for managing at least one server through the distribution unit. The second controller is coupled to the at least one distribution unit and the first controller for monitoring an operating condition of the first controller by using the first interrogation signal. Wherein, when the first controller does not respond to the first inquiry signal with the first confirmation signal, the second controller replaces the first controller and manages at least one server through the distribution unit, and restarts the first controller.

在本發明的一實施例中,所述第一控制器透過第二詢問信號監控第二控制器,並且,在第二控制器未以第二確認信號回應第二詢問信號時,重新啟動第二控制器。In an embodiment of the invention, the first controller monitors the second controller by using the second interrogation signal, and restarts the second when the second controller does not respond to the second interrogation signal with the second acknowledgement signal. Controller.

在本發明的一實施例中,所述第一控制器/第二控制器在重新啟動之後,判斷第二控制器/第一控制器是否管理伺服器。In an embodiment of the invention, the first controller/second controller determines whether the second controller/first controller manages the server after restarting.

在本發明的一實施例中,其中判斷第二控制器/第一控制器是否管理伺服器是透過讀取關聯於第二控制器/第一控制器的主/從標識來進行,在第二控制器/第一控制器開始管理伺服器前,先將其對應的主/從標識切換為主,而當第二控制器/第一控制器結束管理伺服器後,將其對應的主 /從標識切換為從。In an embodiment of the invention, wherein determining whether the second controller/first controller manages the server is performed by reading a master/slave identifier associated with the second controller/first controller, in the second Before the controller/first controller starts to manage the server, its corresponding master/slave identifier is switched to be the master, and when the second controller/first controller ends the management server, the corresponding master is / Switch from ID to From.

在本發明的一實施例中,其中第一控制器/第二控制器在重新啟動之後,若判斷第二控制器/第一控制器的主/從標識為從,則將第二控制器/第一控制器的主/從標識設置為主。此外,若判斷第二控制器/第一控制器的主/從標識為主,則將第二控制器/第一控制器的主/從標識設置為從。In an embodiment of the invention, after the first controller/second controller is restarted, if it is determined that the master/slave identifier of the second controller/first controller is slave, the second controller/ The master/slave flag of the first controller is set to master. Further, if it is determined that the master/slave flag of the second controller/first controller is dominant, the master/slave flag of the second controller/first controller is set to slave.

在本發明的一實施例中,所述第一控制器在判斷第二控制器為管理伺服器時,透過第二詢問信號監控第二控制器,並且,在第二控制器未以第二確認信號回應第二詢問信號時,第一控制器取代第二控制器而對伺服器進行管理,並重新啟動第二控制器。In an embodiment of the invention, the first controller monitors the second controller through the second interrogation signal when determining that the second controller is the management server, and does not confirm the second controller in the second controller. When the signal responds to the second interrogation signal, the first controller replaces the second controller to manage the server and restarts the second controller.

在本發明的一實施例中,所述第一控制器/第二控制器更連接管理中心,當第一控制器/第二控制器的其中之一管理伺服器時,收集伺服器的工作日誌並上傳至管理中心。In an embodiment of the invention, the first controller/second controller is further connected to the management center, and collects the working log of the server when one of the first controller/second controller manages the server. And upload to the management center.

另一觀點而言,本發明提供一種系統管理方法,適於網路伺服系統,所述方法包括下列步驟。首先,利用第二控制器透過第一詢問信號監控第一控制器的操作情形。接著,當第一控制器未以第一確認信號回應第一詢問信號時,利用第二控制器取代第一控制器而透過分配單元對伺服器進行管理,並重新啟動第一控制器。In another aspect, the present invention provides a system management method suitable for a network servo system, the method comprising the following steps. First, the operating condition of the first controller is monitored by the second controller through the first interrogation signal. Then, when the first controller does not respond to the first inquiry signal with the first confirmation signal, the second controller replaces the first controller to manage the server through the distribution unit, and restarts the first controller.

在本發明的一實施例中,所述方法更包括透過第二詢問信號監控第二控制器,並且,在第二控制器未以第二確認信號回應第二詢問信號時,重新啟動第二控制器。In an embodiment of the invention, the method further includes monitoring the second controller by the second interrogation signal, and restarting the second control when the second controller does not respond to the second interrogation signal with the second acknowledgement signal Device.

在本發明的一實施例中,其中在重新啟動第一控制器/第二控制器的步驟之後,更包括判斷第二控制器/第一控制器是否管理伺服器。In an embodiment of the invention, after the step of restarting the first controller/second controller, it further comprises determining whether the second controller/first controller manages the server.

在本發明的一實施例中,判斷第二控制器/第一控制器是否管理伺服器的步驟包括讀取關聯於第二控制器/第一控制器的主/從標識來進行,在第二控制器/第一控制器開始管理伺服器前,先將其對應的主/從標識切換為主,而當第二控制器/第一控制器結束管理伺服器後,將其對應的主/從標識切換為從。In an embodiment of the invention, the step of determining whether the second controller/first controller manages the server comprises: reading a master/slave identifier associated with the second controller/first controller, in the second Before the controller/first controller starts to manage the server, it first switches its corresponding master/slave identifier to the master, and when the second controller/first controller ends the management server, it corresponds to the master/slave. The logo is switched to slave.

在本發明的一實施例中,在重新啟動第一控制器/第二控制器的步驟之後,更包括若判斷第二控制器/第一控制器的主/從標識為從,則將第二控制器/第一控制器的主/從標識設置為主。若判斷第二控制器/第一控制器的主/從標識為主,則將第二控制器/第一控制器的主/從標識設置為從。In an embodiment of the present invention, after the step of restarting the first controller/second controller, further comprising: if determining that the master/slave identifier of the second controller/first controller is slave, then the second The master/slave flag of the controller/first controller is set to master. If it is determined that the master/slave flag of the second controller/first controller is dominant, the master/slave flag of the second controller/first controller is set to slave.

在本發明的一實施例中,判斷第二控制器/第一控制器是否管理伺服器的步驟包括在判斷第二控制器為管理伺服器時,透過第二詢問信號監控第二控制器,並且,在第二控制器未以第二確認信號回應第二詢問信號時,第一控制器取代第二控制器而對伺服器進行管理,並重新啟動第二控制器。In an embodiment of the invention, the step of determining whether the second controller/first controller manages the server comprises: monitoring the second controller by the second interrogation signal when determining that the second controller is the management server, and When the second controller does not respond to the second inquiry signal with the second confirmation signal, the first controller replaces the second controller to manage the server and restarts the second controller.

在本發明的一實施例中,所述方法更包括當第一控制器/第二控制器的其中之一管理伺服器時,收集伺服器的工作日誌並上傳至管理中心。In an embodiment of the invention, the method further includes collecting the work log of the server and uploading to the management center when one of the first controller/second controller manages the server.

基於上述,本發明提供的網路伺服系統及其管理方法 中,透過第二控制器對第一控制器的監控,可以在第一控制器的操作情形出現異常時,即時地由第二控制器來代替第一控制器管理伺服器。如此一來,即使在第一控制器出現故障或是異常時,網路伺服系統仍可以藉由第二控制器的運作而維持其操作情形。此外,第一控制器可以在重新啟動之後對第二控制器進行監控,並在第二控制器的操作情形出現異常時,由第一控制器再次地對伺服器進行管理,使得網路伺服系統可以在第一控制器及第二控制器之間的輪流運作情形下維持其正常運作。Based on the above, the network servo system provided by the present invention and the management method thereof The monitoring of the first controller by the second controller may immediately replace the first controller management server by the second controller when an abnormality occurs in the operating condition of the first controller. In this way, even when the first controller fails or is abnormal, the network servo system can maintain its operation condition by the operation of the second controller. In addition, the first controller may monitor the second controller after restarting, and when the operation condition of the second controller is abnormal, the server is again managed by the first controller, so that the network servo system The normal operation of the first controller and the second controller can be maintained in a rotating operation.

為讓本發明之上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。The above described features and advantages of the present invention will be more apparent from the following description.

現將詳細參考本發明之示範性實施例,在附圖中說明所述示範性實施例之實例。另外,凡可能之處,在圖式及實施方式中使用相同標號的元件/構件/符號代表相同或類似部分。DETAILED DESCRIPTION OF THE INVENTION Reference will now be made in detail to the exemplary embodiments embodiments In addition, wherever possible, the elements and/

圖1是根據本發明實施例所繪示的網路伺服系統的示意圖。在本實施例中,網路伺服系統100包括第一控制器110_1、第二控制器110_2、分配單元120_1以及伺服器130_1_1~130_1_P(P為正整數)。分配單元120_1連接伺服器130_1_1~130_1_P。第一控制器110_1連接分配單元120_1,用以透過分配單元120_1管理伺服器130_1_1~130_1_P。第二控制器110_2連接分配單元120_1 以及第一控制器110_1,用以透過第一詢問信號IS1監控第一控制器110_1的操作情形。FIG. 1 is a schematic diagram of a network servo system according to an embodiment of the invention. In the present embodiment, the network servo system 100 includes a first controller 110_1, a second controller 110_2, an allocating unit 120_1, and servers 130_1_1~130_1_P (P is a positive integer). The allocating unit 120_1 connects the servers 130_1_1~130_1_P. The first controller 110_1 is connected to the distribution unit 120_1 for managing the servers 130_1_1~130_1_P through the distribution unit 120_1. The second controller 110_2 is connected to the distribution unit 120_1 And the first controller 110_1 is configured to monitor an operation situation of the first controller 110_1 by using the first query signal IS1.

當第一控制器110_1未以第一確認信號CS1回應第一詢問信號IS1時,第二控制器110_2取代第一控制器110_1而對伺服器130_1_1~130_1_P進行管理。並且,第二控制器110_2可在取代第一控制器110_1對伺服器130_1_1~130_1_P進行管理之後,重新啟動第一控制器110_1。When the first controller 110_1 does not respond to the first inquiry signal IS1 with the first confirmation signal CS1, the second controller 110_2 manages the servers 130_1_1~130_1_P in place of the first controller 110_1. Moreover, the second controller 110_2 may restart the first controller 110_1 after managing the servers 130_1_1~130_1_P in place of the first controller 110_1.

對於第一控制器110_1或是第二控制器110_2而言,其皆可透過例如讀取關聯於另一控制器的一主/從標識(master/slave flag)來判斷另一伺服器是否正透過分配單元120_1管理伺服器130_1_1~130_1_P。詳細來說明,所述主/從標識可用以代表控制器對於伺服器130_1_1~130_1_P的管理狀態。舉例而言,當第一控制器110_1的主/從標識為「主」時,其可以代表第一控制器110_1正在管理伺服器130_1_1~130_1_P。相對地,當第一控制器110_1的主/從標識為「從」時,其可以代表第一控制器110_1未管理伺服器130_1_1~130_1_P。換言之,當第一控制器110_1的主/從標識為「主」時,亦表示第二控制器110_2的主/從標識為「從」。For the first controller 110_1 or the second controller 110_2, it is possible to determine whether another server is transmitting through, for example, reading a master/slave flag associated with another controller. The allocation unit 120_1 manages the servers 130_1_1~130_1_P. In detail, the master/slave identifier can be used to represent the management status of the controllers for the servers 130_1_1~130_1_P. For example, when the master/slave identifier of the first controller 110_1 is "master", it may represent that the first controller 110_1 is managing the servers 130_1_1~130_1_P. In contrast, when the master/slave identifier of the first controller 110_1 is "slave", it may represent the first controller 110_1 not managing the servers 130_1_1~130_1_P. In other words, when the master/slave identifier of the first controller 110_1 is "master", it also indicates that the master/slave identifier of the second controller 110_2 is "slave".

因此,當第二控制器110_2取代第一控制器110_1而管理伺服器130_1_1~130_1_P時,第一控制器110_1的主/從標識可對應地切換為「從」,表示此時第一控制器110_1並未管理伺服器130_1_1~130_1_P;而第二控制器110_2 的主/從標識則可對應地切換為「主」,以表示此時第二控制器110_2正管理伺服器130_1_1~130_1_P。換言之,當第一控制器110_1(或第二控制器110_2)結束管理伺服器130_1_1~130_1_P時,其主/從標識將切換為「從」,以表示其並未管理伺服器130_1_1~130_1_P。Therefore, when the second controller 110_2 manages the servers 130_1_1~130_1_P instead of the first controller 110_1, the master/slave identifier of the first controller 110_1 can be correspondingly switched to "slave", indicating that the first controller 110_1 at this time The servers 130_1_1~130_1_P are not managed; and the second controller 110_2 The master/slave flag can be switched to "master" correspondingly to indicate that the second controller 110_2 is managing the servers 130_1_1~130_1_P at this time. In other words, when the first controller 110_1 (or the second controller 110_2) ends the management servers 130_1_1~130_1_P, its master/slave identifier will be switched to "slave" to indicate that it has not managed the servers 130_1_1~130_1_P.

此外,在控制器重新啟動之後,其亦可透過讀取另一控制器的主/從標識來對應設定其自身的主/從標識。舉例而言,當第一控制器110_1在重新啟動之後,可藉由讀取第二控制器110_2的主/從標識來判斷此時第二控制器110_2對伺服器130_1_1~130_1_P的管理狀態。若第二控制器110_2的主/從標識為「主」,第一控制器110_1即可得知此時第二控制器110_2正管理伺服器130_1_1~130_1_P。接著,第一控制器110_1則可將自身的主/從標識切換為「從」,並開始對第二控制器110_2的管理狀態進行監控。In addition, after the controller is restarted, it can also set its own master/slave identifier by reading the master/slave identifier of the other controller. For example, after the first controller 110_1 is restarted, the management status of the servers 130_1_1~130_1_P by the second controller 110_2 at this time can be determined by reading the master/slave identifier of the second controller 110_2. If the master/slave identifier of the second controller 110_2 is "master", the first controller 110_1 can know that the second controller 110_2 is managing the servers 130_1_1~130_1_P at this time. Then, the first controller 110_1 can switch its own master/slave identifier to "slave" and start monitoring the management state of the second controller 110_2.

在一實施例中,第一控制器110_1以及第二控制器110_2可以是雲端運算(Cloud Computing)網路系統中的機架管理控制器(Rack Management Controller,RMC)。分配單元120_1可以是電源分配單元(Power Distribution Unit,PDU)。一般而言,當網路伺服系統100在運作時,僅需第一控制器110_1(例如RMC)即可透過分配單元120_1(例如PDU)對伺服器130_1_1~130_1_P進行管理。而透過第二控制器110_2的設置以及其對第一控制器110_1的監視,使得當第一控制器110_1的操作情形出現異常時,第二控制 器110_2可以取代第一控制器110_1的角色而繼續維持網路伺服系統100的工作情形。In an embodiment, the first controller 110_1 and the second controller 110_2 may be a Rack Management Controller (RMC) in a Cloud Computing network system. The allocation unit 120_1 may be a Power Distribution Unit (PDU). In general, when the network servo system 100 is in operation, only the first controller 110_1 (eg, RMC) is required to manage the servers 130_1_1~130_1_P through the distribution unit 120_1 (eg, PDU). And through the setting of the second controller 110_2 and its monitoring of the first controller 110_1, when the operating condition of the first controller 110_1 is abnormal, the second control The device 110_2 can continue to maintain the working condition of the network servo system 100 instead of the role of the first controller 110_1.

詳細來說明,第二控制器110_2透過發送第一詢問信號IS1至第一控制器110_1,並依據第一控制器110_1是否對第一詢問信號IS1作出回應來判斷第一控制器110_1的操作情形是否正常。當第一控制器110_1的操作情形正常時,第一控制器110_1可以在接收到第一詢問信號IS1時,即時地回傳第一確認信號CS1至第二控制器110_2。此時,第二控制器110_2即可透過第一確認信號CS1得知目前第一控制器110_1仍可正常運作。此外,第二控制器110_2在發出第一詢問信號IS1之後,可進行計時操作,當第一控制器110_1未在預設時間內回傳第一確認信號CS1時,即判斷第一控制器110_1的操作情形出現異常。In detail, the second controller 110_2 determines whether the operation condition of the first controller 110_1 is determined by transmitting the first interrogation signal IS1 to the first controller 110_1 and according to whether the first controller 110_1 responds to the first interrogation signal IS1. normal. When the operation condition of the first controller 110_1 is normal, the first controller 110_1 may immediately return the first confirmation signal CS1 to the second controller 110_2 upon receiving the first inquiry signal IS1. At this time, the second controller 110_2 can learn that the first controller 110_1 can still operate normally through the first confirmation signal CS1. In addition, the second controller 110_2 may perform a timing operation after the first inquiry signal IS1 is sent. When the first controller 110_1 does not return the first confirmation signal CS1 within a preset time, the first controller 110_1 is determined. The operation situation is abnormal.

換言之,當第一控制器110_1的操作情形出現異常或是故障時,由於此時第一控制器110_1在接收到第一詢問信號IS1後,無法如同在正常操作情形中即時地對第一詢問信號IS1做出回應。因此,第一控制器110_1在回傳第一確認信號CS1時可能出現延遲,甚至無法對第一詢問信號IS1做出回應。此時,由於第二控制器110_2未收到第一控制器110_1回傳的第一確認信號CS1(或超過預設時間才收到第一確認信號CS1),第二控制器110_2即判斷此時第一控制器110_1的操作情形出現異常。接著,為了使網路伺服系統100可以持續正常運作,第二控制器110_2即取代第一控制器110_1而對伺服器130_1_1~130_1_P進行 管理,使得網路伺服系統100在第一控制器110_1的操作情形異常時,可以由第二控制器110_2執行第一控制器110_1的工作,以維持網路伺服系統100的工作情形。In other words, when the operation condition of the first controller 110_1 is abnormal or faulty, since the first controller 110_1 receives the first interrogation signal IS1 at this time, the first interrogation signal cannot be immediately as in the normal operation situation. IS1 responded. Therefore, the first controller 110_1 may delay when returning the first acknowledgment signal CS1, and may even not respond to the first interrogation signal IS1. At this time, since the second controller 110_2 does not receive the first confirmation signal CS1 returned by the first controller 110_1 (or receives the first confirmation signal CS1 after exceeding the preset time), the second controller 110_2 determines that the time is The operation situation of the first controller 110_1 is abnormal. Then, in order to enable the network servo system 100 to continue to operate normally, the second controller 110_2 replaces the first controller 110_1 to perform the server 130_1_1~130_1_P. Management, such that when the operating condition of the first controller 110_1 is abnormal, the second controller 110_2 can perform the operation of the first controller 110_1 to maintain the working condition of the network servo system 100.

在一實施例中,第一詢問信號IS1可以由第二控制器110_2以週期性的方式,或是以在預設時間點發送的方式來發送至第一控制器110_1。In an embodiment, the first interrogation signal IS1 may be sent to the first controller 110_1 by the second controller 110_2 in a periodic manner or in a manner of being transmitted at a preset time point.

簡而言之,透過第二控制器110_2對第一控制器110_1的監控,可以在第一控制器110_1的操作情形出現異常時,即時地由第二控制器110_2來代替第一控制器110_1管理伺服器130_1_1~130_1_P。如此一來,即使在第一控制器110_1出現故障或是異常時,網路伺服系統100仍可以藉由第二控制器110_2的運作而維持其操作情形。In short, the monitoring of the first controller 110_1 by the second controller 110_2 can be managed by the second controller 110_2 instead of the first controller 110_1 when the operation situation of the first controller 110_1 is abnormal. Servers 130_1_1~130_1_P. In this way, even when the first controller 110_1 fails or is abnormal, the network servo system 100 can maintain its operating condition by the operation of the second controller 110_2.

在其他實施例中,第一控制器110_1及第二控制器110_2可更連接一管理中心,用以儲存伺服器130_1_1~130_1_P的工作日誌。舉例而言,當第一控制器110_1管理伺服器130_1_1~130_1_P時,第一控制器110_1可收集伺服器的工作日誌並上傳至所述管理中心。In other embodiments, the first controller 110_1 and the second controller 110_2 may be further connected to a management center for storing the work logs of the servers 130_1_1~130_1_P. For example, when the first controller 110_1 manages the servers 130_1_1~130_1_P, the first controller 110_1 may collect the work logs of the server and upload them to the management center.

圖2是根據本發明實施例所繪示的系統管理方法的流程圖。請參照圖2,本實施例的系統管理方法適用於圖1的網路伺服系統100,以下即搭配圖1中的各項裝置說明系統管理方法的詳細步驟:首先,網路伺服系統100利用第二控制器110_2透過第一詢問信號IS1監控第一控制器110_1的操作情形(步驟S210)。接著,當第一控制器110_1未以第一確認信號CS1 回應第一詢問信號IS1時,利用第二控制器110_2取代第一控制器110_1而對伺服器130_1_1~130_1_P進行管理,並重新啟動第一控制器110_1(步驟S220)。FIG. 2 is a flowchart of a system management method according to an embodiment of the invention. Referring to FIG. 2, the system management method of this embodiment is applicable to the network servo system 100 of FIG. 1. Hereinafter, the detailed steps of the system management method are described in conjunction with the devices in FIG. 1. First, the network servo system 100 utilizes the The second controller 110_2 monitors the operation situation of the first controller 110_1 through the first interrogation signal IS1 (step S210). Then, when the first controller 110_1 does not use the first confirmation signal CS1 In response to the first inquiry signal IS1, the server 130_1_1~130_1_P is managed by the second controller 110_2 in place of the first controller 110_1, and the first controller 110_1 is restarted (step S220).

圖3是根據本發明實施例所繪示的網路伺服系統的示意圖。圖4是根據本發明實施例所繪示的系統管理方法的流程圖。請同時參照圖3及圖4,首先,步驟S410及S420中進行的操作與圖2中的步驟S210及S220相同,在此不再贅述。在第二控制器110_2取代第一控制器110_1而對伺服器130_1_1~130_1_P進行管理之後,由於第一控制器110_1被判斷為操作情形異常,因此第二控制器110_2可以重新啟動第一控制器110_1(步驟S430),使第一控制器110_1能夠在重新啟動之後恢復正常的操作情形。FIG. 3 is a schematic diagram of a network servo system according to an embodiment of the invention. FIG. 4 is a flowchart of a system management method according to an embodiment of the present invention. Please refer to FIG. 3 and FIG. 4 at the same time. First, the operations performed in steps S410 and S420 are the same as steps S210 and S220 in FIG. 2, and details are not described herein again. After the second controller 110_2 manages the servers 130_1_1~130_1_P in place of the first controller 110_1, since the first controller 110_1 is determined to be abnormal in the operation situation, the second controller 110_2 may restart the first controller 110_1. (Step S430), the first controller 110_1 is enabled to resume the normal operating situation after restarting.

在一實施例中,第一控制器110_1在被第二控制器110_2重新啟動之後,可透過例如主/從標識來判斷第二控制器110_2是否正管理伺服器130_1_1~130_1_P。接著,由於此時分配單元120_1是由第二控制器110_2所管理,因此,第一控制器110_1在判斷第二控制器110_2正管理伺服器130_1_1~130_1_P後,可以透過第二詢問信號IS2監控第二控制器110_2(步驟S440)。In an embodiment, after being restarted by the second controller 110_2, the first controller 110_1 may determine whether the second controller 110_2 is managing the servers 130_1_1~130_1_P through, for example, a master/slave identifier. Then, since the allocation unit 120_1 is managed by the second controller 110_2 at this time, the first controller 110_1 can monitor the second query signal IS2 after the second controller 110_2 is managing the servers 130_1_1~130_1_P. The second controller 110_2 (step S440).

當第二控制器110_2未以第二確認信號CS2回應第二詢問信號IS2時,第一控制器110_1取代第二控制器110_2而對伺服器130_1_1~130_1_P進行管理(步驟S450)。換言之,此時第二控制器110_2的操作情形出現異常,使得第二控制器110_2無法回應第一控制器110_1發送的第二詢 問信號IS2。因此,為了使網路伺服系統100可以持續正常運作,第一控制器110_1即取代第二控制器110_2而對分配單元120_1進行管理,使得網路伺服系統100在第二控制器110_2的操作情形異常時,可以由第一控制器110_1執行第二控制器110_2的工作,以維持網路伺服系統100的工作情形。之後,第一控制器110_1可以重新啟動第二控制器110_2(步驟S460),使第二控制器110_2能夠在重新啟動之後恢復正常的操作情形。When the second controller 110_2 does not respond to the second inquiry signal IS2 with the second confirmation signal CS2, the first controller 110_1 manages the servers 130_1_1~130_1_P in place of the second controller 110_2 (step S450). In other words, the operation situation of the second controller 110_2 is abnormal at this time, so that the second controller 110_2 cannot respond to the second query sent by the first controller 110_1. Ask signal IS2. Therefore, in order to enable the network servo system 100 to continue to operate normally, the first controller 110_1 manages the allocation unit 120_1 instead of the second controller 110_2, so that the operation condition of the network servo system 100 in the second controller 110_2 is abnormal. At this time, the operation of the second controller 110_2 can be performed by the first controller 110_1 to maintain the operating condition of the network servo system 100. Thereafter, the first controller 110_1 may restart the second controller 110_2 (step S460) to enable the second controller 110_2 to resume normal operation after restarting.

在一實施例中,第二控制器110_2在被第一控制器110_1重新啟動之後,可透過例如第一控制器110_1的主/從標識來判斷第一控制器110_1是否正管理伺服器130_1_1~130_1_P。In an embodiment, after being restarted by the first controller 110_1, the second controller 110_2 can determine whether the first controller 110_1 is managing the servers 130_1_1~130_1_P through, for example, the master/slave identifier of the first controller 110_1. .

在步驟S460之後,即返回步驟S410,再次由第二控制器110_2監控第一控制器110_1的操作情形。After step S460, the process returns to step S410, and the operation situation of the first controller 110_1 is monitored again by the second controller 110_2.

簡而言之,在第二控制器110_2判斷第一控制器110_1的操作情形出現異常,並進而取代第一控制器110_1管理分配單元120_1之後,第一控制器110_1可以在重新啟動之後對第二控制器110_2進行監控。也就是說,第一控制器110_1進行原本由第二控制器110_2執行的監控操作。並且,在第二控制器110_2的操作情形出現異常時,第一控制器110_1可以再次對伺服器130_1_1~130_1_P進行管理,使得網路伺服系統100可以透過第一控制器110_1和第二控制器110_2之間的輪流操作而持續正常運作。In short, after the second controller 110_2 determines that the operation condition of the first controller 110_1 is abnormal, and further replaces the first controller 110_1 to manage the allocation unit 120_1, the first controller 110_1 may The controller 110_2 performs monitoring. That is, the first controller 110_1 performs a monitoring operation originally performed by the second controller 110_2. Moreover, when an abnormality occurs in the operation situation of the second controller 110_2, the first controller 110_1 may manage the servers 130_1_1~130_1_P again, so that the network servo system 100 can transmit the first controller 110_1 and the second controller 110_2. The rotation between the two operations continues to function normally.

此外,在其他實施例中,步驟S440~460亦可在第一 控制器110_1正管理伺服器130_1_1~130_1_P時進行。詳細來說明,在第二控制器110_2監控第一控制器110_1管理狀態的同時,第一控制器110_1亦可透過第二詢問信號監控第二控制器110_2是否正常運作,並且在第二控制器110_2未以第二確認信號CS2回應第二詢問信號時IS2,重新啟動第二控制器110_2。換言之,兩個伺服器可以持續地透過詢問訊號來監控對方的運作狀態。In addition, in other embodiments, steps S440~460 may also be in the first step. The controller 110_1 performs when the servers 130_1_1~130_1_P are being managed. In detail, while the second controller 110_2 monitors the management state of the first controller 110_1, the first controller 110_1 can also monitor whether the second controller 110_2 is operating normally through the second interrogation signal, and in the second controller 110_2. The second controller 110_2 is restarted without responding to the second inquiry signal IS2 with the second confirmation signal CS2. In other words, the two servers can continuously monitor the operation status of the other party through the inquiry signal.

綜上所述,在本發明提供的網路伺服系統及其管理方法中,透過第二控制器對第一控制器的監控,可以在第一控制器的操作情形出現異常時,即時地由第二控制器來代替第一控制器管理伺服器。如此一來,即使在第一控制器出現故障或是異常時,網路伺服系統仍可以藉由第二控制器的運作而維持其操作情形。此外,第一控制器可以在重新啟動之後對第二控制器進行監控,並在第二控制器的操作情形出現異常時,由第一控制器再次地對伺服器進行管理,使得網路伺服系統可以在第一控制器及第二控制器之間的輪流運作情形下維持其正常運作。In summary, in the network servo system and the management method thereof provided by the present invention, the monitoring of the first controller by the second controller may be immediately performed by the first operation when the operation condition of the first controller is abnormal. The second controller replaces the first controller management server. In this way, even when the first controller fails or is abnormal, the network servo system can maintain its operation condition by the operation of the second controller. In addition, the first controller may monitor the second controller after restarting, and when the operation condition of the second controller is abnormal, the server is again managed by the first controller, so that the network servo system The normal operation of the first controller and the second controller can be maintained in a rotating operation.

雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,故本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the invention, and any one of ordinary skill in the art can make some modifications and refinements without departing from the spirit and scope of the invention. The scope of the invention is defined by the scope of the appended claims.

100‧‧‧網路伺服系統100‧‧‧Network Servo System

110_1‧‧‧第一控制器110_1‧‧‧First controller

110_2‧‧‧第二控制器110_2‧‧‧Second controller

120_1‧‧‧分配單元120_1‧‧‧Distribution unit

130_1_1~130_1_P‧‧‧伺服器130_1_1~130_1_P‧‧‧Server

CS1‧‧‧第一確認信號CS1‧‧‧First confirmation signal

CS2‧‧‧第二確認信號CS2‧‧‧Second confirmation signal

IS1‧‧‧第一詢問信號IS1‧‧‧ first interrogation signal

IS2‧‧‧第二詢問信號IS2‧‧‧ second interrogation signal

S210~S220、S410~S460‧‧‧步驟S210~S220, S410~S460‧‧‧ steps

圖1是根據本發明實施例所繪示的網路伺服系統的示 意圖。1 is a diagram of a network servo system according to an embodiment of the invention. intention.

圖2是根據本發明實施例所繪示的系統管理方法的流程圖。FIG. 2 is a flowchart of a system management method according to an embodiment of the invention.

圖3是根據本發明實施例所繪示的網路伺服系統的示意圖。FIG. 3 is a schematic diagram of a network servo system according to an embodiment of the invention.

圖4是根據本發明實施例所繪示的系統管理方法的流程圖。FIG. 4 is a flowchart of a system management method according to an embodiment of the present invention.

100‧‧‧網路伺服系統100‧‧‧Network Servo System

110_1‧‧‧第一控制器110_1‧‧‧First controller

110_2‧‧‧第二控制器110_2‧‧‧Second controller

120_1‧‧‧分配單元120_1‧‧‧Distribution unit

130_1_1~130_1_P‧‧‧伺服器130_1_1~130_1_P‧‧‧Server

CS1‧‧‧第一確認信號CS1‧‧‧First confirmation signal

IS1‧‧‧第一詢問信號IS1‧‧‧ first interrogation signal

Claims (14)

一種網路伺服系統,包括:至少一伺服器;至少一分配單元,連接該至少一伺服器;一第一控制器,連接該分配單元,用以透過該分配單元管理該至少一伺服器;以及一第二控制器,連接該分配單元以及該第一控制器,用以透過一第一詢問信號監控該第一控制器的操作情形,其中,當該第一控制器未以一第一確認信號回應該第一詢問信號時,該第二控制器取代該第一控制器而透過該分配單元對該至少一伺服器進行管理,並重新啟動該第一控制器。A network servo system comprising: at least one server; at least one distribution unit connecting the at least one server; a first controller connected to the distribution unit for managing the at least one server through the distribution unit; a second controller, the connection unit and the first controller are configured to monitor an operation condition of the first controller by using a first interrogation signal, wherein when the first controller does not use a first confirmation signal When the first inquiry signal is returned, the second controller replaces the first controller and manages the at least one server through the distribution unit, and restarts the first controller. 如申請專利範圍第1項所述之網路伺服系統,其中該第一控制器透過一第二詢問信號監控該第二控制器,並且,在該第二控制器未以一第二確認信號回應該第二詢問信號時,重新啟動該第二控制器。The network servo system of claim 1, wherein the first controller monitors the second controller through a second interrogation signal, and the second controller does not return with a second confirmation signal. The second controller should be restarted when the second interrogation signal should be applied. 如申請專利範圍第2項所述之網路伺服系統,其中該第一控制器/第二控制器在重新啟動之後,判斷該第二控制器/第一控制器是否管理該至少一伺服器。The network servo system of claim 2, wherein the first controller/second controller determines whether the second controller/first controller manages the at least one server after restarting. 如申請專利範圍第3項所述之網路伺服系統,其中判斷該第二控制器/第一控制器是否管理該至少一伺服器是透過讀取關聯於該第二控制器/第一控制器的一主/從標識來進行,在該第二控制器/第一控制器開始管理該伺服器前,先將其對應的主/從標識切換為主,而當該第二 控制器/第一控制器結束管理該至少一伺服器後,將其對應的主/從標識切換為從。The network servo system of claim 3, wherein determining whether the second controller/first controller manages the at least one server is related to the second controller/first controller by reading Performing a master/slave flag, before the second controller/first controller starts managing the server, first switching its corresponding master/slave identifier to the master, and when the second controller After the controller/first controller finishes managing the at least one server, it switches its corresponding master/slave identifier to slave. 如申請專利範圍第4項所述之網路伺服系統,其中該第一控制器/第二控制器在重新啟動之後,若判斷該第二控制器/第一控制器的該主/從標識為從,則將該第二控制器/第一控制器的該主/從標識設置為主;若判斷該第二控制器/第一控制器的該主/從標識為主,則將該第二控制器/第一控制器的該主/從標識設置為從。The network servo system of claim 4, wherein the first controller/second controller determines that the master/slave identifier of the second controller/first controller is after restarting And, the master/slave identifier of the second controller/first controller is set to be the master; if the master/slave identifier of the second controller/first controller is determined to be the master, the second The master/slave flag of the controller/first controller is set to slave. 如申請專利範圍第3項所述之網路伺服系統,其中,該第一控制器在判斷該第二控制器為管理該伺服器時,透過該第二詢問信號監控該第二控制器,並且,在該第二控制器未以該第二確認信號回應該第二詢問信號時,該第一控制器取代該第二控制器而對該至少一伺服器進行管理,並重新啟動該第二控制器。The network servo system of claim 3, wherein the first controller monitors the second controller by the second interrogation signal when determining that the second controller is to manage the server, and When the second controller does not respond to the second inquiry signal by the second confirmation signal, the first controller replaces the second controller to manage the at least one server, and restarts the second control Device. 如申請專利範圍第3項所述之網路伺服系統,其中該第一控制器/第二控制器更連接一管理中心,當該第一控制器/第二控制器的其中之一管理該至少一伺服器時,收集該伺服器的工作日誌並上傳至該管理中心。The network servo system of claim 3, wherein the first controller/second controller is further connected to a management center, and one of the first controller/second controller manages the at least one When a server is used, the work log of the server is collected and uploaded to the management center. 一種系統管理方法,適於一網路伺服系統,該方法包括:利用一第二控制器透過一第一詢問信號監控一第一控制器的操作情形;以及當該第一控制器未以一第一確認信號回應該第一詢 問信號時,利用一第二控制器取代該第一控制器而透過該分配單元對至少一伺服器進行管理,並重新啟動該第一控制器。A system management method, suitable for a network servo system, the method comprising: monitoring, by a second controller, a first controller operation condition through a first interrogation signal; and when the first controller is not A confirmation signal should be returned to the first inquiry When the signal is asked, the second controller is used to replace the first controller, and at least one server is managed through the distribution unit, and the first controller is restarted. 如申請專利範圍第8項所述之方法,更包括透過一第二詢問信號監控該第二控制器,並且,在該第二控制器未以一第二確認信號回應該第二詢問信號時,重新啟動該第二控制器。The method of claim 8, further comprising monitoring the second controller by a second interrogation signal, and when the second controller does not respond to the second interrogation signal with a second confirmation signal, Restart the second controller. 如申請專利範圍第9項所述之方法,其中在重新啟動該第一控制器/第二控制器的步驟之後,更包括:判斷該第二控制器/第一控制器是否管理該至少一伺服器。The method of claim 9, wherein after the step of restarting the first controller/second controller, further comprising: determining whether the second controller/first controller manages the at least one servo Device. 如申請專利範圍第10項所述之方法,其中判斷該第二控制器/第一控制器是否管理該伺服器的步驟包括:讀取關聯於該第二控制器/第一控制器的一主/從標識來進行,在該第二控制器/第一控制器開始管理該伺服器前,先將其對應的主/從標識切換為主,而當該第二控制器/第一控制器結束管理該伺服器後,將其對應的主/從標識切換為從。The method of claim 10, wherein the step of determining whether the second controller/first controller manages the server comprises: reading a master associated with the second controller/first controller / from the identification, before the second controller / first controller starts to manage the server, first switch its corresponding master / slave identification to the main, and when the second controller / first controller ends After managing the server, switch its corresponding master/slave ID to slave. 如申請專利範圍第11項所述之方法,在重新啟動該第一控制器/第二控制器的步驟之後,更包括:若判斷該第二控制器/第一控制器的該主/從標識為從,則將該第二控制器/第一控制器的該主/從標識設置為主;若判斷該第二控制器/第一控制器的該主/從標識為 主,則將該第二控制器/第一控制器的該主/從標識設置為從。The method of claim 11, after the step of restarting the first controller/second controller, further comprising: determining the master/slave identifier of the second controller/first controller For the slave, the master/slave identifier of the second controller/first controller is set to be the master; if the master/slave identifier of the second controller/first controller is determined to be The master sets the master/slave flag of the second controller/first controller to slave. 如申請專利範圍第10項所述之方法,其中判斷該第二控制器/第一控制器是否管理該至少一伺服器的步驟包括:在判斷該第二控制器為管理該至少一伺服器時,透過該第二詢問信號監控該第二控制器,並且,在該第二控制器未以該第二確認信號回應該第二詢問信號時,該第一控制器取代該第二控制器而對該至少一伺服器進行管理,並重新啟動該第二控制器。The method of claim 10, wherein the step of determining whether the second controller/first controller manages the at least one server comprises: determining that the second controller is to manage the at least one server Monitoring the second controller by the second interrogation signal, and when the second controller does not respond to the second interrogation signal with the second acknowledgement signal, the first controller replaces the second controller The at least one server manages and restarts the second controller. 如申請專利範圍第10項所述之方法,更包括當該第一控制器/第二控制器的其中之一管理該伺服器時,收集該伺服器的工作日誌並上傳至一管理中心。The method of claim 10, further comprising collecting the work log of the server and uploading it to a management center when one of the first controller/second controller manages the server.
TW101139119A 2012-10-23 2012-10-23 Network sever system and management method thereof TWI468949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW101139119A TWI468949B (en) 2012-10-23 2012-10-23 Network sever system and management method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW101139119A TWI468949B (en) 2012-10-23 2012-10-23 Network sever system and management method thereof

Publications (2)

Publication Number Publication Date
TW201416880A TW201416880A (en) 2014-05-01
TWI468949B true TWI468949B (en) 2015-01-11

Family

ID=51293816

Family Applications (1)

Application Number Title Priority Date Filing Date
TW101139119A TWI468949B (en) 2012-10-23 2012-10-23 Network sever system and management method thereof

Country Status (1)

Country Link
TW (1) TWI468949B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW497071B (en) * 1998-10-27 2002-08-01 Ibm Method and apparatus for managing clustered computer systems
US6675199B1 (en) * 2000-07-06 2004-01-06 Microsoft Identification of active server cluster controller
TWI322948B (en) * 2006-11-07 2010-04-01 Inventec Corp Method of periodically performing fail back for a server having dual redundant storage controllers
CN102298436A (en) * 2010-06-24 2011-12-28 阿沃森特公司 System and Method for Identifying Power Connections in Computer Systems Having Redundant Power Supplies
TW201205304A (en) * 2010-07-23 2012-02-01 Quanta Comp Inc Server system and operation method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW497071B (en) * 1998-10-27 2002-08-01 Ibm Method and apparatus for managing clustered computer systems
US6675199B1 (en) * 2000-07-06 2004-01-06 Microsoft Identification of active server cluster controller
TWI322948B (en) * 2006-11-07 2010-04-01 Inventec Corp Method of periodically performing fail back for a server having dual redundant storage controllers
CN102298436A (en) * 2010-06-24 2011-12-28 阿沃森特公司 System and Method for Identifying Power Connections in Computer Systems Having Redundant Power Supplies
TW201205304A (en) * 2010-07-23 2012-02-01 Quanta Comp Inc Server system and operation method thereof

Also Published As

Publication number Publication date
TW201416880A (en) 2014-05-01

Similar Documents

Publication Publication Date Title
US10887247B2 (en) Dynamic resource allocation for sensor devices on a cellular network
WO2016197876A1 (en) Remote control method, remote server, management device, and terminal
US9306825B2 (en) Providing a witness service
US10511480B2 (en) Message flow management for virtual networks
US20160330067A1 (en) Network Service Fault Handling Method, Service Management System, and System Management Module
CN112162821B (en) Container cluster resource monitoring method, device and system
CN110830283B (en) Fault detection method, device, equipment and system
US11706080B2 (en) Providing dynamic serviceability for software-defined data centers
US10187181B2 (en) Method and device for handling exception event in telecommunication cloud
CN103036719A (en) Cross-regional service disaster method and device based on main cluster servers
CN106452836A (en) Method and apparatus for setting host node
CN109766110B (en) Control method, substrate management controller and control system
WO2019232931A1 (en) Node exception processing method and system, device and computer-readable storage medium
US10721135B1 (en) Edge computing system for monitoring and maintaining data center operations
US20160285673A1 (en) Client side host machine backup system and its implementing method
US8812900B2 (en) Managing storage providers in a clustered appliance environment
TWI468949B (en) Network sever system and management method thereof
CA2960184C (en) Systems and methods for centrally-assisted distributed hash table
US9798633B2 (en) Access point controller failover system
TW201408885A (en) System and method for controlling sharing of fans
CN111064609A (en) Master-slave switching method and device of message system, electronic equipment and storage medium
CN111064608A (en) Master-slave switching method and device of message system, electronic equipment and storage medium
CN102724080A (en) Network management system and network management method
CN118354034A (en) Redundancy monitoring system, redundancy monitoring method, electronic equipment and readable storage medium
CN114138551A (en) Monitoring method of distributed system, computer equipment and storage medium

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees