CN106446311A - CPU alarm circuit and alarm method - Google Patents

CPU alarm circuit and alarm method Download PDF

Info

Publication number
CN106446311A
CN106446311A CN201510485887.2A CN201510485887A CN106446311A CN 106446311 A CN106446311 A CN 106446311A CN 201510485887 A CN201510485887 A CN 201510485887A CN 106446311 A CN106446311 A CN 106446311A
Authority
CN
China
Prior art keywords
cpu
logical device
fault alarm
alarm signal
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510485887.2A
Other languages
Chinese (zh)
Other versions
CN106446311B (en
Inventor
宛江明
吴聿旻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XFusion Digital Technologies Co Ltd
Original Assignee
Hangzhou Huawei Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huawei Digital Technologies Co Ltd filed Critical Hangzhou Huawei Digital Technologies Co Ltd
Priority to CN201510485887.2A priority Critical patent/CN106446311B/en
Publication of CN106446311A publication Critical patent/CN106446311A/en
Application granted granted Critical
Publication of CN106446311B publication Critical patent/CN106446311B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Hardware Redundancy (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a CPU alarm circuit and alarm method, and belongs to the field of CPU circuit design. The method comprises the steps that alarm signal pins of CPUs are connected with a logic device, the logic device receives fault alarm signals of the CPUs, detects the types of the fault alarm signals of the CPUs, and if the fault alarm signals are of unrepairable types, the logic device sends the fault alarm signals of the unrepairable types to all CPUs of a same hard partition; and if the fault alarm signals are of repairable types, the logic device sends the repairable fault alarm signals to all the CPUs of the same hard partition, thereby triggering other CPUs to perform fault repair. The problem of complex circuit design for connecting a plurality of CPUs to a same alarm bus in the prior art is solved; and the effects of simple circuit design and capability of accurately locating the faulted CPU through the logic device are achieved.

Description

CPU warning circuit and alarm method
Technical field
The present invention relates to cpu circuit design field, particularly to a kind of CPU warning circuit and alarm method.
Background technology
Multipath server is provided with the server system of multiple CPU.Can be propped up by interconnecting the extension of chip Hold structure 32 road servers, 64 road servers and 128 road servers.Interconnection chip can be Node Controller (English:Node controller, referred to as:NC).
In multipath server, if a CPU breaks down, need to transmit fault announcement between each CPU Alert signal.Prior art is designed with the alarm bus outside a band, will be equal for each CPU in multipath server It is connected in this alarm bus.In normal operation, this alarm bus is set to high level.If certain One CPU there occurs fault, then this CPU is down for low level namely total in this alarm by this alarm bus Fault alarm signal is triggered, thus notifying the other CPU in multipath server on line.
During realizing the present invention, inventor finds that prior art at least has problems with:Due to needing Multiple CPU are simultaneously connected in same alarm bus, when the CPU in multipath server is more, Also need to increase other electronic devices, such as driving element, lead to for multiple CPU to be always connected to same alarm Circuit design on line is extremely complex, is unfavorable for commercial production.
Content of the invention
In order to solve problem of the prior art, embodiments provide a kind of CPU warning circuit and alarm Method.Described technical scheme is as follows:
In a first aspect, embodiments providing a kind of CPU warning circuit, described circuit includes:
It is N number of that by interconnecting the CPU that chip is connected, each described CPU includes respective warning signal pin, N is 4 power;
Logical device;
The described warning signal pin of each described CPU is connected with described logical device.
In the first possible embodiment of first aspect, described logical device is at least two, described CPU is at least four;
Each described logical device is respectively connected with the described warning signal pin of at least two CPU, each institute The described warning signal pin stating CPU is connected with a described logical device;
Described CPU warning circuit also includes:System Management Unit SMU;
Described SMU is stored with hard partitioning information;
Described hard partitioning information is used for instruction at least two hard partitionings, and each described hard partitioning includes at least one Described logical device, and the described CPU being connected with described logical device;
Described SMU is also connected with the described logical device in hard partitioning each described.
In conjunction with the first possible embodiment of first aspect, in the possible embodiment of second,
There is at least one described hard partitioning and include logical device described at least two;
The logical device described at least two belonging in same described hard partitioning is connected mutually, same described The described logical device of one of hard partitioning is connected with described SMU.
The possible enforcement of the second of the first the possible embodiment in conjunction with first aspect or first aspect Mode, in the third possible embodiment, described logical device is N/4, each described logic device The described warning signal pin of part CPU described with four is connected.
The possible enforcement of the second of the first the possible embodiment in conjunction with first aspect or first aspect Mode, in the 4th kind of possible embodiment, also includes baseboard management controller in each described hard partitioning BMC;
Described BMC is connected with belonging to the described logical device in same described hard partitioning.
The first possible embodiment in conjunction with first aspect or first aspect or the second of first aspect Plant possible embodiment or the third possible embodiment of first aspect or first aspect the 4th kind Possible embodiment, in the 5th kind of possible embodiment, described logical device is field programmable gate Array FPGA circuitry or complex programmable logic device (CPLD) circuit.
Second aspect, embodiments provides a kind of multipath server, and described multipath server includes:
Described multipath server includes the CPU warning circuit of first aspect offer.
The third aspect, embodiments provides a kind of CPU alarm method, is applied to as first aspect institute In the CPU warning circuit providing, methods described includes:
Described logical device receives the fault alarm signal of described CPU;
Described logical device detects the type of the fault alarm signal of described CPU;
If the type of described fault alarm signal be unrepairable type, described logical device to described CPU is in the fault alarm signal of all CPU described unrepairable types of transmission of same hard partitioning;
If the type of described fault alarm signal be can repair type, described logical device to described CPU Be in same hard partitioning all CPU send described in can repair the fault alarm signal of type, described repair The fault alarm signal of type is used for triggering described other CPU and carries out fault restoration.
In the first possible embodiment of the third aspect, described logical device detects the event of described CPU The type of barrier warning signal, including:
When described logical device detects whether the duration of the fault alarm signal of described CPU exceedes default Long;
If exceeding described preset duration it is determined that the type of described fault alarm signal is unrepairable type;
If not less than described preset duration it is determined that the type of described fault alarm signal is to repair type.
In the possible embodiment of the second of the third aspect, CPU warning circuit is such as the of first aspect The second possible embodiment of a kind of possible embodiment or first aspect or first aspect the third Any one in 4th kind of possible embodiment of possible embodiment or first aspect,
Before described logical device receives the fault alarm signal of described CPU, also include:
Described logical device reads hard partitioning information from described SMU, and described hard partitioning information is used for instruction at least Two hard partitionings.
The beneficial effect that technical scheme provided in an embodiment of the present invention is brought is:
By the warning signal pin of each CPU is connected with logical device, logical device receives the event of CPU Barrier warning signal, logical device detects the type of the fault alarm signal of this CPU, if fault alarm signal is Unrepairable type, then all CPU transmissions of same hard partitioning residing for this CPU for the logical device can not be repaiied The fault alarm signal of multiple type;If fault alarm signal is to repair type, logical device is to this CPU All CPU of residing same hard partitioning send recoverable fault alarm signal, and the other CPU of triggering is carried out Fault restoration;Solve the problems, such as the complex circuit designs multiple CPU being connected in same alarm bus; Reached the fault warning pin connecting CPU by logical device, circuit design flexibly simple additionally it is possible to It is accurately positioned out the effect of the CPU breaking down by logical device.
Brief description
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, below will be to institute in embodiment description Need use accompanying drawing be briefly described it should be apparent that, drawings in the following description are only the present invention Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, Other accompanying drawings can also be obtained according to these accompanying drawings.
Figure 1A is a kind of structural representation of CPU warning circuit that one embodiment of the invention provides;
Figure 1B is a kind of structural representation of CPU warning circuit that another embodiment of the present invention provides;
Fig. 2 is a kind of flow chart of CPU alarm method that one embodiment of the invention provides;
Fig. 3 is a kind of flow chart of the sub-step of CPU alarm method that another embodiment of the present invention provides;
Fig. 4 A is a kind of structural representation of CPU warning circuit that one embodiment of the invention provides;
Fig. 4 B is a kind of structural representation of CPU warning circuit that another embodiment of the present invention provides;
Fig. 5 is a kind of flow chart of CPU alarm method that another embodiment of the present invention provides;
Fig. 6 is a kind of structural representation of multipath server that one embodiment of the invention provides.
Specific embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to the present invention Embodiment is described in further detail.
Embodiments provide a kind of CPU warning circuit applied in multipath server and alarm side Method.Alarm bus of the prior art is replaced by logical device in the embodiment of the present invention, can be in multichannel When server is the server on 32 tunnels, 64 tunnels or 128 tunnels, simplifies circuit design and realize flexibly firmly dividing Area designs.
Refer to Figure 1A, a kind of structure of CPU warning circuit of one embodiment of the invention offer is provided Schematic diagram.As shown in Figure 1A, this CPU warning circuit includes N number of CPU10 and logical device 20.
N number of CPU10 passes through interconnection chip (not shown) and is connected, and each CPU10 has respective announcement Alert signal pin, that is, N number of CPU10 is to should have N number of warning signal pin.Come with N=8 in Figure 1A Illustrate.
The number of logical device 20 is at least one.To be illustrated with logical device for 1 in Figure 1A. Logical device 20 is connected with the warning signal pin of 8 CPU10.Alternatively, also wrap in logical device 20 Containing 8 depositor (not shown)s, the CPU10 connected with 1 is corresponding for each depositor.Each Depositor is used for depositing the position of the CPU producing fault alarm signal.
Wherein, logical device 20 is field programmable gate array (English:Field Programmable Gate Array, referred to as:FPGA) circuit or CPLD (English:Complex Programmable Logic Device, referred to as:CPLD) circuit.
In this embodiment it is believed that 8 CPU belong to same hard partitioning, and only exist this in the present embodiment Individual hard partitioning is illustrating.Logical device 20 is additionally operable to control fault alarm signal only in this hard partitioning Transmit between each CPU.
In sum, the CPU warning circuit that the present embodiment provides, by logical device 20 and each CPU Fault alarm signal be connected, can simplify circuit design, particularly multipath server be 32 road servers, Circuit design when 64 road servers and 128 road server.Simultaneously as can arrange in logical device 20 Depositor, it is possible to which CPU the CPU being accurately positioned alarm of breaking down is, and in prior art Alarm bus be pulled low to low level after it is impossible to which CPU the CPU of alarm of judging to break down is.
Refer to Figure 1B, a kind of knot of CPU warning circuit of another embodiment of the present invention offer is provided Structure schematic diagram.As shown in Figure 1B, this CPU warning circuit includes N number of CPU10 and logical device 20.
N number of CPU10 passes through interconnection chip (not shown) and is connected, and each CPU10 has respective announcement Alert signal pin, that is, N number of CPU10 is to should have N number of warning signal pin.In figure to be lifted with N=8 Example explanation.
The number of logical device 20 is at least one, to be lifted with the number of logical device 20 for 2 in Figure 1B Example explanation.Each logical device 20 is connected with the warning signal pin of at least two CPU10, each CPU10 Warning signal pin be connected with a logical device 20.Concrete in fig. ib, each logical device 20 with The warning signal pin of four CPU10 is connected, and is connected mutually between two logical devices 20.
Alternatively, 4 depositor (not shown)s are included in each logical device 20, each is deposited The CPU10 connected with 1 is corresponding for device.Each depositor is used for depositing the CPU's producing fault alarm signal Position.
Wherein, logical device 20 is FPGA circuitry or CPLD circuit.
In this embodiment it is believed that 8 CPU belong to same hard partitioning, and only exist this in the present embodiment Individual hard partitioning is illustrating.Logical device 20 is additionally operable to control fault alarm signal only in this hard partitioning Transmit between each CPU.
Alternatively, this CPU warning circuit also includes:Baseboard management controller (English: Baseboard Management Controller, referred to as:BMC)30.BMC30 and one of logic device Part 20 is connected.
In sum, the CPU warning circuit that the present embodiment provides, by logical device 20 and each CPU Fault alarm signal be connected, can simplify circuit design, particularly multipath server be 32 road servers, Circuit design when 64 road servers and 128 road server.Simultaneously as can arrange in logical device 20 Depositor, it is possible to accurate confirmation is broken down, which CPU the CPU of alarm is, and in prior art Alarm bus be pulled low to low level after it is impossible to which CPU the CPU of alarm of judging to break down is.
Refer to Fig. 2, the method flow of the CPU alarm method of one embodiment of the invention offer is provided Figure.The present embodiment is applied to the CPU warning circuit shown by Figure 1A or Figure 1B with this CPU alarm method In illustrating.The method includes:
Step 201, CPU produces fault alarm signal.
In the running of multipath server, the fault warning pin of CPU is defaulted as high level.
If certain CPU breaks down, this CPU can produce fault alarm signal, and passes through warning signal pin Send fault alarm signal to connected logical device.
Fault alarm signal is divided into two kinds:The fault alarm signal of unrepairable type and the event that type can be repaired Barrier warning signal.Alternatively, the fault alarm signal of unrepairable type adopts long low level signal To represent;The fault alarm signal of type can be repaired, and using length, the pulse signal for 160ns to represent.
Step 202, logical device receives the fault alarm signal of CPU.
Logical device receives the fault alarm signal producing with this CPU.
Step 203, logical device detects the type of the fault alarm signal of CPU.
Logical device is according to the fault alarm signal of the CPU receiving, the type to cpu fault warning signal Detected.
Alternatively, this step includes following sub-step, as shown in Figure 3:
Step 203a, when logical device detects whether the duration of the fault alarm signal of CPU exceedes default Long.
Alternatively, this preset duration is 160ns.
Step 203b, if exceed preset duration it is determined that the type of fault alarm signal is unrepairable type.
If, more than 160ns, logical device is true for the fault alarm signal duration that logical device detection receives The type of this fault alarm signal fixed is unrepairable type.
Step 203c, if not less than preset duration it is determined that the type of fault alarm signal is to repair type.
If the fault alarm signal duration that logical device detection receives is not less than 160ns, logical device The type determining this fault alarm signal is for repairing type.
Step 204, if the type of fault alarm signal be unrepairable type, logical device to CPU The all CPU being in same hard partitioning send the fault alarm signal of unrepairable type.
If logical device detects that the fault alarm signal of CPU belongs to the type of unrepairable, logical device The fault alarm signal of unrepairable type can be sent to all CPU in same hard partitioning.
That is, the signal on the warning signal pin of all CPU can be placed in low level state by logical device, Multipath server is made to be in extension death situation state.
Step 205, if the type of fault alarm signal be can repair type, logical device to CPU at Send the fault alarm signal that can repair type in all CPU of same hard partitioning.
The fault alarm signal that this can repair type is used for triggering other CPU and carries out fault restoration.
Logical device sends the pulse signal of 160ns to other CPU, and all CPU of triggering start serviceability (Reliability, Availability and Serviceability, referred to as:RAS) flow process carries out fault restoration. This failover process can execute scheduled operation, such as empties the depositor of each CPU, to attempt repairing event Barrier.
Require supplementation with explanation is a little to provide two logical devices in Figure 1B, if the CPU in left side sends out Give birth to fault, the logical device in left side can receive this fault alarm signal, if subsequently needing to send out to all CPU When sending fault alarm signal, the logical device in left side sends fault warning letter to 4 CPU being connected with itself Number, the logical device logical device to the right in left side sends instruction simultaneously, from right side logical device to another Outer 4 CPU send fault alarm signal.That is, two logical devices can collaborative work.
Require supplementation with explanation on the other hand, BMC can provide graphic user interface to attendant, if dimension Shield personnel need to check which CPU the CPU breaking down is, then BMC can read in logical device Depositor, thus knowing which CPU there occurs fault, and make phase in graphical user interfaces The display answered.
In sum, the CPU alarm method that the present embodiment provides, CPU passes through warning signal pin to logic Device sends fault alarm signal, and logical device, according to the fault alarm signal receiving, detects this CPU's Fault alarm signal type, if the fault alarm signal of this CPU belongs to unrepairable type, logical device Send the fault alarm signal of unrepairable type to all CPU;If the fault alarm signal of this CPU belongs to Type can be repaired, then logical device sends the fault alarm signal that can repair type to all CPU;Solve In the prior art multiple CPU are connected to the complex circuit designs problem in same alarm bus;Reach In multipath server system, can quickly orient the effect of the position of CPU producing fault alarm signal.
In above-described embodiment, same hard partitioning is belonged to multipath server, only exist a hard partitioning simultaneously and enter Row illustrates, but the server on 32 tunnels, 64 tunnels even 128 tunnel occurs during implementing, In the prior art, the previous mainboard of mesh at most can only deposit 4 CPU, so being directed to multipath server The fault warning of middle CPU, the hard partitioning needing flexible design, to simplify circuit design, refer to and is implemented as follows Example.
Refer to Fig. 4 A, a kind of structure of CPU warning circuit of one embodiment of the invention offer is provided Schematic diagram.As shown in Figure 4 A, this CPU warning circuit includes N number of CPU10, logical device 20 and is System administrative unit (English:System Management Unit, referred to as:SMU)40.
N number of CPU10 passes through interconnection chip (not shown) and is connected, and each CPU10 has respective announcement Alert signal pin, that is, N number of CPU10 is to should have N number of warning signal pin.
Logical device 20 is at least two, CPU10 number is at least four;Each logical device 20 with extremely The warning signal pin of few two CPU10 is respectively connected with, and the warning signal pin of each CPU10 is patrolled with one Collect device 20 to be connected.To be illustrated with N=32 in Fig. 4 A.
Be stored with SMU40 hard partitioning information, and hard partitioning information is used for instruction at least two hard partitionings 50, often Individual hard partitioning 50 includes at least one logical device 20 and be connected at least two with logical device 20 CPU10.
There is at least one hard partitioning 50 and include at least two logical devices 20, and belong to same hard partitioning In at least two logical devices 20 interconnect, one of same hard partitioning logical device 20 with SMU40 is connected.
To be illustrated with the number of hard partitioning 50 for 4 in Fig. 4 A.Logical device in each hard partitioning 50 20 number is 2, and the number of CPU10 is 8, each logical device 20 and at least two CPU10 Warning signal pin be connected, the warning signal pin of each CPU10 is connected with a logical device 20. It is concrete that each logical device 20 is connected with the warning signal pin of 4 CPU10 in Figure 4 A, and two It is connected mutually between logical device 20.
Optionally, 4 depositor (not shown)s are included in each logical device 20, each is deposited The CPU10 connected with 1 is corresponding for device.Each depositor is used for depositing the CPU's producing fault alarm signal Position.
Wherein, logical device 20 is FPGA circuitry or CPLD circuit.
In this embodiment it is believed that 32 CPU are divided into 4 hard partitionings, in each hard partitioning, comprise 8 Individual CPU and 2 logical device is illustrating.Logical device 20 is additionally operable to only control fault alarm signal Transmit between each CPU in same hard partitioning.
Alternatively, this CPU warning circuit also includes:BMC30.BMC30 and one of logic device Part 20 is connected.
In sum, the CPU warning circuit that the present embodiment provides, by the hard partitioning of storage in SMU40 Information, multiple CPU are divided into 4 different hard partitionings, the CPU in logical device 20 and same hard partitioning Fault alarm signal be connected, can simplify circuit design, particularly multipath server be 32 road servers, Circuit design when 64 road servers and 128 road server;It is capable of flexibly setting between multiple hard partitionings Meter.Simultaneously as depositor can be arranged in logical device 20, it is possible to the accurate announcement that confirms to break down Which CPU alert CPU is, and alarm bus of the prior art be pulled low to low level after it is impossible to sentence Breaking, which CPU the CPU of generation fault warning be.
In Fig. 4 A embodiment, in each hard partitioning, include 8 CPU and corresponding 2 logic devices Part, each logical device is connected with the warning signal pin of 4 CPU, but in reality, each The CPU number comprising in hard partitioning and the number of logical device are not necessarily identical, each logical device simultaneously Also not necessarily it is connected with the warning signal pin of 4 CPU, refer to shown in Fig. 4 B.
Refer to Fig. 4 B, a kind of knot of CPU warning circuit of another embodiment of the present invention offer is provided Structure schematic diagram.As shown in Figure 4 B, this CPU warning circuit includes N number of CPU, logical device 20 and SMU40.
N number of CPU passes through interconnection chip (not shown) and is connected, and each CPU has respective alarm letter Number pin, that is, N number of CPU is to should have N number of warning signal pin.
Logical device 20 is at least two, CPU number N is at least four;Each logical device 20 with extremely The warning signal pin of few two CPU is respectively connected with, the warning signal pin of each CPU and a logic device Part 20 is connected.To be illustrated with N=32 in Fig. 4 B.
Be stored with SMU40 hard partitioning information, and hard partitioning information is used for instruction at least two hard partitionings, each Hard partitioning includes at least one logical device 20 and at least two CPU being connected with logical device 20.
There is at least one hard partitioning and include at least two logical devices 20, and belong in same hard partitioning At least two logical devices 20 interconnect, one of same hard partitioning logical device 20 and SMU40 It is connected.
With the number of hard partitioning for 5 in Fig. 4 B, respectively hard partitioning 41 to illustrate to hard partitioning 45.
4 CPU, respectively CPU11 to CPU14 is included in hard partitioning 41, and a logical device 20;8 CPU, respectively CPU21 to CPU28 is included in hard partitioning 42, and a logical device 20;4 CPU, respectively CPU31 to CPU34 is included in hard partitioning 43, and two logical devices 20;8 CPU, respectively CPU41 to CPU48 is included in hard partitioning 44, and two logical devices 20;8 CPU, respectively CPU51 to CPU58 is included in hard partitioning 45, and two logical devices 20.Each logical device 20 is connected with the warning signal pin of at least two CPU, the announcement of each CPU10 Alert signal pin is connected with a logical device 20.Specifically in figure 4b, in hard partitioning 41, logic device Part 20 is connected with the warning signal pin of 4 CPU;In hard partitioning 42, logical device 20 and 8 The warning signal pin of CPU is connected;In hard partitioning 43, the announcement of each logical device 20 and 2 CPU Alert signal pin is connected, and is connected mutually between two logical devices 20;In hard partitioning 44, each is patrolled Collect device 20 to be connected with the warning signal pin of 4 CPU, and be connected mutually between two logical devices 20; In hard partitioning 45, each logical device 20 is connected with the warning signal pin of 4 CPU, and two are patrolled Collect and be connected mutually between device 20.
Optionally, (in figure is not to include the depositor of the corresponding CPU number that is connected in each logical device 20 Illustrate), that is, in hard partitioning 41, including 4 depositors in each logical device 20;At hard point In area 42, in each logical device 20, include 8 depositors;In hard partitioning 43, each logic device 2 depositors are included in part 20;In hard partitioning 44 and hard partitioning 45, in each logical device 20 All include 4 depositors;The CPU connected with 1 is corresponding for each depositor.Each depositor is used for posting Deposit the position of the CPU producing fault alarm signal.
Alternatively, logical device is N/4, the warning signal pin phase of each logical device and four CPU Even.
Wherein, logical device 20 is FPGA circuitry or CPLD circuit.
In this embodiment it is believed that 32 CPU are divided into 5 hard partitionings, comprise not in each hard partitioning CPU with number to illustrate.Logical device 20 is additionally operable to control fault alarm signal only same hard Transmit between each CPU in subregion.
Alternatively, this CPU warning circuit also includes:BMC30.BMC30 and one of logic device Part 20 is connected.
In sum, the CPU warning circuit that the present embodiment provides, by the hard partitioning of storage in SMU40 Information, multiple CPU are divided into 5 different hard partitionings, the CPU in logical device 20 and same hard partitioning Fault alarm signal be connected, can simplify circuit design, particularly multipath server be 32 road servers, Circuit design when 64 road servers and 128 road server;It is capable of flexibly setting between multiple hard partitionings Meter.Simultaneously as depositor can be arranged in logical device 20, it is possible to the accurate announcement that confirms to break down Which CPU alert CPU is, and alarm bus of the prior art be pulled low to low level after it is impossible to sentence Breaking, which CPU the CPU of generation fault warning be.
Refer to Fig. 5, a kind of flow process of CPU alarm method of one embodiment of the invention offer is provided Figure.The present embodiment is applied to include to lift in the CPU warning circuit shown in Fig. 4 A or Fig. 4 B in this way Example explanation.This CPU alarm method includes:
Step 501, logical device reads hard partitioning information from SMU, and this hard partitioning information is used for instruction at least Two hard partitionings.
Hard partitioning information is stored in advance in SMU, and logical device directly reads hard point of storage from SMU Area's information, this hard partitioning packet contains at least two hard partitionings, that is, including hard to multiple CPU Partitioning scenario.
Step 502, logical device, according to the hard partitioning information reading, obtains CPU corresponding with logical device Warning signal pin information.
In system standby (English:Standby, referred to as:STB) after upper electricity, logical device is according on SMU The hard partitioning information of storage, the pin configuration being connected between logical device and CPU is input and output pin; That is, the warning signal pin configuration of CPU is output pin, with patrolling that the warning signal pin of CPU is connected Collect device pin and be configured to input key.
Step 503, CPU produces fault alarm signal.
In the running of multipath server, the fault warning pin of CPU is defaulted as high level.
If certain CPU breaks down, this CPU can produce fault alarm signal, and passes through warning signal pin Send fault alarm signal to connected logical device.
Fault alarm signal is divided into two kinds:The fault alarm signal of unrepairable type and the event that type can be repaired Barrier warning signal.Alternatively, the fault alarm signal of unrepairable type adopts long low level signal To represent;The fault alarm signal of type can be repaired, and using length, the pulse signal for 160ns to represent.
Step 504, logical device receives the fault alarm signal of CPU.
Logical device receives the fault alarm signal producing with this CPU.Logical device according to hard partitioning information, By the fault alarm signal of CPU according to hard partitioning realize transmission topology it is ensured that the fault alarm signal of CPU extremely Same hard partitioning is transmitted, does not affect other hard partitionings.
Step 505, logical device detects the type of the fault alarm signal of CPU.
Logical device is according to the fault alarm signal of the CPU receiving, the type to cpu fault warning signal Detected.
This step is identical with step 203, refer to Fig. 2 embodiment.
Step 506, if the type of fault alarm signal be unrepairable type, logical device to this CPU The all CPU being in same hard partitioning send the fault alarm signal of unrepairable type.
If logical device detects that the fault alarm signal of this CPU belongs to the type of unrepairable, logic device Part can send the fault alarm signal of unrepairable type to all CPU in same hard partitioning.
That is, the signal on the warning signal pin of all CPU in this hard partitioning can be placed in by logical device Low level state, all CPU in this hard partitioning making are in and hang dead state.
Step 507, if the type of fault alarm signal be can repair type, logical device to this CPU The all CPU being in same hard partitioning send the fault alarm signal that can repair type.
The fault alarm signal that this can repair type is used for triggering other CPU and carries out fault restoration.
Logical device sends the pulse signal of 160ns to other CPU, and all CPU of triggering start RAS flow Cheng Jinhang fault restoration.This failover process can execute scheduled operation, such as empties depositing of each CPU Device, to attempt repairing fault.
Require supplementation with explanation is a little to provide 4 hard partitionings in Fig. 4 A, and each hard partitioning includes two Individual logical device, if the CPU in left side there occurs fault, the logical device in left side can receive this fault warning letter Number, if subsequently need to all CPU send fault alarm signal, the logical device in left side to itself phase 4 CPU even send fault alarm signal, and the logical device in left side logical device to the right sends simultaneously Instruction, sends fault alarm signal from the logical device on right side to other 4 CPU.That is, two logics Device can collaborative work.
Require supplementation with explanation on the other hand, BMC can provide graphic user interface to attendant, if dimension Shield personnel need to check which CPU the CPU breaking down is, then BMC can read in logical device Depositor, thus knowing which CPU there occurs fault, and make phase in graphical user interfaces The display answered.
In an exemplary embodiment it is assumed that in figure 4b, logical device 20 reads from SMU40 Hard partitioning information, according to hard partitioning information, logical device 20 obtains the warning signal pin of corresponding CPU Information;It is assumed that the CPU43 in hard partitioning 44 produces fault alarm signal, now, the logic device in left side Part 20 receives the fault alarm signal of CPU43, the depositor being connected in left side logical device 20 with CPU43 Determine the position of CPU43 according to fault alarm signal.
If left side logical device 20 passes through detection, find that the fault alarm signal of CPU43 is unrepairable class Type, then left side logical device 20 to be connected with itself CPU41, CPU42 and CPU44 transmission can not repair The fault alarm signal of multiple type, the logical device 20 in left side logical device 20 transmission to the right simultaneously refers to Order, the logical device 20 on right side according to instruction to CPU45, CPU46, the CPU47 being connected with itself and CPU48 sends the fault alarm signal of unrepairable type.
If left side logical device 20 passes through detection, find that the fault alarm signal of CPU43 is can to repair type, Then left side logical device 20 is sent to CPU41, CPU42 and the CPU44 being connected with itself and can repair type Fault alarm signal, simultaneously logical device 20 logical device 20 to the right in left side send instruction, right The logical device 20 of side is according to instruction to CPU45, CPU46, CPU47 and the CPU48 being connected with itself Transmission can repair the fault alarm signal of type.
In sum, the CPU alarm method that the present embodiment provides, the hard partitioning information being configured by SMU, Logical device, according to hard partitioning information, CPU is connected with logical device according to hard partitioning information, CPU passes through Warning signal pin sends fault alarm signal to logical device, and logical device is according to the fault warning receiving Signal, detects the fault alarm signal type of this CPU, if the fault alarm signal of this CPU belongs to can not repair Multiple type, then logical device is to the fault warning of all CPU transmission unrepairable types in same hard partitioning Signal;If the fault alarm signal of this CPU belongs to can repair type, logical device is into same hard partitioning All CPU send the fault alarm signal that can repair type;Solve in the prior art by multiple CPU It is connected to the complex circuit designs problem in same alarm bus;Reach in multipath server system, can Quickly to orient the position of the CPU producing fault alarm signal, it is simultaneously achieved the effect of flexible hard partitioning.
Refer to Fig. 6, a kind of structural representation of multipath server of one embodiment of the invention offer is provided Figure.As shown in fig. 6, this multipath server 600 includes CPU warning circuit 610.
This CPU warning circuit 610 is any one shown by Figure 1A or Figure 1B or Fig. 4 A or Fig. 4 B CPU warning circuit.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can be passed through Hardware come to complete it is also possible to instructed by program correlation hardware complete, described program can be stored in In a kind of computer-readable recording medium, storage medium mentioned above can be read only memory, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all the present invention's Within spirit and principle, any modification, equivalent substitution and improvement made etc., should be included in the present invention's Within protection domain.

Claims (10)

1. a kind of CPU warning circuit is it is characterised in that be used in multipath server, described circuit includes:
It is N number of that by interconnecting the CPU that chip is connected, each described CPU includes respective warning signal pin, N is 4 power;
Logical device;
The described warning signal pin of each described CPU is connected with described logical device.
2. circuit according to claim 1 it is characterised in that described logical device be at least two, Described CPU is at least four;
Each described logical device is respectively connected with the described warning signal pin of at least two CPU, each institute The described warning signal pin stating CPU is connected with a described logical device;
Described CPU warning circuit also includes:System Management Unit SMU;
Described SMU is stored with hard partitioning information;
Described hard partitioning information is used for instruction at least two hard partitionings, and each described hard partitioning includes at least one Described logical device, and the described CPU being connected with described logical device;
Described SMU is also connected with the described logical device in hard partitioning each described.
3. circuit according to claim 2 it is characterised in that
There is at least one described hard partitioning and include logical device described at least two;
The logical device described at least two belonging in same described hard partitioning is connected mutually, same described The described logical device of one of hard partitioning is connected with described SMU.
4. the circuit according to Claims 2 or 3 is it is characterised in that described logical device is N/4, The described warning signal pin of each described logical device CPU described with four is connected.
5. the circuit according to Claims 2 or 3 is it is characterised in that also wrap in each described hard partitioning Include baseboard management controller BMC;
Described BMC is connected with belonging to the described logical device in same described hard partitioning.
6. according to the arbitrary described circuit of claim 1 to 5 it is characterised in that described logical device is existing Field programmable gate array FPGA circuitry or complex programmable logic device (CPLD) circuit.
7. a kind of multipath server is it is characterised in that described multipath server includes such as claim 1 to 6 Arbitrary described CPU warning circuit.
8. a kind of CPU alarm method is it is characterised in that be applied to as arbitrary in claim 1 to 6 described CPU warning circuit in, methods described includes:
Described logical device receives the fault alarm signal of described CPU;
Described logical device detects the type of the fault alarm signal of described CPU;
If the type of described fault alarm signal be unrepairable type, described logical device to described CPU is in the fault alarm signal of all CPU described unrepairable types of transmission of same hard partitioning;
If the type of described fault alarm signal be can repair type, described logical device to described CPU Be in same hard partitioning all CPU send described in can repair the fault alarm signal of type, described repair The fault alarm signal of type is used for triggering described other CPU and carries out fault restoration.
9. method according to claim 8 is it is characterised in that described logical device detects described CPU Fault alarm signal type, including:
When described logical device detects whether the duration of the fault alarm signal of described CPU exceedes default Long;
If exceeding described preset duration it is determined that the type of described fault alarm signal is unrepairable type;
If not less than described preset duration it is determined that the type of described fault alarm signal is to repair type.
10. method according to claim 8 is it is characterised in that described CPU warning circuit is as power Profit requires 2 to 5 arbitrary described CPU warning circuits,
Before described logical device receives the fault alarm signal of described CPU, also include:
Described logical device reads hard partitioning information from described SMU, and described hard partitioning information is used for instruction at least Two hard partitionings.
CN201510485887.2A 2015-08-10 2015-08-10 CPU warning circuit and alarm method Active CN106446311B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510485887.2A CN106446311B (en) 2015-08-10 2015-08-10 CPU warning circuit and alarm method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510485887.2A CN106446311B (en) 2015-08-10 2015-08-10 CPU warning circuit and alarm method

Publications (2)

Publication Number Publication Date
CN106446311A true CN106446311A (en) 2017-02-22
CN106446311B CN106446311B (en) 2019-09-13

Family

ID=58093501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510485887.2A Active CN106446311B (en) 2015-08-10 2015-08-10 CPU warning circuit and alarm method

Country Status (1)

Country Link
CN (1) CN106446311B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287774A (en) * 2018-02-28 2018-07-17 郑州云海信息技术有限公司 A kind of method for diagnosing faults of server, device, equipment and storage medium
CN109101009A (en) * 2018-09-06 2018-12-28 华为技术有限公司 Fault diagnosis system and server
CN109302470A (en) * 2018-09-26 2019-02-01 郑州云海信息技术有限公司 A kind of road N server interacted system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120331041A1 (en) * 2011-06-27 2012-12-27 Hyun-Sung Shin Lookup table logic apparatus and server communicating with the same
CN102880583A (en) * 2012-08-01 2013-01-16 浪潮(北京)电子信息产业有限公司 Device and method for configuring dynamic link of multi-way server
CN103733180A (en) * 2013-09-29 2014-04-16 华为技术有限公司 Server control method and control device
CN104503947A (en) * 2014-12-16 2015-04-08 华为技术有限公司 Multi-server and signal processing method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120331041A1 (en) * 2011-06-27 2012-12-27 Hyun-Sung Shin Lookup table logic apparatus and server communicating with the same
CN102880583A (en) * 2012-08-01 2013-01-16 浪潮(北京)电子信息产业有限公司 Device and method for configuring dynamic link of multi-way server
CN103733180A (en) * 2013-09-29 2014-04-16 华为技术有限公司 Server control method and control device
CN104503947A (en) * 2014-12-16 2015-04-08 华为技术有限公司 Multi-server and signal processing method thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287774A (en) * 2018-02-28 2018-07-17 郑州云海信息技术有限公司 A kind of method for diagnosing faults of server, device, equipment and storage medium
CN109101009A (en) * 2018-09-06 2018-12-28 华为技术有限公司 Fault diagnosis system and server
WO2020048174A1 (en) * 2018-09-06 2020-03-12 华为技术有限公司 Fault diagnosis system and server
CN109101009B (en) * 2018-09-06 2020-08-14 华为技术有限公司 Fault diagnosis system and server
EP3835903A4 (en) * 2018-09-06 2021-10-13 Huawei Technologies Co., Ltd. Fault diagnosis system and server
US11347611B2 (en) 2018-09-06 2022-05-31 Xfusion Digital Technologies Co., Ltd. Fault diagnosis system and server
CN109302470A (en) * 2018-09-26 2019-02-01 郑州云海信息技术有限公司 A kind of road N server interacted system

Also Published As

Publication number Publication date
CN106446311B (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN107463456A (en) A kind of system and method for lifting double netcard NCSI management system switching efficiencies
CN104796273A (en) Method and device for diagnosing root of network faults
CN109491946A (en) A kind of chip and method for I2C bus extension
CN105807722B (en) Possesses the numerical control system of internal register runback bit function
CN105868149A (en) A serial port information transmission method and device
CN106446311A (en) CPU alarm circuit and alarm method
CN109586959A (en) A kind of method and device of fault detection
CN104283718B (en) The network equipment and the hardware fault diagnosis method for the network equipment
CN106547655B (en) The method and system of memory bar quantity on circuit for detecting plate
CN109101009A (en) Fault diagnosis system and server
CN104503947B (en) Multipath server and its signal processing method
CN108683528A (en) A kind of data transmission method, central server, server and data transmission system
CN105549696A (en) Rack-mounted server system with case management function
CN108334060B (en) A kind of bus failure injection device
CN106502944A (en) The heartbeat detecting method of computer, PCIE device and PCIE device
CN105119765B (en) A kind of Intelligent treatment fault system framework
US6330694B1 (en) Fault tolerant system and method utilizing the peripheral components interconnection bus monitoring card
CN107729173A (en) A kind of redriver parameter configuration monitoring methods for server
CN106100941A (en) Method and device based on distributed system test board intercard communication reliability
CN101686119A (en) Method, device and system of communication between single boards
CN105763366A (en) Method and device for realizing data communication based on aggregation link
US20070180329A1 (en) Method of latent fault checking a management network
CN103885441A (en) Self-adaptive fault diagnosis method for controller local area network
CN103763170B (en) Looped network protecting method and device
CN104656478B (en) The control circuit and control method of a kind of multi-power module

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200422

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee after: HUAWEI TECHNOLOGIES Co.,Ltd.

Address before: 301, A building, room 3, building 301, foreshore Road, No. 310052, Binjiang District, Zhejiang, Hangzhou

Patentee before: Huawei Technologies Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211222

Address after: 450046 Floor 9, building 1, Zhengshang Boya Plaza, Longzihu wisdom Island, Zhengdong New Area, Zhengzhou City, Henan Province

Patentee after: Super fusion Digital Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right