CN109726055A - Detect the method and computer equipment of PCIe chip exception - Google Patents

Detect the method and computer equipment of PCIe chip exception Download PDF

Info

Publication number
CN109726055A
CN109726055A CN201711044736.9A CN201711044736A CN109726055A CN 109726055 A CN109726055 A CN 109726055A CN 201711044736 A CN201711044736 A CN 201711044736A CN 109726055 A CN109726055 A CN 109726055A
Authority
CN
China
Prior art keywords
chip
pcie
pcie chip
cpld
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711044736.9A
Other languages
Chinese (zh)
Other versions
CN109726055B (en
Inventor
柴峰
陈加怀
李道宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XFusion Digital Technologies Co Ltd
Original Assignee
Hangzhou Huawei Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huawei Digital Technologies Co Ltd filed Critical Hangzhou Huawei Digital Technologies Co Ltd
Priority to CN201711044736.9A priority Critical patent/CN109726055B/en
Publication of CN109726055A publication Critical patent/CN109726055A/en
Application granted granted Critical
Publication of CN109726055B publication Critical patent/CN109726055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

A kind of method and computer equipment detecting PCIe chip exception, the computer equipment includes: CPU, the PCIe chip coupled with CPU, and the CPLD coupled with PCIe chip;Wherein, the signal detection pin of the heartbeat signal pin of PCIe chip and CPLD are electrically connected, and PCIe chip exports heartbeat signal to CPLD by heartbeat signal pin, and heartbeat signal is used to indicate whether PCIe chip works normally.CPLD is used to determine whether PCIe chip works normally according to the heartbeat signal that signal detection pin obtains;When determining PCIe chip abnormal work, electricity under PCIe chip is controlled.The embodiment of the present application detects whether PCIe chip works normally by hardware means, and when detecting PCIe chip exception, the link between PCIe chip and CPU is disconnected by hardware means, to improve abnormality detection speed, and more chain rupture in time.

Description

Detect the method and computer equipment of PCIe chip exception
Technical field
The invention relates to field of communication technology, in particular to a kind of detection high speed peripheral component interconnection The method and computer equipment of (Peripheral Component Interface express, PCIe) chip exception.
Background technique
PCIe is the serial communication interconnection standards of a kind of high-performance, high bandwidth.PCIe technology be widely used in PC, In the computer equipments such as server and data center.
PCIe chip refers to the chip for supporting PCIe standard, as PCIe exchanges (Switch) chip, non-transparent bridge (Non- Transparant Bridge, NTB) chip etc..PCIe chip usually with the central processing unit (Central of computer equipment Processing Unit, CPU) it is connected, the message for forwarding CPU to issue.PCIe chip at work, can be because code be held , there is the case where abnormal work in the factors such as abnormal, big, the physical characteristic aging of message pressure of row.
When PCIe chip abnormal work, the message that CPU is issued can not be continued with, if CPU and PCIe core at this time PCIe link between piece does not disconnect, CPU in the case where not perceiving PCIe chip abnormal work, can wait PCIe chip to CPU feeds back the response of message, and the subsequent message for needing to be handed down to PCIe chip can be constantly put into the buffer area of CPU, until CPU Buffer overflow when report an error.
In the prior art, CPU detects whether PCIe chip works normally by software approach, and is detecting PCIe core When piece abnormal work, the PCIe link between CPU and PCIe chip is disconnected by software approach.Specifically, the software in CPU Program accesses PCIe chip by the way of poll, and when the number of connected reference failure reaches preset times (such as 3 times), this is soft Part program disconnects the PCIe link between CPU and PCIe chip.
Conventionally, as being to detect PCIe chip exception by software approach and execute chain rupture operation, there is inspection Degree of testing the speed is slow, chain rupture problem not prompt enough.
Summary of the invention
The embodiment of the present application provides a kind of method and computer equipment for detecting PCIe chip exception, can be used for solving existing Have and PCIe chip exception is detected by software approach in technology and executes chain rupture operation, existing detection speed is slow, and chain rupture is not Enough timely problems.
On the one hand, the embodiment of the present application provides a kind of computer equipment, which includes: CPU, couples with CPU PCIe chip, and Complex Programmable Logic Devices (the Complex Programmable Logic coupled with PCIe chip Device, CPLD).The heartbeat signal pin of PCIe chip and the signal detection pin of CPLD are electrically connected, and PCIe chip passes through Heartbeat signal pin exports heartbeat signal to CPLD, which is used to indicate whether PCIe chip works normally.CPLD is used Determine whether PCIe chip works normally in the heartbeat signal obtained according to signal detection pin;When determining PCIe chip exception work When making, electricity under PCIe chip is controlled.
In scheme provided by the embodiments of the present application, PCIe core is detected according to the heartbeat signal that PCIe chip exports by CPLD Whether piece works normally, and when detecting PCIe chip abnormal work, by electricity under control PCIe chip, so that PCIe chip PCIe link between CPU disconnects, and realizes and operates to the abnormality detection of PCIe chip and chain rupture.Due to the spy of CPLD itself Property, for software faster, and not will receive CPU causes card is slow even to hang extremely to processing speed due to message issues obstruction Influence, it is abnormal so as to be quickly detected from PCIe chip;And it is electric under to PCIe chip by the way of make PCIe core PCIe link between piece and CPU disconnects, so that chain rupture is much sooner, quickly.
In a possible design, heartbeat signal is the square-wave signal of predeterminated frequency.CPLD is also used to heartbeat signal High level lasting time or the low level duration be greater than preset threshold when, determine PCIe chip abnormal work.
Using square-wave signal as heartbeat signal, enable to that CPLD is simpler, clearly distinguishes heartbeat signal Metamorphosis helps quickly and accurately to determine whether PCIe chip works normally.
In a possible design, CPLD is also used to when determining PCIe chip abnormal work, to power supply control chip Lower electric signal is exported, which is used to stop according to lower electric signal to power to PCIe chip.
In a possible design, CPLD is also used in the case where controlling PCIe chip after electricity, if receiving CPU output Power on signal then controls PCIe chip according to power on signal and re-powers.
In a possible design, PCIe chip is PCIe exchange chip;Alternatively, PCIe chip is NTB chip.
On the other hand, the embodiment of the present application provides a kind of method for detecting PCIe chip exception, and this method is applied to calculate In the CPLD of machine equipment.Computer equipment includes: CPU, the PCIe chip coupled with CPU, and coupled with PCIe chip CPLD.Wherein, the signal detection pin of the heartbeat signal pin of PCIe chip and CPLD are electrically connected, and PCIe chip passes through heartbeat Signal pins export heartbeat signal to CPLD, which is used to indicate whether PCIe chip works normally.
This method comprises: CPLD determines the whether normal work of PCIe chip according to the heartbeat signal that signal detection pin obtains Make;When determining PCIe chip abnormal work, CPLD controls electricity under PCIe chip.
In a possible design, heartbeat signal is the square-wave signal of predeterminated frequency.CPLD is according to signal detection pin The heartbeat signal of acquisition determines whether PCIe chip works normally, comprising: the high level lasting time or low electricity of heartbeat signal When the flat duration is greater than preset threshold, CPLD determines PCIe chip abnormal work.
In a possible design, CPLD controls electricity under PCIe chip, comprising: under CPLD is exported to power supply control chip Electric signal, power supply control chip are used to be stopped according to lower electric signal to power to PCIe chip.
In a possible design, CPLD is controlled under PCIe chip after electricity, further includes: when receiving CPU output When power on signal, CPLD controls PCIe chip according to power on signal and re-powers.
Compared to the prior art, it in scheme provided by the embodiments of the present application, is exported by CPLD according to PCIe chip Whether heartbeat signal detection PCIe chip works normally, and when detecting PCIe chip abnormal work, by controlling PCIe core It is electric under piece, so that the PCIe link between PCIe chip and CPU disconnects, realizes and the abnormality detection of PCIe chip and chain rupture are grasped Make.Due to the characteristic of CPLD itself, processing speed faster, and not will receive CPU because message issues resistance for software It fills in and causes card slow or even hang dead influence, it is abnormal so as to be quickly detected from PCIe chip;And using to PCIe chip The mode of lower electricity disconnects the PCIe link between PCIe chip and CPU, so that chain rupture is much sooner, quickly.
Detailed description of the invention
Fig. 1 is the block diagram for the computer equipment that the application one embodiment provides;
Fig. 2A and Fig. 2 B is the schematic diagram for the heartbeat signal that the application one embodiment provides;
Fig. 3 is a kind of schematic diagram of application scenarios provided by the embodiments of the present application;
Fig. 4 is the schematic diagram of another application scenarios provided by the embodiments of the present application;
Fig. 5 is the method flow diagram for the detection PCIe chip exception that the application one embodiment provides.
Specific embodiment
To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with attached drawing to the application embodiment party Formula is described in further detail.
In technical solution provided by the embodiments of the present application, detect whether PCIe chip works normally by hardware means, And when detecting PCIe chip abnormal work, the PCIe link between PCIe chip and CPU is disconnected by hardware means, thus Abnormality detection speed is improved, and more carries out chain rupture in time.
In the following, in terms of will be in conjunction with general character involved in the embodiment of the present application recited above, to the embodiment of the present application do into One step is described in detail.
Referring to FIG. 1, the block diagram of the computer equipment 10 provided it illustrates the application one embodiment.The computer is set Standby 10 include: CPU 11, PCIe chip 12 and CPLD 13.
Computer equipment 10 can be PC, server and data center etc. and arbitrarily the electronics of PCIe technology be supported to set It is standby.
CPU 11 is the arithmetic core and control core of computer equipment 10, function be mainly interpretive machine instruction with And the data in processing computer software.
PCIe chip 12 refers to the chip for supporting PCIe standard, such as PCIe exchange chip, NTB chip, PCIe network interface card. Establishing between CPU 11 and PCIe chip 12 has a PCIe link, and CPU 11 is by the PCIe link to transmitting messages under PCIe chip 12 Text, and subsequent processing (such as forward, store) is done to the message received by PCIe chip 12.
CPLD 13 breaks for detecting whether PCIe chip 12 works normally, and when detecting PCIe chip abnormal work Open the link between PCIe chip 12 and CPU 11.
In the embodiment of the present application, as shown in Figure 1, CPU 11 is coupled with PCIe chip 12, PCIe chip 12 and CPLD 13 Coupling.
PCIe chip 12 has heartbeat signal pin, and CPLD 13 has signal detection pin, and the heartbeat of PCIe chip 12 is believed The signal detection pin of number pin and CPLD 13 are electrically connected.PCIe chip 12 is exported by heartbeat signal pin to CPLD 13 Heartbeat signal, correspondingly, CPLD 13 receive the heartbeat signal that PCIe chip 12 exports by signal detection pin.
Heartbeat signal is used to indicate whether PCIe chip 12 works normally.Heartbeat signal can be the height electricity of square Ordinary mail number.When PCIe chip 12 works normally, PCIe chip 12 exports the heartbeat signal of the first form to CPLD 13;When When PCIe 12 abnormal work of chip, PCIe chip 12 exports the heartbeat signal of the second form to CPLD 13;Wherein, the first form It is different from the second form.Form of the CPLD 13 according to the heartbeat signal received can determine the whether normal work of PCIe chip 12 Make.For example, PCIe chip 12 is to the heartbeat signal side of being of CPLD13 the first form exported when PCIe chip 12 works normally The low and high level signal of waveshape.If there is exception in PCIe chip 12 when PCIe chip 12 exports high level signal, PCIe chip 12 can continue output high level signal, no longer be switched to low level signal;Alternatively, if being exported in PCIe chip 12 There is exception in PCIe chip 12 when low level signal, then PCIe chip 12 can continue output low level signal, is no longer switched to height Level signal.Therefore, in PCIe 12 abnormal work of chip, PCIe chip 12 is believed to the heartbeat of CPLD13 the second form exported It number may be lasting high level signal, it is also possible to lasting low level signal.
In one example, heartbeat signal is the square-wave signal of predeterminated frequency, using square-wave signal as heartbeat signal, energy Enough so that CPLD 13 is simpler, clearly distinguishes the metamorphosis of heartbeat signal, help quickly and accurately to determine Whether PCIe chip 12 works normally.Predeterminated frequency can be preset empirical value, such as predeterminated frequency according to the actual situation For 10Hz (hertz).In practical applications, can according to the message between CPU 11 and PCIe chip 12 send frequency, or according to Demand to the detection speed whether detection PCIe chip 12 works normally, determines the value of predeterminated frequency.For example, working as 11 He of CPU When message transmission frequency between PCIe chip 12 is higher, in order to avoid causing message to block because PCIe chip exception occurs, Need more to be quickly detected from whether PCIe chip 12 works normally, thus the value of predeterminated frequency can also suitably choose it is big by one A bit.
In addition, whether PCIe chip 12 works normally, CPU 11 whether can be normally handled with PCIe chip 12 and issued Message be standard, when PCIe chip 12 can normally handle the message that CPU 11 is issued, it is believed that the normal work of PCIe chip 12 Make, when PCIe chip 12 can not normally handle the message that CPU 11 is issued, it is believed that 12 abnormal work of PCIe chip.
In the embodiment of the present application, CPLD 13 is used to determine PCIe core according to the heartbeat signal that signal detection pin obtains Whether piece 12 works normally.For example, CPLD 13 is determined when the heartbeat signal that signal detection pin obtains is the first form PCIe chip 12 works normally;When the heartbeat signal that signal detection pin obtains is the second form, CPLD 13 determines PCIe core 12 abnormal work of piece.
Optionally, when heartbeat signal is the square-wave signal of predeterminated frequency, CPLD 13 is also used to the height electricity of heartbeat signal When flat duration or the low level duration are greater than preset threshold, 12 abnormal work of PCIe chip is determined.Preset threshold can be with It is preset empirical value according to the actual situation, such as it can be set according to the value of predeterminated frequency.For example, when pre- If frequency is 10Hz, preset threshold is 1 second or 0.5 second.PCIe chip 12 is being shown just in conjunction with reference Fig. 2A and Fig. 2 B, Fig. 2A Often when work, the schematic diagram of the heartbeat signal of the first form of output, the heartbeat signal of first form is the height constantly switched Low level signal;When Fig. 2 B shows PCIe 12 abnormal work of chip, the schematic diagram of the heartbeat signal of the second form of output should The heartbeat signal of second form may be lasting high level signal (in Fig. 2 B shown in the part (a)), it is also possible to lasting Low level signal (in Fig. 2 B shown in the part (b)).
CPLD 13 is also used to when determining PCIe 12 abnormal work of chip, the lower electricity of control PCIe chip 12.In the application reality It applies in example, when CPLD 13 determines 12 abnormal work of PCIe chip, electricity under PCIe chip 12 is immediately controlled, due to PCIe chip 12 by lower electricity, therefore the PCIe link between PCIe chip 12 and CPU 11 is also just disconnected.By under control PCIe chip 12 Electricity triggering chain rupture, so that chain rupture is more quick.
In one example, CPLD 13 is also used to when determining PCIe 12 abnormal work of chip, defeated to power supply control chip Electric signal is descended out, which, which is used to indicate power supply control chip, stops powering to PCIe chip 12.Optionally, lower electric signal It can be a high level signal, be also possible to the combination of a low level signal or a low and high level signal.Power supply controls core Piece is used to be stopped according to lower electric signal to power to PCIe chip 12, such as power supply control chip is after receiving lower electric signal, Power supply control chip no longer exports high level to PCIe chip 12, to stop powering to PCIe chip 12.
Optionally, CPU 11 can execute advanced error report after detecting that the link between PCIe chip 12 disconnects It accuses (Advanced Error Reporting, AER) and repairs process.CPU 11 sends power on signal, telecommunications on this to CPLD 13 Number for trigger CPLD 13 control PCIe chip 12 re-power.Optionally, CPU 11 is by the way of writing register to CPLD 13 send power on signal, and a numerical value can be written in CPU 11 in the register of CPLD 13, which is used to indicate control PCIe Chip 12 re-powers, after CPLD 13 reads the above-mentioned numerical value being written in its register, control PCIe chip 12 again on Electricity.CPLD 13 is also used in the case where controlling PCIe chip 12 after electricity, if receiving the power on signal of the output of CPU 11, basis Power on signal control PCIe chip 12 re-powers.For example, CPLD 13 after receiving power on signal, is controlled to power supply Chip exports power on signal, and power supply control chip is after receiving power on signal, and power supply control chip is again to PCIe chip 12 provide high level, to restart to power to PCIe chip 12.PCIe chip 12 after power-up, passes through execution and CPU Link setup process between 11, trial re-establish PCIe link.If the source of trouble of 12 abnormal work of PCIe chip is caused to arrange It removes, then PCIe link can rebuild success;If the source of trouble of 12 abnormal work of PCIe chip is caused not exclude (such as PCIe chip 12 thoroughly break down), then PCIe link understands reconstruction failure, but the source of trouble will not be further continued for influencing CPU 11.
In the following, explanation is introduced to technical solution provided by the embodiments of the present application in conjunction with two application scenarios.
In an exemplary scene, as shown in figure 3, PCIe chip 12 is PCIe exchange chip 121.PCIe exchange chip 121 for realizing the interconnection between CPU 11 and multiple PCIe devices.PCIe exchange chip 121 includes an input port and N A output port, N are positive integer.CPU 11 and input port are electrically connected, such as are electrically connected by PCIe bus.It is N number of defeated Each of exit port output port is used to be electrically connected with 1 PCIe device, such as is electrically connected by PCIe bus.By In the presence of PCIe exchange chip 121, CPU 11 is enabled to support to communicate with multiple PCIe devices simultaneously.
PCIe exchange chip 121 sends to CPLD 13 and notifies when starting to work normally, which is used to indicate CPLD 13, which open heartbeat detection, enables.After the unlatching heartbeat detection of CPLD 13 is enabled, passes through signal detection pin and receive PCIe exchange core The heartbeat signal that piece 121 exports, and determine whether PCIe exchange chip 121 works normally according to the heartbeat signal.Work as determination When PCIe 121 abnormal work of exchange chip, CPLD 13 controls the lower electricity of PCIe exchange chip 121, so that PCIe exchange chip 121 PCIe link between CPU 11 disconnects.
In another exemplary scene, as shown in figure 4, PCIe chip 12 is NTB chip 122.NTB chip 122 is usually answered In synchronizing used in the data of double-control system, as shown in figure 4, computer equipment 10 (a) and computer equipment 10 (b) form dual control system System, computer equipment 10 (a) includes CPU11 (a), NTB chip 122 (a) and CPLD13 (a), and computer equipment 10 (b) includes CPU11 (b), NTB chip 122 (b) and CPLD13 (b).
CPLD13 (a) receives the heartbeat signal of NTB chip 122 (a) output by signal detection pin, and according to the heartbeat Signal determines whether NTB chip 122 (a) works normally.CPLD13 (b) receives NTB chip 122 (b) by signal detection pin The heartbeat signal of output, and determine whether NTB chip 122 (b) works normally according to the heartbeat signal.Assuming that working as CPLD13 (a) When determining NTB chip 122 (a) abnormal work, CPLD13 (a) controls electricity under NTB chip 122 (a), so that NTB chip 122 (a) PCIe link between CPU 11 disconnects, and makes the PCIe link between NTB chip 122 (a) and NTB chip 122 (b) It disconnects, avoids CPU11 (b) from also occurring abnormal.
In scheme provided by the embodiments of the present application, detected by CPLD 13 according to the heartbeat signal that PCIe chip 12 exports Whether PCIe chip 12 works normally, and when detecting PCIe 12 abnormal work of chip, lower electric by control PCIe chip 12, So that the PCIe link between PCIe chip 12 and CPU 11 disconnects, realizes and the abnormality detection of PCIe chip 12 and chain rupture are grasped Make.Due to the characteristic of CPLD 13 itself, processing speed faster, and not will receive CPU 11 because of message for software It issues obstruction and causes card slow or even hang dead influence, it is abnormal so as to be quickly detected from PCIe chip 12;And use pair The mode of the lower electricity of PCIe chip 12 disconnects the PCIe link between PCIe chip 12 and CPU 11 so that chain rupture more and When, quickly.Technical solution provided by the embodiments of the present application improves the managerial reliability of PCIe chip 12, maintainability, subtracts Fault time is lacked.
Referring to FIG. 5, the process of the method for detecting PCIe chip exception provided it illustrates the application one embodiment Figure, this method can be applied in the CPLD 13 of Fig. 1 embodiment offer.This method may include the following steps:
Step 501, CPLD 13 determines the whether normal work of PCIe chip 12 according to the heartbeat signal that signal detection pin obtains Make.
PCIe chip 12 has heartbeat signal pin, and CPLD 13 has signal detection pin, and the heartbeat of PCIe chip 12 is believed The signal detection pin of number pin and CPLD 13 are electrically connected.PCIe chip 12 is exported by heartbeat signal pin to CPLD 13 Heartbeat signal, correspondingly, CPLD 13 receive the heartbeat signal that PCIe chip 12 exports by signal detection pin.
Heartbeat signal is used to indicate whether PCIe chip 12 works normally.Heartbeat signal can be the height electricity of square Ordinary mail number.When PCIe chip 12 works normally, PCIe chip 12 exports the heartbeat signal of the first form to CPLD 13;When When PCIe 12 abnormal work of chip, PCIe chip 12 exports the heartbeat signal of the second form to CPLD 13;Wherein, the first form It is different from the second form.Form of the CPLD 13 according to the heartbeat signal received can determine the whether normal work of PCIe chip 12 Make.For example, PCIe chip 12 is to the heartbeat signal side of being of CPLD13 the first form exported when PCIe chip 12 works normally The low and high level signal of waveshape.If there is exception in PCIe chip 12 when PCIe chip 12 exports high level signal, PCIe chip 12 can continue output high level signal, no longer be switched to low level signal;Alternatively, if being exported in PCIe chip 12 There is exception in PCIe chip 12 when low level signal, then PCIe chip 12 can continue output low level signal, is no longer switched to height Level signal.Therefore, in PCIe 12 abnormal work of chip, PCIe chip 12 is believed to the heartbeat of CPLD13 the second form exported It number may be lasting high level signal, it is also possible to lasting low level signal.
In one example, heartbeat signal is the square-wave signal of predeterminated frequency, using square-wave signal as heartbeat signal, energy Enough so that CPLD 13 is simpler, clearly distinguishes the metamorphosis of heartbeat signal, help quickly and accurately to determine Whether PCIe chip 12 works normally.Predeterminated frequency can be preset empirical value, such as predeterminated frequency according to the actual situation For 10Hz (hertz).In practical applications, can according to the message between CPU 11 and PCIe chip 12 send frequency, or according to Demand to the detection speed whether detection PCIe chip 12 works normally, determines the value of predeterminated frequency.For example, working as 11 He of CPU When message transmission frequency between PCIe chip 12 is higher, in order to avoid causing message to block because PCIe chip exception occurs, Need more to be quickly detected from whether PCIe chip 12 works normally, thus the value of predeterminated frequency can also suitably choose it is big by one A bit.
In addition, whether PCIe chip 12 works normally, CPU 11 whether can be normally handled with PCIe chip 12 and issued Message be standard, when PCIe chip 12 can normally handle the message that CPU 11 is issued, it is believed that the normal work of PCIe chip 12 Make, when PCIe chip 12 can not normally handle the message that CPU 11 is issued, it is believed that 12 abnormal work of PCIe chip.
In the embodiment of the present application, CPLD 13 determines PCIe chip 12 according to the heartbeat signal that signal detection pin obtains Whether work normally.For example, CPLD 13 determines PCIe core when the heartbeat signal that signal detection pin obtains is the first form Piece 12 works normally;When the heartbeat signal that signal detection pin obtains is the second form, CPLD 13 determines that PCIe chip 12 is different Often work.
Optionally, heartbeat signal be predeterminated frequency square-wave signal when, the high level lasting time of heartbeat signal or When the low level duration is greater than preset threshold, CPLD 13 determines 12 abnormal work of PCIe chip.Preset threshold can be basis The preset empirical value of actual conditions, such as it can be set according to the value of predeterminated frequency.For example, working as predeterminated frequency When for 10Hz, preset threshold is 1 second or 0.5 second.PCIe chip 12 is shown in conjunction with reference Fig. 2A and Fig. 2 B, Fig. 2A to work normally When, the schematic diagram of the heartbeat signal of the first form of output, the heartbeat signal of first form is the low and high level constantly switched Signal;When Fig. 2 B shows PCIe 12 abnormal work of chip, the schematic diagram of the heartbeat signal of the second form of output, second shape The heartbeat signal of state may be lasting high level signal (in Fig. 2 B shown in the part (a)), it is also possible to lasting low level Signal (in Fig. 2 B shown in the part (b)).
Step 502, when determining PCIe 12 abnormal work of chip, CPLD 13 controls the lower electricity of PCIe chip 12.
In the embodiment of the present application, when CPLD 13 determines 12 abnormal work of PCIe chip, PCIe chip 12 is immediately controlled Lower electricity, since PCIe chip 12 is by lower electricity, the PCIe link between PCIe chip 12 and CPU 11 is also just disconnected.Pass through The lower electricity triggering chain rupture of PCIe chip 12 is controlled, so that chain rupture is more quick.
In one example, when determining PCIe 12 abnormal work of chip, CPLD 13 exports lower electricity to power supply control chip Signal, which, which is used to indicate power supply control chip, stops powering to PCIe chip 12.Optionally, lower electric signal can be One high level signal is also possible to the combination of a low level signal or a low and high level signal.Power supply control chip is used for Stop powering to PCIe chip 12 according to lower electric signal, such as power supply control chip is after receiving lower electric signal, power supply control Coremaking piece no longer exports high level to PCIe chip 12, to stop powering to PCIe chip 12.
Optionally, CPU 11 can execute AER after detecting that the link between PCIe chip 12 disconnects.CPU 11 Power on signal is sent to CPLD 13, which re-powers for triggering the control PCIe chip 12 of CPLD 13.CPLD 13 In the case where controlling PCIe chip 12 after electricity, if receiving the power on signal of the output of CPU 11, CPLD 13 is according to the power on signal Control PCIe chip 12 re-powers.Optionally, CPU 11 sends power on signal to CPLD 13 by the way of writing register, A numerical value can be written in CPU 11 in the register of CPLD 13, which is used to indicate control PCIe chip 12 and re-powers, After CPLD 13 reads the above-mentioned numerical value being written in its register, control PCIe chip 12 is re-powered.For example, CPLD 13 After receiving power on signal, power on signal is exported to power supply control chip, power supply control chip is receiving power on signal Later, power supply control chip provides high level to PCIe chip 12 again, to restart to power to PCIe chip 12.PCIe After power-up, by executing the link setup process between CPU 11, trial re-establishes PCIe link to chip 12.If caused The source of trouble of 12 abnormal work of PCIe chip has excluded, then PCIe link can rebuild success;If causing PCIe chip 12 different The source of trouble often to work does not exclude (such as PCIe chip 12 thoroughly breaks down), then PCIe link meeting reconstruction failure, but the source of trouble It will not be further continued for influencing CPU 11.
For details undisclosed in above method embodiment, reference can be made to above-mentioned product embodiments shown in FIG. 1.
In scheme provided by the embodiments of the present application, detected by CPLD 13 according to the heartbeat signal that PCIe chip 12 exports Whether PCIe chip 12 works normally, and when detecting PCIe 12 abnormal work of chip, lower electric by control PCIe chip 12, So that the PCIe link between PCIe chip 12 and CPU 11 disconnects, realizes and the abnormality detection of PCIe chip 12 and chain rupture are grasped Make.Due to the characteristic of CPLD 13 itself, processing speed faster, and not will receive CPU 11 because of message for software It issues obstruction and causes card slow or even hang dead influence, it is abnormal so as to be quickly detected from PCIe chip 12;And use pair The mode of the lower electricity of PCIe chip 12 disconnects the PCIe link between PCIe chip 12 and CPU 11 so that chain rupture more and When, quickly.Technical solution provided by the embodiments of the present application improves the managerial reliability of PCIe chip 12, maintainability, subtracts Fault time is lacked.
One exemplary embodiment of the application additionally provides a kind of CPLD 13, is written with firmware in the CPLD 13, the firmware The method provided for realizing above-mentioned Fig. 5 embodiment.
It should be understood that referenced herein " multiple " refer to two or more."and/or", description association The incidence relation of object indicates may exist three kinds of relationships, for example, A and/or B, can indicate: individualism A exists simultaneously A And B, individualism B these three situations.Character "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or".Make herein " first ", " second " and similar word are not offered as any sequence, quantity or importance, and are used only to distinguish Different objects.
Above-described specific embodiment carries out the purpose of the embodiment of the present application, technical scheme and beneficial effects It is further described, it should be understood that the foregoing is merely the specific embodiments of the embodiment of the present application, and does not have to In limit the embodiment of the present application protection scope, it is all on the basis of the technical solution of the embodiment of the present application, done it is any Modification, equivalent replacement, improvement etc. should all include within the protection scope of the embodiment of the present application.

Claims (10)

1. a kind of computer equipment, which is characterized in that the computer equipment includes: central processor CPU and the CPU coupling The high speed peripheral component of conjunction interconnects PCIe chip, and the complex programmable logic device (CPLD) coupled with the PCIe chip; Wherein, the signal detection pin of the heartbeat signal pin of the PCIe chip and the CPLD are electrically connected, the PCIe chip Heartbeat signal is exported to the CPLD by the heartbeat signal pin, the heartbeat signal, which is used to indicate the PCIe chip, is No normal work;
Whether just the CPLD, the heartbeat signal for being obtained according to the signal detection pin determine the PCIe chip Often work;When determining the PCIe chip abnormal work, control electric under the PCIe chip.
2. computer equipment according to claim 1, which is characterized in that the heartbeat signal is that the square wave of predeterminated frequency is believed Number.
3. computer equipment according to claim 2, which is characterized in that
The CPLD is also used to be greater than preset threshold when the high level lasting time or the low level duration of the heartbeat signal When, determine the PCIe chip abnormal work.
4. computer equipment according to any one of claims 1 to 3, which is characterized in that
The CPLD is also used to when determining the PCIe chip abnormal work, exports lower electric signal, institute to power supply control chip Power supply control chip is stated for stopping powering to the PCIe chip according to the lower electric signal.
5. computer equipment according to any one of claims 1 to 3, which is characterized in that
The CPLD is also used in the case where controlling the PCIe chip after electricity, if receiving the power on signal of the CPU output, Then the PCIe chip is controlled according to the power on signal to re-power.
6. computer equipment according to any one of claims 1 to 3, which is characterized in that the PCIe chip is PCIe friendship Change chip;Alternatively, the PCIe chip is non-transparent bridge NTB chip.
7. a kind of method of detection high speed peripheral component interconnection PCIe chip exception, which is characterized in that be applied to computer equipment Complex programmable logic device (CPLD) in;The institute that the computer equipment includes: central processor CPU, is coupled with the CPU State PCIe chip, and the CPLD coupled with the PCIe chip;Wherein, the heartbeat signal pin of the PCIe chip with The signal detection pin of the CPLD is electrically connected, and the PCIe chip is exported by the heartbeat signal pin to the CPLD Heartbeat signal, the heartbeat signal are used to indicate whether the PCIe chip works normally;
The described method includes:
The CPLD determines the whether normal work of the PCIe chip according to the heartbeat signal that the signal detection pin obtains Make;
When determining the PCIe chip abnormal work, the CPLD controls electricity under the PCIe chip.
8. the method according to the description of claim 7 is characterized in that the heartbeat signal is the square-wave signal of predeterminated frequency;
The CPLD determines the whether normal work of the PCIe chip according to the heartbeat signal that the signal detection pin obtains Make, comprising:
When the high level lasting time of the heartbeat signal or the low level duration are greater than preset threshold, the CPLD is determined The PCIe chip abnormal work.
9. method according to claim 7 or 8, which is characterized in that the CPLD controls electricity under the PCIe chip, packet It includes:
The CPLD exports lower electric signal to power supply control chip, and the power supply control chip according to the lower electric signal for stopping Only power to the PCIe chip.
10. method according to claim 7 or 8, which is characterized in that the CPLD controls electricity under the PCIe chip Afterwards, further includes:
When receiving the power on signal of the CPU output, the CPLD controls the PCIe chip according to the power on signal It re-powers.
CN201711044736.9A 2017-10-31 2017-10-31 Method for detecting PCIe chip abnormity and computer equipment Active CN109726055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711044736.9A CN109726055B (en) 2017-10-31 2017-10-31 Method for detecting PCIe chip abnormity and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711044736.9A CN109726055B (en) 2017-10-31 2017-10-31 Method for detecting PCIe chip abnormity and computer equipment

Publications (2)

Publication Number Publication Date
CN109726055A true CN109726055A (en) 2019-05-07
CN109726055B CN109726055B (en) 2021-01-12

Family

ID=66293094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711044736.9A Active CN109726055B (en) 2017-10-31 2017-10-31 Method for detecting PCIe chip abnormity and computer equipment

Country Status (1)

Country Link
CN (1) CN109726055B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112867107A (en) * 2019-11-28 2021-05-28 华为技术有限公司 Wireless fidelity (WIFI) chip control method and related equipment thereof
CN113791368A (en) * 2021-09-10 2021-12-14 苏州浪潮智能科技有限公司 Method and device for automatically checking misplugging of interconnection cables of server and GPU (graphics processing Unit) box

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104125049A (en) * 2014-08-08 2014-10-29 浪潮电子信息产业股份有限公司 Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform
CN104461805A (en) * 2014-12-29 2015-03-25 浪潮电子信息产业股份有限公司 CPLD-based system state detecting method, CPLD and server mainboard
CN104639304A (en) * 2015-02-05 2015-05-20 南京阖云骥联信息科技有限公司 Dual-controller communication system based on internet of vehicles and dual-controller communication method based on internet of vehicles
JP2015225522A (en) * 2014-05-28 2015-12-14 富士ゼロックス株式会社 System and failure processing method
CN105912089A (en) * 2016-04-07 2016-08-31 浪潮电子信息产业股份有限公司 Battery redundancy method, device and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015225522A (en) * 2014-05-28 2015-12-14 富士ゼロックス株式会社 System and failure processing method
CN104125049A (en) * 2014-08-08 2014-10-29 浪潮电子信息产业股份有限公司 Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform
CN104461805A (en) * 2014-12-29 2015-03-25 浪潮电子信息产业股份有限公司 CPLD-based system state detecting method, CPLD and server mainboard
CN104639304A (en) * 2015-02-05 2015-05-20 南京阖云骥联信息科技有限公司 Dual-controller communication system based on internet of vehicles and dual-controller communication method based on internet of vehicles
CN105912089A (en) * 2016-04-07 2016-08-31 浪潮电子信息产业股份有限公司 Battery redundancy method, device and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112867107A (en) * 2019-11-28 2021-05-28 华为技术有限公司 Wireless fidelity (WIFI) chip control method and related equipment thereof
CN113791368A (en) * 2021-09-10 2021-12-14 苏州浪潮智能科技有限公司 Method and device for automatically checking misplugging of interconnection cables of server and GPU (graphics processing Unit) box

Also Published As

Publication number Publication date
CN109726055B (en) 2021-01-12

Similar Documents

Publication Publication Date Title
JP5932146B2 (en) Method, computer system and apparatus for accessing PCI Express endpoint device
EP2854369A1 (en) Method and apparatus for detecting interface connection between devices
CN107948063B (en) Method for establishing aggregation link and access equipment
CN110740072A (en) fault detection method, device and related equipment
CN104796329B (en) A kind of link automatic switching method and device
CN101141327A (en) Method for detecting network node abnormality
CN106502814B (en) Method and device for recording error information of PCIE (peripheral component interface express) equipment
CN109240953A (en) A kind of method, pinboard and the system of adaptive switching hard disk
CN109726055A (en) Detect the method and computer equipment of PCIe chip exception
CN109271273A (en) A kind of method, abnormal restoring equipment and storage medium that communication abnormality restores
CN104601523A (en) Data transmitting method and device
CN104536514A (en) Server with selective switch management network connection function
CN104202199A (en) Method and system for detecting interface status and processing interface fault according to interface status
CN105306352A (en) Industrial field bus protocol gateway device
CN107290954A (en) A kind of dual hot redundancy method of control computer
CN104536853B (en) A kind of device ensureing dual controller storage device resource continuous availability
CN104486149B (en) A kind of finite state machine method for ground test
CN109828945A (en) A kind of service message processing method and system
CN106506265B (en) Detection fpga chip hangs dead method and device
CN105553865B (en) A kind of FC exchanger chips credit management test method
CN116670636A (en) Data access method, device and storage medium
CN105224426A (en) Physical host fault detection method, device and empty machine management method, system
CN107357698A (en) A kind of method and device of acquisition BMC Serial Port Informations
CN106452696A (en) Control system of server cluster
CN105591902A (en) Main-standby switching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200421

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant after: HUAWEI TECHNOLOGIES Co.,Ltd.

Address before: 301, A building, room 3, building 301, foreshore Road, No. 310052, Binjiang District, Zhejiang, Hangzhou

Applicant before: Huawei Technologies Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20211223

Address after: 450046 Floor 9, building 1, Zhengshang Boya Plaza, Longzihu wisdom Island, Zhengdong New Area, Zhengzhou City, Henan Province

Patentee after: Super fusion Digital Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right