CN104125049A - Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform - Google Patents

Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform Download PDF

Info

Publication number
CN104125049A
CN104125049A CN201410387756.6A CN201410387756A CN104125049A CN 104125049 A CN104125049 A CN 104125049A CN 201410387756 A CN201410387756 A CN 201410387756A CN 104125049 A CN104125049 A CN 104125049A
Authority
CN
China
Prior art keywords
pcie
cpu0
cpu1
cpu
brickland
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410387756.6A
Other languages
Chinese (zh)
Inventor
牟茜
刘振东
李萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410387756.6A priority Critical patent/CN104125049A/en
Publication of CN104125049A publication Critical patent/CN104125049A/en
Pending legal-status Critical Current

Links

Landscapes

  • Hardware Redundancy (AREA)

Abstract

The invention discloses a redundancy implementation method of a PCIE (Peripheral Component Interface Express) device based on a BRICKLAND platform, and belongs to the technical field of computers. The method comprises the following steps: respectively setting two adjacent CPU as CPU0 and CPU1; connecting PCIEDEVICE in the system to the CPU0 and CPU1 through PCIESwitch; monitoring the states of the CPU0 and CPU1 by the system through FPGA (Field Programmable Gate Array)/ CPLD (Complex Programmable Logic Device); controlling the connection state of PCIESwitchPort according to the monitored state results of the CPU0 and the CPU1, and thus determining the connection of the CPU0 and PU1 to the PCIEDEVICE. With the adoption of the method, the PCIE device is switched to other CPU in case some CPU of the server suffers from fault, and thus the PCIE device under the faulted CPU can work normally without closing the system, and as a result, the stability of the whole system is improved.

Description

A kind of PCIE equipment redundancy implementation method based on BRICKLAND platform
 
Technical field
The present invention relates to field of computer technology, specifically a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform.
Background technology
The simplicity of the maintainability of the development of server technology now to server and server maintenance requires more and more higher, current Brickland Platform Server, PCIE controller is integrated in CPU inside, part server does not use all PCIE resources of corresponding CPU, and system can have been supported online and the offline function of CPU, but in the time there is CPU offline situation, the PCIE equipment under offline CPU is disabled, causes the not continuity of PCIE equipment work.
Current Brickland Platform Server mostly is multi processor platform, and in the time that offline or other mistakes appear in a CPU, losing efficacy at once in the PCIE groove position under corresponding CPU, causes PCIE equipment normally to work, and greatly reduces the stability of whole system.
Literary composition Chinese and English is explained as follows:
PCIE Slot: i.e. PCI Express slot, this slot is the actual embodiment of PCE bus on server master board; PCI Express, is called for short PCI-E, is the one of computer bus PCI, and it has continued to use existing PCI programming concept and communication standard, but builds based on serial communication system faster.Intel is the main supporter of this interface.PCIe is only applied to intraconnection.Because PCIe is based on existing pci system, only need amendment physical layer and need not revise software and just existing pci system can be converted to PCIe.PCIe has speed faster, to replace almost whole existing internal buss (comprising AGP and PCI).
The passive switch of PCIE Passive Switch:PCIE, needs Event triggered;
CPU Online/Offline: the modern advanced feature on system architecture makes processor possess the ability of error reporting and error correction.CPU architecture is supported subregion, and this makes the computational resource of single cpu also can meet the needs of virtual machine.Some OEM have supported the hot plug of NUMA hardware, the insertion of physical node with remove the support that needs hot plug of processor technology.The CPU. that this advanced feature needs kernel can remove where necessary to use such as, for the needs of RAS, a CPUoffline who carries out malicious code must be remained on this CPU outside system execution route, after changing this CPU, need to do online operation, it is continued to use within import system execution route again.
Brickland platform: the i.e. server platform of ' Brickland ', is made up of the server platform of platform code name ' Brickland ' the Ivy Bridge goods processor of Intel Xeon series and C602J server chipset.
Summary of the invention
Technical assignment of the present invention is to provide a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform.
Technical assignment of the present invention is realized in the following manner, and the method step is as follows:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE DEVICE in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1, thereby determine that PCIE DEVICE connects CPU0 or CPU1.
In the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, and PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, and now, PCIE Switch Port1 is closed condition.
In the time that offline appears in described CPU0 or occur that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, and automatically PCIE Switch is switched to Port1, and PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
A kind of PCIE equipment redundancy implementation method based on BRICKLAND platform of the present invention compared to the prior art, can realize server system in the time that certain CPU breaks down, PCIE equipment is switched on other CPU, can be without shutdown system in the situation that, the PCIE equipment under fault CPU that ensures is normally worked, to improve whole system stability.
Brief description of the drawings
Accompanying drawing 1 is a kind of PCIE Redundancy Design schematic diagram of the PCIE equipment redundancy implementation method based on BRICKLAND platform.
The connection diagram when CPU0 that accompanying drawing 2 is a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform normally works.
Connection diagram when accompanying drawing 3 is a kind of CPU0 offline of the PCIE equipment redundancy implementation method based on BRICKLAND platform.
Figure Chinese and English is explained as follows:
PCIE DEVICE:PCIe equipment comprises EP (as the equipment such as network interface card, video card), Switch and PCIe bridge.PCIe bus adopts connected mode end to end, and each PCIe port can only connect an EP, and certain PCIe port also can connect Switch and carry out link expansion.The PCIe link expanding by Switch can continue to articulate EP or other Switch
PCIE Switch: in PCIe architecture, Switch is in core status.PCIe bus is used Switch to carry out link expansion, in Switch, and the corresponding Virtual PC I bridge of each port.
embodiment
Embodiment 1:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE equipment in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1; In the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, and PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, and now, PCIE Switch Port1 is closed condition.
Embodiment 2:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE equipment in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1; In the time that offline appears in described CPU0 or occur that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, and automatically PCIE Switch is switched to Port1, and PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
Embodiment 3:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE Slot in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1; In the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, and PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, and now, PCIE Switch Port1 is closed condition; In the time that offline appears in described CPU0 or occur that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, and automatically PCIE Switch is switched to Port1, and PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
By embodiment above, described those skilled in the art can be easy to realize the present invention.But should be appreciated that the present invention is not limited to above-mentioned several embodiments.On the basis of disclosed execution mode, described those skilled in the art can the different technical characterictic of combination in any, thereby realizes different technical schemes.

Claims (3)

1. the PCIE equipment redundancy implementation method based on BRICKLAND platform, is characterized in that, the method step is as follows:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE DEVICE in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1, thereby determine that PCIE DEVICE connects CPU0 or CPU1.
2. a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform according to claim 1, it is characterized in that, in the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, now, PCIE Switch Port1 is closed condition.
3. a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform according to claim 1, it is characterized in that, in the time that there is offline in described CPU0 or occurs that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, automatically PCIE Switch is switched to Port1, PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
CN201410387756.6A 2014-08-08 2014-08-08 Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform Pending CN104125049A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410387756.6A CN104125049A (en) 2014-08-08 2014-08-08 Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410387756.6A CN104125049A (en) 2014-08-08 2014-08-08 Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform

Publications (1)

Publication Number Publication Date
CN104125049A true CN104125049A (en) 2014-10-29

Family

ID=51770323

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410387756.6A Pending CN104125049A (en) 2014-08-08 2014-08-08 Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform

Country Status (1)

Country Link
CN (1) CN104125049A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104579802A (en) * 2015-02-15 2015-04-29 浪潮电子信息产业股份有限公司 Method for fast fault restoration of multipath server
CN105550075A (en) * 2015-12-11 2016-05-04 浪潮电子信息产业股份有限公司 Method for realizing memory equipment redundancy
CN105718333A (en) * 2016-01-26 2016-06-29 山东超越数控电子有限公司 Twin-channel server mainboard main-slave CPU switching device and switching control method thereof
CN106161169A (en) * 2016-09-30 2016-11-23 郑州云海信息技术有限公司 A kind of multi-host network exchange system
CN106250349A (en) * 2016-08-08 2016-12-21 浪潮(北京)电子信息产业有限公司 A kind of high energy efficiency heterogeneous computing system
CN107894961A (en) * 2017-12-07 2018-04-10 郑州云海信息技术有限公司 A kind of symmetric design framework of multichannel CPU external interfaces interconnection
CN109726055A (en) * 2017-10-31 2019-05-07 杭州华为数字技术有限公司 Detect the method and computer equipment of PCIe chip exception
CN117591457A (en) * 2024-01-17 2024-02-23 苏州元脑智能科技有限公司 PCIE expansion box, server, method, device and product for controlling data transmission

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1373427A (en) * 2001-03-01 2002-10-09 深圳市中兴通讯股份有限公司 Device and method for implementing dual system slots
CN101071407A (en) * 2007-06-22 2007-11-14 中兴通讯股份有限公司 Active-standby system and method for realizing interconnecting device switching of external devices therebetween
US20100229050A1 (en) * 2009-03-06 2010-09-09 Fujitsu Limited Apparatus having first bus and second bus connectable to i/o device, information processing apparatus and method of controlling apparatus
CN102486759A (en) * 2010-12-03 2012-06-06 国际商业机器公司 Cable redundancy and failover for multi-lane pci express io interconnections

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1373427A (en) * 2001-03-01 2002-10-09 深圳市中兴通讯股份有限公司 Device and method for implementing dual system slots
CN101071407A (en) * 2007-06-22 2007-11-14 中兴通讯股份有限公司 Active-standby system and method for realizing interconnecting device switching of external devices therebetween
US20100229050A1 (en) * 2009-03-06 2010-09-09 Fujitsu Limited Apparatus having first bus and second bus connectable to i/o device, information processing apparatus and method of controlling apparatus
CN102486759A (en) * 2010-12-03 2012-06-06 国际商业机器公司 Cable redundancy and failover for multi-lane pci express io interconnections

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104579802A (en) * 2015-02-15 2015-04-29 浪潮电子信息产业股份有限公司 Method for fast fault restoration of multipath server
CN105550075A (en) * 2015-12-11 2016-05-04 浪潮电子信息产业股份有限公司 Method for realizing memory equipment redundancy
CN105718333A (en) * 2016-01-26 2016-06-29 山东超越数控电子有限公司 Twin-channel server mainboard main-slave CPU switching device and switching control method thereof
CN106250349A (en) * 2016-08-08 2016-12-21 浪潮(北京)电子信息产业有限公司 A kind of high energy efficiency heterogeneous computing system
CN106161169A (en) * 2016-09-30 2016-11-23 郑州云海信息技术有限公司 A kind of multi-host network exchange system
CN109726055A (en) * 2017-10-31 2019-05-07 杭州华为数字技术有限公司 Detect the method and computer equipment of PCIe chip exception
CN109726055B (en) * 2017-10-31 2021-01-12 华为技术有限公司 Method for detecting PCIe chip abnormity and computer equipment
CN107894961A (en) * 2017-12-07 2018-04-10 郑州云海信息技术有限公司 A kind of symmetric design framework of multichannel CPU external interfaces interconnection
CN117591457A (en) * 2024-01-17 2024-02-23 苏州元脑智能科技有限公司 PCIE expansion box, server, method, device and product for controlling data transmission
CN117591457B (en) * 2024-01-17 2024-04-19 苏州元脑智能科技有限公司 PCIE expansion box, server, method, device and product for controlling data transmission

Similar Documents

Publication Publication Date Title
CN104125049A (en) Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform
US9760455B2 (en) PCIe network system with fail-over capability and operation method thereof
EP2811413B1 (en) Computer system, access method and apparatus for peripheral component interconnect express endpoint device
JP4934642B2 (en) Computer system
US8677180B2 (en) Switch failover control in a multiprocessor computer system
US8656228B2 (en) Memory error isolation and recovery in a multiprocessor computer system
US8521929B2 (en) Virtual serial port management system and method
US9772912B2 (en) Configurable and fault-tolerant baseboard management controller arrangement
CN102622279B (en) Redundancy control system, method and Management Controller
DE102015107990A1 (en) Method and apparatus for dynamic node repair in a multiple node environment
US8843688B2 (en) Concurrent repair of PCIE switch units in a tightly-coupled, multi-switch, multi-adapter, multi-host distributed system
CN103635884A (en) System and method for using redundancy of controller operation
US11176297B2 (en) Detection and isolation of faults to prevent propagation of faults in a resilient system
CN110109782B (en) Method, device and system for replacing fault PCIe (peripheral component interconnect express) equipment
WO2020125041A1 (en) Network switching method and device
US20190056970A1 (en) Method for computer-aided coupling a processing module into a modular technical system and modular technical system
CN104579802A (en) Method for fast fault restoration of multipath server
US20180267870A1 (en) Management node failover for high reliability systems
CN112445739A (en) Circuit and method for supporting non-inductive upgrading of BIOS
CN115550291A (en) Reset system and method for switch, storage medium, and electronic device
CN106528320B (en) Computer system
CN109684257B (en) Remote memory expansion management system
CN105009086A (en) Method for switching processors, computer, and switching apparatus
JP6135403B2 (en) Information processing system and information processing system failure processing method
CN107861763A (en) A kind of interruption routed environment restoration methods towards Feiteng processor sleep procedure

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20141029

WD01 Invention patent application deemed withdrawn after publication