CN104125049A - Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform - Google Patents
Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform Download PDFInfo
- Publication number
- CN104125049A CN104125049A CN201410387756.6A CN201410387756A CN104125049A CN 104125049 A CN104125049 A CN 104125049A CN 201410387756 A CN201410387756 A CN 201410387756A CN 104125049 A CN104125049 A CN 104125049A
- Authority
- CN
- China
- Prior art keywords
- pcie
- cpu0
- cpu1
- cpu
- brickland
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Hardware Redundancy (AREA)
Abstract
The invention discloses a redundancy implementation method of a PCIE (Peripheral Component Interface Express) device based on a BRICKLAND platform, and belongs to the technical field of computers. The method comprises the following steps: respectively setting two adjacent CPU as CPU0 and CPU1; connecting PCIEDEVICE in the system to the CPU0 and CPU1 through PCIESwitch; monitoring the states of the CPU0 and CPU1 by the system through FPGA (Field Programmable Gate Array)/ CPLD (Complex Programmable Logic Device); controlling the connection state of PCIESwitchPort according to the monitored state results of the CPU0 and the CPU1, and thus determining the connection of the CPU0 and PU1 to the PCIEDEVICE. With the adoption of the method, the PCIE device is switched to other CPU in case some CPU of the server suffers from fault, and thus the PCIE device under the faulted CPU can work normally without closing the system, and as a result, the stability of the whole system is improved.
Description
Technical field
The present invention relates to field of computer technology, specifically a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform.
Background technology
The simplicity of the maintainability of the development of server technology now to server and server maintenance requires more and more higher, current Brickland Platform Server, PCIE controller is integrated in CPU inside, part server does not use all PCIE resources of corresponding CPU, and system can have been supported online and the offline function of CPU, but in the time there is CPU offline situation, the PCIE equipment under offline CPU is disabled, causes the not continuity of PCIE equipment work.
Current Brickland Platform Server mostly is multi processor platform, and in the time that offline or other mistakes appear in a CPU, losing efficacy at once in the PCIE groove position under corresponding CPU, causes PCIE equipment normally to work, and greatly reduces the stability of whole system.
Literary composition Chinese and English is explained as follows:
PCIE Slot: i.e. PCI Express slot, this slot is the actual embodiment of PCE bus on server master board; PCI Express, is called for short PCI-E, is the one of computer bus PCI, and it has continued to use existing PCI programming concept and communication standard, but builds based on serial communication system faster.Intel is the main supporter of this interface.PCIe is only applied to intraconnection.Because PCIe is based on existing pci system, only need amendment physical layer and need not revise software and just existing pci system can be converted to PCIe.PCIe has speed faster, to replace almost whole existing internal buss (comprising AGP and PCI).
The passive switch of PCIE Passive Switch:PCIE, needs Event triggered;
CPU Online/Offline: the modern advanced feature on system architecture makes processor possess the ability of error reporting and error correction.CPU architecture is supported subregion, and this makes the computational resource of single cpu also can meet the needs of virtual machine.Some OEM have supported the hot plug of NUMA hardware, the insertion of physical node with remove the support that needs hot plug of processor technology.The CPU. that this advanced feature needs kernel can remove where necessary to use such as, for the needs of RAS, a CPUoffline who carries out malicious code must be remained on this CPU outside system execution route, after changing this CPU, need to do online operation, it is continued to use within import system execution route again.
Brickland platform: the i.e. server platform of ' Brickland ', is made up of the server platform of platform code name ' Brickland ' the Ivy Bridge goods processor of Intel Xeon series and C602J server chipset.
Summary of the invention
Technical assignment of the present invention is to provide a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform.
Technical assignment of the present invention is realized in the following manner, and the method step is as follows:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE DEVICE in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1, thereby determine that PCIE DEVICE connects CPU0 or CPU1.
In the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, and PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, and now, PCIE Switch Port1 is closed condition.
In the time that offline appears in described CPU0 or occur that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, and automatically PCIE Switch is switched to Port1, and PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
A kind of PCIE equipment redundancy implementation method based on BRICKLAND platform of the present invention compared to the prior art, can realize server system in the time that certain CPU breaks down, PCIE equipment is switched on other CPU, can be without shutdown system in the situation that, the PCIE equipment under fault CPU that ensures is normally worked, to improve whole system stability.
Brief description of the drawings
Accompanying drawing 1 is a kind of PCIE Redundancy Design schematic diagram of the PCIE equipment redundancy implementation method based on BRICKLAND platform.
The connection diagram when CPU0 that accompanying drawing 2 is a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform normally works.
Connection diagram when accompanying drawing 3 is a kind of CPU0 offline of the PCIE equipment redundancy implementation method based on BRICKLAND platform.
Figure Chinese and English is explained as follows:
PCIE DEVICE:PCIe equipment comprises EP (as the equipment such as network interface card, video card), Switch and PCIe bridge.PCIe bus adopts connected mode end to end, and each PCIe port can only connect an EP, and certain PCIe port also can connect Switch and carry out link expansion.The PCIe link expanding by Switch can continue to articulate EP or other Switch
PCIE Switch: in PCIe architecture, Switch is in core status.PCIe bus is used Switch to carry out link expansion, in Switch, and the corresponding Virtual PC I bridge of each port.
embodiment
Embodiment 1:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE equipment in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1; In the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, and PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, and now, PCIE Switch Port1 is closed condition.
Embodiment 2:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE equipment in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1; In the time that offline appears in described CPU0 or occur that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, and automatically PCIE Switch is switched to Port1, and PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
Embodiment 3:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE Slot in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1; In the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, and PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, and now, PCIE Switch Port1 is closed condition; In the time that offline appears in described CPU0 or occur that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, and automatically PCIE Switch is switched to Port1, and PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
By embodiment above, described those skilled in the art can be easy to realize the present invention.But should be appreciated that the present invention is not limited to above-mentioned several embodiments.On the basis of disclosed execution mode, described those skilled in the art can the different technical characterictic of combination in any, thereby realizes different technical schemes.
Claims (3)
1. the PCIE equipment redundancy implementation method based on BRICKLAND platform, is characterized in that, the method step is as follows:
Two adjacent CPU are set to respectively to CPU0 and CPU1, PCIE DEVICE in system is connected respectively to CPU0 and CPU1 by PCIE Switch, system is carried out the state of monitoring CPU 0 and CPU1 by FPGA/CPLD, control PCIE Switch Port connection status by monitoring the state outcome of CPU0 and CPU1, thereby determine that PCIE DEVICE connects CPU0 or CPU1.
2. a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform according to claim 1, it is characterized in that, in the time that described CPU0 is working properly, FPGA/CPLD connects PCIE Switch to be arranged on Port0, PCIE DEVICE is connected to CPU0 by the Port0 of PCIE Switch, now, PCIE Switch Port1 is closed condition.
3. a kind of PCIE equipment redundancy implementation method based on BRICKLAND platform according to claim 1, it is characterized in that, in the time that there is offline in described CPU0 or occurs that other are wrong, FPGA/CPLD monitoring CPU 0 breaks down, automatically PCIE Switch is switched to Port1, PCIE DEVICE is connected to CPU1 by Port1, ensures that PCIE DEVICE normally works.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410387756.6A CN104125049A (en) | 2014-08-08 | 2014-08-08 | Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410387756.6A CN104125049A (en) | 2014-08-08 | 2014-08-08 | Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104125049A true CN104125049A (en) | 2014-10-29 |
Family
ID=51770323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410387756.6A Pending CN104125049A (en) | 2014-08-08 | 2014-08-08 | Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104125049A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104579802A (en) * | 2015-02-15 | 2015-04-29 | 浪潮电子信息产业股份有限公司 | Method for fast fault restoration of multipath server |
CN105550075A (en) * | 2015-12-11 | 2016-05-04 | 浪潮电子信息产业股份有限公司 | Method for realizing memory equipment redundancy |
CN105718333A (en) * | 2016-01-26 | 2016-06-29 | 山东超越数控电子有限公司 | Twin-channel server mainboard main-slave CPU switching device and switching control method thereof |
CN106161169A (en) * | 2016-09-30 | 2016-11-23 | 郑州云海信息技术有限公司 | A kind of multi-host network exchange system |
CN106250349A (en) * | 2016-08-08 | 2016-12-21 | 浪潮(北京)电子信息产业有限公司 | A kind of high energy efficiency heterogeneous computing system |
CN107894961A (en) * | 2017-12-07 | 2018-04-10 | 郑州云海信息技术有限公司 | A kind of symmetric design framework of multichannel CPU external interfaces interconnection |
CN109726055A (en) * | 2017-10-31 | 2019-05-07 | 杭州华为数字技术有限公司 | Detect the method and computer equipment of PCIe chip exception |
CN117591457A (en) * | 2024-01-17 | 2024-02-23 | 苏州元脑智能科技有限公司 | PCIE expansion box, server, method, device and product for controlling data transmission |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1373427A (en) * | 2001-03-01 | 2002-10-09 | 深圳市中兴通讯股份有限公司 | Device and method for implementing dual system slots |
CN101071407A (en) * | 2007-06-22 | 2007-11-14 | 中兴通讯股份有限公司 | Active-standby system and method for realizing interconnecting device switching of external devices therebetween |
US20100229050A1 (en) * | 2009-03-06 | 2010-09-09 | Fujitsu Limited | Apparatus having first bus and second bus connectable to i/o device, information processing apparatus and method of controlling apparatus |
CN102486759A (en) * | 2010-12-03 | 2012-06-06 | 国际商业机器公司 | Cable redundancy and failover for multi-lane pci express io interconnections |
-
2014
- 2014-08-08 CN CN201410387756.6A patent/CN104125049A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1373427A (en) * | 2001-03-01 | 2002-10-09 | 深圳市中兴通讯股份有限公司 | Device and method for implementing dual system slots |
CN101071407A (en) * | 2007-06-22 | 2007-11-14 | 中兴通讯股份有限公司 | Active-standby system and method for realizing interconnecting device switching of external devices therebetween |
US20100229050A1 (en) * | 2009-03-06 | 2010-09-09 | Fujitsu Limited | Apparatus having first bus and second bus connectable to i/o device, information processing apparatus and method of controlling apparatus |
CN102486759A (en) * | 2010-12-03 | 2012-06-06 | 国际商业机器公司 | Cable redundancy and failover for multi-lane pci express io interconnections |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104579802A (en) * | 2015-02-15 | 2015-04-29 | 浪潮电子信息产业股份有限公司 | Method for fast fault restoration of multipath server |
CN105550075A (en) * | 2015-12-11 | 2016-05-04 | 浪潮电子信息产业股份有限公司 | Method for realizing memory equipment redundancy |
CN105718333A (en) * | 2016-01-26 | 2016-06-29 | 山东超越数控电子有限公司 | Twin-channel server mainboard main-slave CPU switching device and switching control method thereof |
CN106250349A (en) * | 2016-08-08 | 2016-12-21 | 浪潮(北京)电子信息产业有限公司 | A kind of high energy efficiency heterogeneous computing system |
CN106161169A (en) * | 2016-09-30 | 2016-11-23 | 郑州云海信息技术有限公司 | A kind of multi-host network exchange system |
CN109726055A (en) * | 2017-10-31 | 2019-05-07 | 杭州华为数字技术有限公司 | Detect the method and computer equipment of PCIe chip exception |
CN109726055B (en) * | 2017-10-31 | 2021-01-12 | 华为技术有限公司 | Method for detecting PCIe chip abnormity and computer equipment |
CN107894961A (en) * | 2017-12-07 | 2018-04-10 | 郑州云海信息技术有限公司 | A kind of symmetric design framework of multichannel CPU external interfaces interconnection |
CN117591457A (en) * | 2024-01-17 | 2024-02-23 | 苏州元脑智能科技有限公司 | PCIE expansion box, server, method, device and product for controlling data transmission |
CN117591457B (en) * | 2024-01-17 | 2024-04-19 | 苏州元脑智能科技有限公司 | PCIE expansion box, server, method, device and product for controlling data transmission |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104125049A (en) | Redundancy implementation method of PCIE (Peripheral Component Interface Express) device based on BRICKLAND platform | |
US9760455B2 (en) | PCIe network system with fail-over capability and operation method thereof | |
EP2811413B1 (en) | Computer system, access method and apparatus for peripheral component interconnect express endpoint device | |
JP4934642B2 (en) | Computer system | |
US8677180B2 (en) | Switch failover control in a multiprocessor computer system | |
US8656228B2 (en) | Memory error isolation and recovery in a multiprocessor computer system | |
US8521929B2 (en) | Virtual serial port management system and method | |
US9772912B2 (en) | Configurable and fault-tolerant baseboard management controller arrangement | |
CN102622279B (en) | Redundancy control system, method and Management Controller | |
DE102015107990A1 (en) | Method and apparatus for dynamic node repair in a multiple node environment | |
US8843688B2 (en) | Concurrent repair of PCIE switch units in a tightly-coupled, multi-switch, multi-adapter, multi-host distributed system | |
CN103635884A (en) | System and method for using redundancy of controller operation | |
US11176297B2 (en) | Detection and isolation of faults to prevent propagation of faults in a resilient system | |
CN110109782B (en) | Method, device and system for replacing fault PCIe (peripheral component interconnect express) equipment | |
WO2020125041A1 (en) | Network switching method and device | |
US20190056970A1 (en) | Method for computer-aided coupling a processing module into a modular technical system and modular technical system | |
CN104579802A (en) | Method for fast fault restoration of multipath server | |
US20180267870A1 (en) | Management node failover for high reliability systems | |
CN112445739A (en) | Circuit and method for supporting non-inductive upgrading of BIOS | |
CN115550291A (en) | Reset system and method for switch, storage medium, and electronic device | |
CN106528320B (en) | Computer system | |
CN109684257B (en) | Remote memory expansion management system | |
CN105009086A (en) | Method for switching processors, computer, and switching apparatus | |
JP6135403B2 (en) | Information processing system and information processing system failure processing method | |
CN107861763A (en) | A kind of interruption routed environment restoration methods towards Feiteng processor sleep procedure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20141029 |
|
WD01 | Invention patent application deemed withdrawn after publication |