WO2014207893A1 - Computation circuit and computer - Google Patents

Computation circuit and computer Download PDF

Info

Publication number
WO2014207893A1
WO2014207893A1 PCT/JP2013/067816 JP2013067816W WO2014207893A1 WO 2014207893 A1 WO2014207893 A1 WO 2014207893A1 JP 2013067816 W JP2013067816 W JP 2013067816W WO 2014207893 A1 WO2014207893 A1 WO 2014207893A1
Authority
WO
WIPO (PCT)
Prior art keywords
arithmetic
unit
redundant
arithmetic unit
fpga
Prior art date
Application number
PCT/JP2013/067816
Other languages
French (fr)
Japanese (ja)
Inventor
本村 哲朗
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2013/067816 priority Critical patent/WO2014207893A1/en
Publication of WO2014207893A1 publication Critical patent/WO2014207893A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1629Error detection by comparing the output of redundant processing systems
    • G06F11/1641Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/18Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
    • G06F11/183Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits by voting, the voting not being performed by the redundant components
    • G06F11/184Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits by voting, the voting not being performed by the redundant components where the redundant components implement processing functionality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1629Error detection by comparing the output of redundant processing systems
    • G06F11/1641Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
    • G06F11/1645Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components and the comparison itself uses redundant hardware

Definitions

  • the present invention relates to a technique for improving the reliability of a redundant arithmetic circuit.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • the SRAM type FPGA implements the user-defined internal configuration by loading bitstream data defining the internal configuration into a CRAM (Configuration RAM). Each bit of the bit stream data data loaded into the CRAM indicates a circuit configuration defined by the user.
  • CRAM Configuration RAM
  • Patent Literature 1 and Patent Literature 2 are known as techniques that use triplexing to ensure redundancy.
  • the present invention provides a redundant arithmetic unit including a first arithmetic unit and a second arithmetic unit that operate on input data in parallel, an arithmetic result of the first arithmetic unit, and the second arithmetic unit.
  • a redundant arithmetic unit including a first arithmetic unit and a second arithmetic unit that operate on input data in parallel, an arithmetic result of the first arithmetic unit, and the second arithmetic unit.
  • a redundant arithmetic unit including two arithmetic units and an arithmetic circuit having one third arithmetic unit, the redundant arithmetic unit and the third arithmetic unit are tripled only when an error occurs. It is possible to restore the operation result by performing the operation with the converted operation unit and performing the majority process with the majority circuit, and it is possible to improve the reliability while suppressing the manufacturing cost of the operation circuit.
  • FIG. 1 is a block diagram illustrating an example of a computer system according to a first embodiment of this invention.
  • FIG. It is a block diagram which shows the 1st Example of this invention and shows an example of a matching engine (ME) server.
  • ME matching engine
  • 1 is a block diagram illustrating an example of an FPGA board according to a first embodiment of this invention.
  • FIG. It is a sequence diagram which shows a 1st Example of this invention and shows an example of the process performed with a computer system. It is a sequence diagram which shows a 1st Example of this invention and shows an example of the process performed within an FPGA board.
  • FIG. 10 is a diagram illustrating a relationship between the output of the FPGA and the clock according to the second embodiment of this invention. It is a block diagram which shows the 3rd Example of this invention and shows an example of an FPGA board. It is a flowchart which shows a 3rd Example of this invention and shows an example of the process performed by CPU of an FPGA board.
  • FIG. 1 shows a first embodiment of the present invention and is a block diagram showing an example of a computer system.
  • the computer system of the present invention an example is shown in which transactions such as securities are processed in parallel on an FPGA board equipped with a large number of FPGAs (Field Programmable Gate Array).
  • FPGAs Field Programmable Gate Array
  • the transaction request (Order) received by the gateway device 12 is distributed to a plurality of nodes 10-1 to 10-n via the network 13.
  • Each of the nodes 10-1 to 10-n includes a primary matching engine (hereinafter referred to as ME) server 1-P and a secondary matching engine (ME in the figure) server 1-S.
  • ME primary matching engine
  • ME secondary matching engine
  • the primary ME server 1-P receives Order, executes predetermined arithmetic processing by an FPGA board described later, and returns the processing result to the gateway device 12.
  • the gateway device 12 transmits the processing result to the Order transmission source.
  • the gateway device 12 distributes the received Order to each of the nodes 10-1 to 10-n by a known or well-known method such as round robin or weighted round robin.
  • Order is, for example, a transaction request for securities, and a large amount of Order is processed in parallel by a plurality of nodes 10-1 to 10-n.
  • the FPGA boards of the nodes 10-1 to 10-n perform processing.
  • FIG. 2 is a block diagram showing an example of the ME server 1-P. Since the primary ME server 1-P and the secondary ME server 1-S have the same configuration, only the ME server 1-P will be described.
  • the ME server 1-P includes a CPU 2 that performs arithmetic processing, a memory 3 that stores data and programs, a NIC card (or network I / F) 6 that communicates with the network 13 (or the inter-server network 11), and Order.
  • the FPGA board 5 that executes the above process, and the PCI express 4 as an interconnect (or bus) that connects the CPU 2 to the FPGA board 5 and the NIC card 6.
  • the FPGA board 5 is connected to the PCI express 4 via the socket 41, and the NIC card 6 is connected to the PCI express 4 via the socket 42.
  • PCI express 4 is used as an interconnect for connecting the CPU 2 to the FPGA board 5 as an I / O device and the NIC card 6 is shown, a PCI bus or the like may be used.
  • FIG. 3 is a block diagram showing an example of the FPGA board 5.
  • the FPGA board 5 includes a data transfer FPGA 50 that transmits and receives data and commands to and from the PCI express 4, and redundant FPGAs (Dual Modular Redundancy FPGAs in the figure, DMR FPGAs in the figure) 51 that execute arithmetic processing including a plurality of FPGAs. -1 to 51-n, a third FPGA 52 that performs an operation when an error occurs in the redundant FPGAs 51-1 to 51-n, and the redundant FPGAs 51-1 to 51-n.
  • the circuit monitors the output to detect an error in the operation result, and when there is an error, switches between an error detection / recovery module 55 (Detect and Recovery Module in the figure) that functions as a control unit that repairs the operation result
  • the circuit includes selectors (Sel1 to Sel3 in the figure) 53, 54-1, and 54-2.
  • the selector 53 (Sel1) functions as a first selector that switches the input of the third FPGA 52, and the selector 54-1 (Sel2) functions as a second selector that switches the primary FPGA 500-1 to be input to the majority circuit 57.
  • the selector 54-2 (Sel3) functions as a third selector for switching the secondary FPGA 500-2 to be input to the majority circuit 57.
  • the data transfer FPGA 50 and the error detection / repair module 55 are configured by a flash memory type FPGA.
  • the redundant FPGAs 51-1 to 51-n and the third FPGA 52 are of SRAM type.
  • the generic name of the redundant FPGAs 51-1 to 51-n is denoted by reference numeral 51.
  • symbol which shows a generic name deletes a subscript, and displays it.
  • the data transfer FPGA 50 has a memory 60 for holding data to be transmitted and received, a signal line 40 connected to the PCI express 4 via the socket 41, a signal line 45 for transferring data and commands to each redundant FPGA 51, and an error A signal line 59 for receiving an output from the detection / repair module 55 and a signal line 58 for receiving a command from the error detection / repair module 55 are connected.
  • the data transfer FPGA 50 When the data transfer FPGA 50 receives data (Order data) from the PCI express 4, the data transfer FPGA 50 distributes the data to the plurality of redundant FPGAs 51. At this time, if there is no redundant FPGA 51 to which data is allocated, the data is distributed to the plurality of redundant FPGAs 51 by a known or well-known method such as queuing for holding the data in the memory 60.
  • the connection between the data transfer FPGA 50 and the redundant FPGA 51 is an example in which the connection is made through a PCI express signal line 45.
  • the redundant FPGA 51-1 includes a primary FPGA (FPG Pri in the figure) 500-1 serving as an active system and a secondary FPGA (FPGA Sec in the figure) 500-2 serving as a standby system.
  • the primary FPGA 500-1 is connected with a memory 61-1 for holding data, calculation results, and the like
  • the secondary FPGA 500-2 is connected with a memory 61-2 for holding data, calculation results, and the like.
  • the primary FPGA 500-1 functions as a first arithmetic unit
  • the secondary FPGA 500-2 functions as a second arithmetic unit
  • the third FPGA 52 functions as a third arithmetic unit
  • each FPGA has the same arithmetic processing Execute.
  • Order data is transmitted to the redundant FPGA 51-1 through the PCI express signal line 45.
  • the redundant FPGA 51-1 is connected with a signal line 58 for receiving a command from the error detection / repair module 55 and a signal line 46-1 for transmitting data to the selector 53.
  • the primary FPGA 500-1 and the secondary FPGA 500-2 each execute predetermined calculation processing on the data received from the data transfer FPGA 50, and output the calculation result to the comparison circuit 56-1. Note that the primary FPGA 500-1 and the secondary FPGA 500-2 respectively execute the same calculation process for one Order data.
  • the output side of the primary FPGA 500-1 is connected to the comparison circuit 56-1 of the error detection / repair module 55 through the signal line 70-1, and the output side of the secondary FPGA 500-2 is also in error through the signal line 71-1. It is connected to the comparison circuit 56-1 of the detection / repair module 55.
  • the redundant FPGAs 51-2 to 51-n are connected to the error detection / repair module 55 through the signal lines 70-2 to 70-n and 71-2 to 71-n. 56-2 to 56-n.
  • the third FPGA 52 is configured in the same manner as the primary FPGA 500-1, and includes a memory 62.
  • the third FPGA 52 is connected to the selector 53 on the input side, and receives Order data from the redundant FPGA 51 in which an error has occurred via the signal lines 46-1 to 46-n.
  • the output side of the third FPGA 52 is connected to a majority circuit (Vota in the figure) 57 of the error detection / repair module 55.
  • the error detection / repair module 55 includes the redundancy circuits selected by the comparison circuits 56-1 to 56-n corresponding to each redundant FPGA 51, the selector (second selector) 54-1 and the selector (third selector) 54-2.
  • the comparison circuit 56 When the comparison result of the two inputs from the redundant FPGA 51 (the two calculation results of the primary FPGA 500-1 and the secondary FPGA 500-2) matches, the comparison circuit 56 outputs the calculation result from the signal line 59 and outputs data. It transmits to transfer FPGA50.
  • the data transfer FPGA 50 outputs the calculation result from the error detection / repair module 55 to the PCI express 4.
  • the comparison circuit 56 compares the calculation result of the primary FPGA 500-1 of the redundant FPGA 51 with the calculation result of the secondary FPGA 500-2, and detects an error if the calculation results do not match.
  • the error detection / repair module 55 instructs the selectors 53, 54-1 and 54-2 to connect to the redundant FPGA 51 where the error has occurred via the signal line 58.
  • the selector 53 connects the redundant FPGA 51 and the third FPGA 52 in which the error has occurred
  • the selector 54-1 connects the signal line 70 of the redundant FPGA 51 in which the error has occurred and the majority circuit 57
  • the selector 54-2 The signal line 71 of the redundant FPGA 51 in which an error has occurred and the majority circuit 57 are connected.
  • the error detection / repair module 55 instructs the third FPGA 52 to transmit the data of the redundant FPGA 51 in which an error has occurred via the signal line 58. After that, the error detection / repair module 55 causes the third FPGA 52 and the redundant FPGA 51 in which the error has occurred to execute the calculation for the Order data again.
  • the second selector 54-1 is connected to the signal line 70 of the redundant FPGA 51 in which an error has occurred
  • the third selector 54-2 is connected to the signal line 71 of the redundant FPGA 51 in which an error has occurred. Therefore, the majority circuit 57 includes three processing operations of the third FPGA 52, the calculation result of the primary FPGA 500-1 of the redundant FPGA 51 in which an error has occurred, and the secondary FPGA 500-2 of the redundant FPGA 51 in which the error has also occurred. The calculation result is input.
  • the two FPGAs 500-1 and 500-2 of the redundant FPGA 51 in which the error has occurred and the arithmetic circuit of the FPGA tripled by the third FPGA 52 are temporarily stored. Is set automatically. Then, the majority circuit 57 compares the operation results of the temporary triple circuit, and performs the majority process of outputting the operation result with the largest number of matches to the signal line 59.
  • the data transfer FPGA 50 outputs the calculation result received from the error detection / repair module 55 (majority decision circuit 57) to the PCI express 4 as a calculation result for the Order data.
  • the CPU 2 and the NIC card 6 transmit the calculation result output from the FPGA board 5 to the PCI express 4 to the gateway device 12.
  • the error detection / repair module 55 notifies the data transfer FPGA 50 of the identifier of the redundant FPGA 51 in which an error has occurred.
  • the data transfer FPGA 50 notifies the CPU 2 of the identifier of the redundant FPGA 51 in which an error has occurred via the PCI express 4.
  • the error detection / repair module 55 may notify the data transfer FPGA 50 that an error has occurred.
  • the CPU 2 When the CPU 2 receives the identifier of the redundant FPGA 51 in which an error has occurred from the FPGA board 5 via the PCI express 4, the CPU 2 switches the device for processing the Order data from the primary ME server 1-P to the secondary ME server 1-S. .
  • the redundant FPGA 51 configured with the SRAM type FPGA, it is possible to switch to the secondary ME server 1-S after correcting the error using the third FPGA 52.
  • an error is detected using n redundant FPGAs 51 that are duplicated, and when an error occurs, two FPGAs 500-1, 500 of the third FPGA 52 and the redundant FPGA 51 in which the error has occurred are detected. -2 configures a temporary triple FPGA, and calculates again with the same data. Then, the three operation results of the triple FPGA are compared by the majority circuit 57, and the operation result with the most consistent number is output as the repaired operation result.
  • n redundant FPGAs 51 including two FPGAs 500-1 and 500-2 and one third FPGA 52 perform computation again with the triplexed FPGA only when an error occurs.
  • the operation result can be repaired.
  • only one third FPGA 52 is required to be tripled, so that the manufacturing cost of the FPGA board 5 can be suppressed.
  • the error detection / repair module 55 resets the selectors 53, 54-1, and 54-2, and the redundant FPGA 51, the third FPGA 52, and the majority circuit 57 in which an error has occurred. Disconnect from.
  • the error detection / repair module 55 functions as a control unit that performs error detection of the comparison circuit 56, control of the selectors 53, 54-1, and 54-2. It is not limited to.
  • a control circuit that controls the selectors 53, 54-1 and 54-2 based on the comparison result of the comparison circuit 56 may be provided independently.
  • FIG. 4 is a sequence diagram showing an example of processing performed in the computer system. In the illustrated example, an example in which an error has occurred in the redundant FPGA 51 is shown.
  • the gateway device 12 transfers the Order to a predetermined node 10 via the network 13 (S1). As described above, the gateway device 12 selects the node 10 as the transfer destination of Order by a load balancing method such as round robin. In the following description, an example is shown in which the gateway device 12 transfers Order to the node 10-1.
  • the primary ME server 1 -P receives Order, and the CPU 2 receives Order through the NIC card 6 and stores it in the memory 3.
  • the CPU 2 executes a predetermined program (hereinafter referred to as a node control program) for processing the received Order by the FPGA board 5.
  • the CPU 2 that executes the Order control program transfers the Order written in the memory 3 to the secondary ME server 1-S.
  • the secondary ME server 1-S also executes the node control program, and copies the Order received from the primary ME server 1-P to the memory 3. Then, the CPU 2 of the secondary ME server 1-S transmits a response (Ack in the figure) indicating that the Order copy has been saved to the primary ME server 1-P (S4).
  • the CPU 2 of the primary ME server 1-P that executes the node control program transmits the Order stored in the memory 3 to the FPGA board 5 (S3).
  • the CPU 2 of the primary ME server 1-P receives a response from the secondary ME server 1-S, it sends a calculation start command to the FPGA board 5 (S5).
  • the FPGA board 5 when the CPU 2 or Order is received, the Order is stored in the queue set in the memory 60 by the data transfer FPGA 50.
  • the data transfer FPGA 50 transmits Order to the redundant FPGA 51 corresponding to the queue to execute a predetermined calculation (S6).
  • the order includes a transaction request for securities
  • the redundant FPGA 51 of the FPGA board 5 executes a matching process for establishing a transaction for data included in the Order as a predetermined process.
  • ME indicates a matching engine indicating the function of the redundant FPGA 51.
  • the two computation results of the primary FPGA 500-1 and the secondary FPGA 500-2 of the redundant FPGA 51 are input to the comparison circuit 56, and an error occurs if the two computation results do not match (error 1 in the figure).
  • the error detection / repair module 55 of the FPGA board 5 connects the redundant FPGA 51 and the third FPGA 52 in which an error has occurred with the selectors 53, 54-1, and 54-2, and performs the operation again (S7). ).
  • the majority circuit 57 selects a calculation result from the output of the triple FPGA, and outputs the calculation result from the FPGA board 5 as a calculation result of correcting the error. (S8), and responds to the gateway device 12 via the network 13. Note that the calculation result output from the FPGA board 5 is received by the CPU 2 from the PCI express 4, and the calculation result is output from the NIC card 6.
  • the CPU 2 of the primary ME server 1-P that executes the node control program receives the identifier of the redundant FPGA 51 in which an error has occurred from the FPGA board 5, it instructs the standby secondary ME server 1-S to take over the processing. (S9).
  • the next received Order is processed by the secondary ME server 1-S and responds to the gateway device 12.
  • FIG. 5 is a sequence diagram illustrating an example of processing performed in the FPGA board 5 according to the first embodiment of this invention.
  • the data transfer FPGA 50 receives the order (S11).
  • the data transfer FPGA 50 transmits the data included in the Order to a predetermined redundancy FPGA 51, and the primary FPGA 500-1 and the secondary FPGA 500-2 each calculate the received data (S12). , S13).
  • the calculation result (Re in the figure) of the primary FPGA 500-1 and the secondary FPGA 500-2 is determined by the comparison circuit 56 of the error detection / repair module 55 to determine whether or not they match (S14). If they do not match, an error occurs, and the error detection / repair module 55 switches the selectors 53, 54-1, 54-2, and, as described above, the primary and secondary of the redundant FPGA 51 in which the error has occurred.
  • the FPGA and the third FPGA 52 are tripled and connected to the majority circuit 57 (S15).
  • the error detection / repair module 55 instructs the redundant FPGA 51 in which an error has occurred to transmit the current data to the third FPGA 52.
  • the redundant FPGA 51 in which the error has occurred transmits the currently held data to the third FPGA 52 (S16).
  • the error detection / repair module 55 instructs the triple redundant FPGA 51 and the third FPGA 52 to execute an operation with the currently held data (S17).
  • the redundant FPGA 51 in which an error has occurred the primary and secondary FPGAs 500-1 and 500-2 calculate the previous data again, and the third FPGA 52 calculates the data copied from the redundant FPGA 51. Then, the operation result of these tripled FPGAs is input to the majority circuit 57 (S18).
  • the majority circuit 57 executes a majority vote on the three input operation results, and outputs the operation result with the most consistent number (S19).
  • the output calculation result is transferred to the data transfer FPGA 50 via the signal line 59 shown in FIG. 3, and is output to the PCI express 4 (S20).
  • the n-redundant FPGA 51 including the two FPGAs 500-1 and 500-2 and the one third FPGA 52 constitute the FPGA board 5, and the triple FPGA only when an error occurs. It is possible to restore the calculation result by performing the calculation again, and to improve the reliability while suppressing the manufacturing cost of the FPGA board 5.
  • the communication between the FPGA board 5 of the primary ME server 1-P and the secondary ME server 1-S is transmitted and received from the NIC card 6 mainly by the CPU 2. It is not limited.
  • the data transfer FPGA 50 of the FPGA board 5 may notify the secondary ME server 1-S of data transfer or processing takeover by remote DMA.
  • the majority circuit 57 can perform majority processing on the three calculation results, and can identify an FPGA having a different calculation result as the FPGA in which an error has occurred.
  • FIG. 6A is a block diagram illustrating an example of the FPGA 500 according to the second embodiment.
  • FIG. 6A shows the configuration of the primary FPGA 500-1, but the secondary FPGA 500-2 and the third FPGA 52 have the same configuration.
  • the primary FPGA 500-1 includes a PCI express interface 501 connected to the PCI express 4, a controller 502, an arithmetic module A (503A), an arithmetic module B (503B), an output of the arithmetic module A, and an output of the arithmetic module B as a clock signal CLK.
  • the main selector 504 is configured to switch the output accordingly, and the memory interface 505 that controls the memory 61-1 is mainly configured.
  • the arithmetic module A and the arithmetic module B are obtained by dividing the arithmetic unit of the primary FPGA 500-1 of the first embodiment.
  • the arithmetic module A performs processing from receiving data from the data line 510 (DIN in the figure) of the PCI express interface 501 until writing to the memory 61-1 via the memory interface 505.
  • the calculation module B reads data from the memory 61-1 via the memory interface 505, executes a predetermined matching process, and outputs a calculation result.
  • the matching process is the same as in the first embodiment, and the calculation result is the same as in the first embodiment.
  • the internal selector 504 switches between the output of the arithmetic module A and the output of the arithmetic module B in accordance with a command from the controller 502 and outputs the result to the signal line 70-1. For this reason, the controller 502 is connected with a control line 511 (CIN in the figure) of the PCI express interface 501 and a clock signal line 520 for inputting an external clock signal CLK.
  • the external clock signal CLK for example, a signal generated by the CPU 2 can be used.
  • the relationship between the clock signal CLK and the output of the internal selector 504 is shown in FIG. 6B.
  • the clock signal CLK is “0” and the internal selector 504 selects the output of the arithmetic module A
  • the clock signal CLK is “1” and the internal selector 504 selects the output of the arithmetic module B.
  • the external clock signal CLK is input to the primary FPGA 500-1, the secondary FPGA 500-2, and the third FPGA 52 of the redundant FPGAs 51-1 to 51-n shown in FIG. Then, after an error has occurred in the redundant FPGA 51, it is tripled with the third FPGA 52, and the majority circuit 57 receives the calculation results of the three FPGAs.
  • the clock signal CLK is “1”, and the majority circuit 57 determines whether or not the calculation results of the calculation module B match. Since all the operation results do not match in the redundant FPGA 51 in which an error has occurred, the error detection / repair module 55 can specify whether the FPGA that has output a different operation result is the primary FPGA 500-1 or the secondary FPGA 500-2. .
  • the majority circuit 57 determines whether or not the calculation results of the calculation module A match.
  • the error detection / recovery module 55 can determine that an error has occurred in the FPGA calculation module B that has output a different calculation result in the first determination by the majority circuit 57.
  • the error detection / recovery module 55 indicates that an error has occurred in the calculation module A of the FPGA that output a different calculation result in the determination of the majority circuit 57 this time (second time). Can be identified.
  • the FPGA configuring the redundant FPGA 51 is divided into a plurality of parts, and a plurality of operation results of the divided operation units are switched by the internal selector 504 and output.
  • the error detection / repair module 55 can specify the part (arithmetic unit) where the error has occurred in addition to the FPGA where the error has occurred.
  • the error detection / repair module 55 shows an example of specifying an FPGA in which an error has occurred (one of the primary FPGA 500-1 or the secondary FPGA 500-2) and specifying a site (arithmetic unit) in which the error has occurred
  • the part where the error occurred in the data transfer FPGA 50 or the CPU 2 may be specified.
  • FIG. 7 and 8 show the third embodiment.
  • FIG. 7 is a block diagram showing an example of the FPGA board 5.
  • FIG. 8 is a flowchart illustrating an example of processing performed by the CPU of the FPGA board 5.
  • n redundant FPGAs 51 and one third FPGA 52 are combined, and when the error occurs, the redundant FPGA 51 and the third FPGA 52 are combined to be tripled.
  • the third embodiment shows an example in which one of the n redundant FPGAs 51-1 to 51-n is used as the third FPGA.
  • the redundant FPGA 51 includes a primary FPGA 500-1 and a secondary FPGA 500-2, and the outputs of the FPGAs 500-1 and 500-2 are connected to the comparison circuit 56 via signal lines 70 and 71, respectively.
  • the outputs of the FPGAs 500-1 and 500-2 are connected from the signal lines 70 and 71 to the majority circuit 57 via the selectors 54-1 and 54-2.
  • the local CPU 80 selects any one of the redundant FPGAs 51-1 to 51-n as the third FPGA.
  • one redundant FPGA 51 includes two FPGAs 500-1 and 500-2, the local CPU 80 selects one of them as the third FPGA.
  • the local CPU 80 selects the redundant FPGA 51 that is the third FPGA in round robin, and selects the primary FPGA 500-1 of the redundant FPGA 51 as the third FPGA.
  • the error detection / repair module 55 is obtained by omitting the control of the selectors 54-1, 545-2 from the configuration of the first embodiment.
  • the error detection / repair module 55 notifies the local CPU 80 of the identifier of the redundant FPGA 51 in which the mismatch error has occurred via the signal line 58.
  • the local CPU 80 causes the selectors 54-1 and 54-2 to connect the signal lines 70 and 71 of the redundant FPGA 51 in which an error has occurred and the signal line 70 of the redundant FPGA 51 functioning as the third FPGA to the majority circuit 57. Command. Then, the local CPU 80 copies the current data from the redundant FPGA 51 in which the error has occurred to the third FPGA. Thereafter, the local CPU 80 instructs the redundant FPGA 51 and the third FPGA in which an error has occurred to execute the calculation.
  • the repaired calculation result is output from the majority circuit 57 of the error detection / repair module 55. Thereafter, the local CPU 80 notifies the secondary ME server 1-S of the takeover of processing via the data transfer FPGA 50.
  • the flowchart of FIG. 8 is executed in a state where the activation of the FPGA board 5 is completed and the third FPGA is selected.
  • the local CPU 80 monitors the signal line 58 and waits until the error detection / repair module 55 notifies the error (S31).
  • the local CPU 80 receives the identifier from the signal line 58 and identifies the redundant FPGA 51 in which the error has occurred (S32).
  • the local CPU 80 causes the selectors 54-1 and 54-2 to connect the signal lines 70 and 71 of the redundant FPGA 51 in which an error has occurred and the signal line 70 of the redundant FPGA 51 functioning as the third FPGA to the majority circuit 57.
  • Command (S33, S34).
  • the local CPU 80 copies the current data from the redundant FPGA 51 in which the error has occurred to the third FPGA (S35).
  • the local CPU 80 instructs the redundant FPGA 51 and the third FPGA in which an error has occurred to execute the calculation (S36).
  • the primary and secondary FPGAs 500-1 and 500-2 calculate the previous data again, and the third FPGA calculates the data copied from the redundant FPGA 51. Then, the operation result of the triple FPGA is input to the majority circuit 57.
  • the majority circuit 57 executes a majority vote on the three input operation results, and outputs the operation result having the most matched number to the signal line 59 (S37).
  • the local CPU 80 instructs the standby secondary ME server 1-S to take over the processing (S38).
  • one of the n redundant FPGAs 51 including the two FPGAs 500-1 and 500-2 is set as the third FPGA, and, as in the first embodiment, the triplicate is performed only when an error occurs. It is possible to perform the calculation again with the converted FPGA and restore the calculation result, and it is possible to improve the reliability while suppressing the manufacturing cost of the FPGA board 5.
  • the third FPGA can be selected from the n redundant FPGAs 51, when an error occurs in the third FPGA, the processing is taken over to the secondary FPGA 500-2 or another redundant FPGA 51. The redundancy of the third FPGA can be ensured.
  • a plurality of redundant FPGAs 51 including two FPGAs 500-1 and 500-2 may be arranged, so that semiconductor design and manufacture can be facilitated.
  • the present invention is not limited to the above-described embodiments, and includes various modifications.
  • the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.

Abstract

Provided is a computation circuit, comprising: a redundant computation unit, further comprising a first computation unit and a second computation unit, which computes on inputted data in parallel; a comparison circuit which compares the computation result of the first computation unit and the computation result of the second computation unit; a third computation unit which computes the data of the redundant computation unit when the comparison result of the computation unit is not a match; and selectors which connect the output of the first computation unit, the output of the second computation unit, and the output of the third computation unit to a voter circuit when the comparison result of the comparison circuit is not a match.

Description

演算回路及び計算機Arithmetic circuit and computer
 本発明は、冗長化された演算回路の信頼性を向上させる技術に関する。 The present invention relates to a technique for improving the reliability of a redundant arithmetic circuit.
 少量生産のIT機器では開発コストの高いASIC(Application Specific Integrated Circuit)に代わって、カスタマイズを低コストで実現できるFPGA(Field Programmable Gate Array)を演算ユニットとして利用することが望まれている。 For small-volume IT devices, it is desired to use FPGA (Field Programmable Gate Array) that can be customized at low cost instead of ASIC (Application Specific Integrated Circuit), which has a high development cost.
 SRAM型のFPGAでは、内部構成を定義したビットストリームデータをCRAM(Configuration RAM)にロードすることで、ユーザが定義した内部構成を実現する。CRAMにロードされたビットストリームデータデータの各ビットが、ユーザの定義による回路構成を示す。 The SRAM type FPGA implements the user-defined internal configuration by loading bitstream data defining the internal configuration into a CRAM (Configuration RAM). Each bit of the bit stream data data loaded into the CRAM indicates a circuit configuration defined by the user.
 近年、半導体プロセスの微細化に伴って、SRAM型のFPGAでは、集積度が向上している。一方、半導体プロセスの微細化が進むにつれて、環境放射線(地上における宇宙線中性子やα線)によるCRAM(SRAM)のビットエラーが発生する傾向があった。 In recent years, with the miniaturization of semiconductor processes, the integration degree of SRAM type FPGAs has been improved. On the other hand, as semiconductor processes have become finer, bit errors in CRAM (SRAM) due to environmental radiation (cosmic neutrons and alpha rays on the ground) tend to occur.
 1ビットエラーを回復するには、ECC等の既存の手法で回復可能であるが、隣接する2ビットでエラーが発生した場合には、FPGAの多重化(3重化など)で対処する必要がある。冗長性を確保するために3重化を用いる技術としては、例えば、特許文献1や特許文献2が知られている。 To recover a 1-bit error, it is possible to recover using an existing method such as ECC. However, when an error occurs in 2 adjacent bits, it is necessary to deal with multiplexing of FPGA (such as triple). is there. For example, Patent Literature 1 and Patent Literature 2 are known as techniques that use triplexing to ensure redundancy.
米国特許第6,389,041号US Pat. No. 6,389,041 特開2013-46181号公報JP 2013-46181 A
 しかしながら、上記FPGAの3重化によって冗長性を確保できるものの、演算回路または演算装置のコストが増大するという問題があった。特に、多数のFPGAで並列的に演算処理を行う演算装置では、ひとつの演算処理に3つのFPGAを使用するため、並列化の数が多くなるにつれて演算回路または演算装置のコストが過大になるという問題があった。 However, although redundancy can be ensured by triple the FPGA, there is a problem that the cost of the arithmetic circuit or the arithmetic unit increases. In particular, in an arithmetic device that performs arithmetic processing in parallel with a large number of FPGAs, three FPGAs are used for one arithmetic processing, so that the cost of the arithmetic circuit or arithmetic device increases as the number of parallelization increases. There was a problem.
 本発明は、入力されたデータを並列して演算する第1の演算ユニットと第2の演算ユニットを含む冗長化演算ユニットと、前記第1の演算ユニットの演算結果と、前記第2の演算ユニットの演算結果とを比較する比較回路と、前記比較回路の比較結果が一致しないときに、前記冗長化演算ユニットのデータを演算する第3の演算ユニットと、前記比較回路の比較結果が一致しないときに、前記第1の演算ユニットの出力と、前記第2の演算ユニットの出力と、前記第3の演算ユニットの出力とを多数決回路に接続するセレクタと、を備える。 The present invention provides a redundant arithmetic unit including a first arithmetic unit and a second arithmetic unit that operate on input data in parallel, an arithmetic result of the first arithmetic unit, and the second arithmetic unit. When the comparison result of the comparison circuit and the comparison result of the comparison circuit do not match, the comparison result of the comparison circuit of the third calculation unit that calculates the data of the redundant calculation unit does not match And a selector for connecting the output of the first arithmetic unit, the output of the second arithmetic unit, and the output of the third arithmetic unit to a majority circuit.
 本発明によれば、2つの演算ユニットを含む冗長化演算ユニットと、1つの第3の演算ユニットを有する演算回路で、エラーが発生したときのみ冗長化演算ユニットと第3の演算ユニットで3重化した演算ユニットで演算を行い、多数決回路で多数決処理を行うことで演算結果の修復が可能となり、演算回路の製造コストを抑制しながら信頼性を向上させることが可能となるのである。 According to the present invention, a redundant arithmetic unit including two arithmetic units and an arithmetic circuit having one third arithmetic unit, the redundant arithmetic unit and the third arithmetic unit are tripled only when an error occurs. It is possible to restore the operation result by performing the operation with the converted operation unit and performing the majority process with the majority circuit, and it is possible to improve the reliability while suppressing the manufacturing cost of the operation circuit.
本発明の第1の実施例を示し、計算機システムの一例を示すブロック図である。1 is a block diagram illustrating an example of a computer system according to a first embodiment of this invention. FIG. 本発明の第1の実施例を示し、マッチングエンジン(ME)サーバの一例を示すブロック図である。It is a block diagram which shows the 1st Example of this invention and shows an example of a matching engine (ME) server. 本発明の第1の実施例を示し、FPGAボードの一例を示すブロック図である。1 is a block diagram illustrating an example of an FPGA board according to a first embodiment of this invention. FIG. 本発明の第1の実施例を示し、計算機システムで行われる処理の一例を示すシーケンス図である。It is a sequence diagram which shows a 1st Example of this invention and shows an example of the process performed with a computer system. 本発明の第1の実施例を示し、FPGAボード内で行われる処理の一例を示すシーケンス図である。It is a sequence diagram which shows a 1st Example of this invention and shows an example of the process performed within an FPGA board. 本発明の第2の実施例を示し、FPGAの一例を示すブロック図である。It is a block diagram which shows the 2nd Example of this invention and shows an example of FPGA. 本発明の第2の実施例を示し、FPGAの出力とクロックの関係を示す図である。FIG. 10 is a diagram illustrating a relationship between the output of the FPGA and the clock according to the second embodiment of this invention. 本発明の第3の実施例を示し、FPGAボードの一例を示すブロック図である。It is a block diagram which shows the 3rd Example of this invention and shows an example of an FPGA board. 本発明の第3の実施例を示し、FPGAボードのCPUで行われる処理の一例を示すフローチャートである。It is a flowchart which shows a 3rd Example of this invention and shows an example of the process performed by CPU of an FPGA board.
 以下、本発明の一実施形態について添付図面を用いて説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the accompanying drawings.
 図1は、本発明の実施例1を示し、計算機システムの一例を示すブロック図である。本発明の計算機システムでは、証券等の取引を多数のFPGA(Field Programmable Gate Array)を搭載したFPGAボードで並列的に処理する一例を示す。 FIG. 1 shows a first embodiment of the present invention and is a block diagram showing an example of a computer system. In the computer system of the present invention, an example is shown in which transactions such as securities are processed in parallel on an FPGA board equipped with a large number of FPGAs (Field Programmable Gate Array).
 ゲートウェイ装置12で受け付けた取引要求(Order)は、ネットワーク13を介して複数のノード10-1~10-nに分配される。各ノード10-1~10-nは、プライマリのマッチングエンジン(以下、MEとする)サーバ1-Pと、セカンダリのマッチングエンジン(図中ME)サーバ1-Sで構成される。 The transaction request (Order) received by the gateway device 12 is distributed to a plurality of nodes 10-1 to 10-n via the network 13. Each of the nodes 10-1 to 10-n includes a primary matching engine (hereinafter referred to as ME) server 1-P and a secondary matching engine (ME in the figure) server 1-S.
 ノードが正常な時には、プライマリのMEサーバ1-PがOrderを受信して、後述するFPGAボードで所定の演算処理を実行し、処理結果をゲートウェイ装置12に応答する。ゲートウェイ装置12はOrderの送信元に処理結果を送信する。プライマリ(現用系)のMEサーバ1-Pに障害が発生すると、セカンダリ(待機系)のMEサーバ1-Sに処理が引き継がれる。 When the node is normal, the primary ME server 1-P receives Order, executes predetermined arithmetic processing by an FPGA board described later, and returns the processing result to the gateway device 12. The gateway device 12 transmits the processing result to the Order transmission source. When a failure occurs in the primary (active) ME server 1-P, the processing is taken over by the secondary (standby) ME server 1-S.
 なお、プライマリのMEサーバ1-PとセカンダリのMEサーバ1-Sは、サーバ間ネットワーク11で接続される。また、サーバ間ネットワーク11とネットワーク13は、同一のネットワークであってもよい。また、ゲートウェイ装置12は、受信したOrderをラウンドロビンや重み付きラウンドロビンなどの公知または周知の手法により各ノード10-1~10-nに分配する。なお、Orderは、例えば、証券の取引要求であり、大量のOrderが、複数のノード10-1~10-nで並列的に処理される。Orderの処理については各ノード10-1~10-nのFPGAボードが処理を実施する。 Note that the primary ME server 1-P and the secondary ME server 1-S are connected by an inter-server network 11. Further, the inter-server network 11 and the network 13 may be the same network. The gateway device 12 distributes the received Order to each of the nodes 10-1 to 10-n by a known or well-known method such as round robin or weighted round robin. Order is, for example, a transaction request for securities, and a large amount of Order is processed in parallel by a plurality of nodes 10-1 to 10-n. For the order processing, the FPGA boards of the nodes 10-1 to 10-n perform processing.
 図2は、MEサーバ1-Pの一例を示すブロック図である。プライマリのMEサーバ1-PとセカンダリのMEサーバ1-Sは同一の構成であるので、MEサーバ1-Pのみについて説明する。 FIG. 2 is a block diagram showing an example of the ME server 1-P. Since the primary ME server 1-P and the secondary ME server 1-S have the same configuration, only the ME server 1-P will be described.
 MEサーバ1-Pは、演算処理を行うCPU2と、データやプログラムを保持するメモリ3と、ネットワーク13(またはサーバ間ネットワーク11)と通信を行うNICカード(またはネットワークI/F)6と、Orderの処理を実行するFPGAボード5と、CPU2とFPGAボード5及びNICカード6とを接続するインターコネクト(またはバス)としてのPCIexpress4とを備える。 The ME server 1-P includes a CPU 2 that performs arithmetic processing, a memory 3 that stores data and programs, a NIC card (or network I / F) 6 that communicates with the network 13 (or the inter-server network 11), and Order. The FPGA board 5 that executes the above process, and the PCI express 4 as an interconnect (or bus) that connects the CPU 2 to the FPGA board 5 and the NIC card 6.
 なお、図中FPGAボード5はソケット41を介してPCIexpress4に接続され、NICカード6はソケット42を介してPCIexpress4に接続される。また、CPU2とI/OデバイスとしてのFPGAボード5、NICカード6を接続するインターコネクトとしてPCIexpress4を採用した例を示したが、PCIバスなどであってもよい。 In the figure, the FPGA board 5 is connected to the PCI express 4 via the socket 41, and the NIC card 6 is connected to the PCI express 4 via the socket 42. Further, although an example in which PCI express 4 is used as an interconnect for connecting the CPU 2 to the FPGA board 5 as an I / O device and the NIC card 6 is shown, a PCI bus or the like may be used.
 図3は、FPGAボード5の一例を示すブロック図である。FPGAボード5は、PCIexpress4との間でデータやコマンドの送受信を行うデータ転送FPGA50と、複数のFPGAを含んで演算処理を実行する冗長化FPGA(図中Dual Modular Redundancy FPGAs、図中DMR FPGAs)51-1~51-nと、冗長化FPGA51-1~51-nでエラーが発生したときに演算を行う第3のFPGA(図中FPGA Third)52と、冗長化FPGA51-1~51-nの出力を監視して演算結果のエラーを検出し、エラーがあったときには演算結果を修復する制御部として機能するエラー検知・修復モジュール(図中Detect and Recovery Module)55と、データの入出力を切り替えるセレクタ(図中Sel1~Sel3)53、54-1、54-2を含む回路で構成される。 FIG. 3 is a block diagram showing an example of the FPGA board 5. The FPGA board 5 includes a data transfer FPGA 50 that transmits and receives data and commands to and from the PCI express 4, and redundant FPGAs (Dual Modular Redundancy FPGAs in the figure, DMR FPGAs in the figure) 51 that execute arithmetic processing including a plurality of FPGAs. -1 to 51-n, a third FPGA 52 that performs an operation when an error occurs in the redundant FPGAs 51-1 to 51-n, and the redundant FPGAs 51-1 to 51-n. Monitors the output to detect an error in the operation result, and when there is an error, switches between an error detection / recovery module 55 (Detect and Recovery Module in the figure) that functions as a control unit that repairs the operation result The circuit includes selectors (Sel1 to Sel3 in the figure) 53, 54-1, and 54-2.
 なお、セレクタ53(Sel1)は第3のFPGA52の入力を切り替える第1のセレクタとして機能し、セレクタ54-1(Sel2)は多数決回路57へ入力するプライマリFPGA500-1を切り替える第2のセレクタとして機能し、セレクタ54-2(Sel3)は多数決回路57へ入力するセカンダリFPGA500-2を切り替える第3のセレクタとして機能する。 The selector 53 (Sel1) functions as a first selector that switches the input of the third FPGA 52, and the selector 54-1 (Sel2) functions as a second selector that switches the primary FPGA 500-1 to be input to the majority circuit 57. The selector 54-2 (Sel3) functions as a third selector for switching the secondary FPGA 500-2 to be input to the majority circuit 57.
 本実施例では、データ転送FPGA50と、エラー検知・修復モジュール55はフラッシュメモリ型のFPGAで構成される。一方、冗長化FPGA51-1~51-nと、第3のFPGA52はSRAM型で構成される。なお、以下では冗長化FPGA51-1~51-nの総称を符号51で表す。なお、他の構成要素についても、総称を示す符号は添え字を削除して表示する。 In this embodiment, the data transfer FPGA 50 and the error detection / repair module 55 are configured by a flash memory type FPGA. On the other hand, the redundant FPGAs 51-1 to 51-n and the third FPGA 52 are of SRAM type. In the following, the generic name of the redundant FPGAs 51-1 to 51-n is denoted by reference numeral 51. In addition, also about another component, the code | symbol which shows a generic name deletes a subscript, and displays it.
 データ転送FPGA50は、送受信するデータなどを保持するメモリ60を有し、ソケット41を介してPCIexpress4に接続される信号線40と、データやコマンドを各冗長化FPGA51に転送する信号線45と、エラー検知・修復モジュール55からの出力を受け付ける信号線59と、エラー検知・修復モジュール55からの指令等を受け付ける信号線58とが接続される。 The data transfer FPGA 50 has a memory 60 for holding data to be transmitted and received, a signal line 40 connected to the PCI express 4 via the socket 41, a signal line 45 for transferring data and commands to each redundant FPGA 51, and an error A signal line 59 for receiving an output from the detection / repair module 55 and a signal line 58 for receiving a command from the error detection / repair module 55 are connected.
 データ転送FPGA50は、PCIexpress4からデータ(Orderのデータ)を受け付けると、複数の冗長化FPGA51にデータを分配する。この際、データを割り当てる冗長化FPGA51が存在しない場合には、データをメモリ60に保持するキューイングなどの公知または周知の手法により、データを複数の冗長化FPGA51に分配する。なお、本実施例では、データ転送FPGA50と冗長化FPGA51の接続は、PCIexpressの信号線45で接続する例を示す。 When the data transfer FPGA 50 receives data (Order data) from the PCI express 4, the data transfer FPGA 50 distributes the data to the plurality of redundant FPGAs 51. At this time, if there is no redundant FPGA 51 to which data is allocated, the data is distributed to the plurality of redundant FPGAs 51 by a known or well-known method such as queuing for holding the data in the memory 60. In the present embodiment, the connection between the data transfer FPGA 50 and the redundant FPGA 51 is an example in which the connection is made through a PCI express signal line 45.
 演算を実行する冗長化FPGA51-1~51-nは、同一の構成であるので、冗長化FPGA51-1についてのみ説明する。 Since the redundant FPGAs 51-1 to 51-n that execute the operation have the same configuration, only the redundant FPGA 51-1 will be described.
 冗長化FPGA51-1は、現用系となるプライマリFPGA(図中FPGA Pri)500-1と、待機系となるセカンダリFPGA(図中FPGA Sec)500-2を含む。プライマリFPGA500-1にはデータや演算結果等を保持するメモリ61-1が接続され、セカンダリFPGA500-2にもデータや演算結果等を保持するメモリ61-2が接続される。プライマリFPGA500-1は第1の演算ユニットとして機能し、セカンダリFPGA500-2は第2の演算ユニットとして機能し、第3のFPGA52は第3の演算ユニットとして機能し、各FPGAは、同一の演算処理を実行する。 The redundant FPGA 51-1 includes a primary FPGA (FPG Pri in the figure) 500-1 serving as an active system and a secondary FPGA (FPGA Sec in the figure) 500-2 serving as a standby system. The primary FPGA 500-1 is connected with a memory 61-1 for holding data, calculation results, and the like, and the secondary FPGA 500-2 is connected with a memory 61-2 for holding data, calculation results, and the like. The primary FPGA 500-1 functions as a first arithmetic unit, the secondary FPGA 500-2 functions as a second arithmetic unit, the third FPGA 52 functions as a third arithmetic unit, and each FPGA has the same arithmetic processing Execute.
 冗長化FPGA51-1には、PCIexpressの信号線45を介してOrderのデータが送信される。また、冗長化FPGA51-1には、エラー検知・修復モジュール55からの指令等を受け付ける信号線58と、セレクタ53へデータを送信する信号線46-1が接続される。 Order data is transmitted to the redundant FPGA 51-1 through the PCI express signal line 45. The redundant FPGA 51-1 is connected with a signal line 58 for receiving a command from the error detection / repair module 55 and a signal line 46-1 for transmitting data to the selector 53.
 冗長化FPGA51-1では、データ転送FPGA50から受け付けたデータについて、プライマリFPGA500-1とセカンダリFPGA500-2がそれぞれ所定の演算処理を実行し、演算結果を比較回路56-1へ出力する。なお、プライマリFPGA500-1とセカンダリFPGA500-2は、ひとつのOrderのデータについて同一の演算処理をそれぞれ実行する。 In the redundant FPGA 51-1, the primary FPGA 500-1 and the secondary FPGA 500-2 each execute predetermined calculation processing on the data received from the data transfer FPGA 50, and output the calculation result to the comparison circuit 56-1. Note that the primary FPGA 500-1 and the secondary FPGA 500-2 respectively execute the same calculation process for one Order data.
 プライマリFPGA500-1の出力側は信号線70-1を介してエラー検知・修復モジュール55の比較回路56-1に接続され、セカンダリFPGA500-2の出力側は信号線71-1を介して同じくエラー検知・修復モジュール55の比較回路56-1に接続される。 The output side of the primary FPGA 500-1 is connected to the comparison circuit 56-1 of the error detection / repair module 55 through the signal line 70-1, and the output side of the secondary FPGA 500-2 is also in error through the signal line 71-1. It is connected to the comparison circuit 56-1 of the detection / repair module 55.
 他の冗長化FPGA51も同様であり、冗長化FPGA51-2~51-nは、信号線70-2~70-nと71-2~71-nを介してエラー検知・修復モジュール55の比較回路56-2~56-nへ接続される。 The same applies to the other redundant FPGAs 51. The redundant FPGAs 51-2 to 51-n are connected to the error detection / repair module 55 through the signal lines 70-2 to 70-n and 71-2 to 71-n. 56-2 to 56-n.
 また、第3のFPGA52は、プライマリFPGA500-1と同様に構成され、メモリ62を備える。第3のFPGA52は、入力側をセレクタ53に接続されており、エラーが発生した冗長化FPGA51から信号線46-1~46-nを介してOrderのデータを受信する。そして、第3のFPGA52の出力側はエラー検知・修復モジュール55の多数決回路(図中Voter)57に接続される。 The third FPGA 52 is configured in the same manner as the primary FPGA 500-1, and includes a memory 62. The third FPGA 52 is connected to the selector 53 on the input side, and receives Order data from the redundant FPGA 51 in which an error has occurred via the signal lines 46-1 to 46-n. The output side of the third FPGA 52 is connected to a majority circuit (Vota in the figure) 57 of the error detection / repair module 55.
 エラー検知・修復モジュール55は、各冗長化FPGA51に対応する比較回路56-1~56-nと、セレクタ(第2セレクタ)54-1とセレクタ(第3セレクタ)54-2で選択された冗長化FPGA51のプライマリFPGA500-1とセカンダリFPGA500-2の出力と、第3のFPGA52の出力の多数決を演算する多数決回路57を含む。 The error detection / repair module 55 includes the redundancy circuits selected by the comparison circuits 56-1 to 56-n corresponding to each redundant FPGA 51, the selector (second selector) 54-1 and the selector (third selector) 54-2. A majority circuit 57 for calculating the majority of the outputs of the primary FPGA 500-1 and the secondary FPGA 500-2 of the generalized FPGA 51 and the output of the third FPGA 52.
 比較回路56は、冗長化FPGA51からの2つの入力(プライマリFPGA500-1とセカンダリFPGA500-2の2つの演算結果)を比較した結果が一致した場合、この演算結果を信号線59から出力してデータ転送FPGA50に送信する。データ転送FPGA50は、エラー検知・修復モジュール55からの演算結果をPCIexpress4に出力する。 When the comparison result of the two inputs from the redundant FPGA 51 (the two calculation results of the primary FPGA 500-1 and the secondary FPGA 500-2) matches, the comparison circuit 56 outputs the calculation result from the signal line 59 and outputs data. It transmits to transfer FPGA50. The data transfer FPGA 50 outputs the calculation result from the error detection / repair module 55 to the PCI express 4.
 比較回路56は、冗長化FPGA51のプライマリFPGA500-1の演算結果と、セカンダリFPGA500-2の演算結果を比較して、演算結果が一致しない場合はエラーとして検知する。エラー検知・修復モジュール55は、比較回路56がエラーを検知すると、信号線58を介してセレクタ53、54-1、54-2に対して、エラーが発生した冗長化FPGA51に接続するよう指令する。セレクタ53はエラーが発生した冗長化FPGA51と第3のFPGA52を接続し、セレクタ54-1は、エラーが発生した冗長化FPGA51の信号線70と多数決回路57を接続し、セレクタ54-2は、エラーが発生した冗長化FPGA51の信号線71と多数決回路57を接続する。 The comparison circuit 56 compares the calculation result of the primary FPGA 500-1 of the redundant FPGA 51 with the calculation result of the secondary FPGA 500-2, and detects an error if the calculation results do not match. When the comparison circuit 56 detects an error, the error detection / repair module 55 instructs the selectors 53, 54-1 and 54-2 to connect to the redundant FPGA 51 where the error has occurred via the signal line 58. . The selector 53 connects the redundant FPGA 51 and the third FPGA 52 in which the error has occurred, the selector 54-1 connects the signal line 70 of the redundant FPGA 51 in which the error has occurred and the majority circuit 57, and the selector 54-2 The signal line 71 of the redundant FPGA 51 in which an error has occurred and the majority circuit 57 are connected.
 次に、エラー検知・修復モジュール55は、信号線58を介してエラーが発生した冗長化FPGA51のデータを、第3のFPGA52へ送信するよう指令する。その後、エラー検知・修復モジュール55は、第3のFPGA52と、エラーが発生した冗長化FPGA51で、Orderのデータについての演算を再度実行させる。 Next, the error detection / repair module 55 instructs the third FPGA 52 to transmit the data of the redundant FPGA 51 in which an error has occurred via the signal line 58. After that, the error detection / repair module 55 causes the third FPGA 52 and the redundant FPGA 51 in which the error has occurred to execute the calculation for the Order data again.
 第2のセレクタ54-1は、エラーが発生した冗長化FPGA51の信号線70に接続され、第3のセレクタ54-2は、エラーが発生した冗長化FPGA51の信号線71に接続される。したがって、多数決回路57には、第3のFPGA52の演算処理と、エラーが発生した冗長化FPGA51のプライマリFPGA500-1の演算結果と、同じくエラーが発生した冗長化FPGA51のセカンダリFPGA500-2の3つの演算結果が入力される。 The second selector 54-1 is connected to the signal line 70 of the redundant FPGA 51 in which an error has occurred, and the third selector 54-2 is connected to the signal line 71 of the redundant FPGA 51 in which an error has occurred. Therefore, the majority circuit 57 includes three processing operations of the third FPGA 52, the calculation result of the primary FPGA 500-1 of the redundant FPGA 51 in which an error has occurred, and the secondary FPGA 500-2 of the redundant FPGA 51 in which the error has also occurred. The calculation result is input.
 本実施例のFPGAボード5では、エラーが発生したときには、エラーが発生した冗長化FPGA51の2つのFPGA500-1、500-2と、第3のFPGA52で3重化されたFPGAの演算回路が一時的に設定される。そして、多数決回路57は、一時的な3重化回路の演算結果を比較して、一致する数が最大の演算結果を信号線59へ出力する多数決処理を行う。データ転送FPGA50は、エラー検知・修復モジュール55(多数決回路57)から受信した演算結果を、Orderのデータに対する演算結果としてPCIexpress4に出力する。CPU2とNICカード6は、FPGAボード5からPCIexpress4へ出力された演算結果を、ゲートウェイ装置12へ送信する。 In the FPGA board 5 of the present embodiment, when an error occurs, the two FPGAs 500-1 and 500-2 of the redundant FPGA 51 in which the error has occurred and the arithmetic circuit of the FPGA tripled by the third FPGA 52 are temporarily stored. Is set automatically. Then, the majority circuit 57 compares the operation results of the temporary triple circuit, and performs the majority process of outputting the operation result with the largest number of matches to the signal line 59. The data transfer FPGA 50 outputs the calculation result received from the error detection / repair module 55 (majority decision circuit 57) to the PCI express 4 as a calculation result for the Order data. The CPU 2 and the NIC card 6 transmit the calculation result output from the FPGA board 5 to the PCI express 4 to the gateway device 12.
 また、エラー検知・修復モジュール55は、エラーが発生した冗長化FPGA51の識別子をデータ転送FPGA50に通知する。データ転送FPGA50は、PCIexpress4を介してCPU2へエラーが発生した冗長化FPGA51の識別子を通知する。なお、エラー検知・修復モジュール55は、データ転送FPGA50にエラーが発生したことを通知するようにしても良い。 Further, the error detection / repair module 55 notifies the data transfer FPGA 50 of the identifier of the redundant FPGA 51 in which an error has occurred. The data transfer FPGA 50 notifies the CPU 2 of the identifier of the redundant FPGA 51 in which an error has occurred via the PCI express 4. The error detection / repair module 55 may notify the data transfer FPGA 50 that an error has occurred.
 CPU2は、PCIexpress4を介してFPGAボード5からエラーが発生した冗長化FPGA51の識別子を受信すると、Orderのデータを処理する装置を、プライマリのMEサーバ1-PからセカンダリのMEサーバ1-Sに切り替える。これにより、SRAM型のFPGAで構成された冗長化FPGA51にエラーが発生しても、第3のFPGA52を用いてエラーを修復してからセカンダリのMEサーバ1-Sに切り替えることが可能となる。 When the CPU 2 receives the identifier of the redundant FPGA 51 in which an error has occurred from the FPGA board 5 via the PCI express 4, the CPU 2 switches the device for processing the Order data from the primary ME server 1-P to the secondary ME server 1-S. . As a result, even if an error occurs in the redundant FPGA 51 configured with the SRAM type FPGA, it is possible to switch to the secondary ME server 1-S after correcting the error using the third FPGA 52.
 本発明では、2重化されたn個の冗長化FPGA51を用いてエラーを検知し、エラーが発生したときには、第3のFPGA52と、エラーが発生した冗長化FPGA51の2つのFPGA500-1、500-2で一時的に3重化したFPGAを構成し、同一のデータで再度演算する。そして、3重化したFPGAの3つの演算結果を多数決回路57で比較し、最も一致した数の多い演算結果を、修復した演算結果として出力する。 In the present invention, an error is detected using n redundant FPGAs 51 that are duplicated, and when an error occurs, two FPGAs 500-1, 500 of the third FPGA 52 and the redundant FPGA 51 in which the error has occurred are detected. -2 configures a temporary triple FPGA, and calculates again with the same data. Then, the three operation results of the triple FPGA are compared by the majority circuit 57, and the operation result with the most consistent number is output as the repaired operation result.
 したがって、FPGAボード5では、2つのFPGA500-1と500-2を含むn個の冗長化FPGA51と、1つの第3のFPGA52で、エラーが発生したときのみ3重化したFPGAで再度演算を行って、演算結果の修復が可能となる。これにより、3重化するための第3のFPGA52が1つあれば良いので、FPGAボード5の製造コストを抑制することが可能となるのである。 Therefore, in the FPGA board 5, n redundant FPGAs 51 including two FPGAs 500-1 and 500-2 and one third FPGA 52 perform computation again with the triplexed FPGA only when an error occurs. Thus, the operation result can be repaired. As a result, only one third FPGA 52 is required to be tripled, so that the manufacturing cost of the FPGA board 5 can be suppressed.
 なお、多数決回路57が演算結果を出力すると、エラー検知・修復モジュール55は各セレクタ53、54-1、54-2をリセットし、エラーが発生した冗長化FPGA51と第3のFPGA52及び多数決回路57との接続を解除する。 When the majority circuit 57 outputs the calculation result, the error detection / repair module 55 resets the selectors 53, 54-1, and 54-2, and the redundant FPGA 51, the third FPGA 52, and the majority circuit 57 in which an error has occurred. Disconnect from.
 また、図3の例では、エラー検知・修復モジュール55が、比較回路56のエラー検出やセレクタ53、54-1、54-2の制御などを行う制御部として機能する例を示したが、これに限定されるものではない。例えば、比較回路56の比較結果からセレクタ53、54-1、54-2を制御する制御回路を独立して設けてもよい。 In the example of FIG. 3, the error detection / repair module 55 functions as a control unit that performs error detection of the comparison circuit 56, control of the selectors 53, 54-1, and 54-2. It is not limited to. For example, a control circuit that controls the selectors 53, 54-1 and 54-2 based on the comparison result of the comparison circuit 56 may be provided independently.
 図4は、計算機システムで行われる処理の一例を示すシーケンス図である。図示の例では、冗長化FPGA51でエラーが発生した例を示す。 FIG. 4 is a sequence diagram showing an example of processing performed in the computer system. In the illustrated example, an example in which an error has occurred in the redundant FPGA 51 is shown.
 まず、ゲートウェイ装置12は、外部の計算機(図示省略)からOrderを受信すると、ネットワーク13を介して所定のノード10にOrderを転送する(S1)。ゲートウェイ装置12は、上述したようにラウンドロビンなどの負荷分散手法によりOrderを転送先のノード10を選択する。以下の説明では、ゲートウェイ装置12がノード10-1にOrderを転送した例を示す。 First, when receiving an Order from an external computer (not shown), the gateway device 12 transfers the Order to a predetermined node 10 via the network 13 (S1). As described above, the gateway device 12 selects the node 10 as the transfer destination of Order by a load balancing method such as round robin. In the following description, an example is shown in which the gateway device 12 transfers Order to the node 10-1.
 ノード10-1では、プライマリのMEサーバ1-PがOrderを受信し、NICカード6を介してCPU2がOrderを受信してメモリ3に格納する。CPU2は、受信したOrderをFPGAボード5で処理する所定のプログラム(以下、ノード制御プログラムとする)を実行する。Order制御プログラムを実行するCPU2は、メモリ3に書き込んだOrderをセカンダリのMEサーバ1-Sに転送する。 In the node 10-1, the primary ME server 1 -P receives Order, and the CPU 2 receives Order through the NIC card 6 and stores it in the memory 3. The CPU 2 executes a predetermined program (hereinafter referred to as a node control program) for processing the received Order by the FPGA board 5. The CPU 2 that executes the Order control program transfers the Order written in the memory 3 to the secondary ME server 1-S.
 セカンダリのMEサーバ1-Sも、ノード制御プログラムを実行しており、プライマリのMEサーバ1-Pから受信したOrderを、メモリ3にコピーする。そして、セカンダリのMEサーバ1-SのCPU2は、Orderのコピーを保存したことを示す応答(図中Ack)をプライマリのMEサーバ1-Pに送信する(S4)。 The secondary ME server 1-S also executes the node control program, and copies the Order received from the primary ME server 1-P to the memory 3. Then, the CPU 2 of the secondary ME server 1-S transmits a response (Ack in the figure) indicating that the Order copy has been saved to the primary ME server 1-P (S4).
 ノード制御プログラムを実行するプライマリのMEサーバ1-PのCPU2は、メモリ3に格納したOrderをFPGAボード5に送信する(S3)。そして、プライマリのMEサーバ1-PのCPU2は、セカンダリのMEサーバ1-Sから応答を受信すると、FPGAボード5に演算開始の指令を送信する(S5)。 The CPU 2 of the primary ME server 1-P that executes the node control program transmits the Order stored in the memory 3 to the FPGA board 5 (S3). When the CPU 2 of the primary ME server 1-P receives a response from the secondary ME server 1-S, it sends a calculation start command to the FPGA board 5 (S5).
 FPGAボード5では、CPU2かOrderを受信すると、データ転送FPGA50がメモリ60に設定したキューにOrderを格納する。そして、CPU2から演算開始の指令を受信すると、データ転送FPGA50は、キューに対応する冗長化FPGA51にOrderを送信し、所定の演算を実行させる(S6)。本実施例では、Orderとして証券の取引要求を含み、FPGAボード5の冗長化FPGA51では、所定の処理としてOrderに含まれるデータの取引を成立させるマッチング処理を実行する。なお、図中MEは冗長化FPGA51の機能を示すマッチングエンジンを示す。 In the FPGA board 5, when the CPU 2 or Order is received, the Order is stored in the queue set in the memory 60 by the data transfer FPGA 50. When receiving a calculation start command from the CPU 2, the data transfer FPGA 50 transmits Order to the redundant FPGA 51 corresponding to the queue to execute a predetermined calculation (S6). In this embodiment, the order includes a transaction request for securities, and the redundant FPGA 51 of the FPGA board 5 executes a matching process for establishing a transaction for data included in the Order as a predetermined process. In the figure, ME indicates a matching engine indicating the function of the redundant FPGA 51.
 冗長化FPGA51のプライマリFPGA500-1とセカンダリFPGA500-2の2つの演算結果は比較回路56へ入力され、2つの演算結果が一致しない場合にはエラーとなる(図中error1)。FPGAボード5のエラー検知・修復モジュール55は、エラーが発生した冗長化FPGA51と第3のFPGA52をセレクタ53、54-1、54-2で接続して3重化し、再度演算を実行する(S7)。そして、エラー検知・修復モジュール55では、上述したように多数決回路57が、3重化されたFPGAの出力から演算結果を選択し、この演算結果をエラーを修復した演算結果としてFPGAボード5から出力し(S8)、ネットワーク13を介してゲートウェイ装置12に応答する。なお、FPGAボード5からの演算結果の出力は、PCIexpress4からCPU2が演算結果を受信し、この演算結果をNICカード6から出力する。 The two computation results of the primary FPGA 500-1 and the secondary FPGA 500-2 of the redundant FPGA 51 are input to the comparison circuit 56, and an error occurs if the two computation results do not match (error 1 in the figure). The error detection / repair module 55 of the FPGA board 5 connects the redundant FPGA 51 and the third FPGA 52 in which an error has occurred with the selectors 53, 54-1, and 54-2, and performs the operation again (S7). ). In the error detection / repair module 55, as described above, the majority circuit 57 selects a calculation result from the output of the triple FPGA, and outputs the calculation result from the FPGA board 5 as a calculation result of correcting the error. (S8), and responds to the gateway device 12 via the network 13. Note that the calculation result output from the FPGA board 5 is received by the CPU 2 from the PCI express 4, and the calculation result is output from the NIC card 6.
 ノード制御プログラムを実行するプライマリのMEサーバ1-PのCPU2は、FPGAボード5からエラーが発生した冗長化FPGA51の識別子を受け付けると、待機系のセカンダリのMEサーバ1-Sに処理を引き継ぐよう指令する(S9)。ノード10-1では、次に受信したOrderをセカンダリのMEサーバ1-Sで処理し、ゲートウェイ装置12へ応答する。 When the CPU 2 of the primary ME server 1-P that executes the node control program receives the identifier of the redundant FPGA 51 in which an error has occurred from the FPGA board 5, it instructs the standby secondary ME server 1-S to take over the processing. (S9). In the node 10-1, the next received Order is processed by the secondary ME server 1-S and responds to the gateway device 12.
 図5は、本発明の第1の実施例を示し、FPGAボード5内で行われる処理の一例を示すシーケンス図である。FPGAボード5では、PCIexpress4からOrderを受け付けると、データ転送FPGA50が受信する(S11)。データ転送FPGA50は、CPU2から演算開始の指令を受信すると、Orderに含まれるデータを所定の冗長化FPGA51に送信し、プライマリFPGA500-1とセカンダリFPGA500-2は、受信したデータをそれぞれ演算する(S12、S13)。 FIG. 5 is a sequence diagram illustrating an example of processing performed in the FPGA board 5 according to the first embodiment of this invention. In the FPGA board 5, when Order is received from the PCI express 4, the data transfer FPGA 50 receives the order (S11). When the data transfer FPGA 50 receives a calculation start command from the CPU 2, the data transfer FPGA 50 transmits the data included in the Order to a predetermined redundancy FPGA 51, and the primary FPGA 500-1 and the secondary FPGA 500-2 each calculate the received data (S12). , S13).
 プライマリFPGA500-1とセカンダリFPGA500-2の演算結果(図中Re)は、エラー検知・修復モジュール55の比較回路56で、一致するか否かの判定が行われる(S14)。一致しない場合には、エラーの発生となり、エラー検知・修復モジュール55は各セレクタ53、54-1、54-2を切り替えて、上述したように、エラーが発生した冗長化FPGA51のプライマリ及びセカンダリのFPGAと第3のFPGA52を3重化して多数決回路57に接続する(S15)。 The calculation result (Re in the figure) of the primary FPGA 500-1 and the secondary FPGA 500-2 is determined by the comparison circuit 56 of the error detection / repair module 55 to determine whether or not they match (S14). If they do not match, an error occurs, and the error detection / repair module 55 switches the selectors 53, 54-1, 54-2, and, as described above, the primary and secondary of the redundant FPGA 51 in which the error has occurred. The FPGA and the third FPGA 52 are tripled and connected to the majority circuit 57 (S15).
 そして、エラー検知・修復モジュール55は、エラーが発生した冗長化FPGA51に、現在のデータを第3のFPGA52へ送信するように指令する。エラーが発生した冗長化FPGA51は、現在保持しているデータを第3のFPGA52に送信する(S16)。 Then, the error detection / repair module 55 instructs the redundant FPGA 51 in which an error has occurred to transmit the current data to the third FPGA 52. The redundant FPGA 51 in which the error has occurred transmits the currently held data to the third FPGA 52 (S16).
 次に、エラー検知・修復モジュール55は、3重化した冗長化FPGA51と第3のFPGA52に対して、現在保持しているデータで演算を実行するように指令する(S17)。エラーが発生した冗長化FPGA51では、プライマリとセカンダリのFPGA500-1と500-2が、前回のデータを再度演算し、第3のFPGA52は冗長化FPGA51からコピーしたデータを演算する。そして、これら3重化したFPGAの演算結果は多数決回路57へ入力される(S18)。 Next, the error detection / repair module 55 instructs the triple redundant FPGA 51 and the third FPGA 52 to execute an operation with the currently held data (S17). In the redundant FPGA 51 in which an error has occurred, the primary and secondary FPGAs 500-1 and 500-2 calculate the previous data again, and the third FPGA 52 calculates the data copied from the redundant FPGA 51. Then, the operation result of these tripled FPGAs is input to the majority circuit 57 (S18).
 多数決回路57は、入力された3つの演算結果について多数決を実行し、最も一致した数の多い演算結果を出力する(S19)。出力された演算結果は、図3に示した信号線59を介してデータ転送FPGA50に転送され、PCIexpress4へ出力される(S20)。 The majority circuit 57 executes a majority vote on the three input operation results, and outputs the operation result with the most consistent number (S19). The output calculation result is transferred to the data transfer FPGA 50 via the signal line 59 shown in FIG. 3, and is output to the PCI express 4 (S20).
 以上の処理により、2つのFPGA500-1と500-2を含むn個の冗長化FPGA51と、1つの第3のFPGA52でFPGAボード5を構成し、エラーが発生したときのみ3重化したFPGAで再度演算を行って、演算結果の修復が可能となり、FPGAボード5の製造コストを抑制しながら信頼性を向上させることが可能となるのである。 Through the above processing, the n-redundant FPGA 51 including the two FPGAs 500-1 and 500-2 and the one third FPGA 52 constitute the FPGA board 5, and the triple FPGA only when an error occurs. It is possible to restore the calculation result by performing the calculation again, and to improve the reliability while suppressing the manufacturing cost of the FPGA board 5.
 なお、上記では、プライマリのMEサーバ1-PのFPGAボード5と、セカンダリのMEサーバ1-Sとの通信は、CPU2が主体となってNICカード6から送受信する例を示したが、これに限定されるものではない。例えば、FPGAボード5のデータ転送FPGA50がセカンダリのMEサーバ1-Sに対してリモートDMAでデータの転送や処理の引き継ぎの通知を行うようにしても良い。 In the above description, the communication between the FPGA board 5 of the primary ME server 1-P and the secondary ME server 1-S is transmitted and received from the NIC card 6 mainly by the CPU 2. It is not limited. For example, the data transfer FPGA 50 of the FPGA board 5 may notify the secondary ME server 1-S of data transfer or processing takeover by remote DMA.
 また、多数決回路57は、3つの演算結果について多数決処理を行って、演算結果が他とは異なるFPGAを、エラーが発生したFPGAとして特定することができる。 In addition, the majority circuit 57 can perform majority processing on the three calculation results, and can identify an FPGA having a different calculation result as the FPGA in which an error has occurred.
 図6Aは、第2の実施例を示し、FPGA500の一例を示すブロック図である。図6Aは、プライマリFPGA500-1の構成を示すが、セカンダリFPGA500-2と第3のFPGA52も同様の構成である。 FIG. 6A is a block diagram illustrating an example of the FPGA 500 according to the second embodiment. FIG. 6A shows the configuration of the primary FPGA 500-1, but the secondary FPGA 500-2 and the third FPGA 52 have the same configuration.
 プライマリFPGA500-1は、PCIexpress4に接続するPCIexpressインターフェース501と、コントローラ502と、演算モジュールA(503A)と演算モジュールB(503B)と、演算モジュールAの出力と演算モジュールBの出力をクロック信号CLKに応じて切り替えて出力する内部セレクタ504と、メモリ61-1を制御するメモリインターフェース505を主体に構成される。 The primary FPGA 500-1 includes a PCI express interface 501 connected to the PCI express 4, a controller 502, an arithmetic module A (503A), an arithmetic module B (503B), an output of the arithmetic module A, and an output of the arithmetic module B as a clock signal CLK. The main selector 504 is configured to switch the output accordingly, and the memory interface 505 that controls the memory 61-1 is mainly configured.
 演算モジュールAと演算モジュールBは、前記実施例1のプライマリFPGA500-1の演算ユニットを分割したものである。演算モジュールAは、PCIexpressインターフェース501のデータ線510(図中DIN)からデータを受信し、メモリインターフェース505を介してメモリ61-1へ書き込むまでの処理を行う。演算モジュールBは、メモリインターフェース505を介してメモリ61-1からデータを読み込んで、所定のマッチング処理を実行し演算結果を出力する。マッチング処理は前記実施例1と同様であり、演算結果も前記実施例1と同様となる。 The arithmetic module A and the arithmetic module B are obtained by dividing the arithmetic unit of the primary FPGA 500-1 of the first embodiment. The arithmetic module A performs processing from receiving data from the data line 510 (DIN in the figure) of the PCI express interface 501 until writing to the memory 61-1 via the memory interface 505. The calculation module B reads data from the memory 61-1 via the memory interface 505, executes a predetermined matching process, and outputs a calculation result. The matching process is the same as in the first embodiment, and the calculation result is the same as in the first embodiment.
 内部セレクタ504は、コントローラ502からの指令に応じて、演算モジュールAの出力と、演算モジュールBの出力を切り替えて信号線70-1へ出力する。このため、コントローラ502には、PCIexpressインターフェース501の制御線511(図中CIN)と、外部からのクロック信号CLKを入力するクロック信号線520が接続される。 The internal selector 504 switches between the output of the arithmetic module A and the output of the arithmetic module B in accordance with a command from the controller 502 and outputs the result to the signal line 70-1. For this reason, the controller 502 is connected with a control line 511 (CIN in the figure) of the PCI express interface 501 and a clock signal line 520 for inputting an external clock signal CLK.
 外部のクロック信号CLKとしては、例えば、CPU2が生成した信号を用いることができる。このクロック信号CLKと内部セレクタ504の出力の関係を図6Bに示す。図6Bの例では、クロック信号CLKが"0"で内部セレクタ504は演算モジュールAの出力を選択し、クロック信号CLKが"1"で内部セレクタ504は演算モジュールBの出力を選択する。 As the external clock signal CLK, for example, a signal generated by the CPU 2 can be used. The relationship between the clock signal CLK and the output of the internal selector 504 is shown in FIG. 6B. In the example of FIG. 6B, the clock signal CLK is “0” and the internal selector 504 selects the output of the arithmetic module A, and the clock signal CLK is “1” and the internal selector 504 selects the output of the arithmetic module B.
 図3で示した冗長化FPGA51-1~51-nのプライマリFPGA500-1とセカンダリFPGA500-2及び第3のFPGA52に、外部のクロック信号CLKが入力される。そして、冗長化FPGA51にエラーが発生した後に、第3のFPGA52と3重化され、多数決回路57には、3つのFPGAの演算結果が入力される。 The external clock signal CLK is input to the primary FPGA 500-1, the secondary FPGA 500-2, and the third FPGA 52 of the redundant FPGAs 51-1 to 51-n shown in FIG. Then, after an error has occurred in the redundant FPGA 51, it is tripled with the third FPGA 52, and the majority circuit 57 receives the calculation results of the three FPGAs.
 このとき、クロック信号CLKが"1"で多数決回路57は演算モジュールBの演算結果が一致するか否かを判定する。エラーが発生した冗長化FPGA51では全ての演算結果は一致しないので、エラー検知・修復モジュール55では、異なる演算結果を出力したFPGAがプライマリFPGA500-1とセカンダリFPGA500-2の何れであるかを特定できる。 At this time, the clock signal CLK is “1”, and the majority circuit 57 determines whether or not the calculation results of the calculation module B match. Since all the operation results do not match in the redundant FPGA 51 in which an error has occurred, the error detection / repair module 55 can specify whether the FPGA that has output a different operation result is the primary FPGA 500-1 or the secondary FPGA 500-2. .
 次に、再度3重化したFPGAで同一のデータで演算を実行させる。そして、クロック信号CLKを"0"として多数決回路57は演算モジュールAの演算結果が一致するか否かを判定する。3つの演算結果が一致した場合、エラー検知・修復モジュール55は、第1回目の多数決回路57の判定で、異なる演算結果を出力したFPGAの演算モジュールBでエラーが発生したことを特定できる。一方、3つの演算結果が一致しない場合、エラー検知・修復モジュール55は、今回(第2回目)の多数決回路57の判定で、異なる演算結果を出力したFPGAの演算モジュールAでエラーが発生したことを特定できる。 Next, the calculation is executed with the same data in the triple FPGA again. Then, with the clock signal CLK set to “0”, the majority circuit 57 determines whether or not the calculation results of the calculation module A match. When the three calculation results match, the error detection / recovery module 55 can determine that an error has occurred in the FPGA calculation module B that has output a different calculation result in the first determination by the majority circuit 57. On the other hand, if the three calculation results do not match, the error detection / recovery module 55 indicates that an error has occurred in the calculation module A of the FPGA that output a different calculation result in the determination of the majority circuit 57 this time (second time). Can be identified.
 以上のように、実施例2によれば、冗長化FPGA51を構成するFPGAを複数に分割して、分割された演算ユニットの複数の演算結果を内部セレクタ504で切り替えて出力する。これにより、エラー検知・修復モジュール55は、エラーが発生したFPGAに加えて、エラーが発生した部位(演算ユニット)を特定することができる。 As described above, according to the second embodiment, the FPGA configuring the redundant FPGA 51 is divided into a plurality of parts, and a plurality of operation results of the divided operation units are switched by the internal selector 504 and output. As a result, the error detection / repair module 55 can specify the part (arithmetic unit) where the error has occurred in addition to the FPGA where the error has occurred.
 なお、上記ではエラー検知・修復モジュール55は、エラーが発生したFPGAの特定(プライマリFPGA500-1とセカンダリFPGA500-2のいずれか)と、エラーが発生した部位(演算ユニット)を特定する例を示したが、データ転送FPGA50やCPU2でエラーが発生した部位の特定を行うようにしてもよい。 In the above description, the error detection / repair module 55 shows an example of specifying an FPGA in which an error has occurred (one of the primary FPGA 500-1 or the secondary FPGA 500-2) and specifying a site (arithmetic unit) in which the error has occurred However, the part where the error occurred in the data transfer FPGA 50 or the CPU 2 may be specified.
 図7、図8は実施例3を示す。図7は、FPGAボード5の一例を示すブロック図である。図8は、FPGAボード5のCPUで行われる処理の一例を示すフローチャートである。 7 and 8 show the third embodiment. FIG. 7 is a block diagram showing an example of the FPGA board 5. FIG. 8 is a flowchart illustrating an example of processing performed by the CPU of the FPGA board 5.
 図7は、前記実施例1の図3に示した第3のFPGA52に代わって多数の冗長化FPGA51のうちのひとつを3重化するFPGAとして使用し、信号線58に接続したローカルCPU80で冗長化FPGA51と、エラー検知・修復モジュール55の制御を行う例を示す。なお、前記実施例1の図3に示したセレクタ53は削除して、ローカルCPU80によってデータのコピーなど実行される。ローカルCPU80は、プログラムやデータを保持するローカルメモリ81を有する。また、その他の構成は前記実施例1と同様である。 7 uses one of a number of redundant FPGAs 51 as a triplex FPGA instead of the third FPGA 52 shown in FIG. 3 of the first embodiment, and the local CPU 80 connected to the signal line 58 is redundant. An example in which the control FPGA 51 and the error detection / repair module 55 are controlled will be described. Note that the selector 53 shown in FIG. 3 of the first embodiment is deleted, and data is copied by the local CPU 80. The local CPU 80 has a local memory 81 that holds programs and data. Other configurations are the same as those of the first embodiment.
 前記実施例1では、n個の冗長化FPGA51とひとつの第3のFPGA52で、エラーが発生したときには冗長化FPGA51と第3のFPGA52を組み合わせて3重化する。 In the first embodiment, n redundant FPGAs 51 and one third FPGA 52 are combined, and when the error occurs, the redundant FPGA 51 and the third FPGA 52 are combined to be tripled.
 これに対して本実施例3では、n個の冗長化FPGA51-1~51-nのうちのとひとつを、第3のFPGAとして用いる例を示す。冗長化FPGA51は、プライマリFPGA500-1とセカンダリFPGA500-2を備え、各FPGA500-1、500-2の出力は信号線70,71を介して比較回路56に接続される。また、各FPGA500-1、500-2の出力は信号線70、71からセレクタ54-1、54-2を介して多数決回路57に接続される。 On the other hand, the third embodiment shows an example in which one of the n redundant FPGAs 51-1 to 51-n is used as the third FPGA. The redundant FPGA 51 includes a primary FPGA 500-1 and a secondary FPGA 500-2, and the outputs of the FPGAs 500-1 and 500-2 are connected to the comparison circuit 56 via signal lines 70 and 71, respectively. The outputs of the FPGAs 500-1 and 500-2 are connected from the signal lines 70 and 71 to the majority circuit 57 via the selectors 54-1 and 54-2.
 ローカルCPU80は、起動すると冗長化FPGA51-1~51-nのいずれかひとつを第3のFPGAとして選択する。ここで、ひとつの冗長化FPGA51には、2つのFPGA500-1と500-2が含まれるので、ローカルCPU80は、何れか一方を第3のFPGAとして選択する。例えば、ローカルCPU80は、ラウンドロビンで第3のFPGAとする冗長化FPGA51を選択し、当該冗長化FPGA51のプライマリFPGA500-1を第3のFPGAとして選択する。 When activated, the local CPU 80 selects any one of the redundant FPGAs 51-1 to 51-n as the third FPGA. Here, since one redundant FPGA 51 includes two FPGAs 500-1 and 500-2, the local CPU 80 selects one of them as the third FPGA. For example, the local CPU 80 selects the redundant FPGA 51 that is the third FPGA in round robin, and selects the primary FPGA 500-1 of the redundant FPGA 51 as the third FPGA.
 エラー検知・修復モジュール55は、前記実施例1の構成からセレクタ54-1、545-2の制御を省略したものである。エラー検知・修復モジュール55は、比較回路56が演算結果の不一致を検出すると、信号線58を介してローカルCPU80に不一致のエラーが発生した冗長化FPGA51の識別子を通知する。 The error detection / repair module 55 is obtained by omitting the control of the selectors 54-1, 545-2 from the configuration of the first embodiment. When the comparison circuit 56 detects a mismatch in the operation result, the error detection / repair module 55 notifies the local CPU 80 of the identifier of the redundant FPGA 51 in which the mismatch error has occurred via the signal line 58.
 ローカルCPU80は、エラーが発生した冗長化FPGA51の信号線70,71と、第3のFPGAとして機能する冗長化FPGA51の信号線70を多数決回路57に接続するようセレクタ54-1、54-2に指令する。そして、ローカルCPU80は、エラーが発生した冗長化FPGA51から、現在のデータを第3のFPGAにコピーする。その後、ローカルCPU80は、エラーが発生した冗長化FPGA51と第3のFPGAに演算の実行を指令する。 The local CPU 80 causes the selectors 54-1 and 54-2 to connect the signal lines 70 and 71 of the redundant FPGA 51 in which an error has occurred and the signal line 70 of the redundant FPGA 51 functioning as the third FPGA to the majority circuit 57. Command. Then, the local CPU 80 copies the current data from the redundant FPGA 51 in which the error has occurred to the third FPGA. Thereafter, the local CPU 80 instructs the redundant FPGA 51 and the third FPGA in which an error has occurred to execute the calculation.
 上記処理により、エラー検知・修復モジュール55の多数決回路57からは修復された演算結果が出力される。ローカルCPU80は、その後、データ転送FPGA50を介してセカンダリのMEサーバ1-Sに処理の引き継ぎを通知する。 Through the above processing, the repaired calculation result is output from the majority circuit 57 of the error detection / repair module 55. Thereafter, the local CPU 80 notifies the secondary ME server 1-S of the takeover of processing via the data transfer FPGA 50.
 上記処理の内容を、図8を参照しながら説明する。図8のフローチャートは、FPGAボード5の起動が完了して、第3のFPGAが選択された状態で実行される。ローカルCPU80は、信号線58を監視してエラー検知・修復モジュール55がエラーを通知するまで待機する(S31)。エラー検知・修復モジュール55はエラーが発生した冗長化FPGA51の識別子を通知すると、ローカルCPU80は信号線58から識別子を受信して、エラーが発生した冗長化FPGA51を特定する(S32)。 The contents of the above processing will be described with reference to FIG. The flowchart of FIG. 8 is executed in a state where the activation of the FPGA board 5 is completed and the third FPGA is selected. The local CPU 80 monitors the signal line 58 and waits until the error detection / repair module 55 notifies the error (S31). When the error detection / repair module 55 notifies the identifier of the redundant FPGA 51 in which the error has occurred, the local CPU 80 receives the identifier from the signal line 58 and identifies the redundant FPGA 51 in which the error has occurred (S32).
 ローカルCPU80は、エラーが発生した冗長化FPGA51の信号線70,71と、第3のFPGAとして機能する冗長化FPGA51の信号線70を多数決回路57に接続するようセレクタ54-1、54-2に指令する(S33、S34)。 The local CPU 80 causes the selectors 54-1 and 54-2 to connect the signal lines 70 and 71 of the redundant FPGA 51 in which an error has occurred and the signal line 70 of the redundant FPGA 51 functioning as the third FPGA to the majority circuit 57. Command (S33, S34).
 ローカルCPU80は、エラーが発生した冗長化FPGA51から、現在のデータを第3のFPGAにコピーする(S35)。ローカルCPU80は、エラーが発生した冗長化FPGA51と第3のFPGAに演算の実行を指令する(S36)。 The local CPU 80 copies the current data from the redundant FPGA 51 in which the error has occurred to the third FPGA (S35). The local CPU 80 instructs the redundant FPGA 51 and the third FPGA in which an error has occurred to execute the calculation (S36).
 エラーが発生した冗長化FPGA51では、プライマリとセカンダリのFPGA500-1と500-2が、前回のデータを再度演算し、第3のFPGAは冗長化FPGA51からコピーしたデータを演算する。そして、これら3重化したFPGAの演算結果は多数決回路57へ入力される。多数決回路57は、入力された3つの演算結果について多数決を実行し、最も一致した数の多い演算結果を信号線59へ出力する(S37)。 In the redundant FPGA 51 in which an error has occurred, the primary and secondary FPGAs 500-1 and 500-2 calculate the previous data again, and the third FPGA calculates the data copied from the redundant FPGA 51. Then, the operation result of the triple FPGA is input to the majority circuit 57. The majority circuit 57 executes a majority vote on the three input operation results, and outputs the operation result having the most matched number to the signal line 59 (S37).
 ローカルCPU80は、待機系のセカンダリのMEサーバ1-Sに対して処理を引き継ぐよう指令する(S38)。 The local CPU 80 instructs the standby secondary ME server 1-S to take over the processing (S38).
 以上の処理により、2つのFPGA500-1と500-2を含むn個の冗長化FPGA51のうちの1つを第3のFPGAとして、前記実施例1と同様に、エラーが発生したときのみ3重化したFPGAで再度演算を行って、演算結果の修復を行うことが可能となり、FPGAボード5の製造コストを抑制しながら信頼性を向上させることが可能となるのである。 With the above processing, one of the n redundant FPGAs 51 including the two FPGAs 500-1 and 500-2 is set as the third FPGA, and, as in the first embodiment, the triplicate is performed only when an error occurs. It is possible to perform the calculation again with the converted FPGA and restore the calculation result, and it is possible to improve the reliability while suppressing the manufacturing cost of the FPGA board 5.
 本実施例3では、第3のFPGAをn個の冗長化FPGA51の中から選択できるため、第3のFPGAにエラーが生じた場合は、セカンダリFPGA500-2や他の冗長化FPGA51へ処理を引き継ぐことができ、第3のFPGAの冗長性を確保することができるのである。 In the third embodiment, since the third FPGA can be selected from the n redundant FPGAs 51, when an error occurs in the third FPGA, the processing is taken over to the secondary FPGA 500-2 or another redundant FPGA 51. The redundancy of the third FPGA can be ensured.
 また、本実施例3では、2つのFPGA500-1、500-2を含む冗長化FPGA51を複数配置すれば良いので、半導体の設計及び製造を容易にすることができる。 Further, in the third embodiment, a plurality of redundant FPGAs 51 including two FPGAs 500-1 and 500-2 may be arranged, so that semiconductor design and manufacture can be facilitated.
 また、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明をわかりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。 Further, the present invention is not limited to the above-described embodiments, and includes various modifications. For example, the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.

Claims (12)

  1.  入力されたデータを並列して演算する第1の演算ユニットと第2の演算ユニットを含む冗長化演算ユニットと、
     前記第1の演算ユニットの演算結果と、前記第2の演算ユニットの演算結果とを比較する比較回路と、
     前記比較回路の比較結果が一致しないときに、前記冗長化演算ユニットのデータを演算する第3の演算ユニットと、
     前記比較回路の比較結果が一致しないときに、前記第1の演算ユニットの出力と、前記第2の演算ユニットの出力と、前記第3の演算ユニットの出力とを多数決回路に接続するセレクタと、
    を備えたことを特徴とする演算回路。
    A redundant computing unit including a first computing unit and a second computing unit that computes input data in parallel;
    A comparison circuit for comparing the calculation result of the first calculation unit and the calculation result of the second calculation unit;
    A third arithmetic unit that calculates data of the redundant arithmetic unit when the comparison result of the comparison circuit does not match;
    A selector that connects the output of the first arithmetic unit, the output of the second arithmetic unit, and the output of the third arithmetic unit to a majority circuit when the comparison result of the comparison circuit does not match;
    An arithmetic circuit comprising:
  2.  請求項1に記載の演算回路であって、
     前記冗長化演算ユニットを複数備え、
     前記複数の冗長化演算ユニットにそれぞれ接続される複数の前記比較回路と、
     前記複数の冗長化演算ユニットに接続された前記比較回路のうち、比較結果が一致しないときに、当該冗長化演算ユニットのエラーの発生を検出する制御部をさらに有し、
     前記制御部は、
     前記エラーが発生した冗長化演算ユニットの第1の演算ユニットの出力と、前記第2の演算ユニットの出力とを前記多数決回路に接続するように前記セレクタに指令し、
     前記エラーが発生した冗長化演算ユニットに、現在のデータを前記第3の演算ユニットに転送させ、
     前記冗長化演算ユニットに前記現在のデータで再度演算を実行させ、前記第3の演算ユニットに前記転送されたデータを演算させ、
     前記多数決回路は、
     前記セレクタを介して一時的に3重化された冗長化演算ユニットと第3の演算ユニットの演算結果を受け付けて、多数決処理を行うことを特徴とする演算回路。
    The arithmetic circuit according to claim 1,
    A plurality of the redundant arithmetic units are provided,
    A plurality of the comparison circuits respectively connected to the plurality of redundant arithmetic units;
    Among the comparison circuits connected to the plurality of redundant arithmetic units, further comprising a control unit that detects the occurrence of an error in the redundant arithmetic units when the comparison result does not match,
    The controller is
    Instructing the selector to connect the output of the first arithmetic unit of the redundant arithmetic unit in which the error has occurred and the output of the second arithmetic unit to the majority circuit,
    Causing the redundant arithmetic unit in which the error has occurred to transfer the current data to the third arithmetic unit;
    Causing the redundant computing unit to perform the computation again with the current data, causing the third computing unit to compute the transferred data,
    The majority circuit is
    An arithmetic circuit that receives the arithmetic results of the redundant arithmetic unit and the third arithmetic unit that are temporarily tripled through the selector and performs majority processing.
  3.  請求項1に記載の演算回路であって、
     前記第3の演算ユニットは、
     前記冗長化演算ユニットで構成されて、前記第1の演算ユニットと第2の演算ユニットの一方を当該第3の演算ユニットとして使用することを特徴とする演算回路。
    The arithmetic circuit according to claim 1,
    The third arithmetic unit is
    An arithmetic circuit comprising the redundant arithmetic unit, wherein one of the first arithmetic unit and the second arithmetic unit is used as the third arithmetic unit.
  4.  請求項1に記載の演算回路であって、
     前記第3の演算ユニットは、
     前記多数決回路に常時接続されることを特徴とする演算回路。
    The arithmetic circuit according to claim 1,
    The third arithmetic unit is
    An arithmetic circuit which is always connected to the majority circuit.
  5.  請求項1に記載の演算回路であって、
     前記多数決回路は、
     前記第1の演算ユニットの演算結果と、前記第2の演算ユニットの演算結果と、前記第3の演算ユニットの演算結果とを受け付けて、他とは異なる演算結果を出力する演算ユニットをエラーが発生した演算ユニットとして特定することを特徴とする演算回路。
    The arithmetic circuit according to claim 1,
    The majority circuit is
    An error occurs in the arithmetic unit that receives the arithmetic result of the first arithmetic unit, the arithmetic result of the second arithmetic unit, and the arithmetic result of the third arithmetic unit and outputs an arithmetic result different from the others. An arithmetic circuit characterized by specifying the generated arithmetic unit.
  6.  請求項1に記載の演算回路であって、
     前記第1の演算ユニットと、第2の演算ユニット及び第3の演算ユニットは、
     入力を受け付ける第1の演算モジュールと、前記第1の演算モジュールの出力を受け付ける第2の演算モジュールと、第1の演算モジュールの出力と第2の演算モジュールの出力とをクロックに応じて切り替える内部セレクタと、をそれぞれ含み、
     前記第1の演算ユニットと、第2の演算ユニット及び第3の演算ユニットに同一のクロックを入力することを特徴とする演算回路。
    The arithmetic circuit according to claim 1,
    The first arithmetic unit, the second arithmetic unit, and the third arithmetic unit are:
    A first arithmetic module that accepts an input, a second arithmetic module that accepts an output of the first arithmetic module, and an internal that switches between the output of the first arithmetic module and the output of the second arithmetic module according to a clock. Each including a selector,
    An arithmetic circuit, wherein the same clock is input to the first arithmetic unit, the second arithmetic unit, and the third arithmetic unit.
  7.  プロセッサと、メモリと、前記プロセッサに接続される演算回路と、を備えた計算機であって、
     前記演算回路は、
     入力されたデータを並列して演算する第1の演算ユニットと第2の演算ユニットを含む冗長化演算ユニットと、
     前記第1の演算ユニットの演算結果と、前記第2の演算ユニットの演算結果とを比較する比較回路と、
     前記比較回路の比較結果が一致しないときに、前記冗長化演算ユニットのデータを演算する第3の演算ユニットと、
     前記比較回路の比較結果が一致しないときに、前記第1の演算ユニットの出力と、前記第2の演算ユニットの出力と、前記第3の演算ユニットの出力とを多数決回路に接続するセレクタと、
    を備えたことを特徴とする計算機。
    A computer comprising a processor, a memory, and an arithmetic circuit connected to the processor,
    The arithmetic circuit is:
    A redundant computing unit including a first computing unit and a second computing unit that computes input data in parallel;
    A comparison circuit for comparing the calculation result of the first calculation unit and the calculation result of the second calculation unit;
    A third arithmetic unit that calculates data of the redundant arithmetic unit when the comparison result of the comparison circuit does not match;
    A selector that connects the output of the first arithmetic unit, the output of the second arithmetic unit, and the output of the third arithmetic unit to a majority circuit when the comparison result of the comparison circuit does not match;
    A computer characterized by comprising:
  8.  請求項7に記載の計算機であって、
     前記演算回路は、
     前記冗長化演算ユニットを複数備え、
     前記複数の冗長化演算ユニットにそれぞれ接続される複数の前記比較回路と、
     前記複数の冗長化演算ユニットに接続された前記比較回路のうち、比較結果が一致しないときに、当該冗長化演算ユニットのエラーの発生を検出する制御部をさらに有し、
     前記制御部は、
     前記エラーが発生した冗長化演算ユニットの第1の演算ユニットの出力と、前記第2の演算ユニットの出力とを前記多数決回路に接続するように前記セレクタに指令し、
     前記エラーが発生した冗長化演算ユニットに、現在のデータを前記第3の演算ユニットに転送させ、
     前記冗長化演算ユニットに前記現在のデータで再度演算を実行させ、前記第3の演算ユニットに前記転送されたデータを演算させ、
     前記多数決回路は、
     前記セレクタを介して一時的に3重化された冗長化演算ユニットと第3の演算ユニットの演算結果を受け付けて、多数決処理を行うことを特徴とする計算機。
    The computer according to claim 7,
    The arithmetic circuit is:
    A plurality of the redundant arithmetic units are provided,
    A plurality of the comparison circuits respectively connected to the plurality of redundant arithmetic units;
    Among the comparison circuits connected to the plurality of redundant arithmetic units, further comprising a control unit that detects the occurrence of an error in the redundant arithmetic units when the comparison result does not match,
    The controller is
    Instructing the selector to connect the output of the first arithmetic unit of the redundant arithmetic unit in which the error has occurred and the output of the second arithmetic unit to the majority circuit,
    Causing the redundant arithmetic unit in which the error has occurred to transfer the current data to the third arithmetic unit;
    Causing the redundant computing unit to perform the computation again with the current data, causing the third computing unit to compute the transferred data,
    The majority circuit is
    A computer that receives the operation results of the redundant operation unit and the third operation unit that are temporarily tripled through the selector and performs majority processing.
  9.  請求項7に記載の計算機であって、
     前記第3の演算ユニットは、
     前記冗長化演算ユニットで構成されて、前記第1の演算ユニットと第2の演算ユニットの一方を当該第3の演算ユニットとして使用することを特徴とする計算機。
    The computer according to claim 7,
    The third arithmetic unit is
    A computer comprising the redundant arithmetic unit, wherein one of the first arithmetic unit and the second arithmetic unit is used as the third arithmetic unit.
  10.  請求項7に記載の計算機であって、
     前記第3の演算ユニットは、
     前記多数決回路に常時接続されることを特徴とする計算機。
    The computer according to claim 7,
    The third arithmetic unit is
    A computer that is always connected to the majority circuit.
  11.  請求項7に記載の計算機であって、
     前記多数決回路は、
     前記第1の演算ユニットの演算結果と、前記第2の演算ユニットの演算結果と、前記第3の演算ユニットの演算結果とを受け付けて、他とは異なる演算結果を出力する演算ユニットをエラーが発生した演算ユニットとして特定することを特徴とする計算機。
    The computer according to claim 7,
    The majority circuit is
    An error occurs in the arithmetic unit that receives the arithmetic result of the first arithmetic unit, the arithmetic result of the second arithmetic unit, and the arithmetic result of the third arithmetic unit and outputs an arithmetic result different from the others. A computer characterized by being identified as an arithmetic unit that has occurred.
  12.  請求項7に記載の計算機であって、
     前記第1の演算ユニットと、第2の演算ユニット及び第3の演算ユニットは、
     入力を受け付ける第1の演算モジュールと、前記第1の演算モジュールの出力を受け付ける第2の演算モジュールと、第1の演算モジュールの出力と第2の演算モジュールの出力とをクロックに応じて切り替える内部セレクタと、をそれぞれ含み、
     前記第1の演算ユニットと、第2の演算ユニット及び第3の演算ユニットに同一のクロックを入力することを特徴とする計算機。
    The computer according to claim 7,
    The first arithmetic unit, the second arithmetic unit, and the third arithmetic unit are:
    A first arithmetic module that accepts an input, a second arithmetic module that accepts an output of the first arithmetic module, and an internal that switches between the output of the first arithmetic module and the output of the second arithmetic module according to a clock. Each including a selector,
    A computer, wherein the same clock is input to the first arithmetic unit, the second arithmetic unit, and the third arithmetic unit.
PCT/JP2013/067816 2013-06-28 2013-06-28 Computation circuit and computer WO2014207893A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/067816 WO2014207893A1 (en) 2013-06-28 2013-06-28 Computation circuit and computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/067816 WO2014207893A1 (en) 2013-06-28 2013-06-28 Computation circuit and computer

Publications (1)

Publication Number Publication Date
WO2014207893A1 true WO2014207893A1 (en) 2014-12-31

Family

ID=52141291

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/067816 WO2014207893A1 (en) 2013-06-28 2013-06-28 Computation circuit and computer

Country Status (1)

Country Link
WO (1) WO2014207893A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017042894A1 (en) * 2015-09-08 2017-03-16 株式会社東芝 Multiplexing processing system, multiplexing processing method, and program
WO2018134023A1 (en) * 2017-01-23 2018-07-26 Zf Friedrichshafen Ag Redundant processor architecture
JP6490316B1 (en) * 2018-02-28 2019-03-27 三菱電機株式会社 Output judgment circuit
US20190228666A1 (en) * 2018-01-19 2019-07-25 Ge Aviation Systems Llc System and Method for Reconfiguring a System-On-Module for an Unmanned Vehicle
CN112084071A (en) * 2020-09-14 2020-12-15 海光信息技术有限公司 Calculation unit operation reinforcement method, parallel processor and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07239798A (en) * 1994-02-28 1995-09-12 Nec Commun Syst Ltd Fault tolerant system of computer
WO2010103564A1 (en) * 2009-03-10 2010-09-16 富士通株式会社 Transmission/reception device, transmission device, reception device, and data transmission/reception method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07239798A (en) * 1994-02-28 1995-09-12 Nec Commun Syst Ltd Fault tolerant system of computer
WO2010103564A1 (en) * 2009-03-10 2010-09-16 富士通株式会社 Transmission/reception device, transmission device, reception device, and data transmission/reception method

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10489239B2 (en) 2015-09-08 2019-11-26 Kabushiki Kaisha Toshiba Multiplexing system, multiplexing method, and computer program product
JPWO2017042894A1 (en) * 2015-09-08 2018-02-15 株式会社東芝 Multiplexing processing system, multiplexing processing method and program
CN107949831A (en) * 2015-09-08 2018-04-20 株式会社东芝 Multiplex processing system, multiplex processing method and program
WO2017042894A1 (en) * 2015-09-08 2017-03-16 株式会社東芝 Multiplexing processing system, multiplexing processing method, and program
CN107949831B (en) * 2015-09-08 2021-03-16 株式会社东芝 Multiplexing system, multiplexing method, and program
WO2018134023A1 (en) * 2017-01-23 2018-07-26 Zf Friedrichshafen Ag Redundant processor architecture
CN110192185A (en) * 2017-01-23 2019-08-30 Zf 腓德烈斯哈芬股份公司 The processor architecture of redundancy
US11281547B2 (en) 2017-01-23 2022-03-22 Zf Friedrichshafen Ag Redundant processor architecture
CN110192185B (en) * 2017-01-23 2023-06-23 Zf 腓德烈斯哈芬股份公司 Redundant processor architecture
US20190228666A1 (en) * 2018-01-19 2019-07-25 Ge Aviation Systems Llc System and Method for Reconfiguring a System-On-Module for an Unmanned Vehicle
JP6490316B1 (en) * 2018-02-28 2019-03-27 三菱電機株式会社 Output judgment circuit
CN112084071A (en) * 2020-09-14 2020-12-15 海光信息技术有限公司 Calculation unit operation reinforcement method, parallel processor and electronic equipment
CN112084071B (en) * 2020-09-14 2023-09-19 海光信息技术股份有限公司 Computing unit operation reinforcement method, parallel processor and electronic equipment

Similar Documents

Publication Publication Date Title
WO2014207893A1 (en) Computation circuit and computer
US7237144B2 (en) Off-chip lockstep checking
US20190260504A1 (en) Systems and methods for maintaining network-on-chip (noc) safety and reliability
US9342422B2 (en) Selectively coupling a PCI host bridge to multiple PCI communication paths
US7290169B2 (en) Core-level processor lockstepping
US20050240829A1 (en) Lockstep error signaling
US9582448B2 (en) Transmission apparatus and control unit
US9141493B2 (en) Isolating a PCI host bridge in response to an error event
US20070220367A1 (en) Fault tolerant computing system
US9477559B2 (en) Control device, control method and recording medium storing program thereof
JP5509637B2 (en) Fault tolerant system
US8356240B2 (en) Data transfering apparatus
KR20150088559A (en) Method and apparatus for restoring failure of network
JP6431197B2 (en) Snapshot processing methods and associated devices
JP2534430B2 (en) Methods for achieving match of computer system output with fault tolerance
CN105209982A (en) Method and apparatus for controlling a physical unit in an automation system
JP2014178793A (en) Information processing system
TW202005410A (en) Muiti-node device and backup communication method thereof
EP1988469B1 (en) Error control device
JP6710142B2 (en) Control system
JP6394727B1 (en) Control device, control method, and fault tolerant device
WO2023204786A1 (en) An avionic computer architecture
US11914338B2 (en) Redundant automation system and method for operating the redundant automation system
JP6653250B2 (en) Computer system
WO2018154664A1 (en) Control device and control method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13888485

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13888485

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP