US20190278660A1 - Single event latchup recovery with state protection - Google Patents

Single event latchup recovery with state protection Download PDF

Info

Publication number
US20190278660A1
US20190278660A1 US15/917,523 US201815917523A US2019278660A1 US 20190278660 A1 US20190278660 A1 US 20190278660A1 US 201815917523 A US201815917523 A US 201815917523A US 2019278660 A1 US2019278660 A1 US 2019278660A1
Authority
US
United States
Prior art keywords
microprocessor
sel
state
power
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US15/917,523
Other versions
US10713118B2 (en
Inventor
Joshua C. Swenson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hamilton Sundstrand Corp
Original Assignee
Hamilton Sundstrand Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hamilton Sundstrand Corp filed Critical Hamilton Sundstrand Corp
Priority to US15/917,523 priority Critical patent/US10713118B2/en
Assigned to HAMILTON SUNDSTRAND CORPORATION reassignment HAMILTON SUNDSTRAND CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Swenson, Joshua C.
Priority to EP19161303.3A priority patent/EP3537301A1/en
Publication of US20190278660A1 publication Critical patent/US20190278660A1/en
Application granted granted Critical
Publication of US10713118B2 publication Critical patent/US10713118B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K17/00Electronic switching or gating, i.e. not by contact-making and –breaking
    • H03K17/22Modifications for ensuring a predetermined initial state when the supply voltage has been applied
    • H03K17/24Storing the actual state when the supply voltage fails
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K19/00Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
    • H03K19/003Modifications for increasing the reliability for protection
    • H03K19/0033Radiation hardening
    • H03K19/00338In field effect transistor circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0736Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in functional embedded systems, i.e. in a data processing system designed as a combination of hardware and software dedicated to performing a certain function
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/076Error or fault detection not based on redundancy by exceeding limits by exceeding a count or rate limit, e.g. word- or bit count limit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0796Safety measures, i.e. ensuring safe condition in the event of error, e.g. for controlling element
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/805Real-time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/82Solving problems relating to consistency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/86Event-based monitoring

Definitions

  • Exemplary embodiments pertain to the art of solid state power controllers, and more specifically, recovery from single event latchup with a state protection circuit.
  • Cosmic radiation can induce Single Event Latchup (SEL) in complex electronic devices.
  • SELs are induced by causing conduction from the circuit to the substrate that results in a 4 layer device or SCR turning on and carrying common mode current from multiple paths through the substrate to ground. This ‘latch’ results in collapsing the local power supply around the fault and disrupting the ability of the circuit to function at all.
  • the amount of circuitry affect depends on the location of the collapse and the power supply characteristics. In aerospace this may be a particular problem due to higher radiation intensities and system criticality. Certain flight paths have increasing probability of SEL due to global magnetic variances and/or atmospheric conditions. Further, SEL may become more likely at certain polar orientations where cosmic radiation intensity is higher.
  • Solid state power controllers may switch power on and off to electrical loads (e.g., displays, components, etc.).
  • the SSPCs may be controlled by Peripheral Interface Controllers (PICs) that monitor voltage and current status, and drive field effect transistor gates to turn the power on and off in the load circuits.
  • PICs Peripheral Interface Controllers
  • SEL affecting the PICs may cause loss of control and protection of the SSPC and the SSPC output to shift from their proper state to an erroneous state.
  • SEL can only be cleared by a power cycle of the affected device.
  • Known methods by necessity power cycle the control circuits resulting in a loss of control state. It is advantageous to maintain the control state during an SEL recovery to prevent system effects.
  • an apparatus that includes a single event latchup (SEL) recovery circuit, a microprocessor operatively connected with the SEL recovery circuit, and an output maintenance circuit that maintains a state of the microprocessor prior to a power cycle of the microprocessor.
  • the apparatus is configured to detect a SEL event or other fault via a watchdog circuit, initiate a power cycle of the microprocessor, retain a latch state from the microprocessor, and determine whether the microprocessor was restarted due to an SEL event. Responsive to determining that the microprocessor has failed to restart due to a persistent fault, the apparatus determines whether a prepower cycle limit is reached within a predetermined span of time, and selectively provide power to a load based on the latch state and the power cycle limit determination.
  • SEL single event latchup
  • the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
  • selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
  • the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
  • the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
  • a method for recovering a circuit after a single event latchup includes: detecting a SEL event or other fault via a watchdog circuit; initiating a power cycle of a microprocessor; retaining a latch state from the microprocessor; determining, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event; responsive to determining that the microprocessor has failed to restart due to a persistent fault, determining whether a predetermined power cycle limit is reached within a predetermined span of time; and selectively providing power to a load based on the latch state and the power cycle limit determination.
  • the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
  • selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
  • the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
  • the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
  • a nontransitory computer readable storage medium storing instructions that, when executed by a processor, perform a method for recovering a circuit after a single event latchup (SEL).
  • the method includes: detecting a SEL event or other fault via a watchdog circuit; initiating a power cycle of a microprocessor; retaining a latch state from the microprocessor; determining, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event; responsive to determining that the microprocessor has failed to restart due to a persistent fault, determining whether a predetermined power cycle limit is reached within a predetermined span of time; and selectively providing power to a load based on the latch state and the power cycle limit determination.
  • the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
  • selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
  • the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
  • the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
  • FIG. 1 is a flow diagram of a system for SEL event latchup recovery with state protection according to one embodiment
  • FIG. 2 is a diagram of a circuit for SEL event latchup recovery with state protection according to another embodiment.
  • SEEs Single Event Effects
  • SEEs are caused by a single, energetic particle.
  • SEEs can be soft errors or hard errors.
  • Soft errors can include, for example, a Single Event Upset (SEU), which is usually non-destructive and can be cleared by a reset pulse to the microprocessor.
  • An SEU can appear as s transient pulse in logic or support circuitry, or as a bit-flip in a memory cell or register.
  • a hard error can include, for example, a Single Event Latchup (SEL), burnout of power components (e.g., MOSFETS), gate rupture, frozen bits, and noise in CCDs.
  • SEL Single Event Latchup
  • FIG. 1 is a flow diagram of a system 100 for SEL event latchup recovery with state protection, according to an embodiment.
  • Embodiments of the present invention improve the existing SEL recovery systems with minimal added circuitry. By memorizing the output state of the controller, the SEL recovery of the control circuitry can be concluded without disrupting the overall system operation. By limiting the total number of SEL recovery attempts and shutting down the controller and latched states if SEL recovery is unsuccessful, safety of the electronics with which system 100 is installed is maintained.
  • the system 100 includes a power supply 102 , a counting circuit (counter 110 ), a watchdog circuit (watchdog timer 108 ), control circuitry 105 having a microprocessor 106 (hereafter “controller 105 ” and microprocessor 106 , respectively), and a latch mechanism 113 .
  • the watchdog timer 108 detects a malfunction condition of the control circuitry 105 , and in response to detecting a malfunction, triggers a power cycle operation of the microprocessor 106 by sending a power cycle pulse 109 .
  • a malfunction condition can be, for example, an occurrence of atmospheric radiation causing a latched state error of one or more CMOS devices operating in the control circuitry 105 .
  • prior inventions may detect a SEL fault with a watchdog circuit
  • embodiments of the present invention protect and remember a state of a latch indicative of a state of the load prior to restarting. Accordingly, the load is not briefly lost due to a changed state in the microprocessor 106 .
  • the system 100 will cycle power to the control circuitry 105 .
  • the power cycle will cause the control circuitry 105 to stop operating for a period of time during and following the power cycle.
  • the power is removed from the control circuitry 105 to remove the SEL and then restored allowing the control circuitry 105 to restart.
  • the control circuitry 105 may require time to fully return to operation due to software loading and health checks upon restarting.
  • the Latch 113 maintains the state of the load present prior to the SEL event. Additional latches 113 may be present to maintain other control circuit state data.
  • the microprocessor 106 upon power up may determine whether the power cycle was due to an SEL event or whether the power cycle is a non-SEL type restart (such as, for example, turning the system on for the first time) by reading the Latch 113 outputs via monitor signals 116 .
  • the system 100 is configured to 1) determine whether the system has recovered properly, 2) determine and remember the state prior to recovery power cycle, and 3) resume active control of the load.
  • a persistent malfunction in the system may prevent the control circuitry 105 from restarting leading to multiple power cycles, where the system has not returned to a safe state after each power cycle.
  • the persistent malfunction could cause the control circuitry 105 to repeatedly cycle on-off, on-off, etc., which may be damaging to the system 100 .
  • the watchdog circuit 108 may remove power from both the control circuitry 105 and the Latch circuitry 113 .
  • the microprocessor is thus maintained in a nonfunctional powered off state until maintenance of the hardware is performed.
  • the counter 110 is configured to transmit a power down signal 111 to the power supply 102 if and only if a predetermined number of power cycles has been exceeded within a predetermined span of time.
  • the counter 110 may be configured to limit the system 100 to only 3 resets (power cycles of the control power 117 ) that shut off the control power 117 via a switch 104 and restore power to the control circuitry 105 via the switch 104 within a 60 second time period. After the third reset the counter 110 may be configured to cause the power supply 102 to turn off removing power 103 from both the control circuitry 105 and the latch circuitry 113 . Removing power from the latch circuitry 113 ensures the load is placed in a safe off state when persistent faults prevent proper operation of the control circuitry 105 . It should be appreciated that the predetermined number and predetermined time span are exemplary only and not limiting.
  • Recovering properly includes a full power cycle of the control circuitry 105 , and restarting of system 100 software (operating as part of the control circuitry 105 ) where the controller acknowledges the current known state, appreciates whether the current state is a recovered state or a fresh restart state.
  • the software or hardware fault causes an incomplete restart where the system software (not shown) operating in the control circuitry 105 fails to execute or executes with errors.
  • the counter 110 shuts down the control circuitry 105 and latch 113 .
  • the system 100 monitors the output signals 112 in a feedback loop (e.g., monitor signal 116 ) to determine what the control state was prior to power cycling. If the control circuitry 105 is operative without errors, the control circuitry 105 outputs a state signal 115 to the flip flop (described with respect to FIG. 2 ) indicative of a functional gate command, and outputs the clock signal 114 to the latch 113 . With the clock signal 114 and the state signal 115 , the latch 113 functions as a persistence mechanism that remembers the state of the control circuitry 105 prior to the reset.
  • a feedback loop e.g., monitor signal 116
  • FIG. 2 is a diagram of an exemplary circuit 200 for SEL event latchup recovery with state protection, according to another embodiment.
  • the circuit 200 may be, for example, the circuit functional for the system 100 as shown in FIG. 1 .
  • FIGS. 1 and 2 are now considered in conjunction with one another, according to an embodiment.
  • circuit 200 includes two main functional portions: an SEL recovery portion 202 , and an output maintenance circuit 204 .
  • the SEL recovery portion 202 includes a watchdog circuit 209 counter mechanism (e.g., the watchdog timer 108 and counter 110 as shown in FIG. 1 ), and a switch (e.g., the switch 104 as shown in FIG.
  • the watchdog circuit 209 configured to receive signals from the watchdog circuit 209 (e.g., power cycle pulse 109 as shown in FIG. 1 ) and remove power from the controller 210 and then restore power to the controller 210 . Responsive to determining that the circuit 200 has not restarted properly due to multiple power cycle attempts, the watchdog circuit 209 sends a shutdown signal (e.g., power down 111 as shown in FIG. 1 ) to the power supply 207 .
  • a shutdown signal e.g., power down 111 as shown in FIG. 1
  • the circuit 200 further includes the output maintenance circuit 204 configured to remember the prior state of the system before a restart.
  • the states can include, for example, 1) the output states of the system 100 , 2) a normal operation state associated with a non-erroneous shut down and restart of the control circuitry 105 , and 3) a recovery state associated with a prior SEL event recovery where the control circuitry 105 has lost power due to an SEL event.
  • the flip flop 215 configured as the latch mechanism 113 of FIG. 1 , receives the clock signal 114 and the state signal 115 from the controller 210 (operational as the control circuitry 105 and/or the microprocessor 106 of FIG. 1 ).
  • the controller 210 determines the desired output state 216 during normal operation. If then a fault appears that leads the SEL recovery portion 202 to power cycle the controller 210 , the flip flop 215 retains the output state 216 . After restarting the controller 210 may read the previously set output state 216 via the monitor signal 116 in FIG. 1 .
  • the output state 216 of the flip flop 215 operates, via the gate driver 212 , a switch 213 providing the voltage feed 211 to the load 214 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)
  • Microcomputers (AREA)
  • Power Sources (AREA)

Abstract

An apparatus that includes a single event latchup (SEL) recovery circuit, a microprocessor operatively connected with the SEL recovery circuit, and an output maintenance circuit that maintains a state of the microprocessor prior to a power cycle of the microprocessor. The apparatus is configured to detect a SEL event or other fault via a watchdog circuit, initiate a power cycle of the microprocessor, retain a latch state from the microprocessor, and determine whether the microprocessor was restarted due to an SEL event. Responsive to determining that the microprocessor has failed to restart due to a persistent fault, the apparatus determines whether a prepower cycle limit is reached within a predetermined span of time, and selectively provide power to a load based on the latch state and the power cycle limit determination.

Description

    BACKGROUND
  • Exemplary embodiments pertain to the art of solid state power controllers, and more specifically, recovery from single event latchup with a state protection circuit.
  • Cosmic radiation can induce Single Event Latchup (SEL) in complex electronic devices. SELs are induced by causing conduction from the circuit to the substrate that results in a 4 layer device or SCR turning on and carrying common mode current from multiple paths through the substrate to ground. This ‘latch’ results in collapsing the local power supply around the fault and disrupting the ability of the circuit to function at all. The amount of circuitry affect depends on the location of the collapse and the power supply characteristics. In aerospace this may be a particular problem due to higher radiation intensities and system criticality. Certain flight paths have increasing probability of SEL due to global magnetic variances and/or atmospheric conditions. Further, SEL may become more likely at certain polar orientations where cosmic radiation intensity is higher.
  • Solid state power controllers (SSPCs) may switch power on and off to electrical loads (e.g., displays, components, etc.). The SSPCs may be controlled by Peripheral Interface Controllers (PICs) that monitor voltage and current status, and drive field effect transistor gates to turn the power on and off in the load circuits. SEL affecting the PICs may cause loss of control and protection of the SSPC and the SSPC output to shift from their proper state to an erroneous state.
  • SEL can only be cleared by a power cycle of the affected device. Known methods by necessity power cycle the control circuits resulting in a loss of control state. It is advantageous to maintain the control state during an SEL recovery to prevent system effects.
  • BRIEF DESCRIPTION
  • Disclosed is an apparatus that includes a single event latchup (SEL) recovery circuit, a microprocessor operatively connected with the SEL recovery circuit, and an output maintenance circuit that maintains a state of the microprocessor prior to a power cycle of the microprocessor. The apparatus is configured to detect a SEL event or other fault via a watchdog circuit, initiate a power cycle of the microprocessor, retain a latch state from the microprocessor, and determine whether the microprocessor was restarted due to an SEL event. Responsive to determining that the microprocessor has failed to restart due to a persistent fault, the apparatus determines whether a prepower cycle limit is reached within a predetermined span of time, and selectively provide power to a load based on the latch state and the power cycle limit determination.
  • In any prior apparatus, the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
  • In any prior apparatus, selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
  • In any prior apparatus, the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
  • In any prior apparatus, the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
  • Also disclosed is a method for recovering a circuit after a single event latchup (SEL). The method includes: detecting a SEL event or other fault via a watchdog circuit; initiating a power cycle of a microprocessor; retaining a latch state from the microprocessor; determining, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event; responsive to determining that the microprocessor has failed to restart due to a persistent fault, determining whether a predetermined power cycle limit is reached within a predetermined span of time; and selectively providing power to a load based on the latch state and the power cycle limit determination.
  • In the method of any prior embodiment, the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
  • In the method of any prior embodiment, selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
  • In the method of any prior embodiment, the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
  • In the method of any prior embodiment, the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
  • Also disclosed is a nontransitory computer readable storage medium storing instructions that, when executed by a processor, perform a method for recovering a circuit after a single event latchup (SEL). The method includes: detecting a SEL event or other fault via a watchdog circuit; initiating a power cycle of a microprocessor; retaining a latch state from the microprocessor; determining, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event; responsive to determining that the microprocessor has failed to restart due to a persistent fault, determining whether a predetermined power cycle limit is reached within a predetermined span of time; and selectively providing power to a load based on the latch state and the power cycle limit determination.
  • In the nontransitory computer-readable storage medium of any prior embodiment, the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
  • In the nontransitory computer-readable storage medium of any prior embodiment, selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
  • In the nontransitory computer-readable storage medium of any prior embodiment, the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
  • In the nontransitory computer-readable storage medium of any prior embodiment, the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The following descriptions should not be considered limiting in any way. With reference to the accompanying drawings, like elements are numbered alike:
  • FIG. 1 is a flow diagram of a system for SEL event latchup recovery with state protection according to one embodiment; and
  • FIG. 2 is a diagram of a circuit for SEL event latchup recovery with state protection according to another embodiment.
  • DETAILED DESCRIPTION
  • A detailed description of one or more embodiments of the disclosed apparatus and method are presented herein by way of exemplification and not limitation with reference to the Figures.
  • Single Event Effects (SEEs) are caused by a single, energetic particle. SEEs can be soft errors or hard errors. Soft errors can include, for example, a Single Event Upset (SEU), which is usually non-destructive and can be cleared by a reset pulse to the microprocessor. An SEU can appear as s transient pulse in logic or support circuitry, or as a bit-flip in a memory cell or register. A hard error can include, for example, a Single Event Latchup (SEL), burnout of power components (e.g., MOSFETS), gate rupture, frozen bits, and noise in CCDs. An SEL that causes a high operating current that exceeds device specifications is potentially destructive. In situations, an SEL can only be cleared by restarting power to the microprocessor, including removing and then restoring power. A reset operation of the microprocessor would not be sufficient. However, the power cycle will cause a loss of the control state. It is currently known to provide a watchdog timer used to verify valid operation of the control circuitry. In this type of watchdog circuit, a controller puts out a regular pulse to confirm proper operation of the system. When the controller experiences an SEL event such that it can no longer function, the controller stops providing pulses. When the pulse is not sensed for a given length of time the Watchdog initiates a power cycle. Such a method is explained in U.S. Patent Application no. 2017/0308441 A1, which is incorporated herein by reference. It is advantageous, therefore, to provide a system for event latchup recovery with state protection and a counter included that limits the number of allowed power cycles to prevent oscillation in case of a hard (persisting) failure.
  • FIG. 1 is a flow diagram of a system 100 for SEL event latchup recovery with state protection, according to an embodiment. Embodiments of the present invention improve the existing SEL recovery systems with minimal added circuitry. By memorizing the output state of the controller, the SEL recovery of the control circuitry can be concluded without disrupting the overall system operation. By limiting the total number of SEL recovery attempts and shutting down the controller and latched states if SEL recovery is unsuccessful, safety of the electronics with which system 100 is installed is maintained.
  • As shown in FIG. 1, the system 100 includes a power supply 102, a counting circuit (counter 110), a watchdog circuit (watchdog timer 108), control circuitry 105 having a microprocessor 106 (hereafter “controller 105” and microprocessor 106, respectively), and a latch mechanism 113.
  • The watchdog timer 108 detects a malfunction condition of the control circuitry 105, and in response to detecting a malfunction, triggers a power cycle operation of the microprocessor 106 by sending a power cycle pulse 109. A malfunction condition can be, for example, an occurrence of atmospheric radiation causing a latched state error of one or more CMOS devices operating in the control circuitry 105. While prior inventions may detect a SEL fault with a watchdog circuit, embodiments of the present invention protect and remember a state of a latch indicative of a state of the load prior to restarting. Accordingly, the load is not briefly lost due to a changed state in the microprocessor 106.
  • Responsive to an SEL event, the system 100 will cycle power to the control circuitry 105. The power cycle will cause the control circuitry 105 to stop operating for a period of time during and following the power cycle. The power is removed from the control circuitry 105 to remove the SEL and then restored allowing the control circuitry 105 to restart. The control circuitry 105 may require time to fully return to operation due to software loading and health checks upon restarting. During the period of time that the control circuitry 105 is not operable, the Latch 113 maintains the state of the load present prior to the SEL event. Additional latches 113 may be present to maintain other control circuit state data.
  • The microprocessor 106 upon power up may determine whether the power cycle was due to an SEL event or whether the power cycle is a non-SEL type restart (such as, for example, turning the system on for the first time) by reading the Latch 113 outputs via monitor signals 116. When the microprocessor 106 determines that the restart is due to an SEL event, according to embodiments, the system 100 is configured to 1) determine whether the system has recovered properly, 2) determine and remember the state prior to recovery power cycle, and 3) resume active control of the load.
  • At times, a persistent malfunction in the system may prevent the control circuitry 105 from restarting leading to multiple power cycles, where the system has not returned to a safe state after each power cycle. The persistent malfunction could cause the control circuitry 105 to repeatedly cycle on-off, on-off, etc., which may be damaging to the system 100. To prevent rapid power cycling in such cases, in response to a persistent malfunction condition, the watchdog circuit 108 may remove power from both the control circuitry 105 and the Latch circuitry 113. The microprocessor is thus maintained in a nonfunctional powered off state until maintenance of the hardware is performed. The counter 110 is configured to transmit a power down signal 111 to the power supply 102 if and only if a predetermined number of power cycles has been exceeded within a predetermined span of time. For example, the counter 110 may be configured to limit the system 100 to only 3 resets (power cycles of the control power 117) that shut off the control power 117 via a switch 104 and restore power to the control circuitry 105 via the switch 104 within a 60 second time period. After the third reset the counter 110 may be configured to cause the power supply 102 to turn off removing power 103 from both the control circuitry 105 and the latch circuitry 113. Removing power from the latch circuitry 113 ensures the load is placed in a safe off state when persistent faults prevent proper operation of the control circuitry 105. It should be appreciated that the predetermined number and predetermined time span are exemplary only and not limiting.
  • Recovering properly includes a full power cycle of the control circuitry 105, and restarting of system 100 software (operating as part of the control circuitry 105) where the controller acknowledges the current known state, appreciates whether the current state is a recovered state or a fresh restart state. In one case, the software or hardware fault causes an incomplete restart where the system software (not shown) operating in the control circuitry 105 fails to execute or executes with errors. After predetermined number of incomplete restart cycles resulting in an incomplete state recovery, the counter 110 shuts down the control circuitry 105 and latch 113.
  • The system 100 monitors the output signals 112 in a feedback loop (e.g., monitor signal 116) to determine what the control state was prior to power cycling. If the control circuitry 105 is operative without errors, the control circuitry 105 outputs a state signal 115 to the flip flop (described with respect to FIG. 2) indicative of a functional gate command, and outputs the clock signal 114 to the latch 113. With the clock signal 114 and the state signal 115, the latch 113 functions as a persistence mechanism that remembers the state of the control circuitry 105 prior to the reset.
  • FIG. 2 is a diagram of an exemplary circuit 200 for SEL event latchup recovery with state protection, according to another embodiment. The circuit 200 may be, for example, the circuit functional for the system 100 as shown in FIG. 1. FIGS. 1 and 2 are now considered in conjunction with one another, according to an embodiment. Referring now to FIG. 2, circuit 200 includes two main functional portions: an SEL recovery portion 202, and an output maintenance circuit 204. The SEL recovery portion 202 includes a watchdog circuit 209 counter mechanism (e.g., the watchdog timer 108 and counter 110 as shown in FIG. 1), and a switch (e.g., the switch 104 as shown in FIG. 1) configured to receive signals from the watchdog circuit 209 (e.g., power cycle pulse 109 as shown in FIG. 1) and remove power from the controller 210 and then restore power to the controller 210. Responsive to determining that the circuit 200 has not restarted properly due to multiple power cycle attempts, the watchdog circuit 209 sends a shutdown signal (e.g., power down 111 as shown in FIG. 1) to the power supply 207.
  • The circuit 200 further includes the output maintenance circuit 204 configured to remember the prior state of the system before a restart. The states can include, for example, 1) the output states of the system 100, 2) a normal operation state associated with a non-erroneous shut down and restart of the control circuitry 105, and 3) a recovery state associated with a prior SEL event recovery where the control circuitry 105 has lost power due to an SEL event.
  • According to an embodiment, the flip flop 215, configured as the latch mechanism 113 of FIG. 1, receives the clock signal 114 and the state signal 115 from the controller 210 (operational as the control circuitry 105 and/or the microprocessor 106 of FIG. 1). The controller 210 determines the desired output state 216 during normal operation. If then a fault appears that leads the SEL recovery portion 202 to power cycle the controller 210, the flip flop 215 retains the output state 216. After restarting the controller 210 may read the previously set output state 216 via the monitor signal 116 in FIG. 1. The output state 216 of the flip flop 215 operates, via the gate driver 212, a switch 213 providing the voltage feed 211 to the load 214.
  • The term “about” is intended to include the degree of error associated with measurement of the particular quantity based upon the equipment available at the time of filing the application. For example, “about” can include a range of ±8% or 5%, or 2% of a given value.
  • The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the present disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, element components, and/or groups thereof
  • While the present disclosure has been described with reference to an exemplary embodiment or embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the present disclosure. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the present disclosure without departing from the essential scope thereof. Therefore, it is intended that the present disclosure not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this present disclosure, but that the present disclosure will include all embodiments falling within the scope of the claims.

Claims (15)

What is claimed is:
1. An apparatus comprising:
a single event latchup (SEL) recovery circuit;
a microprocessor operatively connected with the SEL recovery circuit, and an output maintenance circuit that maintains a state of the microprocessor prior to a power cycle of the microprocessor;
wherein the apparatus is configured to:
detect a SEL event or other fault via a watchdog circuit;
initiate a power cycle of the microprocessor;
retain a latch state from the microprocessor;
determine, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event;
responsive to determining that the microprocessor has failed to restart due to a persistent fault, determine whether a prepower cycle limit is reached within a predetermined span of time; and
selectively provide power to a load based on the latch state and the power cycle limit determination.
2. The apparatus of claim 1, wherein the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
3. The apparatus of claim 1, wherein selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
4. The apparatus of claim 1, wherein the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
5. The apparatus of claim 1, wherein the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
6. A method for recovering a circuit after a single event latchup (SEL) comprising:
detecting a SEL event or other fault via a watchdog circuit;
initiating a power cycle of a microprocessor;
retaining a latch state from the microprocessor;
determining, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event;
responsive to determining that the microprocessor has failed to restart due to a persistent fault, determining whether a predetermined power cycle limit is reached within a predetermined span of time; and
selectively providing power to a load based on the latch state and the power cycle limit determination.
7. The method of claim 6, wherein the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
8. The method of claim 6, wherein selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
9. The method of claim 6, wherein the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
10. The method of claim 6, wherein the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
11. A nontransitory computer readable storage medium storing instructions that, when executed by a processor, perform a method for recovering a circuit after a single event latchup (SEL) comprising:
detecting a SEL event or other fault via a watchdog circuit;
initiating a power cycle of a microprocessor;
retaining a latch state from the microprocessor;
determining, via the microprocessor and a latch mechanism, whether the microprocessor was restarted due to an SEL event;
responsive to determining that the microprocessor has failed to restart due to a persistent fault, determining whether a predetermined power cycle limit is reached within a predetermined span of time; and
selectively providing power to a load based on the latch state and the power cycle limit determination.
12. The nontransitory computer-readable storage medium of claim 10, wherein the watchdog circuit is configured to shut the power to the load off responsive to determining that the predetermined power cycle limit is reached within a predetermined span of time.
13. The nontransitory computer-readable storage medium 10, wherein selectively providing power to the load comprises transmitting a command state to a field effect transistor operable as part of a solid state power controller.
14. The nontransitory computer-readable storage medium 10, wherein the latch state comprises a normal operation state of the microprocessor, wherein the normal operation state is associated with a non-erroneous shut down and restart of the SEL recovery circuit.
15. The nontransitory computer-readable storage medium 10, wherein the latch state comprises a recovery state associated with a prior SEL event recovery, wherein the microprocessor has lost power due to an earlier SEL event within a predetermined span of time.
US15/917,523 2018-03-09 2018-03-09 Single event latchup recovery with state protection Active 2038-08-04 US10713118B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US15/917,523 US10713118B2 (en) 2018-03-09 2018-03-09 Single event latchup recovery with state protection
EP19161303.3A EP3537301A1 (en) 2018-03-09 2019-03-07 Single event latchup recovery with state protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US15/917,523 US10713118B2 (en) 2018-03-09 2018-03-09 Single event latchup recovery with state protection

Publications (2)

Publication Number Publication Date
US20190278660A1 true US20190278660A1 (en) 2019-09-12
US10713118B2 US10713118B2 (en) 2020-07-14

Family

ID=65817729

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/917,523 Active 2038-08-04 US10713118B2 (en) 2018-03-09 2018-03-09 Single event latchup recovery with state protection

Country Status (2)

Country Link
US (1) US10713118B2 (en)
EP (1) EP3537301A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11188421B2 (en) * 2018-07-30 2021-11-30 Honeywell International Inc. Method and apparatus for detecting and remedying single event effects
US11409609B2 (en) * 2018-12-11 2022-08-09 Rolls-Royce Plc Single event effect mitigation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5923830A (en) * 1997-05-07 1999-07-13 General Dynamics Information Systems, Inc. Non-interrupting power control for fault tolerant computer systems
US6064555A (en) * 1997-02-25 2000-05-16 Czajkowski; David Radiation induced single event latchup protection and recovery of integrated circuits
US20080151456A1 (en) * 2005-10-20 2008-06-26 Microchip Technology Incorporated Automatic Detection of a CMOS Device in Latch-Up and Cycling of Power Thereto
US20100268987A1 (en) * 2008-11-26 2010-10-21 Arizona Board of Regents, for and behalf of Arizona State University Circuits And Methods For Processors With Multiple Redundancy Techniques For Mitigating Radiation Errors
US20120159269A1 (en) * 2009-07-15 2012-06-21 Hitachi, Ltd. Measurement Device and Measurement Method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5341497A (en) 1991-10-16 1994-08-23 Ohmeda Inc. Method and apparatus for a computer system to detect program faults and permit recovery from such faults
US5993039A (en) 1997-03-26 1999-11-30 Avalon Imagining, Inc. Power-loss interlocking interface method and apparatus
US7453678B2 (en) 2004-08-24 2008-11-18 Hamilton Sunstrand Corporation Power interruption system for electronic circuit breaker
US20070091527A1 (en) 2005-10-20 2007-04-26 Microchip Technology Incorporated Automatic detection of a CMOS circuit device in latch-up and reset of power thereto
JP6322434B2 (en) 2014-02-17 2018-05-09 矢崎総業株式会社 Backup signal generation circuit for load control
EP2958210A1 (en) 2014-06-20 2015-12-23 Hamilton Sundstrand Corporation Power delivery system with mitigation for radiation induced single event latch-up microelectronic devices
US9928143B2 (en) 2016-04-20 2018-03-27 Hamilton Sundstrand Corporation System and method for managing single event latched (SEL) conditions

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064555A (en) * 1997-02-25 2000-05-16 Czajkowski; David Radiation induced single event latchup protection and recovery of integrated circuits
US5923830A (en) * 1997-05-07 1999-07-13 General Dynamics Information Systems, Inc. Non-interrupting power control for fault tolerant computer systems
US20080151456A1 (en) * 2005-10-20 2008-06-26 Microchip Technology Incorporated Automatic Detection of a CMOS Device in Latch-Up and Cycling of Power Thereto
US20100268987A1 (en) * 2008-11-26 2010-10-21 Arizona Board of Regents, for and behalf of Arizona State University Circuits And Methods For Processors With Multiple Redundancy Techniques For Mitigating Radiation Errors
US20120159269A1 (en) * 2009-07-15 2012-06-21 Hitachi, Ltd. Measurement Device and Measurement Method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11188421B2 (en) * 2018-07-30 2021-11-30 Honeywell International Inc. Method and apparatus for detecting and remedying single event effects
GB2580727B (en) * 2018-07-30 2022-08-31 Honeywell Int Inc Method and apparatus for detecting and remedying single event effects
US11409609B2 (en) * 2018-12-11 2022-08-09 Rolls-Royce Plc Single event effect mitigation

Also Published As

Publication number Publication date
EP3537301A1 (en) 2019-09-11
US10713118B2 (en) 2020-07-14

Similar Documents

Publication Publication Date Title
CN107589825B (en) Watchdog circuit, power IC and watchdog monitoring system
US8495433B2 (en) Microcomputer mutual monitoring system and a microcomputer mutual monitoring method
US10713118B2 (en) Single event latchup recovery with state protection
JPH09258821A (en) Device for monitoring microcomputer fault
US20110043323A1 (en) Fault monitoring circuit, semiconductor integrated circuit, and faulty part locating method
JP5094777B2 (en) In-vehicle electronic control unit
CN108832990B (en) Space single event effect instant recovery method for real-time communication equipment
JPS6128142B2 (en)
US20210159688A1 (en) Timer circuit with autonomous floating of pins and related systems, methods, and devices
EP4206697A1 (en) Self-locking and detection circuit and apparatus, and control method
US10838794B2 (en) Rate based fault restart scheme
US20160266564A1 (en) Industrial control system with integrated circuit elements partitioned for functional safety and employing watchdog timing circuits
KR101448013B1 (en) Fault-tolerant apparatus and method in multi-computer for Unmanned Aerial Vehicle
JP2006350425A (en) Single-event compensation circuit of semiconductor device
US11409609B2 (en) Single event effect mitigation
JP2009003663A (en) Power control device
US9274909B2 (en) Method and apparatus for error management of an integrated circuit system
CN107453316B (en) Safety circuit
EP4167092A1 (en) Built-in memory tests for aircraft processing systems
US20230327673A1 (en) Main board, hot plug control signal generator, and control signal generating method thereof
CN109885450B (en) Active satellite-borne computer health state monitoring and optimizing method and system
US11493982B2 (en) Microcontroller and power management integrated circuit application clustering for safe state management
CN114791830B (en) Method for controlling and automatically restarting a technical device
US11169892B1 (en) Detecting and reporting random reset faults for functional safety and other high reliability applications
EP2833240A2 (en) Electronic device

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: HAMILTON SUNDSTRAND CORPORATION, NORTH CAROLINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SWENSON, JOSHUA C.;REEL/FRAME:045172/0287

Effective date: 20180308

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4