CN105573876A - Middleware system of distributed fault tolerance flight control computer based on deterministic communication - Google Patents

Middleware system of distributed fault tolerance flight control computer based on deterministic communication Download PDF

Info

Publication number
CN105573876A
CN105573876A CN201510920584.9A CN201510920584A CN105573876A CN 105573876 A CN105573876 A CN 105573876A CN 201510920584 A CN201510920584 A CN 201510920584A CN 105573876 A CN105573876 A CN 105573876A
Authority
CN
China
Prior art keywords
module
node
self
flight control
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510920584.9A
Other languages
Chinese (zh)
Inventor
郭勇
陈宣文
牟明
刘帅
吴楠
马超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN201510920584.9A priority Critical patent/CN105573876A/en
Publication of CN105573876A publication Critical patent/CN105573876A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2252Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using fault dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/2236Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test CPU or processors
    • G06F11/2242Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test CPU or processors in multi-processor systems, e.g. one processor becoming the test master
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2284Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing by power-on test, e.g. power-on self test [POST]

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention relates to a distributed computer embedded technology, in particular to a middleware system of a distributed fault tolerance flight control computer based on deterministic communication. The system is arranged between an application and a hardware driver. The system comprises a management module, a member management module, a redundancy management module, a first input/output interface and a second input/output interface. By using the system, the difficulties in application development of a multi-redundancy system can be effectively overcome and the expandability and usability of the distributed system are improved.

Description

A kind of middleware system of distributed fault-tolerance flight control computer of deterministic communication
Technical field
The present invention relates to a kind of distributed computer embedded technology, be specifically related to a kind of middleware system of distributed fault-tolerance flight control computer of deterministic communication.
Background technology
Fault-tolerant flight control computer system is as the core component of flight control system, and its security, reliability directly have influence on the viability of aircraft.Flight control computer system, as typical airborne fault-tolerant computer system, experienced by the development from the distributed fault-tolerance computer system of centralized fax flight control computer system, bus communication to the distributed flight control computer system based on switching network.
The control remaining software package that flies in the past contains the contents such as redundancy management, failure monitoring and fault handling, software usually and upper layer application close-coupled, therefore the application software developed and hardware design relation are comparatively large, need overlapping development for different platforms and different demands.For the software that triangular web is developed, system is difficult to expansion, and between different system, software also transplants difficulty.
Summary of the invention
In order to solve the problem in background technology, the present invention proposes a kind of need be undertaken alternately by the data-interface that defines and middleware interface, facilitate distributed fault-tolerance allocation of computer, and add the middleware system of the distributed fault-tolerance flight control computer of the deterministic communication of the extensibility of system.
Technical scheme of the present invention is:
The present invention proposes a kind of middleware system of distributed fault-tolerance flight control computer of deterministic communication, it is characterized in that: be arranged between application program and hardware driving; Comprise administration module, member management module, redundancy management module, the first IO interface and the second IO interface;
Described health control module is used for failure monitoring, fault handling, the record trouble message of computer node self and judges whether computer node is healthy node;
Described member management module is used for making institute's unsoundness node information after completing a cycle renewal consistent;
Described redundancy management module is used for carrying out remaining voting to the data that each computer node collects, and the value after voting is passed to application program by the first IO interface and processes, and result is sent to flight control unit;
First IO interface is used for the intercommunication with application program; Second IO interface is used for the intercommunication with hardware driving.
Upper described health control module comprises power-on self-test die trial block, cycle self-test module and hardware logic module;
Described power-on self-test tries out the health status in detection node hardware, if normally, then initialization own node enters trusted members's list, otherwise mourns in silence self; Described trusted members's list be based upon node this locality for storage node self whether can list;
Described cycle self-test module does periodic Autonomous test to node hardware, comprises cpu test, ram test, timer test, interface testing;
Hardware logic module is used for warning and the hardware effort condition monitoring of software fault.
For making, the unsoundness node concrete grammar that information is consistent after completing a cycle renewal is above-mentioned member management module:
1) each computer node of each cycle and TTE clock synchronization of ad; Described TTE network clocking is the clock that all computer nodes have;
2) carry out synchronously to all computer nodes self trusted members's list and application data result within the set time; Described application data is for the parameter, the data message that control of flying.
Above-mentioned redundancy management module comprises input data acquisition module, input data table determines module and output information controls output module;
Described input data acquisition module is used for gathering external device data by hardware driving;
Described input data table module of determining is chosen according to the external device data of majority voting principle to input, and transmits it to application program and calculate;
Described output information controls output module and exports the result of calculation of application program to flight control unit by hardware driving.
The invention has the advantages that:
1. middleware system of the present invention is arranged on extensibility and the dirigibility of the distributed flight control computer effectively adding deterministic communication between application program and hardware driving.
2. the application of middleware system of the present invention being developed need not be concerned about system margin configuration and network model thus reduce the development difficulty of application, possesses portable ability.
Accompanying drawing explanation
Fig. 1 is middleware system structural representation of the present invention;
Fig. 2 is middleware system practical application schematic diagram of the present invention;
Fig. 3 is the synchronous concrete sequential process flow diagram of trusted members's list and application data result;
Fig. 4 is the synchronous specific algorithm process flow diagram of trusted members's list and application data result.
Embodiment
Below in conjunction with accompanying drawing 1,2, this system is described in detail:
The distributed fault-tolerance computer system of deterministic communication is made up of distributed treatment computer node (lockstep node), distributed interface computing machine (RDC node) and switch.
There are three lockstep nodes in the present embodiment, the distributed fault-tolerance computer system of four RDC node composition 3*4 remainings.Distributed node communication adopts unified TTE network.System layout is shown in Fig. 1.
The input/output function that wherein lockstep module mainly completes data processing function, RDC module completes interface.The distributed fault-tolerance computer system software exploitation of deterministic communication is introduced to the thought of " layering ", realize interoperability between heterogeneous system, between hardware layer and application layer encapsulate one to hardware and application all independently middleware shield the interface difference between distinct device hardware, application programs realizes compatibility and the support of different bottom hardware.Middleware software comprises health control, member management and redundancy management three part.For different nodes, middleware software only needs adaptive application with system to carry out the interface of data upload/pass down and the data communication interface with peripherals.Concrete node software hierarchy chart as shown in Figure 2.
The present invention proposes a kind of middleware system of distributed fault-tolerance flight control computer of deterministic communication, it is characterized in that: be arranged between application program and hardware driving; Comprise administration module, member management module, redundancy management module, the first IO interface and the second IO interface;
Health control module
Health control module comprises failure monitoring, fault handling, record trouble message, and this module is made up of power-on self-test die trial block (PUBIT), cycle self-test module (CBIT) and hardware logic module;
Specifically:
PUBIT: power-on self-test tries.Upon power-up of the system, module does power-on self-test examination, the health status of detection module hardware, if normal, initialization self trusted members list comprises oneself, otherwise mourns in silence self.
CBIT: cycle self-test.In system operation, distributed computer node hardware does periodic Autonomous test, comprises cpu test, ram test, timer test, interface testing etc.
Hardware logic: comprise house dog, hardware effort condition monitoring etc.
In operational process, distributed node also can monitor own node data and same node data whether within thresholding, has ensured that the process data of oneself are effectively credible.
If certain node itself fail, then can be set to silent status in health monitoring process.
Member management module
Member management module is used for making institute's unsoundness node information after completing a cycle renewal consistent; The method used comprises the following steps:
1) each computer node of each cycle and TTE clock synchronization of ad; Described TTE network clocking is the clock that all computer nodes have;
2) carry out synchronously to all computer nodes self trusted members's list and application data result within the set time; Application data is for the parameter, the data message that control of flying.
Trusted members's list and the synchronous concrete sequential flow process of application data result are as shown in Figure 3.Be specially:
1, at this tasks synchronization time 100us place, SM1 is to system broadcasts TT message;
2, before this task 140us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 complete and receive this broadcast, stored in corresponding reception buffer zone;
3, at this task 200us place, SM2 is to system broadcasts TT message;
4, before this task 240us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 (network configuration increases SM8 and SM9) complete and receive this broadcast, stored in corresponding reception buffer zone;
5, at this task 300us place, SM3 is to system broadcasts TT message;
6, before this task 340us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 (network configuration increases SM8 and SM9) complete and receive this broadcast, stored in corresponding reception buffer zone;
7, at this task 400us place, SM4 is to system broadcasts TT message;
8, before this task 440us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 (network configuration increases SM8 and SM9) complete and receive this broadcast, stored in corresponding reception buffer zone;
9, at this task 500us place, SM5 is to system broadcasts TT message;
10, before this task 540us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 (network configuration increases SM8 and SM9) complete and receive this broadcast, stored in corresponding reception buffer zone;
11, at this task 600us place, SM6 is to system broadcasts TT message;
12, before this task 640us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 (network configuration increases SM8 and SM9) complete and receive this broadcast, stored in corresponding reception buffer zone;
13, at this task 700us place, SM7 is to system broadcasts TT message;
14, before this point of task 740us, SM1, SM2, SM3, SM4, SM5, SM6, SM7 (network configuration increases SM8 and SM9) complete and receive this broadcast, stored in corresponding reception buffer zone;
Trusted members's list and the synchronous specific algorithm flow process of application data result are as shown in Figure 4.Be specially:
1, judge whether the transmission cycle is that this nodes anticipate sends the cycle; If so, then carry out step 2, if not, then carry out step 4;
2, determine whether counter is greater than fail counter, if so, then carry out step 3), if not then Counter Value clear 0;
3, send information about firms, node state is changed to and sends;
4, whether arrive appointment and receive other nodal information time; If so, then carry out step 5, if not, then wait for;
5, information about firms management algorithm;
6, whether be receive for the last time in this cycle; If so, terminate this to calculate, if not, then proceed step 1.
Redundancy management module
Redundancy management module is used for carrying out remaining voting to the data that each computer node collects, and the value after voting is passed to application program by the first IO interface and processes, and result is sent to flight control unit.
IO interface
IO interface comprises the first IO interface and the second IO interface; First IO interface is used for the intercommunication with application program; Second IO interface is used for the intercommunication with hardware driving.

Claims (4)

1. a middleware system for the distributed fault-tolerance flight control computer of deterministic communication, is characterized in that: be arranged between application program and hardware driving; Comprise administration module, member management module, redundancy management module, the first IO interface and the second IO interface;
Described health control module is used for failure monitoring, fault handling, the record trouble message of computer node self and judges whether computer node is healthy node;
Described member management module is used for making institute's unsoundness node information after completing a cycle renewal consistent;
Described redundancy management module is used for carrying out remaining voting to the data that each computer node collects, and the value after voting is passed to application program by the first IO interface and processes, and result is sent to flight control unit;
First IO interface is used for the intercommunication with application program; Second IO interface is used for the intercommunication with hardware driving.
2. the middleware system of the distributed fault-tolerance flight control computer of deterministic communication according to claim 1, is characterized in that: described health control module comprises power-on self-test die trial block, cycle self-test module and hardware logic module;
Described power-on self-test tries out the health status in detection node hardware, if normally, then initialization own node enters trusted members's list, otherwise mourns in silence self; Described trusted members's list be based upon node this locality for storage node self whether can list;
Described cycle self-test module does periodic Autonomous test to node hardware, comprises cpu test, ram test, timer test, interface testing;
Hardware logic module is used for warning and the hardware effort condition monitoring of software fault.
3. the middleware system of the distributed fault-tolerance flight control computer of deterministic communication according to claim 1 and 2, is characterized in that: for making, the unsoundness node concrete grammar that information is consistent after completing a cycle renewal is described member management module:
1) each computer node of each cycle and TTE clock synchronization of ad; Described TTE network clocking is the clock that all computer nodes have;
2) carry out synchronously to all computer nodes self trusted members's list and application data result within the set time; Described application data is for the parameter, the data message that control of flying.
4. the middleware system of the distributed fault-tolerance flight control computer of deterministic communication according to claim 3, is characterized in that: described redundancy management module comprises input data acquisition module, input data table determines module and output information controls output module;
Described input data acquisition module is used for gathering external device data by hardware driving;
Described input data table module of determining is chosen according to the external device data of majority voting principle to input, and transmits it to application program and calculate;
Described output information controls output module and exports the result of calculation of application program to flight control unit by hardware driving.
CN201510920584.9A 2015-12-10 2015-12-10 Middleware system of distributed fault tolerance flight control computer based on deterministic communication Pending CN105573876A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510920584.9A CN105573876A (en) 2015-12-10 2015-12-10 Middleware system of distributed fault tolerance flight control computer based on deterministic communication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510920584.9A CN105573876A (en) 2015-12-10 2015-12-10 Middleware system of distributed fault tolerance flight control computer based on deterministic communication

Publications (1)

Publication Number Publication Date
CN105573876A true CN105573876A (en) 2016-05-11

Family

ID=55884042

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510920584.9A Pending CN105573876A (en) 2015-12-10 2015-12-10 Middleware system of distributed fault tolerance flight control computer based on deterministic communication

Country Status (1)

Country Link
CN (1) CN105573876A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7840832B2 (en) * 2006-05-16 2010-11-23 Saab Ab Fault tolerant control system
CN103825902A (en) * 2014-03-04 2014-05-28 中国民航大学 Reconstruction decision-making system and decision making method for comprehensive modularized avionics system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7840832B2 (en) * 2006-05-16 2010-11-23 Saab Ab Fault tolerant control system
CN103825902A (en) * 2014-03-04 2014-05-28 中国民航大学 Reconstruction decision-making system and decision making method for comprehensive modularized avionics system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
徐亚军 主编: "《民航飞机自动飞行***》", 30 September 2013 *
杨伟等编著: "《容错飞行控制***》", 31 March 2007 *
金娟: "基于时间触发的容错通信中间件的研究与实现", 《中国优秀硕士学位论文全文数据库 工程科技II辑》 *

Similar Documents

Publication Publication Date Title
CN110795219B (en) Resource scheduling method and system suitable for multiple computing frameworks
CN102325192B (en) Cloud computing implementation method and system
Ren et al. A methodology towards virtualisation-based high performance simulation platform supporting multidisciplinary design of complex products
CN105095001B (en) Virtual machine abnormal restoring method under distributed environment
CN108270726B (en) Application instance deployment method and device
CN105373650B (en) IMA dynamic restructuring modeling methods based on AADL
CN105659562B (en) It is a kind of for hold barrier method and data processing system and include for holds hinder computer usable code storage equipment
Kim et al. Safer: System-level architecture for failure evasion in real-time applications
CN102214128B (en) Repurposable recovery environment
CN104205109B (en) The worker process of continuation and elasticity
CN104408071A (en) Distributive database high-availability method and system based on cluster manager
CN105933137A (en) Resource management method, device and system
CN104133734A (en) Distributed integrated modular avionic system hybrid dynamic reconfiguration system and method
CN107634855A (en) A kind of double hot standby method of embedded system
CN102110035B (en) DMI redundancy in multiple processor computer systems
CN103559108A (en) Method and system for carrying out automatic master and slave failure recovery on the basis of virtualization
CN104077199A (en) Shared disk based high availability cluster isolation method and system
CN102394774A (en) Service state monitoring and failure recovery method for controllers of cloud computing operating system
CN103440160A (en) Virtual machine recovering method and virtual machine migration method , device and system
WO2012005637A1 (en) Method for configuring a distributed avionics control system
CN112948063B (en) Cloud platform creation method and device, cloud platform and cloud platform implementation system
CN109684131B (en) Dynamic reconstruction method of hybrid structure network fault-tolerant system based on table driving
CN110874261A (en) Usability system, usability method, and storage medium storing program
CN105847053A (en) Method and system for automatically setting arbitrary bonding for multi-network card and multi-network segment under LINUX system
Pentyala Emergency communication system with Docker containers, OSM and Rsync

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160511