CN1295615C - Distribution type software reliability evaluation system having time restraint - Google Patents

Distribution type software reliability evaluation system having time restraint Download PDF

Info

Publication number
CN1295615C
CN1295615C CNB2004100614067A CN200410061406A CN1295615C CN 1295615 C CN1295615 C CN 1295615C CN B2004100614067 A CNB2004100614067 A CN B2004100614067A CN 200410061406 A CN200410061406 A CN 200410061406A CN 1295615 C CN1295615 C CN 1295615C
Authority
CN
China
Prior art keywords
module
information
simulator
node machine
program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2004100614067A
Other languages
Chinese (zh)
Other versions
CN1624664A (en
Inventor
金海�
李运发
李胜利
韩宗芬
戴志华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CNB2004100614067A priority Critical patent/CN1295615C/en
Publication of CN1624664A publication Critical patent/CN1624664A/en
Application granted granted Critical
Publication of CN1295615C publication Critical patent/CN1295615C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The present invention provides a distribution type software reliability evaluation system with time constraint, which comprises a reliability evaluation device, node machine simulators and a distribution type software simulator, wherein the reliability evaluation device is positioned at the master control end; the node machine simulators are positioned on all node machines in a distribution type system; and the reliability evaluation device is used for receiving configuration files, carrying out initial configuration, controlling the simulation of the node machine simulators for the node machines, etc. The node machine simulators are used for receiving detection information sent from the reliability evaluation device, receiving executable programs and available data files sent from the distribution type software simulator, etc. The distribution type software simulator is used for generating programs in program lists and data files in file lists, distributing the programs and the data files to the node machine simulators at random, etc. The distribution type software reliability evaluation system selects reliability and mean free error time used as main detection indexes and has the advantages of simple installation, human-based interfaces, favorable expandability, controllable detection parameters, continuous and multiple times of detection, no platform, etc.

Description

Distributed software reliability evaluating system with time-constrain
Technical field
The invention belongs to the software performance evaluation field, for the reliability assessment of distributed software with time-constrain provides a kind of new model and method.
Background technology
In recent years, along with the fast development of science and technology, microelectronics and computer technology begin to be penetrated into every field, and people have been in the new period of a fast development to the application of electronic information technology.Because the distributed software system with time-constrain has the characteristic of time-constrain on the one hand, has the redundancy of program and data files on the other hand again, therefore quilt is extensively with transporting to fields such as military affairs, Aero-Space, navigation and energy control.Have the part of the distributed software of time-constrain as computer system, expansion along with range of application, its scale also constantly enlarges, function is on the increase, complicacy is constantly deepened, and begins to occupy an important position in whole computer technology and information processing.Have the increase of distributed software scale of time-constrain and the raising of complexity, error rate takes place and can get more and more in it.The integrity problem that the inefficacy of distributed software may cause catastrophic consequence to force soft project circle and reliability engineering bound pair to have the distributed software of time-constrain gives abundant attention, and the subject matter of total system progressively turns to software from hardware.
The Performance Evaluation of distributed software comprises real-time, reliability, stability, ease for use and accuracy etc. usually, when assessment, at first assesses the quality of each subsystem of distributed software.On the basis that guarantees the software subsystem quality, select certain metric, real-time, reliability, stability, ease for use and the accuracy of each subsystem are measured, graded and assess.After the Performance Evaluation of subsystem finished, again the combination property of total system is assessed, seen whether really to satisfy actual needs.
Distributed software reliability is to weigh the most important index of distributed software quality, also is the final goal of distributed software development.The assessment of distributed software reliability is to be that the basis is assessed the reliability of software with the reliability model, and the tolerance directly related with reliability (as fault intensity, failure rate, mean time before failure etc.) is assessed.Research to the distributed software reliability assessment will be in the reliability of metric software, the performance of assessment software, and the control and the development of management software products, the production run of software, the aspects such as throughput rate that improve distributed software play an important role.
Yet the assessment models of current distributed software lays particular emphasis on the reliability of distributed software, and seldom reflection has the feature of time-constrain; Simultaneously, the index that reliability assessment adopted is normally single, when promptly assessing, just assesses at a certain performance index of software reliability at every turn.
Summary of the invention
The objective of the invention is to overcome the defective of existing distributed software reliability assessment, a kind of distributed software reliability evaluating system with time-constrain is provided.This system has changed evaluation index has been concentrated on problem on the single index, and it can be to reflecting two leading indicators of the distributed software reliability with time-constrain: fiduciary level and mean free error time are assessed simultaneously.
A kind of distributed software reliability evaluating system with time-constrain provided by the invention comprises the reliability assessment device that is arranged in the overhead control end, is positioned at node machine simulator and distributed software simulator on each node machine of distributed system.Wherein,
The reliability assessment device comprises overhead control interface module, information gathering module, assessment algorithm module, display module and distributed system are restarted module as a result.The overhead control interface module is used for system is carried out initialization, realizes distributed system is restarted module, information gathering module, assessment algorithm module and the control of display module as a result.Distributed system is restarted module and was used for before the assessment beginning, restart distributed software simulator and all node machine simulators, carry out the random distribution of the random schedule and the file of program by distributed software simulator and all node machine simulators again.The information gathering module is used to receive the control information of sending from the overhead control interface module and determines its method of operation according to control information; Send detection information and receive its feedback information to node machine simulator, and this feedback information is sent to overhead control interface module and assessment algorithm module.The assessment algorithm module is used to receive the feedback information that sends from the configuration information and the information gathering module of the transmission of overhead control interface module, utilization has the distributed software reliability assessment models of time-constrain, calculating has the distributed software reliability of time-constrain, and the result is sent to display module as a result.Display module is used to the information that shows that overhead control interface module and assessment algorithm module are sent as a result.
Node machine simulator is used to receive the detection information of sending from the reliability assessment device; And receive the executable program that sends from the distributed software simulator and available data file; To receive executable program and available data file leaves designated directory in; Sense command that operation receives and the information of carrying out this machine detect; The information that on the interface, shows this machine in real time; The object information that detects is reported in the reliability assessment device; Executable program on this node machine and available data file information are removed in " Clear " that reception is sent from reliability assessment device order.The distributed software simulator is used for the program of generating routine tabulation and the data file in the listed files, and they are assigned randomly in the node machine simulator on each node machine; Reception is ordered also random distribution program and data files again from " Send " that master controller sends.
Above-mentioned node machine simulator comprises sets up socket module, message processing module and display module.Setting up the socket module is used to set up node machine simulator and is connected with port between reliability assessment device and the distributed software simulator.Message processing module is used to receive program, data file and the detection information of sending from the socket module, leave program and data files in designated directory, to detect the program and data files on the present node machine according to the detection information that receives, judge whether it is executable program or is not available data file, and testing result sent to set up socket module and display module, shown by display module.
Above-mentioned distributed software simulator comprises initialization module, sets up socket module and program and file distributing module.Initialization module is used to read in initialization files, and information sent to set up the socket module, set up the socket module and set up connection between node machine simulator and the distributed software simulator according to this information, and initialization files are transmitted to executable program and available file distributing module, by program and file distributing module the branch of distributed software is dispatched on the node machine simulator randomly and moves, data file is distributed on the node machine simulator randomly.
The present invention adopts modularized design method, to reduce the influence of assessment software system oneself factor to the software evaluation result to greatest extent; Simultaneously, also adopted the method for designing of simulator, can guarantee the environment of simulator real simulated distributed software operation like this, evaluating system is independent of outside the tested distributed software system, thereby realize true assessment distributed software reliability with time-constrain.In a word, the present invention has following technique effect:
1, adopted new assessment models
Native system is theoretical according to original distributed software assessment, on the basis that increases the time-constrain factor, has derived theory and model that a kind of new being fit to has the distributed software reliability assessment of time-constrain.The distribution that this assessment models has taken into full account program and data files is to the influence of distributed software reliability with time-constrain; Method such as adopt that breadth First, node are progressively expanded does not have duplicate paths in the file spanning tree of guaranteeing to generate.
2, good extensibility
Native system can adopt the topology information that reads in tested distributed system from configuration file, so the configuration of native system is more flexible, can assess the reliability of the distributed system under linearity, star, annular and the different topology structure such as netted, it also is not subjected to the topological structure of distributed system and size without limits simultaneously.
3, evaluate parameter controllability
Native system can be configured evaluate parameter in the configuration file.The Systems Programmer can utilize this system earlier its reliability to be assessed, and the key factor that influences its performance is adjusted to reach customer requirements before designing, and has avoided by unreasonable unnecessary expense and the trouble of bringing of system design.
But 4, Ping Gu continuity
Node machine simulator and distributed software simulator can be simulated any evaluated distributed system jointly, assess by 2 pairs of reliabilities with distributed software of time-constrain of master controller.When once assessment finished, the node machine simulator and the distributed software simulator that can restart from the overhead control interface the native system software carried out the Random assignment of the random schedule and the file of program, for assessment is next time prepared.Whole operation has realized robotization.
5, professional platform independence.
This system software is write with Java language, is applicable to Windows operating platform and Linux operating platform.
In addition, the present invention also has and installs simple and convenient and the unusual characteristics of hommization in interface.
Description of drawings
Fig. 1 is the distributed software reliability evaluating system topological structure with time-constrain;
Fig. 2 is the system architecture of reliability assessment simulator;
Fig. 3 is a node machine simulator data flowchart;
Fig. 4 is a distributed software simulator data flowchart.
Fig. 5 is the synoptic diagram of a test case.
Embodiment
As shown in Figure 1, from the operating angle of evaluating system, the present invention includes the reliability assessment device 5 that is arranged in master controller 2, be positioned at distributed system 1 each node machine 4.1,4.2 ... 4.n on node machine simulator 6.1,6.2 ..., 6.n (below be referred to as 6) and distributed software simulator 7.
Reliability assessment device 5 is cores of whole evaluating system, and its major function is to receive configuration file and carry out initial configuration according to configuration file; The simulation of 6 pairs of node machines of Control Node machine simulator simultaneously, sends the program and data files information of required detection and receives its feedback information to node machine simulator 6; Control distributed software simulator is to the simulation of distributed software and receive its feedback information; Show all feedback informations; According to configuration information and feedback information, utilize distributed software reliability assessment models with time-constrain, calculate the reliability value of distributed software and show its result of calculation with time-constrain.
Wherein, in having the distributed software reliability assessment models of time-constrain, adopted series of algorithms, having comprised: degree-1 is simplified algorithm (Degree-1 Reduction Algorithm), degree-2 is simplified algorithm (Degree-2 Reduction Algorithm), file spanning tree algorithm (File Spanning TreeAlgorithm), transmission time assessment algorithm (Transmission Time Evaluation Algorithms) and Markov model.Its concrete performing step is:
1., according to configuration file, determine all the executable distributed programs in the distributed software, their are formed a set { p 1, p 2, p 3..., p n.(i=1,2,...,n)
2., according to configuration file, determine distributed program p i(i=1,2 ..., n) required all data files are formed a set { F with their 1, F 2, F 3..., F m.
3., according to all feedback informations of configuration file and reception, determine distributed program p i(i=1,2 ..., n) required data file position in distributed system.
4., according to configuration file and data file the position in the node machine, availability-1 simplifies algorithm and degree-2 is simplified algorithm, figure simplifies to distributed system.
5., according to the distributed system figure, the distributed program p that simplify i(i=1,2 ..., n) required data file and data file position in system utilizes the file spanning tree algorithm, generates distributed program p i(i=1,2 ..., all transmission paths of desired data file when n) carrying out.
6., according to distributed program p i(i=1,2 ..., all transmission paths of desired data file utilize the transmission time assessment algorithm when n) carrying out, and calculate data file required transmission time on transmission path, thus the response time of specified data file.
7., according to distributed program p i(i=1,2 ..., required data file, the position of data file in distributed system, the transmittability and the fiduciary level of every link when n) carrying out, the time-constrain factor is utilized the computing formula of Markov model and distributed program fiduciary level, calculates distributed program p i(i=1,2 ..., reliability n).
8., according to distributed program p i(i=1,2 ..., reliability n) calculates the reliability of the distributed software with time-constrain.
Node machine simulator 6 is the running environment of distributed software in the whole evaluating system.Its major function is to receive the detection information of sending from reliability assessment device 5; Receive the executable program that sends from distributed software simulator 7 and available data file simultaneously; To receive executable program and available data file leaves designated directory in; Sense command that operation receives and the information of carrying out this machine detect; The information that on the interface, shows this machine in real time; In time the object information that detects is reported in the reliability assessment device 5; Simultaneously, receive " Clear " order from reliability assessment device 5, send, remove executable program on this node machine and available data file information.
The major function of distributed software simulator 7 is program in the generating routine tabulation and the data file in the listed files, and they are assigned randomly in the node machine simulator 6 on each node machine; Reception is ordered also random distribution program and data files again from " Send " that master controller 2 sends.
Reliability assessment device 5 at first passes through network 3 to node machine 4 transmitting control commands, each the node machine simulator 6 beginning analog operation in the command node machine 4; After the analog operation of all node machine simulators 6 is finished, reliability assessment device 5 passes through network 3 again to distributed software simulator 7 transmitting control commands, distributed software simulator 7 carries out the random distribution of distributed program and data file according to control command each node machine simulator 6 in node machine 4, finishes the simulation to distributed software; After distributed software simulator 7 was finished simulation to distributed software, reliability assessment device 5 sent detection information by network 3 each node machine simulator 6 in node machine 4 again.After each the node machine simulator 6 in the node machine 4 receives detection information, detect and the result is fed back to reliability assessment device 5 by the sense command in the detection information; Reliability assessment device 5 utilizes the distributed software reliability assessment models with time-constrain according to configuration information and feedback information, calculates the reliability value of the distributed software with time-constrain and shows its result of calculation.
Because the topological structure of distributed system may be linearity, star, annular or shape such as netted in shape, may be on network distributes for same LAN (Local Area Network) also may be wide area network, and the distributed system that heterogeneous networks shape and heterogeneous networks distribute may be different to the demand of distributed software.Equally, the demand to distributed system environment also may be different when operation for different distributed softwares.Therefore, when the distributed software reliability with time-constrain is assessed, adopted the method for simulator, made it can simulate different distributed softwares on the one hand, can simulate each the node machine in the distributed system again on the other hand with time-constrain.
Be described in further detail for the concrete structure of each part mentioned above below in conjunction with Fig. 2.
One, the reliability assessment device in the master controller
Reliability assessment device 5 comprises overhead control interface module 8, information gathering module 9, assessment algorithm module 10, display module 11 and distributed system are restarted module 12 5 parts as a result.
The major function of overhead control interface module 8 is: input time constraint factor; Read in the transmittability and the fiduciary level of the configuration file of distributed system figure G, every link, system is carried out initialization; Realization is restarted module 12, information gathering module 9, assessment algorithm module 10 and the control of display module 11 as a result to distributed system; To assessment algorithm module 10 and as a result display module 11 send configuration informations.
The function that distributed system is restarted module 12 is: before the assessment beginning, restart distributed software simulator 7 and all node machine simulators 6, carry out the random distribution of the random schedule and the file of program again.
The function of information gathering module 9 is: receive the control information of sending from overhead control interface module 8 and determine its method of operation according to control information; Send detection information and receive its feedback information (available programs collection PA that each node machine is all and data available file set FA) to node machine simulator 6; Send the feedback information that is received to overhead control interface module 8 and assessment algorithm module 10.
The function of assessment algorithm module 10 is: receive the feedback information that sends from the configuration information and the assessment algorithm module 10 of 8 transmissions of overhead control interface module, according to these information, utilization has the distributed software reliability assessment models of time-constrain, calculates to have the distributed software reliability of time-constrain and its value is sent to display module 11.
The function of display module 11 is as a result: receive the assessment result that assessment algorithm module 10 sends; Show distributed system figure, assessment result figure and assessment result journal file.
The function of display module 11 is as a result: receive from overhead control interface module 8 and send startup command or cease and desist order; Receive the assessment result that assessment algorithm module 10 sends; Send display message to overhead control interface module 8; The assessment result that receives is shown.
The specific procedure flow process of reliability assessment device 5 is as follows:
1., the user is from overhead control interface constraint factor 8 input time, reads in the transmittability of the description document of distributed system synoptic diagram, every link and the fiduciary level of every link, and system is carried out initialization.
2., the user is restarted module 12 from overhead control interface 8 to distributed system and is sent control information, distributed system is restarted module 12 and is restarted order to 6 transmissions of node machine simulator earlier according to control information, after node machine simulator 6 simulation finishes, send startup command to distributed software simulator 7 again.
3., information gathering module 9 receives the control information of sending from overhead control interface module 8, information gathering module 9 sends detection information to node machine simulator 6 earlier according to control information, the feedback information beamed back of receiving node machine simulator 6 sends the feedback information that receives to assessment algorithm module 10 and overhead control interface module 8 then again.
4., assessment algorithm module 10 receives the feedback information that sends from information gathering module 9, also receives the configuration information that sends from overhead control interface module 8 simultaneously.According to these information of reception, utilize distributed software reliability assessment models with time-constrain, calculate distributed software reliability with time-constrain, at last result of calculation is sent to display module 11 as a result.
5., display module 11 receives the result of calculation that assessment algorithm module 10 sends as a result, receives the display command and the configuration information that send from overhead control interface module 8 simultaneously.Display module 11 sends display message to overhead control interface module 8 earlier according to display command as a result, result of calculation and the configuration information that receives is shown again.Displaying contents has three: distributed system figure, assessment result figure and assessment journal file.Distributed system figure is with the information (node name, IP address, file available collection and available programs collection) of each node and the connection state (ability to communicate of link and fiduciary level) between the node in the evaluated distributed system of form demonstration intuitively.Assessment result figure shows the assessment result of whole distributed software with the form of bar chart and shows the detailed results of each branch reliability assessment (the distributed system figure after this program simplified and the mean free error time of all effective statuses).Simultaneously, with the form demonstration assessment daily record of text, be convenient to the reliability of system design engineer analysis and improvement distributed system and software.
6., distributed system restarts module 12 and restarted distributed system simulator and all node machines next time before beginning, and again executable program and available data file carried out random distribution.Like this, all might assess at every turn, make assessment have more generality different distributed systems.
Two, node machine simulator
Node machine simulator 6 comprises three functional modules: set up socket module 13, message processing module 14 and display module 15.
Set up socket module 13 and set up server socket, and wait for being connected of reliability assessment device 5 in the master controller 2 and distributed software simulator 7 at port 8189; The detection information that the reliability assessment device 5 of reception from master controller 2 sends is also sent detection information into message processing module 14; Executable program that reception is sent from distributed software simulator 7 and available data file are also sent them into message processing module 14; Give reliability assessment device 5 in the master controller 2 with detected report information; Link information is returned to reliability assessment device 5 and distributed software simulator 7 in the master controller 2 respectively.
Message processing module 14 receives the executable program that sends from socket module 13 and available data file, will receive executable program and available data file leaves designated directory in; Also receive simultaneously the detection information of sending from socket module 13; Carry out the detection of information according to the detection information that receives, detect the program and data files information on the present node machine; Testing result sent to set up socket module 13 and display module 15.As mentioned above, socket module 13 sends information gathering module 9 with testing result.
Message processing module 14 receives from the process prescription of the information of socket module 13 as follows:
1., if receive " FA ", what show reception will be data file, message processing module 14 at first receives filename, the file content that receives is saved in the file with the name of this document name again, and file name information is saved in the txt file tabulation.
2., if receive " PA ", what show reception will be executable program, and message processing module 14 at first receives program name, the file content that receives will be saved in the file with this program name name again, and move this program, file name information is saved in the exe listed files.
The process prescription that message processing module 14 information of carrying out detects is as follows:
1. deposit from appointment and take out the txt file tabulation executable program and the available data file content, from the txt file tabulation, read the filename of " FA " back, if the file that reads is by name empty, represent that then data file available on the present node machine is for empty, if the filename non-NULL that reads is then represented data file non-NULL available on the present node machine.
2. deposit from appointment and take out the exe listed files executable program and the available data file content, from the exe listed files, read the filename of " PA " back, if the file that reads is by name empty, represent that then the program on the present node machine is sky, if the filename non-NULL that reads is then represented the program non-NULL on the present node machine.
Display module 15 receives from the testing result of message processing module 14 transmissions and with it and shows, shows program and fileinfo on the present node machine.
The data flow of node machine simulator 6 as shown in Figure 3, its concrete program circuit is as follows:
(1), the New Server socket, intercept at specific port 8189, wait for being connected of reliability assessment device 5 in the master controller 2 and distributed software simulator 7.
(2), judge whether to receive " Begin " connection request, if, then forward step (3) to, if do not have, then forward step (1) to.
(3), reliability assessment device 5 in foundation and the master controller 2 and distributed software simulator 7 is connected.
(4), judge whether to receive " Start " order, if, then forward step (7) to, if not, then forward step (8) to.
(5), judge whether to receive " Clear " order, if, then forward step (6) to, if not, then forward (12) to.
(6), all Data Filenames in the tabulation of removing text; Remove all program names in the executable program tabulation; The update displayed interface only shows node machine name and node IP address; Forward step (12) to.
(7), judge whether the information receive is " PA ", if, show and will receive executable program, so, forward step (9) to, if not, then forward step (8) to.
(8), judge whether the information receive receives " FA ", if, show and will receive available data file, so, forward step (10) to, if not, then forward step (5) to.
(9), receive the display message and the reception program of program, forward step (11) to.
(10), the display message and the received data file of received data file, forward step (11) to.
(11), show data file and the program receive, and this information is sent to master controller 2.
(12), whether what judge to receive is " End " order, if, then stop, if not, then forward step (1) to.
Three, distributed software simulator
Distributed software simulator 7 is divided into three functional modules: initialization module 18, set up socket module 16 and program and file distributing module 17.
Initialization module 18 reads in configuration file, comprises IP address, program listing and the listed files of each node machine; Set up socket module 16 port 8188 set up server socket and etc. to be connected; Program and file distributing module 17 are dispatched to the branch of distributed software randomly on certain node machine and move, and data file is distributed on certain node machine randomly.
Distributed software simulator 7 need simultaneously with node machine simulator 6 and master controller 2 in reliability assessment device 5 carry out communication.The data flow of distributed software simulator 7 as shown in Figure 4, its flow process is as follows:
(1) reads initialization files.The set of node, procedure set and the file set that comprise required distribution in the initialization files.
(2) New Server socket is intercepted at specific port 8188.
(3) judge whether to carry out the distribution of program, if, then forward step (4) to, if not, then forward step (5) to.
(4) at random node machine of distribution in set of node is set up being connected between overhead control end and the node machine.By overhead control end router on the node machine display message and to node machine router filename and program.After finishing, forward step (7) to.
(5) judge whether to carry out the distribution of data file, if, then forward step (6) to, if not, then forward step (1) to.
(6) at random node machine of distribution in set of node is set up being connected between overhead control end and the node machine.By the overhead control end send data file on the node machine display message and send Data Filename and data file to the node machine.After finishing, forward step (7) to.
(7) whether judgment data file and program send and finish, and finish if send, and then finish, otherwise forward step (3) to.
For guaranteeing the accurate of assessment result, will be before the opening entry assessment data through one section trial run time, be generally 5 minutes, when the operation of node machine simulator, master controller 2 and distributed software simulator in the distributed system tends towards stability, ability opening entry assessment data and assessment result.Giving an actual example below, the present invention is further detailed explanation.
System of the present invention is installed in " internet and cluster computing center ", and test environment as shown in Figure 5.Use the multiple host (the inner network segment is 192.168.1.0) of " internet and cluster computing center ", node machine simulator program and operation are installed separately, as the available resources of distributed system.Go up installation distributed software simulator at another machine (as 192.161.1.91), operation simulation distribution formula software, and will form the required data file random distribution of the program of this software and program to the node machine that has moved, simulate the actual environment of a distributed software operation.Go up the installation reliability evaluator at machine (IP is 192.168.1.79) simultaneously, move this evaluator, from configuration file, read the structural information of current distributed software, structure distributed software system figure, collect the available information (comprising the required file of program and program) on the node machine among the distributed software figure, this distributed software is tested.
Concrete testing engineering is as follows:
(1) installation and startup reliability assessment device.With initialization files init.txt and reliability assessment device program Reliability.exe copy to E: Reliability under the catalogue, operation Reliability.exe starts the reliability assessment device.
(2) installation and startup node machine simulator.With node machine simulator program NodeEmulator.exe copy each node machine to E: Reliability under the catalogue.Node machine simulator is when receiving the control command that the reliability assessment device sends " Start ", and operation NodeEmulator.exe starts node machine simulator.Have only the program run of node machine normal, the task that the distributed software simulator distributes just has available processor and memory source to finish.In the evaluated distributed system of simulation several node machines are arranged, respective counts destination node machine simulator will be installed.
(3) installation and startup distributed software simulator.With program listing and listed files file list.txt and distributed software simulator program DSEmulator.exe copy to E: Reliability under the catalogue.The distributed software simulator is when receiving the control command that the reliability assessment device sends " Start ", operation DSEmulator.exe, start the distributed software simulator, produce program or file in the list.txt file, be assigned randomly on the node machine of distributed system.
(4) in the reliability assessment device, 1) at first from the overhead control interface menu, select " Initialize → Restart DS " to restart distributed system, carry out the Random assignment of program and file again.2) from the overhead control interface menu, select " Initialize → From File " to read and edit initialization files init.txt and obtain distributed system figure, send information gathering order " Begin " from trend node machine simulator, obtain program and file distribution information; 3) " Test → Property " changes the time-constrain factor from the overhead control interface menu, selects " Test → Start " beginning evaluation process then from menu; 4) after assessment finishes, from the overhead control interface menu, select " Display → System Graph " to show the topology diagram and the details of this tested distributed software intuitively; Or from menu, select " Display → Result Graph " to show the overall result of this assessment with bar chart, and click the numeral under each bar chart, can show the reliability assessment figure of each branch; Also can from menu, select " Display → LogFile " to show the details of this test, be convenient to the analysis that the distributed system analyst carries out assessment result, distributed system and software are improved with textual form; 5) user can select to obtain in " Help " help information at any time from the overhead control interface menu; 6) after once assessment finishes, can from the overhead control interface menu, select " Initialize → Restart DS " to restart distributed system, carry out the Random assignment of program and file again, jump to step 1) and carry out evaluation process next time, can assess repeatedly.
The result of above-mentioned test has been to use six main frames of internet and cluster computing center as the node machine, and a main frame is as the distributed software simulator, and a main frame is as the reliability assessment device.By node machine simulator and distributed software simulator, to the actual environment simulation of distributed software operation.By this evaluating system, the reliability that to assess out a time-constrain be the distributed software of 200 milliseconds (ms), test result is shown in table one, table two.
Table. one has each program in the distributed software of time-constrain on the node machine
Executable state and corresponding mean down time (MTTF)
state MTTF(ms) state MTTF(ms) state MTTF(ms) state MTTF(ms)
PA 1∶P 1 1 68.3333 2 55.0000 3 55.0000 4 66.6667
5 66.6667 6 25.0000 7 50.0000 8 50.0000
9 50.0000 10 50.0000 11 66.6667 12 50.0000
13 50.0000
PA 2∶P 2 1 131.6919 2 87.517624 3 87.517624 4 50.0000
PA 2∶P 3 1 68.60523 2 43.56223 3 55.11448 4 60.228912
5 35.545334 6 58.33334 7 25.316458 8 25.316458
9 33.333366 10 50.00003 11 25.316458 12 50.000065
13 25.316458 14 50.000065 15 50.00005
PA 3∶P 3 1 52.11494 2 30.430899 3 43.56223 4 35.545334
5 35.545334 6 33.333324 7 25.316458 8 25.316458
9 25.316458 10 25.316458 11 33.333366 12 25.316458
PA 4∶P 4 1 58.3333 2 38.333317 3 43.333313 4 43.333313
5 51.66664 6 38.333317 7 34.999985 8 33.333324
9 24.999987 10 33.333324 11 33.333324 12 24.999987
13 24.999987 14 33.333324 15 33.333324 16 24.999987
17 33.333324 18 24.999987 19 33.333366 20 33.333366
PA 5∶P 5 1 109.58438 2 51.281986 3 109.58433 4 51.28204
5 91.11619 6 51.281975 7 51.28202 8 51.28202
9 91.11615 10 51.282005
PA 6∶P 5 1 91.72598 2 25.316458 3 91.11619 4 91.1161
5 91.11615
PA 6∶P 6 1 56.392815 2 47.67582 3 25.316458 4 32.81376
Table two has program in the distributed software (TCDS) of time-constrain
The fiduciary level of fiduciary level and TCDS
Program The node machine at place Fiduciary level The fiduciary level of program among the TCDS The fiduciary level of TCDS
P 1 x 1 0.6941432 0.6937952 0.28401545
P 2 x 2 0.8777859 0.8777859
P 3 x 2 0.6941418 0.9064509
x 3 0.69413894
P 4 x 4 0.693063 0.693063
P 5 x 5 0.7801325 0.9515709
x 6 0.7797352
P 6 x 6 0.7797335 0.7797335

Claims (3)

1, a kind of distributed software reliability evaluating system with time-constrain comprises the reliability assessment device (5) that is arranged in the overhead control end, is positioned at node machine simulator (6) and distributed software simulator (7) on each node machine of distributed system;
Reliability assessment device (5) comprises overhead control interface module (8), information gathering module (9), assessment algorithm module (10), display module (11) and distributed system are restarted module (12) as a result; Wherein, overhead control interface module (8) is used for system is carried out initialization, realizes distributed system is restarted module (12), information gathering module (9), assessment algorithm module (10) and the control of display module (11) as a result; Distributed system is restarted module (12) and was used for before the assessment beginning, restart distributed software simulator (7) and all node machine simulators (6), carry out the random distribution of the random schedule and the file of program by distributed software simulator (7) and all node machine simulators (6) again; Information gathering module (9) is used for receiving the control information of sending from overhead control interface module (8) and determines its method of operation according to control information; Send detection information and receive its feedback information to node machine simulator (6), and this feedback information is sent to overhead control interface module (8) and assessment algorithm module (10); Assessment algorithm module (10) is used for receiving the feedback information that sends from the configuration information and the information gathering module (9) of overhead control interface module (8) transmission, utilization has the distributed software reliability assessment models of time-constrain, calculating has the distributed software reliability of time-constrain, and the result is sent to display module (11) as a result; Display module (11) is used to the information that shows that overhead control interface module (8) and assessment algorithm module (10) are sent as a result;
Node machine simulator (6) is used for receiving the detection information of sending from reliability assessment device (5); And receive the executable program that sends from distributed software simulator (7) and available data file; To receive executable program and available data file leaves designated directory in; Sense command that operation receives and the information of carrying out this machine detect; The information that on the interface, shows this machine in real time; The object information that detects is reported in the reliability assessment device (5); Executable program on this node machine and available data file information are removed in " Clear " that reception is sent from reliability assessment device (5) order;
Distributed software simulator (7) is used for the program of generating routine tabulation and the data file in the listed files, and they are assigned randomly in the node machine simulator (6) on each node machine; Reception is ordered also random distribution program and data files again from " Send " that master controller (2) sends.
2, evaluating system according to claim 1 is characterized in that: node machine simulator (6) comprises sets up socket module (13), message processing module (14) and display module (15);
Setting up socket module (13) is used to set up node machine simulator (6) and is connected with port between reliability assessment device (5) and the distributed software simulator (7);
Message processing module (14) is used for receiving program, data file and the detection information of sending from socket module (13), leave program and data files in designated directory, to detect the program and data files on the present node machine according to the detection information that receives, judge whether it is executable program or is not available data file, and testing result sent to set up socket module (13) and display module (15), shown by display module (15).
3, evaluating system according to claim 1 and 2 is characterized in that: distributed software simulator (7) comprises initialization module (18), sets up socket module (16) and program and file distributing module (17);
Initialization module (18) is used to read in initialization files, and information sent to set up socket module (16), set up socket module (16) and set up connection between node machine simulator (6) and the distributed software simulator (7) according to this information, and initialization files are transmitted to executable program and available file distributing module (17), by program and file distributing module (17) branch of distributed software is dispatched to node machine simulator (6) randomly and goes up operation, data file is distributed on the node machine simulator (6) randomly.
CNB2004100614067A 2004-12-20 2004-12-20 Distribution type software reliability evaluation system having time restraint Expired - Fee Related CN1295615C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2004100614067A CN1295615C (en) 2004-12-20 2004-12-20 Distribution type software reliability evaluation system having time restraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2004100614067A CN1295615C (en) 2004-12-20 2004-12-20 Distribution type software reliability evaluation system having time restraint

Publications (2)

Publication Number Publication Date
CN1624664A CN1624664A (en) 2005-06-08
CN1295615C true CN1295615C (en) 2007-01-17

Family

ID=34764510

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100614067A Expired - Fee Related CN1295615C (en) 2004-12-20 2004-12-20 Distribution type software reliability evaluation system having time restraint

Country Status (1)

Country Link
CN (1) CN1295615C (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5446800B2 (en) * 2009-12-04 2014-03-19 ソニー株式会社 Information processing apparatus, information processing method, and program
CN104503916A (en) * 2015-01-05 2015-04-08 中国石油大学(华东) Quantitative evaluation method for availability of system interface
CN106294511B (en) * 2015-06-10 2019-07-02 ***通信集团广东有限公司 A kind of storage method and device of Hadoop distributed file system
CN108959098B (en) * 2018-07-20 2021-11-05 大连理工大学 System and method for testing deadlock defects of distributed system program
CN110134407B (en) * 2019-05-21 2020-07-24 中电莱斯信息***有限公司 Configurable distributed software distribution method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001097034A1 (en) * 2000-06-14 2001-12-20 Seiko Epson Corporation Automatic evaluation method and automatic evaluation system and storage medium storing automatic evaluation program
CN1472652A (en) * 2003-07-17 2004-02-04 中国科学院计算技术研究所 Software breakdown testing method for dynamic resouce management
CN1490724A (en) * 2002-10-18 2004-04-21 上海贝尔有限公司 Virtual machine for embedded systemic software development

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001097034A1 (en) * 2000-06-14 2001-12-20 Seiko Epson Corporation Automatic evaluation method and automatic evaluation system and storage medium storing automatic evaluation program
CN1490724A (en) * 2002-10-18 2004-04-21 上海贝尔有限公司 Virtual machine for embedded systemic software development
CN1472652A (en) * 2003-07-17 2004-02-04 中国科学院计算技术研究所 Software breakdown testing method for dynamic resouce management

Also Published As

Publication number Publication date
CN1624664A (en) 2005-06-08

Similar Documents

Publication Publication Date Title
CN111741134B (en) System and method for quickly constructing virtual machine in large-scale scene of network shooting range
CN100451989C (en) Software testing system and testing method
US7603669B2 (en) Upgrade and downgrade of data resource components
US9569325B2 (en) Method and system for automated test and result comparison
CN107678946B (en) Model-based airborne embedded software test environment construction method
CN102404381A (en) Software deployment system and deployment method based on workflow in cloud computing environment
CN111859832B (en) Chip simulation verification method and device and related equipment
CN1841336A (en) Computer testing method
US20200371902A1 (en) Systems and methods for software regression detection
US7434041B2 (en) Infrastructure for verifying configuration and health of a multi-node computer system
US20060250970A1 (en) Method and apparatus for managing capacity utilization estimation of a data center
US10719375B2 (en) Systems and method for event parsing
US11025688B1 (en) Automated streaming data platform
CN1295615C (en) Distribution type software reliability evaluation system having time restraint
CN1555014A (en) Human-machine order testing method
CN1805424A (en) Gridding emulation method and its emulator
US9258187B2 (en) System and method for optimizing and digitally correcting errors on a computer system
KR101759893B1 (en) Virtual device management apparatus based on scenario for distributed energy resources
CN117041111A (en) Vehicle cloud function test method and device, electronic equipment and storage medium
CN116244186A (en) Operating system test management method and device and computing equipment
US20100083032A1 (en) Connection broker assignment status reporting
CN114244548B (en) Cloud IDE-oriented dynamic scheduling and user authentication method
KR20100067913A (en) Method for healing faults in sensor network and the sensor network for implementing the method
CN114861773A (en) Model training visualization method and device and cloud platform
CN103617077A (en) Intelligent cloud migration method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20070117

Termination date: 20100120