JP7275922B2

JP7275922B2 - Information processing device, anomaly detection method and program

Info

Publication number: JP7275922B2
Application number: JP2019121104A
Authority: JP
Inventors: 厚大堀; 理若林; 貴司水上
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2023-05-18
Anticipated expiration: 2039-06-28
Also published as: JP2021006970A

Description

本発明は、ハードディスク装置等の記憶装置を搭載した情報処理装置に関し、特に記憶装置の故障等の異常状態を検出する情報処理装置、情報処理装置の異常検出方法、及び異常検出方法のためのプログラムに関する。 The present invention relates to an information processing apparatus equipped with a storage device such as a hard disk device, and more particularly to an information processing apparatus for detecting an abnormal state such as a failure of a storage device, an abnormality detection method for the information processing apparatus, and a program for the abnormality detection method. Regarding.

コンピュータ等の情報処理装置においては、搭載されたハードディスク装置における、経年劣化、ファイルの誤消去等の人為的な障害、コンピュータウイルスによるファイルの破壊、ファイルシステムの不整合等の異常状態、いわゆるハードディスク障害によって、セクタ不良やファイルシステム障害等の不具合が発生する場合がある。 In information processing equipment such as computers, the so-called hard disk failure, such as deterioration over time, artificial failure such as erroneous erasure of files, file destruction due to computer viruses, abnormal conditions such as file system inconsistency, etc. Due to this, problems such as defective sectors and file system failures may occur.

例えば、特許文献１には、ハードディスク障害を確実に検知することを目的として、ハードディスク装置に対して書き込み処理を実行し、その書き込みを監視してハードディスク装置の状態を監視し、書き込み処理が一定回数失敗した場合に、ハードディスク装置に障害が発生していると判定する情報処理装置が開示されている。 For example, in Patent Document 1, for the purpose of reliably detecting a hard disk failure, write processing is performed on a hard disk device, the writing is monitored to monitor the state of the hard disk device, and the write processing is performed a certain number of times. Disclosed is an information processing apparatus that determines that a hard disk device is faulty when it fails.

特開２０１４－２３５５０３号公報JP 2014-235503 A

一般的に、ハードディスク装置が故障した場合には、ＲＡＩＤ（Redundant Arrays of Inexpensive Disks）に代表されるディスク冗長化技術を用いることで、破損したデータの復旧自体は可能である。 In general, when a hard disk device fails, damaged data itself can be recovered by using a disk redundancy technique represented by RAID (Redundant Arrays of Inexpensive Disks).

しかしながら、ハードディスク装置の故障又はデータ欠落が発生しても、情報処理装置においてユーザアプリケーションやＯＳに対しての通知や、自律的に回復アクションを実行することはない。そのため、ｆｓｃｋ等のディスクチェックやＳ.Ｍ.Ａ.Ｒ.Ｔ(Self-Monitoring, Analysis and Reporting Technology)に代表されるディスク健康診断等の検査ツールを用いてテストし、異常状態を発見するまでは、ハードディスク障害は検知されないこととなる。 However, even if a hard disk device failure or data loss occurs, the information processing device does not notify the user application or OS or autonomously execute recovery action. For that reason, we test using inspection tools such as disk checks such as fsck and disk health examinations represented by SMART (Self-Monitoring, Analysis and Reporting Technology), and until we find an abnormal state hard disk failure will not be detected.

また、ハードディスク障害の発生した箇所が、ユーザアプリケーションがアクセスするようなディレクトリ領域、又はファイルであった場合には、処理の誤動作やアクセス遅延、セグメンテーションフォルト等の多種多様な問題が発生する原因となりうる。 In addition, if the location where the hard disk failure occurs is a directory area or file that is accessed by a user application, it may cause various problems such as processing malfunction, access delay, segmentation fault, etc. .

サーバや交換系の伝送装置等の２４時間連続での稼働が求められるような情報処理装置においては、ハードディスク障害により安定したサービスの提供ができなくなってしまうことが懸念されている。特に、ブレードサーバのような、ディスクの冗長化を行っていない単一ハードディスク構成のシステムにおいては致命的な問題となる場合がある。 2. Description of the Related Art In an information processing device such as a server or a switching system transmission device that is required to operate continuously for 24 hours, there is a concern that a hard disk failure may hinder stable service provision. In particular, in a system with a single hard disk configuration without disk redundancy, such as a blade server, this can be a fatal problem.

そこで、本発明の目的は、記憶装置の異常状態を正確かつ迅速に検出することができる情報処理装置、異常検出方法及びプログラムを提供することである。 SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide an information processing apparatus, an abnormality detection method, and a program capable of accurately and quickly detecting an abnormal state of a storage device.

本発明の情報処理装置は、記憶装置の異常を検出する情報処理装置であって、前記記憶装置の監視対象のパーティション毎にファイルシステムのマウント状態の変化があったか否かを所定の期間毎に確認するマウント状態監視部と、前記パーティション毎に前記記憶装置にディレクトリが存在するか否かを前記所定の期間毎に確認するディレクトリ確認部と、前記ディレクトリ確認部によって存在が確認された前記ディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを前記所定の期間毎に確認するアクセス確認部と、前記マウント状態監視部による前記マウント状態の変化の確認、前記ディレクトリ確認部による前記ディレクトリの非存在の確認、及び前記アクセス確認部による前記ファイル書き込み又は削除の不可の確認のうちのいずれか１が行われたとき前記記憶装置の異常検出と判定する異常検出部と、前記マウント状態監視部による前記マウント状態の不変化の確認、前記ディレクトリ確認部による前記ディレクトリの存在の確認、及び前記アクセス確認部による前記ファイル書き込み及び削除の可能の確認の全てが行われたとき実行カウンタのカウント値を更新するカウンタ更新部と、前記所定の期間毎に前記実行カウンタのカウント値の更新を確認するカウント更新確認部と、を備え、前記異常検出部は、前記カウント更新確認部により前記実行カウンタのカウント値の更新が確認されなかったとき前記記憶装置の異常検出と判定することを特徴としている。 An information processing apparatus according to the present invention is an information processing apparatus that detects an abnormality in a storage device, and confirms at predetermined intervals whether or not there has been a change in the mount state of a file system for each partition to be monitored in the storage device. a mount status monitoring unit for each partition; a directory confirmation unit for confirming whether or not a directory exists in the storage device for each partition at each predetermined period; an access confirmation unit for confirming whether or not a file can be written and deleted for each predetermined period ; confirmation of a change in the mount state by the mount state monitoring unit; non-existence of the directory by the directory confirmation unit; and the confirmation by the access confirmation unit that the file cannot be written or deleted. The count value of the execution counter is updated when confirmation of unchanged mount status, confirmation of the existence of the directory by the directory confirmation unit, and confirmation of whether the file can be written and deleted by the access confirmation unit are all performed. a counter update unit and a count update confirmation unit for confirming update of the count value of the execution counter every predetermined period, wherein the abnormality detection unit causes the count update confirmation unit to update the count value of the execution counter. It is characterized in that it is determined that an abnormality of the storage device is detected when the update is not confirmed .

本発明の異常検出方法は、記憶装置の異常を検出する情報処理装置の異常検出方法であって、前記記憶装置の監視対象のパーティション毎にファイルシステムのマウント状態の変化があったか否かを所定の期間毎に確認するマウント状態監視ステップと、前記パーティション毎に前記記憶装置にディレクトリが存在するか否かを前記所定の期間毎に確認するディレクトリ確認ステップと、前記ディレクトリ確認ステップにおいて存在が確認された前記ディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを前記所定の期間毎に確認するディスクアクセス確認ステップと、前記マウント状態監視ステップにおける前記マウント状態の変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの非存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み又は削除の不可の確認のうちのいずれか１が行われたとき前記記憶装置の異常検出と判定する異常検出ステップと、前記マウント状態監視ステップにおける前記マウント状態の不変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み及び削除の可能の確認の全てが行われたとき実行カウンタのカウント値を更新するカウンタ更新ステップと、前記所定の期間毎に前記実行カウンタのカウント値の更新を確認するカウント更新確認ステップと、前記カウント更新確認ステップにおいて前記実行カウンタのカウント値の更新が確認されなかったとき前記記憶装置の異常検出と判定する判定ステップと、を含むことを特徴としている。 An abnormality detection method of the present invention is a method for detecting an abnormality in an information processing apparatus for detecting an abnormality in a storage device . a mount state monitoring step of checking for each period ; a directory checking step of checking whether or not a directory exists in the storage device for each of the partitions for each of the predetermined periods; A disk access confirmation step of confirming whether or not a file can be written and deleted under the directory for each predetermined period ; a confirmation of a change in the mount status in the mount status monitoring step; an anomaly detection step of determining that an anomaly of the storage device is detected when any one of confirmation of non-existence of the directory and confirmation of impossibility of file writing or deletion in the disk access confirmation step is performed ; Executed when confirmation of the unchangeable mount status in the mount status monitoring step, confirmation of the existence of the directory in the directory confirmation step, and confirmation of the possibility of file writing and deletion in the disk access confirmation step are all performed. a counter update step of updating the count value of the counter; a count update confirmation step of confirming update of the count value of the execution counter every predetermined period; and a determination step of determining that an abnormality of the storage device is detected when no confirmation is made .

本発明のプログラムは、記憶装置の異常を検出する情報処理装置の異常検出方法のためのプログラムであって、コンピュータに、前記記憶装置の監視対象のパーティション毎にファイルシステムのマウント状態の変化があったか否かを所定の期間毎に確認するマウント状態監視ステップと、前記パーティション毎に前記記憶装置にディレクトリが存在するか否かを前記所定の期間毎に確認するディレクトリ確認ステップと、前記ディレクトリ確認ステップにおいて存在が確認された前記ディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを前記所定の期間毎に確認するディスクアクセス確認ステップと、前記マウント状態監視ステップにおける前記マウント状態の変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの非存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み又は削除の不可の確認のうちのいずれか１が行われたとき前記記憶装置の異常検出と判定する異常検出ステップと、前記マウント状態監視ステップにおける前記マウント状態の不変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み及び削除の可能の確認の全てが行われたとき実行カウンタのカウント値を更新するカウンタ更新ステップと、前記所定の期間毎に前記実行カウンタのカウント値の更新を確認するカウント更新確認ステップと、前記カウント更新確認ステップにおいて前記実行カウンタのカウント値の更新が確認されなかったとき前記記憶装置の異常検出と判定する判定ステップと、を実行させることを特徴としている。 A program of the present invention is a program for an abnormality detection method for an information processing device for detecting an abnormality in a storage device, and the computer detects whether there has been a change in the mount state of the file system for each partition to be monitored in the storage device. a mount state monitoring step of checking whether or not there is a directory in the storage device for each of the partitions at intervals of the predetermined period; and a directory checking step of checking whether a disk access confirmation step of confirming whether or not a file can be written and deleted under the directory whose existence has been confirmed at each predetermined period ; and confirmation of a change in the mount state in the mount state monitoring step; Abnormality determined as abnormality detection of the storage device when any one of confirmation of non-existence of the directory in the directory confirmation step and confirmation of impossibility of file writing or deletion in the disk access confirmation step is performed. all of the detection step, confirmation that the mount state has not changed in the mount state monitoring step, confirmation that the directory exists in the directory confirmation step, and confirmation that the file can be written and deleted in the disk access confirmation step. a counter update step of updating the count value of the execution counter when the execution counter is executed; a count update confirmation step of confirming update of the count value of the execution counter every predetermined period; and a determination step of determining that an abnormality in the storage device is detected when update of the count value is not confirmed .

本発明の情報処理装置、異常検出方法及びプログラムによれば、ファイルシステムのマウント状態の変化の確認、ハードディスク装置のディレクトリの非存在の確認、又はファイル書き込み又は削除の不可の確認が行われたとき記憶装置の異常検出と判定するので、記憶装置の異常状態を正確かつ迅速に検出することができる。 According to the information processing device, the abnormality detection method, and the program of the present invention, when confirmation of a change in the mount state of the file system, confirmation of non-existence of a directory in the hard disk device, or confirmation of impossibility of file writing or deletion is performed Since it is determined that the storage device is abnormally detected, the abnormal state of the storage device can be detected accurately and quickly.

本発明による情報処理装置の構成を示すブロック図である。1 is a block diagram showing the configuration of an information processing device according to the present invention; FIG. 図１のＣＰＵの動作によって構成される各部を示すブロック図Block diagram showing each part configured by the operation of the CPU in FIG. ハードディスク障害監視処理を示すフローチャートである。4 is a flowchart showing hard disk failure monitoring processing; 実行カウンタ監視処理を示すフローチャートである。9 is a flowchart showing execution counter monitoring processing; マウント状態チェックを示すフローチャートである。10 is a flow chart showing a mount state check; /proc/mountsファイルの内容を例示する図である。FIG. 4 is a diagram illustrating the contents of a /proc/mounts file; ディスクアクセスチェックを示すフローチャートである。10 is a flow chart showing a disk access check; 異常検出アクション及びＷａｔｃｈｄｏｇ監視処理をＩＰＭＩによるハードリセットの動作タイミングと共に示すフローチャートである。FIG. 11 is a flow chart showing anomaly detection action and Watchdog monitoring process together with operation timing of hard reset by IPMI; FIG.

以下、本発明の実施例を、図面を参照しつつ詳細に説明する。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は本発明による情報処理装置１０の構成を示している。この情報処理装置１０は、ＣＰＵ（Central Processing Unit：中央処理装置）１１、記憶部１２、通信部１３、入出力インタフェース１４、ハードディスク装置（ハードディスクドライブ）１５、入力装置１６、出力装置１７、バス１８、ＩＰＭＩ（Intelligent Platform Management Interface:ハードウェア管理インタフェース）ハードウェア１９、電源装置２０を備えている。ＣＰＵ１１、記憶部１２、通信部１３、入出力インタフェース１４、ＩＰＭＩハードウェア１９の各々はバス１８に接続されている。ハードディスク装置１５、入力装置１６、出力装置１７は入出力インタフェース１４に接続されている。 FIG. 1 shows the configuration of an information processing apparatus 10 according to the present invention. This information processing device 10 includes a CPU (Central Processing Unit) 11, a storage unit 12, a communication unit 13, an input/output interface 14, a hard disk device (hard disk drive) 15, an input device 16, an output device 17, a bus 18, and , IPMI (Intelligent Platform Management Interface: hardware management interface) hardware 19 , and a power supply device 20 . CPU 11 , storage unit 12 , communication unit 13 , input/output interface 14 and IPMI hardware 19 are each connected to bus 18 . The hard disk device 15 , the input device 16 and the output device 17 are connected to the input/output interface 14 .

ＣＰＵ１１はＯＳ（オペレーションシステム）を含むソフトウェアに従って情報処理装置１０の全体を制御する制御部である。記憶部１２はＲＡＭ（Random Access Memory）等のメモリであり、ＣＰＵ１１の処理プログラムや各種のデータが展開或いは保持される領域を有する。通信部１３は図示しないネットワークを介して他の装置とデータを送受信する。入出力インタフェース１４はハードディスク装置１５、入力装置１６、出力装置１７の各々とデータの送受信を行う。入出力インタフェース１４は、接続される装置に応じた種類のインタフェースを有している。 The CPU 11 is a control unit that controls the entire information processing apparatus 10 according to software including an OS (Operating System). The storage unit 12 is a memory such as a RAM (Random Access Memory), and has an area in which processing programs of the CPU 11 and various data are expanded or held. The communication unit 13 transmits and receives data to and from other devices via a network (not shown). The input/output interface 14 transmits and receives data to and from each of the hard disk device 15, the input device 16, and the output device 17. FIG. The input/output interface 14 has a type of interface corresponding to the connected device.

ハードディスク装置１５は、磁気ディスク、光学ディスク、光磁気ディスク等のディスクを用いた記憶装置である。ハードディスク装置１５は本実施例では情報処理装置１０の内部に設けられているが、外部接続されても良い。ハードディスク装置１５には、ＯＳ共にハードディスク障害監視プログラムがインストールされている。本実施例では、ＯＳはＬｉｎｕｘ（登録商標）とする。ハードディスク障害監視プログラムはＯＳ起動時にＣＰＵ１１によって常駐プログラムとして実行開始され、所定の実行タイミングで繰り返しハードディスク装置１５の障害監視を後述するように行う。 The hard disk device 15 is a storage device using a disk such as a magnetic disk, an optical disk, or a magneto-optical disk. Although the hard disk device 15 is provided inside the information processing device 10 in this embodiment, it may be externally connected. A hard disk failure monitoring program is installed in the hard disk device 15 together with the OS. In this embodiment, the OS is Linux (registered trademark). The hard disk failure monitoring program is started to be executed as a resident program by the CPU 11 when the OS is started, and repeatedly performs failure monitoring of the hard disk device 15 at predetermined execution timings as will be described later.

入力装置１６はキーボードやマウスを含む。出力装置１７はディスプレイ装置を含む。 Input device 16 includes a keyboard and a mouse. Output device 17 includes a display device.

ＩＰＭＩハードウェア１９は、Ｗａｔｃｈｄｏｇ（ウォッチドッグ）機能としてウォッチドッグタイマ１９ａを内部に有している。ＩＰＭＩハードウェア１９は、Ｗａｔｃｈｄｏｇ機能を利用して所定のタイマ時間内にウォッチドッグタイマ１９ａのリセットの有無を検出し、そのリセットがない場合に電源装置２０に対してハードリセットを行う。電源装置２０は、情報処理装置１０内のＣＰＵ１１を含む各装置への電源供給を行う装置であり、ハードリセットによって一旦電源供給を停止した後、電源供給を再度行う。 The IPMI hardware 19 internally has a watchdog timer 19a as a watchdog function. The IPMI hardware 19 uses the Watchdog function to detect whether or not the watchdog timer 19a has been reset within a predetermined timer time, and performs a hard reset to the power supply device 20 when there is no reset. The power supply device 20 is a device that supplies power to each device including the CPU 11 in the information processing device 10. After temporarily stopping the power supply by a hard reset, the power supply is resumed.

ＣＰＵ１１は、ハードディスク障害監視プログラムの動作としてマウント状態チェック、ディスクアクセスチェック、そして処理スレッド実行状態チェックを行う。マウント状態チェックはファイルシステムのマウント状態の正常性を確認する動作である。ディスクアクセスチェックは、書き込み不可などのアクセス異常が発生していないか確認する動作である。処理スレッド実行状態チェックは、マウント状態チェック及びディスクアクセスチェックを実行している処理スレッドの状態が、アクセス異常などによりフリーズ（処理停止）していないかを後述する実行カウンタにより確認する動作である。 The CPU 11 performs a mount state check, a disk access check, and a processing thread execution state check as operations of the hard disk fault monitoring program. The mount state check is an operation to confirm the normality of the mount state of the file system. The disk access check is an operation to check whether an access error such as write impossibility has occurred. The processing thread execution state check is an operation to check whether the state of the processing thread executing the mount state check and disk access check is frozen (process stopped) due to an access error or the like, using an execution counter, which will be described later.

ハードディスク障害監視プログラムはメインルーチンと実行カウンタ監視ルーチンとを含む。メインルーチンにはマウント状態チェック、ディスクアクセスチェック、異常検出アクション及び実行カウンタ更新ステップが含まれる。ディスクアクセスチェックにはディレクトリ存在チェック、ファイル書き込みチェック、及びファイル削除チェックが含まれる。実行カウンタ監視ルーチンには処理スレッド実行状態チェックが含まれる。 The hard disk fault monitoring program includes a main routine and an execution counter monitoring routine. The main routine includes mount status check, disk access check, error detection action, and execution counter update steps. Disk access checks include directory existence checks, file write checks, and file delete checks. The run counter monitor routine includes a process thread run state check.

ＣＰＵ１１は、図２に示すように、メインルーチンのマウント状態チェックを実行することによりマウント状態監視部３１を構成し、ディレクトリ存在チェックを実行することによりディレクトリ確認部３２を構成し、ファイル書き込みチェック及びファイル削除チェックを実行することによりアクセス確認部３３を構成し、異常検出アクションを実行することにより異常検出部３４を構成する。また、ＣＰＵ１１は、実行カウンタ更新ステップを実行することによりカウンタ更新部３５を構成し、実行カウンタ監視ルーチンを実行することによりカウンタ更新確認部３６を構成する。 As shown in FIG. 2, the CPU 11 configures a mount state monitoring unit 31 by executing a mount state check in the main routine, configures a directory confirmation unit 32 by executing a directory existence check, and performs a file write check and The access confirmation unit 33 is configured by executing the file deletion check, and the abnormality detection unit 34 is configured by executing the abnormality detection action. Further, the CPU 11 configures the counter update unit 35 by executing the execution counter update step, and configures the counter update confirmation unit 36 by executing the execution counter monitoring routine.

次に、ＣＰＵ１１によるハードディスク障害監視プログラム実行による概略動作を説明すると、メインルーチンでは、図３に示すように、先ず、ステップＳ１０１で所定の実行タイミングが測られた後、マウント状態チェックが実行される（ステップＳ１０２）。マウント状態チェックの実行後、その実行結果が判別される（ステップＳ１０３）。 Next, the general operation of executing the hard disk failure monitoring program by the CPU 11 will be described. In the main routine, as shown in FIG. (Step S102). After executing the mount state check, the execution result is determined (step S103).

マウント状態チェックの実行結果が正常ならば、ディスクアクセスチェックが実行される（ステップＳ１０４）。ディスクアクセスチェックの実行後、その実行結果が判別される（ステップＳ１０５）。 If the execution result of the mount status check is normal, a disk access check is executed (step S104). After executing the disk access check, the execution result is determined (step S105).

ディスクアクセスチェックの実行結果が正常ならば、実行カウンタがインクリメントされる（ステップＳ１０６）。実行カウンタの初期値は０であり、インクリメントされる毎にカウント値が例えば１だけ増加する。その後、ステップＳ１０１からの実行が繰り返される。 If the execution result of the disk access check is normal, the execution counter is incremented (step S106). The initial value of the execution counter is 0, and the count value increases, for example, by 1 each time it is incremented. After that, the execution from step S101 is repeated.

一方、マウント状態チェックの実行結果、又はディスクアクセスチェックの実行結果が異常ならば、ハードディスク障害が検出されたことを意味するので、異常検出アクションが実行される（ステップＳ１０７）。異常検出アクションでは、ＣＰＵ１１は出力装置１７にハードディスク障害検出を表示させる。 On the other hand, if the execution result of the mount status check or the execution result of the disk access check is abnormal, it means that a hard disk failure has been detected, so an abnormality detection action is executed (step S107). In the abnormality detection action, the CPU 11 causes the output device 17 to display hard disk failure detection.

ＣＰＵ１１は、ハードディスク障害監視プログラムのメインルーチンとは別のタスクとして実行カウンタ監視ルーチンを実行する。 The CPU 11 executes an execution counter monitoring routine as a separate task from the main routine of the hard disk fault monitoring program.

実行カウンタ監視ルーチンでは、図４に示すように、ステップＳ１１１で所定の実行タイミングが測られた後、実行カウンタ更新チェックが実行される（ステップＳ１１２）。実行カウンタのカウント値が前回値からインクリメントされている場合、すなわちメインルーチンのステップＳ１０６が実行された場合には、その判別結果は正常である（ステップＳ１１３）。その正常ならば、その後、ステップＳ１１１からの実行が繰り返される。一方、実行カウンタのカウント値が前回値からインクリメントされず、前回値のままである場合には、その判別結果は異常である（ステップＳ１１３）。その異常ならば、ハードディスク障害監視プログラムのメインルーチンの処理に何らかの不具合が生じたとして異常検出アクションが実行される（ステップＳ１１４）。 In the execution counter monitoring routine, as shown in FIG. 4, after a predetermined execution timing is measured in step S111, an execution counter update check is executed (step S112). If the count value of the execution counter has been incremented from the previous value, that is, if step S106 of the main routine has been executed, the determination result is normal (step S113). If it is normal, then the execution from step S111 is repeated. On the other hand, if the count value of the execution counter is not incremented from the previous value and remains at the previous value, the determination result is abnormal (step S113). If there is an abnormality, an abnormality detection action is executed assuming that some kind of abnormality has occurred in the processing of the main routine of the hard disk failure monitoring program (step S114).

次に、ステップＳ１０２のマウント状態チェックを具体的に説明すると、図５に示すように、先ず、マウント情報管理ファイルが記憶部１２から読み出される（ステップＳ１２１）。マウント情報管理ファイルは、監視対象のディスクパーティション単位で設定ファイルのコンフィグレーション情報に基づいて事前に作成され、記憶部１２に保存される。マウント情報管理ファイルには、マウント情報として、ディスクパーティション毎にデバイスと、ディスクパーティションと、ファイルシステムタイプと、マウントオプションとが含まれており、ステップＳ１２１ではマウント情報管理ファイルの読み出しによりマウント情報が得られる。ステップＳ１２１の実行後、Ｌｉｎｕｘカーネルのシステムファイルである/proc/mountsファイルが読み出される（ステップＳ１２２）。/proc/mountsファイルには例えば、図６に示すように現在の全マウント情報の一覧が示されている。 Next, specifically describing the mount state check in step S102, as shown in FIG. 5, first, the mount information management file is read from the storage unit 12 (step S121). The mount information management file is created in advance based on the configuration information of the setting file for each disk partition to be monitored and stored in the storage unit 12 . The mount information management file contains, as mount information, a device, disk partition, file system type, and mount option for each disk partition. In step S121, the mount information is obtained by reading the mount information management file. be done. After execution of step S121, the /proc/mounts file, which is a Linux kernel system file, is read (step S122). The /proc/mounts file lists all current mount information, for example, as shown in FIG.

/proc/mountsファイルの読み出し後、監視対象のディスクパーティション毎にマウント情報と/proc/mountsファイルの内容との文字列の比較が実行される（ステップＳ１２３）。/proc/mountsファイルには一般的に、各行にデバイス、ディスクパーティション（マウントポイント）、ファイルシステムタイプ、マウントオプションがその順に記されている。マウントオプションには、読み書き可能のマウントを示す「ｒｗ」と、読み取り専用のマウントを示す「ｒｏ」とのいずれか一方が記されている。図６の/proc/mountsファイルの符号Ａで示した行では、デバイスとして「/dev/sda11」、ディスクパーティションとして「/var/crash」、ファイルシステムタイプとして「ext3」、マウントオプションとして「ｒｗ」が記載されている。ステップＳ１２３では、ディスクパーティションとファイルシステムタイプとが検索キーワードとして用いられ、検索キーワードがマウント情報のものと一致する/proc/mountsファイルの行があるならば、当該行のマウントオプションに変化があるか否かの比較が行われる。その比較結果は図３のステップＳ１０３においてマウントオプションに変化がない場合にマウント状態チェック結果はハードディスク装置１５の正常と判定され、変化がある場合にはハードディスク装置１５の異常と判定される。 After reading the /proc/mounts file, the character strings of the mount information and the content of the /proc/mounts file are compared for each monitored disk partition (step S123). A /proc/mounts file typically contains a device, disk partition (mount point), filesystem type, and mount options on each line, in that order. The mount option describes either "rw" indicating a read/write mount or "ro" indicating a read-only mount. In the line indicated by symbol A in the /proc/mounts file in Figure 6, the device is "/dev/sda11", the disk partition is "/var/crash", the file system type is "ext3", and the mount option is "rw". is described. In step S123, the disk partition and file system type are used as search keywords, and if there is a line in the /proc/mounts file whose search keyword matches that of the mount information, whether there is a change in the mount option in that line. A no comparison is made. As a result of the comparison, if there is no change in the mount option in step S103 of FIG. 3, the hard disk device 15 is determined to be normal, and if there is a change, the hard disk device 15 is determined to be abnormal.

なお、マウント状態の変化としては、読み書き可能のマウントを示す「ｒｗ」と、読み取り専用のマウントを示す「ｒｏ」との間の変化に限らず、マウントの存在の有無の変化や、アクセス権の変化でも良い。 Note that the change in mount status is not limited to the change between "rw" indicating a read/write mount and "ro" indicating a read-only mount. Change is fine.

次いで、ステップＳ１０４のディスクアクセスチェックを具体的に説明すると、図７に示すように、先ず、ディレクトリの存在がチェックされる（ステップＳ１３１）。ステップＳ１３１では、監視対象のディスクパーティションをマウント情報から得てそのディスクパーティションのディレクトリが存在するか否かが判別される。例えば、そのディレクトリにアクセスが可能か否かにより判別が行われる。そして、その判別結果が判定される（ステップＳ１３２）。ディレクトリの存在チェックの判別結果が存在ならば、ファイル書き込みチェックが実行される（ステップＳ１３３）。ステップＳ１３３のファイル書き込みチェックでは、該当ディレクトリに試しファイルの書き込み処理が実行される。試しファイルは当該ディレクトリに容易に書き込み可能な大きさ、例えば、１バイト程度であることが望ましい。そのファイル書き込みチェック後、試しファイルの書き込みが成功したか否かが判別される（ステップＳ１３４）。当該ディレクトリに試しファイルが保存されたならば、それは試しファイルの書き込み成功を意味するので、次に、ファイル削除チェックが実行される（ステップＳ１３５）。ファイル削除チェックは書き込まれた試しファイルの削除処理が実行される。試しファイルの削除処理後、その削除処理が成功したか否かが判別される（ステップＳ１３６）。該当ディレクトリから試しファイルの存在がなくなった場合には、それは試しファイルの削除成功を意味するので、ディスクアクセスチェックは終了となる。 Next, specifically describing the disk access check in step S104, as shown in FIG. 7, first, the presence of a directory is checked (step S131). In step S131, the disk partition to be monitored is obtained from the mount information, and it is determined whether or not the directory of the disk partition exists. For example, determination is made based on whether or not the directory is accessible. Then, the determination result is determined (step S132). If the determination result of the existence check of the directory exists, the file write check is executed (step S133). In the file write check in step S133, a test file is written to the corresponding directory. It is desirable that the trial file has a size that can be easily written to the directory, for example, about 1 byte. After the file write check, it is determined whether the writing of the trial file was successful (step S134). If the trial file is saved in the directory, it means that the writing of the trial file was successful, so file deletion check is performed (step S135). In the file deletion check, deletion processing of the written trial file is executed. After deleting the trial file, it is determined whether or not the deletion process was successful (step S136). If the test file no longer exists in the directory, it means that the test file has been successfully deleted, so the disk access check ends.

一方、ステップＳ１３２においてディレクトリの存在チェックの判別結果が存在しない場合、ステップＳ１３４において試しファイルの書き込みが失敗である場合、又はステップＳ１３５において試しファイルの削除が失敗である場合には、ハードディスク装置１５が異常状態にあり、ディスクアクセスチェックに何らかの不具合が生じたとして異常検出アクションが実行される（ステップＳ１３７）。なお、ステップＳ１３２、Ｓ１３４及びＳ１３６の各判別は図３に示したディスクアクセスチェックの実行後のステップＳ１０５の実行結果判定に相当する。 On the other hand, if the determination result of the existence check of the directory does not exist in step S132, if the writing of the trial file fails in step S134, or if the deletion of the trial file fails in step S135, the hard disk device 15 It is in an abnormal state, and an abnormality detection action is executed assuming that some kind of problem has occurred in the disk access check (step S137). The determinations in steps S132, S134 and S136 correspond to the execution result determination in step S105 after execution of the disk access check shown in FIG.

ステップＳ１０７、Ｓ１１４及びＳ１３７の異常検出アクションでは同一の処理が実行される。この異常検出アクションを具体的に説明すると、図８に示すように、先ず、ＳＮＭＰトラップにより保守者のコンピュータ（外部端末）に対して異常発生通知が送信される（ステップＳ１４１）。これは本情報処理装置１０を含むシステムの保守者に本情報処理装置１０が異常状態にある旨を知らしめるための送信である。ステップＳ１４１の実行後、Ｗａｔｃｈｄｏｇ監視処理の停止が指令され（ステップＳ１４２）、そしてＯＳリブートの実行が指令される（ステップＳ１４３）。 The same processing is executed in the abnormality detection actions of steps S107, S114 and S137. Specifically, as shown in FIG. 8, an error occurrence notification is sent to the maintenance person's computer (external terminal) by SNMP trap (step S141). This is a transmission for notifying the maintenance person of the system including the information processing apparatus 10 that the information processing apparatus 10 is in an abnormal state. After execution of step S141, an instruction to stop the Watchdog monitoring process is issued (step S142), and an instruction to reboot the OS is issued (step S143).

ＣＰＵ１１は、ハードディスク障害監視プログラムとは別タスクとしてＷａｔｃｈｄｏｇ監視処理プログラムを実行する。Ｗａｔｃｈｄｏｇ監視処理プログラムの実行によって所定の繰り返し周期でタイマリセット信号がＣＰＵ１１からＩＰＭＩハードウェア１９に送信される（ステップＳ１５１、Ｓ１５２）。ＩＰＭＩハードウェア１９はＷａｔｃｈｄｏｇ機能のタイマ１９ａをハードウェア又はソフトウェアとして内蔵し、タイマ１９ａは所定のタイマ時間を計測する。ＩＰＭＩハードウェア１９ではタイマ１９ａがタイマリセット信号に応答してリセットされ初期値から所定のタイマ時間を再計測する（ステップＳ１６１、Ｓ１６２）。所定のタイマ時間はタイマリセット信号の送信周期である所定の繰り返し周期より長い時間である。ＩＰＭＩハードウェア１９がタイマリセット信号を受信しないためにタイマ１９ａが所定のタイマ時間の計測を終了すると、ＩＰＭＩハードウェア１９はハードリセットを電源装置２０に対して指令する（ステップＳ１６３）。 The CPU 11 executes a Watchdog monitoring processing program as a separate task from the hard disk failure monitoring program. A timer reset signal is transmitted from the CPU 11 to the IPMI hardware 19 at a predetermined repetition cycle by executing the Watchdog monitoring processing program (steps S151 and S152). The IPMI hardware 19 incorporates a Watchdog function timer 19a as hardware or software, and the timer 19a measures a predetermined timer time. In the IPMI hardware 19, the timer 19a is reset in response to the timer reset signal, and the predetermined timer time is remeasured from the initial value (steps S161 and S162). The predetermined timer time is longer than a predetermined repetition period, which is the transmission period of the timer reset signal. When the IPMI hardware 19 does not receive the timer reset signal and the timer 19a finishes measuring the predetermined timer time, the IPMI hardware 19 instructs the power supply device 20 to reset the hardware (step S163).

ステップＳ１４２のＷａｔｃｈｄｏｇ監視処理停止指令では、ＳＩＧＫＩＬＬ等の処理停止信号を送ることにより、Ｗａｔｃｈｄｏｇ監視処理はその処理停止信号に応答してタイマリセット信号の送信を停止させる（ステップＳ１５３）。一方、その停止直後のステップＳ１４３のＯＳリブート指令により本情報処理装置１０ではＯＳの再起動が行われる。ＣＰＵ１１は現在の起動中のＯＳを一旦終了させてからＯＳを再起動する。そのＯＳの再起動が異常なく完了するならば、ＣＰＵ１１は、Ｗａｔｃｈｄｏｇ監視処理プログラムも実行し直すので、所定の繰り返しタイミングでタイマリセット信号をＩＰＭＩハードウェア１９に送信する。よって、ＯＳの再起動が正常に行われる限りＷａｔｃｈｄｏｇ機能のタイマが所定のタイマ時間の計測を終了することはない。 By sending a process stop signal such as SIGKILL as the Watchdog monitoring process stop command in step S142, the Watchdog monitoring process stops transmitting the timer reset signal in response to the process stop signal (step S153). On the other hand, the OS is rebooted in the information processing apparatus 10 by the OS reboot command in step S143 immediately after the stop. The CPU 11 restarts the OS after temporarily terminating the currently running OS. If the OS is successfully restarted, the CPU 11 also re-executes the Watchdog monitoring processing program, so it sends a timer reset signal to the IPMI hardware 19 at a predetermined repetition timing. Therefore, as long as the OS is restarted normally, the timer of the Watchdog function does not finish measuring the predetermined timer time.

しかしながら、ハードディスク装置１５が故障している場合にはステップＳ１４３のＯＳリブート処理が大幅に遅延したり、又はＯＳリブート処理自体が実行されないために、ＯＳの再起動前にＷａｔｃｈｄｏｇ機能のタイマ１９ａが所定のタイマ時間の計測を終了してしまうことが起きうる。所定のタイマ時間の計測が終了すると、ステップＳ１６３のハードリセットが指令される。ハードリセット指令に応答して電源装置２０は一旦電源供給を停止した後、電源供給を再度行う。これにより本情報処理装置１０ではＯＳの再起動が行われる。 However, if the hard disk device 15 fails, the OS reboot processing in step S143 is significantly delayed, or the OS reboot processing itself is not executed. may end the measurement of the timer time. When the measurement of the predetermined timer time ends, a hardware reset is commanded in step S163. In response to the hard reset command, the power supply device 20 temporarily stops supplying power, and then resumes supplying power. As a result, the OS is restarted in the information processing apparatus 10 .

このように実施例においては、ＯＳリブート実行が指令されたにも係わらず、実際にはＯＳリブートが正常に実行されない場合には、Ｗａｔｃｈｄｏｇ機能のタイマ１９ａが所定のタイマ時間の計測を終了してしまい、ハードリセットにより一旦電源オフとして強制的に本情報処理装置１０は再起動される。すなわち、ＩＰＭＩハードウェア１９によるハードリセット機能を実装しているので、異常検出アクション時にＯＳリブート処理の失敗が生じても本情報処理装置１０自身でハードリセットを強制的に実行することができる。よって、ハードディスク障害という異常な状態においてユーザプログラムが起動し続けることを避けることができると共に、自律的に正常状態への回復を図ることができる。 As described above, in this embodiment, if the OS reboot is not actually executed normally even though the OS reboot is commanded, the timer 19a of the Watchdog function stops measuring the predetermined timer time. Therefore, the information processing apparatus 10 is forcibly restarted by temporarily turning off the power by a hard reset. In other words, since the IPMI hardware 19 implements a hardware reset function, the information processing apparatus 10 itself can forcibly perform a hardware reset even if the OS reboot process fails during an abnormality detection action. Therefore, it is possible to prevent the user program from continuing to run in an abnormal state such as a hard disk failure, and to recover the normal state autonomously.

また、ハードディスク障害監視プログラムのメインルーチンのディスクアクセスチェックがハードディスク装置１５の故障によりメインルーチンの処理自体が停止した場合でも、ステップＳ１１２の実行カウンタのカウント値のチェックによりハードディスク障害を検出することができる。 Even if the disk access check in the main routine of the hard disk failure monitoring program stops due to a failure of the hard disk device 15, the hard disk failure can be detected by checking the count value of the execution counter in step S112. .

また、上記した実施例では、ファイルシステムのマウント状態の変化の有無確認、ディレクトリの存否の確認、及びファイル書き込み及び削除の可否の確認が繰り返し判定されるので、ハードディスク装置の異常状態を正確かつ迅速に検出することができる。 Further, in the above-described embodiment, confirmation of whether or not there is a change in the mount state of the file system, confirmation of the existence or non-existence of directories, and confirmation of whether files can be written or deleted are repeatedly determined. can be detected.

なお、上記した実施例では、記憶装置としてハードディスク装置を用いた場合を示したが、本発明はこれに限定されず、ディスク以外の半導体メモリを用いたＳＳＤ(Solid State Drive)等の記憶装置を搭載した情報記憶装置にも適用することができる。 In the above-described embodiment, a hard disk device is used as a storage device, but the present invention is not limited to this, and a storage device such as an SSD (Solid State Drive) using a semiconductor memory other than a disk may be used. It can also be applied to a mounted information storage device.

また、上記した実施例では、情報記憶装置には、マウント状態監視部、ディレクトリ確認部及びアクセス確認部が設けられているが、これらのうちのいずれか１だけが備えられても良い。また、アクセス確認部はディレクトリ確認部によって存在が確認されたディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを確認するが、アクセス確認部だけが備えられた情報記憶装置では、監視対象のパーティションのディレクトリが存在するものとしてファイルの書き込み及び削除が可能であるか否かを確認することが行われる。 Also, in the above-described embodiment, the information storage device is provided with the mount state monitoring section, the directory confirmation section, and the access confirmation section, but only one of them may be provided. Also, the access confirmation unit confirms whether or not files can be written and deleted under the directory whose existence has been confirmed by the directory confirmation unit. A check is made to see if files can be written and deleted assuming the partition's directory exists.

１０情報処理装置
１１ＣＰＵ
１２記憶部
１３通信部
１４入出力インタフェース
１５ハードディスク装置
１６入力装置
１７出力装置
１８バス
１９ＩＰＭＩハードウェア
１９ａウォッチドッグタイマ
２０電源装置
３１マウント状態監視部
３２ディレクトリ確認部
３３アクセス確認部
３４異常検出部
３５カウンタ更新部
３６カウンタ更新確認部 10 information processing device 11 CPU
12 storage unit 13 communication unit 14 input/output interface 15 hard disk device 16 input device 17 output device 18 bus 19 IPMI hardware 19a watchdog timer 20 power supply device 31 mount state monitoring unit 32 directory confirmation unit 33 access confirmation unit 34 abnormality detection unit 35 Counter update unit 36 Counter update confirmation unit

Claims

記憶装置の異常を検出する情報処理装置であって、
前記記憶装置の監視対象のパーティション毎にファイルシステムのマウント状態の変化があったか否かを所定の期間毎に確認するマウント状態監視部と、
前記パーティション毎に前記記憶装置にディレクトリが存在するか否かを前記所定の期間毎に確認するディレクトリ確認部と、
前記ディレクトリ確認部によって存在が確認された前記ディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを前記所定の期間毎に確認するアクセス確認部と、
前記マウント状態監視部による前記マウント状態の変化の確認、前記ディレクトリ確認部による前記ディレクトリの非存在の確認、及び前記アクセス確認部による前記ファイル書き込み又は削除の不可の確認のうちのいずれか１が行われたとき前記記憶装置の異常検出と判定する異常検出部と、
前記マウント状態監視部による前記マウント状態の不変化の確認、前記ディレクトリ確認部による前記ディレクトリの存在の確認、及び前記アクセス確認部による前記ファイル書き込み及び削除の可能の確認の全てが行われたとき実行カウンタのカウント値を更新するカウンタ更新部と、
前記所定の期間毎に前記実行カウンタのカウント値の更新を確認するカウント更新確認部と、を備え、
前記異常検出部は、前記カウント更新確認部により前記実行カウンタのカウント値の更新が確認されなかったとき前記記憶装置の異常検出と判定することを特徴とする情報処理装置。 An information processing device that detects an abnormality in a storage device,
a mount status monitoring unit for checking, at predetermined intervals, whether or not there has been a change in the mount status of the file system for each partition to be monitored of the storage device;
a directory confirmation unit for confirming whether or not a directory exists in the storage device for each partition at each predetermined period ;
an access confirmation unit that confirms , at intervals of the predetermined period, whether files can be written and deleted under the directory whose existence has been confirmed by the directory confirmation unit;
any one of confirmation of a change in the mount state by the mount state monitoring unit, confirmation of non-existence of the directory by the directory confirmation unit, and confirmation of impossibility of writing or deletion of the file by the access confirmation unit. an anomaly detection unit that determines that an anomaly of the storage device is detected when an error is detected;
Executed when confirmation by the mount status monitoring unit that the mount state has not changed, confirmation by the directory confirmation unit that the directory exists, and confirmation by the access confirmation unit that the file can be written and deleted. a counter updating unit that updates the count value of the counter;
a count update confirmation unit that confirms update of the count value of the execution counter every predetermined period,
The information processing apparatus according to claim 1, wherein the abnormality detection section determines that an abnormality of the storage device is detected when the update of the count value of the execution counter is not confirmed by the count update confirmation section.

前記異常検出部は、前記記憶装置の異常検出判定時に前記情報処理装置の外部端末に対して異常発生通知を送信することを特徴とする請求項１記載の情報処理装置。 2. The information processing apparatus according to claim 1 , wherein the abnormality detection unit transmits an abnormality occurrence notification to an external terminal of the information processing apparatus when the abnormality detection determination is made for the storage device.

前記記憶装置に保存されたオペレーションシステムの起動を行う制御部を更に備え、
前記異常検出部は、前記記憶装置の異常検出判定時に前記制御部に対して前記オペレーションシステムの再起動を指令することを特徴とする請求項１又は２記載の情報処理装置。 further comprising a control unit that activates the operating system stored in the storage device,
3. The information processing apparatus according to claim 1 , wherein the abnormality detection unit instructs the control unit to restart the operating system when determining whether the storage device has detected an abnormality.

所定の繰り返し周期でタイマリセット信号を送信するウォッチドッグ監視処理部と、
前記タイマリセット信号に応答してリセットして前記所定の繰り返し周期よりも長い所定のタイマ時間を初期値から計測するウォッチドッグタイマを含むハードリセット部と、を更に備え、
前記異常検出部は、前記記憶装置の異常検出判定時に前記ウォッチドッグ監視処理部の前記タイマリセット信号の送信を停止させ、
前記ハードリセット部は、前記ウォッチドッグタイマが前記所定のタイマ時間の計測を終了したときに、前記情報処理装置の電源装置の前記情報処理装置内への電源供給を強制的に一旦停止させた後、前記制御部に前記オペレーションシステムの再起動をさせるべく前記電源装置の電源供給を再開させるハードリセットを実行することを特徴とする請求項３記載の情報処理装置。 a watchdog monitoring processing unit that transmits a timer reset signal at a predetermined repetition period;
a hardware reset unit including a watchdog timer that resets in response to the timer reset signal and measures a predetermined timer time longer than the predetermined repetition period from an initial value;
The abnormality detection unit stops transmission of the timer reset signal from the watchdog monitoring processing unit when an abnormality detection determination is made for the storage device,
After the hardware reset unit forcibly stops the power supply of the power supply of the information processing device to the inside of the information processing device when the watchdog timer finishes measuring the predetermined timer time, 4. The information processing apparatus according to claim 3 , wherein a hard reset is executed for restarting power supply of said power supply device so as to cause said control unit to restart said operating system.

記憶装置の異常を検出する情報処理装置の異常検出方法であって、
前記記憶装置の監視対象のパーティション毎にファイルシステムのマウント状態の変化があったか否かを所定の期間毎に確認するマウント状態監視ステップと、
前記パーティション毎に前記記憶装置にディレクトリが存在するか否かを前記所定の期間毎に確認するディレクトリ確認ステップと、
前記ディレクトリ確認ステップにおいて存在が確認された前記ディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを前記所定の期間毎に確認するディスクアクセス確認ステップと、
前記マウント状態監視ステップにおける前記マウント状態の変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの非存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み又は削除の不可の確認のうちのいずれか１が行われたとき前記記憶装置の異常検出と判定する異常検出ステップと、
前記マウント状態監視ステップにおける前記マウント状態の不変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み及び削除の可能の確認の全てが行われたとき実行カウンタのカウント値を更新するカウンタ更新ステップと、
前記所定の期間毎に前記実行カウンタのカウント値の更新を確認するカウント更新確認ステップと、
前記カウント更新確認ステップにおいて前記実行カウンタのカウント値の更新が確認されなかったとき前記記憶装置の異常検出と判定する判定ステップと、を含むことを特徴とする異常検出方法。 An abnormality detection method for an information processing device for detecting an abnormality in a storage device, comprising:
a mount status monitoring step of checking , at predetermined intervals, whether or not there has been a change in the mount status of the file system for each monitored partition of the storage device;
a directory confirmation step of confirming whether or not a directory exists in the storage device for each of the partitions for each of the predetermined periods ;
a disk access confirmation step of confirming whether or not files can be written and deleted under the directory whose existence has been confirmed in the directory confirmation step, at each predetermined period ;
any one of confirmation of a change in the mount state in the mount state monitoring step, confirmation of non-existence of the directory in the directory confirmation step, and confirmation of impossibility of writing or deletion of the file in the disk access confirmation step, an anomaly detection step of determining an anomaly detection of the storage device when performed;
When all of the confirmation of the unchange of the mount state in the mount state monitoring step, the existence of the directory in the directory confirmation step, and the possibility of file writing and deletion in the disk access confirmation step are performed. a counter update step of updating the count value of the execution counter;
a count update confirmation step of confirming update of the count value of the execution counter every predetermined period;
and a determination step of determining that an abnormality of the storage device is detected when the update of the count value of the execution counter is not confirmed in the count update confirmation step.

記憶装置の異常を検出する情報処理装置の異常検出方法のためのプログラムであって、
コンピュータに、
前記記憶装置の監視対象のパーティション毎にファイルシステムのマウント状態の変化があったか否かを所定の期間毎に確認するマウント状態監視ステップと、
前記パーティション毎に前記記憶装置にディレクトリが存在するか否かを前記所定の期間毎に確認するディレクトリ確認ステップと、
前記ディレクトリ確認ステップにおいて存在が確認された前記ディレクトリ配下にファイルの書き込み及び削除が可能であるか否かを前記所定の期間毎に確認するディスクアクセス確認ステップと、
前記マウント状態監視ステップにおける前記マウント状態の変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの非存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み又は削除の不可の確認のうちのいずれか１が行われたとき前記記憶装置の異常検出と判定する異常検出ステップと、
前記マウント状態監視ステップにおける前記マウント状態の不変化の確認、前記ディレクトリ確認ステップにおける前記ディレクトリの存在の確認、及び前記ディスクアクセス確認ステップにおける前記ファイル書き込み及び削除の可能の確認の全てが行われたとき実行カウンタのカウント値を更新するカウンタ更新ステップと、
前記所定の期間毎に前記実行カウンタのカウント値の更新を確認するカウント更新確認ステップと、
前記カウント更新確認ステップにおいて前記実行カウンタのカウント値の更新が確認されなかったとき前記記憶装置の異常検出と判定する判定ステップと、を実行させることを特徴とするプログラム。 A program for an abnormality detection method for an information processing device for detecting an abnormality in a storage device,
to the computer,
a mount status monitoring step of checking , at predetermined intervals, whether or not there has been a change in the mount status of the file system for each monitored partition of the storage device;
a directory confirmation step of confirming whether or not a directory exists in the storage device for each of the partitions for each of the predetermined periods ;
a disk access confirmation step of confirming whether or not files can be written and deleted under the directory whose existence has been confirmed in the directory confirmation step, at each predetermined period ;
any one of confirmation of a change in the mount state in the mount state monitoring step, confirmation of non-existence of the directory in the directory confirmation step, and confirmation of impossibility of writing or deletion of the file in the disk access confirmation step, an anomaly detection step of determining an anomaly detection of the storage device when performed;
When all of the confirmation of the unchange of the mount state in the mount state monitoring step, the existence of the directory in the directory confirmation step, and the possibility of file writing and deletion in the disk access confirmation step are performed. a counter update step of updating the count value of the execution counter;
a count update confirmation step of confirming update of the count value of the execution counter every predetermined period;
and a determination step of determining that an abnormality of the storage device is detected when the update of the count value of the execution counter is not confirmed in the count update confirmation step.