KR100938676B1

KR100938676B1 - Event priority level setting method

Info

Publication number: KR100938676B1
Application number: KR1020080073644A
Authority: KR
Inventors: 임태환
Original assignee: 임태환
Priority date: 2008-07-28
Filing date: 2008-07-28
Publication date: 2010-01-25

Abstract

PURPOSE: A failure level determination method of an operation system for securing the stability of an operation system is provided to accurately determine a failure level by considering a method for determining a failure grade. CONSTITUTION: It confirms whether there is a failure event through event information(S101~S107). A failure event elements is subdivided(S109). An environmental variables by a field suitable for an element is set up(S111). A type of the fault event defines(S113). The environment variable is applied to the type definition(S115). The final fault level is determined(S117).

Description

운영시스템의 장애 등급 판단방법{Event Priority Level Setting Method}Event Priority Level Setting Method

본 발명은 IT(정보기술) 시스템 관리분야 또는 산업분야에 걸쳐 운영을 담당하는 운영 시스템(업무처리기기, 소프트웨어, 시스템, 업무처리 프로세스, 기타 등등)의 장애 발생시 장애 등급(중요도)을 정확하게 판단하기 위한 운영시스템의 장애 등급 판단방법에 관한 것이다.The present invention is to accurately determine the failure level (criticality) in the event of a failure of an operating system (business processing device, software, system, business processing process, etc.) in charge of operations across IT (information technology) system management or industry. The present invention relates to a method for determining a failure level of an operating system.

일반적으로, IT 시스템 관리 분야는 소정의 운영체제를 구비한 컴퓨터 시스템과 같은 운영 시스템을 이용하여 운영을 하는 것이 대부분이며, 이러한 운영 시스템은 내부의 소프트웨어 장애라든지 업무처리 프로세스의 장애, 또는 하드웨어의 장애가 발생한 경우 전체 시스템의 운영에 막대한 지장을 초래하게 되므로, 장애를 조속히 검출하고 해당 장애등급을 정확하게 판단하는 것이 매우 중요하다.In general, the IT system management field is mostly operated by using an operating system such as a computer system having a predetermined operating system. Such an operating system is caused by an internal software failure, a failure in a business process process, or a hardware failure. In this case, it causes a great disruption to the operation of the whole system. Therefore, it is very important to detect faults promptly and to accurately determine the fault level.

또한, 제품을 대량 생산하는 산업 분야, 특히 공장 등의 생산 설비 시설은 그 정상 가동 상태를 지속함으로써 제품의 양산이 가능하고, 생산되는 제품의 품질을 일정하게 지속적으로 유지하는 것이 매우 중요하다. 따라서 이러한 산업 분야에서도 지속적인 제품 양산과 품질 유지를 위해 각 장비의 장애 여부를 신속하게 판단하는 것, 또는 소프트웨어나 운영체제 등의 장애 여부를 신속하게 판단하는 것이 중요하다.In addition, it is very important that the industrial field of mass production of products, in particular, production facilities, such as factories, can be mass-produced by maintaining their normal operating conditions, and that the quality of the products produced is kept constant. Therefore, in these industries, it is important to quickly determine the failure of each equipment or to determine the failure of the software or the operating system in order to continuously produce the product and maintain the quality.

아울러 장애 이벤트가 발생할 시 장애 이벤트에 대한 정확한 장애등급(중요도)의 판단은 장애처리를 위해 매우 중요한 요소이다. 예컨대, 장애 등급에 따라서 경보등급, 처리 우선순위, 예상 피해상황 도출작업 등의 후속 조치가 결정되므로, 장애 발생시 장애 등급(중요도)을 정확하게 판단하는 것이 매우 중요하다.In addition, when the fault event occurs, the determination of the exact fault level (criticality) of the fault event is a very important factor for fault handling. For example, it is very important to accurately determine the level of failure (criticality) in the event of a failure since subsequent actions such as alarm level, processing priority, and anticipated damage situation are determined according to the failure level.

이러한 IT 시스템 관리 분야나 모든 산업 분야에서 운영 기기의 오작동이나 이상상황으로 장애 이벤트가 발생한 경우 장애 등급을 판단하기 위한 종래의 운영시스템 장애 등급 판단방법은, 운영시스템으로부터 장애 이벤트(Event)를 수신함으로써 장애 발생 여부를 판단하고, 장애 이벤트가 발생한 경우 장애 이벤트 종류(Event_id)별로 미리 지정된 중요도(장애 등급)를 부여하는 방법을 사용한다.The conventional operating system failure level determination method for determining a failure level when a failure event occurs due to a malfunction or abnormal operation of an operating device in the IT system management field or all industrial fields is performed by receiving a failure event from the operating system. The method determines whether a failure occurs and, if a failure event occurs, assigns a predetermined importance (disability level) for each failure event type (Event_id).

즉, 이벤트 종류(Event_ID) 및 이벤트_묘사(Event_description) 정보에 따라 미리 이벤트 종류별로 중요도를 1:1 대응하게 규정하고, 장애 이벤트가 발생하면 대응하는 중요도를 바로 부여하는 방법으로 장애 등급을 판단하게 된다.In other words, the importance level is determined in a 1: 1 manner according to the event type (Event_ID) and the event_description information, and the failure level is determined by immediately assigning the corresponding importance level when a failure event occurs. do.

그러나 이러한 종래의 장애 등급 판단기술은 실제 시스템 운영자의 복합적인 주변 상황(이벤트 발생 시점, 장애자원, 운영시간, 계획휴지 여부 등)을 감안하지 않은 상태이므로, 실제 운영상의 중요도와는 매우 상이한 결과를 초래하는 경우가 발생하였다.However, this conventional failure level determination technique does not take into account the complex surrounding conditions (event occurrence time, failure resources, operating hours, planned suspension, etc.) of the actual system operator, and thus has very different results from the actual operational importance. Incurred.

또한, 종래의 기술은 시스템에 의해 도출되는 장애등급이 사람의 판단 기준으로 도출된 결과와 상이한 경우가 많아, 사람에 의한 중요도 재확인 작업이 수시로 필요한 단점이 있으며, 사람에 의한 중요도 재확인 작업이 생략되었을 경우에는 중요 장애 이벤트의 간과로 인해 대형 사고를 유발시키는 경우도 발생하였다.In addition, in the conventional technology, the level of disability derived by the system is often different from the result obtained based on the judgment criteria of the person, and thus, there is a disadvantage in that the revalidation work is frequently required by a person, and the revalidation work by a person has been omitted. In some cases, oversight of critical disability events caused large accidents.

이에 본 발명은 상기와 같은 종래 장애 등급 판단방법에서 발생하는 제반 문제점을 해결하기 위해서 제안된 것으로서,Accordingly, the present invention has been proposed to solve various problems occurring in the above-described conventional disability rating method.

본 발명이 해결하고자 하는 과제는, IT(정보기술) 시스템 관리분야 또는 산업분야에 걸쳐 운영을 담당하는 운영 시스템(업무처리기기, 소프트웨어, 시스템, 업무처리 프로세스, 기타 등등)의 장애 발생시 장애 등급(중요도)을 정확하게 판단하기 위한 운영시스템의 장애 등급 판단방법을 제공하는 데 있다.The problem to be solved by the present invention is the failure level in the event of a failure of the operating system (business processing equipment, software, systems, business processing processes, etc.) in charge of operations across the IT (information technology) system management field or industry field ( It is to provide a method of determining the failure level of the operating system to accurately determine the importance level.

본 발명이 해결하고자 하는 다른 과제는, 운영 시스템에서 발생한 장애 이벤트의 정확한 등급을 판정함으로써, 그에 적절한 처리를 유도함으로써 시스템 가동의 안정성을 확보하도록 한 운영시스템의 장애 등급 판단방법을 제공하는 데 있다.Another problem to be solved by the present invention is to provide a method for determining the failure level of the operating system to ensure the stability of the system operation by determining the correct grade of the failure event occurred in the operating system, thereby inducing appropriate processing.

상기와 같은 과제들을 해결하기 위한 본 발명에 따른 "운영시스템의 장애 등급 판단방법"의 바람직한 실시 예는,In order to solve the above problems, a preferred embodiment of the "determination method of the failure level of the operating system" according to the present invention,

운영시스템과 연결되어 상기 운영시스템에서 장애 발생시 장애 등급을 판단하는 방법에 있어서,In the method of determining a failure level when a failure occurs in the operating system connected to the operating system,

상기 운영시스템과의 연결 설정을 수행한 후, 상기 운영시스템에서 발생한 이벤트 정보를 수신하여 장애 이벤트 여부를 확인하는 제1단계와;Performing a connection setup with the operating system, and then receiving event information generated by the operating system to check whether there is a failure event;

상기 확인결과 장애 이벤트일 경우, 장애 이벤트 구성 요소를 세분화하는 제2단계와;A second step of subdividing a fault event component when the check result is a fault event;

상기 구성 요소에 적합한 분야별 환경 변수를 설정하고, 상기 장애 이벤트의 유형을 정의하는 제3단계와;Setting a field-specific environment variable suitable for the component and defining a type of the failure event;

상기 정의한 유형에 상기 환경 변수를 적용하여 최종 장애 등급을 판단하는 제4단계를 포함하는 것을 특징으로 한다.And a fourth step of determining a final failure level by applying the environment variable to the defined type.

여기서 본 발명은,Here, the present invention,

장애 이벤트 구성 요소 중 이벤트-묘사 부분을 자원, 수치필드, 일반필드로 세분화하는 것이 바람직하며, It is desirable to subdivide the event-description part of the fault event component into resources, numerical fields, and general fields.

구성 요소별 환경 변수는 주중_운영시작_시각, 주중_운영종료_시각, 주중_운영시간_중요도, 주중_비 운영시간_중요도, 주말_운영시작_시각, 주말_운영종료_시각, 주말_운영시간_중요도, 주말_비 운영시간_중요도, 공휴일_운영시작_시각, 공휴일_운영종료_시각, 공휴일_운영시간_중요도, 공휴일_비 운영시간_중요도, 계획휴지 일정을 포함한다.The environment variables for each component are weekday_start_time, weekday_end_time, weekday_open_time_importance, weekday_non-hour_importance, weekend_start_time, weekend_end_time, weekend _Operating time_importance, weekend_non-operation time_importance, holiday_operation start_time, public holiday_operation end_time, public holiday_operation time_importance, public holiday_non-operation time_importance, planned holiday schedule

또한, 장애 이벤트의 유형은, 모든 경우 적용 가능한 유형인 유형"M", 중요도 고려대상 "자원"이 1개 존재하는 경우에 적용 가능한 유형인 유형"R","S", 중요도 고려대상 "자원"이 복수 존재하는 경우의 유형인 유형"T","U", 중요도 고려대상 "수치필드"가 존재하는 경우에 적용 가능한 유형인 유형"N", 기타 특수처리가 필요한 유형을 나타내는 유형"X"인 것을 특징으로 한다.In addition, the type of failure event is the type "M", which is the applicable type in all cases, and the types "R", "S", the type that is applicable when there is one "consider" of importance consideration and the "resource" of importance. "Type" T "," U "which is the type when there are multiple plural types, type" N "which is the type applicable when" Numerical field "with importance considerations exists, type" X "indicating a type that requires special treatment. It is characterized by being.

또한, 장애 이벤트의 유형에 따른 최종 장애 등급 판단은,In addition, the final failure level determination according to the type of failure event,

장애 이벤트 유형이 "M"인 경우 해당 이벤트 종류의 환경 변수를 고려하여 최종등급을 판단하고, 장애 이벤트 유형이 "R"인 경우 해당 자원의 환경변수를 고려하여 최종등급을 판단하되, 자원이 없을 경우에는 해당 이벤트 종류의 환경변수를 고려하여 최종등급을 판단하며, 장애 이벤트 유형이 "S"인 경우 해당 자원의 환경변수를 고려하여 최종등급을 판단하되, 자원이 없을 경우에는 최종 등급은 최하등급을 적용하고, 장애 이벤트 유형이 "T"인 경우 해당 자원의 환경변수를 고려하여 가장 높은 중요도를 갖는 자원의 중요도를 적용하되, 최상위 자원이 복수 존재할 경우에는 선행 자원을 우선 적용하고, 자원이 없을 경우에는 이벤트 종류 중요도를 적용하여 최종 장애 등급을 판단하며, 장애 이벤트 유형이 "U"인 경우 해당 자원의 환경변수를 고려하여 가장 높은 중요도를 갖는 자원의 중요도를 적용하되, 최상위 자원이 복수 존재할 경우에는 선행 자원 우선 적용하고, 자원이 없을 경우에는 최하등급으로 최종 장애 등급을 판단하며, 장애 이벤트 유형이 "N"인 경우 해당 이벤트 종류별 또는 이벤트 종류와 장애자원별 수치필드의 값에 따라 중요도를 판정하되, 발생 수치 값이 지정 범위 밖인 경우에는 이벤트 종류 중요도를 적용하여 최종 장애 등급을 판단하며, 장애 이벤트 유형이 "X"인 경우 해당 이벤트 종류를 위한 특별처리 프로그램을 수행하여 장애 등급을 판단하되, 특별처리 프로그램이 지정되지 않은 경우에는 최하등급으로 장애 등급을 판정하는 것을 특징으로 한다.If the failure event type is "M", the final level is determined by considering the environment variable of the event type. If the failure event type is "R", the final level is determined by considering the environment variable of the resource. In the case of the event class, the final class is determined by considering the environmental variables. If the event type is "S", the final class is determined by considering the environmental variables of the resource. If the failure event type is "T", the resource with the highest priority is considered in consideration of the environment variable of the resource, but if there are multiple top-level resources, the preceding resource is applied first. In the case of event type, the final failure level is determined by applying the importance of the event type. If the failure event type is "U", the environment variable of the resource is considered. If the resource with the highest priority is applied, the priority is applied first if there are multiple top level resources, if there is no resource, the final failure level is determined as the lowest level, and if the failure event type is "N" The importance is determined by the value of the numerical field by event type or by event type and failure resource.If the occurrence value is out of the specified range, the final failure level is determined by applying the event type importance, and the failure event type is "X". In this case, the failure level is determined by performing a special processing program for the corresponding event type, and when the special processing program is not specified, the failure level is determined as the lowest level.

본 발명에 따르면, IT(정보기술) 시스템 관리분야 또는 산업분야에 걸쳐 운영을 담당하는 운영 시스템(업무처리기기, 소프트웨어, 시스템, 업무처리 프로세스, 기타 등등)에서 장애 이벤트가 발생한 경우, 환경 변수를 고려함으로써 정확한 장애 등급을 판단할 수 있는 장점이 있다.According to the present invention, when a failure event occurs in an operating system (business processing device, software, system, business processing process, etc.) that is in charge of operations in an IT (information technology) system management field or an industrial field, Consideration has the advantage of determining the correct level of disability.

또한, 운영 시스템에서 발생한 장애 이벤트의 정확한 등급을 판정할 수 있으므로, 장애 이벤트 발생시 그에 적절한 처리를 유도할 수 있어, 시스템 가동의 안정성을 확보해주는 장점도 있다.In addition, since it is possible to determine the correct grade of the failure event occurred in the operating system, it is possible to induce appropriate processing when a failure event occurs, there is an advantage to ensure the stability of the system operation.

이하 본 발명의 바람직한 실시 예를 첨부한 도면에 의거 상세히 설명하면 다음과 같다. 본 발명을 설명하기에 앞서 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그에 대한 상세한 설명은 생략한다.Hereinafter, described in detail with reference to the accompanying drawings a preferred embodiment of the present invention. If it is determined that the detailed description of the known function or configuration related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.

도 1은 본 발명이 적용되는 장애 등급 판단 시스템의 개략 구성도로서, 운영 시스템(100)과 장애 등급 판단 시스템(200)으로 구성된다.1 is a schematic configuration diagram of a failure rating determination system to which the present invention is applied, and includes an operating system 100 and a failure rating determination system 200.

운영 시스템(100)은 업무처리기기, 소프트웨어, 시스템, 업무처리 프로세스 등을 통칭한 것으로서, 장치 또는 시스템 전체 동작을 제어 및 운영하는 기능을 수행하며, 장애를 감지한 경우 장애 이벤트를 실시간 또는 주기적으로 발생하는 기능을 수행한다.The operating system 100 collectively refers to a business processing device, software, a system, a business processing process, etc., and performs a function of controlling and operating a device or a whole system, and when a failure is detected, a failure event is performed in real time or periodically. Perform the function that occurs.

장애 등급 판단 시스템(200)은 상기 운영 시스템(100)과 연결되어, 상기 운영 시스템(100)으로부터 전송된 장애 이벤트를 수신하면 장애 이벤트 정보를 분석하여 장애 등급을 판단하고, 장애 등급을 표시해주는 기능을 수행한다.The failure level determination system 200 is connected to the operating system 100 and, upon receiving a failure event transmitted from the operating system 100, analyzes failure event information to determine a failure level and displays a failure level. Do this.

이러한 장애 등급 판단 시스템(200)은 상기 운영 시스템(100)과 연결 설정을 수행한 후 이벤트 정보를 요청하거나 전송된 이벤트 정보를 수신하기 위한 데이터 인터페이스기(210), 상기 수신한 이벤트 정보를 저장하며, 발생한 장애 이벤트의 등급을 판단하기 위한 다양한 정보(예를 들어, 장애 이벤트의 구성 요소 중 이벤트 -묘사 부분을 세분화하기 위한 정보, 분야별 환경변수 정보, 이벤트 유형을 정의하기 위한 유형 정의 정보, 최종 장애 등급을 판단하기 위한 정보, 기타 등)가 저장된 데이터베이스(220), 상기 데이터 인터페이스기(210)를 통해 수신한 이벤트 정보를 분석하여 장애 발생 여부를 판단하는 장애 발생 판단부(230), 상기 데이터 인터페이스기(210)를 통해 이벤트 정보가 수신되면 상기 데이터베이스(220)에 저장된 다양한 정보를 통해 발생한 장애 이벤트의 등급을 판단하고, 그 장애 등급의 표시를 제어하는 장애 등급 판단부(240), 상기 장애 등급 판단부(240)와 연동하여 상기 운영 시스템(100)의 장애 발생시 장애 등급을 표시해주는 장애 등급 표시부(250)를 포함한다.The failure level determination system 200 stores the received event information and a data interface 210 for requesting event information or receiving the transmitted event information after performing connection establishment with the operating system 100. , Various information for determining the grade of the fault event (for example, information for subdividing the event-description part of the component of the fault event, environment variable information for each field, type definition information for defining the event type, and the final fault) Information for determining a grade, etc.) is stored in the database 220, the failure occurrence determination unit 230 to determine whether a failure occurs by analyzing the event information received through the data interface 210, the data interface When event information is received through the device 210, a failure that occurs through various information stored in the database 220. Disability level determining unit for determining the class of the event, and control the display of the disability level, the disability level indicating the disability level when the failure of the operating system 100 in conjunction with the disability level determining unit 240 And a display unit 250.

주지한 바와 같이 구성되는 장애 등급 판단 시스템(200)은 하나의 모듈로 구현하는 것이 바람직하다. 하나의 모듈로 구현한 경우, 특정 운영 시스템에 연결만 설정하면 그 연결된 운영 시스템에서 장애 이벤트 발생시 장애 등급을 판단하는 장치로 사용할 수 있다.The failure rating determination system 200 configured as described above is preferably implemented as one module. When implemented as a module, it can be used as a device to determine the failure level when a failure event occurs in the connected operating system by setting only a connection to a specific operating system.

도 2는 본 발명에 따른 "운영시스템의 장애 등급 판단방법"의 바람직한 실시 예를 보인 흐름도로서, 상기 장애 등급 판단 시스템(200) 내의 장애 등급 판단부(240)에서 소프트웨어적으로 장애 등급을 판단하는 프로세스로서, S는 단계(Step)를 나타낸다.2 is a flow chart showing a preferred embodiment of the "determination method of the failure level of the operating system" according to the present invention, the failure rating determination unit 240 in the failure rating determination system 200 to determine the failure level in software As a process, S stands for Step.

이에 도시된 바와 같이, 본 발명에 따른 "운영시스템의 장애 등급 판단방법"은, 상기 운영시스템(100)과의 연결 설정을 수행한 후, 상기 운영시스템(100)에서 발생한 이벤트 정보를 수신하여 장애 이벤트 여부를 확인하는 제1단계(S101 ~ S107)와; 상기 확인결과 장애 이벤트일 경우, 장애 이벤트 구성 요소를 세분화하는 제2단계(S109)와; 상기 구성 요소에 적합한 분야별 환경 변수를 설정하고, 상기 장애 이벤트의 유형을 정의하는 제3단계(S111 ~ S113)와; 상기 정의한 유형에 상기 환경 변수를 적용하여 최종 장애 등급을 판단하는 제4단계(S117 ~ S119)로 이루어진다.As shown in the figure, the method for determining a failure level of the operating system according to the present invention, after performing the connection setup with the operating system 100, receives the event information generated in the operating system 100 and fails. A first step (S101 ˜ S107) of checking whether an event is present; A second step (S109) of subdividing a fault event component when the check result is a fault event; A third step (S111 ˜ S113) of setting a field-specific environment variable suitable for the component and defining a type of the failure event; The fourth step (S117 to S119) of determining the final failure level by applying the environment variable to the defined type.

이와 같이 이루어지는 본 발명에 따른 "운영시스템의 장애 등급 판단 방법"은, 운영 시스템(100)과 장애 등급 판단 시스템(200)을 물리적(예를 들어, 통신 라인)으로 연결한 상태에서, 단계 S101에서와 같이 통신을 통해 운영 시스템(100)과 연결 설정을 시도한다.As described above, in the "operation grade determination method of the operating system" according to the present invention, the operation system 100 and the failure grade determination system 200 are physically connected (for example, a communication line) in step S101. Attempts to establish a connection with the operating system 100 through communication as shown.

이후 운영 시스템(100)으로부터 연결 설정에 대한 응답이 수신되면 정상적인 연결로 판단을 하게 되며, 상기 운영 시스템(100)으로부터 연결 설정에 대한 응답이 수신되지 않으면 전술한 단계 S101로 이동하여 운영 시스템과의 연결 설정을 재 시도하게 된다.Thereafter, when a response to the connection establishment is received from the operating system 100, a determination is made as a normal connection. The connection setup will be retried.

운영 시스템(100)과의 연결 설정이 정상적으로 이루어진 후에는, 단계 S103에서 상기 운영 시스템(100)에 이벤트(Event) 정보를 요청하게 되고, 상기 운영 시스템(100)으로부터 전송된 이벤트 정보를 수신하게 된다.After the connection setting with the operating system 100 is normally made, the event information is requested to the operating system 100 in step S103, and the event information transmitted from the operating system 100 is received. .

이벤트 정보가 수신되면 단계 S105 및 S107에서 수신한 이벤트 정보를 분석하여 장애 이벤트 여부를 판단하게 되고, 이 판단 결과 장애 이벤트로 판단되면, 단계 S109로 이동하여 이벤트의 구성 요소 중 이벤트-묘사 부분을 세분화하게 된다. 여기서 수신한 이벤트를 분석하여 장애 이벤트 여부를 판단하는 방법은 이벤트를 감시하는 분야에서 널리 알려진 공지의 이벤트 발생 판단 방법을 그대로 채택하게 되므로, 그의 자세한 설명은 생략한다.If event information is received, the event information received in steps S105 and S107 is analyzed to determine whether there is a failure event. If the determination result is a failure event, the control proceeds to step S109 to subdivide the event-description portion among the components of the event. Done. The method of determining whether or not a failure event by analyzing the received event adopts a well-known event occurrence determination method that is well known in the field of monitoring the event, as a detailed description thereof will be omitted.

상기 이벤트_묘사 부분을 세분화하는 방법은 다음과 같다.The method of subdividing the event_description part is as follows.

일반적인 시스템에서 운영 중 발생하는 이벤트는 아래의 [표1]과 같은 구조를 갖는다.Events that occur during operation in a general system has a structure as shown in [Table 1] below.

일반적인 운영시스템에서 이벤트 구조Event structure in a typical operating system 발생 시각Occurrence time 발생(감지)시스템Generation (detection) system 이벤트_종류(Event_id)Event_type (Event_id) 이벤트_묘사(Event_description)Event_description 2008.3.24 13:45:322008.3.24 13:45:32 SYS1SYS1 IEF 4501IEF 4501 JOBA ABEND S037 REASON=OCJOBA ABEND S037 REASON = OC 2008.3.24 13:46:212008.3.24 13:46:21 TEC1TEC1 Ora_down_RDB_50Ora_down_RDB_50 Oracle database Instance ORADVE downOracle database instance ORADVE down 2008.3.24 13:47:472008.3.24 13:47:47 NMS22NMS22 nw_node_down21nw_node_down21 Router NR5334 is unavailable nowRouter NR5334 is unavailable now 2008.3.24 13:49:152008.3.24 13:49:15 APM3APM3 perf_degrad_35perf_degrad_35 response time for BIZ3 is 6.4sec(over5)response time for BIZ3 is 6.4sec (over5)

이러한 이벤트의 구성요소 중 묘사 부분을 "자원(Resource)", "수치필드(Numeric Field)", "일반 필드"로 세분화하게 된다. 아래의 [표2]는 구성요소 중 묘사 부분을 세분화한 예를 나타낸 것이다.Descriptive parts of these events are divided into "Resource", "Numeric Field", and "General Field". [Table 2] below shows an example of subdividing the description part of the components.

이벤트 구성요소 중 묘사 부분을 세분화 예Example segmentation of event components 발생 시각Occurrence time 발생(감지)시스템Generation (detection) system 이벤트_종류(Event_id)Event_type (Event_id) 이벤트_묘사(Event_description)Event_description 2008.3.24 13:45:322008.3.24 13:45:32 SYS1SYS1 IEF4501IEF4501 JOBA (자원) ABEND S037 REASON=OC JOBA (Resource) ABEND S037 REASON = OC 2008.3.24 13:46:212008.3.24 13:46:21 TEC1TEC1 Ora_down_RDB_50Ora_down_RDB_50 Oracle database Instance ORADVE (자원) downOracle database instance ORADVE (resource) down 2008.3.24 13:47:472008.3.24 13:47:47 NMS22NMS22 nw_node_down21nw_node_down21 Router NR5334 (자원) is unavailable nowRouter NR5334 (Resource) is unavailable now 2008.3.24 13:49:152008.3.24 13:49:15 APM3APM3 perf_degrad_35perf_degrad_35 response time for BIZ3 (자원) is 6.4(수치) sec(over5)response time for BIZ3 (resource) is 6.4 (num) sec (over5)

다음으로, 단계 S111에서 상기 각 구성요소에 적절한 분야별 환경 변수를 설정한다. 아래의 [표3]은 각 구성요소에 대응하는 적절한 분야별 환경 변수를 나타낸 것이다.Next, in step S111, an environment-specific environment variable appropriate to each of the above components is set. [Table 3] below shows the environment-specific environmental variables corresponding to each component.

구성요소별 환경변수Component Variables 구성 요소Component 환경변수(환경에 따른 중요도)Environmental variable (importance according to environment) 발생(감지)시스템Generation (detection) system 주중_운영시작_시각, 주중_운영종료_시각, 주중_운영시간_중요도, 주중_비 운영시간_중요도, 주말_운영시작_시각, 주말_운영종료_시각, 주말_운영시간_중요도, 주말_비 운영시간_중요도, 공휴일_운영시작_시각, 공휴일_운영종료_시각, 공휴일_운영시간_중요도, 공휴일_비 운영시간_중요도, 계획휴지 일정Weekday_start_time, weekday_end_time, weekday_hour_importance, weekday_non-hour_importance, weekend_start_time, weekend_end_time, weekend_open_time, Weekend_non-operation time_importance, holiday_operation start_time, holiday_operation end_time, holiday_operation time_importance, holiday_non-operation time_importance, planned holiday schedule Event_idEvent_id 주중_운영시작_시각, 주중_운영종료_시각, 주중_운영시간_중요도, 주중_비 운영시간_중요도, 주말_운영시작_시각, 주말_운영종료_시각, 주말_운영시간_중요도, 주말_비 운영시간_중요도, 공휴일_운영시작_시각, 공휴일_운영종료_시각, 공휴일_운영시간_중요도, 공휴일_비 운영시간_중요도, 계획휴지 일정Weekday_start_time, weekday_end_time, weekday_hour_importance, weekday_non-hour_importance, weekend_start_time, weekend_end_time, weekend_open_time, Weekend_non-operation time_importance, holiday_operation start_time, holiday_operation end_time, holiday_operation time_importance, holiday_non-operation time_importance, planned holiday schedule 자원resource 주중_운영시작_시각, 주중_운영종료_시각, 주중_운영시간_중요도, 주중_비 운영시간_중요도, 주말_운영시작_시각, 주말_운영종료_시각, 주말_운영시간_중요도, 주말_비 운영시간_중요도, 공휴일_운영시작_시각, 공휴일_운영종료_시각, 공휴일_운영시간_중요도, 공휴일_비 운영시간_중요도, 계획휴지 일정Weekday_start_time, weekday_end_time, weekday_hour_importance, weekday_non-hour_importance, weekend_start_time, weekend_end_time, weekend_open_time, Weekend_non-operation time_importance, holiday_operation start_time, holiday_operation end_time, holiday_operation time_importance, holiday_non-operation time_importance, planned holiday schedule Event_id별 수치필드Numeric Field by Event_id "Event_id별" 또는 "Event_id와 장애자원별" 수치 구간의 중요도 구분Classification of importance of numerical interval "by Event_id" or "by Event_id and Disability Resource"

다음으로, 단계 S113에서는 상기 발생한 장애 이벤트 종류별로 이벤트 종류 유형을 정의한다. 아래의 [표4]는 이벤트 종류별 이벤트 종류 유형을 정의한 것을 나타낸 것이다.Next, in step S113, an event type type is defined for each of the generated fault events. [Table 4] below shows the definition of event type by event type.

이벤트 유형Event type 이벤트 종류 유형Event type type 내용Contents 중요도 처리방법Importance Handling MM 모든 경우 적용 가능Applicable in all cases Event_id 중요도를 최종 중요도로 지정Make Event_id importance as final RR 중요도 고려대상 "자원"이 1개 존재하는 경우 When there is one "Resource" to consider materiality 자원 존재시, 자원의 중요도를 적용. 해당 자원 없을 시 이벤트 종류 중요도 적용If resources exist, apply their importance. Event type importance is applied when there is no corresponding resource SS 자원 존재시 자원의 중요도를 적용. 자원 없을 시 최종 중요도는 "최하등급(등급 없음)"적용Apply the importance of resources in the presence of resources. In case of no resource, the final importance is "lowest grade" TT 중요도 고려대상 "자원"이 복수 존재하는 경우When there are multiple "resources" to consider materiality 자원 존재시 가능 높은 중요도를 갖는 자원의 중요도 적용. 최상위 자원이 복수 존재시 선행 자원 우선. 자원 없을 시 이벤트 종류 중요도 적용.In the presence of resources, apply the importance of resources with the highest possible importance. If multiple top-level resources exist, the preceding resource takes precedence. Event type importance is applied when there is no resource. UU 자원 존재시 가능 높은 중요도를 갖는 자원의 중요도 적용. 최상위 자원이 복수 존재시 선행 자원 우선. 자원 없을 시 최종 중요도는 "최하등급(등급 없음)" 적용.In the presence of resources, apply the importance of resources with the highest possible importance. If multiple top-level resources exist, the preceding resource takes precedence. In the absence of resources, the final importance is "lowest grade". NN 중요도 고려대상 "수치필드"가 존재하는 경우Importance consideration "numeric field" exists 이벤트 종류별 또는 이벤트 종류와 장애자원별 수치구간별 중요도 판정 예) 50 이상 80 미만: 중요도 3 80 이상 95 미만: 중요도2 95 이상 101 미만: 중요도1 그 외의 경우: 이벤트 종류별 중요도 적용 Materiality judgment by event type or by numerical type by event type and failure resource Example) 50 or more and less than 80: Severity 3 80 or more and less than 95: Severity 2 95 or more and less than 101: Severity 1 Otherwise: Importance by event type XX 기타 특수처리 필요Other special treatment required 이벤트 종류별 특수처리 프로그램 지정Specify special processing program by event type

발생한 장애 이벤트에 대해서 이벤트 유형의 정의가 종료되면, 다음으로 단계 S115 및 단계 S117에서 이벤트 유형별 환경 변수를 적용하여 최종 장애 등급을 판단하게 된다.When the definition of the event type is finished for the generated failure event, the final failure level is determined by applying environment variables for each event type in steps S115 and S117.

즉, 장애 이벤트 유형이 "M"인 경우에는 해당 이벤트 종류의 환경 변수를 고려하여 최종등급을 판단하고, 장애 이벤트 유형이 "R"인 경우에는 해당 자원의 환경변수를 고려하여 최종등급을 판단하되, 자원이 없을 경우에는 해당 이벤트 종류의 환경변수를 고려하여 최종등급을 판단하게 된다.That is, if the failure event type is "M", the final grade is determined by considering the environmental variable of the corresponding event type, and when the failure event type is "R", the final grade is determined by considering the environmental variable of the corresponding resource. In case there is no resource, the final class is determined by considering the environment variables of the event type.

아울러 장애 이벤트 유형이 "S"인 경우에는 해당 자원의 환경변수를 고려하여 최종등급을 판단하되, 자원이 없을 경우에는 최종 등급은 최하등급을 적용하게 되며, 장애 이벤트 유형이 "T"인 경우에는 해당 자원의 환경변수를 고려하여 가장 높은 중요도를 갖는 자원의 중요도를 적용하되, 최상위 자원이 복수 존재할 경우에는 선행 자원을 우선 적용하고, 자원이 없을 경우에는 이벤트 종류 중요도를 적용하여 최종 장애 등급을 판단하게 된다.In addition, when the failure event type is "S", the final level is determined by considering the environmental variables of the resource. If there is no resource, the final level is applied to the lowest level. When the failure event type is "T", In consideration of the environmental variables of the resource, apply the importance of the resource with the highest importance, but if there are multiple top-level resources, apply the preceding resource first, and if there is no resource, determine the final failure level by applying the event type importance. Done.

여기서 장애 이벤트 유형이 "M", "R", "S" 및 "T"일 경우에는, 이벤트 발생 시스템의 환경 변수를 고려하고, 앞에서 도출된 최종 중요도가 환경 변수가 고려된 이벤트 발생 시스템의 중요도보다 높은 경우에는 이벤트 발생 시스템의 중요도를 최종 중요도로 결정하게 된다. 최종 중요도는 환경변수가 고려된 이벤트 발생 시스템의 중요도를 넘을 수 없다.If the fault event type is "M", "R", "S" and "T", then consider the environment variables of the event generation system, and the importance of the event occurrence system with the final importance derived above. In higher cases, the importance of the event generating system is determined as the final importance. The final importance cannot exceed the importance of the event generation system in which environmental variables are considered.

또한, 장애 이벤트 유형이 "U"인 경우에는 해당 자원의 환경변수를 고려하여 가장 높은 중요도를 갖는 자원의 중요도를 적용하되, 최상위 자원이 복수 존재할 경우에는 선행 자원 우선 적용하고, 자원이 없을 경우에는 최하등급으로 최종 장애 등급을 판단한다.In addition, when the failure event type is "U", the priority of the resource with the highest priority is applied in consideration of the environment variable of the resource, but if there are multiple top-level resources, the priority resource is applied first. The lowest grade is used to determine the final level of disability.

또한, 장애 이벤트 유형이 "N"인 경우에는 해당 이벤트 종류별 또는 이벤트 종류와 장애자원별 수치필드의 값에 따라 중요도를 판정하되, 발생 수치 값이 지정 범위 밖인 경우에는 이벤트 종류 중요도를 적용하여 최종 장애 등급을 판단하게 된다.In addition, if the fault event type is "N", the importance is determined according to the value of the corresponding event type or the value of the numerical field of each event type and the failure resource. If the occurrence numeric value is outside the specified range, the event type importance is applied to the final failure. The grade will be judged.

마지막으로, 장애 이벤트 유형이 "X"인 경우에는 해당 이벤트 종류를 위한 특별처리 프로그램을 수행하여 장애 등급을 판단하되, 특별처리 프로그램이 지정되지 않은 경우에는 최하등급으로 장애 등급을 판정하게 되는 것이다.Finally, when the failure event type is "X", the failure level is determined by performing a special processing program for the corresponding event type, but when the failure event type is not specified, the failure level is determined as the lowest level.

이후 단계 S119에서는 판단한 장애 등급을 관리자에게 통보해주어 사후 대책을 수립하도록 도모해주며, 아울러 장애 등급을 표시해주게 된다.Subsequently, in step S119, the administrator is notified of the determined disability level, so as to establish a follow-up countermeasure, and the disability level is displayed.

종래 장애 등급 판단 방법에 의한 장애 등급과 본 발명에 따른 장애 등급의 판단 방법을 이용한 장애 등급을 실제 운영시스템에서 발생한 장애 이벤트에 적용하면 아래의 [표5]와 같다.If a failure grade using a failure grade determination method according to the present invention and a failure grade determination method according to the present invention is applied to a failure event occurring in an actual operating system, the following [Table 5].

중요도 판정 사례Materiality judgment case 발생이벤트Event 종래 중요도판정Priority Determination 본 발명의 중요도 판정Criticality determination of the present invention RAD017 SEOUL21 rader detected a flying object at 56km ahead, in latitude 37°92' North and in longitude 126°70' EastRAD017 SEOUL21 rader detected a flying object at 56km ahead, in latitude 37 ° 92 'North and in longitude 126 ° 70' East 2등급: 레이더에 미확인 물체가 감지되었습니다. Level 2: Radar detected an unidentified object. 1등급: 서울 레이더 56km 전방, 북위 37°92', 동경 126°70'에 미확인 물체가 감지되었습니다. Level 1: An unidentified object was detected 56km ahead of Seoul Radar, 37 ° 92 'north latitude, and 126 ° 70' east longitude. RAD017 DAIGU47 rader detected a flying object at 96km ahead, in latitude 36°50' North and in longitude 128°00' EastRAD017 DAIGU47 rader detected a flying object at 96km ahead, in latitude 36 ° 50 'North and in longitude 128 ° 00' East 2등급: 레이더에 미확인 물체가 감지되었습니다. Level 2: Radar detected an unidentified object. 3등급: 대구 레이더 96km 전방, 북위 36°50', 동경 128°00'에 미확인 물체가 감지되었습니다.Level 3: An unidentified object was detected at the radar 96 km, 36 ° 50 'north latitude, and 128 ° 00' east longitude.

결론적으로, 종래의 장애 등급 판단 방법은 미리 정해진 중요도에 이벤트 종류만을 대입하는 방식이므로, 상기[표5]와 같이 사안이 다른 사항에서도 일률적인 장애 등급을 부여하게 되므로, 중요 이벤트의 간과로 인해 대형 사고를 유발할 수 있다.In conclusion, the conventional disability rating method is a method of assigning only an event type to a predetermined importance level, and thus assigns a uniform disability level to other matters as shown in [Table 5]. It may cause an accident.

반면, 본 발명은 환경 변수(예를 들어, 서울, 대구)를 고려하게 되므로, 동일 사안에 대해서도 환경을 고려하여 최적으로 장애 등급을 부여하게 되므로, 주지한 바와 같이 동일 이벤트에 대해서도 최적의 장애 등급을 부여하게 된다. 따라서 중요 이벤트를 놓쳐 대형 사고를 유발하는 문제를 사전에 예방할 수 있게 되는 것이다.On the other hand, since the present invention considers the environmental variables (for example, Seoul, Daegu), and because it is given the optimal level of failure in consideration of the environment in the same issue, the optimal level of failure for the same event as well known Will be given. Therefore, it is possible to prevent problems that cause a major accident by missing important events.

본 발명은 상술한 특정의 바람직한 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변형실시가 가능한 것은 물론이고, 그와 같은 변경은 청구범위 기재의 범위 내에 있게 된다.The present invention is not limited to the above-described specific preferred embodiments, and various modifications can be made by any person having ordinary skill in the art without departing from the gist of the present invention claimed in the claims. Of course, such changes will fall within the scope of the claims.

이상 상술한 본 발명은 IT 관련 분야나 산업 분야의 운영 시스템에 적용하여 장애 등급을 정확하게 판단할 수 있음은 물론, 데이터 인터페이스만 되면 해당 시스템의 장애 등급을 정확하게 판단할 수 있으므로, 운영시스템을 이용하는 방위산업, 환경산업 등의 모든 분야에 확대 적용이 가능하다.The present invention described above can be applied to an operating system in an IT related field or an industrial field to accurately determine a failure level, and of course, only a data interface can accurately determine a failure level of a corresponding system. It can be extended to all fields such as industry and environmental industry.

도 1은 본 발명이 적용되는 장애발생 판단시스템의 개략적인 구성을 보인 블록도.1 is a block diagram showing a schematic configuration of a failure determination system to which the present invention is applied.

도 2는 본 발명에 따른 운영시스템의 장애 등급 판단 방법을 보인 흐름도.2 is a flow chart showing a failure rating determination method of the operating system according to the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

100… 운영 시스템100... Operating system

200… 장애 등급 판단시스템200... Fault Rating Determination System

210… 데이터 인터페이스기210... Data interface

220… 데이터베이스220... Database

230… 장애발생 판단부230... Failure occurrence judging unit

240… 장애등급 판단부240... Disability Rating Determination

250… 장애등급 표시부250... Fault level indicator

Claims

삭제delete

장비 또는 기기 등을 운영하는 운영시스템과 연결되어 상기 운영시스템의 장애 발생시 장애 등급을 판단하는 방법에 있어서,In the method of determining the failure level in the event of a failure of the operating system connected to the operating system for operating equipment or devices, etc.

상기 정의한 유형에 상기 환경 변수를 적용하여 최종 장애 등급을 판단하는 제4단계를 포함하되, Including the fourth step of determining the final failure level by applying the environmental variable to the defined type,

상기 구성 요소별 환경 변수는 주중_운영시작_시각, 주중_운영종료_시각, 주중_운영시간_중요도, 주중_비 운영시간_중요도, 주말_운영시작_시각, 주말_운영종료_시각, 주말_운영시간_중요도, 주말_비 운영시간_중요도, 공휴일_운영시작_시각, 공휴일_운영종료_시각, 공휴일_운영시간_중요도, 공휴일_비 운영시간_중요도, 계획휴지 일정을 포함하는 것을 특징으로 하는 운영시스템의 장애 등급 판단방법.The environment variables for each component include weekday_operation start_time, weekday_operation end_time, weekday_operation time_importance, weekday_non-operation time_importance, weekend_operation start_time, weekend_operation end_time, Weekend_operating time_importance, weekend_non-operation time_importance, public holiday_operation start_time, public holiday_operation end_time, public holiday_operation time_importance, public holiday_non-operation time_importance, planned holiday Failure level determination method of the operating system, characterized in that.

상기 장애 이벤트의 유형은, The type of failure event is

모든 경우 적용 가능한 유형인 유형"M", 중요도 고려대상 "자원"이 1개 존재하는 경우에 적용 가능한 유형인 유형"R" 및 "S", 중요도 고려대상 "자원"이 복수 존재하는 경우의 유형인 유형"T" 및 "U", 중요도 고려대상 "수치필드"가 존재하는 경우에 적용 가능한 유형인 유형"N", 기타 특수처리가 필요한 유형을 나타내는 유형"X"인 것을 특징으로 하는 운영시스템의 장애 등급 판단방법.In all cases, the type "M", which is an applicable type, and the types "R" and "S", which are applicable types when there is one importance consideration "resource", the type where there are multiple "importance" considerations Phosphorus types "T" and "U", the type "N" which is the type applicable when the importance value "numeric field" exists, and the type "X" which indicates the type that requires special treatment. How to determine your level of disability.

제4항에 있어서, 상기 장애 이벤트의 유형에 따른 최종 장애 등급 판단은,The method of claim 4, wherein the final level of disability is determined according to the type of disability event.

장애 이벤트 유형이 "M"인 경우 해당 이벤트 종류의 환경 변수를 고려하여 최종등급을 판단하고, 장애 이벤트 유형이 "R"인 경우 해당 자원의 환경변수를 고려하여 최종등급을 판단하되, 자원이 없을 경우에는 해당 이벤트 종류의 환경변수를 고려하여 최종등급을 판단하며, 장애 이벤트 유형이 "S"인 경우 해당 자원의 환 경변수를 고려하여 최종등급을 판단하되, 자원이 없을 경우에는 최종 등급은 최하등급을 적용하고, 장애 이벤트 유형이 "T"인 경우 해당 자원의 환경변수를 고려하여 가장 높은 중요도를 갖는 자원의 중요도를 적용하되, 최상위 자원이 복수 존재할 경우에는 선행 자원을 우선 적용하고, 자원이 없을 경우에는 이벤트 종류 중요도를 적용하여 최종 장애 등급을 판단하며, 장애 이벤트 유형이 "U"인 경우 해당 자원의 환경변수를 고려하여 가장 높은 중요도를 갖는 자원의 중요도를 적용하되, 최상위 자원이 복수 존재할 경우에는 선행 자원 우선 적용하고, 자원이 없을 경우에는 최하등급으로 최종 장애 등급을 판단하며, 장애 이벤트 유형이 "N"인 경우 해당 이벤트 종류별 또는 이벤트 종류와 장애자원별 수치필드의 값에 따라 중요도를 판정하되, 발생 수치 값이 지정 범위 밖인 경우에는 이벤트 종류 중요도를 적용하여 최종 장애 등급을 판단하며, 장애 이벤트 유형이 "X"인 경우 해당 이벤트 종류를 위한 특별처리 프로그램을 수행하여 장애 등급을 판단하되, 특별처리 프로그램이 지정되지 않은 경우에는 최하등급으로 장애 등급을 판정하는 것을 특징으로 하는 운영시스템의 장애 등급 판단방법.If the failure event type is "M", the final level is determined by considering the environment variable of the event type. If the failure event type is "R", the final level is determined by considering the environment variable of the resource. In case of event type, the final grade is determined by considering the environmental variables of the event type. If the event type is "S", the final grade is determined by considering the environmental variables of the resource. If the failure event type is "T", the resource of the highest priority is applied in consideration of the environment variable of the resource, but if there are multiple top-level resources, the preceding resource is applied first. If there is no event type, the final failure level is determined by applying the event type importance. If the failure event type is "U", the environment variable of the resource is considered. The priority of the resource with the highest priority is applied. If there are multiple top level resources, the priority is applied first. If there are no resources, the lowest level of failure is determined as the lowest level. If the type of failure event is "N", The importance is determined by the value of the numerical field by event type or by event type and failure resource.If the occurrence value is out of the specified range, the final failure level is determined by applying the event type importance, and the failure event type is "X". If the failure level is determined by performing a special processing program for the event type, if the special processing program is not specified, the failure level is determined by the lowest level.

제5항에 있어서, 상기 장애 이벤트 유형이 "M", "R", "S" 및 "T"일 경우에는, 이벤트 발생 시스템의 환경 변수를 고려하고, 앞에서 도출된 최종 중요도가 환경 변수가 고려된 이벤트 발생 시스템의 중요도보다 높은 경우에는 이벤트 발생 시스템의 중요도를 최종 중요도로 결정하는 것을 특징으로 하는 운영시스템의 장애 등급 판단방법.6. The method of claim 5, wherein when the fault event type is "M", "R", "S" and "T", the environment variable of the event generation system is considered, and the final importance derived above is considered. And determining the importance of the event generating system as the final importance when the importance of the event generating system is higher than the importance of the event generating system.