KR19990013520A

KR19990013520A - Distributed data processing system management method and apparatus

Info

Publication number: KR19990013520A
Application number: KR1019980026491A
Authority: KR
Inventors: 길리스그렘싱클레어; 고든제프리존; 자파리-랜그루디사다르; 서튼케빈존
Original assignee: 포맨제프리엘; 인터내셔널비지네스머신즈코포레이션
Priority date: 1997-07-26
Filing date: 1998-07-01
Publication date: 1999-02-25
Also published as: GB9715760D0; GB2328043A; KR100308678B1

Abstract

본 발명은 네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템을 구비한 분산 데이터 처리 시스템을 관리하기 위한 장치에 관한 것으로서, 여기서, 각각의 애플리케이션 서버는 데이터베이스를 포함하는 데이터베이스 애플리케이션을 구비하고 있으며, 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하고, 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성한다. 이 장치는 네트워크를 통해 애플리케이션 서버로부터 수신되는 로그 파일을 메모리에 저장하고, 저장된 로그 파일에서의 경보 조건 검출시 제1 이벤트 메시지를 발생하고, 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에 의한 데이터베이스 복제가 없다는 것을 검출하는 경우에 제2 이벤트 메시지를 발생하고, 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에서 장애 조건의 검출시 제3 이벤트 메시지를 발생하고, 애플리케이션 서버 사이의 하나 또는 그 이상의 접속에서의 장애 조건 검출시 제4 이벤트 메시지를 발생한다.The present invention relates to an apparatus for managing a distributed data processing system having a plurality of application server computer systems interconnected via a network, wherein each application server has a database application comprising a database, the database Data is communicated from the network to a plurality of client computer systems and other application servers, and a log file is recorded that records the occurrence of the data communication. The device stores a log file received from an application server over a network in memory, generates a first event message upon detection of an alert condition in the stored log file, and allows database replication by one or more database applications in the application server. Generating a second event message when detecting that there is no, generating a third event message upon detection of a failure condition in a database application within one or more application servers, and a failure condition on one or more connections between application servers Generate a fourth event message upon detection.

Description

분산 데이터 처리 시스템 관리 방법 및 장치Distributed data processing system management method and device

본 발명은 분산 데이터 처리 시스템을 관리하기 위한 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for managing a distributed data processing system.

종래의 몇몇 데이터 처리 환경은 중앙 호스트 데이터 처리 시스템에 접속된 다수의 이용자 단말기를 포함한다. 이와 같은 데이터 처리 환경은 통상적으로 중앙집중 또는 호스트 환경으로 불린다.Some conventional data processing environments include a number of user terminals connected to a central host data processing system. Such a data processing environment is commonly referred to as a centralized or host environment.

이용자 단말기가 복수의 서버 데이터 처리 시스템에 접속되는 분산 데이터 처리 환경의 대중성이 증가하고 있다.The popularity of a distributed data processing environment in which user terminals are connected to a plurality of server data processing systems is increasing.

전술한 2가지 모두의 예에서, 시스템 관리 비용은 이용자에 대한 관리자(administrators)(또는 운영 지원 스태프)의 비율에 의해 측정된다. 로투스 노트 서비스(Lotus Notes service) 또는 이와 유사한 분산 클라이언트-서버 데이터베이스 애플리케이션을 제공하는 환경과 같은, 통상적인 분산 환경에서, 그 비율이 비교적 높다. 예를 들어, 한 로투스 노트(Lotus 및 Lotus Notes는 Lotus Development Corporation의 상표임) 관리자가 완전한 기능의 로투스 노트 서비스로 200 이상의 이용자를 관리하기가 곤란할 수도 있다. 비교에 의하면, 오피스비전(OfficeVision은 IBM사의 상표임)과 같은 통상적인 호스트 환경에서는, 단일 관리자가 수천명의 이용자를 충분하게 관리할 수도 있다.In both of the above examples, system administration costs are measured by the ratio of administrators (or operations support staff) to users. In a typical distributed environment, such as an environment that provides a Lotus Notes service or similar distributed client-server database application, the ratio is relatively high. For example, one Lotus Notes (Lotus and Lotus Notes are trademarks of Lotus Development Corporation) administrators may find it difficult to manage more than 200 users with the fully functional Lotus Notes service. By comparison, in a typical host environment, such as OfficeVision (OfficeVision is a trademark of IBM Corporation), a single administrator may manage thousands of users.

분산 데이터베이스 관리 시스템이 채용된 통상적인 분산 환경에서, 한 그룹의 관리자가 그룹웨어(Groupware) 또는 E-메일 서버와 같은 서버들의 관리와 관련된 연산 작업을 집합적으로 수행한다. E-메일과 그룹웨어 애플리케이션은 모두 통상적으로 보통의 일상적인 연산 동안에 수 메가바이트의 정보를 생성한다. 정보는 통상적으로 로그(log) 포맷으로 저장된다. 로그는 에러 상태를 식별하고, 따라서 애플리케이션 서버 장애를 제거하거나 또는 적어도 감소시킬 목적으로 처리되는 것이 바람직하다. 그러나, 이와 같은 로그의 처리는 어려운 액티비티이다. 그러므로, 분산 환경에서 서버 관리의 자동화를 개선하는 것이 바람직하다.In a typical distributed environment employing a distributed database management system, a group of administrators collectively perform computational tasks related to the management of servers such as groupware or email servers. Both email and groupware applications typically generate several megabytes of information during normal routine operations. The information is typically stored in log format. Logs are preferably processed for the purpose of identifying error conditions and thus eliminating or at least reducing application server failures. However, processing such logs is a difficult activity. Therefore, it is desirable to improve the automation of server management in a distributed environment.

본 발명에 따르면, 네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템 - 여기서, 각각의 애플리케이션 서버는, 데이터베이스를 포함하는 데이터베이스 애플리케이션, 상기 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하기 위한 수단, 및 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성하기 위한 수단을 구비함 - 을 포함하는 분산 데이터 처리 시스템을 관리하기 위한 장치가 제공되며, 이 장치는, 기억 수단; 네트워크를 통해 상기 애플리케이션 서버로부터 수신되는 로그 파일을 상기 기억 수단에 저장하고, 상기 저장된 로그 파일에서의 경보 조건 검출시 제1 이벤트 메시지를 발생하기 위한 모니터 수단; 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에 의한 데이터베이스 복제가 없다는 것을 검출하는 경우에 제2 이벤트 메시지를 발생하기 위한 복제 트래킹 수단; 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에서 장애 조건의 검출시 제3 이벤트 메시지를 발생하기 위한 서버 프로브 수단; 및 상기 애플리케이션 서버 사이의 하나 또는 그 이상의 접속에서의 장애 조건 검출시 제4 이벤트 메시지를 발생하기 위한 메일 프로브 수단을 포함한다.According to the invention, a plurality of application server computer systems interconnected via a network, wherein each application server comprises a database application comprising a database, data from the database to a plurality of client computer systems and other application servers via the network. Means for communicating, and means for generating a log file that records the occurrence of the data communication, the apparatus for managing a distributed data processing system comprising: storage means; Monitor means for storing a log file received from the application server via a network in the storage means and for generating a first event message upon detection of an alarm condition in the stored log file; Replication tracking means for generating a second event message when detecting that there is no database replication by a database application in one or more application servers; Server probe means for generating a third event message upon detection of a failure condition in a database application in one or more application servers; And mail probe means for generating a fourth event message upon detecting a failure condition in one or more connections between said application servers.

이 장치는 바람직하게는, 상기 제1, 제2, 제3 및 제4 이벤트 메시지를 디스플레이하기 위한 수단을 구비한 관리 단말기를 포함한다. 본 발명의 양호한 실시예에서, 상기 모니터 수단은 상기 이벤트 메시지들을 상기 관리 단말기로 선택적으로 통신하기 위한 필터를 포함한다. 상기 복제 트래킹 수단은 바람직하게는, 각각의 애플리케이션 서버의 데이터베이스를 샘플링하기 위한 샘플링 수단; 대응하는 샘플이 서로 정합하는지 판단하기 위한 제1 비교 수단; 및 상기 제1 비교 수단에 의해 식별된 정합의 수가 사전정의된 수용 임계치 보다 적은 경우에 상기 제2 이벤트 메시지를 발생하기 위한 제2 비교 수단을 포함한다. 상기 서버 프로브 수단은 바람직하게는, 상기 장애 조건을 해결하기 위해 연속적으로 시도하기 위한 재개시 수단과, 상기 장애 조건을 해결하기 위한 상기 재개시 수단에 의한 사전결정된 다수의 시도 실패 이후에 상기 제3 이벤트 메시지를 발생하기 위한 수단을 포함한다. 상기 메일 프로브 수단은 바람직하게는, 상기 애플리케이션 서버 중 다른 서버를 통해 상기 애플리케이션 서버 중 리플렉트하는 서버로 테스트 메시지를 전송하기 위한 수단; 상기 애플리케이션 서버 중 리플렉트하는 서버로부터 상기 애플리케이션 서버 중 다른 서버를 통해 테스트 메시지를 수신하기 위한 수단; 및 테스트 메시지가 상기 애플리케이션 서버 중 리플렉트하는 서버를 통해 통과하는데 걸리는 시간 및 상기 애플리케이션 서버 중 리플렉트하는 서버로 또한 그 서버로부터 도중에 각각의 애플리케이션 서버를 통해 통과하는데 걸리는 시간을 기록하기 위한 수단을 포함한다.The apparatus preferably comprises a management terminal with means for displaying the first, second, third and fourth event messages. In a preferred embodiment of the invention, said monitoring means comprises a filter for selectively communicating said event messages to said management terminal. The copy tracking means preferably comprises: sampling means for sampling a database of each application server; First comparing means for determining whether corresponding samples match each other; And second comparing means for generating the second event message if the number of matches identified by the first comparing means is less than a predefined acceptance threshold. The server probe means preferably comprises: restart means for continuously attempting to resolve the fault condition, and after the predetermined plurality of failed attempts by the restart means for solving the fault condition. Means for generating an event message. The mail probe means preferably comprises: means for sending a test message to one of the application servers reflecting through another of the application servers; Means for receiving a test message from another reflecting one of said application servers via another one of said application servers; And means for recording the time it takes for a test message to pass through the reflecting one of the application servers and the time it takes to pass through each application server halfway from and to the reflecting one of the application servers. do.

본 발명은, 네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템 - 여기서, 각각의 애플리케이션 서버는, 데이터베이스를 포함하는 데이터베이스 애플리케이션, 상기 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하기 위한 수단, 및 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성하기 위한 수단을 구비함 - 과, 전술한 바와 같은 장치를 포함하는 분산 데이터 처리 시스템까지 확장된다는 것을 이해할 것이다.The present invention relates to a plurality of application server computer systems interconnected via a network, where each application server comprises a database application comprising a database, data from the database to a plurality of client computer systems and other application servers via the network. It will be appreciated that the invention extends to a distributed data processing system comprising means for communicating, and means for generating a log file that records the occurrence of the data communication, and an apparatus as described above.

본 발명을 다른 관점에서 보면, 네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템 - 여기서, 각각의 애플리케이션 서버는, 데이터베이스를 포함하는 데이터베이스 애플리케이션, 상기 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하기 위한 수단, 및 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성하기 위한 수단을 구비함 - 을 포함하는 분산 데이터 처리 시스템을 관리하기 위한 방법이 제공되며, 이 방법은, 네트워크를 통해 상기 애플리케이션 서버로부터 수신되는 로그 파일을 관리 서버 컴퓨터 시스템의 기억 수단에 저장하는 단계; 상기 저장된 로그 파일에서의 경보 조건 검출시 제1 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계; 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에 의한 데이터베이스 복제가 없다는 것을 검출하는 경우에 제2 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계; 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에서 장애 조건의 검출시 제3 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계; 및 상기 애플리케이션 서버 사이의 하나 또는 그 이상의 접속에서의 장애 조건 검출시 제4 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계를 포함한다.In another aspect, the present invention provides a plurality of application server computer systems interconnected via a network, wherein each application server comprises a database application comprising a database, a plurality of client computer systems and other applications from the database over the network. Means for communicating data to a server, and means for generating a log file that records the occurrence of the data communication; and a method for managing a distributed data processing system, the method comprising: a network Storing a log file received from the application server via a storage means of a management server computer system; Generating, by the management server, a first event message upon detecting an alarm condition in the stored log file; Generating, by the management server, a second event message when detecting that there is no database replication by a database application in one or more application servers; Generating by the management server a third event message upon detection of a failure condition in a database application in one or more application servers; And generating, by the management server, a fourth event message upon detecting a failure condition in one or more connections between the application servers.

도1은 분산 데이터 처리 시스템의 블록도.1 is a block diagram of a distributed data processing system.

도2는 도1의 데이터 처리 시스템의 상세 블록도.2 is a detailed block diagram of the data processing system of FIG.

도3은 도2에 도시된 시스템의 DSM 서버의 블록도.3 is a block diagram of a DSM server of the system shown in FIG.

도4는 도2에 도시된 시스템의 애플리케이션 서버의 블록도.4 is a block diagram of an application server of the system shown in FIG.

도5는 DSM 서버에 대한 고레벨 아키텍처의 블록도.5 is a block diagram of a high level architecture for a DSM server.

도6은 도2에 도시된 시스템의 애플리케이션 서버에 저장된 소프트웨어의 블록도.6 is a block diagram of software stored in an application server of the system shown in FIG.

도7은 DSM 서버의 기능 블록도.7 is a functional block diagram of a DSM server.

도8은 DSM 서버의 서버 프로브 기능의 흐름도 형태의 블록도.8 is a block diagram in flow chart form of a server probe function of a DSM server;

도9는 DSM 서버의 메일 프로브 기능의 흐름도 형태의 블록도.Fig. 9 is a block diagram in the form of a flowchart of the mail probe function of the DSM server.

도10은 본 발명을 구현하는 다른 분산 데이터 처리 환경의 블록도.Figure 10 is a block diagram of another distributed data processing environment implementing the present invention.

* 도면의 주요 부분에 대한 부호의 설명* Explanation of symbols for the main parts of the drawings

10:DSM 서버 20:관리 단말기10: DSM server 20: Management terminal

40-70:애플리케이션 서버 90-93:클라이언트 이용자 단말기40-70: application server 90-93: client user terminal

이제, 첨부 도면을 참조하여 본 발명의 양호한 실시예에 대해 설명하게 된다.DESCRIPTION OF THE PREFERRED EMBODIMENTS A preferred embodiment of the present invention will now be described with reference to the accompanying drawings.

먼저, 도1을 참조하면, 본 발명을 구현하는 분산 데이터 처리 시스템은 다수의 애플리케이션 서버 컴퓨터 시스템(40-70)과 분산 시스템 모니터(Distributed System Monitor:DSM) 서버 컴퓨터 시스템(10)을 포함하며, 이들은 모두 네트워크(5)를 통해 상호접속된다.First, referring to FIG. 1, a distributed data processing system implementing the present invention includes a plurality of application server computer systems 40-70 and a Distributed System Monitor (DSM) server computer system 10, They are all interconnected via the network 5.

이제, 도2를 참조하면, 각각의 애플리케이션 서버(40-70)는 일련의 클라이언트 이용자 단말기(90-93)에 서비스를 제공한다. DSM 서버(10)는 또한, 관리 단말기(administration terminal)(20)에도 접속된다.Referring now to FIG. 2, each application server 40-70 provides a service to a series of client user terminals 90-93. The DSM server 10 is also connected to an administration terminal 20.

도3을 참조하면, DSM 서버(10)는, 시스템 랜덤 억세스 메모리(RAM)(200), 시스템 판독 전용 메모리(ROM)(210), 중앙 처리 장치(CPU)(220), 하나 또는 그 이상의 대용량 자기 디스크 또는 그와 유사한 데이터 기록 매체를 포함하는 대용량 기억장치(230), 플로피 디스크 드라이브나 CD ROM 드라이브 등과 같은 하나 또는 그 이상의 분리가능한 기억 수단(240), 네트워크 어댑터(250), 키보드 어댑터(260), 포인팅 장치 어댑터(270), 및 디스플레이 어댑터(280)를 포함하고 있으며, 이들은 모두 버스 아키텍처(290)를 통해 상호접속된다. CPU(220)는 펜티엄 100MHz CPU이다(Pentium은 Intel사의 상표임). 본 발명의 다른 실시예는 CPU(220)의 기능을 수행하기 위해 펜티엄 100MHz CPU의 등가물을 사용할 수도 있다는 것을 이해할 것이다. RAM(200)은 그 용량이 적어도 48 메가바이트이다. 키보드(300)는 키보드 어댑터(260)를 통해 버스 아키텍처(290)에 접속된다. 이와 유사하게, 마우스, 터치 스크린, 태블릿, 트랙커 볼(tracker ball) 등과 같은 포인팅 장치(310)는 포인팅 장치 어댑터(270)를 통해 버스 아키텍처(290)에 접속된다. 이와 동일하게, 음극선관(CRT) 디스플레이, 액정 디스플레이(LCD) 패널 등과 같은 디스플레이 출력 장치(320)는 디스플레이 어댑터(280)를 통해 버스 아키텍처(290)에 접속된다. 또한, DSM 서버(10)는 네트워크 어댑터(250)를 통해 단말기(20)와 서버(40-70)에 접속된다.Referring to FIG. 3, the DSM server 10 includes a system random access memory (RAM) 200, a system read only memory (ROM) 210, a central processing unit (CPU) 220, one or more large capacities. Mass storage device 230, including magnetic disk or similar data recording medium, one or more removable storage means 240, such as floppy disk drive or CD ROM drive, network adapter 250, keyboard adapter 260 ), Pointing device adapter 270, and display adapter 280, all of which are interconnected via bus architecture 290. CPU 220 is a Pentium 100 MHz CPU (Pentium is a trademark of Intel Corporation). It will be appreciated that other embodiments of the present invention may use equivalents of the Pentium 100 MHz CPU to perform the functions of the CPU 220. RAM 200 has a capacity of at least 48 megabytes. Keyboard 300 is connected to bus architecture 290 via keyboard adapter 260. Similarly, pointing device 310, such as a mouse, touch screen, tablet, tracker ball, etc., is connected to bus architecture 290 via pointing device adapter 270. Similarly, display output devices 320, such as cathode ray tube (CRT) displays, liquid crystal display (LCD) panels, and the like, are connected to bus architecture 290 through display adapter 280. In addition, the DSM server 10 is connected to the terminal 20 and the servers 40-70 through the network adapter 250.

버스 아키텍처(290)를 통해, CPU(220), 대용량 기억장치(230), RAM(200), ROM(210), 분리가능한 기억장치(240) 및 어댑터(250-280) 사이에서 데이터 통신을 가능하게 하기 위해, 기본 입출력 시스템(BIOS) 소프트웨어가 ROM(210)에 저장된다. 대용량 기억장치(230)에는, DSM 소프트웨어를 포함하는 애플리케이션 소프트웨어와 운영체제 소프트웨어가 저장된다. 분리가능한 기억장치(240) 또는 네트워크 어댑터(280)를 통해, 다른 애플리케이션 소프트웨어가 DSM 서버(10)에 로딩될 수도 있다. 운영체제 소프트웨어는 DSM 서버(10)로 하여금 애플리케이션 소프트웨어를 선택하고 실행시킬 수 있도록 한다. DSM 서버(10)에 저장된 애플리케이션 소프트웨어는 로투스 노트 릴리스 4를 포함한다. 로투스 노트 4는 문서-기반 데이터베이스 관리 시스템이다. 로투스 노트 4의 세부사항은 Sybex. Inc.에 의해 1996년에 발간된, 브라운, 브라운, 코트콕 및 브라운의Matering Lotus Notes 4에서 찾아 볼 수 있다. 바로 후술되는 바와 같이, 동작중에, DSM 서버(10)는 애플리케이션 서버(40-70)와 통신하기 위해 로투스 노트 4를 사용한다.Bus architecture 290 enables data communication between CPU 220, mass storage 230, RAM 200, ROM 210, removable storage 240, and adapters 250-280. To do so, basic input / output system (BIOS) software is stored in ROM 210. The mass storage device 230 stores application software, including operating system software, including DSM software. Other application software may be loaded into the DSM server 10 via removable storage 240 or network adapter 280. Operating system software enables the DSM server 10 to select and execute application software. Application software stored on the DSM server 10 includes Lotus Notes Release 4. Lotus Note 4 is a document-based database management system. Details of Lotus Note 4 are available from Sybex. It can be found in Matering Lotus Notes 4 of Brown, Brown, Cottcock and Brown, published in 1996 by Inc. As will be discussed below, in operation, DSM server 10 uses Lotus Notes 4 to communicate with application servers 40-70.

본 발명의 몇몇 실시예에서, 단말기(20)는 DSM 서버(10)의 구성요소가 될 수 있으며, 단말기(20)의 모니터링 및 제어 기능이 DSM 서버(10)의 디스플레이(320) 및 입력장치(300,310)를 통해 용이하게 이루어진다.In some embodiments of the present invention, the terminal 20 may be a component of the DSM server 10, and the monitoring and control functions of the terminal 20 may be performed by the display 320 and the input device of the DSM server 10. 300, 310) is made easily.

도4를 참조하면, 각각의 애플리케이션 서버(40-70)는, 시스템 랜덤 억세스 메모리(RAM)(700), 시스템 판독 전용 메모리(ROM)(710), 중앙 처리 장치(CPU)(720), 하나 또는 그 이상의 대용량 자기 디스크 또는 그와 유사한 데이터 기록 매체를 포함하는 대용량 기억장치(730), 플로피 디스크 드라이브나 CD ROM 드라이브 등과 같은 하나 또는 그 이상의 분리가능한 기억 수단(740), 네트워크 어댑터(750), 키보드 어댑터(760), 포인팅 장치 어댑터(770), 및 디스플레이 어댑터(780)를 포함하고 있으며, 이들은 모두 버스 아키텍처(790)를 통해 상호접속된다. CPU(720)는 펜티엄 100MHz CPU, 또는 그 등가물이 될 수 있다. 키보드(800)는 키보드 어댑터(760)를 통해 버스 아키텍처(790)에 접속된다. 이와 유사하게, 마우스, 터치 스크린, 태블릿, 트랙커 볼(tracker ball) 등과 같은 포인팅 장치(810)는 포인팅 장치 어댑터(770)를 통해 버스 아키텍처(790)에 접속된다. 이와 동일하게, 음극선관(CRT) 디스플레이, 액정 디스플레이(LCD) 패널 등과 같은 디스플레이 출력 장치(820)는 디스플레이 어댑터(780)를 통해 버스 아키텍처(790)에 접속된다. 또한, 각각의 애플리케이션 서버(40-70)는 네트워크 어댑터(750)를 통해, DSM 서버(10)와 원격 클라이언트 단말기(90)에 접속된다.4, each application server 40-70 includes a system random access memory (RAM) 700, a system read only memory (ROM) 710, a central processing unit (CPU) 720, and one. A mass storage device 730 comprising more or more mass magnetic disks or similar data recording media, one or more removable storage means 740, such as a floppy disk drive or CD ROM drive, a network adapter 750, A keyboard adapter 760, a pointing device adapter 770, and a display adapter 780, all of which are interconnected via a bus architecture 790. The CPU 720 may be a Pentium 100 MHz CPU, or equivalent. Keyboard 800 is connected to bus architecture 790 via keyboard adapter 760. Similarly, pointing devices 810, such as a mouse, touch screen, tablet, tracker ball, etc., are connected to bus architecture 790 via pointing device adapter 770. Similarly, display output devices 820, such as cathode ray tube (CRT) displays, liquid crystal display (LCD) panels, and the like, are connected to bus architecture 790 via display adapter 780. In addition, each application server 40-70 is connected to the DSM server 10 and the remote client terminal 90 via a network adapter 750.

버스 아키텍처(790)를 통해, CPU(720), 대용량 기억장치(730), RAM(700), ROM(710), 분리가능한 기억장치(740) 및 어댑터(750-780) 사이에서 데이터 통신을 가능하게 하기 위해, 기본 입출력 시스템(BIOS) 소프트웨어가 ROM(710)에 저장된다. 대용량 기억장치(730)에는, 애플리케이션 소프트웨어와 운영체제 소프트웨어가 저장된다. 애플리케이션 소프트웨어는 로투스 노트4 및 로투스 cc:Mail과 같은 분산 클라이언트 서버 데이터베이스 애플리케이션을 포함한다. 동작중에, 각각의 애플리케이션 서버(40-70)는 원격 클라이언트 단말기(90-93) 및 DSM 서버(10) 둘다와 통신하기 위해 상주 클라이언트-서버 데이터베이스 애플리케이션을 이용한다. 또한, 분리가능한 기억장치(740) 또는 네트워크 어댑터(780)를 통해, 다른 애플리케이션 소프트웨어가 각각의 애플리케이션 서버(40-70)에 로딩될 수도 있다. 동작중에, 운영체제 소프트웨어는 각각의 애플리케이션 서버(40-70)로 하여금 애플리케이션 소프트웨어를 선택하고 실행시킬 수 있도록 한다.Bus architecture 790 enables data communication between CPU 720, mass storage 730, RAM 700, ROM 710, removable storage 740, and adapters 750-780. To do so, basic input / output system (BIOS) software is stored in ROM 710. The mass storage device 730 stores application software and operating system software. Application software includes distributed client server database applications such as Lotus Notes 4 and Lotus cc: Mail. In operation, each application server 40-70 uses a resident client-server database application to communicate with both the remote client terminal 90-93 and the DSM server 10. Also, via removable storage 740 or network adapter 780, other application software may be loaded into each application server 40-70. In operation, the operating system software allows each application server 40-70 to select and execute the application software.

다시 도2를 참조하면, 애플리케이션 서버(40)는 접속된 클라이언트 단말기(90)의 이용자에게 cc:Mail 서비스를 제공하기 위해, IBM사에 의해 제조된 OS/2 운영체제 플랫폼(OS/2는 IBM사의 상표임) 상에서 로투스 cc:Mail을 실행하는 cc:Mail 서버이다. 애플리케이션 서버(50)는 접속된 클라이언트 단말기(91)의 이용자에게 Notes 서비스를 제공하기 위해, 마이크로소프트사에 의해 제조된 윈도우 서버 NT 운영체제(Windows 및 Windows NT는 마이크로소프트사의 상표임) 상에서 로투스 노트 4를 실행하는 Notes 서버이다. 애플리케이션 서버(60)는 접속된 클라이언트 단말기(92)의 이용자에게 Notes 서비스를 제공하기 위해, OS/2 운영체제 상에서 로투스 노트 4를 실행하는 Notes 서버이다. 애플리케이션 서버(70)는 접속된 클라이언트 단말기(93)의 이용자에게 Notes 서비스를 제공하기 위해, UNIX 또는 AIX 운영 체제(UNIX는 X/Open 캄퍼니 리미티드를 통해 배타적으로 라이센스된 것이며, AIX는 IBM사의 상표임) 상에서 로투스 노트 4를 실행하는 Notes 서버이다. 본 발명의 다른 실시예에서는, 하나 또는 그 이상의 전술한 또는 다른 운영체제 플랫폼 상에서 운영되는 4개 보다 적거나 많은 애플리케이션 서버가 존재할 수도 있다는 것을 이해할 것이다.Referring back to FIG. 2, the application server 40 is an OS / 2 operating system platform manufactured by IBM to provide cc: Mail service to the user of the connected client terminal 90 (OS / 2 is an IBM company). Is a cc: Mail server running Lotus cc: Mail. Application server 50 runs on Lotus Notes 4 on a Windows Server NT operating system (Windows and Windows NT are trademarks of Microsoft Corporation) manufactured by Microsoft to provide Notes services to users of connected client terminals 91. Notes server running. Application server 60 is a Notes server running Lotus Notes 4 on an OS / 2 operating system to provide Notes services to users of connected client terminals 92. The application server 70 is licensed exclusively through a UNIX or AIX operating system (UNIX is an X / Open Company Limited, and is a trademark of IBM Corporation) in order to provide Notes services to users of the client terminal 93 to which it is connected. Is a Notes server running Lotus Notes 4. It will be appreciated that in other embodiments of the invention, there may be less than four or more application servers running on one or more of the aforementioned or other operating system platforms.

DSM 서버:일반 원리DSM Server: General Principles

도2로부터, DSM 서버는 시스템 하이어라키에 관해, 애플리케이션 서버(40-70)와 관리 단말기(20) 사이에 위치한다는 것을 이해할 것이다. 동작중에, DSM 서버는 중간-레벨 시스템 관리자로서 동작한다. 동작중에, 애플리케이션 서버(40-70)는 접속된 클라이언트 단말기(90-93)로/로부터의 메시지 및 E-메일과 같이, 로그 파일에 관련된 데이터 전송(transfers)을 기록한다. 애플리케이션 서버(40-70)에 의해 유지되는 로그 파일은 DSM 서버(10)로 향한다. DSM 서버(10)는 관리 단말기(20)로 전송되는 리포팅 정보의 양을 감소시키기 위해 수신된 로그 파일을 처리한다. 애플리케이션 서버(40-70)가 이와 같은 로그 파일을 DSM 서버(10)로 루팅할 수 있다면, 또한, Notes 서버(50-70)가 Notes 통신 프로토콜을 사용하는 경우에는, 운영체제 플랫폼은 관계되지 않는다.From Fig. 2, it will be understood that the DSM server is located between the application server 40-70 and the management terminal 20 with respect to the system hierarchy. In operation, the DSM server acts as a mid-level system administrator. In operation, application server 40-70 records data transfers related to log files, such as messages and E-mails to / from connected client terminals 90-93. The log file maintained by the application server 40-70 is directed to the DSM server 10. The DSM server 10 processes the received log file to reduce the amount of reporting information sent to the management terminal 20. If the application server 40-70 can route such log files to the DSM server 10, and also if the Notes server 50-70 uses the Notes communication protocol, the operating system platform is irrelevant.

이제, 도5를 참조하면, DSM 서버(10)의 고레벨 아키텍처는 프로세스(Process), 액션(Action), 통지(Notify) 및 리포트(Report) 기능을 수행하기 위한 제1 레이어(11)를 포함한다. 제1 레이어(11) 아래에는, 로그(Log), 분석(Analyze) 및 여과(Filter) 기능을 수행하기 위한 제2 기능 레이어(12)가 있다. 제2 레이어(12) 아래에는 로투스 노트 레이어(13)가 있다. 동작중에, 노트 레이어(13)는 DSM 서버(10)로 하여금 애플리케이션 서버(50-70)와 통신할 수 있도록 한다. 이와 같은 통신을 용이하게 하기 위해, 노트 레이어(13)는 노트 메일 메시지 전송 에이전트(Notes mail message transfer agent)(MTA)(14), cc:Mail MTA(15), 심플 메시지 전송 프로토콜(SMTP) 메일 MTA(16), 및 X.400 메일 MTA(도시 안됨)을 포함한다. 메시지 전송 에이전트는 노트와 관계가 없는 메일 시스템과 통신하기 위해 특수한 메일 게이트웨이를 포함할 필요성을 없앤다. 노트 레이어(13) 아래에는, DSM 서버(10)를 애플리케이션 서버(40-70)와 인터페이스하기 위한 네트워크 레이어(17)가 있다. 대용량 기억장치(230)는 노트 데이터 기억장치(21), cc:Mail 기억장치(22) 및 아카이브 데이터 기억장치(23)를 포함한다. STATREP 및 LOG.NSF 파일(110) 또는 다른 메일 시스템 로그 파일(100)과 같은 애플리케이션 서버(40-70)로부터의 데이터는 네트워크 레이어(17)에서 DSM 서버(10)에 수신되며, 노트 레이어(13)의 MTA(14-16)를 통해 제2 레이어(12)로 전달되어, 로그, 분석 및 여과 기능에 의해 처리된다. 로그 기능은 인입 데이터를 대용량 기억장치(230)에 기록한다. 여과된 데이터는 제2 레이어(12)로부터 제1 레이어(11)로 전달되어, 프로세스, 액션, 통지 및 리포트 기능에 의해 처리된다. 액션 기능은 수신된 데이터에 응답하여, 애플리케이션 서버(40-70)로 반송되는 교정(corrective) 명령(81)을 발생할 수도 있다. 시스템 구성에 따라, 제2 레이어(12)로부터 제1 레이어(11)까지의 데이터 전달에 하나 또는 그 이상의 기능과 관련하여 지연이 부과될 수도 있다. 예를 들어, 본 발명의 몇몇 실시예에서, 리포트 기능은 리포트가 만기가 될 때까지 대용량 기억장치(230)에 로깅되어 남아 있는 데이터에 대해 일주일에 한 번만 활성화하도록 설정될 수 있다. 각각의 애플리케이션 서버(40-70)의 고레벨 아키텍처는 경우에 따라 OS/2, NT 또는 AIX 운영체제 플랫폼 상에서 Notes 또는 cc:Mail 기능성을 제공하는 메일 레이어(41)를 포함한다. 메일 레이어(41) 아래에는, DSM 서버(10)로 로그 파일을 제공하기 위한 로그 레이어(42)가 있다. 이 로그 레이어(42) 아래에는, DSM 서버(10)의 네트워크 레이어(17)와의 인터페이스를 위한 네트워크 레이어(43)가 있다.Referring now to FIG. 5, the high level architecture of the DSM server 10 includes a first layer 11 for performing Process, Action, Notify, and Report functions. . Below the first layer 11, there is a second functional layer 12 for performing Log, Analyze, and Filter functions. Below the second layer 12 is a Lotus note layer 13. In operation, note layer 13 allows DSM server 10 to communicate with application server 50-70. In order to facilitate such communication, the notes layer 13 may include a Notes mail message transfer agent (MTA) 14, a cc: Mail MTA 15, and a Simple Message Transfer Protocol (SMTP) mail. MTA 16, and X.400 Mail MTA (not shown). Message transfer agents eliminate the need to include a special mail gateway to communicate with mail systems that are not associated with notes. Below the note layer 13 is a network layer 17 for interfacing the DSM server 10 with the application servers 40-70. The mass storage device 230 includes a note data storage 21, a cc: Mail storage 22, and an archive data storage 23. Data from the application server 40-70, such as the STATREP and LOG.NSF files 110 or other mail system log files 100, is received at the network layer 17 at the DSM server 10, and the note layer 13 ) Is passed through the MTA 14-16 to the second layer 12 and processed by the log, analysis, and filtration functions. The log function records the incoming data in the mass storage device 230. The filtered data is transferred from the second layer 12 to the first layer 11 and processed by process, action, notification and report functions. The action function may generate a corrective command 81 returned to the application server 40-70 in response to the received data. Depending on the system configuration, a delay may be imposed with respect to one or more functions in transferring data from the second layer 12 to the first layer 11. For example, in some embodiments of the invention, the report function may be set to activate only once a week for data remaining logged in mass storage 230 until the report expires. The high level architecture of each application server 40-70 optionally includes a mail layer 41 that provides Notes or cc: Mail functionality on an OS / 2, NT or AIX operating system platform. Below the mail layer 41 is a log layer 42 for providing a log file to the DSM server 10. Below this log layer 42 is a network layer 43 for interfacing with the network layer 17 of the DSM server 10.

로투스 수집 에이전트(Lotus Collection Agent)Lotus Collection Agent

도6을 참조하면, 각각의 노트 애플리케이션 서버(50-70)는 AIX, OS/2, 또는 NT 운영체제와 같은 운영체제(900) 상에서 노트(910)를 실행시킨다. 각각의 노트 애플리케이션 서버(50-70)에서, 노트(910)는 NOTE.INI 파일(911) 및 다수의 노트 태스크(913-916)를 포함한다. 태스크(913-916)는 루터(Router) 태스크(913), 리플리케이터(Replicator) 태스크(914), 및 리포터(Reporter) 태스크(915)를 포함한다. 또한, 각각의 노트 애플리케이션 서버(50-70)는 노트 수집 에이전트(912)를 포함한다. 노트 수집 에이전트(912)는 노트(910) 내의 태스크로서 동작한다. NOTES.INI 파일은 호스트 노트 애플리케이션 서버(50-70)가 부팅될 때 노트(910)에서 개시될 태스크를 정의한다. 노트 수집 에이전트(912)는 NOTES.INI 파일 내에 명시된다. 그러므로, 노트 수집 에이전트(912)는 호스트 애플리케이션 서버(50-70)가 연산중일 때마다 활성 상태로 된다. 활성 상태로 되면, 노트 수집 에이전트(912)는 DSM 서버(10)가 노트 프로토콜을 이용하여 노트 애플리케이션 서버(50-70)와 통신할 수 있도록 한다. 노트 애플리케이션 서버(50-70)는 그 각각의 연산 통계(operational statistics)를 LOG.NSF로 불리는 데이터베이스 파일로 전송한다. 정의가능한 간격으로, 노트 수집 에이전트(912)는 LOG.NSF로부터의 정보를 중간 데이터베이스로 카피하고, 이 데이터베이스로부터의 정보는 포맷되어, 도5에서 110으로 도시된 바와 같이, 노트 레이어(41)를 통해 DSM 서버(10)로 우송된다. 각각의 노트 애플리케이션 서버(50-70)로부터 수집된 정보는 4개의 카테고리, 즉, 로그 데이터, 서버 태스크, E-메일 루팅, 복제 중 하나의 카테고리에 속한다. 각각의 카테고리에 대해, 이용자는 다음의 옵션 중 하나를 선택할 수 있다.Referring to Figure 6, each note application server 50-70 executes note 910 on an operating system 900, such as an AIX, OS / 2, or NT operating system. In each note application server 50-70, note 910 includes a NOTE.INI file 911 and a number of note tasks 913-916. Tasks 913-916 include a Router task 913, a Replicator task 914, and a Reporter task 915. Each note application server 50-70 also includes a note collection agent 912. The note collection agent 912 acts as a task within the note 910. The NOTES.INI file defines the task to be started at note 910 when the host note application server 50-70 boots. The note collection agent 912 is specified in the NOTES.INI file. Therefore, note collection agent 912 becomes active whenever host application server 50-70 is operating. Once active, note collection agent 912 enables DSM server 10 to communicate with note application servers 50-70 using note protocols. Note The application server 50-70 sends its respective operational statistics to a database file called LOG.NSF. At definable intervals, the note collection agent 912 copies the information from LOG.NSF into the intermediate database, and the information from this database is formatted to copy the note layer 41, as shown at 110 in FIG. It is sent to the DSM server 10 via. The information collected from each note application server 50-70 belongs to one of four categories: log data, server task, email routing, and replication. For each category, the user can select one of the following options.

(1) 그 카테고리 내의 모든 문서를 수집하고 처리한다(디폴트);(1) collect and process all documents in that category (default);

(2) 현재에는 이러한 형태의 모든 문서를 수집하지만, 그것을 나중에 처리한다.(2) Collect all documents of this type at present, but process them later.

(3) 이 카테고리의 모든 문서를 폐기(discard)한다.(3) Discard all documents in this category.

이용자는 어느 때라도 문서 카테고리에 대한 액션을 변경할 수 있다. 노트 수집 에이전트(912)는 LOG.NSF 파일의 마지막 성공적인 폴(poll)을 기록하기 위해 시간 및 날짜 스탬프를 유지한다. 이 스탬프는 NOTES.INI 파일(911)에 기록된다. 만일 태스크(913-916) 중 하나가 타임 스탬프 없이 또는 무효 타임 스탬프에 의해 개시되면, 노트 수집 에이전트(912)는 데이터를 처리하기 전에 타임 스탬프를 생성하게 된다. 노트 수집 에이전트(912)에 의해 사용되는 파라미터는 각각의 노트 서버(50-70)를 통해 관찰되고 구성될 수 있다. 통계도 역시 STATREP 데이터베이스에 기록되며, 도5에 메일 흐름(11)으로 도시된 바와 같이, 처리를 위해 DSM 서버(10)로 루팅된다.The user can change the action for the document category at any time. The note collection agent 912 maintains a time and date stamp to record the last successful poll of the LOG.NSF file. This stamp is recorded in the NOTES.INI file 911. If one of the tasks 913-916 is initiated with or without a time stamp, the note collection agent 912 will generate a time stamp before processing the data. Parameters used by note collection agent 912 may be observed and configured through each note server 50-70. Statistics are also recorded in the STATREP database and routed to the DSM server 10 for processing, as shown by mail flow 11 in FIG.

cc:Mail 수집 에이전트cc: Mail Collection Agent

애플리케이션 서버(40)는 cc:Mail 포스트 오피스 루터로서 작용한다. DSM 서버(10)는 애플리케이션 서버(40)에게 cc:Mail MTA(16)를 통해 피어(peer) 포스트 오피스처럼 보인다. 애플리케이션 서버(40)는 또한 cc:Mail 수집 에이전트를 포함한다. cc:Mail 수집 에이전트와 cc:Mail MTA(16)는 협력하여, cc:Mail 로그 파일이 애플리케이션 서버(40)에 의해 DSM 서버(10)로 우송될 수 있도록 한다. cc:Mail 수집 에이전트는 전술한 노트 수집 에이전트(912)와 그 기능이 유사하다. 동작중에, cc:Mail MTA(16) 와 cc:Mail 수집 에이전트는 애플리케이션 서버(4)로부터 cc:Mail 루터 로그 데이터를 수집하고, 그러한 로그 데이터를 정상적인 루터 기능의 인터럽트 없이 DSM 서버(10)로 루팅하는데 있어 협동한다. 이러한 프로세스는 cc:Mail 호출 리스트 엔트리로서 동작하며, 이 엔트리를 통해 cc:Mail 로그가 수집되어 소정의 간격으로 DSM 서버(10)로 제공된다. 이로 인해, DSM 서버(10)는 애플리케이션 서버(40)에 의해 제공되는 cc:Mail 메시지 루터 서비스로부터 오프-라인으로 이들 로그 파일을 처리할 수 있게 된다.Application server 40 acts as a cc: Mail post office router. The DSM server 10 looks like a peer post office to the application server 40 via the cc: Mail MTA 16. Application server 40 also includes a cc: Mail collection agent. The cc: Mail collection agent and the cc: Mail MTA 16 cooperate to enable cc: Mail log files to be mailed to the DSM server 10 by the application server 40. The cc: Mail collection agent is similar in function to the above-described note collection agent 912. In operation, the cc: Mail MTA 16 and cc: Mail Collection Agent collect cc: Mail router log data from the application server 4 and route such log data to the DSM server 10 without interrupting normal router functionality. To cooperate. This process acts as a cc: Mail call list entry through which the cc: Mail logs are collected and provided to the DSM server 10 at predetermined intervals. This allows the DSM server 10 to process these log files off-line from the cc: Mail message router service provided by the application server 40.

DSM 서버 기능DSM Server Function

이제 도7을 참조하면, 동작중에, DSM 서버(10)는 애플리케이션 서버(40-70)의 액티비티를 관리하기 위한 중간-레벨 시스템 관리자로서 작용한다. 이와 같은 관리를 용이하게 하기 위해, DSM 아키텍처의 제1 레이어(11)는 다음의 프로세스 기능을 포함한다.Referring now to FIG. 7, in operation, DSM server 10 acts as a mid-level system administrator for managing the activities of application server 40-70. To facilitate such management, the first layer 11 of the DSM architecture includes the following process functions.

(a) 모니터 기능(540);(a) monitor function 540;

(b) 서버 프로브 기능(550);(b) server probe function 550;

(c) E-메일 프로브 기능(560);(c) email probe function 560;

(d) 데이터베이스 복제 트래킹 기능(570);(d) database replication tracking function 570;

기능(540-570)은 DSM 서버(10)의 대용량 기억장치(230)에서 유지되는 대응하는 소프트웨어에 의해 구성될 때 CPU(220)에 의해 수행된다. 본 발명의 다른 실시예에서는, 이와 유사한 기능이 하드웨어 또는 하드웨어와 소프트웨어의 조합에 의해 제공될 수도 있다는 것을 이해할 것이다.The functions 540-570 are performed by the CPU 220 when configured by the corresponding software maintained in the mass storage device 230 of the DSM server 10. It will be appreciated that in other embodiments of the present invention, similar functionality may be provided by hardware or a combination of hardware and software.

다시 도2를 참조하면, 애플리케이션 서버(40-70)에 의해 유지되는 로그 파일은 통신 경로(80)에 의해 일반적으로 도시된 바와 같이 DSM 서버(10)로 향한다. 이 로그 파일은 DSM 서버(10)에 의해 대용량 기억장치(230)에 저장된다. 각각의 애플리케이션 서버(40-70)에 대응하는 로그 파일은 대용량 기억장치(230)의 분리된 디스크에 저장된다.Referring again to FIG. 2, log files maintained by application servers 40-70 are directed to DSM server 10 as shown generally by communication path 80. This log file is stored in the mass storage device 230 by the DSM server 10. The log file corresponding to each application server 40-70 is stored on a separate disk of mass storage device 230.

도7을 참조하면, 로그 파일을 수신하면, DSM 서버(10)의 모니터 기능(540)은 각각의 로그에 포함된 메시지를 여과한다. 여과된 메시지는 도2에 도면부호 90에 의해 일반적으로 도시된 바와 같이 관리자에게 디스플레이하기 위해 DSM 서버(10)로부터 단말기(20)로 전송된다. 서버 프로브 기능(550)과 메일 프로브 기능(560)은 자동적으로 애플리케이션 서버(40-70) 중 선택된 서버를 조정한다.Referring to FIG. 7, upon receiving a log file, the monitor function 540 of the DSM server 10 filters the messages contained in each log. The filtered message is sent from the DSM server 10 to the terminal 20 for display to an administrator, as generally indicated by 90 at FIG. The server probe function 550 and the mail probe function 560 automatically adjust the selected server among the application servers 40-70.

DSM 서버(10) 내에 기능(540-570)을 통합하는 것은 DSM 서버(10)로 하여금, 애플리케이션 서버(40-70)에 의해 발생되는 로그 파일을 분석하고, 애플리케이션 서버(40-70) 내의 연산 문제를 식별하고, 그 환경 내의 메시지 루팅 문제를 식별하고, 애플리케이션 서버(40-70)의 로그 파일을 저장할 수 있도록 한다.Integrating functions 540-570 within DSM server 10 causes DSM server 10 to analyze log files generated by application server 40-70 and perform operations within application server 40-70. Identify problems, identify message routing problems within the environment, and save log files of the application server 40-70.

각각의 기능(540-570)은 다수의 입력 테이블(500-530) 중 하나 또는 그 이상의 테이블과 연관되어 있다. 특히, 입력 테이블(500)은 서버 프로브 기능(550) 및 E-메일 프로브 기능(560)과 연관되어 있고, 입력 테이블(510)은 모니터 기능(540)과 연관되어 있으며, 입력 테이블(520)은 서버 프로브 기능(550)과 연관되고, 입력 테이블(530)은 데이터베이스 복제 트래킹 기능(570)과 연관되어 있다. 기능(540-570)의 출력은 대용량 기억장치(230)에 저장된 데이터베이스(580) 및 관리 단말기(20)로 제공된다. 데이터베이스 복제 트래킹 기능(570)은 또한, 대용량 기억장치(230)에 저장된 데이터베이스(610,600,590)를 모니터한다. 입력 테이블(500-540)은 시스템 관리 스태프에 의해 명시된 파라미터 및 임계치를 포함한다. 이들 파라미터와 임계치는 최종 이용자와 일치된 서비스의 경계 또는 레벨과 관련되어 있다. 예를 들어, 만일 메일이 서비스 레벨 일치하에 3분 이내에 배달될 것으로 기대되면, 이 레벨은 e메일 프로브 기능(570)에 대한 입력에 대해 프로브 입력 테이블(500)에서 임계치로서 설정되게 된다. 만일 메일 배달이 이 시간을 초과하면, E-메일 프로브는 경보(alert)를 발생한다. 다음에, 이러한 사건의 리포트는 대용량 기억장치(230)에 저장된 출력 리포트 데이터베이스에 기록된다.Each function 540-570 is associated with one or more of the plurality of input tables 500-530. In particular, the input table 500 is associated with the server probe function 550 and the email probe function 560, the input table 510 is associated with the monitor function 540, and the input table 520 is Associated with the server probe function 550, the input table 530 is associated with the database replication tracking function 570. The output of the functions 540-570 is provided to the database 580 and the management terminal 20 stored in the mass storage device 230. The database replication tracking function 570 also monitors the databases 610, 600, 590 stored in the mass storage device 230. Input tables 500-540 include parameters and thresholds specified by the system management staff. These parameters and thresholds are related to the boundary or level of service matched to the end user. For example, if the mail is expected to be delivered within three minutes under service level matching, this level will be set as a threshold in the probe input table 500 for input to the email probe function 570. If mail delivery exceeds this time, the E-mail Probe generates an alert. The report of such an event is then recorded in an output report database stored in mass storage 230.

모니터 기능Monitor function

통상적으로, 애플리케이션 서버로부터의 로그 파일은 실제적인 환경에서는 좀처럼 완전하게 검사되거나 실행되지 않는데, 그 이유는 전술한 바와 같이, 이와 같은 액티비티가 수동적으로 수행되는 경우에 시간을 소비하기 때문이다. 다시 도1을 참조하면, 전술한 바와 같이, 모든 서버 로그 파일은 애프리케이션 서버(40-70)로부터 DSM 서버(10)로 루팅된다. DSM 서버(10)는 대용량 기억장치(230)에 저장된 지식 베이스(knowledge base)를 포함한다. 지식 베이스는 DSM 서버(10)의 모니터 기능(540)이 애플리케이션 서버(40-70)로부터 수신되는 로그 파일의 내용을 분석하고, 각각의 경우에 어떤 액션이 적절한지 결정할 수 있도록 한다. 분석은 DSM 서버(10)가 관리자를 대신하여 모니터 기능(540)에 의해 분석된 정보에 따라 작용하고, 애플리케이션 서버(40-70)로부터 관리 단말기(20)로의 메시지를 여과하고 의지하도록 하는 연속적이고 진행하는 액티비티이다. 모니터 기능(540)은 관리 단말기(20)로의 전송을 위해 로그 파일로부터 선택된 항목(items)만을 여과하도록 구성될 수 있다. 이러한 특징은 단순한 정보가 관리 단말기(20)에 도달하는 것을 방지하고, 따라서, 관리 스태프가 보다 빠르게 임계적인 메시지에 반응할 수 있도록 하기 위해 사용될 수 있다.Typically, log files from an application server are rarely fully inspected or executed in a practical environment, as described above, because they spend time when such activities are performed manually. Referring again to FIG. 1, as described above, all server log files are routed from the application server 40-70 to the DSM server 10. The DSM server 10 includes a knowledge base stored in the mass storage device 230. The knowledge base allows the monitor function 540 of the DSM server 10 to analyze the contents of the log file received from the application server 40-70 and determine which action is appropriate in each case. The analysis is continuous and allows the DSM server 10 to act on the information analyzed by the monitor function 540 on behalf of the administrator and to filter and rely on messages from the application server 40-70 to the management terminal 20. This is an ongoing activity. The monitor function 540 may be configured to filter only selected items from the log file for transmission to the management terminal 20. This feature can be used to prevent simple information from reaching the management terminal 20 and thus allow the management staff to respond to critical messages more quickly.

애플리케이션 서버(40-70)로부터의 로그 파일에 포함된 에러 메시지(예, 통신, 루터, 보안, 자원, 및 서버 환경 에러 메시지)는 노트 메일, 심플 네트워크 관리 프로토콜(SNMP) 트랩 프로토콜 데이터 유닛(PDU) 및 노트 데이터베이스로의 로깅을 통해 DSM 서버(10)에 의해 포착되어 리포트된다. 애플리케이션 서버(40-70)의 제어는 전술한 경보가 처리될 때 DSM 서버(10)와의 사전정의된 이용자 출구(exits)로 전달될 수 있다.Error messages (e.g., communication, router, security, resource, and server environment error messages) contained in log files from the application server 40-70 include note mail, Simple Network Management Protocol (SNMP) trap protocol data units (PDUs). ) And captured by the DSM server 10 via logging to the note database. Control of the application server 40-70 may be transferred to a predefined user exit with the DSM server 10 when the alert described above is processed.

각각의 애플리케이션 서버(40-70)는 DSM 서버(10)의 대용량 기억장치(230) 내의 상이한 전용 디스크에 대응한다. 전용 디스크는 대응하는 서버(40-70)에 관련된 모든 정보를 기록하기 위해 사용된다. 특히, 정보는 DSM 서버(10)에 의해 DATA, STATUS 및 REPORTS라는 명칭의 서브-디렉토리를 이용하여 표준 포맷으로 편성(organized)된다.Each application server 40-70 corresponds to a different dedicated disk in mass storage 230 of DSM server 10. The dedicated disc is used to record all information related to the corresponding server 40-70. In particular, the information is organized by the DSM server 10 in a standard format using sub-directories named DATA, STATUS and REPORTS.

요약 상태 파일(summary status file)은 DSM 서버(10)에 의해 생성되어, 애플리케이션 서버(40-70)의 최근의 액티비티를 검토하길 원하는 관리 스태프에 의해 의한 억세스를 위해 STATUS 서브-디렉토리에 저장된다. 노트4에 의해 제공되는 그래픽 사용자 인터페이스(GUI)는 관리 스태프가 데이터베이스 네비게이터를 통해 애플리케이션 서버(40-70)로부터 DSM 서버(10)에 의해 수집된 모든 정보를 관찰할 수 있도록 한다.The summary status file is generated by the DSM server 10 and stored in the STATUS sub-directory for access by management staff who wish to review recent activities of the application server 40-70. The graphical user interface (GUI) provided by Note 4 allows the management staff to observe all the information collected by the DSM server 10 from the application server 40-70 via the database navigator.

서버 로그 및 통계로부터의 정보는 미래의 요구에 대한 현재의 데이터 처리 요구조건 및 계획을 관리하기 위한 정보를 관리자에게 제공하기 위해 매주 및 매월에 한 번씩 요약될 수 있다. 정보는 노트 통계(statistics), 운영체제 통계, 네트워크 통계 및 응답 시간 요약의 형태로 노트 데이터베이스 상에서 유지된다. 애플리케이션 서버 로그 파일 및 통계는 DSM 서버(10)를 통해 자동적으로 매월에 한 번씩 원하는 목적지로 기록보관(archived)될 수 있다.Information from server logs and statistics can be summarized weekly and monthly to provide administrators with information to manage current data processing requirements and plans for future needs. The information is maintained on the note database in the form of note statistics, operating system statistics, network statistics, and response time summaries. Application server log files and statistics can be automatically archived to the desired destination once a month via the DSM server 10.

서버 프로브 기능Server probe function

서버 프로브 기능(550)은 각각의 노트 애플리케이션 서버(50-70)의 NOTES.INI 파일에 명시된 노트 태스크(913-916)의 활성화를 보장하기 위해 노트 수집 에이전트(912)를 통해 각각의 노트 애플리케이션 서버(50-70)를 모니터한다. (아마도 장애로 인해) 실행되고 있지 않는 태스크(913-916)는 그 서버 프로브 기능(550)에 의해 자동적으로 개시된다. 또한, 서버 프로브 기능(550)은 노트 애플리케이션 서버(50-70) 내의 노트 수집 에이전트(912)를 통해, 매일 한 번씩 DSM 서버(10)로부터 노트 애플리케이션 서버(50-70)로의 응답 시간을 기록하고 요약한다.The server probe function 550 uses each note application server through the note collection agent 912 to ensure the activation of note tasks 913-916 specified in the NOTES.INI file of each note application server 50-70. Monitor (50-70). Tasks 913-916 that are not running (possibly due to a failure) are automatically initiated by the server probe function 550. The server probe function 550 also records the response time from the DSM server 10 to the note application server 50-70 once a day via the note collection agent 912 in the note application server 50-70. Summarize.

만일 어떤 노트 애플리케이션 서버(50-70)가 어떤 이유로 인해 오프-라인이 되면, 서버 프로브 기능(550)은 중요도 1 경보(severity 1 alert)를 제기하게 된다. 이 중요도 1 경보는 DSM 서버(10)에 의해 관리 단말기(20)로 전송된다. 서버 프로브 기능(550)은 노트 수집 에이전트(912)를 통해, 명시된 태스크(913-916)가 활성화되어 올바르게 작용하고 있는지를 연속적으로 검사한다. 어떤 태스크(913-916)에 문제가 있는 경우에, 서버 프로브 기능(550)은 자동적으로 노트 수집 에이전트(912)를 통해 그 태스크를 재개시하기 위해 시도한다. 특정 태스크(913-916)를 재개시하기 위한 시도의 사전정의된 횟수의 실패 이후에, 서버 프로브 기능(550)은 경보를 관리 단말기(20)로 루팅한다. 호스트 애플리케이션 서버(50-70)의 장애는 약정된 서비스 시간 내에 또는 약정된 서비스 시간을 지나 DSM 서버(10)에 의해 기록된다.If any note application server 50-70 goes off-line for some reason, server probe function 550 raises a Severity 1 alert. This importance 1 alert is sent by the DSM server 10 to the management terminal 20. The server probe function 550 continuously checks via the note collection agent 912 whether the specified tasks 913-916 are activated and functioning properly. If there is a problem with a task 913-916, the server probe function 550 automatically attempts to resume that task through the note collection agent 912. After a predefined number of failures of an attempt to resume a particular task 913-916, the server probe function 550 routes an alert to the management terminal 20. Failure of the host application server 50-70 is recorded by the DSM server 10 within or past the agreed service time.

이제, 도8의 흐름도를 참조하여 서버 프로브 기능(550)에 대해 설명하게 된다. 동작중에, 서버 프로브 기능(550)은 노트 프로토콜에 의해 목표 애플리케이션 서버(50-70)로 명령을 발생한다. 서버 프로브 기능(550)에 의해 발생된 명령은 목표 애플리케이션 서버(50-70)의 노트 수집 에이전트(912)에 의해 노트(910) 내에서 처리된다. 처음에는, 블록(1000)에서, 서버 프로브 기능(550)은 재개시 카운트(restart count)를 제로로 리셋한다. 다음에, 블록(1010)에서, 서버 프로브 기능(550)은 태스크 구성을 제시하라(show task configuration)는 명령을 노트 수집 에이전트(912)로 전송한다. 태스크 구성을 제시하라는 명령은 그 태스크(913-916)가 활성화되어야 하는 목표 애플리케이션 서버(50-70) 내의 NOTES.INI 파일로부터 포착된다. 블록(1020)에서, 서버 프로브 기능(550)은 활성 태스크를 제시하라(show active tasks)는 명령을 목표 애플리케이션 서버(50-70)로 전송한다. 활성 태스크를 제시하라는 명령은 목표 애플리케이션 서버(50-70) 상에서 활성화되는 태스크(913-916)를 포착한다. 블록(1030)에서, 서버 프로브 기능(550)은 태스크 구성을 활성 태스크와 비교한다. 만일 태스크 구성이 활성 태스크와 동일하면, 서버 프로브 기능(550)은 블록(1080)에서 종료된다. 만일 태스크 구성이 활성 태스크와 상이하면, 이것은 하나 또는 그 이상의 태스크(913-916)에 장애가 있다는 것을 나타내며, 그러면, 블록(1040)에서, 서버 프로브 기능(550)은 재개시 카운트가 장애가 있는 태스크의 재개시를 위한 시도의 소정의 임계치와 동일한지 판단한다. 만일 그렇다면, 블록(1050)에서, 서버 프로브 기능(550)은 관리 단말기(20)로 제공하기 위한 경보 메시지를 발생한다. 그러나, 만일 그렇지 않다면, 블록(1060)에서, 서버 프로브 기능(550)은 재개시 카운트를 증가시키고, 블록(1070)에서, 비활성 상태로서 보고되었던 태스크 구성에 명시된 태스크를 재개시하기 위해 시도한다. 서버 프로브 기능(550)은 모든 필요한 태스크가 활성 상태로 되거나 또는 임계치를 초과할 때까지 블록(1020,1030,1040,1060,1070)에 의해 정의된 루프를 돌아 계속된다.The server probe function 550 will now be described with reference to the flowchart in FIG. 8. In operation, server probe function 550 issues commands to target application server 50-70 by note protocol. The commands generated by the server probe function 550 are processed within the notes 910 by the note collection agent 912 of the target application server 50-70. Initially, at block 1000, server probe function 550 resets the restart count to zero. Next, at block 1010, the server probe function 550 sends a command to show note configuration to the note collection agent 912. The command to present the task configuration is captured from the NOTES.INI file in the target application server 50-70 where the tasks 913-916 should be activated. At block 1020, the server probe function 550 sends a command to show target tasks to the target application server 50-70. The command to present an active task captures tasks 913-916 that are activated on target application server 50-70. In block 1030, the server probe function 550 compares the task configuration with the active task. If the task configuration is the same as the active task, the server probe function 550 ends at block 1080. If the task configuration is different from the active task, this indicates that one or more tasks 913-916 have failed, and then at block 1040, server probe function 550 resumes counting the failed task's task. Determine if equal to a predetermined threshold of attempts to resume. If so, at block 1050, the server probe function 550 generates an alert message for providing to the management terminal 20. If not, however, at block 1060, server probe function 550 increments the resume count and, at block 1070, attempts to resume the task specified in the task configuration that was reported as inactive. The server probe function 550 continues around the loop defined by blocks 1020, 1030, 1040, 1060, 1070 until all necessary tasks become active or exceed the threshold.

E-메일 프로브 기능E-mail Probe Function

E-메일 프로브 기능(560)은 사전정의된 임계치에 대하여 리플렉트하는(reflecting) 서버(50-70)로의 복귀 트립(return trip)을 완료하기 위해 로투스 노트 문서 형태의 테스트 메시지에 걸리는 시간을 측정함으로써, 노트 애플리케이션 서버(50-70)의 네트워크에서 메일 루트를 테스트한다. E-메일 프로브 기능(560)에 의해 발생되는 테스트 리포트의 일례가 부록 A에 제공되어 있다. E-메일 프로브 기능(560)은 임계치가 초과되는 경우에 경보를 발생한다. 또한, E-메일 프로브 기능(560)은 도중에 각각의 E-메일 애플리케이션 서버를 통해 경과된 시간을 포함하는 리포트를 발생한다. 특히, E-메일 프로브는 들어갈 때 및 빠저나올 때 각각의 서버(50-70)로부터 국부적인(local) 날짜 및 시간을 포착한다. 각각의 서버에 대한 들어가고 나오는 날짜 및 시간(entry and exit local date and time)은 테스트 메시지를 형성하는 노트 문서에 기록된다. 소정의 메일 애플리케이션에서는, 메일을 배달하기 위해 걸리는 시간 및 메일 배달 문제점이 있는지에 대해 관리 스태프가 알고 있는 것은 중요하다. E-메일 프로브 기능은 노트 메일 네트워크에서 발생하는 문제점을 관리 단말기(20)로 전송하기 위한 경보로 변환한다. E-메일 프로브 기능(560)은 또한, 자동적으로 메일 트래킹 리포트를 발생한다.The email probe function 560 measures the time it takes for the test message in the form of a Lotus Notes document to complete a return trip to the server 50-70 reflecting against a predefined threshold. By doing so, the mail route is tested in the network of note application servers 50-70. An example of a test report generated by the email probe function 560 is provided in Appendix A. E-mail probe function 560 generates an alert if the threshold is exceeded. The email probe function 560 also generates a report that includes the time elapsed through each email application server along the way. In particular, the e-mail probe captures the local date and time from each server 50-70 as it enters and exits. The entry and exit local date and time for each server is recorded in a note document that forms a test message. In some mail applications, it is important for the management staff to know the time it takes to deliver the mail and if there is a mail delivery problem. The e-mail probe function converts a problem occurring in the note mail network into an alert for transmission to the management terminal 20. E-mail probe function 560 also automatically generates a mail tracking report.

복제 트래킹 기능(Replication Tracking Function)Replication Tracking Function

몇몇 노트 애플리케이션에서는, 데이터베이스에 저장된 데이터가 상이한 애플리케이션 서버(50-70) 사이에서 새도우잉(shadowed)되는 것이 중요하다. 이와 같은 새도우잉은 노트 복제 태스크(914)를 통해 실현될 수 있다. DSM 서버(10)의 노트 복제 트래킹 기능(570)은 노트 복제 태스크(914)가 2개 또는 그 이상의 노트 애플리케이션 서버(50-70)에 의해 실행된 이후에 동기되어 있는지 확인하기 위해 애플리케이션 서버(40-70) 상의 데이터베이스를 검사한다. 만일 데이터베이스가 동기되어 있지 않으면, DSM 서버(10)는 관리 단말기(20)로 경보를 전송한다. 노트 복제 트래킹 기능(570)은 노트 복제 서버 액티비티가 발생한 이후에 그 동일한 복제 ID의 데이터베이스가 동일한 내용을 갖고 있는지 검증한다.In some note applications, it is important that data stored in a database is shadowed between different application servers 50-70. Such shadowing may be realized through note replication task 914. The note replication tracking function 570 of the DSM server 10 may determine whether the note replication task 914 is synchronized after being executed by two or more note application servers 50-70. Check the database on -70). If the database is not synchronized, the DSM server 10 sends an alert to the management terminal 20. The note replication tracking function 570 verifies that the database of the same replication ID has the same contents after the note replication server activity occurs.

예를 들어, 도1을 참조하여, 데이터베이스 ABC.NSF가 애플리케이션 서버(50)에 저장되어, 복제 트래킹 태스크(914)를 통해 애플리케이션 서버(60) 상에 복제된 것으로 가정하자. 그러면, 애플리케이션 서버(50)와 애플리케이션 서버(60)는 모두 ABC.NSF의 카피를 저장하고 있는 것이다. 이제, 애플리케이션 서버(50)에 접속된 클라이언트 이용자가 그 애플리케이션 서버(50) 상에 저장된 ABC.NSF의 카피를 수정하고 있고, 이와 동시에, 애플리케이션 서버(60)에 접속된 클라이언트 이용자가 그 애플리케이션 서버(60) 상에 저장된 ABC.NSF의 카피를 수정하고 있는 것으로 가정하자. 그러면, 애플리케이션 서버(50) 상의 복제 태스크(914)는 애플리케이션 서버(60) 상의 ABC.NSF의 수정된 카피를 주기적으로 복제한다. 마찬가지로, 애플리케이션 서버(60) 상의 복제 태스크(914)는 애플리케이션 서버(50) 상의 ABC.NSF의 수정된 카피를 주기적으로 복제한다. 복제가 이루어지는 빈도수는 이용자 요구에 따라 미리설정될 수 있다. 예를 들어, 만일 데이터베이스가 클라이언트 이용자에 의해 빈번하게 수정되는 비교적 중요한 정보를 포함하고 있으면, 그에 대응하여 빈번하게 이루어지는 복제 액티비티가 적절할 수 있다. 이와 반대로, 만일 데이터베이스에 포함된 정보가 덜 중요하면, 복제도 덜 빈번하게 이루어지도록 설정될 수 있다. 시스템 관리 관점에서 보면, 애플리케이션 서버(50,60) 상의 복제 태스크는 애플리케이션 서버(50,60)의 최종 이용자 클라이언트가 ABC.NSF를 독립적으로 수정하는 정규성(regularity)을 수용하기에 충분한 빈도로 ABC.NSF의 복제를 수행하도록 설정되는 것을 보장하는 것이 바람직하다. 이러한 문제는 DSM 서버(10)의 복제 트래킹 기능(570)에 의해 해결된다.For example, referring to FIG. 1, assume that database ABC.NSF is stored in application server 50 and replicated on application server 60 via replication tracking task 914. Then, both the application server 50 and the application server 60 store a copy of ABC.NSF. Now, a client user connected to the application server 50 is modifying a copy of ABC.NSF stored on the application server 50, and at the same time, a client user connected to the application server 60 is connected to the application server ( Suppose you are modifying a copy of ABC.NSF stored on 60). The replication task 914 on the application server 50 then periodically replicates a modified copy of ABC.NSF on the application server 60. Similarly, replication task 914 on application server 60 periodically replicates a modified copy of ABC.NSF on application server 50. The frequency at which duplications are made can be preset according to user requirements. For example, if the database contains relatively important information that is frequently modified by the client user, then frequently frequent replication activities may be appropriate. In contrast, if the information contained in the database is less important, replication can also be set up to occur less frequently. From a system administration point of view, the replication task on the application server 50,60 is a frequency sufficient to accommodate the regularity of end user clients of the application server 50,60 modifying ABC.NSF independently. It is desirable to ensure that it is set up to perform replication of the NSF. This problem is solved by the replication tracking function 570 of the DSM server 10.

도9를 참조하면, 동작중에, 복제 트래킹 기능(570)은 블록(1100)에서, 샘플링 간격(SAMPLING INTERVAL)을 애플리케이션 서버(50,60) 상의 ABC.NSF의 카피에 사용될 원하는 샘플의 수로 설정하고, 목표 카운트(TARGET COUNT)를 애플리케이션 서버(50,60) 상의 ABC.NSF의 카피 내의 샘플링 간격에서 발견될 정합(matches)의 수로 설정하고, 실행중인 총 히트 카운트(HIT COUNT)와 샘플 카운트(SAMPLE COUNT)를 제로로 리셋함으로써 초기화된다. 블록(1110)에서, 복제 트래킹 기능(570)은 애플리케이션(50)에 저장된 ABC.NSF와 애플리케이션(60)에 저장된 ABC.NSF를 둘다 샘플링한다. 블록(1120)에서, 복제 트래킹 기능(570)은 이들 2개의 샘플을 비교한다. 만일 2개의 샘플이 정합하면, 블록(1130)에서 복제 트래킹 기능(570)은 히트 카운트를 증가시키고, 블록(1140)으로 진행한다. 만일 2개의 샘플이 정합하지 않으면, 복제 트래킹 기능(1140)은 직접 블록(1140)으로 진행하며, 여기서는 샘플 카운트가 증가된다. 블록(1150)에서, 복제 트래킹 기능(570)은 샘플 카운트를 샘플링 간격과 비교한다. 만일 샘플 카운트가 샘플링 간격과 동일하지 않으면, 복제 트래킹 기능(570)은 블록(1110)으로 복귀하여 그 다음 쌍의 샘플을 수집한다. 만일 샘플 카운트가 샘플링 간격에 도달하면, 복제 트래킹 기능(570)은 블록(1160)에서, 히트 카운트를 목표 카운트와 비교한다. 만일 히트 카운트가 목표 카운트 보다 적으면, 블록(1170)에서, 복제 트래킹 기능(570)은 애플리케이션 서버(50,60)에 저장된 ABC.NSF의 카피가 동기되지 않았다는 것을 나타내는 경보를 관리 단말기로 발생한다. 만일 그렇지 않다면, 복제 트래킹 기능(570)은 종료된다.Referring to Figure 9, in operation, the replication tracking function 570 sets, at block 1100, the sampling interval SAMPLING INTERVAL to the desired number of samples to be used for copying ABC.NSF on application servers 50 and 60. Set the TARGET COUNT to the number of matches to be found in the sampling interval in the copy of ABC.NSF on the application server 50,60, and the total hit count and sample count running. Is reset by resetting COUNT) to zero. In block 1110, the replication tracking function 570 samples both ABC.NSF stored in the application 50 and ABC.NSF stored in the application 60. In block 1120, the duplicate tracking function 570 compares these two samples. If the two samples match, then at block 1130 the duplicate tracking function 570 increments the hit count and proceeds to block 1140. If the two samples do not match, the duplicate tracking function 1140 proceeds directly to block 1140, where the sample count is incremented. In block 1150, the duplicate tracking function 570 compares the sample count with the sampling interval. If the sample count is not equal to the sampling interval, the duplicate tracking function 570 returns to block 1110 to collect the next pair of samples. If the sample count reaches the sampling interval, the duplicate tracking function 570 compares the hit count to the target count at block 1160. If the hit count is less than the target count, then at block 1170, the duplicate tracking function 570 generates an alert to the management terminal indicating that a copy of ABC.NSF stored in the application server 50, 60 has not been synchronized. . If not, the duplicate tracking function 570 terminates.

복제 트래킹 기능(570)은 하나 이상의 데이터베이스의 복제를 추적하기 위해 사용될 수 있다는 것을 이해할 것이다. 마찬가지로, 복제 트래킹 기능(570)은 각각의 데이터베이스의 2개 이상의 카피의 복제를 추적하기 위해 사용될 수도 있다는 것을 이해할 것이다. 또한, 복제 트래킹 기능(570)은 상이한 테스트 파라미터(예, 샘플링 간격, 목표 카운트)를 상이한 데이터베이스 또는 데이터베이스 그룹에 적용할 수 있다는 것도 이해할 것이다. 또한, 고객의 요구조건에 따라, 비교적 짧은 시간 주기에 걸쳐 샘플링 간격을 형성하는 샘플을 취하거나 또는 비교적 긴 시간 주기에 걸쳐 취할 수도 있다는 것을 이해할 것이다.It will be appreciated that the replication tracking function 570 can be used to track the replication of one or more databases. Similarly, it will be appreciated that the replication tracking function 570 may be used to track the replication of two or more copies of each database. It will also be appreciated that the replication tracking function 570 can apply different test parameters (eg, sampling interval, target count) to different databases or database groups. It will also be appreciated that, depending on the requirements of the customer, a sample may be taken that forms a sampling interval over a relatively short time period, or may be taken over a relatively long time period.

애플리케이션 서버 로그 데이터 기억장치Application server log data storage

본 발명의 실시예에서, 로그 파일은 애플리케이션 서버(40-70)가 아닌 DSM 서버(10)의 대용량 기억장치(230)에 의해 유지되기 때문에, 애플리케이션 서버(40-70)는 보다 많은 자원을 클라이언트 액티비티를 위해 전렴할 수 있다.In the embodiment of the present invention, since the log file is maintained by the mass storage device 230 of the DSM server 10, not the application server 40-70, the application server 40-70 is a client of more resources You can commit for the activity.

DSM 서버 하이어라키 및 확장성DSM Server Hieraki and Scalability

DSM 서버(10)는 모듈로서 동작할 수 있다. 그러므로, 많은 수의 상이한 애플리케이션 서버를 수용하기 위해, 상이한 그룹의 애플리케이션 서버에 각각 서비스하는 복수의 DSM 서버가 사용될 수 있다. 이제, 도10를 참조하면, 이와 같은 구성의 예에는, 다수의 애플리케이션 서버(400-450)와 다수의 DSM 서버(460-480)가 제공된다. 각각의 DSM 서버(460-480)는 상이한 그룹의 애플리케이션 서버(400-450)에 접속된다. DSM 서버(460-480)는 DSM 마스터 서버(490)에 각각 접속된다. 동작중에, 애플리케이션 서버(400-450)와의 모든 MTA 통신은 DSM 서버(460-480)에 의해 처리되지만, DSM 서버(460-480)는 노트 MTA 만을 통해 마스터 DSM 서버(490)와 통신한다. 도10을 참조하여 전술한 본 발명의 예에서는, 3개의 DSM 서버(460-480)가 존재한다. 그러나, 본 발명의 다른 실시예에서는, 마스터 DSM 서버(490)에 보고하는 2개의 DSM 서버만이 존재할 수도 있다. 이와 마찬가지로, 본 발명의 다른 실시예에서는, 마스터 DSM 서버(490)에 보고하거나, 또는 마스터 DSM 서버(490)에서 끝나는 하이어라키 구조로 배열된 하나 또는 그 이상의 다른 레이어의 중간 DSM 서버에 보고하는 3개의 이상의 DSM 서버가 존재할 수도 있다.The DSM server 10 may operate as a module. Therefore, to accommodate a large number of different application servers, a plurality of DSM servers each serving different groups of application servers may be used. Referring now to FIG. 10, an example of such a configuration is provided with multiple application servers 400-450 and multiple DSM servers 460-480. Each DSM server 460-480 is connected to a different group of application servers 400-450. The DSM servers 460-480 are connected to the DSM master server 490, respectively. In operation, all MTA communication with the application server 400-450 is handled by the DSM server 460-480, while the DSM server 460-480 communicates with the master DSM server 490 only via the note MTA. In the example of the present invention described above with reference to FIG. 10, there are three DSM servers 460-480. However, in another embodiment of the present invention, only two DSM servers may report to the master DSM server 490. Similarly, in another embodiment of the present invention, three reporting to the master DSM server 490 or to one or more other layers of intermediate DSM servers arranged in a hierarchical structure ending at the master DSM server 490. More than one DSM server may exist.

통상적으로, 운영 센터는 비용을 위해 하나 또는 2개의 지리적 영역으로 통합된다. 이것은 비교적 적은 수의 운영 센터로부터, 비교적 큰 지역에 걸쳐 시스템 관리 제어가 수행된다는 것을 의미한다. 도10에 도시된 구성은 이러한 시나리오에 특히 적합하다.Typically, operations centers are consolidated into one or two geographic areas for cost. This means that system management control is performed from a relatively small number of operations centers over a relatively large area. The configuration shown in Fig. 10 is particularly suitable for this scenario.

요약하면, 본 발명의 예를 이용하여 전술한 것은 네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템을 구비한 분산 데이터 처리 시스템을 관리하기 위한 장치이며, 여기서, 각각의 애플리케이션 서버는 데이터베이스를 포함하는 데이터베이스 애플리케이션을 구비하고 있으며, 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하고, 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성한다. 이 장치는 네트워크를 통해 애플리케이션 서버로부터 수신되는 로그 파일을 메모리에 저장하고, 저장된 로그 파일에서의 경보 조건 검출시 제1 이벤트 메시지를 발생하고, 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에 의한 데이터베이스 복제가 없다는 것을 검출하는 경우에 제2 이벤트 메시지를 발생하고, 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에서 장애 조건의 검출시 제3 이벤트 메시지를 발생하고, 애플리케이션 서버 사이의 하나 또는 그 이상의 접속에서의 장애 조건 검출시 제4 이벤트 메시지를 발생한다.In summary, what has been described above using an example of the present invention is an apparatus for managing a distributed data processing system having a plurality of application server computer systems interconnected via a network, wherein each application server comprises a database. A database application is provided and communicates data from a database to a plurality of client computer systems and other application servers via a network, and generates a log file that records the occurrence of the data communication. The device stores a log file received from an application server over a network in memory, generates a first event message upon detection of an alert condition in the stored log file, and allows database replication by one or more database applications in the application server. Generating a second event message when detecting that there is no, generating a third event message upon detection of a failure condition in a database application within one or more application servers, and a failure condition on one or more connections between application servers Generate a fourth event message upon detection.

전술한 바와 같은 본 발명에 따른 분산 데이터 처리 시스템 관리 장치 및 방법에 있어서는, 애플리케이션 서버 장애를 제거하거나 또는 적어도 감소시킬 수 있으며, 분산 환경에서 서버 관리의 자동화를 개선할 수 있는 효과가 있다.In the distributed data processing system management apparatus and method according to the present invention as described above, it is possible to eliminate or at least reduce the application server failure, there is an effect that can improve the automation of server management in a distributed environment.

부록 AAppendix A

Claims

네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템 - 여기서, 각각의 애플리케이션 서버는, 데이터베이스를 포함하는 데이터베이스 애플리케이션, 상기 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하기 위한 수단, 및 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성하기 위한 수단을 구비함 - 을 포함하는 분산 데이터 처리 시스템을 관리하기 위한 장치에 있어서,A plurality of application server computer systems interconnected via a network, wherein each application server is a database application comprising a database, means for communicating data from the database to a plurality of client computer systems and other application servers over the network; And means for generating a log file for recording the occurrence of said data communication, the apparatus for managing a distributed data processing system comprising:

기억 수단;Memory means;

네트워크를 통해 상기 애플리케이션 서버로부터 수신되는 로그 파일을 상기 기억 수단에 저장하고, 상기 저장된 로그 파일에서의 경보 조건 검출시 제1 이벤트 메시지를 발생하기 위한 모니터 수단;Monitor means for storing a log file received from the application server via a network in the storage means and for generating a first event message upon detection of an alarm condition in the stored log file;

하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에 의한 데이터베이스 복제가 없다는 것을 검출하는 경우에 제2 이벤트 메시지를 발생하기 위한 복제 트래킹 수단;Replication tracking means for generating a second event message when detecting that there is no database replication by a database application in one or more application servers;

하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에서 장애 조건의 검출시 제3 이벤트 메시지를 발생하기 위한 서버 프로브 수단; 및Server probe means for generating a third event message upon detection of a failure condition in a database application in one or more application servers; And

상기 애플리케이션 서버 사이의 하나 또는 그 이상의 접속에서의 장애 조건 검출시 제4 이벤트 메시지를 발생하기 위한 메일 프로브 수단Mail probe means for generating a fourth event message upon detecting a failure condition in one or more connections between said application servers

을 포함하는 분산 데이터 처리 시스템 관리 장치.Distributed data processing system management device comprising a.

제 1 항에 있어서,The method of claim 1,

상기 제1, 제2, 제3 및 제4 이벤트 메시지를 디스플레이하기 위한 수단을 구비한 관리 단말기A management terminal having means for displaying the first, second, third and fourth event messages

를 더 포함하는 분산 데이터 처리 시스템 관리 장치.Distributed data processing system management device further comprising.

제 2 항에 있어서,The method of claim 2,

상기 모니터 수단은 상기 이벤트 메시지들을 상기 관리 단말기로 선택적으로 통신하기 위한 필터를 포함하는The monitor means comprises a filter for selectively communicating the event messages to the management terminal.

분산 데이터 처리 시스템 관리 장치.Distributed data processing system management unit.

제 1 항, 2 항 또는 제 3 항에 있어서,The method according to claim 1, 2 or 3,

상기 복제 트래킹 수단은,The copy tracking means,

각각의 애플리케이션 서버의 데이터베이스를 샘플링하기 위한 샘플링 수단;Sampling means for sampling a database of each application server;

대응하는 샘플이 서로 정합하는지 판단하기 위한 제1 비교 수단; 및First comparing means for determining whether corresponding samples match each other; And

상기 제1 비교 수단에 의해 식별된 정합의 수가 사전정의된 수용 임계치 보다 적은 경우에 상기 제2 이벤트 메시지를 발생하기 위한 제2 비교 수단Second comparing means for generating the second event message if the number of matches identified by the first comparing means is less than a predefined acceptance threshold;

을 포함하는Containing

제 1 항, 2 항, 3 항 또는 제 4 항에 있어서,The method according to claim 1, 2, 3 or 4,

상기 서버 프로브 수단은, 상기 장애 조건을 해결하기 위해 연속적으로 시도하기 위한 재개시 수단과, 상기 장애 조건을 해결하기 위한 상기 재개시 수단에 의한 사전결정된 다수의 시도 실패 이후에 상기 제3 이벤트 메시지를 발생하기 위한 수단을 포함하는The server probe means may be further configured to restart the third event message after a predetermined number of failed attempts by the restart means for attempting to resolve the failure condition and the resume means for solving the failure condition. Means for generating

제 1 항, 2 항, 3 항, 4 항 또는 제 5 항에 있어서,The method according to claim 1, 2, 3, 4 or 5,

상기 메일 프로브 수단은,The mail probe means,

상기 애플리케이션 서버 중 다른 서버를 통해 상기 애플리케이션 서버 중 리플렉트하는 서버로 테스트 메시지를 전송하기 위한 수단;Means for sending a test message to another reflecting server among said application servers through another one of said application servers;

상기 애플리케이션 서버 중 리플렉트하는 서버로부터 상기 애플리케이션 서버 중 다른 서버를 통해 테스트 메시지를 수신하기 위한 수단; 및Means for receiving a test message from another reflecting one of said application servers via another one of said application servers; And

테스트 메시지가 상기 애플리케이션 서버 중 리플렉트하는 서버를 통해 통과하는데 걸리는 시간 및 상기 애플리케이션 서버 중 리플렉트하는 서버로 또한 그 서버로부터 도중에 각각의 애플리케이션 서버를 통해 통과하는데 걸리는 시간을 기록하기 위한 수단Means for recording the time it takes for a test message to pass through the reflecting one of the application servers and the time it takes to pass through each application server halfway from and to the reflecting one of the application servers

을 포함하는Containing

네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템 - 여기서, 각각의 애플리케이션 서버는, 데이터베이스를 포함하는 데이터베이스 애플리케이션, 상기 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하기 위한 수단, 및 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성하기 위한 수단을 구비함 - 과 선행항중 어느 한 항에서 청구된 바와 같은 장치A plurality of application server computer systems interconnected via a network, wherein each application server is a database application comprising a database, means for communicating data from the database to a plurality of client computer systems and other application servers over the network; And means for generating a log file that records the occurrence of said data communication-and an apparatus as claimed in any preceding claim.

를 포함하는 분산 데이터 처리 시스템.Distributed data processing system comprising a.

네트워크를 통해 상호접속된 복수의 애플리케이션 서버 컴퓨터 시스템 - 여기서, 각각의 애플리케이션 서버는, 데이터베이스를 포함하는 데이터베이스 애플리케이션, 상기 데이터베이스로부터 네트워크를 통해 복수의 클라이언트 컴퓨터 시스템 및 다른 애플리케이션 서버로 데이터를 통신하기 위한 수단, 및 상기 데이터 통신의 발생을 기록하는 로그 파일을 생성하기 위한 수단을 구비함 - 을 포함하는 분산 데이터 처리 시스템을 관리하기 위한 방법에 있어서,A plurality of application server computer systems interconnected via a network, wherein each application server is a database application comprising a database, means for communicating data from the database to a plurality of client computer systems and other application servers over the network; And means for generating a log file for recording the occurrence of said data communication, the method for managing a distributed data processing system comprising:

(가) 네트워크를 통해 상기 애플리케이션 서버로부터 수신되는 로그 파일을 관리 서버 컴퓨터 시스템의 기억 수단에 저장하는 단계;(A) storing a log file received from the application server via a network in a storage means of a management server computer system;

(나) 상기 저장된 로그 파일에서의 경보 조건 검출시 제1 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계;(B) generating, by the management server, a first event message upon detecting an alarm condition in the stored log file;

(다) 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에 의한 데이터베이스 복제가 없다는 것을 검출하는 경우에 제2 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계;(C) generating, by the management server, a second event message when detecting that there is no database replication by a database application in one or more application servers;

(라) 하나 또는 그 이상의 애플리케이션 서버 내의 데이터베이스 애플리케이션에서 장애 조건의 검출시 제3 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계; 및(D) generating, by the management server, a third event message upon detection of a failure condition in a database application in one or more application servers; And

(마) 상기 애플리케이션 서버 사이의 하나 또는 그 이상의 접속에서의 장애 조건 검출시 제4 이벤트 메시지를 상기 관리 서버에 의해 발생하는 단계(E) generating, by the management server, a fourth event message upon detecting a failure condition in one or more connections between the application servers;

를 포함하는 분산 데이터 처리 시스템 관리 방법.Distributed data processing system management method comprising a.

제 8 항에 있어서,The method of claim 8,

상기 단계 (나)는,Step (b) is

(바) 데이터 파일에서 경보 조건을 여과하는 단계를 포함하는(F) filtering the alarm condition in the data file;

분산 데이터 처리 시스템 관리 방법.How to manage distributed data processing systems.

제 8 항 또는 제 9 항에 있어서,The method according to claim 8 or 9,

상기 단계 (다)는,The step (c) is,

(사) 상기 데이터베이스의 각각의 카피를 샘플링하는 단계;(G) sampling each copy of the database;

(아) 대응하는 샘플이 서로 정합하는지 판단하는 단계; 및(H) determining if the corresponding samples match each other; And

(자) 제1 비교 수단에 의해 식별된 정합의 수가 사전정의된 수용 임계치 보다 적은 경우에 제2 이벤트 메시지를 발생하는 단계를 포함하는(I) generating a second event message if the number of matches identified by the first comparing means is less than a predefined acceptance threshold;

제 8 항 내지 제 10 항 중 어느 한 항에 있어서,The method according to any one of claims 8 to 10,

상기 단계 (라)는,Step (d) is

(차) 상기 장애 조건을 해결하기 위해 연속적으로 시도하는 단계; 및(D) continuously attempting to resolve the fault condition; And

(카) 상기 장애 조건을 해결하기 위한 미리결정된 다수의 시도 실패 이후에 제3 이벤트 메시지를 발생하는 단계를 포함하는(C) generating a third event message after a predetermined number of failed attempts to resolve the failure condition;

제 8 항 내지 제 11 항 중 어느 한 항에 있어서,The method according to any one of claims 8 to 11,

상기 단계 (마)는,The step (e),

(타) 상기 애플리케이션 서버 중 다른 서버를 통해 상기 애플리케이션 서버 중 리플렉트하는 서버로 테스트 메시지를 전송하는 단계;(E) transmitting a test message to another reflecting server among the application servers through another of the application servers;

(파) 상기 애플리케이션 서버 중 리플렉트하는 서버로부터 상기 애플리케이션 서버 중 다른 서버를 통해 테스트 메시지를 수신하는 단계; 및(F) receiving a test message from another server of the application server from a reflecting server of the application server; And

(하) 테스트 메시지가 상기 애플리케이션 서버 중 리플렉트하는 서버를 통해 통과하는데 걸리는 시간 및 상기 애플리케이션 서버 중 리플렉트하는 서버로 또한 그 서버로부터 도중에 각각의 애플리케이션 서버를 통해 통과하는데 걸리는 시간을 기록하는 단계를 포함하는(F) recording the time it takes for a test message to pass through the reflecting server among the application servers and the time it takes to pass through each application server to and from the server reflecting among the application servers; Containing