CN113014445B - Operation and maintenance method, device and platform for server and electronic equipment - Google Patents

Operation and maintenance method, device and platform for server and electronic equipment Download PDF

Info

Publication number
CN113014445B
CN113014445B CN202110173331.5A CN202110173331A CN113014445B CN 113014445 B CN113014445 B CN 113014445B CN 202110173331 A CN202110173331 A CN 202110173331A CN 113014445 B CN113014445 B CN 113014445B
Authority
CN
China
Prior art keywords
server
maintenance
node
version
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110173331.5A
Other languages
Chinese (zh)
Other versions
CN113014445A (en
Inventor
张凯
李海英
王迪
林男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110173331.5A priority Critical patent/CN113014445B/en
Publication of CN113014445A publication Critical patent/CN113014445A/en
Application granted granted Critical
Publication of CN113014445B publication Critical patent/CN113014445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Abstract

The present disclosure provides an operation and maintenance method for a server, which relates to the field of financial technology and other technology, and the method comprises the following steps: screening at least one server from a pre-configured server list as an operation and maintenance object for the task; and executing the operation and maintenance operation related to the task in batch by at least one server. The disclosure also provides an operation and maintenance device, an operation and maintenance platform, an electronic device, a computer readable storage medium and a computer program product for the server.

Description

Operation and maintenance method, device and platform for server and electronic equipment
Technical Field
The present disclosure relates to the field of financial technologies and other technologies, and in particular, to an operation and maintenance method for a server, an operation and maintenance device for a server, an operation and maintenance platform, an electronic device, a computer-readable storage medium, and a computer program product.
Background
The operation and maintenance of the server generally includes performing technical maintenance on hardware configuration, software installation, machine room shelf loading and unloading, etc. of the server, testing a working environment of the server, performing configuration, management, daily monitoring, health inspection, fault diagnosis, fault processing, etc. on a physical machine running a virtualization technology product, detecting a program version applied in actual production to determine whether the program version can run normally, and checking an installation log generated by the program version under test to find out error reporting information, etc.
In implementing the disclosed concept, the inventors found that there are at least the following problems in the related art: the operation and maintenance personnel usually use a command line or a script to execute the operation and maintenance operation, and cannot execute the operation and maintenance operation on a plurality of operation and maintenance objects in batch at the same time.
Disclosure of Invention
In view of the above, the present disclosure provides an operation and maintenance method for a server, an operation and maintenance device for a server, an operation and maintenance platform, an electronic device, a computer-readable storage medium, and a computer program product.
One aspect of the present disclosure provides an operation and maintenance method for a server, including: screening at least one server from a pre-configured server list as an operation and maintenance object for the task; and executing the operation and maintenance operation related to the task in batch by the at least one server.
According to the embodiment of the disclosure, the at least one server is screened out from the server list based on the filtering condition input by the user.
According to the embodiment of the disclosure, the at least one server is screened out from the server list based on the filtering condition input by the user through the graphical user interface.
According to an embodiment of the present disclosure, the executing, in batch, the operation and maintenance operations related to the task at this time on the at least one server includes: determining all test objects contained in the current server aiming at each server in the at least one server; for each test object, the following operations are performed: acquiring a handle associated with a current test object; and automatically accessing the current test object through the handle to detect whether the current test object can normally work currently.
According to an embodiment of the present disclosure, the method further includes, for each test object, further performing the following operations: acquiring a browser drive; starting a corresponding browser based on the browser drive; and wherein, automatically accessing the current test object in the browser through the handle.
According to an embodiment of the present disclosure, the executing, in batch, the operation and maintenance operations related to the task at this time on the at least one server includes: determining all service nodes related to the at least one server, wherein all the service nodes are rear-end nodes of the Nginx service nodes; for each of the above all serving nodes, performing the following operations: performing an anomaly detection operation on the current node every other first preset time to obtain a corresponding detection result; and responding to the detection result to represent that the current node is in an abnormal state, listing the node information of the current node into a non-healthy node list and forbidding to use the current node to continue providing services.
According to an embodiment of the present disclosure, the method further includes: and responding to the detection result representing that the current node is in an abnormal state, and executing system restarting operation or master/slave node switching operation on the current node.
According to an embodiment of the present disclosure, the method further includes: under the condition that the system restarting operation is executed on the nodes listed in the non-healthy node list, testing whether each node listed in the non-healthy node list is recovered to a healthy state or not at intervals of second preset time through a Ping command to obtain a corresponding test result; and re-listing the node information of each node which is represented by the test result and has recovered to the health state into a health node list, and restarting the node which has recovered to the health state so as to continuously provide corresponding services.
According to an embodiment of the present disclosure, the method further includes: and checking the content of the program version used for the pre-payment to determine the accuracy of the program version.
According to the embodiment of the disclosure, the program version used for the pre-delivery is the current version; the last version related to the current version is a program version which is delivered for use; the content verification of the version of the pre-paid program includes: acquiring a first mirror image name corresponding to the current version and a second mirror image name corresponding to the previous version; determining whether a first mirror image file matched with the first mirror image name and a second mirror image file matched with the second mirror image name exist in a mirror image warehouse or not; in response to determining that the first image file and the second image file exist in the image repository, finding the first image file and the second image file and comparing the two found image files to determine at least one of the following program lists involved in the first image file: newly added programs, deleted programs, and modified programs; and acquiring and determining whether the version of the current period is accurate or not based on the comparison result.
According to an embodiment of the present disclosure, a first container and a second container are started, where the first container is used to run the first image file, and the second container is used to run the second image file; finding out a first folder where the first mirror image file is located based on the first container; finding out the first mirror image file from the first folder; finding out a second folder where the second mirror image file is located based on the second container; and finding out the second mirror image file from the second folder.
According to the embodiment of the disclosure, in response to the comparison result representing that the actually updated content in the current-stage version is consistent with the content expected to be updated, it is determined that the current-stage version is accurate.
According to an embodiment of the present disclosure, the method further includes: converting the log generated by the server into a JSON format through each server in all the servers related in the server list and then storing the converted log into an ES database; and providing personalized services meeting specific requirements for the logs stored in the ES database.
According to an embodiment of the present disclosure, the personalized service includes at least one of: the system comprises a log classifying and summarizing service, a log aggregation analysis service, a log query service based on keywords, a data drilling service and an instrument board customizing service.
Another aspect of the present disclosure provides an operation and maintenance device for a server, including: the operation and maintenance object screening module is used for screening at least one server from a pre-configured server list as an operation and maintenance object for the task; and the operation and maintenance operation batch execution module is used for executing the operation and maintenance operation related to the task to the at least one server in batch.
Yet another aspect of the present disclosure provides an operation and maintenance platform, including: the front-end equipment is used for providing a graphical user interface so that a user can create and submit an operation and maintenance task; and the back-end framework is used for realizing the operation and maintenance method for the server.
Yet another aspect of the present disclosure provides an electronic device including: one or more processors; memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described above.
Yet another aspect of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.
Yet another aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the method as described above.
According to the embodiment of the disclosure, one or more servers can be screened from a pre-configured full server list or a specific server list (for example, a list configured for all servers which need to execute a restart operation at a specific moment) according to actual operation and maintenance requirements, as the operation and maintenance object for the task, and then the operation and maintenance operation related to the task is executed on the operation and maintenance object/objects in batch, so that the same or a same series of operation and maintenance operations can be executed on a plurality of different operation and maintenance objects in batch, thereby at least partially solving the technical problem that the operation and maintenance operations are executed by using a command line or a script in related technologies and the operation and maintenance operations cannot be executed on a plurality of operation and maintenance objects in batch, and further improving the operation and maintenance efficiency.
Drawings
For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario suitable for an operation and maintenance method, apparatus, and platform for a server according to an embodiment of the present disclosure;
fig. 2 schematically shows a structural block diagram of an operation and maintenance platform and a scene illustration of the operation and maintenance platform performing operation and maintenance on a server according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of an operation and maintenance method for a server according to an embodiment of the disclosure;
FIG. 4 is a flowchart schematically illustrating a specific implementation of an operation and maintenance method for a server according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a detailed implementation flowchart of performing an operation and maintenance operation in bulk, including testing whether a target Web application can operate normally, according to an embodiment of the present disclosure;
FIG. 6 is a flow diagram illustrating a detailed implementation of performing an operation and maintenance operation in bulk, including testing whether a target Web application can operate normally, according to another embodiment of the present disclosure;
FIG. 7 is a flow diagram illustrating a detailed implementation of bulk execution of an operation and maintenance operation including exception detection for a service node according to an embodiment of the present disclosure;
FIG. 8 is a flow diagram illustrating a detailed implementation of bulk execution of an operation and maintenance operation including exception probing, exception repair, and restart of a service node according to another embodiment of the present disclosure;
FIG. 9 is a schematic diagram illustrating an execution process of performing exception detection, exception repair, and restart for each service node in the batch execution operation and maintenance operation shown in FIG. 8;
FIG. 10 is a flow chart illustrating a detailed implementation of the operation and maintenance method for a server according to another embodiment of the present disclosure, including checking the content accuracy of the pre-paid application version;
fig. 11 schematically shows a detailed implementation flowchart for finding out a first image file and a second image file in an operation and maintenance method for a server according to an embodiment of the present disclosure;
fig. 12 schematically illustrates a detailed implementation flowchart including automatic analysis of installation logs in a test process of each Web application in an operation and maintenance method for a server according to an embodiment of the present disclosure;
fig. 13 is a block diagram schematically illustrating a structure of an operation and maintenance device for a server according to an embodiment of the present disclosure; and
fig. 14 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B, and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B, and C" would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.).
Some block diagrams and/or flowcharts are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon for use by or in connection with an instruction execution system.
The embodiment of the disclosure provides an operation and maintenance method for a server, and an operation and maintenance device and an operation and maintenance platform capable of realizing the method. The operation and maintenance method comprises the following steps: screening at least one server from a pre-configured server list as an operation and maintenance object for the task; and executing the operation and maintenance operation related to the task to the at least one server in batch.
The operation and maintenance method for the server in the embodiment of the disclosure is applied to the financial field as an example, and the operation and maintenance method for the server which is not disclosed can also be applied to other fields, which is not limited by the embodiment.
Fig. 1 schematically illustrates an application scenario suitable for an operation and maintenance method, apparatus, and platform for a server according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of an application scenario in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the terminal devices 101, 102, 103 may perform data interaction with the server 105 and other servers (such as servers 106, 107, 108) or server clusters communicatively connected with the server 105 through the network 104 to receive or send messages and the like.
The network 104 is a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105 or between the terminal devices 101, 102, 103 and the servers 106, 107, 108. Network 12 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a finance-type application, a web browser application, a search-type application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only). Financial applications are for example: an online bank APP, an electronic wallet APP or other bank financing APP. The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, notebook computers, desktop computers, smart bands or other electronic devices, and the like.
The servers 105, 106, 107, 108 may be servers that provide various services. For example, the servers 105, 106, 107, 108 may be target servers for providing service support for financial applications accessed by users using the terminal devices 101, 102, 103; or the server 105 may be a reverse proxy server 105 providing service support for financial applications accessed by the user using the terminal devices 101, 102, 103, and the reverse proxy server 105 may be used to forward the user request to access the target servers 106, 107, 108. The servers 105, 106, 107, and 108 may analyze and process data such as a received user request, and feed back a processing result (for example, a web page, information, or data obtained or generated according to the user request) to the terminal devices 101, 102, and 103.
For example, when a user accesses a financial Web service application through a browser, the browser sends an HTTP request to a Web server. And the server analyzes the request after receiving the request and sends the analyzed request to the Web back-end framework. And the back-end framework receives the request and then performs processing (such as database interaction, service processing and other operations), and returns the response object of the HTTP to the server after the processing is finished. And finally, the server returns the received HTTP response object message to the browser, and finally the browser obtains the desired page.
The user can also perform installation or update of various client applications through the terminal devices 101, 102, 103, and the client application to be installed or a new version of the client application that needs to be updated can be obtained by accessing the servers 105, 106, 107, 108. Based on the daily operation and maintenance work of the operation and maintenance personnel on the servers 105, 106, 107 and 108, the deployment of various application programs and the test verification of the updated versions of the application programs can be realized, and the application programs are released after the verification of the updated versions of the application programs is passed.
The operation and maintenance of the server generally includes technical maintenance of hardware configuration, software installation, machine room shelf loading and unloading, etc. of the server, testing of a working environment of the server, configuration, management, daily monitoring, health inspection, fault diagnosis, fault processing, etc. of a physical machine running a virtualization technology product, detection of a program version applied in actual production to determine whether the program version can run normally, and inspection of an installation log generated by the program version under test to find out error reporting information, etc.
For a server adopting the Linux operating system, various operation and maintenance operations are generally performed in a manner of a command line or an automation script. For example, in the process of deploying an application program to a specific server in a server group, a user needs to log in the specific server, and then loads the application program to be deployed to the installation directory of the specific server in the form of a command line, so as to implement the deployment of the application program in the specific server.
However, only one server can be operated at a time by adopting a command line mode, and the operation and maintenance operation cannot be simultaneously executed on a plurality of operation and maintenance objects in batches, so that the working efficiency is low, and the risk of misoperation exists. As the number of servers of an enterprise increases to thousands, operation and maintenance work is more challenging, and the traditional manual operation and maintenance and the mode of relying on an automatic script for server operation and maintenance cannot meet the requirements of the operation and maintenance work.
In view of the above, a first exemplary embodiment of the present disclosure provides an improved operation and maintenance method and platform for a server, so as to at least partially solve the technical problem in the related art that an operation and maintenance operation cannot be simultaneously performed on multiple operation and maintenance objects in a batch manner by using a command line or a script to perform the operation and maintenance operation.
Fig. 2 schematically shows a structural block diagram of an operation and maintenance platform and a scene schematic of the operation and maintenance platform performing operation and maintenance on a server according to an embodiment of the disclosure.
As shown in fig. 2, a scene of the operation and maintenance platform 200 provided in the embodiment of the present disclosure for performing operation and maintenance on the servers 105, 106, 107, and 108 is illustrated, where the operation and maintenance platform 200 includes: the front-end equipment 201 is used for providing a graphical user interface so that a user can create and submit an operation and maintenance task; and a back-end framework 202, configured to implement an operation and maintenance method for a server, which will be described in detail later with reference to fig. 3 to 12 and will not be described in detail here.
The front-end device 201 of the operation and maintenance platform 200 provides a graphical user interface, for example, an interface visually illustrating the creation of the operation and maintenance task, and an input box or a selection box is provided for each object, operation condition, screening parameter, and the like of the operation and maintenance task.
According to an embodiment of the present disclosure, the user may input the filtering condition through a graphical user interface displayed by the front-end device 201, so that the back-end framework 202 may screen out at least one server from the server list.
As shown in fig. 2, the rear end frame 202 includes: a routing layer 202a, a control layer 202b, a model layer 202c, a database 202d, and a view layer 202e.
The routing layer 202a is used for searching the corresponding processing program in the control layer 202b according to the request address.
The control layer 202b is mainly used for receiving an operation and maintenance operation request, controlling the operation and maintenance operation to execute a predetermined logic, returning a result of the operation and maintenance operation, interacting with the view layer 202e and the model layer 202c, and the like. The main operation logic of the operation and maintenance method provided by the embodiment of the present disclosure is implemented in the control layer 202b, and includes: testing the control logic of whether the target Web application can normally run; control logic for performing anomaly detection for the service node; control logic for performing anomaly detection and anomaly restoration for the service node; a control logic for verifying the content accuracy of the pre-paid application version; control logic for automatically analyzing the log, and the like.
The model layer 202c is mainly used for interacting with the database 202d, and performing operations such as adding, deleting, querying, and changing on data in the database 202 d.
View layer 202e is used primarily to write html, css, js codes, etc.
For example, when the operation and maintenance platform 200 tests whether the target Web application can normally run, the operation and maintenance personnel creates and submits an operation and maintenance task for testing whether the target Web application can normally run through the front-end device 201, the back-end framework 202 of the operation and maintenance platform 200 implements operations S501, S502, S503, S601, S602, and S603 (shown in fig. 5 and 6) in response to a request of the front-end device, and finally, after the back-end framework 202 performs each operation, returns an operation result to the front-end device 201, and completes one operation and maintenance task.
Fig. 3 schematically shows a flowchart of an operation and maintenance method for a server according to an embodiment of the present disclosure.
The operation and maintenance method for the server provided by the embodiment of the present disclosure may be executed by the operation and maintenance platform 200. The operation and maintenance platform 200 may include a front end 201 and a back end frame 202. In particular, the operation and maintenance method may be performed by the backend framework 202. The operation and maintenance platform 200 of the embodiment of the present disclosure has access rights to all operation and maintenance objects associated therewith. Based on different operation and maintenance requirements, all operation and maintenance objects can be divided into different groups. In the actual operation and maintenance operation, the number of servers in each server group can be increased or decreased according to the operation and maintenance requirements, and the number of the server groups can also be increased or decreased.
The operation and maintenance method provided by the embodiment of the disclosure can simultaneously carry out batch operation and maintenance on a plurality of servers in the server group, so that the operation and maintenance work efficiency is high.
In one embodiment, the servers needing operation and maintenance are configured with a host computer without secret login, and the servers needing operation and maintenance are managed in advance by taking a group as a unit so as to realize batched operation and maintenance.
As shown in fig. 3, the operation and maintenance method provided by the embodiment of the present disclosure includes the following operations: s301 and S302.
In operation S301, at least one server is screened from a preconfigured server list as an operation and maintenance object for the task.
In operation S302, the operation and maintenance operations related to the task are performed in batch on the at least one server.
For example, the server list includes information of a plurality of servers, and the information of each server may include, but is not limited to, at least one of the following information: server name, IP address of the server, group of servers, etc. The basis for server grouping may include, but is not limited to: a type of service, a type of operation and maintenance operation needed, etc.
In order to facilitate the batched operation and maintenance operation, the servers can be configured in advance to be free from secret login, and the configuration mode can be batched configuration.
According to the embodiment of the disclosure, the at least one server is screened out from the preconfigured server list based on the filtering condition input by the user.
According to an embodiment of the disclosure, an operation and maintenance platform may include a front-end device for providing a graphical user interface through which a user may create and submit an operation and maintenance task. The user here may be an operation and maintenance worker. In an embodiment, the front-end device may display a Web user interface (Web UI). The user may input a filtering condition (e.g., IP addresses of all servers recorded in a specific server list) through a graphical user interface displayed on the front-end device, and the back-end framework may filter out at least one server from the server list based on the filtering condition input by the user through the graphical user interface displayed on the front-end device when performing operation S301. Then, the back-end framework performs operation S302 to perform the operation and maintenance operations related to the task in batch on the at least one server.
Based on the operation and maintenance method and platform provided by the embodiment of the disclosure, one or more servers can be screened from a pre-configured full server list or a specific server list (for example, a list configured for all servers that need to execute a restart operation at a specific time) according to actual operation and maintenance requirements, and then the operation and maintenance operations related to the task are executed in batch on the operation and maintenance object/objects, so that the same or a same series of operation and maintenance operations can be executed on a plurality of different operation and maintenance objects in batch at the same time, thereby at least partially solving the technical problems that operation and maintenance operations are executed by using a command line or a script in the related technology and the operation and maintenance operations cannot be executed on a plurality of operation and maintenance objects in batch, and further improving the operation and maintenance efficiency.
Fig. 4 schematically shows a flowchart of a specific implementation of an operation and maintenance method for a server according to an embodiment of the present disclosure.
As shown in fig. 4, in a specific embodiment, a detailed implementation process of the operation and maintenance method provided in the embodiment of the present disclosure includes the following operations: s401, S402 and S403. Operations S402 and S403 in this embodiment correspond to operations S301 and S302 in the foregoing embodiment, respectively, and details of the embodiment of the present disclosure are not repeated herein.
In operation S401, a server list is obtained, where the server list includes information of a plurality of servers, and the information of each server includes, but is not limited to, at least one of the following information: server name, IP address of the server, grouping of servers, etc.
The server list may be created/configured in advance and stored in the database of the operation and maintenance platform. Operation S401 may be performed by the backend framework.
In operation S402, the server list is filtered based on the filtering condition input by the user, so as to filter out one or more servers meeting the filtering condition.
Referring back to fig. 2, the user may input the filter condition through a graphical user interface displayed by the front-end device 201 of the operation and maintenance platform 200, and in response to the user inputting the filter condition through the front-end device, the back-end framework may perform operations S402 and S403.
In operation S403, the operation and maintenance operations related to the task are executed in batch on the screened servers.
In the embodiment of the disclosure, the user may also input parameters of the operation and maintenance task through a graphical user interface of the front-end device of the operation and maintenance platform, and the back-end framework may execute operation and maintenance operations related to the task in batch on the screened servers according to the parameters input by the user.
By the embodiment of the disclosure, an operation and maintenance person can input the filtering condition through the front-end device of the operation and maintenance platform, the back-end framework can find out all servers meeting the condition from the preset server list, and simultaneously execute the same or similar operation and maintenance operations on all the found servers, so that batched operation and maintenance operations can be realized, and the operation and maintenance work efficiency can be improved.
FIG. 5 schematically shows a detailed implementation flowchart of performing an operation and maintenance operation in bulk including testing whether a target Web application can run normally according to an embodiment of the present disclosure.
At present, the types of browsers on the market are more and more, and versions of mainstream browsers such as IE, chrome, firefox, safari or Opera browsers are updated more frequently, so that a large amount of work is required to test whether a Web application can be compatible and work well in operating environments of different browsers and different versions of the same browser.
It should be appreciated that the operation and maintenance method using the command line and the script has difficulty in determining whether all the test environments are normal in a short time.
In view of this, the embodiments of the present disclosure provide an operation and maintenance method that can determine in batch whether a test environment (e.g., a Web test environment) is ready, that is, whether an application to be tested (e.g., a Web application) can run normally.
In this embodiment, the batch execution of the operation and maintenance operations related to the task at this time on at least one server (i.e., all the operation and maintenance objects targeted by the task at this time) includes: and testing whether the target Web application can normally run in batches.
As shown in fig. 5, performing the operation and maintenance operations related to the task in batch on at least one server (i.e., all the operation and maintenance objects targeted by the task) may include performing the following operations for each server of the at least one server: s501, S502, and S503.
In operation S501, all test objects in the current server are determined.
It should be understood that, in the present operation S501, the all test objects may include all applications to be tested and all functions to be tested included in each application to be tested.
Operations S502 and S503 are performed for each test object.
In operation S502, a handle associated with a current test object is acquired.
In operation S503, the current test object is automatically accessed through the handle to detect whether the current test object can currently operate normally.
Illustratively, in operation S501, all test objects contained in each server that is to perform the batch operation are determined. In this embodiment, all test objects are one or more target Web applications that need to be tested, for example, the target Web applications are internet banking application programs.
In operation S502 and operation S503, determining whether each test object can normally operate may specifically include obtaining a handle associated with the current test object, and then automatically accessing the current test object through the handle to detect whether the test object can normally operate currently.
It should be understood that in embodiments of the present disclosure, a Handle (Handle) may be used to represent an identifier of an object or item. The above objects or items may include: a module, an application instance, a window, a control, a bitmap, a resource or a file, etc.
In this embodiment, a handle associated with each test object may be preset, where the handle points to an operation command packet to be executed for the current test object. There is no need for the user (operation and maintenance personnel) to manually operate the current test object, and the operation of manual clicking is replaced by the operation command packet pointed by the handle, such as opening a browser, clicking the website or link of the current test object to open so as to display the page of the test object and the page elements to be tested involved in the page through the browser.
FIG. 6 is a flowchart illustrating a detailed implementation of performing an operation and maintenance operation in bulk, including testing whether a target Web application can run normally according to another embodiment of the present disclosure.
In addition to the operations S501 to S503 shown in fig. 5, the specific flow of executing the operation and maintenance operations in batch may further include the following operations as shown in fig. 6: s601, S602, and S603.
In operation S601, a browser driver is acquired.
In operation S602, a corresponding browser is launched based on a browser driver.
It should be noted that, in the embodiment of the present disclosure, operation S601 and operation S602 may be performed before operation S502. In addition, operation S501 may be performed before operation S601 or after operation S602, and the embodiment of the present disclosure is not limited herein.
In operation S603, a current test object is automatically accessed through a handle in the browser.
In the embodiment of the present disclosure, a user operation may be simulated, and operation S601 and operation S602 are automatically performed, so as to automatically open a browser, automatically open a page of an application to be tested through the browser based on an associated handle, and perform a related operation and maintenance operation such as clicking.
It should be understood that in the embodiment of the present disclosure, in the process of automatically accessing the test object through the handle, if the test object can be normally displayed and can normally respond based on a specific operation and maintenance operation, it is determined that the test object can currently work normally. Otherwise, determining that the test object can not work normally. Due to the fact that whether a plurality of test objects normally operate can be detected in a batch mode in an automatic mode, the technical problem that whether all test environments are normal or not can not be determined in a short time in the related art at least can be solved.
Fig. 7 schematically illustrates a detailed implementation flowchart of performing operation and maintenance operations in bulk, including performing exception detection on a service node according to an embodiment of the present disclosure.
In the related art, when a server provides a service to a terminal device, most of the service is provided to the outside through nginn. Nginx (engine x) is a type of Http and reverse proxy Web server. The inventor further finds that: generally, if a server node at the rear end of the nginnx is abnormal, the nginnx still forwards the request of the terminal device to the abnormal server node, and when the attempt of the nginnx reaches the maximum failure number, the server node without response at the rear end is set as the abnormal node and then forwarded to another server node, which wastes the number of times of forwarding to the abnormal node and the online traffic.
In view of this, the embodiments of the present disclosure further provide an operation and maintenance method for performing anomaly detection on a service node, which can avoid the waste of forwarding times and forwarding traffic.
In this embodiment, the batch execution of the operation and maintenance operations related to the task at this time on at least one server (i.e., all the operation and maintenance objects targeted by the task at this time) includes: and carrying out exception detection on the service node.
As shown in fig. 7, performing the operation and maintenance operations related to the task in batch on at least one server (i.e., all the operation and maintenance objects targeted by the task) may include performing the following operations for each server of the at least one server: s701, S702, and S703.
In operation S701, all service nodes involved by at least one server (i.e., all operation and maintenance objects targeted by the task) are determined, where all service nodes are backend nodes of the Nginx service node.
In this embodiment, all service nodes involved in providing services by one server are backend nodes of the Nginx service nodes, and the Nginx service nodes forward the request of the terminal device to the backend nodes.
The following operations S702 and S703 are performed for each of all the serving nodes.
In operation S702, an anomaly detection operation is performed on the current node every other first preset time to obtain a corresponding detection result.
The detection result is as follows: the current node is in a normal state or in an abnormal state.
In operation S703, in response to the detection result indicating that the current node is in an abnormal state, the node information of the current node is listed in the non-healthy node list and the current node is prohibited from being used to continue providing services.
In an embodiment, all service nodes are in a normal state by default in an initial state, node information of all service nodes is in a healthy node list, and when a node is detected to be in an abnormal state, the node information of the node is moved into a non-healthy node list from the healthy node list, and a Nginx service node cannot forward service data or a service request to the node in the non-healthy node list, so that the node in an abnormal use state is prohibited from providing services, and the waste of forwarding times and forwarding traffic caused by forwarding the Nginx service node to the abnormal node is avoided.
Through the operations S701 to S703, automatic abnormality detection is realized, and for a node in an abnormal use state, the nginnx service node is prohibited from forwarding a service request or service data to the node in the abnormal use state. In the process of detecting the abnormality, one or more nodes with the abnormality in all the service nodes can be detected in real time, and the one or more nodes are added into the unhealthy node list and the Nginx service node is prohibited from forwarding service data or service requests to the abnormal one or more nodes, so that the waste of the forwarding times and forwarding flow is avoided, and the technical problem that in the related technology, when a certain server node at the rear end of the Nginx is abnormal, the Nginx still forwards the request of the terminal equipment to the abnormal server node, so that the waste of the forwarding times and the forwarding flow is caused can be at least solved.
Fig. 8 schematically shows a detailed implementation flowchart of performing operation and maintenance operations in bulk, including performing exception detection, exception repair, and restart on a service node according to another embodiment of the present disclosure. Fig. 9 schematically illustrates an execution process diagram of performing exception detection, exception repair, and restart for each service node in the batch execution operation and maintenance operation shown in fig. 8.
According to another embodiment of the present disclosure, as shown in fig. 8, the operation and maintenance operations related to batch execution of the task to at least one server (i.e., all the operation and maintenance objects targeted by the task) include, in addition to performing exception detection on the service node, an operation of performing exception repair on the service node, and may further include a restart operation.
As shown in fig. 8, executing the operation and maintenance operation related to the task in batch on at least one server (i.e., all the operation and maintenance objects targeted by the task) may further include, in addition to executing operations S701, S702, and S703 in the above embodiment for each server in the at least one server, the following operations: s804, S805, and S806. Moreover, operations S701 to S703 shown in fig. 7 are the same as operations S701 to S703 shown in fig. 8, respectively, and are not repeated herein.
In operation S804, in response to the detection result indicating that the current node is in an abnormal state, a system restart operation or a master/slave node switching operation is performed on the current node.
In one embodiment, in the case that the current node is in an abnormal state, an operation of restarting the system is performed on the current service node.
In another embodiment, to ensure the reliability of the service provided by the server, a primary node and a standby node are usually configured for each service node, and the standby node is enabled in case of a failure/abnormality of the primary node. Therefore, if the current node is in an abnormal state, the operation of restarting the system or the operation of switching the active and standby nodes is performed, so that the node in the abnormal state is restored to a normal state as soon as possible by restarting, or is replaced by a normal standby node.
In operation S805, in a case that a restart system operation is performed on the nodes listed in the non-healthy node list, it is tested whether each node listed in the non-healthy node list has recovered to a healthy state at every second preset time by a Ping command, and a corresponding test result is obtained.
An Internet Packet explorer (Ping) is a service command that works in the application layer of the TCP/IP network architecture to test the network connection status.
In operation S806, the node information of each node whose test result characteristics have been restored to the healthy state is re-listed in the healthy node list and the node restored to the healthy state is restarted, so that the corresponding service continues to be provided.
In other embodiments, performing the operation and maintenance operations involved in the task in batch on at least one server (i.e., all the operation and maintenance objects targeted by the task in this time) includes performing operations S701, S702, S703, and S804 on each server in the at least one server.
As shown in fig. 9, performing the operation and maintenance operations related to the task in batch on at least one server (i.e., all the operation and maintenance objects targeted by the task) includes performing operations S701, S702, S703, S804, S805, and S806 on each server of the at least one server, and further includes operations S901, S902, and S903. It can be understood that operations S701, S702, S703, S804, S805, and S806 shown in fig. 9 correspond to operations S701, S702, S703, S804, S805, and S806 shown in fig. 8, respectively, and are not described herein again.
In operation S901, it is determined whether the current node is in an abnormal state according to the detection result.
And sequentially executing operation S703 and operation S804 when the detection result indicates that the current node is in the abnormal state. In addition, in other embodiments, the operations S703 and S804 may be executed in parallel. The disclosed embodiments do not limit the specific order of execution of the two herein.
In operation S902, node information of the current node is listed in a healthy node list.
In an embodiment, in the case where the operation S901 determines that the current node is in the normal state, the operation S902 is performed. Or in another embodiment, defaulting that all service nodes are in a normal state, and listing the node information of all service nodes into a healthy node list; and when detecting that a certain node in the healthy node list is abnormal, removing the node from the healthy node list and adding the node into the unhealthy node list.
In operation S903, it is determined whether the current node is healthy according to the test result.
In an embodiment, operation S903 is performed after operation S805. In operation S903, in case that it is determined that the current node is healthy again according to the test result, operation S806 is performed. Nodes that are characterized by test results as still being in an abnormal state (not yet healthy) after restarting system operation are still left in the list of unhealthy nodes.
In an embodiment, all service nodes are in a normal state by default in an initial state, node information of all service nodes is in a healthy node list, and when a node is detected to be in an abnormal state, the node information of the node is moved from the healthy node list to a non-healthy node list. When a test result shows that a certain node is restored to the healthy state after a recovery operation (such as a restart operation) is performed on the nodes listed in the non-healthy node list, the nodes restored to the healthy state are transplanted into the healthy node list again from the non-healthy node list. The operation and maintenance method provided by the embodiment of the disclosure allows the Nginx service node to forward the service data or the service request to the node in the healthy node list, and prohibits the Nginx service node from forwarding the service data or the service request to the node in the unhealthy node list, so that the waste of the forwarding times and the forwarding flow caused by forwarding the Nginx service node to the abnormal service node is avoided, and meanwhile, the abnormal recovery operation (for example, restarting the system operation or switching the main and standby node operations) is performed on the abnormal node, and the nodes in the healthy node list and the unhealthy node list can be adjusted in real time according to the abnormal recovery result, so that the abnormal maintenance of all service nodes and the dynamic adjustment of the forwarding resources can be realized.
Fig. 10 schematically shows a detailed implementation flowchart including checking the content accuracy of the pre-paid application version in an operation and maintenance method for a server according to another embodiment of the present disclosure.
To ensure that the content of the delivered Web application version is correct, embodiments of the present disclosure provide an operation and maintenance method that verifies the content accuracy of the pre-delivered application version.
According to an embodiment of the present disclosure, in addition to the operations S301 and S302, as shown in fig. 10, the operation and maintenance method provided by the embodiment of the present disclosure further includes: and performing content check on the program version used for the pre-payment to determine the accuracy of the program version.
Wherein, the version of the program used for pre-payment is the version of the current period; the last version associated with the current version is the version of the program that has been delivered for use.
The content verification of the pre-paid program version includes the following operations: s1001, S1002, S1003, and S1004.
It should be noted that, in the embodiment of the present disclosure, the execution sequence of operations S1001, S1002, S1003, and S1004 and operations S301 and S302 is not limited.
In operation S1001, a first image name corresponding to a current version and a second image name corresponding to a previous version are obtained.
In operation S1002, it is determined whether a first image file matching the first image name and a second image file matching the second image name exist in the image repository.
According to an embodiment of the present disclosure, the mirror repository is located in the cloud Platform and is configured to store a PAAS (Platform as a Service) mirror.
In operation S1003, in response to determining that the first image file and the second image file exist in the image repository, finding the first image file and the second image file and comparing the two found image files to determine at least one of the following program lists involved in the first image file: newly added programs, deleted programs, and modified programs.
In operation S1004, it is acquired and determined whether the current version is accurate based on the comparison result.
And if the comparison result represents that the actually updated content in the current version is consistent with the expected updated content, determining that the current version is accurate.
Fig. 11 schematically shows a detailed implementation flowchart for finding out the first image file and the second image file in the operation and maintenance method for the server according to an embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 11, finding out the first image file and the second image file specifically includes the following operations: s1101, S1102, S1103, S1104, and S1105.
In operation S1101, a first container and a second container are started, wherein the first container is used to run a first image file, and the second container is used to run a second image file.
In operation S1102, a first folder in which the first image file is located is found based on the first container.
In operation S1103, a first image file is found from a first folder.
In operation S1104, a second folder in which the second image file is located is found based on the second container.
In operation S1105, a second image file is found from a second folder.
In an embodiment, after the contents of the first image file and the second image file are compared, the first container and the second container are deleted.
In this embodiment, the first image file of the current version and the second image file of the previous version are compared to determine the program involved in the actual change of the first image file of the current version compared with the second image file of the previous version, so that the accuracy of the current version is determined by comparing the program involved in the actual change with the content of the expected update, thereby ensuring the correctness of the current version delivered for production.
Fig. 12 schematically shows a detailed implementation flowchart including automatic analysis of logs in an operation and maintenance method for a server according to an embodiment of the present disclosure.
Because thousands of versions need to be installed in the testing process of each Web application version, a large amount of version installation logs can be generated, the related technology mainly depends on manual inspection of the installation logs, and the problem of missing and reporting wrong information often occurs under the condition, and the problem of untimely log inspection also exists. Some related technologies adopt open source log collection tools to collect logs and then analyze the logs, and the log collection tools also have the problem of high CPU utilization rate.
In view of this, the embodiments of the present disclosure further provide an operation and maintenance method for log analysis, which can reduce the CPU utilization.
According to an embodiment of the present disclosure, in addition to the above operations S301 and S302, as shown in fig. 12, the operation and maintenance method provided by the embodiment of the present disclosure further includes the following operations: s1201 and S1202.
In operation S1201, the log generated by the server itself is converted into a JSON format by each of all the servers related to the server list and then stored in the ES database.
In operation S1201, the server may convert the generated log into JSON format and store the JSON format in the ES database. The ES (elastic search) database is a highly-expanded and open-source full-text search and analysis database, and can store, search and analyze massive data. The ES database performs data operation through an http interface by using a JSON format, and the minimum unit of data storage is a document which is essentially a JSON text. The utilization rate of the CPU is reduced by directly storing the JSON format message into an ElasticSearch database.
In operation S1202, a personalized service satisfying a specific requirement is provided with respect to a log saved in an ES database.
According to an embodiment of the present disclosure, the personalized service includes at least one of: the system comprises a log classifying and summarizing service, a log aggregation analysis service, a log query service based on keywords, a data drilling service and an instrument board customizing service.
The method can realize log classification and summarization, log aggregation analysis, log query based on keywords and data drill-down in a customized instrument panel.
It should be noted that, in the embodiment of the present disclosure, the execution sequence of operations S1201 and S1202 and operations S301 and S302 is not limited.
The embodiment of the disclosure also provides an operation and maintenance device for the server.
Fig. 13 schematically shows a block diagram of an operation and maintenance device for a server according to an embodiment of the present disclosure.
As shown in fig. 13, the operation and maintenance device 1300 includes: an operation and maintenance object screening module 1301 and an operation and maintenance operation batch execution module 1302.
The operation and maintenance object screening module 1301 is configured to screen at least one server from a preconfigured server list as an operation and maintenance object for the task.
According to an embodiment of the present disclosure, the server list contains information of a plurality of servers, and the information of each server includes, but is not limited to, at least one of the following information: server name, IP address of the server, grouping of servers, etc. The server list may be created/configured in advance and stored in the database of the operation and maintenance platform.
According to the embodiment of the disclosure, at least one server is screened out from the server list based on the filtering condition input by the user.
According to the embodiment of the disclosure, at least one server is screened out from the server list based on the filtering condition input by the user through the graphical user interface.
The operation and maintenance operation batch execution module 1302 is configured to execute the operation and maintenance operations related to the task in batch for at least one server.
According to an embodiment of the present disclosure, referring to fig. 2 and fig. 13, a user may input parameters of an operation and maintenance task through a graphical user interface of the front-end device 201 of the operation and maintenance platform 200, and the operation and maintenance operation batch execution module 1302 may execute operation and maintenance operations related to the task in batch on a selected server according to the parameters input by the user.
The task includes but is not limited to at least one of the following tasks: testing whether the target Web application can normally run; performing an abnormal detection task aiming at the service node; performing tasks of anomaly detection and anomaly restoration aiming at the service node; the task of checking the content accuracy of the pre-paid application program version is carried out; and performing automatic analysis on the log.
The operation and maintenance device 1300 further includes the following modules: the system comprises a test object ready automatic inspection module, a service abnormity detection and automatic repair module, a mirror image difference comparison module and a program version installation log analysis module.
The test object ready automatic check module is used for detecting whether the Web application can work normally. The test object-ready automated inspection module may include a functional sub-module capable of performing operations S501 to S503 described in the above-described first embodiment, or a functional sub-module capable of performing operations S501 to S503 and S601 to S603. The test object ready automatic checking module may be a module belonging to the operation and maintenance operation batch execution module, or may be a module separately configured.
The service abnormity detection and automatic repair module is used for testing whether the service node related to the server is abnormal or not and repairing the abnormality. The service anomaly detection and automatic repair module may include functional sub-modules capable of performing the operations of S701 to S706 described in the first embodiment described above. The service abnormity detection and automatic repair module can be a module belonging to an operation and maintenance operation batch execution module or a module which is arranged independently.
The mirror image difference comparison module is used for checking the content accuracy of the pre-paid application program version. The mirror image difference comparison module includes functional sub-modules capable of performing operations S1001 to S1004 described in the above first embodiment. The mirror image difference comparison module can be a module belonging to the operation and maintenance operation batch execution module or a module which is arranged independently.
The program version installation log analysis module is used for automatically analyzing the log. And the program version installation log analysis module is used for providing personalized services meeting specific requirements aiming at the logs stored in the ES database. According to an embodiment of the present disclosure, the personalized service comprises at least one of: the system comprises a log classifying and summarizing service, a log aggregation analysis service, a log query service based on key words, a data drilling service and an instrument board customizing service. The program version installation log analysis module can be a module affiliated to the operation and maintenance operation batch execution module or a module which is set independently.
It should be noted that, the embodiment of the operation and maintenance device portion is similar to that of the operation and maintenance method portion, and the achieved technical effects are also similar, which are not described herein again.
Any of the modules, units, or at least part of the functionality of any of them according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, units according to the embodiments of the present disclosure may be implemented at least partly as a hardware circuit, e.g. a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or by any other reasonable way of integrating or packaging a circuit in hardware or firmware, or in any one of three implementations, or in a suitable combination of any of them. Alternatively, one or more of the modules, units according to embodiments of the present disclosure may be implemented at least partly as computer program modules, which, when executed, may perform the respective functions.
For example, any of the operation and maintenance object screening module 1301, the operation and maintenance operation batch execution module 1302, the test object ready automation check module, the service anomaly detection and automatic repair module, the mirror image difference comparison module, and the program version installation log analysis module may be combined and implemented in one module, or any one of the modules may be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the operation and maintenance object screening module 1301, the operation and maintenance operation batch execution module 1302, the test object ready automation inspection module, the service anomaly detection and automatic repair module, the image difference comparison module, and the program version installation log analysis module may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementation manners of software, hardware, and firmware, or implemented by a suitable combination of any several of them. Alternatively, at least one of the operation and maintenance object screening module 1301, the operation and maintenance operation batch execution module 1302, the test object ready automation check module, the service abnormality detection and automatic repair module, the mirror image difference comparison module, and the program version installation log analysis module may be at least partially implemented as a computer program module, and when the computer program module is executed, the corresponding function may be executed.
The embodiment of the disclosure also provides the electronic equipment. The electronic device includes: one or more processors; and a memory for storing one or more programs. Wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any of the above-described operations and maintenance methods for the server.
Fig. 14 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure. The electronic device shown in fig. 14 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 14, an electronic device 1400 according to an embodiment of the present disclosure includes a processor 1401, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1402 or a program loaded from a storage portion 1408 into a Random Access Memory (RAM) 1403. Processor 1401 may comprise, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1401 may also include onboard memory for caching purposes. Processor 1401 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the present disclosure.
In the RAM 1403, various programs and data necessary for the operation of the electronic device 1400 are stored. The processor 1401, the ROM1402, and the RAM 1403 are connected to each other by a bus 1404. The processor 1401 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM1402 and/or the RAM 1403. Note that the above-described programs may also be stored in one or more memories other than the ROM1402 and the RAM 1403. The processor 1401 may also perform various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in the one or more memories described above.
According to an embodiment of the present disclosure, electronic device 1400 may also include an input/output (I/O) interface 1405, which input/output (I/O) interface 1405 is also connected to bus 1404. Electronic device 140 may also include one or more of the following components connected to I/O interface 1405: an input portion 1406 including a keyboard, a mouse, and the like; an output portion 1407 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage portion 1408 including a hard disk and the like; and a communication section 1409 including a network interface card such as a Local Area Network (LAN) card, a modem, or the like. The communication section 1409 performs communication processing via a network such as the internet. The driver 1410 is also connected to the I/O interface 1405 as necessary. A removable medium 1411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1410 as necessary, so that a computer program read out therefrom is installed into the storage section 1408 as necessary.
According to an embodiment of the present disclosure, the method flow according to an embodiment of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1409 and/or installed from the removable medium 1411. The computer program, when executed by the processor 1401, performs the above-described functions defined in the system of the embodiment of the present disclosure. The above described systems, devices, apparatuses, modules, units, etc. may be implemented by computer program modules according to embodiments of the present disclosure.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method provided by the embodiments of the present disclosure, when the computer program product runs on an electronic device, the program code is configured to cause the electronic device to implement the operation and maintenance method for a server provided by the embodiments of the present disclosure.
When the computer program is executed by the processor 1401, the above-described functions defined in the system/apparatus of the embodiment of the present disclosure are performed. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of signals over a network medium, downloaded and installed via the communication portion 1409, and/or installed from the removable media 1411. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include one or more memories other than ROM1402 and/or RAM 1403 and/or ROM1402 and RAM 1403 described above.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (15)

1. An operation and maintenance method for a server, comprising:
screening at least one server from a pre-configured server list as an operation and maintenance object for the task;
executing the operation and maintenance operation related to the task to the at least one server in batch; and
content checking the program version used for pre-payment to determine the accuracy of the program version,
wherein, executing the operation and maintenance operation related to the task in batch to the at least one server comprises:
determining all service nodes involved by the at least one server, wherein all the service nodes are backend nodes of the Nginx service nodes;
for each of the all serving nodes, performing the following:
performing an anomaly detection operation on the current node every other first preset time to obtain a corresponding detection result; and
in response to the detection result representing that the current node is in an abnormal state, listing node information of the current node into a non-healthy node list and prohibiting to continue providing service by using the current node,
content checking the program version used for pre-payment to determine the accuracy of the program version,
the program version used for the pre-delivery is the current version;
the last-stage version related to the current-stage version is a program version which is delivered for use;
the content verification of the program version used for the pre-payment comprises the following steps:
acquiring a first mirror image name corresponding to the current version and a second mirror image name corresponding to the previous version;
determining whether a first image file matched with the first image name and a second image file matched with the second image name exist in an image warehouse or not;
in response to determining that the first image file and the second image file exist in the image repository, finding the first image file and the second image file and comparing the two found image files to determine at least one of the following program lists involved in the first image file: newly added programs, deleted programs, and modified programs; and
and acquiring and determining whether the version of the current period is accurate or not based on the comparison result.
2. The method of claim 1, wherein:
and screening the at least one server from the server list based on a filtering condition input by a user.
3. The method of claim 2, wherein:
and screening the at least one server from the server list based on the filtering condition input by the user through a graphical user interface.
4. The method according to claim 1, wherein the performing the operation and maintenance operation related to the task in batch on the at least one server further comprises: for each of the at least one server,
determining all test objects contained in the current server;
for each test object, the following operations are performed:
acquiring a handle associated with a current test object; and
and automatically accessing the current test object through the handle to detect whether the current test object can normally work currently.
5. The method of claim 4, further comprising, for said each test object, further performing the following:
acquiring a browser drive;
starting a corresponding browser based on the browser driver; and
wherein the current test object is automatically accessed in the browser through the handle.
6. The method of claim 1, further comprising: in response to the probing result characterizing that the current node is in an abnormal state,
and restarting a system operation or switching a main node and a standby node operation on the current node.
7. The method of claim 6, further comprising: in the event that the restart system operation is performed on a node listed in the list of unhealthy nodes,
testing whether each node listed in the non-healthy node list is recovered to a healthy state at intervals of second preset time by a Ping command to obtain a corresponding test result; and
and re-listing the node information of each node which is characterized by the test result and has been restored to the healthy state into a healthy node list, and restarting the node which has been restored to the healthy state so as to continuously provide corresponding services.
8. The method of claim 1, wherein:
starting a first container and a second container, wherein the first container is used for running the first image file, and the second container is used for running the second image file;
finding out a first folder where the first mirror image file is located based on the first container;
finding out the first mirror image file from the first folder;
finding out a second folder where the second mirror image file is located based on the second container; and
and finding out the second mirror image file from the second folder.
9. The method of claim 1, wherein:
and determining that the current-stage version is accurate in response to the comparison result representing that the actually updated content in the current-stage version is consistent with the expected updated content.
10. The method of claim 1, further comprising:
converting the log generated by the server into a JSON format through each server in all the servers related in the server list and then storing the converted log into an ES database; and
providing personalized services meeting specific needs for logs stored in the ES database.
11. The method of claim 10, wherein the personalized service comprises at least one of: the system comprises a log classifying and summarizing service, a log aggregation analysis service, a log query service based on key words, a data drilling service and an instrument board customizing service.
12. An operation and maintenance device for a server, comprising:
the operation and maintenance object screening module is used for screening at least one server from a pre-configured server list as an operation and maintenance object for the task;
the operation and maintenance operation batch execution module is used for executing the operation and maintenance operation related to the task to the at least one server in batch; and
the mirror image difference comparison module is used for carrying out content verification on the program version used for pre-payment so as to determine the accuracy of the program version, wherein
The operation and maintenance operation related to the task is executed on the at least one server in batch, and the operation and maintenance operation includes:
determining all service nodes related to the at least one server, wherein all the service nodes are rear-end nodes of the Nginx service nodes;
for each of the all serving nodes, performing the following:
performing an anomaly detection operation on the current node every other first preset time to obtain a corresponding detection result; and
responsive to the probing result characterizing that the current node is in an abnormal state, listing node information of the current node in a non-healthy node list and prohibiting continued provision of services using the current node, and
for content checking of a program version for pre-payment use to determine the accuracy of the program version,
the program version used for the pre-delivery is the current version;
the last-stage version related to the current-stage version is a program version which is delivered for use;
the content verification of the program version used for the pre-payment comprises the following steps:
acquiring a first mirror image name corresponding to the current version and a second mirror image name corresponding to the previous version;
determining whether a first image file matched with the first image name and a second image file matched with the second image name exist in an image warehouse or not;
in response to determining that the first image file and the second image file exist in the image repository, finding the first image file and the second image file and comparing the two found image files to determine at least one of the following program lists involved in the first image file: newly added programs, deleted programs, and modified programs; and
and acquiring and determining whether the version of the current period is accurate or not based on the comparison result.
13. An operation and maintenance platform comprising:
the front-end equipment is used for providing a graphical user interface so that a user can create and submit an operation and maintenance task; and
a backend framework for implementing the operation and maintenance method for the server according to any one of claims 1 to 11.
14. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-11.
15. A computer-readable storage medium storing computer-executable instructions for implementing the method of any one of claims 1 to 11 when executed.
CN202110173331.5A 2021-02-08 2021-02-08 Operation and maintenance method, device and platform for server and electronic equipment Active CN113014445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110173331.5A CN113014445B (en) 2021-02-08 2021-02-08 Operation and maintenance method, device and platform for server and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110173331.5A CN113014445B (en) 2021-02-08 2021-02-08 Operation and maintenance method, device and platform for server and electronic equipment

Publications (2)

Publication Number Publication Date
CN113014445A CN113014445A (en) 2021-06-22
CN113014445B true CN113014445B (en) 2022-11-11

Family

ID=76384139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110173331.5A Active CN113014445B (en) 2021-02-08 2021-02-08 Operation and maintenance method, device and platform for server and electronic equipment

Country Status (1)

Country Link
CN (1) CN113014445B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113535208A (en) * 2021-08-05 2021-10-22 浙江万朋教育科技股份有限公司 Method for realizing visual centralized maintenance and updating test application program based on python
CN113992491B (en) * 2021-09-29 2024-04-02 中通服科信信息技术有限公司 Application server group operation and maintenance management system, method and device
CN114760314A (en) * 2022-04-06 2022-07-15 中国工商银行股份有限公司 Server management method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019033088A1 (en) * 2017-08-11 2019-02-14 ALTR Solutions, Inc. Immutable datastore for low-latency reading and writing of large data sets
CN110457204A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 Code test method, device, computer equipment and storage medium
CN110661831A (en) * 2018-06-29 2020-01-07 复旦大学 Big data test field security initialization method based on trusted third party
CN112084008A (en) * 2020-09-10 2020-12-15 浪潮云信息技术股份公司 Method for rapidly deploying cloud pipe system based on container technology
CN112087516A (en) * 2020-09-10 2020-12-15 星辰天合(北京)数据科技有限公司 Storage upgrading method and device based on Docker virtualization technology

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9928161B1 (en) * 2016-09-08 2018-03-27 Fmr Llc Automated quality assurance testing of browser-based applications
CN106953758A (en) * 2017-03-20 2017-07-14 北京搜狐新媒体信息技术有限公司 A kind of dynamic allocation management method and system based on Nginx servers
CN109165024A (en) * 2018-07-26 2019-01-08 天讯瑞达通信技术有限公司 A kind of method of operation platform automatic deployment and monitoring server system
CN110460476B (en) * 2019-08-23 2022-08-02 福建广电网络集团股份有限公司 Network operation and maintenance management method
CN111177617A (en) * 2019-12-06 2020-05-19 上海上讯信息技术股份有限公司 Web direct operation and maintenance method and device based on operation and maintenance management system and electronic equipment
CN111371615B (en) * 2020-03-04 2023-07-14 深信服科技股份有限公司 Online server, method and system for updating operation and maintenance tool and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019033088A1 (en) * 2017-08-11 2019-02-14 ALTR Solutions, Inc. Immutable datastore for low-latency reading and writing of large data sets
CN110661831A (en) * 2018-06-29 2020-01-07 复旦大学 Big data test field security initialization method based on trusted third party
CN110457204A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 Code test method, device, computer equipment and storage medium
CN112084008A (en) * 2020-09-10 2020-12-15 浪潮云信息技术股份公司 Method for rapidly deploying cloud pipe system based on container technology
CN112087516A (en) * 2020-09-10 2020-12-15 星辰天合(北京)数据科技有限公司 Storage upgrading method and device based on Docker virtualization technology

Also Published As

Publication number Publication date
CN113014445A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
US11449379B2 (en) Root cause and predictive analyses for technical issues of a computing environment
CN113014445B (en) Operation and maintenance method, device and platform for server and electronic equipment
CN109302522B (en) Test method, test device, computer system, and computer medium
US10437703B2 (en) Correlation of source code with system dump information
US20170177765A1 (en) Test case generation
US8140578B2 (en) Multilevel hierarchical associations between entities in a knowledge system
US10152367B2 (en) System dump analysis
US20190095270A1 (en) Tailoring diagnostic information in a multithreaded environment
US10216617B2 (en) Automatically complete a specific software task using hidden tags
US10884911B2 (en) System and method for use in regression testing of electronic document hyperlinks
US9946630B2 (en) Efficiently debugging software code
US11221907B1 (en) Centralized software issue triage system
CN109901985B (en) Distributed test apparatus and method, storage medium, and electronic device
CN110851471A (en) Distributed log data processing method, device and system
US9256509B1 (en) Computing environment analyzer
CN113918864A (en) Website page testing method, testing system, testing device, electronic equipment and medium
US11232019B1 (en) Machine learning based test coverage in a production environment
US20190213109A1 (en) Filter trace based on function level
US10637722B2 (en) Automated remote message management
US9354962B1 (en) Memory dump file collection and analysis using analysis server and cloud knowledge base
Marinho et al. Evaluating testing strategies for resource related failures in mobile applications
CN112506772B (en) Web automatic test method, device, electronic equipment and storage medium
CN113986768A (en) Application stability testing method, device, equipment and medium
CN114064510A (en) Function testing method and device, electronic equipment and storage medium
US20160275002A1 (en) Image capture in application lifecycle management for documentation and support

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant