CN113806429A - Canvas type log analysis method based on large data stream processing framework - Google Patents

Canvas type log analysis method based on large data stream processing framework Download PDF

Info

Publication number
CN113806429A
CN113806429A CN202010533924.3A CN202010533924A CN113806429A CN 113806429 A CN113806429 A CN 113806429A CN 202010533924 A CN202010533924 A CN 202010533924A CN 113806429 A CN113806429 A CN 113806429A
Authority
CN
China
Prior art keywords
canvas
log analysis
operator
workflow
data stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010533924.3A
Other languages
Chinese (zh)
Inventor
陈飞
赖键锋
廖子渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN202010533924.3A priority Critical patent/CN113806429A/en
Publication of CN113806429A publication Critical patent/CN113806429A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a canvas log analysis method, a canvas log analysis system and a storage medium based on a big data stream processing framework, wherein the canvas log analysis method based on the big data stream processing framework comprises the following steps: acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas; forming a workflow of log analysis according to the operator; converting the workflow into a corresponding Json format file; and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data. The problems that in the prior art, when log analysis is carried out by using an open source program of a large data stream processing framework, operation is complex, reliability and stability cannot be guaranteed, and maintenance is difficult are solved, and the effects of simple operation and high efficiency of the log analysis of the large data stream processing framework are achieved.

Description

Canvas type log analysis method based on large data stream processing framework
Technical Field
The application relates to the technical field of big data, in particular to a canvas type log analysis method based on a big data stream processing framework.
Background
With the rapid development of the internet technology, the generated data also shows the gushing development, and how to make use of the big data to carry out arrangement such as production has important significance for enterprises.
At present, a large data stream processing platform can be used for processing data in the production process of an enterprise, so as to realize analysis of stream data. However, the existing large data stream processing platform mainly performs data analysis through the component operation flow of the open source system, is complex to operate, needs to understand the framework and principle, and has difficulty for business personnel without a large data base to use.
Disclosure of Invention
The embodiment of the application provides a canvas type log analysis method, a canvas type log analysis system and a storage medium based on a big data stream processing framework, aims to solve the problems that in the prior art, when log analysis is performed by using an open source program of a big data stream processing platform, operation is complex, reliability and stability cannot be guaranteed, and maintenance is difficult, and achieves the effects of simplicity in operation and high efficiency of the log analysis of the big data stream processing framework.
In order to achieve the above object, the present application provides a canvas-based log analysis method based on a big data stream processing framework, where the canvas-based log analysis method based on the big data stream processing framework includes the following steps:
acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas;
forming a workflow of log analysis according to the operator;
converting the workflow into a corresponding Json format file;
and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data.
Optionally, the step of generating the Json format file into a distributed program running on an executor includes:
converting the Json format file into a DSL description file;
generating a corresponding workflow diagram according to the DSL description file;
and generating a distributed program operated by the executor according to the workflow diagram.
Optionally, the step of generating a distributed program executed by an executor according to the workflow diagram is preceded by:
and carrying out operation character integration, verification and rule matching operation on the workflow diagram.
Optionally, the step of generating a distributed program executed by an executor according to the workflow diagram is preceded by:
and constructing an execution environment according to the type of the workflow diagram, and setting parameter configuration.
Optionally, the step of constructing an execution environment according to the type of the workflow diagram includes:
judging the type of the workflow diagram according to a source operator contained in the workflow diagram;
if the type of the workflow diagram is a batch processing type, constructing an execution environment corresponding to batch processing;
and if the type of the workflow diagram is a flow processing type, constructing an execution environment corresponding to flow processing.
Optionally, the constructing an execution environment according to the type of the workflow diagram, and after the step of setting the parameter configuration, the method includes:
collecting data for parameter configuration;
and carrying out formatting and field operation on the data.
Optionally, after the step of obtaining an operator of the canvas work area according to a drag operation of the user in the canvas, the method includes:
if the operator contains a self-defined operator;
acquiring jar dependent files corresponding to the self-defined operators;
and storing the jar dependent file and writing a storage path into the Json file.
In addition, in order to achieve the above object, the present application further provides a canvas-based log analysis system based on a big data stream processing framework, wherein the system includes a canvas-based log analysis terminal based on a big data stream processing framework; wherein the content of the first and second substances,
the canvas-based log analysis terminal based on the big data stream processing framework comprises a memory, a processor and a canvas-based log analysis program which is stored on the memory and can run on the processor, wherein the canvas-based log analysis program based on the big data stream processing framework realizes the following steps when being executed by the processor:
acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas;
forming a workflow of log analysis according to the operator;
converting the workflow into a corresponding Json format file;
and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data.
In addition, in order to achieve the above object, the present application further provides a canvas-based log analysis apparatus based on a big data stream processing framework, wherein the apparatus includes:
the acquisition module is used for acquiring operators in the canvas working area according to the dragging operation of a user on the canvas;
the engine module is used for converting the workflow into a corresponding Json format file;
and the actuator module is used for generating the Json format file into a distributed program running on the actuator so that the actuator executes the distributed program to analyze the log data.
In addition, in order to achieve the above object, the present application further provides a computer-readable storage medium, wherein the storage medium stores a canvas-based log analysis program based on a big data stream processing framework, and when the canvas-based log analysis program based on the big data stream processing framework is executed by a processor, the method as described above is implemented.
In this embodiment, an operator in a canvas work area is obtained according to a dragging operation of a user on the canvas, a workflow for log analysis is formed according to the operator, and the workflow is converted into a corresponding Json format file; and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data. The log analysis is completed by dragging the operator of the canvas to the working area of the canvas, codes do not need to be written by a big data stream processing program, the canvas type log analysis based on the big data stream processing frame can be realized without learning and knowing the frame and the principle of the frame, and the effect of simplifying the operation of the big data log analysis is achieved.
Drawings
Fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present application;
FIG. 2 is a schematic flowchart illustrating an embodiment of a canvas-based log analysis method based on a big data stream processing framework according to the present application;
FIG. 3 is a schematic flowchart illustrating another embodiment of a canvas-based log analysis method based on a big data stream processing framework according to the present application;
FIG. 4 is a schematic flowchart illustrating a canvas-based log analysis method based on a big data stream processing framework according to another embodiment of the present application;
FIG. 5 is a flow chart illustrating the construction of an execution environment according to the type of a workflow diagram according to the present application;
FIG. 6 is a flowchart illustrating the steps of constructing an execution environment according to the type of a workflow diagram and setting parameter configurations according to the present application;
fig. 7 is a flowchart illustrating a flowchart of another embodiment of a canvas-based log analysis method based on a big data stream processing framework according to the present application.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The main solution of the embodiment of the application is as follows: acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas; forming a workflow of log analysis according to the operator; converting the workflow into a corresponding Json format file; and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data. . In the prior art, when a large data stream processing platform is used for log analysis, the import from a data source, the data processing and the analysis result are all realized in a code writing mode in the large data stream processing platform, and the log analysis realized in the mode has the problems of complex operation, incapability of ensuring reliability and stability and difficult maintenance. Therefore, the method and the device have the advantages that by dragging the large data stream processing frame operator subjected to packaging processing to the working area of the canvas, the distributed program capable of running on the executor is constructed, the log analysis based on the large data stream processing frame can be realized without writing codes on a large data stream processing platform, and the effects of simplicity in operation and high efficiency of the log analysis of the large data stream processing frame are achieved.
As shown in fig. 1, fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present application.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the terminal may further include a camera, RF (Radio Frequency) circuitry, sensors, a television 1006, audio circuitry, a WiFi module, detectors, and so forth. Of course, the terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer and a temperature sensor, which are not described herein again.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 does not constitute a limitation of the terminal device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a canvas-type log analysis program based on a big data stream processing framework.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call the canvas-based log analysis program stored in the memory 1005 and perform the following operations:
acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas;
forming a workflow of log analysis according to the operator;
converting the workflow into a corresponding Json format file;
and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data.
Referring to fig. 2, fig. 2 is a schematic flowchart of an embodiment of a canvas-based log analysis method based on a big data stream processing framework according to the present application, where the canvas-based log analysis method based on a big data stream processing framework includes:
step S10, acquiring an operator of the canvas work area according to the dragging operation of the user on the canvas;
step S20, forming a workflow of log analysis according to the operator;
the canvas is an operation interface of the canvas-type log analysis system based on the big data stream processing frame, the canvas is designed to be divided into an upper part and a lower part, wherein an operator of the big data stream processing frame subjected to packaging processing is designed at the upper end of the interface as a basic operator, the lower part is a canvas working area, and a user selects the operator corresponding to the log analysis in a mouse clicking mode and drags the operator to the working area of the canvas. It will be appreciated that the design of the canvas may be selected by the operator according to the actual requirements. The acquisition is to acquire an operator of the working area by setting a program inside the system. The types of the operators comprise a source operator (Kafka source operator), a processing operator (parsing rule operator, association rule operator, big data stream processing framework SQL operator, Union operator), a destination operator (search engine destination operator, Kafka destination operator) and a self-defined operator (Async operation operator). When log analysis is performed by dragging data in a canvas, different operators are required to complete the analysis of the log.
The user needs to connect operators of different types according to the log analysis work to be processed, and the operators are connected according to the execution sequence to form a corresponding workflow.
Step S30, converting the workflow into a corresponding Json format file;
the Json format file is a data exchange format file and can be easily analyzed and generated by a computer. Its grammatical form includes a Json object, a Json array, and a Json nest. In the canvas-type log analysis method and system based on the big data stream processing framework, the workflow formed by connecting operators in the working area of the canvas is converted into the Json format file corresponding to the workflow.
And step S40, generating the distributed program running on the executor from the Json format file, so that the executor executes the distributed program to analyze the log data.
After generating the Json format file, the system transmits the Json format file to an engine layer, and the engine layer further processes the Json format file to generate a distributed program capable of running on an actuator.
In the present application, a Flink stream data processing platform is taken as an example. When a user uses the log analysis system to analyze data, the Flink source operator packaged by the system in the system can be dragged into a canvas so as to lead the data to be processed into the Flink, and then the corresponding processing operator is dragged from the control bar of the system according to the analysis operation to the data. Generally, when a data source of data is acquired from the outside, the external data can be imported into the Flink by dragging the source operator. Furthermore, a drag processing operator (transformation) converts the data, and the converted data is written into an external data source through a destination operator. It should be noted that a Flink Job generally comprises a source operator, a processing operator, and a destination operator, that is, the process of importing, analyzing, and storing the analysis result of stream data is realized by dragging three operators in the present application. The processing operators can comprise a plurality of processing operators, and the flow direction of the data can be determined according to the execution sequence of the operators.
In this embodiment, when the system detects that a user drags an operator, the operator in the canvas work area and the connection mode between the operators are obtained, a workflow preliminarily formed by log analysis to be executed is obtained according to the connection mode between the operators, the workflow is converted into a computer description language Json format file, and then a distributed program capable of running on an actuator is generated according to the Json format file, so that the actuator completes the analysis of the log. The user can analyze the log by dragging the encapsulation operator on the canvas, the analysis of the log can be completed without compiling program codes in the big data stream processing platform, and the effects of simple operation and high efficiency of the big data stream processing framework log analysis are achieved.
Referring to fig. 3, fig. 3 is a flowchart illustrating a step of generating the Json format file into a distributed program running on an executor according to another embodiment of the present application, where the step includes:
step S41, converting the Json format file into a DSL description file;
step S42, generating a corresponding workflow chart according to the DSL description file;
and step S43, generating a distributed program operated by the executor according to the workflow diagram.
The DSL is a language that describes domain-specific objects, rules, and modes of operation. For example, when there is an SQL statement in Json, the SQL statement is finally handed to the corresponding database for processing. The database reads useful messages from the SQL statements and returns corresponding results. Generating a workflow graph during execution by converting the Json format document into a DSL description, and generating a corresponding distributed program running at the executor according to the workflow graph, wherein the workflow graph is a data structure for representing a job and is identified by a data flow engine of a large data flow processing framework.
The step of generating the distributed program run by the executor according to the workflow diagram comprises the following steps:
and step S44, performing operation character integration, verification and rule matching operation on the workflow diagram.
It is understood that there may be duplicate operators (there may be multiple source operators) in a workflow diagram generated through DSL language, and further analysis and optimization of the workflow diagram is required, where the analysis includes operations of operator integration, checking, rule matching, and the like on the workflow diagram. For example, when a user needs to analyze data of multiple logs, the multiple Kafka source operators are dragged to respectively import corresponding log data, and when data processing is performed, if the used processing operators are the same (the same operation is performed), the data can be analyzed by selecting only a command corresponding to one of the operators, so that unreasonable resource configuration caused by repeated operation is avoided.
In this embodiment, when multiple data sources, that is, repeated operator nodes, exist in the job flow graph, whether the operators are the same data source is analyzed, and optimization processing is further performed on the operators, so that the same operation instruction is prevented from being repeatedly executed in the executor, and the speed of log analysis performed by the system is reduced.
Referring to fig. 4, fig. 4 is a schematic flowchart of another embodiment of the present application, where before the step of performing operation of integrating an operator, verifying, and matching a rule on the workflow diagram, the method includes:
and step S45, constructing an execution environment according to the type of the workflow diagram, and setting parameter configuration.
In this embodiment, the workflow diagrams generated according to the DSL description have two types, which are a workflow diagram corresponding to a flow processing job and a workflow diagram corresponding to a batch processing job, respectively. And setting configuration parameters of the operation according to two different workflow diagram types.
Referring to fig. 5, the step of constructing an execution environment according to the type of the workflow diagram includes:
step S451, judging the type of the workflow graph according to a source operator contained in the workflow graph;
step S452, if the type of the workflow diagram is a batch processing type, constructing an execution environment corresponding to batch processing;
in step S453, if the type of the workflow diagram is a flow processing type, an execution environment corresponding to flow processing is constructed.
The data of the batch processing job, namely the log, is bounded, and the data of the stream processing job, namely the log, is unbounded. It will be appreciated that batch processing may be performed sequentially or in parallel for a series of related tasks, for example, every year of internet music may store the song records heard by the user in the past year as a data source for batch processing, and a piece of data about the user's usage is obtained through analysis and calculation as data output. The stream processing is a series of continuously changing data, and is characterized in that the data of the data source is dynamic, and log analysis, such as real-time recommendation, needs to be performed in real time according to the data source.
In the application, the type of a data flow graph required to be processed is judged according to a source algorithm type contained in a job flow graph, so that an engine is controlled to construct a corresponding execution environment, further configuration parameters of jobs are set, API (application programming interface) packaging and schema transferring and converting of operators are completed by traversing the whole pipeline, and finally the operators are submitted to a distributed engine for execution.
In a specific embodiment, the application further provides a batch flow fusion method based on asynchronous IO, which constructs an execution environment capable of simultaneously performing flow processing and batch processing, and supports querying and using batch processing data during the flow processing. Specifically, the system can support a typical batch flow fusion scene by introducing an asynchronous IO operator, efficiently interact with an external system, and can access an external database to query the full amount of historical data in the flow data processing process. Batch flow fusion can support a typical Lambda architecture: on one hand, stream data is processed in real time through the stream processing engine, a real-time result is generated and enters the stream data middleware for real-time query and display, and the stream data is written into the data storage middleware for full analysis; on the other hand, the streaming data is stored in a database such as MongoDB or Redis through the batch processing engine, and the result of the batch processing engine is obtained through asynchronously querying the database in the process of processing the streaming data by the batch processing engine.
Referring to fig. 6, the step of constructing an execution environment according to the type of the workflow diagram and setting parameter configuration includes:
step S46, collecting data for parameter configuration;
and step S47, formatting and field operation are carried out on the data.
In the embodiment, when the data is according to different job flows, the data which needs parameter configuration is formatted and field operation is carried out.
In this embodiment, an execution environment corresponding to different workflow diagram types is constructed according to the different workflow diagram types, so as to perform parameter configuration on the workflow diagram, and a scenario based on a batch processing job type and a stream processing job type is provided, so that batch processing data is queried and used in a stream processing process.
Referring to fig. 7, fig. 7 is a schematic flowchart of a further embodiment of the present application, where after the step of obtaining an operator of the canvas work area according to a dragging operation of a user in the canvas, the method includes:
step S50, if the operator contains a user-defined operator;
in this embodiment, the operator may be a specific operation performed by the user in the executor according to the operator by dragging the operator to the working area of the drawing board, and the performing operation may be a specific operation in an alternative embodiment, such as an addition, a subtraction, a multiplication, a division, or the like operation, or an operation for finding a maximum value, a minimum value, an average value, or an operation for constructing a mathematical model, such as a cyclic operation model, through the operator. It can be understood that, in the system, a large data stream processing framework and an API, which are frequently used in log analysis processing, are encapsulated into an operator which can realize log analysis on a canvas in a dragging manner, that is, the operator is designed in a toolbar area of the canvas according to a functional attribute, a user-defined operator is a log processed by a user according to a required function, and when the operator of the toolbar cannot meet the analysis requirement, the user-defined operator is dragged to a working area of the canvas.
Step S60, acquiring jar dependent files corresponding to the user-defined operators;
and step S70, storing the jar dependent file and writing the path into the Json file.
It can be understood that when a user needs to use a custom operator to complete log analysis, the custom operator does not have programming required to be realized by the log analysis in the system, that is, no operation which can be run on an actuator and corresponds to the operation required to be realized by the custom operator exists in the system, for this reason, the user can import a jar dependent file corresponding to the custom operator into the system by operating an import command of the log analysis system, and write a storage path of the corresponding jar dependent file into a Json file, so that when the Json file is analyzed, a client program can submit the whole jar contained in a job flow to a cluster according to a resource request parameter in the Json file, and resource allocation is performed.
In this embodiment, a user can complete log analysis by importing a customized operator, and the method is not limited to existing operators of the system and is convenient and fast to operate.
In addition, in order to achieve the above object, the present application further provides a canvas-based log analysis system based on a big data stream processing framework, wherein the system includes a canvas-based log analysis terminal based on a big data stream processing framework; wherein the content of the first and second substances,
the canvas-based log analysis terminal based on the big data stream processing framework comprises a memory, a processor and a canvas-based log analysis program which is stored on the memory and can run on the processor, wherein the canvas-based log analysis program based on the big data stream processing framework realizes the following steps when being executed by the processor:
acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas;
forming a workflow of log analysis according to the operator;
converting the workflow into a corresponding Json format file;
and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data.
In addition, in order to achieve the above object, the present application also provides a canvas-based log analysis apparatus based on a big data stream processing framework, the apparatus including:
the acquisition module acquires an operator of the canvas work area;
the engine module is used for forming a corresponding Json format file according to the operator;
and the actuator module is used for generating the Json format file into a distributed program running on the actuator so that the actuator can analyze the analyzed log data.
In addition, in order to achieve the above object, the present application also provides a computer readable storage medium, on which a canvas-based log analysis method based on a big data stream processing framework is stored, and when executed by a processor, the canvas-based log analysis program based on the big data stream processing framework implements the method as described above.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A canvas-based log analysis method based on a big data stream processing framework, the method comprising:
acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas;
forming a workflow of log analysis according to the operator;
converting the workflow into a corresponding Json format file;
and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data.
2. The canvas-based log parsing method of claim 1, wherein the step of generating the Json-formatted file into a distributed program running on an executor, comprises:
converting the Json format file into a DSL description file;
generating a corresponding workflow diagram according to the DSL description file;
and generating a distributed program operated by the executor according to the workflow diagram.
3. The canvas-based log analysis method of claim 2, wherein the step of generating an executor-run distributed program from the workflow graph is preceded by:
and carrying out operation character integration, verification and rule matching operation on the workflow diagram.
4. The canvas-based log analysis method of claim 2, wherein the step of performing operator integration, verification, rule matching operations on the workflow graph is preceded by:
and constructing an execution environment according to the type of the workflow diagram, and setting parameter configuration.
5. The canvas-based log analysis method of claim 4, wherein the step of constructing an execution environment in accordance with a workflow diagram type comprises:
judging the type of the workflow diagram according to a source operator contained in the workflow diagram;
if the type of the workflow diagram is a batch processing type, constructing an execution environment corresponding to batch processing;
and if the type of the workflow diagram is a flow processing type, constructing an execution environment corresponding to flow processing.
6. The canvas-based log parsing method according to claim 4, wherein the step of constructing an execution environment according to the type of the workflow diagram and setting the parameter configuration is followed by the steps of:
collecting data for parameter configuration;
and carrying out formatting and field operation on the data.
7. The canvas-based log analysis method of claim 1, wherein the step of obtaining the operator of the canvas work area according to the user's drag operation in the canvas is followed by:
if the operator contains a self-defined operator;
acquiring jar dependent files corresponding to the self-defined operators;
and storing the jar dependent file and writing a storage path into the Json file.
8. A canvas-based log analysis system based on a big data stream processing framework is characterized by comprising a canvas-based log analysis terminal based on the big data stream processing framework; wherein the content of the first and second substances,
the canvas-based log analysis terminal based on the big data stream processing framework comprises a memory, a processor and a canvas-based log analysis program which is stored on the memory and can run on the processor, wherein the canvas-based log analysis program based on the big data stream processing framework realizes the following steps when being executed by the processor:
acquiring an operator of a canvas working area according to a dragging operation of a user on the canvas;
forming a workflow of log analysis according to the operator;
converting the workflow into a corresponding Json format file;
and generating a distributed program operated on the executor according to the Json format file so that the executor executes the distributed program to analyze the log data.
9. An apparatus for canvas-based log analysis based on a big data stream processing framework, the apparatus comprising:
the acquisition module is used for acquiring operators in the canvas working area according to the dragging operation of a user on the canvas;
the engine module is used for converting the workflow into a corresponding Json format file;
and the actuator module is used for generating the Json format file into a distributed program running on the actuator so that the actuator executes the distributed program to analyze the log data.
10. A computer-readable storage medium, wherein the storage medium has stored thereon a big-data-stream-processing-framework-based canvas-style log analysis program, which when executed by a processor implements the method of any of claims 1-7.
CN202010533924.3A 2020-06-11 2020-06-11 Canvas type log analysis method based on large data stream processing framework Pending CN113806429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010533924.3A CN113806429A (en) 2020-06-11 2020-06-11 Canvas type log analysis method based on large data stream processing framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010533924.3A CN113806429A (en) 2020-06-11 2020-06-11 Canvas type log analysis method based on large data stream processing framework

Publications (1)

Publication Number Publication Date
CN113806429A true CN113806429A (en) 2021-12-17

Family

ID=78892158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010533924.3A Pending CN113806429A (en) 2020-06-11 2020-06-11 Canvas type log analysis method based on large data stream processing framework

Country Status (1)

Country Link
CN (1) CN113806429A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115357309A (en) * 2022-10-24 2022-11-18 深信服科技股份有限公司 Data processing method, device and system and computer readable storage medium
CN115499303A (en) * 2022-08-29 2022-12-20 浪潮软件科技有限公司 Log analysis tool based on Flink
CN116501386A (en) * 2023-03-31 2023-07-28 中国船舶集团有限公司第七一九研究所 Automatic calculation program solving method based on data pool and related device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678790A (en) * 2016-07-29 2018-02-09 华为技术有限公司 Flow calculation methodologies, apparatus and system
CN109522138A (en) * 2018-11-14 2019-03-26 北京中电普华信息技术有限公司 A kind of processing method and system of distributed stream data
CN110704290A (en) * 2019-09-27 2020-01-17 百度在线网络技术(北京)有限公司 Log analysis method and device
CN110941467A (en) * 2019-11-06 2020-03-31 第四范式(北京)技术有限公司 Data processing method, device and system
CN111209309A (en) * 2020-01-13 2020-05-29 腾讯科技(深圳)有限公司 Method, device and equipment for determining processing result of data flow graph and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678790A (en) * 2016-07-29 2018-02-09 华为技术有限公司 Flow calculation methodologies, apparatus and system
CN109522138A (en) * 2018-11-14 2019-03-26 北京中电普华信息技术有限公司 A kind of processing method and system of distributed stream data
CN110704290A (en) * 2019-09-27 2020-01-17 百度在线网络技术(北京)有限公司 Log analysis method and device
CN110941467A (en) * 2019-11-06 2020-03-31 第四范式(北京)技术有限公司 Data processing method, device and system
CN111209309A (en) * 2020-01-13 2020-05-29 腾讯科技(深圳)有限公司 Method, device and equipment for determining processing result of data flow graph and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115499303A (en) * 2022-08-29 2022-12-20 浪潮软件科技有限公司 Log analysis tool based on Flink
CN115357309A (en) * 2022-10-24 2022-11-18 深信服科技股份有限公司 Data processing method, device and system and computer readable storage medium
CN115357309B (en) * 2022-10-24 2023-07-14 深信服科技股份有限公司 Data processing method, device, system and computer readable storage medium
CN116501386A (en) * 2023-03-31 2023-07-28 中国船舶集团有限公司第七一九研究所 Automatic calculation program solving method based on data pool and related device
CN116501386B (en) * 2023-03-31 2024-01-26 中国船舶集团有限公司第七一九研究所 Automatic calculation program solving method based on data pool and related device

Similar Documents

Publication Publication Date Title
CN109582660B (en) Data blood margin analysis method, device, equipment, system and readable storage medium
US20210318851A1 (en) Systems and Methods for Dataset Merging using Flow Structures
CN105550268B (en) Big data process modeling analysis engine
JP5298117B2 (en) Data merging in distributed computing
CN109656963B (en) Metadata acquisition method, apparatus, device and computer readable storage medium
US10318595B2 (en) Analytics based on pipes programming model
US11314808B2 (en) Hybrid flows containing a continous flow
CN113806429A (en) Canvas type log analysis method based on large data stream processing framework
CN110908641B (en) Visualization-based stream computing platform, method, device and storage medium
US9886477B2 (en) Generating imperative-language query code from declarative-language query code
CN115617327A (en) Low code page building system, method and computer readable storage medium
WO2016082468A1 (en) Data graphing method, device and database server
WO2016018942A1 (en) Systems and methods for an sql-driven distributed operating system
CN112199086A (en) Automatic programming control system, method, device, electronic device and storage medium
KR20150092586A (en) Method and Apparatus for Processing Exploding Data Stream
AU2017254506B2 (en) Method, apparatus, computing device and storage medium for data analyzing and processing
US20170242665A1 (en) Generation of hybrid enterprise mobile applications in cloud environment
CN106293891B (en) Multidimensional investment index monitoring method
US20190213007A1 (en) Method and device for executing the distributed computation task
CN112286957B (en) API application method and system of BI system based on structured query language
WO2021253641A1 (en) Shading language translation method
WO2021068692A1 (en) Method, apparatus and device for workflow migration, and computer-readable storage medium
CN113419789A (en) Method and device for generating data model script
US20180121526A1 (en) Method, apparatus, and computer-readable medium for non-structured data profiling
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination