CN111597058A - Data stream processing method and system - Google Patents

Data stream processing method and system Download PDF

Info

Publication number
CN111597058A
CN111597058A CN202010307212.XA CN202010307212A CN111597058A CN 111597058 A CN111597058 A CN 111597058A CN 202010307212 A CN202010307212 A CN 202010307212A CN 111597058 A CN111597058 A CN 111597058A
Authority
CN
China
Prior art keywords
processing
processing node
data
node
topology
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010307212.XA
Other languages
Chinese (zh)
Other versions
CN111597058B (en
Inventor
周源
贾晓捷
冯萌萌
王佳佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimeng Chuangke Network Technology China Co Ltd
Original Assignee
Weimeng Chuangke Network Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimeng Chuangke Network Technology China Co Ltd filed Critical Weimeng Chuangke Network Technology China Co Ltd
Priority to CN202010307212.XA priority Critical patent/CN111597058B/en
Publication of CN111597058A publication Critical patent/CN111597058A/en
Application granted granted Critical
Publication of CN111597058B publication Critical patent/CN111597058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/06Network architectures or network communication protocols for network security for supporting key management in a packet data network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/288Distributed intermediate devices, i.e. intermediate devices for interaction with other intermediate devices on the same level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention discloses a data stream processing method and a system, which are used for solving the problems that the mutual communication among processing nodes in the existing data stream processing method needs to depend on centralized nodes, so that the deployment is heavy and the expansion and the contraction are inconvenient, and the method comprises the following steps: each processing node acquires a configuration file of the processing node, wherein the configuration file comprises topology parameters; each processing node builds a topology framework according to the topology parameters of the processing node; when any processing node receives the data stream, the processing node processes the data stream according to the topology architecture and outputs processing result information. According to the embodiment of the invention, each processing node spontaneously builds the topology framework according to the own topology parameters, each processing node is independent, only the own topology parameters are concerned, the behaviors of other processing nodes are not concerned, and no centralized processing node exists in the topology framework, so that the deployed framework is relatively light and convenient for capacity expansion and contraction.

Description

Data stream processing method and system
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a data stream processing method and system.
Background
With the rise of internet big data, the development of big data processing technology is accelerated. Different data have different requirements on processing techniques. The data of the stream processing system is acquired in real time, the acquired data is calculated in real time, and the data is quickly fed back to a user after calculation is completed, so that the aims of quick response, low delay and reliability are fulfilled. Therefore, the stream processing system has the characteristics of high speed, high efficiency, high fault tolerance rate and the like, and can accurately process the data information. In practical application, the stream processing system can be applied to scenes such as fire alarm, gas leakage alarm and the like.
A commonly used stream processing framework is a Storm distributed real-time computing framework, which is known to have good real-time performance and high performance in various platform technologies for large data stream processing, and has the characteristics of high expandability, stability, reliability and the like, and is widely concerned and used in the industry. Storm is used as a stream data processing engine, a polling algorithm is adopted for task scheduling, and fast operation is performed based on a memory, so that each message can be processed, the response speed is high, and the method is very suitable for real-time stream processing.
However, storm realizes mutual discovery among processing nodes through a centralized node, that is, intercommunication among processing nodes needs to be realized by relying on the centralized node. Starting such as storm requires the reliance on zookeeper. Therefore, storm deployment is cumbersome and expansion and contraction are inconvenient.
Disclosure of Invention
The embodiment of the invention provides a data stream processing method and a data stream processing system, which are used for solving the problems that mutual communication among processing nodes in the existing data stream processing method needs to depend on centralized nodes, so that the deployment is heavy and the expansion and the contraction are inconvenient.
The embodiment of the invention adopts the following technical scheme:
in a first aspect, a data stream processing method is provided, where the method includes:
each processing node acquires a configuration file of the processing node, wherein the configuration file comprises topology parameters;
each processing node builds a topology framework according to the topology parameters of the processing node;
when any processing node receives the data stream, the processing node processes the data stream according to the topology architecture and outputs processing result information.
In a second aspect, a data stream processing system is provided, the system comprising: each processing node comprises an acquisition module, a construction module and a service processing module, wherein:
the acquisition module is used for acquiring a configuration file of the processing node, wherein the configuration file comprises topology parameters;
the building module is used for building a topological framework according to the topological parameters of the processing nodes;
and the service processing module is used for processing the service of the data stream according to the topology framework and outputting processing result information when the processing node receives the data stream.
In a third aspect, a data stream processing system is provided, which includes: a memory storing computer program instructions;
a processor, which when executed by said processor implements the data stream processing method as described above.
In a fourth aspect, a computer-readable storage medium is provided, which comprises instructions that, when executed on a computer, cause the computer to perform the data stream processing method as described above when executed.
The embodiment of the invention adopts at least one technical scheme which can achieve the following beneficial effects:
according to the embodiment of the invention, each processing node spontaneously builds the topology framework according to the own topology parameters, each processing node is independent, only the own topology parameters are concerned, the behaviors of other processing nodes are not concerned, and no centralized processing node exists in the topology framework, so that the deployed framework is relatively light and convenient for capacity expansion and contraction.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic flow chart of a data stream processing method provided in an embodiment of the present specification;
fig. 2 is one of schematic diagrams of practical application scenarios of a data stream processing method provided in an embodiment of the present specification;
fig. 3 is a second schematic view of a practical application scenario of the data stream processing method according to an embodiment of the present disclosure;
fig. 4 is a third schematic view of a practical application scenario of the data stream processing method according to an embodiment of the present disclosure;
fig. 5 is a fourth schematic view of an actual application scenario of the data stream processing method according to an embodiment of the present disclosure;
fig. 6 is a fifth schematic view of an actual application scenario of the data stream processing method according to an embodiment of the present disclosure;
fig. 7 is a sixth schematic view of an actual application scenario of the data stream processing method according to an embodiment of the present disclosure;
FIG. 8 is a block diagram of a data stream processing system according to one embodiment of the present disclosure;
fig. 9 is a second schematic structural diagram of a data stream processing system according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to the specific embodiments of the present specification and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step are within the scope of the present application.
The embodiment of the invention provides a data stream processing method and a data stream processing system, which are used for solving the problems that mutual communication among processing nodes in the existing data stream processing method needs to depend on centralized nodes, so that the deployment is heavy and the expansion and the contraction are inconvenient. Embodiments of the present invention provide a data stream processing method, and an execution subject of the method may be, but is not limited to, an application program or a system capable of being configured to execute the method provided by the embodiments of the present invention.
Fig. 1 is a flowchart of a data stream processing method according to an embodiment of the present invention, where the method in fig. 1 may be executed by a system, and as shown in fig. 1, the method may include:
in step 110, each processing node obtains its own configuration file.
Wherein the configuration file may include topology parameters.
The topology parameters may include: identification information of the processing node, data to be processed and processing result data of the processing node, a key value of the processing node reading message middleware and a key list of the processing node outputting message middleware, the type of the processing node reading message middleware, the thread number of the processing node, an instruction for controlling the processing node to execute service processing, and the like.
And step 120, the processing nodes establish a topology framework according to the topology parameters of the processing nodes.
For example, the topology parameters may include data to be processed and data of a processing node, a key value of the processing node reading the message middleware and a key value of the processing node writing data into the message middleware. The steps can be realized as follows:
and determining the incidence relation of each processing node by each processing node according to the data to be processed and the processing result data so as to generate a topological graph.
The topological graph is a directed acyclic graph which is used for determining data flow direction, and checking performance pressure of processing nodes and the like.
Illustratively, as shown in FIG. 2, assume that:
the processing result data of the processing node (node for short) 1 is the data to be processed of the nodes 3 and 4, the processing result data of the node 2 is the data to be processed of the nodes 4 and 5, the processing result data of the node 4 is the data to be processed of the node 6, the processing result data of the node 4 is the data to be processed of the nodes 5 and 7, and the processing result data of the node 7 is the data to be processed of the node 6.
Then, the association relationship between the nodes 1 to 7 can be determined, and a directed acyclic graph, i.e. a topological graph, is generated according to the association relationship.
According to the topological graph, in each processing node, a first processing node which uses a key value to write data into the message middleware establishes communication with a second processing node which uses the same key value to read data from the message middleware, and the topological architecture is established.
As shown in fig. 3, the message middleware is embedded in the system in the form of an interface plug-in, and commonly used message middleware includes Kafka, Redis, BlockingQueue (provided by Java Development Kit, JDK), memcacheq, and the like. The communication can be carried out inside jvm (Java Virtual Machine) or cross jvm by means of plug-in extension.
In specific implementation, the embodiment of the invention uses BlockingQueue, and data can only be transferred in Jvm when the BlockingQueue is used. Embodiments of the invention may use other message middleware where data may be passed across machines.
And the key value adopted by the first processing node for writing the data into the message middleware is consistent with the key value adopted by the second processing node for reading the data from the message middleware.
For example, as shown in fig. 2, assuming that the first processing node is node 1 and the second processing node is node 3, node 1 writes the processing result data into the message middleware using the Key value Key1-3, and node 4 reads the data to be processed from the message middleware using the Key value Key 1-3. Similarly, assuming that the first processing node is node 4 and the second processing node is node 5, the node 4 writes the processing result data into the message middleware by using the Key value Key4-5, and the node 5 reads the data to be processed from the message middleware by using the Key value Key 4-5. By analogy, each processing node can establish communication by adopting the same secret key value to establish a topological structure.
Step 130, when any processing node receives the data stream, the processing node processes the data stream according to the topology architecture and outputs processing result information.
In the embodiment of the invention, the relation of each processing node and the logic of the topological graph can be realized by java annotation, each processing node only concerns the data read by itself and the output data, does not concern the behaviors of other processing nodes, and the constructed topological architecture has no centralized processing node, so that the deployed framework is light and quantitative, and is convenient for capacity expansion and contraction.
As an embodiment, step 110 may be specifically implemented as:
a package providing a class of processing nodes is scanned to obtain class definitions, annotations of which are attached to the class of processing nodes.
As shown in FIG. 3, this step may be performed by a class scanner on the processing node scanning packets that provide a class of processing nodes.
And analyzing the annotation defined by the class to obtain the processing node class and the processing node class object.
And instantiating the processing node class.
Before executing the instantiation process on the processing node class, step 110 further includes:
and judging that the processing node class is legal according to the annotation defined by the class. The method specifically comprises the following steps: judging whether the processing node class is legal or not according to the annotation defined by the class; and if so, instantiating the processing node class.
The instantiation processing is carried out on the processing node class, and the method comprises two modes: first, a normal instantiation process (i.e., normal class initialization as shown in FIG. 4) by the system framework; second, an instantiation process in which dependent processing node class objects are injected into the spring framework is implemented by annotation using processing node classes (i.e., spring class initialization as shown in fig. 4).
The embodiment of the invention adopts the processing node class to realize that the dependent processing node class object is injected into a spring frame through annotation so as to instantiate the processing node class.
Among them, Spring framework is created due to the complexity of software development, and is a lightweight control inversion (IoC) and cut-plane oriented (AOP) container framework.
Spring facilitates loose coupling by a technique called controlled inversion (IoC). When IoC is applied, other processing node classes on which a processing node class depends may be passed in a passive manner rather than the processing node class itself creating or looking up the dependent processing node class object. It can be considered IoC as opposed to JNDI, that instead of a processing node class looking for dependencies from a Spring framework, a Spring framework proactively passes dependencies to it without waiting for a processing node class request at processing node class initialization.
Illustratively, the comment sample @ Processor (desc ═ private source distribution information processing, "readEventName ═ direct _ message _ report," readEventType ═ queue type. redis, emitEventName { "origin _ report" }, and threadNum ═ 2)), includes a field having a processing node name, a processing node reads the key of the message middleware, a key list of the message middleware output by the processing node, a message middleware type, a thread number of a current processing node, some information of a switch and a skip switch, and the like.
The embodiment of the invention can use the spring frame to manage the initialization and the dependency relationship of the processing nodes when the processing nodes are established, thereby providing a way for sharing the codes among the processing nodes on one hand, solving the problem of sharing the codes of the stream processing service and the codes of the non-stream processing service on the other hand and greatly facilitating the development of the codes.
In the realized project, under the condition that the stream processing service and the http service coexist, about 30% of code line number is reduced. In terms of deployment, the embodiment of the invention realizes data communication between processing nodes through message middleware (mainly comprising various message queues), simplifies the deployment difficulty of stream processing services, avoids the problem that the stream processing services rely on centralized processing nodes to find, and adopts a data transmission mode as a pull mode, namely, an upstream part is only responsible for writing data into the message middleware, and a downstream part is actively pulled from the message middleware when idle.
In addition, the stream processing service provided by the embodiment of the invention can be mixed with other types of services, such as http service and rpc service, so that other physical server resources are not occupied to realize sharing, the resource occupation is reduced, and the utilization rate of the server is improved.
As an embodiment, the topology parameter includes an instruction for controlling the processing node to execute service processing, and before executing step 130, the data stream processing method provided in the embodiment of the present invention may further include:
when any processing node receives the data stream, the processing node executes the operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing.
Specifically, the instruction for controlling the processing node to execute the service processing may include a close instruction and/or a skip instruction, and the processing node executes an operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing, which may be specifically implemented as:
the processing node executes the operation of closing the self-execution service processing according to the closing instruction; and/or the processing node does not execute the operation of service processing according to the skipping instruction and directly outputs data.
Illustratively, as shown in fig. 5, in step 510, it is determined whether the processing node performs an operation of closing the self-executed service processing according to the closing instruction; if yes, ending; if not, go to step 520.
At step 520, the processing node reads data from the message middleware.
In step 530, it is determined whether the processing node does not execute the operation of the service processing according to the skip instruction, if yes, step 550 is executed, and if no, step 540 is executed.
In step 540, the processing node performs the operation of the business process and outputs the processed data, writing the processed data into the message middleware.
At step 550, the processing node directly outputs the input data and writes to message middleware.
According to the embodiment of the invention, the topological parameters are provided with the instructions for controlling the processing nodes to execute the service processing, so that the processing nodes can be flexibly controlled to execute the starting or stopping of the data processing operation, and the requirements of each application scene are met.
As an embodiment, the data stream processing method provided in the embodiment of the present invention may further include:
when a first processing node inputs source data and outputs first intermediate data, the first processing node performs packet processing on the first intermediate data to obtain a first data packet and outputs the first data packet, wherein the first data packet comprises first identification information.
When a second processing node inputs the first data packet and outputs second intermediate data, the second processing node performs packet processing on the second intermediate data and the first identification information to obtain and output a second data packet, wherein the second data packet comprises second identification information.
Illustratively, it is assumed that, as shown in fig. 6, the first packet contains first identification information (currentprocessrequest id 11), and the identification information (parenthoprocessrequest id 10) in fig. 6 is identification information generated by a processing node preceding the first processing node. The second processing node analyzes and processes the first data packet to obtain first identification information (currentprocessrequest ID 11) generated by the previous processing node, and the second processing node packages the second intermediate data and the first identification information to obtain a second data packet and generates second identification information (currentprocessrequest ID 12).
And acquiring a data flow log generated in the process of converting the source data into the second intermediate data according to the inheritance relationship between the first identification information and the second identification information.
The steps can be realized as follows: as shown in fig. 7, following the above example, according to the inheritance relationship between the first identification information and the second identification information, and so on, the identification information generated by the first processing node to the tenth processing node can be obtained, which is as follows in sequence: RequestID1, RequestID 2, RequestID 3, RequestID4, RequestID 5, RequestID6, RequestID 7, RequestID 8, RequestID 9, RequestID 10.
And acquiring the inheritance relationship among the identification information, and finally acquiring a data flow log in the process of converting the source data into the intermediate data generated by the tenth processing node.
In the embodiment of the invention, when data is output from a previous processing node, the data is packaged once, one requestID is recorded in the package, when the data reaches the current processing node, the data package is unpacked, so that the requestID processed by the previous processing node is obtained, the requestID of the current processing node is generated, the flow information of the data can be obtained according to the inheritance relationship between the two requestIDs, the source data can be traced from any intermediate data, and the data searching is convenient.
The data stream processing method according to the embodiment of the present specification is described in detail above with reference to fig. 1 to 7, and the system according to the embodiment of the present specification is described in detail below with reference to fig. 8.
Fig. 8 shows a schematic structural diagram of a system provided in an embodiment of the present specification, and as shown in fig. 8, the system 800 may include: each processing node comprises an acquisition module, a construction module and a service processing module, wherein:
the obtaining module 810 is configured to obtain a configuration file of the processing node, where the configuration file includes a topology parameter;
the building module 820 is configured to build a topology framework according to the topology parameters of the processing nodes;
and the service processing module 830 is configured to, when the processing node receives the data stream, perform service processing on the data stream according to the topology, and output processing result information.
In one embodiment, the topology parameters include data to be processed and processing result data of the processing node, the processing node reads a key value of the message middleware and the processing node writes data into the key value of the message middleware;
the building module 820 may include:
the determining unit is used for determining the incidence relation between the processing node and each processing node according to the processing data and the processing result data of the processing node to generate a topological graph;
and the establishing unit is used for writing data into a first processing node of the message middleware by using a key value in each processing node according to the topological graph, establishing communication with a second processing node which reads the data from the message middleware by using the key value, and establishing the topological architecture.
In one embodiment, the obtaining module 810 may include:
a class scanner for scanning a packet providing a class of processing nodes to obtain class definitions, annotations of the class definitions of the processing nodes being applied to the class of processing nodes;
the analysis unit is used for analyzing the annotation defined by the class so as to obtain the processing node class and the processing node class object;
and the processing unit is specifically used for enabling the processing node class to realize that the dependent processing node class object is injected into a spring framework through annotation.
In one embodiment, the obtaining module 810 may include:
and the judging unit is used for judging that the processing node class is legal according to the annotation of the class definition.
In one embodiment, the topology parameters include instructions that control the processing nodes to perform traffic processing, and the system 800 may include:
an executing module 840, configured to, when any processing node receives a data stream, execute, by the processing node, an operation corresponding to an instruction for controlling the processing node to execute service processing.
In one embodiment, the instructions controlling the processing node to perform the traffic processing comprise a close instruction and/or a skip instruction; the execution module 840 may include:
the execution unit is used for executing the operation of closing the self-execution service processing by the processing node according to the closing instruction;
and/or the processing node does not execute the operation of service processing according to the skipping instruction and directly outputs data.
In one embodiment, the system 800 may include:
a first packet processing module 850, configured to, when a first processing node inputs source data and outputs first intermediate data, perform packet processing on the first intermediate data by the first processing node to obtain a first data packet and output the first data packet, where the first data packet includes first identification information;
a second packet processing module 860, configured to, when a second processing node inputs the first data packet and outputs second intermediate data, perform packet processing on the second intermediate data and the first identification information by the second processing node to obtain a second data packet and output the second data packet, where the second data packet includes second identification information;
the log obtaining module 870 is configured to obtain a data flow log generated in a process of converting the source data into the second intermediate data according to an inheritance relationship between the first identification information and the second identification information.
According to the embodiment of the invention, each processing node spontaneously builds the topology framework according to the own topology parameters, each processing node is independent, only the own topology parameters are concerned, the behaviors of other processing nodes are not concerned, and no centralized processing node exists in the topology framework, so that the deployed framework is relatively light and convenient for capacity expansion and contraction.
A data stream processing system according to an embodiment of the present invention will be described in detail below with reference to fig. 9. Referring to fig. 9, at a hardware level, the data stream processing system includes a processor, optionally including an internal bus, a network interface, and a memory. As shown in fig. 9, the Memory may include a Memory, such as a Random-Access Memory (RAM), and may also include a non-volatile Memory, such as at least 1 disk Memory. Of course, the data stream processing system may also include the hardware needed to implement other target services.
The processor, the network interface, and the memory may be interconnected by an internal bus, which may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an extended EISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 9, but this does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form the associated data stream processing system of the resource value-added object and the resource object on the logic level. The processor executes the program stored in the memory and is specifically configured to perform the operations of the method embodiments described herein.
The method and the method executed by the data stream processing system disclosed in the embodiments of fig. 1 to 8 may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The data stream processing system shown in fig. 8 may also execute the method shown in fig. 1 to 7, so as to implement the functions of the data stream processing method in the embodiments shown in fig. 1 to 7, which are not described herein again.
Of course, besides software implementation, the data stream processing system of the present application does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the processes of the method embodiments, and can achieve the same technical effects, and in order to avoid repetition, the details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a server for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction server which implements the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for processing a data stream, comprising:
each processing node acquires a configuration file of the processing node, wherein the configuration file comprises topology parameters;
each processing node builds a topology framework according to the topology parameters of the processing node;
when any processing node receives the data stream, the processing node processes the data stream according to the topology architecture and outputs processing result information.
2. The method of claim 1, wherein the topology parameters comprise data to be processed and processing result data of the processing nodes, the processing nodes read key values of the message middleware and the processing nodes write data into key values of the message middleware;
each processing node constructs a topology framework according to the topology parameters thereof, and the method comprises the following steps:
the processing nodes determine the incidence relation of the processing nodes according to the data to be processed and the processing result data to generate a topological graph;
according to the topological graph, in each processing node, a first processing node which uses a key value to write data into the message middleware establishes communication with a second processing node which uses the same key value to read data from the message middleware, and the topological architecture is established.
3. The method of claim 1, wherein each processing node obtains its own configuration file, comprising:
scanning a package providing a class of processing nodes to obtain class definitions, annotations of the class definitions of the processing nodes being added to the class of processing nodes;
analyzing the annotation defined by the class to obtain the processing node class and the processing node class object;
instantiating the processing node class, specifically including: the processing node class is injected into the spring framework by annotating the processing node class objects that implement dependencies.
4. The method of claim 3, wherein prior to instantiating the processing node class, further comprising: and judging that the processing node class is legal according to the annotation defined by the class.
5. The method according to claim 1, wherein the topology parameters include instructions for controlling the processing nodes to perform traffic processing, and before any processing node performs traffic processing on the data stream according to the topology architecture, the method includes:
when any processing node receives the data stream, the processing node executes the operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing.
6. The method according to claim 5, characterized in that the instructions controlling the processing nodes to perform the traffic processing comprise a close instruction and/or a skip instruction;
the processing node executes the operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing, and the operation comprises the following steps:
the processing node executes the operation of closing the self-execution service processing according to the closing instruction;
and/or the processing node does not execute the operation of service processing according to the skipping instruction and directly outputs data.
7. The method according to claim 1, characterized in that it comprises:
when a first processing node inputs source data and outputs first intermediate data, the first processing node performs packet processing on the first intermediate data to obtain a first data packet and outputs the first data packet, wherein the first data packet comprises first identification information;
when a second processing node inputs the first data packet and outputs second intermediate data, the second processing node performs packet processing on the second intermediate data and the first identification information to obtain and output a second data packet, wherein the second data packet comprises second identification information;
and acquiring a data flow log generated in the process of converting the source data into the second intermediate data according to the inheritance relationship between the first identification information and the second identification information.
8. A data flow processing system is characterized by comprising a plurality of processing nodes, wherein each processing node comprises an acquisition module, a construction module and a service processing module, wherein:
the acquisition module is used for acquiring a configuration file of the processing node, wherein the configuration file comprises topology parameters;
the building module is used for building a topological framework according to the topological parameters of the processing nodes;
and the service processing module is used for processing the service of the data stream according to the topology framework and outputting processing result information when the processing node receives the data stream.
9. A data stream processing system, comprising:
a memory storing computer program instructions;
processor, which when executed by said processor implements the data stream processing method of any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that,
the computer-readable storage medium comprises instructions which, when executed on a computer, cause the computer to carry out the data stream processing method of any one of claims 1 to 7 when executed.
CN202010307212.XA 2020-04-17 2020-04-17 Data stream processing method and system Active CN111597058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010307212.XA CN111597058B (en) 2020-04-17 2020-04-17 Data stream processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010307212.XA CN111597058B (en) 2020-04-17 2020-04-17 Data stream processing method and system

Publications (2)

Publication Number Publication Date
CN111597058A true CN111597058A (en) 2020-08-28
CN111597058B CN111597058B (en) 2023-10-17

Family

ID=72181470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010307212.XA Active CN111597058B (en) 2020-04-17 2020-04-17 Data stream processing method and system

Country Status (1)

Country Link
CN (1) CN111597058B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326292A (en) * 2021-06-25 2021-08-31 深圳前海微众银行股份有限公司 Data stream merging method, device, equipment and computer storage medium
CN114116065A (en) * 2021-11-29 2022-03-01 中电金信软件有限公司 Method and device for acquiring topological graph data object and electronic equipment
CN114281297A (en) * 2021-12-09 2022-04-05 上海深聪半导体有限责任公司 Transmission management method, device, equipment and storage medium for multi-audio stream
CN113326292B (en) * 2021-06-25 2024-06-07 深圳前海微众银行股份有限公司 Data stream merging method, device, equipment and computer storage medium

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102065136A (en) * 2010-12-10 2011-05-18 中国科学院软件研究所 P2P (Peer-to-Peer) network safety data transmission method and system
US20130159477A1 (en) * 2010-07-05 2013-06-20 Saab Ab Method for configuring a distributed avionics control system
CN103368770A (en) * 2013-06-18 2013-10-23 华中师范大学 Gateway level topology-based self-adaptive ALM overlay network constructing and maintaining method
CN103491129A (en) * 2013-07-05 2014-01-01 华为技术有限公司 Service node configuration method and service node pool logger and system
CN103560943A (en) * 2013-10-31 2014-02-05 北京邮电大学 Network analytic system and method supporting real-time mass data processing
CN104038364A (en) * 2013-12-31 2014-09-10 华为技术有限公司 Distributed flow processing system fault tolerance method, nodes and system
CN105574082A (en) * 2015-12-08 2016-05-11 曙光信息产业(北京)有限公司 Storm based stream processing method and system
CN107678852A (en) * 2017-10-26 2018-02-09 携程旅游网络技术(上海)有限公司 Method, system, equipment and the storage medium calculated in real time based on flow data
WO2018072708A1 (en) * 2016-10-21 2018-04-26 中兴通讯股份有限公司 Cloud platform service capacity reduction method, apparatus, and cloud platform
CN108268305A (en) * 2017-01-04 2018-07-10 ***通信集团四川有限公司 For the system and method for virtual machine scalable appearance automatically
CN108595699A (en) * 2018-05-09 2018-09-28 国电南瑞科技股份有限公司 The Stream Processing method of wide-area distribution type data in electric power scheduling automatization system
CN108594810A (en) * 2018-04-08 2018-09-28 百度在线网络技术(北京)有限公司 Method, apparatus, storage medium, terminal device and the automatic driving vehicle of data processing
CN108881369A (en) * 2018-04-24 2018-11-23 中国科学院信息工程研究所 A kind of method for interchanging data and cloud message-oriented middleware system of the cloud message-oriented middleware based on data-oriented content
CN108900320A (en) * 2018-06-04 2018-11-27 佛山科学技术学院 A kind of internet test envelope topological structure large scale shrinkage in size method and device
CN109104318A (en) * 2018-08-23 2018-12-28 广东轩辕网络科技股份有限公司 The dispositions method and system of method for realizing cluster self-adaption deployment, the self-adaption deployment big data cluster based on cloud platform
US20190012466A1 (en) * 2017-07-10 2019-01-10 Burstiq Analytics Corporation Secure adaptive data storage platform
CN109194919A (en) * 2018-09-19 2019-01-11 图普科技(广州)有限公司 A kind of camera data flow distribution system, method and its computer storage medium
US20190138524A1 (en) * 2016-04-25 2019-05-09 Convida Wireless, Llc Data stream analytics at service layer
CN109992561A (en) * 2019-02-14 2019-07-09 石化盈科信息技术有限责任公司 Industrial real-time computing technique, storage medium and calculating equipment
CN110113399A (en) * 2019-04-24 2019-08-09 华为技术有限公司 Load balancing management method and relevant apparatus
CN110245108A (en) * 2019-07-15 2019-09-17 北京一流科技有限公司 It executes body creation system and executes body creation method
CN110716744A (en) * 2019-10-21 2020-01-21 中国科学院空间应用工程与技术中心 Data stream processing method, system and computer readable storage medium
US20210289015A1 (en) * 2018-07-10 2021-09-16 Nokia Technologies Oy Dynamic multiple endpoint generation

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130159477A1 (en) * 2010-07-05 2013-06-20 Saab Ab Method for configuring a distributed avionics control system
CN102065136A (en) * 2010-12-10 2011-05-18 中国科学院软件研究所 P2P (Peer-to-Peer) network safety data transmission method and system
CN103368770A (en) * 2013-06-18 2013-10-23 华中师范大学 Gateway level topology-based self-adaptive ALM overlay network constructing and maintaining method
CN103491129A (en) * 2013-07-05 2014-01-01 华为技术有限公司 Service node configuration method and service node pool logger and system
CN103560943A (en) * 2013-10-31 2014-02-05 北京邮电大学 Network analytic system and method supporting real-time mass data processing
CN104038364A (en) * 2013-12-31 2014-09-10 华为技术有限公司 Distributed flow processing system fault tolerance method, nodes and system
CN105574082A (en) * 2015-12-08 2016-05-11 曙光信息产业(北京)有限公司 Storm based stream processing method and system
US20190138524A1 (en) * 2016-04-25 2019-05-09 Convida Wireless, Llc Data stream analytics at service layer
WO2018072708A1 (en) * 2016-10-21 2018-04-26 中兴通讯股份有限公司 Cloud platform service capacity reduction method, apparatus, and cloud platform
CN108268305A (en) * 2017-01-04 2018-07-10 ***通信集团四川有限公司 For the system and method for virtual machine scalable appearance automatically
US20190012466A1 (en) * 2017-07-10 2019-01-10 Burstiq Analytics Corporation Secure adaptive data storage platform
CN107678852A (en) * 2017-10-26 2018-02-09 携程旅游网络技术(上海)有限公司 Method, system, equipment and the storage medium calculated in real time based on flow data
CN108594810A (en) * 2018-04-08 2018-09-28 百度在线网络技术(北京)有限公司 Method, apparatus, storage medium, terminal device and the automatic driving vehicle of data processing
CN108881369A (en) * 2018-04-24 2018-11-23 中国科学院信息工程研究所 A kind of method for interchanging data and cloud message-oriented middleware system of the cloud message-oriented middleware based on data-oriented content
CN108595699A (en) * 2018-05-09 2018-09-28 国电南瑞科技股份有限公司 The Stream Processing method of wide-area distribution type data in electric power scheduling automatization system
CN108900320A (en) * 2018-06-04 2018-11-27 佛山科学技术学院 A kind of internet test envelope topological structure large scale shrinkage in size method and device
US20210289015A1 (en) * 2018-07-10 2021-09-16 Nokia Technologies Oy Dynamic multiple endpoint generation
CN109104318A (en) * 2018-08-23 2018-12-28 广东轩辕网络科技股份有限公司 The dispositions method and system of method for realizing cluster self-adaption deployment, the self-adaption deployment big data cluster based on cloud platform
CN109194919A (en) * 2018-09-19 2019-01-11 图普科技(广州)有限公司 A kind of camera data flow distribution system, method and its computer storage medium
CN109992561A (en) * 2019-02-14 2019-07-09 石化盈科信息技术有限责任公司 Industrial real-time computing technique, storage medium and calculating equipment
CN110113399A (en) * 2019-04-24 2019-08-09 华为技术有限公司 Load balancing management method and relevant apparatus
CN110245108A (en) * 2019-07-15 2019-09-17 北京一流科技有限公司 It executes body creation system and executes body creation method
CN110716744A (en) * 2019-10-21 2020-01-21 中国科学院空间应用工程与技术中心 Data stream processing method, system and computer readable storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
尚雷明;游红;赵小敏;: "拓扑自适应的移动自组网P2P中间件***", 计算机与信息技术, no. 06 *
张文彬;王春梅;王静;陈托;智佳;: "基于Spark的有效载荷参数解析处理方法", 计算机工程与设计, no. 02, pages 587 - 591 *
王斌;马颖;: "基于软件定义网络的自适应数据流处理模型", 计算机工程与设计, no. 12, pages 3601 - 3604 *
蒋晨晨;季一木;孙雁飞;王汝传;: "基于Storm的面向大数据实时流查询***设计研究", 南京邮电大学学报(自然科学版), no. 03, pages 100 - 105 *
陈松林;秦燕;: "自私节点下无线多跳网的节能博弈拓扑研究", 计算机科学, no. 12, pages 182 - 185 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326292A (en) * 2021-06-25 2021-08-31 深圳前海微众银行股份有限公司 Data stream merging method, device, equipment and computer storage medium
CN113326292B (en) * 2021-06-25 2024-06-07 深圳前海微众银行股份有限公司 Data stream merging method, device, equipment and computer storage medium
CN114116065A (en) * 2021-11-29 2022-03-01 中电金信软件有限公司 Method and device for acquiring topological graph data object and electronic equipment
CN114281297A (en) * 2021-12-09 2022-04-05 上海深聪半导体有限责任公司 Transmission management method, device, equipment and storage medium for multi-audio stream

Also Published As

Publication number Publication date
CN111597058B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
US10257033B2 (en) Virtualized network functions and service chaining in serverless computing infrastructure
CN110704037B (en) Rule engine implementation method and device
Black et al. Infopipes: An abstraction for multimedia streaming
Jussila et al. Model checking dynamic and hierarchical UML state machines
Lohmann et al. Wendy: A tool to synthesize partners for services
US20170048008A1 (en) Method and apparatus for verification of network service in network function virtualization environment
CN111597058A (en) Data stream processing method and system
WO2023151436A1 (en) Sql statement risk detection
CN111741120A (en) Traffic mirroring method, device and equipment
CN110457132B (en) Method and device for creating functional object and terminal equipment
CN106681781B (en) Method and system for realizing real-time computing service
US11347630B1 (en) Method and system for an automated testing framework in design pattern and validating messages
CN113094026B (en) Code processing method and device
Arts et al. System description: Verification of distributed Erlang programs
CN113419952A (en) Cloud service management scene testing device and method
US10581994B2 (en) Facilitating communication between an origin machine and a target machine
Subramonian et al. Reusable models for timing and liveness analysis of middleware for distributed real-time and embedded systems
US20070240164A1 (en) Command line pipelining
CN116668542B (en) Service execution method based on heterogeneous resource binding under enhanced service architecture
US9385935B2 (en) Transparent message modification for diagnostics or testing
CN114168347A (en) Information processing method, information processing apparatus, server, and storage medium
US20150089471A1 (en) Input filters and filter-driven input processing
Hammer et al. PSCS4CPP: A Generative PSCS Implementation for C++
CN116775033A (en) Code deployment and processing method and device
Davis et al. Spring Integration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant