CN108549714B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN108549714B
CN108549714B CN201810361696.9A CN201810361696A CN108549714B CN 108549714 B CN108549714 B CN 108549714B CN 201810361696 A CN201810361696 A CN 201810361696A CN 108549714 B CN108549714 B CN 108549714B
Authority
CN
China
Prior art keywords
data
real
time
web page
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810361696.9A
Other languages
Chinese (zh)
Other versions
CN108549714A (en
Inventor
璁稿缓
许建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Chengying Data Technology Co ltd
Original Assignee
Hangzhou Chengying Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Chengying Data Technology Co ltd filed Critical Hangzhou Chengying Data Technology Co ltd
Priority to CN201810361696.9A priority Critical patent/CN108549714B/en
Publication of CN108549714A publication Critical patent/CN108549714A/en
Application granted granted Critical
Publication of CN108549714B publication Critical patent/CN108549714B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data processing method and a data processing device, wherein the method comprises the following steps: monitoring a WEB page data submission event, and acquiring operation information and data information related to the WEB page data submission event in real time according to the monitored WEB page data submission event; and synchronously executing the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area, wherein the data stored in the real-time data storage area is used for providing query and service of real-time data, and the data stored in the historical data storage area is used for providing query and service of historical data.

Description

Data processing method and device
Technical Field
The present application relates to the field of data storage technologies, and in particular, to a data processing method and apparatus.
Background
In recent years, electronic information data plays an increasingly important role in operation, and the electronic information data needs to be analyzed efficiently, timely and accurately in practical application. Conventional data warehouses employ an Extract Transform Load (ETL) tool to periodically Extract data from a data source and process the data for loading into the data warehouse. The data extraction period of the traditional mode is usually once a month, once a week or once a day, so that only historical data-based query and service can be supported, and the change of data in a data source cannot be captured in real time. Therefore, a real-time data warehouse is provided, but the existing real-time data warehouse data pre-storage method has the problem that query competition is caused when real-time data import and real-time data query are performed simultaneously, and the generated conflict will seriously affect the precision and efficiency of On-Line Transaction Processing (OLTP) and On-Line Analysis Processing (OLAP), and reduce the performance of the real-time data warehouse.
The real-time data warehouse provided by the prior art actually stays at the level of the traditional ETL data loading, and the data acquisition mode is to extract data from different business systems in a passive or pseudo-passive form, that is, the real-time data warehouse based on dynamic mirror images enables data acquisition to be approximately real-time. The solution is only optimized in the data warehouse, the ETL behavior is still triggered, the data is still extracted from the business library, real-time data acquisition is difficult to realize, and in the process of extracting the data, the data synchronization process is extremely complicated due to the fact that different business systems use various databases, the difficulty of maintaining the stability of the system is increased, and meanwhile, implementation personnel and development personnel are required to have high technical capability.
Disclosure of Invention
In order to solve the problems in the prior art, embodiments of the present application provide a data processing method, an apparatus, a computing device, and a computer-readable storage medium, so as to achieve real-time data acquisition.
An aspect of the present embodiment provides a data processing method, where the method includes:
monitoring a WEB page data submission event, and acquiring operation information and data information related to the WEB page data submission event in real time according to the monitored WEB page data submission event;
and synchronously executing the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area, wherein the data stored in the real-time data storage area is used for providing query and service of real-time data, and the data stored in the historical data storage area is used for providing query and service of historical data.
Optionally, the acquiring, in real time, data information and operation information associated with the WEB page data submission event includes:
acquiring configuration information and URL strings of a WEB page monitored to have a data submission event;
analyzing the configuration information to obtain operation information;
and analyzing the URL string to acquire data information.
Optionally, the WEB page data submission event includes:
clicking a WEB page data submission event of a button type; or
And clicking the WEB page data submission event of the link type.
Optionally, after obtaining the operation information and the data information associated with the WEB page data submission event, the method further includes:
determining a preset dependence rule according to the operation information;
and processing the data information in real time according to the preset dependence rule.
Optionally, the real-time processing comprises:
and cleaning the data information according to a preset cleaning rule.
Optionally, the real-time processing further includes:
and filtering the data information according to a pre-established data model.
Optionally, the method further comprises:
and loading the original data into the historical data storage area according to a preset rule.
Optionally, the method further comprises:
receiving a query for real-time data and/or a request for service;
and providing a result of query and/or service of the real-time data according to the data stored in the real-time data storage area.
Optionally, the method further comprises:
receiving a query for historical data and/or a request for service;
and providing the result of the inquiry and/or service of the historical data according to the data stored in the historical data storage area.
Another aspect of the embodiments of the present application further provides a data processing apparatus, where the apparatus includes: the system comprises a WEB page data configuration and analysis module and a data processing module, wherein the WEB page data configuration and analysis module is configured to monitor a WEB page data submission event and acquire operation information and data information associated with the WEB page data submission event in real time according to the monitored WEB page data submission event; the data processing module is configured to synchronously execute the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area, wherein the data stored in the real-time data storage area is used for providing query and service of real-time data, and the data stored in the historical data storage area is used for providing query and service of historical data.
Optionally, the apparatus further includes a data operation module, where after the data operation module is configured to obtain operation information and data information associated with the WEB page data submission event, a preset dependency rule is determined according to the operation information, and the data information is processed in real time according to the preset dependency rule.
Optionally, the apparatus further comprises a data initialization module configured to load raw data into the historical data storage area according to a predetermined rule.
Optionally, the apparatus further comprises an interface module configured to receive a request for a query and/or service of real-time data and to provide results of the query and/or service of real-time data in accordance with data stored in the real-time data storage area.
Optionally, the interface module is further configured to receive a request for a query and/or service of historical data and provide results of the query and/or service of historical data according to the data saved in the historical data storage area.
In another aspect, the present invention further provides a computing device, which includes a memory, a processor, and computer instructions stored in the memory and executable on the processor, where the processor executes the instructions to implement the data processing method.
In another aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which computer instructions are stored, and the computer instructions, when executed by a processor, implement the data processing method.
The data processing method and the data processing device can achieve real-time data acquisition, omit complex mirror image design in the prior art, and improve the stability of a database system. The method and the device provided by the application are more friendly to the support of historical data query, and the workload and the difficulty of ETL design are reduced.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating a data processing method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a data processing method according to another embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of a data processing method according to another embodiment of the present application;
FIG. 4 is a block diagram of a data processing apparatus according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a data processing apparatus according to another embodiment of the present application;
FIG. 6 is a block diagram of a computing device according to an embodiment of the present application.
Detailed Description
While the present application is susceptible to embodiments and details, it should be understood that the present application is not limited to the details of the particular embodiments disclosed, but is capable of many modifications and variations, as will be apparent to those of ordinary skill in the art, without departing from the spirit of the application.
In the present application, the terms "first", "second", "third", "fourth", and the like are used only for distinguishing one from another, and do not indicate importance, order, existence of one another, and the like.
The need for real-time processing of data can be divided into two main categories: an operation type process and an analysis type process. Online Transaction Processing (OLTP) is a typical operation type process, online Analytical Processing (OLAP) is a typical analysis type data process, and a conventional database is mainly used for implementing OLTP, emphasizing On data calculation, record insertion, deletion, modification, and simple query and statistics. The main task of OLTP is to perform transactions, and the main concerns are the timeliness, integrity and correctness of transactions, but there are serious deficiencies in the analysis and processing of data. The disadvantages of the lack of integration and subject ambiguity of the common business database are mainly reflected in the following aspects: firstly, disorder and dispersion of data distribution can be caused by the strip division of a service database system, uniform definition and planning are lacked, and ambiguity of data definition can exist among different service databases; secondly, the business database definition library and the table lack definite subjects and cannot meet the requirement of data analysis; in addition, a large amount of data is stored in different tables, different libraries, and different database servers in a scattered manner, and the efficiency of data analysis processing cannot be guaranteed. Therefore, the traditional database is limited by self conditions and cannot play a role as a large-scale data comprehensive analysis platform, and a new theory and technology for providing support are urgently needed, namely a data warehouse technology.
The data warehouse technology (ETL) is a process of extracting, cleaning and converting data in an original business system and then loading the data into a data warehouse, and is an important ring for constructing the data warehouse. The purpose of doing so is to integrate scattered, disordered, non-uniform data in the business database together, and provide analysis basis for decision making. The ETL mainly uses the processing capability of the conversion server to extract data from the business database, and then performs data cleaning and conversion in the conversion server, and then loads the data into the target library. The design of ETLs is generally divided into three parts: data extraction, cleaning conversion of data and loading of data.
The data cleansing generally refers to the process of compacting data to remove duplicate records and convert the remainder into a standard acceptable format. The standard model of data cleansing is to input data to a data cleansing processor, "cleanse" the data through a series of steps, and then output the cleansed data in a desired format. Data cleaning processes the problems of data loss value, out-of-bounds value, inconsistent code, repeated data and the like in terms of accuracy, completeness, consistency, uniqueness, timeliness and effectiveness of the data.
An embodiment of the present application provides a data processing method, as shown in fig. 1, the method includes:
step 101: monitoring a WEB page data submission event, and acquiring operation information and data information related to the WEB page data submission event in real time according to the monitored WEB page data submission event;
step 102: and synchronously executing the operation indicated by the operation information on the data indicated by the data information in the real-time data storage area and the historical data storage area.
The data stored in the real-time data storage area is used for providing query and service of real-time data, and the data stored in the historical data storage area is used for providing query and service of historical data.
The data stored in the real-time data storage area has a certain life cycle and is used for providing query and service of the real-time data. The length of the life cycle can be adjusted according to requirements. Since a large amount of real-time data is generated every day, in order to prevent the feedback speed of the real-time data query and service from being affected by an excessive amount of data in the real-time data storage area, the life cycle of the real-time data storage area is preferably less than or equal to 24 hours.
The historical data storage area comprises all newly added data in the real-time data storage area besides the historical data and is used for providing inquiry and service aiming at the historical data, so that the newly added data in the real-time data storage area does not need to be imported into the historical data storage area when the inquiry and service aiming at the historical data are carried out, namely the historical data storage area and the real-time data storage area are completely decoupled.
Optionally, the WEB page data submission event includes:
clicking a WEB page data submission event of a button type; or
And clicking the WEB page data submission event of the link type.
Specifically, if a monitored data submission event of a click button type occurs on a WEB page, data associated with the data submission event of the WEB page is acquired from a form in real time. The POST data submission mode specified by the HTTP/1.1 protocol mainly comprises the following two modes:
firstly, application/x-www-form-url is the most common POST data submission mode, namely, through a native < form > form of a browser, if an encrypt attribute is not set, data is submitted in the application/x-www-form-url mode finally;
the multipart/form-data is also a common POST data submission mode, even when a form is used for uploading a file, the enctype of the form must be equal to the multipart/form-data, and the data submission mode mainly adopts a file uploading mode to realize data submission, such as: and submitting data in an excel form uploading mode.
And if the monitored WEB page has a data submission event of a link clicking type, acquiring data associated with the WEB page data submission event from parameters attached to the link in real time.
In step 102, the synchronous execution of the operation indicated by the operation information on the data indicated by the data information in the real-time data storage area and the historical data storage area comprises: insertion, deletion, and updating of data, etc. Specifically, for example, the following statements may be used to perform operations of inserting and updating data in the form of key-value:
map < String, int > Map ═ new HashMap < String, int > (); { defining a Map object }
Put ("ming", 1); { set a set of data, key "ming", corresponding value 1}
Put ("zi", 2); { set another set of data, key "zi", corresponding value 2}
Get ("ming"); { value 1 of acquisition bond "ming }
Put ("ming", 3); { since the key "ming" already exists, the original value 1 would be overwritten, with the new value 3 corresponding to the key "ming" }
The method provided by the embodiment realizes the real-time acquisition of the associated data by actively monitoring the WEB page data submission event, abandons the batch processing mode adopted by the traditional ETL, and truly meets the real-time requirement of the data. The real-time data acquired by the method is synchronously stored in a real-time data storage area and a historical data storage area of the data warehouse.
Another embodiment of the present application provides a data processing method, as shown in fig. 2, the method includes:
step 201: monitoring a WEB page data submission event of a click button type, and acquiring configuration information and a URL string of a WEB page from a form in real time according to the monitored WEB page data submission event of the click button type;
step 202: analyzing the configuration information to obtain operation information;
step 203: analyzing the URL string to acquire data information;
step 204: and synchronously executing the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area according to the operation information and the data information.
The process of the WEB page data submission event of the click button type in step 201 is similar to the process of submitting a WEB form by clicking a button. The operation information is equivalent to various kinds of dimension information of the operation of the form, and the data information is equivalent to the content of the form.
The data information includes field information, field data information, etc., which constitute complete data. The data is transmitted in the form of data streams between the processing modules or over a network. A Uniform Resource Locator (URL) is a standard address of a Resource on the internet, and is used to completely describe web pages and other resources on the internet and also to identify local resources. Each web page or resource on the internet can be uniquely identified using a URL. The URL is composed of a series of characters in the format: protocol:// [ username: password ] @ host [: port ] [/path ] [? query ] [ # fragment ]. Wherein a protocol domain specifies a transport protocol, such as: HTTP protocol, FTP protocol, etc.; the host domain designates the host name or IP address of the server storing the resources; the username and password fields specify the username and password required to connect to the server; the port domain designates the port number of the transport protocol; the path field specifies the address of a directory or file on the host; the query field specifies parameters transferred by the dynamic webpage; the fragment field specifies a fragment in a network resource. In addition, in the above URL format, a field with brackets [ ] is an option. When a client program requests access to an information resource of an internet server using a URL, it is necessary to specify information such as a protocol to be used for the request, the server to be requested, an identifier of the resource to be requested, and a storage path. The above information is provided by the URL address. By parsing the URL string submitted by the WEB page, the required data information can be extracted, for example: field information, field data information, etc.
The operation information is various dimension information of the operation data, and information related to data operation can be analyzed according to configuration information of the WEB page, for example: operation types, operators, operation time and the like, wherein the operation types comprise: insert, delete, update, etc. And in the case that the operation information is not acquired from the WEB page, the default is the insertion operation. The configuration information may further include the name of the database that needs to be configured and monitored, the type of the database, the name of the table, the required fields, and the corresponding URL page. Optionally, the configuration information further includes information about downstream users and transmission modes of the data stored in the real-time data storage area and the historical data storage area.
By adopting the data processing method provided by the embodiment, the real-time synchronization of the data in the data warehouse can be realized, the updating of the data does not depend on the service database any more, and the complete decoupling with the service database is realized. The method has the advantages that the defects of the traditional ETL are eliminated, the complex mirror image structure of the dynamic storage area is omitted, the stability of the data warehouse is guaranteed, and meanwhile, the real-time query and the historical query of the data are realized.
Another embodiment of the present application provides a data processing method, as shown in fig. 3, the method includes:
step 301: monitoring a WEB page data submission event of a click link type, and acquiring operation information and data information related to the WEB page data submission event from parameters attached to a link in real time according to the monitored WEB page data submission event of the click link type;
step 302: determining a dependency rule according to the operation information;
step 303: processing the data information in real time according to the dependency rule;
step 304: and synchronously executing corresponding data operation in the real-time data storage area and the historical data storage area according to the operation information and the data information processed in real time.
In the above embodiment, the new data may be obtained in an event increment-based manner, each record of the click link corresponds to a new data submission event, and there is no necessary connection between the events.
In this embodiment of the application, the obtaining, in step 301, operation information and data information associated with the WEB page data submission event from parameters attached to a link in real time according to the monitored WEB page data submission event of the click link type may include:
acquiring configuration information and URL strings of the monitored WEB page of the WEB page data submission event of the click link type; acquiring operation information by analyzing the configuration information; and acquiring data information by analyzing the URL string.
The dependency rules are used to determine which real-time processing modules the data stream flows through in what order, and which real-time processing steps are executed, for example: the data validity detection is performed to delete invalid data, then type conversion is performed on the data according to a preset rule (for example, data in a specific format such as date and currency is converted into a text type in a preset format) to keep data consistency, then data information is filtered, and only data information corresponding to fields related to a table of a data warehouse is extracted. And determining a corresponding dependency rule according to the operation information obtained by analysis in the configuration information, so as to determine which real-time processing should be performed on the obtained newly-added data information, for example, data cleaning rules such as data encryption and data interception, and the data is synchronized into a corresponding storage area of the data warehouse after the steps such as cleaning conversion.
By analyzing the configuration information of the WEB page detected to have the data submission event, the operation information associated with the submission event can be determined, and further, which mode of processing is to be performed on the data is determined, for example: distribution, dependency, etc. processing modes.
If the processing mode of the data can be determined to be real-time distribution according to the operation information, the data can flow to different modules of the data warehouse in the form of data streams to be operated simultaneously.
For example, it can be determined from the operation information that the processing mode of the data is dependent and obtain the corresponding dependency rule, and it can be known from the dependency rule which real-time processing should be performed on the data information, for example: encrypt, intercept data, etc. And determines whether to return result data or return an execution result (success or failure) based on the operation information.
Optionally, the real-time processing in step 303 includes:
cleaning the data information according to a preset cleaning rule; and/or
And filtering the data information according to a pre-established data model.
The preset cleaning rule for data cleaning comprises the following steps: data interception, data decryption/encryption, data association, data validity detection (such as mobile phone number detection), data type conversion and other rules.
The step of filtering the data according to the pre-established data model refers to the step of filtering the full-field data acquired from the WEB page according to the table structure in the data warehouse, and only extracting the fields related to the table of the data warehouse.
The method of the embodiment can convert the data associated with the WEB page data submission event acquired from upstream in real time into target data required by OLTP or OLAP according to a preset cleaning rule. The conversion of the data includes: the data processing method comprises the steps of data number conversion, data type conversion, data summarizing calculation, data splicing and the like. In the method, as long as the data acquired from the upstream is received, the data can be processed in a data flow mode in time, and the data processing efficiency is greatly improved. The data processed by data conversion, data cleaning, data combination and the like is updated to the storage space of the data warehouse, and the processing mode can better support various queries and service requests for real-time data and historical data.
In a specific application scenario, a data submission event is captured by monitoring a link click action occurring in a WEB page, and the location of a data file and the related information of how the browser should process the data file are analyzed from configuration information and a URL in real time. And determining a corresponding dependency rule according to the operation information analyzed and obtained in the configuration information, so as to determine which real-time processing should be performed on the data information in the data file. For example, determining the combination order of the real-time data processing according to the operation information obtained by parsing is as follows:
firstly, cleaning data, wherein the purpose of cleaning the data is to encrypt and/or decrypt a data file, delete invalid data through data validity detection, convert data types according to a preset rule, keep data consistency and the like;
and then, filtering the data according to a pre-established data model, wherein the purpose of data filtering is to filter the data in the data file according to a table structure in the data warehouse, and only extracting fields related to the table of the data warehouse, namely intercepting useful target data.
And finally, storing the processed target data into a real-time data storage area and a historical data storage area simultaneously.
The dependency rules of the data streams determined according to different operation information and data information are different, so that corresponding real-time processing steps for the data may also be different, and developers of the database can make adaptive designs according to actual needs, thereby configuring different combinations and sequences.
Optionally, when the data warehouse is initialized, the original data existing in the business database is loaded into the historical data storage area according to a predetermined rule, for example, the original data existing in the business database may be all loaded into the historical data storage area at one time, so that the newly established data warehouse can also provide the result of querying and/or serving the historical data before the data warehouse is initialized.
Optionally, the method further comprises:
receiving a query of real-time data and/or a request of service at any time;
providing results of queries and/or services based on the data stored in the real-time data store.
Optionally, the method further comprises:
receiving historical data query and/or service request at any time;
providing results of queries and/or services based on the data stored in the historical data store.
The newly added data is synchronously updated in the historical data storage area and the real-time data storage area, the real-time data storage area and the historical data storage area are completely decoupled, the real-time data storage area only retains data in configuration time, the historical data are directly obtained from the upstream, and no interaction with the real-time data storage area exists. Therefore, on the premise of ensuring the data query precision, the efficiency of real-time data query is improved, the real-time data query and the historical data query are simultaneously supported, and the problem of query competition caused by the fact that real-time data import and the real-time data query are conducted simultaneously is completely avoided.
An embodiment of the present application discloses a data processing apparatus, as shown in fig. 4, the apparatus 400 includes: a WEB page data configuration and analysis module 401 and a data processing module 402, wherein the WEB page data configuration and analysis module 401 is configured to monitor a WEB page data submission event, and obtain operation information and data information associated with the WEB page data submission event in real time according to the monitored WEB page data submission event; the data processing module 402 is configured to synchronously perform the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area, wherein the data stored in the real-time data storage area is used for providing query and service of real-time data, and the data stored in the historical data storage area is used for providing query and service of historical data.
The device provided by the embodiment can realize the real-time acquisition of the associated data by actively capturing the WEB page data submission event, abandons the batch processing mode adopted by the traditional ETL and truly meets the real-time requirement of the data. The historical data storage area comprises all newly added data in the real-time data storage area besides the historical data and is used for providing inquiry and service aiming at the historical data, so that the newly added data in the real-time data storage area does not need to be imported into the historical data storage area when the inquiry and service aiming at the historical data are carried out, namely the historical data storage area and the real-time data storage area are completely decoupled.
An embodiment of the present application discloses a data processing apparatus, as shown in fig. 5, the apparatus 500 includes: the system comprises a WEB page data configuration and analysis module 501, a data processing module 502 and a data operation module 503, wherein the WEB page data configuration and analysis module 501 is configured to monitor a WEB page data submission event, and acquire operation information and data information associated with the WEB page data submission event in real time according to the monitored WEB page data submission event; the data processing module 502 is configured to synchronously execute the operation indicated by the operation information on the data indicated by the data information in the real-time data storage area and the historical data storage area; after acquiring operation information and data information associated with the WEB page data submission event, the data operation module 503 is configured to determine a preset dependency rule according to the operation information, and process the data information in real time according to the preset dependency rule; the data stored in the real-time data storage area is used for providing query and service of real-time data, and the data stored in the historical data storage area is used for providing query and service of historical data.
In the embodiment of the application, the newly added data can be acquired in an event increment-based mode, each record of the click link corresponds to a new data submission event, and the events are not necessarily linked. And determining a corresponding dependency rule according to the operation information carried in the configuration information, so as to determine which real-time processing should be performed on the acquired newly-added data information, for example, data cleaning rules such as data encryption and data interception, and the data is synchronized into a corresponding storage module of the data warehouse after the steps such as cleaning conversion. The device of the embodiment can convert the data associated with the WEB page data submission event acquired from upstream in real time into target data required by OLTP or OLAP according to a preset cleaning rule. The conversion of the data includes: the data processing method comprises the steps of data number conversion, data type conversion, data summarizing calculation, data splicing and the like. The data operation module 503 can process data in a data flow manner in time as long as it receives data acquired from an upstream, thereby greatly increasing the efficiency of data processing. The data processed by data conversion, data cleaning, data combination and the like is updated to the storage space of the data warehouse, and the processing mode can better support various queries and service requests for real-time data and historical data.
Optionally, the apparatus 500 further comprises a data initialization module configured to load raw data into the historical data storage area according to a predetermined rule. Thus, the newly established data warehouse is also able to provide the results of queries and/or services of historical data prior to initialization of the data warehouse.
Optionally, the apparatus further comprises an interface module configured to receive a request for a query and/or service of real-time data and to provide results of the query and/or service of real-time data in accordance with data stored in the real-time data storage area. Optionally, the interface module is further configured to receive a request for a query and/or service of historical data and provide results of the query and/or service of historical data according to the data saved in the historical data storage area. The newly added data is synchronously updated in the historical data storage area and the real-time data storage area, the real-time data storage area and the historical data storage area are completely decoupled, the real-time data storage area only retains data in configuration time, the historical data are directly obtained from the upstream, and no interaction with the real-time data storage area exists. Therefore, on the premise of ensuring the data query precision, the efficiency of real-time data query is improved, the real-time data query and the historical data query are simultaneously supported, and the problem of query competition caused by the fact that real-time data import and the real-time data query are conducted simultaneously is completely avoided.
In one embodiment according to the present application, there is provided a computing device 600 as shown in fig. 6, including but not limited to a memory 601, a processor 602, and computer instructions stored on the memory 601 and executable on the processor 602, the processor 602 implementing the data processing method as described above when executing the instructions.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device 600 belongs to the same concept as the aforementioned data processing method, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the aforementioned data processing method.
In one embodiment according to the present application, there is provided a storage medium having stored thereon computer instructions which, when executed by a processor, implement a data processing method as set forth in the preceding description.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above is an illustrative scheme of a readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the aforementioned data processing method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the aforementioned data processing method.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the application to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims (16)

1. A method of data processing, the method comprising:
monitoring a WEB page data submission event, and acquiring operation information and data information related to the WEB page data submission event in real time according to the monitored WEB page data submission event;
and synchronously executing the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area, wherein the data stored in the real-time data storage area is used for providing query and service of real-time data, the data stored in the historical data storage area is used for providing query and service of historical data, and the operation indicated by the operation information comprises but is not limited to insertion, deletion and updating of data.
2. The method according to claim 1, wherein the obtaining data information and operation information associated with the WEB page data submission event in real time comprises:
acquiring configuration information and URL strings of a WEB page monitored to have a data submission event;
analyzing the configuration information to obtain operation information;
and analyzing the URL string to acquire data information.
3. The method according to claim 1 or 2, wherein the WEB page data submission event comprises:
clicking a WEB page data submission event of a button type; or
And clicking the WEB page data submission event of the link type.
4. The method according to claim 1 or 2, wherein the obtaining of the operation information and the data information associated with the WEB page data submission event further comprises:
determining a preset dependence rule according to the operation information;
and processing the data information in real time according to the preset dependence rule.
5. The method of claim 4, wherein the real-time processing comprises:
and cleaning the data information according to a preset cleaning rule.
6. The method of claim 4, wherein the real-time processing further comprises:
and filtering the data information according to a pre-established data model.
7. The method according to claim 1 or 2, characterized in that the method further comprises:
and loading the original data into the historical data storage area according to a preset rule.
8. The method according to claim 1 or 2, characterized in that the method further comprises:
receiving a query for real-time data and/or a request for service;
and providing a result of query and/or service of the real-time data according to the data stored in the real-time data storage area.
9. The method according to claim 1 or 2, characterized in that the method further comprises:
receiving a query for historical data and/or a request for service;
and providing the result of the inquiry and/or service of the historical data according to the data stored in the historical data storage area.
10. A data processing apparatus, characterized in that the apparatus comprises: the system comprises a WEB page data configuration and analysis module and a data processing module, wherein the WEB page data configuration and analysis module is configured to monitor a WEB page data submission event and acquire operation information and data information associated with the WEB page data submission event in real time according to the monitored WEB page data submission event; the data processing module is configured to synchronously execute the operation indicated by the operation information on the data indicated by the data information in a real-time data storage area and a historical data storage area, wherein the data stored in the real-time data storage area is used for providing query and service of real-time data, the data stored in the historical data storage area is used for providing query and service of historical data, and the operation indicated by the operation information includes but is not limited to insertion, deletion and updating of data.
11. The apparatus according to claim 10, further comprising a data operation module, wherein after the data operation module acquires operation information and data information associated with the WEB page data submission event, the data operation module determines a preset dependency rule according to the operation information, and processes the data information in real time according to the preset dependency rule.
12. The apparatus of claim 10 or 11, further comprising a data initialization module configured to load raw data into the historical data storage according to a predetermined rule.
13. The apparatus of claim 10 or 11, further comprising an interface module configured to receive a request for a query and/or service of real-time data and to provide results of the query and/or service of real-time data based on data stored in the real-time data store.
14. The apparatus of claim 13, wherein the interface module is further configured to receive requests for queries and/or services for historical data and provide results of the queries and/or services for historical data based on data stored in the historical data store.
15. A computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, wherein the processor implements the data processing method of any one of claims 1 to 9 when executing the instructions.
16. A computer-readable storage medium having stored thereon computer instructions, characterized in that the instructions, when executed by a processor, implement the data processing method of any one of claims 1 to 9.
CN201810361696.9A 2018-04-20 2018-04-20 Data processing method and device Expired - Fee Related CN108549714B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810361696.9A CN108549714B (en) 2018-04-20 2018-04-20 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810361696.9A CN108549714B (en) 2018-04-20 2018-04-20 Data processing method and device

Publications (2)

Publication Number Publication Date
CN108549714A CN108549714A (en) 2018-09-18
CN108549714B true CN108549714B (en) 2020-12-11

Family

ID=63512031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810361696.9A Expired - Fee Related CN108549714B (en) 2018-04-20 2018-04-20 Data processing method and device

Country Status (1)

Country Link
CN (1) CN108549714B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110968823A (en) * 2018-09-30 2020-04-07 华为技术有限公司 Application client starting method, service server and client equipment
CN109871378A (en) * 2019-02-21 2019-06-11 杭州市商务委员会(杭州市粮食局) The data acquisition and processing (DAP) method and system of big data platform
CN110781188B (en) * 2019-10-23 2022-09-02 泰康保险集团股份有限公司 Form information processing method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667205A (en) * 2009-09-28 2010-03-10 河南电力试验研究院 Method for memorizing real time measure point data facing quick review
CN102637197A (en) * 2012-02-28 2012-08-15 中北大学 File management method of real-time data acquisition and storage system
CN102646130A (en) * 2012-03-12 2012-08-22 华中科技大学 Method for storing and indexing mass historical data
CN103957248A (en) * 2014-04-21 2014-07-30 中国科学院软件研究所 Public real-time data management cloud service platform based on Internet of Things

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667205A (en) * 2009-09-28 2010-03-10 河南电力试验研究院 Method for memorizing real time measure point data facing quick review
CN102637197A (en) * 2012-02-28 2012-08-15 中北大学 File management method of real-time data acquisition and storage system
CN102646130A (en) * 2012-03-12 2012-08-22 华中科技大学 Method for storing and indexing mass historical data
CN103957248A (en) * 2014-04-21 2014-07-30 中国科学院软件研究所 Public real-time data management cloud service platform based on Internet of Things

Also Published As

Publication number Publication date
CN108549714A (en) 2018-09-18

Similar Documents

Publication Publication Date Title
EP3571606B1 (en) Query language interoperability in a graph database
CN102667761B (en) Scalable cluster database
US10956422B2 (en) Integrating event processing with map-reduce
RU2691595C2 (en) Constructed data stream for improved event processing
CN109063196B (en) Data processing method and device, electronic equipment and computer readable storage medium
US11971867B2 (en) Global column indexing in a graph database
CN110908997A (en) Data blood margin construction method and device, server and readable storage medium
CN108549714B (en) Data processing method and device
CN111339171B (en) Data query method, device and equipment
US20120278354A1 (en) User analysis through user log feature extraction
CN111930768B (en) Incremental data acquisition method, incremental data transmission method, incremental data acquisition device, incremental data transmission device and computer storage medium
US11681707B1 (en) Analytics query response transmission
CN113420026B (en) Database table structure changing method, device, equipment and storage medium
CN108647357A (en) The method and device of data query
CN112416991A (en) Data processing method and device and storage medium
CN113282599A (en) Data synchronization method and system
US10990607B1 (en) Systems and methods for log aggregation
CN115017182A (en) Visual data analysis method and equipment
WO2022057525A1 (en) Method and device for data retrieval, electronic device, and storage medium
CN114661823A (en) Data synchronization method and device, electronic equipment and readable storage medium
Chang et al. SQL and NoSQL database comparison: from performance perspective in supporting semi-structured data
US11494408B2 (en) Asynchronous row to object enrichment of database change streams
Suguna et al. User interest level based preprocessing algorithms using web usage mining
CN115599871A (en) Lake and bin integrated data processing system and method
CN110555065A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201211

CF01 Termination of patent right due to non-payment of annual fee