CN108549714A - A kind of data processing method and device - Google Patents

A kind of data processing method and device Download PDF

Info

Publication number
CN108549714A
CN108549714A CN201810361696.9A CN201810361696A CN108549714A CN 108549714 A CN108549714 A CN 108549714A CN 201810361696 A CN201810361696 A CN 201810361696A CN 108549714 A CN108549714 A CN 108549714A
Authority
CN
China
Prior art keywords
data
real
information
web page
inquiry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810361696.9A
Other languages
Chinese (zh)
Other versions
CN108549714B (en
Inventor
璁稿缓
许建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Orange Eagle Data Technology Co Ltd
Original Assignee
Hangzhou Orange Eagle Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Orange Eagle Data Technology Co Ltd filed Critical Hangzhou Orange Eagle Data Technology Co Ltd
Priority to CN201810361696.9A priority Critical patent/CN108549714B/en
Publication of CN108549714A publication Critical patent/CN108549714A/en
Application granted granted Critical
Publication of CN108549714B publication Critical patent/CN108549714B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of data processing method and devices, wherein the method includes:It submits event to be monitored WEB page data, submits event to obtain operation information associated with WEB page data submission event and data information in real time according to the WEB page data monitored;In the operation for executing the operation information instruction synchronous with the data that history data store area indicates the data information of real-time data memory area, wherein, the data stored in the real-time data memory area are used to provide the inquiry and service of real time data, and the data stored in the history data store area are used to provide the inquiry and service of historical data.

Description

A kind of data processing method and device
Technical field
This application involves technical field of data storage, more particularly to a kind of data processing method and device.
Background technology
In recent years, electronic information data play increasingly important role in operation, are needed to electronics in practical application Information data carry out efficiently, in time, accurately analyze.Traditional data warehouse uses extraction-conversion-load (Extract Transform Load, ETL) tool periodically extracts data from data source, data are loaded onto data after treatment Warehouse.And the data pick-up period of traditional approach be usually one month once, weekly or once a day, therefore can only It supports inquiry and service based on historical data, the variation of data in data source cannot be captured in real time.Therefore, there is real-time number According to warehouse, but existing real-time data warehouse is imported there are real time data according to pre-access method and is carried out at the same time with real time data inquiry The problem of causing inquiry competition, the conflict generated will seriously affect online in-system decryption (On-Line Transaction Processing, OLTP) and online online analysis and processing (On-Line Analysis Processing, OLAP) precision and Efficiency reduces the performance of real-time data warehouse.
The real-time data warehouse that the prior art is provided actually remains in the level of traditional ETL data load, obtains The mode for evidence of fetching still by it is passive or it is pseudo- it is passive in the form of from each different operation system extract data, that is, based on moving The real-time data warehouse of state mirror image makes data acquisition reach approximate real time.This solution is only done in the inside of data warehouse Some optimizations, the behavior of ETL remain trigger-type, and data are remained to be extracted from business library, is not only difficult to realize true Data in positive meaning obtain in real time, and during extraction data, also can use multitype database because of different business systems Cause data synchronization process to become extremely complex, increase keep system stability difficulty, while require implementation personnel and Developer has very high technical capability.
Invention content
In order to solve the problems in the existing technology, a kind of data processing method of the embodiment of the present application offer, device, meter Equipment and computer readable storage medium are calculated, to realize that data truly obtain in real time.
On the one hand the embodiment of the present application provides a kind of data processing method, the method includes:
It submits event to be monitored WEB page data, submits event real-time according to the WEB page data monitored Obtain operation information associated with WEB page data submission event and data information;
Described in the execution synchronous with the data that history data store area indicates the data information of real-time data memory area The operation of operation information instruction, wherein the data stored in the real-time data memory area are used to provide the inquiry of real time data And service, the data stored in the history data store area are used to provide the inquiry and service of historical data.
Optionally, real-time acquisition data information associated with WEB page data submission event and operation are believed Breath includes:
Obtain the configuration information and URL strings for being monitored to and the WEB page that data submit event occurring;
It parses the configuration information and obtains operation information;
It parses the URL strings and obtains data information.
Optionally, the WEB page data submission event includes:
The WEB page data for clicking type of button submits event;Or
The WEB page data of clickthrough type submits event.
Optionally, it obtains operation information associated with WEB page data submission event and data information is gone back later Including:
Preset dependent Rule is determined according to the operation information;
The data information is handled in real time according to the preset dependent Rule.
Optionally, real-time handle includes:
The data information is cleaned according to default cleaning rule.
Optionally, real-time handle further includes:
The data information is filtered according to the data model pre-established.
Optionally, the method further includes:
Initial data is loaded into the history data store area according to pre-defined rule.
Optionally, the method further includes:
The request of inquiry and/or the service of receiving real-time data;
The knot of inquiry and/or the service of real time data is provided according to the data preserved in the real-time data memory area Fruit.
Optionally, the method further includes:
Receive the request of inquiry and/or the service of historical data;
The knot of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area Fruit.
On the other hand the embodiment of the present application also provides a kind of data processing equipment, described device includes:WEB page data is matched Set parsing module and data processing module, wherein the WEB page data Command Line Parsing module is configured as to WEB page number It is monitored according to submission event, submits event to obtain and the WEB page number in real time according to the WEB page data monitored According to the associated operation information of the event of submission and data information;The data processing module is configured as in real-time data memory area The operation for executing the operation information instruction synchronous with the data that history data store area indicates the data information, wherein The data stored in the real-time data memory area are used to provide the inquiry and service of real time data, the history data store area The data of middle storage are used to provide the inquiry and service of historical data.
Optionally, described device further includes data operation modules, the data operation modules be configured as obtain with it is described After WEB page data submits the associated operation information of event and data information, determined according to the operation information preset Dependent Rule is handled the data information according to the preset dependent Rule in real time.
Optionally, described device further includes data initialization module, and the data initialization module is configured as according to pre- Initial data is then loaded into the history data store area by set pattern.
Optionally, described device further includes interface module, and the interface module is configured as the inquiry of receiving real-time data And/or the request of service, and according to the data that are preserved in the real-time data memory area provide real time data inquiry and/ Or the result of service.
Optionally, the interface module is additionally configured to receive the request of inquiry and/or the service of historical data, and root The result of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area.
On the other hand the embodiment of the present application also provides a kind of computing device, including memory, processor and be stored in storage On device and the computer instruction that can run on a processor, the processor realize above-mentioned data processing side when executing described instruction Method.
On the other hand the embodiment of the present application also provides a kind of computer readable storage medium, be stored thereon with computer and refer to It enables, which realizes above-mentioned data processing method when being executed by processor.
Data processing method provided by the present application and device can realize that data truly obtain in real time, omit Complicated designed in mirror image in the prior art, improves the stability of Database Systems.Method and device pair provided by the present application It is more friendly in the support of the inquiry of historical data, and reduce the workload and difficulty of ETL designs.
Description of the drawings
Fig. 1 is the flow diagram of the data processing method of one embodiment of the application;
Fig. 2 is the flow diagram of the data processing method of another embodiment of the application;
Fig. 3 is the flow diagram of the data processing method of another embodiment of the application;
Fig. 4 is the structural schematic diagram for the data processing equipment that the application one is implemented;
Fig. 5 is the structural schematic diagram of the data processing equipment of another implementation of the application;
Fig. 6 is the structural schematic diagram of the computing device of one specific embodiment of the application.
Specific implementation mode
The details for illustrating the application by embodiment below in conjunction with the accompanying drawings is more advantageous to and understands that the application's is interior in this way Hold, but the application can by it is a variety of different from specific embodiment in a manner of implement, those skilled in the art can without prejudice to Similar popularization is done in conjunction with the prior art in the case of the application intension, therefore the application is not by the specific implementation mode of following discloses Limitation.
In this application, " first ", " second ", " third ", " the 4th " etc. are only used for mutual differentiation, rather than indicate important Degree and sequence and each other existing premise etc..
The demand handled in real time data can be divided mainly into two types:The processing of action type and analysis type Processing.Online in-system decryption (On-Line Transaction Processing, OLTP) is typical action type Processing procedure, and online online analysis and processing (On-Line Analytical Processing, OLAP) is typical analysis classes The data handling procedure of type, traditional database be mainly used for realize OLTP, focus on the calculating of data, the insertion of record, deletion, With modification, and simple inquiry and statistics.The main task of OLTP is to carry out issued transaction, and concern is primarily with issued transactions Promptness, integrality and correctness, and there is serious deficiencies in terms of the analyzing processing of data.General service database Lack integration and the indefinite disadvantage of theme is mainly manifested in the following aspects:First, service database system stick point The unordered of data distribution and dispersion can be led to by cutting, and lacked unified definition and planning, may be deposited between different service databases In the ambiguity of data definition;Secondly, service database defines library and Biao Shi lacks specific theme, cannot be satisfied data analysis It needs;In addition, mass data is dispersedly stored in different tables, different libraries and different database servers, Wu Fabao Demonstrate,prove the efficiency of Data Analysis Services.Therefore traditional database is limited by self-condition, can not be taken on as extensive number According to the important task of comprehensive analysis platform, there is an urgent need to have a kind of theory newly with technology to provide support, here it is data warehouse skills Art.
Data warehouse technology (Extract-Transform-Load, ETL) is by the data warp in original operation system It crosses extraction, clean the process that conversion is loaded into data warehouse later, be the important ring for building data warehouse.The purpose done so It is to provide dispersion, messy, the skimble-scamble Data Integration of standard in service database to analysis foundation to together for decision.ETL master If using the processing capacity of change server, after extracting data in service database, data are carried out in change server Cleaning, conversion, are loaded into object library after the completion.The design of ETL usually divides three parts:The cleaning conversion of data pick-up, data With the load of data.
The data cleansing generally refers to simplify data can be connect with removing deduplication record, and remainder being made to be converted into standard Receive the process of format.Data scrubbing master pattern is to enter data into data scrubbing processor, " clear by series of steps Reason " data, then export the data cleared up with desired format.Data scrubbing is from the accuracy of data, integrality, consistent Property, uniqueness, timeliness, the several aspects of validity come handle data missing value, more dividing value, inconsistent code, duplicate data The problems such as.
One embodiment of the application provides a kind of data processing method, as shown in Figure 1, the method includes:
Step 101:It submits event to be monitored WEB page data, is submitted according to the WEB page data monitored Event obtains operation information associated with WEB page data submission event and data information in real time;
Step 102:It is synchronous with the data that history data store area indicates the data information in real-time data memory area Execute the operation of the operation information instruction.
Wherein, the data stored in the real-time data memory area are used to provide the inquiry and service of real time data, described The data stored in history data store area are used to provide the inquiry and service of historical data.
Wherein, the data stored in the real-time data memory area have certain life cycle, for providing in real time The inquiry and service of data.The length of the life cycle can be adjusted according to demand.It is a large amount of due to can all generate daily Real time data, for prevent because in real-time data memory area data volume it is excessive due to influence real time data inquiry and service feedback speed Degree, the life cycle in the real-time data memory area are preferably less than or equal to 24 hours.
In addition to comprising historical data also include all new in real-time data memory area in the history data store area Increase data, for providing inquiry and service for historical data, therefore, when carrying out the inquiry and service for historical data Newly-increased data in real-time data memory area need not be imported into history data store area, i.e. history data store area and reality When data storage area be full decoupled.
Optionally, the WEB page data submission event includes:
The WEB page data for clicking type of button submits event;Or
The WEB page data of clickthrough type submits event.
Specifically, if the data that type of button occurs to click for the WEB page monitored submit event, in real time from list Obtain data associated with WEB page data submission event.POST data way of submission as defined in HTTP/1.1 agreements Include mainly following two:
1. application/x-www-form-urlencoded is most common POST data way of submission, that is, pass through Browser it is primary<form>List, if being not provided with enctype attributes, then finally will be with application/x-www- Form-urlencoded modes submit data;
2. multipart/form-data is also a kind of common POST data way of submission, that is, list is used to upload text When part, it is necessary to allow<form>The enctype of list is equal to multipart/form-data, and this data way of submission is mainly Realize what data were submitted by the way of upper transmitting file, such as:Data are submitted in a manner of uploading excel tables.
If the data that clickthrough type occurs for the WEB page monitored submit event, in real time from the subsidiary ginseng of link Data associated with WEB page data submission event are obtained in number.
Wherein, in step 102, in the number that real-time data memory area and history data store area indicate the data information Include according to the synchronous operation for executing the operation information instruction:Insertion, deletion and update of data etc..Specifically, for example passing through Following sentence can be inserted into the data of key-value forms, newer operation:
Map<String,int>Map=new HashMap<String,int>();{ defining a Map object }
Map.put (" ming ", 1);{ one group of data of setting, key are " ming ", respective value 1 }
Map.put (" zi ", 2);{ another group of data are set, and key is " zi ", respective value 2 }
map.get("ming");{ value 1 for obtaining key " ming " }
Map.put (" ming ", 3);{ because key " ming " has existed, value 1 originally can be capped, key " ming " corresponding new value is 3 }
Method provided in this embodiment realizes associated data in such a way that actively monitoring WEB page data submits event Real-time acquisition, abandoned the mode of batch processing used by traditional ETL, the data that meet truly are wanted in real time It asks.The real time data got by this method is synchronized storage and is deposited to the real-time data memory area of data warehouse and historical data Storage area.
Another embodiment of the application provides a kind of data processing method, as shown in Fig. 2, the method includes:
Step 201:WEB page data to clicking type of button submits event to be monitored, according to monitoring Click configuration information and URL that the WEB page data of type of button submits event to obtain the WEB page from list in real time String;
Step 202:It parses the configuration information and obtains operation information;
Step 203:It parses the URL strings and obtains data information;
Step 204:According to the operation information and data information in real-time data memory area and history data store area pair The data of the data information instruction synchronize the operation for executing the operation information instruction.
Wherein, in the step 201 click type of button WEB page data submit event process be similar to pass through a little It hits button and submits a WEB list.The operation information is equivalent to all kinds of dimensional informations of the operation to list, the data letter Breath is equivalent to list content.
The data information includes field information, field data information etc., these information constitute complete data.The number The form of data flow throughout manages intermodule or transmission over networks according to this.Uniform resource locator (Uniform Resource Locator, URL) be the resource of standard on internet address, webpage for being fully described by internet and other moneys Source can also identify local resource.It can each webpage or resource of unique mark on internet using URL.URL is by a series of words Symbol composition, format are:protocol://[username:password]@host[:port][/path][query][# fragment].Wherein, transport protocol is specified in the domains protocol, such as:Http protocol, File Transfer Protocol etc.;The specified storage in the domains host The host name or IP address of the server of resource;The specified user name being connected to needed for server in the domains username and password And password;Port specifies in domain the port numbers of above-mentioned transport protocol;The address of a catalogue or file in the given host of the domains path; The domains query are assigned to the parameter of dynamic web page transmission;Specify the segment in Internet resources in the domains fragment.In addition, above-mentioned URL lattice In formula, the domain with square brackets [] is option.Client-side program accesses the information resources of Internet server using URL request When, it is thus necessary to determine that ask the information such as the agreement used, the server of request, the identifier for asking resource and store path.It is above-mentioned Information is all provided by the addresses URL.The URL strings submitted by parsing WEB page, it will be able to required data information is extracted, Such as:Field information, field data information etc..
The operation information is exactly the various dimensional informations of operation data, can be parsed according to the configuration information of WEB page Go out with the relevant information of data manipulation, such as:Action type, operator, operating time etc., the action type include:Be inserted into, Delete and update etc..In the case where being obtained from WEB page less than operation information, it is defaulted as insertion operation.In addition the configuration Information may also include the title of the database for needing to configure monitoring, type of database, the title of table, required field and corresponding URL pages etc..Optionally, the configuration information further includes being stored in the real-time data memory area and history data store area The downstream user of data and the relevant information of transmission mode.
The data processing method provided using the present embodiment can realize the real-time synchronization of Data Warehouse, data Update be no longer dependent on service database, realize full decoupled with service database.This method has abandoned traditional ETL's The mirror-image structure of dynamic area complexity is omitted in shortcoming, is realized while the stability that ensure that data warehouse The real-time query of data and historical query.
Another embodiment of the application provides a kind of data processing method, as shown in figure 3, the method includes:
Step 301:Event is submitted to be monitored the WEB page data of clickthrough type, according to monitoring The WEB page data of clickthrough type submits event to be obtained and the WEB page data from the subsidiary parameter of link in real time The associated operation information of submission event and data information;
Step 302:Dependent Rule is determined according to the operation information;
Step 303:The data information is handled in real time according to the dependent Rule;
Step 304:It in real-time data memory area and is gone through according to the operation information and the data information by handling in real time History data storage area, which synchronizes, executes corresponding data manipulation.
In the above-described embodiments, the mode based on event increment may be used and obtain newly-increased data, each clickthrough Corresponding one new data of record submit event, there is no inevitable contact between each event.
In the embodiment of the present application, submitted according to the WEB page data of the clickthrough type monitored in step 301 Event obtains operation information associated with WEB page data submission event and data from the subsidiary parameter of link in real time Information may include:
Obtain the WEB page of the WEB page data submission event for the clickthrough type being monitored to matches confidence Breath and URL strings;Operation information is obtained by parsing the configuration information;Data information is obtained by parsing the URL strings.
Wherein, which dependent Rule executes for determining which real-time processing module what kind of data flow sequentially flow through according to The step of a little processing in real time, such as:It first carries out data validation detection and deletes invalid data, later according to preset rules to data The conversion of progress type (such as:The data of the specific formats such as date, currency are converted to the text type of preset format) keep number According to consistency, then data information is filtered, only extracts the data corresponding to the field involved by the table of data warehouse Information.Corresponding dependent Rule is determined according to the operation information for parsing acquisition in configuration information, so that it is determined that should be to obtaining Which the newly-increased data information got carries out and handles in real time, for example, the data cleansings rule such as data encryption, data cutout, data It will be synchronized in the correspondence memory block of data warehouse after over cleaning conversion.
The configuration information that the WEB page that data submit event occurs by being detected described in parsing, can determine and institute The associated operation information of submission event is stated, and then determines the processing that the data will be carried out with which kind of pattern, such as:Distribution, The tupes such as dependence.
If can determine that the tupe of data is to distribute in real time according to operation information, data can be with the shape of data flow Formula flows to the operation simultaneously of data warehouse disparate modules.
For example, can determine that the tupe of data is to rely on and obtain corresponding dependent Rule, root according to operation information Just being able to know that according to the dependent Rule carry out the data information which is handled in real time, such as:Encryption, data intercept Deng.And determine whether that returning the result data returns to implementing result (success or failure) according to operation information.
Optionally, real-time handle in the step 303 includes:
The data information is cleaned according to default cleaning rule;And/or
The data information is filtered according to the data model pre-established.
The default cleaning rule of the data cleansing includes:Data cutout, data solution/encryption, data correlation, data are legal Property detection (such as cell-phone number detection), data type conversion rule.
It refers to table structure pair according in data warehouse that the data model that the basis pre-establishes, which is filtered data, The full word segment data got from WEB page is filtered, and only extracts the field involved by the table of data warehouse.
The method of the embodiment will can in real time be got from upstream submits event associated with the WEB page data Data the target data needed for OLTP or OLAP is converted to according to default cleaning rule.The conversion of data includes:Data amount check Conversion, the conversion of data type, data summarize calculating, data splicing etc..In this method, as long as receiving upstream acquisition The data arrived, it will be able to data be handled in a manner of data flow in time, greatly accelerate the efficiency of data processing.Through Cross data conversion, data cleansing and data merge etc., and treated that data are updated in the memory space of data warehouse, this Processing mode can preferably support various inquiries and service request for real time data and historical data.
Under a specific application scenarios, the action by monitoring the clickthrough occurred in WEB page captures One data submit event, and parse the position of data file from configuration information and URL in real time and browser is answered How this handles the relevant information of the data file.It is corresponding to determine according to the operation information obtained is parsed in configuration information Dependent Rule, so that it is determined that the data information in data file should be carried out which in real time handle.For example, according to parsing institute The operation information of acquisition determines that the built-up sequence of real time data processing is:
First, data are cleaned, carry out data cleansing purpose be data file is encrypted and/or decrypts, Detecting to delete invalid data and carry out conversion to data type according to preset rules by data validation keeps data consistent Property etc.;
Later, data are filtered according to the data model pre-established, the purpose of data filtering is according to data Table structure in warehouse is filtered the data in data file, only extracts the field involved by the table of data warehouse, Intercept useful target data.
Finally, treated target data is stored to real-time data memory area and history data store area simultaneously.
Dependent Rule according to data flow determined by different operation information and data information is different, so accordingly The real-time processing steps to data may also be different, the developer of database can make adaptability according to actual needs Design, to configure different combinations and sequence.
Optionally, when data warehouse initializes, existing initial data in service database is carried according to pre-defined rule Enter to the history data store area, such as disposably can be loaded into existing initial data in service database described History data store area, to the historical data before making newly-established data warehouse also be capable of providing data warehouse initialization The result of inquiry and/or service.
Optionally, the method further includes:
The request of inquiry and/or the service of receiving real-time data at any time;
The result of inquiry and/or service is provided according to the data preserved in the real-time data memory area.
Optionally, the method further includes:
The request of inquiry and/or the service of historical data is received at any time;
The result of inquiry and/or service is provided according to the data preserved in the history data store area.
Newly-increased data are synchronized update, real time data in the history data store area and the real-time data memory area Memory block and history data store area are full decoupled, and real-time data memory area only retains the data in setup time, history Data are all directly to be obtained from upstream, and there is no any interactions with real-time data memory area.Ensureing data query essence as a result, Under the premise of degree, the efficiency of real time data inquiry is improved, and supports real time data inquiry and the inquiry of historical data simultaneously, completely Avoid the problem of real time data imports the inquiry competition that initiation is carried out at the same time with real time data inquiry.
One embodiment of the application discloses a kind of data processing equipment, as shown in figure 4, described device 400 includes:WEB pages Face data Command Line Parsing module 401 and data processing module 402, wherein 401 quilt of WEB page data Command Line Parsing module It is configured to submit event to be monitored WEB page data, submits event to obtain in real time according to the WEB page data monitored Take operation information associated with WEB page data submission event and data information;The data processing module 402 by with It is set to the execution operation synchronous with the data that history data store area indicates the data information in real-time data memory area The operation of information instruction, wherein the data stored in the real-time data memory area are used to provide the inquiry kimonos of real time data It is engaged in, the data stored in the history data store area are used to provide the inquiry and service of historical data.
The device that above-described embodiment provides can realize association by way of actively capturing WEB page data and submitting event The real-time acquisition of data, has abandoned the mode of batch processing used by traditional ETL, truly to meet data real-time It is required that.Also comprising newly-increased number all in real-time data memory area in addition to comprising historical data in the history data store area According to for providing inquiry and service for historical data, therefore, being not required to carrying out the when of being directed to the inquiry and service of historical data Newly-increased data in real-time data memory area are imported into history data store area, i.e. history data store area and number in real time It is full decoupled according to memory block.
One embodiment of the application discloses a kind of data processing equipment, as shown in figure 5, described device 500 includes:WEB pages Face data Command Line Parsing module 501, data processing module 502 and data operation modules 503, wherein the WEB page data is matched It sets parsing module 501 to be configured as submitting event to be monitored WEB page data, according to the WEB page number monitored Obtain operation information associated with WEB page data submission event and data information in real time according to the event of submission;At data Reason module 502 is configured as synchronous with the data that history data store area indicates the data information in real-time data memory area Execute the operation of the operation information instruction;The data operation modules 503 are configured as obtaining and be carried with the WEB page data After the associated operation information of friendship event and data information, preset dependent Rule is determined according to the operation information, and The data information is handled in real time according to the preset dependent Rule;Wherein, it is deposited in the real-time data memory area The data of storage are used to provide the inquiry and service of real time data, and the data stored in the history data store area are gone through for providing The inquiry and service of history data.
In the embodiment of the present application, the mode based on event increment may be used and obtain newly-increased data, each clicks chain The corresponding new data of record connect submit event, do not have inevitable contact between each event.It is taken in foundation configuration information Corresponding dependent Rule is determined with operation information, so that it is determined which should be carried out to the newly-increased data information got in real time Processing, for example, the data cleansings rule such as data encryption, data cutout, data will be same after over cleaning conversion In the correspondence memory module for walking data warehouse.The device of the embodiment will can in real time be got and the WEB from upstream Page data submits the associated data of event to be converted to the target data needed for OLTP or OLAP according to default cleaning rule.Number According to conversion include:The conversion of data amount check, the conversion of data type, data summarize calculating, data splicing etc..The number As long as receiving the data that upstream is got according to operation module 503, it will be able in time in a manner of data flow to data Reason, greatly accelerates the efficiency of data processing.Treated the data quilt such as merge by data conversion, data cleansing and data In the memory space for updating data warehouse, this processing mode can be supported preferably for real time data and historical data Various inquiries and service request.
Optionally, described device 500 further includes data initialization module, the data initialization module be configured as according to Initial data is loaded into the history data store area by pre-defined rule.Newly-established data warehouse is also capable of providing number as a result, The result of inquiry and/or the service of historical data before being initialized according to warehouse.
Optionally, described device further includes interface module, and the interface module is configured as the inquiry of receiving real-time data And/or the request of service, and according to the data that are preserved in the real-time data memory area provide real time data inquiry and/ Or the result of service.Optionally, the interface module is additionally configured to receive the request of inquiry and/or the service of historical data, And the result of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area.Newly It is synchronized update to increase data in the history data store area and the real-time data memory area, real-time data memory area with go through History data storage area is full decoupled, and real-time data memory area only retains the data in setup time, and historical data is all straight It connects from upstream and obtains, there is no any interactions with real-time data memory area.As a result, in the premise for ensureing data query precision Under, the efficiency of real time data inquiry is improved, and support real time data inquiry and the inquiry of historical data simultaneously, it is entirely avoided real When data import with real time data inquiry be carried out at the same time initiation inquiry compete the problem of.
A kind of computing device as shown in FIG. 6 600 is provided in one embodiment according to the application, including but unlimited In memory 601, processor 602 and it is stored in the computer instruction that can be run on memory 601 and on processor 602, institute It states when processor 602 executes described instruction and realizes foregoing data processing method.
A kind of exemplary scheme of above-mentioned computing device for the present embodiment.It should be noted that the computing device 600 Technical solution and data processing method above-mentioned belong to same design, and the technical solution of the computing device is not described in detail thin Content is saved, the description of the technical solution of above-mentioned data processing method is may refer to.
A kind of storage medium is provided in one embodiment according to the application, is stored thereon with computer instruction, institute It states and realizes power foregoing data processing method when instruction is executed by processor.
The computer instruction includes computer program code, the computer program code can be source code form, Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include:Institute can be carried State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disc, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior Increase and decrease appropriate can be carried out according to legislation in jurisdiction and the requirement of patent practice by holding, such as in certain jurisdictions of courts Area, according to legislation and patent practice, computer-readable medium does not include electric carrier signal and telecommunication signal.
A kind of exemplary scheme of above-mentioned readable storage medium storing program for executing for the present embodiment.It should be noted that the storage medium Technical solution and data processing method above-mentioned belong to same design, the details that the technical solution of storage medium is not described in detail Content may refer to the description of the technical solution of above-mentioned data processing method.
It should be noted that for each method embodiment above-mentioned, describe, therefore it is all expressed as a series of for simplicity Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps may be used other sequences or be carried out at the same time.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to preferred embodiment, and involved action and module might not all be this Shens It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.
The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment All details are described, also do not limit the specific implementation mode that this application is only described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application Principle and practical application, to enable skilled artisan to be best understood by and utilize the application.The application is only It is limited by claims and its full scope and equivalent.

Claims (16)

1. a kind of data processing method, which is characterized in that the method includes:
It submits event to be monitored WEB page data, submits event to obtain in real time according to the WEB page data monitored Operation information associated with WEB page data submission event and data information;
In the execution operation synchronous with the data that history data store area indicates the data information of real-time data memory area The operation of information instruction, wherein the data stored in the real-time data memory area are used to provide the inquiry kimonos of real time data It is engaged in, the data stored in the history data store area are used to provide the inquiry and service of historical data.
2. according to the method described in claim 1, it is characterized in that, the real-time acquisition submits thing with the WEB page data The associated data information of part and operation information include:
Obtain the configuration information and URL strings for being monitored to and the WEB page that data submit event occurring;
It parses the configuration information and obtains operation information;
It parses the URL strings and obtains data information.
3. method according to claim 1 or 2, which is characterized in that the WEB page data submits the event to include:
The WEB page data for clicking type of button submits event;Or
The WEB page data of clickthrough type submits event.
4. method according to claim 1 or 2, which is characterized in that obtain and submit event related to the WEB page data Further include after the operation information and data information of connection:
Preset dependent Rule is determined according to the operation information;
The data information is handled in real time according to the preset dependent Rule.
5. according to the method described in claim 4, it is characterized in that, the real-time processing includes:
The data information is cleaned according to default cleaning rule.
6. according to the method described in claim 4, it is characterized in that, the real-time processing further includes:
The data information is filtered according to the data model pre-established.
7. method according to claim 1 or 2, which is characterized in that the method further includes:
Initial data is loaded into the history data store area according to pre-defined rule.
8. method according to claim 1 or 2, which is characterized in that the method further includes:
The request of inquiry and/or the service of receiving real-time data;
The result of inquiry and/or the service of real time data is provided according to the data preserved in the real-time data memory area.
9. method according to claim 1 or 2, which is characterized in that the method further includes:
Receive the request of inquiry and/or the service of historical data;
The result of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area.
10. a kind of data processing equipment, which is characterized in that described device includes:WEB page data Command Line Parsing module and data Processing module, wherein the WEB page data Command Line Parsing module is configured as submitting event to supervise WEB page data It surveys, submits event to obtain in real time according to the WEB page data monitored and submit event associated with the WEB page data Operation information and data information;The data processing module is configured as in real-time data memory area and history data store area The data of data information instruction are synchronized with the operation for executing the operation information instruction, wherein the real-time data memory The data stored in area are used to provide the inquiry and service of real time data, and the data stored in the history data store area are used for The inquiry and service of historical data are provided.
11. device according to claim 10, which is characterized in that described device further includes data operation modules, the number According to operation module be configured as obtaining with the WEB page data associated operation information of submission event and data information it Afterwards, preset dependent Rule is determined according to the operation information, according to the preset dependent Rule to the data information into Row processing in real time.
12. the device according to claim 10 or 11, which is characterized in that described device further includes data initialization module, The data initialization module is configured as that initial data is loaded into the history data store area according to pre-defined rule.
13. the device according to claim 10 or 11, which is characterized in that described device further includes interface module, described to connect Mouth mold block is configured as the request of inquiry and/or the service of receiving real-time data, and according in the real-time data memory area The data preserved provide the result of inquiry and/or the service of real time data.
14. device according to claim 13, which is characterized in that the interface module is additionally configured to receive historical data Inquiry and/or service request, and provide historical data according to the data that are preserved in the history data store area The result of inquiry and/or service.
15. a kind of computing device, including memory, processor and storage are on a memory and the calculating that can run on a processor Machine instructs, which is characterized in that the processor realizes the data described in any one of claim 1 to 9 when executing described instruction Processing method.
16. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the instruction is by processor The data processing method described in any one of claim 1 to 9 is realized when execution.
CN201810361696.9A 2018-04-20 2018-04-20 Data processing method and device Expired - Fee Related CN108549714B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810361696.9A CN108549714B (en) 2018-04-20 2018-04-20 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810361696.9A CN108549714B (en) 2018-04-20 2018-04-20 Data processing method and device

Publications (2)

Publication Number Publication Date
CN108549714A true CN108549714A (en) 2018-09-18
CN108549714B CN108549714B (en) 2020-12-11

Family

ID=63512031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810361696.9A Expired - Fee Related CN108549714B (en) 2018-04-20 2018-04-20 Data processing method and device

Country Status (1)

Country Link
CN (1) CN108549714B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871378A (en) * 2019-02-21 2019-06-11 杭州市商务委员会(杭州市粮食局) The data acquisition and processing (DAP) method and system of big data platform
CN110781188A (en) * 2019-10-23 2020-02-11 泰康保险集团股份有限公司 Form information processing method and device, electronic equipment and storage medium
US20210218828A1 (en) * 2018-09-30 2021-07-15 Huawei Technologies Co., Ltd. Method for Starting Application Client, Service Server, and Client Device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667205A (en) * 2009-09-28 2010-03-10 河南电力试验研究院 Method for memorizing real time measure point data facing quick review
CN102637197A (en) * 2012-02-28 2012-08-15 中北大学 File management method of real-time data acquisition and storage system
CN102646130A (en) * 2012-03-12 2012-08-22 华中科技大学 Method for storing and indexing mass historical data
CN103957248A (en) * 2014-04-21 2014-07-30 中国科学院软件研究所 Public real-time data management cloud service platform based on Internet of Things

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667205A (en) * 2009-09-28 2010-03-10 河南电力试验研究院 Method for memorizing real time measure point data facing quick review
CN102637197A (en) * 2012-02-28 2012-08-15 中北大学 File management method of real-time data acquisition and storage system
CN102646130A (en) * 2012-03-12 2012-08-22 华中科技大学 Method for storing and indexing mass historical data
CN103957248A (en) * 2014-04-21 2014-07-30 中国科学院软件研究所 Public real-time data management cloud service platform based on Internet of Things

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210218828A1 (en) * 2018-09-30 2021-07-15 Huawei Technologies Co., Ltd. Method for Starting Application Client, Service Server, and Client Device
CN109871378A (en) * 2019-02-21 2019-06-11 杭州市商务委员会(杭州市粮食局) The data acquisition and processing (DAP) method and system of big data platform
CN110781188A (en) * 2019-10-23 2020-02-11 泰康保险集团股份有限公司 Form information processing method and device, electronic equipment and storage medium
CN110781188B (en) * 2019-10-23 2022-09-02 泰康保险集团股份有限公司 Form information processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108549714B (en) 2020-12-11

Similar Documents

Publication Publication Date Title
He et al. Drain: An online log parsing approach with fixed depth tree
CN101937469B (en) Information capture method of video website
CN102667761B (en) Scalable cluster database
CN103810224B (en) information persistence and query method and device
CN105320740B (en) The acquisition methods and acquisition system of wechat article and public platform
CN108549714A (en) A kind of data processing method and device
WO2021114454A1 (en) Method and apparatus for detecting crawler request
CN109284435B (en) Internet-oriented user interaction trace capturing, storing and retrieving system and method
CN105468737A (en) Web service big data analysis method, cloud computing platform and mining system
CN110716950B (en) Caliber system establishment method, caliber system establishment device, caliber system establishment equipment and computer storage medium
CN111523072A (en) Page access data statistical method and device, electronic equipment and storage medium
CN110727663A (en) Data cleaning method, device, equipment and medium
CN108647357A (en) The method and device of data query
CN113420026B (en) Database table structure changing method, device, equipment and storage medium
CN113259467B (en) Webpage asset fingerprint tag identification and discovery method based on big data
CN112084270A (en) Data blood margin processing method and device, storage medium and equipment
CN106603690A (en) Data analysis device, data analysis processing system and data analysis method
CN110968571A (en) Big data analysis and processing platform for financial information service
CN113656673A (en) Master-slave distributed content crawling robot for advertisement delivery
CN103399968B (en) A kind of micro-blog information acquisition method and system
CN115017182A (en) Visual data analysis method and equipment
CN103248511A (en) Analyses method, device and system for single-point service performance
CN109214640B (en) Method and device for determining index result and computer readable storage medium
CN112671878B (en) Block chain information subscription method, device, server and storage medium
CN107679097A (en) A kind of distributed data processing method, system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201211