CN108549714A - A kind of data processing method and device - Google Patents
A kind of data processing method and device Download PDFInfo
- Publication number
- CN108549714A CN108549714A CN201810361696.9A CN201810361696A CN108549714A CN 108549714 A CN108549714 A CN 108549714A CN 201810361696 A CN201810361696 A CN 201810361696A CN 108549714 A CN108549714 A CN 108549714A
- Authority
- CN
- China
- Prior art keywords
- data
- real
- information
- web page
- inquiry
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 claims abstract description 37
- 230000001360 synchronised effect Effects 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims description 40
- 230000001419 dependent effect Effects 0.000 claims description 20
- 238000003860 storage Methods 0.000 claims description 14
- 238000004140 cleaning Methods 0.000 claims description 10
- 238000013499 data model Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 description 16
- 230000009471 action Effects 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 6
- 238000011144 upstream manufacturing Methods 0.000 description 6
- 238000013500 data storage Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 238000005201 scrubbing Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000013502 data validation Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of data processing method and devices, wherein the method includes:It submits event to be monitored WEB page data, submits event to obtain operation information associated with WEB page data submission event and data information in real time according to the WEB page data monitored;In the operation for executing the operation information instruction synchronous with the data that history data store area indicates the data information of real-time data memory area, wherein, the data stored in the real-time data memory area are used to provide the inquiry and service of real time data, and the data stored in the history data store area are used to provide the inquiry and service of historical data.
Description
Technical field
This application involves technical field of data storage, more particularly to a kind of data processing method and device.
Background technology
In recent years, electronic information data play increasingly important role in operation, are needed to electronics in practical application
Information data carry out efficiently, in time, accurately analyze.Traditional data warehouse uses extraction-conversion-load (Extract
Transform Load, ETL) tool periodically extracts data from data source, data are loaded onto data after treatment
Warehouse.And the data pick-up period of traditional approach be usually one month once, weekly or once a day, therefore can only
It supports inquiry and service based on historical data, the variation of data in data source cannot be captured in real time.Therefore, there is real-time number
According to warehouse, but existing real-time data warehouse is imported there are real time data according to pre-access method and is carried out at the same time with real time data inquiry
The problem of causing inquiry competition, the conflict generated will seriously affect online in-system decryption (On-Line Transaction
Processing, OLTP) and online online analysis and processing (On-Line Analysis Processing, OLAP) precision and
Efficiency reduces the performance of real-time data warehouse.
The real-time data warehouse that the prior art is provided actually remains in the level of traditional ETL data load, obtains
The mode for evidence of fetching still by it is passive or it is pseudo- it is passive in the form of from each different operation system extract data, that is, based on moving
The real-time data warehouse of state mirror image makes data acquisition reach approximate real time.This solution is only done in the inside of data warehouse
Some optimizations, the behavior of ETL remain trigger-type, and data are remained to be extracted from business library, is not only difficult to realize true
Data in positive meaning obtain in real time, and during extraction data, also can use multitype database because of different business systems
Cause data synchronization process to become extremely complex, increase keep system stability difficulty, while require implementation personnel and
Developer has very high technical capability.
Invention content
In order to solve the problems in the existing technology, a kind of data processing method of the embodiment of the present application offer, device, meter
Equipment and computer readable storage medium are calculated, to realize that data truly obtain in real time.
On the one hand the embodiment of the present application provides a kind of data processing method, the method includes:
It submits event to be monitored WEB page data, submits event real-time according to the WEB page data monitored
Obtain operation information associated with WEB page data submission event and data information;
Described in the execution synchronous with the data that history data store area indicates the data information of real-time data memory area
The operation of operation information instruction, wherein the data stored in the real-time data memory area are used to provide the inquiry of real time data
And service, the data stored in the history data store area are used to provide the inquiry and service of historical data.
Optionally, real-time acquisition data information associated with WEB page data submission event and operation are believed
Breath includes:
Obtain the configuration information and URL strings for being monitored to and the WEB page that data submit event occurring;
It parses the configuration information and obtains operation information;
It parses the URL strings and obtains data information.
Optionally, the WEB page data submission event includes:
The WEB page data for clicking type of button submits event;Or
The WEB page data of clickthrough type submits event.
Optionally, it obtains operation information associated with WEB page data submission event and data information is gone back later
Including:
Preset dependent Rule is determined according to the operation information;
The data information is handled in real time according to the preset dependent Rule.
Optionally, real-time handle includes:
The data information is cleaned according to default cleaning rule.
Optionally, real-time handle further includes:
The data information is filtered according to the data model pre-established.
Optionally, the method further includes:
Initial data is loaded into the history data store area according to pre-defined rule.
Optionally, the method further includes:
The request of inquiry and/or the service of receiving real-time data;
The knot of inquiry and/or the service of real time data is provided according to the data preserved in the real-time data memory area
Fruit.
Optionally, the method further includes:
Receive the request of inquiry and/or the service of historical data;
The knot of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area
Fruit.
On the other hand the embodiment of the present application also provides a kind of data processing equipment, described device includes:WEB page data is matched
Set parsing module and data processing module, wherein the WEB page data Command Line Parsing module is configured as to WEB page number
It is monitored according to submission event, submits event to obtain and the WEB page number in real time according to the WEB page data monitored
According to the associated operation information of the event of submission and data information;The data processing module is configured as in real-time data memory area
The operation for executing the operation information instruction synchronous with the data that history data store area indicates the data information, wherein
The data stored in the real-time data memory area are used to provide the inquiry and service of real time data, the history data store area
The data of middle storage are used to provide the inquiry and service of historical data.
Optionally, described device further includes data operation modules, the data operation modules be configured as obtain with it is described
After WEB page data submits the associated operation information of event and data information, determined according to the operation information preset
Dependent Rule is handled the data information according to the preset dependent Rule in real time.
Optionally, described device further includes data initialization module, and the data initialization module is configured as according to pre-
Initial data is then loaded into the history data store area by set pattern.
Optionally, described device further includes interface module, and the interface module is configured as the inquiry of receiving real-time data
And/or the request of service, and according to the data that are preserved in the real-time data memory area provide real time data inquiry and/
Or the result of service.
Optionally, the interface module is additionally configured to receive the request of inquiry and/or the service of historical data, and root
The result of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area.
On the other hand the embodiment of the present application also provides a kind of computing device, including memory, processor and be stored in storage
On device and the computer instruction that can run on a processor, the processor realize above-mentioned data processing side when executing described instruction
Method.
On the other hand the embodiment of the present application also provides a kind of computer readable storage medium, be stored thereon with computer and refer to
It enables, which realizes above-mentioned data processing method when being executed by processor.
Data processing method provided by the present application and device can realize that data truly obtain in real time, omit
Complicated designed in mirror image in the prior art, improves the stability of Database Systems.Method and device pair provided by the present application
It is more friendly in the support of the inquiry of historical data, and reduce the workload and difficulty of ETL designs.
Description of the drawings
Fig. 1 is the flow diagram of the data processing method of one embodiment of the application;
Fig. 2 is the flow diagram of the data processing method of another embodiment of the application;
Fig. 3 is the flow diagram of the data processing method of another embodiment of the application;
Fig. 4 is the structural schematic diagram for the data processing equipment that the application one is implemented;
Fig. 5 is the structural schematic diagram of the data processing equipment of another implementation of the application;
Fig. 6 is the structural schematic diagram of the computing device of one specific embodiment of the application.
Specific implementation mode
The details for illustrating the application by embodiment below in conjunction with the accompanying drawings is more advantageous to and understands that the application's is interior in this way
Hold, but the application can by it is a variety of different from specific embodiment in a manner of implement, those skilled in the art can without prejudice to
Similar popularization is done in conjunction with the prior art in the case of the application intension, therefore the application is not by the specific implementation mode of following discloses
Limitation.
In this application, " first ", " second ", " third ", " the 4th " etc. are only used for mutual differentiation, rather than indicate important
Degree and sequence and each other existing premise etc..
The demand handled in real time data can be divided mainly into two types:The processing of action type and analysis type
Processing.Online in-system decryption (On-Line Transaction Processing, OLTP) is typical action type
Processing procedure, and online online analysis and processing (On-Line Analytical Processing, OLAP) is typical analysis classes
The data handling procedure of type, traditional database be mainly used for realize OLTP, focus on the calculating of data, the insertion of record, deletion,
With modification, and simple inquiry and statistics.The main task of OLTP is to carry out issued transaction, and concern is primarily with issued transactions
Promptness, integrality and correctness, and there is serious deficiencies in terms of the analyzing processing of data.General service database
Lack integration and the indefinite disadvantage of theme is mainly manifested in the following aspects:First, service database system stick point
The unordered of data distribution and dispersion can be led to by cutting, and lacked unified definition and planning, may be deposited between different service databases
In the ambiguity of data definition;Secondly, service database defines library and Biao Shi lacks specific theme, cannot be satisfied data analysis
It needs;In addition, mass data is dispersedly stored in different tables, different libraries and different database servers, Wu Fabao
Demonstrate,prove the efficiency of Data Analysis Services.Therefore traditional database is limited by self-condition, can not be taken on as extensive number
According to the important task of comprehensive analysis platform, there is an urgent need to have a kind of theory newly with technology to provide support, here it is data warehouse skills
Art.
Data warehouse technology (Extract-Transform-Load, ETL) is by the data warp in original operation system
It crosses extraction, clean the process that conversion is loaded into data warehouse later, be the important ring for building data warehouse.The purpose done so
It is to provide dispersion, messy, the skimble-scamble Data Integration of standard in service database to analysis foundation to together for decision.ETL master
If using the processing capacity of change server, after extracting data in service database, data are carried out in change server
Cleaning, conversion, are loaded into object library after the completion.The design of ETL usually divides three parts:The cleaning conversion of data pick-up, data
With the load of data.
The data cleansing generally refers to simplify data can be connect with removing deduplication record, and remainder being made to be converted into standard
Receive the process of format.Data scrubbing master pattern is to enter data into data scrubbing processor, " clear by series of steps
Reason " data, then export the data cleared up with desired format.Data scrubbing is from the accuracy of data, integrality, consistent
Property, uniqueness, timeliness, the several aspects of validity come handle data missing value, more dividing value, inconsistent code, duplicate data
The problems such as.
One embodiment of the application provides a kind of data processing method, as shown in Figure 1, the method includes:
Step 101:It submits event to be monitored WEB page data, is submitted according to the WEB page data monitored
Event obtains operation information associated with WEB page data submission event and data information in real time;
Step 102:It is synchronous with the data that history data store area indicates the data information in real-time data memory area
Execute the operation of the operation information instruction.
Wherein, the data stored in the real-time data memory area are used to provide the inquiry and service of real time data, described
The data stored in history data store area are used to provide the inquiry and service of historical data.
Wherein, the data stored in the real-time data memory area have certain life cycle, for providing in real time
The inquiry and service of data.The length of the life cycle can be adjusted according to demand.It is a large amount of due to can all generate daily
Real time data, for prevent because in real-time data memory area data volume it is excessive due to influence real time data inquiry and service feedback speed
Degree, the life cycle in the real-time data memory area are preferably less than or equal to 24 hours.
In addition to comprising historical data also include all new in real-time data memory area in the history data store area
Increase data, for providing inquiry and service for historical data, therefore, when carrying out the inquiry and service for historical data
Newly-increased data in real-time data memory area need not be imported into history data store area, i.e. history data store area and reality
When data storage area be full decoupled.
Optionally, the WEB page data submission event includes:
The WEB page data for clicking type of button submits event;Or
The WEB page data of clickthrough type submits event.
Specifically, if the data that type of button occurs to click for the WEB page monitored submit event, in real time from list
Obtain data associated with WEB page data submission event.POST data way of submission as defined in HTTP/1.1 agreements
Include mainly following two:
1. application/x-www-form-urlencoded is most common POST data way of submission, that is, pass through
Browser it is primary<form>List, if being not provided with enctype attributes, then finally will be with application/x-www-
Form-urlencoded modes submit data;
2. multipart/form-data is also a kind of common POST data way of submission, that is, list is used to upload text
When part, it is necessary to allow<form>The enctype of list is equal to multipart/form-data, and this data way of submission is mainly
Realize what data were submitted by the way of upper transmitting file, such as:Data are submitted in a manner of uploading excel tables.
If the data that clickthrough type occurs for the WEB page monitored submit event, in real time from the subsidiary ginseng of link
Data associated with WEB page data submission event are obtained in number.
Wherein, in step 102, in the number that real-time data memory area and history data store area indicate the data information
Include according to the synchronous operation for executing the operation information instruction:Insertion, deletion and update of data etc..Specifically, for example passing through
Following sentence can be inserted into the data of key-value forms, newer operation:
Map<String,int>Map=new HashMap<String,int>();{ defining a Map object }
Map.put (" ming ", 1);{ one group of data of setting, key are " ming ", respective value 1 }
Map.put (" zi ", 2);{ another group of data are set, and key is " zi ", respective value 2 }
map.get("ming");{ value 1 for obtaining key " ming " }
Map.put (" ming ", 3);{ because key " ming " has existed, value 1 originally can be capped, key
" ming " corresponding new value is 3 }
Method provided in this embodiment realizes associated data in such a way that actively monitoring WEB page data submits event
Real-time acquisition, abandoned the mode of batch processing used by traditional ETL, the data that meet truly are wanted in real time
It asks.The real time data got by this method is synchronized storage and is deposited to the real-time data memory area of data warehouse and historical data
Storage area.
Another embodiment of the application provides a kind of data processing method, as shown in Fig. 2, the method includes:
Step 201:WEB page data to clicking type of button submits event to be monitored, according to monitoring
Click configuration information and URL that the WEB page data of type of button submits event to obtain the WEB page from list in real time
String;
Step 202:It parses the configuration information and obtains operation information;
Step 203:It parses the URL strings and obtains data information;
Step 204:According to the operation information and data information in real-time data memory area and history data store area pair
The data of the data information instruction synchronize the operation for executing the operation information instruction.
Wherein, in the step 201 click type of button WEB page data submit event process be similar to pass through a little
It hits button and submits a WEB list.The operation information is equivalent to all kinds of dimensional informations of the operation to list, the data letter
Breath is equivalent to list content.
The data information includes field information, field data information etc., these information constitute complete data.The number
The form of data flow throughout manages intermodule or transmission over networks according to this.Uniform resource locator (Uniform Resource
Locator, URL) be the resource of standard on internet address, webpage for being fully described by internet and other moneys
Source can also identify local resource.It can each webpage or resource of unique mark on internet using URL.URL is by a series of words
Symbol composition, format are:protocol://[username:password]@host[:port][/path][query][#
fragment].Wherein, transport protocol is specified in the domains protocol, such as:Http protocol, File Transfer Protocol etc.;The specified storage in the domains host
The host name or IP address of the server of resource;The specified user name being connected to needed for server in the domains username and password
And password;Port specifies in domain the port numbers of above-mentioned transport protocol;The address of a catalogue or file in the given host of the domains path;
The domains query are assigned to the parameter of dynamic web page transmission;Specify the segment in Internet resources in the domains fragment.In addition, above-mentioned URL lattice
In formula, the domain with square brackets [] is option.Client-side program accesses the information resources of Internet server using URL request
When, it is thus necessary to determine that ask the information such as the agreement used, the server of request, the identifier for asking resource and store path.It is above-mentioned
Information is all provided by the addresses URL.The URL strings submitted by parsing WEB page, it will be able to required data information is extracted,
Such as:Field information, field data information etc..
The operation information is exactly the various dimensional informations of operation data, can be parsed according to the configuration information of WEB page
Go out with the relevant information of data manipulation, such as:Action type, operator, operating time etc., the action type include:Be inserted into,
Delete and update etc..In the case where being obtained from WEB page less than operation information, it is defaulted as insertion operation.In addition the configuration
Information may also include the title of the database for needing to configure monitoring, type of database, the title of table, required field and corresponding
URL pages etc..Optionally, the configuration information further includes being stored in the real-time data memory area and history data store area
The downstream user of data and the relevant information of transmission mode.
The data processing method provided using the present embodiment can realize the real-time synchronization of Data Warehouse, data
Update be no longer dependent on service database, realize full decoupled with service database.This method has abandoned traditional ETL's
The mirror-image structure of dynamic area complexity is omitted in shortcoming, is realized while the stability that ensure that data warehouse
The real-time query of data and historical query.
Another embodiment of the application provides a kind of data processing method, as shown in figure 3, the method includes:
Step 301:Event is submitted to be monitored the WEB page data of clickthrough type, according to monitoring
The WEB page data of clickthrough type submits event to be obtained and the WEB page data from the subsidiary parameter of link in real time
The associated operation information of submission event and data information;
Step 302:Dependent Rule is determined according to the operation information;
Step 303:The data information is handled in real time according to the dependent Rule;
Step 304:It in real-time data memory area and is gone through according to the operation information and the data information by handling in real time
History data storage area, which synchronizes, executes corresponding data manipulation.
In the above-described embodiments, the mode based on event increment may be used and obtain newly-increased data, each clickthrough
Corresponding one new data of record submit event, there is no inevitable contact between each event.
In the embodiment of the present application, submitted according to the WEB page data of the clickthrough type monitored in step 301
Event obtains operation information associated with WEB page data submission event and data from the subsidiary parameter of link in real time
Information may include:
Obtain the WEB page of the WEB page data submission event for the clickthrough type being monitored to matches confidence
Breath and URL strings;Operation information is obtained by parsing the configuration information;Data information is obtained by parsing the URL strings.
Wherein, which dependent Rule executes for determining which real-time processing module what kind of data flow sequentially flow through according to
The step of a little processing in real time, such as:It first carries out data validation detection and deletes invalid data, later according to preset rules to data
The conversion of progress type (such as:The data of the specific formats such as date, currency are converted to the text type of preset format) keep number
According to consistency, then data information is filtered, only extracts the data corresponding to the field involved by the table of data warehouse
Information.Corresponding dependent Rule is determined according to the operation information for parsing acquisition in configuration information, so that it is determined that should be to obtaining
Which the newly-increased data information got carries out and handles in real time, for example, the data cleansings rule such as data encryption, data cutout, data
It will be synchronized in the correspondence memory block of data warehouse after over cleaning conversion.
The configuration information that the WEB page that data submit event occurs by being detected described in parsing, can determine and institute
The associated operation information of submission event is stated, and then determines the processing that the data will be carried out with which kind of pattern, such as:Distribution,
The tupes such as dependence.
If can determine that the tupe of data is to distribute in real time according to operation information, data can be with the shape of data flow
Formula flows to the operation simultaneously of data warehouse disparate modules.
For example, can determine that the tupe of data is to rely on and obtain corresponding dependent Rule, root according to operation information
Just being able to know that according to the dependent Rule carry out the data information which is handled in real time, such as:Encryption, data intercept
Deng.And determine whether that returning the result data returns to implementing result (success or failure) according to operation information.
Optionally, real-time handle in the step 303 includes:
The data information is cleaned according to default cleaning rule;And/or
The data information is filtered according to the data model pre-established.
The default cleaning rule of the data cleansing includes:Data cutout, data solution/encryption, data correlation, data are legal
Property detection (such as cell-phone number detection), data type conversion rule.
It refers to table structure pair according in data warehouse that the data model that the basis pre-establishes, which is filtered data,
The full word segment data got from WEB page is filtered, and only extracts the field involved by the table of data warehouse.
The method of the embodiment will can in real time be got from upstream submits event associated with the WEB page data
Data the target data needed for OLTP or OLAP is converted to according to default cleaning rule.The conversion of data includes:Data amount check
Conversion, the conversion of data type, data summarize calculating, data splicing etc..In this method, as long as receiving upstream acquisition
The data arrived, it will be able to data be handled in a manner of data flow in time, greatly accelerate the efficiency of data processing.Through
Cross data conversion, data cleansing and data merge etc., and treated that data are updated in the memory space of data warehouse, this
Processing mode can preferably support various inquiries and service request for real time data and historical data.
Under a specific application scenarios, the action by monitoring the clickthrough occurred in WEB page captures
One data submit event, and parse the position of data file from configuration information and URL in real time and browser is answered
How this handles the relevant information of the data file.It is corresponding to determine according to the operation information obtained is parsed in configuration information
Dependent Rule, so that it is determined that the data information in data file should be carried out which in real time handle.For example, according to parsing institute
The operation information of acquisition determines that the built-up sequence of real time data processing is:
First, data are cleaned, carry out data cleansing purpose be data file is encrypted and/or decrypts,
Detecting to delete invalid data and carry out conversion to data type according to preset rules by data validation keeps data consistent
Property etc.;
Later, data are filtered according to the data model pre-established, the purpose of data filtering is according to data
Table structure in warehouse is filtered the data in data file, only extracts the field involved by the table of data warehouse,
Intercept useful target data.
Finally, treated target data is stored to real-time data memory area and history data store area simultaneously.
Dependent Rule according to data flow determined by different operation information and data information is different, so accordingly
The real-time processing steps to data may also be different, the developer of database can make adaptability according to actual needs
Design, to configure different combinations and sequence.
Optionally, when data warehouse initializes, existing initial data in service database is carried according to pre-defined rule
Enter to the history data store area, such as disposably can be loaded into existing initial data in service database described
History data store area, to the historical data before making newly-established data warehouse also be capable of providing data warehouse initialization
The result of inquiry and/or service.
Optionally, the method further includes:
The request of inquiry and/or the service of receiving real-time data at any time;
The result of inquiry and/or service is provided according to the data preserved in the real-time data memory area.
Optionally, the method further includes:
The request of inquiry and/or the service of historical data is received at any time;
The result of inquiry and/or service is provided according to the data preserved in the history data store area.
Newly-increased data are synchronized update, real time data in the history data store area and the real-time data memory area
Memory block and history data store area are full decoupled, and real-time data memory area only retains the data in setup time, history
Data are all directly to be obtained from upstream, and there is no any interactions with real-time data memory area.Ensureing data query essence as a result,
Under the premise of degree, the efficiency of real time data inquiry is improved, and supports real time data inquiry and the inquiry of historical data simultaneously, completely
Avoid the problem of real time data imports the inquiry competition that initiation is carried out at the same time with real time data inquiry.
One embodiment of the application discloses a kind of data processing equipment, as shown in figure 4, described device 400 includes:WEB pages
Face data Command Line Parsing module 401 and data processing module 402, wherein 401 quilt of WEB page data Command Line Parsing module
It is configured to submit event to be monitored WEB page data, submits event to obtain in real time according to the WEB page data monitored
Take operation information associated with WEB page data submission event and data information;The data processing module 402 by with
It is set to the execution operation synchronous with the data that history data store area indicates the data information in real-time data memory area
The operation of information instruction, wherein the data stored in the real-time data memory area are used to provide the inquiry kimonos of real time data
It is engaged in, the data stored in the history data store area are used to provide the inquiry and service of historical data.
The device that above-described embodiment provides can realize association by way of actively capturing WEB page data and submitting event
The real-time acquisition of data, has abandoned the mode of batch processing used by traditional ETL, truly to meet data real-time
It is required that.Also comprising newly-increased number all in real-time data memory area in addition to comprising historical data in the history data store area
According to for providing inquiry and service for historical data, therefore, being not required to carrying out the when of being directed to the inquiry and service of historical data
Newly-increased data in real-time data memory area are imported into history data store area, i.e. history data store area and number in real time
It is full decoupled according to memory block.
One embodiment of the application discloses a kind of data processing equipment, as shown in figure 5, described device 500 includes:WEB pages
Face data Command Line Parsing module 501, data processing module 502 and data operation modules 503, wherein the WEB page data is matched
It sets parsing module 501 to be configured as submitting event to be monitored WEB page data, according to the WEB page number monitored
Obtain operation information associated with WEB page data submission event and data information in real time according to the event of submission;At data
Reason module 502 is configured as synchronous with the data that history data store area indicates the data information in real-time data memory area
Execute the operation of the operation information instruction;The data operation modules 503 are configured as obtaining and be carried with the WEB page data
After the associated operation information of friendship event and data information, preset dependent Rule is determined according to the operation information, and
The data information is handled in real time according to the preset dependent Rule;Wherein, it is deposited in the real-time data memory area
The data of storage are used to provide the inquiry and service of real time data, and the data stored in the history data store area are gone through for providing
The inquiry and service of history data.
In the embodiment of the present application, the mode based on event increment may be used and obtain newly-increased data, each clicks chain
The corresponding new data of record connect submit event, do not have inevitable contact between each event.It is taken in foundation configuration information
Corresponding dependent Rule is determined with operation information, so that it is determined which should be carried out to the newly-increased data information got in real time
Processing, for example, the data cleansings rule such as data encryption, data cutout, data will be same after over cleaning conversion
In the correspondence memory module for walking data warehouse.The device of the embodiment will can in real time be got and the WEB from upstream
Page data submits the associated data of event to be converted to the target data needed for OLTP or OLAP according to default cleaning rule.Number
According to conversion include:The conversion of data amount check, the conversion of data type, data summarize calculating, data splicing etc..The number
As long as receiving the data that upstream is got according to operation module 503, it will be able in time in a manner of data flow to data
Reason, greatly accelerates the efficiency of data processing.Treated the data quilt such as merge by data conversion, data cleansing and data
In the memory space for updating data warehouse, this processing mode can be supported preferably for real time data and historical data
Various inquiries and service request.
Optionally, described device 500 further includes data initialization module, the data initialization module be configured as according to
Initial data is loaded into the history data store area by pre-defined rule.Newly-established data warehouse is also capable of providing number as a result,
The result of inquiry and/or the service of historical data before being initialized according to warehouse.
Optionally, described device further includes interface module, and the interface module is configured as the inquiry of receiving real-time data
And/or the request of service, and according to the data that are preserved in the real-time data memory area provide real time data inquiry and/
Or the result of service.Optionally, the interface module is additionally configured to receive the request of inquiry and/or the service of historical data,
And the result of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area.Newly
It is synchronized update to increase data in the history data store area and the real-time data memory area, real-time data memory area with go through
History data storage area is full decoupled, and real-time data memory area only retains the data in setup time, and historical data is all straight
It connects from upstream and obtains, there is no any interactions with real-time data memory area.As a result, in the premise for ensureing data query precision
Under, the efficiency of real time data inquiry is improved, and support real time data inquiry and the inquiry of historical data simultaneously, it is entirely avoided real
When data import with real time data inquiry be carried out at the same time initiation inquiry compete the problem of.
A kind of computing device as shown in FIG. 6 600 is provided in one embodiment according to the application, including but unlimited
In memory 601, processor 602 and it is stored in the computer instruction that can be run on memory 601 and on processor 602, institute
It states when processor 602 executes described instruction and realizes foregoing data processing method.
A kind of exemplary scheme of above-mentioned computing device for the present embodiment.It should be noted that the computing device 600
Technical solution and data processing method above-mentioned belong to same design, and the technical solution of the computing device is not described in detail thin
Content is saved, the description of the technical solution of above-mentioned data processing method is may refer to.
A kind of storage medium is provided in one embodiment according to the application, is stored thereon with computer instruction, institute
It states and realizes power foregoing data processing method when instruction is executed by processor.
The computer instruction includes computer program code, the computer program code can be source code form,
Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include:Institute can be carried
State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disc, CD, the computer storage of computer program code
Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory),
Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior
Increase and decrease appropriate can be carried out according to legislation in jurisdiction and the requirement of patent practice by holding, such as in certain jurisdictions of courts
Area, according to legislation and patent practice, computer-readable medium does not include electric carrier signal and telecommunication signal.
A kind of exemplary scheme of above-mentioned readable storage medium storing program for executing for the present embodiment.It should be noted that the storage medium
Technical solution and data processing method above-mentioned belong to same design, the details that the technical solution of storage medium is not described in detail
Content may refer to the description of the technical solution of above-mentioned data processing method.
It should be noted that for each method embodiment above-mentioned, describe, therefore it is all expressed as a series of for simplicity
Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because
According to the application, certain steps may be used other sequences or be carried out at the same time.Secondly, those skilled in the art should also know
It knows, embodiment described in this description belongs to preferred embodiment, and involved action and module might not all be this Shens
It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiments.
The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment
All details are described, also do not limit the specific implementation mode that this application is only described.Obviously, according to the content of this specification,
It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application
Principle and practical application, to enable skilled artisan to be best understood by and utilize the application.The application is only
It is limited by claims and its full scope and equivalent.
Claims (16)
1. a kind of data processing method, which is characterized in that the method includes:
It submits event to be monitored WEB page data, submits event to obtain in real time according to the WEB page data monitored
Operation information associated with WEB page data submission event and data information;
In the execution operation synchronous with the data that history data store area indicates the data information of real-time data memory area
The operation of information instruction, wherein the data stored in the real-time data memory area are used to provide the inquiry kimonos of real time data
It is engaged in, the data stored in the history data store area are used to provide the inquiry and service of historical data.
2. according to the method described in claim 1, it is characterized in that, the real-time acquisition submits thing with the WEB page data
The associated data information of part and operation information include:
Obtain the configuration information and URL strings for being monitored to and the WEB page that data submit event occurring;
It parses the configuration information and obtains operation information;
It parses the URL strings and obtains data information.
3. method according to claim 1 or 2, which is characterized in that the WEB page data submits the event to include:
The WEB page data for clicking type of button submits event;Or
The WEB page data of clickthrough type submits event.
4. method according to claim 1 or 2, which is characterized in that obtain and submit event related to the WEB page data
Further include after the operation information and data information of connection:
Preset dependent Rule is determined according to the operation information;
The data information is handled in real time according to the preset dependent Rule.
5. according to the method described in claim 4, it is characterized in that, the real-time processing includes:
The data information is cleaned according to default cleaning rule.
6. according to the method described in claim 4, it is characterized in that, the real-time processing further includes:
The data information is filtered according to the data model pre-established.
7. method according to claim 1 or 2, which is characterized in that the method further includes:
Initial data is loaded into the history data store area according to pre-defined rule.
8. method according to claim 1 or 2, which is characterized in that the method further includes:
The request of inquiry and/or the service of receiving real-time data;
The result of inquiry and/or the service of real time data is provided according to the data preserved in the real-time data memory area.
9. method according to claim 1 or 2, which is characterized in that the method further includes:
Receive the request of inquiry and/or the service of historical data;
The result of inquiry and/or the service of historical data is provided according to the data preserved in the history data store area.
10. a kind of data processing equipment, which is characterized in that described device includes:WEB page data Command Line Parsing module and data
Processing module, wherein the WEB page data Command Line Parsing module is configured as submitting event to supervise WEB page data
It surveys, submits event to obtain in real time according to the WEB page data monitored and submit event associated with the WEB page data
Operation information and data information;The data processing module is configured as in real-time data memory area and history data store area
The data of data information instruction are synchronized with the operation for executing the operation information instruction, wherein the real-time data memory
The data stored in area are used to provide the inquiry and service of real time data, and the data stored in the history data store area are used for
The inquiry and service of historical data are provided.
11. device according to claim 10, which is characterized in that described device further includes data operation modules, the number
According to operation module be configured as obtaining with the WEB page data associated operation information of submission event and data information it
Afterwards, preset dependent Rule is determined according to the operation information, according to the preset dependent Rule to the data information into
Row processing in real time.
12. the device according to claim 10 or 11, which is characterized in that described device further includes data initialization module,
The data initialization module is configured as that initial data is loaded into the history data store area according to pre-defined rule.
13. the device according to claim 10 or 11, which is characterized in that described device further includes interface module, described to connect
Mouth mold block is configured as the request of inquiry and/or the service of receiving real-time data, and according in the real-time data memory area
The data preserved provide the result of inquiry and/or the service of real time data.
14. device according to claim 13, which is characterized in that the interface module is additionally configured to receive historical data
Inquiry and/or service request, and provide historical data according to the data that are preserved in the history data store area
The result of inquiry and/or service.
15. a kind of computing device, including memory, processor and storage are on a memory and the calculating that can run on a processor
Machine instructs, which is characterized in that the processor realizes the data described in any one of claim 1 to 9 when executing described instruction
Processing method.
16. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the instruction is by processor
The data processing method described in any one of claim 1 to 9 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810361696.9A CN108549714B (en) | 2018-04-20 | 2018-04-20 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810361696.9A CN108549714B (en) | 2018-04-20 | 2018-04-20 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108549714A true CN108549714A (en) | 2018-09-18 |
CN108549714B CN108549714B (en) | 2020-12-11 |
Family
ID=63512031
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810361696.9A Expired - Fee Related CN108549714B (en) | 2018-04-20 | 2018-04-20 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108549714B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109871378A (en) * | 2019-02-21 | 2019-06-11 | 杭州市商务委员会(杭州市粮食局) | The data acquisition and processing (DAP) method and system of big data platform |
CN110781188A (en) * | 2019-10-23 | 2020-02-11 | 泰康保险集团股份有限公司 | Form information processing method and device, electronic equipment and storage medium |
US20210218828A1 (en) * | 2018-09-30 | 2021-07-15 | Huawei Technologies Co., Ltd. | Method for Starting Application Client, Service Server, and Client Device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101667205A (en) * | 2009-09-28 | 2010-03-10 | 河南电力试验研究院 | Method for memorizing real time measure point data facing quick review |
CN102637197A (en) * | 2012-02-28 | 2012-08-15 | 中北大学 | File management method of real-time data acquisition and storage system |
CN102646130A (en) * | 2012-03-12 | 2012-08-22 | 华中科技大学 | Method for storing and indexing mass historical data |
CN103957248A (en) * | 2014-04-21 | 2014-07-30 | 中国科学院软件研究所 | Public real-time data management cloud service platform based on Internet of Things |
-
2018
- 2018-04-20 CN CN201810361696.9A patent/CN108549714B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101667205A (en) * | 2009-09-28 | 2010-03-10 | 河南电力试验研究院 | Method for memorizing real time measure point data facing quick review |
CN102637197A (en) * | 2012-02-28 | 2012-08-15 | 中北大学 | File management method of real-time data acquisition and storage system |
CN102646130A (en) * | 2012-03-12 | 2012-08-22 | 华中科技大学 | Method for storing and indexing mass historical data |
CN103957248A (en) * | 2014-04-21 | 2014-07-30 | 中国科学院软件研究所 | Public real-time data management cloud service platform based on Internet of Things |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210218828A1 (en) * | 2018-09-30 | 2021-07-15 | Huawei Technologies Co., Ltd. | Method for Starting Application Client, Service Server, and Client Device |
CN109871378A (en) * | 2019-02-21 | 2019-06-11 | 杭州市商务委员会(杭州市粮食局) | The data acquisition and processing (DAP) method and system of big data platform |
CN110781188A (en) * | 2019-10-23 | 2020-02-11 | 泰康保险集团股份有限公司 | Form information processing method and device, electronic equipment and storage medium |
CN110781188B (en) * | 2019-10-23 | 2022-09-02 | 泰康保险集团股份有限公司 | Form information processing method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108549714B (en) | 2020-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
He et al. | Drain: An online log parsing approach with fixed depth tree | |
CN101937469B (en) | Information capture method of video website | |
CN102667761B (en) | Scalable cluster database | |
CN103810224B (en) | information persistence and query method and device | |
CN105320740B (en) | The acquisition methods and acquisition system of wechat article and public platform | |
CN108549714A (en) | A kind of data processing method and device | |
WO2021114454A1 (en) | Method and apparatus for detecting crawler request | |
CN109284435B (en) | Internet-oriented user interaction trace capturing, storing and retrieving system and method | |
CN105468737A (en) | Web service big data analysis method, cloud computing platform and mining system | |
CN110716950B (en) | Caliber system establishment method, caliber system establishment device, caliber system establishment equipment and computer storage medium | |
CN111523072A (en) | Page access data statistical method and device, electronic equipment and storage medium | |
CN110727663A (en) | Data cleaning method, device, equipment and medium | |
CN108647357A (en) | The method and device of data query | |
CN113420026B (en) | Database table structure changing method, device, equipment and storage medium | |
CN113259467B (en) | Webpage asset fingerprint tag identification and discovery method based on big data | |
CN112084270A (en) | Data blood margin processing method and device, storage medium and equipment | |
CN106603690A (en) | Data analysis device, data analysis processing system and data analysis method | |
CN110968571A (en) | Big data analysis and processing platform for financial information service | |
CN113656673A (en) | Master-slave distributed content crawling robot for advertisement delivery | |
CN103399968B (en) | A kind of micro-blog information acquisition method and system | |
CN115017182A (en) | Visual data analysis method and equipment | |
CN103248511A (en) | Analyses method, device and system for single-point service performance | |
CN109214640B (en) | Method and device for determining index result and computer readable storage medium | |
CN112671878B (en) | Block chain information subscription method, device, server and storage medium | |
CN107679097A (en) | A kind of distributed data processing method, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20201211 |