CN111552696A

CN111552696A - Data processing method and device based on big data, computer equipment and medium

Info

Publication number: CN111552696A
Application number: CN202010185774.1A
Authority: CN
Inventors: 郑如刚
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-03-16
Filing date: 2020-03-16
Publication date: 2020-08-18

Abstract

The application belongs to the field of data processing, and discloses a data processing method and device based on big data, computer equipment and a readable storage medium. The method comprises the steps of receiving data information sent by a user terminal; numbering the data information to obtain the numbering information of each data information, and storing the data information after numbering as data to be copied in a database; copying data to be copied into a message queue cluster as data to be packaged; receiving a data acquisition request sent by a request terminal, and packaging the data to be packaged according to the data acquisition request to obtain the data to be verified; inquiring the data to be copied which is consistent with the serial number information of the data to be checked as comparison data; and carrying out consistency comparison on the data to be verified and the comparison data to obtain a comparison result, and processing the data to be verified according to the comparison result. The method solves the technical problem that the characteristics can not be accurately extracted in the prior art.

Description

Data processing method and device based on big data, computer equipment and medium

Technical Field

The present application relates to the field of data processing, and in particular, to a data processing method and apparatus based on big data, a computer device, and a storage medium.

Background

With the development of cloud computing and big data, high-concurrency and high-load data operation processing is increasing. Data are transmitted more and more between systems, and a data transmission mode in the conventional technology needs to transmit data to a next system for operation after one operation is completed on one system through step-by-step transmission, so that the processing time of data processing between systems is long, and timely reply cannot be achieved. Message queues are generated by applications, and as a technique for exchanging information between distributed applications, the message queues can reside on a disk or a memory, and store messages until the information is read by a system. Through the message queue, the system can independently execute corresponding operations, a plurality of operations are simultaneously carried out without carrying out step-by-step transmission, the data transmission time is saved, and the data processing time is shortened sharply. However, since the data read from the message queue may not be the original data, there is a technical problem that the data information read from the message queue cluster may not be consistent with the data information in the database.

Disclosure of Invention

Therefore, it is necessary to solve the above technical problems, and the present application provides a data processing method, an apparatus, a computer device and a storage medium based on big data, so as to solve the technical problem that data information read from a message queue cluster is inconsistent with data information in a database due to the fact that features cannot be accurately extracted in the prior art.

A big data based data processing method, the method comprising:

receiving data information sent by a user terminal;

numbering the data information to obtain the numbering information of each data information, and storing the data information after numbering as data to be copied in a database;

copying the data to be copied into a message queue cluster as data to be packaged;

receiving a data acquisition request sent by a request terminal, and packaging the data to be packaged according to the data acquisition request to obtain the data to be verified;

inquiring the data to be copied which is consistent with the serial number information of the data to be checked as comparison data;

and carrying out consistency comparison on the data to be verified and the comparison data to obtain a comparison result, and processing the data to be verified according to the comparison result.

A big-data based data processing apparatus, the apparatus comprising:

the receiving module is used for receiving data information sent by a user terminal;

the numbering module is used for numbering the data information to obtain the numbering information of each data information, and storing the data information after numbering as data to be copied in a database;

the data to be copied is copied to the message queue cluster to serve as data to be packaged;

the packaging module is used for receiving a data acquisition request sent by a request terminal and packaging the data to be packaged according to the data acquisition request to obtain the data to be verified;

the query module is used for querying the data to be copied which is consistent with the serial number information of the data to be checked as comparison data;

and the comparison module is used for carrying out consistency comparison on the data to be verified and the comparison data to obtain a comparison result and processing the data to be verified according to the comparison result.

A computer device comprises a memory and a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the big data based data processing method when executing the computer program.

A computer-readable storage medium, which stores a computer program that, when executed by a processor, implements the steps of the above-described big-data based data processing method.

According to the data processing method, the data processing device, the computer equipment and the storage medium based on the big data, the received data information is numbered according to the size and the receiving time, the uniqueness of each row of data information in the database is ensured, the data information to be detected in the message queue cluster is convenient to correspond to the data information in the database subsequently according to the number information of the data to be detected in the message queue cluster, the consistency comparison of the data information with the same number information in different storage positions is carried out, the data information in the message queue cluster is further processed according to the comparison result, and the technical problem that the data information read from the message queue cluster is inconsistent with the data information in the database in the prior art is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a schematic diagram of an application environment of a big data-based data processing method;

FIG. 2 is a schematic flow chart of a big data based data processing method;

FIG. 3 is a schematic flow chart of step 204 in FIG. 2;

FIG. 4 is a schematic flow chart of step 302 in FIG. 3;

FIG. 5 is a schematic flow chart of step 304 in FIG. 3;

FIG. 6 is a schematic flow chart of step 206 in FIG. 2;

FIG. 7 is a schematic flow chart of step 212 in FIG. 2;

FIG. 8 is another flowchart of step 212 of FIG. 2;

FIG. 9 is a schematic diagram of a big data based data processing apparatus;

FIG. 10 is a diagram of a computer device in one embodiment.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The data processing method based on big data provided by the embodiment of the invention can be applied to the application environment shown in FIG. 1. The application environment may include, among other things,

terminals

102, 106 networks for providing a medium for communication links between the

terminals

102, 106 and the server 104, and the service 104, and the networks may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may use the

terminals

102, 106 to interact with the server 104 over a network to receive or send messages, etc. The terminal 102 may have installed thereon various communication client applications, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminals

102, 106 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group audio Layer III, mpeg compression standard audio Layer 3), MP4 players (Moving Picture Experts Group audio Layer IV, mpeg compression standard audio Layer 4), laptop portable computers, desktop computers, and the like.

The server 104 may be a server that provides various services, such as a background server that provides support for pages displayed on the

terminals

102, 106.

It should be noted that the data processing method based on big data provided in the embodiments of the present application is generally executed by a server/terminal, and accordingly, the data processing apparatus based on big data is generally disposed in the server/terminal device.

It should be understood that the number of terminals, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Wherein, the terminal 102 communicates with the server 104 through the network. The server 104 receives the data information sent by the terminal 102, then numbers the data information according to the time for receiving the data information and the size of the data information, and copies one part of the numbered data information from the database to the message queue cluster as the data to be packaged; and then, obtaining the information to be checked to be sent according to the key information sent by the terminal 104, comparing the information to be checked with the corresponding data information in the database to obtain a comparison result, and processing the data to be checked according to the comparison result. The

terminals

102 and 106 and the server 104 are connected through a network, the network may be a wired network or a wireless network, the

terminals

102 and 106 may be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, a method for processing data based on big data is provided, which is described by taking the method as an example for being applied to the server in fig. 1, and includes the following steps:

step 202, receiving data information sent by the user terminal.

The data information sent by the user terminal may be a mass data stream, such as: and the service logs, the monitoring data, the user behaviors and the like are collected and summarized in real time or in batches aiming at the data streams and then are sent to the server.

And 204, numbering the data information to obtain the numbering information of each data information, and storing the data information after numbering as data to be copied in a database.

Generally, received data information is numbered, each piece of data information in a database is guaranteed to be unique, and subsequent data query is facilitated. For example, after the server receives a document information, the document information may be numbered according to the numbering condition, and after the numbering, the document information is stored in the database. The encoded information for a piece of data may be, for example: z20191115041645.

And step 206, copying the data to be copied into the message queue cluster as the data to be packaged.

After the data information is numbered, the numbered data information is copied to the message queue cluster, a subsequent request terminal can directly request to acquire the corresponding data information from the message queue cluster of the server, and the data processing time can be saved by the mode of processing and applying the data information through the message queue cluster.

And 208, receiving a data acquisition request sent by the request terminal, and encapsulating the data to be encapsulated according to the data acquisition request to obtain the data to be verified.

After receiving the data acquisition request sent by the request terminal, the server can package the data to be packaged according to the data acquisition request, the packaged data information can be accessed through an interface on the server, the storage space can be saved, and the time for transmitting the data information from the message queue cluster to the request terminal is shortened.

And step 210, inquiring the data to be copied which is consistent with the serial number information of the data to be checked as comparison data.

Data information in the message queue cluster may change for some reason, and in order to ensure that the data to be verified sent to the requesting terminal is consistent with the original data information, the data information corresponding to the number information of the data to be verified needs to be queried from the database. It should be noted that although the data information is copied from the database to the message queue cluster and encapsulated, the number information of the data information is always kept unchanged, so as long as the number information is the same, the contents of the data information in the database and the data information in the message queue cluster should be consistent theoretically. The same data stored in different places can be obtained greatly conveniently by means of the number information.

And 212, comparing the consistency of the data to be verified and the comparison data to obtain a comparison result, and processing the data to be verified according to the comparison result.

And then, carrying out consistency comparison on the obtained data information and the data to be verified to obtain a comparison result. The comparison result may be that the obtained data information is consistent with the data to be verified, or may not be consistent. Further, the field values corresponding to the same fields in the data information and the data to be verified can be obtained through traversal, consistency comparison is performed on the field values and the data information, if the field values corresponding to all the fields are the same, it is indicated that the data to be verified is consistent with the original data information, otherwise, it is indicated that the data to be verified is changed, and therefore, developers are required to perform operations of checking the reasons of data errors or reacquiring the corresponding original data information from the database to perform encapsulation processing and the like, and then the data information is sent to the request terminal.

According to the data processing method based on the big data, the uniqueness of each row of data information in the database is guaranteed by numbering the received data information according to the size and the receiving time, the correspondence between the data information in the database and the numbering information of the data to be checked in the message queue cluster is facilitated, the consistency comparison is carried out on the data information with the same numbering information at different storage positions, and finally the data information in the message queue cluster is further processed according to the comparison result, so that the technical problem that the data information read from the message queue cluster is inconsistent with the data information in the database in the prior art is solved.

In one embodiment, as shown in FIG. 3, step 204 comprises:

step 302, detecting the size of the data information to obtain the size indicator of the data information.

Because the sizes of different data information are different, if the sizes of different data information are represented by one size reference symbol, the data information is classified according to the sizes of the data information, and the data information is better positioned when the data information is subsequently inquired and processed by the method. For example, the symbol Z is used as a size of data information having a size not larger than 100 KB.

And 304, generating the number information of the data information according to the size indicator and the receiving time of the data information.

Each piece of data information has a receiving time, and the receiving time is the time when the service end receives the data information. The size reference symbol and the receiving time are used as the number information of the data information, the data information can be classified and inquired according to the size of the data information, and data with the same or similar receiving time can be classified and stored through the time when the server receives the data information.

For example, if it is detected that the size of the currently received data information is 200KB and the received time is 20190312150553, the number information of the data information generated by adding the received time to the size indicator of the data information is: y20190312150553.

Further, if the receiving time of at least two data messages with the same size range is within T time, the sequence number is generated for the data messages within T time according to the number of the data messages within T time.

Multiple pieces of data information may appear at the same or similar receiving time, for example, when T is 1s, multiple pieces of data may be received in 1s, if the multiple pieces of data information are in the same size range, and because the receiving times between the data information are similar or the same, and different data information cannot be distinguished clearly, a sequence number may be generated for the data information according to the number of the data information received in the time period T, for example: y20190312151230001, Y20190312151230002, Y20190312151230003 … ….

According to the embodiment, the number information of the data information is generated according to the size and the receiving time of the data information, the data information is conveniently classified and stored or inquired according to the classification, the time for inquiring the data by the server is saved, and the data processing efficiency of the server is improved.

In one embodiment, as shown in FIG. 4, step 302, comprises:

step 402, analyzing the size of the data information.

A size range of data information is then obtained.

If the size of the data information is not greater than 100KB, the size designation is Z, step 404.

At step 406, if the size of the data information is greater than 100KB and less than 1M, the size indicator is Y.

In step 408, if the size of the data information is not smaller than 1M, the size indicator is X.

In this embodiment, when the size of the data information is detected to be not greater than 100KB, the number of the data information is Z plus the receiving time of the data information; when the size of the data information is detected to be larger than 100KB and smaller than 1M, the number of the data information is Y plus the receiving time of the data information; when the size of the data information is detected to be not less than 1M, the number of the data information is X plus the receiving time of the data information.

For example, when the size of the current data information is 153KB and the receiving time is 20190312151230, the number is Y20190312151230.

In this embodiment, the number information headers of the data information in different size ranges are represented by different symbols, so that the data information in different size ranges can be distinguished obviously, a user can know the size range of the data information by looking up the number information visually, and whether the data information is the data required by the user is referred to through the condition.

In one embodiment, as shown in FIG. 5, step 304, comprises:

step 502, if the receiving time of at least two data messages with the same size range is within T time, generating a serial number for the data messages within T time according to the quantity of the data messages within T time;

and step 504, generating number information of the data information according to the serial number, the size reference symbol and the receiving time.

The repeated data information may be repeated in a short time, the serial number distinguishes the repeated data information, the serial number is generated according to the number of the received data information in the T time from large to small, and the serial number information of the data information is generated according to the serial number, the size indicator symbol and the receiving time, so that the data information can be distinguished, and the data information is not confused due to the repeated serial number information.

In one embodiment, as shown in FIG. 6, step 206, comprises:

step 602, reading the data acquisition request to obtain the key information. The key information may be a certain keyword in the data information. Such as the age of the user.

Step 604, extracting data to be encapsulated corresponding to the key information from the message queue cluster, and encapsulating the extracted data to be encapsulated to generate a data class as data to be verified. The key information in the request data acquisition request is acquired, the query is carried out in the message queue cluster, the required data information is quickly positioned and extracted, and the data information is packaged to form a data class as the data to be verified. The data class is a data type which hides the attribute and implementation details of a hidden object of data information, controls the access level of the server for reading and modifying the data class, and ensures that the data to be verified is not tampered before being sent to the request terminal.

In this embodiment, the data information obtained according to the key information is encapsulated, so that it is ensured that the attribute data in the data to be verified is not tampered before the data to be verified is sent to the request terminal.

In one embodiment, as shown in FIG. 7, step 212 includes:

step 702, converting the data to be verified into a temporary data file.

The server generates a temporary data file according to the data to be verified, and the packaged data class needs to be transmitted to another system, so that the packaged data class is generated into the temporary data file, and is compared with the original data information stored in the database, and the data information to be transmitted to the other system is ensured to be accurate.

Step 704, comparing the consistency of the temporary data file and the comparison data in a step-by-step comparison mode to obtain a comparison result.

The server side detects the packaged data information, inquires corresponding data information in the database according to the serial number of the data information, and compares whether the data information in the database is consistent with the packaged data information, so that a comparison result of whether the data information is consistent is obtained.

The comparison result may be that the data information in the database is consistent with the encapsulated data information, or may be inconsistent.

According to the embodiment, the data to be verified and the original data information are subjected to consistency comparison between the data to be verified and the original data information before the data to be verified are sent to the request terminal, so that a comparison result is obtained, and finally whether the data to be verified is sent to the request terminal or not is determined according to the comparison result, so that the data information to be transmitted to another system is accurate.

In one embodiment, as shown in fig. 8, step 212 further includes:

step 802, analyzing a comparison result;

and step 804, if the comparison result is that the data to be verified is consistent with the comparison data, sending the data to be verified to the request terminal.

And if the data information in the database is consistent with the data to be verified obtained by the encapsulated data information, sending the data to be verified to the requester.

Step 806, if the comparison result is that the data to be verified is inconsistent with the comparison data, removing the data to be packaged corresponding to the data to be verified from the message queue cluster, copying the data to be copied corresponding to the data to be verified from the database into the message queue cluster as the data to be packaged, and re-packaging the data to be verified.

If the data information in the database is inconsistent with the data to be verified, the data to be packaged in the message queue cluster is proved to be damaged, the data to be packaged can be deleted, finally, the original data information corresponding to the key information is copied into the message queue cluster again to serve as the data to be packaged, the data to be packaged is packaged into the data to be verified again and then consistency comparison is carried out, and the data to be verified is sent to the request terminal until the obtained comparison result shows that the data to be verified is consistent with the comparison data.

In the embodiment, different processing operations are performed on the data to be verified according to the comparison result, and the operation of assigning the data information to the message queue cluster is performed again after the data to be verified is damaged, so that the data information obtained by the request terminal is ensured to be consistent with the data information initially received by the server.

It should be understood that although the various steps in the flowcharts of fig. 2-8 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-8 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 9, a big-data-based data processing apparatus is provided, which corresponds one-to-one to the big-data-based data processing method in the above-described embodiment. The big data-based data processing device comprises:

a receiving module 902, configured to receive data information sent by a user terminal;

a numbering module 904, configured to number the data information to obtain numbering information of each data information, and store the data information after the numbering process in a database as data to be copied;

a pending module 906, configured to copy data to be copied to the message queue cluster as data to be encapsulated;

the encapsulation module 908 is configured to receive a data acquisition request sent by a request terminal, and encapsulate data to be encapsulated according to the data acquisition request to obtain data to be verified;

the query module 910 is configured to query the data to be copied, which is consistent with the serial number information of the data to be checked, as comparison data;

the comparison module 912 is configured to perform consistency comparison on the data to be verified and the comparison data to obtain a comparison result, and process the data to be verified according to the comparison result.

Further, the numbering module 904 includes:

the size detection submodule is used for detecting the size of the data information to obtain a size reference symbol of the data information;

and the number generation submodule is used for generating the number information of the data information according to the size reference symbol and the receiving time of the data information.

Further, the number generation submodule includes:

a sequence number generating unit, configured to generate a sequence number for the data information within the T time according to the number of the data information within the T time if the receiving time of the at least two data information with the same size range is within the T time;

and the number generating unit is used for generating the number information of the data information according to the serial number, the size reference symbol and the receiving time.

Further, a packaging module comprising:

the request reading submodule is used for reading a data acquisition request to obtain key information;

and the packaging submodule is used for extracting the data to be packaged corresponding to the key information from the message queue cluster, packaging the extracted data to be packaged and generating a data class as the data to be verified.

Further, a comparison module, comprising:

the conversion submodule is used for converting the data to be verified into a temporary data file;

and the comparison submodule is used for comparing the consistency of the temporary data file and the comparison data in a step-by-step comparison mode to obtain a comparison result.

Further, the comparison module further comprises:

the result analysis submodule is used for analyzing the comparison result;

the circulation submodule is used for clearing the data to be packaged corresponding to the data to be verified from the message queue cluster if the comparison result is that the data to be verified is inconsistent with the comparison data, copying the data to be copied corresponding to the data to be verified into the message queue cluster from the database to serve as the data to be packaged, re-packaging the data to be verified, and repeating the consistency comparison operation until the comparison result is that the data to be verified is consistent with the comparison data;

and the sending submodule is used for sending the data to be verified to the request terminal if the comparison result shows that the data to be verified is consistent with the comparison data.

According to the data processing device based on the big data, the received data information is numbered according to the size and the receiving time, the uniqueness of each row of data information in the database is guaranteed, the data information to be detected in the message queue cluster is convenient to correspond to the data information in the database according to the number information of the data to be detected in the message queue cluster, consistency comparison is conducted on the data information with the same number information in different storage positions, and finally the data information in the message queue cluster is further processed according to the comparison result, so that the technical problem that the data information read from the message queue cluster is inconsistent with the data information in the database in the prior art is solved.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing user order data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to realize a data processing method based on big data, the uniqueness of each row of data information in a database is ensured by numbering the received data information according to the size and the receiving time, the correspondence between the number information of the data to be detected in a message queue cluster and the data information in the database is convenient for the follow-up, the consistency comparison of the data information with the same number information at different storage positions is carried out, and the further processing is carried out on the data information in the message queue cluster according to the comparison result, thereby solving the technical problem that the data information read from the message queue cluster is inconsistent with the data information in the database in the prior art

As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the steps of the big data based data processing method in the above embodiments, such as the steps 202 to 212 shown in fig. 2, or when the processor executes the computer program, the processor implements the functions of the modules/units of the big data based data processing apparatus in the above embodiments, such as the functions of the modules 902 to 912 shown in fig. 9. To avoid repetition, further description is omitted here.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program when executed by a processor implements the steps of the big data based data processing method in the above-described embodiments, such as the steps 202 to 212 shown in fig. 2, or the processor implements the functions of the modules/units of the big data based data processing apparatus in the above-described embodiments, such as the functions of the modules 902 to 912 shown in fig. 9. The received data information is numbered according to the size and the receiving time, so that the uniqueness of each row of data information in the database is ensured, the data information with the same number information at different storage positions is conveniently and correspondingly compared with the data information in the database according to the number information of the data to be checked in the message queue cluster, and finally the data information in the message queue cluster is further processed according to the comparison result, thereby solving the technical problem that the data information read from the message queue cluster is inconsistent with the data information in the database in the prior art

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for those skilled in the art, without departing from the spirit and scope of the present invention, several changes, modifications and equivalent substitutions of some technical features may be made, and these changes or substitutions do not make the essence of the same technical solution depart from the spirit and scope of the technical solution of the embodiments of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A big data-based data processing method, the method comprising:

receiving data information sent by a user terminal;

2. The method of claim 1, wherein the numbering the data information comprises:

detecting the size of the data information to obtain a size reference symbol of the data information;

and generating the number information of the data information according to the size reference symbol and the receiving time of the data information.

3. The method of claim 2, wherein the detecting the size of the data information to obtain the size reference sign of the data information comprises:

if the size of the data information is not greater than 100KB, the size reference symbol is Z;

if the size of the data information is greater than 100KB and less than 1M, then the size reference symbol is Y;

and if the size of the data information is not less than 1M, the size reference symbol is X.

4. The method of claim 2, wherein the generating the number information of the data information according to the size indicator and the receiving time of the data information comprises:

if the receiving time of at least two data messages with the same size range is within T time, generating a serial number for the data messages within the T time according to the quantity of the data messages within the T time;

and generating the number information of the data information according to the serial number, the size indicator and the receiving time.

5. The method according to claim 1, wherein the encapsulating the data to be encapsulated according to the data obtaining request to obtain the data to be verified comprises:

reading the data acquisition request to obtain key information;

and extracting the data to be packaged corresponding to the key information from the message queue cluster, and packaging the extracted data to be packaged to generate a data class as the data to be verified.

6. The method according to claim 1, wherein the consistency comparison between the data to be verified and the comparison data to obtain a comparison result comprises:

converting the data to be verified into a temporary data file;

and comparing the consistency of the temporary data file and the comparison data in a step-by-step comparison mode to obtain the comparison result.

7. The method according to claim 1, wherein the processing the data to be verified according to the comparison result comprises:

if the comparison result is that the data to be verified is inconsistent with the comparison data, clearing the data to be packaged corresponding to the data to be verified from the message queue cluster, copying the data to be copied corresponding to the data to be verified from the database to the message queue cluster to be used as the data to be packaged, re-packaging the data to be verified, and repeating the consistency comparison operation until the comparison result is that the data to be verified is consistent with the comparison data;

and if the comparison result is that the data to be verified is consistent with the comparison data, sending the data to be verified to the request terminal.

8. A big data based data processing apparatus, comprising:

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.