CN115220914A

CN115220914A - Data processing method, system and computing equipment

Info

Publication number: CN115220914A
Application number: CN202210793157.9A
Authority: CN
Inventors: 周杨; 田强
Original assignee: Beijing Chezhiying Technology Co ltd
Current assignee: Beijing Chezhiying Technology Co ltd
Priority date: 2022-07-05
Filing date: 2022-07-05
Publication date: 2022-10-21

Abstract

The invention discloses a data processing method, a system and a computing device, wherein the method is executed in a message queue server, the message queue server is in communication connection with a plurality of consumption servers, and the method comprises the following steps: acquiring a plurality of messages generated based on corpus data; configuring a predetermined number of consumption program instances for each consumption server respectively, so that each consumption server concurrently processes the message through the predetermined number of consumption program instances respectively to push the message to the NLP system for calculation; wherein the consumer program instance is adapted to apply for messages to the message queue server and process the messages at run time. According to the technical scheme of the invention, massive corpus data can be processed concurrently, and the data processing efficiency is improved.

Description

Data processing method, system and computing equipment

Technical Field

The present invention relates to the field of computer and internet technologies, and in particular, to a data processing method, a data processing system, and a computing device.

Background

The system for synchronizing and calculating the linguistic data by the NLP (Natural Language Processing) has the function of quickly synchronizing mass public praise to the NLP system for calculation. Specifically, a public praise NLP corpus synchronization and calculation system relates to synchronization and calculation of massive public praise data, the total space of public praise is as high as more than 500 and ten thousand, the content of each public praise averagely comprises 7000 characters, in addition, the NLP system is complex, and a plurality of subsystems, such as emotion analysis, a rule engine, word cluster matching and the like, are involved, so that the average time consumed for processing each public praise by the NLP system reaches about 2 seconds. If the 500W word-of-mouth tablets are calculated in series, the total time consumption reaches about 115 days, and the time consumption is too long. Therefore, under the condition of large data volume, the synchronization and calculation of the NLP system are time-consuming, a large number of subsystem calls and network transmission are involved, and the problems of timeout and abnormal interruption of programs occur.

Therefore, a data processing method and system are needed to solve the problems in the above technical solutions.

Disclosure of Invention

To this end, the present invention provides a data processing method, a data processing system and a computing device to solve or at least alleviate the above existing problems.

According to one aspect of the present invention, there is provided a data processing method, performed in a message queue server communicatively coupled to a plurality of consumption servers, the method comprising: acquiring a plurality of messages generated based on corpus data; respectively configuring a predetermined number of consumption program instances for each consumption server, so that each consumption server concurrently processes the message through the predetermined number of consumption program instances respectively to push the message to the NLP system for calculation; wherein the consumer program instance is adapted to apply for messages to the message queue server and process the messages at run time.

Optionally, in the data processing method according to the present invention, configuring a predetermined number of consumption program instances for each consumption server respectively comprises: and acquiring configuration information, determining a preset number according to the configuration information, and configuring a preset number of consumption program instances for each consumption server respectively.

Alternatively, in the data processing method according to the invention, each consumption server is adapted to: and dynamically adjusting the number of the consumption program instances of the consumption server according to the message processing speed of the consumption server.

Alternatively, in the data processing method according to the invention, each consumption server is adapted to: retrying based on a predetermined time interval when processing the message fails; and writing the message into a deadlock queue after the retry number reaches the preset retry number.

Alternatively, in the data processing method according to the present invention, the retrying based on a predetermined time interval includes: and returning the message to the message queue server so that other consumer program instances apply for the message from the message queue server again and process the message.

Alternatively, in the data processing method according to the present invention, the predetermined time interval is 10 seconds, and the predetermined number of retries is 4 times.

Optionally, in the data processing method according to the present invention, the message queue server is communicatively connected to a data storage device, and acquiring a plurality of messages to be processed includes: acquiring a plurality of messages generated based on corpus data from the data storage device; the corpus data includes a public praise identification.

According to an aspect of the present invention, there is provided a data processing system comprising: a plurality of consumption servers; the message queue server is in communication connection with the plurality of consumption servers, is suitable for acquiring a plurality of messages generated based on the corpus data, and is suitable for configuring a predetermined number of consumption program examples for each consumption server respectively, so that each consumption server can concurrently process the messages through the predetermined number of consumption program examples respectively to push the messages to the NLP system for calculation; wherein the consumer program instance is adapted to apply for messages to the message queue server and process the messages at run time.

Optionally, in the data processing system according to the invention, the message queue server is further adapted to: and acquiring configuration information, determining a preset number according to the configuration information, and configuring a preset number of consumption program instances for each consumption server respectively.

Optionally, in a data processing system according to the invention, each consuming server is adapted to: and dynamically adjusting the number of the consumption program instances of the consumption server according to the message processing speed of the consumption server.

Optionally, in a data processing system according to the invention, each consumption server is adapted to: retrying based on a predetermined time interval when processing the message fails; and writing the message into a deadlock queue after the retry number reaches the preset retry number.

Alternatively, in the data processing system according to the present invention, the retrying based on the predetermined time interval includes: and returning the message to the message queue server so that other consumer program instances apply for the message from the message queue server again and process the message.

Optionally, in the data processing system according to the present invention, the system further comprises: the system comprises a data storage device, a data processing device and a display device, wherein the data storage device stores a plurality of messages generated based on corpus data, and the corpus data comprises a public praise identifier; the message queue server is in communication connection with the data storage device and is suitable for obtaining a plurality of messages generated based on the corpus data from the data storage device.

According to an aspect of the invention, there is provided a computing device comprising: at least one processor; a memory storing program instructions configured to be adapted to be executed by the at least one processor, the program instructions comprising instructions for performing the data processing method as described above.

According to an aspect of the present invention, there is provided a readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform the data processing method as described above.

According to the data processing method and system, the message queue server acquires a plurality of messages generated based on the corpus data and configures a predetermined number of consumption program examples for each consumption server, so that the plurality of consumption program examples of the plurality of consumption servers can be used for concurrently processing the messages and pushing the messages to the NLP system for calculation. Therefore, the invention can carry out concurrent processing on mass corpus data, thereby greatly improving the data processing efficiency.

Furthermore, the number of the consumption program instances of the consumption server can be dynamically adjusted according to the message processing speed of the consumption server, so that the consumption program instances can be dynamically expanded according to the actual message processing progress.

In addition, the consumption server in the invention returns the message with processing failure to the message queue so as to enable the consumption program instances of other consumption servers to continue processing the message. Therefore, the problem that the messages with processing failure caused by the abnormality of the current consumption server or the current consumption program instance cannot be continuously processed can be avoided. And the failure message is retried based on the preset retry times, so that the fault tolerance and reliability of data synchronization are further ensured, and the problem of livelock is also avoided.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which are indicative of various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description read in conjunction with the accompanying drawings. Throughout this disclosure, like reference numerals generally refer to like parts or elements.

FIG. 1 shows a schematic diagram of a data processing system 100, in accordance with one embodiment of the present invention;

FIG. 2 shows a schematic diagram of a computing device 200, according to one embodiment of the invention;

FIG. 3 shows a flow diagram of a data processing method 300 according to one embodiment of the invention;

FIG. 4 shows a flow diagram for processing a message according to one embodiment of the invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

FIG. 1 shows a schematic diagram of a data processing system 100, according to one embodiment of the invention.

As shown in FIG. 1, data processing system 100 includes a message queue server 120, a plurality of consumption servers 130 communicatively coupled to the message queue server. Here, each consumption server 130 is communicatively connected to the message queue server 120, for example, through a wired or wireless network connection.

In one implementation, the message queue server may be implemented as a RabbitMQ server. The consumption server may be implemented as an MQ consumption server.

In one embodiment, the data processing system 100 further includes a data storage device 110, and the data storage device 110 stores a plurality of messages (messages to be processed) generated based on the corpus data. The message queue server 120 is communicatively connected to the data storage device 110, and may obtain a plurality of messages generated based on the corpus data from the data storage device 110.

In one embodiment, the corpus data may be implemented as word-of-mouth data, and in particular, the corpus data may include word-of-mouth identification.

In an embodiment of the present invention, the message queue server 120 may configure a predetermined number of consuming program instances for each consuming server 130, so that each consuming server 130 may start a predetermined number of consuming program instances and concurrently process the message through the predetermined number of consuming program instances. Here, processing the message via the consuming program instance may enable pushing the message to the NLP system for computation. The plurality of consumption program instances of the plurality of consumption servers 130 process the messages, so that the plurality of messages can be processed concurrently, and the plurality of messages are synchronously pushed to the NLP system for calculation, thereby being beneficial to improving the data processing efficiency.

It should be noted that, when the consuming program instance of each consuming server 130 runs, it may apply for a message from the message queue server and process the message to push the message to the NLP system for computation.

In one embodiment, the message queue server 120 may determine the number (i.e., the predetermined number) of consuming program instances that need to be configured for each consuming server 130 according to the configuration information by obtaining the configuration information, and then may configure the predetermined number of consuming program instances for each consuming server 130, respectively.

In one implementation, the configuration information includes a consturency parameter, and the consturency parameter value is, for example, 220. Accordingly, the predetermined number determined from the configuration information is 220, based on which the message queue server 120 may configure 220 consuming program instances for each consuming server 130.

In one embodiment, each consumption server 130 may dynamically adjust the number of consumption program instances of the consumption server itself based on the consumption server's own message processing speed. For example, when the consumption server 130 determines that it is slow to process messages, it may dynamically add a new consumption program instance and launch the new consumption program instance. After the new consumption program instance is started, the new consumption program instance can automatically apply for the message from the message queue server and process the obtained message. Thereby, a further increase in the speed at which the consumption server processes messages can be achieved.

It should be noted that, in the embodiment of the present invention, the number of messages that the consuming program instance applies for from the message queue server at a time may also be determined according to the configuration information. For example, in one implementation, a consuming program instance may apply for a message from a message queue server at a time and process the message.

In one embodiment, when a consuming program instance fails to process a message due to an exception, the message that failed to be processed is retried, i.e., reprocessed, based on a predetermined time interval. Specifically, the consuming program instance returns the processing-failed message to the message queue server 120, and then the other consuming program instances apply for the message (the aforementioned processing-failed message) from the message queue server again and process the message. Here, when the consuming server 130 or the consuming program instance processing the message operates abnormally, or the NLP system operates abnormally, it may cause a failure in processing the message.

After the number of retries (retry number) of the message whose processing failed reaches a predetermined retry number, if the processing failed, the message is written into the deadlock queue as a deadlock message. That is, the number of retries for a message whose processing failed is at most a predetermined number of times. Based on this, the maximum processing number (including the first processing + retry number) of the message in the present invention is a predetermined number plus 1.

Thus, according to embodiments of the present invention, the consuming server 130 continues to process the message by consuming program instances of other consuming servers 130 by returning the message that failed processing to the message queue. In this way, it is able to avoid that the message with processing failure due to the abnormality of the current consumption server 130 or the consumption program instance cannot be processed continuously.

In an embodiment of the invention, the message queue server 120 is adapted to perform the data processing method 300. The data processing method 300 of the present invention will be described in detail below.

In one embodiment, the data storage device 110, the message queue server 120, and the consumption server 130 of the present invention may each be implemented as a computing device. When the message queue server 120 is implemented as a computing device, the data processing method 300 of the present invention is enabled to be executed in the computing device.

FIG. 2 shows a block diagram of a computing device 200, according to one embodiment of the invention.

As shown in FIG. 2, in a basic configuration 202, a computing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.

Depending on the desired configuration, the processor 204 may be any type of processing, including but not limited to: a microprocessor (UP), a microcontroller (UC), a digital information processor (DSP), or any combination thereof. The processor 204 may include one or more levels of cache, such as a level one cache 210 and a level two cache 212, a processor core 214, and registers 216. Example processor cores 214 may include Arithmetic Logic Units (ALUs), floating Point Units (FPUs), digital signal processing cores (DSP cores), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations the memory controller 218 may be an internal part of the processor 204.

Depending on the desired configuration, system memory 206 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 220, one or more applications 222, and program data 224. The application 222 is actually a plurality of program instructions that direct the processor 204 to perform corresponding operations. In some implementations, the application 222 can be arranged to cause the processor 204 to operate with the program data 224 on an operating system.

Computing device 200 may also include a storage interface bus 234. A storage interface bus 234 enables communication from storage devices 232 (e.g., removable storage 236 and non-removable storage 238) to basic configuration 202 via bus/interface controller 230. At least a portion of the operating system 220, applications 222, and data 224 may be stored on removable storage 236 and/or non-removable storage 238, and loaded into system memory 206 via storage interface bus 234 and executed by the one or more processors 204 when the computing device 200 is powered on or applications 222 are to be executed.

Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to the basic configuration 202 via the bus/interface controller 230. The example output device 242 includes a graphics processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. Example peripheral interfaces 244 can include a serial interface controller 254 and a parallel interface controller 256, which can be configured to facilitate communications with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 258. An example communication device 246 may include a network controller 260, which may be arranged to facilitate communications with one or more other computing devices 262 over a network communication link via one or more communication ports 264.

A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in a manner that encodes information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, radio Frequency (RF), microwave, infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.

In the computing device 200 according to the present invention, the application 222 includes therein a plurality of program instructions for executing the data processing method 300, which can instruct the processor 204 to execute the data processing method 300 of the present invention, so that the computing device 200 concurrently processes mass data by executing the data processing method 300 of the present invention, thereby improving data processing efficiency.

FIG. 3 shows a flow diagram of a data processing method 300 according to one embodiment of the invention. The method 300 is suitable for execution in a message queue server 120, such as the aforementioned computing device 200.

As shown in fig. 3, the method 300 begins at step S310.

In step S310, a plurality of messages generated based on corpus data are acquired.

Specifically, the message queue server 120 is communicatively connected to the data storage device 110, and may obtain a plurality of messages generated based on the corpus data from the data storage device 110.

In one embodiment of the invention, the corpus data may be implemented as word-of-mouth data, and in particular, the corpus data may include word-of-mouth identification.

In step S320, a predetermined number of consuming program instances are configured for each consuming server 130, so that each consuming server 130 may start a predetermined number of consuming program instances and concurrently process the message through the predetermined number of consuming program instances. Here, processing the message via the consuming program instance may enable pushing the message to the NLP system for computation. The plurality of consumption program instances of the plurality of consumption servers 130 process the messages, so that the plurality of messages can be processed concurrently, and the plurality of messages are synchronously pushed to the NLP system for calculation, thereby being beneficial to improving the data processing efficiency.

According to an embodiment of the present invention, the message queue server 120 may determine the number (i.e., the predetermined number) of consuming program instances that need to be configured for each consuming server 130 according to the configuration information by acquiring the configuration information, and then may configure the predetermined number of consuming program instances for each consuming server 130, respectively.

According to an embodiment of the present invention, each consumption server 130 may dynamically adjust the number of consumption program instances of the consumption server itself according to the message processing speed of the consumption server itself. For example, when the consumption server 130 determines that it is slow to process messages, it may dynamically add a new consumption program instance and launch the new consumption program instance. After the new consumption program example is started, the new consumption program example can automatically apply for the message from the message queue server and process the obtained message.

According to one embodiment of the invention, when the consuming program instance fails to process the message due to an exception, the message which fails to be processed is retried based on a predetermined time interval, that is, the message is reprocessed. Specifically, the consuming program instance returns the processing-failed message to the message queue server 120, and then the other consuming program instances apply for the message (the aforementioned processing-failed message) from the message queue server again and process the message. Here, when the consuming server 130 or the consuming program instance processing the message operates abnormally, or the NLP system operates abnormally, it may cause a failure in processing the message.

In addition, after the number of retries (retry number) of the message whose processing failed reaches a predetermined retry number, if the message still fails to be processed, the message is written into the blind-letter queue as a blind-letter message. That is, the number of retries for a message whose processing failed is at most a predetermined number of times. Based on this, the maximum processing times (including the first processing + retry times) of the message in the present invention is the predetermined times plus 1.

Thus, according to embodiments of the present invention, the consuming server 130 continues to process the message by consuming program instances of other consuming servers 130 by returning the message that failed processing to the message queue. In this way, it can be avoided that the message with processing failure due to the exception of the current consuming server 130 or consuming program instance cannot be processed continuously.

In one embodiment, the predetermined time interval may be, for example, 10 seconds, and the predetermined number of retries may be, for example, 4 times. The maximum number of treatments may be 5.

As shown in fig. 4, the message queue server 120 may be connected to a message producer to retrieve messages from the message producer. Here, the message producer is the program that delivers the message. In implementations of the invention, the message producer may be implemented as a data storage device or a program module coupled to a data storage device to send messages generated based on corpus data to a message queue server.

In one implementation, as shown in fig. 4, the message Queue server 120 includes a message switch (Exchange), a plurality of message queues (Queue), and the message switch can send messages to the corresponding message queues according to a predetermined rule, wherein each message can be sent (routed) to one or more message queues via the message switch. Through the message queue, the message may be sent to a consuming program instance of the consuming server 130 connected to the message queue for processing of the message by the consuming program instance of the consuming server 130. It will be appreciated that the consuming program example is a program that is a message consumer, i.e. a program that receives a message, and is responsible for processing the message (i.e. consuming the message) in order to push it to the NLP system for computation.

As described above, when the consuming program instance of the consuming server 130 processes a message exception, a retry is made on the basis of a predetermined time interval to process a failed message. After the number of retries (retry number) of the message with failed processing reaches a predetermined retry number, if the processing still fails, the message with failed processing is regarded as a dead-letter message and written into a dead-letter queue. In one implementation, as shown in fig. 4, the consuming program instance of the consuming server 130 may send the deadlock message to the deadlock message switch, and send (route) the deadlock message to the corresponding deadlock queue via the deadlock message switch. Here, the deadlock queue may be specifically a deadlock queue of the NLP. In one implementation, the deception message switch can send the deception message to the corresponding deception queue according to the routing Key of the deception message. Furthermore, the dead message can be sent to a dead message queue consumption server connected with the dead message queue through the dead message queue, so that the dead message can be processed and consumed through the dead message queue consumption server. The dead trust queue consumption server may specifically be a dead trust queue consumption server of the NLP.

According to the data processing method and system, the message queue server acquires a plurality of messages generated based on the corpus data and configures a predetermined number of consumption program examples for each consumption server, so that the messages can be concurrently processed by using the plurality of consumption program examples of the plurality of consumption servers, and the messages are pushed to the NLP system for calculation. Therefore, the invention can carry out concurrent processing on mass corpus data, thereby greatly improving the data processing efficiency.

In addition, the consumption server in the invention returns the message with processing failure to the message queue, so that the consumption program instances of other consumption servers can continuously process the message. Therefore, the problem that the message with processing failure caused by the exception of the current consumption server or the current consumption program instance cannot be processed continuously can be avoided. And, through retrying the failed message based on the predetermined number of times of retrying, fault tolerance and reliability of data synchronization have further been guaranteed, have also avoided the livelock problem.

B9, the system of B8, wherein the message queue server is further adapted to: and acquiring configuration information, determining a preset number according to the configuration information, and configuring a preset number of consumption program instances for each consumption server respectively.

B10, the system of B8 or B9, wherein each consumption server is adapted to: and dynamically adjusting the number of the consumption program instances of the consumption server according to the message processing speed of the consumption server.

B11, the system according to any of B8-B10, wherein each consuming server is adapted to: retrying based on a predetermined time interval when processing the message fails; and writing the message into a deadlock queue after the retry number reaches the preset retry number.

The system of B12, as in B11, wherein the retrying based on the predetermined time interval comprises: and returning the message to the message queue server so that other consuming program instances apply for the message from the message queue server again and process the message.

B13, the system as in any one of B8-B12, wherein the system further comprises: the data storage device stores a plurality of messages generated based on corpus data, and the corpus data comprises public praise identifications; the message queue server is in communication connection with the data storage device and is suitable for obtaining a plurality of messages generated based on the corpus data from the data storage device.

The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U.S. disks, floppy disks, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.

In the case of program code execution on programmable computers, the mobile terminal generally includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the data processing method of the present invention according to instructions in the program code stored in the memory.

By way of example, and not limitation, readable media may comprise readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.

In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with examples of this invention. The required structure for constructing such a system is apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.

As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed by way of illustration and not limitation with respect to the scope of the invention, which is defined by the appended claims.

Claims

1. A data processing method implemented in a message queue server communicatively coupled to a plurality of consumption servers, the method comprising:

acquiring a plurality of messages generated based on corpus data;

respectively configuring a predetermined number of consumption program instances for each consumption server, so that each consumption server concurrently processes the message through the predetermined number of consumption program instances respectively to push the message to the NLP system for calculation;

wherein the consumer program instance is adapted to apply for messages to the message queue server and process the messages at run time.

2. The method of claim 1, wherein configuring a predetermined number of instances of the consuming program for each consuming server respectively comprises:

and acquiring configuration information, determining a preset number according to the configuration information, and configuring a preset number of consumption program instances for each consumption server respectively.

3. The method of claim 1 or 2, wherein each consumption server is adapted to:

and dynamically adjusting the number of the consumption program instances of the consumption server according to the message processing speed of the consumption server.

4. The method of any one of claims 1-3, wherein each consumption server is adapted to:

retrying based on a predetermined time interval when processing the message fails;

and writing the message into a deadlock queue after the retry number reaches the preset retry number.

5. The method of claim 4, wherein retrying based on a predetermined time interval comprises:

and returning the message to the message queue server so that other consumer program instances apply for the message from the message queue server again and process the message.

6. The method of claim 4 or 5, wherein the predetermined time interval is 10 seconds and the predetermined number of retries is 4.

7. The method of any of claims 1-6, wherein the message queue server is communicatively coupled to a data storage device, and retrieving the plurality of pending messages comprises:

acquiring a plurality of messages generated based on the corpus data from the data storage device;

the corpus data includes a public praise identification.

8. A data processing system comprising:

a plurality of consumption servers;

the message queue server is in communication connection with the plurality of consumption servers, is suitable for acquiring a plurality of messages generated based on the corpus data, and is suitable for configuring a predetermined number of consumption program examples for each consumption server respectively, so that each consumption server can concurrently process the messages through the predetermined number of consumption program examples respectively to push the messages to the NLP system for calculation;

9. A computing device, comprising:

at least one processor; and

a memory storing program instructions, wherein the program instructions are configured to be adapted to be executed by the at least one processor, the program instructions comprising instructions for performing the data processing method of any of claims 1-7.

10. A readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform the data processing method of any one of claims 1-7.