CN114510299A

CN114510299A - Method, device and storage medium for processing artificial intelligence service

Info

Publication number: CN114510299A
Application number: CN202011280138.3A
Authority: CN
Inventors: 邓成东; 蒋宁; 王素文; 王洪斌; 吴海英
Original assignee: Beijing Finite Element Technology Co Ltd
Current assignee: Beijing Finite Element Technology Co Ltd
Priority date: 2020-11-16
Filing date: 2020-11-16
Publication date: 2022-05-17

Abstract

The application discloses a method, a device and a storage medium for processing artificial intelligence service. Wherein, the method comprises the following steps: receiving concurrent service requests related to artificial intelligence services, and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests; sending the request event data to a first message queue for caching the service request; monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data; monitoring the second message queue, and sending batch data in the second message queue to an artificial intelligence processing system; receiving a processing result corresponding to the request event data from the artificial intelligence processing system, and sending the processing result to a third message queue for caching the processing result; and monitoring the third message queue, and feeding back a processing result to the corresponding service request.

Description

Method, device and storage medium for processing artificial intelligence service

Technical Field

The present application relates to the field of internet artificial intelligence technology, and in particular, to a method, an apparatus, and a storage medium for processing artificial intelligence services.

Background

In the field of artificial intelligence operation, the characteristics of identification and processing of unstructured data of graphic image classes, large data volume and parallel computation of image units are taken as the main points, and the situation that the requirements on time and computation efficiency cannot be met in a traditional CPU computation mode is determined. Because the GPU takes the graphic numerical computation as a core and is most adept at the highly parallel numerical computation of the graphic type or the non-graphic type, people commonly adopt a server with the GPU for the algorithm model operation in the field of artificial intelligence. However, the algorithm model serves for the application system, and if the application system still calls the algorithm model service to perform operation by adopting the traditional synchronous request-response mode, the parallel processing capability of the GPU server cannot be effectively utilized, which brings about reduction of the operation efficiency. Therefore, it is necessary to design a method for asynchronous synchronous request and batch submitting to algorithm model for operation in service application system.

Aiming at the technical problems that the prior art calls an artificial intelligence algorithm model to process services by adopting a traditional synchronous request-response mode, the parallel processing capability of the model cannot be effectively utilized, and the service processing efficiency is influenced, an effective solution is not provided at present.

Disclosure of Invention

Embodiments of the present disclosure provide a method, an apparatus, and a storage medium for processing an artificial intelligence service, so as to at least solve technical problems that in the prior art, a traditional synchronous request-response manner is adopted to invoke an artificial intelligence algorithm model to process a service, and parallel processing capability of the model cannot be effectively utilized, which affects service processing efficiency.

According to an aspect of the embodiments of the present disclosure, there is provided a method for processing an artificial intelligence service, including: receiving concurrent service requests related to artificial intelligence services, and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests; sending the request event data to a first message queue for caching the service request; monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data; monitoring the second message queue, and sending batch data in the second message queue to an artificial intelligence processing system; receiving a processing result corresponding to the request event data from the artificial intelligence processing system, and sending the processing result to a third message queue for caching the processing result; and monitoring the third message queue, and feeding back a processing result to the corresponding service request.

According to another aspect of the embodiments of the present disclosure, there is also provided a storage medium including a stored program, wherein the method of any one of the above is performed by a processor when the program is executed.

According to another aspect of the embodiments of the present disclosure, there is also provided an apparatus for processing an artificial intelligence service, including: the request receiving module is used for receiving concurrent service requests related to the artificial intelligence service and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests; the first sending module is used for sending the request event data to a first message queue used for caching the service request; the second sending module is used for monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue and sending the batch data to a second message queue for caching the batch data; the third sending module is used for monitoring the second message queue and sending the batch data in the second message queue to the artificial intelligence processing system; the fourth sending module is used for receiving the processing result corresponding to the request event data from the artificial intelligence processing system and sending the processing result to a third message queue for caching the processing result; and the feedback module is used for monitoring the third message queue and feeding back the processing result to the corresponding service request.

According to another aspect of the embodiments of the present disclosure, there is also provided an apparatus for processing an artificial intelligence service, including: a processor; and a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: receiving concurrent service requests related to artificial intelligence services, and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests; sending the request event data to a first message queue for caching the service request; monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data; monitoring the second message queue, and sending batch data in the second message queue to an artificial intelligence processing system; receiving a processing result corresponding to the request event data from the artificial intelligence processing system, and sending the processing result to a third message queue for caching the processing result; and monitoring the third message queue, and feeding back a processing result to the corresponding service request.

In the embodiment of the present disclosure, in the process of processing a parallel service request by a service application system, first, request event data corresponding to the service request is cached by using a first message queue, then, the request event data cached in the first message queue is constructed into batch data and the batch data is sent to a second message queue, then, the batch data in the second message queue is sent to an artificial intelligence processing system for parallel processing, and a processing result is cached to a third message queue. And finally, feeding back the processing result cached in the third message queue to the corresponding service request. Therefore, compared with the prior art, the service application system can construct batch data according to a plurality of requests and asynchronously send the batch data to the artificial intelligence processing system after receiving the plurality of parallel requests, the requests are processed in parallel through the strong parallel processing capacity of the artificial intelligence processing system, and finally the processing results are received from the artificial intelligence processing system and fed back to the corresponding service requests. Therefore, the service application system of the embodiment can achieve the technical effects of asynchronous response and batch processing of service requests. The method further solves the technical problems that the prior art calls an artificial intelligence algorithm model to process the service by adopting a traditional synchronous request-response mode, cannot effectively utilize the parallel processing capacity of the model and influences the service processing efficiency.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure. In the drawings:

fig. 1 is a hardware configuration block diagram of a computing device for implementing the method according to embodiment 1 of the present disclosure;

FIG. 2 is a schematic diagram of a system for processing artificial intelligence services according to embodiment 1 of the present disclosure;

FIG. 3 is a schematic flow chart of a method for processing artificial intelligence services according to a first aspect of embodiment 1 of the present disclosure;

FIG. 4 is a schematic diagram of an apparatus for processing artificial intelligence services according to embodiment 2 of the disclosure; and

fig. 5 is a schematic diagram of an apparatus for processing artificial intelligence services according to embodiment 3 of the present disclosure.

Detailed Description

In order to make those skilled in the art better understand the technical solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure. It is to be understood that the described embodiments are merely exemplary of some, and not all, of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

There is also provided in accordance with the present embodiment, an embodiment of a method of processing artificial intelligence services, it being noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

The method embodiments provided by the present embodiment may be executed in a server or similar computing device. FIG. 1 illustrates a block diagram of a hardware architecture of a computing device for implementing a method of processing artificial intelligence services. As shown in fig. 1, the computing device may include one or more processors (which may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory for storing data, and a transmission device for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computing device may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuitry may be a single, stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computing device. As referred to in the disclosed embodiments, the data processing circuit acts as a processor control (e.g., selection of a variable resistance termination path connected to the interface).

The memory may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the method for processing artificial intelligence services in the embodiments of the present disclosure, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, that is, implementing the above-mentioned method for processing artificial intelligence services by an application program. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory may further include memory located remotely from the processor, which may be connected to the computing device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device is used for receiving or transmitting data via a network. Specific examples of such networks may include wireless networks provided by communication providers of the computing devices. In one example, the transmission device includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computing device.

It should be noted here that in some alternative embodiments, the computing device shown in fig. 1 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that FIG. 1 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in a computing device as described above.

Fig. 2 is a schematic diagram of a system for processing artificial intelligence services according to the present embodiment. Referring to fig. 2, the system includes: business application systems and AI algorithm systems (hereinafter referred to as artificial intelligence processing systems). The service application system mainly includes an external service request receiving module and a plurality of queues, as shown in fig. 2, for example, the plurality of queues include: the Queue-one Queue is used for caching an external single service request, data objects in the Queue are divided into service objects and signal objects, wherein the service objects are objects which are used for transmitting the specific service data information in the Queue, and the signal objects are that when the service object data in the Queue do not reach a specific number (for example, 100 pieces) but reach a specific time (for example, 1 second), the system automatically constructs a signal object and puts the signal object into the Queue for judgment; the Queue-more Queue is used for caching the converted batch requests; and the Queue-resp Queue is used for caching the result data responded by the artificial intelligence processing system. The service application system may receive external service requests, where the service requests are mainly parallel service requests related to artificial intelligence services, such as: and the artificial intelligence processing system comprises a plurality of computing units, can process the batch requests in parallel and sends the processing results to a service application system. And finally, the service application system receives the processing result and feeds the processing result back to the corresponding service request. The hardware configuration described above can be applied to hardware facilities such as a computing unit in the system.

In the above operating environment, according to the first aspect of the present embodiment, a method for processing an artificial intelligence service is provided, and the method is implemented by the service application system shown in fig. 2. Fig. 3 shows a flow diagram of the method, which, with reference to fig. 3, comprises:

s302: receiving concurrent service requests related to artificial intelligence services, and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests;

s304: sending the request event data to a first message queue for caching the service request;

s306: monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data;

s308: monitoring the second message queue, and sending batch data in the second message queue to an artificial intelligence processing system;

s310: receiving a processing result corresponding to the request event data from the artificial intelligence processing system, and sending the processing result to a third message queue for caching the processing result; and

s312: and monitoring the third message queue, and feeding back a processing result to the corresponding service request.

As described in the background art, in the field of artificial intelligence operation, the characteristics of identifying and processing unstructured data of graphic image class, large data volume and parallel computation of image units determine that the traditional CPU computation mode cannot meet the requirements in terms of time and computation efficiency. Because the GPU takes the graphic numerical computation as a core and is most adept at the highly parallel numerical computation of the graphic type or the non-graphic type, people commonly adopt a server with the GPU for the algorithm model operation in the field of artificial intelligence. However, the algorithm model serves for the application system, and if the application system still calls the algorithm model service to perform operation by adopting the traditional synchronous request-response mode, the parallel processing capability of the GPU server cannot be effectively utilized, which brings about reduction of the operation efficiency. Therefore, it is necessary to design a method for asynchronous synchronous request and batch submitting to algorithm model for operation in service application system.

For the technical problem existing in the background art, in step S302 in the technical solution of this embodiment, a service application system first receives a concurrent service request related to an artificial intelligence service, for example, the service application system may receive the concurrent service request through an external service request receiving module, as shown in fig. 2, the number of the concurrent service requests may be multiple, for example: external request req01, external request req02. Wherein the plurality of service requests are associated with artificial intelligence services, such as: the plurality of service requests are for requesting processing of services related to Artificial Intelligence (AI) such as image processing, voice recognition, and the like. Then, the business application system determines the request event data which needs to be submitted to a preset artificial intelligence processing system according to the business request. In a specific example, the service application system parses a plurality of service requests, and determines a plurality of request event data (ReqEventData) to be submitted to a preset artificial intelligence processing system, for example: picture data, voice data, etc. that need to be processed. The attributes of the request event data include, but are not limited to: request attributes, response attributes, current thread attributes, and whether it is a signal object identification attribute.

Further, in step S304, the service application system sends request event data (which may be a plurality of request event data) to a first message Queue (corresponding to the Queue-one Queue in fig. 2) for caching the service request.

Further, in step S306, the business application system listens to the first message Queue (Queue-one Queue). In practical operation, referring to fig. 2, for example, a data processing class onequeue is set for a first message Queue, the Queue-one Queue is monitored, request event data which is already cached is subscribed (consumed) from the first message Queue, the request event data cached in the first message Queue is constructed into batch data according to the request event data cached in the first message Queue, and the batch data is sent to a second message Queue (corresponding to the Queue-more Queue in fig. 2) for caching the batch data.

Further, in step S308, the service application system listens to the second message Queue, as shown in fig. 2, for example, the data processing class morequeueliser is set for the second message Queue to implement listening to the Queue-more Queue. The second message queue caches batch data of the request event data structure, and the business application system can subscribe (or consume) the batch data in the second message queue and send the batch data to the artificial intelligence processing system. The batch data may be sent through an interface preset by the artificial intelligence processing system, for example. Then, referring to fig. 2, the artificial intelligence processing system (AI algorithm system) includes a plurality of computing units, and the AI algorithm system makes full use of thousands of numeric computing units without logical relationship of the GPU for parallel processing by threads, synchronously waits until a response of the AI algorithm system interface is obtained, and then analyzes the response data to obtain a plurality of processing results.

Further, in step S310, the business application system receives the processing result (S) corresponding to the request event data (S) from the artificial intelligence processing system and sends the processing results to a third message Queue (corresponding to the Queue-resp Queue in fig. 2) for caching the processing results.

Finally, in step S312, the service application system listens to the third message Queue, for example, as shown in fig. 2, a data processing class RespQueueListener is set for the third message Queue, and then listens to the Queue-resp Queue. Then subscribe (or consume) a plurality of cached processing results in the third message queue, and then the service application system feeds back each processing result to the corresponding service request.

Therefore, in the process of processing parallel service requests by the service application system, the request event data corresponding to the service requests are firstly cached by using the first message queue, then the request event data cached in the first message queue is constructed into batch data and the batch data is sent to the second message queue, then the batch data in the second message queue is sent to the artificial intelligence processing system for parallel processing, and a processing result is cached to the third message queue. And finally, feeding back the processing result cached in the third message queue to the corresponding service request. Therefore, compared with the prior art, the service application system can construct batch data according to a plurality of requests and asynchronously send the batch data to the artificial intelligence processing system after receiving the plurality of parallel requests, the requests are processed in parallel through the strong parallel processing capacity of the artificial intelligence processing system, and finally the processing results are received from the artificial intelligence processing system and fed back to the corresponding service requests. Therefore, the service application system of the embodiment can achieve the technical effects of asynchronous response and batch processing of service requests. The method further solves the technical problems that the prior art calls an artificial intelligence algorithm model to process the service by adopting a traditional synchronous request-response mode, cannot effectively utilize the parallel processing capacity of the model and influences the service processing efficiency.

Optionally, constructing the batch data according to the request event data buffered in the first message queue includes: calculating the quantity of the request event data which is cached in the first message queue; and constructing the batch data according to the quantity and a preset threshold value.

Specifically, in the operation of constructing the batch data according to the request event data buffered in the first message queue, the business application system first calculates the number of the request event data buffered in the first message queue, and then constructs the batch data according to the number and a preset threshold. In one embodiment, the request event data buffered in the first message queue is counted, and when the number of buffered request event data reaches a batch commit threshold (e.g., 100), the buffered request event data is constructed as a batch data. Thus, by the mode, the technical effect of batch processing of the service requests can be realized.

Optionally, the method further comprises: and constructing the batch data according to the request event data buffered in the first message queue within the preset time interval under the condition that the quantity of the request event data does not reach the threshold value.

Specifically, in actual operation, a situation may occur in which the number of request event data does not reach the threshold, that is: the number of service requests is too small. In this case, in order to implement the bulk construction of data, the service application system may further construct bulk data from the request event data buffered in the first message queue within a predetermined time interval. In practical applications, the business application system may set a timed task thread, where the thread constructs a signal object to be written into the Queue-one Queue every fixed time N (e.g., 50 milliseconds), and constructs a batch of data every N times. Therefore, the situation that the request event data cannot reach the submission threshold value and continuously waits can be avoided through the mode, and the processing efficiency can be improved.

Optionally, receiving a concurrent service request related to an artificial intelligence service, further includes: after determining threads respectively corresponding to the service requests and sending the request event data to the first message queue for caching the service requests, the method further comprises the following steps: setting a plurality of threads corresponding to service requests related to request event data to be in a blocking state, and feeding back processing results to the corresponding service requests, wherein the method comprises the following steps: and determining a thread corresponding to the request event data, setting the thread to be in an awakening state, and feeding back a processing result to the corresponding service request through the thread.

Specifically, after receiving concurrent service requests related to the artificial intelligence service, the service application system needs to determine threads respectively corresponding to the service requests, that is, allocate a Thread (Thread) to each service request.

After the request event data is sent to the first message queue for caching the service request, setting a plurality of threads corresponding to the service request related to the request event data to be in a blocking state, for example: and if the request event data of the service request req01 and the service request req02 are already cached in the first message queue, setting the state of the thread corresponding to the service request req01 and the service request req02 to be in a blocking state.

In the operation of feeding back the processing result to the corresponding service request, the service application system first determines a thread corresponding to the request event data and sets the thread to be in an awake state, for example: if the processing results corresponding to the service requests req01 and req02 have been cached in the third message queue, the service application system sets the threads corresponding to the service requests req01 and req02 to be in the wake state, and then feeds back the processing results to the corresponding service requests through the threads, for example, feeds back the processing results related to the service requests req01 to the service requests req01 through the threads corresponding to the service requests req 01.

Therefore, in the process of processing parallel service requests, corresponding threads can be allocated to each service request, and finally, processing results are fed back to each processing request through the corresponding threads, so that the processing efficiency of the service application system can be improved.

Optionally, the request event data includes response attributes, and the processing result is sent to a third message queue for caching the processing result, including adding the processing result to the response attributes corresponding to the request event data respectively; and sending the request event data added with the processing result to a third message queue for caching the processing result.

Specifically, the request event data may include a response attribute, where the response attribute is used to record a processing result corresponding to the service request. In the operation of sending the processing result to the third message queue for caching the processing result, the service application system first adds the processing result to the response attribute of the corresponding request event data, so that each request event data records the corresponding processing result. Further, the service application system sends the request event data added with the processing result to a third message queue for caching the processing result. Therefore, each request event data is associated with a processing result, the corresponding processing result is conveniently fed back to the service request, and the efficiency is improved.

Optionally, the method further comprises: and caching the request event data into a memory data storage structure in a global key value pair form, wherein a key in the key value pair is a thread ID.

Specifically, the technical solution of this embodiment further includes caching each request event data in the multiple request event data in a global memory data storage structure Map (i.e., in a storage structure in a Key-value pair form), where a Key (Key) of the Map is a thread ID of a request object. Thus, the thread corresponding to the service request is convenient to search.

Optionally, feeding back the processing result to the corresponding service request through the thread, including: determining request event data corresponding to the thread according to the thread ID of the key value pair; processing result data in the response attribute is obtained from the request event data; and feeding back the processing result to the corresponding service request through the thread.

Specifically, in the operation of feeding back the processing result to the service request through the thread, the service application system first obtains corresponding request event data from the global memory data storage structure Map through the thread ID of the thread, then obtains result data from the response attribute of the request event data, and finally feeds back the processing result data to a plurality of service requests through the thread. In addition, the result data may be subjected to secondary processing after being acquired, such as: processing the data format, etc. Therefore, the processing result can be quickly acquired and fed back to the corresponding service request.

In addition, the application also comprises a specific example of the scheme, and the steps of the example are as follows:

1. and (4) sorting the queues. The Queue-one is a Queue for caching an external single service request, and data objects in the Queue are divided into service objects and signal objects; the Queue-more is a Queue for caching the converted batch requests; the Queue-resp is a Queue for caching result data responded by the AI algorithm system.

2. And the external service request receiving module. After receiving a plurality of external concurrent requests, each request is held by a thread object ReqThread, the service application system analyzes the request content and constructs data ReqEventData submitted to the AI algorithm system, then the data ReqEventData is sent to a Queue-one, and then the request thread ReqThread is blocked and waits until being awakened. ReqEventData object core attributes should include request attributes, response attributes, current thread attributes, and whether it is a signal object identification attribute.

Queue-one Queue data processing class onequeueListener. After subscribing the Queue and receiving the data ReqEventData, counting the data, when the count M reaches the batch submission threshold (such as 100) or the signal object reaches, constructing the M pieces of data into a batch data object BatchEventData, and then sending the data BatchEventData to the Queue-more. In addition, in order to avoid the situation that M cannot reach the submitted value and continuously waits, a timed task thread needs to be designed, and the thread constructs a signal object to be written into the Queue-one Queue at a fixed time N (such as 50 milliseconds).

Queue-more Queue data processing class MoreQueueListener. After subscribing the queue and receiving the data BatchEventData, constructing the batch of M data ReqEventData in the BatchEventData into a request to be submitted to an AI algorithm system interface, fully utilizing thousands of numerical calculation threads without logical relation of the GPU by the AI algorithm system to carry out parallel processing, and synchronously waiting until obtaining the response of the AI algorithm system interface. And then analyzing the response data to obtain M results, and traversing the M results to fill one of the M results into the response attribute of each corresponding request object. And finally constructing M request objects into a data object RespEventData and sending the data object RespEventData to a Queue-resp.

Queue-resp Queue data processing class RespQueueListener. After subscribing the queue and receiving data RespEventData, traversing M request objects ReqEventData in the data, caching each request object ReqEventData into a global memory data storage structure Map, wherein Key of the Map is the thread ID of the request object; the ReqThread thread is then awakened to resume execution from blocking wait.

And 6, after the ReqThread thread is awakened, acquiring ReqEventData from the global memory data storage structure Map through the thread ID of the thread, and synchronously responding data in the response attribute to a corresponding external request (reqN) after secondary processing.

According to the technical scheme, in an audio and video quality inspection service scene, visual computing capabilities such as voice-to-text, identity card OCR (optical character recognition), face recognition, face comparison and the like provided by AI algorithm services are needed, and the processing speed is improved by depending on the parallel stateless processing capability of a server with a GPU (graphics processing unit).

After the upper-layer service system generates the audio and video files, the audio and video files are uploaded through an AI algorithm proxy service interface for quality inspection, and after concurrent service requests are packaged in batches through the method, the AI algorithm proxy service interface is asynchronously called as a calculation interface; the AI algorithm service interface subdivides and refines the batch of request data so as to fully utilize the capacity of the multiple computing units of the GPU to complete the one-time rapid operation. Therefore, the requirement of upper-layer service on data processing timeliness can be met, and computing resources can be fully utilized.

Further, referring to fig. 1, according to a second aspect of the present embodiment, there is provided a storage medium. The storage medium comprises a stored program, wherein the method of any of the above is performed by a processor when the program is run.

Therefore, according to this embodiment, in the process of processing parallel service requests by the service application system, first, request event data corresponding to the service request is cached by using the first message queue, then, the request event data cached in the first message queue is constructed into batch data and the batch data is sent to the second message queue, then, the batch data in the second message queue is sent to the artificial intelligence processing system for parallel processing, and a processing result is cached to the third message queue. And finally, feeding back the processing result cached in the third message queue to the corresponding service request. Therefore, compared with the prior art, the service application system can construct batch data according to a plurality of requests and asynchronously send the batch data to the artificial intelligence processing system after receiving the plurality of parallel requests, the requests are processed in parallel through the strong parallel processing capacity of the artificial intelligence processing system, and finally the processing results are received from the artificial intelligence processing system and fed back to the corresponding service requests. Therefore, the service application system of the embodiment can achieve the technical effects of asynchronous response and batch processing of service requests. The method further solves the technical problems that the prior art calls an artificial intelligence algorithm model to process the business by adopting a traditional synchronous request-response mode, cannot effectively utilize the parallel processing capacity of the model and influences the business processing efficiency.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

Example 2

Fig. 4 shows an apparatus 400 for processing artificial intelligence services according to the present embodiment, the apparatus 400 corresponding to the method according to the first aspect of embodiment 1. Referring to fig. 4, the apparatus 400 includes: a request receiving module 410, configured to receive concurrent service requests related to artificial intelligence services, and determine request event data that needs to be submitted to a preset artificial intelligence processing system according to the service requests; a first sending module 420, configured to send request event data to a first message queue for caching service requests; a second sending module 430, configured to monitor the first message queue, construct batch data according to the request event data cached in the first message queue, and send the batch data to a second message queue for caching the batch data; a third sending module 440, configured to monitor the second message queue, and send batch data in the second message queue to the artificial intelligence processing system; a fourth sending module 450, configured to receive a processing result corresponding to the request event data from the artificial intelligence processing system, and send the processing result to a third message queue for caching the processing result; and a feedback module 460, configured to monitor the third message queue, and feed back a processing result to the corresponding service request.

Optionally, the second sending module 430 includes: the calculation submodule is used for calculating the number of the request event data which is cached in the first message queue; and the first construction submodule is used for constructing the batch data according to the quantity and a preset threshold value.

Optionally, the apparatus 400 further comprises: and the second constructing submodule is used for constructing batch data according to the request event data cached in the first message queue within a preset time interval under the condition that the quantity of the request event data does not reach the threshold value.

Optionally, the request receiving module includes: a thread allocation submodule for determining threads respectively corresponding to the service requests, and

the apparatus 400 further comprises: the blocking setting submodule is used for setting a plurality of threads corresponding to the service requests related to the request event data into a blocking state after the request event data is sent to the first message queue used for caching the service requests, and the feedback module comprises: and the thread feedback submodule is used for determining the thread corresponding to the request event data, setting the thread to be in an awakening state, and feeding back the processing result to the corresponding service request through the thread.

Optionally, the request event data includes a response attribute, and the fourth sending module includes an attribute adding sub-module, configured to add the processing results to the response attribute corresponding to the request event data, respectively; and the sending submodule is used for sending the request event data added with the processing result to a third message queue used for caching the processing result.

Optionally, the apparatus 400 further comprises: and the key value pair storage module is used for caching the request event data into a memory data storage structure in a global key value pair form, wherein a key in the key value pair is a thread ID.

Optionally, the thread feedback sub-module includes: the event data determining submodule is used for determining request event data corresponding to the thread according to the thread ID of the key value pair; the attribute acquisition submodule acquires processing result data in the response attribute from the request event data; and the feedback submodule is used for feeding back the processing result to the corresponding service request through the thread.

Therefore, according to this embodiment, in the process of processing parallel service requests by the service application system, first, request event data corresponding to the service request is cached by using the first message queue, then, the request event data cached in the first message queue is constructed into batch data and the batch data is sent to the second message queue, then, the batch data in the second message queue is sent to the artificial intelligence processing system for parallel processing, and a processing result is cached to the third message queue. And finally, feeding back the processing result cached in the third message queue to the corresponding service request. Therefore, compared with the prior art, the service application system can construct batch data according to a plurality of requests and asynchronously send the batch data to the artificial intelligence processing system after receiving the plurality of parallel requests, the requests are processed in parallel through the strong parallel processing capacity of the artificial intelligence processing system, and finally the processing results are received from the artificial intelligence processing system and fed back to the corresponding service requests. Therefore, the service application system of the embodiment can achieve the technical effects of asynchronous response and batch processing of service requests. The method further solves the technical problems that the prior art calls an artificial intelligence algorithm model to process the service by adopting a traditional synchronous request-response mode, cannot effectively utilize the parallel processing capacity of the model and influences the service processing efficiency.

Example 3

Fig. 5 shows an apparatus 500 for processing artificial intelligence services according to the present embodiment, the apparatus 500 corresponding to the method according to the first aspect of embodiment 1. Referring to fig. 5, the apparatus 500 includes: a processor 510; and a memory 520 coupled to processor 510 for providing processor 510 with instructions to process the following process steps: receiving concurrent service requests related to artificial intelligence services, and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests; sending the request event data to a first message queue for caching the service request; monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data; monitoring the second message queue, and sending batch data in the second message queue to an artificial intelligence processing system; receiving a processing result corresponding to the request event data from the artificial intelligence processing system, and sending the processing result to a third message queue for caching the processing result; and monitoring the third message queue, and feeding back a processing result to the corresponding service request.

Optionally, the memory 520 is further configured to provide the processor 510 with instructions to process the following process steps: and constructing the batch data according to the request event data buffered in the first message queue within the preset time interval under the condition that the quantity of the request event data does not reach the threshold value.

Optionally, the memory 520 is further configured to provide the processor 510 with instructions to process the following process steps: and caching the request event data into a memory data storage structure in a global key value pair form, wherein a key in the key value pair is a thread ID.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method of processing artificial intelligence services, comprising:

receiving concurrent service requests related to artificial intelligence services, and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests;

sending the request event data to a first message queue for caching service requests;

monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data;

monitoring the second message queue, and sending the batch data in the second message queue to the artificial intelligence processing system;

receiving a processing result corresponding to the request event data from the artificial intelligence processing system, and sending the processing result to a third message queue for caching the processing result; and

and monitoring the third message queue, and feeding back the processing result to the corresponding service request.

2. The method of claim 1, wherein constructing a batch of data from the request event data buffered in the first message queue comprises:

calculating the quantity of the request event data which is buffered in the first message queue; and

and constructing the batch data according to the quantity and a preset threshold value.

3. The method of claim 2, further comprising:

and under the condition that the quantity of the request event data does not reach the threshold value, constructing the batch data according to the request event data buffered in the first message queue within a preset time interval.

4. The method of claim 1, wherein receiving concurrent service requests associated with artificial intelligence services further comprises: determining threads respectively corresponding to the service requests, and after sending the request event data to a first message queue for caching the service requests, the method further includes:

setting a plurality of threads corresponding to the service requests related to the request event data to be in a blocking state, and feeding back the processing results to the corresponding service requests, including:

and determining a thread corresponding to the request event data, setting the thread to be in an awakening state, and feeding back the processing result to the corresponding service request through the thread.

5. The method of claim 1, wherein the request event data includes a response attribute, and wherein sending the processing result to a third message queue for caching the processing result comprises

Adding the processing results to the response attributes corresponding to the request event data respectively; and

and sending the request event data added with the processing result to a third message queue for caching the processing result.

6. The method of claim 4, further comprising:

caching the request event data into a memory data storage structure in a global key value pair form, wherein a key in the key value pair is a thread ID.

7. The method of claim 6, wherein feeding back the processing result to the corresponding service request by the thread comprises:

determining the request event data corresponding to the thread according to the thread ID of the key value pair;

processing result data in response attributes are obtained from the request event data; and

and feeding back the processing result to the corresponding service request through the thread.

8. A storage medium comprising a stored program, wherein the method of any one of claims 1 to 7 is performed by a processor when the program is run.

9. An apparatus for processing artificial intelligence services, comprising:

the request receiving module is used for receiving concurrent service requests related to the artificial intelligence service and determining request event data needing to be submitted to a preset artificial intelligence processing system according to the service requests;

the first sending module is used for sending the request event data to a first message queue used for caching the service request;

the second sending module is used for monitoring the first message queue, constructing batch data according to the request event data cached in the first message queue, and sending the batch data to a second message queue for caching the batch data;

a third sending module, configured to monitor the second message queue, and send the batch data in the second message queue to the artificial intelligence processing system;

the fourth sending module is used for receiving a processing result corresponding to the request event data from the artificial intelligence processing system and sending the processing result to a third message queue for caching the processing result; and

and the feedback module is used for monitoring the third message queue and feeding back the processing result to the corresponding service request.

10. An apparatus for processing artificial intelligence services, comprising:

a processor; and

a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: