CN112416575A

CN112416575A - Algorithm model scheduling system and method for urban brain AI calculation

Info

Publication number: CN112416575A
Application number: CN202011204562.XA
Authority: CN
Inventors: 梅一多; 郑新颖; 何彬; 罗建萌; 王博; 王崟乐
Original assignee: Zhongguancun Smart City Co Ltd
Current assignee: Zhongguancun Smart City Co Ltd
Priority date: 2020-11-02
Filing date: 2020-11-02
Publication date: 2021-02-26

Abstract

The invention relates to an algorithm model scheduling system and method for urban brain AI calculation, wherein the system comprises: the resource monitoring module is used for monitoring the resource usage amount depended by the AI algorithm and the AI model in real time; the automatic packing module is used for performing containerized deployment on different AI algorithms and AI models; and the resource allocation module is used for comparing the resource usage amount monitored by the resource monitoring module with a preset threshold value and automatically reallocating resources to the relevant AI algorithm and AI model according to a preset resource allocation strategy according to a comparison result. The distributed resources are adjusted by monitoring the resource usage amount in real time, the optimal configuration of the resources is provided for the AI algorithm and the AI model, the resource waste is avoided while the execution efficiency is ensured, different algorithms and models provided by third-party manufacturers can be packaged, independently deployed and operated by container deployment of the AI algorithm and the AI model, and the mutual interference of the resources is avoided.

Description

Algorithm model scheduling system and method for urban brain AI calculation

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to an algorithm model scheduling system and method for urban brain AI (artificial intelligence) calculation

Background

The smart city is an urban resource optimal allocation solution based on data and driven by demand. Limited resources are applied to unlimited urban services. According to the real-time data and various information, the public resources of the city are comprehensively allocated and regulated, and finally the intelligentization of the city operation and the optimization of the operation efficiency are realized. The 'urban brain' is taken as an urban platform and a core component of smart city infrastructure, is generated by the promotion of the demand of the rapid development of smart cities, and becomes a necessary condition for the comprehensive development of novel smart cities. At present, the urban brain forms an urban brain data lake by collecting urban multi-source data resources of governments, enterprises and society, can dynamically sense urban operation signs by utilizing an AI algorithm and an AI model, monitors the operation state of a city in real time, and basically covers various fields of government affairs service, traffic operation, ecological environment, social administration, medical treatment, education and the like.

At present, in the urban brain, resources consumed by AI models and AI algorithms in various fields are pre-distributed in advance, real-time monitoring and dynamic adjustment cannot be realized, and optimal configuration of the resources cannot be realized. For example, when an emergency occurs, the required resources are sharply increased in a short time, and when the required resources exceed the pre-allocated limit, the system cannot respond in time, so that blockage and delay are caused.

Disclosure of Invention

Aiming at the technical problem, the invention provides an algorithm model scheduling system and method for urban brain AI calculation. Through real-time monitoring of system resource consumption, unified resource management is achieved, the capabilities of an AI model and an AI algorithm are efficiently utilized, AI construction of the brain of the city is perfected, and development of the smart city is promoted.

The technical scheme for solving the technical problems is as follows:

an algorithmic model scheduling system for urban brain AI computation, comprising:

the resource monitoring module is used for monitoring the resource usage amount depended by the AI algorithm and the AI model in real time;

the automatic packing module is used for performing containerized deployment on different AI algorithms and AI models;

and the resource allocation module is used for comparing the resource usage amount monitored by the resource monitoring module with a preset threshold value and automatically reallocating resources to the relevant AI algorithm and AI model according to a preset resource allocation strategy according to a comparison result.

The invention has the beneficial effects that: the distributed resources are adjusted by monitoring the resource usage in real time, the optimal configuration of the resources is provided for the AI algorithm and the AI model, the resource waste is avoided while the execution efficiency is ensured, different algorithms and models provided by third-party manufacturers can be packaged through containerized deployment of the AI algorithm and the AI model, the independent deployment and the independent operation are realized, and the mutual interference of the resources is avoided.

On the basis of the technical scheme, the invention can be further improved as follows.

Further, the system further comprises:

and the self-repairing module is used for recovering the AI algorithm and the AI model according to a preset recovery strategy when the AI algorithm and the AI model which are arranged in a containerization way are abnormally delayed.

The technical scheme has the advantages that high availability of the core AI algorithm and the core AI model is realized, and abnormal conditions are timely found and automatically repaired through real-time monitoring.

Further, the system further comprises:

the service discovery module is used for registering the AI algorithm and the AI model to a unified registration center;

and the load balancing module is used for distributing the resource usage amount of each load through a load balancing mechanism.

The beneficial effect of adopting the further scheme is that a more convenient communication mode and a more reasonable processing mechanism are provided for the AI algorithm and the AI model through the service discovery and load balancing module.

Further, the system further comprises:

the automatic issuing module is used for rolling and updating the AI algorithm and the AI model according to the current load condition to complete the assembly line automatic issuing of the AI algorithm and the AI model;

and the automatic rollback module is used for automatically rolling back to the previous version when the AI algorithm and the AI model of the current version are failed to be released.

The adoption of the further scheme has the beneficial effects that the usability of the AI algorithm and the AI model is improved through automatic publishing and automatic rollback, and the high usability of 7 x 24 hours is achieved.

Further, the system further comprises:

the authority control module is used for setting access authorities of different namespace resources for different users;

and the configuration management module is used for carrying out unified configuration management on the public resources aiming at different environments, and persisting and updating the configuration in real time.

The adoption of the further scheme has the advantages that more flexible control and management of the AI algorithm and the AI model are realized through authority control and configuration management, and mutual interference is avoided.

Further, the system further comprises:

and the storage arranging module is used for distributing storage resources according to the size of the data file.

The beneficial effect of adopting the further scheme is that the storage resources are reasonably distributed according to the size characteristics of the data file, so that the storage resources are not wasted, and the data reading speed is better.

Further, the system further comprises:

the timing task module is used for triggering a synchronization task of third-party manufacturer data at fixed time;

and the message triggering module is used for setting a synchronization sequence of the data based on the service requirement, and the message mechanism ensures that the data synchronization is executed according to the synchronization sequence.

The beneficial effect of adopting the above further scheme is that the technical details of the third party manufacturer are shielded through the timing task and the message triggering, and the more reasonable third party manufacturer technology is used by combining different scenes.

In order to achieve the above object, the present invention further provides an algorithm model scheduling method for urban brain AI computation, including:

monitoring the resource usage amount depended on by the AI algorithm and the AI model in real time;

carrying out containerized deployment on different AI algorithms and AI models;

and comparing the resource usage amount monitored by the resource monitoring module with a preset threshold, and automatically reallocating resources to the relevant AI algorithm and AI model according to a preset resource allocation strategy according to the comparison result.

Further, the method further comprises:

and when the AI algorithm and the AI model which are arranged in a containerization way are abnormally delayed, restoring the AI algorithm and the AI model according to a preset restoring strategy.

Further, the method further comprises:

registering the AI algorithm and the AI model to a unified registration center;

and distributing the resource usage amount of each load through a load balancing mechanism.

Further, the method further comprises:

according to the current load condition, the AI algorithm and the AI model are updated in a rolling manner, and the automatic release of the AI algorithm and the AI model in a production line is completed;

and when the AI algorithm and the AI model of the current version fail to be released, automatically rolling back to the previous version.

Further, the method further comprises:

setting access rights of different namespace resources for different users;

and performing unified configuration management on the public resources aiming at different environments, and persisting and updating the configuration in real time.

Further, the method further comprises:

and allocating storage resources according to the size of the data file.

Further, the method further comprises:

triggering a synchronization task of third party manufacturer data at regular time;

and setting a synchronization sequence of the data based on the service requirement, and ensuring that the data synchronization is executed according to the synchronization sequence by a message mechanism.

Drawings

Fig. 1 is a block diagram of an algorithm model scheduling system for urban brain AI computation according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a working principle of an algorithm model scheduling system for urban brain AI computation according to an embodiment of the present invention;

fig. 3 is a flowchart of an algorithm model scheduling method for urban brain AI computation according to an embodiment of the present invention.

Detailed Description

The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.

The algorithm model scheduling system for urban brain AI calculation provided by the embodiment of the present invention is applicable to an "urban brain" system having one or more AI algorithms, model providers, and users in multiple fields, as shown in fig. 1, and the system includes:

Specifically, the resource monitoring module is used as a monitoring component, and can monitor the use condition of necessary resources depended on by the AI algorithm and the AI model in real time, for example, the CPU load, the memory load, the disk load, the network traffic, the GPU load, and the like in real time, and once the use amount of the resources is found to exceed the threshold set by the early warning, the resource allocation module is invoked by an invocation request to automatically reallocate the resources to the relevant AI algorithm and the AI model so as to achieve the purpose of real-time response, for example, as shown in fig. 2, when the traffic suddenly increases, the processing capacity of the preset AI algorithm and the AI model load is exceeded, the resource allocation module performs horizontal capacity expansion on the AI algorithm and the AI model according to the invocation request of the resource monitoring module on the basis of the original resources coping with the conventional traffic according to a preset capacity expansion strategy so as to cope with the traffic peak; when the flow is reduced, horizontal capacity reduction is carried out on the AI algorithm and the AI model, and unnecessary resources are recycled, so that the purpose of reasonably utilizing the resources is achieved.

The automatic boxing module carries out containerized deployment on AI algorithms and AI models provided by different providers, packages bottom layer implementation details, adapts to different scenes, and selects a proper model according to a proper scene to achieve effect optimization.

According to the algorithm model scheduling system for urban brain AI calculation, the distributed resources are adjusted by monitoring the resource usage in real time, the optimal configuration of the resources is provided for the AI algorithm and the AI model, the resource waste is avoided while the execution efficiency is ensured, different algorithms and models provided by third-party manufacturers can be packaged, independently deployed and independently operated by means of containerization deployment of the AI algorithm and the AI model, and mutual interference of the resources is avoided.

Optionally, in this embodiment, as shown in fig. 1, the system further includes:

Specifically, when the AI algorithm and the AI model which are arranged in a containerization manner are abnormally crashed, the self-repairing module restarts the AI algorithm and the AI model in real time, old resources are recycled, new resources are reallocated, the mirror image is pulled again, and a container is created, so that the AI algorithm and the AI model are provided again. In the restoration process of the AI algorithm and the AI model, a restoration policy may be set in advance.

The self-repairing module can realize high availability of the core AI algorithm and the core AI model, and can timely find abnormal conditions and automatically repair the abnormal conditions through real-time monitoring.

Specifically, the AI algorithm and the AI model register themselves to a unified registration center, so that other AI algorithms and AI models can find the interfaces to be called conveniently, communicate with each other, cooperate with each other, and process the service requirements of different scenes.

The AI algorithm and the AI model avoid that the flow cannot reach the same load through a load balancing mechanism, and the flow is reasonably distributed according to the actual use condition of each load.

Through the service discovery and load balancing module, a more convenient communication mode and a more reasonable processing mechanism can be provided for the AI algorithm and the AI model.

Specifically, the AI algorithm and the AI model are automatically issued by a production line through continuous construction in devops, and when the AI algorithm and the AI model need to be issued, rolling updating is performed according to the current load condition, so that the condition that the AI algorithm and the AI model are unavailable is avoided to the maximum extent.

And when the release fails, the automatic rollback module automatically rolls back to the previous version, so that the problem that an AI algorithm and an AI model are unavailable due to the release is avoided.

The availability of an AI algorithm and an AI model is improved through the automatic publishing and automatic rollback module, and the high availability of 7 x 24 hours is achieved.

Specifically, the authority control module isolates the limitation condition of resources through a namespace, and allocates the available authority through an account and a role, so that the mutual interference of different AI algorithms and AI models is avoided.

The configuration management module performs unified configuration management on the public resources, performs unified configuration on different environments such as a development environment, a test environment, a pre-production environment, a production environment and the like, and persists the configuration. The configuration is updated in real time, and the effect of real-time effect can be achieved.

Through the authority control and configuration management module, the AI algorithm and the AI model are controlled and managed more flexibly, and mutual interference is avoided.

Specifically, the data is of various types, including image data, video data, text data, voice data and the like, and storage resources are reasonably allocated according to the size characteristics of the data file, so that the storage resources are not wasted, and the data reading speed is better.

Specifically, the timing task module triggers the data synchronization task at regular time. In order to decouple the third party vendor docking, the third party vendor data is triggered to synchronize using a timed task. And the timing task is configured in an interface mode through the timing task configuration platform.

Because the data synchronization has sequentiality and the task execution has preface, the message trigger module uses message trigger aiming at the data synchronization sequence, the task execution sequence and the like, thereby ensuring the timeliness and avoiding meaningless waiting.

Through the timing task and the message trigger module, the technical details of a third-party manufacturer are shielded, and a more reasonable third-party manufacturer technology is used in combination with different scenes.

The embodiment of the present invention further provides an algorithm model scheduling method for urban brain AI computation, as shown in fig. 3, the method includes:

s1, monitoring the resource usage amount depended by the AI algorithm and the AI model in real time;

s2, carrying out containerization deployment on different AI algorithms and AI models;

and S3, comparing the resource usage amount monitored by the resource monitoring module with a preset threshold value, and automatically reallocating resources to the relevant AI algorithm and AI model according to the comparison result and the preset resource allocation strategy.

Optionally, in this embodiment, the method further includes:

and S4, when the AI algorithm and the AI model which are arranged in a containerization mode are abnormally delayed, restoring the AI algorithm and the AI model according to a preset restoring strategy.

Optionally, in this embodiment, the method further includes:

s5, registering the AI algorithm and the AI model to a unified registration center;

and S6, distributing the resource usage of each load through a load balancing mechanism.

Optionally, in this embodiment, the method further includes:

s7, according to the current load condition, rolling and updating the AI algorithm and the AI model, and completing the pipeline automatic release of the AI algorithm and the AI model;

and S8, when the AI algorithm and the AI model of the current version fail to be released, automatically rolling back to the previous version.

Optionally, in this embodiment, the method further includes:

s9, setting access rights of different namespace resources for different users;

and S10, performing unified configuration management on the common resources aiming at different environments, and persisting and updating the configuration in real time.

Optionally, in this embodiment, the method further includes:

and S11, allocating storage resources according to the size of the data file.

Optionally, in this embodiment, the method further includes:

s12, regularly triggering a synchronization task of third party manufacturer data;

and S13, setting a synchronization sequence of the data based on the service requirement, and ensuring that the data synchronization is executed according to the synchronization sequence by a message mechanism.

The following provides a specific application example based on the above embodiment:

a vehicle identification algorithm provided by an algorithm provider A is used in services of a registered user A in a certain field, under normal conditions, 500 paths of cameras can process the vehicle identification algorithm, when some emergency occurs, the flow is increased suddenly, the 1500 paths of cameras are increased suddenly and are processed simultaneously, the system monitors the flow peak in real time, and the capacity is expanded by 3 times in real time so as to deal with the processing task of the sudden increase.

When the flow peak passes, the number of the cameras processed in real time is recovered to be normal, the system automatically reduces the capacity, the waste of resources is avoided, and the resources are reasonably and effectively utilized.

The reader should understand that in the description of this specification, reference to the description of the terms "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working processes of modules and units in the foregoing system embodiment may be referred to for corresponding processes in the foregoing method embodiment, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An algorithmic model scheduling system for urban brain AI computation, comprising:

2. The system of claim 1, further comprising:

3. The system of claim 1, further comprising:

4. The system of claim 1, further comprising:

5. The system of claim 1, further comprising:

6. The system of claim 1, further comprising:

7. The system of any one of claims 1 to 6, further comprising:

8. An algorithm model scheduling method for urban brain AI calculation is characterized by comprising the following steps:

carrying out containerized deployment on different AI algorithms and AI models;

and comparing the monitored resource usage with a preset threshold, and automatically reallocating resources to the relevant AI algorithm and the AI model according to a preset resource allocation strategy according to the comparison result.

9. The method of claim 8, further comprising:

10. The method of claim 8, further comprising:

registering the AI algorithm and the AI model to a unified registration center;