WO2017127976A1 - Method for training and scheduling incremental learning cloud system and related device - Google Patents

Method for training and scheduling incremental learning cloud system and related device Download PDF

Info

Publication number
WO2017127976A1
WO2017127976A1 PCT/CN2016/071970 CN2016071970W WO2017127976A1 WO 2017127976 A1 WO2017127976 A1 WO 2017127976A1 CN 2016071970 W CN2016071970 W CN 2016071970W WO 2017127976 A1 WO2017127976 A1 WO 2017127976A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
task
priority
cloud
model
Prior art date
Application number
PCT/CN2016/071970
Other languages
French (fr)
Chinese (zh)
Inventor
邵云峰
姚骏
薛希俊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2016/071970 priority Critical patent/WO2017127976A1/en
Priority to CN201680018168.2A priority patent/CN108027889B/en
Publication of WO2017127976A1 publication Critical patent/WO2017127976A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Definitions

  • the present invention relates to the field of data processing, and in particular, to a training, scheduling method, and related device for an incremental learning cloud system.
  • the identification information includes the correct rate value, the newly added data amount, and the newly added data type, and the method further includes:
  • the recognition model is provided by the training cloud
  • there are two ways to identify the cloud acquisition recognition model one is directly obtained from the training cloud, that is, the recognition model is in the cloud system where the recognition cloud is located.
  • the training cloud is sent to the recognition cloud; the other is obtained from the storage device, that is, the recognition model is read from the storage device by the recognition cloud, and the identification model stored in the storage device is identified by the
  • the cloud is sent by the training cloud in the cloud system.
  • all the recognition models are not stored, but only some recently trained recognition models are stored. All the recognition models trained by the training cloud are stored in the storage device, so that the recognition cloud can be acquired. Identify models arbitrarily.
  • the identification information includes at least one of a correct rate value, a new data amount, and a new data category.
  • the identification information includes the correct rate value, the newly added data amount, and the newly added data category
  • the first processing module is further configured to:
  • a second receiving module configured to receive unidentified data, which is sent by the user equipment UE or provided by the storage device;
  • the second processing module is further configured to: when the identification information exceeds a preset identification threshold, send a model training request to the training cloud, so that the training cloud trains the recognition model, and the model training The training request carries the identification information and the type of the recognition model.
  • the unidentified data stored in the storage device being transmitted by the UE to the storage device.
  • the identification model is read by the identification device from the storage device, and the identification model stored in the storage device is sent by the training device in the cloud system where the identification device is located.
  • FIG. 4 is a diagram of an embodiment of a training method based on an incremental learning cloud system according to an embodiment of the present invention
  • FIG. 2 is an incremental learning cloud according to an embodiment of the present invention.
  • the training cloud mainly trains the existing recognition model, and the parameters required for the training include the data identified by the recognition model and the identification information, wherein the identified data of the recognition model is provided by the storage cloud, as shown in the storage cloud of FIG. 2 .
  • the identification information is provided by the recognition cloud, and the training cloud updates the recognition model based on the two types of information as parameters to provide a more accurate recognition model.
  • the training cloud will also identify the model.
  • the parameters and their parameters are backed up and stored in the storage cloud, as shown by the solid thin arrow of the training cloud to the storage cloud in FIG. 2, so that the storage cloud can provide the recognition model for identifying the cloud.
  • the priority of the training task corresponds to the execution priority of the training task.
  • Other ways of indicating the priority number of the execution priority level are not limited herein.
  • calculating the priority number of the training task according to the identification information may include:
  • the task parameter is assigned to the training task when it is created, and will change with the execution of the training task, such as adding some parameters, changing the value of the original parameter, and the like.
  • the training task is a task or a non-executing task; when the training task is a task, the task parameter includes a task importance parameter and a running time estimation parameter; when the training task is a non-executing task, the task parameter includes a task important level. Parameters, model parameters, latency parameters, and runtime estimation parameters.
  • the task importance parameter is an artificially set parameter, and different parameters may be set according to different actual processed data;
  • the model parameter represents a percentage of the model parameter transmission time and the calculation time, or the recognition model size of the training task accounts for all the identification.
  • the wait time factor indicates the length of the wait time, or the length of the wait time of the training task as a percentage of the waiting time of all tasks;
  • the runtime estimation factor indicates the estimated length of the calculation time, or the length of the task calculation time is the total The percentage of the length of time the task was calculated.
  • the priority number is equal to w1* new data amount + w2 * new data type + w3 * correct rate + task parameter of the unexecuted task
  • the task parameter of the unexecuted task may be equal to w7 * Task importance parameter - w8 * model parameter + w9 * waiting time parameter - w10 * runtime estimation parameter
  • the executed task its priority number is equal to w4 * new data volume + w5 * new data type + w6 * correct rate Value + the task parameter of the execution task, the task parameter of the execution task may be equal to the w11* task importance parameter -w12* runtime estimation parameter.
  • w1 to w3, and w4 to w6 are first weighting factors
  • w7 to w10 and w11 and w12 are second weighting factors
  • w1 to w12 represent weights of corresponding parameters
  • values of w1 to w12 can be based on Training tasks need to be set up.
  • the training task with the high priority or the high priority is selected as the training task of the type training model, and the trigger priority is according to the recognition model.
  • the priority value of the type setting, the priority value of the new data, and the priority of the new data type is selected as the training task of the type training model, and the trigger priority is according to the recognition model.
  • the training cloud calculates the allocated resources of each candidate task in the candidate task set according to the priority number.
  • the training cloud determines the candidate task that allocates the number of resources not less than the minimum resource number of the candidate task as the priority task.
  • resources are allocated for the set formed by the priority task and the non-priority task, and the allocation is based on the previous candidate tasks.
  • the number of allocated resources is calculated, so the priority tasks can all meet the minimum resource requirements; the non-priority tasks are divided into two cases. The first one is to calculate all the resources without calculating all the resources. , but leave some margin, and these margins can meet the difference between the minimum number of resources of some or all non-priority tasks, even after the allocation is completed. Source, the second case is that the calculation process has already calculated all the resources. In the actual allocation, the non-priority tasks are not allocated to the minimum number of resources.
  • the resource can be utilized most effectively, so that the training task with high priority can be under sufficient resources. Priority execution, thereby improving the efficiency of execution in the case of multiple training tasks.
  • the specific steps of data identification include: the new service data (ie, the unidentified data) is stored in the storage cloud;
  • the new business data is identified in the recognition cloud according to the training model that has been obtained, and the recognition result is stored in the storage cloud;
  • the recognition cloud organizes different forms of data according to the type of trigger. For the correct rate is lower than the threshold, use all kinds of correct identification data and incorrect data to train together. For the new data volume higher than the threshold, use the trained various types of correct data and new data to train together. For new data types. Above the threshold, the training is performed using all kinds of correct data and new types of data that have been trained.
  • Table 3 below shows the settings for identification information and task parameters:
  • w1 to w12 of the first weighting factor and the second weighting factor are both set to 1
  • the priority number threshold C1 is set to 0.2
  • the scheduling condition is set to be scheduled every 0.1 hour or when a new task joins or a task is completed.
  • the training resource is set to 100 units of resources, and the minimum number of resources and the optimal number of resources occupied by the specific types of identification models are as shown in Table 4 below:
  • the number of resources to which the "face recognition" task is assigned is 100.
  • the difference between the number of resources allocated to the training task and the minimum number of resources is compared with the threshold number V1 of the resource:
  • the specific resource allocation process includes:
  • the first receiving module 501 can implement step 301 in the embodiment shown in FIG.
  • the identification information includes at least one of a correct rate value, a newly added data amount, and a new data type. It can be seen that the identification information is obtained by using a certain type of recognition model to identify certain data, and the description of various kinds of values in the identification information is similar to the description of step 301 in the embodiment shown in FIG. I won't go into details here.
  • the resource allocation module 503 is configured to allocate a training resource to the training task according to the priority number, and execute a corresponding training task according to the execution priority level.
  • the first receiving module 501 after receiving the model training request, the first receiving module 501 generates a corresponding training task by the first processing module 502, and also acquires corresponding identification information according to the model training request, and then the first The processing module 502 also calculates a priority number of the training task according to the identification information of the training task and the type of the recognition model of the training task, the priority number indicating the execution priority level of the training task, and when there are multiple training tasks, due to a training priority The number needs to be determined by two types of data, namely, training identification information and the type of training model. Finally, the resource allocation module 503 schedules the training tasks in the training cloud according to the calculated priority number, so that the training resources can be trained by multiple training tasks. Reasonable sharing to improve training efficiency.
  • the training device for the incremental learning cloud system in the embodiment of the present invention is described above.
  • the identification device for the incremental learning cloud system in the embodiment of the present invention may be an application server. Or a plurality of application servers, the UE may be connected to the identification device through a network, and the identification device is mainly used to identify data from the UE according to the identification model generated by the training device.
  • FIG. 6 is an implementation of the present invention.
  • An embodiment of an identification device of an example, the identification device may include:
  • the second receiving module 601 can implement step 401 in the embodiment shown in FIG.
  • the second receiving module 601 is configured to receive the unidentified data read from the storage device, and the unidentified data stored in the storage device is sent by the UE to the storage device, where the data is not identified.
  • the source is similar to the description of step 401 in the embodiment shown in FIG. 4, and details are not described herein again.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the field of data processing. Provided are a method for training and scheduling an incremental learning cloud system and a related device. The method comprises: receiving, by a training cloud terminal, a model training request transmitted by an identification cloud terminal; generating, according to identification information and a type of an identification model, a corresponding training task (302); calculating, on the basis of the identification information, a priority number of the training task (303), the priority number of the training task corresponding to an execution priority level of the training task; and allocating, according to the priority number, a training resource for the training task, and executing, according to the execution priority level, the corresponding training task (304). The invention calculates a priority number of a training task to determine a priority level of the training task, and performs, according to the calculated priority number, scheduling on the training task in a training cloud terminal, such that a training resource can be reasonably shared by multiple training tasks, thereby increasing training efficiency.

Description

一种用于增量式学习云***的训练、调度方法及相关设备Training, scheduling method and related equipment for incremental learning cloud system 技术领域Technical field
本发明涉及数据处理领域,尤其涉及的是一种用于增量式学习云***的训练、调度方法及相关设备。The present invention relates to the field of data processing, and in particular, to a training, scheduling method, and related device for an incremental learning cloud system.
背景技术Background technique
机器学习是一种从海量数据中发掘有价值信息的方法。随着网络和传感器技术的发展,数据量越来越多,而且数据量和数据种类是随时间增加的,因此用于识别数据的识别模型也需要进行更新以适配新增加的数据量和数据种类。Machine learning is a way to discover valuable information from vast amounts of data. With the development of network and sensor technologies, the amount of data is increasing, and the amount of data and data types are increasing with time. Therefore, the identification model used to identify data needs to be updated to adapt to the newly added data volume and data. kind.
目前常采用增量式的学习方法,采用数据存储器对数据源提供的数据进行存储,一个数据识别过程是由预测器根据模型训练器提供的预测模型对来自数据存储器的数据进行预测,而提供预测模型的模型训练设备的预测模型更新过程包括,根据数据存储器提供的数据对已有的预测模型进行训练,得到更新后的预测模型,并将该预测模型提供给预测器,这种增量式学的的方式通常是间断式进行的,即达到一定触发条件之后才会进行一次增量学习,例如新增数据量达到一定的量或者经过了固定的时间;一般来说,两次增量学习之间会有一定的时间间隔。Incremental learning methods are often used, and data storage is used to store the data provided by the data source. A data identification process is performed by the predictor to predict the data from the data memory according to the prediction model provided by the model trainer, and provide prediction. The predictive model updating process of the model training device of the model includes training the existing predictive model according to the data provided by the data storage, obtaining the updated predictive model, and providing the predictive model to the predictor, the incremental learning The method is usually intermittent, that is, an incremental learning is performed after a certain trigger condition is reached, for example, the amount of new data reaches a certain amount or a fixed time; in general, two incremental learning There will be a certain time interval between them.
然而由于模型训练的计算量非常大,需要采用特殊的设备,如大量的图形处理器(Graphics Processing Unit,GPU)或者是现场可编程门阵列(Field-Programmable Gate Array,FPGA)对模型训练进行加速,以及时匹配预测器正在处理的数据,这些特殊设备的成本较高,并且由于只是简单的通过新增数据量作为触发条件,在有多个训练任务被触发时,也没有对应的合理执行策略,仅仅是按照触发的时间顺序进行处理。However, due to the large amount of computation of the model training, special equipment, such as a large number of graphics processing units (GPUs) or Field-Programmable Gate Arrays (FPGAs), is required to accelerate the model training. The data that the predictor is processing in time, the cost of these special devices is high, and since the simple increase of the amount of data as a trigger condition, when there are multiple training tasks being triggered, there is no corresponding reasonable execution strategy. , only in the chronological order of the trigger.
发明内容Summary of the invention
本发明实施例提供了一种用于增量式学习云***的训练、调度方法及相关设备,根据触发条件和触发顺序分配训练资源,使得训练资源能够被多个训练任务合理共享,提高训练效率。 The embodiments of the present invention provide a training, scheduling method, and related device for an incremental learning cloud system, and allocate training resources according to trigger conditions and trigger sequences, so that training resources can be reasonably shared by multiple training tasks, thereby improving training efficiency. .
有鉴于此,本发明实施例第一方面提供一种用于增量式学习云***的调度方法,包括用于进行训练任务的训练和调度的训练云,该训练云的调度过程如下:首先接收所述训练云所在的云***中的识别云发送模型训练请求,该模型训练请求中携带识别信息和识别模型的类型;之后根据该识别信息和所述识别模型的类型生成与这两者相对应的训练任务;再通过该识别信息计算所述训练任务的优先数,该优先数对应优先级别,即优先数大的优先级别高,优先数低的优先级别低;最后以各训练任务优先数为基础,为这些训练任务按照其对应的优先数分配训练资源,并按所述执行优先级别执行对应的训练任务。In view of this, the first aspect of the embodiments of the present invention provides a scheduling method for an incremental learning cloud system, including a training cloud for performing training and scheduling of training tasks, and the scheduling process of the training cloud is as follows: first receiving The recognition cloud in the cloud system in which the training cloud is located sends a model training request, where the model training request carries the identification information and the type of the recognition model; and then generates and corresponds to the type according to the identification information and the identification model. The training task is further calculated by using the identification information, and the priority number is corresponding to the priority level, that is, the priority number is higher, the priority number is lower, and the priority number is lower. Basically, the training resources are allocated for the training tasks according to their corresponding priority numbers, and the corresponding training tasks are executed according to the execution priority level.
可以看出,在具有多个训练任务时,由于一个训练任务的优先数需要由两类数据来决定,即训练任务的识别信息和训练任务的识别模型的类型,根据计算出的优先数对训练云中得训练任务进行调度,使得训练资源能够被多个训练任务合理共享,提高训练效率。It can be seen that when there are multiple training tasks, the priority number of one training task needs to be determined by two types of data, namely, the identification information of the training task and the type of the recognition model of the training task, and the training according to the calculated priority number. The training tasks in the cloud are scheduled, so that the training resources can be reasonably shared by multiple training tasks, and the training efficiency is improved.
在一些实施例中,所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种。In some embodiments, the identification information includes at least one of a correct rate value, a new data amount, and a new data category.
在一些实施例中,由于可能出现两个训练任务具有相同识别模型的类型,因此两个任务实际上都是对同一个模型进行的训练,对于此情形需要剔除其中一个训练任务,具体的,所述识别信息包括所述正确率值、所述新增数据量和所述新增数据种类,所述方法还包括:In some embodiments, since two training tasks may have the same type of recognition model, both tasks are actually training the same model. For this case, one of the training tasks needs to be eliminated. Specifically, The identification information includes the correct rate value, the newly added data amount, and the newly added data type, and the method further includes:
当两个所述训练任务的识别模型的类型相同时,选取触发优先级高或执行优先级高的训练任务作为所述类型的训练模型的训练任务,所述触发优先级为按照所述识别模型的类型设置的所述识别信息中所述正确率值、所述新增数据量和所述新增数据种类的优先顺序。通过此相同训练模型剔除的机制能够避免一个训练模型在短时间内被反复训练,从而能够节省训练资源,增加训练的效率。When the types of the recognition models of the two training tasks are the same, the training task with the high priority or the high priority is selected as the training task of the training model of the type, and the trigger priority is according to the recognition model. The priority rate value, the new data amount, and the priority order of the new data category in the identification information set by the type. The mechanism of culling by the same training model can avoid a training model being repeatedly trained in a short time, thereby saving training resources and increasing training efficiency.
在一些实施例中,训练云计算优先数的方式包括:首先确定与训练任务对应的任务参数;而后确定出该训练任务的识别信息的第一加权因子,任务参数的第二加权因子;最后根据识别识别信息、任务参数、第一加权因子和第二加权因子计算所述训练任务的优先数。一个训练任务的优先数与其识别信息和任务参数相关,并且针对不同的训练任务设有对应识别信息和任务参数的不同的 第一加权因子和第二加权因子,这些相关因素使得计算出的训练任务的优先数更为精确,从而能够更好的被调度。In some embodiments, the manner of training the cloud computing priority number includes: first determining a task parameter corresponding to the training task; and then determining a first weighting factor of the identification information of the training task, a second weighting factor of the task parameter; The identification information, the task parameter, the first weighting factor, and the second weighting factor are used to calculate a priority number of the training task. The priority number of a training task is related to its identification information and task parameters, and different identification information and task parameters are provided for different training tasks. The first weighting factor and the second weighting factor, these correlation factors make the calculated priority number of the training task more accurate, so that it can be better scheduled.
在一些实施例中,任务参数根据训练任务被分为执行任务或者非执行任务而有所不同,执行任务即正在执行的训练任务,非执行任务即已经准备好,但还未开始执行的训练任务;当所述训练任务为执行任务时,所述任务参数包括任务重要级参数和运行时间估计参数;当所述训练任务为非执行任务时,所述任务参数包括任务重要级参数、模型参数、等待时间参数和运行时间估计参数。通过训练任务进一步细分计算,能够使得优先数的计算更为精准,并且使得训练任务的优先级别更为合理。In some embodiments, the task parameters are different according to whether the training task is divided into an executing task or a non-executing task. The executing task is a training task being executed, and the non-executing task is a training task that is already prepared but has not yet started execution. When the training task is to perform a task, the task parameter includes a task importance level parameter and a running time estimation parameter; when the training task is a non-executing task, the task parameter includes a task important level parameter, a model parameter, The wait time parameter and the run time estimate parameter. Further subdivision calculations through training tasks can make the calculation of the priority number more precise and make the priority of the training task more reasonable.
在一些实施例中,计算出各训练任务的优先数之后,即可对这些训练任务进行资源分配,具体的过程可包括:首先根据所述训练任务对应的识别模型确定所述训练任务所需的最低资源数和最佳资源数,每个训练任务的最低资源数和最佳资源数在一次调度的过程中是不变的;其次,设置一个预设的优先数阈值对训练任务进一步划分,对将最大优先数与所述优先数的差值小于预设的优先数阈值的所述优先数对应的训练任务确定为候选任务集合中的候选任务;In some embodiments, after the priority number of each training task is calculated, resource allocation can be performed on the training tasks. The specific process may include: first determining, according to the recognition model corresponding to the training task, the required training task. The minimum number of resources and the optimal number of resources, the minimum number of resources and the optimal number of resources for each training task are unchanged in the process of one scheduling; secondly, setting a preset priority number threshold to further divide the training task, Determining, by the training task corresponding to the priority number that the difference between the maximum priority number and the priority number is less than the preset priority number threshold, as a candidate task in the candidate task set;
接着,根据所述优先数计算所述候选任务集合中各候选任务的分配资源数,将分配资源数不小于所述候选任务的最低资源数的候选任务确定为优先任务;将分配资源数小于所述候选任务的最低资源数,且所述候选任务的最低资源数与所述候选任务的分配资源数之差小于预设的资源调整阈值的候选任务确定为非优先任务;Then, the number of allocated resources of each candidate task in the candidate task set is calculated according to the priority number, and the candidate task whose number of allocated resources is not less than the minimum resource number of the candidate task is determined as a priority task; a candidate resource having a minimum number of resources, and a candidate task whose difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is less than a preset resource adjustment threshold is determined as a non-priority task;
最后,在确定出了优先任务和非优先任务之后,即可按照优先任务和非优先任务进行资源分配。Finally, after prioritization and non-priority tasks are identified, resource allocation can be performed on priority and non-priority tasks.
在一些实施例中,确定候选任务和费候选任务的过程中,会将候选任务的最低资源数与所述候选任务的分配资源数之差大于预设的资源调整阈值的候选任务从所述候选任务集合中去除。这部分训练任务由于缺少的资源数较多,排入候选任务也很难被分配到足够的资源数,因此去除会释放出一些原本将要分配给这些训练任务的资源,使得分配给优先任务和非优先任务的资源数更多,能够更好的执行优先任务和非优先任务。In some embodiments, in the process of determining the candidate task and the fee candidate task, the difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is greater than a preset resource adjustment threshold candidate task from the candidate. Removed from the task collection. Due to the lack of resources in this part of the training task, it is difficult to allocate enough resources to the candidate tasks. Therefore, the removal will release some resources that will be allocated to these training tasks, so that they are assigned to priority tasks and non-priorities. Priority tasks have more resources and are better able to perform priority and non-priority tasks.
在一些实施例中,所述候选任务集合中的优先任务和非优先任务的优先顺 序是按照优先数从大到小的顺序排列的,并且所述优先任务的顺序位置位于非优先任务之前,即即便是优先任务的优先数小于非优先任务,仍旧排在非优先任务之前。通过此排列方式,能够使得训练任务的调度更加合理,即优先满足需要较少资源即可执行的训练任务,从而能够尽可能的在同一时间段内执行更多训练任务,提高训练效率。In some embodiments, priority tasks and non-priority tasks in the candidate task set are prioritized The order is arranged in descending order of priority numbers, and the priority positions of the priority tasks are located before the non-priority tasks, that is, even if the priority number of the priority tasks is smaller than the non-priority tasks, they are still ranked before the non-priority tasks. Through this arrangement, the scheduling of the training task can be made more reasonable, that is, the training task that needs less resources can be preferentially executed, so that more training tasks can be executed in the same time period as much as possible, and the training efficiency is improved.
在一些实施例中,为优先任务和非优先任务分配资源数的过程可以包括:In some embodiments, the process of allocating resources for priority and non-priority tasks may include:
第一次资源分配,即所述训练云为所述优先任务和所述非优先任务按照所述优先数的顺序分配各自最低资源数;若第一分配之后,资源数还有剩余,则还可进行第二次资源分配,即所述训练云将第一次资源分配后剩余的资源按照优先数比例分配给所述优先任务。通过此分配方式能够使得资源分配更加合理。第一次和第二分配针对的均是优先任务。The first resource allocation, that is, the training cloud allocates the respective minimum resource numbers in the order of the priority number for the priority task and the non-priority task; if the number of resources remains after the first allocation, A second resource allocation is performed, that is, the training cloud allocates the remaining resources after the first resource allocation to the priority task according to the priority ratio. This allocation method makes resource allocation more reasonable. Both the first and second assignments are for priority tasks.
在一些实施例中,若第二次资源分配后资源数还有剩余,则可进行第三次分配,即训练云将第二次资源分配后超出所述优先任务的最佳资源数的剩余资源按照优先数比例分配给非优先任务。在优先任务都分配至最佳资源数后,才考虑分配给非优先任务,能够提高资源利用的效率。In some embodiments, if there is still a remaining number of resources after the second resource allocation, a third allocation may be performed, that is, the remaining resources of the optimal resource number exceeding the priority task after the training cloud allocates the second resource. Assigned to non-priority tasks in proportion to the priority. After the priority tasks are allocated to the optimal number of resources, the allocation to non-priority tasks is considered, which can improve the efficiency of resource utilization.
在一些实施例中,识别模型的类型包括人脸识别、图像分类、语音分析和视频分类之中的至少一种。识别模型的类型决定对应的训练任务的任务参数,以及相应的优先数,应当理解的是,识别模型并不仅限于上述四种。In some embodiments, the type of recognition model includes at least one of face recognition, image classification, speech analysis, and video classification. The type of the recognition model determines the task parameters of the corresponding training task, and the corresponding priority numbers. It should be understood that the recognition model is not limited to the above four.
本发明实施例第二方面还提供一种用于增量式学习云***的训练方法,该方法主要应用于增量式学习云***的识别云和训练云,该训练方法可包括:首先识别云接收未识别数据,所述未识别数据由UE发出或由存储设备提供,可以看出,未识别数据有两个来源;之后,识别云可根据识别模型对所述未识别数据进行识别,该识别模型由所述识别云所在云***中的训练云提供;接着,由训练云在识别过程统计出采用的识别模型识别为识别数据的识别信息,并且在识别信息超出预设的识别阈值时,所述识别云向所述训练云发送模型训练请求,以使得所述训练云训练所述识别模型,所述模型训练请求携带所述识别信息和所述识别模型的类型。The second aspect of the embodiments of the present invention further provides a training method for an incremental learning cloud system, which is mainly applied to an identification cloud and a training cloud of an incremental learning cloud system, and the training method may include: first identifying a cloud Receiving unidentified data sent by the UE or provided by the storage device, it can be seen that the unidentified data has two sources; afterwards, the recognition cloud can identify the unidentified data according to the recognition model, and the identification The model is provided by the training cloud in the cloud system in which the cloud is identified; then, the identification model used by the training cloud to identify the identification model used in the identification process is identified as identification information of the identification data, and when the identification information exceeds a preset recognition threshold, The identification cloud sends a model training request to the training cloud to cause the training cloud to train the recognition model, the model training request carrying the identification information and the type of the recognition model.
可以看出,训练云是否进行识别模型的训练是由识别云发起的,识别云通过统计采用识别模型在识别未识别数据的过程中的表现,即统计出的识别信 息,判断是否要模型训练,若判断要进行模型训练时,会向训练云发送模型训练请求以便于训练云对识别模型进行训练。此方式使得识别模型的训练具有针对性,训练的是最为需要训练的这类识别模型,从而使得模型训练更加合理。It can be seen that the training of whether the training cloud performs the recognition model is initiated by the recognition cloud, and the recognition cloud uses the recognition model to perform the performance in the process of identifying the unidentified data, that is, the statistical identification letter. The information is judged whether the model training is to be performed. If it is judged that the model training is to be performed, the model training request is sent to the training cloud to facilitate the training cloud to train the recognition model. This method makes the training of the recognition model specific, and the training model is the one that needs the most training, which makes the model training more reasonable.
在一些实施例中,识别云除了将识别信息携带在模型训练请求发送给训练云之外,还可以将该识别信息发送至存储设备进行存储,以便于后续查看识别日志和为训练云提供该识别信息。In some embodiments, the identification cloud may send the identification information to the storage device for storage in addition to carrying the identification information to the training cloud for subsequent viewing of the identification log and providing the identification for the training cloud. information.
在一些实施例中,识别云接收的未识别数据来至于存储设备,具体的接收从存储云读取的所述未识别数据,该存储设备内存储的未识别数据是由UE发送至存储设备的。此情形应用于数据量非常大的情况或者数据比较特殊或识别云当前的识别负载较高的情况下,预先将这些数据存储在存储云中,以便于后续识别云能获取这些未识别数据进行识别。In some embodiments, the unrecognized data received by the cloud is identified as a storage device, specifically receiving the unidentified data read from the storage cloud, and the unidentified data stored in the storage device is sent by the UE to the storage device. . In this case, when the data volume is very large or the data is special or the current recognition load of the cloud is high, the data is stored in the storage cloud in advance, so that the subsequent recognition cloud can acquire the unidentified data for identification. .
在一些实施例中,识别模型虽然都是由训练云所提供,但是识别云获取识别模型的方式有两种,一种是从训练云直接获取,即识别模型由所述识别云所在云***中的训练云发送至所述识别云;另一种是从存储设备中获取,即识别模型由所述识别云从所述存储设备中读取,所述存储设备内存储的识别模型由所述识别云所在云***中的训练云发送。训练云中一般不会将所有的识别模型均进行存储,而是仅存储一些最近训练过的识别模型,所有通过训练云训练过的识别模型均会存储在存储设备中,以便于识别云能够获取任意识别模型。In some embodiments, although the recognition model is provided by the training cloud, there are two ways to identify the cloud acquisition recognition model, one is directly obtained from the training cloud, that is, the recognition model is in the cloud system where the recognition cloud is located. The training cloud is sent to the recognition cloud; the other is obtained from the storage device, that is, the recognition model is read from the storage device by the recognition cloud, and the identification model stored in the storage device is identified by the The cloud is sent by the training cloud in the cloud system. In the training cloud, all the recognition models are not stored, but only some recently trained recognition models are stored. All the recognition models trained by the training cloud are stored in the storage device, so that the recognition cloud can be acquired. Identify models arbitrarily.
在一些实施例中,所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种,所述识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值之中的至少一种。In some embodiments, the identification information includes at least one of a correct rate value, a new data amount, and a new data category, where the identification threshold includes a correct rate threshold, a new data amount threshold, and a new category threshold. At least one of them.
本发明实施例第三方面还提供一种用于增量式学习云***的训练设备,其特征在于,包括:The third aspect of the embodiments of the present invention further provides a training device for an incremental learning cloud system, including:
第一接收模块,用于接收所述训练云所在的云***中的识别云发送模型训练请求,所述模型训练请求中携带识别信息和识别模型的类型;a first receiving module, configured to receive a recognition cloud sending model training request in a cloud system where the training cloud is located, where the model training request carries the identification information and a type of the recognition model;
第一处理模块,用于根据所述识别信息和所述识别模型的类型生成对应的训练任务;a first processing module, configured to generate a corresponding training task according to the identification information and the type of the recognition model;
所述第一处理模块还用于,通过所述识别信息计算所述训练任务的优先数,所述训练任务的优先数对应所述训练任务的执行优先级别; The first processing module is further configured to calculate, by using the identification information, a priority number of the training task, where a priority number of the training task corresponds to an execution priority level of the training task;
资源分配模块,用于根据所述优先数为所述训练任务分配训练资源,并按所述执行优先级别执行对应的训练任务。And a resource allocation module, configured to allocate a training resource to the training task according to the priority number, and perform a corresponding training task according to the execution priority level.
在一些实施例中,所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种。In some embodiments, the identification information includes at least one of a correct rate value, a new data amount, and a new data category.
在一些实施例中,所述识别信息包括所述正确率值、所述新增数据量和所述新增数据种类,所述第一处理模块还用于:In some embodiments, the identification information includes the correct rate value, the newly added data amount, and the newly added data category, and the first processing module is further configured to:
当两个所述训练任务的识别模型的类型相同时,选取触发优先级高或执行优先级高的训练任务作为所述类型的训练模型的训练任务,所述触发优先级为按照所述识别模型的类型设置的所述识别信息中所述正确率值、所述新增数据量和所述新增数据种类的优先顺序。When the types of the recognition models of the two training tasks are the same, the training task with the high priority or the high priority is selected as the training task of the training model of the type, and the trigger priority is according to the recognition model. The priority rate value, the new data amount, and the priority order of the new data category in the identification information set by the type.
在一些实施例中,所述第一处理模块具体用于:In some embodiments, the first processing module is specifically configured to:
确定与训练任务对应的任务参数;Determining task parameters corresponding to the training task;
确定所述识别信息的第一加权因子和所述任务参数的第二加权因子;Determining a first weighting factor of the identification information and a second weighting factor of the task parameter;
根据所述识别信息、任务参数、第一加权因子和第二加权因子计算所述训练任务的优先数。Calculating a priority number of the training task according to the identification information, the task parameter, the first weighting factor, and the second weighting factor.
在一些实施例中,所述训练任务为执行任务或非执行任务;In some embodiments, the training task is a task or a non-execution task;
当所述训练任务为执行任务时,所述任务参数包括任务重要级参数和运行时间估计参数;When the training task is to perform a task, the task parameter includes a task importance level parameter and a running time estimation parameter;
当所述训练任务为非执行任务时,所述任务参数包括任务重要级参数、模型参数、等待时间参数和运行时间估计参数。When the training task is a non-executing task, the task parameters include a task importance level parameter, a model parameter, a waiting time parameter, and a running time estimation parameter.
在一些实施例中,所述资源分配模块具体用于:In some embodiments, the resource allocation module is specifically configured to:
根据所述训练任务对应的识别模型确定所述训练任务所需的最低资源数和最佳资源数;Determining, according to the identification model corresponding to the training task, a minimum number of resources and an optimal number of resources required for the training task;
将最大优先数与所述优先数的差值小于预设的优先数阈值的所述优先数对应的训练任务确定为候选任务集合中的候选任务;Determining, by the training task corresponding to the priority number that the difference between the maximum priority number and the priority number is less than the preset priority number threshold, as a candidate task in the candidate task set;
根据所述优先数计算所述候选任务集合中各候选任务的分配资源数;Calculating, according to the priority number, a number of allocated resources of each candidate task in the candidate task set;
将分配资源数不小于所述候选任务的最低资源数的候选任务确定为优先任务;Determining, as a priority task, a candidate task that allocates a resource number not less than a minimum resource number of the candidate task;
将分配资源数小于所述候选任务的最低资源数,且所述候选任务的最低资 源数与所述候选任务的分配资源数之差小于预设的资源调整阈值的候选任务确定为非优先任务;The number of allocated resources is less than the minimum number of resources of the candidate task, and the minimum resource of the candidate task The candidate task whose difference between the source number and the allocated resource number of the candidate task is less than the preset resource adjustment threshold is determined as a non-priority task;
为所述候选任务集合中的候选任务按照所述优先任务和所述非优先任务分配对应的资源。Assigning corresponding resources to the candidate tasks in the candidate task set according to the priority task and the non-priority task.
在一些实施例中,所述资源分配模块还用于:In some embodiments, the resource allocation module is further configured to:
将所述候选任务的最低资源数与所述候选任务的分配资源数之差大于预设的资源调整阈值的候选任务从所述候选任务集合中去除。A candidate task that has a difference between a minimum resource number of the candidate task and a allocated resource number of the candidate task greater than a preset resource adjustment threshold is removed from the candidate task set.
在一些实施例中,所述候选任务集合中的优先任务和非优先任务按照优先数从大到小的顺序排列,所述优先任务的顺序位置位于非优先任务之前。In some embodiments, the priority task and the non-priority task in the candidate task set are arranged in descending order of priority numbers, and the priority positions of the priority tasks are located before the non-priority task.
在一些实施例中,具体分配资源的过程并非一次分配就分配完毕,而是通过多次分配依次分配完成,所述资源分配模块具体用于:In some embodiments, the process of specifically allocating resources is not completed by one allocation, but is completed by multiple allocations, and the resource allocation module is specifically configured to:
第一次资源分配,为所述优先任务和所述非优先任务按照所述优先数的顺序分配各自最低资源数;a first resource allocation, where the priority task and the non-priority task are allocated respective minimum resource numbers in the order of the priority number;
第二次资源分配,将第一次资源分配后剩余的资源按照优先数比例分配给所述优先任务。The second resource allocation allocates the remaining resources after the first resource allocation to the priority task according to the priority ratio.
在一些实施例中,若第二次分配完毕依然有剩余部分,则可进行第三次分配,所述资源分配模块具体用于:In some embodiments, if there is still a remaining portion after the second allocation is completed, a third allocation may be performed, where the resource allocation module is specifically configured to:
第三次资源分配,将第二次资源分配后超出所述优先任务的最佳资源数的剩余资源按照优先数比例分配给非优先任务。For the third resource allocation, the remaining resources exceeding the optimal resource number of the priority task after the second resource allocation are allocated to the non-priority tasks according to the priority ratio.
在一些实施例中,所述识别模型的类型包括人脸识别、图像分类、语音分析和视频分类之中的至少一种。本发明实施例第四方面还提供一种用于增量式学习云***的识别设备,可包括:In some embodiments, the type of the recognition model includes at least one of face recognition, image classification, speech analysis, and video classification. The fourth aspect of the embodiments of the present invention further provides an identification device for an incremental learning cloud system, which may include:
第二接收模块,用于收未识别数据,所述未识别数据由用户设备UE发出或由存储设备提供;a second receiving module, configured to receive unidentified data, which is sent by the user equipment UE or provided by the storage device;
第二处理模块,用于根据识别模型对所述未识别数据进行识别,所述识别模型由所述识别云所在云***中的训练设备提供;a second processing module, configured to identify the unidentified data according to the identification model, where the identification model is provided by a training device in the cloud system where the identification cloud is located;
统计模块,用于根据所述识别模型针对所述未识别数据统计识别信息;a statistics module, configured to perform statistical identification information on the unidentified data according to the identification model;
所述第二处理模块还用于当所述识别信息超出预设的识别阈值时,向所述训练云发送模型训练请求,以使得所述训练云训练所述识别模型,所述模型训 练请求携带所述识别信息和所述识别模型的类型。The second processing module is further configured to: when the identification information exceeds a preset identification threshold, send a model training request to the training cloud, so that the training cloud trains the recognition model, and the model training The training request carries the identification information and the type of the recognition model.
在一些实施例中,所述识别设备还包括:In some embodiments, the identifying device further includes:
发送模块,用于将所述识别信息发送至所述存储设备。And a sending module, configured to send the identification information to the storage device.
在一些实施例中,所述第一接收模块具体用于:In some embodiments, the first receiving module is specifically configured to:
接收从所述存储设备读取的所述未识别数据,所述存储设备内存储的未识别数据由所述UE发送至所述存储设备。Receiving the unidentified data read from the storage device, the unidentified data stored in the storage device being transmitted by the UE to the storage device.
在一些实施例中,所述识别模型由所述识别设备所在云***中的训练设备发送至所述识别设备;或,In some embodiments, the identification model is sent to the identification device by a training device in the cloud system in which the identification device is located; or
所述识别模型由所述识别设备从所述存储设备中读取,所述存储设备内存储的识别模型由所述识别设备所在云***中的训练设备发送。The identification model is read by the identification device from the storage device, and the identification model stored in the storage device is sent by the training device in the cloud system where the identification device is located.
在一些实施例中,所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种,所述识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值之中的至少一种。In some embodiments, the identification information includes at least one of a correct rate value, a new data amount, and a new data category, where the identification threshold includes a correct rate threshold, a new data amount threshold, and a new category threshold. At least one of them.
从以上技术方案可以看出,本发明实施例具有以下优点:本发明实施例应用于增量式学习云***,该***中的训练云在接收到模型训练请求后会产生对应的训练任务,并且还会根据模型训练请求获取对应的识别信息,之后根据训练任务的识别信息和训练任务的识别模型的类型计算训练任务的优先数,该优先数表示训练任务的执行优先级别,在具有多个训练任务时,由于一个训练任务的优先数需要由两类数据来决定,即训练任务的识别信息和训练任务的识别模型的类型,根据计算出的优先数对训练云中得训练任务进行调度,使得训练资源能够被多个训练任务合理共享,提高训练效率。It can be seen from the above technical solutions that the embodiments of the present invention have the following advantages: the embodiment of the present invention is applied to an incremental learning cloud system, and the training cloud in the system generates a corresponding training task after receiving the model training request, and The corresponding identification information is also obtained according to the model training request, and then the priority number of the training task is calculated according to the identification information of the training task and the type of the recognition model of the training task, the priority number indicating the execution priority level of the training task, and having multiple trainings In the task, since the priority number of a training task needs to be determined by two types of data, namely, the identification information of the training task and the type of the recognition model of the training task, the training task in the training cloud is scheduled according to the calculated priority number, so that Training resources can be reasonably shared by multiple training tasks, improving training efficiency.
附图说明DRAWINGS
图1是现有增量式学习***的结构示意图;1 is a schematic structural diagram of a prior art incremental learning system;
图2是本发明实施例的增量式学习云***的架构图;2 is an architectural diagram of an incremental learning cloud system according to an embodiment of the present invention;
图3为本发明实施例的用于增量式学习云***的调度方法的一个实施例图;3 is a diagram of an embodiment of a scheduling method for an incremental learning cloud system according to an embodiment of the present invention;
图4是本发明实施例的基于增量式学习云***的训练方法的一个实施例图;4 is a diagram of an embodiment of a training method based on an incremental learning cloud system according to an embodiment of the present invention;
图5是本发明实施例的训练设备的一个实施例图; Figure 5 is a diagram showing an embodiment of a training device according to an embodiment of the present invention;
图6是本发明实施例的识别设备的一个实施例图;Figure 6 is a diagram showing an embodiment of an identification device according to an embodiment of the present invention;
图7是本发明实施例的训练设备的一个实施例;Figure 7 is an embodiment of a training device in accordance with an embodiment of the present invention;
图8是本发明实施例的识别设备的一个实施例。Figure 8 is an embodiment of an identification device in accordance with an embodiment of the present invention.
具体实施方式detailed description
本发明实施例提供了一种用于增量式学习云***的训练、调度方法及相关设备,根据触发条件和触发顺序分配训练资源,使得训练资源能够被多个训练任务合理共享,提高训练效率。The embodiments of the present invention provide a training, scheduling method, and related device for an incremental learning cloud system, and allocate training resources according to trigger conditions and trigger sequences, so that training resources can be reasonably shared by multiple training tasks, thereby improving training efficiency. .
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments.
以下分别进行详细说明。The details are described below separately.
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。The terms "first", "second", "third", "fourth", etc. (if present) in the specification and claims of the present invention and the above figures are used to distinguish similar objects without being used for Describe a specific order or order. It is to be understood that the data so used may be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than what is illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or modules is not necessarily limited to Those steps or modules may include other steps or modules not explicitly listed or inherent to such processes, methods, products or devices.
随着网络和传感器技术的发展,数据量越来越多,而且数据量和数据种类是随时间增加的,因此需要一种增量式的学习方法。请参阅图1,图1是现有增量式学习***的结构示意图,其中,空心粗箭头所示的流程是数据的存储过程,从数据源产生的数据被数据存储器存储。实心粗箭头所示的流程是数据预测的过程,从数据存储器中读取的数据在预测器中根据已经训练得到的预测模型进行预测。实心细箭头所示的流程是增量学习的流程,数据存储器中的数据和已有的预测模型在模型训练器中进行模型更新得到更新的预测模型。此增量式学习***的增量学习通常是间断式进行,即达到一定条件后才进行一次增量学习。相邻的两次增量学习通常有时间间隔。 With the development of network and sensor technologies, the amount of data is increasing, and the amount of data and data types are increasing with time, so an incremental learning method is needed. Please refer to FIG. 1. FIG. 1 is a schematic structural diagram of a conventional incremental learning system. The flow indicated by the hollow thick arrow is a data storage process, and data generated from the data source is stored by the data storage. The flow shown by the solid thick arrow is the process of data prediction, and the data read from the data memory is predicted in the predictor based on the predicted model that has been trained. The flow shown by the solid thin arrow is the process of incremental learning. The data in the data memory and the existing prediction model are updated in the model trainer to obtain an updated prediction model. The incremental learning of this incremental learning system is usually performed intermittently, that is, an incremental learning is performed after certain conditions are met. Adjacent two incremental learning usually has a time interval.
可以看出,此增量式***中的预测模型在更新时,依赖数据存储器中的数据和已有的预测模型,且达到预设的条件才进行一次更新,如每经过固定时间进行一次更新等,并且相邻的两次的更新过程会有时间间隔。It can be seen that the prediction model in the incremental system relies on the data in the data memory and the existing prediction model when updating, and the update is performed until the preset condition is met, such as performing an update every fixed time. And there will be a time interval between the two adjacent update processes.
针对上述问题,本发明实施例提出了一种增量式学习云***,该***为将上述增量式***云化获得,请参阅图2,图2是本发明实施例的增量式学习云***的架构图,其中识别云包含现有技术的预测器的功能,存储云包含现有技术的数据存储器的功能,训练云包含现有技术的模型训练器的功能。用户通过各种应用向识别云或存储云提供各种未识别数据。For the above problem, an embodiment of the present invention provides an incremental learning cloud system, which is obtained by clouding the above-mentioned incremental system. Referring to FIG. 2, FIG. 2 is an incremental learning cloud according to an embodiment of the present invention. An architectural diagram of the system in which the recognition cloud includes the functionality of a prior art predictor, the storage cloud includes the functionality of a prior art data store, and the training cloud includes the functionality of a prior art model trainer. Users provide various unidentified data to the recognition cloud or storage cloud through various applications.
从各个云的功能来看:From the function of each cloud:
识别云的未识别数据具有两个来源,其一是用户提供的未识别数据,其二是存储云提供的未识别数据,第二种实质也是由用户提供,只是预先存储于存储云中,两种来源如图2中实心粗箭头指示;识别模型同样具有两个来源,其一是来自于训练云提供,其二是来自于存储云,第二种实质也有训练云提供,只是预先存储于存储于中,如图2中空心粗箭头所示;可以看出,采用第二种方式时,识别云识别数据与训练云对识别模型进行训练相互之间不会产生冲突,从而使得模型训练过程可以无缝进行,而无需设置一定的时间间隔,此外,识别云会对未识别数据按照识别模型进行识别,而后会将识别后的数据分类存储在存储云中,如图2中从识别云到存储云的细箭头所示;此外,该识别云还会对采用识别模型而得到的识别信息提供给训练云,如图2中识别云与训练云之间的实心细箭头所示。这些训练信息中包括但不仅限于新增数据量、新增数据种类和正确率值。The unidentified data identifying the cloud has two sources, one is the unidentified data provided by the user, the other is the unidentified data provided by the storage cloud, and the second substance is also provided by the user, but is pre-stored in the storage cloud, two The source is shown by the solid thick arrow in Figure 2; the recognition model also has two sources, one is from the training cloud, the other is from the storage cloud, and the second is also provided by the training cloud, but is stored in advance in storage. In the middle, as shown by the thick hollow arrow in Figure 2; it can be seen that when the second method is adopted, the identification of the cloud identification data and the training cloud training the recognition model do not conflict with each other, so that the model training process can Seamlessly, without setting a certain time interval, in addition, the recognition cloud will identify the unidentified data according to the recognition model, and then store the identified data classification in the storage cloud, as shown in Figure 2 from identifying the cloud to storing The cloud's thin arrow is shown; in addition, the recognition cloud also provides identification information obtained by using the recognition model to the training cloud, as shown in Figure 2, identifying the cloud and training. The solid thin arrows between the clouds are shown. These training information includes, but is not limited to, new data volume, new data type, and correct rate value.
训练云主要对已有的识别模型进行训练,训练所需的参数包括通过该识别模型识别的数据以及识别信息,其中,识别模型已识别的数据由存储云提供,如图2所示的存储云到训练云的细箭头所示,识别信息则由识别云提供,训练云会根据这两类信息作为参数对识别模型进行更新,从而提供更加精确的识别模型,此外,训练云还会将识别模型及其各项参数进行备份存储至存储云中,如图2中训练云到存储云的实心细箭头所示,以便于存储云能够为识别云提供该识别模型。The training cloud mainly trains the existing recognition model, and the parameters required for the training include the data identified by the recognition model and the identification information, wherein the identified data of the recognition model is provided by the storage cloud, as shown in the storage cloud of FIG. 2 . As indicated by the thin arrow in the training cloud, the identification information is provided by the recognition cloud, and the training cloud updates the recognition model based on the two types of information as parameters to provide a more accurate recognition model. In addition, the training cloud will also identify the model. The parameters and their parameters are backed up and stored in the storage cloud, as shown by the solid thin arrow of the training cloud to the storage cloud in FIG. 2, so that the storage cloud can provide the recognition model for identifying the cloud.
存储云主要提供各项存储功能,如来着训练云的识别模型及其参数的备 份,来自于识别云的分类存储的数据,来自于用户的未识别数据,此外,存储云还会向训练云提供识别模型识别后的数据,以及向识别云提供未识别数据和对应的识别模型。The storage cloud mainly provides various storage functions, such as the training model for training the cloud and its parameters. The data stored in the classification from the recognition cloud is from the unidentified data of the user. In addition, the storage cloud provides the training cloud with the data identified by the recognition model, and provides the unidentified data and the corresponding recognition model to the recognition cloud. .
下面对本发明实施例的用于增量式学习云***的调度方法进行介绍,请参阅图3,图3为本发明实施例的用于增量式学习云***的调度方法的一个实施例图,如图1所示,本发明实施例的用于增量式学习云***的调度方法,可包括以下内容:The following is a description of a scheduling method for an incremental learning cloud system according to an embodiment of the present invention. Referring to FIG. 3, FIG. 3 is a schematic diagram of an embodiment of a scheduling method for an incremental learning cloud system according to an embodiment of the present invention. As shown in FIG. 1 , a scheduling method for an incremental learning cloud system according to an embodiment of the present invention may include the following content:
301、训练云接收训练云所在的云***中的识别云发送模型训练请求。301. The training cloud receives a model cloud training request in the cloud system where the training cloud is located.
其中,模型训练请求指示识别信息和识别模型的类型,由于在实际情形下,未识别数据的类型分为多种,这些不同类别的未识别数据需要采用不同类型的识别模型进行识别,而识别云发送的模型训练请求可以根据当前识别的数据的种类附带上识别模型的类别。The model training request indicates the identification information and the type of the recognition model. Since the types of unidentified data are divided into multiple types in actual situations, the unidentified data of these different categories needs to be identified by using different types of recognition models, and the recognition cloud is identified. The transmitted model training request may be accompanied by a category identifying the model based on the type of data currently identified.
需要说明的是,识别信息可包括正确率值、新增数据量和新增数据种类之中的至少一种,可以看出,识别信息是采用某一类别的识别模型识别一定的数据后统计得出的,正确率值意味着当次识别过程中采用该识别模型正确识别出数据的概率,新增数据量表示采用该识别模型的情况下实际识别的比识别模型中定义的数据量超出的数据量,新增数据种类表示采用该识别模型的情况下识别未识别数据时,该识别模型中并未定义的数据种类;在实际处理中具体采用哪几种作为识别模型训练的基础,可根据实际的识别模型进行确定,此处不作限定。It should be noted that the identification information may include at least one of a correct rate value, a newly added data amount, and a newly added data type. It can be seen that the identification information is obtained by identifying a certain data by using a certain type of recognition model. The correct rate value means the probability that the recognition model correctly recognizes the data in the current recognition process, and the newly added data amount indicates the data that is actually recognized than the data amount defined in the recognition model when the recognition model is used. Quantity, new data type indicates the type of data that is not defined in the recognition model when identifying the unidentified data in the case of using the recognition model; which of the specific types of training in the actual processing is used as the basis for the training of the recognition model, according to the actual The identification model is determined and is not limited herein.
其中,新增数据量表示新增数据占已有数据的百分比;新增数据种类表示新增种类占已有种类的百分比或者新增种类的个数;正确率值表示当前的识别误差与期待误差的差值,一般来说,该差值超过一定程度时,需要对识别模型进行训练。Among them, the new data volume indicates the percentage of new data in the existing data; the new data category indicates the percentage of new types to existing categories or the number of new types; the correct rate value indicates the current identification error and expected error. The difference, in general, when the difference exceeds a certain level, the recognition model needs to be trained.
需要说明的是,识别模型的类型可包括人脸识别、图像分类、语音分析和视频分类之中的至少一种,对应不同种类的未识别数据,如人脸识别模型对应人脸识别数据,图像分类识别模型对应图像分类数据,语音分析识别模型对应语音分析数据,视频分类识别模型对应视频分类数据;应理解,识别模型的类型除前述几种之外,还有很多其他的类型,例如指纹识别虹膜识别等生物特征 识别类,又例如论文分类和小说分类等文字上的分类等,各种音频、图形的分析,甚至是一些自定义的识别均可建立对应的识别模型。It should be noted that the type of the recognition model may include at least one of face recognition, image classification, voice analysis, and video classification, corresponding to different types of unidentified data, such as face recognition model corresponding to face recognition data, and images. The classification recognition model corresponds to image classification data, the speech analysis recognition model corresponds to speech analysis data, and the video classification recognition model corresponds to video classification data; it should be understood that there are many other types of recognition model types, such as fingerprint recognition. Biological characteristics such as iris recognition Recognition classes, such as paper classification and novel classification, etc., various audio, graphic analysis, and even some custom recognition can establish a corresponding recognition model.
302、训练云根据识别信息和识别模型的类型生成对应的训练任务;302. The training cloud generates a corresponding training task according to the identification information and the type of the recognition model.
可以理解的是,在训练云获知该识别信息和识别模型的类型,即可根据此两种信息生成对应的训练任务,该训练任务是以识别信息作为基础,针对识别模型进行的训练。It can be understood that, in the training cloud, the identification information and the type of the recognition model are obtained, and the corresponding training task can be generated according to the two kinds of information, and the training task is based on the identification information and is trained for the recognition model.
需要说明的是,在生成训练任务时是以识别信息是否超出预设的识别信息阈值来确定的,例如上述正确率值、新增数据量和新增数据种类之中的至少一种超出了预设阈值,则表示识别模型已经无法适应当前识别环境,需要对其进行更新,以提高其正确率值以及覆盖更多的数据量和数据种类。It should be noted that, when the training task is generated, it is determined whether the identification information exceeds a preset identification information threshold, for example, at least one of the foregoing correct rate value, the newly added data amount, and the newly added data type exceeds the pre-preparation. Setting the threshold means that the recognition model has been unable to adapt to the current recognition environment and needs to be updated to increase its correct rate value and cover more data volume and data type.
303、训练云通过识别信息计算训练任务的优先数。303. The training cloud calculates the priority number of the training task by using the identification information.
其中,训练任务的优先数对应训练任务的执行优先级别,本实施例中优先数越大的表示其执行优先级别高,当然也可采用优先数越小其优先级越高的方式,或是采用其他能够表示执行优先级别的优先数的方式,此处不作限定。The priority of the training task corresponds to the execution priority of the training task. In this embodiment, the greater the priority number, the higher the priority of the execution, and the smaller the priority, the higher the priority, or the higher priority. Other ways of indicating the priority number of the execution priority level are not limited herein.
可选的,根据识别信息计算训练任务的优先数可包括:Optionally, calculating the priority number of the training task according to the identification information may include:
训练云确定与训练任务对应的任务参数;The training cloud determines task parameters corresponding to the training tasks;
其中,任务参数是训练任务在被创建时赋予,并且随着训练任务的执行情况的不同会有所改变,例如增加一些参数,改变原有参数的值等。可选的,训练任务为执行任务或非执行任务;当训练任务为执行任务时,任务参数包括任务重要级参数和运行时间估计参数;当训练任务为非执行任务时,任务参数包括任务重要级参数、模型参数、等待时间参数和运行时间估计参数。Among them, the task parameter is assigned to the training task when it is created, and will change with the execution of the training task, such as adding some parameters, changing the value of the original parameter, and the like. Optionally, the training task is a task or a non-executing task; when the training task is a task, the task parameter includes a task importance parameter and a running time estimation parameter; when the training task is a non-executing task, the task parameter includes a task important level. Parameters, model parameters, latency parameters, and runtime estimation parameters.
其中,任务重要性参数是人为设置的参数,可根据实际的处理的数据的不同设置不同的参数;模型参数表示模型参数传输时间与计算时间的百分比,或该训练任务的识别模型大小占全部识别模型大小的百分比;等待时间因子表示等待时间长度,或该训练任务等待时间长度占全部任务的等待时间长度的百分比;运行时间估计因子表示计算时间的估计长度,或该任务计算时间的长度占全部任务的计算时间长度的百分比。The task importance parameter is an artificially set parameter, and different parameters may be set according to different actual processed data; the model parameter represents a percentage of the model parameter transmission time and the calculation time, or the recognition model size of the training task accounts for all the identification. The percentage of the model size; the wait time factor indicates the length of the wait time, or the length of the wait time of the training task as a percentage of the waiting time of all tasks; the runtime estimation factor indicates the estimated length of the calculation time, or the length of the task calculation time is the total The percentage of the length of time the task was calculated.
可以看出,任务参数作为一项训练任务的属性,在计算优先数时,任务参数的不同会对优先数的结果产生影响,从而影响该训练任务的执行优先级别。 It can be seen that the task parameter is used as an attribute of a training task. When calculating the priority number, the difference of the task parameters affects the result of the priority number, thereby affecting the execution priority level of the training task.
训练云确定识别信息的第一加权因子和任务参数的第二加权因子。The training cloud determines a first weighting factor of the identification information and a second weighting factor of the task parameter.
需要说明的是,虽然识别信息和任务参数均是后续优先数的计算基础,但是识别信息和任务参数根据识别模型的类型不同具有不同的加权因子,并且识别信息和任务参数两者之间的加权因子一般来说可以相同也可以不相同,例如识别信息中正确率值、新增数据量和新增数据种类的加权因子一般来说也是可以相同也可以不相同,任务参数中的各类参数也类似,具体的均以实际情况要求而定。It should be noted that although the identification information and the task parameters are the calculation basis of the subsequent priority number, the identification information and the task parameters have different weighting factors according to the types of the recognition models, and the weighting between the identification information and the task parameters. The factors may be the same or different. For example, the weighting factors of the correct rate value, the newly added data amount, and the newly added data type in the identification information may be the same or different, and the various parameters in the task parameters are also Similar, the specific requirements are determined by the actual situation.
训练云根据识别信息、任务参数、第一加权因子和第二加权因子计算训练任务的优先数。The training cloud calculates the priority number of the training task based on the identification information, the task parameter, the first weighting factor, and the second weighting factor.
举例来说,针对未执行任务,其优先数等于w1*新增数据量+w2*新增数据种类+w3*正确率+未执行任务的任务参数,该未执行任务的任务参数可等于w7*任务重要性参数-w8*模型参数+w9*等待时间参数–w10*运行时间估计参数;针对已执行任务,其优先数等于w4*新增数据量+w5*新增数据种类+w6*正确率值+执行任务的任务参数,该执行任务的任务参数可等于w11*任务重要性参数-w12*运行时间估计参数。For example, for an unexecuted task, the priority number is equal to w1* new data amount + w2 * new data type + w3 * correct rate + task parameter of the unexecuted task, and the task parameter of the unexecuted task may be equal to w7 * Task importance parameter - w8 * model parameter + w9 * waiting time parameter - w10 * runtime estimation parameter; for the executed task, its priority number is equal to w4 * new data volume + w5 * new data type + w6 * correct rate Value + the task parameter of the execution task, the task parameter of the execution task may be equal to the w11* task importance parameter -w12* runtime estimation parameter.
其中,w1至w3,以及w4至w6均为第一加权因子,w7至w10,以及w11和w12均为第二加权因子,w1至w12表示对应的各参数的权重,w1至w12的值可根据训练任务需要进行设置。Wherein, w1 to w3, and w4 to w6 are first weighting factors, w7 to w10, and w11 and w12 are second weighting factors, w1 to w12 represent weights of corresponding parameters, and values of w1 to w12 can be based on Training tasks need to be set up.
可以看出,在确定出识别信息、任务参数、第一加权因子和第二加权因子后,就可对优先数进行计算,例如优先数越大表示优先级越高,则具体计算为,第一加权因子乘以识别信息,第二加权因子乘以任务参数,再将两个乘积相加得到优先数。对于并非以优先数越大表示优先级越高的情况,如优先数越小表示优先级越高的情况,也采用其他计算方式,只要能够反应出正确的训练任务的执行优先级别即可。It can be seen that after determining the identification information, the task parameter, the first weighting factor and the second weighting factor, the priority number can be calculated. For example, the larger the priority number, the higher the priority, the specific calculation is, first The weighting factor is multiplied by the identification information, the second weighting factor is multiplied by the task parameter, and the two products are added together to obtain the priority number. In the case where the priority is not higher, the higher the priority is, the smaller the priority is, the higher the priority is, and other calculation methods are used, as long as the execution priority of the correct training task can be reflected.
需要说明的是,若两个训练任务的识别模型的类型相同的,则需要对其中一个进行处理,可选的,识别信息包括正确率值、新增数据量和新增数据种类,方法包括:It should be noted that if the identification models of the two training tasks are of the same type, one of them needs to be processed. Optionally, the identification information includes a correct rate value, a new data volume, and a new data type, and the method includes:
当两个训练任务的识别模型的类型相同时,选取触发优先级高或执行优先级高的训练任务作为类型的训练模型的训练任务,触发优先级为按照识别模型 的类型设置的识别信息中正确率值、新增数据量和新增数据种类的优先顺序。When the types of the recognition models of the two training tasks are the same, the training task with the high priority or the high priority is selected as the training task of the type training model, and the trigger priority is according to the recognition model. The priority value of the type setting, the priority value of the new data, and the priority of the new data type.
可以看出,若两个训练任务的识别模型的类型相同,则需要考虑其识别信息中的正确率值、新增数据量和新增数据种类的优先顺序,或者还可以直接考虑优先数,选取优先数对应优先级别较高的训练任务,从而避免一个识别模型被重复的训练,由于一般只会保存最近一次的识别模型,因此其中一次的训练实际上无效的,因此引入此机制能够消除这种隐患,从而提高效率。It can be seen that if the identification models of the two training tasks are of the same type, it is necessary to consider the correct rate value, the newly added data amount, and the priority order of the newly added data types in the identification information, or may directly consider the priority number and select The priority number corresponds to a higher priority training task, thereby avoiding a training in which the recognition model is repeated. Since only the most recent recognition model is generally saved, one of the trainings is actually invalid, so the introduction of this mechanism can eliminate this. Hidden dangers, thereby increasing efficiency.
304、训练云根据优先数为训练任务分配训练资源,并按执行执行优先级别执行对应的训练任务。304. The training cloud allocates training resources for the training task according to the priority number, and executes the corresponding training task according to the execution execution priority level.
其中,在优先数确定的情况下,训练云即可根据优先数为优先数不同的训练任务分配训练资源,并按照优先数得出的执行优先级别执行对应的训练任务,如有五个训练任务,五个训练任务分别具有不同的执行优先级别,则按照执行优先级别从高到底顺序执行这五个训练任务,当然,若训练云能够同时训练两个以上的识别模型,也同样按照训练任务的优先级顺序一次执行多个训练任务。In the case that the priority number is determined, the training cloud may allocate training resources according to the priority number of the training tasks with different priority numbers, and execute the corresponding training tasks according to the execution priority level obtained by the priority number, if there are five training tasks The five training tasks have different execution priority levels, and the five training tasks are executed in order from the highest to the lowest according to the execution priority level. Of course, if the training cloud can train more than two recognition models at the same time, the training tasks are also performed according to the training tasks. The priority order executes multiple training tasks at a time.
可以看出,本实施例中训练云在接收到模型训练请求后会产生对应的训练任务,并且还会根据模型训练请求获取对应的识别信息,之后根据训练任务的识别信息和训练任务的识别模型的类型计算训练任务的优先数,该优先数表示训练任务的执行优先级别,在具有多个训练任务时,由于一个训练的优先数需要由两类数据来决定,即训识别信息和练模型的类型,根据计算出的优先数对训练云中得训练任务进行调度,使得训练资源能够被多个训练任务合理共享,提高训练效率。It can be seen that, in this embodiment, after receiving the model training request, the training cloud generates a corresponding training task, and also acquires corresponding identification information according to the model training request, and then according to the identification information of the training task and the recognition model of the training task. The type calculates the priority number of the training task, which indicates the execution priority level of the training task. When there are multiple training tasks, the priority number of one training needs to be determined by two types of data, that is, the training identification information and the training model. Type, according to the calculated priority number, the training tasks in the training cloud are scheduled, so that the training resources can be reasonably shared by multiple training tasks, and the training efficiency is improved.
可选的,图3中步骤304进一步可包括:Optionally, step 304 in FIG. 3 may further include:
a、训练云根据训练任务对应的识别模型确定训练任务所需的最低资源数和最佳资源数。a. The training cloud determines the minimum number of resources and the optimal number of resources required for the training task according to the recognition model corresponding to the training task.
其中,当识别模型确定后,更新该模型需要的计算、存储资源能够大致确定,从而训练该识别模型的所需的最低资源数和最佳资源数也能够确定,根据每个训练任务,不论是执行任务还是非执行任务,都能够通过其所需的最低资源数和最佳资源数计算出实际执行该任务需要使用的硬件资源,这些硬件资源包括硬件资源的种类、数量以及硬件资源的使用时间。 Wherein, after the identification model is determined, the calculation and storage resources required to update the model can be roughly determined, so that the minimum number of resources required and the optimal number of resources required for training the recognition model can also be determined, according to each training task, whether Whether the task is executed or not, the hardware resources that are actually used to perform the task can be calculated by the minimum number of resources required and the optimal number of resources, including the type and amount of hardware resources and the usage time of the hardware resources. .
b、训练云将最大优先数与优先数的差值小于预设的优先数阈值的优先数对应的训练任务确定为候选任务集合中的候选任务。b. The training cloud determines, as the candidate task in the candidate task set, the training task corresponding to the priority number whose difference between the maximum priority number and the priority number is less than the preset priority number threshold.
其中,优先数阈值用以判断一项训练任务是候选任务还是非候选任务,在执行顺序上,候选任务是高于非候选任务的,分出此两类是为了更好的进行资源分配,在资源数量有限的情况下,优先分配给需要优先执行的训练任务。The priority number threshold is used to determine whether a training task is a candidate task or a non-candidate task. In the execution order, the candidate task is higher than the non-candidate task, and the two classes are separated for better resource allocation. In the case of a limited number of resources, priority is given to training tasks that need to be prioritized.
c、训练云根据优先数计算候选任务集合中各候选任务的分配资源数。c. The training cloud calculates the allocated resources of each candidate task in the candidate task set according to the priority number.
其中,在计算待分配资源时,主要考虑针对候选任务的分配,根据优先数按比例进行分配,例如候选任务有5个,总资源数为100,则按照优先数分配依次可以是26、22、20、17、15,即总的来说若优先级从高到底排列时,其所能分配到的资源是依次减少的。When calculating the resources to be allocated, the allocation for the candidate tasks is mainly considered, and the allocation is performed according to the priority number. For example, if there are 5 candidate tasks and the total number of resources is 100, the priority allocation may be 26, 22, 20, 17, and 15, in general, if the priority is ranked from high to low, the resources that can be allocated are reduced in turn.
d、训练云将分配资源数不小于候选任务的最低资源数的候选任务确定为优先任务。d. The training cloud determines the candidate task that allocates the number of resources not less than the minimum resource number of the candidate task as the priority task.
其中,在计算完成之后,根据各候选任务分配到的资源数及其最低资源数进行对比,当分配到的资源数不小于最低资源数时,则表示该候选任务是优先任务,即表示按照前述计算结果分配的资源能够支撑该候选任务的执行。After the calculation is completed, the number of resources allocated by each candidate task and the minimum number of resources are compared. When the number of allocated resources is not less than the minimum number of resources, it indicates that the candidate task is a priority task, that is, according to the foregoing The resources allocated by the calculation result can support the execution of the candidate task.
e、训练云将分配资源数小于候选任务的最低资源数,且候选任务的最低资源数与候选任务的分配资源数之差小于预设的资源调整阈值的候选任务确定为非优先任务。e. The training cloud allocates the number of resources to be less than the minimum number of resources of the candidate task, and the candidate task whose difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is less than the preset resource adjustment threshold is determined as a non-priority task.
其中,在某一候选任务分配到的资源数小于其最低资源数,且分配到的资源数与最低资源数之差的绝对值小于资源调整阈值时,其仍为候选任务,不过将其列入非优先任务,在优先任务之后执行。Wherein, when the number of resources allocated to a candidate task is less than the minimum number of resources, and the absolute value of the difference between the allocated number of resources and the minimum number of resources is less than the resource adjustment threshold, it is still a candidate task, but is included in the candidate task. Non-priority tasks, executed after priority tasks.
f、训练云为候选任务集合中的候选任务按照优先任务和非优先任务分配对应的资源。f. The training cloud allocates corresponding resources according to the priority task and the non-priority task for the candidate tasks in the candidate task set.
其中,在候选任务集合中确定出了优先任务和非优先任务后,会针对优先任务构成的集合和非优先任务构成的集合分配资源,此次分配是按照前次计算出的各候选任务的被分配到的资源数计算的,因此优先任务能够全部满足最低资源数的要求;而非优先任务则分为两种情况,第一种是在计算分配资源时,并未将全部的资源计算在内,而是留有一些余量,且这些余量能够满足一部分或者全部非优先任务距离最低资源数的差值,甚至在分配完成之后还有剩余资 源,第二种情况则是计算过程已经将全部的资源计算在内,实际分配时,非优先任务并没有被分配到资源数达到最低资源数。After the priority task and the non-priority task are determined in the candidate task set, resources are allocated for the set formed by the priority task and the non-priority task, and the allocation is based on the previous candidate tasks. The number of allocated resources is calculated, so the priority tasks can all meet the minimum resource requirements; the non-priority tasks are divided into two cases. The first one is to calculate all the resources without calculating all the resources. , but leave some margin, and these margins can meet the difference between the minimum number of resources of some or all non-priority tasks, even after the allocation is completed. Source, the second case is that the calculation process has already calculated all the resources. In the actual allocation, the non-priority tasks are not allocated to the minimum number of resources.
可选的,训练云将候选任务的最低资源数与候选任务的分配资源数之差大于预设的资源调整阈值的候选任务从候选任务集合中去除。Optionally, the training cloud removes the candidate task whose difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is greater than the preset resource adjustment threshold from the candidate task set.
可以看出,在候选任务的最低资源数与候选任务的分配资源数之差大于预设的资源调整阈值的情况下,表示即便将该任务列入候选任务,也没有足够的资源支撑其执行,因此可将此类训练任务从候选任务集合中去除,此时针对步骤f的第二种可能的情况会带来两种不同的分支,分支一是去除的这些训练任务释放的资源数能够使得一部分非优先任务达到最低资源数,但是并不能达到所有的非优先任务均达到最低资源数;分支二是,能够使得所有非优先任务均能够达到最低资源数,并且可能还有剩余资源。It can be seen that, in the case that the difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is greater than the preset resource adjustment threshold, it means that even if the task is included in the candidate task, there is not enough resources to support its execution. Therefore, such a training task can be removed from the candidate task set. In this case, the second possible situation for the step f brings two different branches, and the branch 1 removes the number of resources released by the training tasks to enable a part of the training task. Non-priority tasks reach the minimum number of resources, but not all non-priority tasks reach the minimum number of resources; branch 2 is that all non-priority tasks can achieve the minimum number of resources, and there may be remaining resources.
可选的,为了便于计算分配的资源,候选任务集合中的优先任务和非优先任务按照优先数从大到小的顺序排列,优先任务的顺序位置位于非优先任务之前。Optionally, in order to facilitate calculation of the allocated resources, the priority tasks and the non-priority tasks in the candidate task set are arranged in descending order of priority, and the priority positions of the priority tasks are located before the non-priority tasks.
可选的,在上述分支二的情况下,步骤f可包括至少两次的资源分配,分别为:Optionally, in the case of the foregoing branch two, the step f may include resource allocation at least twice, respectively:
第一次资源分配,训练云为优先任务和非优先任务按照优先数的顺序分配各自最低资源数;For the first resource allocation, the training cloud allocates the minimum number of resources for priority tasks and non-priority tasks in order of priority;
第二次资源分配,训练云将第一次资源分配后剩余的资源按照优先数比例分配给优先任务。For the second resource allocation, the training cloud allocates the remaining resources after the first resource allocation to the priority tasks according to the priority ratio.
可以理解的是,若第二次分配仍有余量,则可进行第三次分配:It can be understood that if there is still a margin in the second allocation, a third allocation can be made:
第三次资源分配,训练云将第二次资源分配后超出优先任务的最佳资源数的剩余资源按照优先数比例分配给非优先任务。For the third resource allocation, the training cloud allocates the remaining resources of the optimal resources exceeding the priority task after the second resource allocation to the non-priority tasks according to the priority ratio.
其中,第一次资源分配即为优先任务和非优先任务分配至最低资源数,第二次资源分配则是将第一分配后剩余的资源优先分配给优先任务,此时若剩余资源足够则会将所有的优先任务均分配至最佳资源数,否则会将优先任务中前一部分任务分配至最佳资源数;若优先任务全分配至最佳资源数时,仍留有剩余,此时可进行第三次资源分配,即将这些剩余的资源依此分配给非优先任务,若将所有的非优先任务均分配至最佳资源数后仍留有资源,则继续分配至前述 步骤中被去掉的候选任务,若分配至最佳资源数后仍有剩余,还可继续分配给非候选任务。The first resource allocation is that the priority task and the non-priority task are allocated to the minimum resource number, and the second resource allocation is to preferentially allocate the remaining resources after the first allocation to the priority task, and if the remaining resources are sufficient, All the priority tasks are assigned to the optimal number of resources, otherwise the first part of the priority tasks will be assigned to the optimal number of resources; if the priority tasks are all allocated to the optimal number of resources, there will still be surplus, which can be done at this time. The third resource allocation, that is, the remaining resources are allocated to the non-priority tasks accordingly. If all the non-priority tasks are allocated to the optimal resources and resources remain, the allocation continues to the foregoing. The candidate tasks that are removed in the step can still be assigned to the non-candidate tasks if they remain after being allocated to the optimal number of resources.
可以看出,采用上述资源分配方式在具有多个训练任务的情况下,尤其是资源有限的情况下,能够使得资源最有效的被利用,使得执行优先级高的训练任务能够在足够的资源下优先执行,从而提高多训练任务的情况下的执行效率。It can be seen that, in the case of having multiple training tasks, especially in the case of limited resources, the resource can be utilized most effectively, so that the training task with high priority can be under sufficient resources. Priority execution, thereby improving the efficiency of execution in the case of multiple training tasks.
需要说明的是,在本发明实施例中,由于任务列表中的任务执行的周期较长,为了资源更为合理的利用,还会设定一个周期,每经过一个该周期,都会对执行任务以及非执行任务进行优先数的重新计算,以便于重新执行上述步骤a至步骤f从而对进行资源的调整。It should be noted that, in the embodiment of the present invention, since the task execution period in the task list is long, for a more rational use of resources, a cycle is also set, and each time a cycle is performed, the task is executed and The non-executing task performs a recalculation of the priority number in order to re-execute the above steps a to f to adjust the resources.
上面对本发明实施例的调度方法进行了介绍,下面对发明实施例的训练方法进行介绍,该方法基于图2所示的增量式学习云***,具体的,请参阅图4,图4是本发明实施例的基于增量式学习云***的训练方法的一个实施例图,如图4所示,该训练方法可包括:The scheduling method of the embodiment of the present invention is described above. The training method of the embodiment of the present invention is introduced. The method is based on the incremental learning cloud system shown in FIG. 2, specifically, FIG. 4, FIG. 4 is An embodiment of the training method based on the incremental learning cloud system of the embodiment of the present invention, as shown in FIG. 4, the training method may include:
401、识别云接收未识别数据。401. The recognition cloud receives unidentified data.
其中,未识别数据由UE发出或由存储设备提供。Wherein, the unidentified data is sent by the UE or provided by the storage device.
可选的,当未识别数据由存储设备提供时,识别云接收未识别数据的过程为:识别云接收从存储云读取的未识别数据,存储设备内存储的未识别数据由UE发送发送至存储设备。Optionally, when the unidentified data is provided by the storage device, the process of the recognition cloud receiving the unidentified data is: the recognition cloud receives the unidentified data read from the storage cloud, and the unidentified data stored in the storage device is sent by the UE to the Storage device.
402、识别云根据识别模型对未识别数据进行识别。402. The recognition cloud identifies the unidentified data according to the recognition model.
其中,识别模型由识别云所在云***中的训练云提供。The recognition model is provided by a training cloud in the cloud system in which the cloud is located.
可选的,训练云在训练完成识别模型后,有两种方式对识别模型进行处理,一种是可以将该识别模型备份存储在存储云中,另一种是将训练完成的识别模型直接提供给识别云。此时,识别模型由识别云所在云***中的训练云提供包括:Optionally, after the training cloud completes the recognition model, there are two ways to process the recognition model. One is to store the recognition model in the storage cloud, and the other is to provide the training recognition model directly. Give recognition to the cloud. At this point, the recognition model is provided by the training cloud in the cloud system in which the cloud is located, including:
识别模型由识别云所在云***中的训练云发送至识别云;或,The recognition model is sent to the recognition cloud by the training cloud in the cloud system in which the cloud is located; or
识别模型由识别云从存储设备中读取,存储设备内存储的识别模型由识别云所在云***中的训练云发送。The recognition model is read from the storage device by the recognition cloud, and the recognition model stored in the storage device is sent by the training cloud in the cloud system in which the cloud is located.
403、识别云识别模型针对未识别数据统计识别信息; 403. Identify the cloud identification model for the unidentified data statistical identification information;
其中,识别云在接收到未识别数据以及针对的识别模型后,会采用该模型对未识别数据进行识别,并统计出采用该识别模型识别未识别数据后的识别信息。Wherein, after the recognition cloud receives the unidentified data and the targeted recognition model, the model is used to identify the unidentified data, and the identification information after the unidentified data is identified by using the recognition model is counted.
可选的,识别云将识别信息发送至存储设备。即除了采用该识别信息判断该识别模型是否需要训练之外,还可将该识别信息发送至存储设备进行存储,以便于后续同样的识别模型识别未识别数据产生识别信息后作为参照。Optionally, the recognition cloud sends the identification information to the storage device. That is, in addition to using the identification information to determine whether the recognition model requires training, the identification information may be sent to the storage device for storage, so that the subsequent identification model identifies the unidentified data and generates the identification information as a reference.
404、识别云向训练云发送模型训练请求。404. Identify the cloud and send a model training request to the training cloud.
其中,当识别信息超出预设的识别阈值时,识别云向训练云发送模型训练请求,从而训练云能够训练识别模型,模型训练请求携带识别信息和识别模型的类型。Wherein, when the identification information exceeds the preset recognition threshold, the recognition cloud sends a model training request to the training cloud, so that the training cloud can train the recognition model, and the model training request carries the identification information and the type of the recognition model.
其中,识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种,识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值之中的至少一种。The identification information includes at least one of a correct rate value, a new data amount, and a new data type, and the identification threshold includes at least one of a correct rate threshold, a new data amount threshold, and a new category threshold.
可以看出,只要识别信息中正确率值、新增数据量和新增数据种类之中一种超出对应阈值,即会触发识别云产生模型训练请求,该模型训练请求会附带上识别模型的类型和识别信息,以便于训练云能够根据这两种信息对识别模型进行训练,即本发明实施例中模型训练的触发条件除了一般的固定一段时间或特定触发条件之外,实际是由识别云进行触发的,由于识别云能够通过识别信息知晓当前识别模型是否需要被训练,因此通过识别云进行模型训练的触发能够使得被训练的识别模型是当前最需要被训练的识别模型,从而使得模型训练更为合理,最重能够提高数据识别的效率。It can be seen that as long as one of the correct rate value, the newly added data amount and the newly added data type in the identification information exceeds the corresponding threshold value, the identification cloud generation model training request is triggered, and the model training request is accompanied by the type of the recognition model. And the identification information, so that the training cloud can train the recognition model according to the two kinds of information, that is, the trigger condition of the model training in the embodiment of the present invention is actually performed by the recognition cloud, except for a general fixed period of time or a specific trigger condition. Triggered, since the recognition cloud can know whether the current recognition model needs to be trained by the identification information, the triggering of the model training by recognizing the cloud can make the trained recognition model the recognition model that is most needed to be trained at present, thereby making the model training more To be reasonable, the most important can improve the efficiency of data identification.
上面对发明实施例的训练方法和调度方法进行了介绍,下面以一个实际的例子对训练方法和调度方法进行说明。The training method and the scheduling method of the embodiment of the invention are described above. The training method and the scheduling method are described below with a practical example.
其中,首先对触发条件进行设定,本实施例中采用四种识别模型,识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值,即每种识别模型的类型的均对应此三个阈值,如下表1所示:First, the trigger condition is set. In this embodiment, four recognition models are used, and the recognition threshold includes a correct rate threshold, a new data amount threshold, and a new category threshold, that is, each type of the recognition model corresponds to the three. The thresholds are as shown in Table 1 below:
表1Table 1
Figure PCTCN2016071970-appb-000001
Figure PCTCN2016071970-appb-000001
Figure PCTCN2016071970-appb-000002
Figure PCTCN2016071970-appb-000002
需要说明的是,假如某个时刻“人脸识别”的正确率为89%,则触发“人脸识别”的正确率过低的训练任务,“人脸识别”的正确率过低的训练任务进入训练任务调度器中进行调度;假如某个时刻“人脸识别”的新增数据量为11%,则触发“人脸识别”的新增数据量的训练任务,“人脸识别”的新增数据量的训练任务进入训练任务调度器中进行调度;假如某个时刻“人脸识别”的新增种类为1,则触发“人脸识别”的新增种类的训练任务,“人脸识别”的新增数据量的训练任务进入训练任务调度器中进行调度;假如某个时刻“图像分类”的正确率为84%,则触发“图像分类”的正确率过低的训练任务,“图像分类”的正确率过低的训练任务进入训练任务调度器中进行调度。It should be noted that if the correct rate of "face recognition" at a certain moment is 89%, the training task with the correct rate of "face recognition" is too low, and the training task with the correct rate of "face recognition" is too low. Enter the training task scheduler for scheduling; if the new data volume of "Face Recognition" is 11% at a certain moment, the training task of "new face data" that triggers "Face Recognition" is triggered, and the new "Face Recognition" is added. The training task of increasing data volume enters the training task scheduler for scheduling; if the new type of "face recognition" is 1 at a certain moment, a new type of training task of "face recognition" is triggered, "face recognition" The training task of the newly added data volume enters the training task scheduler for scheduling; if the correct rate of "image classification" at a certain moment is 84%, the training task of "image classification" is too low, "image" The training task whose classification is too low in accuracy is entered into the training task scheduler for scheduling.
其中,数据识别的具体步骤包括:新来的业务数据(即未识别数据)存储于存储云中;The specific steps of data identification include: the new service data (ie, the unidentified data) is stored in the storage cloud;
新来的业务数据在识别云中根据已经得到的训练模型进行识别,并将识别结果存于存储云中;The new business data is identified in the recognition cloud according to the training model that has been obtained, and the recognition result is stored in the storage cloud;
在识别云中,针对当前识别模型对识别的可信度、识别的正确性以及是否是新增加的分类数据进行标记,标记结果存储于存储云中;In the recognition cloud, the current recognition model is used to mark the credibility of the recognition, the correctness of the recognition, and whether the newly added classification data is marked, and the marking result is stored in the storage cloud;
在识别云中,对正确率、新增数据以及新增种类信息进行统计,统计信息定期在存储云中进行备份;正确率与期待正确率的差值即正确率值。In the identification cloud, the correct rate, the newly added data, and the newly added type information are counted, and the statistical information is periodically backed up in the storage cloud; the difference between the correct rate and the expected correct rate is the correct rate value.
当正确率值超出正确率阈值,或新增数据量超出新增数据量阈值,或新增数据种类高于新增种类阈值时,则通过发出模型训练请求触发训练云进行训练,训练云根据模型训练训练请求生成训练任务,且训练云根据训练任务的优先级对任务的模型进行训练。When the correct rate value exceeds the correct rate threshold, or the new data volume exceeds the new data volume threshold, or the new data type is higher than the new category threshold, the training cloud is trained by issuing a model training request, and the training cloud is trained according to the model. The training training request generates a training task, and the training cloud trains the model of the task according to the priority of the training task.
其中,具体训练步骤可包括:The specific training steps may include:
识别云根据触发类型组织不同形式的数据。对于正确率低于阈值的采用各类正确识别数据和不正确的数据共同训练,对于新增数据量高于阈值采用已训练过的各类正确数据和新增数据共同训练,对于新增数据种类高于阈值采用已训练过的各类正确数据和新增种类数据共同训练。 The recognition cloud organizes different forms of data according to the type of trigger. For the correct rate is lower than the threshold, use all kinds of correct identification data and incorrect data to train together. For the new data volume higher than the threshold, use the trained various types of correct data and new data to train together. For new data types. Above the threshold, the training is performed using all kinds of correct data and new types of data that have been trained.
存储云根据识别云提供的信息组织数据上传到训练云中;The storage cloud organizes data into the training cloud according to the information provided by the recognition cloud;
训练云根据触发类型,实施不同的训练模型,对于正确率低于阈值,或新增数据量高于阈值情况采用不扩展模型的增量学习模式,对于新增数据种类高于阈值采用模型扩展的增量学习模式。The training cloud implements different training models according to the trigger type. If the correct rate is lower than the threshold, or the new data volume is higher than the threshold, the incremental learning mode of the non-expanded model is adopted, and the model extension is adopted for the new data type higher than the threshold. Incremental learning mode.
下面对具体的训练任务的调度过程进行说明。The following describes the scheduling process of a specific training task.
首先,可设定各识别模型的触发种类优先级:First, the trigger type priority of each recognition model can be set:
“人脸识别”新增种类>正确率过低>新增数据量"Face recognition" added category > Correct rate is too low > New data volume
“图像分类”新增种类>正确率过低>新增数据量"Image Classification" added category > Correct rate is too low > New data volume
“语音分析”正确率过低>新增种类>新增数据量"Voice Analysis" correct rate is too low > New category > New data volume
“视频分类”不设置固定的优先级,直接根据优先数调度。Video Classification does not set a fixed priority and is directly scheduled according to the priority number.
下表2是正确率阈值的设置:Table 2 below is the setting of the correct rate threshold:
表2Table 2
  人脸识别Face recognition 图像分类Image classification 语音分析Speech analysis 视频分类Video classification
正确率阈值Correct rate threshold 95%95% 90%90% 90%90% 90%90%
下表3是识别信息和任务参数的设置:Table 3 below shows the settings for identification information and task parameters:
表3table 3
Figure PCTCN2016071970-appb-000003
Figure PCTCN2016071970-appb-000003
其中,第一加权因子和第二加权因子中的w1至w12均设为1,优先数阈值C1设为0.2,调度条件设为每隔0.1小时或新的任务加入或有任务完成时进行一次调度,训练资源设为100个单位的资源,具体的各识别模型的类型占用的最低资源数和最佳资源数如下表4所示: Wherein, w1 to w12 of the first weighting factor and the second weighting factor are both set to 1, the priority number threshold C1 is set to 0.2, and the scheduling condition is set to be scheduled every 0.1 hour or when a new task joins or a task is completed. The training resource is set to 100 units of resources, and the minimum number of resources and the optimal number of resources occupied by the specific types of identification models are as shown in Table 4 below:
表4Table 4
  人脸识别Face recognition 图像分类Image classification 语音分析Speech analysis 视频分类Video classification
最低资源Minimum resource 3030 5050 3030 1010
最佳资源Best resource 5050 100100 6060 2020
其中,资源调整阈值V1可设为10。The resource adjustment threshold V1 can be set to 10.
下面以仅有一种业务为例,在数据识别的过程中,某个时刻“人脸识别”的正确率为84%,则触发“人脸识别”的正确率过低的训练任务,其为未执行任务,“人脸识别”的正确率过低的训练任务进入训练任务调度器中进行调度,假设此时各业务的状态如下表5所示:In the following, taking only one type of service as an example, in the process of data identification, the correct rate of "face recognition" at a certain moment is 84%, which triggers a training task with a low accuracy rate of "face recognition". To perform the task, the training task with the correct rate of “Face Recognition” is entered into the training task scheduler for scheduling. Assume that the status of each service at this time is as shown in Table 5 below:
表5table 5
  人脸识别Face recognition 图像分类Image classification 语音分析Speech analysis 视频分类Video classification
是否进入任务调度队列Whether to enter the task scheduling queue Yes no no no
新增数据量New data volume 0.050.05 0.010.01 0.020.02 0.030.03
新增数据种类New data type 00 00 0.10.1 0.10.1
正确率Correct rate 0.060.06 0.010.01 0.020.02 0.020.02
任务重要性参数Task importance parameter 0.20.2 0.20.2 0.10.1 0.10.1
模型参数Model parameter 0.10.1 0.10.1 0.10.1 0.10.1
等待时间参数Waiting time parameter 00 00 00 00
运行时间估计参数Run time estimation parameter 0.10.1 0.30.3 0.10.1 0.50.5
首先计算当前各任务的优先数:“人脸识别”的优先数等于0.11。First calculate the priority number of each task: the priority number of "Face Recognition" is equal to 0.11.
具体的资源分配过程包括:The specific resource allocation process includes:
首先确定任务列表{(“人脸识别”,30,50)};First determine the task list { ("Face Recognition", 30, 50)};
继续确定出任务优先数列表{(“人脸识别”,30,50,0.11)};Continue to determine the list of task priority numbers {("Face Recognition", 30, 50, 0.11)};
由于0.11-0.11小于0.2,将优先数与最大值相差小于0.2的任务列为候选任务得到{(“人脸识别”,30,50,0.11)};Since 0.11-0.11 is less than 0.2, the task whose difference between the priority number and the maximum value is less than 0.2 is listed as a candidate task to obtain {("Face Recognition", 30, 50, 0.11)};
根据优先数比例计算各个候选任务可以分配到的资源数;Calculating the number of resources that each candidate task can be allocated according to the priority ratio;
其中,由于只有一个候选任务,因此该“人脸识别”任务被分配到的资源数为100。 Among them, since there is only one candidate task, the number of resources to which the "face recognition" task is assigned is 100.
根据训练任务被分配的资源数与其最低资源数的差异,与资源数阈值V1进行比较:The difference between the number of resources allocated to the training task and the minimum number of resources is compared with the threshold number V1 of the resource:
则,剩余任务为{(“人脸识别”,30,50,0.11)};Then, the remaining tasks are {("Face Recognition", 30, 50, 0.11)};
优先任务组成的任务集合为P={(“人脸识别”,30,50,0.11)};The task set consisting of priority tasks is P={("Face Recognition", 30, 50, 0.11)};
非优先任务组成的任务集合为Q={};The set of tasks consisting of non-priority tasks is Q={};
接着,为每个候选任务分配各任务的最低资源数的资源:Next, allocate resources for the minimum number of resources for each task for each candidate task:
得到{(“人脸识别”,30,30,50)}。Get {("Face Recognition", 30, 30, 50)}.
接着,资源还剩余70,可将将剩下的资源根据优先数比例分配给优先任务:Then, there are 70 remaining resources, and the remaining resources can be allocated to the priority tasks according to the priority ratio:
得到{(“人脸识别”,100,30,50)}。Get {("Face Recognition", 100, 30, 50)}.
之后,由于人脸识别的最佳资源数为50,超出的资源数50将按优先数比例分配给其它优先任务:After that, since the optimal number of resources for face recognition is 50, the number of resources exceeded 50 will be allocated to other priority tasks in proportion to the priority:
得到{(“人脸识别”,50,30,50)}。Get {("Face Recognition", 50, 30, 50)}.
无其它任务,此次调度结束,此时任务运行情况:No other tasks, the scheduling is over, and the task is running at this time:
得到{(“人脸识别”,50)}。Get {("Face Recognition", 50)}.
由于“人脸识别”是新运行的任务,且为“正确率过低”的触发类型,采用各类正确识别数据和不正确的数据共同训练,采用非增量的训练方式。Since “Face Recognition” is a newly-running task and is a trigger type of “correct rate is too low”, various types of correct identification data and incorrect data are used for joint training, and non-incremental training methods are adopted.
下面以同时具有两种业务为例,某个时刻“人脸识别”的正确率为89%,则触发“人脸识别”的正确率过低的训练任务,且为未执行任务,“人脸识别”的正确率过低的训练任务进入训练任务调度器中进行调度,假设此时各业务的状态如下表6所示:In the following, taking two kinds of services at the same time as an example, the correct rate of "face recognition" at a certain moment is 89%, which triggers a training task with a low accuracy rate of "face recognition", and is an unexecuted task, "face" The training task that identifies the correct rate is too low to enter the training task scheduler for scheduling. Assume that the status of each service at this time is as shown in Table 6 below:
表6Table 6
Figure PCTCN2016071970-appb-000004
Figure PCTCN2016071970-appb-000004
Figure PCTCN2016071970-appb-000005
Figure PCTCN2016071970-appb-000005
首先计算当前各任务的优先数:“人脸识别”的优先数等于0.25,“图像分类”的优先数等于-0.11等于0(对于负数都归一化为0处理)First calculate the priority number of each task: the priority number of "Face Recognition" is equal to 0.25, and the priority number of "Image Classification" is equal to -0.11 equal to 0 (normalized to 0 for negative numbers)
具体的资源分配过程包括:The specific resource allocation process includes:
首先确定任务列表{(“人脸识别”,30,50),(“图像分类”,50,100)};First determine the task list {("Face Recognition", 30, 50), ("Image Classification", 50, 100)};
继续确定出任务优先数列表:Continue to determine the list of task priorities:
{(“人脸识别”,30,50,0.25),(“图像分类”,50,100,0)};{("Face Recognition", 30, 50, 0.25), ("Image Classification", 50, 100, 0)};
由于0.25-0.25小于0.2,0.25-0大于0.2,按照将优先数与最大值相差小于0.2的任务列为候选任务:Since 0.25-0.25 is less than 0.2 and 0.25-0 is greater than 0.2, tasks with a difference between the priority and the maximum value of less than 0.2 are listed as candidates:
{(“人脸识别”,30,50,0.25),(“图像分类”,50,100,0)}。{("Face Recognition", 30, 50, 0.25), ("Image Classification", 50, 100, 0)}.
根据优先数比例计算各个候选任务可以分配到的资源数;Calculating the number of resources that each candidate task can be allocated according to the priority ratio;
其中,总资源数为100,由于“图像分类”的优先数是0,因此按优先数分得资源数为0,而“图像分类”最少需要50个资源数,最少资源数与按优先数分得资源数的差值为50,大于阈值V1=10,因此“图像分类”被淘汰。The total number of resources is 100. Since the priority number of the "image classification" is 0, the number of resources by the priority number is 0, and the "image classification" requires at least 50 resources, and the minimum number of resources is divided by the priority number. The difference in the number of resources obtained is 50, which is greater than the threshold V1=10, so the "image classification" is eliminated.
剩余任务为{(“人脸识别”,30,50,0.25)}The remaining tasks are {("Face Recognition", 30, 50, 0.25)}
优先任务组成的任务集合为P={(“人脸识别”,30,50,0.25)}The task set consisting of priority tasks is P={("Face Recognition", 30, 50, 0.25)}
非优先任务组成的任务集合为Q={}(没有任务的最低资源数介于按优先The set of tasks consisting of non-priority tasks is Q={} (the minimum number of resources without tasks is between priority
数分得资源数和按优先数分得资源数+V1之间)Number of resources and number of resources by priority number +V1)
接着,为每个候选任务分配各任务的最低资源数的资源:Next, allocate resources for the minimum number of resources for each task for each candidate task:
得到{(“人脸识别”,30,30,50)}Get {("Face Recognition", 30, 30, 50)}
接着,资源还剩余70,将剩下的资源根据优先数比例分配给优先任务:Next, there are 70 remaining resources, and the remaining resources are allocated to the priority tasks according to the priority ratio:
{(“人脸识别”,100,30,50)}{("Face Recognition", 100, 30, 50)}
之后,由于人脸识别的最佳资源数为50,超出的资源数50将按优先数比例分配给其它优先任务:After that, since the optimal number of resources for face recognition is 50, the number of resources exceeded 50 will be allocated to other priority tasks in proportion to the priority:
得到{(“人脸识别”,50,30,50)} Get {("Face Recognition", 50, 30, 50)}
根据优先数顺序查找非候选任务中最低资源数小于剩余资源的任务,将该任务以最低优先数调度:{(“图像分类”,50,50,100)}Find the task with the lowest number of resources in the non-candidate task less than the remaining resources according to the priority order, and schedule the task with the lowest priority number: {("Image Classification", 50, 50, 100)}
无其它任务,此次调度结束:No other tasks, the end of this schedule:
得到{(“人脸识别”,50),(“图像分类”,50)}Get {("Face Recognition", 50), ("Image Classification", 50)}
由于“图像分类”是新运行的任务,且为“正确率过低”的触发类型,采用各类正确识别数据和不正确的数据共同训练,采用非增量的训练方式。Since “image classification” is a newly-running task and is a trigger type of “correct rate is too low”, various types of correct identification data and incorrect data are used for training together, and non-incremental training methods are adopted.
下面以具有三种业务为例,某个时刻“语音分析”的新增种类为10,则触发“语音分析”的新增种类的训练任务,以语音分析为未执行任务为例,“语音分析”的新增种类的训练任务进入训练任务调度器中进行调度,假设此时各业务的状态如下表In the following, taking three kinds of services as an example, the new type of "speech analysis" at a certain moment is 10, which triggers a new type of training task of "speech analysis", and takes voice analysis as an unexecuted task as an example, "speech analysis" The newly added training task enters the training task scheduler for scheduling, assuming that the status of each service at this time is as follows:
表7Table 7
  人脸识别Face recognition 图像分类Image classification 语音分析Speech analysis 视频分类Video classification
是否进入任务调度队列Whether to enter the task scheduling queue Yes Yes Yes no
新增数据集因子New data set factor 0.070.07 0.030.03 0.040.04 0.050.05
新增种类因子New category factor 00 00 0.10.1 0.10.1
正确率因子Correct rate factor 0.040.04 0.060.06 0.030.03 0.030.03
任务重要性因子Task importance factor 0.20.2 0.20.2 0.50.5 0.10.1
模型参数因子Model parameter factor 0.10.1 0.10.1 0.10.1 0.10.1
等待时间因子Waiting time factor 00 00 00 00
运行时间估计因子Run time estimation factor 0.050.05 0.30.3 0.10.1 0.50.5
首先计算当前各任务的优先数:“人脸识别”的优先数等于0.16,“图像分类”的优先数等于-0.11等于0,“语音分析”的优先数等于0.47。First calculate the priority number of each task: the priority number of "Face Recognition" is equal to 0.16, the priority number of "Image Classification" is equal to -0.11 equal to 0, and the priority number of "Voice Analysis" is equal to 0.47.
具体的资源分配过程包括:The specific resource allocation process includes:
首先确定任务列表:First determine the task list:
{(“人脸识别”,30,50),(“图像分类”,50,100),(“语音分析”,30,60)}。{("Face Recognition", 30, 50), ("Image Classification", 50, 100), ("Voice Analysis", 30, 60)}.
继续确定出任务优先数列表:Continue to determine the list of task priorities:
{(“语音分析”,30,60,0.97),(“人脸识别”,30,50,0.27),(“图像分 类”,50,100,0)}。{("Voice Analysis", 30, 60, 0.97), ("Face Recognition", 30, 50, 0.27), ("Image points Class ", 50, 100, 0)}.
由于0.47-0.16大于0.2,0.47-0大于0.2,0.47-0.47小于0.2,按照将优先数与最大值相差小于0.2的任务列为候选任务:Since 0.47-0.16 is greater than 0.2, 0.47-0 is greater than 0.2, and 0.47-0.47 is less than 0.2, tasks with a difference between the priority and the maximum value of less than 0.2 are listed as candidates:
得到{(“语音分析”,30,60,0.47)}。Get {("speech analysis", 30, 60, 0.47)}.
根据优先数比例计算各个候选任务可以分配到的资源数:Calculate the number of resources that each candidate task can allocate based on the priority ratio:
其中,总资源数为100,根据任务的分配资源与最低资源差异:Among them, the total number of resources is 100, according to the task of the allocation of resources and the minimum resource difference:
剩余任务为{(“语音分析”,30,60,0.47)}The remaining tasks are {("voice analysis", 30, 60, 0.47)}
优先任务组成的任务集合为P={(“语音分析”,30,60,0.47)}The set of tasks consisting of priority tasks is P={("voice analysis", 30, 60, 0.47)}
非优先任务组成的任务集合为Q={}The set of tasks consisting of non-priority tasks is Q={}
接着,为每个候选任务分配各任务的最低资源数的资源:Next, allocate resources for the minimum number of resources for each task for each candidate task:
得到{(“语音分析”,30,30,60)}Get {("voice analysis", 30, 30, 60)}
接着,资源还剩余70,将剩下的资源根据优先数比例分配给优先任务:Next, there are 70 remaining resources, and the remaining resources are allocated to the priority tasks according to the priority ratio:
{(“语音分析”,100,30,60)}{("Voice Analysis", 100, 30, 60)}
之后,由于语音分析的最佳资源数为60,超出的资源数40将按优先数比例分配给其它优先任务:After that, since the optimal number of resources for speech analysis is 60, the number of resources exceeded 40 will be allocated to other priority tasks in proportion to the priority:
得到{(“语音分析”,60,30,60)}Get {("voice analysis", 60, 30, 60)}
根据优先数顺序查找非候选任务中最低资源数小于剩余资源的任务,将该任务以最低优先数调度:{(“人脸识别”,30,30,50)}Find the task with the lowest number of resources in the non-candidate task less than the remaining resources according to the priority order, and schedule the task with the lowest priority number: {("Face Recognition", 30, 30, 50)}
无其它任务,此次调度结束:No other tasks, the end of this schedule:
得到{(“语音分析”,60),(“人脸识别”,30)}Get {("Voice Analysis", 60), ("Face Recognition", 30)}
将“图像分类”从训练中中调度出来。The "image classification" is dispatched from the training.
由于“语音分析”是新运行的任务,且为“新增种类”的触发类型,采用各类正确数据和新增种类数据共同训练,采用增量的训练方式。Since "speech analysis" is a newly-running task and is a trigger type of "new type", it uses various correct data and new type data to train together, and adopts incremental training mode.
上面对本发明实施例的用于增量式学习云***的训练方法进行了介绍,下面对本发明实施例的用于增量式学习云***的训练设备进行介绍,该训练设备可以是一台应用服务器或是多台应用服务器组成,该训练设备主要用于识别模型的生成和训练,请参阅图5,图5是本发明实施例训练设备的一个实施例图,该训练设备可包括: The training method for the incremental learning cloud system of the embodiment of the present invention is described above. The training device for the incremental learning cloud system according to the embodiment of the present invention is introduced. The training device may be an application server. Or a plurality of application servers, the training device is mainly used to identify the generation and training of the model. Referring to FIG. 5, FIG. 5 is a diagram of an embodiment of a training device according to an embodiment of the present invention. The training device may include:
第一接收模块501,用于接收所述训练云所在的云***中的识别云发送模型训练请求,所述模型训练请求中携带识别信息和识别模型的类型;The first receiving module 501 is configured to receive a recognition cloud sending model training request in the cloud system where the training cloud is located, where the model training request carries the identification information and the type of the recognition model;
其中,第一接收模块501可以实现图3所示实施例中的步骤301。识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种。可以看出,识别信息是采用某一类别的识别模型识别一定的数据后统计得出的,识别信息中各种种类的值的说明与图3所示实施例中针对步骤301的说明类似,此处不再赘述。The first receiving module 501 can implement step 301 in the embodiment shown in FIG. The identification information includes at least one of a correct rate value, a newly added data amount, and a new data type. It can be seen that the identification information is obtained by using a certain type of recognition model to identify certain data, and the description of various kinds of values in the identification information is similar to the description of step 301 in the embodiment shown in FIG. I won't go into details here.
此外,识别模型的类型可包括人脸识别、图像分类、语音分析和视频分类之中的至少一种,对应不同种类的未识别数据,具体的可参见图3所示实施例中针对步骤301的说明,此处不再赘述。In addition, the type of the recognition model may include at least one of face recognition, image classification, voice analysis, and video classification, corresponding to different types of unidentified data. For details, refer to step 301 in the embodiment shown in FIG. 3 . Description, no more details here.
第一处理模块502,用于根据所述识别信息和所述识别模型的类型生成对应的训练任务;The first processing module 502 is configured to generate a corresponding training task according to the identification information and the type of the recognition model;
所述第一处理模块502还用于,通过所述识别信息计算所述训练任务的优先数,所述训练任务的优先数对应所述训练任务的执行优先级别;The first processing module 502 is further configured to calculate, by using the identification information, a priority number of the training task, where a priority number of the training task corresponds to an execution priority level of the training task;
其中,第一处理模块502可以实现图3所示实施例中的步骤302和步骤303。第一处理模块502主要处理两个方面,一方面是根据识别信息和识别模型的类型生成对应的训练任务,具体的是否生成训练任务的条件可参见图3所是实施例中针对步骤302的说明,此处不再赘述。第二方面是计算出每个训练任务的优先数,该优先数意味着对应的训练任务的优先级别。The first processing module 502 can implement step 302 and step 303 in the embodiment shown in FIG. The first processing module 502 mainly processes two aspects. On the one hand, the corresponding training task is generated according to the identification information and the type of the recognition model. For the specific conditions for generating the training task, refer to the description of step 302 in the embodiment of FIG. , will not repeat them here. The second aspect is to calculate the priority number of each training task, which means the priority level of the corresponding training task.
其中一个计算方式是首先确定与训练任务对应的任务参数,之后确定所述识别信息的第一加权因子和所述任务参数的第二加权因子;再根据所述识别信息、任务参数、第一加权因子和第二加权因子计算所述训练任务的优先数。该计算方式计算优先数的具体计算过程可参见图3所是实施例中针对步骤303的说明,此处不再赘述。One of the calculation methods is to first determine a task parameter corresponding to the training task, and then determine a first weighting factor of the identification information and a second weighting factor of the task parameter; and further, according to the identification information, the task parameter, and the first weighting The factor and the second weighting factor calculate a priority number of the training task. For the specific calculation process of calculating the priority number in the calculation mode, refer to the description of step 303 in the embodiment of FIG. 3, and details are not described herein again.
需要说明的是,优先数计算过程中根据不同的任务类型还具有不同的任务参数,计算优先数的方式也有所差别;例如,根据训练任务是否执行分为两类,所述训练任务为执行任务或非执行任务;当所述训练任务为执行任务时,所述任务参数包括任务重要级参数和运行时间估计参数;当所述训练任务为非执行任务时,所述任务参数包括任务重要级参数、模型参数、等待时间参数和运行 时间估计参数。It should be noted that, in the calculation process of the priority number, different task parameters are also different according to different task types, and the manner of calculating the priority number is also different; for example, according to whether the training task is executed, the training task is performed in two categories. Or a non-executing task; when the training task is a task, the task parameter includes a task importance level parameter and a running time estimation parameter; when the training task is a non-executing task, the task parameter includes a task important level parameter , model parameters, wait time parameters, and run Time estimation parameters.
此外,需要说明的是,若两个训练任务的识别模型的类型相同的,则需要对其中一个进行处理,可选的,识别信息包括正确率值、新增数据量和新增数据种类,方法包括:In addition, it should be noted that if the identification models of the two training tasks are of the same type, one of them needs to be processed. Optionally, the identification information includes the correct rate value, the newly added data amount, and the newly added data type. include:
当两个训练任务的识别模型的类型相同时,选取触发优先级高或执行优先级高的训练任务作为类型的训练模型的训练任务,触发优先级为按照识别模型的类型设置的识别信息中正确率值、新增数据量和新增数据种类的优先顺序。When the types of the recognition models of the two training tasks are the same, the training task with the high priority or the high priority is selected as the training task of the type training model, and the trigger priority is correct in the identification information set according to the type of the recognition model. The priority of the rate, the amount of new data, and the type of new data.
资源分配模块503,用于根据所述优先数为所述训练任务分配训练资源,并按所述执行优先级别执行对应的训练任务。The resource allocation module 503 is configured to allocate a training resource to the training task according to the priority number, and execute a corresponding training task according to the execution priority level.
其中,资源分配模块503可以实现图3所示实施例中的步骤304。资源分配模块503主要具有优先数的各训练任务的执行进行调度。该资源分配模块503的功能包括两个部分,一个部分是对于生成的训练任务进行分类,该分类并非一次分类,而是可以有多次分类的情况。其中,具体的资源分配过程与图3所示实施例中针对步骤304的说明中的步骤a至步骤f类似,其主要的思想的首先确定出候选任务,而后为候选任务预分配资源,根据预分配的结果进一步确定出候选任务中的优先任务和非优先任务,之后在按照分出的训练任务的级别从高到低分配资源,具体分配过程此处不再赘述。The resource allocation module 503 can implement step 304 in the embodiment shown in FIG. The resource allocation module 503 mainly has the execution of each training task with a priority number for scheduling. The function of the resource allocation module 503 includes two parts, one part is to classify the generated training tasks, the classification is not one classification, but may be the case of multiple classifications. The specific resource allocation process is similar to the steps a to b in the description of step 304 in the embodiment shown in FIG. 3, and the main idea is to first determine a candidate task, and then pre-allocate resources for the candidate task, according to the pre- The result of the allocation further determines the priority task and the non-priority task in the candidate task, and then allocates resources according to the level of the separated training task from high to low, and the specific allocation process will not be described here.
此外,候选任务集合中的任务数量并非一直不变,在分配过程中,将所述候选任务的最低资源数与所述候选任务的分配资源数之差大于预设的资源调整阈值的候选任务从所述候选任务集合中去除。这部分训练任务由于最低资源数与分配资源数相差过大,因而需要补充很多的资源才可运行,将其设置在较高的优先级会占用过多紧张资源,因此并不合理。In addition, the number of tasks in the candidate task set is not always the same. In the allocation process, the difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is greater than the preset resource adjustment threshold candidate task. The candidate task set is removed. Because this part of the training task is too different from the number of allocated resources, it needs to be supplemented with a lot of resources to run. Setting it at a higher priority will take up too much of the tight resources, so it is not reasonable.
此外,所述候选任务集合中的优先任务和非优先任务按照优先数从大到小的顺序排列,所述优先任务的顺序位置位于非优先任务之前。In addition, the priority task and the non-priority task in the candidate task set are arranged in descending order of priority numbers, and the priority positions of the priority tasks are located before the non-priority task.
在各任务级别划分完成后,会对各集合的训练任务进行实际的资源分配,该分配的方式并不是一次分配,而是多次分配,例如,具体的,资源分配模块503具体用于:第一次资源分配,为所述优先任务和所述非优先任务按照所述优先数的顺序分配各自最低资源数;而后第二次资源分配,将第一次资源分配后剩余的资源按照优先数比例分配给所述优先任务。若仍有剩余的情况,还可 进行第三次资源分配,将第二次资源分配后超出所述优先任务的最佳资源数的剩余资源按照优先数比例分配给非优先任务。After the task level is divided, the actual resource allocation is performed on the training tasks of the respective groups. The allocation mode is not one allocation, but multiple allocations. For example, the resource allocation module 503 is specifically used for: a resource allocation, wherein the priority task and the non-priority task are allocated the respective minimum resource numbers in the order of the priority number; and then the second resource allocation, the remaining resources after the first resource allocation are proportioned according to the priority ratio Assigned to the priority task. If there are still remaining cases, The third resource allocation is performed, and the remaining resources exceeding the optimal resource number of the priority task after the second resource allocation are allocated to the non-priority tasks according to the priority ratio.
可以看出,本实施例中第一接收模块501在接收到模型训练请求后,会由第一处理模块502产生对应的训练任务,并且还会根据模型训练请求获取对应的识别信息,之后第一处理模块502还会根据训练任务的识别信息和训练任务的识别模型的类型计算训练任务的优先数,该优先数表示训练任务的执行优先级别,在具有多个训练任务时,由于一个训练的优先数需要由两类数据来决定,即训识别信息和练模型的类型,最后由资源分配模块503根据计算出的优先数对训练云中得训练任务进行调度,使得训练资源能够被多个训练任务合理共享,提高训练效率。It can be seen that, in the embodiment, after receiving the model training request, the first receiving module 501 generates a corresponding training task by the first processing module 502, and also acquires corresponding identification information according to the model training request, and then the first The processing module 502 also calculates a priority number of the training task according to the identification information of the training task and the type of the recognition model of the training task, the priority number indicating the execution priority level of the training task, and when there are multiple training tasks, due to a training priority The number needs to be determined by two types of data, namely, training identification information and the type of training model. Finally, the resource allocation module 503 schedules the training tasks in the training cloud according to the calculated priority number, so that the training resources can be trained by multiple training tasks. Reasonable sharing to improve training efficiency.
上面对本发明实施例中的用于增量式学习云***的训练设备进行了介绍,下面对本发明实施例中的用于增量式学习云***的识别设备,该识别设备可以是一台应用服务器或是多台应用服务器组成,UE可通过网络连接到该识别设备,该识别设备主要用于根据训练设备生成的识别模型对来自UE的数据进行识别,请参阅图6,图6是本发明实施例的识别设备的一个实施例图,该识别设备可包括:The training device for the incremental learning cloud system in the embodiment of the present invention is described above. The identification device for the incremental learning cloud system in the embodiment of the present invention may be an application server. Or a plurality of application servers, the UE may be connected to the identification device through a network, and the identification device is mainly used to identify data from the UE according to the identification model generated by the training device. Referring to FIG. 6, FIG. 6 is an implementation of the present invention. An embodiment of an identification device of an example, the identification device may include:
第二接收模块601,用于收未识别数据,所述未识别数据由用户设备UE发出或由存储设备提供。The second receiving module 601 is configured to receive unidentified data, which is sent by the user equipment UE or provided by the storage device.
其中,第二接收模块601可以实现图4所示实施例中的步骤401。第二接收模块601主要用于接收从所述存储设备读取的所述未识别数据,所述存储设备内存储的未识别数据由所述UE发送至所述存储设备,其中,未识别数据的来源与图4所示实施例中针对步骤401的说明类似,此处不再赘述。The second receiving module 601 can implement step 401 in the embodiment shown in FIG. The second receiving module 601 is configured to receive the unidentified data read from the storage device, and the unidentified data stored in the storage device is sent by the UE to the storage device, where the data is not identified. The source is similar to the description of step 401 in the embodiment shown in FIG. 4, and details are not described herein again.
第二处理模块602,用于根据识别模型对所述未识别数据进行识别,所述识别模型由所述识别云所在云***中的训练设备提供。The second processing module 602 is configured to identify the unidentified data according to the recognition model, where the recognition model is provided by a training device in the cloud system where the recognition cloud is located.
其中,第二处理模块602可以实现图4所示实施例中的步骤402。训练设备在训练完成识别模型后,有两种方式对识别模型进行处理,一种是可以将该识别模型备份存储在云***的存储设备中,另一种是将训练完成的识别模型直接提供给识别设备。即,所述识别模型由所述识别设备所在云***中的训练设备发送至所述识别设备;或,所述识别模型由所述识别设备从所述存储设备中 读取,所述存储设备内存储的识别模型由所述识别设备所在云***中的训练设备发送。The second processing module 602 can implement step 402 in the embodiment shown in FIG. After the training device completes the recognition model, there are two ways to process the recognition model. One is to store the recognition model in the storage device of the cloud system, and the other is to provide the training identification model directly to the training device. Identify the device. That is, the identification model is sent to the identification device by a training device in the cloud system where the identification device is located; or the recognition model is from the storage device by the identification device Reading, the identification model stored in the storage device is sent by the training device in the cloud system where the identification device is located.
统计模块603,用于根据所述识别模型针对所述未识别数据统计识别信息。The statistics module 603 is configured to perform statistical identification information on the unidentified data according to the identification model.
其中,统计模块603可以实现图4所示实施例中的步骤403,即会统计出采用该识别模型识别未识别数据后的识别信息。识别信息可包括正确率值、新增数据量和新增数据种类之中的至少一种,识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值之中的至少一种。The statistic module 603 can implement the step 403 in the embodiment shown in FIG. 4, that is, the identification information after the unidentified data is identified by using the identification model is counted. The identification information may include at least one of a correct rate value, a new data amount, and a new data category, and the recognition threshold includes at least one of a correct rate threshold, a new data amount threshold, and a new category threshold.
所述第二处理模块602还用于当所述识别信息超出预设的识别阈值时,向所述训练云发送模型训练请求,以使得所述训练云训练所述识别模型,所述模型训练请求携带所述识别信息和所述识别模型的类型。具体的模型训练请求的发出条件与图4所示实施例中针对步骤404的说明类似,主要采用预设识别阈值的方式,此处不再赘述。The second processing module 602 is further configured to: when the identification information exceeds a preset identification threshold, send a model training request to the training cloud, so that the training cloud trains the recognition model, and the model training request Carrying the identification information and the type of the recognition model. The conditions for the issuance of the specific model training request are similar to the description of the step 404 in the embodiment shown in FIG. 4, and the method of the preset identification threshold is mainly used, and details are not described herein again.
其中,该识别设备7还包括发送模块604,用于将所述识别信息发送至所述存储设备。可便于后续同样的识别模型识别未识别数据产生识别信息后作为参照。The identification device 7 further includes a sending module 604, configured to send the identification information to the storage device. It is convenient to use the same identification model to identify the unidentified data to generate the identification information as a reference.
下面对本发明实施例中训练设备的结构进行描述,请参阅图7,图7是本发明实施例的训练设备的一个实施例图,其中,训练设备7可包括均与总线相连接的至少一个处理器701、至少一个接收器702和至少一个发送器703,本发明实施例涉及的基站可以具有比图7所示出的更多或更少的部件,可以组合两个或更多个部件,或者可以具有不同的部件配置或设置,各个部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件或硬件和软件的组合实现。The following describes the structure of the training device in the embodiment of the present invention. Referring to FIG. 7, FIG. 7 is a diagram of an embodiment of a training device according to an embodiment of the present invention, wherein the training device 7 may include at least one process that is connected to the bus. 701, at least one receiver 702 and at least one transmitter 703, the base station according to an embodiment of the present invention may have more or less components than those shown in FIG. 7, and two or more components may be combined, or There may be different component configurations or arrangements, each component being implemented in hardware, software or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
具体的,对于图5所示的实施例来说,该处理器701能实现图5所示实施例中的第一处理模块502和资源分配模块503的功能,该接收器702能实现图5所示实施例中的第一接收模块501的功能,该发送器703能实现图5所示实施例中训练设备向云***中存储设备或识别设备发送识别模型的功能。Specifically, for the embodiment shown in FIG. 5, the processor 701 can implement the functions of the first processing module 502 and the resource allocation module 503 in the embodiment shown in FIG. 5, and the receiver 702 can implement the method in FIG. The function of the first receiving module 501 in the embodiment, the transmitter 703 can implement the function of the training device in the embodiment shown in FIG. 5 to send the recognition model to the storage device or the identification device in the cloud system.
下面对本发明实施例中识别设备的结构进行描述,请参阅图8,图8是本发明实施例的识别设备的一个实施例图,其中,识别设备8可包括均与总线相 连接的至少一个处理器801、至少一个接收器802和至少一个发送器803,本发明实施例涉及的基站可以具有比图8所示出的更多或更少的部件,可以组合两个或更多个部件,或者可以具有不同的部件配置或设置,各个部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件或硬件和软件的组合实现。The following is a description of the structure of the identification device in the embodiment of the present invention. Referring to FIG. 8, FIG. 8 is a diagram of an embodiment of the identification device according to the embodiment of the present invention, wherein the identification device 8 may include both the bus and the bus. Connected at least one processor 801, at least one receiver 802, and at least one transmitter 803. The base station according to an embodiment of the present invention may have more or less components than those shown in FIG. 8, and may combine two or more. Multiple components, or different component configurations or arrangements, may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
具体的,对于图6所示的实施例来说,该处理器801能实现图6所示实施例中的第二处理模块602和统计模块603的功能,该接收器802能实现图6所示实施例中的第二接收模块601的功能,该发送器803能实现图5所示实施例中发送模块604的功能。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Specifically, for the embodiment shown in FIG. 6, the processor 801 can implement the functions of the second processing module 602 and the statistic module 603 in the embodiment shown in FIG. 6, and the receiver 802 can implement the method shown in FIG. The function of the second receiving module 601 in the embodiment, the transmitter 803 can implement the function of the transmitting module 604 in the embodiment shown in FIG. A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全 部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention is essential or the part contributing to the prior art or the entire technical solution. The portion or portion may be embodied in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the various aspects of the present invention. All or part of the steps of the method described in the examples. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。 The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the embodiments are modified, or the equivalents of the technical features are replaced by the equivalents of the technical solutions of the embodiments of the present invention.

Claims (32)

  1. 一种用于增量式学习云***的调度方法,其特征在于,包括:A scheduling method for an incremental learning cloud system, comprising:
    训练云接收所述训练云所在的云***中的识别云发送模型训练请求,所述模型训练请求中携带识别信息和识别模型的类型;The training cloud receives the recognition cloud transmission model training request in the cloud system where the training cloud is located, where the model training request carries the identification information and the type of the recognition model;
    所述训练云根据所述识别信息和所述识别模型的类型生成对应的训练任务;The training cloud generates a corresponding training task according to the identification information and the type of the recognition model;
    所述训练云通过所述识别信息计算所述训练任务的优先数,所述训练任务的优先数对应所述训练任务的执行优先级别;The training cloud calculates a priority number of the training task by using the identification information, where a priority number of the training task corresponds to an execution priority level of the training task;
    所述训练云根据所述优先数为所述训练任务分配训练资源,并按所述执行优先级别执行对应的训练任务。And the training cloud allocates a training resource to the training task according to the priority number, and performs a corresponding training task according to the execution priority level.
  2. 根据权利要求1所述的用于增量式学习云***的调度方法,其特征在于:所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种。The scheduling method for an incremental learning cloud system according to claim 1, wherein the identification information comprises at least one of a correct rate value, a newly added data amount, and a newly added data type.
  3. 根据权利要求2所述的用于增量式学习云***的调度方法,其特征在于,所述识别信息包括所述正确率值、所述新增数据量和所述新增数据种类,所述方法还包括:The scheduling method for an incremental learning cloud system according to claim 2, wherein the identification information comprises the correct rate value, the newly added data amount, and the newly added data type, The method also includes:
    当两个所述训练任务的识别模型的类型相同时,选取触发优先级高或执行优先级高的训练任务作为所述类型的训练模型的训练任务,所述触发优先级为按照所述识别模型的类型设置的所述识别信息中所述正确率值、所述新增数据量和所述新增数据种类的优先顺序。When the types of the recognition models of the two training tasks are the same, the training task with the high priority or the high priority is selected as the training task of the training model of the type, and the trigger priority is according to the recognition model. The priority rate value, the new data amount, and the priority order of the new data category in the identification information set by the type.
  4. 根据权利要求2所述的用于增量式学习云***的调度方法,其特征在于,所述训练云通过所述识别信息计算所述训练任务的优先数包括:The scheduling method for the incremental learning cloud system according to claim 2, wherein the training cloud calculates the priority number of the training task by using the identification information, including:
    所述训练云确定与训练任务对应的任务参数;The training cloud determines a task parameter corresponding to the training task;
    所述训练云确定所述识别信息的第一加权因子和所述任务参数的第二加权因子;The training cloud determines a first weighting factor of the identification information and a second weighting factor of the task parameter;
    所述训练云根据所述识别信息、任务参数、第一加权因子和第二加权因子计算所述训练任务的优先数。The training cloud calculates a priority number of the training task according to the identification information, the task parameter, the first weighting factor, and the second weighting factor.
  5. 根据权利要求4所述的用于增量式学习云***的调度方法,其特征在于:所述训练任务为执行任务或非执行任务;The scheduling method for an incremental learning cloud system according to claim 4, wherein the training task is a task or a non-execution task;
    当所述训练任务为执行任务时,所述任务参数包括任务重要级参数和运行 时间估计参数;When the training task is to perform a task, the task parameters include task importance level parameters and running Time estimation parameter
    当所述训练任务为非执行任务时,所述任务参数包括任务重要级参数、模型参数、等待时间参数和运行时间估计参数。When the training task is a non-executing task, the task parameters include a task importance level parameter, a model parameter, a waiting time parameter, and a running time estimation parameter.
  6. 根据权利要:1至5中任一项所述的用于增量式学习云***的调度方法,其特征在于,所述训练云根据所述优先数为所述训练任务分配训练资源包括:The scheduling method for the incremental learning cloud system according to any one of claims 1 to 5, wherein the training cloud allocates training resources for the training task according to the priority number, including:
    所述训练云根据所述训练任务对应的识别模型确定所述训练任务所需的最低资源数和最佳资源数;Determining, by the training cloud, the minimum number of resources and the optimal number of resources required by the training task according to the recognition model corresponding to the training task;
    所述训练云将最大优先数与所述优先数的差值小于预设的优先数阈值的所述优先数对应的训练任务确定为候选任务集合中的候选任务;Determining, by the training cloud, a training task corresponding to the priority number whose maximum priority number is less than the priority number of the preset priority number threshold as a candidate task in the candidate task set;
    所述训练云根据所述优先数计算所述候选任务集合中各候选任务的分配资源数;The training cloud calculates, according to the priority number, a number of allocated resources of each candidate task in the candidate task set;
    所述训练云将分配资源数不小于所述候选任务的最低资源数的候选任务确定为优先任务;The training cloud determines a candidate task that allocates a resource number not less than a minimum resource number of the candidate task as a priority task;
    所述训练云将分配资源数小于所述候选任务的最低资源数,且所述候选任务的最低资源数与所述候选任务的分配资源数之差小于预设的资源调整阈值的候选任务确定为非优先任务;The training cloud allocates the number of resources to be less than the minimum number of resources of the candidate task, and the candidate task whose difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is less than a preset resource adjustment threshold is determined as Non-priority task;
    所述训练云为所述候选任务集合中的候选任务按照所述优先任务和所述非优先任务分配对应的资源。The training cloud allocates corresponding resources according to the priority task and the non-priority task for candidate tasks in the candidate task set.
  7. 根据权利要求6所述的用于增量式学习云***的调度方法,其特征在于,所述方法还包括:The scheduling method for an incremental learning cloud system according to claim 6, wherein the method further comprises:
    所述训练云将所述候选任务的最低资源数与所述候选任务的分配资源数之差大于预设的资源调整阈值的候选任务从所述候选任务集合中去除。The training cloud removes candidate tasks whose difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is greater than a preset resource adjustment threshold from the candidate task set.
  8. 根据权利要求7所述的用于增量式学习云***的调度方法,其特征在于,所述候选任务集合中的优先任务和非优先任务按照优先数从大到小的顺序排列,所述优先任务的顺序位置位于非优先任务之前。The scheduling method for an incremental learning cloud system according to claim 7, wherein the priority tasks and the non-priority tasks in the candidate task set are arranged in descending order of priority numbers, the priority The sequential position of the task is before the non-priority task.
  9. 根据权利要求8所述的用于增量式学习云***的调度方法,其特征在于,所述为所述候选任务集合中候选任务按照所述优先任务和所述非优先任务分配对应的资源包括:The scheduling method for the incremental learning cloud system according to claim 8, wherein the resource corresponding to the priority task and the non-priority task is included in the candidate task set. :
    第一次资源分配,所述训练云为所述优先任务和所述非优先任务按照所述 优先数的顺序分配各自最低资源数;a first resource allocation, the training cloud according to the priority task and the non-priority task The order of the priority numbers is assigned the respective minimum number of resources;
    第二次资源分配,所述训练云将第一次资源分配后剩余的资源按照优先数比例分配给所述优先任务。The second resource allocation, the training cloud allocates the remaining resources after the first resource allocation to the priority task according to the priority ratio.
  10. 根据权利要求9所述的用于增量式学习云***的调度方法,其特征在于,所述方法还包括:The scheduling method for an incremental learning cloud system according to claim 9, wherein the method further comprises:
    第三次资源分配,所述训练云将第二次资源分配后超出所述优先任务的最佳资源数的剩余资源按照优先数比例分配给非优先任务。The third resource allocation, the training cloud allocates the remaining resources exceeding the optimal resource number of the priority task after the second resource allocation to the non-priority task according to the priority ratio.
  11. 根据权利要求1至10中任一项所述的用于增量式学习云***的调度方法,其特征在于:所述识别模型的类型包括人脸识别、图像分类、语音分析和视频分类之中的至少一种。The scheduling method for an incremental learning cloud system according to any one of claims 1 to 10, wherein the type of the recognition model includes face recognition, image classification, voice analysis, and video classification. At least one of them.
  12. 一种用于增量式学习云***的训练方法,其特征在于:A training method for an incremental learning cloud system, characterized in that:
    识别云接收未识别数据,所述未识别数据由用户设备UE发出或由存储设备提供;The recognition cloud receives unidentified data, which is sent by the user equipment UE or provided by the storage device;
    所述识别云根据识别模型对所述未识别数据进行识别,所述识别模型由所述识别云所在云***中的训练云提供;The identification cloud identifies the unidentified data according to a recognition model, and the recognition model is provided by a training cloud in a cloud system in which the recognition cloud is located;
    所述识别云根据所述识别模型针对所述未识别数据统计识别信息;The identification cloud statistically identifies the information for the unidentified data according to the identification model;
    当所述识别信息超出预设的识别阈值时,所述识别云向所述训练云发送模型训练请求,以使得所述训练云训练所述识别模型,所述模型训练请求携带所述识别信息和所述识别模型的类型。When the identification information exceeds a preset identification threshold, the recognition cloud sends a model training request to the training cloud, so that the training cloud trains the recognition model, and the model training request carries the identification information and The type of the recognition model.
  13. 根据权利要求12所述的用于增量式学习云***的训练方法,其特征在于,所述方法还包括:The training method for an incremental learning cloud system according to claim 12, wherein the method further comprises:
    所述识别云将所述识别信息发送至所述存储设备。The identification cloud transmits the identification information to the storage device.
  14. 根据权利要求12所述的用于增量式学习云***的训练方法,其特征在于,所述识别云接收未识别数据包括:The training method for an incremental learning cloud system according to claim 12, wherein the identifying cloud receiving unidentified data comprises:
    所述识别云接收从所述存储云读取的所述未识别数据,所述存储设备内存储的未识别数据由所述UE发送至所述存储设备。The identification cloud receives the unidentified data read from the storage cloud, and the unidentified data stored in the storage device is sent by the UE to the storage device.
  15. 根据权利要求12所述的用于增量式学习云***的训练方法,其特征在于,所述识别模型由所述识别云所在云***中的训练云提供包括:The training method for an incremental learning cloud system according to claim 12, wherein the identifying model provided by the training cloud in the cloud system in which the cloud is located comprises:
    所述识别模型由所述识别云所在云***中的训练云发送至所述识别云; 或,The identification model is sent to the recognition cloud by a training cloud in the cloud system where the recognition cloud is located; or,
    所述识别模型由所述识别云从所述存储设备中读取,所述存储设备内存储的识别模型由所述识别云所在云***中的训练云发送。The identification model is read by the identification cloud from the storage device, and the identification model stored in the storage device is sent by the training cloud in the cloud system where the recognition cloud is located.
  16. 根据权利要求12至15中任一项所述的用于增量式学习云***的训练方法,其特征在于:所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种,所述识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值之中的至少一种。The training method for an incremental learning cloud system according to any one of claims 12 to 15, wherein the identification information includes a correct rate value, a newly added data amount, and a new data type. At least one of the identification thresholds includes at least one of a correct rate threshold, a new data amount threshold, and a new category threshold.
  17. 一种用于增量式学习云***的训练设备,其特征在于,包括:A training device for an incremental learning cloud system, comprising:
    第一接收模块,用于接收所述训练云所在的云***中的识别云发送模型训练请求,所述模型训练请求中携带识别信息和识别模型的类型;a first receiving module, configured to receive a recognition cloud sending model training request in a cloud system where the training cloud is located, where the model training request carries the identification information and a type of the recognition model;
    第一处理模块,用于根据所述识别信息和所述识别模型的类型生成对应的训练任务;a first processing module, configured to generate a corresponding training task according to the identification information and the type of the recognition model;
    所述第一处理模块还用于,通过所述识别信息计算所述训练任务的优先数,所述训练任务的优先数对应所述训练任务的执行优先级别;The first processing module is further configured to calculate, by using the identification information, a priority number of the training task, where a priority number of the training task corresponds to an execution priority level of the training task;
    资源分配模块,用于根据所述优先数为所述训练任务分配训练资源,并按所述执行优先级别执行对应的训练任务。And a resource allocation module, configured to allocate a training resource to the training task according to the priority number, and perform a corresponding training task according to the execution priority level.
  18. 根据权利要求17所述的训练设备,其特征在于:所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种。The training device according to claim 17, wherein the identification information comprises at least one of a correct rate value, a new data amount, and a new data type.
  19. 根据权利要求18所述的训练设备,其特征在于,所述识别信息包括所述正确率值、所述新增数据量和所述新增数据种类,所述第一处理模块还用于:The training device according to claim 18, wherein the identification information includes the correct rate value, the newly added data amount, and the newly added data type, and the first processing module is further configured to:
    当两个所述训练任务的识别模型的类型相同时,选取触发优先级高或执行优先级高的训练任务作为所述类型的训练模型的训练任务,所述触发优先级为按照所述识别模型的类型设置的所述识别信息中所述正确率值、所述新增数据量和所述新增数据种类的优先顺序。When the types of the recognition models of the two training tasks are the same, the training task with the high priority or the high priority is selected as the training task of the training model of the type, and the trigger priority is according to the recognition model. The priority rate value, the new data amount, and the priority order of the new data category in the identification information set by the type.
  20. 根据权利要求18所述的训练设备,其特征在于,所述第一处理模块具体用于:The training device according to claim 18, wherein the first processing module is specifically configured to:
    确定与训练任务对应的任务参数;Determining task parameters corresponding to the training task;
    确定所述识别信息的第一加权因子和所述任务参数的第二加权因子; Determining a first weighting factor of the identification information and a second weighting factor of the task parameter;
    根据所述识别信息、任务参数、第一加权因子和第二加权因子计算所述训练任务的优先数。Calculating a priority number of the training task according to the identification information, the task parameter, the first weighting factor, and the second weighting factor.
  21. 根据权利要求20所述的训练设备,其特征在于,所述训练任务为执行任务或非执行任务;The training device according to claim 20, wherein the training task is a task or a non-execution task;
    当所述训练任务为执行任务时,所述任务参数包括任务重要级参数和运行时间估计参数;When the training task is to perform a task, the task parameter includes a task importance level parameter and a running time estimation parameter;
    当所述训练任务为非执行任务时,所述任务参数包括任务重要级参数、模型参数、等待时间参数和运行时间估计参数。When the training task is a non-executing task, the task parameters include a task importance level parameter, a model parameter, a waiting time parameter, and a running time estimation parameter.
  22. 根据权利要求17至21中任一项所述的训练设备,其特征在于,所述资源分配模块具体用于:The training device according to any one of claims 17 to 21, wherein the resource allocation module is specifically configured to:
    根据所述训练任务对应的识别模型确定所述训练任务所需的最低资源数和最佳资源数;Determining, according to the identification model corresponding to the training task, a minimum number of resources and an optimal number of resources required for the training task;
    将最大优先数与所述优先数的差值小于预设的优先数阈值的所述优先数对应的训练任务确定为候选任务集合中的候选任务;Determining, by the training task corresponding to the priority number that the difference between the maximum priority number and the priority number is less than the preset priority number threshold, as a candidate task in the candidate task set;
    根据所述优先数计算所述候选任务集合中各候选任务的分配资源数;Calculating, according to the priority number, a number of allocated resources of each candidate task in the candidate task set;
    将分配资源数不小于所述候选任务的最低资源数的候选任务确定为优先任务;Determining, as a priority task, a candidate task that allocates a resource number not less than a minimum resource number of the candidate task;
    将分配资源数小于所述候选任务的最低资源数,且所述候选任务的最低资源数与所述候选任务的分配资源数之差小于预设的资源调整阈值的候选任务确定为非优先任务;And the candidate task whose number of allocated resources is smaller than the minimum number of resources of the candidate task, and the difference between the minimum resource number of the candidate task and the allocated resource number of the candidate task is less than a preset resource adjustment threshold is determined as a non-priority task;
    为所述候选任务集合中的候选任务按照所述优先任务和所述非优先任务分配对应的资源。Assigning corresponding resources to the candidate tasks in the candidate task set according to the priority task and the non-priority task.
  23. 根据权利要求22所述的训练设备,其特征在于,所述资源分配模块还用于:The training device according to claim 22, wherein the resource allocation module is further configured to:
    将所述候选任务的最低资源数与所述候选任务的分配资源数之差大于预设的资源调整阈值的候选任务从所述候选任务集合中去除。A candidate task that has a difference between a minimum resource number of the candidate task and a allocated resource number of the candidate task greater than a preset resource adjustment threshold is removed from the candidate task set.
  24. 根据权利要求23所述的训练设备,其特征在于:所述候选任务集合中的优先任务和非优先任务按照优先数从大到小的顺序排列,所述优先任务的顺序位置位于非优先任务之前。 The training device according to claim 23, wherein the priority task and the non-priority task in the candidate task set are arranged in descending order of priority numbers, and the priority positions of the priority tasks are located before the non-priority task .
  25. 根据权利要求24所述的训练设备,其特征在于,所述资源分配模块具体用于:The training device according to claim 24, wherein the resource allocation module is specifically configured to:
    第一次资源分配,为所述优先任务和所述非优先任务按照所述优先数的顺序分配各自最低资源数;a first resource allocation, where the priority task and the non-priority task are allocated respective minimum resource numbers in the order of the priority number;
    第二次资源分配,将第一次资源分配后剩余的资源按照优先数比例分配给所述优先任务。The second resource allocation allocates the remaining resources after the first resource allocation to the priority task according to the priority ratio.
  26. 根据权利要求25所述的训练设备,其特征在于,所述资源分配模块具体用于:The training device according to claim 25, wherein the resource allocation module is specifically configured to:
    第三次资源分配,将第二次资源分配后超出所述优先任务的最佳资源数的剩余资源按照优先数比例分配给非优先任务。For the third resource allocation, the remaining resources exceeding the optimal resource number of the priority task after the second resource allocation are allocated to the non-priority tasks according to the priority ratio.
  27. 根据权利要求17至25中任一项所述的训练设备,其特征在于:所述识别模型的类型包括人脸识别、图像分类、语音分析和视频分类之中的至少一种。The training apparatus according to any one of claims 17 to 25, characterized in that the type of the recognition model includes at least one of face recognition, image classification, voice analysis, and video classification.
  28. 一种用于增量式学习云***的识别设备,其特征在于,包括:An identification device for an incremental learning cloud system, comprising:
    第二接收模块,用于收未识别数据,所述未识别数据由用户设备UE发出或由存储设备提供;a second receiving module, configured to receive unidentified data, which is sent by the user equipment UE or provided by the storage device;
    第二处理模块,用于根据识别模型对所述未识别数据进行识别,所述识别模型由所述识别云所在云***中的训练设备提供;a second processing module, configured to identify the unidentified data according to the identification model, where the identification model is provided by a training device in the cloud system where the identification cloud is located;
    统计模块,用于根据所述识别模型针对所述未识别数据统计识别信息;a statistics module, configured to perform statistical identification information on the unidentified data according to the identification model;
    所述第二处理模块还用于当所述识别信息超出预设的识别阈值时,向所述训练云发送模型训练请求,以使得所述训练云训练所述识别模型,所述模型训练请求携带所述识别信息和所述识别模型的类型。The second processing module is further configured to: when the identification information exceeds a preset identification threshold, send a model training request to the training cloud, so that the training cloud trains the recognition model, where the model training request carries The identification information and the type of the recognition model.
  29. 根据权利要求28所述的识别设备,其特征在于,所述识别设备还包括:The identification device according to claim 28, wherein the identification device further comprises:
    发送模块,用于将所述识别信息发送至所述存储设备。And a sending module, configured to send the identification information to the storage device.
  30. 根据权利要求28所述的识别设备,其特征在于,所述第一接收模块具体用于:The identification device according to claim 28, wherein the first receiving module is specifically configured to:
    接收从所述存储设备读取的所述未识别数据,所述存储设备内存储的未识别数据由所述UE发送至所述存储设备。 Receiving the unidentified data read from the storage device, the unidentified data stored in the storage device being transmitted by the UE to the storage device.
  31. 根据权利要求28所述的识别设备,其特征在于:所述识别模型由所述识别设备所在云***中的训练设备发送至所述识别设备;或,The identification device according to claim 28, wherein said identification model is transmitted to said identification device by a training device in a cloud system in which said identification device is located; or
    所述识别模型由所述识别设备从所述存储设备中读取,所述存储设备内存储的识别模型由所述识别设备所在云***中的训练设备发送。The identification model is read by the identification device from the storage device, and the identification model stored in the storage device is sent by the training device in the cloud system where the identification device is located.
  32. 根据权利要求28至31中任一项所述的识别设备,其特征在于:所述识别信息包括正确率值、新增数据量和新增数据种类之中的至少一种,所述识别阈值包括正确率阈值、新增数据量阈值和新增种类阈值之中的至少一种。 The identification device according to any one of claims 28 to 31, wherein the identification information includes at least one of a correct rate value, a new data amount, and a new data type, the recognition threshold including At least one of a correct rate threshold, a new data amount threshold, and a new category threshold.
PCT/CN2016/071970 2016-01-25 2016-01-25 Method for training and scheduling incremental learning cloud system and related device WO2017127976A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2016/071970 WO2017127976A1 (en) 2016-01-25 2016-01-25 Method for training and scheduling incremental learning cloud system and related device
CN201680018168.2A CN108027889B (en) 2016-01-25 2016-01-25 Training and scheduling method for incremental learning cloud system and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/071970 WO2017127976A1 (en) 2016-01-25 2016-01-25 Method for training and scheduling incremental learning cloud system and related device

Publications (1)

Publication Number Publication Date
WO2017127976A1 true WO2017127976A1 (en) 2017-08-03

Family

ID=59396828

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/071970 WO2017127976A1 (en) 2016-01-25 2016-01-25 Method for training and scheduling incremental learning cloud system and related device

Country Status (2)

Country Link
CN (1) CN108027889B (en)
WO (1) WO2017127976A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104222A (en) * 2019-12-16 2020-05-05 上海众源网络有限公司 Task processing method and device, computer equipment and storage medium
CN111105006A (en) * 2018-10-26 2020-05-05 杭州海康威视数字技术股份有限公司 Deep learning network training system and method
CN111176852A (en) * 2020-01-15 2020-05-19 上海依图网络科技有限公司 Resource allocation method, device, chip and computer readable storage medium
CN111813523A (en) * 2020-07-09 2020-10-23 北京奇艺世纪科技有限公司 Duration pre-estimation model generation method, system resource scheduling method, device, electronic equipment and storage medium
CN112559147A (en) * 2020-12-08 2021-03-26 和美(深圳)信息技术股份有限公司 Dynamic matching algorithm, system and equipment based on GPU resource occupation characteristics
CN112686387A (en) * 2020-11-24 2021-04-20 中国电子科技集团公司电子科学研究院 Common technical model training and scheduling method and device and readable storage medium
CN113205177A (en) * 2021-04-25 2021-08-03 广西大学 Electric power terminal identification method based on incremental collaborative attention mobile convolution
CN113767662A (en) * 2019-08-30 2021-12-07 Oppo广东移动通信有限公司 Method, device and system for determining type of wireless signal
CN116089883A (en) * 2023-01-30 2023-05-09 北京邮电大学 Training method for improving classification degree of new and old categories in existing category increment learning
CN116155750A (en) * 2023-04-19 2023-05-23 之江实验室 Deep learning job resource placement method, system, equipment and storage medium
CN117527807A (en) * 2023-11-21 2024-02-06 扬州万方科技股份有限公司 Multi-micro-cloud task scheduling method, device and equipment
WO2024041400A1 (en) * 2022-08-20 2024-02-29 抖音视界有限公司 Model training task scheduling method and apparatus, and electronic device
CN117527807B (en) * 2023-11-21 2024-05-31 扬州万方科技股份有限公司 Multi-micro-cloud task scheduling method, device and equipment

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175677A (en) * 2019-04-16 2019-08-27 平安普惠企业管理有限公司 Automatic update method, device, computer equipment and storage medium
CN111064746A (en) * 2019-12-30 2020-04-24 深信服科技股份有限公司 Resource allocation method, device, equipment and storage medium
CN111738404B (en) * 2020-05-08 2024-01-12 深圳市万普拉斯科技有限公司 Model training task processing method and device, electronic equipment and storage medium
CN112084017B (en) * 2020-07-30 2024-04-19 北京聚云科技有限公司 Memory management method and device, electronic equipment and storage medium
CN114844901B (en) * 2022-05-23 2023-01-31 成都睿信天和科技有限公司 Big data cleaning task processing method based on artificial intelligence and cloud computing system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100088138A1 (en) * 2008-10-07 2010-04-08 International Business Machines Corporation Method and system for integrated short-term activity resource staffing levels and long-term resource action planning for a portfolio of services projects
CN103442076A (en) * 2013-09-04 2013-12-11 上海海事大学 Usability guarantee method for cloud storage system
CN103593680A (en) * 2013-11-19 2014-02-19 南京大学 Dynamic hand gesture recognition method based on self incremental learning of hidden Markov model
CN104331421A (en) * 2014-10-14 2015-02-04 安徽四创电子股份有限公司 High-efficiency processing method and system for big data
CN104965763A (en) * 2015-07-21 2015-10-07 国家计算机网络与信息安全管理中心 Aging perception task scheduling system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103489033A (en) * 2013-09-27 2014-01-01 南京理工大学 Incremental type learning method integrating self-organizing mapping and probability neural network
CN104361311B (en) * 2014-09-25 2017-09-12 南京大学 The visiting identifying system of multi-modal online increment type and its recognition methods

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100088138A1 (en) * 2008-10-07 2010-04-08 International Business Machines Corporation Method and system for integrated short-term activity resource staffing levels and long-term resource action planning for a portfolio of services projects
CN103442076A (en) * 2013-09-04 2013-12-11 上海海事大学 Usability guarantee method for cloud storage system
CN103593680A (en) * 2013-11-19 2014-02-19 南京大学 Dynamic hand gesture recognition method based on self incremental learning of hidden Markov model
CN104331421A (en) * 2014-10-14 2015-02-04 安徽四创电子股份有限公司 High-efficiency processing method and system for big data
CN104965763A (en) * 2015-07-21 2015-10-07 国家计算机网络与信息安全管理中心 Aging perception task scheduling system

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105006A (en) * 2018-10-26 2020-05-05 杭州海康威视数字技术股份有限公司 Deep learning network training system and method
CN111105006B (en) * 2018-10-26 2023-08-04 杭州海康威视数字技术股份有限公司 Deep learning network training system and method
CN113767662A (en) * 2019-08-30 2021-12-07 Oppo广东移动通信有限公司 Method, device and system for determining type of wireless signal
CN111104222B (en) * 2019-12-16 2023-06-30 上海众源网络有限公司 Task processing method, device, computer equipment and storage medium
CN111104222A (en) * 2019-12-16 2020-05-05 上海众源网络有限公司 Task processing method and device, computer equipment and storage medium
CN111176852A (en) * 2020-01-15 2020-05-19 上海依图网络科技有限公司 Resource allocation method, device, chip and computer readable storage medium
CN111176852B (en) * 2020-01-15 2024-04-16 上海依图网络科技有限公司 Resource allocation method, device, chip and computer readable storage medium
CN111813523A (en) * 2020-07-09 2020-10-23 北京奇艺世纪科技有限公司 Duration pre-estimation model generation method, system resource scheduling method, device, electronic equipment and storage medium
CN112686387A (en) * 2020-11-24 2021-04-20 中国电子科技集团公司电子科学研究院 Common technical model training and scheduling method and device and readable storage medium
CN112559147A (en) * 2020-12-08 2021-03-26 和美(深圳)信息技术股份有限公司 Dynamic matching algorithm, system and equipment based on GPU resource occupation characteristics
CN112559147B (en) * 2020-12-08 2024-04-19 和美(深圳)信息技术股份有限公司 Dynamic matching method, system and equipment based on GPU (graphics processing Unit) occupied resource characteristics
CN113205177B (en) * 2021-04-25 2022-03-25 广西大学 Electric power terminal identification method based on incremental collaborative attention mobile convolution
CN113205177A (en) * 2021-04-25 2021-08-03 广西大学 Electric power terminal identification method based on incremental collaborative attention mobile convolution
WO2024041400A1 (en) * 2022-08-20 2024-02-29 抖音视界有限公司 Model training task scheduling method and apparatus, and electronic device
CN116089883A (en) * 2023-01-30 2023-05-09 北京邮电大学 Training method for improving classification degree of new and old categories in existing category increment learning
CN116089883B (en) * 2023-01-30 2023-12-19 北京邮电大学 Training method for improving classification degree of new and old categories in existing category increment learning
CN116155750A (en) * 2023-04-19 2023-05-23 之江实验室 Deep learning job resource placement method, system, equipment and storage medium
CN117527807A (en) * 2023-11-21 2024-02-06 扬州万方科技股份有限公司 Multi-micro-cloud task scheduling method, device and equipment
CN117527807B (en) * 2023-11-21 2024-05-31 扬州万方科技股份有限公司 Multi-micro-cloud task scheduling method, device and equipment

Also Published As

Publication number Publication date
CN108027889A (en) 2018-05-11
CN108027889B (en) 2020-07-28

Similar Documents

Publication Publication Date Title
WO2017127976A1 (en) Method for training and scheduling incremental learning cloud system and related device
CN110837410B (en) Task scheduling method and device, electronic equipment and computer readable storage medium
CN107977268B (en) Task scheduling method and device for artificial intelligence heterogeneous hardware and readable medium
EP3736692B1 (en) Using computational cost and instantaneous load analysis for intelligent deployment of neural networks on multiple hardware executors
WO2020258920A1 (en) Network slice resource management method and apparatus
CN108984301B (en) Self-adaptive cloud resource allocation method and device
US11436050B2 (en) Method, apparatus and computer program product for resource scheduling
CN112148468B (en) Resource scheduling method and device, electronic equipment and storage medium
CN111176852A (en) Resource allocation method, device, chip and computer readable storage medium
CN105900064A (en) Method and apparatus for scheduling data flow task
US20060112388A1 (en) Method for dynamic scheduling in a distributed environment
CN104168318A (en) Resource service system and resource distribution method thereof
CN109005130B (en) Network resource allocation scheduling method and device
CN105607952B (en) Method and device for scheduling virtualized resources
CN112996116B (en) Resource allocation method and system for guaranteeing quality of power time delay sensitive service
CN110502321A (en) A kind of resource regulating method and system
CN113946431B (en) Resource scheduling method, system, medium and computing device
WO2024021489A1 (en) Task scheduling method and apparatus, and kubernetes scheduler
Muthusamy et al. Cluster-based task scheduling using K-means clustering for load balancing in cloud datacenters
WO2019144775A1 (en) Resource scheduling method and system based on tdma system
CN112559187A (en) Method and system for dynamically allocating tasks to mobile edge computing server
Choi et al. An enhanced data-locality-aware task scheduling algorithm for hadoop applications
CN111343288A (en) Job scheduling method and system and computing device
CN110780991B (en) Deep learning task scheduling method and device based on priority
CN112559147A (en) Dynamic matching algorithm, system and equipment based on GPU resource occupation characteristics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16886863

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16886863

Country of ref document: EP

Kind code of ref document: A1