WO2019079994A1 - Core scheduling method and terminal - Google Patents

Core scheduling method and terminal

Info

Publication number
WO2019079994A1
WO2019079994A1 PCT/CN2017/107614 CN2017107614W WO2019079994A1 WO 2019079994 A1 WO2019079994 A1 WO 2019079994A1 CN 2017107614 W CN2017107614 W CN 2017107614W WO 2019079994 A1 WO2019079994 A1 WO 2019079994A1
Authority
WO
WIPO (PCT)
Prior art keywords
core
weight value
cores
neural network
convolutional neural
Prior art date
Application number
PCT/CN2017/107614
Other languages
French (fr)
Chinese (zh)
Inventor
曹海恒
谭利文
杜明亮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2017/107614 priority Critical patent/WO2019079994A1/en
Priority to CN201780064697.0A priority patent/CN109937410B/en
Publication of WO2019079994A1 publication Critical patent/WO2019079994A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers

Definitions

  • Embodiments of the present invention relate to the field of chip systems, and in particular, to a core scheduling method and a terminal.
  • Convolutional neural network is a feedforward neural network whose artificial neurons can respond to a part of the coverage of surrounding cells and have excellent performance for large image processing.
  • CNN Convolutional neural network
  • the system chip on the terminal In order to improve the computing power of the terminal, the system chip on the terminal often includes multiple heterogeneous cores to perform different services using different cores.
  • Embodiments of the present invention provide a core scheduling method and a terminal, which are used to provide an adapted core for a convolutional neural network model.
  • a first aspect of the embodiments of the present invention provides a core scheduling method, including: acquiring target model parameters, wherein the target model parameters are used to represent a computing density of a convolutional neural network model. Convolutional neural network models of different computational densities are suitable for operation on different cores. Then, determining, according to the target model parameter, a core weight value of at least two cores from the preset first correspondence, the core weight values of the at least two cores corresponding to the target model parameters, the at least two cores being on the terminal Heterogeneous cores, heterogeneous cores on the terminal have different hardware characteristics, so different cores are suitable for running convolutional neural network models with different computational densities.
  • the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight values of the at least two cores, and the core weight value is used to indicate that the core is selected to run the convolutional neural network model, so that the core weight value can be
  • a core adapted to operate the convolutional neural network model is determined such that a core of the running convolutional neural network model is determined from at least two cores based on core weight values of at least two cores.
  • the use of the core weight value may be directly used, that is, directly determining the core with the largest core weight value to run the convolutional neural network model; or indirectly using the core weight value, For example, using other parameters to correct the core weight value, the corrected weight value is obtained, and then the modified weight value is used to determine the core of the running convolutional neural network model.
  • the core of the adaptation can be determined to run a convolutional neural network model with a specific computational density. If the core with a larger core weight value can run the convolutional neural network model efficiently, according to the core The core determined by the weight value can run the convolutional neural network model efficiently.
  • the method of the implementation manner further includes: acquiring a current state parameter of the terminal, where the state parameter is dynamically changed. parameter. Therefore, the current state parameter obtained may reflect the core operating environment on the terminal, and different operating environments may also affect different core running convolutional neural network models.
  • the parameter weight values of the at least two cores are determined from the preset second correspondence according to the state parameter, and the parameter weight values of the at least two cores correspond to the state parameters, wherein the second correspondence includes the state parameters and
  • the parameter weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the state parameter. In this way, the parameter weight value reflects the influence of the dynamic environmental factors on the terminal on the core running convolutional neural network mode.
  • the core of the running convolutional neural network model determines, according to the core weight values of the at least two cores, the core of the running convolutional neural network model from the at least two cores, comprising: for each core, correcting the core weight values by using the parameter weight values to obtain the first modified weight value,
  • the first modified weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
  • the first modified weight value has the influencing factor of the core weight value, and also has the influencing factor of the parameter weight value.
  • the core of the running convolutional neural network model is determined from at least two cores according to the first modified weight value of the at least two cores. It will be possible to identify the core that is more suitable for running the convolutional neural network model.
  • the current state parameter of the obtained terminal may reflect the current dynamic operating environment factor of the terminal, and the parameter weight value of each core may be determined according to the state parameter and the second correspondence relationship, and the parameter weight value reflects the state on the terminal.
  • the influence of parameters on the core running convolutional neural network model the core of the core weight value is more suitable for running the convolutional neural network model under the state parameter, so that the more preferential scheduling is used to run the convolutional neural network model.
  • the first modified weight value obtained by modifying the core weight value by using the state parameter has more factors, and more reflects the appropriateness of the core running convolutional neural network model, so that the core pair running volume determined according to the first modified weight value
  • the product neural network model has better running results.
  • the current state parameter of the terminal is the current core usage rate of each core. Therefore, determining, according to the state parameter, the parameter weight value of the at least two cores from the preset second correspondence, including: determining, for each core, a performance weight value from the preset second correspondence according to the core usage rate, The performance weight value of each core corresponds to the core usage rate of each core.
  • the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate of the core, and the second correspondence relationship includes the core usage rate of each core and the performance weight of each core.
  • the core weight value thus determined reflects the extent to which the core core usage rate affects the running convolutional neural network.
  • the obtained state parameter is the core usage rate
  • the performance weight value determined according to the core usage rate and the second correspondence relationship is used, and the core core usage rate is used as one of the reference factors of the scheduling core, and the performance weight value is used to correct the value.
  • the first modified weight value obtained from the core weight value can take into account the impact of core usage on the running convolutional neural network model.
  • the method of the implementation manner further includes: acquiring a current remaining power value of the terminal. Then, according to the remaining power value, the power consumption weight values of the at least two cores are determined from the preset third correspondence, and the power consumption weight values of the at least two cores correspond to the remaining power values.
  • the third correspondence includes a remaining power value and a power consumption of at least two cores. The correspondence of the weight values, the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value.
  • the power weight value determined in this way can reflect the degree of influence of the remaining power of the terminal on the running convolutional neural network model.
  • determining, according to the first modified weight value of the at least two cores, determining a core of the running convolutional neural network model from the at least two cores comprising: correcting the first modified weight value by using the power consumption weight value for each core, A second modified weight value is used to indicate a priority level at which the core is selected to run the convolutional neural network model. Then, the core of the running convolutional neural network model is determined from at least two cores based on the second modified weight values of the at least two cores.
  • the power consumption weight value of each core is determined by the current remaining power value of the terminal and the third correspondence relationship, and the second modified weight value obtained by correcting the first modified weight value by using the power consumption weight value further considers the remaining of the terminal.
  • the method of the implementation manner further includes: acquiring a current core usage rate of each core, and then, for each The core determines performance parameters from the second correspondence according to the core usage rate, and the performance parameters of each core correspond to the core usage rate of each core.
  • the second correspondence includes a correspondence between performance parameters of each core and core usage of each core.
  • the core has different operation requirements under different core usage rates. Through the preset of the second correspondence relationship, the core operation modes of different core usage rates can be controlled through different performance parameters.
  • the method of the implementation manner further comprises: performing convolution on the target core using the performance parameters of the target core.
  • the neural network model, the core of the target is the core of the running convolutional neural network model.
  • the specific operating mode of the core can be controlled through specific performance parameters, so that the core of the running convolutional neural network model can be operated in a user-set manner to meet the user's control requirements for the core.
  • the performance parameter includes one of thread priority information, sleep time information, and thread number information.
  • the thread priority information is the priority information of the child thread when the core runs the convolutional neural network model
  • the sleep time information is the time when the core runs two convolutional neural network models
  • the thread number information is used when the core runs the convolutional neural network model. The number of threads information.
  • the current state parameter of the terminal is the current remaining power value of the terminal, so that, according to the state parameter, Determining the parameter weight values of the at least two cores from the preset second correspondence, including: determining, according to the remaining power value, the power consumption weight values of the at least two cores from the preset second correspondence, the at least two The power consumption weight value corresponds to the remaining power value, wherein the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value, and the second correspondence includes the remaining power value and The correspondence between the power weight values of at least two cores.
  • the power weight value determined in this way reflects the current residual power value of the terminal and runs a convolutional neural network model on different cores. The extent of the impact.
  • the obtained state parameter is the remaining power value of the terminal, so that the power consumption weight value determined according to the remaining power value and the second correspondence relationship is used as one of the consideration factors of the scheduling core, and the power consumption is used.
  • the first modified weight value obtained by modifying the core weight value of the weight value can take into account the influence of the remaining power of the terminal on the running convolutional neural network model.
  • any one of the first to sixth implementation manners of the first aspect of the embodiments of the present invention in a seventh implementation manner of the first aspect of the embodiments of the present invention
  • the target model parameter is the number of weight parameters of the convolutional neural network model.
  • the number of weight parameters can accurately reflect the computational density of the convolutional neural network model.
  • any one of the first to seventh implementation manners of the first aspect of the embodiment of the present invention, in an eighth implementation manner of the first aspect of the embodiment of the present invention At least two cores include at least two of a CPU, a GPU, a DSP, and a pulsating array processor.
  • the pulsating array processor may include, for example, a neural network processor NPU or a tensor processor TPU or the like.
  • These computing cores have different characteristics.
  • the implementation of the same convolutional neural network model can have different execution efficiencies, and the core scheduling method of the embodiments of the present invention can be effectively used for these cores to effectively determine the core with excellent operation.
  • any one of the first to eighth implementation manners of the first aspect of the embodiment of the present invention, in a ninth implementation manner of the first aspect of the embodiment of the present invention And determining, according to the target model parameter, the core weight value of the at least two cores from the preset first correspondence, including: determining, in the preset first correspondence, determining a target model parameter interval in which the target model parameter is located, and then, In the first correspondence, determining a core weight value interval of the at least two cores, where the core weight value interval of the at least two cores corresponds to the target model parameter interval, wherein the first correspondence includes the target model parameter interval and at least two The correspondence between the core core weight value intervals and the target model parameter interval includes the target model parameters. And, for each core, the core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
  • the target model parameter interval and the core weight value interval in the first correspondence relationship are numerical ranges, which can cover more specific parameters, and determine the core weight values of the at least two cores more accurately by means of position mapping. Reflecting the correspondence with the target model parameters, and the core weight values between different cores are easier to distinguish, so that the core weight value can better reflect the priority of the core selection.
  • an embodiment of the present invention provides a core scheduling method, including:
  • a core weight value of at least two cores from a preset first correspondence, the core weight values of the at least two cores corresponding to the target model parameter, the at least two cores a heterogeneous core on the terminal, the first correspondence relationship comprising a correspondence between the target model parameter and a core weight value of the at least two cores, the core weight value being used to indicate that the core is selected to run the convolution The priority of the neural network model.
  • the convolutional neural network model is assigned to run on different cores based on the core weight values of the at least two cores.
  • an embodiment of the present invention provides a core scheduling method, including:
  • the task type parameter is used to indicate a type of the computing task
  • a core weight value of at least two cores from a preset fourth correspondence relationship, where the core weight values of the at least two cores correspond to the task type parameter, the at least two cores a heterogeneous core on the terminal, where the fourth correspondence includes a correspondence between the task type parameter and a core weight value of the at least two cores, where the core weight value is used to indicate that the core is selected to run the computing task Priority
  • an embodiment of the present invention provides a terminal, where the terminal has the function of a host in the foregoing method.
  • This function can be implemented in hardware or in hardware by executing the corresponding software.
  • the hardware or software includes one or more modules corresponding to the functions described above.
  • the terminal includes:
  • An obtaining unit configured to acquire a target model parameter, where the target model parameter is used to represent a calculated density of a convolutional neural network model
  • a weight value determining unit configured to determine, according to the target model parameter, a core weight value of at least two cores from a preset first correspondence, where the core weight values of the at least two cores correspond to the target model parameter
  • the at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes a correspondence between the target model parameters and core weight values of the at least two cores, where the core weight values are used to indicate that the core is Selecting the priority to run the convolutional neural network model;
  • a core determining unit configured to determine, according to the core weight values of the at least two cores, a core that runs the convolutional neural network model from the at least two cores.
  • the terminal includes: a processor and a memory.
  • the processor may be configured to support a terminal to perform a corresponding function in the method of the first aspect described above.
  • the processor is configured to: acquire target model parameters, the target model parameters are used to represent a computing density of a convolutional neural network model; and determine at least two from the preset first correspondence according to the target model parameters Core core weight values, the core weight values of the at least two cores correspond to the target model parameters, the at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes the target model a correspondence between the parameter and the core weight value of the at least two cores, the core weight value being used to indicate a priority of the core selected to run the convolutional neural network model; according to the core weight values of the at least two cores, A core running the convolutional neural network model is determined from the at least two cores.
  • an embodiment of the present invention provides a computer readable storage medium having instructions stored therein that, when run on a computer, cause the computer to perform the methods described in the above aspects.
  • an embodiment of the present invention provides a computer program product comprising instructions that, when run on a computer, cause the computer to perform the methods described in the above aspects.
  • a chip arrangement comprising a processing unit for performing the method of the first aspect described above.
  • a chip arrangement comprising a processor and a memory.
  • the memory includes instructions that the processor runs to perform the methods described in the various aspects above.
  • a chip system comprising a processor for supporting a terminal to implement the functions involved in the above first to third aspects, such as transmitting or processing data and/or information involved in the above method .
  • the chip system further includes a memory for storing necessary program instructions and data of the network device.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • the target model parameter is obtained, wherein the target model parameter is used to represent the calculation density of a convolutional neural network model, and then determined according to the target model parameter from the preset first correspondence relationship.
  • At least two core core weight values the core weight values of the at least two cores correspond to target model parameters, the at least two cores are heterogeneous cores on the terminal, wherein the first correspondence relationship includes target model parameters and at least two The core core weight values are used to indicate the priority at which the core is selected to run the convolutional neural network model.
  • the core of the running convolutional neural network model is determined from at least two cores based on the core weight values of the at least two cores.
  • the heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities.
  • the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model,
  • At least two cores are heterogeneous cores on the terminal, and after obtaining the target model parameters of a convolutional neural network model, at least two cores may be determined from the preset first correspondence according to the target model parameters. Core weight value.
  • the core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model.
  • the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores.
  • the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
  • FIG. 1 is a schematic diagram of a convolutional neural network according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a unit of a convolutional neural network according to another embodiment of the present invention.
  • FIG. 3 is a usage scenario diagram related to a core scheduling method according to another embodiment of the present invention.
  • FIG. 4 is a flowchart of a method for a core scheduling method according to another embodiment of the present invention.
  • FIG. 5 is a flowchart of a method for a core scheduling method according to another embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of hardware of a terminal according to another embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of a terminal according to another embodiment of the present invention.
  • the convolutional neural network is a feedforward neural network whose artificial neurons can respond to a surrounding area of a part of the coverage and have excellent performance for large image processing.
  • a convolutional neural network consists of one or more convolutional layers and a fully connected layer at the top (corresponding to a classical neural network), including associated weights and pooling layers.
  • This structure enables the convolutional neural network to take advantage of the two-dimensional structure of the input data.
  • convolutional neural networks can give better results in terms of image and speech recognition.
  • This model can also be trained using backpropagation algorithms.
  • convolutional neural networks require fewer parameters to estimate, making them an attractive deep learning structure.
  • Convolutional neural networks differ from ordinary neural networks in that the convolutional neural network consists of a feature extractor consisting of a convolutional layer and a subsampling layer.
  • the convolutional layer of a convolutional neural network one neuron is only connected to a portion of the adjacent layer neurons.
  • a convolutional layer of CNN there are usually several feature maps. Each feature plane is composed of a number of rectangularly arranged neurons. The neurons of the same feature plane share weights. The weights shared here are convolutions. nuclear.
  • the convolution kernel is generally initialized in the form of a random fractional matrix, and the convolution kernel will learn to obtain reasonable weights during the training process of the network.
  • the immediate benefit of shared weights is the reduction of connections between layers of the network while reducing the risk of overfitting.
  • Subsampling is also called pooling. It usually has two forms: mean pooling and max pooling. Subsampling can be seen as a special convolution process. Convolution and subsampling greatly simplify the model complexity and reduce the parameters of the model.
  • Neural networks including convolutional neural networks, consist of multiple layer stacks, each consisting of nodes. The operation is performed in a node whose mode of operation is roughly similar to that of a human neuron, and activates and releases a signal when sufficient stimulus information is encountered.
  • a node combines input data with a set of coefficients (or weight parameters) to specify its importance in the algorithm learning task by amplifying or suppressing the input. As shown in FIG. 1, input bit 1, and X 1 to X m are input data, and W 0 to Wm are weight parameters. The sum of the product of the input data and the weight parameter will enter the activation function of the node, determine whether the signal continues to be transmitted in the network, and the distance passed, thereby determining how the signal affects the final result of the network.
  • the unit of the convolutional neural network can be as shown in Figure 2, where h(x) is the output data and X 1 to X m are the input data.
  • a neural network model is formed.
  • convolutional neural networks have been applied in many directions, such as speech recognition, face recognition, general object recognition, motion analysis, and natural language processing.
  • convolutional neural networks have more and more applications on mobile terminals.
  • the application forms of smart albums include image classification, feature extraction, face clustering, etc.
  • the computational characteristics of these applications include a large number of matrix operations.
  • the convolutional neural network model is a specific network model (or arithmetical calculation) obtained after training the convolutional neural network.
  • the method of convolutional neural network has the characteristics of convolutional neural network.
  • the convolutional neural network model has specific computational density and can be used to execute specific application services.
  • the device has a variety of cores (also known as processors or computing units) that form the system chip.
  • the core of the embodiments of the present invention mainly relates to heterogeneous cores, and the types of these cores include but are not limited to the following:
  • CPU Central Processing Unit
  • Core computing core
  • control unit of a computer. Its function is mainly to explain computer instructions and to process data in computer software.
  • GPU Graphics Processing Unit
  • display core visual processor
  • display chip is a kind of personal computer, workstation, game console and some mobile devices (such as tablet, smart phone, etc.)
  • mobile devices such as tablet, smart phone, etc.
  • DSP Digital Signal Processor
  • a pulsating array processor is an application specific chip (ASIC) that employs a systolic array structure in which data is "flowing" rhythmically between processing units of the array in a predetermined “flowing” manner. .
  • ASIC application specific chip
  • all processing units process the data flowing through it in parallel at the same time, so that it can achieve high parallel processing speed.
  • the pulsating array processor may specifically be a Neural Network Processor (NPU), a Tensor Processing Unit (TPU), an Intelligent Processing Unit (IPU), or the like.
  • NPU Neural Network Processor
  • TPU Tensor Processing Unit
  • IPU Intelligent Processing Unit
  • NPU Neural-network Processing Unit
  • the NPU realizes the integration of storage and computation through synaptic weights, thereby greatly improving the operational efficiency.
  • TPU Tensor Processing Unit
  • Artificial intelligence is designed to give people the intelligence of machines. Machine learning is a powerful way to implement artificial intelligence. The so-called machine learning, that is, the study of how to let computers automatically learn the subject.
  • TPU is such a chip dedicated to machine learning. It can be a programmable artificial intelligence (AI) accelerator for the Tensorflow platform, which is essentially an accelerator of a pulsating array structure. Its internal instruction set can also be run when the Tensorflow program changes or updates the algorithm.
  • the TPU can provide high-throughput, low-precision calculations for forward modeling of models rather than model training, and with higher energy efficiency (TOPS/w).
  • the TPU can also be called an Intelligent Processing Unit (IPU).
  • IPU Intelligent Processing Unit
  • FIG. 3 is a usage scenario diagram related to a core scheduling method according to an embodiment of the present invention.
  • a system on chip SoC
  • the system chip includes at least two cores, and the at least two cores are heterogeneous cores.
  • the at least two cores may include a CPU, a GPU, a DSP, a pulse array processor, and the like.
  • Pulsating array Column processors include, but are not limited to, neural network processors, tensor processors, and the like. These chips can be called cores for calculations on the terminal. Among them, different cores have different energy efficiency ratios.
  • the terminal can perform different application services by using a specific algorithm.
  • the method in the embodiment of the present invention involves running a convolutional neural network model, and the terminal can perform different application services using the convolutional neural network model.
  • the terminal When the terminal performs the unused application service, it encounters different requirements.
  • the real-time scene application (such as camera preview) requires real-time recognition of the image, and the performance requirement is high; and the library classifies the imported image in the background.
  • the real-time requirements for the operation are lower, and the requirement is to reduce the power consumption.
  • the terminal runs a specific convolutional neural network model, it needs to perform effective core scheduling according to computing requirements (for example, performance, power consumption, etc.), and the scheduling core runs the convolutional neural network model to perform specific services. This would be beneficial for the execution of application services on the terminal, such as producing more efficient or energy efficient execution effects.
  • an embodiment of the present invention provides a core scheduling method for providing an adaptive core for a convolutional neural network model to efficiently run the convolutional neural network model.
  • FIG. 4 is a flowchart of a method for a core scheduling method according to an embodiment of the present invention. Referring to the above content and FIG. 4, the method of the embodiment of the present invention includes:
  • Step 401 Acquire target model parameters.
  • the target model parameters are used to represent the computational density of a convolutional neural network model.
  • the terminal acquires the target model parameters, and the calculated density of the specific convolutional neural network model can be determined by the target model parameters. Because different cores are suitable for running convolutional neural network models with different computational densities, the core can be selected according to the target model parameters to run a convolutional neural network model with the target model parameters.
  • the target model parameter is the number of weight parameters of the convolutional neural network model.
  • the target model parameter can also be the number of layers of the convolutional neural network model, and the number of layers of the convolutional neural network model can represent the depth of the convolutional neural network model.
  • the target model parameters can also be other parameters, which can reflect the computational density of the convolutional neural network model.
  • the computational density of the convolutional neural network model can also be called the complexity of the convolutional neural network model.
  • step 401 There are various implementations of step 401. Several examples are given below, as follows:
  • Example 1 The terminal acquires a convolutional neural network model, and analyzes the convolutional neural network model to obtain the target model parameters of the convolutional neural network model.
  • Example 2 The analysis device acquires a convolutional neural network model, and the parsing device parses the convolutional neural network model to obtain a target model parameter of the convolutional neural network model, and then the parsing device sends the target model parameter to the terminal, so that the terminal acquires Go to the target model parameters.
  • Step 402 Determine core weight values of at least two cores from the preset first correspondence according to the target model parameters.
  • the core weight value of the at least two cores corresponds to a target model parameter, and the at least two cores are heterogeneous cores on the terminal, where the first correspondence relationship includes a correspondence between the target model parameter and the core weight values of the at least two cores. Relationship, The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
  • the terminal is preset with a first correspondence, where the first correspondence includes a correspondence between the target model parameter and the core weight values of the at least two cores.
  • the terminal may determine the core weight values of the at least two cores from the first correspondence according to the target model parameters, and determine the core weight values of the at least two cores.
  • the correspondence between the core weight values of the at least two cores and the target model parameters is the correspondence between the target model parameters included in the first correspondence relationship and the core weight values of the at least two cores.
  • the core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model so that the terminal can utilize the core weight value to implement a suitable kernel for scheduling to run the convolutional neural network model. For example, if the first core runs a convolutional neural network model with a specific computational density more efficiently than the second core runs the convolutional neural network model, in other words, the first core is more suitable for running the convolution than the second core.
  • the neural network model sets the core weight value of the first core higher than the core weight value of the second core to indicate that in order to run the convolutional neural network model, the first core is selected with a greater priority than the second core is selected. Priority.
  • the first correspondence relationship may be pre-established using the target model parameter and the core weight values of the at least two cores.
  • the core weight value of the core may be determined from the first correspondence according to the target model parameter.
  • the core weight value is specifically used to indicate the degree of adaptation of the core hardware characteristics to the running convolutional neural network model, or the degree of adaptation of the core architectural characteristics to the running convolutional neural network model, or the core computing mode to run The degree of adaptation of the convolutional neural network model.
  • At least two cores in the first correspondence are heterogeneous cores on the terminal, and the core characteristics are different, so that different operational effects are generated when the convolutional neural network model is run, so the setting of the core weight value is feasible.
  • the core As for which convolutional neural network model the core is suitable for running, it can be obtained according to pre-test tests, for example, setting test efficiency parameters, which represent the time taken to run the convolutional neural network model to perform specific services. . Then, a convolutional neural network model with a specific computational density is run on different cores to obtain test efficiency parameters for different cores. Then configure a larger core weight value for the core with a large test efficiency parameter.
  • the core weight value may be indicative of the suitability of the core hardware characteristics for running a convolutional neural network model.
  • At least two cores in step 402 are heterogeneous cores on the terminal, and different heterogeneous cores have different hardware characteristics, so that different heterogeneous cores are suitable for running convolutional neural network models with different computational densities. It is more practical to schedule heterogeneous cores to run a convolutional neural network model.
  • the at least two cores may be at least two of a CPU, a GPU, a DSP, and a pulse array processor.
  • the heterogeneous core set on the terminal may be any two of the CPU, GPU, DSP, and pulse array processor, or any three, or any four, or all of the chips. It can be understood that the heterogeneous cores set on the terminal can also be other cores.
  • the systolic array processor can be a specific chip such as an NPU or a TPU.
  • core weight values of the at least two cores determined in step 402 have multiple specific implementation forms, as follows:
  • the core weight value is in the form of a numerical value, and may be in the form of a percentage, for example, 10%, 30%, etc.; in the form of a score, for example, or Etc; the form of decimals, for example, 0.5, 1.0, 1.5, etc.;
  • the core weight value is in the form of a level representation, such as a first priority, a fifth priority, and the like.
  • the target model parameter and the core weight value have multiple representations.
  • the target model parameter or the core weight value may be a specific value, for example, the target model parameter is 1000, 2000. Etc., the core weight value can be 0.5, 0.2, and so on.
  • the target model parameter or the core weight value may also be a numerical interval, and the numerical interval is a numerical range, for example, the target model parameter is an interval [10000, 15000], [15000, 20000], etc., the core weight value For the interval [0.1, 0.6], [0.6, 0.8], etc.
  • the embodiment of the present invention does not limit the specific representation form of the target model parameter and the core weight value in the first correspondence relationship.
  • step 402 There are various specific implementations for step 402, two examples of which are exemplified below:
  • the target model parameter and the core weight value included in the first correspondence are specific values.
  • the specific implementation of step 402 includes: matching the target model parameter of step 401 with the target model parameter of the first correspondence, if the matching Similarly, the core weight values of at least two cores corresponding to the same target model parameters are determined from the first correspondence.
  • the first correspondence relationship includes a target model parameter and a core weight value in the form of a target model parameter interval and a core weight value interval.
  • the specific implementation manner of step 402 includes:
  • Step A1 In the preset first correspondence, the target model parameter interval in which the target model parameter of step 401 is located is determined.
  • the target model parameter is the number of weight parameters
  • the first correspondence includes the weight parameter number interval [10 million, 30 million] and the CPU core weight value interval [0.2, 0.4], and the GPU core weight value interval [0.1, The corresponding relationship of 0.3].
  • the target model parameter of the convolutional neural network model acquired by the terminal is 150 million, and in the first correspondence relationship, the target model parameter of the convolutional neural network model is determined to fall within the weight parameter interval of 10 million [10 million , 30 million].
  • Step A2 In the first correspondence, determine a core weight value interval of at least two cores.
  • the core weight value interval of the at least two cores corresponds to the target model parameter interval
  • the first correspondence relationship includes a correspondence relationship between the target model parameter interval and the core weight value interval of the at least two cores
  • the target model parameter interval includes the target model parameter
  • the first correspondence includes the correspondence between the weight parameter number interval [10 million, 30 million] and the CPU core weight value interval [0.2, 0.4], and the weight parameter number interval [10 million, 30 million] Correspondence with the core weight value interval [0.1, 0.3] of the GPU.
  • the core weight value interval of the CPU corresponding to the weight parameter number interval is determined. [0.2, 0.4], and the core weight value range of the GPU [0.1, 0.3].
  • Step A3 For each core, the core weight value is determined from the core weight value interval.
  • the position of the core weight value in the core weight value interval and the position of the target model parameter in the target model parameter interval are the same.
  • the target model parameter of 15 million is located in the target model parameter interval [10 million, 30 million] at one-half, so that for the CPU, it is determined from the core weight value interval [0.2, 0.4]
  • the core weight value of one-half is 0.3; for the GPU, the core weight value of 0.2 at one-half is determined from the core weight value interval [0.1, 0.3].
  • the first corresponding relationship includes a target model parameter that is a numerical interval
  • the first corresponding relationship includes a core weight value that is a specific value.
  • the specific implementation manner of step 402 includes: determining, in a preset first correspondence, a target model parameter interval in which the target model parameter of step 401 is located, and then determining, in the first correspondence relationship, the target model The core weight value of at least two cores corresponding to the parameter interval.
  • Step 403 Determine a core of the running convolutional neural network model from at least two cores according to core weight values of at least two cores.
  • the core weight value is used to indicate that the core is selected to run the convolutional neural network model, so that at least two core weight values can be obtained from at least two core weight values.
  • the core of the running convolutional neural network model is determined in the core, and the core weight value is used to determine the core of the convolutional neural network model suitable for running step 401, so that the core can be scheduled to run the convolutional neural network model to execute specific Application business.
  • step 403 There are a number of specific implementations for step 403, and a few examples are given below:
  • the core with the largest core weight value is determined from the at least two cores, and the core with the largest core weight value is used to run the convolutional neural network model.
  • the core weight value is used to indicate that the core is selected to run the convolutional neural network model, so that the core with a large core weight value is scheduled to run the convolutional neural network model in preference to the core with a small core weight value.
  • the terminal in order to specifically perform step 403, the terminal further needs to perform other steps to obtain a parameter for correcting the core weight value, and then use the parameter to correct the core weight value to obtain a modified weight value, by using the modified weight.
  • the value is determined from the core of the running convolutional neural network model from the at least two cores.
  • the modified weight value introduces more reference factors and thus reflects the core of the appropriate scheduling more than the core weight value. Details are as follows:
  • the method of this example also includes:
  • Step B1 Obtain the current state parameter of the terminal.
  • the state parameter is a dynamically changing parameter.
  • the state parameter is a dynamically changing parameter on the terminal.
  • the state parameter reflects the specific operating environment when the terminal runs the convolutional neural network model.
  • the core running convolutional neural network model will have different effects affected by different operating environments.
  • the status parameters include, but are not limited to, the remaining power value of the terminal, the core usage rate, the temperature of the core, and the like.
  • Step B2 Determine parameter weight values of at least two cores from the preset second correspondence according to the state parameters.
  • the parameter weight value of the at least two cores corresponds to a state parameter, and the second correspondence relationship includes a correspondence between the state parameter and at least two core parameter weight values.
  • the parameter weight value is used to indicate that the core is selected under the state parameter.
  • a second correspondence is preset on the terminal, where the second correspondence includes a correspondence between the state parameter and the parameter weight value of the at least two cores.
  • the terminal may determine the parameter weight value of the at least two cores from the preset second correspondence according to the state parameter, and the parameter weight value and the state parameter of the at least two cores. correspond.
  • Corresponding relationship between the parameter weight value and the state parameter of the at least two cores is the second correspondence Correspondence between the included state parameter and at least two core parameter weight values.
  • the parameter weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the state parameter.
  • the core with a large parameter weight value takes precedence over the core running convolutional neural network model with a small parameter weight value. Therefore, the terminal can use the core weight value to correct the core weight value in step 402, so that the modified weight value further takes into account the state parameter on the terminal, and more reflects the core of the suitable convolutional neural network model.
  • step B2 At least two cores of step B2 and at least two cores of step 402 are referred to the same.
  • core As for which core is more suitable for running the convolutional neural network model under different state parameters, it can be obtained by pre-testing, for example, setting test efficiency parameters, and a convolutional neural network model with specific computational density in different cores. Running on, the core is under the state parameters of the specific terminal, and then the test efficiency parameters of different cores are obtained. Then configure a larger core weight value for the core with a large test efficiency parameter.
  • parameter weight values of the at least two cores determined in step B2 have multiple specific implementation forms, such as a form of a percentage, a form of a score, or a form of a level representation.
  • specific implementation forms such as a form of a percentage, a form of a score, or a form of a level representation.
  • the state parameter and the parameter weight value may also have multiple representations, for example, may be a specific numerical value or a numerical interval.
  • the target model parameters in the first correspondence relationship may be a detailed description of the representation of the core weight values.
  • step B2 There are various concrete implementations for step B2, and two examples are given below:
  • the state parameter and the parameter weight value of the second correspondence include a specific value.
  • the specific implementation of step B2 includes: using the state parameter of step B1 and the state parameter in the second correspondence, If the matches are the same, the parameter weight values of the at least two cores corresponding to the same state parameter are determined from the second correspondence.
  • the second correspondence includes a state parameter and a parameter weight value value range.
  • the specific implementation manner of the step B2 includes: determining, in the preset second correspondence, the state parameter interval in which the state parameter of the step B1 is located; and then determining, in the second correspondence, the parameter weight values of the at least two cores.
  • the interval, the parameter weight value interval of the at least two cores corresponds to the state parameter interval, and the second correspondence relationship includes a correspondence between the state parameter interval and the parameter weight value interval of the at least two cores.
  • the parameter weight value is determined from the parameter weight value interval.
  • the position of the parameter weight value in the parameter weight value interval and the position of the state parameter of step B1 in the state parameter interval are the same.
  • the step 403 is performed.
  • the step 403 specifically includes the step B3 and the step B4, as follows:
  • Step B3 For each core, the core weight value is corrected by using the parameter weight value to obtain the first modified weight value.
  • the first modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model.
  • the core with the first modified weight value has priority over the core running convolutional neural network model with the first modified weight value.
  • the specific modification manner can be set in advance. For example, multiplying the parameter weight value and the core weight value to obtain the first modified weight value, or correcting the core weight value by using the parameter weight value according to the preset correction relationship to obtain the first modified weight value, for example, the core weight value is The third priority, the parameter weight value is the fifth priority, and the preset correction relationship is that the highest level of the two weight values is the first modified weight value, so that the first modified weight value is the third priority.
  • Step B3 means for each of the aforementioned at least two cores.
  • Step B3 is to determine a specific core for each of the at least two cores, and use the parameter weight value of the core to modify the core weight value of the core to obtain a first modified weight value of the core.
  • Step B4 Determine a core of the running convolutional neural network model from at least two cores according to the first modified weight value of the at least two cores.
  • the terminal may determine the core of the running convolutional neural network model from the at least two cores according to the first modified weight value of the at least two cores, thereby determining Suitable for running the core of the convolutional neural network model.
  • the core with the largest corrected weight value is determined from the at least two cores, and the core with the largest first modified weight value is used to run the convolutional neural network model.
  • other parameters are further used to correct the first modified weight value, and the further modified weight value is used to determine the core of the running convolutional neural network model. Similar to performing steps B1 to B2 again, only the other parameters are acquired, and the first correction weight value is corrected.
  • the parameter weight value is obtained according to the current state parameter of the terminal, and the current state parameter is reflected in the specific operating environment when the terminal runs the convolutional neural network model, so that the parameter weight value reflects the influence of the current environment of the terminal on the core running convolutional neural network model.
  • the determination of the core weight value is determined according to the target model parameter representing the computational density of a convolutional neural network model, so that the core weight value reflects the influence of the core hardware characteristics on the running convolutional neural network model, thereby using the parameter weight value correction
  • the first modified weight value obtained from the core weight value takes into account more factors. According to the first modified weight value, the core of the more suitable convolutional neural network model can be determined.
  • steps B1 through B2 There are various implementations for steps B1 through B2, and two implementations are given below:
  • step B1 the current state parameter of the terminal is the current core usage rate of each core.
  • step B2 specifically includes: determining, for each core, a performance weight value from the preset second correspondence according to the core usage rate.
  • the performance weight value of each core corresponds to the core usage rate of each core
  • the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the current core core usage rate of the core
  • second The correspondence includes the correspondence between the core usage rate of each core and the performance weight value of each core.
  • the core usage rate refers to the core resources occupied by the programs running on the terminal, indicating the busyness of the core running programs. The higher the core core usage rate, the more programs are running on the core, and vice versa.
  • the core usage rate can be a specific value, such as 10%, 2%, and the like.
  • the performance weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model under the core core usage.
  • the core performance weight value reflects the extent to which the core's current computing resources are available.
  • the performance weight value is large, indicating that the core's current computing resources are highly usable, so that the core priority scheduling is used to run the convolutional neural network model.
  • the core with a large performance weight value is used to run the convolutional neural network model.
  • the core usage rate and the performance weight value may also have multiple representations, for example.
  • it may be a specific numerical value or a numerical interval.
  • the representation of the target model parameter and the core weight value in the first correspondence refer to the detailed description of the representation of the target model parameter and the core weight value in the first correspondence.
  • Each core in the implementation manner refers to each core of the at least two cores.
  • the performance weight value of the core is determined from the preset second correspondence according to the core usage rate of the core. .
  • step B2 determines the performance weight value from the preset second correspondence according to the core usage rate
  • step B1 the current state parameter of the terminal is the current remaining power value of the terminal
  • the step B2 specifically includes: determining, according to the remaining power value, the power consumption weight values of the at least two cores from the preset second correspondence.
  • the power consumption weight value of the at least two cores corresponds to the remaining power value, and the power consumption weight value is used to indicate that the core is selected to run the convolutional neural network model priority under the remaining power value, and the second correspondence includes The correspondence between the remaining power value and the power consumption weight values of at least two cores.
  • the remaining power value is the value of the remaining power on the terminal.
  • the representation of the remaining charge value includes, but is not limited to, a percentage, an ampere hour, and the like.
  • the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the current specific remaining power value of the terminal, and the core with the large power weight value is preferentially used to run the convolutional neural network model.
  • the setting of the power consumption weight value may also refer to the power consumption of the core. For example, as the remaining power value decreases, the power consumption weight value of the core with large power consumption decreases by more than the power consumption. Small core power consumption weight reduction. Therefore, the power consumption weight value can better reflect the appropriateness of the core running convolutional neural network model under the current remaining power value of the terminal.
  • the remaining power value and the power consumption weight value may also have multiple representations, for example, may be a specific numerical value or a numerical interval.
  • the target model in the first correspondence relationship A detailed description of the representation of the parameters and core weight values.
  • Each core in the implementation manner refers to each core of the at least two cores.
  • the power consumption weight value of the core is determined from the preset second correspondence according to the remaining power value.
  • step B2 determining the power consumption weight values of at least two cores from the preset second correspondence according to the remaining power value
  • the first modified weight value may be further corrected to obtain a second modified weight value, so that the running volume is determined from the at least two cores according to the second modified weight value of the at least two cores.
  • the core of the neural network model may be further corrected to obtain a second modified weight value, so that the running volume is determined from the at least two cores according to the second modified weight value of the at least two cores.
  • the current remaining power value of the terminal may be acquired, to determine power consumption weights of at least two cores according to the remaining power value and another preset correspondence. Value, and thus for each core, the first modified weight value is corrected using the power consumption weight value to obtain a second modified weight value.
  • the determination of the power consumption weight value reference may be made to the detailed description in the second implementation manner described above.
  • the core usage rate of each core may also be obtained, so that for each core, according to the core usage rate and Presetting another correspondence, determining a performance weight value, and then using the performance weight value to correct the first modified weight value to obtain a second correction right Heavy value.
  • determining a performance weight value refer to the detailed description in the foregoing implementation manner 1.
  • the method of the example further includes:
  • Step C1 Obtain a current remaining power value of the terminal.
  • Step C2 Determine power consumption weight values of at least two cores from a preset third correspondence according to the remaining power value.
  • the power consumption weight value of the at least two cores corresponds to a remaining power value
  • the third correspondence relationship includes a correspondence between the remaining power value and the power consumption weight values of the at least two cores, where the power consumption weight value is used to indicate that the remaining power value is
  • the core is chosen to run the convolutional neural network model with priority.
  • step B4 specifically includes step C3 and step C4.
  • the details of step C3 and step C4 are as follows:
  • Step C3 For each core, the first modified weight value is corrected by using the power consumption weight value to obtain a second modified weight value.
  • the second modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model
  • Step C4 Determine a core of the running convolutional neural network model from at least two cores according to the second modified weight value of the at least two cores.
  • the core with the second largest modified weight value may be determined from the at least two cores, and the core with the second largest modified weight value is used to run the convolutional neural network model. .
  • the second modified weight value is further corrected using other parameters to obtain a further modified weight value.
  • the core of the running convolutional neural network model is determined using the weight value of the further correction.
  • the core scheduling method in some examples of the present invention further includes determining the performance parameter, and the core running the convolutional neural network algorithm may have multiple different operation modes, and may be controlled in order to control the specific operation mode of the core.
  • the core uses the determined performance parameters to run the convolutional neural network model.
  • the determination of the performance parameter needs to be combined with the specific use state of the core.
  • the performance parameter of the core may be determined according to the core usage rate of the core. The details are as follows:
  • the core scheduling method further includes:
  • Step D1 Obtain the current core usage rate of each core.
  • the terminal first obtains the core usage rate, so that the core usage rate can be determined according to the core usage rate.
  • Each core here is each core of at least two cores described above.
  • the terminal can read the current core usage rate of the core through an application programming interface (API) provided by the operating system.
  • API application programming interface
  • Step D2 For each core, the performance parameter is determined from the second correspondence according to the core usage rate.
  • the performance parameter of each core corresponds to the core usage rate of each core, and the second correspondence relationship includes the correspondence between the performance parameters of each core and the core usage rate of each core. Performance parameters are used to indicate how the core operates.
  • the terminal is preset with a second correspondence, where the second correspondence includes a correspondence between a performance parameter of each core and a core usage rate of each core. Therefore, after obtaining the core usage rate of each core, the terminal determines performance parameters from the second correspondence according to the core usage rate for each core, so that performance parameters of the at least two cores can be obtained.
  • the core usage rate in the second correspondence may be a specific value, for example, 10%, 23%, or the like, or a numerical interval, such as [10%, 35%], etc., which is not specifically limited in the embodiment of the present invention.
  • the terminal uses the core core usage rate of the core and the core usage rate of the second correspondence relationship, if the current core usage rate of the core is the same as the core usage rate in the second correspondence relationship or If the core usage rate interval falls in the second correspondence, the matching is successful, and the terminal may determine, from the second correspondence, the performance parameter corresponding to the successfully matched core usage rate.
  • the foregoing operations are performed for each core to obtain performance parameters for each core.
  • Step D3 After determining the core of the running convolutional neural network model from at least two cores according to the core weight values of the at least two cores, running the convolutional neural network model on the target core using the performance parameters of the target core.
  • the target core is the core of running the convolutional neural network model.
  • the terminal obtains the performance parameters of each core. After determining the target core of the running convolutional neural network model, the terminal can use the performance parameters of the target core on the target core when running the convolutional neural network model using the target core.
  • the convolutional neural network model is used to control the specific operating conditions of the target core through the setting of performance parameters to meet the user's operational requirements for the core.
  • the performance parameter includes one or more of thread priority information, sleep time information, and number of threads.
  • the thread priority information is the priority information of the child thread when the core runs the convolutional neural network model;
  • the sleep time information is the time when the core runs two convolutional neural network models;
  • the number of threads is the core running convolutional neural network The number of threads used when the model is used.
  • the target core runs the convolutional neural network model using thread priority information, and the target core uses sub-threads to schedule the sub-threads based on the priority information of the sub-threads indicated by the thread priority information.
  • the target core runs the convolutional neural network model using the sleep time information, and after the target core runs the convolutional neural network model, the target core no longer runs another convolution during the interval indicated by the sleep time information.
  • Neural network model is a convolutional neural network model using the sleep time information, and after the target core runs the convolutional neural network model, the target core no longer runs another convolution during the interval indicated by the sleep time information.
  • the target core runs a convolutional neural network model using the number of threads information, the target core generates the number of threads indicated by the number of threads information, and then uses the number of threads to run the convolutional neural network model.
  • the determination of the performance parameter and the determination of the performance weight value use the current core usage rate of the core, so that in some embodiments, the determination of the performance parameter and the determination of the performance weight value can be implemented in the same step. At the same time, the current core usage of each core can also be obtained only once.
  • the core usage rate is used to determine the performance parameters of the control core operation, so that the target core can use the performance parameters of the target core to run the convolutional neural network model when running the convolutional neural network model, so that the target core operates in the same way as the target core.
  • the core usage rate when the correspondence between the performance parameters of each core included in the second correspondence relationship and the core usage rate of each core is preset according to a manner of improving execution efficiency, the target core can be efficiently operated.
  • a core scheduling method comprising:
  • Step E1 Acquire the target model parameters.
  • the target model parameters are used to represent the computational density of a convolutional neural network model.
  • step E1 For a specific implementation of step E1, reference may be made to the detailed description of step 401.
  • Step E2 Determine core weight values of at least two cores from the preset first correspondence according to the target model parameters.
  • the core weight value of at least two cores corresponds to the target model parameter, and at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes a correspondence between the target model parameters and core weight values of at least two cores, and the core
  • the weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
  • step E2 For a specific implementation of step E2, reference may be made to the detailed description of step 402.
  • the correspondence between the target model parameters in the first correspondence relationship and the core weight values of the at least two cores may be preset according to the effects of the different core cooperative running convolutional neural network models.
  • Step E3 Assign the convolutional neural network model to different cores according to the core weight values of at least two cores.
  • a plurality of core cooperative running convolutional neural network models may be used, that is, the convolutional neural network model is allocated to run on different cores, but the core weight values are determined for each core, thereby The core weight value, after the convolutional neural network model is assigned to run on different cores, the proportion of the core running the convolutional neural network model is determined by the core weight value. For example, the core with a large core weight value and the core with a small core weight value Ratio, the majority of the convolutional neural network model is assigned to the core with a large core weight value, and a small part of the convolutional neural network model is assigned to the core with a small core weight value, so that the core with a large core weight value To the role of running the main core.
  • a core scheduling method comprising:
  • Step F1 Obtain a task type parameter.
  • the task type parameter is used to indicate the type of the calculation task.
  • the calculation task may be, for example, image recognition, voice recognition, image classification, etc.
  • the task type parameter may be text information, such as "image recognition” text, "speech recognition” text, or information such as letters or numbers, such as "001 ", as long as it can identify the type of specific computing tasks.
  • Step F2 Determine, according to the task type parameter, a core weight value of at least two cores from a preset fourth correspondence.
  • the core weight value of at least two cores corresponds to a task type parameter, and at least two cores are heterogeneous cores on the terminal, and the fourth correspondence relationship includes a correspondence between a task type parameter and at least two core core weight values, and a core
  • the weight value is used to indicate the priority at which the core is selected to run the computing task.
  • the correspondence between the task type parameter and the core weight value of at least two cores is preset, and after the task type parameter is acquired, the acquired task type parameter and the task type parameter in the fourth correspondence relationship are used. If the matching is successful, the matching is successful, and the core weight values of at least two cores corresponding to the successfully matched task type parameters are determined.
  • Step F3 Determine the core of the running computing task from the at least two cores according to the core weight values of the at least two cores.
  • the core of the running computing task can be determined from at least two cores according to the core weight values of the at least two cores, for example, selecting a core weight value is large.
  • the core runs the computing task, or after the core weight values of the at least two cores are modified according to other parameters, the core of the running computing task is determined from the at least two cores.
  • the heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities.
  • the first correspondence includes target model parameters and at least two cores Correspondence of core weight values, wherein the target model parameter is used to represent the computational density of a convolutional neural network model, and the at least two cores are heterogeneous cores on the terminal, and the target of acquiring a convolutional neural network model is obtained.
  • the core weight values of at least two cores may be determined from the preset first correspondence according to the target model parameters.
  • the core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model.
  • the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores.
  • the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
  • FIG. 5 is a flowchart of a method for a core scheduling method according to an embodiment of the present invention. The method may be applied to a terminal, and the method of the embodiment shown in FIG. 5 may be implemented based on the method shown in FIG. 4, in FIG. 5.
  • the pulse array processor is an NPU
  • the target model parameter is a weight parameter number as an example.
  • the method of the embodiment of the present invention includes:
  • Step 501 The terminal acquires a convolutional neural network model.
  • the terminal can use a specific algorithm to execute the application service, for example, the convolutional neural network model can be used to perform the execution of the specific application service. To this end, the terminal first acquires the convolutional neural network model to be used.
  • a terminal acquires a convolutional neural network model.
  • the terminal acquires a convolutional neural network model sent by other devices, or the terminal establishes a convolutional neural network model locally.
  • Terminals can use convolutional neural network models for a variety of application services, such as running convolutional neural network models for image and speech recognition.
  • An example of running a convolutional neural network model to perform image services may be an operation of image classification, image feature extraction, face clustering, etc., and the computational characteristics of these operations include a large number of matrix operations, thereby being suitable for using a convolutional neural network model. To execute.
  • the method of establishing the convolutional neural network model can be obtained through training, for example, collecting a large amount of relevant data, and using the data to perform convolutional neural network training to obtain a convolutional neural network model.
  • the device that performs the steps of training the convolutional neural network model may be a terminal or a device such as a server.
  • Step 502 The terminal acquires the number of weight parameters.
  • the number of weight parameters is used to represent the computational density of a convolutional neural network model
  • Convolutional neural network models are used for computations, such as image processing, where different convolutional neural network models have different characteristics. For example, different convolutional neural network models may have different computational densities.
  • the computational density can be determined by the number of weighting parameters in the convolutional neural network model, ie the number of weighting parameters of the convolutional neural network model can indicate the computational density of the convolutional neural network model.
  • the content of the weighting parameters of the convolutional neural network model reference can be made to the above description.
  • a computationally dense density convolutional neural network model is suitable for operation on a GPU, and a computationally densely convolved convolutional neural network model can be, for example, a large matrix convolutional neural network model;
  • the computationally densely populated convolutional neural network model is suitable for running on a CPU.
  • the computationally densely populated convolutional neural network model can be, for example, a small matrix, or a serial, or for-loop convolutional neural network model.
  • Different cores have different computational characteristics, and different types of convolutional neural network models have different computational densities.
  • Which type of convolutional neural network model is suitable for which core to operate can be known from empirical data or experimental tests.
  • ResNet residual network 18 such as classification, or feature extraction, or object detection and other convolutional neural network models, suitable for running on NPU or GPU; non-human face (for example, dog face / cat face) recognition , ID card image recognition and other convolutional neural network models belonging to small networks, suitable for running on the CPU.
  • the terminal acquires the convolutional neural network model
  • the number of weight parameters of the convolutional neural network model is obtained, so that the volume can be considered according to the number of weight parameters of the convolutional neural network model.
  • the core of the neural network model is scheduled.
  • the specific implementation manner of the number of weight parameters of the terminal acquiring the convolutional neural network model may be:
  • the terminal analyzes the acquired convolutional neural network model using a parser, and analyzes the number of weight parameters of the convolutional neural network model.
  • Step 503 Determine core weight values of at least two cores from the preset first correspondence according to the number of weight parameters.
  • the core weight value of the at least two cores corresponds to the number of weight parameters, and the at least two cores are heterogeneous cores on the terminal.
  • the core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
  • a plurality of heterogeneous cores may be disposed on the terminal, and the core is a computing unit for performing calculations, and the types of the heterogeneous cores include but are not limited to CPUs, GPUs, DSPs, NPUs, and the like.
  • each core corresponds to a core weight value
  • multiple cores correspond to multiple core weight values.
  • the first correspondence includes a correspondence between the number of weight parameters and the core weight values of the at least two cores. That is, the parameters of the first correspondence include the number of weight parameters and the core weight value.
  • the specific form of the two parameters may be a specific value or a numerical range.
  • the determined core weight value can be a specific value.
  • the correspondence between the number of weight parameters and the core weight value may be a correspondence between specific values, or may be a correspondence between numerical ranges.
  • the core type of the core weight value may be a plurality of heterogeneous cores, and the determined number of core weight values may also be multiple, wherein different core weight values belong to different heterogeneous cores.
  • step 503 may be: after the terminal obtains the number of weight parameters of the convolutional neural network model, the terminal uses the obtained number of weight parameters and the number of weight parameters of the first corresponding relationship to match, when the number of weight parameters obtained is If the number of weight parameters of the first corresponding relationship is the same or the number of the obtained weight parameters is within the weight parameter number range of the first correspondence, the matching is successful. Then, a core weight value corresponding to the number of matched weight parameters is determined in the first correspondence.
  • the specific core weight value may be calculated using the weight parameter number of the convolutional neural network model.
  • the core weight value is determined using a linear mapping method according to the number of weight parameters of the convolutional neural network model within the corresponding core weight value range.
  • the establishment of the first correspondence may be established based on experimental test data or empirical data.
  • the first The acquisition of the corresponding relationship may be obtained from the storage device of the terminal, or may be obtained by the terminal from other devices.
  • step 503 A specific example is given below to illustrate step 503, as follows:
  • the convolutional neural network model with the weight parameter number ⁇ 50million (million) is a small network model, which is suitable for running on the CPU, so for this type of convolutional neural network, the CPU can be set.
  • the core weight value is 1, and the core weight value of the GPU is linearly set according to the number of weight parameters of the convolutional neural network model from 0 to 1.
  • the core weight value of the NPU is 0 to 0 according to the weight parameter of the convolutional neural network model. Set between 0.5.
  • the convolutional neural network model of 50million ⁇ 200million is a medium-sized network model, which is suitable for running on the GPU. Therefore, the core weight value of the GPU is 1, and the core weight value of the CPU is between 1 and 0.5 according to the weight parameter of the convolutional neural network model. According to the linear setting, the core weight value of the NPU is set between 0.5 and 1.0 according to the number of weight parameters of the convolutional neural network model.
  • the convolutional neural network model of 200million ⁇ 500million is a large-scale network model, which is suitable for running on the dedicated acceleration device NPU. Therefore, the core weight value of the NPU is 1, and the core weight value of the CPU is 0.5 ⁇ according to the weight parameter of the convolutional neural network model. According to the linear setting between 0, the core weight value of the GPU is linearly set according to the number of weight parameters of the convolutional neural network model between 1.0 and 0.5.
  • the method for determining the core weight value of a core using linear mapping is as follows:
  • the core weight value interval of the at least two cores is determined, and the target core weight value intervals of the at least two cores respectively correspond to the target weight parameter number interval.
  • the at least two cores are heterogeneous cores on the terminal.
  • the core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
  • the weight parameter of the convolutional neural network model obtained by the terminal is 100million, and the heterogeneous core of the terminal is CPU, GPU and NPU.
  • the core weight value is calculated as follows:
  • the terminal uses the weight parameter number 100million of the convolutional neural network model and the weight parameter number interval in the correspondence relationship of Table 1 to match, and determines that the weight parameter number 100million is located in the target weight parameter quantity interval “50Million ⁇ 200Million” in Table 1.
  • the terminal determines the core weight of the CPU, GPU and NPU according to the position of the weight parameter of the convolutional neural network model in the position of the target weight parameter number interval.
  • the value interval is linearly mapped to obtain the core weight value for each core.
  • the core weight value of the CPU is obtained by linear mapping.
  • the calculated core weight value of the CPU is 0.83.
  • the core weight of the NPU is calculated to be 0.66.
  • the core weight value of the core may also be directly determined from the first correspondence.
  • the number of weight parameters of the convolutional neural network model is still 100 million.
  • the terminal uses the number of weight parameters of the convolutional neural network model 100million to match the number of weight parameters in the correspondence relationship of Table 1, and determines that the weight parameter number 100million is located in the target core weight value interval “50Million ⁇ 200Million”.
  • the method for determining the core weight value from the corresponding relationship may be multiple in the embodiment of the present invention.
  • the specific core weight value is a specific value, it may be directly determined, if the core weight value is a value in the first correspondence relationship. In the range, it is necessary to perform linear mapping according to the number of weight parameters of the convolutional neural network model to obtain the core weight value of the core.
  • the core weight value reflects the extent to which the core hardware characteristics are applicable to the current convolutional neural network model. The higher the core weight value, the more appropriate it is to run the convolutional neural network model. Therefore, the terminal can schedule the core of the running convolutional neural network model according to the core weight value. For example, in the above example, the terminal can retrieve the convolutional neural network model obtained by the core GPU running step 501 with the largest core weight value.
  • the core operating environment also affects the running convolutional neural network model.
  • the core weight values reflect the core static characteristics and the convolutional neural network model. Correlation, if only the core hardware characteristics are considered to schedule the core running convolutional neural network model, the obtained operation effect is not necessarily the best.
  • the core operating environment parameters that is, the dynamic characteristics. Combining the static and dynamic characteristics of the core will select the current best suitable running volume.
  • the core of the neural network model The specific combination may be to use the core dynamic parameters to adjust the core weight value to obtain a new weight value to schedule the core according to the new weight value.
  • the core scheduling method of the embodiment of the present invention further includes the following steps.
  • the dynamic parameters are used as an example of the core usage rate and the remaining power consumption value of the terminal. That is, increase the load balancing judgment of the performance and power consumption dimensions to correct the weight value to decide which core to schedule.
  • Step 504 Obtain the current core usage rate of each core.
  • the core usage rate is a dynamically changing parameter.
  • the core performance state is one of the core dynamic characteristics.
  • the same core has different computing capabilities when it is in different performance states, so the core performance.
  • the state has an impact on running the convolutional neural network model, so the current performance state of the core is taken as the core of the generation.
  • One of the considerations of the degree strategy can make the scheduled core more efficient to run the current convolutional neural network model.
  • the core core usage rate is an important performance state parameter of the core, so core usage can be used as one of the considerations of the core scheduling strategy.
  • the core usage rate represents the core load situation.
  • a specific implementation manner in which the terminal obtains the current usage rate of the multiple heterogeneous cores may be, for example:
  • the terminal invokes the preset core usage reading program or uses the API of the core usage rate provided by the terminal system to read the core usage rate of each core on the terminal.
  • the core that needs to read the core usage rate is the core of the convolutional neural network model to be run on the terminal, that is, the core shown in step 503 above.
  • the terminal reads 25% of the core usage of the GPU through the API provided by the terminal system, that is, the GPU has 75% of computing resources available.
  • Step 505 Determine, for each core, a performance weight value and a performance parameter from a preset second correspondence according to the core usage rate.
  • the performance weight value of each core corresponds to the core usage rate of each core, and the performance parameters of each core correspond to the core usage rate of each core.
  • the performance weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model under the current core usage of the core, and the core performance weight value reflects the current computing resource availability of the core. Different cores can have different performance weight values.
  • Running the convolutional neural network model on the core requires running with specific performance parameters of the core, including one or more of thread priority information, sleep time information, and number of threads.
  • the thread priority information is the priority information of the child thread when the core runs the convolutional neural network model;
  • the sleep time information is the time when the core runs two convolutional neural network models;
  • the thread number information is the core running convolutional neural network model The number of threads used at the time.
  • the performance weight value and performance parameter obtained in step 505 may also appear in the form of a list.
  • the second correspondence includes a correspondence between core usage of each core and performance weight values of each core.
  • the second correspondence also includes a correspondence between performance parameters of each core and core usage of each core.
  • the second correspondence is a correspondence between core usage rate, performance weight value, and performance parameter. That is, the parameters of the second correspondence include core usage rate, performance weight value, and performance parameter.
  • the specific form of these three parameters may be a specific value or a numerical interval.
  • the determined performance weight value can be a specific value.
  • the performance weight value may belong to one or more core types, and the determined performance weight value may also be one or more, wherein different performance weight values belong to different cores.
  • the performance parameter of the core may be one or more, which is not specifically limited in the present invention.
  • step 505 The specific implementation of step 505 can be as follows:
  • the terminal After the terminal obtains the current core usage rate of the core, the terminal uses the current core usage rate and the core usage ratio in the second correspondence to match, and the current core usage rate and the core usage rate in the second correspondence relationship are the same, or the If the current core usage rate is within the numerical range of the core usage rate in the second correspondence, the matching is successful. Then, in the second correspondence of the terminal, the performance weight value corresponding to the core usage rate that the matching is successful is determined.
  • the corresponding value is determined for the numerical range of the core usage rate and the performance weight value.
  • the core core current usage rate can be used to calculate a specific performance weight value.
  • the core value is determined according to the core core usage rate in a numerical range of the corresponding performance weight value. Performance weight value.
  • the terminal may also determine, in the second correspondence, a performance parameter corresponding to the core usage rate that is successfully matched, and the performance parameter may be a specific value.
  • the core performance parameters can be set in the second correspondence according to the specific core usage rate, which requires system tuning and acquisition. For example, when the core usage rate is the lowest, more threads can be used to obtain higher processing performance. When the core usage rate is high, the neural network computing request is processed with fewer threads, and the impact on the already high core usage rate is as much as possible. small.
  • the establishment of the second correspondence relationship may be established in advance based on experimental test data or empirical data.
  • the obtaining of the second correspondence by the terminal may be obtained from the memory of the terminal, or may be obtained by the terminal from other devices.
  • step 505 A specific example is given below to illustrate step 505, as follows:
  • the performance weight value is divided into three ranges, that is, divided into three grades, and the core usage rate and performance parameters are also divided into three grades, wherein each core is The usage rate is a range of values, and the performance parameters of each file are specific values.
  • the core usage rate is not considered too high to prevent the core load from being too heavy and affecting the operation of the terminal.
  • the performance weight value is determined from the performance weight value interval, wherein the position of the performance weight value in the performance weight value interval and the core current core usage rate are the same in the target core usage interval.
  • the performance weight value is calculated as follows:
  • the current core usage rate of the GPU is 25%.
  • the method for determining performance parameters is as follows:
  • the performance parameter corresponding to the target core usage interval is determined.
  • the current core usage rate of the GPU is 25%.
  • the performance parameters are set as follows:
  • the performance parameters corresponding to the target core usage interval of 2% to 30% are determined as follows: the thread priority information is 0, the sleep time information is 400 ms, and the thread number information is 2.
  • the correspondence of step 505 may include a performance weight value but does not include a performance parameter, or does not include a performance weight value but includes a performance parameter.
  • the core usage in step 505 is one of the status parameters of the terminal, and the status parameter of the terminal may further include the remaining power value of the terminal, the temperature of the core, and the like.
  • the second correspondence may also be a correspondence between other state parameters and at least two core parameter weight values, the parameter weight values being used to indicate that the core is selected to run the convolutional neural network model under specific state parameters. Priority.
  • Step 506 For each core, the core weight value is corrected by using the performance weight value to obtain the first modified weight value.
  • the first modified weight value is used to indicate that the core is selected to run the convolutional neural network model.
  • the core with the first modified weight value and the core with the smaller first modified weight value is suitable for running the convolutional neural network model.
  • the terminal runs the convolutional neural network model on a specific core, it not only needs to adapt the physical characteristics of the core and the characteristics of the convolutional neural network model, but also makes the current performance state of the core suitable for running the convolutional neural network model. To do this, you need to consider the core performance parameters and core hardware features.
  • the specific implementation manner is to correct the core weight values of the multiple heterogeneous cores by using the performance weight values of the multiple heterogeneous cores, so that the obtained first modified weight value combines the core weight value and the performance weight value, and the core first
  • the modified weight value reflects the extent to which the core hardware characteristics and the core core usage of the core are suitable for running the convolutional neural network model.
  • the first modified weight value is more reflective of the core's adaptation to the convolutional neural network model than the core weight value that reflects only the static characteristics of the core. Scheduling the core of the running convolutional neural network model based on the first modified weight value will enable the selection of a more efficient core.
  • the GPU has a core weight value of 1, and the GPU has a performance weight value of 0.7.
  • the core weight value and the performance weight value are multiplied to obtain a first modified weight value of the GPU of 0.7.
  • Step 507 Acquire a remaining power value of the terminal.
  • the core of the terminal runs the convolutional neural network model, it will generate power consumption. Because the power consumption of different cores is different, running the same convolutional neural network model on different cores will generate different power consumption. In order not to affect the continuous use of the terminal by the user, it is necessary to regard the remaining power of the terminal as one of the consideration factors of the scheduling core. This is especially important on terminals with small electrical energy storage.
  • the terminal needs to obtain the remaining power value on the terminal, and the remaining power value is used to indicate how much power the terminal currently has.
  • the specific manner in which the terminal obtains the remaining power value may be, for example, the terminal uses the power detection program on the terminal to detect the current remaining power of the terminal, and obtains the remaining power value.
  • Step 508 Determine, according to the remaining power value, the power consumption weight values of the at least two cores from the preset third correspondence.
  • the power consumption weight values of the at least two cores correspond to the remaining power consumption values.
  • the power consumption weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model under the remaining power value.
  • a core with a large power weight value and a core with a smaller power weight value is suitable for running a convolutional neural network model.
  • the core of the convolutional neural network model scheduling may be a core with lower power consumption, but the computing power of the core may not be superior to other cores, that is, different parameters on the integrated terminal are required.
  • the power consumption weight value of the core is determined by comprehensively considering the remaining power of the terminal and the power consumption of the core. Specifically, the power consumption weight value of the core is determined by using the third correspondence, where the parameters of the third correspondence include a remaining power value and a power weight value, and in the third correspondence, the power weight value is set. The power consumption of the core is taken into account.
  • the third correspondence includes a correspondence between the remaining power value and the power consumption weight value of the at least two cores. That is, the parameters of the third correspondence include the remaining power value and the power consumption weight value.
  • the specific form of these two parameters may be a specific value or a range of values.
  • the determined power weight value can be a specific value.
  • the power consumption weight value may belong to one or more core types, and the determined power consumption weight value may also be one or more, wherein different power consumption weight values belong to different cores.
  • the specific power consumption of the terminal may be used to calculate the specific power consumption.
  • the weight value determines a power consumption weight value in the power consumption weight value interval using a linear mapping manner at a position of the corresponding power consumption weight value interval according to the current remaining power value of the terminal.
  • step 508 A specific example is given below to illustrate step 508, as follows:
  • the power consumption weight value is divided into two ranges, that is, divided into two grades, and the remaining power value range is also divided into two grades.
  • the low power consumption weight value corresponding to the remaining power range is set to the lowest, that is, as long as the power is greater than 8%, the power consumption weight value can be set to a higher level.
  • the power weight value is calculated as follows:
  • the power consumption weight value is determined from the target power weight value interval, the position of the power weight value in the target power weight value interval, and the position of the terminal remaining power value in the target remaining power value interval the same.
  • the power consumption weight value is calculated as follows:
  • the target power consumption weight value interval corresponding to the target remaining power consumption interval is determined: the corresponding CPU is 0.8 to 1.0, and the corresponding GPU is 0 to 0.8, corresponding to the NPU. It is 0.8 to 1.0;
  • Step 509 For each core, the first modified weight value is corrected by using the power consumption weight value to obtain a second modified weight value.
  • the second modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model; the core with the second modified weight value and the core smaller than the second modified weight value is suitable for running the convolutional neural network model.
  • the first correction weight value is corrected by using the power consumption weight value, that is, the first correction weight of the core is used for the remaining power value of the terminal and the power consumption of the core. The value is corrected.
  • the second correction weight value may appear in the form of a list.
  • the core first correction weight value reflects the degree of hardware calculation characteristics of the core and the core core usage rate of the core is suitable for running the convolutional neural network model, and corrects the core first by using the core power weight value.
  • the second modified weight value of the core is combined with more parameters to generate the core scheduling strategy.
  • the parameters of the second modified weight value of the core may have core hardware calculation characteristics, computational density of the convolutional neural network model, core usage rate, remaining power of the terminal at the core, The core power consumption, and thus the core second correction weight, can better reflect the suitability of the core running convolutional neural network model, according to different
  • the core second correction weight can more accurately determine the core of the convolutional neural network model that can run efficiently.
  • the power consumption weight value of the GPU is 0.4
  • the first correction weight value of the CPU is 0.7.
  • the terminal uses the power consumption weight value and multiplies the first modified weight value to obtain a second modified weight value of the GPU of 0.28.
  • the core weight value may be first modified by using the performance weight value, or the core weight value may be first modified by using the power weight value, or the core weight value may be corrected by using the performance weight value and the power weight value simultaneously.
  • the embodiment of the present invention does not specifically limit this.
  • Step 510 Determine, from the at least two cores, a target core with the second modified weight value being the largest.
  • This target core is used to run the core of the convolutional neural network model.
  • the second modified weight value may be used for core scheduling. Comparing the second modified weight values of the cores, selecting the target core with the second modified weight value, and the second weight value of the core reflects the appropriate degree of the core suitable for running the convolutional neural network model, so it is suitable for the target core. Run the convolutional neural network model on it.
  • the terminal selects the convolutional neural network model acquired by the GPU running step 501 with the second largest modified weight value. To perform specific application business.
  • Step 511 Run a convolutional neural network model on the target core using the performance parameters of the target core.
  • Running the convolutional neural network model on the target core involves a specific way of running, such as how to use the core thread to run the convolutional neural network model.
  • the performance parameter of the core is determined according to the current core usage rate of the core, that is, the performance parameter of the target core has been determined, and the terminal can use the performance parameter to run the convolutional neural network model on the target core.
  • each of the performance parameters is used to run the convolutional neural network model. For example, according to the number of threads of the performance parameter, the number of concurrent threads of the target core is controlled; according to the sleep time information of the performance parameter, after the network computing request is executed, the sleep time of the core is controlled, that is, indicated by the sleep time information.
  • the core does not run the next convolutional neural network model during the interval; the priority of the sub-threads in the target core is controlled according to the thread priority information of the performance parameters.
  • the target core runs the convolutional neural network model
  • the sleep API of the system is called, and sleeps for a period of time, and another convolutional neural network model is not run during the period, after the period of time Then, deal with the next new convolutional neural network model.
  • the target core sleeps for a long time by setting the sleep time information to maintain the core usage rate at a reasonable level.
  • the method of the embodiment of the present invention may, after obtaining the weight parameter of the convolutional neural network, represent the calculation density of the convolutional neural network model according to the number of the weight parameter, according to the preset first correspondence relationship,
  • the core weight value of the plurality of cores may be determined according to the number of the weight parameters, and the core weight value is used to indicate that the core is selected to run the convolutional neural network model, and then the core weight value is corrected by the dynamic parameters of the core terminal.
  • the performance weight value is determined according to the second correspondence relationship by the core core usage rate, and the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the current core core usage rate of the core, Determining a power consumption weight value according to a third correspondence relationship by using a remaining power value of the terminal, where the power consumption weight value is used to indicate the remaining power value
  • the core is selected to run the priority of the convolutional neural network model, and the performance weight value and the power weight value are used to correct the core weight value to obtain a second modified weight value, and in the plurality of cores, the second correction weight
  • the target core with the largest value is the core that is most suitable for running the convolutional neural network model.
  • the target core can be scheduled to run the convolutional neural network model, which can improve the efficiency of operation and reduce power consumption.
  • FIG. 6 is a schematic structural diagram of hardware of a terminal according to an embodiment of the present invention. As shown in FIG. 6, for the convenience of description, only the parts related to the embodiment of the present invention are shown. For the specific technical details not disclosed, please refer to the method part of the embodiment of the present invention.
  • the terminal may be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), an in-vehicle computer, and the terminal is a mobile phone as an example:
  • FIG. 6 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present invention.
  • the mobile phone includes: a radio frequency (RF) circuit 610, a memory 620, an input unit 630, a display unit 640, a sensor 650, an audio circuit 660, a wireless fidelity (WiFi) module 670, and a central processing unit. 680, and power supply 690 and other components.
  • RF radio frequency
  • the mobile phone may further include a graphics processor 681, a digital signal processor 682, a pulse array processor 683, etc.
  • the pulse array processor may specifically be a neural network processor, a tensor processor, an intelligent processor. Wait.
  • the structure of the handset shown in FIG. 6 does not constitute a limitation to the handset, and may include more or less components than those illustrated, or some components may be combined, or different components may be arranged.
  • the RF circuit 610 can be used for transmitting and receiving information or during a call, and receiving and transmitting the signal. Specifically, after receiving the downlink information of the base station, it is processed by the central processing unit 680; in addition, the uplink data is designed to be sent to the base station.
  • RF circuit 610 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • LNA Low Noise Amplifier
  • RF circuitry 610 can also communicate with the network and other devices via wireless communication.
  • the above wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division). Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), and the like.
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • E-mail Short Messaging Service
  • the memory 620 can be used to store software programs and modules, and the central processing unit 680 executes various functional applications and data processing of the mobile phone by running software programs and modules stored in the memory 620.
  • the memory 620 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of the mobile phone (such as audio data, phone book, etc.).
  • memory 620 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 630 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function controls of the handset.
  • the input unit 630 may include the touch panel 631 and other input devices. 632.
  • the touch panel 631 also referred to as a touch screen, can collect touch operations on or near the user (such as the user using a finger, a stylus, or the like on the touch panel 631 or near the touch panel 631. Operation), and drive the corresponding connecting device according to a preset program.
  • the touch panel 631 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the central processor 680 is provided and can receive commands from the central processing unit 680 and execute them.
  • the touch panel 631 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 630 may also include other input devices 632.
  • other input devices 632 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 640 can be used to display information input by the user or information provided to the user as well as various menus of the mobile phone.
  • the display unit 640 can include a display panel 641.
  • the display panel 641 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 631 can cover the display panel 641. When the touch panel 631 detects a touch operation thereon or nearby, the touch panel 631 transmits to the central processing unit 680 to determine the type of the touch event, and then the central processing unit 680 according to the touch. The type of event provides a corresponding visual output on display panel 641.
  • the touch panel 631 and the display panel 641 are two independent components to implement the input and input functions of the mobile phone, in some embodiments, the touch panel 631 may be integrated with the display panel 641. Realize the input and output functions of the phone.
  • the handset can also include at least one type of sensor 650, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 641 according to the brightness of the ambient light, and the proximity sensor may close the display panel 641 and/or when the mobile phone moves to the ear. Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the mobile phone can be used to identify the gesture of the mobile phone (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
  • the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration
  • vibration recognition related functions such as pedometer, tapping
  • the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
  • Audio circuit 660, speaker 661, and microphone 662 provide an audio interface between the user and the handset.
  • the audio circuit 660 can transmit the converted electrical data of the received audio data to the speaker 661 for conversion to the sound signal output by the speaker 661; on the other hand, the microphone 662 converts the collected sound signal into an electrical signal by the audio circuit 660. After receiving, it is converted into audio data, and then the audio data is output to the central processing unit 680 for processing, sent to the other mobile phone via the RF circuit 610, or the audio data is output to the memory 620 for further processing.
  • WiFi is a short-range wireless transmission technology
  • the mobile phone can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 670, which provides users with wireless broadband Internet access.
  • FIG. 6 shows the WiFi module 670, it can be understood that it does not belong to the essential configuration of the mobile phone, and can be omitted as needed within the scope of not changing the essence of the invention.
  • the central processing unit 680 is the control center of the handset, which connects various portions of the entire handset using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 620, and by calling them stored in the memory 620. Data, perform various functions of the mobile phone and process data to monitor the mobile phone as a whole.
  • the central processing unit 680 can include one or more processing units; preferably, the central processing unit 680 can integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application. Programs, etc., the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the central processing unit 680.
  • the handset also includes a power source 690 (such as a battery) that supplies power to the various components.
  • a power source 690 such as a battery
  • the power source can be logically coupled to the central processing unit 680 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • the mobile phone may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the central processing unit 680 included in the terminal may be configured to: acquire a target model parameter, where the target model parameter is used to represent a computing density of a convolutional neural network model; and according to the target model parameter, from the preset
  • the core correspondence value of at least two cores is determined in the first correspondence relationship, the core weight values of the at least two cores correspond to the target model parameters, and at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes target model parameters.
  • the core weight value is used to indicate the priority of the core selected to run the convolutional neural network model; and determined from at least two cores based on the core weight values of at least two cores Run the core of the convolutional neural network model.
  • the central processing unit 680 is further configured to: obtain a current state parameter of the terminal, where the state parameter is a dynamically changing parameter; and determine, according to the state parameter, the parameter weight of the at least two cores from the preset second correspondence. a value, at least two core parameter weight values corresponding to the state parameter, the second correspondence relationship comprising a correspondence between the state parameter and at least two core parameter weight values, the parameter weight value being used to indicate that under the state parameter, the core is selected The priority of running the convolutional neural network model; for each core, the core weight value is corrected using the parameter weight value to obtain a first modified weight value, and the first modified weight value is used to indicate that the core is selected to run the convolutional neural network model Priority; determining the core of the running convolutional neural network model from at least two cores based on the first modified weight values of the at least two cores.
  • the current state parameter of the terminal is the current core usage rate of each core
  • the central processing unit 680 is further configured to: determine, for each core, a performance weight from the preset second correspondence according to the core usage rate. Value, the performance weight value of each core corresponds to the core usage rate of each core, and the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate, second The correspondence includes the correspondence between the core usage rate of each core and the performance weight value of each core.
  • the central processing unit 680 is further configured to: determine, according to the remaining power value, the power consumption weight values of the at least two cores from the preset third correspondence, at least two The core power consumption weight value corresponds to the remaining power value, and the third correspondence relationship includes a correspondence between the remaining power value and at least two core power consumption weight values, and the power consumption weight value is used to indicate that the core is selected under the remaining power value.
  • the priority of running the convolutional neural network model for each core, the first modified weight value is corrected using the power consumption weight value to obtain a second modified weight value, and the second modified weight value is used to indicate that the core is selected to run the convolution
  • the priority of the neural network model determining the core of the running convolutional neural network model from at least two cores based on the second modified weight values of the at least two cores.
  • the central processing unit 680 is further configured to: obtain a current core usage rate of each core; for each core, determine performance parameters from the second correspondence according to the core usage rate, performance parameters of each core, and Each core Corresponding to the core usage rate, the second correspondence relationship includes the correspondence between the performance parameters of each core and the core usage rate of each core; determining the running convolutional nerve from at least two cores according to the core weight values of at least two cores After the core of the network model, the convolutional neural network model is run on the target core using the performance parameters of the target core.
  • the target core is the core of the running convolutional neural network model.
  • the current state parameter of the terminal is the current remaining power value of the terminal
  • the central processing unit 680 is further configured to: determine, according to the remaining power value, the power consumption weight of the at least two cores from the preset second correspondence.
  • the value, the power weight value of at least two cores corresponds to the remaining power value, and the power weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value, and the second correspondence includes the remaining The correspondence between the power value and the power weight values of at least two cores.
  • the central processing unit 680 is further configured to: determine, in the preset first correspondence, a target model parameter interval in which the target model parameter is located; and determine, in the first correspondence, the core weight of the at least two cores The value interval, at least two core core weight value intervals correspond to the target model parameter interval, and the first correspondence relationship includes a correspondence relationship between the target model parameter interval and at least two core core weight value intervals, and the target model parameter interval includes the target model parameter
  • the core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
  • the central processing unit 680 is further configured to perform the steps 401 to 403 described above.
  • the central processing unit 680 is further configured to perform steps 501 to 511 described above.
  • the central processing unit 680 acquires target model parameters, wherein the target model parameters are used to represent the computational density of a convolutional neural network model, and then the central processor 680 selects from the preset first correspondence according to the target model parameters. Determining at least two core core weight values in the relationship, the core weight values of the at least two cores are corresponding to target model parameters, the at least two cores are heterogeneous cores on the terminal, wherein the first correspondence relationship includes target model parameters Corresponding to the core weight values of at least two cores, the core weight values are used to indicate the priority of the core being selected to run the convolutional neural network model.
  • the central processor 680 thus determines the core of the running convolutional neural network model from at least two cores based on the core weight values of the at least two cores.
  • the heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities.
  • the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model
  • At least two cores are heterogeneous cores on the terminal, and after obtaining the target model parameters of a convolutional neural network model, at least two cores may be determined from the preset first correspondence according to the target model parameters.
  • the core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model.
  • the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores.
  • the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
  • FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
  • the terminal may be integrated on the terminal shown in FIG. 6.
  • the terminal shown in FIG. 7 may be used to execute the terminal executed by FIG. 4 or FIG. step.
  • a terminal includes:
  • An obtaining unit 701 configured to acquire a target model parameter, where the target model parameter is used to represent a computing density of a convolutional neural network model;
  • the weight value determining unit 702 is configured to determine, according to the target model parameter, a core weight value of at least two cores from the preset first correspondence relationship, where the core weight values of the at least two cores correspond to the target model parameters, and at least two cores For the heterogeneous core on the terminal, the first correspondence includes a correspondence between the target model parameter and the core weight values of at least two cores, and the core weight value is used to indicate the priority of the core selected to run the convolutional neural network model;
  • the core determining unit 703 is configured to determine, according to the core weight values of the at least two cores, a core of the running convolutional neural network model from the at least two cores.
  • the obtaining unit 701 is further configured to acquire a current state parameter of the terminal, where the state parameter is a dynamically changing parameter;
  • the weight value determining unit 702 is further configured to determine, according to the state parameter, a parameter weight value of at least two cores from the preset second correspondence relationship, where the parameter weight values of the at least two cores correspond to the state parameters, and the second correspondence relationship includes The correspondence between the state parameter and at least two core parameter weight values, the parameter weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the state parameter;
  • Core determining unit 703, comprising a correction module 704 and a core determining module 705;
  • the correction module 704 is configured to, for each core, use a parameter weight value to correct the core weight value to obtain a first modified weight value, where the first modified weight value is used to indicate a priority of the core selected to run the convolutional neural network model;
  • the core determining module 705 is configured to determine, according to the first modified weight value of the at least two cores, a core of the running convolutional neural network model from the at least two cores.
  • the current state parameter of the terminal is the current core usage rate of each core
  • the weight value determining unit 702 is further configured to determine a performance weight value from the preset second correspondence relationship according to the core usage rate for each core, and the performance weight value of each core corresponds to the core usage rate of each core, and the performance The weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate of the core, and the second correspondence includes the core usage rate of each core and the performance weight value of each core. relationship.
  • the obtaining unit 701 is further configured to acquire a current remaining power value of the terminal;
  • the weight value determining unit 702 is further configured to determine, according to the remaining power value, a power consumption weight value of at least two cores from the preset third correspondence relationship, where the power consumption weight values of the at least two cores and the remaining power value correspond to,
  • the three correspondence relationship includes a correspondence between the remaining power value and the power consumption weight values of the at least two cores, and the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value;
  • the correction module 704 is further configured to: for each core, correct the first modified weight value by using the power consumption weight value to obtain a second modified weight value, where the second modified weight value is used to indicate that the core is selected to run the convolutional neural network model Priority
  • the core determining module 705 is further configured to determine, according to the second modified weight value of the at least two cores, a core of the running convolutional neural network model from the at least two cores.
  • the terminal further includes a parameter determining unit 706 and an operating unit 707;
  • the obtaining unit 701 is further configured to obtain a current core usage rate of each core.
  • the parameter determining unit 706 is configured to determine performance parameters from the second correspondence according to the core usage rate, and the performance parameter of each core corresponds to the core usage rate of each core, and the second correspondence includes each core. Correspondence between performance parameters and core usage of each core;
  • the running unit 707 is configured to: after the core determining unit performs the core of the running convolutional neural network model from the at least two cores, according to the core weight value of the at least two cores, use the performance parameter of the target core on the target core Convolutional neural network model, the core of the target is the core of the running convolutional neural network model.
  • the performance parameter includes one or more of thread priority information, sleep time information, and number of threads;
  • the thread priority information is the priority information of the child thread when the core runs the convolutional neural network model
  • the sleep time information is the time when the core runs two convolutional neural network models
  • the number of threads information is the number of threads used when the core runs the convolutional neural network model.
  • the current state parameter of the terminal is the current remaining power value of the terminal
  • the weight value determining unit 702 is further configured to determine, according to the remaining power value, the power consumption weight values of the at least two cores from the preset second correspondence, and the power consumption weight values of the at least two cores correspond to the remaining power values.
  • the weight loss value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value, and the second correspondence includes the correspondence between the remaining power value and the power consumption weight values of at least two cores.
  • the target model parameter is the number of weight parameters of the convolutional neural network model.
  • the at least two cores include at least two of a central processing unit CPU, a graphics processor GPU, a digital signal processor DSP, and a pulse array processor.
  • the weight value determining unit 702 is further configured to determine, in the preset first correspondence, a target model parameter interval in which the target model parameter is located; and in the first correspondence relationship, determine at least two core core weight value intervals, at least two The core core weight value interval corresponds to the target model parameter interval, and the first correspondence relationship includes the correspondence relationship between the target model parameter interval and the core weight value interval of at least two cores, and the target model parameter interval includes the target model parameter; for each core The core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
  • the obtaining unit 701 acquires a target model parameter, wherein the target model parameter is used to represent the calculated density of a convolutional neural network model, and then the weight value determining unit 702 selects the first corresponding corresponding according to the target model parameter. Determining at least two core core weight values in the relationship, the core weight values of the at least two cores are corresponding to target model parameters, the at least two cores are heterogeneous cores on the terminal, wherein the first correspondence relationship includes target model parameters Corresponding to the core weight values of at least two cores, the core weight values are used to indicate the priority of the core being selected to run the convolutional neural network model.
  • the core determining unit 703 is based on at least two core weight values of at least two cores.
  • the core of the running convolutional neural network model is determined in the core.
  • the heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities.
  • the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model,
  • At least two cores are heterogeneous cores on the terminal, and after obtaining the target model parameters of a convolutional neural network model, at least two cores may be determined from the preset first correspondence according to the target model parameters.
  • Core weight value is a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model.
  • the core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model.
  • the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores.
  • the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
  • the embodiment of the invention further provides a chip device, the chip comprising a processing unit for performing the method shown in FIG. 4 and FIG. 5 above.
  • the embodiment of the invention further provides a chip device, which comprises a processor and a memory.
  • the memory includes instructions that are executed by the processor for performing the methods illustrated in Figures 4 and 5 above.
  • the chip device may be a chip in the terminal, the chip includes: a processing unit and a communication unit, and the processing unit may be, for example, a processor, and the processor may be a central processor as described above. 680.
  • the communication unit may be, for example, an input/output interface, a pin or a circuit, etc., and the communication unit includes a system bus.
  • the chip further includes a storage unit, where the storage unit may be a memory inside the chip, such as a register, a cache, a random access memory (RAM), an EEPROM or a FLASH, etc.;
  • the unit may also be a memory located external to the chip, which may be various types of memory 620 as previously described.
  • the processor is coupled to a memory that can execute instructions stored in the memory to cause the chip device to perform the methods illustrated in Figures 4 and 5 above.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transfer to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.).
  • wire eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer readable storage medium can be any available media that can be stored by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium (such as a solid state disk (SSD)).

Abstract

Provided are a core scheduling method and a related device. The method comprises: acquiring a target model parameter, wherein the target model parameter is used for indicating the computation density of a convolutional neural network model; determining core weight values of at least two cores from a pre-set first correlation according to the target model parameter, wherein the core weight values of the at least two cores correspond to the target model parameter, the at least two cores are heterogeneous cores on a terminal, the first correlation comprises a correlation between the target model parameter and the core weight values of the at least two cores, and the core weight value is used for indicating the priority level of a core being selected to run the convolutional neural network model; and determining a core for running the convolutional neural network model from the at least two cores according to the core weight values of the at least two cores. By means of the core weight values of different cores, an adapted core can be determined to run a convolutional neural network model with a specific computation density.

Description

核心调度方法和终端Core scheduling method and terminal 技术领域Technical field
本发明实施例涉及芯片***技术领域,尤其涉及一种核心调度方法和终端。Embodiments of the present invention relate to the field of chip systems, and in particular, to a core scheduling method and a terminal.
背景技术Background technique
卷积神经网络(convolutional neural network,CNN)是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。在终端上对卷积神经网络具有越来越多的应用,例如利用卷积神经网路进行图像分类、特征提取、人脸聚类等。Convolutional neural network (CNN) is a feedforward neural network whose artificial neurons can respond to a part of the coverage of surrounding cells and have excellent performance for large image processing. There are more and more applications to convolutional neural networks on the terminal, such as image classification, feature extraction, face clustering, etc. using convolutional neural networks.
为了提高终端的计算能力,终端上的***芯片往往包括多个异构核心,以使用不同的核心执行不同的业务。现有方案中,运行卷积神经网络进行的业务处理,还没有有效的核心调度机制,往往是从终端上的芯片***中简单确定一个核心后,即在这个核心上运行获取到的卷积神经网络模型,以进行业务处理。In order to improve the computing power of the terminal, the system chip on the terminal often includes multiple heterogeneous cores to perform different services using different cores. In the existing solution, there is no effective core scheduling mechanism for running the business process of the convolutional neural network. It is often after a simple determination of a core from the chip system on the terminal, that is, the acquired convolutional nerve is run on the core. Network model for business processing.
因不同的卷积神经网络模型往往有不同的特点,不同的核心也有不同的特点,现有方案中,没有利用不同核心的特点,以在适配的核心上执行具体的卷积神经网络模型,这使得运行具体的卷积神经网络模型时效率低,浪费了终端上的计算资源。Because different convolutional neural network models often have different characteristics, different cores have different characteristics. In the existing schemes, different core features are not utilized to implement a specific convolutional neural network model on the adapted core. This makes it inefficient to run a specific convolutional neural network model, wasting computing resources on the terminal.
发明内容Summary of the invention
本发明实施例提供了一种核心调度方法和终端,用于为卷积神经网络模型提供适配的核心。Embodiments of the present invention provide a core scheduling method and a terminal, which are used to provide an adapted core for a convolutional neural network model.
本发明实施例的第一方面提供一种核心调度方法,包括:获取目标模型参数,其中,目标模型参数用于表示一卷积神经网络模型的计算密度。不同计算密度的卷积神经网络模型适于在不同核心上运行。然后,根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,该至少两个核心的核心权重值与目标模型参数对应,该至少两个核心为终端上的异构核心,终端上的异构核心的硬件特性不同,从而不同核心适于运行不同计算密度的卷积神经网络模型。其中,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,这样,通过核心权重值可以确定适于运行该卷积神经网络模型的核心,从而,根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。其中,确定运行卷积神经网络模型的核心时,对核心权重值的使用可以是直接使用,即直接确定核心权重值最大的核心来运行卷积神经网络模型;也可以是间接使用核心权重值,例如使用其它的参数来对该核心权重值进行修正后,得到修正的权重值,然后使用修正的权重值来确定运行卷积神经网络模型的核心。A first aspect of the embodiments of the present invention provides a core scheduling method, including: acquiring target model parameters, wherein the target model parameters are used to represent a computing density of a convolutional neural network model. Convolutional neural network models of different computational densities are suitable for operation on different cores. Then, determining, according to the target model parameter, a core weight value of at least two cores from the preset first correspondence, the core weight values of the at least two cores corresponding to the target model parameters, the at least two cores being on the terminal Heterogeneous cores, heterogeneous cores on the terminal have different hardware characteristics, so different cores are suitable for running convolutional neural network models with different computational densities. The first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight values of the at least two cores, and the core weight value is used to indicate that the core is selected to run the convolutional neural network model, so that the core weight value can be A core adapted to operate the convolutional neural network model is determined such that a core of the running convolutional neural network model is determined from at least two cores based on core weight values of at least two cores. Wherein, when determining the core of the running convolutional neural network model, the use of the core weight value may be directly used, that is, directly determining the core with the largest core weight value to run the convolutional neural network model; or indirectly using the core weight value, For example, using other parameters to correct the core weight value, the corrected weight value is obtained, and then the modified weight value is used to determine the core of the running convolutional neural network model.
这样,通过不同核心的核心权重值,可确定适配的核心来运行具有具体计算密度的卷积神经网络模型,若核心权重值越大的核心越能高效运行卷积神经网络模型时,根据核心权重值确定出的核心可以高效运行该卷积神经网络模型。 In this way, through the core weight values of different cores, the core of the adaptation can be determined to run a convolutional neural network model with a specific computational density. If the core with a larger core weight value can run the convolutional neural network model efficiently, according to the core The core determined by the weight value can run the convolutional neural network model efficiently.
结合本发明实施例的第一方面,在本发明实施例的第一方面的第一种实现方式中,本实现方式的方法还包括:获取终端当前的状态参数,其中,状态参数为动态变化的参数。从而获取的该当前的状态参数可以反映终端上核心的运行环境,不同的运行环境对不同的核心运行卷积神经网络模型也会产生影响。为此,根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值,该至少两个核心的参数权重值和状态参数对应,其中,第二对应关系包括状态参数和至少两个核心的参数权重值的对应关系,参数权重值用于表示在状态参数下,核心被选择以运行卷积神经网络模型的优先程度。这样,该参数权重值即反映的终端上的动态环境因素对核心运行卷积神经网络模式产生的影响。With reference to the first aspect of the embodiments of the present invention, in a first implementation manner of the first aspect of the embodiments of the present disclosure, the method of the implementation manner further includes: acquiring a current state parameter of the terminal, where the state parameter is dynamically changed. parameter. Therefore, the current state parameter obtained may reflect the core operating environment on the terminal, and different operating environments may also affect different core running convolutional neural network models. To this end, the parameter weight values of the at least two cores are determined from the preset second correspondence according to the state parameter, and the parameter weight values of the at least two cores correspond to the state parameters, wherein the second correspondence includes the state parameters and The correspondence between the parameter weight values of at least two cores, the parameter weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the state parameter. In this way, the parameter weight value reflects the influence of the dynamic environmental factors on the terminal on the core running convolutional neural network mode.
从而根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心,包括:对每一核心,使用参数权重值修正核心权重值,得到第一修正权重值,第一修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。这样,第一修正权重值即有核心权重值的影响因素,也有参数权重值的影响因素。此时,根据至少两个核心的第一修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。将能确定出更适宜于运行该卷积神经网络模型的核心。Thereby determining, according to the core weight values of the at least two cores, the core of the running convolutional neural network model from the at least two cores, comprising: for each core, correcting the core weight values by using the parameter weight values to obtain the first modified weight value, The first modified weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model. Thus, the first modified weight value has the influencing factor of the core weight value, and also has the influencing factor of the parameter weight value. At this time, the core of the running convolutional neural network model is determined from at least two cores according to the first modified weight value of the at least two cores. It will be possible to identify the core that is more suitable for running the convolutional neural network model.
这样,获取的终端当前的状态参数,可以反映终端当前的动态运行环境因素,根据该状态参数和第二对应关系可以确定出每一核心的参数权重值,该参数权重值反映了终端上的状态参数对核心运行卷积神经网络模型的影响,核心权重值越大的核心在该状态参数下,越适于运行该卷积神经网络模型,从而越优先调度用于运行卷积神经网络模型。使用该状态参数修正核心权重值得到的第一修正权重值参考的因素更多,更能反映出核心运行卷积神经网络模型的适宜程度,从而根据该第一修正权重值确定的核心对运行卷积神经网络模型有更优的运行效果。In this way, the current state parameter of the obtained terminal may reflect the current dynamic operating environment factor of the terminal, and the parameter weight value of each core may be determined according to the state parameter and the second correspondence relationship, and the parameter weight value reflects the state on the terminal. The influence of parameters on the core running convolutional neural network model, the core of the core weight value is more suitable for running the convolutional neural network model under the state parameter, so that the more preferential scheduling is used to run the convolutional neural network model. The first modified weight value obtained by modifying the core weight value by using the state parameter has more factors, and more reflects the appropriateness of the core running convolutional neural network model, so that the core pair running volume determined according to the first modified weight value The product neural network model has better running results.
结合本发明实施例的第一方面的第一种实现方式,本发明实施例的第一方面的第二种实现方式中,终端当前的状态参数为每一核心当前的核心使用率。从而,根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值,包括:对每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值,该每一核心的性能权重值和每一核心的核心使用率对应。其中,性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括每一核心的核心使用率和每一核心的性能权重值的对应关系。这样确定出的核心权重值反映了核心的当前核心使用率对运行卷积神经网络的影响程度。With reference to the first implementation manner of the first aspect of the embodiment of the present invention, in the second implementation manner of the first aspect of the embodiment, the current state parameter of the terminal is the current core usage rate of each core. Therefore, determining, according to the state parameter, the parameter weight value of the at least two cores from the preset second correspondence, including: determining, for each core, a performance weight value from the preset second correspondence according to the core usage rate, The performance weight value of each core corresponds to the core usage rate of each core. Wherein, the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate of the core, and the second correspondence relationship includes the core usage rate of each core and the performance weight of each core. The correspondence of values. The core weight value thus determined reflects the extent to which the core core usage rate affects the running convolutional neural network.
这样,获取的状态参数为核心使用率,从而根据该核心使用率和第二对应关系确定出的性能权重值,将核心的核心使用率作为调度核心的参考因素之一,使用该性能权重值修正核心权重值得到的第一修正权重值能考虑到了核心使用率对运行卷积神经网络模型的影响。In this way, the obtained state parameter is the core usage rate, and the performance weight value determined according to the core usage rate and the second correspondence relationship is used, and the core core usage rate is used as one of the reference factors of the scheduling core, and the performance weight value is used to correct the value. The first modified weight value obtained from the core weight value can take into account the impact of core usage on the running convolutional neural network model.
结合本发明实施例的第一方面的第二种实现方式,本发明实施例的第一方面的第三种实现方式中,本实现方式的方法还包括:获取终端当前的剩余电量值。然后,根据剩余电量值,从预设的第三对应关系中确定至少两个核心的功耗权重值,该至少两个核心的功耗权重值和剩余电量值对应。其中,第三对应关系包括剩余电量值和至少两个核心的功耗权 重值的对应关系,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度。这样确定出的功耗权重值能反映出终端的剩余电量对运行卷积神经网络模型的影响程度。With reference to the second implementation manner of the first aspect of the embodiment of the present invention, in a third implementation manner of the first aspect of the embodiment, the method of the implementation manner further includes: acquiring a current remaining power value of the terminal. Then, according to the remaining power value, the power consumption weight values of the at least two cores are determined from the preset third correspondence, and the power consumption weight values of the at least two cores correspond to the remaining power values. The third correspondence includes a remaining power value and a power consumption of at least two cores The correspondence of the weight values, the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value. The power weight value determined in this way can reflect the degree of influence of the remaining power of the terminal on the running convolutional neural network model.
从而,根据至少两个核心的第一修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心,包括:对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值,该第二修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。然后,根据至少两个核心的第二修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。Therefore, determining, according to the first modified weight value of the at least two cores, determining a core of the running convolutional neural network model from the at least two cores, comprising: correcting the first modified weight value by using the power consumption weight value for each core, A second modified weight value is used to indicate a priority level at which the core is selected to run the convolutional neural network model. Then, the core of the running convolutional neural network model is determined from at least two cores based on the second modified weight values of the at least two cores.
这样,通过终端当前的剩余电量值和第三对应关系确定出每一核心的功耗权重值,使用该功耗权重值修正第一修正权重值得到的第二修正权重值进一步考虑了终端的剩余电量对不同核心运行卷积神经网络模型的影响,从而根据该第二修正权重值确定出的核心对运行卷积神经网络模型有更优的效果。In this way, the power consumption weight value of each core is determined by the current remaining power value of the terminal and the third correspondence relationship, and the second modified weight value obtained by correcting the first modified weight value by using the power consumption weight value further considers the remaining of the terminal. The influence of the electric quantity on the convolutional neural network model of different core operations, so that the core determined according to the second modified weight value has a better effect on the running convolutional neural network model.
结合本发明实施例的第一方面,在本发明实施例的第一方面的第四种实现方式中,本实现方式的方法还包括:获取每一核心当前的核心使用率,然后,对每一核心,根据核心使用率从第二对应关系中确定性能参数,该每一核心的性能参数和每一核心的核心使用率对应。其中,第二对应关系包括每一核心的性能参数和每一核心的核心使用率的对应关系。核心在不同的核心使用率下,有不同的运行需求,通过第二对应关系的预设,可以通过不同的性能参数来控制不同核心使用率的核心的运行方式。With reference to the first aspect of the embodiments of the present invention, in a fourth implementation manner of the first aspect of the embodiments of the present disclosure, the method of the implementation manner further includes: acquiring a current core usage rate of each core, and then, for each The core determines performance parameters from the second correspondence according to the core usage rate, and the performance parameters of each core correspond to the core usage rate of each core. The second correspondence includes a correspondence between performance parameters of each core and core usage of each core. The core has different operation requirements under different core usage rates. Through the preset of the second correspondence relationship, the core operation modes of different core usage rates can be controlled through different performance parameters.
从而,根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心之后,本实现方式的方法还包括:在目标核心上使用目标核心的性能参数运行卷积神经网络模型,目标核心为运行卷积神经网络模型的核心。Therefore, after determining the core of the running convolutional neural network model from at least two cores according to the core weight values of the at least two cores, the method of the implementation manner further comprises: performing convolution on the target core using the performance parameters of the target core. The neural network model, the core of the target is the core of the running convolutional neural network model.
这样,通过具体的性能参数即可控制核心的具体运行方式,使得运行卷积神经网络模型的核心能以用户设定的方式运行,满足用户对核心的控制需求。In this way, the specific operating mode of the core can be controlled through specific performance parameters, so that the core of the running convolutional neural network model can be operated in a user-set manner to meet the user's control requirements for the core.
结合本发明实施例的第一方面的第四种实现方式,本发明实施例的第一方面的第五种实现方式中,性能参数包括线程优先级信息、睡眠时间信息、线程数目信息中的一种或多种。线程优先级信息为核心运行卷积神经网络模型时子线程的优先级别信息;睡眠时间信息为核心运行两个卷积神经网络模型间隔的时间;线程数目信息为核心运行卷积神经网络模型时使用的线程数目信息。With reference to the fourth implementation manner of the first aspect of the embodiment of the present invention, in a fifth implementation manner of the first aspect of the embodiment, the performance parameter includes one of thread priority information, sleep time information, and thread number information. Kind or more. The thread priority information is the priority information of the child thread when the core runs the convolutional neural network model; the sleep time information is the time when the core runs two convolutional neural network models; the thread number information is used when the core runs the convolutional neural network model. The number of threads information.
这样,可以实现对运行卷积神经网络模型的核心的线程、运行时间、子线程等进行控制。In this way, it is possible to control the thread, runtime, sub-thread, etc. of the core running the convolutional neural network model.
结合本发明实施例的第一方面的第一种实现方式,本发明实施例的第一方面的第六种实现方式中,终端当前的状态参数为终端当前的剩余电量值,从而根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值,包括:根据剩余电量值,从预设的第二对应关系中确定至少两个核心的功耗权重值,该至少两个核心的功耗权重值和剩余电量值对应,其中,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系。这样确定出的功耗权重值反映了终端的当前剩余电量值对不同核心运行卷积神经网络模型 的影响程度。With reference to the first implementation manner of the first aspect of the embodiment of the present invention, in the sixth implementation manner of the first aspect of the embodiment, the current state parameter of the terminal is the current remaining power value of the terminal, so that, according to the state parameter, Determining the parameter weight values of the at least two cores from the preset second correspondence, including: determining, according to the remaining power value, the power consumption weight values of the at least two cores from the preset second correspondence, the at least two The power consumption weight value corresponds to the remaining power value, wherein the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value, and the second correspondence includes the remaining power value and The correspondence between the power weight values of at least two cores. The power weight value determined in this way reflects the current residual power value of the terminal and runs a convolutional neural network model on different cores. The extent of the impact.
这样,获取的状态参数为终端的剩余电量值,从而根据该剩余电量值和第二对应关系确定出的功耗权重值,将终端的剩余电量作为调度核心的考虑因素之一,使用该功耗权重值修正核心权重值得到的第一修正权重值能考虑到终端的剩余电量对运行卷积神经网络模型的影响。In this way, the obtained state parameter is the remaining power value of the terminal, so that the power consumption weight value determined according to the remaining power value and the second correspondence relationship is used as one of the consideration factors of the scheduling core, and the power consumption is used. The first modified weight value obtained by modifying the core weight value of the weight value can take into account the influence of the remaining power of the terminal on the running convolutional neural network model.
结合本发明实施例的第一方面、本发明实施例的第一方面的第一种至第六种实现方式中的任意一种,在本发明实施例的第一方面的第七种实现方式中,目标模型参数为卷积神经网络模型的权重参数数量。该权重参数数量能准确反映出卷积神经网络模型的计算密度。With reference to the first aspect of the embodiments of the present invention, any one of the first to sixth implementation manners of the first aspect of the embodiments of the present invention, in a seventh implementation manner of the first aspect of the embodiments of the present invention The target model parameter is the number of weight parameters of the convolutional neural network model. The number of weight parameters can accurately reflect the computational density of the convolutional neural network model.
结合本发明实施例的第一方面、本发明实施例的第一方面的第一种至第七种实现方式中的任意一种,在本发明实施例的第一方面的第八种实现方式中,至少两个核心包括CPU、GPU、DSP、脉动阵列处理器中的至少两个。脉动阵列处理器例如可以包括神经网络处理器NPU或张量处理器TPU等。这些计算核心有不同的特性,执行同一的卷积神经网络模型可以有不同的执行效率,从容对这些核心使用本发明实施例的核心调度方法可以有效确定出运行优异的核心。With reference to the first aspect of the embodiments of the present invention, any one of the first to seventh implementation manners of the first aspect of the embodiment of the present invention, in an eighth implementation manner of the first aspect of the embodiment of the present invention At least two cores include at least two of a CPU, a GPU, a DSP, and a pulsating array processor. The pulsating array processor may include, for example, a neural network processor NPU or a tensor processor TPU or the like. These computing cores have different characteristics. The implementation of the same convolutional neural network model can have different execution efficiencies, and the core scheduling method of the embodiments of the present invention can be effectively used for these cores to effectively determine the core with excellent operation.
结合本发明实施例的第一方面、本发明实施例的第一方面的第一种至第八种实现方式中的任意一种,在本发明实施例的第一方面的第九种实现方式中,根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,包括:在预设的第一对应关系中,确定目标模型参数所在的目标模型参数区间,然后,在第一对应关系中,确定至少两个核心的核心权重值区间,该至少两个核心的核心权重值区间和目标模型参数区间对应,其中,第一对应关系包括目标模型参数区间和至少两个核心的核心权重值区间的对应关系,目标模型参数区间包括目标模型参数。以及,对每一核心,从核心权重值区间中确定核心权重值,核心权重值在核心权重值区间中的位置和目标模型参数在目标模型参数区间中的位置相同。With reference to the first aspect of the embodiments of the present invention, any one of the first to eighth implementation manners of the first aspect of the embodiment of the present invention, in a ninth implementation manner of the first aspect of the embodiment of the present invention And determining, according to the target model parameter, the core weight value of the at least two cores from the preset first correspondence, including: determining, in the preset first correspondence, determining a target model parameter interval in which the target model parameter is located, and then, In the first correspondence, determining a core weight value interval of the at least two cores, where the core weight value interval of the at least two cores corresponds to the target model parameter interval, wherein the first correspondence includes the target model parameter interval and at least two The correspondence between the core core weight value intervals and the target model parameter interval includes the target model parameters. And, for each core, the core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
这样,第一对应关系中的目标模型参数区间和核心权重值区间为数值范围,将能覆盖更多的具体的参数,通过位置映射的方式确定出该至少两个核心的核心权重值更能准确反映与目标模型参数的对应关系,且不同核心间的核心权重值更易于区别,从而该核心权重值更能反映核心被选择的优先程度。In this way, the target model parameter interval and the core weight value interval in the first correspondence relationship are numerical ranges, which can cover more specific parameters, and determine the core weight values of the at least two cores more accurately by means of position mapping. Reflecting the correspondence with the target model parameters, and the core weight values between different cores are easier to distinguish, so that the core weight value can better reflect the priority of the core selection.
第二方面,本发明实施例提供一种核心调度方法,包括:In a second aspect, an embodiment of the present invention provides a core scheduling method, including:
获取目标模型参数,所述目标模型参数用于表示一卷积神经网络模型的计算密度;Obtaining a target model parameter, the target model parameter being used to represent a calculated density of a convolutional neural network model;
根据所述目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,所述至少两个核心的核心权重值与所述目标模型参数对应,所述至少两个核心为终端上的异构核心,所述第一对应关系包括所述目标模型参数和所述至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度。Determining, according to the target model parameter, a core weight value of at least two cores from a preset first correspondence, the core weight values of the at least two cores corresponding to the target model parameter, the at least two cores a heterogeneous core on the terminal, the first correspondence relationship comprising a correspondence between the target model parameter and a core weight value of the at least two cores, the core weight value being used to indicate that the core is selected to run the convolution The priority of the neural network model.
根据所述至少两个核心的核心权重值,将所述卷积神经网络模型分配到不同核心上运行。The convolutional neural network model is assigned to run on different cores based on the core weight values of the at least two cores.
第三方面,本发明实施例提供一种核心调度方法,包括:In a third aspect, an embodiment of the present invention provides a core scheduling method, including:
获取任务类型参数,所述任务类型参数用于表示计算任务的类型; Obtaining a task type parameter, the task type parameter is used to indicate a type of the computing task;
根据所述任务类型参数,从预设的第四对应关系中确定至少两个核心的核心权重值,所述至少两个核心的核心权重值与所述任务类型参数对应,所述至少两个核心为终端上的异构核心,所述第四对应关系包括所述任务类型参数和所述至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行所述计算任务的优先程度;Determining, according to the task type parameter, a core weight value of at least two cores from a preset fourth correspondence relationship, where the core weight values of the at least two cores correspond to the task type parameter, the at least two cores a heterogeneous core on the terminal, where the fourth correspondence includes a correspondence between the task type parameter and a core weight value of the at least two cores, where the core weight value is used to indicate that the core is selected to run the computing task Priority
根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述计算任务的核心。Determining a core running the computing task from the at least two cores based on core weight values of the at least two cores.
第四方面,本发明实施例中提供一种终端,该终端具有上述方法中宿主机的功能。该功能可以通过硬件实现,也可能通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。In a fourth aspect, an embodiment of the present invention provides a terminal, where the terminal has the function of a host in the foregoing method. This function can be implemented in hardware or in hardware by executing the corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.
一种可能的实现方式中,该终端包括:In a possible implementation manner, the terminal includes:
获取单元,用于获取目标模型参数,所述目标模型参数用于表示一卷积神经网络模型的计算密度;An obtaining unit, configured to acquire a target model parameter, where the target model parameter is used to represent a calculated density of a convolutional neural network model;
权重值确定单元,用于根据所述目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,所述至少两个核心的核心权重值与所述目标模型参数对应,所述至少两个核心为终端上的异构核心,所述第一对应关系包括所述目标模型参数和所述至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;a weight value determining unit, configured to determine, according to the target model parameter, a core weight value of at least two cores from a preset first correspondence, where the core weight values of the at least two cores correspond to the target model parameter The at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes a correspondence between the target model parameters and core weight values of the at least two cores, where the core weight values are used to indicate that the core is Selecting the priority to run the convolutional neural network model;
核心确定单元,用于根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。And a core determining unit, configured to determine, according to the core weight values of the at least two cores, a core that runs the convolutional neural network model from the at least two cores.
另一种可能的实现方式中,该终端包括:包括处理器和存储器。所述处理器可以配置用于支持终端执行上述第一方面所述方法中相应的功能。例如处理器被配置用于:获取目标模型参数,所述目标模型参数用于表示一卷积神经网络模型的计算密度;根据所述目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,所述至少两个核心的核心权重值与所述目标模型参数对应,所述至少两个核心为终端上的异构核心,所述第一对应关系包括所述目标模型参数和所述至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。In another possible implementation manner, the terminal includes: a processor and a memory. The processor may be configured to support a terminal to perform a corresponding function in the method of the first aspect described above. For example, the processor is configured to: acquire target model parameters, the target model parameters are used to represent a computing density of a convolutional neural network model; and determine at least two from the preset first correspondence according to the target model parameters Core core weight values, the core weight values of the at least two cores correspond to the target model parameters, the at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes the target model a correspondence between the parameter and the core weight value of the at least two cores, the core weight value being used to indicate a priority of the core selected to run the convolutional neural network model; according to the core weight values of the at least two cores, A core running the convolutional neural network model is determined from the at least two cores.
第五方面,本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。In a fifth aspect, an embodiment of the present invention provides a computer readable storage medium having instructions stored therein that, when run on a computer, cause the computer to perform the methods described in the above aspects.
第六方面,本发明实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。In a sixth aspect, an embodiment of the present invention provides a computer program product comprising instructions that, when run on a computer, cause the computer to perform the methods described in the above aspects.
第七方面,提供了一种芯片装置,所述芯片装置包括处理单元,所述处理单元用于执行上述第一方面所述的方法。In a seventh aspect, a chip arrangement is provided, the chip arrangement comprising a processing unit for performing the method of the first aspect described above.
第八方面,提供了一种芯片装置,所述芯片装置包括处理器和存储器。所述存储器包括指令,所述处理器运行所述指令以执行上述各方面所述的方法。In an eighth aspect, a chip arrangement is provided, the chip arrangement comprising a processor and a memory. The memory includes instructions that the processor runs to perform the methods described in the various aspects above.
第九方面,提供了一种芯片***,该芯片***包括处理器,用于支持终端实现上述第一至三方面中所涉及的功能,例如发送或处理上述方法中所涉及的数据和/或信息。在一种 可能的设计中,所述芯片***还包括存储器,所述存储器,用于保存网络设备必要的程序指令和数据。该芯片***,可以由芯片构成,也可以包括芯片和其他分立器件。In a ninth aspect, a chip system is provided, the chip system comprising a processor for supporting a terminal to implement the functions involved in the above first to third aspects, such as transmitting or processing data and/or information involved in the above method . In a kind In a possible design, the chip system further includes a memory for storing necessary program instructions and data of the network device. The chip system can be composed of chips, and can also include chips and other discrete devices.
本发明实施例提供的技术方案中,获取目标模型参数,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,然后,根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,该至少两个核心的核心权重值与目标模型参数对应,该至少两个核心为终端上的异构核心,其中,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。从而根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。In the technical solution provided by the embodiment of the present invention, the target model parameter is obtained, wherein the target model parameter is used to represent the calculation density of a convolutional neural network model, and then determined according to the target model parameter from the preset first correspondence relationship. At least two core core weight values, the core weight values of the at least two cores correspond to target model parameters, the at least two cores are heterogeneous cores on the terminal, wherein the first correspondence relationship includes target model parameters and at least two The core core weight values are used to indicate the priority at which the core is selected to run the convolutional neural network model. Thus, the core of the running convolutional neural network model is determined from at least two cores based on the core weight values of the at least two cores.
终端上的异构核心特点不同,不同核心适于运行不同计算密度的卷积神经网络模型。若预设有第一对应关系,该第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,该至少两个核心为终端上的异构核心,则在获取到一卷积神经网络模型的目标模型参数后,可以根据该目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值。核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,通过核心权重值可确定适于运行该卷积神经网络模型的核心。这样,即可实现根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。通过不同核心的核心权重值,可确定适配的核心来运行具有具体计算密度的卷积神经网络模型,若核心权重值越大的核心越能高效运行卷积神经网络模型时,根据核心权重值确定出的核心可以高效运行该卷积神经网络模型。The heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities. If the first correspondence relationship is pre-set, the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model, At least two cores are heterogeneous cores on the terminal, and after obtaining the target model parameters of a convolutional neural network model, at least two cores may be determined from the preset first correspondence according to the target model parameters. Core weight value. The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model. In this way, the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores. Through the core weight values of different cores, the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
附图说明DRAWINGS
图1为本发明一实施例提供的一种卷积神经网络示意图;1 is a schematic diagram of a convolutional neural network according to an embodiment of the present invention;
图2为本发明另一实施例提供的一种卷积神经网络的单元示意图;2 is a schematic diagram of a unit of a convolutional neural network according to another embodiment of the present invention;
图3为本发明另一实施例提供的一种核心调度方法涉及的使用场景图;3 is a usage scenario diagram related to a core scheduling method according to another embodiment of the present invention;
图4为本发明另一实施例提供的一种核心调度方法的方法流程图;FIG. 4 is a flowchart of a method for a core scheduling method according to another embodiment of the present invention;
图5为本发明另一实施例提供的一种核心调度方法的方法流程图;FIG. 5 is a flowchart of a method for a core scheduling method according to another embodiment of the present invention;
图6为本发明另一实施例提供的一种终端的硬件结构示意图;FIG. 6 is a schematic structural diagram of hardware of a terminal according to another embodiment of the present disclosure;
图7为本发明另一实施例提供的一种终端的结构示意图。FIG. 7 is a schematic structural diagram of a terminal according to another embodiment of the present invention.
具体实施方式Detailed ways
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于 这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if present) in the specification and claims of the present invention and the above figures are used to distinguish similar objects without having to use To describe a specific order or order. It is to be understood that the data so used may be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than what is illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units, but may include those that are not clearly listed or for Other steps or units inherent to these processes, methods, products or equipment.
为了方便理解本发明的各个实施例,下文对一些技术术语进行介绍,在后文的各个实施例可参考下述对各技术术语的介绍。In order to facilitate the understanding of various embodiments of the present invention, some technical terms are described below, and various embodiments will be referred to the following description of various technical terms.
1、卷积神经网络1. Convolutional neural network
卷积神经网络是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。The convolutional neural network is a feedforward neural network whose artificial neurons can respond to a surrounding area of a part of the coverage and have excellent performance for large image processing.
卷积神经网络由一个或多个卷积层和顶端的全连通层(对应经典的神经网络)组成,同时也包括关联权重和池化层。这一结构使得卷积神经网络能够利用输入数据的二维结构。与其他深度学习结构相比,卷积神经网络在图像和语音识别方面能够给出更优的结果。这一模型也可以使用反向传播算法进行训练。相比较其他深度、前馈神经网络,卷积神经网络需要估计的参数更少,使之成为一种颇具吸引力的深度学习结构。A convolutional neural network consists of one or more convolutional layers and a fully connected layer at the top (corresponding to a classical neural network), including associated weights and pooling layers. This structure enables the convolutional neural network to take advantage of the two-dimensional structure of the input data. Compared to other deep learning structures, convolutional neural networks can give better results in terms of image and speech recognition. This model can also be trained using backpropagation algorithms. Compared to other depth and feedforward neural networks, convolutional neural networks require fewer parameters to estimate, making them an attractive deep learning structure.
卷积神经网络与普通神经网络的区别在于,卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器。在卷积神经网络的卷积层中,一个神经元只与部分邻层神经元连接。在CNN的一个卷积层中,通常包含若干个特征平面(featureMap),每个特征平面由一些矩形排列的神经元组成,同一特征平面的神经元共享权值,这里共享的权值就是卷积核。卷积核一般以随机小数矩阵的形式初始化,在网络的训练过程中卷积核将学习得到合理的权值。共享权值(卷积核)带来的直接好处是减少网络各层之间的连接,同时又降低了过拟合的风险。子采样也叫做池化(pooling),通常有均值子采样(mean pooling)和最大值子采样(max pooling)两种形式。子采样可以看作一种特殊的卷积过程。卷积和子采样大大简化了模型复杂度,减少了模型的参数。Convolutional neural networks differ from ordinary neural networks in that the convolutional neural network consists of a feature extractor consisting of a convolutional layer and a subsampling layer. In the convolutional layer of a convolutional neural network, one neuron is only connected to a portion of the adjacent layer neurons. In a convolutional layer of CNN, there are usually several feature maps. Each feature plane is composed of a number of rectangularly arranged neurons. The neurons of the same feature plane share weights. The weights shared here are convolutions. nuclear. The convolution kernel is generally initialized in the form of a random fractional matrix, and the convolution kernel will learn to obtain reasonable weights during the training process of the network. The immediate benefit of shared weights (convolution cores) is the reduction of connections between layers of the network while reducing the risk of overfitting. Subsampling is also called pooling. It usually has two forms: mean pooling and max pooling. Subsampling can be seen as a special convolution process. Convolution and subsampling greatly simplify the model complexity and reduce the parameters of the model.
2、卷积神经网络的权重参数2. Weight parameters of convolutional neural networks
神经网络,包括卷积神经网络,由多个层堆叠组成,每一层由节点构成。运算在节点中进行,节点的运作模式与人类的神经元大致相似,遇到足够的刺激信息时就会激活并释放信号。节点将输入数据与一组系数(或称权重参数)结合,通过放大或抑制输入来指定其在算法学习任务中的重要性。如图1所示,输入位1、以及X1至Xm为输入数据,W0至Wm为权重参数。输入数据与权重参数的乘积之和将进入节点的激活函数,判定信号是否继续在网络中传递,以及传递的距离,从而决定信号如何影响网络的最终结果。Neural networks, including convolutional neural networks, consist of multiple layer stacks, each consisting of nodes. The operation is performed in a node whose mode of operation is roughly similar to that of a human neuron, and activates and releases a signal when sufficient stimulus information is encountered. A node combines input data with a set of coefficients (or weight parameters) to specify its importance in the algorithm learning task by amplifying or suppressing the input. As shown in FIG. 1, input bit 1, and X 1 to X m are input data, and W 0 to Wm are weight parameters. The sum of the product of the input data and the weight parameter will enter the activation function of the node, determine whether the signal continues to be transmitted in the network, and the distance passed, thereby determining how the signal affects the final result of the network.
卷积神经网络的单元可以如图2所示,其中,h(x)为输出数据,X1至Xm为输入数据。The unit of the convolutional neural network can be as shown in Figure 2, where h(x) is the output data and X 1 to X m are the input data.
当将如图2所示的多个单元组合起来并具有分层结构时,就形成了神经网络模型。When a plurality of units as shown in FIG. 2 are combined and have a hierarchical structure, a neural network model is formed.
3、卷积神经网络的应用3. Application of convolutional neural networks
近年来卷积神经网络在多个方向都有应用,例如应用在语音识别、人脸识别、通用物体识别、运动分析、自然语言处理等领域。In recent years, convolutional neural networks have been applied in many directions, such as speech recognition, face recognition, general object recognition, motion analysis, and natural language processing.
目前在移动终端上,卷积神经网络也具有越来越多的应用,具体到智能相册其应用形式有图像分类、特征提取、人脸聚类等,这些应用的计算特性为包含大量矩阵运算。At present, convolutional neural networks have more and more applications on mobile terminals. The application forms of smart albums include image classification, feature extraction, face clustering, etc. The computational characteristics of these applications include a large number of matrix operations.
4、卷积神经网络模型4. Convolutional neural network model
卷积神经网络模型为对卷积神经网络进行训练后得到的具体的网络模型(或者说是算 法实例),卷积神经网络模型具有卷积神经网络的特点,卷积神经网络模型具有具体的计算密度,可以用于执行具体的应用业务。The convolutional neural network model is a specific network model (or arithmetical calculation) obtained after training the convolutional neural network. The method of convolutional neural network has the characteristics of convolutional neural network. The convolutional neural network model has specific computational density and can be used to execute specific application services.
5、***芯片5, system chip
设备上设有多种核心(还可称之为处理器或计算单元),这些核心构成***芯片。本发明实施例的核心主要涉及异构核心,这些核心的类型包括但不限于以下几种:The device has a variety of cores (also known as processors or computing units) that form the system chip. The core of the embodiments of the present invention mainly relates to heterogeneous cores, and the types of these cores include but are not limited to the following:
1)中央处理器(CPU,Central Processing Unit)是一块超大规模的集成电路,是一台计算机的运算核心(Core)和控制核心(Control Unit)。它的功能主要是解释计算机指令以及处理计算机软件中的数据。1) Central Processing Unit (CPU) is a very large-scale integrated circuit, which is the computing core (Core) and control unit of a computer. Its function is mainly to explain computer instructions and to process data in computer software.
2)图形处理器(Graphics Processing Unit,GPU),又称显示核心、视觉处理器、显示芯片,是一种专门在个人电脑、工作站、游戏机和一些移动设备(如平板电脑、智能手机等)上图像运算工作的微处理器。2) Graphics Processing Unit (GPU), also known as display core, visual processor, display chip, is a kind of personal computer, workstation, game console and some mobile devices (such as tablet, smart phone, etc.) A microprocessor that works on image operations.
3)数字信号处理器(Digital Signal Process,DSP),DSP指能够实现数字信号处理技术的芯片。DSP芯片的内部采用程序和数据分开的哈佛结构,具有专门的硬件乘法器,广泛采用流水线操作,提供特殊的DSP指令,可以用来快速的实现各种数字信号处理算法。3) Digital Signal Processor (DSP), DSP refers to a chip capable of implementing digital signal processing technology. The internal processing of the DSP chip is a Harvard structure with separate programs and data. It has a dedicated hardware multiplier, which is widely used in pipeline operations and provides special DSP instructions, which can be used to quickly implement various digital signal processing algorithms.
4)脉动阵列处理器4) Pulsating array processor
脉动阵列处理器是一种专用芯片(ASIC),采用脉动阵列(systolic array)结构,在这种阵列结构中,数据按预先确定的“流水”方式在阵列的处理单元间有节奏地“流动”。在数据流动的过程中,所有的处理单元同时并行地对流经它的数据进行处理,因而它可以达到很高的并行处理速度。A pulsating array processor is an application specific chip (ASIC) that employs a systolic array structure in which data is "flowing" rhythmically between processing units of the array in a predetermined "flowing" manner. . In the process of data flow, all processing units process the data flowing through it in parallel at the same time, so that it can achieve high parallel processing speed.
脉动阵列处理器具体可以是神经网络处理器(Neural-network Processing Unit,NPU)、张量处理器(Tensor Processing Unit,TPU)、智能处理器(Intelligence Processing Unit,IPU)等。The pulsating array processor may specifically be a Neural Network Processor (NPU), a Tensor Processing Unit (TPU), an Intelligent Processing Unit (IPU), or the like.
4.1)神经网络处理器(Neural-network Processing Unit,NPU),NPU在电路层模拟人类神经元和突触,并且用深度学习指令集直接处理大规模的神经元和突触,一条指令完成一组神经元的处理。相比于CPU中采取的存储与计算相分离的冯诺伊曼结构,NPU通过突触权重实现存储和计算一体化,从而大大提高了运行效率。4.1) Neural-network Processing Unit (NPU), which simulates human neurons and synapses at the circuit level and directly processes large-scale neurons and synapses with a deep learning instruction set, one instruction completes a group Processing of neurons. Compared with the von Neumann structure in which the storage and computation are separated in the CPU, the NPU realizes the integration of storage and computation through synaptic weights, thereby greatly improving the operational efficiency.
4.2)张量处理器(Tensor Processing Unit,TPU)人工智能旨在为机器赋予人的智能,机器学***台的可编程人工智能(Artificial Intelligence,AI)加速器,本质是脉动阵列结构的加速器。其内部的指令集在Tensorflow程序变化或者更新算法时也可以运行。TPU可以提供高吞吐量的低精度计算,用于模型的前向运算而不是模型训练,且能效(TOPS/w)更高。TPU也可以称之为智能处理器(Intelligence Processing Unit,IPU)。4.2) Tensor Processing Unit (TPU) Artificial intelligence is designed to give people the intelligence of machines. Machine learning is a powerful way to implement artificial intelligence. The so-called machine learning, that is, the study of how to let computers automatically learn the subject. TPU is such a chip dedicated to machine learning. It can be a programmable artificial intelligence (AI) accelerator for the Tensorflow platform, which is essentially an accelerator of a pulsating array structure. Its internal instruction set can also be run when the Tensorflow program changes or updates the algorithm. The TPU can provide high-throughput, low-precision calculations for forward modeling of models rather than model training, and with higher energy efficiency (TOPS/w). The TPU can also be called an Intelligent Processing Unit (IPU).
图3为本发明实施例提供的核心调度方法涉及的使用场景图。在该使用场景中,在终端上设有***芯片(System On Chip,SoC),该***芯片包括至少两个核心,该至少两个核心为异构核心。该至少两个核心可以包括CPU、GPU、DSP、脉动阵列处理器等。脉动阵 列处理器包括但不限于神经网络处理器、张量处理器等。这些芯片可以称之为核心,用于在终端上进行计算。其中,不同的核心有不同的能效比。FIG. 3 is a usage scenario diagram related to a core scheduling method according to an embodiment of the present invention. In this usage scenario, a system on chip (SoC) is provided on the terminal, and the system chip includes at least two cores, and the at least two cores are heterogeneous cores. The at least two cores may include a CPU, a GPU, a DSP, a pulse array processor, and the like. Pulsating array Column processors include, but are not limited to, neural network processors, tensor processors, and the like. These chips can be called cores for calculations on the terminal. Among them, different cores have different energy efficiency ratios.
终端可以使用具体的算法执行不同的应用业务,本发明实施例的方法涉及运行卷积神经网络模型,终端可以使用卷积神经网络模型执行不同的应用业务。The terminal can perform different application services by using a specific algorithm. The method in the embodiment of the present invention involves running a convolutional neural network model, and the terminal can perform different application services using the convolutional neural network model.
终端在执行不用的应用业务时,会遇到不同的要求,例如,实时场景应用(比如相机预览)要求对图像进行实时识别,对性能的要求较高;而图库对导入图片进行分类在后台进行,此时,对运算的实时性要求较低,更偏向要求减少功耗。When the terminal performs the unused application service, it encounters different requirements. For example, the real-time scene application (such as camera preview) requires real-time recognition of the image, and the performance requirement is high; and the library classifies the imported image in the background. At this time, the real-time requirements for the operation are lower, and the requirement is to reduce the power consumption.
因此,终端在运行具体的卷积神经网络模型时,需要根据计算要求(例如,性能、功耗等)进行有效的核心调度,调度核心运行该卷积神经网络模型以执行具体的业务。这将有益于在终端上对应用业务的执行,例如产生更加高效或者节能的执行效果等。Therefore, when the terminal runs a specific convolutional neural network model, it needs to perform effective core scheduling according to computing requirements (for example, performance, power consumption, etc.), and the scheduling core runs the convolutional neural network model to perform specific services. This would be beneficial for the execution of application services on the terminal, such as producing more efficient or energy efficient execution effects.
为此,本发明实施例提供了一种核心调度方法,用于为卷积神经网络模型提供适配的核心,以高效运行该卷积神经网络模型。To this end, an embodiment of the present invention provides a core scheduling method for providing an adaptive core for a convolutional neural network model to efficiently run the convolutional neural network model.
图4为本发明实施例提供的一种核心调度方法的方法流程图。参考上文的内容和图4,本发明实施例的方法包括:FIG. 4 is a flowchart of a method for a core scheduling method according to an embodiment of the present invention. Referring to the above content and FIG. 4, the method of the embodiment of the present invention includes:
步骤401:获取目标模型参数。Step 401: Acquire target model parameters.
其中,目标模型参数用于表示一卷积神经网络模型的计算密度。Among them, the target model parameters are used to represent the computational density of a convolutional neural network model.
终端获取目标模型参数,通过该目标模型参数即可确定具体的卷积神经网络模型的计算密度。因不同核心适于运行不同计算密度的卷积神经网络模型,从而可根据目标模型参数选择核心,以运行具有该目标模型参数的卷积神经网络模型。The terminal acquires the target model parameters, and the calculated density of the specific convolutional neural network model can be determined by the target model parameters. Because different cores are suitable for running convolutional neural network models with different computational densities, the core can be selected according to the target model parameters to run a convolutional neural network model with the target model parameters.
可以理解,目标模型参数的具体形式有多种,例如,目标模型参数为卷积神经网络模型的权重参数数量。该目标模型参数还可以为卷积神经网络模型的层(layer)数,卷积神经网络模型的层数可表示卷积神经网络模型的深度。目标模型参数还可以是其它的参数,这些参数都可以反映卷积神经网络模型的计算密度,卷积神经网络模型的计算密度也可以称为卷积神经网络模型的复杂度。It can be understood that there are various specific forms of the target model parameters, for example, the target model parameter is the number of weight parameters of the convolutional neural network model. The target model parameter can also be the number of layers of the convolutional neural network model, and the number of layers of the convolutional neural network model can represent the depth of the convolutional neural network model. The target model parameters can also be other parameters, which can reflect the computational density of the convolutional neural network model. The computational density of the convolutional neural network model can also be called the complexity of the convolutional neural network model.
关于卷积神经网络模型和卷积神经网络模型的计算密度可参考上文介绍技术术语的部分。For the calculated density of the convolutional neural network model and the convolutional neural network model, reference may be made to the section on technical terms introduced above.
步骤401的具体实现方式有多种,现举出其中的几种示例,如下:There are various implementations of step 401. Several examples are given below, as follows:
示例一:终端获取卷积神经网络模型,通过分析该卷积神经网络模型,得到该卷积神经网络模型的目标模型参数。Example 1: The terminal acquires a convolutional neural network model, and analyzes the convolutional neural network model to obtain the target model parameters of the convolutional neural network model.
示例二:解析设备获取卷积神经网络模型,解析设备解析该卷积神经网络模型,以得到该卷积神经网络模型的目标模型参数,然后解析设备向终端发送该目标模型参数,以使终端获取到该目标模型参数。Example 2: The analysis device acquires a convolutional neural network model, and the parsing device parses the convolutional neural network model to obtain a target model parameter of the convolutional neural network model, and then the parsing device sends the target model parameter to the terminal, so that the terminal acquires Go to the target model parameters.
步骤402:根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值。Step 402: Determine core weight values of at least two cores from the preset first correspondence according to the target model parameters.
其中,该至少两个核心的核心权重值与目标模型参数对应,该至少两个核心为终端上的异构核心,该第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系, 核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。The core weight value of the at least two cores corresponds to a target model parameter, and the at least two cores are heterogeneous cores on the terminal, where the first correspondence relationship includes a correspondence between the target model parameter and the core weight values of the at least two cores. Relationship, The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
终端预先设置有第一对应关系,该第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系。这样,终端在获取到目标模型参数后,即可根据该目标模型参数,从该第一对应关系中确定该至少两个核心的核心权重值,确定出的该至少两个核心的核心权重值与目标模型参数对应,该至少两个核心的核心权重值与目标模型参数的对应关系即为第一对应关系包括的目标模型参数和至少两个核心的核心权重值的对应关系。The terminal is preset with a first correspondence, where the first correspondence includes a correspondence between the target model parameter and the core weight values of the at least two cores. In this way, after acquiring the target model parameters, the terminal may determine the core weight values of the at least two cores from the first correspondence according to the target model parameters, and determine the core weight values of the at least two cores. Corresponding to the target model parameters, the correspondence between the core weight values of the at least two cores and the target model parameters is the correspondence between the target model parameters included in the first correspondence relationship and the core weight values of the at least two cores.
核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,从而终端可以利用核心权重值来实现调度适宜的核心来运行该卷积神经网络模型。例如,若第一核心运行一具体计算密度的卷积神经网络模型的效率高于第二核心运行该卷积神经网络模型的效率,换言之,第一核心比第二核心更加适宜于运行该卷积神经网络模型,则将第一核心的核心权重值设置得比第二核心的核心权重值高,以表示为了运行该卷积神经网络模型,第一核心被选择的优先程度大于第二核心被选择的优先程度。从而,可以使用该目标模型参数和该至少两个核心的核心权重值预先建立第一对应关系。当终端获取到具体的目标模型参数时,即可根据该目标模型参数从第一对应关系中确定出核心的核心权重值。The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model so that the terminal can utilize the core weight value to implement a suitable kernel for scheduling to run the convolutional neural network model. For example, if the first core runs a convolutional neural network model with a specific computational density more efficiently than the second core runs the convolutional neural network model, in other words, the first core is more suitable for running the convolution than the second core. The neural network model sets the core weight value of the first core higher than the core weight value of the second core to indicate that in order to run the convolutional neural network model, the first core is selected with a greater priority than the second core is selected. Priority. Thus, the first correspondence relationship may be pre-established using the target model parameter and the core weight values of the at least two cores. When the terminal acquires the specific target model parameter, the core weight value of the core may be determined from the first correspondence according to the target model parameter.
为了能实现利用核心权重值来调度适宜的核心来运行该卷积神经网络模型,在有的实施例中,需要根据核心的硬件特性或架构特性或计算方式对运行具体的卷积神经网络模型的适宜程度来设置核心权重值。此时,核心权重值具体用于表示核心的硬件特性对运行卷积神经网络模型的适配程度、或者核心的架构特性对运行卷积神经网络模型的适配程度、或者核心的计算方式对运行卷积神经网络模型的适配程度。In order to realize the use of the core weight value to schedule the appropriate core to run the convolutional neural network model, in some embodiments, it is necessary to run a specific convolutional neural network model according to the core hardware characteristics or architectural characteristics or calculation methods. The appropriate level to set the core weight value. At this time, the core weight value is specifically used to indicate the degree of adaptation of the core hardware characteristics to the running convolutional neural network model, or the degree of adaptation of the core architectural characteristics to the running convolutional neural network model, or the core computing mode to run The degree of adaptation of the convolutional neural network model.
第一对应关系中的至少两个核心为终端上的异构核心,核心特性不相同,从而运行卷积神经网络模型时会产生不同的运行效果,所以核心权重值的设定是可行的。At least two cores in the first correspondence are heterogeneous cores on the terminal, and the core characteristics are different, so that different operational effects are generated when the convolutional neural network model is run, so the setting of the core weight value is feasible.
至于哪一核心适于运行哪一计算密度的卷积神经网络模型,可以根据预先进行试验测试得到,例如,设置测试效率参数,该测试效率参数表示运行卷积神经网络模型执行具体业务所用的时间。然后,将具有具体计算密度的卷积神经网络模型在不同核心上运行,以得到不同核心的测试效率参数。然后为测试效率参数大的核心配置较大的核心权重值。As for which convolutional neural network model the core is suitable for running, it can be obtained according to pre-test tests, for example, setting test efficiency parameters, which represent the time taken to run the convolutional neural network model to perform specific services. . Then, a convolutional neural network model with a specific computational density is run on different cores to obtain test efficiency parameters for different cores. Then configure a larger core weight value for the core with a large test efficiency parameter.
核心权重值具体来说,可以是表示核心的硬件特性对运行一卷积神经网络模型的适宜程度。步骤402中的至少两个核心为终端上的异构核心,不同的异构核心具有不同的硬件特性,从而不同的异构的核心适宜于运行不同计算密度的卷积神经网络模型。对异构核心进行调度以运行卷积神经网络模型才更有实际意义。Specifically, the core weight value may be indicative of the suitability of the core hardware characteristics for running a convolutional neural network model. At least two cores in step 402 are heterogeneous cores on the terminal, and different heterogeneous cores have different hardware characteristics, so that different heterogeneous cores are suitable for running convolutional neural network models with different computational densities. It is more practical to schedule heterogeneous cores to run a convolutional neural network model.
可选地,该至少两个核心可以CPU、GPU、DSP、脉动阵列处理器中的至少两个。例如,在终端上设置的异构核心可以为CPU、GPU、DSP、脉动阵列处理器中任意两个,或者任意三个,或者任意四个,或者包括全部这些芯片。可以理解,在终端上设置的异构核心还可以是其它的核心。脉动阵列处理器可以是NPU或TPU等具体的芯片。Optionally, the at least two cores may be at least two of a CPU, a GPU, a DSP, and a pulse array processor. For example, the heterogeneous core set on the terminal may be any two of the CPU, GPU, DSP, and pulse array processor, or any three, or any four, or all of the chips. It can be understood that the heterogeneous cores set on the terminal can also be other cores. The systolic array processor can be a specific chip such as an NPU or a TPU.
可以理解,在步骤402中确定出的至少两个核心的核心权重值有多种具体的实现形式,举例如下:It can be understood that the core weight values of the at least two cores determined in step 402 have multiple specific implementation forms, as follows:
在一个示例中,核心权重值为数值的形式,具体可以为:百分比的形式,例如,10%、30%等;分数的形式,例如,
Figure PCTCN2017107614-appb-000001
Figure PCTCN2017107614-appb-000002
等;小数的形式,例如,0.5、1.0、1.5等;
In one example, the core weight value is in the form of a numerical value, and may be in the form of a percentage, for example, 10%, 30%, etc.; in the form of a score, for example,
Figure PCTCN2017107614-appb-000001
or
Figure PCTCN2017107614-appb-000002
Etc; the form of decimals, for example, 0.5, 1.0, 1.5, etc.;
在另一个示例中,核心权重值为等级表示的形式,例如,第一优先级、第五优先级等。In another example, the core weight value is in the form of a level representation, such as a first priority, a fifth priority, and the like.
可以理解,在第一对应关系中,目标模型参数和核心权重值有多种表现形式,在一些示例中,该目标模型参数或者核心权重值可以是具体的数值,例如目标模型参数为1000、2000等,核心权重值可以为0.5、0.2等。在其它的示例中,该目标模型参数或者核心权重值还可以是数值区间,数值区间即为数值范围,例如目标模型参数为区间[10000,15000]、[15000,20000]等,该核心权重值为区间[0.1,0.6]、[0.6,0.8]等。本发明实施例对第一对应关系中的目标模型参数和核心权重值的具体表现形式不作限定。It can be understood that, in the first correspondence, the target model parameter and the core weight value have multiple representations. In some examples, the target model parameter or the core weight value may be a specific value, for example, the target model parameter is 1000, 2000. Etc., the core weight value can be 0.5, 0.2, and so on. In other examples, the target model parameter or the core weight value may also be a numerical interval, and the numerical interval is a numerical range, for example, the target model parameter is an interval [10000, 15000], [15000, 20000], etc., the core weight value For the interval [0.1, 0.6], [0.6, 0.8], etc. The embodiment of the present invention does not limit the specific representation form of the target model parameter and the core weight value in the first correspondence relationship.
关于步骤402的具体实现方式有多种,下文将举出其中两个示例:There are various specific implementations for step 402, two examples of which are exemplified below:
示例一:Example 1:
第一对应关系包括的目标模型参数和核心权重值为具体的数值,此时,步骤402的具体实现方式包括:使用步骤401的目标模型参数和第一对应关系的目标模型参数进行匹配,若匹配相同,则从第一对应关系中确定出与匹配相同的目标模型参数对应的至少两个核心的核心权重值。The target model parameter and the core weight value included in the first correspondence are specific values. In this case, the specific implementation of step 402 includes: matching the target model parameter of step 401 with the target model parameter of the first correspondence, if the matching Similarly, the core weight values of at least two cores corresponding to the same target model parameters are determined from the first correspondence.
示例二:Example two:
第一对应关系包括的目标模型参数和核心权重值为目标模型参数区间和核心权重值区间的形式。此时,步骤402的具体实现方式包括:The first correspondence relationship includes a target model parameter and a core weight value in the form of a target model parameter interval and a core weight value interval. At this time, the specific implementation manner of step 402 includes:
步骤A1:在预设的第一对应关系中,确定步骤401的目标模型参数所在的目标模型参数区间。Step A1: In the preset first correspondence, the target model parameter interval in which the target model parameter of step 401 is located is determined.
例如,目标模型参数为权重参数数量,第一对应关系包括权重参数数量区间[1千万,3千万]和CPU的核心权重值区间[0.2,0.4]、GPU的核心权重值区间[0.1,0.3]的对应关系。终端获取到的卷积神经网络模型的目标模型参数为1.5千万,则可在第一对应关系中,确定卷积神经网络模型的目标模型参数1.5千万落入权重参数数量区间[1千万,3千万]。For example, the target model parameter is the number of weight parameters, and the first correspondence includes the weight parameter number interval [10 million, 30 million] and the CPU core weight value interval [0.2, 0.4], and the GPU core weight value interval [0.1, The corresponding relationship of 0.3]. The target model parameter of the convolutional neural network model acquired by the terminal is 150 million, and in the first correspondence relationship, the target model parameter of the convolutional neural network model is determined to fall within the weight parameter interval of 10 million [10 million , 30 million].
步骤A2:在第一对应关系中,确定至少两个核心的核心权重值区间。Step A2: In the first correspondence, determine a core weight value interval of at least two cores.
其中,至少两个核心的核心权重值区间和目标模型参数区间对应,第一对应关系包括目标模型参数区间和至少两个核心的核心权重值区间的对应关系,目标模型参数区间包括目标模型参数。The core weight value interval of the at least two cores corresponds to the target model parameter interval, and the first correspondence relationship includes a correspondence relationship between the target model parameter interval and the core weight value interval of the at least two cores, and the target model parameter interval includes the target model parameter.
例如,第一对应关系包括权重参数数量区间[1千万,3千万]和CPU的核心权重值区间[0.2,0.4]的对应关系、以及权重参数数量区间[1千万,3千万]和GPU的核心权重值区间[0.1,0.3]的对应关系。确定出卷积神经网络模型的目标模型参数落入权重参数数量区间[1千万,3千万]后,在第一对应关系中,确定与该权重参数数量区间对应的CPU的核心权重值区间[0.2,0.4]、以及GPU的核心权重值区间[0.1,0.3]。For example, the first correspondence includes the correspondence between the weight parameter number interval [10 million, 30 million] and the CPU core weight value interval [0.2, 0.4], and the weight parameter number interval [10 million, 30 million] Correspondence with the core weight value interval [0.1, 0.3] of the GPU. After determining that the target model parameter of the convolutional neural network model falls within the weight parameter number interval [10 million, 30 million], in the first correspondence relationship, the core weight value interval of the CPU corresponding to the weight parameter number interval is determined. [0.2, 0.4], and the core weight value range of the GPU [0.1, 0.3].
步骤A3:对每一核心,从核心权重值区间中确定核心权重值。Step A3: For each core, the core weight value is determined from the core weight value interval.
其中,核心权重值在核心权重值区间中的位置和目标模型参数在目标模型参数区间中的位置相同。Wherein, the position of the core weight value in the core weight value interval and the position of the target model parameter in the target model parameter interval are the same.
例如,目标模型参数1.5千万在目标模型参数区间[1千万,3千万]中的位置为位于二分之一处,从而针对CPU,从核心权重值区间[0.2,0.4]中确定位于二分之一处的核心权重值0.3;针对GPU,从核心权重值区间[0.1,0.3]中确定位于二分之一处的核心权重值0.2。 For example, the target model parameter of 15 million is located in the target model parameter interval [10 million, 30 million] at one-half, so that for the CPU, it is determined from the core weight value interval [0.2, 0.4] The core weight value of one-half is 0.3; for the GPU, the core weight value of 0.2 at one-half is determined from the core weight value interval [0.1, 0.3].
在另一个示例中,第一对应关系包括的目标模型参数为数值区间,第一对应关系包括的核心权重值为具体数值。此时,步骤402的具体实现方式包括:在预设的第一对应关系中,确定步骤401的目标模型参数所在的目标模型参数区间,然后,在该第一对应关系中,确定与该目标模型参数区间对应的至少两个核心的核心权重值。In another example, the first corresponding relationship includes a target model parameter that is a numerical interval, and the first corresponding relationship includes a core weight value that is a specific value. At this time, the specific implementation manner of step 402 includes: determining, in a preset first correspondence, a target model parameter interval in which the target model parameter of step 401 is located, and then determining, in the first correspondence relationship, the target model The core weight value of at least two cores corresponding to the parameter interval.
步骤403:根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。Step 403: Determine a core of the running convolutional neural network model from at least two cores according to core weight values of at least two cores.
终端在确定出至少两个核心的核心权重值后,因核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,从而可以根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心,利用核心权重值,确定适宜于运行步骤401中的卷积神经网络模型的核心,从而可以调度该核心运行该卷积神经网络模型,以执行具体的应用业务。After the terminal determines the core weight values of at least two cores, the core weight value is used to indicate that the core is selected to run the convolutional neural network model, so that at least two core weight values can be obtained from at least two core weight values. The core of the running convolutional neural network model is determined in the core, and the core weight value is used to determine the core of the convolutional neural network model suitable for running step 401, so that the core can be scheduled to run the convolutional neural network model to execute specific Application business.
关于步骤403,有多种具体的实现方式,下面即举出其中几个示例:There are a number of specific implementations for step 403, and a few examples are given below:
步骤403的示例一:Example one of step 403:
从该至少两个核心中确定核心权重值最大的核心,该核心权重值最大的核心用于运行卷积神经网络模型。因核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,从而核心权重值大的核心优先于核心权重值小的核心被调度运行卷积神经网络模型。The core with the largest core weight value is determined from the at least two cores, and the core with the largest core weight value is used to run the convolutional neural network model. The core weight value is used to indicate that the core is selected to run the convolutional neural network model, so that the core with a large core weight value is scheduled to run the convolutional neural network model in preference to the core with a small core weight value.
步骤403的示例二:Example 2 of step 403:
在本示例中,为了具体执行步骤403,终端还需先执行其它的步骤,以得到修正核心权重值的参数,然后使用该参数修正核心权重值,得到修正的权重值,通过使用该修正的权重值来从该至少两个核心中确定运行卷积神经网络模型的核心。该修正的权重值因引入更多的参考因素,从而比核心权重值更能反映出适宜调度的核心。详情如下:In this example, in order to specifically perform step 403, the terminal further needs to perform other steps to obtain a parameter for correcting the core weight value, and then use the parameter to correct the core weight value to obtain a modified weight value, by using the modified weight. The value is determined from the core of the running convolutional neural network model from the at least two cores. The modified weight value introduces more reference factors and thus reflects the core of the appropriate scheduling more than the core weight value. Details are as follows:
本示例的方法还包括:The method of this example also includes:
步骤B1:获取终端当前的状态参数。Step B1: Obtain the current state parameter of the terminal.
其中,状态参数为动态变化的参数。Among them, the state parameter is a dynamically changing parameter.
状态参数为终端上动态变化的参数,状态参数反应了终端运行卷积神经网络模型时的具体运行环境,受不同运行环境影响,核心运行卷积神经网络模型将产生不同的效果。获取终端的当前的状态参数,将该状态参数作为调度核心需要的参考因素之一,从而可以确定出更适宜运行卷积神经网络模型的核心。The state parameter is a dynamically changing parameter on the terminal. The state parameter reflects the specific operating environment when the terminal runs the convolutional neural network model. The core running convolutional neural network model will have different effects affected by different operating environments. Obtaining the current state parameter of the terminal, which is one of the reference factors needed for the scheduling core, so that the core of the convolutional neural network model can be determined to be more suitable.
该状态参数包括但不限于终端的剩余电量值、核心使用率、核心的温度等等。The status parameters include, but are not limited to, the remaining power value of the terminal, the core usage rate, the temperature of the core, and the like.
步骤B2:根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值。Step B2: Determine parameter weight values of at least two cores from the preset second correspondence according to the state parameters.
其中,该至少两个核心的参数权重值和状态参数对应,第二对应关系包括状态参数和至少两个核心的参数权重值的对应关系,参数权重值用于表示在状态参数下,核心被选择以运行卷积神经网络模型的优先程度。The parameter weight value of the at least two cores corresponds to a state parameter, and the second correspondence relationship includes a correspondence between the state parameter and at least two core parameter weight values. The parameter weight value is used to indicate that the core is selected under the state parameter. To prioritize the running of the convolutional neural network model.
在终端上预先设置有第二对应关系,该第二对应关系包括状态参数和至少两个核心的参数权重值的对应关系。这样,终端在获取到终端当前的状态参数后,即可根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值,该至少两个核心的参数权重值和状态参数对应。该至少两个核心的参数权重值和状态参数的对应关系即为第二对应关系 包括的状态参数和至少两个核心的参数权重值的对应关系。其中,参数权重值用于表示在状态参数下,核心被选择以运行卷积神经网络模型的优先程度。参数权重值大的核心优先于参数权重值小的核心运行卷积神经网络模型。从而,终端可以利用该核心权重值来修正步骤402中的核心权重值,以使得修正后的权重值进一步考虑到了终端上的状态参数,更能反映出适宜运行卷积神经网络模型的核心。A second correspondence is preset on the terminal, where the second correspondence includes a correspondence between the state parameter and the parameter weight value of the at least two cores. After the terminal obtains the current state parameter of the terminal, the terminal may determine the parameter weight value of the at least two cores from the preset second correspondence according to the state parameter, and the parameter weight value and the state parameter of the at least two cores. correspond. Corresponding relationship between the parameter weight value and the state parameter of the at least two cores is the second correspondence Correspondence between the included state parameter and at least two core parameter weight values. Among them, the parameter weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the state parameter. The core with a large parameter weight value takes precedence over the core running convolutional neural network model with a small parameter weight value. Therefore, the terminal can use the core weight value to correct the core weight value in step 402, so that the modified weight value further takes into account the state parameter on the terminal, and more reflects the core of the suitable convolutional neural network model.
其中,步骤B2的至少两个核心和步骤402的至少两个核心所指相同。Wherein at least two cores of step B2 and at least two cores of step 402 are referred to the same.
至于在不同状态参数下,哪一核心更适宜于运行卷积神经网络模型,可以通过预先进行试验测试得到,例如,设置测试效率参数,将一具有具体计算密度的卷积神经网络模型在不同核心上运行,该核心处于具体的终端的状态参数下,然后得到不同核心的测试效率参数。然后为测试效率参数大的核心配置较大的核心权重值。As for which core is more suitable for running the convolutional neural network model under different state parameters, it can be obtained by pre-testing, for example, setting test efficiency parameters, and a convolutional neural network model with specific computational density in different cores. Running on, the core is under the state parameters of the specific terminal, and then the test efficiency parameters of different cores are obtained. Then configure a larger core weight value for the core with a large test efficiency parameter.
可以理解,在步骤B2中确定出的至少两个核心的参数权重值有多种具体的实现形式,例如百分比的形式、分数的形式或等级表示的形式等,具体可参阅上文中对核心权重值的具体的实现形式的详细描述。It can be understood that the parameter weight values of the at least two cores determined in step B2 have multiple specific implementation forms, such as a form of a percentage, a form of a score, or a form of a level representation. For details, refer to the core weight value in the above. A detailed description of the specific implementation form.
可以理解,在第二对应关系中,状态参数和参数权重值也可以有多种表现形式,例如可以为具体的数值或者数值区间,具体可参阅上文中对第一对应关系中的目标模型参数和核心权重值的表现形式的详细描述。It can be understood that, in the second correspondence, the state parameter and the parameter weight value may also have multiple representations, for example, may be a specific numerical value or a numerical interval. For details, refer to the target model parameters in the first correspondence relationship. A detailed description of the representation of the core weight values.
关于步骤B2的具体实现方式有多种,下文将举出其中两个示例:There are various concrete implementations for step B2, and two examples are given below:
在一个示例中,第二对应关系包括的状态参数和参数权重值为具体的数值,此时步骤B2的具体实现方式包括:使用步骤B1的状态参数和第二对应关系中的状态参数进行匹配,若匹配相同,则从第二对应关系中确定与匹配相同的状态参数对应的至少两个核心的参数权重值。In an example, the state parameter and the parameter weight value of the second correspondence include a specific value. In this case, the specific implementation of step B2 includes: using the state parameter of step B1 and the state parameter in the second correspondence, If the matches are the same, the parameter weight values of the at least two cores corresponding to the same state parameter are determined from the second correspondence.
在另一个示例中,第二对应关系包括的状态参数和参数权重值为数值区间。此时步骤B2的具体实现方式包括:在预设的第二对应关系中,确定步骤B1的状态参数所在的状态参数区间;然后,在第二对应关系中,确定至少两个核心的参数权重值区间,该至少两个核心的参数权重值区间与该状态参数区间对应,该第二对应关系包括该状态参数区间和该至少两个核心的参数权重值区间的对应关系。跟着,对每一核心,从参数权重值区间中确定参数权重值。其中,参数权重值在参数权重值区间中的位置和步骤B1的状态参数在该状态参数区间中的位置相同。具体的内容,可参考上文中对步骤402的具体实现方式中的示例二的详细描述。In another example, the second correspondence includes a state parameter and a parameter weight value value range. The specific implementation manner of the step B2 includes: determining, in the preset second correspondence, the state parameter interval in which the state parameter of the step B1 is located; and then determining, in the second correspondence, the parameter weight values of the at least two cores. The interval, the parameter weight value interval of the at least two cores corresponds to the state parameter interval, and the second correspondence relationship includes a correspondence between the state parameter interval and the parameter weight value interval of the at least two cores. Next, for each core, the parameter weight value is determined from the parameter weight value interval. Wherein, the position of the parameter weight value in the parameter weight value interval and the position of the state parameter of step B1 in the state parameter interval are the same. For specific content, reference may be made to the above detailed description of the second example in the specific implementation of step 402.
步骤B1和B2执行后,再执行步骤403,此时,步骤403具体包括步骤B3和步骤B4,如下:After the steps B1 and B2 are executed, the step 403 is performed. In this case, the step 403 specifically includes the step B3 and the step B4, as follows:
步骤B3:对每一核心,使用参数权重值修正核心权重值,得到第一修正权重值。Step B3: For each core, the core weight value is corrected by using the parameter weight value to obtain the first modified weight value.
其中,第一修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。第一修正权重值大的核心优先于第一修正权重值小的核心运行卷积神经网络模型。Wherein, the first modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model. The core with the first modified weight value has priority over the core running convolutional neural network model with the first modified weight value.
关于使用参数权重值修正核心权重值的具体修正方式有多种,该具体的修正方式可以预先进行设置。例如,将参数权重值和核心权重值相乘得到第一修正权重值,或者是根据预设的修正关系使用参数权重值修正核心权重值得到第一修正权重值,例如核心权重值为 第三优先级,参数权重值为第五优先级,预设的修正关系为确定两个权重值的最高级为第一修正权重值,从而第一修正权重值为第三优先级。There are various specific correction methods for correcting the core weight value using the parameter weight value, and the specific modification manner can be set in advance. For example, multiplying the parameter weight value and the core weight value to obtain the first modified weight value, or correcting the core weight value by using the parameter weight value according to the preset correction relationship to obtain the first modified weight value, for example, the core weight value is The third priority, the parameter weight value is the fifth priority, and the preset correction relationship is that the highest level of the two weight values is the first modified weight value, so that the first modified weight value is the third priority.
步骤B3中的“对每一核心”,表示针对前述至少两个核心中的每一核心。步骤B3即为对前述至少两个核心,分别确定具体的核心,使用该核心的参数权重值修正该核心的核心权重值,得到该核心的第一修正权重值。"For each core" in step B3, means for each of the aforementioned at least two cores. Step B3 is to determine a specific core for each of the at least two cores, and use the parameter weight value of the core to modify the core weight value of the core to obtain a first modified weight value of the core.
步骤B4:根据至少两个核心的第一修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。Step B4: Determine a core of the running convolutional neural network model from at least two cores according to the first modified weight value of the at least two cores.
终端在得到至少两个核心的第一修正权重值后,即可根据该至少两个核心的第一修正权重值,从该至少两个核心中确定运行卷积神经网络模型的核心,从而确定出适宜于运行卷积神经网络模型的核心。After obtaining the first modified weight value of the at least two cores, the terminal may determine the core of the running convolutional neural network model from the at least two cores according to the first modified weight value of the at least two cores, thereby determining Suitable for running the core of the convolutional neural network model.
步骤B4的具体实现方式有多种,例如,从至少两个核心中确定第一修正权重值最大的核心,该第一修正权重值最大的核心用于运行卷积神经网络模型。或者是,进一步地使用其它参数来修正第一修正权重值,使用进一步修正的权重值来确定运行卷积神经网络模型的核心。类似于再一次执行步骤B1至B2,只是获取的是其它参数,被修正的是第一修正权重值。There are various implementations of the step B4. For example, the core with the largest corrected weight value is determined from the at least two cores, and the core with the largest first modified weight value is used to run the convolutional neural network model. Alternatively, other parameters are further used to correct the first modified weight value, and the further modified weight value is used to determine the core of the running convolutional neural network model. Similar to performing steps B1 to B2 again, only the other parameters are acquired, and the first correction weight value is corrected.
参数权重值根据终端当前的状态参数得到,当前的状态参数反映到了终端运行卷积神经网络模型时的具体运行环境,从而参数权重值反映了终端当前的环境对核心运行卷积神经网络模型的影响,而核心权重值的确定根据表示一卷积神经网络模型的计算密度的目标模型参数确定,从而核心权重值反映了核心的硬件特性对运行卷积神经网络模型的影响,从而使用参数权重值修正核心权重值得到的第一修正权重值考虑到了更多的因数,根据第一修正权重值,将能确定出更适宜运行卷积神经网络模型的核心。The parameter weight value is obtained according to the current state parameter of the terminal, and the current state parameter is reflected in the specific operating environment when the terminal runs the convolutional neural network model, so that the parameter weight value reflects the influence of the current environment of the terminal on the core running convolutional neural network model. And the determination of the core weight value is determined according to the target model parameter representing the computational density of a convolutional neural network model, so that the core weight value reflects the influence of the core hardware characteristics on the running convolutional neural network model, thereby using the parameter weight value correction The first modified weight value obtained from the core weight value takes into account more factors. According to the first modified weight value, the core of the more suitable convolutional neural network model can be determined.
关于步骤B1至步骤B2有多种实现方式,下面对此举出两个实现方式:There are various implementations for steps B1 through B2, and two implementations are given below:
实现方式一:Implementation one:
在步骤B1中,终端当前的状态参数为每一核心当前的核心使用率。In step B1, the current state parameter of the terminal is the current core usage rate of each core.
此时,步骤B2具体包括:对每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值。At this time, step B2 specifically includes: determining, for each core, a performance weight value from the preset second correspondence according to the core usage rate.
其中,每一核心的性能权重值和每一核心的核心使用率对应,性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括每一核心的核心使用率和每一核心的性能权重值的对应关系。Wherein, the performance weight value of each core corresponds to the core usage rate of each core, and the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the current core core usage rate of the core, and second The correspondence includes the correspondence between the core usage rate of each core and the performance weight value of each core.
核心使用率指终端上运行的程序占用的核心资源,表示了核心运行程序的繁忙程度。核心的核心使用率越高,说明在该核心上运行了较多程序,反之较少。核心使用率可以为具体的数值,例如10%、2%等。The core usage rate refers to the core resources occupied by the programs running on the terminal, indicating the busyness of the core running programs. The higher the core core usage rate, the more programs are running on the core, and vice versa. The core usage rate can be a specific value, such as 10%, 2%, and the like.
性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行卷积神经网络模型的优先程度。核心的性能权重值反映该核心的当前计算资源可使用程度。性能权重值大,表示核心的当前计算资源可使用程度高,从而该核心优先调度用于运行卷积神经网络模型。性能权重值大的核心优先用于运行卷积神经网络模型。The performance weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model under the core core usage. The core performance weight value reflects the extent to which the core's current computing resources are available. The performance weight value is large, indicating that the core's current computing resources are highly usable, so that the core priority scheduling is used to run the convolutional neural network model. The core with a large performance weight value is used to run the convolutional neural network model.
可以理解,在第二对应关系中,核心使用率和性能权重值也可以有多种表现形式,例 如可以为具体的数值或者数值区间,具体可参阅上文中对第一对应关系中的目标模型参数和核心权重值的表现形式的详细描述。It can be understood that in the second correspondence, the core usage rate and the performance weight value may also have multiple representations, for example. For example, it may be a specific numerical value or a numerical interval. For details, refer to the detailed description of the representation of the target model parameter and the core weight value in the first correspondence.
本实现方式中的每一核心是指前述至少两个核心的每一核心,针对每一核心,都根据该核心的核心使用率从预设的第二对应关系中确定出该核心的性能权重值。Each core in the implementation manner refers to each core of the at least two cores. For each core, the performance weight value of the core is determined from the preset second correspondence according to the core usage rate of the core. .
关于“根据核心使用率从预设的第二对应关系中确定性能权重值”可参考上述关于步骤B2的具体实现方式的两个示例的具体描述。Regarding "determining the performance weight value from the preset second correspondence according to the core usage rate", reference may be made to the above detailed description of two examples of the specific implementation of step B2.
实现方式二:Implementation 2:
在步骤B1中,终端当前的状态参数为终端当前的剩余电量值;In step B1, the current state parameter of the terminal is the current remaining power value of the terminal;
此时,步骤B2具体包括:根据剩余电量值,从预设的第二对应关系中确定至少两个核心的功耗权重值。At this time, the step B2 specifically includes: determining, according to the remaining power value, the power consumption weight values of the at least two cores from the preset second correspondence.
其中,该至少两个核心的功耗权重值和剩余电量值对应,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系。The power consumption weight value of the at least two cores corresponds to the remaining power value, and the power consumption weight value is used to indicate that the core is selected to run the convolutional neural network model priority under the remaining power value, and the second correspondence includes The correspondence between the remaining power value and the power consumption weight values of at least two cores.
剩余电量值即终端上剩余的电量的数值。剩余电量值的表示方式包括但不限于百分数、安培·小时等。The remaining power value is the value of the remaining power on the terminal. The representation of the remaining charge value includes, but is not limited to, a percentage, an ampere hour, and the like.
功耗权重值用于表示在终端当前的具体剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度,功耗权重值大的核心优先用于运行卷积神经网络模型。在有的实施例中,功耗权重值的设定还可以参考核心的功耗大小,例如,随着剩余电量值的减小,功耗大的核心的功耗权重值减小量大于功耗小的核心的功耗权重值减小量。从而,功耗权重值更能反映在终端当前的剩余电量值下核心运行卷积神经网络模型的适宜程度。The power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the current specific remaining power value of the terminal, and the core with the large power weight value is preferentially used to run the convolutional neural network model. In some embodiments, the setting of the power consumption weight value may also refer to the power consumption of the core. For example, as the remaining power value decreases, the power consumption weight value of the core with large power consumption decreases by more than the power consumption. Small core power consumption weight reduction. Therefore, the power consumption weight value can better reflect the appropriateness of the core running convolutional neural network model under the current remaining power value of the terminal.
可以理解,在第二对应关系中,剩余电量值和功耗权重值也可以有多种表现形式,例如可以为具体的数值或者数值区间,具体可参阅上文中对第一对应关系中的目标模型参数和核心权重值的表现形式的详细描述。It can be understood that, in the second correspondence, the remaining power value and the power consumption weight value may also have multiple representations, for example, may be a specific numerical value or a numerical interval. For details, refer to the target model in the first correspondence relationship. A detailed description of the representation of the parameters and core weight values.
本实现方式中的每一核心是指前述至少两个核心的每一核心,针对每一核心,都根据剩余电量值从预设的第二对应关系中确定出该核心的功耗权重值。Each core in the implementation manner refers to each core of the at least two cores. For each core, the power consumption weight value of the core is determined from the preset second correspondence according to the remaining power value.
关于“根据剩余电量值,从预设的第二对应关系中确定至少两个核心的功耗权重值”可参考上述关于步骤B2的具体实现方式的两个示例的具体描述。Regarding "determining the power consumption weight values of at least two cores from the preset second correspondence according to the remaining power value", reference may be made to the above-described specific description of the two examples of the specific implementation of step B2.
可以理解,在步骤B4中,还可以是进一步修正第一修正权重值,得到第二修正权重值,从而根据该至少两个核心的第二修正权重值,从该至少两个核心中确定运行卷积神经网络模型的核心。It can be understood that, in step B4, the first modified weight value may be further corrected to obtain a second modified weight value, so that the running volume is determined from the at least two cores according to the second modified weight value of the at least two cores. The core of the neural network model.
例如,在上述的步骤B1至步骤B2的实现方式一中,还可以获取终端当前的剩余电量值,以根据该剩余电量值和预设的另一对应关系,确定至少两个核心的功耗权重值,从而对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值。功耗权重值的确定可以参考上述的实现方式二中的详细描述。For example, in the first implementation manner of the foregoing step B1 to step B2, the current remaining power value of the terminal may be acquired, to determine power consumption weights of at least two cores according to the remaining power value and another preset correspondence. Value, and thus for each core, the first modified weight value is corrected using the power consumption weight value to obtain a second modified weight value. For the determination of the power consumption weight value, reference may be made to the detailed description in the second implementation manner described above.
相应地,在有的示例中,还可以是,在上述的步骤B1至步骤B2的实现方式二中,还可以获取每一核心的核心使用率,从而对每一核心,根据该核心使用率和预设的另一对应关系,确定性能权重值,然后使用该性能权重值修正该第一修正权重值,得到第二修正权 重值。该性能权重值的确定方式可参考上述的实现方式一中的详细描述。Correspondingly, in some examples, in the implementation manner 2 of the foregoing steps B1 to B2, the core usage rate of each core may also be obtained, so that for each core, according to the core usage rate and Presetting another correspondence, determining a performance weight value, and then using the performance weight value to correct the first modified weight value to obtain a second correction right Heavy value. For the manner of determining the performance weight value, refer to the detailed description in the foregoing implementation manner 1.
为了对上述的二次修正权重值的方案有更清楚的理解,下面举出其中一个具体的示例,如下所述:In order to have a clearer understanding of the above-described scheme of quadratic correction of weight values, one specific example is given below, as follows:
在上述的步骤B1至步骤B2的实现方式一中,该示例的方法还包括:In the first implementation manner of the foregoing step B1 to step B2, the method of the example further includes:
步骤C1:获取终端当前的剩余电量值;Step C1: Obtain a current remaining power value of the terminal.
步骤C2:根据剩余电量值,从预设的第三对应关系中确定至少两个核心的功耗权重值Step C2: Determine power consumption weight values of at least two cores from a preset third correspondence according to the remaining power value.
该至少两个核心的功耗权重值和剩余电量值对应,第三对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度。The power consumption weight value of the at least two cores corresponds to a remaining power value, and the third correspondence relationship includes a correspondence between the remaining power value and the power consumption weight values of the at least two cores, where the power consumption weight value is used to indicate that the remaining power value is The core is chosen to run the convolutional neural network model with priority.
从而,步骤B4具体包括步骤C3和步骤C4。该步骤C3和步骤C4的详情如下:Thus, step B4 specifically includes step C3 and step C4. The details of step C3 and step C4 are as follows:
步骤C3:对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值。Step C3: For each core, the first modified weight value is corrected by using the power consumption weight value to obtain a second modified weight value.
其中,第二修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;Wherein, the second modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model;
步骤C4:根据至少两个核心的第二修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。Step C4: Determine a core of the running convolutional neural network model from at least two cores according to the second modified weight value of the at least two cores.
关于步骤C4的具体实现方式,在有的示例中,可以是从该至少两个核心中确定第二修正权重值最大的核心,该第二修正权重值最大的核心用于运行卷积神经网络模型。在另一些示例,则为使用其它的参数进一步修正该第二修正权重值,得到进一步修正的权重值,With regard to the specific implementation of step C4, in some examples, the core with the second largest modified weight value may be determined from the at least two cores, and the core with the second largest modified weight value is used to run the convolutional neural network model. . In other examples, the second modified weight value is further corrected using other parameters to obtain a further modified weight value.
以使用该进一步修正的权重值确定运行卷积神经网络模型的核心。The core of the running convolutional neural network model is determined using the weight value of the further correction.
可选地,在本发明的一些示例中的核心调度方法还包括对性能参数的确定,核心运行卷积神经网络算法时可以有多种不同的运行方式,为了能控制核心的具体运行方式可以控制核心使用确定的性能参数来运行卷积神经网络模型。其中,该性能参数的确定需要结合核心的具体使用状态,具体来说,可以是根据核心的核心使用率来确定该核心的性能参数。详情如下所示:Optionally, the core scheduling method in some examples of the present invention further includes determining the performance parameter, and the core running the convolutional neural network algorithm may have multiple different operation modes, and may be controlled in order to control the specific operation mode of the core. The core uses the determined performance parameters to run the convolutional neural network model. The determination of the performance parameter needs to be combined with the specific use state of the core. Specifically, the performance parameter of the core may be determined according to the core usage rate of the core. The details are as follows:
在本发明的一些示例中,核心调度方法还包括:In some examples of the invention, the core scheduling method further includes:
步骤D1:获取每一核心当前的核心使用率。Step D1: Obtain the current core usage rate of each core.
终端先获取核心使用率,从而可以根据核心使用率来确定核心使用的性能参数。此处的每一核心为前述的至少两个核心的每一核心。The terminal first obtains the core usage rate, so that the core usage rate can be determined according to the core usage rate. Each core here is each core of at least two cores described above.
具体来说,终端可以通过操作***提供的应用程序编程接口(Application Programming Interface,API)读取到核心当前的核心使用率。Specifically, the terminal can read the current core usage rate of the core through an application programming interface (API) provided by the operating system.
步骤D2:对每一核心,根据核心使用率从第二对应关系中确定性能参数。Step D2: For each core, the performance parameter is determined from the second correspondence according to the core usage rate.
其中,每一核心的性能参数和每一核心的核心使用率对应,第二对应关系包括每一核心的性能参数和每一核心的核心使用率的对应关系。性能参数用于指示核心的运行方式。The performance parameter of each core corresponds to the core usage rate of each core, and the second correspondence relationship includes the correspondence between the performance parameters of each core and the core usage rate of each core. Performance parameters are used to indicate how the core operates.
终端预先设置有第二对应关系,该第二对应关系包括每一核心的性能参数和每一核心的核心使用率的对应关系。从而终端获取每一核心的核心使用率后,针对每一核心,根据核心使用率从第二对应关系中确定性能参数,从而可以得到该至少两个核心的性能参数。The terminal is preset with a second correspondence, where the second correspondence includes a correspondence between a performance parameter of each core and a core usage rate of each core. Therefore, after obtaining the core usage rate of each core, the terminal determines performance parameters from the second correspondence according to the core usage rate for each core, so that performance parameters of the at least two cores can be obtained.
在第二对应关系中的核心使用率可以为具体的数值、例如10%、23%等,或者数值区间,例如[10%,35%]等,本发明实施例对此不作具体限定。 The core usage rate in the second correspondence may be a specific value, for example, 10%, 23%, or the like, or a numerical interval, such as [10%, 35%], etc., which is not specifically limited in the embodiment of the present invention.
具体来说,对每一核心,终端使用核心当前的核心使用率和第二对应关系的中的核心使用率进行匹配,若核心当前的核心使用率和第二对应关系中的核心使用率相同或者落入第二对应关系中的核心使用率数值区间,则匹配成功,终端可以从第二对应关系中确定与匹配成功的核心使用率对应的性能参数。对每一核心都执行前述操作,从而得到每一核心的性能参数。Specifically, for each core, the terminal uses the core core usage rate of the core and the core usage rate of the second correspondence relationship, if the current core usage rate of the core is the same as the core usage rate in the second correspondence relationship or If the core usage rate interval falls in the second correspondence, the matching is successful, and the terminal may determine, from the second correspondence, the performance parameter corresponding to the successfully matched core usage rate. The foregoing operations are performed for each core to obtain performance parameters for each core.
步骤D3:根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心之后,在目标核心上使用目标核心的性能参数运行卷积神经网络模型。Step D3: After determining the core of the running convolutional neural network model from at least two cores according to the core weight values of the at least two cores, running the convolutional neural network model on the target core using the performance parameters of the target core.
其中,目标核心为运行卷积神经网络模型的核心。Among them, the target core is the core of running the convolutional neural network model.
终端获取到每一核心的性能参数,在确定出运行卷积神经网络模型的目标核心后,终端在使用该目标核心运行卷积神经网络模型时,可以在目标核心上使用目标核心的性能参数运行卷积神经网络模型,以使得通过性能参数的设定控制目标核心的具体运行状况,满足用户对核心的运行需求。The terminal obtains the performance parameters of each core. After determining the target core of the running convolutional neural network model, the terminal can use the performance parameters of the target core on the target core when running the convolutional neural network model using the target core. The convolutional neural network model is used to control the specific operating conditions of the target core through the setting of performance parameters to meet the user's operational requirements for the core.
其中,性能参数包括线程优先级信息、睡眠时间信息、线程数目信息中的一种或多种。该线程优先级信息为核心运行卷积神经网络模型时子线程的优先级别信息;该睡眠时间信息为核心运行两个卷积神经网络模型间隔的时间;该线程数目信息为核心运行卷积神经网络模型时使用的线程数目信息。The performance parameter includes one or more of thread priority information, sleep time information, and number of threads. The thread priority information is the priority information of the child thread when the core runs the convolutional neural network model; the sleep time information is the time when the core runs two convolutional neural network models; the number of threads is the core running convolutional neural network The number of threads used when the model is used.
在一个示例中,目标核心使用线程优先级信息运行卷积神经网络模型,目标核心使用到子线程,根据线程优先级信息指示的子线程的优先级别信息调度这些子线程。In one example, the target core runs the convolutional neural network model using thread priority information, and the target core uses sub-threads to schedule the sub-threads based on the priority information of the sub-threads indicated by the thread priority information.
在另一个示例中,目标核心使用睡眠时间信息运行卷积神经网络模型,在目标核心运行了以卷积神经网络模型后,目标核心该睡眠时间信息指示的间隔时间内不再运行另一卷积神经网络模型。In another example, the target core runs the convolutional neural network model using the sleep time information, and after the target core runs the convolutional neural network model, the target core no longer runs another convolution during the interval indicated by the sleep time information. Neural network model.
在另一个示例中,目标核心使用线程数目信息运行卷积神经网络模型,目标核心生成线程数目信息指示的线程数目,然后使用这些线程数目运行卷积神经网络模型。In another example, the target core runs a convolutional neural network model using the number of threads information, the target core generates the number of threads indicated by the number of threads information, and then uses the number of threads to run the convolutional neural network model.
可以理解,因性能参数的确定和性能权重值的确定都使用到了核心当前的核心使用率,从而在有的实施例中可以将性能参数的确定和性能权重值的确定在同一步骤中实现,此时,每一核心当前的核心使用率也可以只获取一次。It can be understood that the determination of the performance parameter and the determination of the performance weight value use the current core usage rate of the core, so that in some embodiments, the determination of the performance parameter and the determination of the performance weight value can be implemented in the same step. At the same time, the current core usage of each core can also be obtained only once.
通过核心使用率来确定控制核心运行的性能参数,使得目标核心在运行卷积神经网络模型时可以使用该目标核心的性能参数来运行卷积神经网络模型,使得目标核心的运行方式与目标核心当前的核心使用率的情况,在将第二对应关系包括的每一核心的性能参数和每一核心的核心使用率的对应关系根据提高执行效率的方式预设时,可以使得目标核心能高效运行。The core usage rate is used to determine the performance parameters of the control core operation, so that the target core can use the performance parameters of the target core to run the convolutional neural network model when running the convolutional neural network model, so that the target core operates in the same way as the target core. In the case of the core usage rate, when the correspondence between the performance parameters of each core included in the second correspondence relationship and the core usage rate of each core is preset according to a manner of improving execution efficiency, the target core can be efficiently operated.
在本发明的其它实施例中,还提供一种核心调度方法,该方法包括:In other embodiments of the present invention, a core scheduling method is further provided, the method comprising:
步骤E1:获取目标模型参数。Step E1: Acquire the target model parameters.
其中,目标模型参数用于表示一卷积神经网络模型的计算密度。Among them, the target model parameters are used to represent the computational density of a convolutional neural network model.
步骤E1的具体实现可以参考步骤401的详细描述。For a specific implementation of step E1, reference may be made to the detailed description of step 401.
步骤E2:根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值。 Step E2: Determine core weight values of at least two cores from the preset first correspondence according to the target model parameters.
其中,至少两个核心的核心权重值与目标模型参数对应,至少两个核心为终端上的异构核心,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。The core weight value of at least two cores corresponds to the target model parameter, and at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes a correspondence between the target model parameters and core weight values of at least two cores, and the core The weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
步骤E2的具体实现可参考步骤402的详细描述。在本实施例中,第一对应关系中的目标模型参数和至少两个核心的核心权重值的对应关系可根据不同核心协同运行卷积神经网络模型的效果进行预设。For a specific implementation of step E2, reference may be made to the detailed description of step 402. In this embodiment, the correspondence between the target model parameters in the first correspondence relationship and the core weight values of the at least two cores may be preset according to the effects of the different core cooperative running convolutional neural network models.
步骤E3:根据至少两个核心的核心权重值,将卷积神经网络模型分配到不同核心上运行。Step E3: Assign the convolutional neural network model to different cores according to the core weight values of at least two cores.
在本实施例中,可以使用多个核心协同运行卷积神经网络模型,即将卷积神经网络模型分配到不同核心上运行,但因对每一核心确定了核心权重值,从而根据每一核心的核心权重值,将卷积神经网络模型分配到不同核心上运行后,核心运行该卷积神经网络模型的比重由核心权重值确定,例如,核心权重值大的核心与核心权重值小的核心相比,将卷积神经网络模型的大部分分配到核心权重值大的核心上运行,将卷积神经网络模型的小部分分配到核心权重值小的核心上运行,从而核心权重值大的核心起到运行主核心的作用。In this embodiment, a plurality of core cooperative running convolutional neural network models may be used, that is, the convolutional neural network model is allocated to run on different cores, but the core weight values are determined for each core, thereby The core weight value, after the convolutional neural network model is assigned to run on different cores, the proportion of the core running the convolutional neural network model is determined by the core weight value. For example, the core with a large core weight value and the core with a small core weight value Ratio, the majority of the convolutional neural network model is assigned to the core with a large core weight value, and a small part of the convolutional neural network model is assigned to the core with a small core weight value, so that the core with a large core weight value To the role of running the main core.
在本发明的其它实施例中,还提供一种核心调度方法,该方法包括:In other embodiments of the present invention, a core scheduling method is further provided, the method comprising:
步骤F1:获取任务类型参数。Step F1: Obtain a task type parameter.
其中,任务类型参数用于表示计算任务的类型。Among them, the task type parameter is used to indicate the type of the calculation task.
计算任务例如可以为图像识别、语音识别、图像分类等,任务类型参数可以是文字信息,例如“图像识别”的文字、“语音识别”的文字,也可以是字母或数字等信息,例如“001”,只要能起到标识具体的计算任务的类型即可。The calculation task may be, for example, image recognition, voice recognition, image classification, etc., and the task type parameter may be text information, such as "image recognition" text, "speech recognition" text, or information such as letters or numbers, such as "001 ", as long as it can identify the type of specific computing tasks.
步骤F2:根据任务类型参数,从预设的第四对应关系中确定至少两个核心的核心权重值。Step F2: Determine, according to the task type parameter, a core weight value of at least two cores from a preset fourth correspondence.
其中,至少两个核心的核心权重值与任务类型参数对应,至少两个核心为终端上的异构核心,第四对应关系包括任务类型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行计算任务的优先程度。The core weight value of at least two cores corresponds to a task type parameter, and at least two cores are heterogeneous cores on the terminal, and the fourth correspondence relationship includes a correspondence between a task type parameter and at least two core core weight values, and a core The weight value is used to indicate the priority at which the core is selected to run the computing task.
在本实施例中,预先设定了任务类型参数和至少两个核心的核心权重值的对应关系,从而获取了任务类型参数后,使用获取的任务类型参数和第四对应关系中的任务类型参数进行匹配,匹配成功,即可确定出与匹配成功的任务类型参数对应的至少两个核心的核心权重值。In this embodiment, the correspondence between the task type parameter and the core weight value of at least two cores is preset, and after the task type parameter is acquired, the acquired task type parameter and the task type parameter in the fourth correspondence relationship are used. If the matching is successful, the matching is successful, and the core weight values of at least two cores corresponding to the successfully matched task type parameters are determined.
步骤F3:根据至少两个核心的核心权重值,从至少两个核心中确定运行计算任务的核心。Step F3: Determine the core of the running computing task from the at least two cores according to the core weight values of the at least two cores.
因核心权重值用于表示核心被选择以运行计算任务的优先程度,从而可以根据至少两个核心的核心权重值,从至少两个核心中确定运行计算任务的核心,例如选择核心权重值大的核心运行计算任务,或者,对该至少两个核心的核心权重值根据其它参数进行修正调整后,从该至少两个核心中确定运行计算任务的核心。Since the core weight value is used to indicate the priority of the core selected to run the computing task, the core of the running computing task can be determined from at least two cores according to the core weight values of the at least two cores, for example, selecting a core weight value is large. The core runs the computing task, or after the core weight values of the at least two cores are modified according to other parameters, the core of the running computing task is determined from the at least two cores.
综上所述,终端上的异构核心特点不同,不同核心适于运行不同计算密度的卷积神经网络模型。若预设有第一对应关系,该第一对应关系包括目标模型参数和至少两个核心的 核心权重值的对应关系,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,该至少两个核心为终端上的异构核心,则在获取到一卷积神经网络模型的目标模型参数后,可以根据该目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值。核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,通过核心权重值可确定适于运行该卷积神经网络模型的核心。这样,即可实现根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。通过不同核心的核心权重值,可确定适配的核心来运行具有具体计算密度的卷积神经网络模型,若核心权重值越大的核心越能高效运行卷积神经网络模型时,根据核心权重值确定出的核心可以高效运行该卷积神经网络模型。In summary, the heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities. If the first correspondence is pre-set, the first correspondence includes target model parameters and at least two cores Correspondence of core weight values, wherein the target model parameter is used to represent the computational density of a convolutional neural network model, and the at least two cores are heterogeneous cores on the terminal, and the target of acquiring a convolutional neural network model is obtained. After the model parameters, the core weight values of at least two cores may be determined from the preset first correspondence according to the target model parameters. The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model. In this way, the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores. Through the core weight values of different cores, the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
图5为本发明实施例提供的一种核心调度方法的方法流程图,该方法可以应用于终端上,图5所示实施例的方法可基于图4所示实施例的方法实现,在图5的实施例中,以脉动阵列处理器为NPU,目标模型参数为权重参数数量为示例进行说明。FIG. 5 is a flowchart of a method for a core scheduling method according to an embodiment of the present invention. The method may be applied to a terminal, and the method of the embodiment shown in FIG. 5 may be implemented based on the method shown in FIG. 4, in FIG. 5. In the embodiment, the pulse array processor is an NPU, and the target model parameter is a weight parameter number as an example.
参阅图5以及上文描述的内容,本发明实施例的方法包括:Referring to FIG. 5 and the content described above, the method of the embodiment of the present invention includes:
步骤501:终端获取卷积神经网络模型。Step 501: The terminal acquires a convolutional neural network model.
终端可以使用具体的算法来执行应用业务,例如可以通过卷积神经网络模型来进行具体应用业务的执行。为此,终端先获取待使用的卷积神经网络模型。The terminal can use a specific algorithm to execute the application service, for example, the convolutional neural network model can be used to perform the execution of the specific application service. To this end, the terminal first acquires the convolutional neural network model to be used.
终端获取卷积神经网络模型的方式有多种,例如,终端获取其它设备发送的卷积神经网络模型,或者终端在本地建立卷积神经网络模型。There are various ways for a terminal to acquire a convolutional neural network model. For example, the terminal acquires a convolutional neural network model sent by other devices, or the terminal establishes a convolutional neural network model locally.
终端可以在多种应用业务上使用卷积神经网络模型,例如运行卷积神经网络模型进行图像和语音识别。运行卷积神经网络模型来执行图像业务的示例,可以为进行图像分类、图像特征提取、人脸聚类等操作,这些操作的计算特性为包括大量矩阵运算,从而适于使用卷积神经网络模型来执行。Terminals can use convolutional neural network models for a variety of application services, such as running convolutional neural network models for image and speech recognition. An example of running a convolutional neural network model to perform image services may be an operation of image classification, image feature extraction, face clustering, etc., and the computational characteristics of these operations include a large number of matrix operations, thereby being suitable for using a convolutional neural network model. To execute.
关于卷积神经网络模型的建立方式,可以为通过训练得到,例如采集大量的相关数据,使用这些数据进行卷积神经网络训练,得到一卷积神经网络模型。执行训练卷积神经网络模型的步骤的设备可以是终端,也可以是服务器等设备。The method of establishing the convolutional neural network model can be obtained through training, for example, collecting a large amount of relevant data, and using the data to perform convolutional neural network training to obtain a convolutional neural network model. The device that performs the steps of training the convolutional neural network model may be a terminal or a device such as a server.
步骤502:终端获取权重参数数量。Step 502: The terminal acquires the number of weight parameters.
其中,权重参数数量用于表示一卷积神经网络模型的计算密度;Wherein, the number of weight parameters is used to represent the computational density of a convolutional neural network model;
卷积神经网络模型用于进行计算,例如进行图像处理,其中不同的卷积神经网络模型有不同的特点,例如,不同的卷积神经网络模型的计算密度可以不同。计算密度可由卷积神经网络模型中的权重参数的数量决定,即卷积神经网络模型的权重参数数量能表明该卷积神经网络模型的计算密度。关于卷积神经网络模型的权重参数的内容可以参考上文的描述。Convolutional neural network models are used for computations, such as image processing, where different convolutional neural network models have different characteristics. For example, different convolutional neural network models may have different computational densities. The computational density can be determined by the number of weighting parameters in the convolutional neural network model, ie the number of weighting parameters of the convolutional neural network model can indicate the computational density of the convolutional neural network model. For the content of the weighting parameters of the convolutional neural network model, reference can be made to the above description.
一般而言,计算密度密集的卷积神经网络模型适合在GPU上运行,计算密度密集的卷积神经网络模型例如可以为大矩阵的卷积神经网络模型;In general, a computationally dense density convolutional neural network model is suitable for operation on a GPU, and a computationally densely convolved convolutional neural network model can be, for example, a large matrix convolutional neural network model;
计算密度稀疏的卷积神经网络模型适合在CPU上运行,计算密度稀疏的卷积神经网络模型例如可以为小矩阵,或串行的、或for循环的卷积神经网络模型。 The computationally densely populated convolutional neural network model is suitable for running on a CPU. The computationally densely populated convolutional neural network model can be, for example, a small matrix, or a serial, or for-loop convolutional neural network model.
不同核心具有不同的计算特性,而不同种类的卷积神经网络模型有不同的计算密度,哪种类型的卷积神经网络模型适合在哪个核心运行,根据经验数据或实验测试即可得知。Different cores have different computational characteristics, and different types of convolutional neural network models have different computational densities. Which type of convolutional neural network model is suitable for which core to operate, can be known from empirical data or experimental tests.
具体来说,ResNet(残差网络)18这类分类、或特征提取、或物体检测等卷积神经网络模型,适合在NPU或GPU上运行;非人脸(例如,狗脸/猫脸)识别、身份证图像识别等属于小网络的卷积神经网络模型,适合在CPU上运行。Specifically, ResNet (residual network) 18 such as classification, or feature extraction, or object detection and other convolutional neural network models, suitable for running on NPU or GPU; non-human face (for example, dog face / cat face) recognition , ID card image recognition and other convolutional neural network models belonging to small networks, suitable for running on the CPU.
为此,在本发明实施例中,终端获取卷积神经网络模型后,获取该卷积神经网络模型的权重参数数量,从而可以根据该卷积神经网络模型的权重参数数量来考虑对运行该卷积神经网络模型的核心的调度。Therefore, in the embodiment of the present invention, after the terminal acquires the convolutional neural network model, the number of weight parameters of the convolutional neural network model is obtained, so that the volume can be considered according to the number of weight parameters of the convolutional neural network model. The core of the neural network model is scheduled.
关于终端获取卷积神经网络模型的权重参数数量的具体实现方式可以为:The specific implementation manner of the number of weight parameters of the terminal acquiring the convolutional neural network model may be:
终端使用解析器分析获取的卷积神经网络模型,分析得到该卷积神经网络模型的权重参数的数量。The terminal analyzes the acquired convolutional neural network model using a parser, and analyzes the number of weight parameters of the convolutional neural network model.
步骤503:根据权重参数数量,从预设的第一对应关系中确定至少两个核心的核心权重值。Step 503: Determine core weight values of at least two cores from the preset first correspondence according to the number of weight parameters.
其中,该至少两个核心的核心权重值与权重参数数量对应,该至少两个核心为终端上的异构核心。核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。The core weight value of the at least two cores corresponds to the number of weight parameters, and the at least two cores are heterogeneous cores on the terminal. The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model.
在本发明实施例中,在终端上可以设置有多个异构核心,核心为用于进行计算的计算单元,这些异构核心的类型包括但不限CPU、GPU、DSP、NPU等。执行步骤503后,每一核心都对应于一核心权重值,多个核心对应于多个核心权重值,在有的实施例中,该多个核心权重值可以以列表的形式出现,例如,列表{WCPU=1.0;WGPU=0.5;WNPU=0.1},WCPU表示CPU的核心权重值为1.0,WGPU表示GPU的核心权重值为0.5,WNPU表示NPU的核心权重值为0.1。In the embodiment of the present invention, a plurality of heterogeneous cores may be disposed on the terminal, and the core is a computing unit for performing calculations, and the types of the heterogeneous cores include but are not limited to CPUs, GPUs, DSPs, NPUs, and the like. After performing step 503, each core corresponds to a core weight value, and multiple cores correspond to multiple core weight values. In some embodiments, the multiple core weight values may appear in the form of a list, for example, a list. {W CPU =1.0; W GPU =0.5; W NPU =0.1}, W CPU indicates that the core weight of the CPU is 1.0, W GPU indicates that the core weight of the GPU is 0.5, and W NPU indicates that the core weight of the NPU is 0.1.
该第一对应关系包括权重参数数量和该至少两个核心的核心权重值的对应关系。即该第一对应关系的参数包括权重参数数量和核心权重值。在第一对应关系中,这两个参数的具体形式可以为具体的数值,也可以为数值范围。确定出的核心权重值可为具体的数值。在第一对应关系中,权重参数数量和核心权重值的对应可以为具体数值间的对应,也可以为数值范围间的对应。在第一对应关系中核心权重值所属的核心类型可以为多个异构核心,确定出的核心权重值的数量也可以为多个,其中不同的核心权重值属于不同的异构核心。The first correspondence includes a correspondence between the number of weight parameters and the core weight values of the at least two cores. That is, the parameters of the first correspondence include the number of weight parameters and the core weight value. In the first correspondence, the specific form of the two parameters may be a specific value or a numerical range. The determined core weight value can be a specific value. In the first correspondence, the correspondence between the number of weight parameters and the core weight value may be a correspondence between specific values, or may be a correspondence between numerical ranges. In the first correspondence, the core type of the core weight value may be a plurality of heterogeneous cores, and the determined number of core weight values may also be multiple, wherein different core weight values belong to different heterogeneous cores.
步骤503的具体实现方式可以为:终端获取到卷积神经网络模型的权重参数数量后,使用该获取的权重参数数量和第一对应关系的权重参数数量进行匹配,当该获取的权重参数数量和第一对应关系的权重参数数量相同或者该获取的权重参数数量处于第一对应关系的权重参数数量区间内时,则匹配成功。然后,在第一对应关系中确定与匹配的权重参数数量对应的核心权重值。The specific implementation manner of step 503 may be: after the terminal obtains the number of weight parameters of the convolutional neural network model, the terminal uses the obtained number of weight parameters and the number of weight parameters of the first corresponding relationship to match, when the number of weight parameters obtained is If the number of weight parameters of the first corresponding relationship is the same or the number of the obtained weight parameters is within the weight parameter number range of the first correspondence, the matching is successful. Then, a core weight value corresponding to the number of matched weight parameters is determined in the first correspondence.
若在第一对应关系中,为权重参数数量区间和核心权重值区间对应时,确定出对应的核心权重值区间后,可以使用卷积神经网络模型的权重参数数量计算出具体的核心权重值,例如根据卷积神经网络模型的权重参数数量在对应的核心权重值范围内使用线性映射方式确定出核心权重值。If, in the first correspondence, when the weight parameter number interval and the core weight value interval correspond, after determining the corresponding core weight value interval, the specific core weight value may be calculated using the weight parameter number of the convolutional neural network model. For example, the core weight value is determined using a linear mapping method according to the number of weight parameters of the convolutional neural network model within the corresponding core weight value range.
关于该第一对应关系的建立,可以为根据实验测试数据或经验数据建立得到。该第一 对应关系的获取,可以为从终端的存储器件上读取得到,也可以为终端从其它设备上获取得到。The establishment of the first correspondence may be established based on experimental test data or empirical data. The first The acquisition of the corresponding relationship may be obtained from the storage device of the terminal, or may be obtained by the terminal from other devices.
下面即举出一具体示例来说明步骤503,如下:A specific example is given below to illustrate step 503, as follows:
在终端中预存有一对应关系,该对应关系如表1所示。A corresponding relationship is prestored in the terminal, and the corresponding relationship is as shown in Table 1.
表1:Table 1:
Figure PCTCN2017107614-appb-000003
Figure PCTCN2017107614-appb-000003
在表一所示的对应关系中,权重参数数量<50million(百万)的卷积神经网络模型为小型网络模型,其适合在CPU上运行,故对这类的卷积神经网络,可设置CPU的核心权重值为1,GPU的核心权重值根据卷积神经网络模型的权重参数数量在0~1之间按照线性设置,NPU的核心权重值根据卷积神经网络模型的权重参数数量在0~0.5之间设置。In the correspondence shown in Table 1, the convolutional neural network model with the weight parameter number <50million (million) is a small network model, which is suitable for running on the CPU, so for this type of convolutional neural network, the CPU can be set. The core weight value is 1, and the core weight value of the GPU is linearly set according to the number of weight parameters of the convolutional neural network model from 0 to 1. The core weight value of the NPU is 0 to 0 according to the weight parameter of the convolutional neural network model. Set between 0.5.
50million~200million的卷积神经网络模型为中型网络模型,适合在GPU上运行,故GPU的核心权重值为1,CPU的核心权重值根据卷积神经网络模型的权重参数数量在1~0.5之间按照线性设置,NPU的核心权重值根据卷积神经网络模型的权重参数数量在0.5~1.0之间设置。The convolutional neural network model of 50million~200million is a medium-sized network model, which is suitable for running on the GPU. Therefore, the core weight value of the GPU is 1, and the core weight value of the CPU is between 1 and 0.5 according to the weight parameter of the convolutional neural network model. According to the linear setting, the core weight value of the NPU is set between 0.5 and 1.0 according to the number of weight parameters of the convolutional neural network model.
200million~500million的卷积神经网络模型为大型网络模型,适合在专用加速器件NPU上运行,故NPU的核心权重值为1,CPU的核心权重值根据卷积神经网络模型的权重参数数量在0.5~0之间按照线性设置,GPU的核心权重值根据卷积神经网络模型的权重参数数量在1.0~0.5之间按照线性设置。The convolutional neural network model of 200million~500million is a large-scale network model, which is suitable for running on the dedicated acceleration device NPU. Therefore, the core weight value of the NPU is 1, and the core weight value of the CPU is 0.5~ according to the weight parameter of the convolutional neural network model. According to the linear setting between 0, the core weight value of the GPU is linearly set according to the number of weight parameters of the convolutional neural network model between 1.0 and 0.5.
使用线性映射确定核心的核心权重值的方法如下:The method for determining the core weight value of a core using linear mapping is as follows:
1)在表1的对应关系中,确定与卷积神经网络模型的权重参数数量所在的目标权重参数数量区间。1) In the correspondence relationship of Table 1, the target weight parameter number interval in which the number of weight parameters of the convolutional neural network model is located is determined.
2)在表1的对应关系中,确定该至少两个核心的核心权重值区间,该至少两个核心的目标核心权重值区间分别和目标权重参数数量区间对应。2) In the correspondence relationship of Table 1, the core weight value interval of the at least two cores is determined, and the target core weight value intervals of the at least two cores respectively correspond to the target weight parameter number interval.
其中该至少两个核心为终端上的异构核心。The at least two cores are heterogeneous cores on the terminal.
3)对每一核心,从核心权重值区间中确定核心权重值,核心权重值在核心权重值区间中的位置和目标模型参数在目标模型参数区间中的位置相同。3) For each core, the core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
现以终端获取到的卷积神经网络模型的权重参数数量为100million,终端的异构核心为CPU、GPU、NPU为例,其核心权重值的计算如下:The weight parameter of the convolutional neural network model obtained by the terminal is 100million, and the heterogeneous core of the terminal is CPU, GPU and NPU. The core weight value is calculated as follows:
1)终端使用卷积神经网络模型的权重参数数量100million和表1的对应关系中的权重参数数量区间进行匹配,确定权重参数数量100million位于表1中的目标权重参数数量区间“50Million~200Million”。1) The terminal uses the weight parameter number 100million of the convolutional neural network model and the weight parameter number interval in the correspondence relationship of Table 1 to match, and determines that the weight parameter number 100million is located in the target weight parameter quantity interval “50Million~200Million” in Table 1.
2)在表1的对应关系中,确定与目标权重参数数量区间“50Million~200Million”对 应的各异构核心的核心权重值为:GPU为1,CPU为1~0.5,NPU为0.5~1.0。2) In the correspondence relationship of Table 1, determine the pair of target weight parameter number interval "50Million ~ 200Million" The core weight values of the respective heterogeneous cores are: GPU is 1, CPU is 1 to 0.5, and NPU is 0.5 to 1.0.
3)为了能从核心权重值区间中确定每一核心的核心权重值,终端根据卷积神经网络模型的权重参数数量在目标权重参数数量区间的位置对确定出的CPU、GPU和NPU的核心权重值区间进行线性映射,得到每一核心的核心权重值。在表1的对应关系中,卷积神经网络模型的权重参数数量100million在目标权重参数数量区间50Million~200Million的位置为(100-50)/(200-50)=1/3,根据该1/3,在CPU的目标核心权重值区间1~0.5内线性映射得到CPU的核心权重值,设CPU的核心权重值为Sa,则(Sa-1)/(0.5-1)=1/3,计算得到CPU的核心权重值为0.83。使用相同的方法,计算得到NPU的核心权重值为0.66。3) In order to determine the core weight value of each core from the core weight value interval, the terminal determines the core weight of the CPU, GPU and NPU according to the position of the weight parameter of the convolutional neural network model in the position of the target weight parameter number interval. The value interval is linearly mapped to obtain the core weight value for each core. In the correspondence relationship of Table 1, the weight parameter number of the convolutional neural network model is 100million at the position of the target weight parameter number interval of 50Million to 200Million (100-50) / (200-50) = 1/3, according to the 1/ 3. In the target core weight value range of the CPU from 1 to 0.5, the core weight value of the CPU is obtained by linear mapping. The core weight value of the CPU is S a , then (S a -1) / (0.5-1) = 1/3 The calculated core weight value of the CPU is 0.83. Using the same method, the core weight of the NPU is calculated to be 0.66.
在本发明的其它示例中,也可以从第一对应关系中,直接确定出核心的核心权重值。In other examples of the present invention, the core weight value of the core may also be directly determined from the first correspondence.
例如,仍以卷积神经网络模型的权重参数数量为100million为例。For example, the number of weight parameters of the convolutional neural network model is still 100 million.
1)终端使用卷积神经网络模型的权重参数数量100million和表1的对应关系中的权重参数数量进行匹配,确定权重参数数量100million位于目标核心权重值区间“50Million~200Million”。1) The terminal uses the number of weight parameters of the convolutional neural network model 100million to match the number of weight parameters in the correspondence relationship of Table 1, and determines that the weight parameter number 100million is located in the target core weight value interval “50Million~200Million”.
2)在表一的对应关系中,与该目标核心权重值范围“50Million~200Million”对应的GPU的核心权重值为1,从而可以直接得到GPU的核心权重值。2) In the correspondence relationship of Table 1, the core weight value of the GPU corresponding to the target core weight value range “50Million~200Million” is 1, so that the core weight value of the GPU can be directly obtained.
从上述的描述可知,本发明实施例从对应关系中确定核心权重值的方法可以有多种,具体核心权重值为具体数值时,可以直接确定,若核心权重值在第一对应关系中为数值范围时,需要根据卷积神经网络模型的权重参数数量进行线性映射得到核心的核心权重值。It can be seen from the above description that the method for determining the core weight value from the corresponding relationship may be multiple in the embodiment of the present invention. When the specific core weight value is a specific value, it may be directly determined, if the core weight value is a value in the first correspondence relationship. In the range, it is necessary to perform linear mapping according to the number of weight parameters of the convolutional neural network model to obtain the core weight value of the core.
核心权重值反映了核心的硬件特性适用于当前卷积神经网络模型的程度,核心权重值越高的核心越适用于运行该卷积神经网络模型。从而终端可以根据核心权重值调度运行卷积神经网络模型的核心,例如,在上述的示例中,终端可以调取核心权重值最大的核心GPU运行步骤501获取的卷积神经网络模型。The core weight value reflects the extent to which the core hardware characteristics are applicable to the current convolutional neural network model. The higher the core weight value, the more appropriate it is to run the convolutional neural network model. Therefore, the terminal can schedule the core of the running convolutional neural network model according to the core weight value. For example, in the above example, the terminal can retrieve the convolutional neural network model obtained by the core GPU running step 501 with the largest core weight value.
但是,不同核心的硬件特性适用于不同的卷积神经网络模型,核心的运行环境也会对运行卷积神经网络模型产生影响,核心权重值反映的是核心的静态特性与卷积神经网络模型的相关性,若只考虑核心的硬件特性来调度核心运行卷积神经网络模型,得到的运行效果并不一定是最好的。为了能选择出更适用于运行当前卷积神经网络模型的核心,还需考虑核心的运行环境参数,即动态特性,将核心的静态特性和动态特性结合考虑将能选择出当前最适于运行卷积神经网络模型的核心。具体的结合方式可以为使用核心的动态参数来调节核心权重值得到新的权重值,以根据该新的权重值来调度核心。However, the hardware characteristics of different cores are applicable to different convolutional neural network models. The core operating environment also affects the running convolutional neural network model. The core weight values reflect the core static characteristics and the convolutional neural network model. Correlation, if only the core hardware characteristics are considered to schedule the core running convolutional neural network model, the obtained operation effect is not necessarily the best. In order to select the core that is more suitable for running the current convolutional neural network model, it is also necessary to consider the core operating environment parameters, that is, the dynamic characteristics. Combining the static and dynamic characteristics of the core will select the current best suitable running volume. The core of the neural network model. The specific combination may be to use the core dynamic parameters to adjust the core weight value to obtain a new weight value to schedule the core according to the new weight value.
为此,本发明实施例的核心调度方法还包括下文的步骤。To this end, the core scheduling method of the embodiment of the present invention further includes the following steps.
在下文中,将以动态参数为核心使用率和终端的剩余电量值为示例进行说明。即增加性能和功耗维度的负载平衡判断,以修正权重值,以决策调度哪一核心。In the following, the dynamic parameters are used as an example of the core usage rate and the remaining power consumption value of the terminal. That is, increase the load balancing judgment of the performance and power consumption dimensions to correct the weight value to decide which core to schedule.
步骤504:获取每一核心当前的核心使用率。Step 504: Obtain the current core usage rate of each core.
核心使用率为动态变化的参数。The core usage rate is a dynamically changing parameter.
终端上的运行环境不同对核心运行卷积神经网络模型也会产生不同的影响,核心的性能状态为核心的动态特性之一,同一核心处于不同性能状态时有不同的计算能力,故核心的性能状态对运行卷积神经网络模型有影响,所以将核心的当前性能状态作为生成核心调 度策略的考虑因素之一,可以使得调度到的核心更能高效运行当前卷积神经网络模型。Different operating environments on the terminal will have different effects on the core running convolutional neural network model. The core performance state is one of the core dynamic characteristics. The same core has different computing capabilities when it is in different performance states, so the core performance. The state has an impact on running the convolutional neural network model, so the current performance state of the core is taken as the core of the generation. One of the considerations of the degree strategy can make the scheduled core more efficient to run the current convolutional neural network model.
核心的核心使用率是核心的重要性能状态参数,从而可以使用核心使用率来作为核心调度策略的考虑因素之一。核心的使用率表示核心的负载情况。The core core usage rate is an important performance state parameter of the core, so core usage can be used as one of the considerations of the core scheduling strategy. The core usage rate represents the core load situation.
终端获取该多个异构核心的当前使用率的具体实现方式例如可以为:A specific implementation manner in which the terminal obtains the current usage rate of the multiple heterogeneous cores may be, for example:
终端调用预设的核心使用率读取程序或者使用终端***提供的读取核心使用率的API,从而读取终端上各核心的核心使用率。需要读取核心使用率的核心为终端上待运行卷积神经网络模型的核心,即上述步骤503所示的核心。The terminal invokes the preset core usage reading program or uses the API of the core usage rate provided by the terminal system to read the core usage rate of each core on the terminal. The core that needs to read the core usage rate is the core of the convolutional neural network model to be run on the terminal, that is, the core shown in step 503 above.
例如,终端通过终端***提供的API读取到GPU的核心使用率为25%,即该GPU还有75%的计算资源可用。For example, the terminal reads 25% of the core usage of the GPU through the API provided by the terminal system, that is, the GPU has 75% of computing resources available.
步骤505:对每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值和性能参数。Step 505: Determine, for each core, a performance weight value and a performance parameter from a preset second correspondence according to the core usage rate.
其中,每一核心的性能权重值和每一核心的核心使用率对应,以及,每一核心的性能参数和每一核心的核心使用率对应。The performance weight value of each core corresponds to the core usage rate of each core, and the performance parameters of each core correspond to the core usage rate of each core.
性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行卷积神经网络模型的优先程度,核心的性能权重值反映该核心的当前计算资源可使用程度。不同核心可具有不同的性能权重值。The performance weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model under the current core usage of the core, and the core performance weight value reflects the current computing resource availability of the core. Different cores can have different performance weight values.
在核心上运行卷积神经网络模型需要使用核心的具体性能参数来运行,该性能参数包括线程优先级信息、睡眠时间信息、线程数目信息中的一种或多种。Running the convolutional neural network model on the core requires running with specific performance parameters of the core, including one or more of thread priority information, sleep time information, and number of threads.
其中,线程优先级信息为核心运行卷积神经网络模型时子线程的优先级别信息;睡眠时间信息为核心运行两个卷积神经网络模型间隔的时间;线程数目信息为核心运行卷积神经网络模型时使用的线程数目信息。The thread priority information is the priority information of the child thread when the core runs the convolutional neural network model; the sleep time information is the time when the core runs two convolutional neural network models; the thread number information is the core running convolutional neural network model The number of threads used at the time.
其中步骤505获取的性能权重值和性能参数也可以以列表的形式出现。The performance weight value and performance parameter obtained in step 505 may also appear in the form of a list.
第二对应关系包括每一核心的核心使用率和每一核心的性能权重值的对应关系。第二对应关系还包括每一核心的性能参数和每一核心的核心使用率的对应关系。换言之,该第二对应关系为核心使用率和性能权重值、性能参数的对应关系。即第二对应关系的参数包括核心使用率、性能权重值和性能参数。这三个参数的具体形式可以是具体的数值,也可以是数值区间。确定出的性能权重值可以为具体的数值。在第二对应关系中性能权重值所属的核心类型可以为一个或者多个,确定出的性能权重值的数量也可以为一个或多个,其中不同的性能权重值属于不同的核心。核心的性能参数可以为一种或多种,本发明对此不作具体限定。The second correspondence includes a correspondence between core usage of each core and performance weight values of each core. The second correspondence also includes a correspondence between performance parameters of each core and core usage of each core. In other words, the second correspondence is a correspondence between core usage rate, performance weight value, and performance parameter. That is, the parameters of the second correspondence include core usage rate, performance weight value, and performance parameter. The specific form of these three parameters may be a specific value or a numerical interval. The determined performance weight value can be a specific value. In the second correspondence, the performance weight value may belong to one or more core types, and the determined performance weight value may also be one or more, wherein different performance weight values belong to different cores. The performance parameter of the core may be one or more, which is not specifically limited in the present invention.
步骤505的具体实现方式可以如下所示:The specific implementation of step 505 can be as follows:
1)根据第二对应关系,确定与核心当前的核心使用率对应的性能权重值。1) Determine a performance weight value corresponding to the current core usage rate of the core according to the second correspondence.
终端获取到核心的当前核心使用率后,终端使用该当前核心使用率和第二对应关系中的核心使用率匹配,将该当前核心使用率和第二对应关系中的核心使用率相同,或者该当前核心使用率位于第二对应关系中的核心使用率的数值范围内,则匹配成功。然后,终端第二对应关系中,确定与匹配成功的核心使用率对应的性能权重值。After the terminal obtains the current core usage rate of the core, the terminal uses the current core usage rate and the core usage ratio in the second correspondence to match, and the current core usage rate and the core usage rate in the second correspondence relationship are the same, or the If the current core usage rate is within the numerical range of the core usage rate in the second correspondence, the matching is successful. Then, in the second correspondence of the terminal, the performance weight value corresponding to the core usage rate that the matching is successful is determined.
若在第二对应关系中,为核心使用率和性能权重值的数值范围对应,确定出对应的性 能权重值的数值范围后,可以使用核心的当前核心使用率计算出具体的性能权重值,例如,根据核心的当前核心使用率在对应的性能权重值的数值范围使用线性映射方式确定出该核心的性能权重值。In the second correspondence, the corresponding value is determined for the numerical range of the core usage rate and the performance weight value. After the value range of the weight value can be used, the core core current usage rate can be used to calculate a specific performance weight value. For example, the core value is determined according to the core core usage rate in a numerical range of the corresponding performance weight value. Performance weight value.
2)根据第二对应关系,确定与核心当前的核心使用率对应的性能参数。2) Determine performance parameters corresponding to the current core usage rate of the core according to the second correspondence.
在上述的操作中,将该当前核心使用率和第二对应关系中的核心使用率相同,或者该当前核心使用率位于第二对应关系中的核心使用率的数值范围内,则匹配成功,此时终端还可以在第二对应关系中,确定出于与匹配成功的核心使用率对应的性能参数,该性能参数可以为具体的数值。In the above operation, if the current core usage rate is the same as the core usage rate in the second correspondence relationship, or the current core usage rate is within the numerical value range of the core usage rate in the second correspondence relationship, the matching succeeds. The terminal may also determine, in the second correspondence, a performance parameter corresponding to the core usage rate that is successfully matched, and the performance parameter may be a specific value.
核心的性能参数可以根据具体的核心使用率在第二对应关系中进行设定,需要***调优获取。例如,核心使用率最低时,可使用较多线程,以获取较高的处理性能,核心使用率较高时,就用较少线程处理神经网络计算请求,对本已高企的核心使用率影响尽可能小。The core performance parameters can be set in the second correspondence according to the specific core usage rate, which requires system tuning and acquisition. For example, when the core usage rate is the lowest, more threads can be used to obtain higher processing performance. When the core usage rate is high, the neural network computing request is processed with fewer threads, and the impact on the already high core usage rate is as much as possible. small.
关于第二对应关系的建立,可以为预先根据实验测试数据或经验数据建立得到。终端对该第二对应关系的获取,可以为从终端的存储器上读取得到,也可以为终端从其它设备上获取得到。The establishment of the second correspondence relationship may be established in advance based on experimental test data or empirical data. The obtaining of the second correspondence by the terminal may be obtained from the memory of the terminal, or may be obtained by the terminal from other devices.
下面即举出一具体示例来说明步骤505,如下:A specific example is given below to illustrate step 505, as follows:
在终端中预先有一对应关系,该对应关系如表2所示。There is a corresponding correspondence in the terminal in advance, and the corresponding relationship is as shown in Table 2.
表2Table 2
Figure PCTCN2017107614-appb-000004
Figure PCTCN2017107614-appb-000004
如表2所示,在该对应关系中,性能权重值被划分为三个范围,即被分为三个档次,核心使用率和性能参数也被划分为三个档次,其中每一档核心的使用率为一数值范围,每一档的性能参数为具体的数值。在本示例中核心使用率不考虑过高的情况,以防止核心负载过重,影响终端的运行。As shown in Table 2, in the corresponding relationship, the performance weight value is divided into three ranges, that is, divided into three grades, and the core usage rate and performance parameters are also divided into three grades, wherein each core is The usage rate is a range of values, and the performance parameters of each file are specific values. In this example, the core usage rate is not considered too high to prevent the core load from being too heavy and affecting the operation of the terminal.
确定核心的性能权重值的方法如下:Here's how to determine the core performance weight values:
1)对每一核心,在表2的对应关系中,确定核心当前的核心使用率所在的目标核心使用率区间;1) For each core, in the correspondence relationship of Table 2, determine the target core usage rate interval in which the core core usage rate is located;
2)对每一核心,在表2的对应关系中,确定与该目标核心使用率区间对应的性能权重值区间; 2) for each core, in the correspondence relationship of Table 2, determining a performance weight value interval corresponding to the target core usage interval;
3)对每一核心,从性能权重值区间中确定性能权重值,其中,性能权重值在性能权重值区间中的位置和核心当前的核心使用率在目标核心使用率区间的位置相同。3) For each core, the performance weight value is determined from the performance weight value interval, wherein the position of the performance weight value in the performance weight value interval and the core current core usage rate are the same in the target core usage interval.
以GPU的当前核心使用率为25%为例,其性能权重值的计算如下:Taking the current core usage rate of the GPU as 25%, the performance weight value is calculated as follows:
1)在表2的对应关系中,确定GPU的当前核心使用率25%所在的目标核心使用率区间为2%~30%。1) In the correspondence between Table 2, it is determined that the target core usage rate of the current core usage rate of the GPU is 2% to 30%.
2)在表2的对应关系中,确定与该目标核心使用率区间对应的性能权重值区间为0.8~0.5。2) In the correspondence relationship of Table 2, it is determined that the performance weight value interval corresponding to the target core usage interval is 0.8 to 0.5.
3)GPU的当前核心使用率25%在目标核心使用率范围2%~30%的相对位置关系为(25-2)/(30-2)=23/28,根据该23/28,在性能权重值区间0.8~0.5内进行线性映射,设GPU的性能权重值为Sx,则(Sx-0.8)/(0.5-0.8)=23/28。计算得到GPU的性能权重值x≈0.55。3) The current core usage rate of the GPU is 25%. The relative positional relationship of the target core usage range is 2% to 30% (25-2)/(30-2)=23/28, according to the performance of 23/28. Linear mapping is performed within the weight value interval of 0.8 to 0.5, and the performance weight value of the GPU is S x , then (S x -0.8) / (0.5 - 0.8) = 23 / 28. Calculate the performance weight value of the GPU x ≈ 0.55.
确定性能参数的方法如下:The method for determining performance parameters is as follows:
1)在第二对应关系中,确定当前核心使用率所处的目标核心使用率区间;1) determining, in the second correspondence, a target core usage interval in which the current core usage rate is located;
2)在第二对应关系中,确定与该目标核心使用率区间对应的性能参数。2) In the second correspondence, the performance parameter corresponding to the target core usage interval is determined.
依然以GPU的当前核心使用率为25%为例,其中性能参数的设定如下:The current core usage rate of the GPU is 25%. The performance parameters are set as follows:
1)在表2的对应关系中,确定GPU的当前核心使用率25%所处的目标核心使用率区间2%~30%;1) In the correspondence relationship of Table 2, determine the target core usage rate of 25% of the current core usage rate of the GPU is 2% to 30%;
2)在表2的对应关系中,确定与目标核心使用率区间2%~30%对应的性能参数为:线程优先级信息为0,睡眠时间信息为400ms,线程数目信息为2。2) In the correspondence relationship of Table 2, the performance parameters corresponding to the target core usage interval of 2% to 30% are determined as follows: the thread priority information is 0, the sleep time information is 400 ms, and the thread number information is 2.
可以理解,在本发明有的实施例中,步骤505的对应关系可以包括性能权重值但不包括性能参数,或者不包括性能权重值但包括性能参数。或者将该第第二对应关系拆分为两个对应关系,其中之一为核心使用率和性能权重值的对应关系,另一为核心的使用率和性能参数的对应关系。It can be understood that, in some embodiments of the present invention, the correspondence of step 505 may include a performance weight value but does not include a performance parameter, or does not include a performance weight value but includes a performance parameter. Or splitting the second correspondence into two correspondences, one of which is a correspondence between a core usage rate and a performance weight value, and the other is a correspondence between a core usage rate and a performance parameter.
可以理解,步骤505中的核心使用率为终端的状态参数的其中一个示例,终端的状态参数还可以包括终端剩余电量值、核心的温度等等。从而,第二对应关系还可以是其它的状态参数和至少两个核心的参数权重值的对应关系,该参数权重值用于表示在具体的状态参数下,核心被选择以运行卷积神经网络模型的优先程度。It can be understood that the core usage in step 505 is one of the status parameters of the terminal, and the status parameter of the terminal may further include the remaining power value of the terminal, the temperature of the core, and the like. Thus, the second correspondence may also be a correspondence between other state parameters and at least two core parameter weight values, the parameter weight values being used to indicate that the core is selected to run the convolutional neural network model under specific state parameters. Priority.
步骤506:对每一核心,使用性能权重值修正核心权重值,得到第一修正权重值。Step 506: For each core, the core weight value is corrected by using the performance weight value to obtain the first modified weight value.
第一修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,第一修正权重值大的核心比第一修正权重值小的核心适于运行卷积神经网络模型。The first modified weight value is used to indicate that the core is selected to run the convolutional neural network model. The core with the first modified weight value and the core with the smaller first modified weight value is suitable for running the convolutional neural network model.
终端在具体的核心上要运行卷积神经网络模型时,不仅要该核心的物理特性和卷积神经网络模型的特点适应,还要使得该核心的当前性能状态适于运行卷积神经网络模型,为此需要将核心的性能参数和核心的硬件特性结合考虑。具体的实现方式为用多个异构核心的性能权重值修正该多个异构核心的核心权重值,从而,得到的第一修正权重值结合了核心权重值和性能权重值,核心的第一修正权重值反映了该核心的硬件特性和该核心的当前核心使用率适于运行卷积神经网络模型的程度。比起只使用反映核心的静态特性的核心权重值,第一修正权重值更能反映核心对卷积神经网络模型的适配程度。根据该第一修正权重值来调度运行卷积神经网络模型的核心,将能选择出运行更高效的核心。 When the terminal runs the convolutional neural network model on a specific core, it not only needs to adapt the physical characteristics of the core and the characteristics of the convolutional neural network model, but also makes the current performance state of the core suitable for running the convolutional neural network model. To do this, you need to consider the core performance parameters and core hardware features. The specific implementation manner is to correct the core weight values of the multiple heterogeneous cores by using the performance weight values of the multiple heterogeneous cores, so that the obtained first modified weight value combines the core weight value and the performance weight value, and the core first The modified weight value reflects the extent to which the core hardware characteristics and the core core usage of the core are suitable for running the convolutional neural network model. The first modified weight value is more reflective of the core's adaptation to the convolutional neural network model than the core weight value that reflects only the static characteristics of the core. Scheduling the core of the running convolutional neural network model based on the first modified weight value will enable the selection of a more efficient core.
使用核心的性能权重值修正该核心的核心权重值的方式有多种,例如将核心的性能权重值和该核心的核心权重值进行相乘,或者进行加权运算等,以得到第一修正权重值。There are various ways to modify the core weight value of the core by using the core performance weight value, for example, multiplying the core performance weight value and the core core weight value, or performing a weighting operation to obtain the first modified weight value. .
例如,GPU的核心权重值为1,该GPU的性能权重值为0.7,使用该核心权重值和性能权重值进行相乘,得到该GPU的第一修正权重值0.7。For example, the GPU has a core weight value of 1, and the GPU has a performance weight value of 0.7. The core weight value and the performance weight value are multiplied to obtain a first modified weight value of the GPU of 0.7.
步骤507:获取终端的剩余电量值。Step 507: Acquire a remaining power value of the terminal.
终端的核心运行卷积神经网络模型时将产生电量消耗,因不同核心的功耗不同,在不同的核心上运行同一卷积神经网络模型,将会产生不同的电量消耗。为了不影响用户对终端的连续使用,需要将终端的剩余电量作为调度核心的考虑因素之一。这在电能存储较小的终端上尤其需要考虑。When the core of the terminal runs the convolutional neural network model, it will generate power consumption. Because the power consumption of different cores is different, running the same convolutional neural network model on different cores will generate different power consumption. In order not to affect the continuous use of the terminal by the user, it is necessary to regard the remaining power of the terminal as one of the consideration factors of the scheduling core. This is especially important on terminals with small electrical energy storage.
为此,终端需要获取其上的剩余电量值,该剩余电量值用于表示该终端当前剩余多少电量。To this end, the terminal needs to obtain the remaining power value on the terminal, and the remaining power value is used to indicate how much power the terminal currently has.
终端获取剩余电量值的具体方式,例如可以为:终端使用终端上的电量检测程序检测终端当前的剩余电量,得到剩余电量值。The specific manner in which the terminal obtains the remaining power value may be, for example, the terminal uses the power detection program on the terminal to detect the current remaining power of the terminal, and obtains the remaining power value.
步骤508:根据剩余电量值,从预设的第三对应关系中确定至少两个核心的功耗权重值。Step 508: Determine, according to the remaining power value, the power consumption weight values of the at least two cores from the preset third correspondence.
其中,至少两个核心的功耗权重值和剩余电量值对应。功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度。功耗权重值大的核心比功耗权重值小的核心适于运行卷积神经网络模型。The power consumption weight values of the at least two cores correspond to the remaining power consumption values. The power consumption weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model under the remaining power value. A core with a large power weight value and a core with a smaller power weight value is suitable for running a convolutional neural network model.
因不同的异构核心的功耗特性不同,在运行同一卷积神经网络模型时将会产生不同的电量消耗,可以将不同核心的功耗特性作为是否调度该核心的考虑因素之一。Due to the different power consumption characteristics of different heterogeneous cores, different power consumption will be generated when running the same convolutional neural network model, and the power consumption characteristics of different cores can be regarded as one of the considerations for scheduling the core.
若只是考虑核心的功耗的话,可能为卷积神经网络模型调度的核心是功耗较小的核心,但是该核心的计算能力可能并不优于其它的核心,即需要综合终端上的不同参数来考虑,才能产生更优的核心调度策略。为此,在本发明实施例中,综合考虑了终端的剩余电量和核心的功耗来决定核心的功耗权重值。具体的方式,为利用第三对应关系来确定核心的功耗权重值,该第三对应关系的参数包括剩余电量值和功耗权重值,并在第三对应关系中,功耗权重值的设定考虑到核心的功耗。If only the core power consumption is considered, the core of the convolutional neural network model scheduling may be a core with lower power consumption, but the computing power of the core may not be superior to other cores, that is, different parameters on the integrated terminal are required. To consider, we can produce a better core scheduling strategy. Therefore, in the embodiment of the present invention, the power consumption weight value of the core is determined by comprehensively considering the remaining power of the terminal and the power consumption of the core. Specifically, the power consumption weight value of the core is determined by using the third correspondence, where the parameters of the third correspondence include a remaining power value and a power weight value, and in the third correspondence, the power weight value is set. The power consumption of the core is taken into account.
第三对应关系包括剩余电量值和该至少两个核心的功耗权重值的对应关系。即第三对应关系的参数包括剩余电量值和功耗权重值。这两个参数的具体形式可以具体的数值,也可以是数值的范围。确定出的功耗权重值可以为具体的数值。在第三对应关系中功耗权重值所属的核心类型可以为一个或者多个,确定出的功耗权重值的数量也可以为一个或多个,其中不同的功耗权重值属于不同的核心。The third correspondence includes a correspondence between the remaining power value and the power consumption weight value of the at least two cores. That is, the parameters of the third correspondence include the remaining power value and the power consumption weight value. The specific form of these two parameters may be a specific value or a range of values. The determined power weight value can be a specific value. In the third correspondence, the power consumption weight value may belong to one or more core types, and the determined power consumption weight value may also be one or more, wherein different power consumption weight values belong to different cores.
若在第三对应关系中,为剩余电量值区间和功耗权重值的数值区间对应,确定出对应的功耗权重值的数值范围后,可以使用终端的当前剩余电量值计算出具体的功耗权重值,例如,根据终端的当前剩余电量值在对应的功耗权重值区间的位置使用线性映射方式在功耗权重值区间确定出功耗权重值。If, in the third correspondence, the value range of the remaining power value interval and the power weight value are corresponding, after determining the value range of the corresponding power weight value, the specific power consumption of the terminal may be used to calculate the specific power consumption. The weight value, for example, determines a power consumption weight value in the power consumption weight value interval using a linear mapping manner at a position of the corresponding power consumption weight value interval according to the current remaining power value of the terminal.
下面即举出一具体示例来说明步骤508,如下: A specific example is given below to illustrate step 508, as follows:
表3:table 3:
Figure PCTCN2017107614-appb-000005
Figure PCTCN2017107614-appb-000005
如表3所示,在该对应关系中,功耗权重值被划分为两个范围,即被分为两个档次,剩余电量值范围也被划分为两个档次。As shown in Table 3, in the corresponding relationship, the power consumption weight value is divided into two ranges, that is, divided into two grades, and the remaining power value range is also divided into two grades.
考虑到NPU功耗较低,其低功耗权重值对应的剩余电量范围设置的最低,即只要电量大于8%,都可以将功耗权重值设置为较高水平。Considering that the power consumption of the NPU is low, the low power consumption weight value corresponding to the remaining power range is set to the lowest, that is, as long as the power is greater than 8%, the power consumption weight value can be set to a higher level.
功耗权重值的计算方法如下:The power weight value is calculated as follows:
1)在表3的对应关系中,确定终端的剩余电量值所在的目标剩余电量值区间;1) In the correspondence relationship of Table 3, determining a target remaining power value interval in which the remaining power value of the terminal is located;
2)在表3的对应关系中,确定与该目标剩余电量值区间对应的目标功耗权重值区间;2) in the correspondence relationship of Table 3, determining a target power consumption weight value interval corresponding to the target remaining power value interval;
3)对每一核心,从目标功耗权重值区间中确定功耗权重值,功耗权重值在目标功耗权重值区间中的位置和终端的剩余电量值在目标剩余电量值区间中的位置相同。3) For each core, the power consumption weight value is determined from the target power weight value interval, the position of the power weight value in the target power weight value interval, and the position of the terminal remaining power value in the target remaining power value interval the same.
以终端的当前剩余电量值为40%为例,其功耗权重值的计算如下:Taking the current remaining power value of the terminal as 40% as an example, the power consumption weight value is calculated as follows:
1)在表3的对应关系中,确定终端的剩余电量值40%所在的目标剩余电量区间:对应CPU的为10%~100%,对应GPU的为0%~50%,对应NPU的为8%~100%;1) In the corresponding relationship of Table 3, determine the target remaining power consumption interval where the remaining power value of the terminal is 40%: 10% to 100% for the corresponding CPU, 0% to 50% for the corresponding GPU, and 8 for the corresponding NPU. %~100%;
2)对每一核心,在表3的对应关系中,确定与该目标剩余电量区间对应的目标功耗权重值区间:对应CPU的为0.8~1.0,对应GPU的为0~0.8,对应NPU的为0.8~1.0;2) For each core, in the correspondence relationship of Table 3, the target power consumption weight value interval corresponding to the target remaining power consumption interval is determined: the corresponding CPU is 0.8 to 1.0, and the corresponding GPU is 0 to 0.8, corresponding to the NPU. It is 0.8 to 1.0;
3)终端的剩余电量值40%在对应CPU的目标剩余电量值范围10%~100%内的相对位置关系为(40-10)/(100-10)=1/3。根据该1/3,在对应CPU的目标功耗权重值范围0.8~1.0内进行线性映射,设CPU的功耗权重值为Sy,则(Sy-0.8)/(1-0.8)=1/3。计算得到CPU的功耗权重值Sy≈0.86。使用相同的计算方法,计算得到GPU的功耗权重值为0.64,NPU的功耗权重值为0.87。3) The relative positional relationship of the remaining power value of the terminal 40% within the range of 10% to 100% of the target remaining power value corresponding to the CPU is (40-10) / (100-10) = 1/3. According to the 1/3, linear mapping is performed within a target power consumption weight range of 0.8 to 1.0 corresponding to the CPU, and the power consumption weight value of the CPU is S y , then (S y -0.8) / (1 - 0.8) = 1 /3. The power consumption weight value S y ≈ 0.86 of the CPU is calculated. Using the same calculation method, the power consumption weight of the GPU is calculated to be 0.64, and the power consumption weight of the NPU is 0.87.
步骤509:对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值。Step 509: For each core, the first modified weight value is corrected by using the power consumption weight value to obtain a second modified weight value.
第二修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;第二修正权重值大的核心比第二修正权重值小的核心适于运行卷积神经网络模型。The second modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model; the core with the second modified weight value and the core smaller than the second modified weight value is suitable for running the convolutional neural network model.
因功耗权重值反映了终端的剩余电量值和核心的功耗,使用功耗权重值修正第一修正权重值,即为使用终端的剩余电量值和核心的功耗对核心的第一修正权重值进行修正。Since the power consumption weight value reflects the remaining power value of the terminal and the power consumption of the core, the first correction weight value is corrected by using the power consumption weight value, that is, the first correction weight of the core is used for the remaining power value of the terminal and the power consumption of the core. The value is corrected.
在本发明实施例中该第二修正权重值可以为列表的形式出现。In the embodiment of the present invention, the second correction weight value may appear in the form of a list.
如上所述,核心的第一修正权重值反映了该核心的硬件计算特性和该核心的当前核心使用率适于运行卷积神经网络模型的程度,使用核心的功耗权重值修正核心的第一修正权重值后,得到核心的第二修正权重值结合了更多的参数来进行核心调度策略的生成。根据上述对各权重值的描述可以,该核心的第二修改权重值涉及到的参数有核心的硬件计算特性、卷积神经网络模型的计算密度、核心使用率、核心所处终端的剩余电量、核心的功耗,从而核心的第二修正权重更能反映出该核心运行卷积神经网络模型的适宜程度,根据不同 核心的第二修正权重,可以更准确地确定出可以高效运行该卷积神经网络模型的核心。As described above, the core first correction weight value reflects the degree of hardware calculation characteristics of the core and the core core usage rate of the core is suitable for running the convolutional neural network model, and corrects the core first by using the core power weight value. After the weight value is corrected, the second modified weight value of the core is combined with more parameters to generate the core scheduling strategy. According to the above description of the weight values, the parameters of the second modified weight value of the core may have core hardware calculation characteristics, computational density of the convolutional neural network model, core usage rate, remaining power of the terminal at the core, The core power consumption, and thus the core second correction weight, can better reflect the suitability of the core running convolutional neural network model, according to different The core second correction weight can more accurately determine the core of the convolutional neural network model that can run efficiently.
使用功耗权重值修正第一修正权重值的具体方式有多种,例如将核心的功耗权重值和第一修改权重进行相乘,或者使用加权运算等,以得到该核心的第二修正权重值。There are various ways to correct the first modified weight value by using the power consumption weight value, for example, multiplying the power consumption weight value of the core and the first modified weight, or using a weighting operation or the like to obtain the second correction weight of the core. value.
例如,GPU的功耗权重值为0.4,该CPU的第一修正权重值为0.7,终端使用该功耗权重值乘以第一修正权重值,得到该GPU的第二修改权重值0.28。For example, the power consumption weight value of the GPU is 0.4, and the first correction weight value of the CPU is 0.7. The terminal uses the power consumption weight value and multiplies the first modified weight value to obtain a second modified weight value of the GPU of 0.28.
可以理解,在本发明实施例中,可以先使用性能权重值修正核心权重值,也可以先使用功耗权重值修正核心权重值,也可以同时使用性能权重值和功耗权重值修正核心权重值,本发明实施例对此不作具体限定。It can be understood that, in the embodiment of the present invention, the core weight value may be first modified by using the performance weight value, or the core weight value may be first modified by using the power weight value, or the core weight value may be corrected by using the performance weight value and the power weight value simultaneously. The embodiment of the present invention does not specifically limit this.
步骤510:从该至少两个核心中确定第二修正权重值最大的目标核心。Step 510: Determine, from the at least two cores, a target core with the second modified weight value being the largest.
该目标核心用于运行卷积神经网络模型的核心。This target core is used to run the core of the convolutional neural network model.
终端在得到多个异构核心的第二修正权重值后,该第二修正权重值可用于进行核心的调度。比较这些核心的第二修正权重值,选择第二修正权重值最大的目标核心,因核心的第二权重值反应了该核心适于运行卷积神经网络模型的合适程度,故适于在目标核心上运行该卷积神经网络模型。After the terminal obtains the second modified weight value of the multiple heterogeneous cores, the second modified weight value may be used for core scheduling. Comparing the second modified weight values of the cores, selecting the target core with the second modified weight value, and the second weight value of the core reflects the appropriate degree of the core suitable for running the convolutional neural network model, so it is suitable for the target core. Run the convolutional neural network model on it.
例如,多个核心的第二修正权重值为{WCPU=0.4;WGPU=0.8;WNPU=0.5},则终端选择第二修正权重值最大的GPU运行步骤501获取的卷积神经网络模型,以执行具体的应用业务。For example, if the second correction weight value of the plurality of cores is {W CPU =0.4; W GPU =0.8; W NPU =0.5}, the terminal selects the convolutional neural network model acquired by the GPU running step 501 with the second largest modified weight value. To perform specific application business.
步骤511:在目标核心上使用目标核心的性能参数运行卷积神经网络模型。Step 511: Run a convolutional neural network model on the target core using the performance parameters of the target core.
在目标核心上运行卷积神经网络模型,涉及到具体的运行方式,例如怎么使用核心上线程来运行该卷积神经网络模型。在上述步骤中因根据核心的当前核心使用率确定了该核心的性能参数,即目标核心的性能参数已确定出,终端可以在目标核心上使用该性能参数来运行卷积神经网络模型。Running the convolutional neural network model on the target core involves a specific way of running, such as how to use the core thread to run the convolutional neural network model. In the above steps, the performance parameter of the core is determined according to the current core usage rate of the core, that is, the performance parameter of the target core has been determined, and the terminal can use the performance parameter to run the convolutional neural network model on the target core.
确定的性能参数有多种时,则具体使用每一种性能参数来运行卷积神经网络模型。例如,依据性能参数的线程数目信息,控制目标核心的多线程的并发数量;依据性能参数的睡眠时间信息,执行完本次网络计算请求后,控制核心的睡眠时间,即在睡眠时间信息指示的间隔时间内核心不运行下一卷积神经网络模型;依据性能参数的线程优先级信息,控制目标核心中子线程的优先级。When there are multiple performance parameters determined, each of the performance parameters is used to run the convolutional neural network model. For example, according to the number of threads of the performance parameter, the number of concurrent threads of the target core is controlled; according to the sleep time information of the performance parameter, after the network computing request is executed, the sleep time of the core is controlled, that is, indicated by the sleep time information. The core does not run the next convolutional neural network model during the interval; the priority of the sub-threads in the target core is controlled according to the thread priority information of the performance parameters.
例如,目标核心运行完成卷积神经网络模型后,在睡眠时间信息的作用下,调用***的睡眠API,睡眠一段时间,在该段时间内不运行另一卷积神经网络模型,该段时间之后,再去处理下一次新的卷积神经网络模型。这样,在核心使用率高时,通过睡眠时间信息的设定使得目标核心睡眠较长时间,以维持核心使用率在合理水平。For example, after the target core runs the convolutional neural network model, under the action of the sleep time information, the sleep API of the system is called, and sleeps for a period of time, and another convolutional neural network model is not run during the period, after the period of time Then, deal with the next new convolutional neural network model. In this way, when the core usage rate is high, the target core sleeps for a long time by setting the sleep time information to maintain the core usage rate at a reasonable level.
综上所述,本发明实施例的方法,可以在获取了卷积神经网络的权重参数数量后,因该权重参数数量表示卷积神经网络模型的计算密度,根据预设的第一对应关系,可以根据该权重参数数量确定出多个核心的核心权重值,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,再通过核心所在终端的动态参数来修正核心权重值,例如,通过核心的核心使用率根据第二对应关系确定出性能权重值,性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行所述卷积神经网络模型的优先程度,通过终端的剩余电量值根据第三对应关系确定出功耗权重值,功耗权重值用于表示在所述剩余电量值 下,核心被选择以运行所述卷积神经网络模型的优先程度,先后使用性能权重值和功耗权重值修正核心权重值得到第二修正权重值,在该多个核心中,第二修正权重值最大的目标核心为最适宜于运行卷积神经网络模型的核心,可以调度该目标核心运行卷积神经网络模型,可以提高运行的效率,并能减少功耗。In summary, the method of the embodiment of the present invention may, after obtaining the weight parameter of the convolutional neural network, represent the calculation density of the convolutional neural network model according to the number of the weight parameter, according to the preset first correspondence relationship, The core weight value of the plurality of cores may be determined according to the number of the weight parameters, and the core weight value is used to indicate that the core is selected to run the convolutional neural network model, and then the core weight value is corrected by the dynamic parameters of the core terminal. For example, the performance weight value is determined according to the second correspondence relationship by the core core usage rate, and the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the current core core usage rate of the core, Determining a power consumption weight value according to a third correspondence relationship by using a remaining power value of the terminal, where the power consumption weight value is used to indicate the remaining power value Next, the core is selected to run the priority of the convolutional neural network model, and the performance weight value and the power weight value are used to correct the core weight value to obtain a second modified weight value, and in the plurality of cores, the second correction weight The target core with the largest value is the core that is most suitable for running the convolutional neural network model. The target core can be scheduled to run the convolutional neural network model, which can improve the efficiency of operation and reduce power consumption.
图6为本发明实施例提供的一种终端的硬件结构示意图。如图6所示,为了便于说明,仅示出了与本发明实施例相关的部分,具体技术细节未揭示的,请参照本发明实施例方法部分。该终端可以为包括手机、平板电脑、PDA(Personal Digital Assistant,个人数字助理)、POS(Point of Sales,销售终端)、车载电脑等任意终端设备,以终端为手机为例:FIG. 6 is a schematic structural diagram of hardware of a terminal according to an embodiment of the present invention. As shown in FIG. 6, for the convenience of description, only the parts related to the embodiment of the present invention are shown. For the specific technical details not disclosed, please refer to the method part of the embodiment of the present invention. The terminal may be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales), an in-vehicle computer, and the terminal is a mobile phone as an example:
图6示出的是与本发明实施例提供的终端相关的手机的部分结构的框图。参考图6,手机包括:射频(Radio Frequency,RF)电路610、存储器620、输入单元630、显示单元640、传感器650、音频电路660、无线保真(wireless fidelity,WiFi)模块670、中央处理器680、以及电源690等部件。FIG. 6 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present invention. Referring to FIG. 6, the mobile phone includes: a radio frequency (RF) circuit 610, a memory 620, an input unit 630, a display unit 640, a sensor 650, an audio circuit 660, a wireless fidelity (WiFi) module 670, and a central processing unit. 680, and power supply 690 and other components.
在一些实施例中,该手机还可以包括图形处理器681、数字信号处理器682、脉动阵列处理器683等,该脉动阵列处理器具体可以为神经网络处理器、张量处理器、智能处理器等。In some embodiments, the mobile phone may further include a graphics processor 681, a digital signal processor 682, a pulse array processor 683, etc., and the pulse array processor may specifically be a neural network processor, a tensor processor, an intelligent processor. Wait.
本领域技术人员可以理解,图6中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。It will be understood by those skilled in the art that the structure of the handset shown in FIG. 6 does not constitute a limitation to the handset, and may include more or less components than those illustrated, or some components may be combined, or different components may be arranged.
下面结合图6对手机的各个构成部件进行具体的介绍:The following describes the components of the mobile phone in detail with reference to FIG. 6:
RF电路610可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给中央处理器680处理;另外,将设计上行的数据发送给基站。通常,RF电路610包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,RF电路610还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯***(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE)、电子邮件、短消息服务(Short Messaging Service,SMS)等。The RF circuit 610 can be used for transmitting and receiving information or during a call, and receiving and transmitting the signal. Specifically, after receiving the downlink information of the base station, it is processed by the central processing unit 680; in addition, the uplink data is designed to be sent to the base station. Generally, RF circuit 610 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuitry 610 can also communicate with the network and other devices via wireless communication. The above wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division). Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), and the like.
存储器620可用于存储软件程序以及模块,中央处理器680通过运行存储在存储器620的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器620可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器620可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 620 can be used to store software programs and modules, and the central processing unit 680 executes various functional applications and data processing of the mobile phone by running software programs and modules stored in the memory 620. The memory 620 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of the mobile phone (such as audio data, phone book, etc.). Moreover, memory 620 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
输入单元630可用于接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元630可包括触控面板631以及其他输入设备 632。触控面板631,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板631上或在触控面板631附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板631可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给中央处理器680,并能接收中央处理器680发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板631。除了触控面板631,输入单元630还可以包括其他输入设备632。具体地,其他输入设备632可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。The input unit 630 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function controls of the handset. Specifically, the input unit 630 may include the touch panel 631 and other input devices. 632. The touch panel 631, also referred to as a touch screen, can collect touch operations on or near the user (such as the user using a finger, a stylus, or the like on the touch panel 631 or near the touch panel 631. Operation), and drive the corresponding connecting device according to a preset program. Optionally, the touch panel 631 can include two parts: a touch detection device and a touch controller. Wherein, the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information. The central processor 680 is provided and can receive commands from the central processing unit 680 and execute them. In addition, the touch panel 631 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves. In addition to the touch panel 631, the input unit 630 may also include other input devices 632. In particular, other input devices 632 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
显示单元640可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元640可包括显示面板641,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板641。进一步的,触控面板631可覆盖显示面板641,当触控面板631检测到在其上或附近的触摸操作后,传送给中央处理器680以确定触摸事件的类型,随后中央处理器680根据触摸事件的类型在显示面板641上提供相应的视觉输出。虽然在图6中,触控面板631与显示面板641是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板631与显示面板641集成而实现手机的输入和输出功能。The display unit 640 can be used to display information input by the user or information provided to the user as well as various menus of the mobile phone. The display unit 640 can include a display panel 641. Alternatively, the display panel 641 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 631 can cover the display panel 641. When the touch panel 631 detects a touch operation thereon or nearby, the touch panel 631 transmits to the central processing unit 680 to determine the type of the touch event, and then the central processing unit 680 according to the touch. The type of event provides a corresponding visual output on display panel 641. Although in FIG. 6, the touch panel 631 and the display panel 641 are two independent components to implement the input and input functions of the mobile phone, in some embodiments, the touch panel 631 may be integrated with the display panel 641. Realize the input and output functions of the phone.
手机还可包括至少一种传感器650,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板641的亮度,接近传感器可在手机移动到耳边时,关闭显示面板641和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。The handset can also include at least one type of sensor 650, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 641 according to the brightness of the ambient light, and the proximity sensor may close the display panel 641 and/or when the mobile phone moves to the ear. Or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity. It can be used to identify the gesture of the mobile phone (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
音频电路660、扬声器661,传声器662可提供用户与手机之间的音频接口。音频电路660可将接收到的音频数据转换后的电信号,传输到扬声器661,由扬声器661转换为声音信号输出;另一方面,传声器662将收集的声音信号转换为电信号,由音频电路660接收后转换为音频数据,再将音频数据输出中央处理器680处理后,经RF电路610以发送给比如另一手机,或者将音频数据输出至存储器620以便进一步处理。 Audio circuit 660, speaker 661, and microphone 662 provide an audio interface between the user and the handset. The audio circuit 660 can transmit the converted electrical data of the received audio data to the speaker 661 for conversion to the sound signal output by the speaker 661; on the other hand, the microphone 662 converts the collected sound signal into an electrical signal by the audio circuit 660. After receiving, it is converted into audio data, and then the audio data is output to the central processing unit 680 for processing, sent to the other mobile phone via the RF circuit 610, or the audio data is output to the memory 620 for further processing.
WiFi属于短距离无线传输技术,手机通过WiFi模块670可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图6示出了WiFi模块670,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。WiFi is a short-range wireless transmission technology, and the mobile phone can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 670, which provides users with wireless broadband Internet access. Although FIG. 6 shows the WiFi module 670, it can be understood that it does not belong to the essential configuration of the mobile phone, and can be omitted as needed within the scope of not changing the essence of the invention.
中央处理器680是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器620内的软件程序和/或模块,以及调用存储在存储器620内 的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,中央处理器680可包括一个或多个处理单元;优选的,中央处理器680可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作***、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到中央处理器680中。The central processing unit 680 is the control center of the handset, which connects various portions of the entire handset using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 620, and by calling them stored in the memory 620. Data, perform various functions of the mobile phone and process data to monitor the mobile phone as a whole. Optionally, the central processing unit 680 can include one or more processing units; preferably, the central processing unit 680 can integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application. Programs, etc., the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the central processing unit 680.
手机还包括给各个部件供电的电源690(比如电池),优选的,电源可以通过电源管理***与中央处理器680逻辑相连,从而通过电源管理***实现管理充电、放电、以及功耗管理等功能。The handset also includes a power source 690 (such as a battery) that supplies power to the various components. Preferably, the power source can be logically coupled to the central processing unit 680 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
尽管未示出,手机还可以包括摄像头、蓝牙模块等,在此不再赘述。Although not shown, the mobile phone may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
在本发明实施例中,该终端所包括的中央处理器680可以设置用于:获取目标模型参数,目标模型参数用于表示一卷积神经网络模型的计算密度;根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,至少两个核心的核心权重值与目标模型参数对应,至少两个核心为终端上的异构核心,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。In the embodiment of the present invention, the central processing unit 680 included in the terminal may be configured to: acquire a target model parameter, where the target model parameter is used to represent a computing density of a convolutional neural network model; and according to the target model parameter, from the preset The core correspondence value of at least two cores is determined in the first correspondence relationship, the core weight values of the at least two cores correspond to the target model parameters, and at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes target model parameters. Corresponding to at least two core core weight values, the core weight value is used to indicate the priority of the core selected to run the convolutional neural network model; and determined from at least two cores based on the core weight values of at least two cores Run the core of the convolutional neural network model.
可选地,中央处理器680还可以设置用于:获取终端当前的状态参数,状态参数为动态变化的参数;根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值,至少两个核心的参数权重值和状态参数对应,第二对应关系包括状态参数和至少两个核心的参数权重值的对应关系,参数权重值用于表示在状态参数下,核心被选择以运行卷积神经网络模型的优先程度;对每一核心,使用参数权重值修正核心权重值,得到第一修正权重值,第一修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;根据至少两个核心的第一修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。Optionally, the central processing unit 680 is further configured to: obtain a current state parameter of the terminal, where the state parameter is a dynamically changing parameter; and determine, according to the state parameter, the parameter weight of the at least two cores from the preset second correspondence. a value, at least two core parameter weight values corresponding to the state parameter, the second correspondence relationship comprising a correspondence between the state parameter and at least two core parameter weight values, the parameter weight value being used to indicate that under the state parameter, the core is selected The priority of running the convolutional neural network model; for each core, the core weight value is corrected using the parameter weight value to obtain a first modified weight value, and the first modified weight value is used to indicate that the core is selected to run the convolutional neural network model Priority; determining the core of the running convolutional neural network model from at least two cores based on the first modified weight values of the at least two cores.
可选地,终端当前的状态参数为每一核心当前的核心使用率,中央处理器680还可以设置用于:对每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值,每一核心的性能权重值和每一核心的核心使用率对应,性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括每一核心的核心使用率和每一核心的性能权重值的对应关系。Optionally, the current state parameter of the terminal is the current core usage rate of each core, and the central processing unit 680 is further configured to: determine, for each core, a performance weight from the preset second correspondence according to the core usage rate. Value, the performance weight value of each core corresponds to the core usage rate of each core, and the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate, second The correspondence includes the correspondence between the core usage rate of each core and the performance weight value of each core.
可选地,获取终端当前的剩余电量值,中央处理器680还可以设置用于:根据剩余电量值,从预设的第三对应关系中确定至少两个核心的功耗权重值,至少两个核心的功耗权重值和剩余电量值对应,第三对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度;对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值,第二修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;根据至少两个核心的第二修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。Optionally, the current remaining power value of the terminal is obtained, and the central processing unit 680 is further configured to: determine, according to the remaining power value, the power consumption weight values of the at least two cores from the preset third correspondence, at least two The core power consumption weight value corresponds to the remaining power value, and the third correspondence relationship includes a correspondence between the remaining power value and at least two core power consumption weight values, and the power consumption weight value is used to indicate that the core is selected under the remaining power value. The priority of running the convolutional neural network model; for each core, the first modified weight value is corrected using the power consumption weight value to obtain a second modified weight value, and the second modified weight value is used to indicate that the core is selected to run the convolution The priority of the neural network model; determining the core of the running convolutional neural network model from at least two cores based on the second modified weight values of the at least two cores.
可选地,中央处理器680还可以设置用于:获取每一核心当前的核心使用率;对每一核心,根据核心使用率从第二对应关系中确定性能参数,每一核心的性能参数和每一核心 的核心使用率对应,第二对应关系包括每一核心的性能参数和每一核心的核心使用率的对应关系;根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心之后,在目标核心上使用目标核心的性能参数运行卷积神经网络模型,目标核心为运行卷积神经网络模型的核心。Optionally, the central processing unit 680 is further configured to: obtain a current core usage rate of each core; for each core, determine performance parameters from the second correspondence according to the core usage rate, performance parameters of each core, and Each core Corresponding to the core usage rate, the second correspondence relationship includes the correspondence between the performance parameters of each core and the core usage rate of each core; determining the running convolutional nerve from at least two cores according to the core weight values of at least two cores After the core of the network model, the convolutional neural network model is run on the target core using the performance parameters of the target core. The target core is the core of the running convolutional neural network model.
可选地,终端当前的状态参数为终端当前的剩余电量值,中央处理器680还可以设置用于:根据剩余电量值,从预设的第二对应关系中确定至少两个核心的功耗权重值,至少两个核心的功耗权重值和剩余电量值对应,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系。Optionally, the current state parameter of the terminal is the current remaining power value of the terminal, and the central processing unit 680 is further configured to: determine, according to the remaining power value, the power consumption weight of the at least two cores from the preset second correspondence. The value, the power weight value of at least two cores corresponds to the remaining power value, and the power weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value, and the second correspondence includes the remaining The correspondence between the power value and the power weight values of at least two cores.
可选地,中央处理器680还可以设置用于:在预设的第一对应关系中,确定目标模型参数所在的目标模型参数区间;在第一对应关系中,确定至少两个核心的核心权重值区间,至少两个核心的核心权重值区间和目标模型参数区间对应,第一对应关系包括目标模型参数区间和至少两个核心的核心权重值区间的对应关系,目标模型参数区间包括目标模型参数;对每一核心,从核心权重值区间中确定核心权重值,核心权重值在核心权重值区间中的位置和目标模型参数在目标模型参数区间中的位置相同。Optionally, the central processing unit 680 is further configured to: determine, in the preset first correspondence, a target model parameter interval in which the target model parameter is located; and determine, in the first correspondence, the core weight of the at least two cores The value interval, at least two core core weight value intervals correspond to the target model parameter interval, and the first correspondence relationship includes a correspondence relationship between the target model parameter interval and at least two core core weight value intervals, and the target model parameter interval includes the target model parameter For each core, the core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
可选地,中央处理器680还可以设置用于执行上述的步骤401至步骤403。Optionally, the central processing unit 680 is further configured to perform the steps 401 to 403 described above.
可选地,中央处理器680还可以设置用于执行上述的步骤501至步骤511。Optionally, the central processing unit 680 is further configured to perform steps 501 to 511 described above.
综上所述,中央处理器680获取目标模型参数,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,然后,中央处理器680根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,该至少两个核心的核心权重值与目标模型参数对应,该至少两个核心为终端上的异构核心,其中,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。从而中央处理器680根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。终端上的异构核心特点不同,不同核心适于运行不同计算密度的卷积神经网络模型。若预设有第一对应关系,该第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,该至少两个核心为终端上的异构核心,则在获取到一卷积神经网络模型的目标模型参数后,可以根据该目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值。核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,通过核心权重值可确定适于运行该卷积神经网络模型的核心。这样,即可实现根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。通过不同核心的核心权重值,可确定适配的核心来运行具有具体计算密度的卷积神经网络模型,若核心权重值越大的核心越能高效运行卷积神经网络模型时,根据核心权重值确定出的核心可以高效运行该卷积神经网络模型。In summary, the central processing unit 680 acquires target model parameters, wherein the target model parameters are used to represent the computational density of a convolutional neural network model, and then the central processor 680 selects from the preset first correspondence according to the target model parameters. Determining at least two core core weight values in the relationship, the core weight values of the at least two cores are corresponding to target model parameters, the at least two cores are heterogeneous cores on the terminal, wherein the first correspondence relationship includes target model parameters Corresponding to the core weight values of at least two cores, the core weight values are used to indicate the priority of the core being selected to run the convolutional neural network model. The central processor 680 thus determines the core of the running convolutional neural network model from at least two cores based on the core weight values of the at least two cores. The heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities. If the first correspondence relationship is pre-set, the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model, At least two cores are heterogeneous cores on the terminal, and after obtaining the target model parameters of a convolutional neural network model, at least two cores may be determined from the preset first correspondence according to the target model parameters. Core weight value. The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model. In this way, the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores. Through the core weight values of different cores, the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
图7为本发明实施例提供的一种终端的结构示意图,该终端可集成在图6所示的终端上,图7所示的终端可用于执行图4或图5所示的终端所执行的步骤。 FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure. The terminal may be integrated on the terminal shown in FIG. 6. The terminal shown in FIG. 7 may be used to execute the terminal executed by FIG. 4 or FIG. step.
参阅图7,本发明实施例的终端,包括:Referring to FIG. 7, a terminal according to an embodiment of the present invention includes:
获取单元701,用于获取目标模型参数,目标模型参数用于表示一卷积神经网络模型的计算密度;An obtaining unit 701, configured to acquire a target model parameter, where the target model parameter is used to represent a computing density of a convolutional neural network model;
权重值确定单元702,用于根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,至少两个核心的核心权重值与目标模型参数对应,至少两个核心为终端上的异构核心,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;The weight value determining unit 702 is configured to determine, according to the target model parameter, a core weight value of at least two cores from the preset first correspondence relationship, where the core weight values of the at least two cores correspond to the target model parameters, and at least two cores For the heterogeneous core on the terminal, the first correspondence includes a correspondence between the target model parameter and the core weight values of at least two cores, and the core weight value is used to indicate the priority of the core selected to run the convolutional neural network model;
核心确定单元703,用于根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。The core determining unit 703 is configured to determine, according to the core weight values of the at least two cores, a core of the running convolutional neural network model from the at least two cores.
可选地,Optionally,
获取单元701,还用于获取终端当前的状态参数,状态参数为动态变化的参数;The obtaining unit 701 is further configured to acquire a current state parameter of the terminal, where the state parameter is a dynamically changing parameter;
权重值确定单元702,还用于根据状态参数,从预设的第二对应关系中确定至少两个核心的参数权重值,至少两个核心的参数权重值和状态参数对应,第二对应关系包括状态参数和至少两个核心的参数权重值的对应关系,参数权重值用于表示在状态参数下,核心被选择以运行卷积神经网络模型的优先程度;The weight value determining unit 702 is further configured to determine, according to the state parameter, a parameter weight value of at least two cores from the preset second correspondence relationship, where the parameter weight values of the at least two cores correspond to the state parameters, and the second correspondence relationship includes The correspondence between the state parameter and at least two core parameter weight values, the parameter weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the state parameter;
核心确定单元703,包括修正模块704和核心确定模块705; Core determining unit 703, comprising a correction module 704 and a core determining module 705;
修正模块704,用于对每一核心,使用参数权重值修正核心权重值,得到第一修正权重值,第一修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;The correction module 704 is configured to, for each core, use a parameter weight value to correct the core weight value to obtain a first modified weight value, where the first modified weight value is used to indicate a priority of the core selected to run the convolutional neural network model;
核心确定模块705,用于根据至少两个核心的第一修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。The core determining module 705 is configured to determine, according to the first modified weight value of the at least two cores, a core of the running convolutional neural network model from the at least two cores.
可选地,Optionally,
终端当前的状态参数为每一核心当前的核心使用率;The current state parameter of the terminal is the current core usage rate of each core;
权重值确定单元702,还用于对每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值,每一核心的性能权重值和每一核心的核心使用率对应,性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括每一核心的核心使用率和每一核心的性能权重值的对应关系。The weight value determining unit 702 is further configured to determine a performance weight value from the preset second correspondence relationship according to the core usage rate for each core, and the performance weight value of each core corresponds to the core usage rate of each core, and the performance The weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate of the core, and the second correspondence includes the core usage rate of each core and the performance weight value of each core. relationship.
可选地,Optionally,
获取单元701,还用于获取终端当前的剩余电量值;The obtaining unit 701 is further configured to acquire a current remaining power value of the terminal;
权重值确定单元702,还用于根据剩余电量值,从预设的第三对应关系中确定至少两个核心的功耗权重值,至少两个核心的功耗权重值和剩余电量值对应,第三对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度;The weight value determining unit 702 is further configured to determine, according to the remaining power value, a power consumption weight value of at least two cores from the preset third correspondence relationship, where the power consumption weight values of the at least two cores and the remaining power value correspond to, The three correspondence relationship includes a correspondence between the remaining power value and the power consumption weight values of the at least two cores, and the power consumption weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value;
修正模块704,还用于对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值,第二修正权重值用于表示核心被选择以运行卷积神经网络模型的优先程度;The correction module 704 is further configured to: for each core, correct the first modified weight value by using the power consumption weight value to obtain a second modified weight value, where the second modified weight value is used to indicate that the core is selected to run the convolutional neural network model Priority
核心确定模块705,还用于根据至少两个核心的第二修正权重值,从至少两个核心中确定运行卷积神经网络模型的核心。The core determining module 705 is further configured to determine, according to the second modified weight value of the at least two cores, a core of the running convolutional neural network model from the at least two cores.
可选地, Optionally,
终端还包括参数确定单元706和运行单元707;The terminal further includes a parameter determining unit 706 and an operating unit 707;
获取单元701,还用于获取每一核心当前的核心使用率;The obtaining unit 701 is further configured to obtain a current core usage rate of each core.
参数确定单元706,用于对每一核心,根据核心使用率从第二对应关系中确定性能参数,每一核心的性能参数和每一核心的核心使用率对应,第二对应关系包括每一核心的性能参数和每一核心的核心使用率的对应关系;The parameter determining unit 706 is configured to determine performance parameters from the second correspondence according to the core usage rate, and the performance parameter of each core corresponds to the core usage rate of each core, and the second correspondence includes each core. Correspondence between performance parameters and core usage of each core;
运行单元707,用于在所述核心确定单元根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心之后,在目标核心上使用目标核心的性能参数运行卷积神经网络模型,目标核心为运行卷积神经网络模型的核心。The running unit 707 is configured to: after the core determining unit performs the core of the running convolutional neural network model from the at least two cores, according to the core weight value of the at least two cores, use the performance parameter of the target core on the target core Convolutional neural network model, the core of the target is the core of the running convolutional neural network model.
可选地,Optionally,
性能参数包括线程优先级信息、睡眠时间信息、线程数目信息中的一种或多种;The performance parameter includes one or more of thread priority information, sleep time information, and number of threads;
线程优先级信息为核心运行卷积神经网络模型时子线程的优先级别信息;The thread priority information is the priority information of the child thread when the core runs the convolutional neural network model;
睡眠时间信息为核心运行两个卷积神经网络模型间隔的时间;The sleep time information is the time when the core runs two convolutional neural network models;
线程数目信息为核心运行卷积神经网络模型时使用的线程数目信息。The number of threads information is the number of threads used when the core runs the convolutional neural network model.
可选地,Optionally,
终端当前的状态参数为终端当前的剩余电量值;The current state parameter of the terminal is the current remaining power value of the terminal;
权重值确定单元702,还用于根据剩余电量值,从预设的第二对应关系中确定至少两个核心的功耗权重值,至少两个核心的功耗权重值和剩余电量值对应,功耗权重值用于表示在剩余电量值下,核心被选择以运行卷积神经网络模型的优先程度,第二对应关系包括剩余电量值和至少两个核心的功耗权重值的对应关系。The weight value determining unit 702 is further configured to determine, according to the remaining power value, the power consumption weight values of the at least two cores from the preset second correspondence, and the power consumption weight values of the at least two cores correspond to the remaining power values. The weight loss value is used to indicate the priority of the core selected to run the convolutional neural network model under the remaining power value, and the second correspondence includes the correspondence between the remaining power value and the power consumption weight values of at least two cores.
可选地,Optionally,
目标模型参数为卷积神经网络模型的权重参数数量。The target model parameter is the number of weight parameters of the convolutional neural network model.
可选地,Optionally,
至少两个核心包括中央处理器CPU、图形处理器GPU、数字信号处理器DSP、脉动阵列处理器中的至少两个。The at least two cores include at least two of a central processing unit CPU, a graphics processor GPU, a digital signal processor DSP, and a pulse array processor.
可选地,Optionally,
权重值确定单元702,还用于在预设的第一对应关系中,确定目标模型参数所在的目标模型参数区间;在第一对应关系中,确定至少两个核心的核心权重值区间,至少两个核心的核心权重值区间和目标模型参数区间对应,第一对应关系包括目标模型参数区间和至少两个核心的核心权重值区间的对应关系,目标模型参数区间包括目标模型参数;对每一核心,从核心权重值区间中确定核心权重值,核心权重值在核心权重值区间中的位置和目标模型参数在目标模型参数区间中的位置相同。The weight value determining unit 702 is further configured to determine, in the preset first correspondence, a target model parameter interval in which the target model parameter is located; and in the first correspondence relationship, determine at least two core core weight value intervals, at least two The core core weight value interval corresponds to the target model parameter interval, and the first correspondence relationship includes the correspondence relationship between the target model parameter interval and the core weight value interval of at least two cores, and the target model parameter interval includes the target model parameter; for each core The core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
综上所述,获取单元701获取目标模型参数,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,然后,权重值确定单元702根据目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,该至少两个核心的核心权重值与目标模型参数对应,该至少两个核心为终端上的异构核心,其中,第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度。从而核心确定单元703根据至少两个核心的核心权重值,从至少两个 核心中确定运行卷积神经网络模型的核心。终端上的异构核心特点不同,不同核心适于运行不同计算密度的卷积神经网络模型。若预设有第一对应关系,该第一对应关系包括目标模型参数和至少两个核心的核心权重值的对应关系,其中,目标模型参数用于表示一卷积神经网络模型的计算密度,该至少两个核心为终端上的异构核心,则在获取到一卷积神经网络模型的目标模型参数后,可以根据该目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值。核心权重值用于表示核心被选择以运行卷积神经网络模型的优先程度,通过核心权重值可确定适于运行该卷积神经网络模型的核心。这样,即可实现根据至少两个核心的核心权重值,从至少两个核心中确定运行卷积神经网络模型的核心。通过不同核心的核心权重值,可确定适配的核心来运行具有具体计算密度的卷积神经网络模型,若核心权重值越大的核心越能高效运行卷积神经网络模型时,根据核心权重值确定出的核心可以高效运行该卷积神经网络模型。In summary, the obtaining unit 701 acquires a target model parameter, wherein the target model parameter is used to represent the calculated density of a convolutional neural network model, and then the weight value determining unit 702 selects the first corresponding corresponding according to the target model parameter. Determining at least two core core weight values in the relationship, the core weight values of the at least two cores are corresponding to target model parameters, the at least two cores are heterogeneous cores on the terminal, wherein the first correspondence relationship includes target model parameters Corresponding to the core weight values of at least two cores, the core weight values are used to indicate the priority of the core being selected to run the convolutional neural network model. Thus the core determining unit 703 is based on at least two core weight values of at least two cores. The core of the running convolutional neural network model is determined in the core. The heterogeneous core features on the terminal are different, and different cores are suitable for running convolutional neural network models with different computational densities. If the first correspondence relationship is pre-set, the first correspondence relationship includes a correspondence relationship between the target model parameter and the core weight value of at least two cores, wherein the target model parameter is used to represent a calculation density of a convolutional neural network model, At least two cores are heterogeneous cores on the terminal, and after obtaining the target model parameters of a convolutional neural network model, at least two cores may be determined from the preset first correspondence according to the target model parameters. Core weight value. The core weight value is used to indicate the priority at which the core is selected to run the convolutional neural network model, and the core weight value can be used to determine the core suitable for running the convolutional neural network model. In this way, the core of the running convolutional neural network model can be determined from at least two cores according to the core weight values of at least two cores. Through the core weight values of different cores, the core of the adaptation can be determined to run a convolutional neural network model with specific computational density. If the core with higher core weight value can run the convolutional neural network model efficiently, according to the core weight value The identified core can run the convolutional neural network model efficiently.
本发明实施例还提供了一种芯片装置,所述芯片包括处理单元,用于执行上述图4和图5所示的方法。The embodiment of the invention further provides a chip device, the chip comprising a processing unit for performing the method shown in FIG. 4 and FIG. 5 above.
本发明实施例还提供了一种芯片装置,所述芯片装置包括处理器和存储器。所述存储器包括指令,所述处理器运行所述指令,用于执行上述图4和图5所示的方法。The embodiment of the invention further provides a chip device, which comprises a processor and a memory. The memory includes instructions that are executed by the processor for performing the methods illustrated in Figures 4 and 5 above.
在本发明实施例中,芯片装置可以为终端内的芯片,所述芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述处理器可以是前文所述的中央处理器680。所述通信单元例如可以是输入/输出接口、管脚或电路等,所述通信单元包括***总线。可选地,所述芯片还包括存储单元,所述存储单元可以是所述芯片内部的存储器,例如寄存器、缓存、随机存取存储器(random access memory,RAM)、EEPROM或者FLASH等;所述存储单元还可以是位于所述芯片外部的存储器,该存储器可以是前文所述的各种类型的存储器620。处理器连接到存储器,该处理器可以运行存储器存储的指令,以使该芯片装置执行上述图4和图5所示的方法。In the embodiment of the present invention, the chip device may be a chip in the terminal, the chip includes: a processing unit and a communication unit, and the processing unit may be, for example, a processor, and the processor may be a central processor as described above. 680. The communication unit may be, for example, an input/output interface, a pin or a circuit, etc., and the communication unit includes a system bus. Optionally, the chip further includes a storage unit, where the storage unit may be a memory inside the chip, such as a register, a cache, a random access memory (RAM), an EEPROM or a FLASH, etc.; The unit may also be a memory located external to the chip, which may be various types of memory 620 as previously described. The processor is coupled to a memory that can execute instructions stored in the memory to cause the chip device to perform the methods illustrated in Figures 4 and 5 above.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product.
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。 The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present invention are generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transfer to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.). The computer readable storage medium can be any available media that can be stored by a computer or a data storage device such as a server, data center, or the like that includes one or more available media. The usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium (such as a solid state disk (SSD)).

Claims (24)

  1. 一种核心调度方法,其特征在于,包括:A core scheduling method, comprising:
    获取目标模型参数,所述目标模型参数用于表示一卷积神经网络模型的计算密度;Obtaining a target model parameter, the target model parameter being used to represent a calculated density of a convolutional neural network model;
    根据所述目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,所述至少两个核心的核心权重值与所述目标模型参数对应,所述至少两个核心为终端上的异构核心,所述第一对应关系包括所述目标模型参数和所述至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;Determining, according to the target model parameter, a core weight value of at least two cores from a preset first correspondence, the core weight values of the at least two cores corresponding to the target model parameter, the at least two cores a heterogeneous core on the terminal, the first correspondence relationship comprising a correspondence between the target model parameter and a core weight value of the at least two cores, the core weight value being used to indicate that the core is selected to run the convolution The priority of the neural network model;
    根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。A core running the convolutional neural network model is determined from the at least two cores based on core weight values of the at least two cores.
  2. 根据权利要求1所述的方法,其特征在于,The method of claim 1 wherein
    所述方法还包括:The method further includes:
    获取所述终端当前的状态参数,所述状态参数为动态变化的参数;Obtaining a current state parameter of the terminal, where the state parameter is a dynamically changing parameter;
    根据所述状态参数,从预设的第二对应关系中确定所述至少两个核心的参数权重值,所述至少两个核心的参数权重值和所述状态参数对应,所述第二对应关系包括所述状态参数和所述至少两个核心的参数权重值的对应关系,参数权重值用于表示在所述状态参数下,核心被选择以运行所述卷积神经网络模型的优先程度;Determining, according to the state parameter, a parameter weight value of the at least two cores from a preset second correspondence, where the parameter weight values of the at least two cores correspond to the state parameters, and the second correspondence relationship And including a correspondence between the state parameter and the parameter weight value of the at least two cores, where the parameter weight value is used to indicate a priority of the core selected to run the convolutional neural network model under the state parameter;
    所述根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心,包括:Determining, according to the core weight values of the at least two cores, a core that runs the convolutional neural network model from the at least two cores, including:
    对每一核心,使用参数权重值修正核心权重值,得到第一修正权重值,所述第一修正权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;For each core, the core weight value is modified using the parameter weight value to obtain a first modified weight value, the first modified weight value being used to indicate the priority of the core selected to run the convolutional neural network model;
    根据所述至少两个核心的第一修正权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。A core running the convolutional neural network model is determined from the at least two cores based on the first modified weight values of the at least two cores.
  3. 根据权利要求2所述的方法,其特征在于,The method of claim 2 wherein:
    所述终端当前的状态参数为每一核心当前的核心使用率;The current state parameter of the terminal is the current core usage rate of each core;
    所述根据所述状态参数,从预设的第二对应关系中确定所述至少两个核心的参数权重值,包括:Determining, according to the state parameter, a parameter weight value of the at least two cores from a preset second correspondence, including:
    对所述每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值,所述每一核心的性能权重值和所述每一核心的核心使用率对应,所述性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行所述卷积神经网络模型的优先程度,所述第二对应关系包括所述每一核心的核心使用率和所述每一核心的性能权重值的对应关系。Determining a performance weight value from the preset second correspondence relationship according to the core usage rate, where the performance weight value of each core corresponds to the core usage rate of each core, and the performance weight Values are used to indicate the priority at which the core is selected to run the convolutional neural network model at the core core usage of the core, the second correspondence including the core usage of each core and each of the cores The correspondence between the core performance weight values.
  4. 根据权利要求3所述的方法,其特征在于,The method of claim 3 wherein:
    所述方法还包括:The method further includes:
    获取所述终端当前的剩余电量值;Obtaining a current remaining power value of the terminal;
    根据所述剩余电量值,从预设的第三对应关系中确定所述至少两个核心的功耗权重值,所述至少两个核心的功耗权重值和所述剩余电量值对应,所述第三对应关系包括所述剩余电量值和所述至少两个核心的功耗权重值的对应关系,所述功耗权重值用于表示在所述剩余电量值下,核心被选择以运行所述卷积神经网络模型的优先程度; Determining, according to the remaining power value, a power consumption weight value of the at least two cores from a preset third correspondence, where the power consumption weight values of the at least two cores and the remaining power value correspond to The third correspondence includes a correspondence between the remaining power value and a power consumption weight value of the at least two cores, the power consumption weight value being used to indicate that, under the remaining power value, a core is selected to run the The priority of the convolutional neural network model;
    所述根据所述至少两个核心的第一修正权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心,包括:Determining, according to the first modified weight value of the at least two cores, a core that runs the convolutional neural network model from the at least two cores, including:
    对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值,所述第二修正权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;For each core, the first modified weight value is corrected using the power consumption weight value to obtain a second modified weight value, and the second modified weight value is used to indicate the priority of the core selected to run the convolutional neural network model;
    根据所述至少两个核心的第二修正权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。Determining a core running the convolutional neural network model from the at least two cores according to a second modified weight value of the at least two cores.
  5. 根据权利要求1所述的方法,其特征在于,The method of claim 1 wherein
    所述方法还包括:The method further includes:
    获取每一核心当前的核心使用率;Get the current core usage of each core;
    对所述每一核心,根据核心使用率从第二对应关系中确定性能参数,所述每一核心的性能参数和所述每一核心的核心使用率对应,所述第二对应关系包括所述每一核心的性能参数和所述每一核心的核心使用率的对应关系;Determining performance parameters from the second correspondence relationship according to the core usage rate, the performance parameter of each core corresponding to the core usage rate of each core, and the second correspondence relationship includes the Correspondence between the performance parameters of each core and the core usage of each core;
    所述根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心之后,所述方法还包括:After determining, according to the core weight values of the at least two cores, the core of the convolutional neural network model is determined from the at least two cores, the method further includes:
    在目标核心上使用所述目标核心的性能参数运行所述卷积神经网络模型,所述目标核心为所述运行所述卷积神经网络模型的核心。The convolutional neural network model is run on the target core using performance parameters of the target core, the target core being the core of the running the convolutional neural network model.
  6. 根据权利要求5所述的方法,其特征在于,The method of claim 5 wherein:
    所述性能参数包括线程优先级信息、睡眠时间信息、线程数目信息中的一种或多种;The performance parameter includes one or more of thread priority information, sleep time information, and number of threads;
    所述线程优先级信息为核心运行卷积神经网络模型时子线程的优先级别信息;The thread priority information is priority information of the child thread when the core runs the convolutional neural network model;
    所述睡眠时间信息为核心运行两个卷积神经网络模型间隔的时间;The sleep time information is a time when the core runs two convolutional neural network models;
    所述线程数目信息为核心运行卷积神经网络模型时使用的线程数目信息。The thread number information is information on the number of threads used when the core runs the convolutional neural network model.
  7. 根据权利要求2所述的方法,其特征在于,The method of claim 2 wherein:
    所述终端当前的状态参数为所述终端当前的剩余电量值;The current state parameter of the terminal is a current remaining power value of the terminal;
    所述根据所述状态参数,从预设的第二对应关系中确定所述至少两个核心的参数权重值,包括:Determining, according to the state parameter, a parameter weight value of the at least two cores from a preset second correspondence, including:
    根据所述剩余电量值,从预设的第二对应关系中确定所述至少两个核心的功耗权重值,所述至少两个核心的功耗权重值和所述剩余电量值对应,所述功耗权重值用于表示在所述剩余电量值下,核心被选择以运行所述卷积神经网络模型的优先程度,所述第二对应关系包括所述剩余电量值和所述至少两个核心的功耗权重值的对应关系。Determining, according to the remaining power value, a power consumption weight value of the at least two cores from a preset second correspondence, where the power consumption weight values of the at least two cores and the remaining power value correspond to The power consumption weight value is used to indicate a priority at which the core is selected to run the convolutional neural network model under the remaining power value, the second correspondence including the remaining power value and the at least two cores Correspondence between power consumption weight values.
  8. 根据权利要求1-7任一项所述的方法,其特征在于,Method according to any of claims 1-7, characterized in that
    所述目标模型参数为所述卷积神经网络模型的权重参数数量。The target model parameter is the number of weight parameters of the convolutional neural network model.
  9. 根据权利要求1-8任一项所述的方法,其特征在于,Method according to any of claims 1-8, characterized in that
    所述至少两个核心包括中央处理器CPU、图形处理器GPU、数字信号处理器DSP、脉动阵列处理器中的至少两个。The at least two cores include at least two of a central processing unit CPU, a graphics processor GPU, a digital signal processor DSP, and a pulse array processor.
  10. 根据权利要求1-9任一项所述的方法,其特征在于,Method according to any of claims 1-9, characterized in that
    所述根据所述目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,包括: Determining, according to the target model parameter, a core weight value of at least two cores from a preset first correspondence, including:
    在预设的第一对应关系中,确定所述目标模型参数所在的目标模型参数区间;Determining, in a preset first correspondence, a target model parameter interval in which the target model parameter is located;
    在所述第一对应关系中,确定至少两个核心的核心权重值区间,所述至少两个核心的核心权重值区间和所述目标模型参数区间对应,所述第一对应关系包括所述目标模型参数区间和所述至少两个核心的核心权重值区间的对应关系,所述目标模型参数区间包括所述目标模型参数;Determining, in the first correspondence, a core weight value interval of at least two cores, the core weight value interval of the at least two cores corresponding to the target model parameter interval, the first correspondence including the target a correspondence between a model parameter interval and a core weight value interval of the at least two cores, the target model parameter interval including the target model parameter;
    对每一核心,从核心权重值区间中确定核心权重值,所述核心权重值在所述核心权重值区间中的位置和所述目标模型参数在所述目标模型参数区间中的位置相同。For each core, a core weight value is determined from the core weight value interval, and the position of the core weight value in the core weight value interval is the same as the position of the target model parameter in the target model parameter interval.
  11. 一种终端,其特征在于,包括:A terminal, comprising:
    获取单元,用于获取目标模型参数,所述目标模型参数用于表示一卷积神经网络模型的计算密度;An obtaining unit, configured to acquire a target model parameter, where the target model parameter is used to represent a calculated density of a convolutional neural network model;
    权重值确定单元,用于根据所述目标模型参数,从预设的第一对应关系中确定至少两个核心的核心权重值,所述至少两个核心的核心权重值与所述目标模型参数对应,所述至少两个核心为终端上的异构核心,所述第一对应关系包括所述目标模型参数和所述至少两个核心的核心权重值的对应关系,核心权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;a weight value determining unit, configured to determine, according to the target model parameter, a core weight value of at least two cores from a preset first correspondence, where the core weight values of the at least two cores correspond to the target model parameter The at least two cores are heterogeneous cores on the terminal, and the first correspondence relationship includes a correspondence between the target model parameters and core weight values of the at least two cores, where the core weight values are used to indicate that the core is Selecting the priority to run the convolutional neural network model;
    核心确定单元,用于根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。And a core determining unit, configured to determine, according to the core weight values of the at least two cores, a core that runs the convolutional neural network model from the at least two cores.
  12. 根据权利要求11所述的终端,其特征在于,The terminal of claim 11 wherein:
    所述获取单元,还用于获取所述终端当前的状态参数,所述状态参数为动态变化的参数;The acquiring unit is further configured to acquire a current state parameter of the terminal, where the state parameter is a dynamically changing parameter;
    所述权重值确定单元,还用于根据所述状态参数,从预设的第二对应关系中确定所述至少两个核心的参数权重值,所述至少两个核心的参数权重值和所述状态参数对应,所述第二对应关系包括所述状态参数和所述至少两个核心的参数权重值的对应关系,参数权重值用于表示在所述状态参数下,核心被选择以运行所述卷积神经网络模型的优先程度;The weight value determining unit is further configured to determine, according to the state parameter, a parameter weight value of the at least two cores from a preset second correspondence, a parameter weight value of the at least two cores, and the Corresponding to the state parameter, the second correspondence includes a correspondence between the state parameter and the parameter weight value of the at least two cores, where the parameter weight value is used to indicate that, under the state parameter, the core is selected to run the The priority of the convolutional neural network model;
    所述核心确定单元,包括修正模块和核心确定模块;The core determining unit includes a correction module and a core determining module;
    所述修正模块,用于对每一核心,使用参数权重值修正核心权重值,得到第一修正权重值,所述第一修正权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;The correction module is configured to, for each core, correct the core weight value by using a parameter weight value to obtain a first modified weight value, where the first modified weight value is used to indicate that the core is selected to run the convolutional neural network model Priority
    所述核心确定模块,用于根据所述至少两个核心的第一修正权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。The core determining module is configured to determine, according to the first modified weight value of the at least two cores, a core that runs the convolutional neural network model from the at least two cores.
  13. 根据权利要求12所述的终端,其特征在于,The terminal according to claim 12, characterized in that
    所述终端当前的状态参数为每一核心当前的核心使用率;The current state parameter of the terminal is the current core usage rate of each core;
    所述权重值确定单元,还用于对所述每一核心,根据核心使用率从预设的第二对应关系中确定性能权重值,所述每一核心的性能权重值和所述每一核心的核心使用率对应,所述性能权重值用于表示在核心当前的核心使用率下,核心被选择以运行所述卷积神经网络模型的优先程度,所述第二对应关系包括所述每一核心的核心使用率和所述每一核心的性能权重值的对应关系。The weight value determining unit is further configured to: determine, for each core, a performance weight value from a preset second correspondence according to a core usage rate, a performance weight value of each core, and each core Corresponding to the core usage rate, the performance weight value is used to indicate the priority of the core selected to run the convolutional neural network model under the core core usage rate of the core, the second correspondence relationship including each The correspondence between the core core usage rate and the performance weight value of each core.
  14. 根据权利要求13所述的终端,其特征在于, The terminal of claim 13 wherein:
    所述获取单元,还用于获取所述终端当前的剩余电量值;The obtaining unit is further configured to acquire a current remaining power value of the terminal;
    所述权重值确定单元,还用于根据所述剩余电量值,从预设的第三对应关系中确定所述至少两个核心的功耗权重值,所述至少两个核心的功耗权重值和所述剩余电量值对应,所述第三对应关系包括所述剩余电量值和所述至少两个核心的功耗权重值的对应关系,所述功耗权重值用于表示在所述剩余电量值下,核心被选择以运行所述卷积神经网络模型的优先程度;The weight value determining unit is further configured to determine, according to the remaining power value, a power consumption weight value of the at least two cores from a preset third correspondence, and power consumption weight values of the at least two cores Corresponding to the remaining power value, the third correspondence includes a correspondence between the remaining power value and a power consumption weight value of the at least two cores, where the power consumption weight value is used to indicate the remaining power Value, the priority of the core selected to run the convolutional neural network model;
    所述修正模块,还用于对每一核心,使用功耗权重值修正第一修正权重值,得到第二修正权重值,所述第二修正权重值用于表示核心被选择以运行所述卷积神经网络模型的优先程度;The correction module is further configured to, for each core, use a power consumption weight value to correct the first modified weight value to obtain a second modified weight value, where the second modified weight value is used to indicate that the core is selected to run the volume The priority of the neural network model;
    所述核心确定模块,还用于根据所述至少两个核心的第二修正权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心。The core determining module is further configured to determine, according to the second modified weight value of the at least two cores, a core that runs the convolutional neural network model from the at least two cores.
  15. 根据权利要求11所述的终端,其特征在于,The terminal of claim 11 wherein:
    所述终端还包括参数确定单元和运行单元;The terminal further includes a parameter determining unit and an operating unit;
    所述获取单元,还用于获取每一核心当前的核心使用率;The obtaining unit is further configured to obtain a current core usage rate of each core;
    所述参数确定单元,用于对所述每一核心,根据核心使用率从第二对应关系中确定性能参数,所述每一核心的性能参数和所述每一核心的核心使用率对应,所述第二对应关系包括所述每一核心的性能参数和所述每一核心的核心使用率的对应关系;The parameter determining unit is configured to determine a performance parameter from the second correspondence according to the core usage rate, where the performance parameter of each core corresponds to the core usage rate of each core, The second correspondence relationship includes a correspondence between the performance parameters of each core and the core usage rate of each core;
    所述运行单元,用于在所述核心确定单元根据所述至少两个核心的核心权重值,从所述至少两个核心中确定运行所述卷积神经网络模型的核心之后,在目标核心上使用所述目标核心的性能参数运行所述卷积神经网络模型,所述目标核心为所述运行所述卷积神经网络模型的核心。The operating unit is configured to determine, after the core of the at least two cores, the core of the convolutional neural network model from the at least two cores, according to the core weight value of the at least two cores, on the target core The convolutional neural network model is run using performance parameters of the target core, the target core being the core of the running convolutional neural network model.
  16. 根据权利要求15所述的终端,其特征在于,The terminal of claim 15 wherein:
    所述性能参数包括线程优先级信息、睡眠时间信息、线程数目信息中的一种或多种;The performance parameter includes one or more of thread priority information, sleep time information, and number of threads;
    所述线程优先级信息为核心运行卷积神经网络模型时子线程的优先级别信息;The thread priority information is priority information of the child thread when the core runs the convolutional neural network model;
    所述睡眠时间信息为核心运行两个卷积神经网络模型间隔的时间;The sleep time information is a time when the core runs two convolutional neural network models;
    所述线程数目信息为核心运行卷积神经网络模型时使用的线程数目信息。The thread number information is information on the number of threads used when the core runs the convolutional neural network model.
  17. 根据权利要求12所述的终端,其特征在于,The terminal according to claim 12, characterized in that
    所述终端当前的状态参数为所述终端当前的剩余电量值;The current state parameter of the terminal is a current remaining power value of the terminal;
    所述权重值确定单元,还用于根据所述剩余电量值,从预设的第二对应关系中确定所述至少两个核心的功耗权重值,所述至少两个核心的功耗权重值和所述剩余电量值对应,所述功耗权重值用于表示在所述剩余电量值下,核心被选择以运行所述卷积神经网络模型的优先程度,所述第二对应关系包括所述剩余电量值和所述至少两个核心的功耗权重值的对应关系。The weight value determining unit is further configured to determine, according to the remaining power value, a power consumption weight value of the at least two cores from a preset second correspondence, and power consumption weight values of the at least two cores Corresponding to the remaining power value, the power consumption weight value is used to indicate a priority of the core selected to run the convolutional neural network model under the remaining power value, the second correspondence includes the Corresponding relationship between the remaining power value and the power consumption weight value of the at least two cores.
  18. 根据权利要求11-17任一项所述的终端,其特征在于,A terminal according to any one of claims 11-17, characterized in that
    所述目标模型参数为所述卷积神经网络模型的权重参数数量。The target model parameter is the number of weight parameters of the convolutional neural network model.
  19. 根据权利要求11-18任一项所述的终端,其特征在于,A terminal according to any one of claims 11-18, characterized in that
    所述至少两个核心包括中央处理器CPU、图形处理器GPU、数字信号处理器DSP、脉动 阵列处理器中的至少两个。The at least two cores include a central processing unit CPU, a graphics processing unit GPU, a digital signal processing unit DSP, and a ripple At least two of the array processors.
  20. 根据权利要求11-19任一项所述的终端,其特征在于,A terminal according to any one of claims 11 to 19, characterized in that
    所述权重值确定单元,还用于在预设的第一对应关系中,确定所述目标模型参数所在的目标模型参数区间;在所述第一对应关系中,确定至少两个核心的核心权重值区间,所述至少两个核心的核心权重值区间和所述目标模型参数区间对应,所述第一对应关系包括所述目标模型参数区间和所述至少两个核心的核心权重值区间的对应关系,所述目标模型参数区间包括所述目标模型参数;对每一核心,从核心权重值区间中确定核心权重值,所述核心权重值在所述核心权重值区间中的位置和所述目标模型参数在所述目标模型参数区间中的位置相同。The weight value determining unit is further configured to: determine, in a preset first correspondence, a target model parameter interval in which the target model parameter is located; and determine, in the first correspondence relationship, a core weight of at least two cores a value interval, where the core weight value interval of the at least two cores corresponds to the target model parameter interval, and the first correspondence relationship includes a correspondence between the target model parameter interval and the core weight value interval of the at least two cores a relationship, the target model parameter interval includes the target model parameter; for each core, a core weight value is determined from a core weight value interval, a position of the core weight value in the core weight value interval, and the target The model parameters are in the same position in the target model parameter interval.
  21. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-10任意一项所述的方法。A computer readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any of claims 1-10.
  22. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1-10任意一项所述的方法。A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any of claims 1-10.
  23. 一种终端,其特征在于,包括:A terminal, comprising:
    处理器和存储器;Processor and memory;
    通过调用所述存储器存储的操作指令,所述处理器被设置使得所述终端执行如权利要求1-10任意一项所述的方法。The processor is arranged to cause the terminal to perform the method of any of claims 1-10 by invoking an operational instruction stored by the memory.
  24. 一种芯片装置,其特征在于,所述装置包括处理单元;A chip device, characterized in that the device comprises a processing unit;
    其中,所述处理单元,用于执行如权利要求1-10任一项所述的方法。 The processing unit is configured to perform the method according to any one of claims 1-10.
PCT/CN2017/107614 2017-10-25 2017-10-25 Core scheduling method and terminal WO2019079994A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/107614 WO2019079994A1 (en) 2017-10-25 2017-10-25 Core scheduling method and terminal
CN201780064697.0A CN109937410B (en) 2017-10-25 2017-10-25 Core scheduling method and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/107614 WO2019079994A1 (en) 2017-10-25 2017-10-25 Core scheduling method and terminal

Publications (1)

Publication Number Publication Date
WO2019079994A1 true WO2019079994A1 (en) 2019-05-02

Family

ID=66247167

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/107614 WO2019079994A1 (en) 2017-10-25 2017-10-25 Core scheduling method and terminal

Country Status (2)

Country Link
CN (1) CN109937410B (en)
WO (1) WO2019079994A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442919B (en) * 2019-07-12 2022-12-27 西安空间无线电技术研究所 Microwave component micro-discharge numerical simulation method based on GPU (graphics processing Unit) architecture
TWI819480B (en) * 2022-01-27 2023-10-21 緯創資通股份有限公司 Acceleration system and dynamic configuration method thereof
CN114237859B (en) * 2022-02-25 2022-05-13 中瓴智行(成都)科技有限公司 Distributed intelligent terminal GPU (graphics processing Unit) computing power improving method, terminal, system and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012155010A1 (en) * 2011-05-11 2012-11-15 Advanced Micro Devices, Inc. Automatic load balancing for heterogeneous cores
CN103119580A (en) * 2010-09-25 2013-05-22 英特尔公司 Application scheduling in heterogeneous multiprocessor computing platforms
CN103443769A (en) * 2011-03-11 2013-12-11 英特尔公司 Dynamic core selection for heterogeneous multi-ore systems
CN105224502A (en) * 2015-09-28 2016-01-06 浪潮(北京)电子信息产业有限公司 A kind of degree of depth learning method based on GPU and system
CN106201651A (en) * 2016-06-27 2016-12-07 鄞州浙江清华长三角研究院创新中心 The simulator of neuromorphic chip

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289436B (en) * 2010-06-18 2013-12-25 阿里巴巴集团控股有限公司 Method and device for determining weighted value of search term and method and device for generating search results
CN103299277B (en) * 2011-12-31 2016-11-09 华为技术有限公司 Gpu system and processing method thereof
EP3039544B1 (en) * 2013-10-03 2018-12-12 Huawei Technologies Co., Ltd. Method and system for assigning a computational block of a software program to cores of a multi-processor system
CN103645954B (en) * 2013-11-21 2018-12-14 华为技术有限公司 A kind of CPU dispatching method based on heterogeneous multi-core system, device and system
EP3035204B1 (en) * 2014-12-19 2018-08-15 Intel Corporation Storage device and method for performing convolution operations
US10234930B2 (en) * 2015-02-13 2019-03-19 Intel Corporation Performing power management in a multicore processor
US9766673B2 (en) * 2015-02-27 2017-09-19 Intel Corporation Supercapacitor-based power supply protection for multi-node systems
CN105930902B (en) * 2016-04-18 2018-08-10 中国科学院计算技术研究所 A kind of processing method of neural network, system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103119580A (en) * 2010-09-25 2013-05-22 英特尔公司 Application scheduling in heterogeneous multiprocessor computing platforms
CN103443769A (en) * 2011-03-11 2013-12-11 英特尔公司 Dynamic core selection for heterogeneous multi-ore systems
WO2012155010A1 (en) * 2011-05-11 2012-11-15 Advanced Micro Devices, Inc. Automatic load balancing for heterogeneous cores
CN105224502A (en) * 2015-09-28 2016-01-06 浪潮(北京)电子信息产业有限公司 A kind of degree of depth learning method based on GPU and system
CN106201651A (en) * 2016-06-27 2016-12-07 鄞州浙江清华长三角研究院创新中心 The simulator of neuromorphic chip

Also Published As

Publication number Publication date
CN109937410B (en) 2021-02-23
CN109937410A (en) 2019-06-25

Similar Documents

Publication Publication Date Title
EP3553676A1 (en) Smart recommendation method and terminal
CN111261161B (en) Voice recognition method, device and storage medium
WO2019062413A1 (en) Method and apparatus for managing and controlling application program, storage medium, and electronic device
CN107947951A (en) Groups of users recommends method, apparatus and storage medium and server
CN107678858B (en) Application processing method, device, storage medium and electronic equipment
CN107610698A (en) A kind of method for realizing Voice command, robot and computer-readable recording medium
CN107223298A (en) Isomery battery unit switches
KR102647690B1 (en) Neural network processing unit configured to drive an optimized artificial neural network model
WO2021089008A1 (en) Method and device for predicting intermolecular binding activity
WO2019079994A1 (en) Core scheduling method and terminal
WO2023207487A1 (en) Circuit wiring determination method and related device
WO2019062411A1 (en) Method for managing and controlling background application program, storage medium, and electronic device
WO2024046473A1 (en) Data processing method and apparatus
CN115562878B (en) GPU computing resource management method and device, electronic equipment and readable storage medium
CN109656719B (en) Algorithm processing method and device, storage medium and terminal equipment
CN108681480B (en) Background application program control method and device, storage medium and electronic equipment
CN115982110B (en) File running method, file running device, computer equipment and readable storage medium
WO2024055952A1 (en) Data processing method and apparatus thereof
WO2024016894A1 (en) Method for training neural network and related device
US20230179675A1 (en) Electronic device and method for operating thereof
US20220300333A1 (en) Electronic device including multi processor and method for operating the same
WO2024087830A1 (en) Application starting method and electronic device
KR20230105203A (en) Method and system for searching deep neural network architecture
KR20220128159A (en) Apparatus and scheduling method for process scheduling
CN112492047A (en) Cloud computing call optimization method and device and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17929744

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17929744

Country of ref document: EP

Kind code of ref document: A1