CN109978129A

CN109978129A - Dispatching method and relevant apparatus

Info

Publication number: CN109978129A
Application number: CN201711467704.XA
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing Zhongke Cambrian Technology Co Ltd
Current assignee: Cambricon Technologies Corp Ltd; Beijing Zhongke Cambrian Technology Co Ltd
Priority date: 2017-12-28
Filing date: 2017-12-28
Publication date: 2019-07-05
Anticipated expiration: 2037-12-28
Also published as: CN109978129B

Abstract

The embodiment of the present application discloses a kind of dispatching method and relevant apparatus, and wherein method is based on the server comprising multiple computing devices, comprising: receives multiple operations requests；If a corresponding target nerve network model is requested in the multiple operation, target parallel computing device corresponding with the target nerve network model is chosen from the multiple computing device；Based on the multiple operation request of the target parallel computing device parallel computation, multiple final operation results are obtained；Final operation result each in the multiple final operation result is sent to corresponding electronic equipment.The embodiment of the present application can choose the target in server and calculate the multiple operation requests of parallel device parallel processing, improve the operational efficiency of server.

Description

Dispatching method and relevant apparatus

Technical field

This application involves field of computer technology, and in particular to a kind of dispatching method and relevant apparatus.

Background technique

Neural network is the basis of current many artificial intelligence applications, with the further expansion of the application range of neural network Greatly, various neural network models are stored using server or cloud computing service, and for the fortune that user submits It calculates request and carries out operation.In face of numerous neural network models and large batch of request, the operation efficiency of server how is improved It is those skilled in the art's technical problem to be solved.

Summary of the invention

The embodiment of the present application proposes a kind of dispatching method and relevant apparatus, and the computing device that can be chosen in server executes Multiple operation requests, improve the operational efficiency of server.

In a first aspect, the embodiment of the present application provides a kind of dispatching method, it is described based on the server of multiple computing devices Method includes:

Receive multiple operation requests；

If a corresponding target nerve network model is requested in the multiple operation, chosen from the multiple computing device with The corresponding target parallel computing device of the target nerve network model；

The multiple operation is requested based on the target parallel computing device to carry out parallel computation, obtains multiple final fortune Calculate result；

Final operation result each in the multiple final operation result is sent to corresponding electronic equipment.

Second aspect, the embodiment of the present application provide a kind of server, and the server includes multiple computing devices, In:

Receiving unit, for receiving multiple operation requests；

Scheduling unit, if a corresponding target nerve network model is requested for the multiple operation, from the multiple meter It calculates and chooses target parallel computing device corresponding with the target nerve network model in device；

Arithmetic element carries out parallel computation for requesting based on the target parallel computing device the multiple operation, Obtain multiple final operation results；

Transmission unit, for final operation result each in the multiple final operation result to be sent to corresponding electronics Equipment.

The third aspect, the embodiment of the present application provide another server, including processor, memory, communication interface with And one or more programs, wherein one or more of programs are stored in the memory, and are configured by described Processor executes, and described program includes the instruction for the step some or all of as described in first aspect.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, the computer storage medium It is stored with computer program, the computer program includes program instruction, and described program instruction makes institute when being executed by a processor State the method that processor executes above-mentioned first aspect.

After above-mentioned dispatching method and relevant apparatus, if there are multiple operations request in server, and it is multiple What operation request was directed to is all the same target nerve network model, then target nerve network can be chosen from multiple computing devices The corresponding target parallel computing device of model requests corresponding operand to multiple operations using target parallel computing device According to carrying out batch operation, then the final operation result that operation obtains is distinguished to obtain each operation and request corresponding final fortune Calculate as a result, and be sent to corresponding electronic equipment, avoid Reusability target nerve network model carry out operation caused by the time Waste, to improve the integral operation efficiency of server.

Detailed description of the invention

In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the application Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.

Wherein:

Fig. 1 is a kind of structural schematic diagram of server provided by the embodiments of the present application；

Fig. 1 a is a kind of structural schematic diagram of computing unit provided by the embodiments of the present application；

Fig. 1 b is a kind of structural schematic diagram of main process task circuit provided by the embodiments of the present application；

Fig. 1 c is a kind of data distribution schematic diagram of computing unit provided by the embodiments of the present application；

Fig. 1 d is a kind of data back schematic diagram of computing unit provided by the embodiments of the present application；

Fig. 1 e is a kind of operation schematic diagram of neural network structure provided by the embodiments of the present application；

Fig. 2 is a kind of flow diagram of dispatching method provided by the embodiments of the present application；

Fig. 3 is the flow diagram of another dispatching method provided by the embodiments of the present application；

Fig. 4 is the structural schematic diagram of another server provided by the embodiments of the present application；

Fig. 5 is the structural schematic diagram of another server provided by the embodiments of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall in the protection scope of this application.

It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded Body, step, operation, the presence or addition of element, component and/or its set.

It is also understood that mesh of the term used in this present specification merely for the sake of description specific embodiment And be not intended to limit the application.As present specification and it is used in the attached claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.

It will be further appreciated that the term "and/or" used in present specification and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.

As used in this specification and in the appended claims, term " if " can be according to context quilt Be construed to " when ... " or " once " or " in response to determination " or " in response to detecting ".Similarly, phrase " if it is determined that " or " if detecting [described condition or event] " can be interpreted to mean according to context " once it is determined that " or " in response to true It is fixed " or " once detecting [described condition or event] " or " in response to detecting [described condition or event] ".

The embodiment of the present application proposes a kind of dispatching method and relevant apparatus, and the computing device that can be chosen in server executes Multiple operation requests, improve the operational efficiency of server.Below in conjunction with specific embodiment, and referring to attached drawing, to the application into One step is described in detail.

Fig. 1 is please referred to, Fig. 1 is a kind of structural schematic diagram of server provided by the embodiments of the present application.On as shown in Figure 1, Stating server includes multiple computing devices, and computing device includes but is not limited to server computer, can also be personal computer (personal computer, PC), network PC, minicomputer, mainframe computer etc..

In this application, it establishes connection by wired or wireless between each computing device for including in server and transmits Data, and each computing device includes at least one calculating carrier, such as: central processing unit (Central Processing Unit, CPU), image processor (graphics processing unit, GPU), processor board etc..And involved by the application And server can also be Cloud Server, provide cloud computing service for electronic equipment.

Wherein, each carrier that calculates may include the computing unit that at least one is used for neural network computing, such as: processing core Piece etc..The specific structure of computing unit is not construed as limiting, Fig. 1 a is please referred to, Fig. 1 a is a kind of structural representation of computing unit Figure.As shown in Figure 1a, which includes: main process task circuit, basic handling circuit and branch process circuit.Specifically, main Processing circuit and branch process circuit connection, at least one basic handling circuit of branch process circuit connection.

The branch process circuit, for receiving and dispatching the data of main process task circuit or basic handling circuit.

B refering to fig. 1, Fig. 1 b are a kind of structural schematic diagram of main processing circuit, and as shown in Figure 1 b, main process task circuit can wrap Register and/or on piece buffer circuit are included, which can also include: control circuit, vector operation device circuit, ALU (arithmetic and logic unit, arithmetic logic circuit) circuit, accumulator circuit, DMA (Direct Memory Access, direct memory access) circuits such as circuit, certainly in practical applications, above-mentioned main process task circuit can also add, conversion Circuit (such as matrix transposition circuit), data rearrangement circuit or active circuit etc. others circuit.

Main process task circuit further includes data transmitting line, data receiver circuit or interface, which can collect At data distribution circuit and data broadcasting circuit, certainly in practical applications, data distribution circuit and data broadcasting circuit It can also be respectively set；Above-mentioned data transmitting line and data receiver circuit also can integrate shape together in practical applications At data transmit-receive circuit.For broadcast data, that is, need to be sent to the data of each based process circuit.For distributing data, Need selectively to be sent to the data of part basis processing circuit, specific selection mode can be by main process task circuit foundation Load and calculation are specifically determined.For broadcast transmission mode, i.e., broadcast data is sent to the forms of broadcasting Each based process circuit.(broadcast data in practical applications, is sent to each based process by way of once broadcasting Broadcast data can also be sent to each based process circuit, the application specific implementation by way of repeatedly broadcasting by circuit Mode is not intended to limit the number of above-mentioned broadcast), for distributing sending method, i.e., distribution data are selectively sent to part base Plinth processing circuit.

Realizing that the control circuit of main process task circuit is to some or all of based process circuit transmission number when distributing data According to (data may be the same or different, specifically, if sending data by the way of distribution, each reception data The data that based process circuit receives can be different, naturally it is also possible to which the data for having part basis processing circuit to receive are identical；

Specifically, when broadcast data, the control circuit of main process task circuit is to some or all of based process circuit transmission Data, each based process circuit for receiving data can receive identical data, i.e. broadcast data may include all bases Processing circuit is required to the data received.Distribution data may include: the data that part basis processing circuit needs to receive. The broadcast data can be sent to all branch process circuits, branch process electricity by one or many broadcast by main process task circuit The road broadcast data is transmitted to all based process circuits.

Optionally, the vector operation device circuit of above-mentioned main process task circuit can execute vector operation, including but not limited to: two A vector addition subtraction multiplication and division, vector and constant add, subtract, multiplication and division operation, or executes any operation to each element in vector. Wherein, continuous operation is specifically as follows, and vector and constant add, subtract, multiplication and division operation, activating operation, accumulating operation etc..

Each based process circuit may include base register and/or basic on piece buffer circuit；Each based process Circuit can also include: one or any combination in inner product operation device circuit, vector operation device circuit, accumulator circuit etc..On Stating inner product operation device circuit, vector operation device circuit, accumulator circuit can be integrated circuit, above-mentioned inner product operation device electricity Road, vector operation device circuit, accumulator circuit may be the circuit being separately provided.

The connection structure of branch process circuit and tandem circuit can be arbitrary, and be not limited to the H-type structure of Fig. 1 b.It can Choosing, main process task circuit to tandem circuit is the structure of broadcast or distribution, and tandem circuit to main process task circuit is to collect (gather) structure.Broadcast, distribution and collection are defined as follows:

The data transfer mode of the main process task circuit to tandem circuit may include:

Main process task circuit is respectively connected with multiple branch process circuits, each branch process circuit again with multiple tandem circuits It is respectively connected with.

Main process task circuit is connected with a branch process circuit, which reconnects a branch process electricity Road, and so on, multiple branch process circuits of connecting, then, each branch process circuit distinguish phase with multiple tandem circuits again Even.

Main process task circuit is respectively connected with multiple branch process circuits, and each branch process circuit is connected multiple basic electric again Road.

Main process task circuit is connected with a branch process circuit, which reconnects a branch process electricity Road, and so on, multiple branch process circuits of connecting, then, each branch process circuit is connected multiple tandem circuits again.

When distributing data, main process task circuit transmits data, each base for receiving data to some or all of tandem circuit The data that plinth circuit receives can be different；

When broadcast data, main process task circuit transmits data, each base for receiving data to some or all of tandem circuit Plinth circuit receives identical data.

When collecting data, part or all of tandem circuit is to main process task circuit transmission data.It should be noted that such as Fig. 1 a institute The computing unit shown can be an individual phy chip, and certainly in practical applications, which also can integrate In other chips (such as CPU, GPU), the application specific embodiment is not intended to limit the physical behavior shape of said chip device Formula.

C refering to fig. 1, Fig. 1 c are a kind of data distribution schematic diagram of computing unit, and as shown in the arrow of Fig. 1 c, which is The distribution direction of data, as illustrated in figure 1 c, after main process task circuit receives external data, after external data is split, point Multiple branch process circuits are sent to, branch process circuit is sent to basic handling circuit for data are split.

D refering to fig. 1, Fig. 1 d are a kind of data back schematic diagram of computing unit, and as shown in the arrow of Fig. 1 d, which is Data (such as inner product calculated result) is returned to branch process by the upstream direction of data, as shown in Figure 1 d, basic handling circuit Circuit, branch process circuit are being back to main process task circuit.

It can be specifically vector, matrix, multidimensional (three-dimensional or four-dimensional or more) data, for defeated for input data A specific value for entering data, is properly termed as an element of the input data.

Present disclosure embodiment also provides a kind of calculation method of computing unit as shown in Figure 1a, the calculation method apply with In neural computing, specifically, the computing unit can be used for the input data to one or more layers in multilayer neural network Operation is executed with weight data.

Specifically, computing unit described above is used for the input data to one or more layers in trained multilayer neural network Operation is executed with weight data；

Or the computing unit is used for the input data and power to one or more layers in the multilayer neural network of forward operation Value Data executes operation.

Above-mentioned operation includes but is not limited to: convolution algorithm, Matrix Multiplication matrix operation, Matrix Multiplication vector operation, biasing operation, One of full connection operation, GEMM operation, GEMV operation, activation operation or any combination.

GEMM calculating refers to: the operation of the matrix-matrix multiplication in the library BLAS.The usual representation of the operation are as follows: C= Alpha*op (S) * op (P)+beta*C, wherein S and P is two matrixes of input, and C is output matrix, and alpha and beta are Scalar, op represents certain operation to matrix S or P, in addition, also having the integer of some auxiliary as a parameter to illustrating matrix The width of S and P is high；

GEMV calculating refers to: the operation of the Matrix-Vector multiplication in the library BLAS.The usual representation of the operation are as follows: C= Alpha*op (S) * P+beta*C, wherein S is input matrix, and P is the vector of input, and C is output vector, and alpha and beta are Scalar, op represent certain operation to matrix S.

The application is not construed as limiting for calculating the connection relationship between carrier in computing device, can be isomorphism or isomery Carrier is calculated, is also not construed as limiting for calculating the connection relationship in carrier between computing unit, is carried by the calculating of above-mentioned isomery Body or computing unit execute parallel task, and operation efficiency can be improved.

Computing device as described in Figure 1 includes that at least one calculates carrier, wherein calculate carrier includes at least one meter again Unit is calculated, i.e., selected target computing device depends on the connection relationship between each computing device and each meter in the application Calculate the attribute letter that the specific physical hardware such as neural network model, Internet resources disposed in device supports situation and operation request The calculating carrier of same type, then can be deployed in the same computing device, such as the calculating carrier portion that will be used for propagated forward by breath It is deployed on the same computing device, without being different computing device, effectively reduces the expense communicated between computing device, just In raising operation efficiency；Specific neural network model can also be deployed in specific calculating carrier, i.e. server is receiving needle When requesting the operation of specified neural network, the corresponding calculating carrier of above-mentioned specified neural network is called to execute above-mentioned operation request , the time of determining processing task is saved, operation efficiency is improved.

In this application, will be disclosed, and the neural network model used extensively is as specified neural network model (example Such as: LeNet, AlexNet, ZFnet in convolutional neural networks (convolutional neural network, CNN), GoogleNet、VGG、ResNet)。

Optionally, it obtains specified neural network model and concentrates the operation demand of each specified neural network model and described more The hardware attributes of each computing device obtain multiple operation demands and multiple hardware attributes in a computing device；According to the multiple Operation demand and the multiple hardware attributes are corresponding by the specified each specified neural network model of neural network model concentration Specified computing device on dispose corresponding specified neural network model.

Wherein, specifying neural network model collection includes multiple specified neural network models, the hardware attributes packet of computing device Network bandwidth, memory capacity, the processor host frequency rate etc. for including computing device itself further include that carrier or meter are calculated in computing device Calculate the hardware attributes of unit.That is, according to the selection of the hardware attributes of each computing device and specified neural network model The corresponding computing device of operation demand, can avoid processing leads to server failure not in time, and energy is supported in the operation for improving server Power.

The input neuron and output neuron mentioned in the application do not mean that refreshing in the input layer of entire neural network Through neuron in member and output layer, but the mind for two layers of arbitrary neighborhood in network, in network feed forward operation lower layer It is input neuron through member, the neuron in network feed forward operation upper layer is output neuron.With convolutional Neural net For network, if a convolutional neural networks have L layers, K=1,2 ..., L-1, for K layers and K+1 layers, by K layers Referred to as input layer, neuron therein are the input neuron, and K+1 layers are known as output layer, and neuron therein is described Output neuron.I.e. in addition to top, each layer all can serve as input layer, and next layer is corresponding output layer.

The operation being mentioned above all is one layer in neural network of operation, for multilayer neural network, realizes process As shown in fig. le, the arrow of dotted line indicates reversed operation in figure, and the arrow of solid line indicates forward operation.In forward operation, when Upper one layer of artificial neural network executes complete after, using upper one layer of obtained output neuron as next layer of input neuron It carries out operation (or the input neuron that certain operations are re-used as next layer is carried out to the output neuron), meanwhile, it will weigh Value also replaces with next layer of weight.In reversed operation, after the completion of the reversed operation of upper one layer of artificial neural network executes, The input neuron gradient that upper one layer obtains is subjected to operation (or to the input as next layer of output neuron gradient Neuron gradient carries out the output neuron gradient that certain operations are re-used as next layer), while weight is replaced with next layer Weight.

The forward operation of neural network is that input data is input to the calculating processes of final output data, reversed operation with just It is to the direction of propagation of operation on the contrary, anti-for final output data and the loss of desired output data or the corresponding loss function of loss To the calculating process by forward operation.By information forward operation and reversed operation in cycles, according to loss or loss Functional gradient decline mode correct each layer weight, each layer weight is adjusted and neural network learning training process, The loss of network output can be reduced.

Fig. 2 is referred to, Fig. 2 is a kind of flow diagram of dispatching method provided by the embodiments of the present application, as shown in Fig. 2, This method is applied to server as shown in Figure 1, and this method is related to the above-mentioned electronic equipment for allowing to access above-mentioned server, should Electronic equipment may include the various handheld devices with wireless communication function, mobile unit, wearable device, calculate equipment or Other processing equipments and various forms of user equipmenies (user equipment, UE) of radio modem are connected to, Mobile station (mobile station, MS), terminal device (terminal device) etc..

201: receiving multiple operation requests.

In this application, server receives multiple operations request that the electronic equipment for allowing to access is sent, and electronics is set The quantity for the operation request that standby quantity and each electronic equipment are sent is not construed as limiting, i.e., above-mentioned multiple operation requests can be one What a electronic equipment was sent, it is also possible to what multiple electronic equipments were sent.

Operation request includes target nerve network mould involved in processor active task (training mission or test assignment), operation The attribute informations such as type.Wherein, training mission is for being trained target nerve network model, i.e., to the neural network model into Row forward operation and reversed operation, until training is completed；And test assignment is used to be carried out once according to target nerve network model Forward operation.

Above-mentioned target nerve network model can be user and send the nerve net uploaded when operation request by electronic equipment Network model is also possible to the neural network model stored in server etc., number of the application for target nerve network model Amount is also not construed as limiting, i.e., each operation request can correspond at least one target nerve network model.

202: if a corresponding target nerve network model is requested in the multiple operation, chosen from multiple computing devices with The corresponding target parallel computing device of the target nerve network model.

That is, multiple operations request if it exists, and what multiple operations request was directed to is all the same target nerve net Network model can choose the corresponding target parallel computing device of target nerve network model, from multiple computing devices then convenient for adopting It requests corresponding operational data to carry out concurrent operation multiple operations with target parallel computing device, avoids Reusability target refreshing The waste of time caused by operation is carried out through network model.

Optionally, if the processor active task of target operation request is test assignment, packet is chosen from the multiple computing device The computing device for including the forward direction operation of the corresponding target nerve network model of the processor active task obtains multiple target computing devices； If the processor active task of the target operation request is training mission, choosing from the multiple computing device includes the target mind Forward direction operation and computing device trained backward through network model, obtain the multiple target computing device；From the multiple The target parallel computing device is chosen in target computing device.

Wherein, target operation request is any operation request in the multiple operation request.That is, if target is transported The processor active task for calculating request is test assignment, and target computing device is the forward direction operation that can be used for performance objective neural network model Computing device；And when processor active task is training mission, target computing device is that can be used for performance objective neural network model Forward direction operation and computing device trained backward, then target parallel computing device is chosen from multiple target computing devices, i.e., Handling operation request by dedicated computing device can be improved the accuracy rate and operation efficiency of operation.

It for example, include the first computing device and the second computing device in server, wherein the first computing device only wraps Containing for specifying the forward direction operation of neural network model, the second computing device can both execute above-mentioned specified neural network model Forward direction operation, and the backward trained operation of above-mentioned specified neural network model can be executed.When the target operation request received In target nerve network model be above-mentioned specified neural network model, and processor active task be test assignment when, determine the first meter Calculation device is target computing device.

203: based on the target parallel computing device to the multiple operation request carry out concurrent operation obtain it is multiple most Whole operation result.

The application requests corresponding operational data to be not construed as limiting each operation, can be the image for image recognition Data are also possible to the voice data etc. for speech recognition；When processor active task is test assignment, operational data is on user The data of biography, and when processor active task is training mission, operational data can be the training set of user's upload, be also possible to service The training set corresponding with target nerve network model stored in device.

It can produce multiple intermediate calculation results in calculating process, each operation can be obtained according to multiple intermediate calculation results and asked Seek corresponding final operation result.

204: final operation result each in the multiple final operation result is sent to corresponding electronic equipment.

It is appreciated that if there are multiple operations request in server, and what multiple operations request was directed to is all the same mesh Neural network model is marked, then can choose the corresponding target parallel of target nerve network model from multiple computing devices and calculate Device is requested corresponding operational data to carry out batch operation multiple operations, then is obtained to operation using target parallel computing device Final operation result distinguish to obtain each operation and request corresponding final operation result, and be sent to corresponding electronics Equipment avoids Reusability target nerve network model from carrying out the waste of time caused by operation, to improve the entirety of server Operation efficiency.

Optionally, the method also includes: wait the first preset duration, detect the target parallel computing device whether To the multiple final operation result, if it is not, being chosen and the target from the idle computing device of the multiple computing device The corresponding spare computing device of neural network model；Based on the spare the multiple operation request of computing device concurrent operation.

That is, when the first preset duration reaches, if target parallel computing device does not complete operation request, from more Spare computing device is chosen in the idle computing device of a computing device, multiple operations are executed by spare computing device and are requested, To improve operation efficiency.

Optionally, after described based on the spare the multiple operation request of computing device concurrent operation, the side Method further include: obtain and obtain the multiple final fortune between the target parallel computing device and the spare computing device at first Calculate result；It is filled to the calculating for not obtaining final operation result between the target parallel computing device and the spare computing device Set transmission pause instruction.

Wherein, pause instruction, which is used to indicate between target parallel computing device and the spare computing device, does not obtain finally The computing device pause of operation result executes corresponding operational order.That is, executing multiple fortune by spare computing device Request is calculated, and is chosen and is obtained final operation result between spare computing device and target parallel computing device at first and refer to as operation Corresponding final operation result is enabled, and to by not obtaining final operation between target parallel computing device and spare computing device As a result computing device sends pause instruction, that is, suspends the operation for not completing the computing device of operational order, to save power consumption.

Optionally, the method also includes: wait the second preset duration, detect the target parallel computing device whether To the multiple final operation result, if it is not, sending faulting instruction.

Wherein, for faulting instruction for informing that operation maintenance personnel calculation of fault device breaks down, the second preset duration is greater than institute State the first preset duration.That is, when the second preset duration reaches, if not receiving what target parallel computing device obtained Multiple final operation results then judge that performance objective parallel computation unit breaks down, and inform corresponding operation maintenance personnel, thus Improve the processing capacity of failure.

Optionally, the method also includes: every object time threshold value, update the hash table of the multiple computing device.

Wherein, hash table (Hash table, be also Hash table), be according to key value (Key value) and directly into The data structure of row access.In this application, using the IP address of multiple computing devices as key value, pass through hash function The position that (mapping function) maps that in hash table can quickly be found that is, after determining target computing device The physical resource that target computing device is distributed.The concrete form of hash table is not construed as limiting, can be artificially be arranged it is quiet The hash table of state is also possible to the hardware resource distributed according to IP address.Every object time threshold value, to multiple computing devices Hash table is updated, and improves the accuracy and search efficiency of lookup.

Referring to figure 3., Fig. 3 is the flow diagram of another dispatching method provided by the embodiments of the present application, such as Fig. 3 institute Show, this method is applied to server as shown in Figure 1, and this method is related to allowing to access the electronic equipment of above-mentioned server.

301: receiving multiple operation requests.

302: if the corresponding multiple target nerve network models of the multiple operation request, from multiple computing devices selection with The corresponding multiple target serial-computing machines of each target nerve network model in the multiple target nerve network model.

That is, multiple operation requests if it exists, and multiple operations request is directed to multiple target nerve network models, then Target serial-computing machine corresponding with target nerve network model can be chosen respectively from multiple computing devices, it is every convenient for improving The operation efficiency of one operation request.And above-mentioned target serial-computing machine has deployed corresponding target nerve network model, that is, saves The time for having removed netinit, improve operation efficiency.

303: corresponding to the target computing device based on each target computing device in the multiple target computing device Operation request carries out that the multiple final operation result is calculated.

304: final operation result each in the multiple final operation result is sent to corresponding electronic equipment.

It is appreciated that if there are multiple operations requests in server, and multiple operations request is directed to multiple target nerve nets Network model can then choose respectively target serial-computing machine corresponding with target nerve network model from multiple computing devices, Corresponding operation request is executed respectively by target serial-computing machine, and the operation efficiency of each operation request can be improved.On and It states target serial-computing machine and has deployed corresponding target nerve network model, that is, eliminate the time of netinit, improve Operation efficiency.

Optionally, according to the attribute information of operation request each in the multiple operation request from auxiliary dispatching set of algorithms Choose auxiliary dispatching algorithm；The target parallel is chosen from the multiple computing device according to the auxiliary dispatching algorithm to calculate Device.

Wherein, auxiliary dispatching set of algorithms includes but is not limited to the next item down: polling dispatching (Round-Robin Scheduling) algorithm, weighted polling (Weighted Round Robin) algorithm, minimum link (Least Connections) algorithm, minimum link (the Weighted Least Connections) algorithm of weighting, based on locality most Few link (Locality-Based Least Connections) algorithm, tape copy are at least linked based on locality (Locality-Based Least Connections with Replication) algorithm, destination address hash (Destination Hashing) algorithm, source address hash (Source Hashing) algorithm.How the application is for according to category Property information choose auxiliary dispatching algorithm and be not construed as limiting, for example, if multiple target computing devices handle operation of the same race request, Auxiliary dispatching algorithm can be polling dispatching algorithm；If the anti-pressure ability of different target computing devices is different, it should to configuration The target computing device high, load is low distributes more operation requests, then auxiliary dispatching algorithm can be Weighted Round Robin；If The workload that each target computing device is assigned in multiple target computing devices is not quite similar, then auxiliary dispatching algorithm can be Minimum chained scheduling algorithm is dynamically chosen and overstocks the least target computing device of connection number currently wherein to handle currently Request, as much as possible improve target computing device utilization efficiency, be also possible to weight minimum chained scheduling algorithm.

That is, on the basis of the dispatching method as involved in above-mentioned Fig. 2 or Fig. 3 embodiment, in conjunction with auxiliary dispatching Algorithm picks finally execute the computing device of operation request, to further increase the operation efficiency of server.

Optionally, the method also includes: wait first preset duration, detect in the multiple target computing device Whether each target computing device obtains corresponding final operation result, if it is not, by the mesh for not obtaining final operation result Computing device is marked as Delay computing device；According to the corresponding target nerve network model of the Delay computing device from described more It is chosen and spare computing device in the idle computing device of a computing device；Based on the spare computing device to the delay meter The corresponding operation request of device is calculated to be calculated.

That is, when the first preset duration reaches, if target computing device does not obtain corresponding final operation result, Spare computing device then is chosen from the idle computing device of multiple computing devices, Delay computing is executed by spare computing device The corresponding operation request of device, to improve operation efficiency.

Optionally, the corresponding operation of the Delay computing device is requested to carry out based on the spare computing device described After calculating, the method also includes: it obtains and obtains at first between the Delay computing device and the spare computing device The final operation result；Final operation result is not obtained between the Delay computing device and the spare computing device Computing device sends pause instruction.

Wherein, pause instruction, which is used to indicate between Delay computing device and spare computing device, does not obtain final operation result Computing device pause execute corresponding operational order.It is requested that is, executing multiple operations by spare computing device, and Choose that obtain final operation result between spare computing device and Delay computing device at first corresponding final as operational order Operation result, and sent out to the computing device by not obtaining final operation result between Delay computing device and spare computing device Pause instruction is sent, that is, suspends the operation for not completing the computing device of operational order, to save power consumption.

Optionally, the method also includes: wait second preset duration, detect in the multiple target computing device Whether each target computing device obtains corresponding final operation result, if it is not, not obtaining prolonging for final operation result for described Slow computing device sends faulting instruction as calculation of fault device.

Wherein, faulting instruction is for informing that operation maintenance personnel calculation of fault device breaks down.That is, default second When duration reaches, if not receiving the final operation result that target computing device obtains, judge that the target computing device occurs Failure, and inform corresponding operation maintenance personnel, to improve the processing capacity of failure.

Consistent with the embodiment of above-mentioned Fig. 2, Fig. 3, referring to figure 4., Fig. 4 is another server provided herein Structural schematic diagram, above-mentioned server include multiple computing devices.As shown in figure 4, above-mentioned server 400 includes:

Receiving unit 401, for receiving multiple operation requests；

Scheduling unit 402, if requesting a corresponding target nerve network model for the multiple operation, from the multiple Target parallel computing device corresponding with the target nerve network model is chosen in computing device；

Arithmetic element 403, for being transported parallel based on the target parallel computing device to the multiple operation request It calculates, obtains multiple final operation results；

Transmission unit 404, it is corresponding for final operation result each in the multiple final operation result to be sent to Electronic equipment.

Optionally, if the scheduling unit 402 is also used to the multiple corresponding multiple target nerve network moulds of operation request Type, it is corresponding with each target nerve network model in multiple target nerve network models from being chosen in the multiple computing device Multiple target serial-computing machines；The arithmetic element 403 is also used to based on each target in the multiple target computing device Computing device calculates the target computing device corresponding operation request, obtains the multiple final operation result.

Optionally, if the processor active task that the scheduling unit 402 is specifically used for target operation request is test assignment, from institute State the calculating dress that the forward direction operation including the corresponding target nerve network model of the processor active task is chosen in multiple computing devices It sets, obtains multiple target computing devices, the target operation request is any operation request in the multiple operation request；If The processor active task is training mission, and the forward direction including the target nerve network model is chosen from the multiple computing device Operation and computing device trained backward, obtain the multiple target computing device；It is selected from the multiple target computing device Take the target parallel computing device.

Optionally, the scheduling unit 402 is specifically used for the category according to operation request each in the multiple operation request Property information chooses auxiliary dispatching algorithm from auxiliary dispatching set of algorithms, and the auxiliary dispatching set of algorithms includes at least one of the following: Polling dispatching algorithm, Weighted Round Robin at least link algorithm, the minimum link algorithm, Locality-Based Least Connections Scheduling of weighting Algorithm, tape copy at least link algorithm, destination address hashing algorithm, source address hashing algorithm based on locality；According to described Auxiliary dispatching algorithm chooses the target parallel computing device from the multiple computing device.

Optionally, the server further includes detection unit 405, for waiting the first preset duration, detects the target Whether parallel computation unit obtains the multiple final operation result；If being by the scheduling unit 402 detection unit 405 It is no, spare calculating corresponding with the target nerve network model is chosen from the idle computing device of the multiple computing device Device；The spare computing device is based on by the arithmetic element 403 the multiple operation is requested to carry out concurrent operation.

Optionally, the server 400 further includes acquiring unit 406, for obtain the target parallel computing device and The multiple final operation result is obtained between the spare computing device at first；From the transmission unit 404 to the target The computing device for not obtaining final operation result between parallel computation unit and the spare computing device sends pause instruction.

Optionally, the detection unit 405 is also used to wait the second preset duration, detects the target parallel and calculates dress It sets and whether obtains the multiple final operation result；Faulting instruction is sent by the transmission unit 404, the faulting instruction is used for Inform that target parallel computing device described in operation maintenance personnel breaks down, when second preset duration is default greater than described first It is long.

Optionally, the server 400 further includes updating unit 407, for updating the clothes every object time threshold value The hash table of business device.

Optionally, the acquiring unit 406 is also used to obtain specified neural network model and concentrates each specified neural network The hardware attributes of each computing device obtain multiple operation demands and more in the operation demand of model and the multiple computing device A hardware attributes；

The server 400 further includes deployment unit 408, for according to the multiple operation demand and the multiple hardware The specified neural network model is concentrated deployment pair on the corresponding specified computing device of each specified neural network model by attribute The specified neural network model answered.

Optionally, the computing device includes that at least one calculates carrier, and the calculating carrier includes at least one calculating Unit.

In one embodiment, as shown in figure 5, including processor 501 this application discloses another server 500, depositing Reservoir 502, communication interface 503 and one or more programs 504, wherein one or more programs 504 are stored in memory In 502, and it is configured to be executed by processor, described program 504 includes for executing portion described in above-mentioned dispatching method Point or Overall Steps instruction.

A kind of computer readable storage medium, above-mentioned computer-readable storage medium are provided in another embodiment of the invention Matter is stored with computer program, and above-mentioned computer program includes program instruction, and above procedure instruction makes when being executed by a processor Above-mentioned processor executes implementation described in dispatching method.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not It is considered as beyond the scope of this invention.

It is apparent to those skilled in the art that for convenience of description and succinctly, the end of foregoing description The specific work process at end and unit, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.

In several embodiments provided herein, it should be understood that disclosed terminal and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of said units, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.In addition, shown or discussed phase Mutually between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication of device or unit Connection is also possible to electricity, mechanical or other form connections.

Above-mentioned unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of unit therein can be selected to realize the embodiment of the present invention according to the actual needs Purpose.

It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.

If above-mentioned integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment above method of the present invention Portion or part steps.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. are various can store program The medium of code.

It should be noted that in attached drawing or specification text, the implementation for not being painted or describing is affiliated technology Form known to a person of ordinary skill in the art, is not described in detail in field.In addition, the above-mentioned definition to each element and method is simultaneously It is not limited only to various specific structures, shape or the mode mentioned in embodiment, those of ordinary skill in the art can carry out letter to it It singly changes or replaces.

Above specific embodiment has carried out further specifically the purpose of the application, technical scheme and beneficial effects It is bright, it should be understood that the above is only the specific embodiments of the application, are not intended to limit this application, all the application's Within spirit and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of protection of this application.

Claims

1. a kind of dispatching method, which is characterized in that the method is based on the server comprising multiple computing devices, the method packet It includes:

Receive multiple operation requests；

If a corresponding target nerve network model is requested in the multiple operation, chosen from the multiple computing device with it is described The corresponding target parallel computing device of target nerve network model；

The multiple operation is requested based on the target parallel computing device to carry out parallel computation, obtains multiple final operation knots Fruit；

2. the method according to claim 1, wherein the method also includes:

If the corresponding multiple target nerve network models of the multiple operation request, chosen from the multiple computing device with it is multiple The corresponding multiple target serial-computing machines of each target nerve network model in target nerve network model；

The corresponding operation of the target computing device is requested based on each target computing device in the multiple target computing device It is calculated, obtains the multiple final operation result.

3. method according to claim 1 or 2, which is characterized in that selection and the institute from the multiple computing device State the corresponding target parallel computing device of target nerve network model, comprising:

If the processor active task of target operation request is test assignment, choosing from the multiple computing device includes that the operation is appointed Be engaged in corresponding target nerve network model forward direction operation computing device, obtain multiple target computing devices, the target fortune Calculating request is any operation request in the multiple operation request；

If the processor active task is training mission, choosing from the multiple computing device includes the target nerve network model Forward direction operation and computing device trained backward, obtain the multiple target computing device；

The target parallel computing device is chosen from the multiple target computing device.

4. the method according to claim 1, wherein described choose and the mesh from the multiple computing device Mark the corresponding target parallel computing device of neural network model, comprising:

Auxiliary is chosen from auxiliary dispatching set of algorithms according to the attribute information of operation request each in the multiple operation request to adjust Algorithm is spent, the auxiliary dispatching set of algorithms includes at least one of the following: polling dispatching algorithm, Weighted Round Robin, minimum link Algorithm, weighting minimum link algorithm, Locality-Based Least Connections Scheduling algorithm, tape copy at least link calculation based on locality Method, destination address hashing algorithm, source address hashing algorithm；

The target parallel computing device is chosen from the multiple computing device according to the auxiliary dispatching algorithm.

5. method according to claim 1-4, which is characterized in that the method also includes:

The first preset duration is waited, detects whether the target parallel computing device obtains the multiple final operation result, if It is no, spare calculating corresponding with the target nerve network model is chosen from the idle computing device of the multiple computing device Device；

The multiple operation is requested based on the spare computing device to carry out concurrent operation.

6. according to the method described in claim 5, it is characterized in that, being based on the spare computing device concurrent operation institute described After stating multiple operation requests, the method also includes:

It obtains and obtains the multiple final operation knot between the target parallel computing device and the spare computing device at first Fruit；

To the computing device for not obtaining final operation result between the target parallel computing device and the spare computing device Send pause instruction.

7. method according to claim 5 or 6, which is characterized in that the method also includes:

The second preset duration is waited, detects whether the target parallel computing device obtains the multiple final operation result, if It is no, faulting instruction is sent, the faulting instruction is described for informing that target parallel computing device described in operation maintenance personnel breaks down Second preset duration is greater than first preset duration.

8. method according to claim 1-7, which is characterized in that the method also includes:

Every object time threshold value, the hash table of the server is updated.

9. method according to claim 1-8, which is characterized in that the method also includes:

Obtain the operation demand for specifying neural network model to concentrate each specified neural network model and the multiple computing device In the hardware attributes of each computing device obtain multiple operation demands and multiple hardware attributes；

The specified neural network model is concentrated into each specify according to the multiple operation demand and the multiple hardware attributes Corresponding specified neural network model is disposed on the corresponding specified computing device of neural network model.

10. -9 described in any item methods according to claim 1, which is characterized in that the computing device includes at least one meter Carrier is calculated, the calculating carrier includes at least one computing unit.

11. a kind of server, which is characterized in that the server includes multiple computing devices, the server further include: be used for Execute the unit such as the described in any item methods of claim 1-10.

12. a kind of server, which is characterized in that including processor, memory, communication interface and one or more program, In, one or more of programs are stored in the memory, and are configured to be executed by the processor, described program Include the steps that requiring the instruction in any one of 1-10 method for perform claim.

13. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, the calculating Machine program includes program instruction, and described program instruction makes the processor execute such as claim 1-10 when being executed by a processor Described in any item methods.