CN107563512A - A kind of data processing method, device and storage medium - Google Patents
A kind of data processing method, device and storage medium Download PDFInfo
- Publication number
- CN107563512A CN107563512A CN201710735990.7A CN201710735990A CN107563512A CN 107563512 A CN107563512 A CN 107563512A CN 201710735990 A CN201710735990 A CN 201710735990A CN 107563512 A CN107563512 A CN 107563512A
- Authority
- CN
- China
- Prior art keywords
- computing architecture
- learning model
- deep learning
- current layer
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a kind of data processing method, device and storage medium, the data processing method includes:Obtain pending data;Handled in the deep learning model that the pending data input terminal is embedded, the deep learning model includes multilayer computing architecture;In processing procedure, whether detection current layer computing architecture, which runs, finishes;If detecting, the operation of current layer computing architecture is finished, and obtains storage region corresponding to current layer computing architecture, and the storage region is cleared up, and the storage region is used to store current layer computing architecture operationally caused service data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and return to whether perform detection current layer computing architecture has run the step of finishing, until output result.Above-mentioned data processing method is advantageous to the offline realization of deep learning model in the terminal, and flexibility is high, and computational efficiency is high.
Description
Technical field
The present invention relates to field of computer technology, more particularly to a kind of data processing method, device and storage medium.
Background technology
The concept of deep learning comes from the research of artificial neural network, and the multilayer perceptron containing more hidden layers is exactly a kind of depth
Learning structure.
Deep learning forms more abstract high-rise expression attribute classification or feature by combining low-level feature, to find number
According to distributed nature represent that it proposes to cause artificial neural network to turn into one of most important algorithm of machine learning again,
A pre-training stage is added in traditional artificial neural network training, i.e., one is carried out to each layer network with unsupervised learning
Secondary special training, overall training is then carried out to whole network using supervised learning, that is to say, that computer passes through depth
Neutral net, the mechanism of human brain is simulated to learn, judge, decision-making.
Based on the intelligent use that deep learning method is realized because its accuracy rate is high, excellent, turn into artificial intelligence field
First choice.But the application of existing deployment deep learning needs the large-scale operation being piled into by substantial amounts of CPU and GPU
Stand to realize, generally, directly can not be realized offline in the mobile phone application of user because its calculating memory consumption is excessive.
The content of the invention
It is an object of the invention to provide a kind of data processing method, device and storage medium, to solve existing depth
The technical problem that learning model can not be realized directly offline in the terminal because its calculating memory consumption is excessive.
In order to solve the above technical problems, the embodiment of the present invention provides following technical scheme:
A kind of data processing method, including:
Obtain pending data;
Handled in the deep learning model that the pending data input terminal is embedded, the deep learning model
Including multilayer computing architecture;
In processing procedure, whether detection current layer computing architecture, which runs, finishes;
If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and
The storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused operation number
According to;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer meter is returned to
Calculate whether framework has run the step of finishing, until output result.
In order to solve the above technical problems, the embodiment of the present invention also provides following technical scheme:
A kind of data processing equipment, including:
Acquisition module, for obtaining pending data;
Processing module, handled for the pending data to be inputted in the deep learning model having been inserted into, it is described
Deep learning model includes multilayer computing architecture;
Detection module, in processing procedure, whether detection current layer computing architecture, which runs, finishes;
Cleaning modul, if detecting that the operation of current layer computing architecture finishes for the detection module, obtain current layer
Storage region corresponding to computing architecture, and the storage region is cleared up, the storage region is based on storing current layer
Calculate framework operationally caused service data;When clearing up completion, calculated the next layer of computing architecture as current layer
Framework, and return to whether the triggering detection module perform detection current layer computing architecture has run the step of finishing, until defeated
Go out result.
In order to solve the above technical problems, the embodiment of the present invention also provides following technical scheme:
A kind of storage medium, the storage medium are stored with a plurality of instruction, and the instruction is loaded suitable for processor, with
Perform the step in the data processing method described in any of the above-described.
Data processing method, device and storage medium provided by the invention, are treated by obtaining pending data, and by this
Handled in the deep learning model that processing data input has been inserted into, the deep learning model includes multilayer computing architecture, connects
, in processing procedure, whether detection current layer computing architecture, which runs, is finished, if detecting, current layer computing architecture has been run
Finish, then obtain storage region corresponding to current layer computing architecture, and the storage region is cleared up, the storage region is used to deposit
Current layer computing architecture operationally caused service data is stored up, afterwards, when clearing up completion, the next layer of computing architecture is made
For current layer computing architecture, and return to whether perform detection current layer computing architecture has run the step of finishing, until at output
Result is managed, so as to start the automatic clearing function of intermediate result in calculating process, preferably resolves deep learning model
The problem of memory consumption is excessive is calculated, is advantageous to the offline realization of deep learning model in the terminal, flexibility is high, computational efficiency
It is high.
Brief description of the drawings
Below in conjunction with the accompanying drawings, by the way that the embodiment of the present invention is described in detail, technical scheme will be made
And other beneficial effects are apparent.
Fig. 1 is the schematic flow sheet of data processing method provided in an embodiment of the present invention;
Fig. 2 a are another schematic flow sheet of data processing method provided in an embodiment of the present invention;
Fig. 2 b are the block schematic illustration provided in an embodiment of the present invention for disposing CNN models on an electronic device;
Fig. 2 c are the schematic diagram of the deep learning model provided in an embodiment of the present invention with n-layer network structure;
Fig. 3 a are the structural representation of data processing equipment provided in an embodiment of the present invention;
Fig. 3 b are the structural representation of embedded module provided in an embodiment of the present invention;
Fig. 3 c are another structural representation of data processing equipment provided in an embodiment of the present invention;
Fig. 3 d are the structural representation of cleaning modul provided in an embodiment of the present invention;
Fig. 4 is the structural representation of electronic equipment provided in an embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, the every other implementation that those skilled in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of protection of the invention.
The embodiment of the present invention provides a kind of data processing method, device, storage medium and electronic equipment, below will respectively
It is described in detail.
The present embodiment will be described from the angle of data processing equipment, and the data processing equipment can specifically be used as independent
Entity realize that can also be integrated in the electronic equipments such as terminal or server to realize, the electronic equipment can include intelligence
Energy mobile phone, tablet personal computer and personal computer etc..
A kind of data processing method, including:Obtain pending data;The depth that pending data input is had been inserted into
Practise and being handled in model, the deep learning model includes multilayer computing architecture;In processing procedure, current layer calculating support is detected
Whether structure, which runs, finishes;If detecting, the operation of current layer computing architecture finishes, and obtains and is stored corresponding to current layer computing architecture
Region, and the storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused fortune
Row data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer is returned to
Whether computing architecture has run the step of finishing, until output result.
As shown in figure 1, the idiographic flow of the data processing method can be as follows:
S101, obtain pending data.
In the present embodiment, the pending data can include the multi-medium datas such as image, voice, and it can currently be gathered
Data or the data downloaded on the net.
Handled in S102, the deep learning model for having been inserted into pending data input, the deep learning model
Including multilayer computing architecture.
In the present embodiment, the deep learning model can include CNN (Convolutional Neural Network, volume
Product neutral net), DBN (Deep Belief Network, depth belief network), RNN (Recurrent Neural
Network, Recognition with Recurrent Neural Network), the neural network model such as recurrent neural tensor network and autocoder.Generally, the depth
Degree learning model includes multilayer computing architecture, and the computing architecture may be constructed input layer, output layer and at least one hiding
Intermediate layer, these layers combine to form the whole framework of deep learning model, are examined with realizing similar to image recognition, face
The functions such as survey, image segmentation.The pending data can be by specifying input interface to input in deep learning model, and this specifies defeated
Incoming interface can be user-defined three-dimensional data structure, and the queueing discipline of the three-dimensional data structure can be c/h/w, lead to
Often, different process object, its c, h, w are different, such as when the pending data is RGB color image, c=image channel numbers
3, h=picture altitudes, w=picture traverses, by specifying the pixel value (0~255) that the content that input interface inputs is image, when
When the pending data is voice, c=h=1, w=numbers of samples, by specify input interface input content for sampled value (-
1~1), wherein number of samples is determined by sample frequency and sampling duration.
It is pointed out that because existing deep learning model is all to be similar to caffe by some
(Convolutional Architecture for Fast Feature Embedding, convolutional neural networks framework),
The components such as torch and tensorflow construction, and the use of these components needs to rely on third party library to realize, if therefore will
Existing deep learning model, which is grafted directly in electronic equipment, to be run, and can undoubtedly be related to the cross-platform use in storehouse, electronic equipment
The speed of service it is low, therefore, the deep learning model being had been inserted into the present embodiment is by specific components, such as ncnn component structures
To make, the ncnn components are the ardware features for being directed to electronic equipment and the high-performance neutral net forward calculation framework formulated,
It is the CNN models specially run on electronic equipment, it needs not rely on third party library in use, existing so as to solve
The problem of cross-platform storehouse for there are CNN models to occur on an electronic device uses, substantially increase the speed of service of electronic equipment.
Certainly, the specific components should make in advance.Specifically, it can first call cmake (cross
Platform make, cross-platform installation tool) generation respective operations system development platform project file, generally, difference behaviour
The electronic equipment for making system corresponds to different project files, afterwards, passes through SDK (Software Development Kit, software
Development kit) and the developing instrument such as IDE (Integrated Development Environment, IDE) add
The project file is carried, and compiles out the static library or dynamic base of the specific components according to the actual requirements, so, subsequently should in utilization
When the deep learning model that specific components construction depth learning model and execution have constructed, the static library or dynamic base
Can is directly as calling storehouse to use, without relying on third party library.In addition, by compiled specific components in electronic equipment
Middle construction depth learning model should also be what is carried out in advance, that is, before above-mentioned steps S102, the data processing method is also
It can include:
1-1, the parameter information for obtaining from the deep learning model trained each layer computing architecture.
For example, above-mentioned steps 1-1 can specifically include:
The deep learning model trained is changed using default switching strategy, obtains files after transform;
Default loading interface is called to load the files after transform in the electronic equipment;
The parameter information of each layer computing architecture is obtained from the content of loading.
In the present embodiment, the deep learning model trained is that existing deep learning model is instructed using sample
Obtained after white silk, it is usually expressed as protobuf (a kind of data interchange format of Google) serialized datas or specific binary system
Data.The default switching strategy is mainly used in existing deep learning model conversion into the text for meeting form needed for the assignment component
Part, it can be the model conversion program artificially compiled out, and the model conversion program can include to deep learning solution to model
Analysis and by data after parsing be written as file the two operation.The files after transform can include model file and network text
Part, the model file refer to the file for meeting model structure needed for the assignment component, and the network file refers to meet the designated groups
The file of network structure needed for part.
The default loading interface is mainly used in load document, and it can directly be adjusted from compiled static library or dynamic base
With, the file of different-format calls different default loading interfaces, wherein, the default loading interface can take load document road
Footpath or load document content both load modes, specific selection depends on electronic equipment memory size, when internal memory is enough,
Load document path can be selected, in the normal fashion Access Model resource, when low memory, can selected in load document
Hold, the Access Model resource in the form of internal memory.The type of parameter that the parameter information can include participating in calculating, quantity and each
The information such as the weights of parameter, these information are included in files after transform, it is necessary to could smoothly after loading in the electronic device
Read.
1-2, calculating path is generated according to the parameter information of each layer computing architecture, the calculating path includes participating in the meter of operation
Calculate the operation order calculated between function and each computing architecture of framework.
In the present embodiment, the calculating path can by by the calculating function of all or part of layer computing architecture according to layer with
The pass order of layer sorts to obtain successively, and it can be directed to all computing architectures, can also be just for part computing architecture, mainly
Depending on mission requirements, such as needing to use the task of all layer functions of deep learning model, it is necessary to be directed to all calculating
Framework generation calculates path, the task for only needing to use deep learning model part layer function, can be counted just for part
Calculate framework generation and calculate path.
For example, above-mentioned steps 1-2 can specifically include:
The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;
Generated according to the parameter information of the computing architecture filtered out and calculate path.
In the present embodiment, the default screening parameter is mainly used in filtering out the layer that need not be used, and it can be quiet in compiling
Just set when state storehouse or dynamic base, directly can so be called when generation calculates path from storehouse, flexibility is high.Certainly,
To adapt to all mission requirements, when can to define the default screening parameter be some value (such as 0), it is believed that need not carry out
Filter.
1-3, by calculating path storage in the electronic device, with embedded deep learning model in the electronic equipment.
Now, above-mentioned steps S102 can specifically include:
The pending data is handled according to the calculating path.
In the present embodiment, if because the calculating path is generated by deep learning model dried layer computing architecture, therefore work as
When being handled using the calculating path pending data, it is possible to achieve all or part of layers of work(of the deep learning model
Energy.
In addition, the telescopiny of deep learning model in the electronic device is not merely related to the setting of calculating process,
It can also include the assignment problem for performing the thread for calculating operation, " in the electronic equipment embedded depth that is, above-mentioned steps
Degree learning model " can also include:
Detect the processor quantity of the electronic equipment;
The number of threads according to corresponding to determining the processor quantity, and the number of threads is stored in the electronic equipment.
In the present embodiment, the Thread Count used needed for calculating process can be controlled by given thread control interface
Amount, the number of threads can be set equal to processor quantity or other set rule.The given thread control interface
It can shift to an earlier date compiled in dynamic base or static library, to directly invoke, now, above-mentioned steps are " according to the calculating path pair
The pending data is handled " it can further include:
Call the idle thread equal with the number of threads;
Using the idle thread of calling, the pending data is handled according to the calculating path.
In the present embodiment, the number of threads can be made to be equal to CPU quantity, be for the electronic equipment of multi -CPU, Ke Yitong
Different threads are crossed, multiple calculating tasks are assigned in different core cpus simultaneously and perform parallel computation, so as to substantially increase place
Manage speed.
In addition, the assignment problem that in deep learning model insertion electronic equipment, will further relate to calculate internal memory, that is, above-mentioned
Step " embedded deep learning model in the electronic equipment " can further include:
The space size of corresponding storage region is determined according to the parameter information of each layer computing architecture.
In the present embodiment, the parameter information can also include the number of output parameter (namely parameter type of operation result)
The information such as the weights of amount and each output parameter, generally, output parameter, the calculating parameter of structure can be calculated according to this layer
Etc. come the memory space of distribution needed for determining, can be normally carried out with ensureing that this layer calculates the calculating of structure.Now, above-mentioned steps
" being handled according to the calculating path the pending data " can further include:Using the idle thread of calling, according to
The calculating path is accordingly calculated the pending data in corresponding storage region.
S103, in processing procedure, whether detection current layer computing architecture runs and finishes, if so, then performing following step
S104, if it is not, then re-executing step S103.
In the present embodiment, the operation, which finishes, refers to that calculating function corresponding to the current layer computing architecture in calculating path all holds
Row finishes.
S104, storage region corresponding to current layer computing architecture is obtained, and the storage region is cleared up, the memory block
Domain is used to store current layer computing architecture operationally caused service data;When clearing up completion, by the next layer of calculating support
Structure returns as current layer computing architecture and performs above-mentioned steps S103, until output result.
, can be by being stored in advance in the specified output interface in static library or dynamic base to last defeated in the present embodiment
Go out result to be exported, it can also be three-dimensional data structure c/h/w that this, which specifies output interface, and different processing tasks, it is exported
Content it is different, such as image recognition tasks, c=image classifications numbering, the confidence level dimension of the h=w=classification is (logical
Often for 1), for voice recognition tasks, c=Classification of Speech is numbered, the confidence level dimension (being usually 1) of the h=w=classification.
In view of in most of deep learning model, the calculating data in intermediate layer are unwanted, namely only need to obtain
The calculating data of last layer are as output result, can be with to avoid the calculating data in intermediate layer from taking excessive internal memory
Using intermediate data machine for automatically recovering system, namely when the operation of any intermediate layer computing architecture finishes, the layer can be cleared up automatically
Data caused by calculating, so as to reduce EMS memory occupation as far as possible.It is pointed out that not each intermediate layer computing architecture has been run
Bi Hou, it is required for clearing up all data in this layer of storage region, for example for multitask neural network model, is counting
It when calculating the operation result of individual task, can temporarily retain the operation result, until all tasks are all completed, then be deleted
Remove, so as to avoid computing repeatedly for same operation result, for such case, the mode of reference count can be taken to determine
Which operation result needs to delete, and also can be that each operation result sets a number to be quoted, only secondary when waiting to quote
Number for 0 or other specify numerical value when, just needs the operation result is cleared up, that is, above-mentioned steps are " to the storage region
Cleared up " it can specifically include:
Service data in addition to the operation result is cleared up;
Obtain the current number to be quoted of the operation result;
When this wait quote number be equal to predetermined threshold value when, the operation result is cleared up;When the number to be quoted is more than
During predetermined threshold value, the operation for performing and obtaining storage region corresponding to current layer computing architecture is returned.
In the present embodiment, the predetermined threshold value can be 0 or other specified numerical value being manually set, can be clear by specifying
Manage interface and carry out cleaning operation, this specifies cleaning interface to be stored in advance in the compiled static library or dynamic base, makes
Used time directly calls from the storehouse.The number to be quoted refers mainly to the number that next layer of computing architecture also needs to use, due to
Next layer of computing architecture can only use the operation result of last layer computing architecture, therefore for other service datas, for example calculated
It number of passes evidence, can select directly to clear up, and the operation result of predetermined threshold value is not up to for the number to be quoted, illustrate it under
One layer of computing architecture also has use value, can be without cleaning.
Generally, only when all service datas are all cleaned out, it can just think that whole cleaning operation is completed, it is contemplated that clear
Reason operation and the calculating operation of next layer of computing architecture can perform parallel, therefore in scale removal process, operation result is waited to quote
Number is to constantly update, that is, when detecting that the operation of current layer computing architecture finishes, the data processing method can also wrap
Include:
The number to be quoted of the operation result is set according to the next layer of computing architecture;
When this wait quote number set successfully after, obtain the next layer of computing architecture and operationally the operation result drawn
Use information;
The number to be quoted is updated according to the reference information.
In the present embodiment, the reference information refers mainly to quote number.Specifically, current layer operation result it is initial wait to quote
Number can be pre-set according to the operation result quantity of next layer of computing architecture, generally, this can be made initial to wait to quote
Number is equal to the operation result quantity of next layer of computing architecture, such as, it is A- for network structure>B->C1+C2 double tasks are deep
Learning model is spent, because the operation result of last layer is C1 and C2, the operation result in intermediate layer is B, therefore B number to be quoted
The initial value of the initial value number to be quoted that can be set to 2, A can be set to 1.Afterwards, in actual calculating process, Ke Yigen
According to the specific number to be quoted for quoting each operation result of situation real-time update, the update method can be:Next layer of calculating support
Structure often uses current layer operation result successively, and corresponding number to be quoted accordingly subtracts one, such as, when asking for C1 results, B's waits to draw
1 is updated to number, when asking for C2 results, B number to be quoted is updated to 0, and now, B can be cleared up.Relative to existing
Have in technology and directly copy to operation result in the storage region of next layer of computing architecture for the mode used, this reality
Example is applied by way of reference count come instead of replicating, without copying data, method is simple, is advantageous to reduce EMS memory occupation.
From the foregoing, the data processing method that the present embodiment provides, by obtaining pending data, and this is pending
Being handled in the deep learning model that data input has been inserted into, the deep learning model includes multilayer computing architecture, then,
In processing procedure, whether detection current layer computing architecture, which runs, finishes, if detecting, the operation of current layer computing architecture finishes, and obtains
Storage region corresponding to current layer computing architecture is taken, and the storage region is cleared up, the storage region is used to store currently
Layer computing architecture operationally caused service data, afterwards, when clear up complete when, using the next layer of computing architecture as currently
Layer computing architecture, and return to whether perform detection current layer computing architecture has run the step of finishing, until output result,
So as to start the automatic clearing function of intermediate result in calculating process, the calculating internal memory of deep learning model is preferably resolved
The problem of excessive is consumed, is advantageous to the offline realization of deep learning model in the terminal, flexibility is high, and computational efficiency is high.
Citing, is described in further detail by the method according to described by above-described embodiment below.
In the present embodiment, will be described in detail so that data processing equipment is integrated in the electronic device as an example.
As shown in Figure 2 a and 2 b, a kind of data processing method, idiographic flow can be as follows:
S201, electronic equipment are changed using default switching strategy to the deep learning model trained, and are changed
File afterwards.
For example, the deep learning model can be CNN models, and the default switching strategy refers mainly to model transformation tools, its
Can include to deep learning solution to model analyse and by data after parsing be written as file the two operation.It can adjust in advance
With the project file of cmake generation respective operations system (such as Android system) development platforms, afterwards, opened by SDK and IDE etc.
The hair tool loads project file, and the model transformation tools is compiled out according to the actual requirements.It should be noted that in compiling mould
While type crossover tool, the static library or dynamic base of specific components (such as ncnn components), namely ncnn storehouses can be compiled out,
So, should subsequently when the deep learning model constructed using ncnn component construction depth learning models and execution
Ncnn storehouses can is directly as calling storehouse to use, without relying on third party library.
Wherein, multiple parameters can be included in the ncnn storehouses, such as disabling file loading and character string output function
Parameter-DNCNN_STDIO=OFF and/or-DNCNN_STRING=OFF cmake, for built-in corresponding to not compiling completely
Parameter-the DWITH_LAYER_xxx=OFF of layer, and for enabling the parameter-DNCNN_ of multi-core cpu parallel processing functions
OPENMP=ON cmake, certainly, the ncnn storehouses can also include multiple interfaces, such as loading ncnn network files
Net::Load_param interfaces, for loading the Net of ncnn model files::Load_model interfaces, it is preceding to net for creating
The Net of network calculator::Create_extractor interfaces, for calling the Extractor of feedforward network calculator::
Extractor interfaces, for opening the Net of the automatic clearing function of intermediate result::Set_light_mode interfaces, for controlling
Use the Net of Thread Count::Set_num_threads interfaces, for inputting the Extractor of pending data::Input connects
Mouthful, and for exporting the Extractor of operation result::Extract interfaces, etc., wherein, Extractor::Input connects
Mouth and Extractor::Extract interfaces are three-dimensional data structure, and queueing discipline can be c/h/w, generally, different places
Object is managed, its c, h, w are different.
S202, electronic equipment call default loading interface to load the files after transform in the electronic equipment, and from loading
Content in obtain the parameter information of each layer computing architecture.
For example, the files after transform can include model file and network file, can call the Net in ncnn storehouses::
Load_param interfaces load ncnn network files, call the Net in ncnn storehouses::Load_model interfaces load ncnn models
File.The parameter of interface input can be file path, or file content.The parameter information can include participating in the meter calculated
The information such as weights of calculation parameter and the type of output parameter, quantity and each parameter.
S203, electronic equipment filter out the calculating support for participating in operation using default screening parameter from each layer computing architecture
Structure, and generated according to the parameter information of the computing architecture filtered out and calculate path, the calculating path includes participating in the calculating of operation
The operation order calculated between function and each computing architecture of framework.
For example, being screened using-DWITH_LAYER_xxx=OFF parameters are somebody's turn to do, xxx represents the layer for needing to filter,
Namely the layer where the computing architecture of computing is not involved in, when xxx is 0, it is believed that all layers of computing architecture are required for participating in
Computing.The Net can be called::Create_extractor interfaces, according to the calculating parameter for participating in calculating in each layer computing architecture
The weights generation of type, quantity and each parameter calculate path, to create feedforward network calculator in the electronic device.
S204, electronic equipment obtain its own processor quantity, and the number of threads according to corresponding to determining the processor quantity,
At the same time, the space size of corresponding storage region is determined according to the parameter information of each layer computing architecture.
For example, the Net can be called::Set_num_threads interfaces set required Thread Count, such as eight cores
Electronic equipment, its participate in calculate number of threads can be set to 8.The output parameter of structure can be calculated according to each layer, is calculated
Parameter etc. is come the memory space of distribution needed for determining, generally, output parameter and more, the required memory space of calculating parameter quantity
It is bigger, for example refer to Fig. 2 c, a deep learning model with n-layer network structure is shown in Fig. 2 c, wherein, due to the
Two layers need to use five operation results of first layer as its calculating parameter, it is necessary to obtain four operation results as its output
Parameter, third layer need to use four operation results of the second layer as its calculating parameter and made, it is necessary to obtain three operation results
For its output parameter, therefore in general, the memory space needed for the second layer is more than the memory space needed for third layer, certainly, also
Need to consider influence of the other factors such as parameter type to memory space simultaneously.Further, since the CPU of existing electronic equipment is most
For arm frameworks (32 reduced instruction set computers), it can further make each byte-aligned of passage internal memory 16, to accelerate to read speed
Degree.
S205, electronic equipment are stored the space size in the calculating path, number of threads and storage region, with
Embedded deep learning model in the electronic equipment.
For example, can using calculate path, number of threads and storage region space size as necessary data in electronics
Feedforward network calculator is created in equipment, and establishes the Extractor::The tune of extractor interfaces and feedforward network calculator
With relation, so as to realize in deep learning model insertion electronic equipment using ncnn components.Because ncnn components are in design
The hardware performance of electronic equipment has just just been taken into full account, therefore deep learning model has been parsed by using ncnn components, and
After reconfiguring, it can be made preferably to be applied to electronic equipment.
It should be noted that above-mentioned steps S203 and S204 do not have obvious execution sequence, it can be carried out simultaneously, also may be used
Successively to carry out.
S206, electronic equipment obtain pending data.
For example, for CNN models, the pending data can be image, and the image can be black white image or colour
Image.
S207, electronic equipment call the idle thread equal with the number of threads, and utilize the idle thread called, according to
The calculating path is calculated the pending data in the storage region of corresponding space size.
For example, the Extractor can be first passed through::The pending data is inputted deep learning model by input interfaces
In, such as, when the pending data is RGB color image (or black white image), the c=image channels number 3 (or 2) of input, h
=picture altitude, w=picture traverses, when the pending data is voice, the c=h=1 of input, w=numbers of samples.Afterwards,
The Extractor can be passed through::Extractor interface interchange feedforward network calculators are calculated the pending data.
S208, in calculating process, whether electronic equipment detection current layer computing architecture runs and finishes, if so, then simultaneously
Following step S209-S210 and S211-S212 are performed, if it is not, then re-executing step S208.
S209, electronic equipment obtain storage region corresponding to current layer computing architecture, and to removing operation in the storage region
As a result the service data outside is cleared up.
For example, finished if current layer computing architecture calculates, the Net can be called::Set_light_mode interfaces are to operation
Data are cleared up, and in scale removal process, can directly be cleared up for the service data in addition to operation result, are tied for operation
Fruit then needs to judge whether to need to clear up with reference to number to be quoted.
S210, electronic equipment obtain the current number to be quoted of the operation result, and detect the number to be quoted whether etc.
In predetermined threshold value, if so, then performing following step S211, above-mentioned steps S209 is performed if it is not, then returning.
S211, electronic equipment are cleared up the operation result, when clear up complete when, using the next layer of computing architecture as
Current layer computing architecture, and return and perform above-mentioned steps S209, until output result,
For example, the predetermined threshold value can be 0, only when when it is 0 to quote number, it is believed that need to clear up the operation knot
Fruit.Generally, when only all service datas have all been cleared up in the storage region of current layer computing architecture, can just carry out next
The cleaning work of the storage region of layer computing architecture.The Extractor can be called::Extract interface outputs manage result,
Wherein, when the pending data is image, the c=image classifications numbering of output, the confidence level dimension of the h=w=classification (is led to
Often for 1), when the pending data is voice, c=Classification of Speech numbering, the confidence level dimension of the h=w=classification is (usually
1)。
S212, electronic equipment are treated according to the set Current Layer first operation result of computing architecture of the next layer of computing architecture
Quote number.
For example, it is A- for network structure>B->C1+C2 deep learning model, when setting first, A number to be quoted
The number to be quoted that can be 1, B can be that 2, C1 and C2 number to be quoted can be 3.
S213, when this wait quote number set successfully after, it is operationally right that electronic equipment obtains the next layer of computing architecture
The reference information of the operation result, and the number to be quoted is updated according to the reference information.
For example, the reference information refers mainly to quote number, such as, when asking for B results, due to the calculating in B results
An A result has been refer in journey, therefore A number to be quoted is updated to 0 by 1, likewise, when asking for C1 results, B results
Number to be quoted can be updated to 1 by 2, when asking for C2 results, the number to be quoted of B results can be updated to 0 by 1, this
Sample can avoid the unused operation result finished from being cleaned out, it is ensured that each operation result can be by next layer of computing architecture
Fully quote.
From the foregoing, the data processing method that the present embodiment provides, wherein electronic equipment can utilize default conversion plan
Slightly the deep learning model trained is changed, obtains files after transform, then, calls default loading interface in the electronics
The files after transform is loaded in equipment, and the parameter information of each layer computing architecture is obtained from the content of loading, then, using pre-
If screening parameter filters out the computing architecture for participating in operation from each layer computing architecture, and according to the ginseng of the computing architecture filtered out
Number information generation calculates path, and the calculating path includes participating in the calculating function of the computing architecture of operation and each computing architecture
Between operation order, at the same time, obtain its own processor quantity, and according to the processor quantity determine corresponding to Thread Count
Amount, the space size of corresponding storage region is determined according to the parameter information of each layer computing architecture, afterwards, by calculating path, the line
The space size of number of passes amount and storage region is stored, with embedded deep learning model in the electronic equipment, so that
Ncnn components can be utilized in deep learning model insertion electronic equipment, will to be preferably based entirely on pure C Plus Plus and realize do not have
Using third party library, storehouse EMS memory occupation is small, and in actual use afterwards, electronic equipment can obtain pending data, connect
, call the idle thread equal with the number of threads, and using the idle thread called, according to the calculating path corresponding empty
Between the storage region of size the pending data is calculated, in calculating process, whether detection current layer computing architecture is transported
Row finishes, if the operation of current layer computing architecture finishes, obtains storage region corresponding to current layer computing architecture, and to the storage
Service data in region in addition to operation result is cleared up, while obtains the current number to be quoted of the operation result, and
Detect whether the number to be quoted is equal to predetermined threshold value, if so, then call default cleaning interface to clear up the operation result,
When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and return to execution and obtain current layer calculating support
Corresponding to structure the step of storage region, until output result, at the same time, electronic equipment can be according to this next layer calculating
Framework set Current Layer first computing architecture operation result number to be quoted, when this wait quote number set successfully after, can
To obtain the next layer of computing architecture operationally to the reference information of the operation result, and this is updated according to the reference information and treated
Number is quoted, so as to start the automatic clearing function of intermediate result in the calculating process of deep learning model, is preferably solved
The problem of calculating memory consumption of deep learning model is excessive, is advantageous to the offline realization of deep learning model in the terminal,
Flexibility is high, and employs the calculation of multi-threaded parallel, and processing concurrency is good, and computational efficiency is high.
Method according to described by above-described embodiment, the present embodiment will further be retouched from the angle of data processing equipment
State, the data processing equipment can integrate in the electronic device.
Fig. 3 a are referred to, data processing equipment provided in an embodiment of the present invention has been described in detail in Fig. 3 a, and it can include:Obtain
Modulus block 10, processing module 20, detection module 30 and cleaning modul 40, wherein:
(1) acquisition module 10
Acquisition module 10, for obtaining pending data.
In the present embodiment, the pending data can include the multi-medium datas such as image, voice, and it can currently be gathered
Data or the data downloaded on the net.
(2) processing module 20
Processing module 20, handled for the pending data to be inputted in the deep learning model having been inserted into, the depth
Degree learning model includes multilayer computing architecture.
In the present embodiment, the deep learning model can include CNN, DBN, RNN, recurrent neural tensor network and from
The neural network models such as dynamic encoder.Generally, the deep learning model includes multilayer computing architecture, and the computing architecture may be constructed
Input layer, output layer and at least one hiding intermediate layer, these layers combine to form the whole frame of deep learning model
Structure, to realize similar to functions such as image recognition, Face datection, image segmentations.The pending data can be by specifying input
In interface input deep learning model, it can be user-defined three-dimensional data structure that this, which specifies input interface, three dimension
Queueing discipline according to structure can be c/h/w, and generally, different process objects, its c, h, w are different, such as when the pending number
According to for RGB color image when, c=image channels number 3, h=picture altitudes, w=picture traverses, pass through specify input interface input
Content be image pixel value (0~255), when the pending data is voice, c=h=1, w=numbers of samples, pass through
The content for specifying input interface input is sampled value (- 1~1), and wherein number of samples is determined by sample frequency and sampling duration.
It is pointed out that due to existing deep learning model be all by some be similar to caffe, torch and
The components such as tensorflow construction, and the use of these components needs to rely on third party library to realize, if therefore will be existing
Deep learning model, which is grafted directly in electronic equipment, to be run, and can undoubtedly be related to the cross-platform use in storehouse, the operation of electronic equipment
Speed is low, therefore, the deep learning model being had been inserted into the present embodiment is by specific components, such as ncnn components construction,
The ncnn components are the ardware features for being directed to electronic equipment and the high-performance neutral net forward calculation framework formulated, and are special
The CNN models run on electronic equipment, it needs not rely on third party library in use, so as to solve existing CNN
The problem of cross-platform storehouse that model occurs on an electronic device uses, substantially increase the speed of service of electronic equipment.
Certainly, the specific components should make in advance.Specifically, cmake can be first called to generate respective operations
The project file of system development platform, generally, the electronic equipment of different operating system correspond to different project files, afterwards, lead to
Cross the developing instruments such as SDK and IDE and load the project file, and compile out the static library or dynamic of the specific components according to the actual requirements
State storehouse, so, subsequently in the deep learning model constructed using the specific components construction depth learning model and execution
When, the static library or dynamic base can are directly as calling storehouse to use, without relying on third party library.In addition, pass through compiling
Construction depth learning model should also be what is carried out in advance to good specific components in the electronic device, that is, Fig. 3 b are referred to, should
Data processing equipment can also include embedded module 50, and the insertion module 50 can specifically include the second acquisition submodule 51, life
Into submodule 52 and sub-module stored 53, wherein:
Second acquisition submodule 51, the deep learning for having been inserted into pending data input for managing module 20 in this place
Before being handled in model, the parameter information of each layer computing architecture is obtained from the deep learning model trained.
For example, second acquisition submodule 51 specifically can be used for:
The deep learning model trained is changed using default switching strategy, obtains files after transform;
Default loading interface is called to load the files after transform in the electronic equipment;
The parameter information of each layer computing architecture is obtained from the content of loading.
In the present embodiment, the deep learning model trained is that existing deep learning model is instructed using sample
Obtained after white silk, it is usually expressed as protobuf (a kind of data interchange format of Google) serialized datas or specific binary system
Data.The default switching strategy is mainly used in existing deep learning model conversion into the text for meeting form needed for the assignment component
Part, it can be the model conversion program artificially compiled out, and the model conversion program can include to deep learning solution to model
Analysis and by data after parsing be written as file the two operation.The files after transform can include model file and network text
Part, the model file refer to the file for meeting model structure needed for the assignment component, and the network file refers to meet the designated groups
The file of network structure needed for part.
The default loading interface is mainly used in load document, and it can directly be adjusted from compiled static library or dynamic base
With, the file of different-format calls different default loading interfaces, wherein, the default loading interface can take load document road
Footpath or load document content both load modes, specific selection depends on electronic equipment memory size, when internal memory is enough,
Load document path can be selected, in the normal fashion Access Model resource, when low memory, can selected in load document
Hold, the Access Model resource in the form of internal memory.The type of parameter that the parameter information can include participating in calculating, quantity and each
The information such as the weights of parameter, these information are included in files after transform, it is necessary to could smoothly after loading in the electronic device
Read.
Submodule 52 is generated, calculates path for being generated according to the parameter information of each layer computing architecture, calculating path bag
Include the operation order calculated between function and each computing architecture for the computing architecture for participating in operation.
In the present embodiment, the calculating path can by by the calculating function of all or part of layer computing architecture according to layer with
The pass order of layer sorts to obtain successively, and it can be directed to all computing architectures, can also be just for part computing architecture, mainly
Depending on mission requirements, such as needing to use the task of all layer functions of deep learning model, generation submodule 52 needs
Generated for all computing architectures and calculate path, the task for only needing to use deep learning model part layer function, generation
Submodule 52 can generate just for part computing architecture and calculate path.
For example, the generation submodule 52 specifically can be used for:
The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;
Generated according to the parameter information of the computing architecture filtered out and calculate path.
In the present embodiment, the default screening parameter is mainly used in filtering out the layer that need not be used, and it can be quiet in compiling
Just set when state storehouse or dynamic base, directly can so be called when generation calculates path from storehouse, flexibility is high.Certainly,
To adapt to all mission requirements, when can to define the default screening parameter be some value (such as 0), it is believed that need not carry out
Filter.
Sub-module stored 53, for calculating path storage in the electronic device, to be somebody's turn to do with embedded in the electronic equipment
Deep learning model.
Now, the processing module 20 specifically can be used for:The pending data is handled according to the calculating path.
In the present embodiment, if because the calculating path is generated by deep learning model dried layer computing architecture, therefore work as
When processing module 20 is handled pending data using the calculating path, it is possible to achieve the deep learning model all or
Part layer function.
In addition, the telescopiny of deep learning model in the electronic device is not merely related to the setting of calculating process,
It can also include the assignment problem for performing the thread for calculating operation, that is, referring to Fig. 3 c, the insertion module 50 can also wrap
Determination sub-module 54 is included, is used for:
Detect the processor quantity of the electronic equipment;The number of threads according to corresponding to determining the processor quantity, and should
Number of threads is stored in the electronic equipment.
In the present embodiment, the Thread Count used needed for calculating process can be controlled by given thread control interface
Amount, the number of threads can be set equal to processor quantity or other set rule.The given thread control interface
It can shift to an earlier date compiled in dynamic base or static library, to directly invoke, now, the processing module 20 can further be used
In:Call the idle thread equal with the number of threads;It is pending to this according to the calculating path using the idle thread of calling
Data are handled.
In the present embodiment, determination sub-module 54 can make the number of threads be equal to CPU quantity, be for the electricity of multi -CPU
Multiple calculating tasks can be assigned in different core cpus and performed parallel by different threads by sub- equipment, electronic equipment simultaneously
Calculate, so as to substantially increase processing speed.
In addition, the assignment problem that in deep learning model insertion electronic equipment, will further relate to calculate internal memory, that is, this is true
Stator modules 54 can be also used for:
The space size of corresponding storage region is determined according to the parameter information of each layer computing architecture.
In the present embodiment, the parameter information can also include the number of output parameter (namely parameter type of operation result)
The information such as the weights of amount and each output parameter, generally, the output that determination sub-module 54 can calculate structure according to this layer are joined
Number, calculating parameter etc. can be normally carried out with ensureing that this layer calculates the calculating of structure come the memory space of distribution needed for determining.This
When, the processing module 20 can be used for:Using the idle thread of calling, this is treated in corresponding storage region according to the calculating path
Processing data is accordingly calculated.
(3) detection module 30
Detection module 30, in processing procedure, whether detection current layer computing architecture, which runs, finishes.
In the present embodiment, the operation, which finishes, refers to that calculating function corresponding to the current layer computing architecture in calculating path all holds
Row finishes.
(4) cleaning modul 40
Cleaning modul 40, if detecting that the operation of current layer computing architecture finishes for the detection module, obtain current layer
Storage region corresponding to computing architecture, and the storage region is cleared up, the storage region is used to store current layer calculating support
Structure operationally caused service data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and
Return and trigger whether the detection module perform detection current layer computing architecture has run the step of finishing, until output processing knot
Fruit.
, can be by being stored in advance in the specified output interface in static library or dynamic base to last defeated in the present embodiment
Go out result to be exported, it can also be three-dimensional data structure c/h/w that this, which specifies output interface, and different processing tasks, it is exported
Content it is different, such as image recognition tasks, c=image classifications numbering, the confidence level dimension of the h=w=classification is (logical
Often for 1), for voice recognition tasks, c=Classification of Speech is numbered, the confidence level dimension (being usually 1) of the h=w=classification.
In view of in most of deep learning model, the calculating data in intermediate layer are unwanted, namely only need to obtain
The calculating data of last layer are as output result, can be with to avoid the calculating data in intermediate layer from taking excessive internal memory
Using intermediate data machine for automatically recovering system, namely when the operation of any intermediate layer computing architecture finishes, the layer can be cleared up automatically
Data caused by calculating, so as to reduce EMS memory occupation as far as possible.It is pointed out that not each intermediate layer computing architecture has been run
Bi Hou, it is required for clearing up all data in this layer of storage region, for example for multitask neural network model, is counting
It when calculating the operation result of individual task, can temporarily retain the operation result, until all tasks are all completed, then be deleted
Remove, so as to avoid computing repeatedly for same operation result, for such case, the mode of reference count can be taken to determine
Which operation result needs to delete, and also can be that each operation result sets a number to be quoted, only secondary when waiting to quote
Number for 0 or other specify numerical value when, just needs are cleared up the operation result, that is, refer to Fig. 3 d, the service data
Operation result can be included, the cleaning modul 40 can specifically include:First cleaning submodule 41, the and of the first acquisition submodule 42
Second cleaning submodule 43, wherein:
First cleaning submodule 41, for clearing up the service data in addition to the operation result;
First acquisition submodule 42, the to be quoted number current for obtaining the operation result;
Second cleaning submodule 43, for when this wait quote number be equal to predetermined threshold value when, to the operation result carry out clearly
Reason;When this is when reference number is more than predetermined threshold value, return performs the behaviour for obtaining storage region corresponding to current layer computing architecture
Make.
In the present embodiment, the predetermined threshold value can be manually set 0 or other specify numerical value, cleaning modul 40 can be with
By specifying cleaning interface to carry out cleaning operation, this specifies cleaning interface to be stored in advance in the compiled static library or dynamic
In state storehouse, directly called from the storehouse during use.The number to be quoted refers mainly to next layer of computing architecture and also needs to what is used
Number, because next layer of computing architecture can only use the operation result of last layer computing architecture, therefore for other service datas, than
Such as calculating process data, the first cleaning submodule 41 can select directly to clear up, and not up to default for the number to be quoted
The operation result of threshold value, illustrate that it also has use value to next layer of computing architecture, the second cleaning submodule 43 can not enter
Row cleaning.
Generally, only when all service datas are all cleaned out, it can just think that whole cleaning operation is completed, it is contemplated that clear
Reason operation and the calculating operation of next layer of computing architecture can perform parallel, therefore in scale removal process, operation result is waited to quote
Number is to constantly update, that is, the data processing equipment can also include update module 60, is used for:
When the detection module 30 detects that the operation of current layer computing architecture finishes, set according to the next layer of computing architecture
The number to be quoted of the operation result;
When this wait quote number set successfully after, obtain the next layer of computing architecture and operationally the operation result drawn
Use information;The number to be quoted is updated according to the reference information.
In the present embodiment, the reference information refers mainly to quote number.Specifically, current layer operation result it is initial wait to quote
Number can be pre-set according to the operation result quantity of next layer of computing architecture, and generally, update module 60 can make this first
The number to be quoted to begin is equal to the operation result quantity of next layer of computing architecture, such as, it is A- for network structure>B->C1+C2
Double task deep learning models, because the operation result of last layer is C1 and C2, the operation result in intermediate layer is B, therefore B
The initial value that the initial value of number to be quoted can be set to 2, A number to be quoted can be set to 1.Afterwards, actually calculating
Cheng Zhong, update module 60 can be according to the specific number to be quoted for quoting each operation result of situation real-time update, the renewal sides
Method can be:Next layer of computing architecture often uses current layer operation result successively, and corresponding number to be quoted accordingly subtracts one, such as,
When asking for C1 results, B number to be quoted is updated to 1, and when asking for C2 results, B number to be quoted is updated to 0, now,
B can be cleared up.Relative to directly operation result is copied in the storage region of next layer of computing architecture in the prior art
For the mode used, the present embodiment is by way of reference count come instead of replicating, without copying data, method is simple
It is single, be advantageous to reduce EMS memory occupation.
It when it is implemented, above unit can be realized as independent entity, can also be combined, be made
Realized for same or several entities, the specific implementation of above unit can be found in embodiment of the method above, herein not
Repeat again.
From the foregoing, the data processing equipment that the present embodiment provides, pending data, place are obtained by acquisition module 10
Handled in the deep learning model that reason module 20 has been inserted into pending data input, the deep learning model includes more
Layer computing architecture, then, in processing procedure, whether the detection current layer computing architecture of detection module 30, which runs, finishes, if detection
Go out the operation of current layer computing architecture to finish, cleaning modul 40 obtains storage region corresponding to current layer computing architecture, and deposits this
Storage area domain is cleared up, and the storage region is used to store current layer computing architecture operationally caused service data, afterwards, when
When cleaning is completed, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer computing architecture is returned to
Whether the step of finishing has been run, until output result, is cleared up automatically so as to start intermediate result in calculating process
Function, preferably resolve deep learning model calculating memory consumption it is excessive the problem of, be advantageous to deep learning model at end
Offline realization in end, flexibility is high, and computational efficiency is high.
Accordingly, the embodiment of the present invention also provides a kind of data handling system, including times that the embodiment of the present invention is provided
A kind of data processing equipment, the data processing equipment can integrate in the electronic device.
Wherein, electronic equipment can obtain pending data;The deep learning that the pending data input terminal is embedded
Handled in model, the deep learning model includes multilayer computing architecture;In processing procedure, current layer computing architecture is detected
Whether operation finishes;If detecting, the operation of current layer computing architecture finishes, and obtains memory block corresponding to current layer computing architecture
Domain, and the storage region is cleared up, the storage region is used to store the operationally caused operation of current layer computing architecture
Data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer meter is returned to
Calculate whether framework has run the step of finishing, until output result.
The specific implementation of each equipment can be found in embodiment above above, will not be repeated here.
By the data handling system can include any data processing equipment for being provided of the embodiment of the present invention, because
This, it is possible to achieve the beneficial effect achieved by any data processing equipment that the embodiment of the present invention is provided, refer to before
Embodiment, will not be repeated here.
Accordingly, the embodiment of the present invention also provides a kind of electronic equipment, as shown in figure 4, it illustrates the embodiment of the present invention
The structural representation of involved electronic equipment, specifically:
The electronic equipment can include one or more than one processing core processor 701, one or more
The parts such as memory 702, power supply 703 and the input block 704 of computer-readable recording medium.Those skilled in the art can manage
Solve, the electronic devices structure shown in Fig. 4 does not form the restriction to electronic equipment, can include more more or less than illustrating
Part, either combine some parts or different parts arrangement.Wherein:
Processor 701 is the control centre of the electronic equipment, utilizes various interfaces and the whole electronic equipment of connection
Various pieces, by running or performing the software program and/or module that are stored in memory 702, and call and be stored in
Data in reservoir 702, the various functions and processing data of electronic equipment are performed, so as to carry out integral monitoring to electronic equipment.
Optionally, processor 701 may include one or more processing cores;Preferably, processor 701 can integrate application processor and tune
Demodulation processor processed, wherein, application processor mainly handles operating system, user interface and application program etc., and modulatedemodulate is mediated
Reason device mainly handles radio communication.It is understood that above-mentioned modem processor can not also be integrated into processor 701
In.
Memory 702 can be used for storage software program and module, and processor 701 is stored in memory 702 by operation
Software program and module, so as to perform various function application and data processing.Memory 702 can mainly include storage journey
Sequence area and storage data field, wherein, storing program area can storage program area, the application program (ratio needed at least one function
Such as sound-playing function, image player function) etc.;Storage data field can store uses created number according to electronic equipment
According to etc..In addition, memory 702 can include high-speed random access memory, nonvolatile memory can also be included, such as extremely
Few a disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 702 can also wrap
Memory Controller is included, to provide access of the processor 701 to memory 702.
Electronic equipment also includes the power supply 703 to all parts power supply, it is preferred that power supply 703 can pass through power management
System and processor 701 are logically contiguous, so as to realize management charging, electric discharge and power managed etc. by power-supply management system
Function.Power supply 703 can also include one or more direct current or AC power, recharging system, power failure monitor
The random component such as circuit, power supply changeover device or inverter, power supply status indicator.
The electronic equipment may also include input block 704, and the input block 704 can be used for the numeral or character for receiving input
Information, and produce keyboard, mouse, action bars, optics or the trace ball signal relevant with user's setting and function control
Input.
Although being not shown, electronic equipment can also will not be repeated here including display unit etc..Specifically in the present embodiment
In, the processor 701 in electronic equipment can be corresponding by the process of one or more application program according to following instruction
Executable file be loaded into memory 702, and the application program being stored in memory 702 is run by processor 701,
It is as follows so as to realize various functions:
Obtain pending data;
Handled in the deep learning model that the pending data input terminal is embedded, the deep learning model includes
Multilayer computing architecture;
In processing procedure, whether detection current layer computing architecture, which runs, finishes;
If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and
The storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused service data;
When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer calculating support is returned to
Whether structure has run the step of finishing, until output result.
The electronic equipment can realize having achieved by any data processing equipment that the embodiment of the present invention is provided
Effect is imitated, embodiment above is referred to, will not be repeated here.
It will appreciated by the skilled person that all or part of step in the various methods of above-described embodiment can be with
Completed by instructing, or control related hardware to complete by instructing, the instruction can be stored in one and computer-readable deposit
In storage media, and loaded and performed by processor.
Therefore, the embodiment of the present invention provides a kind of storage medium, wherein being stored with a plurality of instruction, the instruction can be processed
Device is loaded, to perform the step in any data processing method that the embodiment of the present invention provided.For example, the instruction can
With following steps:
Obtain pending data;
Handled in the deep learning model that pending data input is had been inserted into, the deep learning model includes more
Layer computing architecture;
In processing procedure, whether detection current layer computing architecture, which runs, finishes;
If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and
The storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused service data;
When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer calculating support is returned to
Whether structure has run the step of finishing, until output result.
The specific implementation of each operation can be found in embodiment above above, will not be repeated here.
Wherein, the storage medium can include:Read-only storage (ROM, Read Only Memory), random access memory
Body (RAM, Random Access Memory), disk or CD etc..
By the instruction stored in the storage medium, can perform at any data that the embodiment of the present invention is provided
Step in reason method, it is thereby achieved that achieved by any data processing method that the embodiment of the present invention is provided
Beneficial effect, embodiment above is referred to, will not be repeated here.
Data processing method, device, storage medium, electronic equipment and the system provided above the embodiment of the present invention is entered
Go and be discussed in detail, specific case used herein is set forth to the principle and embodiment of the present invention, and the above is implemented
The explanation of example is only intended to help the method and its core concept for understanding the present invention;Meanwhile for those skilled in the art, according to
According to the thought of the present invention, there will be changes in specific embodiments and applications, in summary, this specification content
It should not be construed as limiting the invention.
Claims (15)
- A kind of 1. data processing method, it is characterised in that including:Obtain pending data;Handled in the deep learning model that pending data input is had been inserted into, the deep learning model includes more Layer computing architecture;In processing procedure, whether detection current layer computing architecture, which runs, finishes;If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and to institute State storage region to be cleared up, the storage region is used to store current layer computing architecture operationally caused service data; When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and the calculating of perform detection current layer is returned to Whether framework has run the step of finishing, until output result.
- 2. data processing method according to claim 1, it is characterised in that the service data includes operation result, right The storage region is cleared up, including:Service data in addition to the operation result is cleared up;Obtain the current number to be quoted of the operation result;When described when reference number is equal to predetermined threshold value, the operation result is cleared up;When the number to be quoted is big When predetermined threshold value, the operation for performing and obtaining storage region corresponding to current layer computing architecture is returned.
- 3. data processing method according to claim 2, it is characterised in that detecting that current layer computing architecture run Bi Shi, in addition to:The number to be quoted of the operation result is set according to the next layer of computing architecture;When described after reference number is set successfully, the next layer of computing architecture is obtained operationally to the operation result Reference information;According to number to be quoted described in reference information renewal.
- 4. according to the data processing method described in any one in claim 1-3, it is characterised in that by the pending number Before being handled in the deep learning model being had been inserted into according to input, in addition to:The parameter information of each layer computing architecture is obtained from the deep learning model trained;Generated according to the parameter information of each layer computing architecture and calculate path, the path that calculates includes participating in the computing architecture of operation The operation order calculated between function and each computing architecture;By calculating path storage in the electronic device, with the embedded deep learning model in the electronic equipment;Handled in the deep learning model that pending data input is had been inserted into, including:According to the calculating Path is handled the pending data.
- 5. data processing method according to claim 4, it is characterised in that described to be believed according to the parameter of each layer computing architecture Breath generation calculates path, including:The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;Generated according to the parameter information of the computing architecture filtered out and calculate path.
- 6. data processing method according to claim 4, it is characterised in that described from the deep learning model trained The parameter information of each layer computing architecture is obtained, including:The deep learning model trained is changed using default switching strategy, obtains files after transform;Default loading interface is called to load the files after transform in the electronic equipment;The parameter information of each layer computing architecture is obtained from the content of loading.
- 7. data processing method according to claim 4, it is characterised in that it is described it is embedded in the electronic equipment described in Deep learning model, in addition to:Detect the processor quantity of the electronic equipment;The number of threads according to corresponding to determining the processor quantity, and the number of threads is stored in the electronic equipment In;It is described that the pending data is handled according to the calculating path, including:Call equal with the number of threads Idle thread;Using the idle thread of calling, the pending data is handled according to the calculating path.
- A kind of 8. data processing equipment, it is characterised in that including:Acquisition module, for obtaining pending data;Processing module, handled for the pending data to be inputted in the deep learning model having been inserted into, the depth Learning model includes multilayer computing architecture;Detection module, in processing procedure, whether detection current layer computing architecture, which runs, finishes;Cleaning modul, if detecting that the operation of current layer computing architecture finishes for the detection module, obtain current layer and calculate Storage region corresponding to framework, and the storage region is cleared up, the storage region is used to store current layer calculating support Structure operationally caused service data;When clear up complete when, using the next layer of computing architecture as current layer computing architecture, And return to whether the triggering detection module perform detection current layer computing architecture has run the step of finishing, until output is handled As a result.
- 9. data processing equipment according to claim 8, it is characterised in that the service data includes operation result, institute Cleaning modul is stated to specifically include:First cleaning submodule, for clearing up the service data in addition to the operation result;First acquisition submodule, the to be quoted number current for obtaining the operation result;Second cleaning submodule, for when reference number is equal to predetermined threshold value, being cleared up when described the operation result; When described when reference number is more than predetermined threshold value, the behaviour for performing and obtaining storage region corresponding to current layer computing architecture is returned Make.
- 10. data processing equipment according to claim 9, it is characterised in that also including update module, be used for:When the detection module detects that the operation of current layer computing architecture finishes, institute is set according to the next layer of computing architecture State the number to be quoted of operation result;When described after reference number is set successfully, the next layer of computing architecture is obtained operationally to the operation result Reference information;According to number to be quoted described in reference information renewal.
- 11. according to the data processing equipment described in any one in claim 8-10, it is characterised in that also include embedded mould Block, the embedded module specifically include:Second acquisition submodule, for the deep learning model for having been inserted into pending data input in the processing module In handled before, the parameter information of each layer computing architecture is obtained from the deep learning model trained;Submodule is generated, path is calculated for being generated according to the parameter information of each layer computing architecture, the calculating path includes ginseng With the operation order calculated between function and each computing architecture of the computing architecture of operation;Sub-module stored, for the calculating path to be stored in the electronic device, described in embedded in the electronic equipment Deep learning model;The processing module is specifically used for:The pending data is handled according to the calculating path.
- 12. data processing equipment according to claim 11, it is characterised in that the generation submodule is specifically used for:The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;Generated according to the parameter information of the computing architecture filtered out and calculate path.
- 13. data processing equipment according to claim 11, it is characterised in that second acquisition submodule is specifically used In:The deep learning model trained is changed using default switching strategy, obtains files after transform;Default loading interface is called to load the files after transform in the electronic equipment;The parameter information of each layer computing architecture is obtained from the content of loading.
- 14. data processing equipment according to claim 11, it is characterised in that the embedded module also includes determining submodule Block, it is used for:Detect the processor quantity of the electronic equipment;The number of threads according to corresponding to determining the processor quantity, and by institute Number of threads is stated to be stored in the electronic equipment;The processing module is specifically used for:Call the idle thread equal with the number of threads;Using the idle thread of calling, The pending data is handled according to the calculating path.
- 15. a kind of storage medium, it is characterised in that the storage medium is stored with a plurality of instruction, and the instruction is suitable to processor Loaded, the step in the data processing method described in 1 to 7 any one is required with perform claim.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710735990.7A CN107563512B (en) | 2017-08-24 | 2017-08-24 | Data processing method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710735990.7A CN107563512B (en) | 2017-08-24 | 2017-08-24 | Data processing method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107563512A true CN107563512A (en) | 2018-01-09 |
CN107563512B CN107563512B (en) | 2023-10-17 |
Family
ID=60976926
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710735990.7A Active CN107563512B (en) | 2017-08-24 | 2017-08-24 | Data processing method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107563512B (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108279881A (en) * | 2018-02-11 | 2018-07-13 | 深圳竹信科技有限公司 | Cross-platform realization framework based on deep learning predicted portions and method |
CN108764487A (en) * | 2018-05-29 | 2018-11-06 | 北京百度网讯科技有限公司 | For generating the method and apparatus of model, the method and apparatus of information for identification |
CN110009110A (en) * | 2019-03-19 | 2019-07-12 | 福建天晴数码有限公司 | STN network optimized approach, storage medium based on the library class UTXO |
WO2019141014A1 (en) * | 2018-01-16 | 2019-07-25 | 腾讯科技(深圳)有限公司 | Chip-based instruction set processing method and apparatus, and storage medium |
CN110276322A (en) * | 2019-06-26 | 2019-09-24 | 湖北亿咖通科技有限公司 | A kind of image processing method and device of the unused resource of combination vehicle device |
CN110503644A (en) * | 2019-08-27 | 2019-11-26 | 广东工业大学 | Defects detection implementation method, defect inspection method and relevant device based on mobile platform |
CN110647996A (en) * | 2018-06-08 | 2020-01-03 | 上海寒武纪信息科技有限公司 | Execution method and device of universal machine learning model and storage medium |
CN111145076A (en) * | 2019-12-27 | 2020-05-12 | 深圳鲲云信息科技有限公司 | Data parallelization processing method, system, equipment and storage medium |
WO2020093205A1 (en) * | 2018-11-05 | 2020-05-14 | 深圳市欢太科技有限公司 | Deep learning computation method and related device |
CN111291882A (en) * | 2018-12-06 | 2020-06-16 | 北京百度网讯科技有限公司 | Model conversion method, device, equipment and computer storage medium |
CN112149908A (en) * | 2020-09-28 | 2020-12-29 | 深圳壹账通智能科技有限公司 | Vehicle driving prediction method, system, computer device and readable storage medium |
CN112740174A (en) * | 2018-10-17 | 2021-04-30 | 北京比特大陆科技有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
US11036480B2 (en) | 2018-06-08 | 2021-06-15 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
WO2021120177A1 (en) * | 2019-12-20 | 2021-06-24 | 华为技术有限公司 | Method and apparatus for compiling neural network model |
CN113676353A (en) * | 2021-08-19 | 2021-11-19 | 杭州华橙软件技术有限公司 | Control method and device for equipment, storage medium and electronic device |
CN114594103A (en) * | 2022-04-12 | 2022-06-07 | 四川大学 | Method and system for automatically detecting surface defects of nuclear industrial equipment and automatically generating reports |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011156741A1 (en) * | 2010-06-10 | 2011-12-15 | Carnegie Mellon University | Synthesis system for pipelined digital circuits with multithreading |
WO2015066331A1 (en) * | 2013-11-04 | 2015-05-07 | Google Inc. | Systems and methods for layered training in machine-learning architectures |
WO2015192812A1 (en) * | 2014-06-20 | 2015-12-23 | Tencent Technology (Shenzhen) Company Limited | Data parallel processing method and apparatus based on multiple graphic procesing units |
CN106951926A (en) * | 2017-03-29 | 2017-07-14 | 山东英特力数据技术有限公司 | The deep learning systems approach and device of a kind of mixed architecture |
-
2017
- 2017-08-24 CN CN201710735990.7A patent/CN107563512B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011156741A1 (en) * | 2010-06-10 | 2011-12-15 | Carnegie Mellon University | Synthesis system for pipelined digital circuits with multithreading |
WO2015066331A1 (en) * | 2013-11-04 | 2015-05-07 | Google Inc. | Systems and methods for layered training in machine-learning architectures |
WO2015192812A1 (en) * | 2014-06-20 | 2015-12-23 | Tencent Technology (Shenzhen) Company Limited | Data parallel processing method and apparatus based on multiple graphic procesing units |
CN106951926A (en) * | 2017-03-29 | 2017-07-14 | 山东英特力数据技术有限公司 | The deep learning systems approach and device of a kind of mixed architecture |
Non-Patent Citations (1)
Title |
---|
张沪寅;汪思思;钱龙;周天瑛;: "面向SDN数据中心网络的路径资源管理节能机制研究" * |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019141014A1 (en) * | 2018-01-16 | 2019-07-25 | 腾讯科技(深圳)有限公司 | Chip-based instruction set processing method and apparatus, and storage medium |
US10877924B2 (en) | 2018-01-16 | 2020-12-29 | Tencent Technology (Shenzhen) Company Limited | Instruction set processing method based on a chip architecture and apparatus, and storage medium |
CN108279881A (en) * | 2018-02-11 | 2018-07-13 | 深圳竹信科技有限公司 | Cross-platform realization framework based on deep learning predicted portions and method |
CN108279881B (en) * | 2018-02-11 | 2021-05-28 | 深圳竹信科技有限公司 | Cross-platform implementation framework and method based on deep learning prediction part |
CN108764487A (en) * | 2018-05-29 | 2018-11-06 | 北京百度网讯科技有限公司 | For generating the method and apparatus of model, the method and apparatus of information for identification |
US11210608B2 (en) | 2018-05-29 | 2021-12-28 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating model, method and apparatus for recognizing information |
US11726754B2 (en) | 2018-06-08 | 2023-08-15 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
US11379199B2 (en) | 2018-06-08 | 2022-07-05 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
CN110647996A (en) * | 2018-06-08 | 2020-01-03 | 上海寒武纪信息科技有限公司 | Execution method and device of universal machine learning model and storage medium |
US11334329B2 (en) | 2018-06-08 | 2022-05-17 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
US11334330B2 (en) | 2018-06-08 | 2022-05-17 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
US11403080B2 (en) | 2018-06-08 | 2022-08-02 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
US11307836B2 (en) | 2018-06-08 | 2022-04-19 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
CN110647996B (en) * | 2018-06-08 | 2021-01-26 | 上海寒武纪信息科技有限公司 | Execution method and device of universal machine learning model and storage medium |
US11036480B2 (en) | 2018-06-08 | 2021-06-15 | Shanghai Cambricon Information Technology Co., Ltd. | General machine learning model, and model file generation and parsing method |
CN112740174A (en) * | 2018-10-17 | 2021-04-30 | 北京比特大陆科技有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN112740174B (en) * | 2018-10-17 | 2024-02-06 | 北京比特大陆科技有限公司 | Data processing method, device, electronic equipment and computer readable storage medium |
WO2020093205A1 (en) * | 2018-11-05 | 2020-05-14 | 深圳市欢太科技有限公司 | Deep learning computation method and related device |
CN112714917A (en) * | 2018-11-05 | 2021-04-27 | 深圳市欢太科技有限公司 | Deep learning calculation method and related equipment |
CN112714917B (en) * | 2018-11-05 | 2024-05-10 | 深圳市欢太科技有限公司 | Deep learning calculation method and related equipment |
CN111291882A (en) * | 2018-12-06 | 2020-06-16 | 北京百度网讯科技有限公司 | Model conversion method, device, equipment and computer storage medium |
CN110009110B (en) * | 2019-03-19 | 2020-11-13 | 福建天晴数码有限公司 | STN network optimization method based on UTXO-like library and storage medium |
CN110009110A (en) * | 2019-03-19 | 2019-07-12 | 福建天晴数码有限公司 | STN network optimized approach, storage medium based on the library class UTXO |
CN110276322A (en) * | 2019-06-26 | 2019-09-24 | 湖北亿咖通科技有限公司 | A kind of image processing method and device of the unused resource of combination vehicle device |
CN110503644A (en) * | 2019-08-27 | 2019-11-26 | 广东工业大学 | Defects detection implementation method, defect inspection method and relevant device based on mobile platform |
CN110503644B (en) * | 2019-08-27 | 2023-07-25 | 广东工业大学 | Defect detection implementation method based on mobile platform, defect detection method and related equipment |
WO2021120177A1 (en) * | 2019-12-20 | 2021-06-24 | 华为技术有限公司 | Method and apparatus for compiling neural network model |
CN111145076A (en) * | 2019-12-27 | 2020-05-12 | 深圳鲲云信息科技有限公司 | Data parallelization processing method, system, equipment and storage medium |
CN112149908A (en) * | 2020-09-28 | 2020-12-29 | 深圳壹账通智能科技有限公司 | Vehicle driving prediction method, system, computer device and readable storage medium |
CN112149908B (en) * | 2020-09-28 | 2023-09-08 | 深圳壹账通智能科技有限公司 | Vehicle driving prediction method, system, computer device, and readable storage medium |
CN113676353A (en) * | 2021-08-19 | 2021-11-19 | 杭州华橙软件技术有限公司 | Control method and device for equipment, storage medium and electronic device |
CN114594103A (en) * | 2022-04-12 | 2022-06-07 | 四川大学 | Method and system for automatically detecting surface defects of nuclear industrial equipment and automatically generating reports |
CN114594103B (en) * | 2022-04-12 | 2023-05-16 | 四川大学 | Automatic detection and report generation method and system for surface defects of nuclear industrial equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107563512B (en) | 2023-10-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107563512A (en) | A kind of data processing method, device and storage medium | |
CN111310936A (en) | Machine learning training construction method, platform, device, equipment and storage medium | |
CN112200297B (en) | Neural network optimization method, device and processor | |
CN106951926A (en) | The deep learning systems approach and device of a kind of mixed architecture | |
CN110309911A (en) | Neural network model verification method, device, computer equipment and storage medium | |
CN110689116B (en) | Neural network pruning method and device, computer equipment and storage medium | |
CN106547627A (en) | The method and system that a kind of Spark MLlib data processings accelerate | |
CN111782637A (en) | Model construction method, device and equipment | |
CN111190741A (en) | Scheduling method, device and storage medium based on deep learning node calculation | |
CN111831359B (en) | Weight precision configuration method, device, equipment and storage medium | |
CN112115801B (en) | Dynamic gesture recognition method and device, storage medium and terminal equipment | |
CN114936085A (en) | ETL scheduling method and device based on deep learning algorithm | |
US20230120227A1 (en) | Method and apparatus having a scalable architecture for neural networks | |
CN111831355A (en) | Weight precision configuration method, device, equipment and storage medium | |
CN113010312A (en) | Hyper-parameter tuning method, device and storage medium | |
CN109117475A (en) | A kind of method and relevant device of text rewriting | |
CN114490116B (en) | Data processing method and device, electronic equipment and storage medium | |
CN111368707A (en) | Face detection method, system, device and medium based on feature pyramid and dense block | |
CN114065948A (en) | Method and device for constructing pre-training model, terminal equipment and storage medium | |
CN109002885A (en) | A kind of convolutional neural networks pond unit and pond calculation method | |
CN112990461B (en) | Method, device, computer equipment and storage medium for constructing neural network model | |
CN106934854A (en) | Based on creator model optimization method and devices | |
CN113626035B (en) | Neural network compiling method facing RISC-V equipment based on TVM | |
CN110442753A (en) | A kind of chart database auto-creating method and device based on OPC UA | |
CN115794137A (en) | GPU-oriented artificial intelligence model deployment method and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |