CN107563512A - A kind of data processing method, device and storage medium - Google Patents

A kind of data processing method, device and storage medium Download PDF

Info

Publication number
CN107563512A
CN107563512A CN201710735990.7A CN201710735990A CN107563512A CN 107563512 A CN107563512 A CN 107563512A CN 201710735990 A CN201710735990 A CN 201710735990A CN 107563512 A CN107563512 A CN 107563512A
Authority
CN
China
Prior art keywords
computing architecture
learning model
deep learning
current layer
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710735990.7A
Other languages
Chinese (zh)
Other versions
CN107563512B (en
Inventor
倪辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Tencent Technology Shanghai Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Tencent Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd, Tencent Technology Shanghai Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN201710735990.7A priority Critical patent/CN107563512B/en
Publication of CN107563512A publication Critical patent/CN107563512A/en
Application granted granted Critical
Publication of CN107563512B publication Critical patent/CN107563512B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kind of data processing method, device and storage medium, the data processing method includes:Obtain pending data;Handled in the deep learning model that the pending data input terminal is embedded, the deep learning model includes multilayer computing architecture;In processing procedure, whether detection current layer computing architecture, which runs, finishes;If detecting, the operation of current layer computing architecture is finished, and obtains storage region corresponding to current layer computing architecture, and the storage region is cleared up, and the storage region is used to store current layer computing architecture operationally caused service data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and return to whether perform detection current layer computing architecture has run the step of finishing, until output result.Above-mentioned data processing method is advantageous to the offline realization of deep learning model in the terminal, and flexibility is high, and computational efficiency is high.

Description

A kind of data processing method, device and storage medium
Technical field
The present invention relates to field of computer technology, more particularly to a kind of data processing method, device and storage medium.
Background technology
The concept of deep learning comes from the research of artificial neural network, and the multilayer perceptron containing more hidden layers is exactly a kind of depth Learning structure.
Deep learning forms more abstract high-rise expression attribute classification or feature by combining low-level feature, to find number According to distributed nature represent that it proposes to cause artificial neural network to turn into one of most important algorithm of machine learning again, A pre-training stage is added in traditional artificial neural network training, i.e., one is carried out to each layer network with unsupervised learning Secondary special training, overall training is then carried out to whole network using supervised learning, that is to say, that computer passes through depth Neutral net, the mechanism of human brain is simulated to learn, judge, decision-making.
Based on the intelligent use that deep learning method is realized because its accuracy rate is high, excellent, turn into artificial intelligence field First choice.But the application of existing deployment deep learning needs the large-scale operation being piled into by substantial amounts of CPU and GPU Stand to realize, generally, directly can not be realized offline in the mobile phone application of user because its calculating memory consumption is excessive.
The content of the invention
It is an object of the invention to provide a kind of data processing method, device and storage medium, to solve existing depth The technical problem that learning model can not be realized directly offline in the terminal because its calculating memory consumption is excessive.
In order to solve the above technical problems, the embodiment of the present invention provides following technical scheme:
A kind of data processing method, including:
Obtain pending data;
Handled in the deep learning model that the pending data input terminal is embedded, the deep learning model Including multilayer computing architecture;
In processing procedure, whether detection current layer computing architecture, which runs, finishes;
If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and The storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused operation number According to;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer meter is returned to Calculate whether framework has run the step of finishing, until output result.
In order to solve the above technical problems, the embodiment of the present invention also provides following technical scheme:
A kind of data processing equipment, including:
Acquisition module, for obtaining pending data;
Processing module, handled for the pending data to be inputted in the deep learning model having been inserted into, it is described Deep learning model includes multilayer computing architecture;
Detection module, in processing procedure, whether detection current layer computing architecture, which runs, finishes;
Cleaning modul, if detecting that the operation of current layer computing architecture finishes for the detection module, obtain current layer Storage region corresponding to computing architecture, and the storage region is cleared up, the storage region is based on storing current layer Calculate framework operationally caused service data;When clearing up completion, calculated the next layer of computing architecture as current layer Framework, and return to whether the triggering detection module perform detection current layer computing architecture has run the step of finishing, until defeated Go out result.
In order to solve the above technical problems, the embodiment of the present invention also provides following technical scheme:
A kind of storage medium, the storage medium are stored with a plurality of instruction, and the instruction is loaded suitable for processor, with Perform the step in the data processing method described in any of the above-described.
Data processing method, device and storage medium provided by the invention, are treated by obtaining pending data, and by this Handled in the deep learning model that processing data input has been inserted into, the deep learning model includes multilayer computing architecture, connects , in processing procedure, whether detection current layer computing architecture, which runs, is finished, if detecting, current layer computing architecture has been run Finish, then obtain storage region corresponding to current layer computing architecture, and the storage region is cleared up, the storage region is used to deposit Current layer computing architecture operationally caused service data is stored up, afterwards, when clearing up completion, the next layer of computing architecture is made For current layer computing architecture, and return to whether perform detection current layer computing architecture has run the step of finishing, until at output Result is managed, so as to start the automatic clearing function of intermediate result in calculating process, preferably resolves deep learning model The problem of memory consumption is excessive is calculated, is advantageous to the offline realization of deep learning model in the terminal, flexibility is high, computational efficiency It is high.
Brief description of the drawings
Below in conjunction with the accompanying drawings, by the way that the embodiment of the present invention is described in detail, technical scheme will be made And other beneficial effects are apparent.
Fig. 1 is the schematic flow sheet of data processing method provided in an embodiment of the present invention;
Fig. 2 a are another schematic flow sheet of data processing method provided in an embodiment of the present invention;
Fig. 2 b are the block schematic illustration provided in an embodiment of the present invention for disposing CNN models on an electronic device;
Fig. 2 c are the schematic diagram of the deep learning model provided in an embodiment of the present invention with n-layer network structure;
Fig. 3 a are the structural representation of data processing equipment provided in an embodiment of the present invention;
Fig. 3 b are the structural representation of embedded module provided in an embodiment of the present invention;
Fig. 3 c are another structural representation of data processing equipment provided in an embodiment of the present invention;
Fig. 3 d are the structural representation of cleaning modul provided in an embodiment of the present invention;
Fig. 4 is the structural representation of electronic equipment provided in an embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, the every other implementation that those skilled in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.
The embodiment of the present invention provides a kind of data processing method, device, storage medium and electronic equipment, below will respectively It is described in detail.
The present embodiment will be described from the angle of data processing equipment, and the data processing equipment can specifically be used as independent Entity realize that can also be integrated in the electronic equipments such as terminal or server to realize, the electronic equipment can include intelligence Energy mobile phone, tablet personal computer and personal computer etc..
A kind of data processing method, including:Obtain pending data;The depth that pending data input is had been inserted into Practise and being handled in model, the deep learning model includes multilayer computing architecture;In processing procedure, current layer calculating support is detected Whether structure, which runs, finishes;If detecting, the operation of current layer computing architecture finishes, and obtains and is stored corresponding to current layer computing architecture Region, and the storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused fortune Row data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer is returned to Whether computing architecture has run the step of finishing, until output result.
As shown in figure 1, the idiographic flow of the data processing method can be as follows:
S101, obtain pending data.
In the present embodiment, the pending data can include the multi-medium datas such as image, voice, and it can currently be gathered Data or the data downloaded on the net.
Handled in S102, the deep learning model for having been inserted into pending data input, the deep learning model Including multilayer computing architecture.
In the present embodiment, the deep learning model can include CNN (Convolutional Neural Network, volume Product neutral net), DBN (Deep Belief Network, depth belief network), RNN (Recurrent Neural Network, Recognition with Recurrent Neural Network), the neural network model such as recurrent neural tensor network and autocoder.Generally, the depth Degree learning model includes multilayer computing architecture, and the computing architecture may be constructed input layer, output layer and at least one hiding Intermediate layer, these layers combine to form the whole framework of deep learning model, are examined with realizing similar to image recognition, face The functions such as survey, image segmentation.The pending data can be by specifying input interface to input in deep learning model, and this specifies defeated Incoming interface can be user-defined three-dimensional data structure, and the queueing discipline of the three-dimensional data structure can be c/h/w, lead to Often, different process object, its c, h, w are different, such as when the pending data is RGB color image, c=image channel numbers 3, h=picture altitudes, w=picture traverses, by specifying the pixel value (0~255) that the content that input interface inputs is image, when When the pending data is voice, c=h=1, w=numbers of samples, by specify input interface input content for sampled value (- 1~1), wherein number of samples is determined by sample frequency and sampling duration.
It is pointed out that because existing deep learning model is all to be similar to caffe by some (Convolutional Architecture for Fast Feature Embedding, convolutional neural networks framework), The components such as torch and tensorflow construction, and the use of these components needs to rely on third party library to realize, if therefore will Existing deep learning model, which is grafted directly in electronic equipment, to be run, and can undoubtedly be related to the cross-platform use in storehouse, electronic equipment The speed of service it is low, therefore, the deep learning model being had been inserted into the present embodiment is by specific components, such as ncnn component structures To make, the ncnn components are the ardware features for being directed to electronic equipment and the high-performance neutral net forward calculation framework formulated, It is the CNN models specially run on electronic equipment, it needs not rely on third party library in use, existing so as to solve The problem of cross-platform storehouse for there are CNN models to occur on an electronic device uses, substantially increase the speed of service of electronic equipment.
Certainly, the specific components should make in advance.Specifically, it can first call cmake (cross Platform make, cross-platform installation tool) generation respective operations system development platform project file, generally, difference behaviour The electronic equipment for making system corresponds to different project files, afterwards, passes through SDK (Software Development Kit, software Development kit) and the developing instrument such as IDE (Integrated Development Environment, IDE) add The project file is carried, and compiles out the static library or dynamic base of the specific components according to the actual requirements, so, subsequently should in utilization When the deep learning model that specific components construction depth learning model and execution have constructed, the static library or dynamic base Can is directly as calling storehouse to use, without relying on third party library.In addition, by compiled specific components in electronic equipment Middle construction depth learning model should also be what is carried out in advance, that is, before above-mentioned steps S102, the data processing method is also It can include:
1-1, the parameter information for obtaining from the deep learning model trained each layer computing architecture.
For example, above-mentioned steps 1-1 can specifically include:
The deep learning model trained is changed using default switching strategy, obtains files after transform;
Default loading interface is called to load the files after transform in the electronic equipment;
The parameter information of each layer computing architecture is obtained from the content of loading.
In the present embodiment, the deep learning model trained is that existing deep learning model is instructed using sample Obtained after white silk, it is usually expressed as protobuf (a kind of data interchange format of Google) serialized datas or specific binary system Data.The default switching strategy is mainly used in existing deep learning model conversion into the text for meeting form needed for the assignment component Part, it can be the model conversion program artificially compiled out, and the model conversion program can include to deep learning solution to model Analysis and by data after parsing be written as file the two operation.The files after transform can include model file and network text Part, the model file refer to the file for meeting model structure needed for the assignment component, and the network file refers to meet the designated groups The file of network structure needed for part.
The default loading interface is mainly used in load document, and it can directly be adjusted from compiled static library or dynamic base With, the file of different-format calls different default loading interfaces, wherein, the default loading interface can take load document road Footpath or load document content both load modes, specific selection depends on electronic equipment memory size, when internal memory is enough, Load document path can be selected, in the normal fashion Access Model resource, when low memory, can selected in load document Hold, the Access Model resource in the form of internal memory.The type of parameter that the parameter information can include participating in calculating, quantity and each The information such as the weights of parameter, these information are included in files after transform, it is necessary to could smoothly after loading in the electronic device Read.
1-2, calculating path is generated according to the parameter information of each layer computing architecture, the calculating path includes participating in the meter of operation Calculate the operation order calculated between function and each computing architecture of framework.
In the present embodiment, the calculating path can by by the calculating function of all or part of layer computing architecture according to layer with The pass order of layer sorts to obtain successively, and it can be directed to all computing architectures, can also be just for part computing architecture, mainly Depending on mission requirements, such as needing to use the task of all layer functions of deep learning model, it is necessary to be directed to all calculating Framework generation calculates path, the task for only needing to use deep learning model part layer function, can be counted just for part Calculate framework generation and calculate path.
For example, above-mentioned steps 1-2 can specifically include:
The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;
Generated according to the parameter information of the computing architecture filtered out and calculate path.
In the present embodiment, the default screening parameter is mainly used in filtering out the layer that need not be used, and it can be quiet in compiling Just set when state storehouse or dynamic base, directly can so be called when generation calculates path from storehouse, flexibility is high.Certainly, To adapt to all mission requirements, when can to define the default screening parameter be some value (such as 0), it is believed that need not carry out Filter.
1-3, by calculating path storage in the electronic device, with embedded deep learning model in the electronic equipment.
Now, above-mentioned steps S102 can specifically include:
The pending data is handled according to the calculating path.
In the present embodiment, if because the calculating path is generated by deep learning model dried layer computing architecture, therefore work as When being handled using the calculating path pending data, it is possible to achieve all or part of layers of work(of the deep learning model Energy.
In addition, the telescopiny of deep learning model in the electronic device is not merely related to the setting of calculating process, It can also include the assignment problem for performing the thread for calculating operation, " in the electronic equipment embedded depth that is, above-mentioned steps Degree learning model " can also include:
Detect the processor quantity of the electronic equipment;
The number of threads according to corresponding to determining the processor quantity, and the number of threads is stored in the electronic equipment.
In the present embodiment, the Thread Count used needed for calculating process can be controlled by given thread control interface Amount, the number of threads can be set equal to processor quantity or other set rule.The given thread control interface It can shift to an earlier date compiled in dynamic base or static library, to directly invoke, now, above-mentioned steps are " according to the calculating path pair The pending data is handled " it can further include:
Call the idle thread equal with the number of threads;
Using the idle thread of calling, the pending data is handled according to the calculating path.
In the present embodiment, the number of threads can be made to be equal to CPU quantity, be for the electronic equipment of multi -CPU, Ke Yitong Different threads are crossed, multiple calculating tasks are assigned in different core cpus simultaneously and perform parallel computation, so as to substantially increase place Manage speed.
In addition, the assignment problem that in deep learning model insertion electronic equipment, will further relate to calculate internal memory, that is, above-mentioned Step " embedded deep learning model in the electronic equipment " can further include:
The space size of corresponding storage region is determined according to the parameter information of each layer computing architecture.
In the present embodiment, the parameter information can also include the number of output parameter (namely parameter type of operation result) The information such as the weights of amount and each output parameter, generally, output parameter, the calculating parameter of structure can be calculated according to this layer Etc. come the memory space of distribution needed for determining, can be normally carried out with ensureing that this layer calculates the calculating of structure.Now, above-mentioned steps " being handled according to the calculating path the pending data " can further include:Using the idle thread of calling, according to The calculating path is accordingly calculated the pending data in corresponding storage region.
S103, in processing procedure, whether detection current layer computing architecture runs and finishes, if so, then performing following step S104, if it is not, then re-executing step S103.
In the present embodiment, the operation, which finishes, refers to that calculating function corresponding to the current layer computing architecture in calculating path all holds Row finishes.
S104, storage region corresponding to current layer computing architecture is obtained, and the storage region is cleared up, the memory block Domain is used to store current layer computing architecture operationally caused service data;When clearing up completion, by the next layer of calculating support Structure returns as current layer computing architecture and performs above-mentioned steps S103, until output result.
, can be by being stored in advance in the specified output interface in static library or dynamic base to last defeated in the present embodiment Go out result to be exported, it can also be three-dimensional data structure c/h/w that this, which specifies output interface, and different processing tasks, it is exported Content it is different, such as image recognition tasks, c=image classifications numbering, the confidence level dimension of the h=w=classification is (logical Often for 1), for voice recognition tasks, c=Classification of Speech is numbered, the confidence level dimension (being usually 1) of the h=w=classification.
In view of in most of deep learning model, the calculating data in intermediate layer are unwanted, namely only need to obtain The calculating data of last layer are as output result, can be with to avoid the calculating data in intermediate layer from taking excessive internal memory Using intermediate data machine for automatically recovering system, namely when the operation of any intermediate layer computing architecture finishes, the layer can be cleared up automatically Data caused by calculating, so as to reduce EMS memory occupation as far as possible.It is pointed out that not each intermediate layer computing architecture has been run Bi Hou, it is required for clearing up all data in this layer of storage region, for example for multitask neural network model, is counting It when calculating the operation result of individual task, can temporarily retain the operation result, until all tasks are all completed, then be deleted Remove, so as to avoid computing repeatedly for same operation result, for such case, the mode of reference count can be taken to determine Which operation result needs to delete, and also can be that each operation result sets a number to be quoted, only secondary when waiting to quote Number for 0 or other specify numerical value when, just needs the operation result is cleared up, that is, above-mentioned steps are " to the storage region Cleared up " it can specifically include:
Service data in addition to the operation result is cleared up;
Obtain the current number to be quoted of the operation result;
When this wait quote number be equal to predetermined threshold value when, the operation result is cleared up;When the number to be quoted is more than During predetermined threshold value, the operation for performing and obtaining storage region corresponding to current layer computing architecture is returned.
In the present embodiment, the predetermined threshold value can be 0 or other specified numerical value being manually set, can be clear by specifying Manage interface and carry out cleaning operation, this specifies cleaning interface to be stored in advance in the compiled static library or dynamic base, makes Used time directly calls from the storehouse.The number to be quoted refers mainly to the number that next layer of computing architecture also needs to use, due to Next layer of computing architecture can only use the operation result of last layer computing architecture, therefore for other service datas, for example calculated It number of passes evidence, can select directly to clear up, and the operation result of predetermined threshold value is not up to for the number to be quoted, illustrate it under One layer of computing architecture also has use value, can be without cleaning.
Generally, only when all service datas are all cleaned out, it can just think that whole cleaning operation is completed, it is contemplated that clear Reason operation and the calculating operation of next layer of computing architecture can perform parallel, therefore in scale removal process, operation result is waited to quote Number is to constantly update, that is, when detecting that the operation of current layer computing architecture finishes, the data processing method can also wrap Include:
The number to be quoted of the operation result is set according to the next layer of computing architecture;
When this wait quote number set successfully after, obtain the next layer of computing architecture and operationally the operation result drawn Use information;
The number to be quoted is updated according to the reference information.
In the present embodiment, the reference information refers mainly to quote number.Specifically, current layer operation result it is initial wait to quote Number can be pre-set according to the operation result quantity of next layer of computing architecture, generally, this can be made initial to wait to quote Number is equal to the operation result quantity of next layer of computing architecture, such as, it is A- for network structure>B->C1+C2 double tasks are deep Learning model is spent, because the operation result of last layer is C1 and C2, the operation result in intermediate layer is B, therefore B number to be quoted The initial value of the initial value number to be quoted that can be set to 2, A can be set to 1.Afterwards, in actual calculating process, Ke Yigen According to the specific number to be quoted for quoting each operation result of situation real-time update, the update method can be:Next layer of calculating support Structure often uses current layer operation result successively, and corresponding number to be quoted accordingly subtracts one, such as, when asking for C1 results, B's waits to draw 1 is updated to number, when asking for C2 results, B number to be quoted is updated to 0, and now, B can be cleared up.Relative to existing Have in technology and directly copy to operation result in the storage region of next layer of computing architecture for the mode used, this reality Example is applied by way of reference count come instead of replicating, without copying data, method is simple, is advantageous to reduce EMS memory occupation.
From the foregoing, the data processing method that the present embodiment provides, by obtaining pending data, and this is pending Being handled in the deep learning model that data input has been inserted into, the deep learning model includes multilayer computing architecture, then, In processing procedure, whether detection current layer computing architecture, which runs, finishes, if detecting, the operation of current layer computing architecture finishes, and obtains Storage region corresponding to current layer computing architecture is taken, and the storage region is cleared up, the storage region is used to store currently Layer computing architecture operationally caused service data, afterwards, when clear up complete when, using the next layer of computing architecture as currently Layer computing architecture, and return to whether perform detection current layer computing architecture has run the step of finishing, until output result, So as to start the automatic clearing function of intermediate result in calculating process, the calculating internal memory of deep learning model is preferably resolved The problem of excessive is consumed, is advantageous to the offline realization of deep learning model in the terminal, flexibility is high, and computational efficiency is high.
Citing, is described in further detail by the method according to described by above-described embodiment below.
In the present embodiment, will be described in detail so that data processing equipment is integrated in the electronic device as an example.
As shown in Figure 2 a and 2 b, a kind of data processing method, idiographic flow can be as follows:
S201, electronic equipment are changed using default switching strategy to the deep learning model trained, and are changed File afterwards.
For example, the deep learning model can be CNN models, and the default switching strategy refers mainly to model transformation tools, its Can include to deep learning solution to model analyse and by data after parsing be written as file the two operation.It can adjust in advance With the project file of cmake generation respective operations system (such as Android system) development platforms, afterwards, opened by SDK and IDE etc. The hair tool loads project file, and the model transformation tools is compiled out according to the actual requirements.It should be noted that in compiling mould While type crossover tool, the static library or dynamic base of specific components (such as ncnn components), namely ncnn storehouses can be compiled out, So, should subsequently when the deep learning model constructed using ncnn component construction depth learning models and execution Ncnn storehouses can is directly as calling storehouse to use, without relying on third party library.
Wherein, multiple parameters can be included in the ncnn storehouses, such as disabling file loading and character string output function Parameter-DNCNN_STDIO=OFF and/or-DNCNN_STRING=OFF cmake, for built-in corresponding to not compiling completely Parameter-the DWITH_LAYER_xxx=OFF of layer, and for enabling the parameter-DNCNN_ of multi-core cpu parallel processing functions OPENMP=ON cmake, certainly, the ncnn storehouses can also include multiple interfaces, such as loading ncnn network files Net::Load_param interfaces, for loading the Net of ncnn model files::Load_model interfaces, it is preceding to net for creating The Net of network calculator::Create_extractor interfaces, for calling the Extractor of feedforward network calculator:: Extractor interfaces, for opening the Net of the automatic clearing function of intermediate result::Set_light_mode interfaces, for controlling Use the Net of Thread Count::Set_num_threads interfaces, for inputting the Extractor of pending data::Input connects Mouthful, and for exporting the Extractor of operation result::Extract interfaces, etc., wherein, Extractor::Input connects Mouth and Extractor::Extract interfaces are three-dimensional data structure, and queueing discipline can be c/h/w, generally, different places Object is managed, its c, h, w are different.
S202, electronic equipment call default loading interface to load the files after transform in the electronic equipment, and from loading Content in obtain the parameter information of each layer computing architecture.
For example, the files after transform can include model file and network file, can call the Net in ncnn storehouses:: Load_param interfaces load ncnn network files, call the Net in ncnn storehouses::Load_model interfaces load ncnn models File.The parameter of interface input can be file path, or file content.The parameter information can include participating in the meter calculated The information such as weights of calculation parameter and the type of output parameter, quantity and each parameter.
S203, electronic equipment filter out the calculating support for participating in operation using default screening parameter from each layer computing architecture Structure, and generated according to the parameter information of the computing architecture filtered out and calculate path, the calculating path includes participating in the calculating of operation The operation order calculated between function and each computing architecture of framework.
For example, being screened using-DWITH_LAYER_xxx=OFF parameters are somebody's turn to do, xxx represents the layer for needing to filter, Namely the layer where the computing architecture of computing is not involved in, when xxx is 0, it is believed that all layers of computing architecture are required for participating in Computing.The Net can be called::Create_extractor interfaces, according to the calculating parameter for participating in calculating in each layer computing architecture The weights generation of type, quantity and each parameter calculate path, to create feedforward network calculator in the electronic device.
S204, electronic equipment obtain its own processor quantity, and the number of threads according to corresponding to determining the processor quantity, At the same time, the space size of corresponding storage region is determined according to the parameter information of each layer computing architecture.
For example, the Net can be called::Set_num_threads interfaces set required Thread Count, such as eight cores Electronic equipment, its participate in calculate number of threads can be set to 8.The output parameter of structure can be calculated according to each layer, is calculated Parameter etc. is come the memory space of distribution needed for determining, generally, output parameter and more, the required memory space of calculating parameter quantity It is bigger, for example refer to Fig. 2 c, a deep learning model with n-layer network structure is shown in Fig. 2 c, wherein, due to the Two layers need to use five operation results of first layer as its calculating parameter, it is necessary to obtain four operation results as its output Parameter, third layer need to use four operation results of the second layer as its calculating parameter and made, it is necessary to obtain three operation results For its output parameter, therefore in general, the memory space needed for the second layer is more than the memory space needed for third layer, certainly, also Need to consider influence of the other factors such as parameter type to memory space simultaneously.Further, since the CPU of existing electronic equipment is most For arm frameworks (32 reduced instruction set computers), it can further make each byte-aligned of passage internal memory 16, to accelerate to read speed Degree.
S205, electronic equipment are stored the space size in the calculating path, number of threads and storage region, with Embedded deep learning model in the electronic equipment.
For example, can using calculate path, number of threads and storage region space size as necessary data in electronics Feedforward network calculator is created in equipment, and establishes the Extractor::The tune of extractor interfaces and feedforward network calculator With relation, so as to realize in deep learning model insertion electronic equipment using ncnn components.Because ncnn components are in design The hardware performance of electronic equipment has just just been taken into full account, therefore deep learning model has been parsed by using ncnn components, and After reconfiguring, it can be made preferably to be applied to electronic equipment.
It should be noted that above-mentioned steps S203 and S204 do not have obvious execution sequence, it can be carried out simultaneously, also may be used Successively to carry out.
S206, electronic equipment obtain pending data.
For example, for CNN models, the pending data can be image, and the image can be black white image or colour Image.
S207, electronic equipment call the idle thread equal with the number of threads, and utilize the idle thread called, according to The calculating path is calculated the pending data in the storage region of corresponding space size.
For example, the Extractor can be first passed through::The pending data is inputted deep learning model by input interfaces In, such as, when the pending data is RGB color image (or black white image), the c=image channels number 3 (or 2) of input, h =picture altitude, w=picture traverses, when the pending data is voice, the c=h=1 of input, w=numbers of samples.Afterwards, The Extractor can be passed through::Extractor interface interchange feedforward network calculators are calculated the pending data.
S208, in calculating process, whether electronic equipment detection current layer computing architecture runs and finishes, if so, then simultaneously Following step S209-S210 and S211-S212 are performed, if it is not, then re-executing step S208.
S209, electronic equipment obtain storage region corresponding to current layer computing architecture, and to removing operation in the storage region As a result the service data outside is cleared up.
For example, finished if current layer computing architecture calculates, the Net can be called::Set_light_mode interfaces are to operation Data are cleared up, and in scale removal process, can directly be cleared up for the service data in addition to operation result, are tied for operation Fruit then needs to judge whether to need to clear up with reference to number to be quoted.
S210, electronic equipment obtain the current number to be quoted of the operation result, and detect the number to be quoted whether etc. In predetermined threshold value, if so, then performing following step S211, above-mentioned steps S209 is performed if it is not, then returning.
S211, electronic equipment are cleared up the operation result, when clear up complete when, using the next layer of computing architecture as Current layer computing architecture, and return and perform above-mentioned steps S209, until output result,
For example, the predetermined threshold value can be 0, only when when it is 0 to quote number, it is believed that need to clear up the operation knot Fruit.Generally, when only all service datas have all been cleared up in the storage region of current layer computing architecture, can just carry out next The cleaning work of the storage region of layer computing architecture.The Extractor can be called::Extract interface outputs manage result, Wherein, when the pending data is image, the c=image classifications numbering of output, the confidence level dimension of the h=w=classification (is led to Often for 1), when the pending data is voice, c=Classification of Speech numbering, the confidence level dimension of the h=w=classification is (usually 1)。
S212, electronic equipment are treated according to the set Current Layer first operation result of computing architecture of the next layer of computing architecture Quote number.
For example, it is A- for network structure>B->C1+C2 deep learning model, when setting first, A number to be quoted The number to be quoted that can be 1, B can be that 2, C1 and C2 number to be quoted can be 3.
S213, when this wait quote number set successfully after, it is operationally right that electronic equipment obtains the next layer of computing architecture The reference information of the operation result, and the number to be quoted is updated according to the reference information.
For example, the reference information refers mainly to quote number, such as, when asking for B results, due to the calculating in B results An A result has been refer in journey, therefore A number to be quoted is updated to 0 by 1, likewise, when asking for C1 results, B results Number to be quoted can be updated to 1 by 2, when asking for C2 results, the number to be quoted of B results can be updated to 0 by 1, this Sample can avoid the unused operation result finished from being cleaned out, it is ensured that each operation result can be by next layer of computing architecture Fully quote.
From the foregoing, the data processing method that the present embodiment provides, wherein electronic equipment can utilize default conversion plan Slightly the deep learning model trained is changed, obtains files after transform, then, calls default loading interface in the electronics The files after transform is loaded in equipment, and the parameter information of each layer computing architecture is obtained from the content of loading, then, using pre- If screening parameter filters out the computing architecture for participating in operation from each layer computing architecture, and according to the ginseng of the computing architecture filtered out Number information generation calculates path, and the calculating path includes participating in the calculating function of the computing architecture of operation and each computing architecture Between operation order, at the same time, obtain its own processor quantity, and according to the processor quantity determine corresponding to Thread Count Amount, the space size of corresponding storage region is determined according to the parameter information of each layer computing architecture, afterwards, by calculating path, the line The space size of number of passes amount and storage region is stored, with embedded deep learning model in the electronic equipment, so that Ncnn components can be utilized in deep learning model insertion electronic equipment, will to be preferably based entirely on pure C Plus Plus and realize do not have Using third party library, storehouse EMS memory occupation is small, and in actual use afterwards, electronic equipment can obtain pending data, connect , call the idle thread equal with the number of threads, and using the idle thread called, according to the calculating path corresponding empty Between the storage region of size the pending data is calculated, in calculating process, whether detection current layer computing architecture is transported Row finishes, if the operation of current layer computing architecture finishes, obtains storage region corresponding to current layer computing architecture, and to the storage Service data in region in addition to operation result is cleared up, while obtains the current number to be quoted of the operation result, and Detect whether the number to be quoted is equal to predetermined threshold value, if so, then call default cleaning interface to clear up the operation result, When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and return to execution and obtain current layer calculating support Corresponding to structure the step of storage region, until output result, at the same time, electronic equipment can be according to this next layer calculating Framework set Current Layer first computing architecture operation result number to be quoted, when this wait quote number set successfully after, can To obtain the next layer of computing architecture operationally to the reference information of the operation result, and this is updated according to the reference information and treated Number is quoted, so as to start the automatic clearing function of intermediate result in the calculating process of deep learning model, is preferably solved The problem of calculating memory consumption of deep learning model is excessive, is advantageous to the offline realization of deep learning model in the terminal, Flexibility is high, and employs the calculation of multi-threaded parallel, and processing concurrency is good, and computational efficiency is high.
Method according to described by above-described embodiment, the present embodiment will further be retouched from the angle of data processing equipment State, the data processing equipment can integrate in the electronic device.
Fig. 3 a are referred to, data processing equipment provided in an embodiment of the present invention has been described in detail in Fig. 3 a, and it can include:Obtain Modulus block 10, processing module 20, detection module 30 and cleaning modul 40, wherein:
(1) acquisition module 10
Acquisition module 10, for obtaining pending data.
In the present embodiment, the pending data can include the multi-medium datas such as image, voice, and it can currently be gathered Data or the data downloaded on the net.
(2) processing module 20
Processing module 20, handled for the pending data to be inputted in the deep learning model having been inserted into, the depth Degree learning model includes multilayer computing architecture.
In the present embodiment, the deep learning model can include CNN, DBN, RNN, recurrent neural tensor network and from The neural network models such as dynamic encoder.Generally, the deep learning model includes multilayer computing architecture, and the computing architecture may be constructed Input layer, output layer and at least one hiding intermediate layer, these layers combine to form the whole frame of deep learning model Structure, to realize similar to functions such as image recognition, Face datection, image segmentations.The pending data can be by specifying input In interface input deep learning model, it can be user-defined three-dimensional data structure that this, which specifies input interface, three dimension Queueing discipline according to structure can be c/h/w, and generally, different process objects, its c, h, w are different, such as when the pending number According to for RGB color image when, c=image channels number 3, h=picture altitudes, w=picture traverses, pass through specify input interface input Content be image pixel value (0~255), when the pending data is voice, c=h=1, w=numbers of samples, pass through The content for specifying input interface input is sampled value (- 1~1), and wherein number of samples is determined by sample frequency and sampling duration.
It is pointed out that due to existing deep learning model be all by some be similar to caffe, torch and The components such as tensorflow construction, and the use of these components needs to rely on third party library to realize, if therefore will be existing Deep learning model, which is grafted directly in electronic equipment, to be run, and can undoubtedly be related to the cross-platform use in storehouse, the operation of electronic equipment Speed is low, therefore, the deep learning model being had been inserted into the present embodiment is by specific components, such as ncnn components construction, The ncnn components are the ardware features for being directed to electronic equipment and the high-performance neutral net forward calculation framework formulated, and are special The CNN models run on electronic equipment, it needs not rely on third party library in use, so as to solve existing CNN The problem of cross-platform storehouse that model occurs on an electronic device uses, substantially increase the speed of service of electronic equipment.
Certainly, the specific components should make in advance.Specifically, cmake can be first called to generate respective operations The project file of system development platform, generally, the electronic equipment of different operating system correspond to different project files, afterwards, lead to Cross the developing instruments such as SDK and IDE and load the project file, and compile out the static library or dynamic of the specific components according to the actual requirements State storehouse, so, subsequently in the deep learning model constructed using the specific components construction depth learning model and execution When, the static library or dynamic base can are directly as calling storehouse to use, without relying on third party library.In addition, pass through compiling Construction depth learning model should also be what is carried out in advance to good specific components in the electronic device, that is, Fig. 3 b are referred to, should Data processing equipment can also include embedded module 50, and the insertion module 50 can specifically include the second acquisition submodule 51, life Into submodule 52 and sub-module stored 53, wherein:
Second acquisition submodule 51, the deep learning for having been inserted into pending data input for managing module 20 in this place Before being handled in model, the parameter information of each layer computing architecture is obtained from the deep learning model trained.
For example, second acquisition submodule 51 specifically can be used for:
The deep learning model trained is changed using default switching strategy, obtains files after transform;
Default loading interface is called to load the files after transform in the electronic equipment;
The parameter information of each layer computing architecture is obtained from the content of loading.
In the present embodiment, the deep learning model trained is that existing deep learning model is instructed using sample Obtained after white silk, it is usually expressed as protobuf (a kind of data interchange format of Google) serialized datas or specific binary system Data.The default switching strategy is mainly used in existing deep learning model conversion into the text for meeting form needed for the assignment component Part, it can be the model conversion program artificially compiled out, and the model conversion program can include to deep learning solution to model Analysis and by data after parsing be written as file the two operation.The files after transform can include model file and network text Part, the model file refer to the file for meeting model structure needed for the assignment component, and the network file refers to meet the designated groups The file of network structure needed for part.
The default loading interface is mainly used in load document, and it can directly be adjusted from compiled static library or dynamic base With, the file of different-format calls different default loading interfaces, wherein, the default loading interface can take load document road Footpath or load document content both load modes, specific selection depends on electronic equipment memory size, when internal memory is enough, Load document path can be selected, in the normal fashion Access Model resource, when low memory, can selected in load document Hold, the Access Model resource in the form of internal memory.The type of parameter that the parameter information can include participating in calculating, quantity and each The information such as the weights of parameter, these information are included in files after transform, it is necessary to could smoothly after loading in the electronic device Read.
Submodule 52 is generated, calculates path for being generated according to the parameter information of each layer computing architecture, calculating path bag Include the operation order calculated between function and each computing architecture for the computing architecture for participating in operation.
In the present embodiment, the calculating path can by by the calculating function of all or part of layer computing architecture according to layer with The pass order of layer sorts to obtain successively, and it can be directed to all computing architectures, can also be just for part computing architecture, mainly Depending on mission requirements, such as needing to use the task of all layer functions of deep learning model, generation submodule 52 needs Generated for all computing architectures and calculate path, the task for only needing to use deep learning model part layer function, generation Submodule 52 can generate just for part computing architecture and calculate path.
For example, the generation submodule 52 specifically can be used for:
The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;
Generated according to the parameter information of the computing architecture filtered out and calculate path.
In the present embodiment, the default screening parameter is mainly used in filtering out the layer that need not be used, and it can be quiet in compiling Just set when state storehouse or dynamic base, directly can so be called when generation calculates path from storehouse, flexibility is high.Certainly, To adapt to all mission requirements, when can to define the default screening parameter be some value (such as 0), it is believed that need not carry out Filter.
Sub-module stored 53, for calculating path storage in the electronic device, to be somebody's turn to do with embedded in the electronic equipment Deep learning model.
Now, the processing module 20 specifically can be used for:The pending data is handled according to the calculating path.
In the present embodiment, if because the calculating path is generated by deep learning model dried layer computing architecture, therefore work as When processing module 20 is handled pending data using the calculating path, it is possible to achieve the deep learning model all or Part layer function.
In addition, the telescopiny of deep learning model in the electronic device is not merely related to the setting of calculating process, It can also include the assignment problem for performing the thread for calculating operation, that is, referring to Fig. 3 c, the insertion module 50 can also wrap Determination sub-module 54 is included, is used for:
Detect the processor quantity of the electronic equipment;The number of threads according to corresponding to determining the processor quantity, and should Number of threads is stored in the electronic equipment.
In the present embodiment, the Thread Count used needed for calculating process can be controlled by given thread control interface Amount, the number of threads can be set equal to processor quantity or other set rule.The given thread control interface It can shift to an earlier date compiled in dynamic base or static library, to directly invoke, now, the processing module 20 can further be used In:Call the idle thread equal with the number of threads;It is pending to this according to the calculating path using the idle thread of calling Data are handled.
In the present embodiment, determination sub-module 54 can make the number of threads be equal to CPU quantity, be for the electricity of multi -CPU Multiple calculating tasks can be assigned in different core cpus and performed parallel by different threads by sub- equipment, electronic equipment simultaneously Calculate, so as to substantially increase processing speed.
In addition, the assignment problem that in deep learning model insertion electronic equipment, will further relate to calculate internal memory, that is, this is true Stator modules 54 can be also used for:
The space size of corresponding storage region is determined according to the parameter information of each layer computing architecture.
In the present embodiment, the parameter information can also include the number of output parameter (namely parameter type of operation result) The information such as the weights of amount and each output parameter, generally, the output that determination sub-module 54 can calculate structure according to this layer are joined Number, calculating parameter etc. can be normally carried out with ensureing that this layer calculates the calculating of structure come the memory space of distribution needed for determining.This When, the processing module 20 can be used for:Using the idle thread of calling, this is treated in corresponding storage region according to the calculating path Processing data is accordingly calculated.
(3) detection module 30
Detection module 30, in processing procedure, whether detection current layer computing architecture, which runs, finishes.
In the present embodiment, the operation, which finishes, refers to that calculating function corresponding to the current layer computing architecture in calculating path all holds Row finishes.
(4) cleaning modul 40
Cleaning modul 40, if detecting that the operation of current layer computing architecture finishes for the detection module, obtain current layer Storage region corresponding to computing architecture, and the storage region is cleared up, the storage region is used to store current layer calculating support Structure operationally caused service data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and Return and trigger whether the detection module perform detection current layer computing architecture has run the step of finishing, until output processing knot Fruit.
, can be by being stored in advance in the specified output interface in static library or dynamic base to last defeated in the present embodiment Go out result to be exported, it can also be three-dimensional data structure c/h/w that this, which specifies output interface, and different processing tasks, it is exported Content it is different, such as image recognition tasks, c=image classifications numbering, the confidence level dimension of the h=w=classification is (logical Often for 1), for voice recognition tasks, c=Classification of Speech is numbered, the confidence level dimension (being usually 1) of the h=w=classification.
In view of in most of deep learning model, the calculating data in intermediate layer are unwanted, namely only need to obtain The calculating data of last layer are as output result, can be with to avoid the calculating data in intermediate layer from taking excessive internal memory Using intermediate data machine for automatically recovering system, namely when the operation of any intermediate layer computing architecture finishes, the layer can be cleared up automatically Data caused by calculating, so as to reduce EMS memory occupation as far as possible.It is pointed out that not each intermediate layer computing architecture has been run Bi Hou, it is required for clearing up all data in this layer of storage region, for example for multitask neural network model, is counting It when calculating the operation result of individual task, can temporarily retain the operation result, until all tasks are all completed, then be deleted Remove, so as to avoid computing repeatedly for same operation result, for such case, the mode of reference count can be taken to determine Which operation result needs to delete, and also can be that each operation result sets a number to be quoted, only secondary when waiting to quote Number for 0 or other specify numerical value when, just needs are cleared up the operation result, that is, refer to Fig. 3 d, the service data Operation result can be included, the cleaning modul 40 can specifically include:First cleaning submodule 41, the and of the first acquisition submodule 42 Second cleaning submodule 43, wherein:
First cleaning submodule 41, for clearing up the service data in addition to the operation result;
First acquisition submodule 42, the to be quoted number current for obtaining the operation result;
Second cleaning submodule 43, for when this wait quote number be equal to predetermined threshold value when, to the operation result carry out clearly Reason;When this is when reference number is more than predetermined threshold value, return performs the behaviour for obtaining storage region corresponding to current layer computing architecture Make.
In the present embodiment, the predetermined threshold value can be manually set 0 or other specify numerical value, cleaning modul 40 can be with By specifying cleaning interface to carry out cleaning operation, this specifies cleaning interface to be stored in advance in the compiled static library or dynamic In state storehouse, directly called from the storehouse during use.The number to be quoted refers mainly to next layer of computing architecture and also needs to what is used Number, because next layer of computing architecture can only use the operation result of last layer computing architecture, therefore for other service datas, than Such as calculating process data, the first cleaning submodule 41 can select directly to clear up, and not up to default for the number to be quoted The operation result of threshold value, illustrate that it also has use value to next layer of computing architecture, the second cleaning submodule 43 can not enter Row cleaning.
Generally, only when all service datas are all cleaned out, it can just think that whole cleaning operation is completed, it is contemplated that clear Reason operation and the calculating operation of next layer of computing architecture can perform parallel, therefore in scale removal process, operation result is waited to quote Number is to constantly update, that is, the data processing equipment can also include update module 60, is used for:
When the detection module 30 detects that the operation of current layer computing architecture finishes, set according to the next layer of computing architecture The number to be quoted of the operation result;
When this wait quote number set successfully after, obtain the next layer of computing architecture and operationally the operation result drawn Use information;The number to be quoted is updated according to the reference information.
In the present embodiment, the reference information refers mainly to quote number.Specifically, current layer operation result it is initial wait to quote Number can be pre-set according to the operation result quantity of next layer of computing architecture, and generally, update module 60 can make this first The number to be quoted to begin is equal to the operation result quantity of next layer of computing architecture, such as, it is A- for network structure>B->C1+C2 Double task deep learning models, because the operation result of last layer is C1 and C2, the operation result in intermediate layer is B, therefore B The initial value that the initial value of number to be quoted can be set to 2, A number to be quoted can be set to 1.Afterwards, actually calculating Cheng Zhong, update module 60 can be according to the specific number to be quoted for quoting each operation result of situation real-time update, the renewal sides Method can be:Next layer of computing architecture often uses current layer operation result successively, and corresponding number to be quoted accordingly subtracts one, such as, When asking for C1 results, B number to be quoted is updated to 1, and when asking for C2 results, B number to be quoted is updated to 0, now, B can be cleared up.Relative to directly operation result is copied in the storage region of next layer of computing architecture in the prior art For the mode used, the present embodiment is by way of reference count come instead of replicating, without copying data, method is simple It is single, be advantageous to reduce EMS memory occupation.
It when it is implemented, above unit can be realized as independent entity, can also be combined, be made Realized for same or several entities, the specific implementation of above unit can be found in embodiment of the method above, herein not Repeat again.
From the foregoing, the data processing equipment that the present embodiment provides, pending data, place are obtained by acquisition module 10 Handled in the deep learning model that reason module 20 has been inserted into pending data input, the deep learning model includes more Layer computing architecture, then, in processing procedure, whether the detection current layer computing architecture of detection module 30, which runs, finishes, if detection Go out the operation of current layer computing architecture to finish, cleaning modul 40 obtains storage region corresponding to current layer computing architecture, and deposits this Storage area domain is cleared up, and the storage region is used to store current layer computing architecture operationally caused service data, afterwards, when When cleaning is completed, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer computing architecture is returned to Whether the step of finishing has been run, until output result, is cleared up automatically so as to start intermediate result in calculating process Function, preferably resolve deep learning model calculating memory consumption it is excessive the problem of, be advantageous to deep learning model at end Offline realization in end, flexibility is high, and computational efficiency is high.
Accordingly, the embodiment of the present invention also provides a kind of data handling system, including times that the embodiment of the present invention is provided A kind of data processing equipment, the data processing equipment can integrate in the electronic device.
Wherein, electronic equipment can obtain pending data;The deep learning that the pending data input terminal is embedded Handled in model, the deep learning model includes multilayer computing architecture;In processing procedure, current layer computing architecture is detected Whether operation finishes;If detecting, the operation of current layer computing architecture finishes, and obtains memory block corresponding to current layer computing architecture Domain, and the storage region is cleared up, the storage region is used to store the operationally caused operation of current layer computing architecture Data;When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer meter is returned to Calculate whether framework has run the step of finishing, until output result.
The specific implementation of each equipment can be found in embodiment above above, will not be repeated here.
By the data handling system can include any data processing equipment for being provided of the embodiment of the present invention, because This, it is possible to achieve the beneficial effect achieved by any data processing equipment that the embodiment of the present invention is provided, refer to before Embodiment, will not be repeated here.
Accordingly, the embodiment of the present invention also provides a kind of electronic equipment, as shown in figure 4, it illustrates the embodiment of the present invention The structural representation of involved electronic equipment, specifically:
The electronic equipment can include one or more than one processing core processor 701, one or more The parts such as memory 702, power supply 703 and the input block 704 of computer-readable recording medium.Those skilled in the art can manage Solve, the electronic devices structure shown in Fig. 4 does not form the restriction to electronic equipment, can include more more or less than illustrating Part, either combine some parts or different parts arrangement.Wherein:
Processor 701 is the control centre of the electronic equipment, utilizes various interfaces and the whole electronic equipment of connection Various pieces, by running or performing the software program and/or module that are stored in memory 702, and call and be stored in Data in reservoir 702, the various functions and processing data of electronic equipment are performed, so as to carry out integral monitoring to electronic equipment. Optionally, processor 701 may include one or more processing cores;Preferably, processor 701 can integrate application processor and tune Demodulation processor processed, wherein, application processor mainly handles operating system, user interface and application program etc., and modulatedemodulate is mediated Reason device mainly handles radio communication.It is understood that above-mentioned modem processor can not also be integrated into processor 701 In.
Memory 702 can be used for storage software program and module, and processor 701 is stored in memory 702 by operation Software program and module, so as to perform various function application and data processing.Memory 702 can mainly include storage journey Sequence area and storage data field, wherein, storing program area can storage program area, the application program (ratio needed at least one function Such as sound-playing function, image player function) etc.;Storage data field can store uses created number according to electronic equipment According to etc..In addition, memory 702 can include high-speed random access memory, nonvolatile memory can also be included, such as extremely Few a disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 702 can also wrap Memory Controller is included, to provide access of the processor 701 to memory 702.
Electronic equipment also includes the power supply 703 to all parts power supply, it is preferred that power supply 703 can pass through power management System and processor 701 are logically contiguous, so as to realize management charging, electric discharge and power managed etc. by power-supply management system Function.Power supply 703 can also include one or more direct current or AC power, recharging system, power failure monitor The random component such as circuit, power supply changeover device or inverter, power supply status indicator.
The electronic equipment may also include input block 704, and the input block 704 can be used for the numeral or character for receiving input Information, and produce keyboard, mouse, action bars, optics or the trace ball signal relevant with user's setting and function control Input.
Although being not shown, electronic equipment can also will not be repeated here including display unit etc..Specifically in the present embodiment In, the processor 701 in electronic equipment can be corresponding by the process of one or more application program according to following instruction Executable file be loaded into memory 702, and the application program being stored in memory 702 is run by processor 701, It is as follows so as to realize various functions:
Obtain pending data;
Handled in the deep learning model that the pending data input terminal is embedded, the deep learning model includes Multilayer computing architecture;
In processing procedure, whether detection current layer computing architecture, which runs, finishes;
If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and The storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused service data; When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer calculating support is returned to Whether structure has run the step of finishing, until output result.
The electronic equipment can realize having achieved by any data processing equipment that the embodiment of the present invention is provided Effect is imitated, embodiment above is referred to, will not be repeated here.
It will appreciated by the skilled person that all or part of step in the various methods of above-described embodiment can be with Completed by instructing, or control related hardware to complete by instructing, the instruction can be stored in one and computer-readable deposit In storage media, and loaded and performed by processor.
Therefore, the embodiment of the present invention provides a kind of storage medium, wherein being stored with a plurality of instruction, the instruction can be processed Device is loaded, to perform the step in any data processing method that the embodiment of the present invention provided.For example, the instruction can With following steps:
Obtain pending data;
Handled in the deep learning model that pending data input is had been inserted into, the deep learning model includes more Layer computing architecture;
In processing procedure, whether detection current layer computing architecture, which runs, finishes;
If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and The storage region is cleared up, the storage region is used to store current layer computing architecture operationally caused service data; When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and perform detection current layer calculating support is returned to Whether structure has run the step of finishing, until output result.
The specific implementation of each operation can be found in embodiment above above, will not be repeated here.
Wherein, the storage medium can include:Read-only storage (ROM, Read Only Memory), random access memory Body (RAM, Random Access Memory), disk or CD etc..
By the instruction stored in the storage medium, can perform at any data that the embodiment of the present invention is provided Step in reason method, it is thereby achieved that achieved by any data processing method that the embodiment of the present invention is provided Beneficial effect, embodiment above is referred to, will not be repeated here.
Data processing method, device, storage medium, electronic equipment and the system provided above the embodiment of the present invention is entered Go and be discussed in detail, specific case used herein is set forth to the principle and embodiment of the present invention, and the above is implemented The explanation of example is only intended to help the method and its core concept for understanding the present invention;Meanwhile for those skilled in the art, according to According to the thought of the present invention, there will be changes in specific embodiments and applications, in summary, this specification content It should not be construed as limiting the invention.

Claims (15)

  1. A kind of 1. data processing method, it is characterised in that including:
    Obtain pending data;
    Handled in the deep learning model that pending data input is had been inserted into, the deep learning model includes more Layer computing architecture;
    In processing procedure, whether detection current layer computing architecture, which runs, finishes;
    If detecting, the operation of current layer computing architecture finishes, and obtains storage region corresponding to current layer computing architecture, and to institute State storage region to be cleared up, the storage region is used to store current layer computing architecture operationally caused service data; When clearing up completion, using the next layer of computing architecture as current layer computing architecture, and the calculating of perform detection current layer is returned to Whether framework has run the step of finishing, until output result.
  2. 2. data processing method according to claim 1, it is characterised in that the service data includes operation result, right The storage region is cleared up, including:
    Service data in addition to the operation result is cleared up;
    Obtain the current number to be quoted of the operation result;
    When described when reference number is equal to predetermined threshold value, the operation result is cleared up;When the number to be quoted is big When predetermined threshold value, the operation for performing and obtaining storage region corresponding to current layer computing architecture is returned.
  3. 3. data processing method according to claim 2, it is characterised in that detecting that current layer computing architecture run Bi Shi, in addition to:
    The number to be quoted of the operation result is set according to the next layer of computing architecture;
    When described after reference number is set successfully, the next layer of computing architecture is obtained operationally to the operation result Reference information;
    According to number to be quoted described in reference information renewal.
  4. 4. according to the data processing method described in any one in claim 1-3, it is characterised in that by the pending number Before being handled in the deep learning model being had been inserted into according to input, in addition to:
    The parameter information of each layer computing architecture is obtained from the deep learning model trained;
    Generated according to the parameter information of each layer computing architecture and calculate path, the path that calculates includes participating in the computing architecture of operation The operation order calculated between function and each computing architecture;
    By calculating path storage in the electronic device, with the embedded deep learning model in the electronic equipment;
    Handled in the deep learning model that pending data input is had been inserted into, including:According to the calculating Path is handled the pending data.
  5. 5. data processing method according to claim 4, it is characterised in that described to be believed according to the parameter of each layer computing architecture Breath generation calculates path, including:
    The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;
    Generated according to the parameter information of the computing architecture filtered out and calculate path.
  6. 6. data processing method according to claim 4, it is characterised in that described from the deep learning model trained The parameter information of each layer computing architecture is obtained, including:
    The deep learning model trained is changed using default switching strategy, obtains files after transform;
    Default loading interface is called to load the files after transform in the electronic equipment;
    The parameter information of each layer computing architecture is obtained from the content of loading.
  7. 7. data processing method according to claim 4, it is characterised in that it is described it is embedded in the electronic equipment described in Deep learning model, in addition to:
    Detect the processor quantity of the electronic equipment;
    The number of threads according to corresponding to determining the processor quantity, and the number of threads is stored in the electronic equipment In;
    It is described that the pending data is handled according to the calculating path, including:Call equal with the number of threads Idle thread;Using the idle thread of calling, the pending data is handled according to the calculating path.
  8. A kind of 8. data processing equipment, it is characterised in that including:
    Acquisition module, for obtaining pending data;
    Processing module, handled for the pending data to be inputted in the deep learning model having been inserted into, the depth Learning model includes multilayer computing architecture;
    Detection module, in processing procedure, whether detection current layer computing architecture, which runs, finishes;
    Cleaning modul, if detecting that the operation of current layer computing architecture finishes for the detection module, obtain current layer and calculate Storage region corresponding to framework, and the storage region is cleared up, the storage region is used to store current layer calculating support Structure operationally caused service data;When clear up complete when, using the next layer of computing architecture as current layer computing architecture, And return to whether the triggering detection module perform detection current layer computing architecture has run the step of finishing, until output is handled As a result.
  9. 9. data processing equipment according to claim 8, it is characterised in that the service data includes operation result, institute Cleaning modul is stated to specifically include:
    First cleaning submodule, for clearing up the service data in addition to the operation result;
    First acquisition submodule, the to be quoted number current for obtaining the operation result;
    Second cleaning submodule, for when reference number is equal to predetermined threshold value, being cleared up when described the operation result; When described when reference number is more than predetermined threshold value, the behaviour for performing and obtaining storage region corresponding to current layer computing architecture is returned Make.
  10. 10. data processing equipment according to claim 9, it is characterised in that also including update module, be used for:
    When the detection module detects that the operation of current layer computing architecture finishes, institute is set according to the next layer of computing architecture State the number to be quoted of operation result;
    When described after reference number is set successfully, the next layer of computing architecture is obtained operationally to the operation result Reference information;According to number to be quoted described in reference information renewal.
  11. 11. according to the data processing equipment described in any one in claim 8-10, it is characterised in that also include embedded mould Block, the embedded module specifically include:
    Second acquisition submodule, for the deep learning model for having been inserted into pending data input in the processing module In handled before, the parameter information of each layer computing architecture is obtained from the deep learning model trained;
    Submodule is generated, path is calculated for being generated according to the parameter information of each layer computing architecture, the calculating path includes ginseng With the operation order calculated between function and each computing architecture of the computing architecture of operation;
    Sub-module stored, for the calculating path to be stored in the electronic device, described in embedded in the electronic equipment Deep learning model;
    The processing module is specifically used for:The pending data is handled according to the calculating path.
  12. 12. data processing equipment according to claim 11, it is characterised in that the generation submodule is specifically used for:
    The computing architecture for participating in operation is filtered out from each layer computing architecture using default screening parameter;
    Generated according to the parameter information of the computing architecture filtered out and calculate path.
  13. 13. data processing equipment according to claim 11, it is characterised in that second acquisition submodule is specifically used In:
    The deep learning model trained is changed using default switching strategy, obtains files after transform;
    Default loading interface is called to load the files after transform in the electronic equipment;
    The parameter information of each layer computing architecture is obtained from the content of loading.
  14. 14. data processing equipment according to claim 11, it is characterised in that the embedded module also includes determining submodule Block, it is used for:
    Detect the processor quantity of the electronic equipment;The number of threads according to corresponding to determining the processor quantity, and by institute Number of threads is stated to be stored in the electronic equipment;
    The processing module is specifically used for:Call the idle thread equal with the number of threads;Using the idle thread of calling, The pending data is handled according to the calculating path.
  15. 15. a kind of storage medium, it is characterised in that the storage medium is stored with a plurality of instruction, and the instruction is suitable to processor Loaded, the step in the data processing method described in 1 to 7 any one is required with perform claim.
CN201710735990.7A 2017-08-24 2017-08-24 Data processing method, device and storage medium Active CN107563512B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710735990.7A CN107563512B (en) 2017-08-24 2017-08-24 Data processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710735990.7A CN107563512B (en) 2017-08-24 2017-08-24 Data processing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN107563512A true CN107563512A (en) 2018-01-09
CN107563512B CN107563512B (en) 2023-10-17

Family

ID=60976926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710735990.7A Active CN107563512B (en) 2017-08-24 2017-08-24 Data processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN107563512B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108279881A (en) * 2018-02-11 2018-07-13 深圳竹信科技有限公司 Cross-platform realization framework based on deep learning predicted portions and method
CN108764487A (en) * 2018-05-29 2018-11-06 北京百度网讯科技有限公司 For generating the method and apparatus of model, the method and apparatus of information for identification
CN110009110A (en) * 2019-03-19 2019-07-12 福建天晴数码有限公司 STN network optimized approach, storage medium based on the library class UTXO
WO2019141014A1 (en) * 2018-01-16 2019-07-25 腾讯科技(深圳)有限公司 Chip-based instruction set processing method and apparatus, and storage medium
CN110276322A (en) * 2019-06-26 2019-09-24 湖北亿咖通科技有限公司 A kind of image processing method and device of the unused resource of combination vehicle device
CN110503644A (en) * 2019-08-27 2019-11-26 广东工业大学 Defects detection implementation method, defect inspection method and relevant device based on mobile platform
CN110647996A (en) * 2018-06-08 2020-01-03 上海寒武纪信息科技有限公司 Execution method and device of universal machine learning model and storage medium
CN111145076A (en) * 2019-12-27 2020-05-12 深圳鲲云信息科技有限公司 Data parallelization processing method, system, equipment and storage medium
WO2020093205A1 (en) * 2018-11-05 2020-05-14 深圳市欢太科技有限公司 Deep learning computation method and related device
CN111291882A (en) * 2018-12-06 2020-06-16 北京百度网讯科技有限公司 Model conversion method, device, equipment and computer storage medium
CN112149908A (en) * 2020-09-28 2020-12-29 深圳壹账通智能科技有限公司 Vehicle driving prediction method, system, computer device and readable storage medium
CN112740174A (en) * 2018-10-17 2021-04-30 北京比特大陆科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
US11036480B2 (en) 2018-06-08 2021-06-15 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
WO2021120177A1 (en) * 2019-12-20 2021-06-24 华为技术有限公司 Method and apparatus for compiling neural network model
CN113676353A (en) * 2021-08-19 2021-11-19 杭州华橙软件技术有限公司 Control method and device for equipment, storage medium and electronic device
CN114594103A (en) * 2022-04-12 2022-06-07 四川大学 Method and system for automatically detecting surface defects of nuclear industrial equipment and automatically generating reports

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011156741A1 (en) * 2010-06-10 2011-12-15 Carnegie Mellon University Synthesis system for pipelined digital circuits with multithreading
WO2015066331A1 (en) * 2013-11-04 2015-05-07 Google Inc. Systems and methods for layered training in machine-learning architectures
WO2015192812A1 (en) * 2014-06-20 2015-12-23 Tencent Technology (Shenzhen) Company Limited Data parallel processing method and apparatus based on multiple graphic procesing units
CN106951926A (en) * 2017-03-29 2017-07-14 山东英特力数据技术有限公司 The deep learning systems approach and device of a kind of mixed architecture

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011156741A1 (en) * 2010-06-10 2011-12-15 Carnegie Mellon University Synthesis system for pipelined digital circuits with multithreading
WO2015066331A1 (en) * 2013-11-04 2015-05-07 Google Inc. Systems and methods for layered training in machine-learning architectures
WO2015192812A1 (en) * 2014-06-20 2015-12-23 Tencent Technology (Shenzhen) Company Limited Data parallel processing method and apparatus based on multiple graphic procesing units
CN106951926A (en) * 2017-03-29 2017-07-14 山东英特力数据技术有限公司 The deep learning systems approach and device of a kind of mixed architecture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张沪寅;汪思思;钱龙;周天瑛;: "面向SDN数据中心网络的路径资源管理节能机制研究" *

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019141014A1 (en) * 2018-01-16 2019-07-25 腾讯科技(深圳)有限公司 Chip-based instruction set processing method and apparatus, and storage medium
US10877924B2 (en) 2018-01-16 2020-12-29 Tencent Technology (Shenzhen) Company Limited Instruction set processing method based on a chip architecture and apparatus, and storage medium
CN108279881A (en) * 2018-02-11 2018-07-13 深圳竹信科技有限公司 Cross-platform realization framework based on deep learning predicted portions and method
CN108279881B (en) * 2018-02-11 2021-05-28 深圳竹信科技有限公司 Cross-platform implementation framework and method based on deep learning prediction part
CN108764487A (en) * 2018-05-29 2018-11-06 北京百度网讯科技有限公司 For generating the method and apparatus of model, the method and apparatus of information for identification
US11210608B2 (en) 2018-05-29 2021-12-28 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for generating model, method and apparatus for recognizing information
US11726754B2 (en) 2018-06-08 2023-08-15 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
US11379199B2 (en) 2018-06-08 2022-07-05 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
CN110647996A (en) * 2018-06-08 2020-01-03 上海寒武纪信息科技有限公司 Execution method and device of universal machine learning model and storage medium
US11334329B2 (en) 2018-06-08 2022-05-17 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
US11334330B2 (en) 2018-06-08 2022-05-17 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
US11403080B2 (en) 2018-06-08 2022-08-02 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
US11307836B2 (en) 2018-06-08 2022-04-19 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
CN110647996B (en) * 2018-06-08 2021-01-26 上海寒武纪信息科技有限公司 Execution method and device of universal machine learning model and storage medium
US11036480B2 (en) 2018-06-08 2021-06-15 Shanghai Cambricon Information Technology Co., Ltd. General machine learning model, and model file generation and parsing method
CN112740174A (en) * 2018-10-17 2021-04-30 北京比特大陆科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112740174B (en) * 2018-10-17 2024-02-06 北京比特大陆科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
WO2020093205A1 (en) * 2018-11-05 2020-05-14 深圳市欢太科技有限公司 Deep learning computation method and related device
CN112714917A (en) * 2018-11-05 2021-04-27 深圳市欢太科技有限公司 Deep learning calculation method and related equipment
CN112714917B (en) * 2018-11-05 2024-05-10 深圳市欢太科技有限公司 Deep learning calculation method and related equipment
CN111291882A (en) * 2018-12-06 2020-06-16 北京百度网讯科技有限公司 Model conversion method, device, equipment and computer storage medium
CN110009110B (en) * 2019-03-19 2020-11-13 福建天晴数码有限公司 STN network optimization method based on UTXO-like library and storage medium
CN110009110A (en) * 2019-03-19 2019-07-12 福建天晴数码有限公司 STN network optimized approach, storage medium based on the library class UTXO
CN110276322A (en) * 2019-06-26 2019-09-24 湖北亿咖通科技有限公司 A kind of image processing method and device of the unused resource of combination vehicle device
CN110503644A (en) * 2019-08-27 2019-11-26 广东工业大学 Defects detection implementation method, defect inspection method and relevant device based on mobile platform
CN110503644B (en) * 2019-08-27 2023-07-25 广东工业大学 Defect detection implementation method based on mobile platform, defect detection method and related equipment
WO2021120177A1 (en) * 2019-12-20 2021-06-24 华为技术有限公司 Method and apparatus for compiling neural network model
CN111145076A (en) * 2019-12-27 2020-05-12 深圳鲲云信息科技有限公司 Data parallelization processing method, system, equipment and storage medium
CN112149908A (en) * 2020-09-28 2020-12-29 深圳壹账通智能科技有限公司 Vehicle driving prediction method, system, computer device and readable storage medium
CN112149908B (en) * 2020-09-28 2023-09-08 深圳壹账通智能科技有限公司 Vehicle driving prediction method, system, computer device, and readable storage medium
CN113676353A (en) * 2021-08-19 2021-11-19 杭州华橙软件技术有限公司 Control method and device for equipment, storage medium and electronic device
CN114594103A (en) * 2022-04-12 2022-06-07 四川大学 Method and system for automatically detecting surface defects of nuclear industrial equipment and automatically generating reports
CN114594103B (en) * 2022-04-12 2023-05-16 四川大学 Automatic detection and report generation method and system for surface defects of nuclear industrial equipment

Also Published As

Publication number Publication date
CN107563512B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
CN107563512A (en) A kind of data processing method, device and storage medium
CN111310936A (en) Machine learning training construction method, platform, device, equipment and storage medium
CN112200297B (en) Neural network optimization method, device and processor
CN106951926A (en) The deep learning systems approach and device of a kind of mixed architecture
CN110309911A (en) Neural network model verification method, device, computer equipment and storage medium
CN110689116B (en) Neural network pruning method and device, computer equipment and storage medium
CN106547627A (en) The method and system that a kind of Spark MLlib data processings accelerate
CN111782637A (en) Model construction method, device and equipment
CN111190741A (en) Scheduling method, device and storage medium based on deep learning node calculation
CN111831359B (en) Weight precision configuration method, device, equipment and storage medium
CN112115801B (en) Dynamic gesture recognition method and device, storage medium and terminal equipment
CN114936085A (en) ETL scheduling method and device based on deep learning algorithm
US20230120227A1 (en) Method and apparatus having a scalable architecture for neural networks
CN111831355A (en) Weight precision configuration method, device, equipment and storage medium
CN113010312A (en) Hyper-parameter tuning method, device and storage medium
CN109117475A (en) A kind of method and relevant device of text rewriting
CN114490116B (en) Data processing method and device, electronic equipment and storage medium
CN111368707A (en) Face detection method, system, device and medium based on feature pyramid and dense block
CN114065948A (en) Method and device for constructing pre-training model, terminal equipment and storage medium
CN109002885A (en) A kind of convolutional neural networks pond unit and pond calculation method
CN112990461B (en) Method, device, computer equipment and storage medium for constructing neural network model
CN106934854A (en) Based on creator model optimization method and devices
CN113626035B (en) Neural network compiling method facing RISC-V equipment based on TVM
CN110442753A (en) A kind of chart database auto-creating method and device based on OPC UA
CN115794137A (en) GPU-oriented artificial intelligence model deployment method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant