CN109558329A - A kind of program detecting method, device, equipment and readable storage medium storing program for executing - Google Patents

A kind of program detecting method, device, equipment and readable storage medium storing program for executing Download PDF

Info

Publication number
CN109558329A
CN109558329A CN201811514703.0A CN201811514703A CN109558329A CN 109558329 A CN109558329 A CN 109558329A CN 201811514703 A CN201811514703 A CN 201811514703A CN 109558329 A CN109558329 A CN 109558329A
Authority
CN
China
Prior art keywords
program
winograd
test data
convolution
fpga
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811514703.0A
Other languages
Chinese (zh)
Inventor
曹芳
赵雅倩
郭振华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Inspur Smart Computing Technology Co Ltd
Original Assignee
Guangdong Inspur Big Data Research Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Inspur Big Data Research Co Ltd filed Critical Guangdong Inspur Big Data Research Co Ltd
Priority to CN201811514703.0A priority Critical patent/CN109558329A/en
Publication of CN109558329A publication Critical patent/CN109558329A/en
Priority to PCT/CN2019/103639 priority patent/WO2020119188A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of program detecting methods, this method comprises: obtaining test data when receiving the instruction of Winograd Programmable detection;Using the target algorithm program of convolutional neural networks, convolutional calculation is carried out to test data, obtains convolution results;Test data is sent to FPGA, so that FPGA carries out fast convolution calculating to test data using Winograd program;The fast convolution that FPGA is sent is received as a result, and calculating the similarities of fast convolution result and convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.This method can detect the Winograd program in FPGA.The invention also discloses a kind of Programmable detection device, equipment and readable storage medium storing program for executing, have corresponding technical effect.

Description

A kind of program detecting method, device, equipment and readable storage medium storing program for executing
Technical field
The present invention relates to computer application technology, more particularly to a kind of program detecting method, device, equipment and can Read storage medium.
Background technique
In recent years, convolutional neural networks (CNN) are applied to Computer Vision Task more and more widely.CNN is generally comprised Multiple layers, each layer of output characteristic pattern are next layer of input feature vector figures.The calculating of current optimal CNN is mainly by convolutional layer master It leads.
FPGA (Field-Programmable Gate Array, field programmable gate array), because its with high-performance, The advantages of low energy consumption and Reconfigurability, and be concerned as effective hardware accelerator of CNN.If using the volume of target Integration method, each element exported in characteristic pattern will individually be calculated through the operation of multistep product accumulation, this needs to expend FPGA In a large amount of DSP (multiplier) resource carry out multiplying, however the DSP resource in FPGA board is limited and very precious, It is not able to satisfy the multiplication quantity of target volume integration method needs.
Winograd algorithm is a kind of convolutional neural networks fast algorithm, it is generated using the structural similarity between element Export the column element in characteristic pattern.The quantity of multiplying can be reduced, to considerably reduce algorithm complexity, can be improved CNN performance on FPGA.
However, code of the Winograd algorithm on FPGA realizes that program complexity increases, during code development, Easily malfunction.Once Winograd algorithm part calculates error, it will influence the accuracy of entire CNN algorithm.In order to enable journey Sequence check results are more accurate, generally require to input different test datas and test to Winograd program.But again Cause, in the development phase, since Winograd algorithm is complicated, it is difficult to find out the corresponding convolution results of different test datas.Even Corresponding table of the different test datas with test result is stored in advance, is used for program ver-ify, that there is also amount of test data is few, it is even to have Right property and test process are complicated, it is difficult to the problem of realizing.
In conclusion the problems such as how effectively examining Winograd program whether correct, is current those skilled in the art Member's technical problem urgently to be solved.
Summary of the invention
The object of the present invention is to provide a kind of program detecting method, device, equipment and readable storage medium storing program for executing, on FPGA Winograd program detect, to ensure the accuracy of entire CNN algorithm.
In order to solve the above technical problems, the invention provides the following technical scheme:
A kind of program detecting method, comprising:
When receiving the instruction of Winograd Programmable detection, test data is obtained;
Using the target algorithm program of the convolutional neural networks, convolutional calculation is carried out to the test data, is rolled up Product result;The target algorithm program is that the algorithm routine of the convolutional neural networks is realized in a manner of sliding window;
The test data is sent to FPGA, so that the FPGA utilizes the Winograd program to the test number According to progress fast convolution calculating;
The fast convolution that the FPGA is sent is received as a result, and calculating the fast convolution result and the convolution results Similarity;
When the similarity is greater than threshold value, determine that the Winograd program is correct.
Preferably, using the target algorithm program of the convolutional neural networks, convolutional calculation is carried out to the test data, Obtain convolution results, comprising:
Using the target algorithm program, convolutional calculation is carried out to the test data, by the convolutional neural networks First layer result is as the convolution results;
Correspondingly, the FPGA carries out fast convolution calculating, packet to the test data using the Winograd program It includes:
The FPGA carries out fast convolution calculating to the test data using the Winograd program, and will quickly count The first layer result for obtaining the convolutional neural networks is calculated as the fast convolution result.
Preferably, further includes:
Obtain the filter parameter of the convolutional neural networks;
The filter parameter is separately positioned in the target convolution algorithm program and Winograd program.
Preferably, the test data is sent to the FPGA, comprising:
The PFGA board running environment is created, and initializes board parameter;
The test data is sent to the FPGA.
Preferably, the FPGA carries out fast convolution calculating, packet to the test data using the Winograd program It includes:
The FPGA starts kernel, and carries out fast convolution meter to the test data using the Winograd program It calculates.
Preferably, the similarity of the fast convolution result and the convolution results is calculated, comprising:
The ratio for calculating the fast convolution result and the convolution results determines the similarity using the ratio;
Or, calculating the difference of the fast convolution result and the convolution results, determined using the difference described similar Degree.
Preferably, when the similarity is less than or equal to the threshold value, further includes:
Determine the Winograd program error.
A kind of Programmable detection device, comprising:
Test data obtains module, for obtaining test data when receiving the instruction of Winograd Programmable detection;
Convolutional calculation module, for using the convolutional neural networks target algorithm program, to the test data into Row convolutional calculation obtains convolution results;The target algorithm program is that the calculation of the convolutional neural networks is realized in a manner of sliding window Method program;
Test data sending module, for the test data to be sent to FPGA, so as to described in FPGA utilization Winograd program carries out fast convolution calculating to the test data;
Similarity calculation module, for receiving the fast convolution of the FPGA transmission as a result, and calculating the fast convolution As a result with the similarity of the convolution results;
Testing result determining module, for determining that the Winograd program is correct when the similarity is greater than threshold value.
A kind of Programmable detection equipment, comprising:
Memory, for storing computer program;
Processor, when for executing the computer program the step of realization above procedure detection method.
A kind of readable storage medium storing program for executing is stored with computer program, the computer program quilt on the readable storage medium storing program for executing The step of above procedure detection method is realized when processor executes.
Test is obtained when receiving the instruction of Winograd Programmable detection using method provided by the embodiment of the present invention Data;Using the target algorithm program of convolutional neural networks, convolutional calculation is carried out to test data, obtains convolution results;Target Algorithm routine is the algorithm routine that convolutional neural networks are realized in a manner of sliding window;Test data is sent to FPGA, so as to FPGA Fast convolution calculating is carried out to test data using Winograd program;The fast convolution of FPGA transmission is received as a result, and calculating The similarity of fast convolution result and convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.
Due to the target algorithm program of depth nerve convolutional network, that is, realize the implementation process of sliding window algorithm, it is only necessary to make Convolution algorithm can accurately be given expression to loop nesting, and have code simple, the small advantage of error probability.And because Winograd program is to realize the fast algorithm program of convolutional neural networks, that is to say, that the Winograd program correctly expressed With target algorithm program when calculating the convolutional calculation result of same input data, obtained two convolution results should one It causes or is maintained within the scope of different, that is, there is similitude.Based on this, after FPGA is written in Winograd program, when When receiving the instruction of Winograd Programmable detection, the test data for inspection is obtained first.Then convolution mind is utilized in CPU Target algorithm program through network carries out convolutional calculation to test data, obtains convolution results.It at the same time, can be by test data It is sent to FPGA.After FPGA obtains test data, fast convolution calculating is carried out to test data using Winograd program, so Fast convolution calculated result is sent to CPU afterwards.After CPU receives the fast convolution result that FPGA is sent, fast convolution is calculated As a result with the similarity of convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.In this way, can pass through The target algorithm program in CPU is operated in, the Winograd program in FPGA is detected, has ensured Winograd algorithm portion The accuracy rate divided, can promote the accuracy of the CNN algorithm in FPGA, further increase the carried out computer vision on FPGA and appoint The accuracy rate of business.
Correspondingly, the embodiment of the invention also provides Programmable detection device corresponding with above procedure detection method, set Standby and readable storage medium storing program for executing, has above-mentioned technique effect, and details are not described herein.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is a kind of implementation flow chart of program detecting method in the embodiment of the present invention;
Fig. 2 is creation board running environment function schematic diagram;
Fig. 3 is initialization board parametric function schematic diagram;
Fig. 4 is a kind of specific flow chart of program detecting method in the embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of Programmable detection device in the embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of Programmable detection equipment in the embodiment of the present invention;
Fig. 7 is a kind of concrete structure schematic diagram of Programmable detection equipment in the embodiment of the present invention.
Specific embodiment
Core of the invention is to provide a kind of program detecting method, and this method is excellent by Winograd algorithm and target algorithm Gesture combines, and proposes by the operation result of target algorithm program come in a manner of examining Winograd algorithm routine operation result, into One step determines the method whether Winograd algorithm routine correctly expresses Winograd algorithm.
Wherein, Fast W inograd algorithm brief introduction: assuming that F (m, r) indicates that input data size is m, filter size is r One-dimensional convolution, and F (mxm, rxr) indicate input data size be m*m, filter size be r*r two-dimensional convolution.One-dimensional volume The quick filter algorithm of winograd of product F (m, r) can be write as matrix form: Y=AT[(Gg)·(BTd)];F (m, r) and its from Body nesting can get the quick filter algorithm of Winograd of minimum two-dimensional convolution F (m*m, r*r), may be expressed as: Y=AT [(GgGT)·(BTDB)] A, wherein g is filtering data, and d is input data, and G, A, B is three transformation matrixs.It is with F (4,3) Example, the value of G, B, A are as follows:
When code on FPGA realizes Winograd algorithm, needs to be calculated by above formula, be with one-dimensional F (4,3) Example, d=[d1, d2, d3, d4, d5], g=[g0, g1, g2] calculate BTD needs to give expression to 6 mathematic(al) representations with code:
Float trans_input0=4.0f*d1-5.0f*d3+d5;
Float trans_input1=-4.0f*d2-4.0f*d3+d4+d5;
Float trans_input2=4.0f*d2-4.0f*d3-d4+d5;
Float trans_input3=-2.0f*d2-d3+2.0f*d4+d5;
Float trans_input4=2.0f*d2-d3-2.0f*d4+d5;
Float trans_input5=4.0f*d2-5.0f*d4+d5;
Gg is calculated to need to give expression to 6 mathematic(al) representations with code:
Float trans_tilter0=one_over_4*g0;
Float trans_filter1=minus_one_over_6*g0+minus_one_over_6*g1+mi nus_ one_over_6*g2;
Float trans_filter2=minus_one_over_6*g0+one_over_6*g1+minus_on e_ over_6*g2;
Float trans_filter3=one_over_24*g0+one_over_12*g1+one_over_6*g 2;
Float trans_filter4=one_over_24*g0-one_over_12*g1+one_over_6*g 2;
Float trans_filter5=g2;
Assuming that BTThe result of d and Gg dot product is [mul0, mul1, mul2, mul3, mul4, mul5], is finally calculated most Termination fruit Y needs to express 4 mathematic(al) representations with code:
Float result0=mul0+mul1+mul2+mul3+mul4;
Float result1=mul1-mul2+2.0f*mul3-2.0f*mul4;
Float result2=mul1+mul2+4.0f*mul3+4.0f*mul4;
Float result3=mul1-mul2+8.0f*mul3-8.0f*mul4+mul5;
As it appears from the above, 6+6+4=16 arithmetic table need to be write manually altogether in the Winograd code of one-dimensional F (4,3) Up to formula.Similar, two-dimentional F (4x4,3x3) then includes at least (6x6+6x6)+(6x3+6x6)+(4x6+4x4)=166 mathematics Expression formula.The expression formula data for needing to write manually in code sharply increase, and include addition subtraction multiplication and division four fundamental rules in each expression Operation, different constants and different variables.In this way, which the various complexity of expression formula makes code-type error probability increase Add.
Traditional convolution algorithm brief introduction: if calculating F (4x4,3x3) according to traditional convolution algorithm, sliding window mode is such as used It realizes, then only needs to calculate using four for loop nestings in program, code is simple, and error probability is small.Traditional algorithm program is such as Under:
Winograd (matrix multiplication) the algorithmic code writing complexity it is found that the realization of the end FPGA kernel is analyzed above, Error probability is high, and traditional convolution algorithm code book when the end host is realized is write simply, and error rate is extremely low.This method will combine The advantages of both algorithms, improves Winograd algorithm.Specific practice is as follows:
The Winograd algorithm expression formula at the end FPGA kernel is constant, but the first layer CNN convolution results calculated pass The end host is gone back to, while realizing traditional convolutional calculation at the end host, sets up another the convolution results that a thread calculates first layer CNN. It is to be calculated it is complete after, compare the calculated result of the end host tradition convolution algorithm and Winograd algorithm that the end kernel is passed back calculated and tied Fruit in expected tolerance band, illustrates that Winograd calculated result is correct, the end kernel cnn if calculated result difference very little Program continues to run;If difference existing for calculated result exceeds expected results, illustrate that Winograd algorithm expression formula malfunctions, Interrupt routine is needed to carry out inspection modification.
In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description The present invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Embodiment one:
Referring to FIG. 1, Fig. 1 is a kind of flow chart of program detecting method in the embodiment of the present invention, this method be can be applied to In CPU, method includes the following steps:
S101, receive Winograd Programmable detection instruction when, obtain test data.
Wherein, Winograd program is to realize the fast algorithm program of convolutional neural networks.
When developer complete Winograd program code development after, and by Winograd program write-in FPGA it Afterwards, the instruction of Winograd Programmable detection can be sent to CPU by way of in visualization interface or by order line.It is connect in CPU When receiving the instruction of Winograd Programmable detection, CPU can obtain the data of test Winograd program.Specifically, the test data It can be specially image data, matrix.Obtain test data can by incoming test data outside interface, can also directly from Supplemental characteristic is read in storage equipment.
S102, the target algorithm program using convolutional neural networks carry out convolutional calculation to test data, obtain convolution knot Fruit.
Wherein, target algorithm program is that the algorithm routine of convolutional neural networks is realized in a manner of sliding window.
In embodiments of the present invention, the target algorithm program of convolutional neural networks can be write in advance.It is tested when getting After data, target algorithm program can be utilized, convolutional calculation is carried out to test data, obtains convolution results.Wherein, target is calculated Fourier also can be selected in method program or im2col mode realizes the algorithm routine of convolutional neural networks.
Wherein, sliding window algorithm, this method are most intuitive simplest methods.Im2col algorithm: almost all of at present Mainstream Computational frame includes Caffe, and MXNet etc. realizes this method, and entire convolution process is converted into GEMM by this method Process, and GEMM is by ultimate attainment optimization in the various libraries BLAS.Fft algorithm: Fourier transformation and fast Fourier variation It is the calculation method being commonly used inside classical image procossing.Since sliding window algorithm, Fourier algorithm or im2col algorithm are Common algorithms, specific processing logic that details are not described herein.
S103, test data is sent to FPGA, so that FPGA carries out quickly test data using Winograd program Convolutional calculation.
After obtaining test data, also need test data being sent to FPGA.Specifically, the FPGA in the embodiment of the present invention It can be chip or equipment with Programmadle logic door.After FPGA receives test data, using Winograd program Fast convolution calculating is carried out to test data, obtains fast convolution result.After FPGA calculates fast convolution result, it can incite somebody to action The fast convolution result is back to CPU.
Wherein, test data is sent to FPGA, specifically included:
Step 1: creation PFGA board running environment, and initialize board parameter;
Step 2: test data is sent to FPGA.
It is illustrated for ease of description, below combining above-mentioned steps one and step 2.
It is created that FPGA board running environment first, after then initializing board parameter, test data can be sent to FPGA.Wherein, creating board running environment and initialize board parameter all can be by calling intel packaged function to board It is operated, can such as call the function of creation board running environment as shown in Figure 2, call initialization board as shown in Figure 3 The function of parameter.
, can be by starting kernel after FPGA receives test data, and utilize Winograd program to test data Carry out fast convolution calculating.Wherein, kernel is to have the scheduling of event and synchronous, between process communication (message biography in FPGA Pass), memory management, the real time operating system of management of process.In this way, can be after obtaining fast convolution result, by result It is back in CPU.
S104, fast convolution that FPGA is sent is received as a result, and calculating the similarities of fast convolution result and convolution results.
After CPU receives FPGA legal fast convolution result, the phase of fast convolution result with convolution results can be calculated Like degree.
Specifically, since the target algorithm and the corresponding program of Winograd algorithm of convolutional neural networks are to same defeated Enter after data carry out convolutional calculation, calculated result should consistent or similarity with higher.Again because of target algorithm program generation Code is relatively easy, not easy to make mistakes, therefore, the convolution obtained using carry out convolutional calculation of the target algorithm program to test data As a result it is used as reference value, after obtaining the fast convolution calculated result of Winograd algorithm routine, judges convolution results and fast The similarity of fast convolution results can determine whether Winograd program is correct.
Specifically, the calculation of the similarity includes but is not limited to following two mode, it in practical applications, can be optional A kind of relative degree calculation:
Mode 1: the ratio of fast convolution result and convolution results is calculated, determines similarity using ratio.By judging two The ratio of a numerical value and 1 relationship, can determine the two number similarity.Specifically, ratio is approximately close to 1, then show The similarity of the two numerical value is higher.Based on this, after obtaining fast convolution result and convolution results, quick volume can be calculated The ratio of product result and convolution results, then determines similarity using the ratio.Specifically, calculating fast convolution result and volume Product result ratio when, ensure ratio (0,1] between (that is, take calculate fast convolution result than convolution results or convolution results Than calculating ratio of the result in the two results of fast convolution result less than or equal to 0 as fast convolution result and convolution results Value), it is specified that ratio is 1, similarity 100% is true by percentage directly by after ratio percentage if ratio is at (0-1) It is set to similarity.
Mode 2: the difference of fast convolution result and convolution results is calculated, determines similarity using difference.Specifically, can advise When to determine difference be 0, similarity is 100%, it is specified that different differences is different similarity out, similar when such as providing that difference is 1 Degree is 99%, and it is similarity 98% that difference, which is 2,.According to a certain percentage, it then follows difference is bigger, and similarity is smaller.
S105, when similarity be greater than threshold value when, determine that Winograd program is correct.
In embodiments of the present invention, a settable threshold value, which is used to be compared with similarity, with determination Whether Winograd program is correct.Specifically, can determine that Winograd program is correct when similarity is greater than threshold value.When similar When degree is less than or equal to threshold value, Winograd program error is determined.The numerical value of the threshold value can be identified as 99% or 99.9%, or 99.999%.
Test is obtained when receiving the instruction of Winograd Programmable detection using method provided by the embodiment of the present invention Data;Using the target algorithm program of convolutional neural networks, convolutional calculation is carried out to test data, obtains convolution results;Target Algorithm routine is the algorithm routine that convolutional neural networks are realized in a manner of sliding window;Test data is sent to FPGA, so as to FPGA Fast convolution calculating is carried out to test data using Winograd program;The fast convolution of FPGA transmission is received as a result, and calculating The similarity of fast convolution result and convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.
Due to the target algorithm program of depth nerve convolutional network, that is, realize the implementation process of sliding window algorithm, it is only necessary to make Convolution algorithm can accurately be given expression to loop nesting, and have code simple, the small advantage of error probability.And because Winograd program is to realize the fast algorithm program of convolutional neural networks, that is to say, that the Winograd program correctly expressed With target algorithm program when calculating the convolutional calculation result of same input data, obtained two convolution results should one It causes or is maintained within the scope of different, that is, there is similitude.Based on this, after FPGA is written in Winograd program, when When receiving the instruction of Winograd Programmable detection, the test data for inspection is obtained first.Then convolution mind is utilized in CPU Target algorithm program through network carries out convolutional calculation to test data, obtains convolution results.It at the same time, can be by test data It is sent to FPGA.After FPGA obtains test data, fast convolution calculating is carried out to test data using Winograd program, so Fast convolution calculated result is sent to CPU afterwards.After CPU receives the fast convolution result that FPGA is sent, fast convolution is calculated As a result with the similarity of convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.In this way, can pass through The target algorithm program in CPU is operated in, the Winograd program in FPGA is detected, has ensured Winograd algorithm portion The accuracy rate divided, can promote the accuracy of the CNN algorithm in FPGA, further increase the carried out computer vision on FPGA and appoint The accuracy rate of business.
It should be noted that based on the above embodiment, the embodiment of the invention also provides be correspondingly improved scheme.Excellent It can mutually be referred between step or corresponding steps same with the above-mentioned embodiment involved in choosing/improvement embodiment, it is corresponding beneficial Effect can also be cross-referenced, no longer repeats one by one in preferred/improvement embodiment of this paper.
Preferably, can also be according to similarity calculation principle when determining whether Winograd program is correct, it can be directly by it Middle difference and ratio and pre-set judgment threshold carry out ratio, to determine whether Winograd program is correct.Specifically, working as After the difference for calculating convolution results and fast convolution result, if the difference is less than 10-3, it is determined that Winograd program is just Ratio really or between convolution results and fast convolution result is greater than 0.999, it is determined that Winograd program is correct.Certainly, In 10-3Judgment threshold with 0.999 can be adjusted according to available accuracy demand.
Preferably due to include several convolutional layers in convolutional neural networks, and convolution algorithm can pass through the side of recursive call Formula reduces code quantity and therefore when carrying out Winograd program test, only first layer convolutional calculation result can be compared ?.Specifically, i.e. step S102 can be specially to utilize target algorithm program, convolutional calculation is carried out to test data, by convolution The first layer result of neural network is as convolution results;Correspondingly, FPGA utilizes Winograd program to test in step S104 Data carry out fast convolution calculating, and specially FPGA carries out fast convolution calculating to test data using Winograd program, and The first layer result of convolutional neural networks will quickly be calculated as fast convolution result.In this way, Winograd can be shortened The verification time of program.
Preferably, the filter parameter before carrying out Winograd program test, in also settable convolutional neural networks. Specifically, obtaining the filter parameter of convolutional neural networks, filter parameter is separately positioned on target convolution algorithm program In Winograd program.In this way, the filter parameter one in target convolution algorithm program and Winograd program can be ensured It causes, while can also test out respectively under different filter parameters, the accuracy rate of Winograd program.
Embodiment two:
Program detecting method provided by the embodiment of the present invention is better understood for the ease of those skilled in the art, below By taking specific application scenarios as an example, it is provided for the embodiments of the invention program detecting method and is described in detail.
Referring to FIG. 4, Fig. 4 is a kind of specific flow chart of program detecting method in the embodiment of the present invention.
The Winograd algorithm expression formula of the end FPGA kernel Winograd program expression constant, but being calculated the One layer of CNN convolution results pass host (host side, with CPU or processor above) end back, while realizing target at the end host Convolutional calculation sets up another the convolution results that a thread calculates first layer CNN.It is to be calculated it is complete after, compare the end host target volume integrating The Winograd algorithm calculated result that the calculated result of method and the end kernel are passed back, if calculated result difference very little, in expection In tolerance band, illustrate that Winograd calculated result is correct, the end kernel cnn program continues to run;If calculated result exists Difference exceed expected results, then illustrate Winograd algorithm expression formula malfunction, need interrupt routine to carry out inspection modification.
Specifically, test data and filter data (with filter parameter above) are input in cpu cache first. Then, start two threads at the end host, wherein thread 1 is used to calculate convolution according to target convolution algorithm;Thread 2 is for opening Dynamic kernel accelerates to calculate CNN using FPGA board.
Thread 2 creates FPGA board running environment, initialization board parameter first after starting, then by test data and FPGA board caching is written in filter data, then starts FPGA kernel program and carries out operation.That is, kernel program according to Winograd algorithm calculates convolution, obtains that convolution results are returned to the end host after first layer CNN convolution.
After thread 1 starts, input and filter data are obtained first, then calculate first layer according to target convolution algorithm CNN convolution results.Receive the Winograd convolution results data of the end kernel return.Then compare the volume of two methods acquisition Product result difference, if difference is less than 10-3, then illustrate that the expression of Winograd algorithm routine is errorless.If difference exceeds expected model Enclose, then illustrate the Winograd algorithm routine of kernel expression formula write there are problems, need interrupt routine to carry out inspection modification. Ensure that Winograd calculated result is accurate by increasing the end host proving program, can judge running on the FPGA Whether Winograd algorithm routine is correct, and in the case where wrong, improves.To be further ensured that entire CNN network Calculated result it is errorless.
Embodiment two:
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of Programmable detection devices, are described below Programmable detection device can correspond to each other reference with procedure described above detection method.
Show referring to Fig. 5, which comprises the following modules:
Test data obtains module 101, for obtaining test data when receiving the instruction of Winograd Programmable detection; Wherein, Winograd program is to realize the fast algorithm program of convolutional neural networks;
Convolutional calculation module 102 carries out convolution to test data for the target algorithm program using convolutional neural networks It calculates, obtains convolution results;Target algorithm program is that the algorithm routine of convolutional neural networks is realized in a manner of sliding window
Test data sending module 103, for test data to be sent to FPGA, so that FPGA utilizes Winograd program Fast convolution calculating is carried out to test data;
Similarity calculation module 104, for receive FPGA transmission fast convolution as a result, and calculate fast convolution result with The similarity of convolution results;
Testing result determining module 105, for determining that Winograd program is correct when similarity is greater than threshold value.
Test is obtained when receiving the instruction of Winograd Programmable detection using device provided by the embodiment of the present invention Data;Using the target algorithm program of convolutional neural networks, convolutional calculation is carried out to test data, obtains convolution results;Target Algorithm routine is the algorithm routine that convolutional neural networks are realized in a manner of sliding window;Test data is sent to FPGA, so as to FPGA Fast convolution calculating is carried out to test data using Winograd program;The fast convolution of FPGA transmission is received as a result, and calculating The similarity of fast convolution result and convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.
Due to the target algorithm program of depth nerve convolutional network, that is, realize the implementation process of sliding window algorithm, it is only necessary to make Convolution algorithm can accurately be given expression to loop nesting, and have code simple, the small advantage of error probability.And because Winograd program is to realize the fast algorithm program of convolutional neural networks, that is to say, that the Winograd program correctly expressed With target algorithm program when calculating the convolutional calculation result of same input data, obtained two convolution results should one It causes or is maintained within the scope of different, that is, there is similitude.Based on this, after FPGA is written in Winograd program, when When receiving the instruction of Winograd Programmable detection, the test data for inspection is obtained first.Then convolution mind is utilized in CPU Target algorithm program through network carries out convolutional calculation to test data, obtains convolution results.It at the same time, can be by test data It is sent to FPGA.After FPGA obtains test data, fast convolution calculating is carried out to test data using Winograd program, so Fast convolution calculated result is sent to CPU afterwards.After CPU receives the fast convolution result that FPGA is sent, fast convolution is calculated As a result with the similarity of convolution results;When similarity is greater than threshold value, determine that Winograd program is correct.In this way, can pass through The target algorithm program in CPU is operated in, the Winograd program in FPGA is detected, has ensured Winograd algorithm portion The accuracy rate divided, can promote the accuracy of the CNN algorithm in FPGA, further increase the carried out computer vision on FPGA and appoint The accuracy rate of business.
In a kind of specific embodiment of the invention, convolutional calculation module 102 is specifically used for utilizing in FPGA Winograd program carries out fast convolution calculating, and the first layer knot that convolutional neural networks will quickly be calculated to test data When fruit is as fast convolution result, using target algorithm program, convolutional calculation is carried out to test data, by convolutional neural networks First layer result is as convolution results.
In a kind of specific embodiment of the invention, further includes:
Filter setup module, for obtaining the filter parameter of convolutional neural networks;Filter parameter is respectively set In target convolution algorithm program and Winograd program.
In a kind of specific embodiment of the invention, test data sending module 103 is specifically used for creation PFGA board Running environment, and initialize board parameter;Test data is sent to FPGA, so that FPGA starts kernel, and is utilized Winograd program carries out fast convolution calculating to test data.
In a kind of specific embodiment of the invention, similarity calculation module 104 is specifically used for calculating fast convolution knot The ratio of fruit and convolution results determines similarity using ratio;Or, the difference of fast convolution result and convolution results is calculated, benefit Similarity is determined with difference.
In a kind of specific embodiment of the invention, testing result determining module 105, specifically for being less than when similarity Or when being equal to threshold value, determine Winograd program error.
Example IV:
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of Programmable detection equipment, are described below A kind of Programmable detection equipment can correspond to each other reference with a kind of above-described program detecting method.
Shown in Figure 6, which includes:
Memory D1, for storing computer program;
Processor D2, when for executing computer program the step of the program detecting method of realization above method embodiment.
Specifically, referring to FIG. 7, Fig. 7 be a kind of concrete structure schematic diagram of Programmable detection equipment provided in this embodiment, The Programmable detection equipment can generate bigger difference because configuration or performance are different, may include one or more processing Device (central processing units, CPU) 322 (for example, one or more processors) and memory 332, one (such as one or more mass memories of storage medium 330 of a or more than one storage application program 342 or data 344 Equipment).Wherein, memory 332 and storage medium 330 can be of short duration storage or persistent storage.It is stored in storage medium 330 Program may include one or more modules (diagram does not mark), and each module may include in data processing equipment Series of instructions operation.Further, central processing unit 322 can be set to communicate with storage medium 330, in Programmable detection The series of instructions operation in storage medium 330 is executed in equipment 301.
Programmable detection equipment 301 can also include one or more power supplys 326, one or more wired or nothings Wired network interface 350, one or more input/output interfaces 358, and/or, one or more operating systems 341. For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
Step in program detecting method as described above can be realized by the structure of Programmable detection equipment.
Embodiment five:
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of readable storage medium storing program for executing, are described below A kind of readable storage medium storing program for executing can correspond to each other reference with a kind of above-described program detecting method.
A kind of readable storage medium storing program for executing is stored with computer program on readable storage medium storing program for executing, and computer program is held by processor The step of program detecting method of above method embodiment is realized when row.
The readable storage medium storing program for executing be specifically as follows USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), the various program storage generations such as random access memory (Random Access Memory, RAM), magnetic or disk The readable storage medium storing program for executing of code.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.

Claims (10)

1. a kind of program detecting method characterized by comprising
When receiving the instruction of Winograd Programmable detection, test data is obtained;
Using the target algorithm program of the convolutional neural networks, convolutional calculation is carried out to the test data, obtains convolution knot Fruit;The target algorithm program is that the algorithm routine of the convolutional neural networks is realized in a manner of sliding window;
The test data is sent to FPGA, so as to the FPGA using the Winograd program to the test data into Row fast convolution calculates;
The fast convolution that the FPGA is sent is received as a result, and to calculate the fast convolution result similar to the convolution results Degree;
When the similarity is greater than threshold value, determine that the Winograd program is correct.
2. program detecting method according to claim 1, which is characterized in that calculated using the target of the convolutional neural networks Method program carries out convolutional calculation to the test data, obtains convolution results, comprising:
Using the target algorithm program, convolutional calculation is carried out to the test data, by the first of the convolutional neural networks Layer result is as the convolution results;
Correspondingly, the FPGA carries out fast convolution calculating to the test data using the Winograd program, comprising:
The FPGA carries out fast convolution calculating to the test data using the Winograd program, and will quickly calculate To the convolutional neural networks first layer result as the fast convolution result.
3. program detecting method according to claim 1, which is characterized in that further include:
Obtain the filter parameter of the convolutional neural networks;
The filter parameter is separately positioned in the target convolution algorithm program and Winograd program.
4. program detecting method according to claim 1, which is characterized in that be sent to the test data described FPGA, comprising:
The PFGA board running environment is created, and initializes board parameter;
The test data is sent to the FPGA.
5. program detecting method according to claim 1, which is characterized in that the FPGA utilizes the Winograd program Fast convolution calculating is carried out to the test data, comprising:
The FPGA starts kernel, and carries out fast convolution calculating to the test data using the Winograd program.
6. program detecting method according to any one of claims 1 to 5, which is characterized in that calculate the fast convolution knot The similarity of fruit and the convolution results, comprising:
The ratio for calculating the fast convolution result and the convolution results determines the similarity using the ratio;
Or, calculating the difference of the fast convolution result and the convolution results, the similarity is determined using the difference.
7. program detecting method according to claim 6, which is characterized in that when the similarity is less than or equal to the threshold When value, further includes:
Determine the Winograd program error.
8. a kind of Programmable detection device characterized by comprising
Test data obtains module, for obtaining test data when receiving the instruction of Winograd Programmable detection;
Convolutional calculation module rolls up the test data for the target algorithm program using the convolutional neural networks Product calculates, and obtains convolution results;The target algorithm program is that the algorithm journey of the convolutional neural networks is realized in a manner of sliding window Sequence;
Test data sending module, for the test data to be sent to FPGA, so as to described in FPGA utilization Winograd program carries out fast convolution calculating to the test data;
Similarity calculation module, for receiving the fast convolution of the FPGA transmission as a result, and calculating the fast convolution result With the similarity of the convolution results;
Testing result determining module, for determining that the Winograd program is correct when the similarity is greater than threshold value.
9. a kind of Programmable detection equipment characterized by comprising
Memory, for storing computer program;
Processor is realized when for executing the computer program such as any one of claim 1 to 7 described program detection method Step.
10. a kind of readable storage medium storing program for executing, which is characterized in that be stored with computer program, the meter on the readable storage medium storing program for executing It is realized when calculation machine program is executed by processor such as the step of any one of claim 1 to 7 described program detection method.
CN201811514703.0A 2018-12-10 2018-12-10 A kind of program detecting method, device, equipment and readable storage medium storing program for executing Pending CN109558329A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811514703.0A CN109558329A (en) 2018-12-10 2018-12-10 A kind of program detecting method, device, equipment and readable storage medium storing program for executing
PCT/CN2019/103639 WO2020119188A1 (en) 2018-12-10 2019-08-30 Program detection method, apparatus and device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811514703.0A CN109558329A (en) 2018-12-10 2018-12-10 A kind of program detecting method, device, equipment and readable storage medium storing program for executing

Publications (1)

Publication Number Publication Date
CN109558329A true CN109558329A (en) 2019-04-02

Family

ID=65869926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811514703.0A Pending CN109558329A (en) 2018-12-10 2018-12-10 A kind of program detecting method, device, equipment and readable storage medium storing program for executing

Country Status (2)

Country Link
CN (1) CN109558329A (en)
WO (1) WO2020119188A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457907A (en) * 2019-07-25 2019-11-15 腾讯科技(深圳)有限公司 A kind of firmware program detecting method and device
CN110516334A (en) * 2019-08-16 2019-11-29 浪潮电子信息产业股份有限公司 Convolutional calculation emulation test method, device and relevant device based on hardware environment
CN111027277A (en) * 2019-11-12 2020-04-17 天津大学 Software and hardware cooperation verification method
WO2020119188A1 (en) * 2018-12-10 2020-06-18 广东浪潮大数据研究有限公司 Program detection method, apparatus and device, and readable storage medium
CN113496272A (en) * 2021-05-10 2021-10-12 中国电子科技集团公司第十四研究所 Convolutional neural network operation method based on heterogeneous platform

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330524B (en) * 2020-10-26 2024-06-18 沈阳上博智像科技有限公司 Device and method for quickly realizing convolution in image tracking system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528363A (en) * 2015-09-14 2017-03-22 深圳市博巨兴实业发展有限公司 Software and hardware cooperative design verifying method and device
US20170344876A1 (en) * 2016-05-31 2017-11-30 Samsung Electronics Co., Ltd. Efficient sparse parallel winograd-based convolution scheme
CN107844833A (en) * 2017-11-28 2018-03-27 郑州云海信息技术有限公司 A kind of data processing method of convolutional neural networks, device and medium
CN108229645A (en) * 2017-04-28 2018-06-29 北京市商汤科技开发有限公司 Convolution accelerates and computation processing method, device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117330B (en) * 2015-08-07 2018-04-03 百度在线网络技术(北京)有限公司 CNN code test methods and device
US10482155B2 (en) * 2016-12-30 2019-11-19 Intel Corporation Winograd algorithm on a matrix processing architecture
CN108764083A (en) * 2018-05-17 2018-11-06 淘然视界(杭州)科技有限公司 Object detection method, electronic equipment, storage medium based on natural language expressing
CN109558329A (en) * 2018-12-10 2019-04-02 广东浪潮大数据研究有限公司 A kind of program detecting method, device, equipment and readable storage medium storing program for executing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528363A (en) * 2015-09-14 2017-03-22 深圳市博巨兴实业发展有限公司 Software and hardware cooperative design verifying method and device
US20170344876A1 (en) * 2016-05-31 2017-11-30 Samsung Electronics Co., Ltd. Efficient sparse parallel winograd-based convolution scheme
CN108229645A (en) * 2017-04-28 2018-06-29 北京市商汤科技开发有限公司 Convolution accelerates and computation processing method, device, electronic equipment and storage medium
CN107844833A (en) * 2017-11-28 2018-03-27 郑州云海信息技术有限公司 A kind of data processing method of convolutional neural networks, device and medium

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020119188A1 (en) * 2018-12-10 2020-06-18 广东浪潮大数据研究有限公司 Program detection method, apparatus and device, and readable storage medium
CN110457907A (en) * 2019-07-25 2019-11-15 腾讯科技(深圳)有限公司 A kind of firmware program detecting method and device
CN110457907B (en) * 2019-07-25 2021-04-20 腾讯科技(深圳)有限公司 Firmware program detection method and device
CN110516334A (en) * 2019-08-16 2019-11-29 浪潮电子信息产业股份有限公司 Convolutional calculation emulation test method, device and relevant device based on hardware environment
WO2021031345A1 (en) * 2019-08-16 2021-02-25 浪潮电子信息产业股份有限公司 Convolutional calculation simulation test method and apparatus based on hardware environment, and related device
CN110516334B (en) * 2019-08-16 2021-12-03 浪潮电子信息产业股份有限公司 Convolution calculation simulation test method and device based on hardware environment and related equipment
CN111027277A (en) * 2019-11-12 2020-04-17 天津大学 Software and hardware cooperation verification method
CN111027277B (en) * 2019-11-12 2024-07-05 天津大学 Software and hardware cooperation verification method
CN113496272A (en) * 2021-05-10 2021-10-12 中国电子科技集团公司第十四研究所 Convolutional neural network operation method based on heterogeneous platform

Also Published As

Publication number Publication date
WO2020119188A1 (en) 2020-06-18

Similar Documents

Publication Publication Date Title
CN109558329A (en) A kind of program detecting method, device, equipment and readable storage medium storing program for executing
US10963292B2 (en) Techniques to manage virtual classes for statistical tests
CN105022670B (en) Heterogeneous distributed task processing system and its processing method in a kind of cloud computing platform
WO2021088688A1 (en) Convolution acceleration operation method and apparatus, storage medium and terminal device
CN110866589B (en) Operation method, device and framework of deep neural network model
CN108334408B (en) Code execution method and device, terminal equipment and computer readable storage medium
CN109189572B (en) Resource estimation method and system, electronic equipment and storage medium
CN112286644A (en) Elastic scheduling method, system, equipment and storage medium for GPU (graphics processing Unit) virtualization computing power
CN110750312A (en) Hardware resource configuration method and device, cloud side equipment and storage medium
US20240241808A1 (en) Application performance test method and apparatus, and method and apparatus for establishing performance test model
CN110750359B (en) Hardware resource configuration method and device, cloud side equipment and storage medium
CN112256623A (en) Heterogeneous system-based processing performance optimization method and device
CN108647007B (en) Computing system and chip
CN107769987B (en) Message forwarding performance evaluation method and device
CN115469931B (en) Instruction optimization method, device, system, equipment and medium of loop program
Ikram et al. Measuring power and energy consumption of programs running on kepler GPUs
CN115549854B (en) Cyclic redundancy check method and device, storage medium and electronic equipment
CN113407350B (en) Instruction processing device, processor, chip, computing equipment and corresponding method
CN111860758B (en) Deep learning model operation method and device, electronic equipment and medium
CN114021733A (en) Model training optimization method and device, computer equipment and storage medium
CN111105015A (en) General CNN reasoning accelerator, control method thereof and readable storage medium
CN111291864B (en) Operation processing module, neural network processor, electronic equipment and data processing method
US20240134532A1 (en) Electronic device, method of determining memory access efficiency for memory, and storage medium
CN107066385A (en) A kind of method of testing, apparatus and system
CN115840685A (en) Chip performance determination method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190402