CN117093376A - Intelligent recognition model adaptation method applied to domestic GPU environment - Google Patents
Intelligent recognition model adaptation method applied to domestic GPU environment Download PDFInfo
- Publication number
- CN117093376A CN117093376A CN202311352128.XA CN202311352128A CN117093376A CN 117093376 A CN117093376 A CN 117093376A CN 202311352128 A CN202311352128 A CN 202311352128A CN 117093376 A CN117093376 A CN 117093376A
- Authority
- CN
- China
- Prior art keywords
- model
- intelligent
- domestic
- steps
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000006978 adaptation Effects 0.000 title claims abstract description 35
- 238000012549 training Methods 0.000 claims abstract description 33
- 238000013135 deep learning Methods 0.000 claims abstract description 23
- 238000011161 development Methods 0.000 claims abstract description 12
- 238000005457 optimization Methods 0.000 claims abstract description 6
- 238000012360 testing method Methods 0.000 claims description 39
- 238000011156 evaluation Methods 0.000 claims description 19
- 238000012795 verification Methods 0.000 claims description 17
- 238000009434 installation Methods 0.000 claims description 16
- 238000012544 monitoring process Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 6
- 230000006872 improvement Effects 0.000 claims description 6
- 238000011056 performance test Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 230000001133 acceleration Effects 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 4
- 230000001419 dependent effect Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- 238000004806 packaging method and process Methods 0.000 claims description 4
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 claims description 3
- 238000004821 distillation Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000013095 identification testing Methods 0.000 claims description 3
- 238000011900 installation process Methods 0.000 claims description 3
- 238000011068 loading method Methods 0.000 claims description 3
- 238000013508 migration Methods 0.000 claims description 3
- 230000005012 migration Effects 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 claims description 3
- 238000013138 pruning Methods 0.000 claims description 3
- 238000011002 quantification Methods 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 2
- 238000010998 test method Methods 0.000 claims description 2
- 238000013434 data augmentation Methods 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000010881 fly ash Substances 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000002002 slurry Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/61—Installation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Stored Programmes (AREA)
Abstract
The invention relates to the field of artificial intelligence and domestic basic platforms, and discloses an intelligent recognition model adaptation method applied to a domestic GPU environment, which comprises the following steps: s1: detecting a basic environment of hardware equipment; s1: detecting a basic environment of hardware equipment; s2: instruction set business architecture adaptation; s3: deep learning framework adaptation; s4: training, optimizing and reasoning of the intelligent recognition model; s5: the performance stability of the intelligent recognition model is improved; s6: verifying an intelligent auditing application; the invention can fully evaluate the suitability and reliability of the domestic hardware platform in the actual business application requirements, and ensure that the domestic hardware platform can meet the project requirements. Combining the potential development and optimization requirements, comprehensively examining whether all aspects of the capability of the hardware equipment can form good support.
Description
Technical Field
The invention belongs to the field of artificial intelligence and domestic basic platforms, and particularly relates to an intelligent recognition model adaptation method applied to domestic GPU environments.
Background
Along with the development of artificial intelligence and big data technology, intelligent identification and auxiliary auditing are increasingly widely applied to various platforms and website resource production and release. The current intelligent recognition technology is mainly realized by using a foreign GPU display card such as Yu Yingwei, and hardware products such as domestic chips and AI accelerator cards are relatively low in performance, poor in compatibility and low in adaptation degree, so that although domestic GPUs such as the Blackless and the like support a mainstream deep learning framework, the intelligent recognition technology lacks an adaptation technical means and butted software ecology with the mainstream AI framework, and the problems of instruction set support and the like need to be further solved.
Disclosure of Invention
The invention aims to adapt an artificial intelligent model developed by a mainstream deep learning framework to a domestic GPU platform, and provides an intelligent recognition model adaptation method applied to a domestic GPU environment. Combining the potential development and optimization requirements, comprehensively examining whether all aspects of the capability of the hardware equipment can form good support.
The invention provides the following technical scheme: an intelligent recognition model adaptation method applied to a domestic GPU environment comprises the following steps:
s1: detecting a basic environment of hardware equipment;
s2: instruction set business architecture adaptation;
s3: deep learning framework adaptation;
s4: training, optimizing and reasoning of the intelligent recognition model;
s5: the performance stability of the intelligent recognition model is improved;
s6: intelligent identification application verification.
The step S1: the specific steps of hardware equipment basic environment detection include:
s1.1: the specific method for adapting the hardware firmware and the driver comprises the following steps:
s1.1.a: installing firmware and a driver, and downloading and installing a high-version driver if the firmware or the driver version is too low in the installation process; if the problem of failure in installing the drive such as card falling occurs, reinstalling the drive program;
s1.1.b: confirm the valid installation of firmware and driver using the terminal command;
s1.2: the specific method for adapting the dependency component library comprises the following steps:
s1.2.a: acquiring a source code;
s1.2.b: installing a cross-compilation tool capable of supporting multiple target architectures;
s1.2.c: configuring compiling options, and managing compiling processes by constructing a system;
s1.2.d: operating and constructing a command compiling dependency library to generate a target architecture compiling dependency library;
s1.2.e: and installing the compiled dependency library, and confirming the effective installation of the dependency library through a terminal command.
The step S2: the specific steps of the instruction set business architecture adaptation include:
s2.1: the instruction set business architecture compatibility test method specifically comprises the following steps:
installing a related analysis processing tool kit aiming at data of a business scene; starting business service, testing, and checking whether the relevant dependence is successfully installed or not through a command; if the installation is successful, passing a compatibility test, and going to step S2.2;
otherwise, performing the adapting step of the related toolkit: 1) Obtaining a source code; 2) Configuring compiling options; 3) Generating a compiling library of the target architecture; 4) S2.2, after the installation test, going to the step;
s2.2: the instruction set service QPS performance test method specifically comprises the following steps:
the same set of business logic codes and algorithm models are respectively used for the original platform and the target platform to deploy business modules; testing the response speed and throughput of the hardware platform according to the algorithm and data in the service; and judging the performance test result of the instruction set service QPS according to the service requirement and the test result.
The step S3: the specific steps of the deep learning framework adaptation include:
s3.1: selecting a deep learning framework supported by a domestic acceleration platform;
s3.2: performing source code compiling, constructing and installing on the mainstream deep learning framework;
s3.3: according to the deep learning framework, official example demo code is run, verifying validity.
The step S4: the specific steps of intelligent recognition model training tuning and reasoning include:
s4.1: installing a dependent environment required by training and reasoning of the intelligent model;
s4.2: aiming at a business scene, preparing a data set, dividing a training set and a testing set, and generating a classification label;
s4.3: respectively implementing algorithm models on the original platform and the target platform, and keeping the model structure consistent with the parameters;
s4.4: reading in training data to start training, and storing a model file after training is completed;
s4.5: loading a trained intelligent model file, converting the model format into a format supported by a domestic platform, encapsulating a model reasoning interface, modifying an original platform preprocessing code and a post-processing code, and carrying out model reasoning and prediction by using the encapsulated interface.
The step S5: the specific steps of improving the performance stability of the intelligent recognition model comprise:
s5.1: the intelligent model performance evaluation method comprises the following steps:
s5.1a: and aiming at the service scene, constructing a test data set required by intelligent model evaluation, and uploading the test data set to different platforms. And under different platforms, using the same set of test data, algorithm model and evaluation standard to perform model identification test on the service data to be identified, and counting the identification result. The evaluation criteria include 4 evaluation indexes of accuracy, recall, F1 and mAP of the intelligent model. The precision and recall rate reflect the accuracy and the comprehensiveness of the prediction of the recognition model, and reflect the comprehensive index through F1; mAP reflects the average accuracy of the recognition model in the multi-category prediction scene. Measuring the performance of the identification model under different platforms through the evaluation standard;
s5.1b: reasoning the same picture for 10 times, observing the reasoning result, and solving the problems that the output performance of the model is unstable and the reasoning effect on the test set is poor;
s5.2: the specific method for improving the model output performance and the reasoning effect comprises the following steps:
s5.2a: and checking the quality of training data, and ensuring the accuracy and the sufficiency of the data. The data quality can be improved by means of data cleaning, data enhancement and the like;
s5.2b: and 3, adjusting the complexity of the model, and avoiding overfitting. The complexity of the model can be controlled by adding regularization items, reducing model parameters and the like;
s5.2c: cross-validation and other techniques are used to evaluate the performance of the model to avoid overfitting. The data set may be divided into a plurality of training sets and verification sets, with the verification sets being used to evaluate the performance of the model;
s5.2d: and parameter adjustment is carried out on the model, and the performance of the model is optimized. Searching the optimal super-parameter combination by means of grid searching, random searching and the like;
s5.2e: and the training data volume is increased, and the generalization capability of the model is improved. The training data volume can be increased by means of data enhancement, data synthesis and the like;
s5.2f: techniques such as transfer learning are used to enhance the generalization ability of the model. A pre-trained model can be used as a basic model, and new tasks can be adapted in a fine adjustment mode and the like;
s5.2g: and model tuning is performed on the test set, so that the generalization capability of the model is improved. The performance of the model can be evaluated using the validation set, and then model tuning is performed on the test set;
s5.3: performance improvement verification: integrating and normalizing tensors in an inference queue by using a bidirectional data binding method;
s5.4: and repeating the steps S5.2 and S5.3, and performing performance improvement verification.
The step S6: the specific steps of intelligent identification application verification include:
s6.1: the method for monitoring the source code safety comprises the following steps:
firstly, aiming at an AI computing acceleration card of a domestic GPU platform, selecting a supported deep learning frame version, and then combining a target service scene to perform security risk detection on source codes of an open source frame to prevent security problems caused by loopholes;
s6.2: the intelligent model development method specifically comprises the following steps:
s6.2.a: packaging the interface of the mainstream deep learning framework to realize a unified development interface;
s6.2.b: data preprocessing, including filtering, cleaning, amplifying and the like;
s6.2.c: constructing a proper deep neural network model by combining service data and requirements;
s6.2.d: initializing model training, and storing the model after training and verification are completed;
s6.3: the intelligent model deployment method specifically comprises the following steps:
s6.3.a: model migration: the trained and verified model is converted into a format of a domestic hardware platform environment, and an offline model is generated;
s6.3.b: model optimization: according to the characteristics of the deployment environment, performing operations such as model pruning, quantification, distillation and the like to reduce the size of the model and improve the performance of the model on specific hardware;
s6.3.c: deployment environment preparation: the method comprises the steps of installing necessary software libraries, configuring hardware equipment, setting network connection and the like;
s6.3.d: model deployment: deploying the optimized model into a target environment, and testing;
s6.3.e: model monitoring and updating: in the process of model deployment and operation, continuously monitoring the performance and functions of the model, and updating and optimizing the model according to the needs;
s6.3.f: reasoning application development: according to the auditing service requirement and the data flow, developing intelligent recognition application, calling an actual sample in the offline model automatic auditing service, transmitting a recognition result back to the service processing flow, and displaying the recognition result to an application interface.
The invention has the following beneficial effects:
the invention can fully evaluate the suitability and reliability of the domestic hardware platform in the actual application requirement, and ensure that the domestic hardware platform can meet the project requirement. Combining the potential development and optimization requirements, comprehensively examining whether all aspects of the capability of the hardware equipment can form good support.
Drawings
FIG. 1 is an inventive schematic;
FIG. 2 is an instruction set business architecture adaptation flow diagram;
fig. 3 is a flow chart of algorithm model adaptation.
Detailed Description
The following description of the embodiments of the invention is presented in conjunction with the accompanying drawings to provide a better understanding of the invention to those skilled in the art. It is to be expressly noted that in the description below, detailed descriptions of known functions and designs are omitted here as perhaps obscuring the present invention.
Examples
FIG. 1 is a schematic diagram of an intelligent recognition model adaptation method applied to a domestic GPU environment, and the method specifically comprises the following steps:
s1: detecting a basic environment of hardware equipment;
s2: instruction set business architecture adaptation;
s3: deep learning framework adaptation;
s4: training, optimizing and reasoning of the intelligent recognition model;
s5: the performance stability of the intelligent recognition model is improved;
s6: intelligent identification application verification.
In this example, the test environment is a domestic GPU british smart accelerator card, model number is: MLU370-X8, the non-domestic display card equipment for comparison is the Injeida GPU, and the model is: NVIDIA 3080Ti, the deep learning framework is a hundred degree fly slurry PaddlePaddle framework.
The step S1: hardware device basic environment detection:
s1.1: adapting the hardware firmware and drivers:
firstly, installing firmware and a driver on GPU hardware, and downloading and installing a high-version driver if the firmware or the driver version is too low in the installation process; if the problem of failure in installing the drive such as card falling occurs, reinstalling the drive program;
after the installation is completed, confirming that the firmware and the driver are effectively installed by using a terminal command (cnmon);
s1.2: adaptation dependent component library:
step 1, source code acquisition: the source code of the dependency library is found in the official website or the GitHub repository of the project. Step 2, installing a cross compiling tool: for the current project, cross-compilation tools capable of supporting multiple target architectures, such as GCC (GNU compiler set), are installed. Step 3, configuring compiling options: for the current project, the compilation process, such as autoconf or cmake, is managed by building a system. Configuring compilation tools for project target architecture involves setting environment variables and possibly other flags and options, pointing to cross compilers. And step 4, a building command (such as make) is operated to compile the dependency library, a target architecture compiled dependency library is generated, installation and test are executed, and the installation and test processes are different in specific implementation according to different projects. The general procedure is to install the compiled library to the target system and then execute the official-provided demo program to confirm the validity of the installation.
The step S2: as shown in fig. 2, the instruction set business architecture adaptation specifically includes:
s2.1: instruction set business architecture compatibility test:
installing a related analysis processing tool kit aiming at data of a business scene; starting business service, testing, and checking whether the relevant dependence is successfully installed or not through a command; if the installation is successful, passing a compatibility test, and going to step S2.2;
otherwise, performing the adapting step of the related toolkit: 1) Obtaining a source code; 2) Configuring compiling options; 3) Generating a compiling library of the target architecture; 4) S2.2, after the installation test, going to the step;
s2.2: instruction set service QPS performance test:
the same set of business logic codes and algorithm models are respectively used for the original platform and the target platform to deploy business modules; for algorithms and data in the business, fastAPI packaging interface service is used, and response speed and throughput of the hardware platform are tested; and judging the performance test result of the instruction set service QPS according to the service requirement and the test result.
The step S3: deep learning framework adaptation:
s3.1: mainstream frame adaptation:
step 1, performing source code compiling, constructing and installing on a main stream deep learning frame, wherein a hundred-degree fly-by-paddle Paddle frame is adopted in an example, and the compiling and installing steps comprise:
1) Preparing a correlation dependence:
mm_v0.1_aarch64-kylin10.tar;
cntoolkit-3.1.4-1.ky10.aarch64.rpm;
cnnl-static-1.14.2-1.ky10.aarch64.rpm;
cnnl-1.14.2-1.ky10.aarch64.rpm;
cncl-1.5.2-1.ky10.aarch64.rpm;
2) 2) compiling into a container, the code being as follows:
gh repo clone Cambricon/mlu-ops
cd mlu-ops/bangc-ops;
./build.sh;
copying the header file to the position under the new;
3) Compiling a pallet:
the pallet warehouse corresponding to CTR2.5 is a flyash 2.4 version library;
3.1 Using the rpm package setup prepared in step 1) to update the underlying library, the commands are:
ARG CNTOOLKIT_VERSION=3.1.4-1;
ARG CNNL_VERSION=1.14.2-1;
ARG CNCL_VERSION-1.5.2-1;
ARG MLUOPS_VERSION=0.4.1-1;
3.2 A) enter working environment command is:
cd Paddle;
3.3 Creating a compiled catalog, the commands being:
mkdir build&&cd build;
3.4 Executing a cmake, command:
cmake .. -DPY_VERSION=3.7 -DPYTHON_EXECUTABLE=`which python3` -DWITH_ARM=ON -DWITH_TESTING=OFF -DON_INFER=ON -DWITH_XBYAK=OFF -DCMAKE_CXX_FLAGS=”who-error -w” -DWITH_MLU=ON;
step 2, according to the deep learning framework, running official example demo codes, verifying validity, wherein the verification codes are as follows:
cd Paddle;
pip install build/python/dist/paddlepaddle_mlu-0.0.0-cp37-cp37m-arm;
python;
import paddle;
paddle.utils.run_check();
the step S4: as shown in FIG. 3, the intelligent recognition model training tuning and reasoning implementation steps are as follows:
s4.1: the dependency environment required for training and reasoning of the installation intelligent model comprises: the chile GPU driver and the dock mirror of the dependent library, the installation chile mlu driver, the pad mlu and the yolox for late chile adaptation;
s4.2: aiming at a service scene, preparing a data set, dividing a training set and a testing set by using a hierarchical dividing method based on the condition that the class of a sample in the data set is unbalanced, ensuring that the class proportions in the training set and the testing set are similar, and generating a classification label;
s4.3: respectively implementing algorithm models on an original platform and a target platform, designating the same loss function, optimizer and evaluation index, and keeping the model structure and parameters consistent;
s4.4: reading in training data, starting training, and storing a model file after the training is finished;
s4.5: loading a trained intelligent model file, converting the model format of the Paddle framework into an ONNX format, converting the model in the ONNX format into a MagicMInd format supported by a domestic platform, encapsulating a model reasoning interface, modifying an original platform preprocessing code and a post-processing code, and carrying out model reasoning and prediction by using the encapsulated interface.
The step S5: and the performance stability of the intelligent recognition model is improved:
s5.1: the intelligent model performance evaluation method comprises the following steps:
s5.1a: and aiming at the service scene, constructing a test data set required by intelligent model evaluation, and uploading the test data set to different platforms. And under different platforms, using the same set of test data, algorithm model and evaluation standard to perform model identification test on the business data to be audited, and counting the identification result. The evaluation criteria include 4 evaluation indexes of accuracy, recall, F1 and mAP of the intelligent model. The precision and recall rate reflect the accuracy and the comprehensiveness of the prediction of the recognition model, and reflect the comprehensive index through F1; mAP reflects the average accuracy of the recognition model in the multi-category prediction scene. Measuring the performance of the identification model under different platforms through the evaluation standard;
s5.1b: reasoning the same picture for 10 times, observing the reasoning result, and solving the problems that the output performance of the model is unstable and the reasoning effect on the test set is poor;
s5.2: performance improvement verification: integrating and normalizing tensors in an inference queue by using a bidirectional data binding method;
s5.3: and S5.1b, repeating the step S, and performing performance improvement verification.
The step S6: intelligent identification application verification:
s6.1: and (3) source code safety monitoring:
firstly, aiming at an AI computing acceleration card of a domestic GPU platform, selecting a supported deep learning frame version, and then combining a target service scene to perform security risk detection on source codes of an open source frame to prevent security problems caused by loopholes;
s6.2: and (3) developing an intelligent model:
step 1, packaging interfaces of a main stream deep learning frame to realize unified development interfaces; step 2, preprocessing data, including filtering, cleaning, amplifying and the like; step 3, combining intelligent identification service data and requirements to construct a deep neural network model; step 4, starting model training, and storing a model weight file after training and verification are completed; step 5, deploying the trained intelligent recognition model, wherein the specific method comprises the following steps of:
1) Model migration: the trained and verified model is converted into a format of a domestic hardware platform environment, and an offline model is generated;
2) Model optimization: according to the characteristics of the deployment environment, performing operations such as model pruning, quantification, distillation and the like to reduce the size of the model and improve the performance of the model on specific hardware;
step 6, preparing a deployment environment, including installing necessary software libraries, configuring hardware equipment, setting network connection and the like; step 7, executing model deployment, deploying the optimized model into a target environment, and testing; step 8, monitoring and updating a model: in the process of model deployment and operation, continuously monitoring the performance and functions of the model, and updating and optimizing the model according to the needs; and 9, reasoning application development, developing intelligent identification application according to the intelligent identification service requirement and the data flow, calling an actual sample in the offline model automatic identification service, transmitting the identification result back to the service processing flow, and displaying the identification result to the application interface.
While the foregoing describes illustrative embodiments of the present invention to facilitate an understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as protected by the accompanying claims insofar as various changes are within the spirit and scope of the present invention as defined and defined by the appended claims.
Claims (9)
1. The intelligent recognition model adaptation method applied to the domestic GPU environment is characterized by comprising the following steps of:
s1: detecting a basic environment of hardware equipment;
s2: instruction set business architecture adaptation;
s3: deep learning framework adaptation;
s4: training, optimizing and reasoning by using an intelligent auditing model;
s5: the performance stability of the intelligent auditing model is improved;
s6: and (5) verifying the intelligent auditing application.
2. The intelligent recognition model adaptation method applied to domestic GPU environment according to claim 1, wherein the steps of S1: the specific steps of hardware equipment basic environment detection include:
s1.1: the specific method for adapting the hardware firmware and the driver comprises the following steps:
s1.1.a: installing firmware and a driver, and downloading and installing a high-version driver if the firmware or the driver version is too low in the installation process; if the problem of drive installation failure caused by card falling occurs, reinstalling the drive program;
s1.1.b: confirm the valid installation of firmware and driver using the terminal command;
s1.2: the specific method for adapting the dependency component library comprises the following steps:
s1.2.a: acquiring a source code;
s1.2.b: installing a cross-compilation tool capable of supporting multiple target architectures;
s1.2.c: configuring compiling options, and managing compiling processes by constructing a system;
s1.2.d: operating and constructing a command compiling dependency library to generate a target architecture compiling dependency library;
s1.2.e: and installing the compiled dependency library, and confirming the effective installation of the dependency library through a terminal command.
3. The intelligent recognition model adaptation method applied to domestic GPU environment according to claim 1, wherein the step S2 is: the specific steps of the instruction set business architecture adaptation include:
s2.1: the instruction set business architecture compatibility test method specifically comprises the following steps:
installing a related analysis processing tool kit aiming at data of a business scene; starting business service, testing, and checking whether the relevant dependence is successfully installed or not through a command; if the installation is successful, passing a compatibility test, and going to step S2.2;
otherwise, carrying out the adaptation step of the related tool kit;
s2.2: the instruction set service QPS performance test method specifically comprises the following steps:
the same set of business logic codes and algorithm models are respectively used for the original platform and the target platform to deploy business modules; testing the response speed and throughput of the hardware platform according to the algorithm and data in the service; and judging the performance test result of the instruction set service QPS according to the service requirement and the test result.
4. The intelligent recognition model adaptation method applied to the domestic GPU environment according to claim 1, wherein the step S3: the specific steps of the deep learning framework adaptation include:
s3.1: the main flow frame adaptation method specifically comprises the following steps:
s3.1.a: performing source code compiling, constructing and installing on the mainstream deep learning framework;
s3.1.b: according to the deep learning framework, official example demo code is run, verifying validity.
5. The intelligent recognition model adaptation method applied to the domestic GPU environment according to claim 1, wherein the step S4 is: the specific steps of intelligent recognition model training tuning and reasoning include:
s4.1: installing a dependent environment required by training and reasoning of the intelligent model;
s4.2: aiming at a business scene, preparing a data set, dividing a training set and a testing set, and generating a classification label;
s4.3: respectively implementing algorithm models on the original platform and the target platform, and keeping the model structure consistent with the parameters;
s4.4: reading in training data to start training, and storing a model file after training is completed;
s4.5: loading a trained intelligent model file, converting the model format into a format supported by a domestic platform, encapsulating a model reasoning interface, modifying an original platform preprocessing code and a post-processing code, and carrying out model reasoning and prediction by using the encapsulated interface.
6. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 1, wherein the step S5 is: the specific steps of improving the performance stability of the intelligent recognition model comprise:
s5.1: the intelligent model performance evaluation method comprises the following steps:
s5.1a: aiming at a service scene, constructing a test data set required by intelligent model evaluation, and uploading the test data set to different platforms; under different platforms, the same set of test data, algorithm model and evaluation standard are used for carrying out model identification test on service data to be identified, and the identification result is counted; the evaluation standard comprises 4 evaluation indexes of the accuracy rate, recall rate, F1 and mAP of the intelligent model; the precision and recall rate reflect the accuracy and the comprehensiveness of the prediction of the recognition model, and reflect the comprehensive index through F1; mAP reflects the average accuracy of the recognition model in the multi-category prediction scene, and the performance of the recognition model under different platforms is measured through the evaluation standard;
s5.1b: reasoning the same picture for 10 times, observing the reasoning result, and solving the problems that the output performance of the model is unstable and the reasoning effect on the test set is poor;
s5.2: performance improvement verification: integrating and normalizing tensors in an inference queue by using a bidirectional data binding method;
s5.3: and S5.1, repeating the step S, and performing performance improvement verification.
7. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 1, wherein the step S6 is: the specific steps of intelligent identification application verification include:
s6.1: the method for monitoring the source code safety comprises the following steps:
firstly, aiming at an AI computing acceleration card of a domestic GPU platform, selecting a supported deep learning frame version, and then combining a target service scene to perform security risk detection on source codes of an open source frame to prevent security problems caused by loopholes;
s6.2: developing an intelligent model;
s6.3: and (5) intelligent model deployment.
8. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 6, wherein the intelligent model development comprises the following steps:
s6.2.a: packaging the interface of the mainstream deep learning framework to realize a unified development interface;
s6.2.b: data preprocessing, including data filtering, data cleaning and data augmentation;
s6.2.c: constructing a proper deep neural network model by combining service data and requirements;
s6.2.d: initializing model training, and storing the model after training and verification.
9. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 6, wherein the intelligent model deployment comprises the following steps:
s4.3.a: model migration: the trained and verified model is converted into a format of a domestic hardware platform environment, and an offline model is generated;
s4.3.B: model optimization: model pruning is carried out, quantification is carried out, and distillation operation is carried out to reduce the size of the model and improve the performance of the model on specific hardware according to the characteristics of a deployment environment;
s4.3.C: deployment environment preparation: installing necessary software libraries, configuring hardware equipment and setting network connection;
s4.3.D: model deployment: deploying the optimized model into a target environment, and testing;
s4.3.E: model monitoring and updating: in the process of model deployment and operation, continuously monitoring the performance and functions of the model, and updating and optimizing the model according to the needs;
s4.3.f: reasoning application development: according to the intelligent identification service requirement and the data flow, developing an intelligent auditing application, calling an actual sample in the offline model automatic auditing service, transmitting an auditing result back to the service processing flow, and displaying the auditing result to an application interface.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311352128.XA CN117093376A (en) | 2023-10-19 | 2023-10-19 | Intelligent recognition model adaptation method applied to domestic GPU environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311352128.XA CN117093376A (en) | 2023-10-19 | 2023-10-19 | Intelligent recognition model adaptation method applied to domestic GPU environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117093376A true CN117093376A (en) | 2023-11-21 |
Family
ID=88777581
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311352128.XA Pending CN117093376A (en) | 2023-10-19 | 2023-10-19 | Intelligent recognition model adaptation method applied to domestic GPU environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117093376A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076143A (en) * | 2021-04-21 | 2021-07-06 | 扬州万方电子技术有限责任公司 | Artificial intelligence environment adaptation method and compatibility testing method for domestic platform |
CN114186697A (en) * | 2021-12-10 | 2022-03-15 | 北京百度网讯科技有限公司 | Method and device for generating and applying deep learning model based on deep learning framework |
CN114330696A (en) * | 2021-12-31 | 2022-04-12 | 中国联合网络通信集团有限公司 | Multi-frame deep learning model processing method and device and electronic equipment |
CN114707667A (en) * | 2022-04-29 | 2022-07-05 | 中国电子科技集团公司第二十八研究所 | Data-driven automatic model training and application system |
CN116483730A (en) * | 2023-05-10 | 2023-07-25 | 公安部第一研究所 | Service system automatic test method based on domestic software and hardware and open source test tool |
WO2023160290A1 (en) * | 2022-02-23 | 2023-08-31 | 京东方科技集团股份有限公司 | Neural network inference acceleration method, target detection method, device, and storage medium |
-
2023
- 2023-10-19 CN CN202311352128.XA patent/CN117093376A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076143A (en) * | 2021-04-21 | 2021-07-06 | 扬州万方电子技术有限责任公司 | Artificial intelligence environment adaptation method and compatibility testing method for domestic platform |
CN114186697A (en) * | 2021-12-10 | 2022-03-15 | 北京百度网讯科技有限公司 | Method and device for generating and applying deep learning model based on deep learning framework |
CN114330696A (en) * | 2021-12-31 | 2022-04-12 | 中国联合网络通信集团有限公司 | Multi-frame deep learning model processing method and device and electronic equipment |
WO2023160290A1 (en) * | 2022-02-23 | 2023-08-31 | 京东方科技集团股份有限公司 | Neural network inference acceleration method, target detection method, device, and storage medium |
CN114707667A (en) * | 2022-04-29 | 2022-07-05 | 中国电子科技集团公司第二十八研究所 | Data-driven automatic model training and application system |
CN116483730A (en) * | 2023-05-10 | 2023-07-25 | 公安部第一研究所 | Service system automatic test method based on domestic software and hardware and open source test tool |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cummins et al. | Compiler fuzzing through deep learning | |
US20200249936A1 (en) | Method and system for a platform for api based user supplied algorithm deployment | |
EP3816806A1 (en) | Utilizing neural network and artificial intelligence models to select and execute test cases in a software development platform | |
EP3432229A1 (en) | Ability imparting data generation device | |
US20210165641A1 (en) | Remote application modernization | |
CN113076143B (en) | Artificial intelligence environment adaptation method and compatibility test method for domestic platform | |
CN112989363B (en) | Vulnerability positioning method and device, electronic equipment and storage medium | |
CN108984416B (en) | Method for evaluating dependency conflict danger level in Maven environment | |
CN113626324A (en) | Move language virtual machine-oriented fuzzy test method | |
CN102640069B (en) | A system and method for system automation based on interpreting a tree sequence of operations | |
CN111429486A (en) | DNNDK model-based moving object real-time detection tracking system and method | |
CN115437336A (en) | Test method and device for test case, electronic equipment and storage medium | |
CN112580627A (en) | Yoov 3 target detection method based on domestic intelligent chip K210 and electronic device | |
US20200301676A1 (en) | Framework for GPU Code Generation and Debugging | |
CN117093376A (en) | Intelligent recognition model adaptation method applied to domestic GPU environment | |
Fursin | The collective knowledge project: Making ML models more portable and reproducible with open APIs, reusable best practices and MLOps | |
CN117235527A (en) | End-to-end containerized big data model construction method, device, equipment and medium | |
CN100483342C (en) | Intelligent generating system and method for sensing programm | |
CN110716716A (en) | Mobile terminal visual AI programming platform | |
FR2828750A1 (en) | Trapping of errors, caused by incorrect use of pointers in programming languages that could cause a computer system to become unstable or crash, by use of a security pointer that checks pointer actions before they are executed | |
CN110095777A (en) | Fuzzy logic method meteorology particle identification method based on shuffling technology | |
Akinsola et al. | Qualitative comparative analysis of software integration testing techniques | |
CN117667045A (en) | Edge controller integrating deep learning and PLC language and code generation method | |
US12013773B2 (en) | Generating debuggable executables based on optimizing different compiler options for source code modules | |
KR102454168B1 (en) | Manufacturing performance integrated management system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |