CN117093376A - Intelligent recognition model adaptation method applied to domestic GPU environment - Google Patents

Intelligent recognition model adaptation method applied to domestic GPU environment Download PDF

Info

Publication number
CN117093376A
CN117093376A CN202311352128.XA CN202311352128A CN117093376A CN 117093376 A CN117093376 A CN 117093376A CN 202311352128 A CN202311352128 A CN 202311352128A CN 117093376 A CN117093376 A CN 117093376A
Authority
CN
China
Prior art keywords
model
intelligent
domestic
steps
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311352128.XA
Other languages
Chinese (zh)
Inventor
马文胜
韩丽萍
李海宁
何涛
贺梓然
戴军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Party Member Education Center Of Organization Department Of Shandong Provincial Committee Of Communist Party Of China
Original Assignee
Party Member Education Center Of Organization Department Of Shandong Provincial Committee Of Communist Party Of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Party Member Education Center Of Organization Department Of Shandong Provincial Committee Of Communist Party Of China filed Critical Party Member Education Center Of Organization Department Of Shandong Provincial Committee Of Communist Party Of China
Priority to CN202311352128.XA priority Critical patent/CN117093376A/en
Publication of CN117093376A publication Critical patent/CN117093376A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention relates to the field of artificial intelligence and domestic basic platforms, and discloses an intelligent recognition model adaptation method applied to a domestic GPU environment, which comprises the following steps: s1: detecting a basic environment of hardware equipment; s1: detecting a basic environment of hardware equipment; s2: instruction set business architecture adaptation; s3: deep learning framework adaptation; s4: training, optimizing and reasoning of the intelligent recognition model; s5: the performance stability of the intelligent recognition model is improved; s6: verifying an intelligent auditing application; the invention can fully evaluate the suitability and reliability of the domestic hardware platform in the actual business application requirements, and ensure that the domestic hardware platform can meet the project requirements. Combining the potential development and optimization requirements, comprehensively examining whether all aspects of the capability of the hardware equipment can form good support.

Description

Intelligent recognition model adaptation method applied to domestic GPU environment
Technical Field
The invention belongs to the field of artificial intelligence and domestic basic platforms, and particularly relates to an intelligent recognition model adaptation method applied to domestic GPU environments.
Background
Along with the development of artificial intelligence and big data technology, intelligent identification and auxiliary auditing are increasingly widely applied to various platforms and website resource production and release. The current intelligent recognition technology is mainly realized by using a foreign GPU display card such as Yu Yingwei, and hardware products such as domestic chips and AI accelerator cards are relatively low in performance, poor in compatibility and low in adaptation degree, so that although domestic GPUs such as the Blackless and the like support a mainstream deep learning framework, the intelligent recognition technology lacks an adaptation technical means and butted software ecology with the mainstream AI framework, and the problems of instruction set support and the like need to be further solved.
Disclosure of Invention
The invention aims to adapt an artificial intelligent model developed by a mainstream deep learning framework to a domestic GPU platform, and provides an intelligent recognition model adaptation method applied to a domestic GPU environment. Combining the potential development and optimization requirements, comprehensively examining whether all aspects of the capability of the hardware equipment can form good support.
The invention provides the following technical scheme: an intelligent recognition model adaptation method applied to a domestic GPU environment comprises the following steps:
s1: detecting a basic environment of hardware equipment;
s2: instruction set business architecture adaptation;
s3: deep learning framework adaptation;
s4: training, optimizing and reasoning of the intelligent recognition model;
s5: the performance stability of the intelligent recognition model is improved;
s6: intelligent identification application verification.
The step S1: the specific steps of hardware equipment basic environment detection include:
s1.1: the specific method for adapting the hardware firmware and the driver comprises the following steps:
s1.1.a: installing firmware and a driver, and downloading and installing a high-version driver if the firmware or the driver version is too low in the installation process; if the problem of failure in installing the drive such as card falling occurs, reinstalling the drive program;
s1.1.b: confirm the valid installation of firmware and driver using the terminal command;
s1.2: the specific method for adapting the dependency component library comprises the following steps:
s1.2.a: acquiring a source code;
s1.2.b: installing a cross-compilation tool capable of supporting multiple target architectures;
s1.2.c: configuring compiling options, and managing compiling processes by constructing a system;
s1.2.d: operating and constructing a command compiling dependency library to generate a target architecture compiling dependency library;
s1.2.e: and installing the compiled dependency library, and confirming the effective installation of the dependency library through a terminal command.
The step S2: the specific steps of the instruction set business architecture adaptation include:
s2.1: the instruction set business architecture compatibility test method specifically comprises the following steps:
installing a related analysis processing tool kit aiming at data of a business scene; starting business service, testing, and checking whether the relevant dependence is successfully installed or not through a command; if the installation is successful, passing a compatibility test, and going to step S2.2;
otherwise, performing the adapting step of the related toolkit: 1) Obtaining a source code; 2) Configuring compiling options; 3) Generating a compiling library of the target architecture; 4) S2.2, after the installation test, going to the step;
s2.2: the instruction set service QPS performance test method specifically comprises the following steps:
the same set of business logic codes and algorithm models are respectively used for the original platform and the target platform to deploy business modules; testing the response speed and throughput of the hardware platform according to the algorithm and data in the service; and judging the performance test result of the instruction set service QPS according to the service requirement and the test result.
The step S3: the specific steps of the deep learning framework adaptation include:
s3.1: selecting a deep learning framework supported by a domestic acceleration platform;
s3.2: performing source code compiling, constructing and installing on the mainstream deep learning framework;
s3.3: according to the deep learning framework, official example demo code is run, verifying validity.
The step S4: the specific steps of intelligent recognition model training tuning and reasoning include:
s4.1: installing a dependent environment required by training and reasoning of the intelligent model;
s4.2: aiming at a business scene, preparing a data set, dividing a training set and a testing set, and generating a classification label;
s4.3: respectively implementing algorithm models on the original platform and the target platform, and keeping the model structure consistent with the parameters;
s4.4: reading in training data to start training, and storing a model file after training is completed;
s4.5: loading a trained intelligent model file, converting the model format into a format supported by a domestic platform, encapsulating a model reasoning interface, modifying an original platform preprocessing code and a post-processing code, and carrying out model reasoning and prediction by using the encapsulated interface.
The step S5: the specific steps of improving the performance stability of the intelligent recognition model comprise:
s5.1: the intelligent model performance evaluation method comprises the following steps:
s5.1a: and aiming at the service scene, constructing a test data set required by intelligent model evaluation, and uploading the test data set to different platforms. And under different platforms, using the same set of test data, algorithm model and evaluation standard to perform model identification test on the service data to be identified, and counting the identification result. The evaluation criteria include 4 evaluation indexes of accuracy, recall, F1 and mAP of the intelligent model. The precision and recall rate reflect the accuracy and the comprehensiveness of the prediction of the recognition model, and reflect the comprehensive index through F1; mAP reflects the average accuracy of the recognition model in the multi-category prediction scene. Measuring the performance of the identification model under different platforms through the evaluation standard;
s5.1b: reasoning the same picture for 10 times, observing the reasoning result, and solving the problems that the output performance of the model is unstable and the reasoning effect on the test set is poor;
s5.2: the specific method for improving the model output performance and the reasoning effect comprises the following steps:
s5.2a: and checking the quality of training data, and ensuring the accuracy and the sufficiency of the data. The data quality can be improved by means of data cleaning, data enhancement and the like;
s5.2b: and 3, adjusting the complexity of the model, and avoiding overfitting. The complexity of the model can be controlled by adding regularization items, reducing model parameters and the like;
s5.2c: cross-validation and other techniques are used to evaluate the performance of the model to avoid overfitting. The data set may be divided into a plurality of training sets and verification sets, with the verification sets being used to evaluate the performance of the model;
s5.2d: and parameter adjustment is carried out on the model, and the performance of the model is optimized. Searching the optimal super-parameter combination by means of grid searching, random searching and the like;
s5.2e: and the training data volume is increased, and the generalization capability of the model is improved. The training data volume can be increased by means of data enhancement, data synthesis and the like;
s5.2f: techniques such as transfer learning are used to enhance the generalization ability of the model. A pre-trained model can be used as a basic model, and new tasks can be adapted in a fine adjustment mode and the like;
s5.2g: and model tuning is performed on the test set, so that the generalization capability of the model is improved. The performance of the model can be evaluated using the validation set, and then model tuning is performed on the test set;
s5.3: performance improvement verification: integrating and normalizing tensors in an inference queue by using a bidirectional data binding method;
s5.4: and repeating the steps S5.2 and S5.3, and performing performance improvement verification.
The step S6: the specific steps of intelligent identification application verification include:
s6.1: the method for monitoring the source code safety comprises the following steps:
firstly, aiming at an AI computing acceleration card of a domestic GPU platform, selecting a supported deep learning frame version, and then combining a target service scene to perform security risk detection on source codes of an open source frame to prevent security problems caused by loopholes;
s6.2: the intelligent model development method specifically comprises the following steps:
s6.2.a: packaging the interface of the mainstream deep learning framework to realize a unified development interface;
s6.2.b: data preprocessing, including filtering, cleaning, amplifying and the like;
s6.2.c: constructing a proper deep neural network model by combining service data and requirements;
s6.2.d: initializing model training, and storing the model after training and verification are completed;
s6.3: the intelligent model deployment method specifically comprises the following steps:
s6.3.a: model migration: the trained and verified model is converted into a format of a domestic hardware platform environment, and an offline model is generated;
s6.3.b: model optimization: according to the characteristics of the deployment environment, performing operations such as model pruning, quantification, distillation and the like to reduce the size of the model and improve the performance of the model on specific hardware;
s6.3.c: deployment environment preparation: the method comprises the steps of installing necessary software libraries, configuring hardware equipment, setting network connection and the like;
s6.3.d: model deployment: deploying the optimized model into a target environment, and testing;
s6.3.e: model monitoring and updating: in the process of model deployment and operation, continuously monitoring the performance and functions of the model, and updating and optimizing the model according to the needs;
s6.3.f: reasoning application development: according to the auditing service requirement and the data flow, developing intelligent recognition application, calling an actual sample in the offline model automatic auditing service, transmitting a recognition result back to the service processing flow, and displaying the recognition result to an application interface.
The invention has the following beneficial effects:
the invention can fully evaluate the suitability and reliability of the domestic hardware platform in the actual application requirement, and ensure that the domestic hardware platform can meet the project requirement. Combining the potential development and optimization requirements, comprehensively examining whether all aspects of the capability of the hardware equipment can form good support.
Drawings
FIG. 1 is an inventive schematic;
FIG. 2 is an instruction set business architecture adaptation flow diagram;
fig. 3 is a flow chart of algorithm model adaptation.
Detailed Description
The following description of the embodiments of the invention is presented in conjunction with the accompanying drawings to provide a better understanding of the invention to those skilled in the art. It is to be expressly noted that in the description below, detailed descriptions of known functions and designs are omitted here as perhaps obscuring the present invention.
Examples
FIG. 1 is a schematic diagram of an intelligent recognition model adaptation method applied to a domestic GPU environment, and the method specifically comprises the following steps:
s1: detecting a basic environment of hardware equipment;
s2: instruction set business architecture adaptation;
s3: deep learning framework adaptation;
s4: training, optimizing and reasoning of the intelligent recognition model;
s5: the performance stability of the intelligent recognition model is improved;
s6: intelligent identification application verification.
In this example, the test environment is a domestic GPU british smart accelerator card, model number is: MLU370-X8, the non-domestic display card equipment for comparison is the Injeida GPU, and the model is: NVIDIA 3080Ti, the deep learning framework is a hundred degree fly slurry PaddlePaddle framework.
The step S1: hardware device basic environment detection:
s1.1: adapting the hardware firmware and drivers:
firstly, installing firmware and a driver on GPU hardware, and downloading and installing a high-version driver if the firmware or the driver version is too low in the installation process; if the problem of failure in installing the drive such as card falling occurs, reinstalling the drive program;
after the installation is completed, confirming that the firmware and the driver are effectively installed by using a terminal command (cnmon);
s1.2: adaptation dependent component library:
step 1, source code acquisition: the source code of the dependency library is found in the official website or the GitHub repository of the project. Step 2, installing a cross compiling tool: for the current project, cross-compilation tools capable of supporting multiple target architectures, such as GCC (GNU compiler set), are installed. Step 3, configuring compiling options: for the current project, the compilation process, such as autoconf or cmake, is managed by building a system. Configuring compilation tools for project target architecture involves setting environment variables and possibly other flags and options, pointing to cross compilers. And step 4, a building command (such as make) is operated to compile the dependency library, a target architecture compiled dependency library is generated, installation and test are executed, and the installation and test processes are different in specific implementation according to different projects. The general procedure is to install the compiled library to the target system and then execute the official-provided demo program to confirm the validity of the installation.
The step S2: as shown in fig. 2, the instruction set business architecture adaptation specifically includes:
s2.1: instruction set business architecture compatibility test:
installing a related analysis processing tool kit aiming at data of a business scene; starting business service, testing, and checking whether the relevant dependence is successfully installed or not through a command; if the installation is successful, passing a compatibility test, and going to step S2.2;
otherwise, performing the adapting step of the related toolkit: 1) Obtaining a source code; 2) Configuring compiling options; 3) Generating a compiling library of the target architecture; 4) S2.2, after the installation test, going to the step;
s2.2: instruction set service QPS performance test:
the same set of business logic codes and algorithm models are respectively used for the original platform and the target platform to deploy business modules; for algorithms and data in the business, fastAPI packaging interface service is used, and response speed and throughput of the hardware platform are tested; and judging the performance test result of the instruction set service QPS according to the service requirement and the test result.
The step S3: deep learning framework adaptation:
s3.1: mainstream frame adaptation:
step 1, performing source code compiling, constructing and installing on a main stream deep learning frame, wherein a hundred-degree fly-by-paddle Paddle frame is adopted in an example, and the compiling and installing steps comprise:
1) Preparing a correlation dependence:
mm_v0.1_aarch64-kylin10.tar;
cntoolkit-3.1.4-1.ky10.aarch64.rpm;
cnnl-static-1.14.2-1.ky10.aarch64.rpm;
cnnl-1.14.2-1.ky10.aarch64.rpm;
cncl-1.5.2-1.ky10.aarch64.rpm;
2) 2) compiling into a container, the code being as follows:
gh repo clone Cambricon/mlu-ops
cd mlu-ops/bangc-ops;
./build.sh;
copying the header file to the position under the new;
3) Compiling a pallet:
the pallet warehouse corresponding to CTR2.5 is a flyash 2.4 version library;
3.1 Using the rpm package setup prepared in step 1) to update the underlying library, the commands are:
ARG CNTOOLKIT_VERSION=3.1.4-1;
ARG CNNL_VERSION=1.14.2-1;
ARG CNCL_VERSION-1.5.2-1;
ARG MLUOPS_VERSION=0.4.1-1;
3.2 A) enter working environment command is:
cd Paddle;
3.3 Creating a compiled catalog, the commands being:
mkdir build&&cd build;
3.4 Executing a cmake, command:
cmake .. -DPY_VERSION=3.7 -DPYTHON_EXECUTABLE=`which python3` -DWITH_ARM=ON -DWITH_TESTING=OFF -DON_INFER=ON -DWITH_XBYAK=OFF -DCMAKE_CXX_FLAGS=”who-error -w” -DWITH_MLU=ON;
step 2, according to the deep learning framework, running official example demo codes, verifying validity, wherein the verification codes are as follows:
cd Paddle;
pip install build/python/dist/paddlepaddle_mlu-0.0.0-cp37-cp37m-arm;
python;
import paddle;
paddle.utils.run_check();
the step S4: as shown in FIG. 3, the intelligent recognition model training tuning and reasoning implementation steps are as follows:
s4.1: the dependency environment required for training and reasoning of the installation intelligent model comprises: the chile GPU driver and the dock mirror of the dependent library, the installation chile mlu driver, the pad mlu and the yolox for late chile adaptation;
s4.2: aiming at a service scene, preparing a data set, dividing a training set and a testing set by using a hierarchical dividing method based on the condition that the class of a sample in the data set is unbalanced, ensuring that the class proportions in the training set and the testing set are similar, and generating a classification label;
s4.3: respectively implementing algorithm models on an original platform and a target platform, designating the same loss function, optimizer and evaluation index, and keeping the model structure and parameters consistent;
s4.4: reading in training data, starting training, and storing a model file after the training is finished;
s4.5: loading a trained intelligent model file, converting the model format of the Paddle framework into an ONNX format, converting the model in the ONNX format into a MagicMInd format supported by a domestic platform, encapsulating a model reasoning interface, modifying an original platform preprocessing code and a post-processing code, and carrying out model reasoning and prediction by using the encapsulated interface.
The step S5: and the performance stability of the intelligent recognition model is improved:
s5.1: the intelligent model performance evaluation method comprises the following steps:
s5.1a: and aiming at the service scene, constructing a test data set required by intelligent model evaluation, and uploading the test data set to different platforms. And under different platforms, using the same set of test data, algorithm model and evaluation standard to perform model identification test on the business data to be audited, and counting the identification result. The evaluation criteria include 4 evaluation indexes of accuracy, recall, F1 and mAP of the intelligent model. The precision and recall rate reflect the accuracy and the comprehensiveness of the prediction of the recognition model, and reflect the comprehensive index through F1; mAP reflects the average accuracy of the recognition model in the multi-category prediction scene. Measuring the performance of the identification model under different platforms through the evaluation standard;
s5.1b: reasoning the same picture for 10 times, observing the reasoning result, and solving the problems that the output performance of the model is unstable and the reasoning effect on the test set is poor;
s5.2: performance improvement verification: integrating and normalizing tensors in an inference queue by using a bidirectional data binding method;
s5.3: and S5.1b, repeating the step S, and performing performance improvement verification.
The step S6: intelligent identification application verification:
s6.1: and (3) source code safety monitoring:
firstly, aiming at an AI computing acceleration card of a domestic GPU platform, selecting a supported deep learning frame version, and then combining a target service scene to perform security risk detection on source codes of an open source frame to prevent security problems caused by loopholes;
s6.2: and (3) developing an intelligent model:
step 1, packaging interfaces of a main stream deep learning frame to realize unified development interfaces; step 2, preprocessing data, including filtering, cleaning, amplifying and the like; step 3, combining intelligent identification service data and requirements to construct a deep neural network model; step 4, starting model training, and storing a model weight file after training and verification are completed; step 5, deploying the trained intelligent recognition model, wherein the specific method comprises the following steps of:
1) Model migration: the trained and verified model is converted into a format of a domestic hardware platform environment, and an offline model is generated;
2) Model optimization: according to the characteristics of the deployment environment, performing operations such as model pruning, quantification, distillation and the like to reduce the size of the model and improve the performance of the model on specific hardware;
step 6, preparing a deployment environment, including installing necessary software libraries, configuring hardware equipment, setting network connection and the like; step 7, executing model deployment, deploying the optimized model into a target environment, and testing; step 8, monitoring and updating a model: in the process of model deployment and operation, continuously monitoring the performance and functions of the model, and updating and optimizing the model according to the needs; and 9, reasoning application development, developing intelligent identification application according to the intelligent identification service requirement and the data flow, calling an actual sample in the offline model automatic identification service, transmitting the identification result back to the service processing flow, and displaying the identification result to the application interface.
While the foregoing describes illustrative embodiments of the present invention to facilitate an understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as protected by the accompanying claims insofar as various changes are within the spirit and scope of the present invention as defined and defined by the appended claims.

Claims (9)

1. The intelligent recognition model adaptation method applied to the domestic GPU environment is characterized by comprising the following steps of:
s1: detecting a basic environment of hardware equipment;
s2: instruction set business architecture adaptation;
s3: deep learning framework adaptation;
s4: training, optimizing and reasoning by using an intelligent auditing model;
s5: the performance stability of the intelligent auditing model is improved;
s6: and (5) verifying the intelligent auditing application.
2. The intelligent recognition model adaptation method applied to domestic GPU environment according to claim 1, wherein the steps of S1: the specific steps of hardware equipment basic environment detection include:
s1.1: the specific method for adapting the hardware firmware and the driver comprises the following steps:
s1.1.a: installing firmware and a driver, and downloading and installing a high-version driver if the firmware or the driver version is too low in the installation process; if the problem of drive installation failure caused by card falling occurs, reinstalling the drive program;
s1.1.b: confirm the valid installation of firmware and driver using the terminal command;
s1.2: the specific method for adapting the dependency component library comprises the following steps:
s1.2.a: acquiring a source code;
s1.2.b: installing a cross-compilation tool capable of supporting multiple target architectures;
s1.2.c: configuring compiling options, and managing compiling processes by constructing a system;
s1.2.d: operating and constructing a command compiling dependency library to generate a target architecture compiling dependency library;
s1.2.e: and installing the compiled dependency library, and confirming the effective installation of the dependency library through a terminal command.
3. The intelligent recognition model adaptation method applied to domestic GPU environment according to claim 1, wherein the step S2 is: the specific steps of the instruction set business architecture adaptation include:
s2.1: the instruction set business architecture compatibility test method specifically comprises the following steps:
installing a related analysis processing tool kit aiming at data of a business scene; starting business service, testing, and checking whether the relevant dependence is successfully installed or not through a command; if the installation is successful, passing a compatibility test, and going to step S2.2;
otherwise, carrying out the adaptation step of the related tool kit;
s2.2: the instruction set service QPS performance test method specifically comprises the following steps:
the same set of business logic codes and algorithm models are respectively used for the original platform and the target platform to deploy business modules; testing the response speed and throughput of the hardware platform according to the algorithm and data in the service; and judging the performance test result of the instruction set service QPS according to the service requirement and the test result.
4. The intelligent recognition model adaptation method applied to the domestic GPU environment according to claim 1, wherein the step S3: the specific steps of the deep learning framework adaptation include:
s3.1: the main flow frame adaptation method specifically comprises the following steps:
s3.1.a: performing source code compiling, constructing and installing on the mainstream deep learning framework;
s3.1.b: according to the deep learning framework, official example demo code is run, verifying validity.
5. The intelligent recognition model adaptation method applied to the domestic GPU environment according to claim 1, wherein the step S4 is: the specific steps of intelligent recognition model training tuning and reasoning include:
s4.1: installing a dependent environment required by training and reasoning of the intelligent model;
s4.2: aiming at a business scene, preparing a data set, dividing a training set and a testing set, and generating a classification label;
s4.3: respectively implementing algorithm models on the original platform and the target platform, and keeping the model structure consistent with the parameters;
s4.4: reading in training data to start training, and storing a model file after training is completed;
s4.5: loading a trained intelligent model file, converting the model format into a format supported by a domestic platform, encapsulating a model reasoning interface, modifying an original platform preprocessing code and a post-processing code, and carrying out model reasoning and prediction by using the encapsulated interface.
6. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 1, wherein the step S5 is: the specific steps of improving the performance stability of the intelligent recognition model comprise:
s5.1: the intelligent model performance evaluation method comprises the following steps:
s5.1a: aiming at a service scene, constructing a test data set required by intelligent model evaluation, and uploading the test data set to different platforms; under different platforms, the same set of test data, algorithm model and evaluation standard are used for carrying out model identification test on service data to be identified, and the identification result is counted; the evaluation standard comprises 4 evaluation indexes of the accuracy rate, recall rate, F1 and mAP of the intelligent model; the precision and recall rate reflect the accuracy and the comprehensiveness of the prediction of the recognition model, and reflect the comprehensive index through F1; mAP reflects the average accuracy of the recognition model in the multi-category prediction scene, and the performance of the recognition model under different platforms is measured through the evaluation standard;
s5.1b: reasoning the same picture for 10 times, observing the reasoning result, and solving the problems that the output performance of the model is unstable and the reasoning effect on the test set is poor;
s5.2: performance improvement verification: integrating and normalizing tensors in an inference queue by using a bidirectional data binding method;
s5.3: and S5.1, repeating the step S, and performing performance improvement verification.
7. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 1, wherein the step S6 is: the specific steps of intelligent identification application verification include:
s6.1: the method for monitoring the source code safety comprises the following steps:
firstly, aiming at an AI computing acceleration card of a domestic GPU platform, selecting a supported deep learning frame version, and then combining a target service scene to perform security risk detection on source codes of an open source frame to prevent security problems caused by loopholes;
s6.2: developing an intelligent model;
s6.3: and (5) intelligent model deployment.
8. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 6, wherein the intelligent model development comprises the following steps:
s6.2.a: packaging the interface of the mainstream deep learning framework to realize a unified development interface;
s6.2.b: data preprocessing, including data filtering, data cleaning and data augmentation;
s6.2.c: constructing a proper deep neural network model by combining service data and requirements;
s6.2.d: initializing model training, and storing the model after training and verification.
9. The intelligent recognition model adapting method applied to the domestic GPU environment according to claim 6, wherein the intelligent model deployment comprises the following steps:
s4.3.a: model migration: the trained and verified model is converted into a format of a domestic hardware platform environment, and an offline model is generated;
s4.3.B: model optimization: model pruning is carried out, quantification is carried out, and distillation operation is carried out to reduce the size of the model and improve the performance of the model on specific hardware according to the characteristics of a deployment environment;
s4.3.C: deployment environment preparation: installing necessary software libraries, configuring hardware equipment and setting network connection;
s4.3.D: model deployment: deploying the optimized model into a target environment, and testing;
s4.3.E: model monitoring and updating: in the process of model deployment and operation, continuously monitoring the performance and functions of the model, and updating and optimizing the model according to the needs;
s4.3.f: reasoning application development: according to the intelligent identification service requirement and the data flow, developing an intelligent auditing application, calling an actual sample in the offline model automatic auditing service, transmitting an auditing result back to the service processing flow, and displaying the auditing result to an application interface.
CN202311352128.XA 2023-10-19 2023-10-19 Intelligent recognition model adaptation method applied to domestic GPU environment Pending CN117093376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311352128.XA CN117093376A (en) 2023-10-19 2023-10-19 Intelligent recognition model adaptation method applied to domestic GPU environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311352128.XA CN117093376A (en) 2023-10-19 2023-10-19 Intelligent recognition model adaptation method applied to domestic GPU environment

Publications (1)

Publication Number Publication Date
CN117093376A true CN117093376A (en) 2023-11-21

Family

ID=88777581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311352128.XA Pending CN117093376A (en) 2023-10-19 2023-10-19 Intelligent recognition model adaptation method applied to domestic GPU environment

Country Status (1)

Country Link
CN (1) CN117093376A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076143A (en) * 2021-04-21 2021-07-06 扬州万方电子技术有限责任公司 Artificial intelligence environment adaptation method and compatibility testing method for domestic platform
CN114186697A (en) * 2021-12-10 2022-03-15 北京百度网讯科技有限公司 Method and device for generating and applying deep learning model based on deep learning framework
CN114330696A (en) * 2021-12-31 2022-04-12 中国联合网络通信集团有限公司 Multi-frame deep learning model processing method and device and electronic equipment
CN114707667A (en) * 2022-04-29 2022-07-05 中国电子科技集团公司第二十八研究所 Data-driven automatic model training and application system
CN116483730A (en) * 2023-05-10 2023-07-25 公安部第一研究所 Service system automatic test method based on domestic software and hardware and open source test tool
WO2023160290A1 (en) * 2022-02-23 2023-08-31 京东方科技集团股份有限公司 Neural network inference acceleration method, target detection method, device, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076143A (en) * 2021-04-21 2021-07-06 扬州万方电子技术有限责任公司 Artificial intelligence environment adaptation method and compatibility testing method for domestic platform
CN114186697A (en) * 2021-12-10 2022-03-15 北京百度网讯科技有限公司 Method and device for generating and applying deep learning model based on deep learning framework
CN114330696A (en) * 2021-12-31 2022-04-12 中国联合网络通信集团有限公司 Multi-frame deep learning model processing method and device and electronic equipment
WO2023160290A1 (en) * 2022-02-23 2023-08-31 京东方科技集团股份有限公司 Neural network inference acceleration method, target detection method, device, and storage medium
CN114707667A (en) * 2022-04-29 2022-07-05 中国电子科技集团公司第二十八研究所 Data-driven automatic model training and application system
CN116483730A (en) * 2023-05-10 2023-07-25 公安部第一研究所 Service system automatic test method based on domestic software and hardware and open source test tool

Similar Documents

Publication Publication Date Title
Cummins et al. Compiler fuzzing through deep learning
US20200249936A1 (en) Method and system for a platform for api based user supplied algorithm deployment
EP3816806A1 (en) Utilizing neural network and artificial intelligence models to select and execute test cases in a software development platform
EP3432229A1 (en) Ability imparting data generation device
US20210165641A1 (en) Remote application modernization
CN113076143B (en) Artificial intelligence environment adaptation method and compatibility test method for domestic platform
CN112989363B (en) Vulnerability positioning method and device, electronic equipment and storage medium
CN108984416B (en) Method for evaluating dependency conflict danger level in Maven environment
CN113626324A (en) Move language virtual machine-oriented fuzzy test method
CN102640069B (en) A system and method for system automation based on interpreting a tree sequence of operations
CN111429486A (en) DNNDK model-based moving object real-time detection tracking system and method
CN115437336A (en) Test method and device for test case, electronic equipment and storage medium
CN112580627A (en) Yoov 3 target detection method based on domestic intelligent chip K210 and electronic device
US20200301676A1 (en) Framework for GPU Code Generation and Debugging
CN117093376A (en) Intelligent recognition model adaptation method applied to domestic GPU environment
Fursin The collective knowledge project: Making ML models more portable and reproducible with open APIs, reusable best practices and MLOps
CN117235527A (en) End-to-end containerized big data model construction method, device, equipment and medium
CN100483342C (en) Intelligent generating system and method for sensing programm
CN110716716A (en) Mobile terminal visual AI programming platform
FR2828750A1 (en) Trapping of errors, caused by incorrect use of pointers in programming languages that could cause a computer system to become unstable or crash, by use of a security pointer that checks pointer actions before they are executed
CN110095777A (en) Fuzzy logic method meteorology particle identification method based on shuffling technology
Akinsola et al. Qualitative comparative analysis of software integration testing techniques
CN117667045A (en) Edge controller integrating deep learning and PLC language and code generation method
US12013773B2 (en) Generating debuggable executables based on optimizing different compiler options for source code modules
KR102454168B1 (en) Manufacturing performance integrated management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination