WO2020200213A1 - 图像生成方法、神经网络的压缩方法及相关装置、设备 - Google Patents

图像生成方法、神经网络的压缩方法及相关装置、设备 Download PDF

Info

Publication number
WO2020200213A1
WO2020200213A1 PCT/CN2020/082599 CN2020082599W WO2020200213A1 WO 2020200213 A1 WO2020200213 A1 WO 2020200213A1 CN 2020082599 W CN2020082599 W CN 2020082599W WO 2020200213 A1 WO2020200213 A1 WO 2020200213A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
sample
image
generated
generator
Prior art date
Application number
PCT/CN2020/082599
Other languages
English (en)
French (fr)
Inventor
陈汉亭
王云鹤
刘传建
韩凯
许春景
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20783785.7A priority Critical patent/EP3940591A4/en
Publication of WO2020200213A1 publication Critical patent/WO2020200213A1/zh
Priority to US17/488,735 priority patent/US20220019855A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to an image generation method, a neural network compression method, and related devices and equipment.
  • machine learning models such as neural networks (NN) and deep neural networks (DNN) have been applied in various fields, such as image classification, object detection, and speech recognition Wait.
  • NN neural networks
  • DNN deep neural networks
  • the operation of the machine learning model requires huge computing resources, and it is difficult to directly apply to mobile phones, tablets, and on-board units (OBU). ), cameras and other small mobile devices, at this time, the machine learning model needs to be compressed to reduce the computing resource requirements of the machine learning model and accelerate the operation of the machine learning model.
  • the compression and acceleration algorithms of existing neural networks are often calculated based on the training samples of the learning model of the compressor to be compressor.
  • real training samples are often protected by privacy policies or laws and cannot be obtained by third parties.
  • the structure of the machine learning model that needs to be compressed is often invisible, and only the input and output interfaces are provided. Therefore, when real training samples are not available, most neural network compression techniques cannot be used. Therefore, how to generate training samples? This is a technical problem that needs to be solved urgently to realize the compression of neural networks without real training samples.
  • GAN generative adversarial networks
  • GAN usually includes a generator and a discriminator, and the two networks learn from each other through games to generate better output.
  • the generator captures the potential distribution of real training samples and generates new samples;
  • the discriminator is a two-classifier used to discriminate whether the input sample is a real sample or a generated sample.
  • This generator can generate samples similar to the real training samples based on the existing real training samples.
  • the generator needs to use real training samples for training. In the case that real training samples cannot be obtained, GAN training cannot be achieved, and samples similar to real training samples cannot be obtained.
  • the embodiments of the present application provide an image generation method, a neural network compression method, and related devices and equipment, which can generate a sample image similar to a real image and realize neural network compression without a real image.
  • an embodiment of the present application provides a method for training an image generator, including:
  • the training device inputs the first matrix to the initial image generator to obtain the generated image, which is a deep neural network; inputs the generated image to the preset discriminator, and obtains the discrimination result, where the preset discriminator is the first trained According to the data training, the first training data includes the real image and the classification corresponding to the real image; further, the initial image generator is updated according to the discrimination result to obtain the target image generator.
  • the initial image generator may be an initialized deep neural network, or it may be a deep neural network obtained during training.
  • the training device can be a terminal device, such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle-mounted terminal, etc., or a server or cloud.
  • a terminal device such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle-mounted terminal, etc., or a server or cloud.
  • the above method does not need to train the real image used by the preset discriminator to train the target image generator, and the trained target image generator can be used to generate characteristics similar to the real image used in the training of the preset discriminator A sample image, which can replace the training data of the preset discriminator, and realize the compression of the preset discriminator and other functions that require the training data of the preset discriminator.
  • the preset discriminator may be an image recognition network, and the image recognition network may be used to recognize the classification of the input image.
  • the preset discriminator can be a face attribute recognition network, which can be used to recognize the attributes of the person described by the input face image, such as age, race, gender, and emotion, etc. .
  • the discrimination result may include the probability that the generated image is predicted to be each of the M categories, where M is an integer greater than 1.
  • the first implementation manner in which the training device updates the initial image generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and the classification corresponding to the maximum probability It is determined as the true result of the generated image; further, the initial image generator is updated according to the difference between the discriminating result and the real result.
  • the training device determines the real result according to the discrimination result of the generated image by the preset discriminator, and updates the initial image generator based on the difference between the discrimination result and the real result, so as to achieve the target image generator obtained by training without real images, and According to the discrimination result, the real result is determined as the classification corresponding to the maximum probability, which reduces the difference between the discrimination result and the real result, and improves the computational efficiency of the training process.
  • the method before updating the initial image generator according to the discrimination result, further includes that the training device extracts features of the generated image through a preset discriminator; the training device updates the second image generator according to the discrimination result.
  • One implementation can be: determine the maximum probability of the probabilities corresponding to each of the M categories, and determine the category corresponding to the maximum probability as the true result of the generated image; further, update the initial result according to the difference between the distinguished result and the true result and the feature Image generator.
  • the implementation of the above method can achieve training without real images to obtain the target image generator, and in the process of training the initial image generator, taking into account the characteristics of the characteristics of the input real image extracted by the preset discriminator, it is generated by constraints The characteristics of the image make the sample image generated by the trained target image generator closer to the real image.
  • the third implementation manner in which the training device updates the initial image generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and the classification corresponding to the maximum probability Determined as the true result of the generated image; According to the N discrimination results corresponding to the N first matrices one-to-one, the average probability of each of the M classifications in the N discrimination results is obtained, and N is a positive integer; according to The difference between the discrimination result and the real result and the probability average update the initial image generator.
  • the above method can achieve the target image generator obtained by training without real images, and in the process of training the initial image generator, the discrimination result of the generated image is restricted, so that the target image generator can generate sample images of each classification in a balanced manner. Avoid the target image generator from falling into the local optimum.
  • the method before updating the initial image generator according to the discrimination result, the method further includes the training device to extract the features of the generated image through a preset discriminator; and the fourth implementation of updating the initial image generator according to the discrimination result
  • the method can be: determine the maximum probability of the probabilities corresponding to each of the M categories, and determine the category corresponding to the maximum probability as the true result of the generated image; according to the N discrimination results corresponding to the N first matrices, Obtain the average probability of each of the M categories in the N discrimination results; furthermore, the initial image generator is jointly updated according to the difference between the discrimination result and the real result, the feature value of the feature, and the average probability.
  • an embodiment of the present application provides an image generation method, including: an execution device inputs a second matrix into a target image generator to obtain a sample image, wherein the target image generator is any one of the methods described in the first aspect.
  • the training method of this image generator is obtained.
  • the execution device can be a terminal device, such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle terminal, etc., or a server or cloud.
  • a terminal device such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle terminal, etc., or a server or cloud.
  • the target image generator trained by the training method described in the first aspect can generate a sample image with characteristics similar to the real image used in the training of the preset discriminator, and the sample image can replace the training of the preset discriminator Data, to achieve the compression of the preset discriminator and other functions that require the training data of the preset discriminator.
  • an embodiment of the present application also provides a neural network compression method, including:
  • the compression device acquires a sample image, wherein the sample image is generated by the image generation method according to any one of the second aspect, wherein the preset discriminator is a neural network to be compressed; the sample image is input to the neural network to be compressed Network to obtain the classification corresponding to the sample image; further, the compressed neural network is compressed according to the classification corresponding to the sample image and the sample image to obtain a compressed neural network, wherein the parameters of the compressed neural network are less than the neural network to be compressed The parameters of the network.
  • the preset discriminator is a neural network to be compressed
  • the sample image is input to the neural network to be compressed Network to obtain the classification corresponding to the sample image
  • the compressed neural network is compressed according to the classification corresponding to the sample image and the sample image to obtain a compressed neural network, wherein the parameters of the compressed neural network are less than the neural network to be compressed The parameters of the network.
  • the compression device may be a server or a cloud.
  • a sample image with characteristics similar to the real image used for training of the preset discriminator is obtained by the image generation method described in the second aspect, and the neural network to be compressed is compressed according to the sample image and the classification corresponding to the sample image, Furthermore, the compression of the neural network to be compressed can be realized without training data.
  • an embodiment of the present application also provides an image processing method, including:
  • the terminal receives the input image; the input image is input to the compressed neural network, the input image is processed through the compressed neural network, and the processing result is obtained, wherein the compressed neural network is passed as described in the third aspect The neural network compression method is obtained; finally, the output processing result.
  • the content of the processing result depends on the function of the compressed neural network
  • the function of the compressed neural network depends on the function of the neural network to be compressed, which may be the classification result of the image, the recognition result, etc.
  • the neural network to be compressed is a face attribute recognition network, which is used to identify the attributes of the person described in the input face image, such as gender, age, race, etc., then the compressed neural network can identify the gender of the input image describing the person , Age, race, etc.
  • the processing result may include the recognized gender, age, and race of the input image.
  • the compressed neural network has a simpler network structure, fewer parameters, and at the same time occupies less storage resources during its operation, and thus can be applied to lightweight terminals.
  • the terminal may be a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle-mounted terminal, or other terminal equipment.
  • an embodiment of the present application provides a method for training a text generator, including: a training device inputs a first matrix into an initial text generator to obtain generated text, the initial text generator is a deep neural network; and the text is generated Input the preset discriminator to obtain the discrimination result, where the preset discriminator is obtained through first training data training, and the first training data includes the real text and the classification corresponding to the real text; further, the initial text is updated according to the discrimination result Generator, get the target text generator.
  • the initial sample generator may be an initialized deep neural network, or it may be a deep neural network obtained during training.
  • the training device can be a terminal device, such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle-mounted terminal, etc., or a server or cloud.
  • a terminal device such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle-mounted terminal, etc., or a server or cloud.
  • the preset discriminator may be a text recognition network, and the text recognition network is used to recognize the classification of the input text, and the classification may be classified according to criteria such as intent and subject.
  • the above method does not need to train the preset discriminator to use the real text, and the target text generator can be trained, and the trained target text generator can be used to generate samples with characteristics similar to the real text used in the training of the preset discriminator Text, the sample text can replace the training data of the preset discriminator, and realize functions such as compression of the preset discriminator that require the training data of the preset discriminator.
  • the discrimination result may include the probability that the generated text is predicted to be each of M categories, where M is an integer greater than 1.
  • the first implementation manner in which the training device updates the initial text generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and the classification corresponding to the maximum probability It is determined as the true result of the first generated sample; further, the initial text generator is updated according to the difference between the discriminant result and the true result.
  • the training device determines the real result according to the discrimination result of the generated text by the preset discriminator, and updates the initial text generator based on the difference between the discriminatory result and the real result, thereby achieving the target text generator obtained by training without real text.
  • the training device determines that the real result is the classification corresponding to the maximum probability according to the discrimination result, reduces the difference between the discrimination result and the real result, and improves the computational efficiency of the training process.
  • the method before updating the initial text generator according to the discrimination result, the method further includes the training device extracting features of the generated text through a preset discriminator.
  • the second implementation manner for the training device to update the initial text generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and determining the category corresponding to the maximum probability as the first generated sample The real result of; further, the initial text generator is updated according to the difference between the discrimination result and the real result and the characteristics.
  • the above method can be realized without real text training to obtain the target text generator, and in the process of training the initial text generator, the characteristics of the input real text extracted by the preset discriminator are taken into account, and the constraints are generated The characteristics of the text make the sample text generated by the trained target text generator closer to the real text.
  • the training device updates the initial text generator according to the discrimination result.
  • the third implementation manner may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and the classification corresponding to the maximum probability Determined as the true result of the first generated sample; According to the N discrimination results corresponding to the N first matrices one-to-one, the average probability of each of the M classifications in the N discrimination results is obtained; according to the discrimination results and The difference between the actual results and the average probability together update the initial text generator.
  • the above method can achieve training without real text to obtain the target text generator, and in the process of training the initial text generator, the discrimination result of the generated text is constrained, so that the target text generator can generate sample text of each classification in a balanced manner , To avoid the target text generator falling into the local optimum.
  • the method before updating the initial text generator according to the discrimination result, the method further includes the training device extracting features of the generated text through a preset discriminator.
  • the fourth implementation manner for the training device to update the initial text generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and determining the category corresponding to the maximum probability as the first generated sample According to the N discriminant results corresponding to the N first matrices one-to-one, the average probability of each of the M categories in the N discriminant results is obtained; further, according to the difference between the discriminant results and the true results , Features, and probability averages to update the initial text generator.
  • an embodiment of the present application provides a text generation method, including: an execution device inputs a second matrix into a target text generator to obtain sample text.
  • the target text generator is obtained by any one of the text generator training methods described in the fourth aspect.
  • the execution device can be a terminal device, such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle terminal, etc., or a server or cloud.
  • a terminal device such as a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle terminal, etc., or a server or cloud.
  • the target text generator trained by the training method described in the fifth aspect can generate sample text with characteristics similar to the real text used in the training of the preset discriminator, and the sample text can replace the training of the preset discriminator Data, to achieve the compression of the preset discriminator and other functions that require the training data of the preset discriminator.
  • an embodiment of the present application provides a neural network compression method, including: a compression device obtains sample text, wherein the sample text is generated by the text generation method according to any one of the fifth aspects, wherein the preset judgment
  • the device is a neural network to be compressed; the sample text is input to the neural network to be compressed to obtain the classification corresponding to the sample text; and then the neural network to be compressed is performed according to the classification corresponding to the sample text and the sample text Compress to obtain a compressed neural network, wherein the parameters of the compressed neural network are less than the parameters of the neural network to be compressed.
  • the text generation method described in the sixth aspect obtains a sample text with characteristics similar to the real text used in the training of the preset discriminator, and compresses the neural network to be compressed according to the sample text and the classification corresponding to the sample text, Furthermore, the compression of the neural network to be compressed can be realized without training data.
  • an embodiment of the present application provides a text processing method.
  • a terminal receives input text; inputs the input text to a compressed neural network, and processes the input text through the compressed neural network to obtain a processing result,
  • the compressed neural network is obtained by the neural network compression method as described in the sixth aspect; furthermore, the processing result is output.
  • the content of the processing result depends on the function of the compressed neural network
  • the function of the compressed neural network depends on the function of the neural network to be compressed, which may be the classification result of the text, the recognition result, etc.
  • the neural network to be compressed is a text recognition network, which is used to recognize the intention of the description of the input text. Then, the compressed neural network can recognize the intention of the input text, and then perform the operation corresponding to the recognized intention, for example, in Recognizing that the intention is to "connect the call", the terminal (such as a mobile phone) can connect to the current call.
  • the compressed neural network has a simpler network structure, fewer parameters, and at the same time occupies less storage resources during its operation, and thus can be applied to lightweight terminals.
  • the terminal may be a mobile phone, a tablet computer, a desktop computer, a portable notebook, AR/VR, a vehicle terminal or other terminal equipment.
  • an embodiment of the present application also provides a method for training a sample generator.
  • the method includes: a training device inputs a first matrix into an initial sample generator to obtain a first generated sample, and the initial sample generator is a deep neural network ; Input the first generated sample into the preset discriminator to obtain the discrimination result, wherein the preset discriminator is obtained through the first training data training, the first training data includes the real sample and the classification corresponding to the real sample;
  • the discrimination result of the first generated sample updates the parameters of the initial sample generator to obtain the target sample generator.
  • the initial sample generator may be an initialized deep neural network, or it may be a deep neural network obtained during training.
  • the above method does not need to train the preset discriminator to use real samples, and the target sample generator can be trained, and the trained target sample generator can be used to generate the first sample with characteristics similar to the real samples used in the training of the preset discriminator.
  • a sample is generated.
  • the second sample image can replace the training data of the preset discriminator, and realize the functions of the compression of the preset discriminator and the training data that require the preset discriminator.
  • the discrimination result may include the probability that the first generated sample is predicted to be each of the M categories, where M is an integer greater than 1.
  • the first implementation manner of updating the initial sample generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and determining the category corresponding to the maximum probability as First, the real result of the sample is generated; further, the initial sample generator is updated according to the difference between the discrimination result and the real result.
  • the training device determines the true result according to the discrimination result of the first generated sample by the preset discriminator, and updates the initial sample generator based on the difference between the discriminatory result and the real result, thereby realizing the target sample generation by training without real samples
  • the real result is determined as the classification corresponding to the maximum probability, which reduces the difference between the discrimination result and the real result and improves the computational efficiency of the training process.
  • the method before updating the initial sample generator according to the discrimination result, the method further includes extracting features of the first generated sample through a preset discriminator.
  • the second implementation of updating the initial sample generator according to the discrimination result can be: determining the maximum probability among the probabilities corresponding to each of the M categories, and determining the category corresponding to the maximum probability as the true value of the first generated sample The result; further, the initial sample generator is updated according to the difference between the discrimination result and the real result and the characteristics.
  • the above method can achieve training without real samples to obtain the target sample generator, and in the process of training the initial sample generator, taking into account the characteristics of the characteristics of the input real samples extracted by the preset discriminator, by constraining the first The characteristics of a generated sample make the second generated sample generated by the trained target sample generator closer to the real sample.
  • the third implementation manner of updating the initial sample generator according to the discrimination result may be: determining the maximum probability among the probabilities corresponding to each of the M categories, and determining the category corresponding to the maximum probability as The true result of the first generated sample; according to the N discrimination results corresponding to the N first matrices one-to-one, the probability average value of each of the M classifications in the N discrimination results is obtained; according to the discrimination result and the true result The difference and the probability average value together update the initial sample generator.
  • the above method can achieve the target sample generator obtained by training without real samples, and in the process of training the initial sample generator, by constraining the discrimination result of the first generated sample, the target sample generator can produce each classification in a balanced manner. Second, generate samples to avoid the target sample generator from falling into the local optimum.
  • the method before updating the initial sample generator according to the discrimination result, the method further includes extracting features of the first generated sample through a preset discriminator.
  • the fourth implementation of updating the initial sample generator according to the discrimination result can be: determining the maximum probability among the probabilities corresponding to each of the M categories, and determining the category corresponding to the maximum probability as the true value of the first generated sample Results; According to the N discriminant results one-to-one corresponding to the N first matrices, the probability average value of each of the M categories in the N discriminant results is obtained; further, according to the difference and characteristics of the discriminant result and the real result The eigenvalues of, and the probability average value together update the initial sample generator.
  • an embodiment of the present application also provides a sample generation method.
  • the method includes: an execution device inputs a second matrix into a target sample generator to obtain a second generated sample, wherein the target sample generator passes the seventh It is obtained by training any sample generator training method in the aspect.
  • the above method does not need to train the preset discriminator to use real samples, and the target sample generator can be trained, and the target sample generator obtained through training can be used to generate the first sample with characteristics similar to the real sample used in the training of the preset discriminator.
  • a generated sample a generated sample.
  • the second generated sample can replace the training data of the preset discriminator, so as to realize the compression of the preset discriminator and other functions that require the training data of the preset discriminator.
  • an embodiment of the present application also provides a neural network compression method, including: a compression device obtains a second generated sample, the second generated sample is generated by any of the sample generation methods described in the eighth aspect, Wherein, the preset discriminator is the neural network to be compressed; the second generated sample is input to the neural network to be compressed to obtain the classification corresponding to the second generated sample; further, the second generated sample corresponds to the second generated sample
  • the classification of is to compress the neural network to be compressed to obtain the compressed neural network, where the parameters of the compressed neural network are less than the parameters of the neural network to be compressed.
  • the compression device can be a server or a cloud.
  • the training process of the target sample generator obtained by training in the compression method of the neural network can also be executed by the training device, and the second generation sample generated by the target sample generator is executed.
  • the equipment of the process and the equipment that executes the compression of the neural network to be compressed according to the second generated sample may be the same or different equipment.
  • the sample generation method described in the tenth aspect can obtain a second generated sample with characteristics similar to the real sample used in the training of the neural network to be compressed.
  • the second generated sample and the classification corresponding to the second generated sample are to be compressed.
  • the neural network performs compression, and then realizes the compression of the neural network to be compressed without training samples.
  • the discrimination result includes the probability that the first generated sample is predicted to be each of the M categories, and M is an integer greater than 1.
  • the real result is determined as the classification corresponding to the maximum probability according to the discrimination result, which reduces the difference between the discrimination result and the real result, and improves the computational efficiency of the training process.
  • a specific implementation for the compression device to compress the neural network to be compressed may be: the compression device inputs the second generated sample to the neural network to be compressed to obtain the real result corresponding to the second generated sample; The second training data trains the initial neural network to obtain the compressed neural network; wherein, the second training data includes the second generated sample and the real result corresponding to the second generated sample, the initial neural network is a deep neural network, and the initial neural network is The model parameters are less than the model parameters of the neural network to be compressed.
  • the neural network compression method generates a second generated sample with characteristics similar to the real sample used for training of the neural network to be compressed through the target sample generator, and uses the second generated sample to pass through the neural network to be compressed
  • the predicted result is used as a label.
  • the second generated sample and its label by training a low-complexity neural network, a neural network consistent with the function of the neural network to be compressed is obtained.
  • the neural network is the compressed neural network, which can be realized without training samples. Compression of the neural network to be compressed.
  • the compressed neural network can be applied to lightweight devices such as terminals, thereby reducing computing loss, reducing storage overhead, and improving computing efficiency.
  • another implementation manner for the compression device to compress the neural network to be compressed may be: the execution device inputs the second generated sample into the neural network to be compressed to obtain the true result of the second generated sample; The importance of neurons in the compressed neural network, remove the neurons whose importance is less than the first threshold in the neural network to be compressed, and obtain a simplified neural network; furthermore, train the simplified neural network through the third training data , The compressed neural network is obtained, and the third training data includes the second generated sample and the real result corresponding to the second generated sample.
  • the neural network compression method generates a second generated sample with characteristics similar to the real sample used for training of the neural network to be compressed through the target sample generator, and passes the second generated sample through the neural network to be compressed
  • the predicted result is used as the label
  • the redundant connection in the neural network to be compressed is removed by the pruning algorithm to obtain a simplified neural network.
  • the second generated sample is used as the input of the simplified neural network.
  • the real result obtained by the generated sample processing is used as a label
  • the simplified neural network is trained by using the second generated sample and its label to obtain the compressed neural network, which realizes the compression of the neural network to be compressed without training samples.
  • the compressed neural network can be applied to lightweight devices such as terminals, thereby reducing the complexity of the neural network to be compressed, improving computing efficiency, and reducing storage overhead.
  • the compression method of the neural network described above can realize the compression of the image recognition network, where the image recognition network is used to recognize the classification of the input image.
  • “Initial sample generator” is “initial image generator”
  • “neural network to be compressed” is “image recognition network”
  • “first generated sample” is “generated image”
  • “real sample” is “Real image”
  • "judgment result” is “judgment classification”
  • "real result” is “real classification”
  • second generated sample is "sample image”.
  • the compression method of the above neural network can realize the compression of the face attribute recognition network, where the face attribute recognition network is used to recognize the person described by the input face image Attributes such as age, race, gender, and mood.
  • “Initial sample generator” is “Initial face image generator”
  • “Neural network to be compressed” is “Initial face image generator”
  • First generated sample is “Generate face image”
  • “Real The “sample” is the “real face image”
  • the “judgment result” is the “judgment attribute”
  • the “real result” is the “true real attribute”
  • the “second generation sample” is the “sample face image”.
  • the compression method of the above neural network can realize the compression of the text recognition network, wherein the text recognition network is used to recognize the classification of the input text, for example, the classification can be based on the intention , Disciplines and other standards.
  • “Initial sample generator” is “initial text generator”
  • “neural network to be compressed” is “initial text generator”
  • “first generated sample” is “generated text”
  • “real sample” is “real “Text”
  • judgment result” is “judgment intent”
  • truee result is “true intent”
  • second generation sample is "sample text”.
  • an embodiment of the present application also provides a data processing method, including:
  • the terminal receives the input data; the input data is input to the compressed neural network, the input data is processed through the compressed neural network, and the processing result is obtained, wherein the compressed neural network is passed as described in the eleventh aspect The neural network compression method described above is obtained; finally, the processing result is output.
  • the content of the processing result depends on the function of the compressed neural network
  • the function of the compressed neural network depends on the function of the neural network to be compressed, which may be the classification result, recognition result, etc. of the input data.
  • the input data is a face image
  • the neural network to be compressed is a face attribute recognition network, used to identify the attributes of the person described in the input face image, such as gender, age, race, etc.; then, the compressed neural network can recognize
  • the input image describes the gender, age, race, etc. of the person, and the processing result may include the recognized gender, age, and race of the input image.
  • the compressed neural network has a simpler network structure, fewer parameters, and at the same time occupies less storage resources during its operation, and thus can be applied to lightweight terminals.
  • an embodiment of the present application also provides a training device for an image generator, which includes a module for executing the method in the first aspect.
  • an embodiment of the present application also provides a training device for an image generator.
  • the device includes: a memory for storing a program; a processor for executing a program stored in the memory, and when the program stored in the memory is executed , The processor is used to execute the method in the first aspect.
  • an embodiment of the present application also provides an image generation device, which includes a module for executing the method in the second aspect.
  • an embodiment of the present application also provides an image generation device, which includes: a memory for storing a program; a processor for executing a program stored in the memory, and when the program stored in the memory is executed, the processor uses To implement the method in the second aspect.
  • an embodiment of the present application also provides a neural network compression device, which includes a module for executing the method in the third aspect.
  • an embodiment of the present application also provides a neural network compression device.
  • the device includes: a memory for storing a program; a processor for executing a program stored in the memory, and processing when the program stored in the memory is executed The device is used to perform the method in the third aspect.
  • an embodiment of the present application further provides an image processing device (or terminal), including a module for executing the method in the fourth aspect.
  • an embodiment of the present application also provides an image processing device (or terminal), including a memory for storing a program; a processor, for executing a program stored in the memory, and processing when the program stored in the memory is executed
  • the device is used to implement the method in the fourth aspect.
  • an embodiment of the present application also provides a training device for a text generator, the device including a module for executing the method in the fifth aspect.
  • a training device for a text generator includes: a memory for storing a program; a processor for executing a program stored in the memory; when the program stored in the memory is executed, the processor is used for Implement the method in the fifth aspect.
  • an embodiment of the present application also provides a text image generation device, which includes a module for executing the method in the sixth aspect.
  • an embodiment of the present application also provides a text generation device, which includes: a memory for storing a program; a processor for executing a program stored in the memory, and processing when the program stored in the memory is executed The device is used to implement the method in the sixth aspect.
  • an embodiment of the present application further provides a neural network compression device, which includes a module for executing the method in the seventh aspect.
  • an embodiment of the present application also provides a neural network compression device, which includes: a memory for storing a program; a processor for executing a program stored in the memory, and when the program stored in the memory is executed, The processor is used to execute the method in the seventh aspect.
  • an embodiment of the present application also provides a text processing device (or terminal), including a module for executing the method in the eighth aspect.
  • an embodiment of the present application also provides a text processing device (or terminal), including a memory for storing a program; a processor, for executing a program stored in the memory, and when the program stored in the memory is executed, The processor is used to execute the method in the eighth aspect.
  • an embodiment of the present application also provides a training device for a sample generator, the device including a module for executing the method in the ninth aspect.
  • an embodiment of the present application also provides a training device for a sample generator.
  • the device includes a memory for storing a program; a processor for executing a program stored in the memory, and when the program stored in the memory is executed , The processor is used to execute the method in the ninth aspect.
  • an embodiment of the present application further provides a sample generation device, which includes a module for executing the method in the tenth aspect.
  • an embodiment of the present application also provides a sample generation device, which includes: a memory for storing a program; a processor for executing a program stored in the memory, and processing when the program stored in the memory is executed The device is used to implement the method in the tenth aspect.
  • an embodiment of the present application also provides a neural network compression device, which includes a module for executing the method in the eleventh aspect.
  • an embodiment of the present application also provides a neural network compression device, which includes: a memory for storing a program; a processor for executing a program stored in the memory, and when the program stored in the memory is executed , The processor is used to execute the method in the eleventh aspect.
  • an embodiment of the present application further provides a data processing device (or terminal), which includes a module for executing the method in the twelfth aspect.
  • an embodiment of the present application also provides a data processing device (or terminal), including a memory, used to store a program; a processor, used to execute a program stored in the memory, and when the program stored in the memory is executed, The processor is used to execute the method in the twelfth aspect.
  • an embodiment of the present application further provides a computer-readable medium, the computer-readable medium storing program code for device execution, and the program code includes a program code for executing any one of the first to twelfth aspects The method described.
  • the embodiments of the present application also provide a computer program product containing instructions, which when the computer program product runs on a computer, causes the computer to execute the method described in any one of the first to twelfth aspects above .
  • an embodiment of the present application further provides a chip.
  • the chip includes a processor and a data interface.
  • the processor reads instructions stored in a memory through the data interface and executes the first to twelfth The method described in any one of the aspects.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute the instructions stored in the memory.
  • the processor is configured to execute the method described in any one of the first to twelfth aspects.
  • an embodiment of the present application further provides an electronic device, which includes the device described in any one of the above-mentioned 37th to 25th aspects.
  • FIG. 1 is a schematic block diagram of a system architecture in an embodiment of the application
  • FIG. 2 is a schematic block diagram of a convolutional neural network in an embodiment of the application
  • FIG. 3 is a schematic diagram of a chip hardware structure in an embodiment of the application.
  • 4A is a schematic flowchart of a method for training a sample generator in an embodiment of the present invention
  • 4B is a schematic explanatory diagram of a training method of a sample generator in an embodiment of the present invention.
  • FIG. 5A is a schematic flowchart of a neural network compression method in an embodiment of the present invention.
  • 5B is a schematic diagram of the principle of a neural network compression method in an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of another neural network compression method in an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart of a data processing method in an embodiment of the present invention.
  • FIG. 8 is a schematic block diagram of a training device of a sample generator in an embodiment of the present invention.
  • FIG. 9 is a schematic block diagram of a sample generating apparatus in an embodiment of the present invention.
  • FIG. 10 is a schematic block diagram of a neural network compression device in an embodiment of the present invention.
  • Figure 11 is a schematic block diagram of a data processing device in an embodiment of the present invention.
  • FIG. 12 is a schematic block diagram of another training device of a sample generator in an embodiment of the present invention.
  • Figure 13 is a schematic block diagram of another sample generating device in an embodiment of the present invention.
  • FIG. 14 is a schematic block diagram of another neural network compression device in an embodiment of the present invention.
  • Fig. 15 is a schematic block diagram of another data processing device in an embodiment of the present invention.
  • the image generation method provided in the embodiments of the present application can be applied to scenes such as training and compression of a neural network whose input is an image.
  • the image generation method of the embodiment of the present application can be applied to the A scene and the B scene as shown below, and the following briefly introduces the A scene and the B scene respectively.
  • the customer obtains the image recognition network through training data training the image recognition network can recognize the classification of the input image, the image can be classified according to the type and shape of the object described by the image, such as airplane, car, bird, cat, deer, dog, For frogs, horses, boats, trucks, etc., the training data of the image recognition network includes real pictures and classifications corresponding to real pictures.
  • the compression equipment of the service provider (for example, a cloud platform that provides neural network compression services) needs to use the training data for training the image recognition network in the process of compressing the image recognition network.
  • the compression device is to complete the compression of the image recognition network, it needs to generate training data by itself.
  • an initial image generator which is a deep neural network, can process the input random vector, and output the generated image.
  • the generated image output by the initial image generator is input into the image recognition network, and the judgment classification corresponding to the generated image can be obtained, that is, the probability that the generated image is predicted to be each of the above-mentioned multiple classifications. It should be understood that in the initial learning process of the initial image generator, the generated image is far from the real image, and the image recognition network has low accuracy in determining the generated image, that is, the probability of the generated image being recognized as the above-mentioned classification is small.
  • the training device can determine the real classification corresponding to the input random vector, and update the parameters of the initial image generator according to the difference between the real classification and the judgment classification, so that the generated image output by the image generator after the updated parameters is recognized by the image recognition network The difference between the determined classification and its actual classification is getting smaller and smaller, and finally the target image generator is obtained. It can be considered that the sample image output by the target image generator processing the input random vector can approximate the real image used by the training image recognition network. Furthermore, the compression device can realize compression and acceleration of the image recognition network through distillation or pruning algorithms based on the generated sample image.
  • the target image generator obtained by the image training method provided by the embodiment of the application does not need to use the training data of the image recognition network, and input the generated image to the trained image recognition network to obtain the discriminative classification of the generated image, and then The difference between the judgment classification of the generated image and the real classification is trained to obtain the target image generator; moreover, the sample image generated by the target image generator can approximate the real image used by the training image recognition network.
  • the customer obtains the face attribute recognition network through training data.
  • the face attribute recognition network can recognize the attributes of the person described in the image according to the input face image, such as gender, race, age, and emotion. Different attributes belong to different
  • the training data of the face attribute recognition network includes real face pictures and attributes corresponding to real face pictures. Similar to the above-mentioned image recognition network, customers can request a service provider with neural network compression service to compress the facial attribute recognition network that has been trained.
  • the compression equipment of the service provider (for example, a cloud platform that provides neural network compression services) needs to use the training data for training the face attribute recognition network in the process of compressing the face attribute recognition network. However, it is often difficult for customers to provide training data for the face attribute recognition network. At this time, if the compression device is to complete the compression of the face attribute recognition network, it needs to generate the training data by itself.
  • an initial face image generator can be constructed.
  • the initial face image generator is a deep neural network, which can process the input random vector and output a face image.
  • the generated face image output by the initial face image generator is input into the face attribute recognition network, and the judgment attribute can be obtained, that is, the probability that the generated face image is predicted to be each attribute in the multiple attribute classifications. It should be understood that during the initial training process of the initial face image generator, there is a large gap between the generated face image and the real face image, and the face attribute recognition network has low accuracy in determining the generated face image, that is, the face image is generated. The probability of being recognized as each type of attribute has a small difference.
  • the training device can determine the real attributes corresponding to the generated face image obtained by the input random vector from the initial face image, and update the parameters of the initial face image generator according to the difference between the real attributes and the determined attributes, so that the updated parameters
  • the difference between the judgment attribute and the real attribute of the generated face image output by the face image generator is getting smaller and smaller by the face attribute recognition network, and finally the target face image generator is obtained.
  • the sample face image output by the target face image generator processing the input random vector can approximate the real face image used by the training face attribute recognition network.
  • the compression device can realize compression and acceleration of the face attribute recognition network based on the generated sample face image through a distillation or pruning algorithm.
  • the target face image generator obtained by the image generation method provided by the embodiment of the application does not need to use the real training data of the face attribute recognition network, and input the generated face image into the trained face attribute recognition network to obtain The judgment attribute of the generated face image is trained to obtain the target face image generator by the difference between the judgment attribute of the generated face image and the real attribute; moreover, the sample face image obtained by the target face image generator can approximate the training person Face attribute recognition network uses real face images.
  • the sample image generated by the image generation method provided in the embodiment of this application can also be used as training data to be applied to other training scenes of machine learning models using images as input, which is not limited in this application. It should also be understood that the human face image is one of the images.
  • the text generation method provided in the embodiments of the present application can be applied to scenarios such as training and compression of neural networks whose input is text.
  • the text generation method of the embodiment of the present application can be applied to the C scenario shown below, and the C scenario is briefly introduced below.
  • the training data of the neural network to be compressed includes real text and classifications corresponding to the real text.
  • a text generation network can be trained to generate sample text that can replace real text, and then based on the generated sample text, the compression and acceleration of the neural network to be compressed can be achieved through distillation or pruning algorithms.
  • the text can be classified by intent.
  • the intent includes: turn on the light, turn off the light, turn on the air conditioner, turn off the air conditioner, turn on the audio, etc.
  • the text recognition network can be applied to smart homes and smart homes.
  • the control center After receiving the voice, the control center converts the voice into text. Then, the control center recognizes the intention of the input text through the text recognition network, and then controls the device in the smart home to perform the operation corresponding to the intention according to the text recognition network. , For example, control to turn on the air conditioner.
  • the intention may also include other classification methods, which are not limited in the embodiment of the present application.
  • the text can be classified in other ways.
  • the text can be classified by subject to achieve classification management of the text, which is not limited in the embodiment of the present application.
  • the embodiment of the present application takes the text recognition network used to recognize the intention of the input text as an example for illustration.
  • the training data of the text recognition network includes the real text and the classification corresponding to the real text.
  • Customers can request a service provider with neural network compression services to compress the trained text recognition network.
  • the service provider's compression equipment (for example, a cloud platform that provides neural network compression services) needs to use the training data for training the text recognition network in the process of compressing the text recognition network.
  • the customer did not provide training data for the text recognition network.
  • the compression device needs to generate training data by itself if it wants to complete the compression of the text recognition network.
  • an initial text generator can be constructed.
  • the initial text generator is a deep neural network that can process the input random vector and output the generated text.
  • the generated text output by the initial text generator is input into the text recognition network, and the judgment intention corresponding to the generated text can be obtained, that is, the probability that the generated text is predicted to be each of the above-mentioned intents. It should be understood that in the initial training process of the initial text generator, the generated text has a large gap with the real text, and the text recognition network has low accuracy in determining the generated text, that is, the probability that the generated text is recognized as the above-mentioned intent is small.
  • the training device can determine the real classification corresponding to the input random vector, and update the parameters of the initial text generator according to the difference between the real intention and the judgment intention, so that the generated text output by the text generator after the updated parameters is recognized by the text recognition network
  • the difference between the judgment intention and the real intention becomes smaller and smaller, and the target text generator is obtained.
  • the sample text output by the target text generator processing the input random vector can approximate the real text used by the training text recognition network.
  • the compression device can realize the compression and acceleration of the text recognition network based on the generated sample text through a distillation or pruning algorithm.
  • the target text generator obtained by the text generation method provided in the embodiments of the present application does not need to use the training data of the text recognition network, and the generated text is input to the trained text recognition network to obtain the judgment intention of the generated text to generate The difference between the judgment intention of the text and the real intention is trained to obtain the target text generator; moreover, the sample text obtained by the target text generator can approximate the real text used by the training text recognition network.
  • the training method of the sample generator provided in the embodiment of this application involves computer vision processing or natural language processing, and can be specifically applied to data processing methods such as data training, machine learning, and deep learning.
  • data processing methods such as data training, machine learning, and deep learning.
  • training data such as the The first matrix
  • For training data performs symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc., and finally obtains a trained target sample generator; and the sample generation method provided in the embodiment of the application can use the above training
  • the target sample image generator in this application inputs input data (such as the second matrix in this application) into the trained target sample generator to obtain output data (such as the second generated sample in this application).
  • training method and sample generation method of the target sample generator provided in the embodiments of this application are inventions based on the same concept, and can also be understood as two parts of a system or two parts of an overall process. Stages: such as model training stage and model application stage.
  • image recognition uses image processing, machine learning, computer graphics and other related methods to recognize the category to which the image belongs or the attributes of the image, etc. according to the image. For example, in scene A, the classification to which the image belongs is recognized; in scene B, the attribute of the face image is recognized.
  • text recognition also becomes natural language recognition, which uses linguistics, computer science, artificial intelligence and other related methods to recognize the intention, emotion, or other attributes expressed by the text based on the text. For example, in scene C, the intention expressed by the text is recognized.
  • a neural network can be composed of neural units.
  • a neural unit can refer to an arithmetic unit that takes x s and intercept 1 as inputs.
  • the output of the arithmetic unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • Ws is the weight of xs
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer.
  • the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting many of the above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • Deep neural network also known as multi-layer neural network
  • DNN can be understood as a neural network with many hidden layers. There is no special metric for "many” here.
  • the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • DNN looks complicated, it is not complicated in terms of the work of each layer.
  • the definition of these parameters in the DNN is as follows: Take the coefficient W as an example: Suppose that in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
  • the summary is: the coefficient from the kth neuron of the L-1th layer to the jth neuron of the Lth layer is defined as It should be noted that the input layer has no W parameter. In deep neural networks, more hidden layers make the network more capable of portraying complex situations in the real world.
  • Training a deep neural network is also a process of learning a weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (a weight matrix formed by vectors W of many layers).
  • Convolutional neural network (CNN, convolutional neuron network) is a deep neural network with a convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolutional layer and a sub-sampling layer.
  • the feature extractor can be seen as a filter, and the convolution process can be seen as using a trainable filter to convolve with an input image or convolution feature map.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can be connected to only part of the neighboring neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units.
  • Neural units in the same feature plane share weights, and the shared weights here are the convolution kernels. Sharing weight can be understood as the way to extract image information has nothing to do with location. The underlying principle is that the statistical information of a certain part of the image is the same as that of other parts. This means that the image information learned in one part can also be used in another part. Therefore, the image information obtained by the same learning can be used for all positions on the image. In the same convolution layer, multiple convolution kernels can be used to extract different image information. Generally, the more the number of convolution kernels, the richer the image information reflected by the convolution operation.
  • the convolution kernel can be initialized in the form of a random-sized matrix. During the training of the convolutional neural network, the convolution kernel can obtain reasonable weights through learning. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • Recurrent neural networks are used to process sequence data.
  • RNN recurrent neural networks
  • the layers are fully connected, and each node in each layer is disconnected.
  • this ordinary neural network has solved many problems, it is still powerless for many problems. For example, if you want to predict what the next word of a sentence will be, you generally need to use the previous word, because the preceding and following words in a sentence are not independent. The reason why RNN is called recurrent neural network is that the current output of a sequence is also related to the previous output.
  • the specific form is that the network will memorize the previous information and apply it to the calculation of the current output, that is, the nodes between the hidden layer are no longer unconnected but connected, and the input of the hidden layer includes not only The output of the input layer also includes the output of the hidden layer at the previous moment.
  • RNN can process sequence data of any length.
  • the training of RNN is the same as the training of traditional CNN or DNN.
  • the error backpropagation algorithm is also used, but there is a difference: that is, if the RNN is network expanded, then the parameters, such as W, are shared; this is not the case with the traditional neural network mentioned above.
  • the output of each step depends not only on the current step of the network, but also on the state of the previous steps of the network. This learning algorithm is called Backpropagation Through Time (BPTT).
  • BPTT Backpropagation Through Time
  • the preset discriminator is a trained neural network, the preset discriminator can accurately identify the classification to which the real sample belongs.
  • the preset discriminator can accurately identify the classification to which the first generated sample belongs, it can be considered that the characteristics of the first generated sample and the real sample are similar, that is, close to the real sample.
  • loss function loss function
  • objective function object function
  • Convolutional neural networks can use backpropagation (BP) algorithms to modify the size of the parameters in the initial sample generator during the training process, so that the reconstruction error loss of the initial sample generator becomes smaller and smaller. Specifically, forwarding the input signal until the output will cause error loss, and the parameters in the initial sample generator are updated by backpropagating the error loss information, so that the error loss is converged.
  • the backpropagation algorithm is a backpropagation motion dominated by error loss, and aims to obtain the optimal target sample generator parameters, such as a weight matrix.
  • Generative adversarial networks is a deep learning model.
  • the model includes at least two modules: one module is a generative model (also called a generative network in the embodiment of this application), and the other module is a discriminative model (also called a discriminative network in this embodiment of the application) , Through these two modules to learn from each other to produce better output.
  • Both the generative model and the discriminant model can be a neural network, specifically a deep neural network, or a convolutional neural network.
  • GAN The basic principle of GAN is as follows: Take the GAN that generates pictures as an example, suppose there are two networks, G (generator) and D (discriminator), where G is a network that generates pictures, and it receives a random noise z through this noise Generate a picture and mark it as G(z); D is a discriminant network used to discriminate whether a picture is "real". Its input parameter is x, x represents a picture, and the output D(x) represents the probability that x is a real picture. If it is 1, it means 100% is a real picture, and if it is 0, it means it cannot be real. image.
  • the goal of generating network G is to generate as real pictures as possible to deceive the discriminating network D, and the goal of discriminating network D is to try to distinguish the pictures generated by G from the real pictures Come.
  • G and D constitute a dynamic "game” process, that is, the "confrontation” in the "generative confrontation network”.
  • an excellent generative model G is obtained, which can be used to generate pictures.
  • the pixel value of the image can be a red-green-blue (RGB) color value, and the pixel value can be a long integer representing the color.
  • the pixel value is 256*Red+100*Green+76Blue, where Blue represents the blue component, Green represents the green component, and Red represents the red component. In each color component, the smaller the value, the lower the brightness, and the larger the value, the higher the brightness.
  • the pixel values can be grayscale values.
  • an embodiment of the present invention provides a system architecture 100.
  • the data collection device 160 is used to collect or generate training data.
  • the training data includes: a first matrix, where the first matrix can be a random matrix, a random vector, etc.;
  • the training data is stored in the database 130, and the training device 120 trains the target sample generator 101 based on the training data maintained in the database 130 and the preset discriminator 121.
  • the training process may include: the training device 120 inputs the first matrix to the initial sample generator 122 to obtain the first generated sample, the initial sample generator 122 is a deep neural network; further, the first generated sample is input to the preset discriminator 121, Obtain the discrimination result, where the preset discriminator 121 is obtained through first training data training, the first training data includes the real sample and the classification corresponding to the real sample; further, the real result of the first generated sample is determined according to the discrimination result, The initial sample generator 122 is updated according to the difference between the real result of the first real sample and the discrimination result, and the target sample generator 101 is obtained.
  • the target sample generator 101 may be a target image generator in scene A, a target face image generator in scene B, or a target text generator in scene C.
  • the target sample generator 101 can be used to implement the sample generation method provided in the embodiments of the present application, that is, input the second matrix into the target sample generator 101 to obtain the second generated sample.
  • the second generated sample can be used to implement the neural network compression method provided in the embodiment of the present application.
  • the neural network to be compressed 102 is the preset discriminator 121, and the second generated sample replaces the neural network to be compressed.
  • the training data of the network 102 (that is, the preset discriminator 121 in this application) is compressed by the neural network 102 to be compressed through a distillation algorithm or a pruning algorithm.
  • a distillation algorithm or a pruning algorithm For details, please refer to the relevant description in the third embodiment below, which will not be expanded here.
  • the target sample generator 101 is the target image generator
  • the preset discriminator 121 and the neural network to be compressed 102 are both image recognition networks
  • the second matrix is input to the target In the image generator, image samples can be generated, which can be further applied to the compression and acceleration of image recognition networks.
  • the target sample generator 101 is the target face image generator
  • the preset discriminator 121 and the neural network to be compressed 102 are both the face attribute recognition network
  • the second matrix is input to the target In the face image generator, a face image sample can be generated, and the face image sample can be further applied to the compression and acceleration of the face attribute recognition network.
  • the target sample generator 101 is the target text generator
  • the preset discriminator 121 and the neural network to be compressed 102 are both text recognition networks
  • the second matrix is input to the target text generator
  • the target sample generator is obtained by training a deep neural network.
  • the preset discriminator in the embodiment of the present application is a deep neural network model obtained through pre-training.
  • the training data maintained in the database 130 may not all come from the collection of the data collection device 160, and may also be received from other devices.
  • the training device 120 does not necessarily perform the training of the target sample generator 101 completely based on the training data maintained by the database 130. It may also obtain training data from the cloud or generate training data for model training. The above description should not be used as a reference to this application. Limitations of Examples.
  • the target sample generator 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. 1, which can be a terminal, such as a mobile phone terminal, a tablet computer, Laptops, AR/VR, vehicle-mounted terminals, etc., can also be servers or clouds.
  • the execution device 110 may execute the sample generation method, image generation method, text generation method, etc. in the embodiments of the present application.
  • the compression device 170 is equipped with an I/O interface 172 for data interaction with external devices.
  • the user can input data to the I/O interface 172 through the client device 140.
  • the input data is described in the embodiment of the present application. It may include: the neural network 102 to be compressed, requesting the compression device 170 to compress the neural network 102 to be compressed.
  • the compression device 170 can call data, codes, etc. in the data storage system 150 for corresponding processing, or can use the data, instructions, etc. obtained from the corresponding processing. Stored in the data storage system 150.
  • the I/O interface 172 returns the processing result, such as the compressed neural network obtained by the aforementioned neural network compression method, to the client device 140, so that the client device 140 can provide the user device 180.
  • the user equipment 180 may be a lightweight terminal that needs to use a compressed neural network, such as a mobile phone terminal, a notebook computer, an AR/VR terminal or a vehicle-mounted terminal, etc., to respond to the corresponding needs of the terminal user, such as the terminal user Perform image recognition on the input image and output the recognition result to the terminal user, or perform text classification on the text input by the terminal user and output the classification result to the terminal user.
  • the training device 120 can generate a corresponding target sample generator 101 based on different training data for different goals or tasks, and the corresponding target sample generator 101 can be used to achieve the aforementioned sample generation or Complete the above tasks to provide users with the desired results.
  • the customer can manually set input data (for example, the neural network to be compressed in the embodiment of the present application), and the manual setting can be operated through the interface provided by the I/O interface 172.
  • the client device 140 can automatically send input data to the I/O interface 172. If the client device 140 is required to automatically send the input data and the user's authorization is required, the user can set the corresponding authority in the client device 140. The customer can view the results output by the compression device 170 on the client device 140.
  • the client device 140 After the client device 140 receives the compressed neural network, it can transmit the compressed neural network to the user device 180, which can be a terminal, such as a mobile phone terminal, a tablet computer, a notebook computer, AR/VR, a vehicle terminal, etc. ,
  • the user equipment 180 runs the compressed neural network to realize the function of the compressed neural network.
  • the compression device 170 may also directly provide the compressed neural network to the user equipment 180, and the comparison is not limited.
  • FIG. 1 is only a schematic diagram of a system architecture provided by an embodiment of the present invention, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data storage system 150 is an external memory relative to the compression device 170. In other cases, the data storage system 150 can also be placed in the compression device 170.
  • the target sample generator 101 is obtained by training according to the training device 120.
  • the target sample generator 101 can be the target image generator in the A scene, the target face image generator in the B scene, and the target in the C scene.
  • the text generator specifically, the target sample generator 101, the target image generator, the target face image generator, and the target text generator provided in the embodiments of the present application can all be machine learning models such as convolutional neural networks or recurrent neural networks. .
  • the compressed neural network may be the compressed neural network of the image recognition network in the A scene, the compressed neural network of the face attribute recognition network in the B scene, and the C scene
  • the Chinese text recognition network is a compressed neural network, etc.
  • the aforementioned neural network may be a deep neural network machine learning model such as a convolutional neural network or a recurrent neural network.
  • a convolutional neural network is a deep neural network with a convolutional structure and a deep learning architecture.
  • the deep learning architecture refers to the use of machine learning algorithms.
  • Multi-level learning is carried out on the abstract level of
  • CNN is a feed-forward artificial neural network. Each neuron in the feed-forward artificial neural network can respond to the input image.
  • a convolutional neural network (CNN) 200 may include an input layer 210, a convolutional layer/pooling layer 220 (the pooling layer is optional), and a neural network layer 230.
  • the convolutional layer/pooling layer 220 shown in Figure 2 may include layers 221-226 as shown in the examples.
  • layer 221 is a convolutional layer
  • layer 222 is a pooling layer
  • layer 223 is a convolutional layer
  • Layers, 224 is the pooling layer
  • 225 is the convolutional layer
  • 226 is the pooling layer
  • 221 and 222 are the convolutional layers
  • 223 is the pooling layer
  • 224 and 225 are the convolutional layers.
  • Layer, 226 is the pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 221 can include many convolution operators.
  • the convolution operator is also called a kernel. Its function in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator is essentially It can be a weight matrix. This weight matrix is usually pre-defined. In the process of convolution on the image, the weight matrix is usually one pixel after one pixel (or two pixels after two pixels) along the horizontal direction on the input image. ...It depends on the value of stride) to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), and the feature maps extracted by the multiple weight matrices of the same size have the same size, and then the multiple extracted feature maps of the same size are combined to form a convolution operation. Output.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications.
  • Each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions. .
  • the initial convolutional layer (such as 221) often extracts more general features, which can also be called low-level features; with the convolutional neural network
  • the features extracted by the subsequent convolutional layers (for example, 226) become more and more complex, such as features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.
  • the pooling layer can be a convolutional layer followed by a layer
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain a smaller size image.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 200 After processing by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (the required class information or other related information), the convolutional neural network 200 needs to use the neural network layer 230 to generate one or a group of required classes of output. Therefore, the neural network layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 2) and an output layer 240. The parameters contained in the multiple hidden layers can be based on specific task types. Relevant training data of, for example, the task type can include image generation, text generation, sample generation, etc...
  • the output layer 240 After the multiple hidden layers in the neural network layer 230, that is, the final layer of the entire convolutional neural network 200 is the output layer 240.
  • the output layer 240 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
  • the convolutional neural network 200 shown in FIG. 2 is only used as an example of a convolutional neural network. In specific applications, the convolutional neural network may also exist in the form of other network models.
  • FIG. 3 is a hardware structure of a chip provided by an embodiment of the present invention.
  • the chip includes a neural network processor 30.
  • the chip can be set in the execution device 110 as shown in FIG. 1 to complete the calculation work of the calculation module 171.
  • the chip can also be set in the training device 120 as shown in FIG. 1 to complete the training work of the training device 120 and output the target model/rule 101.
  • the algorithms of each layer in the convolutional neural network as shown in Figure 2 can be implemented in the chip as shown in Figure 3.
  • the neural network processor 30 may be any processor suitable for large-scale XOR operation processing such as NPU, TPU, or GPU.
  • NPU can be mounted on the host CPU (Host CPU) as a coprocessor, and the host CPU assigns tasks to it.
  • the core part of the NPU is the arithmetic circuit 303.
  • the arithmetic circuit 303 is controlled by the controller 304 to extract matrix data in the memory (301 and 302) and perform multiplication and addition operations.
  • the arithmetic circuit 303 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 303 is a two-dimensional systolic array. The arithmetic circuit 303 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 303 is a general-purpose matrix processor.
  • the arithmetic circuit 303 fetches the weight data of the matrix B from the weight memory 302 and caches it on each PE in the arithmetic circuit 303.
  • the arithmetic circuit 303 fetches the input data of matrix A from the input memory 301, and performs matrix operations based on the input data of matrix A and the weight data of matrix B, and the partial or final result of the obtained matrix is stored in the accumulator 308 .
  • the unified memory 306 is used to store input data and output data.
  • the weight data is directly transferred to the weight memory 302 through the direct memory access controller (DMAC, Direct Memory Access Controller) 305 of the storage unit.
  • the input data is also transferred to the unified memory 306 through the DMAC.
  • DMAC Direct Memory Access Controller
  • the bus interface unit (BIU, Bus Interface Unit) 310 is used for the interaction between the DMAC and the instruction fetch buffer (Instruction Fetch Buffer) 309; the bus interface unit 301 is also used for the instruction fetch memory 309 to obtain instructions from the external memory; the bus interface unit 301 also The storage unit access controller 305 obtains the original data of the input matrix A or the weight matrix B from the external memory.
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 306, or to transfer the weight data to the weight memory 302, or to transfer the input data to the input memory 301.
  • the vector calculation unit 307 has multiple arithmetic processing units, if necessary, further processing the output of the arithmetic circuit 303, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
  • the vector calculation unit 307 is mainly used for calculation of non-convolutional layers or fully connected layers (FC, fully connected layers) in the neural network. Specifically, it can process: Pooling (pooling), Normalization (normalization), etc. calculations.
  • the vector calculation unit 307 may apply a nonlinear function to the output of the arithmetic circuit 303, such as a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 307 generates a normalized value, a combined value, or both.
  • the vector calculation unit 307 stores the processed vector to the unified memory 306.
  • the vector processed by the vector calculation unit 307 can be used as the activation input of the arithmetic circuit 303, for example, for use in subsequent layers in a neural network, as shown in FIG. 2, if the current processing layer is a hidden layer 1 (231), the vector processed by the vector calculation unit 307 can also be used for calculation in the hidden layer 2 (232).
  • the instruction fetch buffer 309 connected to the controller 304 is used to store instructions used by the controller 304;
  • the unified memory 306, the input memory 301, the weight memory 302, and the fetch memory 309 are all On-Chip memories.
  • the external memory is independent of the NPU hardware architecture.
  • each layer in the convolutional neural network shown in FIG. 2 can be executed by the arithmetic circuit 303 or the vector calculation unit 307.
  • the following embodiments 1, 2, 3, 4, and 5 can be applied to the above-mentioned scene A, scene B, and scene C.
  • the "initial sample generator” is “Initial image generator”
  • "preset discriminator” is "image recognition network”
  • first generated sample is “generated image”
  • real sample is “real image”
  • "judgment result” is Is the "judgment classification”
  • the "true result” is the “true classification”
  • the "second generated sample” is the “sample image”.
  • the "initial sample generator” is the “initial face image generator”
  • the “preset discriminator” is the “initial face image generator”
  • the "first generated sample” is “Generate face image”
  • "real sample” is “real face image”
  • "judgment result” is “judgment attribute”
  • "real result” is "true real attribute”
  • “second generated sample” is Is the “sample face image”.
  • initial sample generator is “initial text generator”
  • preset discriminator is “initial text generator”
  • first generated sample is “generated text”
  • real sample "Is the "true text”
  • the "judgment result” is the “judgment intent”
  • the "true result” is the “true intent”
  • the “second generation sample” is the “sample text”.
  • FIG. 4A is a training method of a sample generator provided in Embodiment 1 of the present invention
  • FIG. 4B is a schematic explanatory diagram of a training method of a sample generator, and the method may be specifically executed by the training device 120 shown in FIG. 1.
  • the method can be processed by the CPU, or can be processed by the CPU and the GPU together, or the GPU may not be used, and other processors suitable for neural network calculations, such as the neural network processor 30 shown in FIG. 3, This application is not restricted.
  • the method may include some or all of the following steps:
  • the first matrix may be a stochastic matrix, a random vector, or other forms of matrix, which is not limited in this application.
  • the first matrix may be generated by the training device 120, or may be pre-generated by other functional modules before the training device 120, or may be obtained from the database 130, etc., which is not limited in this application.
  • the random matrix is also a probability matrix, Markov matrix, etc.
  • Each element in the random matrix is a non-negative real number representing the probability, and the sum of all elements in the random matrix is 1.
  • the training device in this application may use a single sample, multiple samples, or all samples for one training process of the sample generator, which is not limited in the embodiment of this application.
  • N samples are used in a training process
  • the training device receives N random vectors, which can be expressed as ⁇ z 1 z 2 ... z i ... z N ⁇ , where N is a positive integer, and the N random vectors are The i vectors are denoted as z i . i is the index of the random vector in the group of random vectors, and i is a positive integer not greater than N.
  • the first matrix in the embodiment of the present application is a general matrix, and the values of elements in each first matrix in the N first matrices may be different from each other. For example, z 1 ⁇ z 2 ⁇ z 3 ... ⁇ z N.
  • the initial sample generator may be an initialized deep neural network, or a deep neural network generated during the training process.
  • the training device processes the input first matrix through the initial sample generator, and the initial sample generator outputs the generated image.
  • the first generated sample is a matrix composed of multiple pixel value points.
  • N random vectors ⁇ z 1 z 2 ... z i ... z N ⁇ are input to the initial sample generator, N first generated samples ⁇ x 1 x 2 ... x i ... x N ⁇ are obtained, where random
  • the vector corresponds to the first generated sample one-to-one, that is, the initial sample generator processes the input random vector z i to obtain the first generated sample x i .
  • the “first generated sample” is a sample generated by the training device through the initial sample generator based on the input first matrix; the “real sample” is the sample used for training the preset discriminator.
  • S406 Input the first generated sample into a preset discriminator to obtain a discrimination result, where the discrimination result includes the probability that the first generated sample is predicted to be each of the M categories, and the preset discriminator is trained on the first training data Obtained, the first training data includes the real sample and the classification corresponding to the real sample.
  • the preset discriminator is a deep neural network pre-trained through the first training data, and the preset discriminator can identify the classification of the input sample.
  • the preset discriminator is a known model, and its training is a method used in training models in the prior art, which is not limited in this application.
  • N first generated samples ⁇ x 1 x 2 ... x i ... x N ⁇ into the preset discriminator, and get N discrimination results ⁇ y 1 y 2 ... y i ... y N ⁇ , of which, N are first generated
  • the sample has a one-to-one correspondence with the N discrimination results, that is, the preset discriminator processes the input first generated sample x i to obtain the discrimination result y i .
  • S408 Determine the maximum probability among the probabilities corresponding to each of the M categories, and determine the category corresponding to the maximum probability as the true result of the first generated sample.
  • the judgment corresponding to the first sample x i y i may determine a first generation generate samples x i corresponding to the true result t i.
  • the true result t i may be the classification corresponding to the maximum probability value in the discrimination result yi , that is, the probability of the classification corresponding to the maximum probability in the M classifications is set to 1, and the probabilities of other classifications are set to 0.
  • N real results can be obtained. It should be understood that the discrimination results correspond to the real results one-to-one. N real results can be expressed as ⁇ t 1 t 2 ... t i ... t N ⁇ .
  • the preset discriminator is an image recognition network as shown in scene A, and the M categories include dog, cat, and chicken.
  • the discrimination result of the generated image input to the image recognition network is: the probability of a dog is 0.5, the probability of a cat is 0.2, and the probability of a chicken is 0.3, the discrimination result can be expressed as ⁇ 0.5 0.2 0.3 ⁇ ; the true result of the generated image It is a dog, which can be expressed as ⁇ 1 0 0 ⁇ .
  • any one of the M classifications is classified as the true result corresponding to the first generated sample, which is not limited in the embodiment of the present application.
  • S410 Update the parameters of the sample generator according to the true result and the discrimination result of the first generated sample to obtain an updated sample generator.
  • the training of a GAN network requires iterative training of a discriminator and a generator, and the discriminator needs to be trained based on real samples and generated samples output by the generator.
  • the preset discriminator is a trained deep neural network, and the real samples used for training the preset discriminator are not available.
  • the preset discriminator is used to identify the classification of the generated samples, and When the first generated sample generated by the initial sample generator can be accurately identified by the preset discriminator (that is, when the difference between the preset discriminator’s discriminatory result of the first generated sample and the real result tends to 0), the sample is considered The generated samples obtained by the generator can replace the real samples used to train the discriminator.
  • the training of the sample generator may adopt a back propagation algorithm to modify the size of the parameters in the sample generator during the training process, so that the reconstruction error loss of the sample generator becomes smaller and smaller.
  • the training device can determine the loss corresponding to the N first generated samples according to the difference between the discriminant result of each first generated sample and the real result, and use the optimization algorithm according to the loss corresponding to the N first generated samples. Update the parameters of the sample generator. During the training process, the parameters of the preset discriminator remain unchanged, and only the parameters of the sample generator are updated.
  • the optimization algorithm may be a gradient descent method or other optimization algorithms, which is not limited in the embodiment of the present application.
  • the loss function can be used to calculate the loss corresponding to the N first generated samples.
  • the loss function can include the first loss item determined by the difference between the discrimination result and the real result, where the first loss item can be the determination result Mean absolute error (MAE), mean squared error (MSE), or root mean squared error (RMSE) between the true result and the true result. It can also be the result of the judgment and the true result.
  • the cross entropy may also have other forms, which is not limited in this application.
  • the first loss item L c can be expressed by cross entropy, then:
  • H c (y i , t i ) is the judgment result y i and the true result t i is the cross entropy, which can be expressed as:
  • the loss function can also include the feature determination of the first generated sample The second loss item.
  • the training method of the sample generator further includes: the training device extracts the feature of the first generated sample through a preset discriminator, wherein the feature of the first sample may be inputting the first generated sample to the preset discriminator Then, the feature output by any convolutional layer in the preset discriminator.
  • the feature of the first generated sample is the feature of the output of the last convolutional layer in the preset discriminator, that is, the high-order feature of the first generated sample.
  • high-level features are high-level semantic features.
  • high-level features may be semantic features.
  • the feature corresponding to the generated sample x i can be expressed as f i .
  • N first generated samples can get N features ⁇ f 1 f 2 ... f i ... f N ⁇ .
  • f i ⁇ f i,1 f i,2 ... f i, k ... f i, P ⁇
  • P is a positive integer.
  • the second loss term L f can represent:
  • represents the norm of the matrix f i
  • f i is the matrix of the absolute values of all the elements.
  • the second loss item L f may also include other forms, for example, the second loss item L f is the average value of the 2-norm of N features, etc., which is not limited in the embodiment of the present application.
  • the average probability values of the generated image being predicted to be a cat and a chicken are 0.225 and 0.4 respectively.
  • Loss function may further comprises M Category classification probability of each determination result in the N average value determined in the third loss term L in, L in the third term loss may be expressed as:
  • L in loss may also have other forms of representation, for example, L in the third term may be lost between V and the average probability of 1 / M the mean absolute error, mean square error or RMSE like , The embodiment of this application does not limit it.
  • the training device can update the parameters of the sample generator according to the difference between the judgment result of the first generated sample and the real result.
  • the training device can use the difference between the judgment result of the N first generated samples and the real result
  • the determined first loss term updates the model parameters of the initial sample generator.
  • the loss function can be expressed as:
  • the training device may update the parameters of the initial image generator according to the difference between the discrimination result of the first generated sample and the real result and the characteristics of the first generated sample.
  • the training device can update the initial sample by using the first loss item determined by the difference between the judgment result of the N first generated samples and the real result and the second loss item determined by the characteristics of the N first generated samples
  • the model parameters of the generator can be expressed as:
  • the training device can update the parameters of the initial image generator according to the difference between the discrimination result of the first generated sample and the real result, the characteristics of the first generated sample, and the average probability of each of the M categories.
  • the training device can use the difference between the judgment result of N first generated samples and the real result to determine the first loss item, the second loss item determined by the characteristics of the N first generated samples, and the Nth loss item.
  • a probability average value of each of the M categories obtained by the statistics of the judgment result of the generated sample, and the model parameters of the initial sample generator are updated.
  • the loss function can be expressed as:
  • the preset discriminator in the embodiment of the present application may be a neural network, a deep neural network, a convolutional neural network, a recurrent neural network, etc., which is not limited in the embodiment of the present application.
  • the image recognition network can be a convolutional neural network
  • the face attribute recognition network can be a convolutional neural network
  • the text recognition network can be Recurrent neural network.
  • the sample generation method in the embodiment of the present application may be executed by the execution device 110.
  • the execution device 110 is configured with a target sample generator.
  • the method can also be processed by the CPU, or can be processed by the CPU and the GPU together, or without the GPU, and use other processors suitable for neural network calculations, such as the neural network processor 30 shown in FIG. 3 , This application is not restricted.
  • the method includes: inputting the second matrix into the target sample generator to obtain the second generated sample.
  • the second matrix and the first matrix have the same format and type, that is, the order of the matrix is the same, and the type of the matrix may include a random matrix or a random vector.
  • the target sample generator is a target sample generator obtained through training in the first embodiment above. For the specific training method, refer to the related description in the first embodiment above, and will not be repeated here.
  • the first generated sample is the generated sample generated by the initial sample generator, and there is a big difference between the attributes of the first generated sample and the attributes of the real sample; and the second generated sample is the generated sample generated by the target sample generator.
  • the target sample generator is a trained model that has learned the attributes of the real sample. Therefore, the attributes of the second generated sample are close to the real sample and can replace the real sample of the preset discriminator.
  • the second generated sample can be used instead of the real sample to implement neural network model training, compression, etc.
  • the compression device can obtain the second generated sample, which is generated by the sample generation method described in the second embodiment, Among them, the preset discriminator at this time is the neural network to be compressed; the compression device inputs the second generated sample to the neural network to be compressed to obtain the classification corresponding to the second generated sample; further, according to the second generated sample and the second generated sample The classification corresponding to the sample is compressed by the neural network to be compressed.
  • the compression device can obtain the second generated sample, which is generated by the sample generation method described in the second embodiment, Among them, the preset discriminator at this time is the neural network to be compressed; the compression device inputs the second generated sample to the neural network to be compressed to obtain the classification corresponding to the second generated sample; further, according to the second generated sample and the second generated sample The classification corresponding to the sample is compressed by the neural network to be compressed.
  • the neural network to be compressed is a black box, which only provides input and output interfaces.
  • the structure and parameters of the neural network to be compressed are unknown.
  • the real samples for training the neural network to be compressed are not available and the original
  • the second generated sample generated by the target sample generator in the application embodiment is unlabeled.
  • the compression method of the neural network shown in FIG. 5A and the schematic diagram of the compression method of the neural network shown in FIG. 5B, the compression of the neural network to be compressed using the unlabeled second generated sample is introduced through the distillation algorithm. 5B is illustrated by taking the sample generator as the target image generator in scene A as an example.
  • the compression method of the neural network in the embodiment of the present application may be executed by the compression device 170.
  • the method can also be processed by the CPU, or can be processed by the CPU and the GPU together, or without the GPU, and use other processors suitable for neural network calculations, such as the neural network processor 30 shown in FIG. 3 , This application is not restricted.
  • the method may include but is not limited to the following steps:
  • the neural network to be compressed is the preset discriminator in the first and second embodiments above.
  • the target sample generation suitable for the neural network to be compressed is obtained by training
  • the second generated sample output by the target sample generator and the training sample of the neural network to be compressed have similar characteristics; further, the method described in the second embodiment above can generate the second generated sample of the neural network to be compressed.
  • the training method of the target sample generator suitable for the neural network to be compressed please refer to the related description in the first embodiment, and the method for generating the second generated sample can refer to the related description in the second embodiment, which will not be repeated here.
  • the N second matrix ⁇ z '1 z' 2 ... z 'i ... z' N ⁇ is input to the target sample generator adapted to be compressed to a neural network, to obtain N samples generating a second ⁇ x '1 x '2 ... x' i ... x 'N ⁇ .
  • S502 Input the second generated sample to the neural network to be compressed, and obtain the real result corresponding to the second generated sample;
  • the N samples to generate input to the second neural network to be compressed, to give a true result of the second N generate samples corresponding to each of ⁇ y '1 y' 2 ... y 'i ... y' N ⁇ .
  • the true result may be the classification corresponding to the largest probability among the probabilities corresponding to each of the M classifications, that is, the probability that the second generated sample is identified as the classification corresponding to the largest probability is 1, and the probability of being the other classifications is all 0.
  • the probabilities corresponding to the M classifications obtained by processing the second generated sample by the to-be-compressed network can also be directly used as the true result.
  • the second generated sample has similar characteristics to the real samples used in the training of the neural network to be compressed, and is a reliable sample
  • the neural network to be compressed is a trained neural network
  • the neural network to be compressed is a reliable sample of the input (Second generated sample)
  • a reliable output can be obtained, that is, the output obtained by processing the second generated sample by the to-be-compressed neural network is the real result corresponding to the second generated sample, and can be used as the label of the second generated sample.
  • S506 Train an initial neural network through the second training data to obtain a compressed neural network; wherein the second training data includes a second generated sample and a real result corresponding to the second generated sample, and the initial neural network is a deep neural network , The model parameters of the initial neural network are less than the model parameters of the neural network to be compressed.
  • a distillation algorithm is used to compress the neural network to be compressed.
  • construct a neural network namely the initial neural network, which has a simpler structure and less parameter model data than the neural network to be compressed.
  • the teacher-student learning strategy the original complex neural network to be compressed is compressed into a low-complexity student network without losing too much In the case of model accuracy, a low-complexity student network can have high computational efficiency and less storage overhead.
  • How to construct the initial neural network, determining the hyperparameters of the initial neural network, and training the compressed neural network based on the second training data are existing technologies, which are not limited.
  • N second generated samples are input to the initial neural network, and the prediction results corresponding to the N second generated samples are obtained ⁇ y 1 s y 2 s ... y i s ... y N s ⁇ .
  • the compression device can determine the loss corresponding to the N second generated samples according to the difference between the predicted result of each second generated sample and the real result, and update the parameters of the initial neural network through the optimization algorithm according to the loss corresponding to the N second generated samples .
  • the loss function L1 is used to calculate the loss corresponding to the N second generated samples.
  • the loss function L1 can also be the average absolute error, mean square error, or root mean square error between the predicted result and the real result, or the predicted result and The cross entropy of the real result may also have other forms, which is not limited in this application.
  • the loss function L1 can be expressed as:
  • y′ i,j represents the probability that the second generated sample x′ i is predicted to be category j by the neural network to be compressed
  • j is the index of the classification
  • M being a positive integer greater than 1
  • j is a positive integer no greater than M.
  • L1 may also include other forms, which are not limited in the embodiment of the present application.
  • the target sample generator generates a second generated sample with characteristics similar to the real sample used for training of the neural network to be compressed, and the result of the second generated sample predicted by the neural network to be compressed is used as the label.
  • the second generated sample and its label by training a low-complexity neural network, a neural network consistent with the function of the neural network to be compressed is obtained.
  • the neural network is the compressed neural network, which can be realized without training samples. Compression of the neural network to be compressed.
  • the compressed neural network can be applied to lightweight devices such as terminals, thereby reducing computing loss, reducing storage overhead, and improving computing efficiency.
  • the specific structure of the neural network to be compressed is known.
  • the real samples for training the neural network to be compressed are not available and the second generated sample generated by the target sample generator in the embodiment of this application is unlabeled.
  • the redundant connections in the neural network to be compressed are discarded by the pruning algorithm to obtain a simplified neural network, and the second generated sample is marked by the neural network to be compressed.
  • the generated samples are used as training data to train the simplified neural network to obtain the compressed neural network, thereby reducing the complexity of the neural network to be compressed, improving computing efficiency, and reducing storage overhead.
  • This method can be executed by the compression device 170.
  • the method may be processed by the CPU, or jointly processed by the CPU and GPU, or GPU may not be used, and other processors suitable for neural network calculations may be used, such as the neural network processor 30 shown in FIG. 3 , This application is not restricted.
  • the compression method of the neural network may include some or all of the following steps:
  • S604 Input the second generated sample to the neural network to be compressed, and obtain the real result corresponding to the second generated sample.
  • the parameters of the neural network include the weight parameters in each convolutional layer. It should be understood that the greater the absolute value of the weight, the greater the contribution of the neuron corresponding to the weight parameter to the output of the neural network, and the more important it is to the neural network.
  • the compression device prunes some or all of the convolutional layers in the compressed neural network, that is, removes the nerves corresponding to the weight parameters of each convolutional layer whose absolute value is less than the first threshold. Element, get a simplified neural network.
  • the compression device can sort the neurons in the neural network to be compressed according to their importance, and then remove multiple neurons with lower importance in the neural network to be compressed, thereby obtaining a simplified Neural Networks.
  • importance refers to the magnitude of the contribution of neurons to the output result, and the neuron that makes a large contribution to the output result has greater importance.
  • S608 Train the simplified neural network through the third training data to obtain a compressed neural network, where the third training data includes a second generated sample and a real result corresponding to the second generated sample.
  • the simplified neural network has fewer parameters and a simpler network structure.
  • the N second generated samples are input to the simplified neural network, and the prediction results corresponding to the N second generated samples are obtained ⁇ y 1 h y 2 h ... y i h ... y N h ⁇ .
  • the compression device can determine the loss corresponding to the N second generated samples according to the difference between the predicted result of each second generated sample and the real result, and update the simplified neural network through the optimization algorithm according to the loss corresponding to the N second generated samples Parameters.
  • the loss function L2 is used to calculate the loss corresponding to the N second generated samples.
  • the loss function L2 can also be the average absolute error, mean square error, or root mean square error between the predicted result and the true result, or the predicted result and
  • the cross entropy of the real result may also have other forms, which is not limited in this application.
  • the loss function L2 can be expressed as:
  • y′ i,j represents the probability that the second generated sample x′ i is predicted to be category j by the neural network to be compressed
  • j is the index of the classification
  • M being a positive integer greater than 1
  • j is a positive integer no greater than M.
  • L2 may also include other forms, which are not limited in the embodiment of the present application.
  • the method may further include:
  • S610 Determine whether to continue compression according to the parameter amount of the neural network model after the current compression.
  • the compression device can also comprehensively determine whether to continue compression based on the currently obtained parameters of the compressed neural network and the accuracy of the model. If so, it can execute the method described by repeating steps S606 and S608 to treat the compressed neural network. The network is further compressed. Otherwise, perform step S612: output the compressed neural network.
  • the target sample generator generates a second generated sample with characteristics similar to the real sample used in the training of the neural network to be compressed, and the result of the second generated sample predicted by the neural network to be compressed is used as the label.
  • the pruning algorithm removes redundant connections in the neural network to be compressed, and obtains a simplified neural network.
  • the second generated sample is used as the input of the simplified neural network.
  • the real result obtained by processing the second generated sample of the input by the compressed neural network As a label, the second generated sample and its label are used to train the simplified neural network to obtain the compressed neural network, which realizes the compression of the neural network to be compressed without training samples.
  • the compressed neural network can be applied to lightweight devices such as terminals, thereby reducing the complexity of the neural network to be compressed, improving computing efficiency, and reducing storage overhead.
  • the first embodiment is the training stage of the target sample generator (the stage executed by the training device 120 as shown in FIG. 1), and the specific training adopts any one of the possibilities based on the first embodiment and the first embodiment.
  • the training of the sample generator provided in the implementation manner of the ” is performed; and the second embodiment can be understood as the application stage of the target sample generator (the stage executed by the execution device 110 as shown in FIG.
  • the compression device 170 can be based on the second generation sample, Set the discriminator to compress, and then obtain the compressed model, that is, the compressed preset discriminator.
  • the compressed neural network can be sent to the client device 140, and the client device 140 sends the compressed neural network to the user device 180 (terminal).
  • the compression device 170 may also send the compressed device to the user equipment 180.
  • the user equipment 180 may run the compressed neural network to realize the function of the compressed neural network.
  • S704 Input the received data into the compressed neural network, and process the input data through the compressed neural network to obtain an output result
  • the output mode includes but is not limited to output through text, image, voice, video, etc.
  • the compressed neural network is obtained by compressing the neural network compression method described in the third or fourth embodiment above.
  • the input data can be images, texts, etc., and are related to the specific functions of the neural network to be compressed.
  • compression of the compressed neural network reference may be made to the relevant description in the third or fourth embodiment above, and details are not repeated in the embodiments of this application.
  • the data processing method is specifically an image processing method, including: a terminal receives an input image; inputting the input image to a compressed neural network, and the compressed neural network The image is processed and the processing result is obtained.
  • the content of the processing result depends on the function of the compressed neural network
  • the function of the compressed neural network depends on the function of the neural network to be compressed, which may be the classification result, recognition result, etc. of the image.
  • the neural network to be compressed is a face attribute recognition network, which is used to identify the attributes of the person described in the input face image, such as gender, age, race, etc., then the compressed neural network can identify the gender of the input image describing the person , Age, race, etc.
  • the processing result may include the recognized gender, age, and race of the input image.
  • the data processing method is specifically a text processing method, including: a terminal receives input text; inputting the input text into a compressed neural network, and the compressed neural network Enter text for processing, and get the processing result.
  • the content of the processing result depends on the function of the compressed neural network
  • the function of the compressed neural network depends on the function of the neural network to be compressed, which may be a classification result of a text, a recognition result, etc.
  • the neural network to be compressed is a text recognition network, which is used to recognize the intention of the description of the input text. Then, the compressed neural network can recognize the intention of the input text, and then perform the operation corresponding to the recognized intention, for example, in Recognizing that the intention is to "connect the call", the terminal (such as a mobile phone) can connect to the current call.
  • Fig. 8 is a schematic block diagram of a training device of a sample generator in an embodiment of the present invention.
  • the training device 800 of the sample generator shown in FIG. 8 (the device 800 may specifically be the training device 120 of FIG. 1), which may include:
  • the obtaining unit 801 is configured to obtain the first matrix
  • the generating unit 802 is configured to input the first matrix into the initial sample generator to obtain the first generated sample, and the initial sample generator is a deep neural network;
  • the discrimination unit 803 is configured to input the first generated sample into the preset discriminator to obtain the discrimination result, wherein the preset discriminator is obtained through training of the first training data, and the first training data includes the real sample and the corresponding real sample classification;
  • the updating unit 804 is used to update the parameters of the sample generator according to the discrimination result of the first generated sample to obtain the updated sample generator.
  • the discrimination result may include the probability that the first generated sample is predicted to be each of the M categories, where M is an integer greater than 1.
  • the true result of the first generated sample may be the classification corresponding to the highest probability among the probabilities respectively corresponding to the M classifications;
  • the apparatus 800 may further include:
  • the feature extraction unit 805 is configured to extract features of the first generated sample through a preset discriminator
  • the probability average unit 805 is configured to obtain the average probability of each of the M classifications in the N discrimination results according to the N discrimination results one-to-one corresponding to the N first matrices;
  • the updating unit 804 is specifically configured to jointly update the initial sample generator according to the difference between the discrimination result and the real result, the characteristic value of the feature, and the probability average value.
  • the updating unit 804 is further configured to implement any one of the first to four implementations of step S410 in the foregoing embodiment 1.
  • the updating unit 804 is further configured to implement any one of the first to four implementations of step S410 in the foregoing embodiment 1.
  • FIG. 9 is a schematic block diagram of a sample generating apparatus in an embodiment of the present invention.
  • the sample generating apparatus 900 shown in FIG. 9 (the apparatus 900 may specifically be the execution device 110 of FIG. 1) may include:
  • the obtaining unit 901 is used to obtain a target sample generator
  • the generating unit 902 is configured to input the second matrix into the target sample generator to obtain the second generated sample.
  • the second matrix and the first matrix have the same format and type, that is, the order of the matrix is the same, and the type of the matrix may include a random matrix or a random vector.
  • the target sample generator is a target sample generator trained by the training method of the sample generator described in the first embodiment. For the specific training method, please refer to the relevant description in the first embodiment above, and will not be repeated here.
  • the sample generating device 900 can receive the target sample generator sent by the device 800, and can also train the target sample generator obtained by performing the training method of the sample generator described in the first embodiment, which is not limited in this embodiment.
  • FIG. 10 is a schematic block diagram of a neural network compression device in an embodiment of the present invention.
  • the neural network compression device 1000 shown in FIG. 10 (the device 1000 may specifically be the compression device 170 of FIG. 1) may include:
  • the obtaining unit 1001 is configured to obtain the second generated sample, and specifically may be used to receive the second generated sample sent by the sample generating apparatus 900.
  • the second generated sample may be the execution device 110 or the sample generating device 900 inputting the second matrix into the target sample generator to obtain the second generated sample.
  • the second generated sample uses the neural network to be compressed as the preset discriminator, and the target sample generator is trained through the training method of the sample generator described above.
  • the target sample generator is trained through the training method of the sample generator described above.
  • the compression unit 1002 is configured to replace the real sample with the second generated sample, use the output of the second generated sample input into the neural network to be compressed as the classification corresponding to the second generated sample, and perform The neural network to be compressed is compressed.
  • FIG. 11 is a schematic block diagram of a data processing device 1100 (terminal) in an embodiment of the present invention.
  • the data processing device 1100 shown in FIG. 11 (the device 1100 may specifically be the user equipment 180 of FIG. 1), which may include:
  • the receiving unit 1101 is used to receive input data
  • the processing unit 1102 is configured to input the input data into a compressed neural network, and process the input data through the compressed neural network to obtain a processing result, where the compressed neural network is Obtained by the neural network compression method of claim 15;
  • the output unit 1103 is used to output the processing result.
  • FIG. 12 is a schematic diagram of the hardware structure of a training device for a sample generator provided by an embodiment of the present application.
  • the training apparatus 1200 of the sample generator shown in FIG. 12 includes a memory 1201, a processor 1202, a communication interface 1203, and a bus 1204.
  • the memory 1201, the processor 1202, and the communication interface 1203 implement communication connections between each other through the bus 1204.
  • the memory 1201 may be a read only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 1201 may store a program. When the program stored in the memory 1201 is executed by the processor 1202, the processor 1202 and the communication interface 1203 are used to execute each step of the training method of the sample generator in Embodiment 1 of the present application.
  • the processor 1202 may adopt a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a graphics processing unit (graphics processing unit, GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the training device of the sample generator of the embodiment of the present application, or execute the training method of the sample generator of the method embodiment of the present application.
  • the processor 1202 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the training method of the sample generator of the present application can be completed by hardware integrated logic circuits in the processor 1202 or instructions in the form of software.
  • the above-mentioned processor 1202 may also be a general-purpose processor, a digital signal processor (Digital Signal Processing, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device , Discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processing
  • ASIC application specific integrated circuit
  • FPGA Field Programmable Gate Array
  • FPGA Field Programmable Gate Array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1201, and the processor 1202 reads the information in the memory 1201, and combines its hardware to complete the functions required by the units included in the training device of the sample generator of the embodiment of the present application, or execute the method embodiment of the present application The training method of the sample generator.
  • the communication interface 1203 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 1200 and other devices or a communication network.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 1200 and other devices or a communication network.
  • the training data (such as the first matrix described in Embodiment 1 of the present application) and the preset discriminator can be obtained through the communication interface 1203.
  • the bus 1204 may include a path for transferring information between various components of the device 1200 (for example, the memory 1201, the processor 1202, and the communication interface 1203).
  • the acquisition unit 801 of the training device 800 of the sample generator may be equivalent to the communication interface 1203 in the training device 1200 of the sample generator, and the generation unit 802, the determination unit 803, and the update unit 804 may be equivalent to the processor 1202.
  • FIG. 13 is a schematic block diagram of another sample generating apparatus in an embodiment of the present invention
  • the sample generating apparatus 1300 shown in FIG. 13 includes a memory 1301, a processor 1302, and a communication interface 1303 and bus 1304.
  • the memory 1301, the processor 1302, and the communication interface 1303 implement communication connections between each other through the bus 1304.
  • the memory 1301 may be a read only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 1301 may store a program. When the program stored in the memory 1301 is executed by the processor 1302, the processor 1302 and the communication interface 1303 are used to execute each step of the training method of the sample generator in the second embodiment of the present application.
  • the processor 1302 may adopt a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a graphics processing unit (graphics processing unit, GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the sample generation device 1300 of the embodiment of the present application, or to implement the sample generation method described in the second embodiment of the method of the present application.
  • the processor 1302 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the sample generation method of the present application can be completed by an integrated logic circuit of hardware in the processor 1302 or instructions in the form of software.
  • the aforementioned processor 1302 may also be a general-purpose processor, a digital signal processor (Digital Signal Processing, DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA), or other programmable logic devices , Discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processing
  • ASIC application specific integrated circuit
  • FPGA Field Programmable Gate Array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1301, and the processor 1302 reads the information in the memory 1301, and combines its hardware to complete the functions required by the units included in the sample generation device of the embodiment of the present application, or perform the sample generation of the method embodiment of the present application method.
  • the communication interface 1303 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 1300 and other devices or communication networks. For example, data (such as the second matrix described in Embodiment 1 of the present application), a preset discriminator or a neural network to be compressed can be obtained through the communication interface 1303.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 1300 and other devices or communication networks.
  • data such as the second matrix described in Embodiment 1 of the present application
  • a preset discriminator or a neural network to be compressed can be obtained through the communication interface 1303.
  • the bus 1304 may include a path for transferring information between various components of the device 1300 (for example, the memory 1301, the processor 1302, and the communication interface 1303).
  • the acquiring unit 901 in the sample generating device 900 is equivalent to the communication interface 1303 in the sample generating device 1300, and the generating unit 902 may be equivalent to the processor 1302.
  • FIG. 14 is a schematic diagram of the hardware structure of another neural network compression device in an embodiment of the present invention.
  • the neural network compression device 1400 shown in FIG. 14 (the device 1400 may specifically be a computer device) includes a memory 1401, a processor 1402, a communication interface 1403, and a bus 1404. Among them, the memory 1401, the processor 1402, and the communication interface 1403 implement communication connections between each other through the bus 1404.
  • the memory 1401 may be a read only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 1401 may store a program. When the program stored in the memory 1401 is executed by the processor 1402, the processor 1402 and the communication interface 1403 are used to execute the neural network compression steps in the third and fourth embodiments of the present application.
  • the processor 1402 may adopt a general-purpose central processing unit (Central Processing Unit, CPU), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), graphics processing unit (graphics processing unit, GPU) or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the neural network compression device of the embodiment of the present application, or execute the neural network compression method of the method embodiment of the present application.
  • the processor 1402 may also be an integrated circuit chip with signal processing capability. In the implementation process, the various steps of the neural network compression method of the present application can be completed by the integrated logic circuit of hardware in the processor 1402 or instructions in the form of software.
  • the aforementioned processor 1402 may also be a general-purpose processor, a digital signal processor (Digital Signal Processing, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices , Discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processing
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • FPGA Field Programmable Gate Array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1401, and the processor 1402 reads the information in the memory 1401, and combines its hardware to complete the functions required by the units included in the neural network compression device 900 of the embodiment of the present application, or execute the method embodiment of the present application The compression method of the neural network.
  • the communication interface 1403 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 1400 and other devices or communication networks. For example, data (such as the second generated sample described in Embodiment 3 or Embodiment 4 of the present application) and the neural network to be compressed can be obtained through the communication interface 1403.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 1400 and other devices or communication networks. For example, data (such as the second generated sample described in Embodiment 3 or Embodiment 4 of the present application) and the neural network to be compressed can be obtained through the communication interface 1403.
  • the bus 1404 may include a path for transferring information between various components of the device 1400 (for example, the memory 1401, the processor 1402, and the communication interface 1403).
  • the acquisition unit 901 in the neural network compression device 1000 is equivalent to the communication interface 1403 in the neural network compression device 1400, and the compression unit 1005 can be equivalent to the processor 1402.
  • FIG. 15 is a schematic block diagram of another data processing device in an embodiment of the present invention
  • the data processing device 1500 shown in FIG. 15 includes a memory 1501, a baseband chip 1502, and a radio frequency module 1503 , Peripheral system 1504 and sensor 1505.
  • the baseband chip 1502 includes at least one processor 15021, such as a CPU, a clock module 15022, and a power management module 15023;
  • the peripheral system 1504 includes a camera 15041, an audio module 15042, a touch screen 15043, etc.
  • the sensor 1505 may include a light sensor 15051, The acceleration sensor 15052, the fingerprint sensor 15053, etc.; the modules included in the peripheral system 1504 and the sensor 1505 can be increased or decreased according to actual needs. Any two connected modules above can be specifically connected by a bus, which can be an industry standard architecture (English: industry standard architecture, abbreviated as: ISA) bus, and external device interconnection (English: peripheral component interconnect, abbreviated as: PCI) Bus or extended standard architecture (English: extended industry standard architecture, EISA for short) bus, etc.
  • a bus which can be an industry standard architecture (English: industry standard architecture, abbreviated as: ISA) bus, and external device interconnection (English: peripheral component interconnect, abbreviated as: PCI) Bus or extended standard architecture (English: extended industry standard architecture, EISA for short) bus, etc.
  • ISA industry standard architecture
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the radio frequency module 1503 may include an antenna and a transceiver (including a modem).
  • the transceiver is used to convert electromagnetic waves received by the antenna into electric current and finally into a digital signal.
  • the transceiver is also used to output digital signals from the mobile phone. The signal is converted into electric current and then into electromagnetic wave, and finally the electromagnetic wave is emitted into free space through the antenna.
  • the radio frequency module 1503 may also include at least one amplifier for amplifying signals.
  • the radio frequency module 1503 can be used for wireless transmission, such as Bluetooth (English: Bluetooth) transmission, wireless assurance (English: Wireless-Fidelity, abbreviation: WI-FI) transmission, third-generation mobile communication technology (English: 3rd) -Generation, abbreviation: 3G) transmission, fourth-generation mobile communication technology (English: the 4th Generation mobile communication, abbreviation: 4G) transmission, etc.
  • wireless transmission such as Bluetooth (English: Bluetooth) transmission, wireless assurance (English: Wireless-Fidelity, abbreviation: WI-FI) transmission, third-generation mobile communication technology (English: 3rd) -Generation, abbreviation: 3G) transmission, fourth-generation mobile communication technology (English: the 4th Generation mobile communication, abbreviation: 4G) transmission, etc.
  • the touch screen 15043 can be used to display information input by the user or show information to the user.
  • the touch screen 15043 can include a touch panel and a display panel.
  • a liquid crystal display (English: Liquid Crystal Display, abbreviated: LCD) can be used.
  • Organic Light-Emitting Diode (English: Organic Light-Emitting Diode, abbreviation: OLED) and other forms to configure the display panel.
  • the touch panel can cover the display panel. When the touch panel detects a touch operation on or near it, it transmits it to the processor 15021 to determine the type of the touch event, and then the processor 15021 displays the display according to the type of the touch event. Corresponding visual output is provided on the panel.
  • the touch panel and the display panel are used as two independent components to realize the input and output functions of the terminal 1500. However, in some embodiments, the touch panel and the display panel may be integrated to realize the input and output functions of the terminal 1500.
  • the camera 15041 is used to obtain images for input to the compressed neural network.
  • the compressed neural network is a deep neural network used to process the image.
  • the neural network after compression of the image recognition network in scene A.
  • the audio input module 15042 may specifically be a microphone, which can acquire voice.
  • the terminal 1500 can convert speech into text, and then input the text into the compressed neural network.
  • the compressed neural network is a deep neural network used to process text. For example, the neural network after compression of the text meaning network in scene C.
  • the sensor 1505 may include a light sensor 15051, an acceleration sensor 15052, and a fingerprint sensor 15052.
  • the light sensor 15051 is used to obtain the light intensity of the environment, and the acceleration sensor 15052 (such as a gyroscope) can obtain the movement status of the terminal 1500.
  • the fingerprint sensor 15053 can input fingerprint information; the sensor 1505 senses the relevant signal and quantizes the signal into a digital signal and transmits it to the processor 15021 for further processing.
  • the memory 1501 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the memory 1501 may also include at least one storage device located far away from the aforementioned processor 15021.
  • the memory 1501 may specifically include a storage instruction area and a storage data area.
  • the storage instruction area may store an operating system, a user interface program, and a communication interface. Programs and other programs, the data storage area can store the data required by the processing to perform related operations, or the data generated by performing related operations.
  • the processor 15021 is the control center of the terminal 1500. It uses various interfaces and lines to connect the various parts of the entire mobile phone, and executes various functions of the terminal 1500 by running programs stored in the memory 1501 and calling data stored in the memory 1501 .
  • the processor 15021 may include one or more application processors, and the application processors mainly process an operating system, a user interface, and application programs.
  • the processor 15021 reads the information in the memory 1501, and combines its hardware to complete the functions required by the units included in the data processing apparatus 1100 of the embodiment of the present application, or execute the data of the method embodiment of the present application Approach.
  • the user realizes the communication function of the terminal 1500 through the radio frequency module 1503.
  • the terminal 1500 can receive the compressed neural network or other data sent by the client device 180 or the compression device 170.
  • the devices 1200, 1300, 1400, and 1500 shown in FIG. 12, FIG. 13, FIG. 14, and FIG. 15 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art It should be understood that the devices 1200, 1300, 1400, and 1500 also include other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the apparatuses 1200, 1300, 1400, and 1500 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the apparatuses 1200, 1300, 1400, and 1500 may also only include the devices necessary to implement the embodiments of the present application, instead of the ones shown in FIGS. 12, 13, 14 and 15. All devices.
  • the device 1200 is equivalent to the training device 120 in 1
  • the device 1300 is equivalent to the execution device 110 in FIG. 1
  • the device 1400 is equivalent to the compression device 170 and the device 1500 in FIG. It is equivalent to the user equipment 180 in FIG. 1.
  • a person of ordinary skill in the art may be aware that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the computer-readable medium may include a computer-readable storage medium, which corresponds to a tangible medium, such as a data storage medium, or a communication medium that includes any medium that facilitates the transfer of a computer program from one place to another (for example, according to a communication protocol) .
  • computer-readable media may generally correspond to (1) non-transitory tangible computer-readable storage media, or (2) communication media, such as signals or carrier waves.
  • Data storage media can be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, codes, and/or data structures for implementing the techniques described in this application.
  • the computer program product may include a computer-readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or structures that can be used to store instructions or data Any other media that can be accessed by the computer in the form of desired program code. And, any connection is properly termed a computer-readable medium.
  • any connection is properly termed a computer-readable medium.
  • coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave to transmit instructions from a website, server, or other remote source
  • coaxial cable Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio, and microwave are included in the definition of media.
  • the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are actually directed to non-transitory tangible storage media.
  • magnetic disks and optical discs include compact discs (CD), laser discs, optical discs, digital versatile discs (DVD), and Blu-ray discs. Disks usually reproduce data magnetically, while discs use lasers to reproduce data optically. data. Combinations of the above should also be included in the scope of computer-readable media.
  • DSP digital signal processors
  • ASIC application-specific integrated circuits
  • FPGA field programmable logic arrays
  • processor may refer to any of the foregoing structure or any other structure suitable for implementing the techniques described herein.
  • DSP digital signal processors
  • ASIC application-specific integrated circuits
  • FPGA field programmable logic arrays
  • the term "processor” as used herein may refer to any of the foregoing structure or any other structure suitable for implementing the techniques described herein.
  • the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided in dedicated hardware and/or software modules configured for encoding and decoding, or combined Into the combined codec.
  • the technology may be fully implemented in one or more circuits or logic elements.
  • the technology of this application can be implemented in a variety of devices or devices, including wireless handsets, integrated circuits (ICs), or a set of ICs (for example, chipsets).
  • ICs integrated circuits
  • a set of ICs for example, chipsets.
  • Various components, modules, or units are described in this application to emphasize the functional aspects of the device for performing the disclosed technology, but they do not necessarily need to be implemented by different hardware units.
  • various units can be combined with appropriate software and/or firmware in the codec hardware unit, or by interoperating hardware units (including one or more processors as described above). provide.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了人工智能领域中的一种图像生成方法、神经网络的压缩方法及相关装置、设备,该图像生成方法包括将第一矩阵输入初始图像生成器,得到生成图像;将生成图像输入预设判别器,得到判别结果,其中,预设判别器是经过真实图像和所述真实图像对应的分类训练得到的;根据判别结果更新初始图像生成器,得到目标图像生成器;进而,将第二矩阵输入目标图像生成器,得到样本图像。进一步地,还公开一种神经网络的压缩方法,基于上述图像生成方法得到的样本图像对预设判别器进行压缩。

Description

图像生成方法、神经网络的压缩方法及相关装置、设备
本申请要求于2019年3月31日提交中国国家知识产权局、申请号为201910254752.3、申请名称为“图像生成方法、神经网络的压缩方法及相关装置、设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,特别涉及一种图像生成方法、神经网络的压缩方法及相关装置、设备。
背景技术
随着深度学***板电脑、车载单元(on board unit,OBU)、摄像头等小型移动设备上,此时,需要对机器学习模型进行的压缩,以降低机器学习模型对计算资源需求,加速机器学习模型的运行。
现有的神经网络的压缩和加速算法常常基于该待压缩机器学习模型的训练样本来计算,然而,在现实生活中,真实训练样本往往受到隐私政策或法律的保护,不可被第三方获得。此外,需要被压缩的机器学习模型的结构也常常是不可见的,只有输入和输出的接口被提供。因而,在真实训练样本不可获得的情况下,大部分神经网络的压缩技术没有办法使用。因此,如何生成训练样本?这是一个在无真实训练样本的情况下实现神经网络的压缩亟待解决的技术问题。
现有技术中通常使用生成式对抗网络(generative adversarial networks,GAN)来实现训练样本的生成,GAN通常包括生成器和判别器,通过这两个网络互相博弈学习,从而产生更好的输出。其中,生成器捕捉真实训练样本的潜在分布,并生成新的样本;判别器是一个二分类器,用于判别输入样本是真实样本还是生成样本。通过迭代优化生成器和判别器,当判别器无法正确判别输入样本的数据来源时,可以认为这个生成器已经学到了真实训练数据的分布。这个生成器就可以基于已有的真实训练样本,生成和真实训练样本类似的样本。然而,生成器需要使用真实训练样本来进行训练,对于真实训练样本无法获得的情况下,无法实现GAN的训练,也无法得到和真实训练样本类似的样本。
发明内容
本申请实施例提供一种图像生成方法、神经网络的压缩方法及相关装置、设备,可以实现在无真实图像的情况下,生成与真实图像类似的样本图像,并实现神经网络的压缩。
第一方面,本申请实施例提供了一种图像生成器的训练方法,包括:
训练设备将第一矩阵输入初始图像生成器,得到生成图像,该初始图像生成器为深度神经网络;将生成图像输入预设判别器,得到判别结果,其中,预设判别器是经过第一训 练数据训练得到的,该第一训练数据包括真实图像和该真实图像对应的分类;进而,根据判别结果更新初始图像生成器,得到目标图像生成器。
应理解,该初始图像生成器可以是初始化的深度神经网络,也可以是在训练过程中得到的过程中的深度神经网络。
训练设备可以是终端设备,例如手机、平板电脑、台式计算机、便携式笔记本、AR/VR、车载终端等,也可以是服务器或者云端等。
上述方法不需要训练预设判别器使用的真实图像,就可以训练得到目标图像生成器,且训练得到的目标图像生成器可以用于生成与预设判别器训练所使用的真实图像的特性相似的样本图像,该样本图像可以替代预设判别器的训练数据,实现预设判别器的压缩等需要预设判别器的训练数据的功能。
应理解,在一种应用场景中,预设判别器可以是图像识别网络,该图像识别网络可以用于识别输入的图像的所属分类。在另一种应用场景中,预设判别器可以是人脸属性识别网络,该人脸属性识别网络可以用于识别输入的人脸图像描述的人物的属性,例如年龄、种族、性别和情绪等。
在一个可选的实现方式中,判别结果可以包括生成图像被预测为M个分类中每一个分类的概率,M为大于1的整数。
在一个可选的实现方式中,训练设备根据判别结果更新初始图像生成器的第一种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为生成图像的真实结果;进而,根据判别结果与真实结果的差异更新初始图像生成器。
上述方法,训练设备根据预设判别器对生成图像的判别结果确定真实结果,基于判别结果和真实结果之间的差异更新初始图像生成器,进而实现无真实图像下训练得到目标图像生成器,且根据判别结果确定真实结果为最大概率对应的分类,减少判别结果和真实结果之间的差异,提高训练过程的运算效率。
在一个可选的实现方式中,在根据判别结果更新初始图像生成器之前,该方法还包括训练设备通过预设判别器提取生成图像的特征;训练设备根据判别结果更新初始图像生成器的第二种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为生成图像的真实结果;进而,根据判别结果与真实结果的差异以及特征更新初始图像生成器。
执行上述方法可实现无真实图像下训练得到目标图像生成器,且在对初始图像生成器进行训练的过程中考虑到输入的真实图像通过预设判别器提取的特征所具有的特性,通过约束生成图像的特征,使得训练得到的目标图像生成器生成的样本图像更接近真实图像。
在一个可选的实现方式中,训练设备根据判别结果更新初始图像生成器的第三种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为生成图像的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值,N为正整数;根据判别结果与真实结果的差异以及概率平均值更新初始图像生成器。
上述方法可实现无真实图像下训练得到目标图像生成器,且在对初始图像生成器进行 训练的过程中通过约束生成图像的判别结果,使得目标图像生成器可以均衡地产生各个分类的样本图像,避免目标图像生成器陷入局部最优。
在一个可选的实现方式中,在根据判别结果更新初始图像生成器之前,该方法还包括训练设备通过预设判别器提取生成图像的特征;根据判别结果更新初始图像生成器的第四种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为生成图像的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值;进而,根据判别结果与真实结果的差异、特征的特征值以及概率平均值共同更新初始图像生成器。
第二方面,本申请实施例提供了一种图像生成方法,包括:执行设备将第二矩阵输入目标图像生成器,得到样本图像,其中,目标图像生成器为通过第一方面所述的任意一种图像生成器的训练方法得到的。
执行设备可以是终端设备,例如手机、平板电脑、台式计算机、便携式笔记本、AR/VR、车载终端等,也可以是服务器或者云端等。
上述方法,通过第一方面所述的训练方法训练得到的目标图像生成器可以生成与预设判别器训练所使用的真实图像的特性相似的样本图像,该样本图像可以替代预设判别器的训练数据,实现预设判别器的压缩等需要预设判别器的训练数据的功能。
第三方面,本申请实施例还提供了一种神经网络压缩方法,包括:
压缩设备获取样本图像,其中,样本图像通过如第二方面任一项所述的图像生成方法生成,其中,预设判别器为待压缩神经网络;将所述样本图像输入到所述待压缩神经网络,得到所述样本图像对应的分类;进而,根据样本图像和样本图像对应的分类对待压缩神经网络进行压缩,得到压缩后的神经网络,其中,压缩后的神经网络的参数少于待压缩神经网络的参数。
其中,压缩设备可以是可以是服务器或者云端等。
上述方法,通过第二方面所述的图像生成方法得到与预设判别器训练所使用的真实图像的特性相似的样本图像,根据该样本图像和该样本图像对应的分类对待压缩神经网络进行压缩,进而,无需训练数据即可实现待压缩神经网络的压缩。
第四方面,本申请实施例还提供了一种图像处理方法,包括:
终端接收输入图像;将该输入图像输入到压缩后的神经网络,通过该压缩后的神经网络对输入图像进行处理,得到处理结果,其中,该压缩后的神经网络是通过如第三方面所述的神经网络压缩方法得到的;最终,输出处理结果。
应理解,处理结果的内容依赖于压缩后的神经网络的功能,而压缩后的神经网络的功能依赖于待压缩神经网络的功能,可以是对图像的分类结果、识别结果等。例如,待压缩神经网络为人脸属性识别网络,用于识别输入的人脸图像所描述的人的属性,比如性别、年龄、种族等,那么,压缩后的神经网络可以识别输入图像描述人的性别、年龄、种族等,该处理结果可以包括输入图像被识别到的性别、年龄和种族。
还应理解,压缩后的神经网络相对于待压缩神经网络具有更简单的网络结构,更少的参数,同时在其运行时占用较少的存储资源,因而可以应用于轻量级的终端。
还应理解,终端可以是手机、平板电脑、台式计算机、便携式笔记本、AR/VR、车载 终端或其他终端设备等。
第五方面,本申请实施例提供了一种文本生成器的训练方法,包括:训练设备将第一矩阵输入初始文本生成器,得到生成文本,该初始文本生成器为深度神经网络;将生成文本输入预设判别器,得到判别结果,其中,预设判别器是经过第一训练数据训练得到的,该第一训练数据包括真实文本和该真实文本对应的分类;进而,根据判别结果更新初始文本生成器,得到目标文本生成器。
应理解,该初始样本生成器可以是初始化的深度神经网络,也可以是在训练过程中得到的过程中的深度神经网络。
训练设备可以是终端设备,例如手机、平板电脑、台式计算机、便携式笔记本、AR/VR、车载终端等,也可以是服务器或者云端等。
在一种应用场景中,预设判别器可以是文本识别网络,该文本识别网络用于识别输入的文本的分类,该分类可以以意图、所属学科等标准进行划分。
上述方法不需要训练预设判别器使用真实文本,就可以训练得到目标文本生成器,且训练得到的目标文本生成器可以用于生成与预设判别器训练所使用的真实文本的特性相似的样本文本,该样本文本可以替代预设判别器的训练数据,实现预设判别器的压缩等需要预设判别器的训练数据的功能。
在一个可选的实现方式中,判别结果可以包括生成文本被预测为M个分类中每一个分类的概率,M为大于1的整数。
在一个可选的实现方式中,训练设备根据判别结果更新初始文本生成器的第一种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;进而,根据判别结果与真实结果的差异更新初始文本生成器。
执行上述方法,训练设备根据预设判别器对生成文本的判别结果确定真实结果,基于判别结果和真实结果之间的差异更新初始文本生成器,进而实现无真实文本下训练得到目标文本生成器,且训练设备根据判别结果确定真实结果为最大概率对应的分类,减少判别结果和真实结果之间的差异,提高训练过程的运算效率。
在一个可选的实现方式中,在根据判别结果更新初始文本生成器之前,该方法还包括训练设备通过预设判别器提取生成文本的特征。此时,训练设备根据判别结果更新初始文本生成器的第二种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;进而,根据判别结果与真实结果的差异以及特征更新初始文本生成器。
上述方法,可实现无真实文本下训练得到目标文本生成器,且在对初始文本生成器进行训练的过程中考虑到输入的真实文本通过预设判别器提取的特征所具有的特性,通过约束生成文本的特征,使得训练得到的目标文本生成器生成的样本文本更接近真实文本。
在一个可选的实现方式中,训练设备根据判别结果更新初始文本生成器的第三种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值;根据判别结果与真实结果的差 异以及概率平均值共同更新初始文本生成器。
上述方法,可实现无真实文本下训练得到目标文本生成器,且在对初始文本生成器进行训练的过程中通过约束生成文本的判别结果,使得目标文本生成器可以均衡地产生各个分类的样本文本,避免目标文本生成器陷入局部最优。
在一个可选的实现方式中,在根据判别结果更新初始文本生成器之前,该方法还包括训练设备通过预设判别器提取生成文本的特征。此时,训练设备根据判别结果更新初始文本生成器的第四种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值;进而,根据判别结果与真实结果的差异、特征以及概率平均值共同更新初始文本生成器。
第六方面,本申请实施例提供了一种文本生成方法,包括:执行设备将第二矩阵输入目标文本生成器,得到样本文本。其中,目标文本生成器为通过第四方面所述的任意一种文本生成器的训练方法得到的。
执行设备可以是终端设备,例如手机、平板电脑、台式计算机、便携式笔记本、AR/VR、车载终端等,也可以是服务器或者云端等。
上述方法,通过第五方面所述的训练方法训练得到的目标文本生成器可以生成与预设判别器训练所使用的真实文本的特性相似的样本文本,该样本文本可以替代预设判别器的训练数据,实现预设判别器的压缩等需要预设判别器的训练数据的功能。
第七方面,本申请实施例提供了一种神经网络压缩方法,包括:压缩设备获取样本文本,其中,样本文本通过如第五方面任一项所述的文本生成方法生成,其中,预设判别器为待压缩神经网络;将样本文本输入到所述待压缩神经网络,得到该样本文本对应的分类;进而根据根据所述样本文本和所述样本文本对应的分类对所述待压缩神经网络进行压缩,得到压缩后的神经网络,其中,所述压缩后的神经网络的参数少于待压缩神经网络的参数。
上述方法,通过第六方面所述的文本生成方法得到与预设判别器训练所使用的真实文本的特性相似的样本文本,根据该样本文本和该样本文本对应的分类对待压缩神经网络进行压缩,进而,无需训练数据即可实现待压缩神经网络的压缩。
第八方面,本申请实施例提供了一种文本处理方法,终端接收输入文本;将该输入文本输入到压缩后的神经网络,通过该压缩后的神经网络对输入文本进行处理,得到处理结果,其中,该压缩后的神经网络是通过如第六方面所述的神经网络压缩方法得到的;进而,输出该处理结果。
应理解,处理结果的内容依赖于压缩后的神经网络的功能,而压缩后的神经网络的功能依赖于待压缩神经网络的功能,可以是对文本的分类结果、识别结果等。例如,待压缩神经网络为文本识别网络,用于识别输入文本的描述的意图,那么,压缩后的神经网络可以识别输入文本的意图,进而执行该该识别到的意图对应的操作,例如,在识别到意图为“接通电话”,终端(如手机)可以接通当前的呼叫。
还应理解,压缩后的神经网络相对于待压缩神经网络具有更简单的网络结构,更少的参数,同时在其运行时占用较少的存储资源,因而可以应用于轻量级的终端。
应理解,终端可以是手机、平板电脑、台式计算机、便携式笔记本、AR/VR、车载终 端或其他终端设备等。
第九方面,本申请实施例还提供了一种样本生成器的训练方法,该方法包括:训练设备将第一矩阵输入初始样本生成器,得到第一生成样本,初始样本生成器为深度神经网络;将第一生成样本输入预设判别器,得到判别结果,其中,预设判别器是经过第一训练数据训练得到的,第一训练数据包括真实样本和该真实样本对应的分类;进而,根据第一生成样本的判别结果更新初始样本生成器的参数,得到目标样本生成器。
应理解,该初始样本生成器可以是初始化的深度神经网络,也可以是在训练过程中得到的过程中的深度神经网络。
上述方法不需要训练预设判别器使用真实样本,就可以训练得到目标样本生成器,且训练得到的目标样本生成器可以用于生成与预设判别器训练所使用的真实样本的特性相似的第二生成样本,该第二样本图像可以替代预设判别器的训练数据,实现预设判别器的压缩等需要预设判别器的训练数据的功能。
在一个可选的实现方式中,判别结果可以包括第一生成样本被预测为M个分类中每一个分类的概率,M为大于1的整数。
在一个可选的实现方式中,根据判别结果更新初始样本生成器的第一种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;进而,根据判别结果与真实结果的差异更新初始样本生成器。
执行上述方法,训练设备根据预设判别器对第一生成样本的判别结果确定真实结果,基于判别结果和真实结果之间的差异更新初始样本生成器,进而实现无真实样本下训练得到目标样本生成器,且根据判别结果确定真实结果为最大概率对应的分类,减少判别结果和真实结果之间的差异,提高训练过程的运算效率。
在一个可选的实现方式中,在根据判别结果更新初始样本生成器之前,该方法还包括通过预设判别器提取第一生成样本的特征。此时,根据判别结果更新初始样本生成器的第二种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;进而,根据判别结果与真实结果的差异以及特征更新初始样本生成器。
上述方法,可实现无真实样本下训练得到目标样本成器,且在对初始样本生成器进行训练的过程中考虑到输入的真实样本通过预设判别器提取的特征所具有的特性,通过约束第一生成样本的特征,使得训练得到的目标样本生成器生成的第二生成样本更接近真实样本。
在一个可选的实现方式中,根据判别结果更新初始样本生成器的第三种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值;根据判别结果与真实结果的差异以及概率平均值共同更新初始样本生成器。
上述方法,可实现无真实样本下训练得到目标样本成器,且在对初始样本生成器进行训练的过程中通过约束第一生成样本的判别结果,使得目标样本生成器可以均衡地产生各个分类的第二生成样本,避免目标样本生成器陷入局部最优。
在一个可选的实现方式中,在根据判别结果更新初始样本生成器之前,该方法还包括通过预设判别器提取第一生成样本的特征。此时,根据判别结果更新初始样本生成器的第四种实现方式可以是:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值;进而,根据判别结果与真实结果的差异、特征的特征值以及概率平均值共同更新初始样本生成器。
第十方面,本申请实施例还提供了一种样本生成方法,该方法包括:执行设备将第二矩阵输入目标样本生成器,得到第二生成样本,其中,目标样本生成器为通过上述第七方面中任一种样本生成器的训练方法训练得到的。
上述方法不需要训练预设判别器使用真实样本,就可以训练得到目标样本生成器,且通过训练得到的目标样本生成器可以用于生成与预设判别器训练所使用的真实样本特性相似的第二生成样本,该第二生成样本可以替代预设判别器的训练数据,实现预设判别器的压缩等需要预设判别器的训练数据的功能。
第十一方面,本申请实施例还提供了一种神经网络的压缩方法,包括:压缩设备获取第二生成样本,第二生成样本通过如第八方面所述的任一种样本生成方法生成,其中,预设判别器为待压缩神经网络;将第二生成样本输入到所述待压缩神经网络,得到该第二生成样本对应的分类;进而,根据第二生成样本和该第二生成样本对应的分类对待压缩神经网络进行压缩,得到压缩后的神经网络,其中,压缩后的神经网络的参数少于待压缩神经网络的参数。
压缩设备可以是服务器或者云端等。
应理解,在本申请实施例的另一种实现中,上述神经网络的压缩方法中训练得到目标样本生成器的训练过程也可以由训练设备执行,执行通过目标样本生成器生成第二生成样本的过程的设备和执行而执行根据第二生成样本对待压缩神经网络进行压缩的设备可以是同一或不同设备。
执行上述方法通过第十方面所述的样本生成方法可以得到与待压缩神经网络训练所使用的真实样本的特性相似的第二生成样本,根据该二生成样本和该二生成样本对应的分类对待压缩神经网络进行压缩,进而在无训练样本的情况下,实现待压缩神经网络的压缩。
在一个可选的实现方式中,判别结果包括第一生成样本被预测为M个分类中每一个分类的概率,M为大于1的整数。
执行上述方法,根据判别结果确定真实结果为最大概率对应的分类,减少判别结果和真实结果之间的差异,提高训练过程的运算效率。
在一个可选的实现方式中,压缩设备对待压缩神经网络进行压缩的一种具体实现可以是:压缩设备将第二生成样本输入到待压缩神经网络,得到第二生成样本对应的真实结果;通过第二训练数据训练初始神经网络,得到得到压缩后的神经网络;其中,第二训练数据包括第二生成样本和第二生成样本对应的真实结果,初始神经网络为深度神经网络,初始神经网络的模型参数少于待压缩神经网络的模型参数。
可见,本申请实施例提供的神经网络的压缩方法,通过目标样本生成器生成与待压缩神经网络训练所使用的真实样本特性相似的第二生成样本,以该第二生成样本通过待压缩 神经网络预测得到的结果作为标签。根据该第二生成样本以及其标签,通过训练一个复杂度低的神经网络,得到与待压缩神经网络功能一致的神经网络,该神经网络即压缩后的神经网络,实现在无训练样本的情况下待压缩神经网络的压缩。该压缩后的神经网络可以应用于终端等轻量级的设备,从而减少运算耗损,减少存储开销,并提高运算效率。
在一个可选的实现方式中,压缩设备对待压缩神经网络进行压缩的另一种实现方式可以是:执行设备将第二生成样本输入待压缩神经网络,得到第二生成样本的真实结果;根据待压缩的神经网络中神经元的重要性,去除待压缩的神经网络中重要性小于第一阈值的神经元,得到简化后的神经网络;进而,通过第三训练数据对简化后的神经网络进行训练,得到压缩后的神经网络,第三训练数据包括第二生成样本和第二生成样本对应的真实结果。
可见,本申请实施例提供的神经网络的压缩方法,通过目标样本生成器生成与待压缩神经网络训练所使用的真实样本特性相似的第二生成样本,将该第二生成样本通过待压缩神经网络预测得到的结果作为标签,通过剪枝算法去除待压缩神经网络中冗余的连接,得到一个简化的神经网络,以第二生成样本作为简化的神经网络的输入,待压缩神经网络对输入的第二生成样本处理得到的真实结果作为标签,利用第二生成样本及其标签对简化的神经网络进行训练,得到压缩后的神经网络,实现在无训练样本的情况下待压缩神经网络的压缩。该压缩后的神经网络可以应用于终端等轻量级的设备,进而降低待压缩神经网络的复杂度和提高运算效率、减少存储开销。
应理解,在具体实施例中A场景所描述的实施例中,上述神经网络的压缩方法可实现对图像识别网络的压缩,其中,图像识别网络用于识别输入的图像的所属的分类,此时,“初始样本生成器”即为“初始图像生成器”,“待压缩神经网络”即为“图像识别网络”,“第一生成样本”即为“生成图像”,“真实样本”即为“真实图像”,“判定结果”即为“判定分类”,“真实结果”即为“真实分类”,“第二生成样本”即为“样本图像”。
在具体实施例中B场景所描述的实施例中,上述神经网络的压缩方法可实现对人脸属性识别网络的压缩,其中,该人脸属性识别网络用于识别输入的人脸图像描述的人物的属性,例如年龄、种族、性别和情绪等。“初始样本生成器”即为“初始人脸图像生成器”,“待压缩神经网络”即为“初始人脸图像生成器”“第一生成样本”即为“生成人脸图像”,“真实样本”即为“真实人脸图像”,“判定结果”即为“判定属性”,“真实结果”即为“真实真实属性”,“第二生成样本”即为“样本人脸图像”。
在具体实施例中C场景所描述的实施例中,上述神经网络的压缩方法可实现对文本识别网络的压缩,其中,该文本识别网络用于识别输入的文本的分类,例如,分类可以以意图、所属学科等标准进行划分。“初始样本生成器”即为“初始文本生成器”“待压缩神经网络”即为“初始文本生成器”,“第一生成样本”即为“生成文本”,“真实样本”即为“真实文本”,“判定结果”即为“判定意图”,“真实结果”即为“真实意图”,“第二生成样本”即为“样本文本”。
第十二方面,本申请实施例还提供了一种数据处理方法,包括:
终端接收输入数据;将该输入数据输入到压缩后的神经网络,通过该压缩后的神经网络对输入数据进行处理,得到处理结果,其中,该压缩后的神经网络是通过如第十一方面所述的神经网络压缩方法得到的;最终,输出处理结果。
应理解,处理结果的内容依赖于压缩后的神经网络的功能,而压缩后的神经网络的功能依赖于待压缩神经网络的功能,可以是对输入数据的分类结果、识别结果等。例如,输入数据为人脸图像;待压缩神经网络为人脸属性识别网络,用于识别输入的人脸图像所描述的人的属性,比如性别、年龄、种族等;那么,压缩后的神经网络可以识别输入图像描述人的性别、年龄、种族等,该处理结果可以包括输入图像被识别到的性别、年龄和种族。
还应理解,压缩后的神经网络相对于待压缩神经网络具有更简单的网络结构,更少的参数,同时在其运行时占用较少的存储资源,因而可以应用于轻量级的终端。
第十三方面,本申请实施例还提供一种图像生成器的训练装置,该装置包括用于执行第一方面中的方法的模块。
第十四方面,本申请实施例还提供一种图像生成器的训练装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第一方面中的方法。
第十五方面,本申请实施例还提供一种图像生成装置,该装置包括用于执行第二方面中的方法的模块。
第十六方面,本申请实施例还提供一种图像生成装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第二方面中的方法。
第十七方面,本申请实施例还提供一种神经网络压缩装置,该装置包括用于执行第三方面中的方法的模块。
第十八方面,本申请实施例还提供一种神经网络压缩装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第三方面中的方法。
第十九方面,本申请实施例还提供一种图像处理装置(或终端),包括用于执行第四方面中的方法的模块。
第二十方面,本申请实施例还提供一种图像处理装置(或终端),包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第四方面中的方法。
第二十一方面,本申请实施例还提供一种文本生成器的训练装置,该装置包括用于执行第五方面中的方法的模块。
第二十二方面,提供一种文本生成器的训练装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第五方面中的方法。
第二十三方面,本申请实施例还提供一种文本像生成装置,该装置包括用于执行第六方面中的方法的模块。
第二十四方面,本申请实施例还提供一种文本生成装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第六方面中的方法。
第二十五方面,本申请实施例还提供一种神经网络压缩装置,该装置包括用于执行第 七方面中的方法的模块。
第二十六方面,本申请实施例还提供一种神经网络压缩装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第七方面中的方法。
第二十七方面,本申请实施例还提供一种文本处理装置(或终端),包括用于执行第八方面中的方法的模块。
第二十八方面,本申请实施例还提供一种文本处理装置(或终端),包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第八方面中的方法。
第二十九方面,本申请实施例还提供一种样本生成器的训练装置,该装置包括用于执行第九方面中的方法的模块。
第三十方面,本申请实施例还提供一种样本生成器的训练装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第九方面中的方法。
第三十一方面,本申请实施例还提供一种样本生成装置,该装置包括用于执行第十方面中的方法的模块。
第三十二方面,本申请实施例还提供一种样本生成装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第十方面中的方法。
第三十三方面,本申请实施例还提供一种神经网络的压缩装置,该装置包括用于执行第十一方面中的方法的模块。
第三十四方面,本申请实施例还提供一种神经网络的压缩装置,该装置包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第十一方面中的方法。
第三十五方面,本申请实施例还提供一种数据处理装置(或终端),包括用于执行第十二方面中的方法的模块。
第三十六方面,本申请实施例还提供一种数据处理装置(或终端),包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第十二方面中的方法。
第三十七方面,本申请实施例还提供一种计算机可读介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行第一至十二方面中任意一个方面所述的方法。
第三十八方面,本申请实施例还提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一至十二方面中任意一个方面所述的方法。
第三十九方面,本申请实施例还提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述第一至十二方面中任意一个方面所述的方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一至十二方面中任意一个方面所述的方法。
第四十方面,本申请实施例还提供一种电子设备,该电子设备包括上述第三十七方面至第二十五方面中的任意一个方面中的所述的装置。
附图说明
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。
图1为本申请实施例中一种***架构的示意性框图;
图2为本申请实施例中一种卷积神经网络的示意性框图;
图3为本申请实施例中一种芯片硬件结构示意图;
图4A为本发明实施例中一种样本生成器的训练方法的流程示意图;
图4B为本发明实施例中一种样本生成器的训练方法的示意性说明图;
图5A为本发明实施例中一种神经网络的压缩方法的流程示意图;
图5B为本发明实施例中一种神经网络的压缩方法的原理示意图;
图6为本发明实施例中另一种神经网络的压缩方法的流程示意图;
图7为本发明实施例中一种数据处理方法的流程示意图;
图8为本发明实施例中一种样本生成器的训练装置的示意性框图;
图9为本发明实施例中一种样本生成装置的示意性框图;
图10为本发明实施例中一种神经网络的压缩装置的示意性框图;
图11为本发明实施例中一种数据处理装置的示意性框图
图12为本发明实施例中另一种样本生成器的训练装置的示意性框图;
图13为本发明实施例中另一种样本生成装置的示意性框图;
图14为本发明实施例中另一种神经网络的压缩装置的示意性框图;
图15为本发明实施例中另一种数据处理装置的示意性框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请实施例提供的图像生成方法可以应用在输入为图像的神经网络的训练、压缩等的场景。具体而言,本申请实施例的图像生成方法可以应用在如下所示的A场景和B场景中,下面分别对A场景和B场景进行简单的介绍。
A场景:
客户通过训练数据训练得到图像识别网络,该图像识别网络可以识别输入图像的分类,该图像可以以图像描述的物体的种类、形状等进行分类,例如飞机、汽车、鸟、猫、鹿、狗、青蛙、马、船、卡车等,该图像识别网络的训练数据包括真实图片和真实图片对应的分类。客户可以向具有神经网络压缩服务的服务商请求对已经训练完成的图像识别网络进 行压缩。服务商的压缩设备(例如提供神经网络压缩服务的云平台)在对图像识别网络进行压缩的过程中需要使用训练该图像识别网络的训练数据。然而,客户往往难以提供该图像识别网络的训练数据,此时,压缩设备若要完成对图像识别网络的压缩,则需要自己生成训练数据。
此时,可以构建一个初始图像生成器,该初始图像生成器为深度神经网络,可以对输入的随机向量进行处理,输出生成图像。将初始图像生成器输出的生成图像输入到图像识别网络,可以得到该生成图像对应的判定分类,即该生成图像的被预测为上述多个分类中每一个分类的概率。应理解,在初始图像生成器的初始学习过程中,生成图像与真实图像差距较大,图像识别网络对生成图像的判定准确度低,即生成图像被识别为上述各个分类的概率相差较小。训练设备可以确定该输入的随机向量对应的真实分类,根据真实分类和判定分类之间的差异更新初始图像生成器的参数,使得更新参数后的图像生成器输出的生成图像,被图像识别网络识别到的判定分类与其真实分类之间的差异越来越小,从而最后得到目标图像生成器。可以认为目标图像生成器对输入的随机向量进行处理输出的样本图像可以逼近训练图像识别网络所使用的真实图像。进而,压缩设备可以基于生成的样本图像通过蒸馏或剪枝算法实现对该图像识别网络的压缩和加速。
可见,本申请实施例提供的图像训练方法得到的目标图像生成器不需要使用图像识别网络的训练数据,将生成图像输入到已训练好的图像识别网络,得到该生成图像的判别分类,进而以生成图像的判定分类与真实分类的差异训练得到目标图像生成器;而且,目标图像生成器生成的样本图像可以逼近训练图像识别网络所使用的真实图像。
B场景:
客户通过训练数据训练得到人脸属性识别网络,该人脸属性识别网络可以根据输入的人脸图像识别该图像描述的人物的属性,例如性别、种族、年龄和情绪等属性,不同的属性属于不同的分类,该人脸属性识别网络的训练数据包括真实人脸图片和真实人脸图片对应的属性。同上述图像识别网络相似,客户可以向具有神经网络压缩服务的服务商请求对已经训练完成的人脸属性识别网络进行压缩。服务商的压缩设备(例如提供神经网络压缩服务的云平台)在对人脸属性识别网络进行压缩的过程中需要使用该训练该人脸属性识别网络的训练数据。然而,客户往往难以提供人脸属性识别网络的训练数据,此时,压缩设备若要完成对该人脸属性识别网络的压缩,则需要自己生成训练数据。
此时,可以构建一个初始人脸图像生成器,该初始人脸图像生成器为深度神经网络,可以对输入的随机向量进行处理,输出生成人脸图像。将初始人脸图像生成器输出的生成人脸图像输入到人脸属性识别网络,可以得到判定属性,即该生成人脸图像被预测为多种属性分类中每一种属性的概率。应理解,在初始人脸图像生成器的初始训练过程中,生成人脸图像与真实人脸图像差距较大,人脸属性识别网络对生成人脸图像的判定准确度低,即生成人脸图像被识别为各个种类的属性的概率相差较小。训练设备可以确定该输入的随机向量通过初始人脸图像得到的生成人脸图像对应的真实属性,根据真实属性和判定属性之间的差异更新初始人脸图像生成器的参数,使得更新参数后的人脸图像生成器输出的生成人脸图像被人脸属性识别网络识别到的判定属性与真实属性之间的差异越来越小,最后得到目标人脸图像生成器。可以认为目标人脸图像生成器对输入的随机向量进行处理输出 的样本人脸图像可以逼近训练人脸属性识别网络所使用的真实人脸图像。进而,压缩设备可以基于生成的样本人脸图像,通过蒸馏或剪枝算法实现对该人脸属性识别网络的压缩和加速。
可见,本申请实施例提供的图像生成方法得到的目标人脸图像生成器不需要使用人脸属性识别网络的真实训练数据,将生成人脸图像输入到已训练好的人脸属性识别网络,得到该生成人脸图像的判定属性,以生成人脸图像的判定属性与真实属性的差异训练得到目标人脸图像生成器;而且,通过目标人脸图像生成器得到的样本人脸图像可以逼近训练人脸属性识别网络所使用的真实人脸图像。
不限于上述A场景和B场景,本申请实施例提供的图像生成方法生成的样本图像,还可以作为训练数据应用于其他以图像为输入的机器学习模型的训练场景中,本申请不作限定。还应理解,人脸图像为图像中的一种。
本申请实施例提供的文本生成方法能够应用在输入为文本的神经网络的训练、压缩等的场景。具体而言,本申请实施例的文本生成方法能够应用在如下所示的C场景,下面对C场景进行简单的介绍。
C场景:
该待压缩神经网络的训练数据包括真实文本和真实文本对应的分类。此时,可以训练一个文本生成网络,以生成可以替代真实文本的样本文本,进而基于生成的样本文本通过蒸馏或剪枝算法实现对该待压缩神经网络压缩和加速。
客户通过训练数据训练得到文本识别网络,该文本识别网络可以识别输入文本的分类。其中,在一种应用场景中该文本可以是以意图进行分类,例如意图包括:打开灯、关闭灯、打开空调、关闭空调、打开音响等,该文本识别网络可以应用于智能家居,智能家居的控制中心接收到语音后,将语音转换为文本,进而,控制中心通过文本识别网络识别输入的文本的意图,进而,根据文本识别网络识别到意图控制智能家居中的设备执行该意图对应的操作,比如,控制打开空调等。应理解,意图还可以包括其他的分类方式,本申请实施例不作限定。还应理解,在另一种应用场景中,该文本可以以其他方式进行分类,例如,文本可以以学科进行分类,以实现对文本的分类管理,本申请实施例不作限定。本申请实施例以文本识别网络用于识别输入文本的意图为例来说明。
该文本识别网络的训练数据包括真实文本和真实文本对应的分类。客户可以向具有神经网络压缩服务的服务商请求对已经训练完成的文本识别网络进行压缩。服务商的压缩设备(例如提供神经网络压缩服务的云平台)在对文本识别网络进行压缩的过程中需要使用该训练该文本识别网络的训练数据。然而,客户并没有提供文本识别网络的训练数据,此时,压缩设备若要完成对文本识别网络的压缩需要自己生成训练数据。
此时,可以构建一个初始文本生成器,该初始文本生成器为深度神经网络,可以对输入的随机向量进行处理,输出生成文本。将初始文本生成器输出的生成文本输入到文本识别网络,可以得到该生成文本对应的判定意图,即该生成文本的被预测为上述多个意图中每一个意图的概率。应理解,在初始文本生成器的初始训练过程中,生成文本与真实文本差距较大,文本识别网络对生成文本的判定准确度低,即生成文本被识别为上述各个意图的概率相差较小。训练设备可以确定该输入的随机向量对应的真实分类,根据真实意图和 判定意图之间的差异更新初始文本生成器的参数,使得更新参数后的文本生成器输出的生成文本被文本识别网络识别到的判定意图与真实意图之间的差异越来越小,得到目标文本生成器。可以认为目标文本生成器对输入的随机向量进行处理输出的样本文本可以逼近训练文本识别网络所使用的真实文本。进而,压缩设备可以基于生成的样本文本通过蒸馏或剪枝算法实现对该文本识别网络压缩和加速。
可见,本申请实施例提供的文本生成方法得到的目标文本生成器不需要使用文本识别网络的训练数据,将生成文本输入到已训练好的文本识别网络,得到该生成文本的判定意图,以生成文本的判定意图与真实意图的差异训练得到目标文本生成器;而且,通过目标文本生成器得到的样本文本可以逼近训练文本识别网络所使用的真实文本。
下面从模型训练侧和模型应用侧对本申请提供的方法进行描述:
本申请实施例提供的样本生成器的训练方法,涉及计算机视觉的处理或自然语言的处理,具体可以应用于数据训练、机器学习、深度学习等数据处理方法,对训练数据(如本申请中的第一矩阵)进行符号化和形式化的智能信息建模、抽取、预处理、训练等,最终得到训练好的目标样本生成器;并且,本申请实施例提供的样本生成方法可以运用上述训练好的目标样本像生成器,将输入数据(如本申请中的第二矩阵)输入到所述训练好的目标样本生成器中,得到输出数据(如本申请中的样第二生成样本)。需要说明的是,本申请实施例提供的目标样本生成器的训练方法和样本生成方法是基于同一个构思产生的发明,也可以理解为一个***中的两个部分,或一个整体流程的两个阶段:如模型训练阶段和模型应用阶段。
由于本申请实施例涉及大量神经网络的应用,为了便于理解,下面先对本申请实施例涉及的相关术语及神经网络等相关概念进行介绍。
(1)图像识别
本申请实施例中,图像识别是利用图像处理和机器学习、计算机图形学等相关方法,根据图像识别图像所属的分类或者图像的属性等。例如,场景A中,识别图像所属的分类;又例如,场景B中,识别人脸图像的属性。
(2)文本识别
本申请实施例中,文本识别也成为自然语言识别,是利用语言学、计算机科学、人工智能等相关方法,根据文本识别文本所表达的意图、情感或者其他属性等。例如,场景C中,识别文本所表达的意图。
(3)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以为:
Figure PCTCN2020082599-appb-000001
其中,s=1、2、……n,n为大于1的自然数,Ws为xs的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的 输入。激活函数可以是sigmoid函数。神经网络是将许多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
(4)深度神经网络
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有很多层隐含层的神经网络,这里的“很多”并没有特别的度量标准。从DNN按不同层的位置划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2020082599-appb-000002
其中,
Figure PCTCN2020082599-appb-000003
是输入向量,
Figure PCTCN2020082599-appb-000004
是输出向量,b是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2020082599-appb-000005
经过如此简单的操作得到输出向量
Figure PCTCN2020082599-appb-000006
由于DNN层数多,则系数W和偏移向量b的数量也就很多了。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2020082599-appb-000007
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。总结就是:第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2020082599-appb-000008
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
(5)卷积神经网络
卷积神经网络(CNN,convolutional neuron network)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器。该特征抽取器可以看作是滤波器,卷积过程可以看作是使用一个可训练的滤波器与一个输入的图像或者卷积特征平面(feature map)做卷积。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。这其中隐含的原理是:图像的某一部分的统计信息与其他部分是一样的。即意味着在某一部分学习的图像信息也能用在另一部分上。所以对于图像上的所有位置,都能使用同样的学习得到的图像信息。在同一卷积层中,可以使用多个卷积核来提取不同的图像信息,一般地,卷积核数量越多,卷积操作反映的图像信息越丰富。
卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可 以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。
(6)循环神经网络(RNN,recurrent neural networks)是用来处理序列数据的。在传统的神经网络模型中,是从输入层到隐含层再到输出层,层与层之间是全连接的,而对于每一层层内之间的各个节点是无连接的。这种普通的神经网络虽然解决了很多难题,但是却仍然对很多问题却无能无力。例如,你要预测句子的下一个单词是什么,一般需要用到前面的单词,因为一个句子中前后单词并不是独立的。RNN之所以称为循环神经网路,即一个序列当前的输出与前面的输出也有关。具体的表现形式为网络会对前面的信息进行记忆并应用于当前输出的计算中,即隐含层本层之间的节点不再无连接而是有连接的,并且隐含层的输入不仅包括输入层的输出还包括上一时刻隐含层的输出。理论上,RNN能够对任何长度的序列数据进行处理。对于RNN的训练和对传统的CNN或DNN的训练一样。同样使用误差反向传播算法,不过有一点区别:即,如果将RNN进行网络展开,那么其中的参数,如W,是共享的;而如上举例上述的传统神经网络却不是这样。并且在使用梯度下降算法中,每一步的输出不仅依赖当前步的网络,还依赖前面若干步网络的状态。该学习算法称为基于时间的反向传播算法Back propagation Through Time(BPTT)。
既然已经有了卷积神经网络,为什么还要循环神经网络?原因很简单,在卷积神经网络中,有一个前提假设是:元素之间是相互独立的,输入与输出也是独立的,比如猫和狗。但现实世界中,很多元素都是相互连接的,比如股票随时间的变化,再比如一个人说了:我喜欢旅游,其中最喜欢的地方是云南,以后有机会一定要去。这里填空,人类应该都知道是填“云南”。因为人类会根据上下文的内容进行推断,但如何让机器做到这一步?RNN就应运而生了。RNN旨在让机器像人一样拥有记忆的能力。因此,RNN的输出就需要依赖当前的输入信息和历史的记忆信息。
(7)损失函数
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断的调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
例如,本申请中,在对初始样本生成器的优化过程中,因为希望初始样本生成器的输出的第一生成样本尽可能的接近真实样本。由于预设判别器为训练好的神经网络,预设判别器可以准确识别真实样本所属的分类。当预设判别器可以准确识别第一生成样本所属的分类时,可以认为第一生成样本和真实样本特性相似,即接近真实样本。因此,通过比较预设判别器对第一生成样本的判定结果与和真正想要的真实结果,再根据两者之间的差异情况来更新初始样本生成器中每一层神经网络的权重向量(当然,在第一次更新之前通常 会有过程,即为初始样本生成器中的各层预先配置参数),比如,如果预设判别器的判定结果的值高了,就调整权重向量让它的值低一些,不断的调整,直到预设判别器能够预测出与真实结果非常接近的值。因此,就需要预先定义“如何比较判定结果和真实结果之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量判定结果和真实结果的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么初始样本生成器的训练就变成了尽可能缩小这个loss的过程。
(8)反向传播算法
卷积神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始样本生成器中参数的大小,使得初始样本生成器的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始样本生成器中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的目标样本生成器的参数,例如权重矩阵。
(9)生成式对抗网络
生成式对抗网络(generative adversarial networks,GAN)是一种深度学习模型。该模型中至少包括两个模块:一个模块是生成模型(generative model,本申请实施例中也称生成网络),另一个模块是判别模型(discriminative model,本申请实施例中也称为判别网络),通过这两个模块互相博弈学习,从而产生更好的输出。生成模型和判别模型都可以是神经网络,具体可以是深度神经网络,或者卷积神经网络。GAN的基本原理如下:以生成图片的GAN为例,假设有两个网络,G(generator)和D(discriminator),其中G是一个生成图片的网络,它接收一个随机的噪声z,通过这个噪声生成图片,记做G(z);D是一个判别网络,用于判别一张图片是不是“真实的”。它的输入参数是x,x代表一张图片,输出D(x)代表x为真实图片的概率,如果为1,就代表100%是真实的图片,如果为0,就代表不可能是真实的图片。在对该生成式对抗网络进行训练的过程中,生成网络G的目标就是尽可能生成真实的图片去欺骗判别网络D,而判别网络D的目标就是尽量把G生成的图片和真实的图片区分开来。这样,G和D就构成了一个动态的“博弈”过程,也即“生成式对抗网络”中的“对抗”。最后博弈的结果,在理想的状态下,G可以生成足以“以假乱真”的图片G(z),而D难以判定G生成的图片究竟是不是真实的,即D(G(z))=0.5。这样就得到了一个优异的生成模型G,它可以用来生成图片。
(10)像素值
图像的像素值可以是一个红绿蓝(RGB)颜色值,像素值可以是表示颜色的长整数。例如,像素值为256*Red+100*Green+76Blue,其中,Blue代表蓝色分量,Green代表绿色分量,Red代表红色分量。各个颜色分量中,数值越小,亮度越低,数值越大,亮度越高。对于灰度图像来说,像素值可以是灰度值。
下面介绍本申请实施例提供的***架构。
参见附图1,本发明实施例提供了一种***架构100。如所述***架构100所示,数据采集设备160用于采集或生成训练数据,本申请实施例中训练数据包括:第一矩阵,其中,第一矩阵可以是随机矩阵、随机向量等;并将训练数据存入数据库130,训练设备120基 于数据库130中维护的训练数据和预设判别器121训练得到目标样本生成器101。训练过程可包括,训练设备120将第一矩阵输入初始样本生成器122,得到第一生成样本,初始样本生成器122为深度神经网络;进一步地,将第一生成样本输入预设判别器121,得到判别结果,其中,该预设判别器121是经过第一训练数据训练得到的,第一训练数据包括真实样本和真实样本对应的分类;进而,根据判别结果确定第一生成样本的真实结果,根据第一真实样本的真实结果与判别结果的差异等更新初始样本生成器122,得到目标样本生成器101。其中,目标样本生成器101可以是在场景A中的目标图像生成器、场景B中目标人脸图像生成器或场景C中目标文本生成器。训练设备120基于训练数据得到目标样本生成器101的详细描述可参见下述实施例一中相关描述,此处不对此展开。该目标样本生成器101能够用于实现本申请实施例提供的样本生成方法,即,将第二矩阵输入到所述目标样本生成器101,即可得到第二生成样本,具体可以参见下述实施例二中相关描述,此处不对此展开。进一步地,该第二生成样本能够用于实现本申请实施例提供的神经网络的压缩方法,此时,待压缩神经网络102即为上述预设判别器121,将第二生成样本替代待压缩神经网络102(即本申请中预设判别器121)的训练数据通过蒸馏算法或剪枝算法实现对待压缩神经网络102的压缩,具体可以参见下述实施例三中相关描述,此处不对此展开。
应理解,在本申请实施例的A场景中,目标样本生成器101即为目标图像生成器,预设判别器121和待压缩神经网络102同为图像识别网路,将第二矩阵输入到目标图像生成器中,即可生成图像样本,该图像样本可以进一步地应用于图像识别网络的压缩和加速。在本申请实施例的B场景中,目标样本生成器101即为目标人脸图像生成器,预设判别器121和待压缩神经网络102同为人脸属性识别网路,将第二矩阵输入到目标人脸图像生成器中,即可生成人脸图像样本,该人脸图像样本可以进一步地应用于人脸属性识别网络的压缩和加速。在本申请实施例的C场景中,目标样本生成器101即为目标文本生成器,预设判别器121和待压缩神经网络102同为文本识别网路,将第二矩阵输入到目标文本生成器中,即可生成文本样本,该文本样本可以进一步地应用于文本识别网络的压缩和加速。
在本申请提供的实施例中,该目标样本生成器是通过训练深度神经网络得到的。本申请实施例中的预设判别器为经过预先训练得到的深度神经网络模型。需要说明的是,在实际的应用中,所述数据库130中维护的训练数据不一定都来自于数据采集设备160的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备120也不一定完全基于数据库130维护的训练数据进行目标样本生成器101的训练,也有可能从云端获取或者自己生成训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。
根据训练设备120训练得到的目标样本生成器101可以应用于不同的***或设备中,如应用于图1所示的执行设备110,所述执行设备110可以是终端,如手机终端,平板电脑,笔记本电脑,AR/VR,车载终端等,还可以是服务器或者云端等。执行设备110可以执行本申请实施例中样本生成方法、图像生成方法、文本生成方法等。在附图1中,压缩设备170配置有I/O接口172,用于与外部设备进行数据交互,用户可以通过客户设备140向I/O接口172输入数据,所述输入数据在本申请实施例中可以包括:待压缩神经网络102,向压缩设备170请求对待压缩神经网络102进行压缩。
在压缩设备170的计算模块171执行计算等相关的处理过程中,压缩设备170可以调用数据存储***150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储***150中。
最后,I/O接口172将处理结果,如上述神经网络的压缩方法得到的压缩后的神经网络返回给客户设备140,从而客户设备140可以提供给用户设备180。该用户设备180可以是需要使用压缩后的神经网络的轻量级终端,如手机终端、笔记本电脑、AR/VR终端或车载终端等,以用于响应与终端用户的相应需求,如对终端用户输入的图像进行图像识别输出识别结果给该终端用户,或对终端用户输入的文本进行文本分类输出分类结果给该终端用户等。
值得说明的是,训练设备120可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标样本生成器101,该相应的目标样本生成器101即可以用于实现上述样本生成或完成上述任务,从而为用户提供所需的结果。
在附图1中所示情况下,客户可以手动给定输入数据(如本申请实施例中待压缩神经网络),该手动给定可以通过I/O接口172提供的界面进行操作。另一种情况下,客户设备140可以自动地向I/O接口172发送输入数据,如果要求客户设备140自动发送输入数据需要获得用户的授权,则用户可以在客户设备140中设置相应权限。客户可以在客户设备140查看压缩设备170输出的结果。
客户设备140在接收到压缩后的神经网络后,可以将压缩后的神经网络传输给用户设备180,用户设备180可以是终端,如手机终端,平板电脑,笔记本电脑,AR/VR,车载终端等,用户设备180运行压缩后的神经网络,以实现该该压缩后的神经网络的功能。应理解,也可以由压缩设备170向用户设备180直接提供该压缩后的神经网络,对比不作限定。
值得注意的是,附图1仅是本发明实施例提供的一种***架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在附图1中,数据存储***150相对压缩设备170是外部存储器,在其它情况下,也可以将数据存储***150置于压缩设备170中。
如图1所示,根据训练设备120训练得到目标样本生成器101,该目标样本生成器101可以是A场景中的目标图像生成器、B场景中的目标人脸图像生成器、C场景中目标文本生成器,具体的,本申请实施例提供的目标样本生成器101、目标图像生成器、目标人脸图像生成器和目标文本生成器都可以是卷积神经网络或循环神经网络等机器学习模型。
根据执行设备110压缩得到压缩后的神经网络,该压缩后的神经网络可以是A场景中图像识别网络经过压缩后的神经网络、B场景中人脸属性识别网络经过压缩后的神经网络、C场景中文本识别网络经过压缩后的神经网络等,上述神经网络可以是卷积神经网络或循环神经网络等深度神经网络机器学习模型。
如前文的基础概念介绍所述,卷积神经网络是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,CNN是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元可以对输入其中的图像作出响应。
如图2所示,卷积神经网络(CNN)200可以包括输入层210,卷积层/池化层220(其中池化层为可选的),以及神经网络层230。
卷积层/池化层220:
卷积层:
如图2所示卷积层/池化层220可以包括如示例221-226层,举例来说:在一种实现中,221层为卷积层,222层为池化层,223层为卷积层,224层为池化层,225为卷积层,226为池化层;在另一种实现方式中,221、222为卷积层,223为池化层,224、225为卷积层,226为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。
下面将以卷积层221为例,介绍一层卷积层的内部工作原理。
卷积层221可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的特征图的尺寸也相同,再将提取到的多个尺寸相同的特征图合并形成卷积运算的输出。
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以用来从输入图像中提取信息,从而使得卷积神经网络200进行正确的预测。
当卷积神经网络200有多个卷积层的时候,初始的卷积层(例如221)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络200深度的加深,越往后的卷积层(例如226)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。
池化层:
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,在如图2中220所示例的221-226各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产 生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像尺寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。
神经网络层230:
在经过卷积层/池化层220的处理后,卷积神经网络200还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层220只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或其他相关信息),卷积神经网络200需要利用神经网络层230来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层230中可以包括多层隐含层(如图2所示的231、232至23n)以及输出层240,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像生成、文本生成、样本生成等等……
在神经网络层230中的多层隐含层之后,也就是整个卷积神经网络200的最后层为输出层240,该输出层240具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络200的前向传播(如图2由210至240方向的传播为前向传播)完成,反向传播(如图2由240至210方向的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络200的损失,及卷积神经网络200通过输出层输出的结果和理想结果之间的误差。
需要说明的是,如图2所示的卷积神经网络200仅作为一种卷积神经网络的示例,在具体的应用中,卷积神经网络还可以以其他网络模型的形式存在。
下面介绍本申请实施例提供的一种芯片硬件结构。
图3为本发明实施例提供的一种芯片硬件结构,该芯片包括神经网络处理器30。该芯片可以被设置在如图1所示的执行设备110中,用以完成计算模块171的计算工作。该芯片也可以被设置在如图1所示的训练设备120中,用以完成训练设备120的训练工作并输出目标模型/规则101。如图2所示的卷积神经网络中各层的算法均可在如图3所示的芯片中得以实现。
神经网络处理器30可以是NPU,TPU,或者GPU等一切适合用于大规模异或运算处理的处理器。以NPU为例:NPU可以作为协处理器挂载到主CPU(Host CPU)上,由主CPU为其分配任务。NPU的核心部分为运算电路303,通过控制器304控制运算电路303提取存储器(301和302)中的矩阵数据并进行乘加运算。
在一些实现中,运算电路303内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路303是二维脉动阵列。运算电路303还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路303是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路303从权重存储器302中取矩阵B的权重数据,并缓存在运算电路303中的每一个PE上。运算电路303从输入存储器301中取矩阵A的输入数据,根据矩阵A的输入数据与矩阵B的权重数据进 行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)308中。
统一存储器306用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(DMAC,Direct Memory Access Controller)305,被搬运到权重存储器302中。输入数据也通过DMAC被搬运到统一存储器306中。
总线接口单元(BIU,Bus Interface Unit)310,用于DMAC和取指存储器(Instruction Fetch Buffer)309的交互;总线接口单元301还用于取指存储器309从外部存储器获取指令;总线接口单元301还用于存储单元访问控制器305从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器306中,或将权重数据搬运到权重存储器302中,或将输入数据搬运到输入存储器301中。
向量计算单元307多个运算处理单元,在需要的情况下,对运算电路303的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。向量计算单元307主要用于神经网络中非卷积层,或全连接层(FC,fully connected layers)的计算,具体可以处理:Pooling(池化),Normalization(归一化)等的计算。例如,向量计算单元307可以将非线性函数应用到运算电路303的输出,例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元307生成归一化的值、合并值,或二者均有。
在一些实现中,向量计算单元307将经处理的向量存储到统一存储器306。在一些实现中,经向量计算单元307处理过的向量能够用作运算电路303的激活输入,例如用于神经网络中后续层中的使用,如图2所示,若当前处理层是隐含层1(231),则经向量计算单元307处理过的向量还可以被用到隐含层2(232)中的计算。
控制器304连接的取指存储器(instruction fetch buffer)309,用于存储控制器304使用的指令;
统一存储器306,输入存储器301,权重存储器302以及取指存储器309均为On-Chip存储器。外部存储器独立于该NPU硬件架构。
其中,图2所示的卷积神经网络中各层的运算可以由运算电路303或向量计算单元307执行。
下面详细描述本申请实施例涉及的方法。
需要说明的是,下述实施例一、二、三、四、五中可以应用于上述场景A、场景B、场景C中,对应于A场景描述的实施例,“初始样本生成器”即为“初始图像生成器”,“预设判别器”即为“图像识别网络”,“第一生成样本”即为“生成图像”,“真实样本”即为“真实图像”,“判定结果”即为“判定分类”,“真实结果”即为“真实分类”,“第二生成样本”即为“样本图像”。
对应于场景B描述的实施例,“初始样本生成器”即为“初始人脸图像生成器”,“预设判别器”即为“初始人脸图像生成器”“第一生成样本”即为“生成人脸图像”,“真实样本”即为“真实人脸图像”,“判定结果”即为“判定属性”,“真实结果”即为“真实真实属性”,“第二生成样本”即为“样本人脸图像”。
在场景C中,“初始样本生成器”即为“初始文本生成器”“预设判别器”即为“初始 文本生成器”,“第一生成样本”即为“生成文本”,“真实样本”即为“真实文本”,“判定结果”即为“判定意图”,“真实结果”即为“真实意图”,“第二生成样本”即为“样本文本”。
实施例一:
图4A为本发明实施例一提供的一种样本生成器的训练方法,图4B为样本生成器的训练方法的示意性说明图,该方法具体可以由如图1所示的训练设备120执行。
可选的,该方法可以由CPU处理,也可以由CPU和GPU共同处理,也可以不用GPU,而使用其他适合用于神经网络计算的处理器,如图3所示的神经网络处理器30,本申请不做限制。该方法可以包括如下部分或全部步骤:
S402:接收第一矩阵。
其中,第一矩阵可以是随机矩阵(stochastic matrix)、随机向量或者其他形式的矩阵,本申请不做限制。第一矩阵可以由训练设备120生成,也可以在训练设备120之前由其他功能模块预先生成,还可以从数据库130中获取等,本申请不作限定。
其中,随机矩阵也成概率矩阵、马尔科夫矩阵等,随机矩阵中每一个元素都是一个表示概率的非负实数,随机矩阵中所有元素之和为1。
应理解,本申请中训练设备对样本生成器的一次训练过程可以采用单个样本、多个样本或者全部样本,本申请实施例不作限定。例如,若一次训练过程采用N个样本,则训练设备接收N个随机向量,可以表示为{z 1 z 2 … z i … z N},其中,N为正整数,该N个随机向量中第i个向量表示为z i。i为该组随机向量中随机向量的索引,i为不大于N的正整数。
应理解,本申请实施例中第一矩阵为泛指的矩阵,N个第一矩阵中各个第一矩阵内元素的值可以互不相同。例如,z 1≠z 2≠z 3…≠z N
S404:将第一矩阵输入初始样本生成器,得到第一生成样本,初始样本生成器为深度神经网络。
应理解,初始样本生成器可以是初始化的深度神经网络,也可以是训练过程中产生的深度神经网络。
还应理解,训练设备通过初始样本生成器对输入的第一矩阵进行处理,初始样本生成器输出生成图像。其中,第一生成样本为多个像素值点组成的矩阵。
例如,将N个随机向量{z 1 z 2 … z i … z N}分别输入到初始样本生成器,得到N个第一生成样本{x 1 x 2 … x i … x N},其中,随机向量与第一生成样本一一对应,即,初始样本生成器对输入的随机向量z i进行处理后得到第一生成样本x i
本申请中,“第一生成样本”为训练设备基于输入的第一矩阵通过初始样本生成器生成的样本;“真实样本”为训练预设判别器所使用的样本。
S406:将第一生成样本输入预设判别器,得到判别结果,其中,判别结果包括第一生成样本被预测为M个分类中每一个分类的概率,预设判别器是经过第一训练数据训练得到的,第一训练数据包括真实样本和该真实样本对应的分类。
应理解,本申请中,预设判别器为通过第一训练数据预先训练得到的深度神经网络,该预设判别器可以识别输入的样本的分类。该预设判别器为已知的模型,其训练为现有技术中训练模型采用的方法,本申请不作限定。
将N个第一生成样本{x 1 x 2 … x i … x N}输入预设判别器,得到N个判别结果{y 1 y 2 … y i … y N},其中,N个第一生成样本与N个判别结果一一对应,即,预设判别器对输入的第一生成样本x i进行处理,得到判别结果y i。本申请中判别结果包括第一生成样本被预测为M个分类中每一个分类的概率,即判别结果y i={y i1 y i,2 … y i,j … y i,M},其中,y i,j表示生成样本x i被预测为分类j的概率,j为分类的索引,M为大于1的正整数,j为不大于M的正整数。
S408:确定M个分类中各个分类对应的概率中的最大概率,将该最大概率对应的分类确定为第一生成样本的真实结果。
其中,根据第一生成样本x i对应的判别结果y i可以确定第一生成样本x i对应的真实结果t i。此时,真实结果t i可以是判别结果y i中最大概率值对应的分类,也即,将M个分类中概率最大值对应的分类的概率置1,为其他分类的概率都置0。根据N个第一生成样本的判别结果可以得到N个真实结果,应理解,判别结果与真实结果一一对应。N个真实结果可以表示为{t 1 t 2 … t i … t N}。
例如,预设判别器为如场景A所示的图像识别网络,M个分类包括狗、猫和鸡。此时,若输入到图像识别网络的生成图像的判别结果为:狗的概率0.5、猫的概率0.2、鸡的概率0.3,该判别结果可以表示为{0.5 0.2 0.3};则生成图像的真实结果为狗,可以表示为{1 0 0}。
在S408的另一种实现中,也可以确定M个分类中任意一个分类为该第一生成样本对应的真实结果,本申请实施例不作限定。
S410:根据第一生成样本的真实结果和判别结果更新样本生成器的参数,得到更新后的样本生成器。
现有技术中,GAN网络的训练,需要迭代训练判别器和生成器,且判别器需要基于真实样本和生成器输出的生成样本进行训练。本申请实施例中,预设判别器为已训练好的深度神经网络,且训练该预设判别器所用的真实样本不可获得,本申请中,通过预设判别器来识别生成样本的分类,在初始样本生成器生成的第一生成样本可以被预设判别器可以精 确地识别时(即预设判别器对第一生成样本的判别结果与真实结果之间的差异趋于0时),认为样本生成器得到的生成样本可以代替训练判别器所使用的真实样本。本申请实施例中,样本生成器的训练可以采用采用反向传播算法在训练过程中修正样本生成器中参数的大小,使得样本生成器的重建误差损失越来越小。
S410的具体实现中,训练设备可以根据每一个第一生成样本的判别结果和真实结果的差异确定该N个第一生成样本对应的损失,根据N个第一生成样本对应的损失,通过优化算法更新样本生成器的参数。在训练过程中,预设判别器的参数保持不变,仅仅更新样本生成器的参数。优化算法可以是梯度下降法(gradient descent)或其他优化算法,本申请实施例不作限定。
本申请实施例中,可以用损失函数计算N个第一生成样本对应的损失,损失函数可以包括由判别结果和真实结果的差异确定的第一损失项,其中,第一损失项可以是判定结果和真实结果之间平均绝对误差(mean absolute error,MAE)、均方误差(mean squared error,MSE)或均方根误差(root mean squared error,RMSE)等,也可以是判定结果和真实结果的交叉熵,还可以具有其他的形式,本申请不作限定。
例如,本申请实施例中可以通过交叉熵来表示第一损失项L c,则:
Figure PCTCN2020082599-appb-000009
其中,H c(y i,t i)为判定结果y i和真实结果t i为交叉熵,可以表示为:
Figure PCTCN2020082599-appb-000010
由于,真实样本输入到深度神经网络中,提取的特征的绝对值通常比较大,为使得生成样本和真实样本具有相似的特性,可选地,损失函数还可以包括由第一生成样本的特征确定的第二损失项。此时,该样本生成器的训练方法还包括:训练设备通过预设判别器提取第一生成样本的特征,其中,该第一样本的特征可以是将第一生成样本输入到预设判别器后由预设判别器中任意一个卷积层输出的特征。可选地,该第一生成样本的特征为由预设判别器中最后一层卷积层的输出的特征,即为第一生成样本的高阶特征。应理解,高阶特征即为即高层语义特征,比如,对于文本来说,高阶特征可以是语义特征等。其中,生成样本x i对应的特征可以表示为f i。N个第一生成样本可以得到N个特征{f 1 f 2 … f i … f N}。其中,f i={f i,1 f i,2 … f i,k … f i,P},P为正整数。
该第二损失项L f可以表示:
Figure PCTCN2020082599-appb-000011
其中,||f i|| 1表示矩阵f i的1范数,即为矩阵f i中所有元素的绝对值之和。
Figure PCTCN2020082599-appb-000012
应理解,第二损失项L f还可以包括其他形式,例如,第二损失项L f为N个特征的2范数的平均值等,本申请实施例不作限定。
应理解,若样本生成器产生各个分类的第一生成样本的概率相同,则对于M个分类中任意一个分类,比如分类i,多个第一生成样本被预测为分类i的概率趋于1/M,此时,M个分类的信息熵最大。可选地,该样本生成器的训练方法还包括:计算N个第一生成样本被预测为M个分类中每一个分类的概率平均值,得到M个分类在该N个判别结果中的概率平均值V={v 1 v 2 … v j … v M},其中,v j为N个第一生成样本被预测为分类j的概率平均值,v j为:
Figure PCTCN2020082599-appb-000013
例如,预设判别器为如场景A所示的图像识别网络,假设M=3,M个分类包括狗、猫和鸡,N=4。此时,若生成图像F1的判别结果为y 1={0.5 0.2 0.3}(即狗的概率0.5、猫的概率0.2、鸡的概率0.3)、生成图像F2的判别结果为y 2={0.3 0.4 0.3}、生成图像F3的判别结果为y 3={0.5 0.1 0.4}、生成图像F4的判别结果为y 3={0.2 0.2 0.6},则生成图像被预测为狗的平均概率值为(0.6+0.3+0.5+0.2)/4=0.375,同理,生成图像被预测为猫和鸡的平均概率值分别为0.225、0.4。
损失函数还可以包括由M个分类中每一个分类在该N个判别结果中的概率平均值确定的第三损失项L in,第三损失项L in可以表示为:
Figure PCTCN2020082599-appb-000014
应理解,第三损失项L in还可以具有其他的表示形式,例如,第三损失项L in可以是概率平均值V和1/M之间平均绝对误差、均方误差或均方根误差等,本申请实施例不做限定。
S410的第一种实现:
训练设备可以根据第一生成样本的判定结果和真实结果之间的差异更新样本生成器的参数,具体实现中,训练设备可以将利用N个第一生成样本的判定结果和真实结果之间的差异确定的第一损失项,更新初始样本生成器的模型参数。此时,损失函数可以表述为:
L=L c             (7)
S410的第二种实现:
训练设备可以根据第一生成样本的判别结果与真实结果的差异以及第一生成样本的特征更新初始图像生成器的参数。具体实现中,训练设备可以将利用N个第一生成样本的判定结果和真实结果之间的差异确定的第一损失项和N个第一生成样本的特征确定的第二损 失项,更新初始样本生成器的模型参数。此时,损失函数可以表述为:
L=L c+αL f               (8)
S410的第三种实现:
训练设备可以根据第一生成样本的判别结果与真实结果的差异、第一生成样本的特征以及M个分类中每一个分类的概率平均值更新初始图像生成器的参数。具体实现中,训练设备可以将利用N个第一生成样本的判定结果和真实结果之间的差异确定的第一损失项、N个第一生成样本的特征确定的第二损失项和N个第一生成样本的判定结果统计得到的M个分类中每一个分类的概率平均值,更新初始样本生成器的模型参数。此时,损失函数可以表述为:
L=L c+αL f+βL in                (9)
应理解,本申请实施例中预设判别器可以是神经网络、深度神经网络、卷积神经网络、循环神经网络等,本申请实施例不做限定。例如,在场景A中,图像识别网络可以是卷积神经网络;又例如,在场景B中,人脸属性识别网络可以是卷积神经网络;又例如,在场景C中,文本识别网络可以是循环神经网络。
实施例二:
本申请实施例样本生成方法可以由执行设备110执行。其中,执行设备110配置有目标样本生成器。可选的,该方法还可以由CPU处理,也可以由CPU和GPU共同处理,也可以不用GPU,而使用其他适合用于神经网络计算的处理器,如图3所示的神经网络处理器30,本申请不做限制。
该方法包括:将第二矩阵输入到目标样本生成器中,得到第二生成样本。
其中,第二矩阵和第一矩阵具有相同的格式和类型,即矩阵的阶数相同,矩阵的类型可以包括随机矩阵或随机向量等。目标样本生成器为通过上述实施例一训练得到的目标样本生成器,具体训练方法,可以参见上述实施例一中相关描述,此处不再赘述。
应理解,第一生成样本为初始样本生成器生成的生成样本,第一生成样本的属性与真实样本的属性之间具有较大的差异;而,第二生成样本是目标样本生成器生成的生成样本,目标样本生成器是训练后的模型,已经学习到了真实样本的属性,因此第二生成样本的属性接近真实样本,可以替代预设判别器的真实样本。
应理解,可以应用第二生成样本代替真实样本实现神经网络模型的训练、压缩等,例如,压缩设备可以获取第二生成样本,第二生成样本通过上述实施例二所述的样本生成方法生成,其中,此时预设判别器即为待压缩神经网络;压缩设备将第二生成样本输入到待压缩神经网络,得到第二生成样本对应的分类;进而,根据第二生成样本和该第二生成样本对应的分类对待压缩神经网络进行压缩,具体可参见下述实施例三和实施例四中相关描述。
实施例三:
本申请实施例中,待压缩神经网络为一个黑盒子,只提供了输入和输出的接口,待压缩神经网络的结构和参数都是未知的,训练该待压缩神经网络的真实样本不可获得且本申请实施例中目标样本生成器生成的第二生成样本是无标签的。下面结合图5A所示的神经网络的压缩方法以及图5B所示的神经网络的压缩方法的原理示意图,介绍通过蒸馏算法实现利用未标注的第二生成样本对待压缩神经网络的压缩,其中,图5B是以样本生成器为场景A中目标图像生成器为例来说明,本申请实施例中神经网络的压缩方法可以由压缩设备170执行。可选的,该方法还可以由CPU处理,也可以由CPU和GPU共同处理,也可以不用GPU,而使用其他适合用于神经网络计算的处理器,如图3所示的神经网络处理器30,本申请不做限制。
该方法可以包括但不限于如下步骤:
S502:获取第二生成样本。
其中,本申请实施例中,待压缩神经网络即为上述实施例一和实施例二中预设判别器,通过上述实施例一所述的方法,训练得到适用于待压缩神经网络的目标样本生成器,使得目标样本生成器输出的第二生成样本与待压缩神经网络的训练样本具有相似的特性;进而,通过上述实施例二所述的方法,可以生成待压缩神经网络的第二生成样本。适用于待压缩神经网络的目标样本生成器的训练方法可以参见上述实施例一中相关描述、第二生成样本的生成方法可以参见上述实施例二中相关描述,此处不再赘述。
具体地,将N个第二矩阵{z' 1 z' 2 … z' i … z' N}输入到适用于待压缩神经网络的目标样本生成器,得到N个第二生成样本{x' 1 x' 2 … x' i … x' N}。
S502:将第二生成样本输入到待压缩神经网络,得到第二生成样本对应的真实结果;
将N个第二生成样本输入到待压缩神经网络中,得到N个第二生成样本分别对应的真实结果{y' 1 y' 2 … y' i … y' N}。
其中,真实结果可以是M个分类分别对应的概率中最大概率对应的分类,即,第二生成样本识别为最大概率对应的分类的概率为1,为其他分类的概率都为0。在本申请实施例的也可以直接将待压缩网络对第二生成样本处理得到的M个分类分别对应的概率作为真实结果
应理解,第二生成样本与待压缩神经网络的训练所采用的真实样本具有相似的特性,为可靠的样本,且待压缩神经网络为经过训练的神经网络,待压缩神经网络对输入的可靠样本(第二生成样本)可以得到可靠的输出,即待压缩神经网络对第二生成样本进行处理得到的输出即为第二生成样本对应的真实结果,可以作为第二生成样本的标签。
S506:通过第二训练数据训练初始神经网络,得到得到压缩后的神经网络;其中,第二训练数据包括第二生成样本和该第二生成样本对应的真实结果,该初始神经网络为深度神经网络,该初始神经网络的模型参数少于待压缩神经网络的模型参数。
应理解,本申请实施例中,采用蒸馏算法对待压缩神经网络进行压缩。此时,构建一个神经网络,即初始神经网络,该初始神经网络相对于待压缩神经网络具有更简单的结构, 更少的参数模型数据。将该初始神经网络作为学生网络,待压缩神经网络作为教师网络,通过教师-学生的学习策略来实现将原来的复杂的待压缩神经网络压缩成一个低复杂度的学生网络,在不损失太多模型准确率的情况下,低复杂度的学生网络能够具备高运算效率以及较少的存储开销。
关于如何构建初始神经网络、初始神经网络超参数的确定、基于第二训练数据训练得到压缩后的神经网络等为现有技术,对此不作限定。
例如,在S506的一种具体实现中,将N个第二生成样本输入到初始神经网络,得到N个第二生成样本分别对应的预测结果{y 1 s y 2 s … y i s … y N s}。压缩设备可以根据每一个第二生成样本的预测结果和真实结果的差异确定该N个第二生成样本对应的损失,根据N个第二生成样本对应的损失,通过优化算法更新初始神经网络的参数。损失函数L1用于计算该N个第二生成样本对应的损失,损失函数L1还可以是预测结果和真实结果之间平均绝对误差、均方误差或均方根误差等,也可以是预测结果和真实结果的交叉熵,还可以具有其他的形式,本申请不作限定。例如,损失函数L1可以表示为:
Figure PCTCN2020082599-appb-000015
其中,
Figure PCTCN2020082599-appb-000016
为预测结果
Figure PCTCN2020082599-appb-000017
和真实结果y′ i为交叉熵,可以表示为:
Figure PCTCN2020082599-appb-000018
其中,第二生成样本x' i的预测结果y' i={y′ i,1 y′ i,2 … y′ i,j … y′ i,M},第二生成样本x' i的真实结果
Figure PCTCN2020082599-appb-000019
其中,y′ i,j表示第二生成样本x' i由待压缩神经网络预测为分类j的概率,
Figure PCTCN2020082599-appb-000020
表示第二生成样本x' i由初始神经网络预测为分类j的概率,j为分类的索引,M为大于1的正整数,j为不大于M的正整数。
L1还可以包括其他形式,本申请实施例不作限定。
可见,本申请实施例通过目标样本生成器生成与待压缩神经网络训练所使用的真实样本特性相似的第二生成样本,以该第二生成样本通过待压缩神经网络预测得到的结果作为标签。根据该第二生成样本以及其标签,通过训练一个复杂度低的神经网络,得到与待压缩神经网络功能一致的神经网络,该神经网络即压缩后的神经网络,实现在无训练样本的情况下待压缩神经网络的压缩。该压缩后的神经网络可以应用于终端等轻量级的设备,从而减少运算耗损,减少存储开销,并提高运算效率。
实施例四:
本申请实施例中,待压缩神经网络的具体结构已知,然而,训练该待压缩神经网络的真实样本不可获得而且本申请实施例中目标样本生成器生成的第二生成样本是无标签的。 下面结合图6所示的神经网络的压缩方法,通过剪枝算法丢掉待压缩神经网络中冗余的连接,得到一个简化的神经网络,利用待压缩神经网络对第二生成样本进行标注,以第二生成样本作为训练数据对简化的神经网络进行训练,得到压缩后的神经网络,进而降低待压缩神经网络的复杂度和提高运算效率、减少存储开销,该方法可以由压缩设备170执行。可选的,所述方法可以由CPU处理,也可以由CPU和GPU共同处理,也可以不用GPU,而使用其他适合用于神经网络计算的处理器,例如图3所示的神经网络处理器30,本申请不做限制。
该神经网络的压缩方法可以包括如下部分或全部步骤:
S602:获取第二生成样本。
具体可以参见上述实施例三中S502中相关描述,本申请实施例不再赘述。
S604:将第二生成样本输入到待压缩神经网络,得到第二生成样本对应的真实结果。
具体可以参见上述实施例三中S504中相关描述,本申请实施例不再赘述。
S606:根据待压缩神经网络中神经元的重要性,去除待压缩神经网络中重要性小于第一阈值的神经元,得到简化后的神经网络。
神经网络的参数包括各个卷积层中的权重参数。应理解,权重绝对值越大,则该权重参数对应的神经元对神经网络的输出的贡献越大,进而对神经网络来说越重要。
基于这一思想,在S606的一种具体实现中,压缩设备对待压缩神经网络中部分或全部卷积层进行剪枝,即去除各个卷积层中绝对值小于第一阈值的权重参数对应的神经元,得到简化后的神经网络。
在本申请实施例的另一种实现中,压缩设备可以对待压缩神经网络中神经元按照重要性进行排序,进而去除待压缩神经网络中重要性靠后的多个神经元,进而得到简化后的神经网络。
需要说明的是,本申请实施例中,重要性指神经元对输出结果的贡献大小,对输出结果贡献大的神经元具有更大的重要性。
S608:通过第三训练数据对简化后的神经网络进行训练,得到压缩后的神经网络,该第三训练数据包括第二生成样本和该第二生成样本对应的真实结果。
应理解,相对于待压缩神经网络,简化后的神经网络具有更少的参数,更简洁的网络结构。
关于如何基于第三训练数据训练得到压缩后的神经网络为现有技术,对此不作限定。
例如,在S608的一种具体实现中,将N个第二生成样本输入到简化后的神经网络,得到N个第二生成样本分别对应的预测结果{y 1 h y 2 h … y i h … y N h}。压缩设备可以根据每一个第二生成样本的预测结果和真实结果的差异确定该N个第二生成样本对应的损失,根据N个第二生成样本对应的损失,通过优化算法更新简化后的神经网络的参数。损失函数L2用于计算该N个第二生成样本对应的损失,损失函数L2还可以是预测结果和真实结果之间平均绝对误差、均方误差或均方根误差等,也可以是预测结果和真实结果的交叉熵,还可以具有其他的形式,本申请不作限定。例如,损失函数L2可以表示为:
Figure PCTCN2020082599-appb-000021
其中,
Figure PCTCN2020082599-appb-000022
为预测结果
Figure PCTCN2020082599-appb-000023
和真实结果y′ i为交叉熵,可以表示为:
Figure PCTCN2020082599-appb-000024
其中,第二生成样本x' i的预测结果y' i={y′ i,1 y′ i,2 … y′ i,j … y′ i,M},第二生成样本x' i的真实结果
Figure PCTCN2020082599-appb-000025
其中,y′ i,j表示第二生成样本x' i由待压缩神经网络预测为分类j的概率,
Figure PCTCN2020082599-appb-000026
表示第二生成样本x' i由初始神经网络预测为分类j的概率,j为分类的索引,M为大于1的正整数,j为不大于M的正整数。
L2还可以包括其他形式,本申请实施例不作限定。
可选地,该方法还可以包括:
S610:根据当前压缩后的神经网络模型的参数量判断是否继续进行压缩。
可选地,压缩设备也可以根据当前得到的压缩后的神经网络的参数量和模型准确率综合觉得是否继续进行压缩,若是,则可以以执行重复上述步骤S606和S608描述的方法,对待压缩神经网络进行进一步地压缩。否则执行步骤S612:输出压缩后的神经网络。
可见,本申请实施例通过目标样本生成器生成与待压缩神经网络训练所使用的真实样本特性相似的第二生成样本,将该第二生成样本通过待压缩神经网络预测得到的结果作为标签,通过剪枝算法去除待压缩神经网络中冗余的连接,得到一个简化的神经网络,以第二生成样本作为简化的神经网络的输入,待压缩神经网络对输入的第二生成样本处理得到的真实结果作为标签,利用第二生成样本及其标签对简化的神经网络进行训练,得到压缩后的神经网络,实现在无训练样本的情况下待压缩神经网络的压缩。该压缩后的神经网络可以应用于终端等轻量级的设备,进而降低待压缩神经网络的复杂度和提高运算效率、减少存储开销。
可以理解,可以理解实施例一为该目标样本生成器的训练阶段(如图1所示的训练设备120执行的阶段),具体训练是采用由实施例一以及实施例一基础上任意一种可能的实现方式中提供的样本生成器的训练进行的;而实施例二则可以理解为是该目标样本生成器的应用阶段(如图1所示的执行设备110执行的阶段),具体可以体现为采用由实施例一训练得到的目标样本生成器,并根据输入的第二矩阵,从而得到输出的第二生成样本,即场景A中样本图像、场景B中样本人脸图像、场景C中样本文本;而实施例三和实施例四则可以理解为是第二生成样本的遇见你个月阶段(如图1所示的压缩设备170执行的阶段),压缩设备170可以根据第二生成样本,对预设判别器进行压缩,进而得到压缩后的模型,即压缩后的预设判别器。
实施例五:
在压缩设备170得到压缩后的神经网络后,可以将该压缩后的神经网络发送给客户设 备140,由客户设备140将该压缩后的神经网络发送给用户设备180(终端)。可选地,压缩设备170也可以将压缩后的设备发送至用户设备而180。用户设备180可以运行该压缩后的神经网络以实现该压缩后的神经网络的功能。下面结合图7描述本申请实施例提供的一种数据处理方法,该方法可以包括但不限于如下部分或全部步骤:
S702:接收输入的数据;
S704:将接收到的数据输入到压缩后的神经网络,通过压缩后的神经网络对输入数据进行处理,得到输出结果;
S706:输出该处理结果。
其中,输出的方式包括但不限于通过文本、图像、语音、视频等方式输出。
其中,压缩后的神经网络为通过上述实施例三或实施例四所述的神经网络压缩方法压缩得到的。该输入数据可以是图像、文本等,与待压缩神经网络的具体功能有关。关于压缩后的神经网络的压缩可以参见上述实施例三或实施例四中相关描述,本申请实施例不再赘述。
在本申请实施例的一种应用场景中,该数据处理方法具体为图像处理方法,包括:终端接收输入图像;将该输入图像输入到压缩后的神经网络,通过该压缩后的神经网络对输入图像进行处理,得到处理结果。其中,该处理结果的内容依赖于压缩后的神经网络的功能,而压缩后的神经网络的功能依赖于待压缩神经网络的功能,可以是对图像的分类结果、识别结果等。例如,待压缩神经网络为人脸属性识别网络,用于识别输入的人脸图像所描述的人的属性,比如性别、年龄、种族等,那么,压缩后的神经网络可以识别输入图像描述人的性别、年龄、种族等,该处理结果可以包括输入图像被识别到的性别、年龄和种族。
在本申请实施例的另一种应用场景中,该数据处理方法具体为文本处理方法,包括:终端接收输入文本;将该输入文本输入到压缩后的神经网络,通过该压缩后的神经网络对输入文本进行处理,得到处理结果。其中,该处理结果的内容依赖于压缩后的神经网络的功能,而压缩后的神经网络的功能依赖于待压缩神经网络的功能,可以是对文本的分类结果、识别结果等。例如,待压缩神经网络为文本识别网络,用于识别输入文本的描述的意图,那么,压缩后的神经网络可以识别输入文本的意图,进而执行该该识别到的意图对应的操作,例如,在识别到意图为“接通电话”,终端(如手机)可以接通当前的呼叫。
下面结合附图介绍本申请实施例涉及的装置。
图8为本发明实施例中一种样本生成器的训练装置的示意性框图。图8所示的样本生成器的训练装置800(该装置800具体可以是图1训练设备120),可以包括:
获取单元801,用于获取第一矩阵;
生成单元802,用于将第一矩阵输入初始样本生成器,得到第一生成样本,初始样本生成器为深度神经网络;
判别单元803,用于将第一生成样本输入预设判别器,得到判别结果,其中,预设判别器是经过第一训练数据训练得到的,第一训练数据包括真实样本和该真实样本对应的分类;
更新单元804,用于根据第一生成样本的判别结果更新样本生成器的参数,得到更新 后的样本生成器。
可选地,判别结果可以包括第一生成样本被预测为M个分类中每一个分类的概率,M为大于1的整数。可选地,所述第一生成样本的真实结果可以是为M个分类分别对应的概率中最大概率对应的分类;
在一种可选的实现方式中,装置800还可以包括:
特征提取单元805,用于通过预设判别器提取所述第一生成样本的特征;
概率平均单元805,用于根据与N个第一矩阵一一对应的N个判别结果,得到M个分类中每一个分类在该N个判别结果中的概率平均值;
此时,更新单元804具体用于:根据判别结果与所述真实结果的差异、特征的特征值以及概率平均值共同更新所述初始样本生成器。
可选地,更新单元804还用于实现如上述实施例一中步骤S410的第一至四种实现中任意一种实现,具体可参见上述实施例一中相关描述,本申请实施例不再赘述。
本申请实施例中各个的单元的具体实现可以参见上述实施例一种相关描述,此处不再赘述。
图9为本发明实施例中一种样本生成装置的示意性框图,图9所示的样本生成装置900(该装置900具体可以是图1执行设备110),可以包括:
获取单元901,用于获取目标样本生成器;
生成单元902,用于将第二矩阵输入到目标样本生成器中,得到第二生成样本。
第二矩阵和第一矩阵具有相同的格式和类型,即矩阵的阶数相同,矩阵的类型可以包括随机矩阵或随机向量等。目标样本生成器为通过实施例一描述的样本生成器的训练方法训练得到的目标样本生成器,具体训练方法,可以参见上述实施例一中相关描述,此处不再赘述。
样本生成装置900可以接收装置800发送的目标样本生成器,也可以训练执行实施例一描述的样本生成器的训练方法训练得到的目标样本生成器,本申请实施例不作限定。
本申请实施例中各个的单元的具体实现可以参见上述实施例二种相关描述,此处不再赘述。
图10为本发明实施例中一种神经网络的压缩装置的示意性框图,图10所示的神经网络的压缩装置1000(该装置1000具体可以是图1压缩设备170),可以包括:
获取单元1001,用于获取第二生成样本,具体可以用于接收样本生成装置900发送的第二生成样本。
其中,第二生成样本可以是执行设备110或样本生成装置900将第二矩阵输入所述目标样本生成器,得到第二生成样本。其中,第二生成样本是以待压缩神经网络作为预设判别器,目标样本生成器是通过上述样本生成器的训练方法训练得到的,具体实现可以参见上述实施例一中相关描述,此处不再赘述。
压缩单元1002,用于以所述第二生成样本代替所述真实样本,以所述第二生成样本输入到所述待压缩神经网络后的输出作为所述第二生成样本对应的分类,对所述待压缩神经 网络进行压缩。
上述各个功能单元的具体实现可以参见上述实施例三和/或实施例四中相关描述,本申请实施例不再赘述。
图11为本发明实施例中一种数据处理装置1100(终端)的示意性框图,图11所示的数据处理装置1100(该装置1100具体可以是图1用户设备180),可以包括:
接收单元1101,用于接收输入数据;
处理单元1102,用于将所述输入数据输入到压缩后的神经网络,通过所述压缩后的神经网络对所述输入数据进行处理,得到处理结果,其中,所述压缩后的神经网络是通过如权利要求15所述的神经网络压缩方法得到的;
输出单元1103,用于输出处理结果。
上述各个功能单元的具体实现可以参见上述实施例五中相关描述,本申请实施例不再赘述。
图12是本申请实施例提供的一种样本生成器的训练装置的硬件结构示意图。图12所示的样本生成器的训练装置1200(该装置1200具体可以是一种计算机设备)包括存储器1201、处理器1202、通信接口1203以及总线1204。其中,存储器1201、处理器1202、通信接口1203通过总线1204实现彼此之间的通信连接。
存储器1201可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器1201可以存储程序,当存储器1201中存储的程序被处理器1202执行时,处理器1202和通信接口1203用于执行本申请实施例一的样本生成器的训练方法的各个步骤。
处理器1202可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的样本生成器的训练装置中的单元所需执行的功能,或者执行本申请方法实施例的样本生成器的训练方法。
处理器1202还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的样本生成器的训练方法的各个步骤可以通过处理器1202中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1202还可以是通用处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1201,处理器1202读取存储器1201中的信息,结合其硬件完成本申请实施例的样本生成 器的训练装置中包括的单元所需执行的功能,或者执行本申请方法实施例的样本生成器的训练方法。
通信接口1203使用例如但不限于收发器一类的收发装置,来实现装置1200与其他设备或通信网络之间的通信。例如,可以通过通信接口1203获取训练数据(如本申请实施例一所述的第一矩阵)和预设判别器。
总线1204可包括在装置1200各个部件(例如,存储器1201、处理器1202、通信接口1203)之间传送信息的通路。
应理解,样本生成器的训练装置800获取单元801可以相当于样本生成器的训练装置1200中的通信接口1203,生成单元802、判别单元803和更新单元804可以相当于处理器1202。
上述各个功能器件的具体实现可以参见上述实施例一中相关描述,本申请实施例不再赘述。
图13为本发明实施例中另一种样本生成装置的示意性框图;图13所示的样本生成装置1300(该装置1300具体可以是一种计算机设备)包括存储器1301、处理器1302、通信接口1303以及总线1304。其中,存储器1301、处理器1302、通信接口1303通过总线1304实现彼此之间的通信连接。
存储器1301可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器1301可以存储程序,当存储器1301中存储的程序被处理器1302执行时,处理器1302和通信接口1303用于执行本申请实施例二的样本生成器的训练方法的各个步骤。
处理器1302可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的样本生成装置1300中的单元所需执行的功能,或者执行本申请方法实施例二所述的样本生成法。
处理器1302还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的样本生成方法的各个步骤可以通过处理器1302中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1302还可以是通用处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1301,处理器1302读取存储器1301中的信息,结合其硬件完成本申请实施例的样本生成装置中包括的单元所需执行的功能,或者执行本申请方法实施例的样本生成方法。
通信接口1303使用例如但不限于收发器一类的收发装置,来实现装置1300与其他设备或通信网络之间的通信。例如,可以通过通信接口1303获取数据(如本申请实施例一所述的第二矩阵)、预设判别器或待压缩神经网络。
总线1304可包括在装置1300各个部件(例如,存储器1301、处理器1302、通信接口1303)之间传送信息的通路。
应理解,样本生成装置900中的获取单元901相当于样本生成装置1300中的通信接口1303,生成单元902可以相当于处理器1302。
上述各个功能单元的具体实现可以参见上述实施例二中相关描述,本申请实施例不再赘述。
图14为本发明实施例中另一种神经网络的压缩装置的硬件结构示意图。图14所示的神经网络的压缩装置1400(该装置1400具体可以是一种计算机设备)包括存储器1401、处理器1402、通信接口1403以及总线1404。其中,存储器1401、处理器1402、通信接口1403通过总线1404实现彼此之间的通信连接。
存储器1401可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器1401可以存储程序,当存储器1401中存储的程序被处理器1402执行时,处理器1402和通信接口1403用于执行本申请实施例三、实施例四中的神经网络压缩的各个步骤。
处理器1402可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的神经网络的压缩装置中的单元所需执行的功能,或者执行本申请方法实施例的神经网络的压缩方法。
处理器1402还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的神经网络的压缩方法的各个步骤可以通过处理器1402中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1402还可以是通用处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1401,处理器1402读取存储器1401中的信息,结合其硬件完成本申请实施例的神经网络的压缩装置900中包括的单元所需执行的功能,或者执行本申请方法实施例的神经网络的压缩方法。
通信接口1403使用例如但不限于收发器一类的收发装置,来实现装置1400与其他设备或通信网络之间的通信。例如,可以通过通信接口1403获取数据(如本申请实施例三或 实施例四所述的第二生成样本)和待压缩神经网络。
总线1404可包括在装置1400各个部件(例如,存储器1401、处理器1402、通信接口1403)之间传送信息的通路。
应理解,神经网络的压缩装置1000中的获取单元901相当于神经网络的压缩装置1400中的通信接口1403,压缩单元1005可以相当于处理器1402。
上述各个功能单元的具体实现可以参见上述实施例三和/或实施例四中相关描述,本申请实施例不再赘述。
图15为本发明实施例中另一种数据处理装置的示意性框图;图15所示的数据处理装置1500(该装置1500具体可以是一种终端)包括存储器1501、基带芯片1502、射频模块1503、******1504和传感器1505。基带芯片1502包括至少一个处理器15021,例如CPU,时钟模块15022和电源管理模块15023;******1504包括摄像头15041、音频模块15042、触摸显示屏15043等,进一步地,传感器1505可以包括光线传感器15051、加速度传感器15052、指纹传感器15053等;******1504和传感器1505包括的模块可以视实际需要来增加或者减少。上述任意两个相连接的模块可以具体通过总线相连,该总线可以是工业标准体系结构(英文:industry standard architecture,简称:ISA)总线、外部设备互连(英文:peripheral component interconnect,简称:PCI)总线或扩展标准体系结构(英文:extended industry standard architecture,简称:EISA)总线等。
射频模块1503可以包括天线和收发器(包括调制解调器),该收发器用于将天线接收到的电磁波转换为电流并且最终转换为数字信号,相应地,该收发器还用于将该手机将要输出的数字信号据转换为电流然后转换为电磁波,最后通过该天线将该电磁波发射到自由空间中。射频模块1503还可包括至少一个用于放大信号的放大器。通常情况下,可以通过该射频模块1503进行无线传输,如蓝牙(英文:Bluetooth)传输、无线保证(英文:Wireless-Fidelity,简称:WI-FI)传输、第三代移动通信技术(英文:3rd-Generation,简称:3G)传输、***移动通信技术(英文:the 4th Generation mobile communication,简称:4G)传输等。
触摸显示屏15043可用于显示由用户输入的信息或向用户展示信息,触摸显示屏15043可包括触控面板和显示面板,可选的,可以采用液晶显示器(英文:Liquid Crystal Display,简称:LCD)、有机发光二极管(英文:Organic Light-Emitting Diode,简称:OLED)等形式来配置显示面板。进一步的,触控面板可覆盖显示面板,当触控面板检测到在其上或附近的触摸操作后,传送给处理器15021以确定触摸事件的类型,随后处理器15021根据触摸事件的类型在显示面板上提供相应的视觉输出。触控面板与显示面板是作为两个独立的部件来实现终端1500的输入和输出功能,但是在某些实施例中,可以将触控面板与显示面板集成而实现终端1500的输入和输出功能。
摄像头15041用于获取图像,以输入到压缩后的神经网络。应理解,此情况下,压缩后的神经网络是用于实现对图像进行处理的深度神经网络。如,对场景A中图像识别网络压缩后的神经网络。
音频输入模块15042具体可以为麦克风,可以获取语音。本身实施例中,终端1500可 以将语音转换为文本,进而将该文本输入到压缩后的神经网络。应理解,此情况下,压缩后的神经网络是用于实现对文本进行处理的深度神经网络。如,对场景C中文本意别网络压缩后的神经网络。
传感器1505用于可以包括光线传感器15051、加速度传感器15052、指纹传感器15052,其中,光线传感器15051用于获取环境的光强,加速度传感器15052(比如陀螺仪等)可以获取终端1500的运动状态,指纹传感器15053可以输入的指纹信息;传感器1505感应到相关信号后将该信号量化为数字信号并传递给处理器15021做进一步处理。
存储器1501可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器1501可选的还可以包括至少一个位于远离前述处理器15021的存储装置,该存储器1501可以具体包括存储指令区和存储数据区,其中,存储指令区可存储操作***、用户接口程序、通信接口程序等程序,该存储数据区可存储该处理在执行相关操作所需要的数据,或者执行相关操作所产生的数据。
处理器15021是终端1500的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行存储在存储器1501内的程序,以及调用存储在存储器1501内的数据,执行终端1500的各项功能。可选的,处理器15021可包括一个或多个应用处理器,该应用处理器主要处理操作***、用户界面和应用程序等。在本申请实施例中,处理器15021读取存储器1501中的信息,结合其硬件完成本申请实施例的数据处理装置1100中包括的单元所需执行的功能,或者执行本申请方法实施例的数据处理方法。
通过射频模块1503用户实现该终端1500的通信功能,具体地,终端1500可以接收客户设备180或者压缩装置170发送的压缩后的神经网络或其他数据。
需要说明的是,上述各个操作的具体实现及有益效果还可以对应参照上述图2至图14中提供的用户界面实施例及其可能的实施例的相应描述。
上述各个功能单元的具体实现可以参见上述实施例五中相关描述,本申请实施例不再赘述。
应注意,尽管图12、图13、图14和图15所示的装置1200、1300、1400和1500仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置1200、1300、1400和1500还包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置1200、1300、1400和1500还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置1200、1300、1400和1500也可仅仅包括实现本申请实施例所必须的器件,而不必包括图12、图13、图14和图15中所示的全部器件。
可以理解,所述装置1200相当于1中的所述训练设备120,所述装置1300相当于图1中的所述执行设备110,装置1400相当于图1中的所述压缩设备170、装置1500相当于图1中的所述用户设备180。本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,根据通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。
本申请的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (28)

  1. 一种图像生成方法,其特征在于,包括:
    将第一矩阵输入初始图像生成器,得到生成图像,所述初始图像生成器为深度神经网络;
    将所述生成图像输入预设判别器,得到判别结果,其中,所述预设判别器是经过第一训练数据训练得到的,所述第一训练数据包括真实图像和所述真实图像对应的分类;
    根据所述判别结果更新所述初始图像生成器,得到目标图像生成器;
    将第二矩阵输入所述目标图像生成器,得到样本图像。
  2. 根据权利要求1所述的方法,其特征在于,所述判别结果包括所述生成图像被预测为M个分类中每一个分类的概率,M为大于1的整数。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述判别结果更新所述初始图像生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述生成图像的真实结果;
    根据所述判别结果与所述真实结果的差异更新所述初始图像生成器。
  4. 根据权利要求2所述的方法,其特征在于,在所述根据所述判别结果更新所述初始图像生成器之前,所述方法还包括:
    通过所述预设判别器提取所述生成图像的特征;
    所述根据所述判别结果更新所述初始图像生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述生成图像的真实结果;根据所述判别结果与所述真实结果的差异以及所述特征更新所述初始图像生成器。
  5. 根据权利要求2所述的方法,其特征在于,所述根据所述判别结果更新所述初始图像生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述生成图像的真实结果;
    根据与N个所述第一矩阵一一对应的N个判别结果,得到所述M个分类中每一个分类在所述N个判别结果中的概率平均值,N为正整数;
    根据所述判别结果与所述真实结果的差异以及所述概率平均值更新所述初始图像生成器。
  6. 根据权利要求2所述的方法,其特征在于,在所述根据所述判别结果更新所述初始图像生成器之前,所述方法还包括:
    通过所述预设判别器提取所述生成图像的特征;
    所述根据所述判别结果更新所述初始图像生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述生成图像的真实结果;根据与N个所述第一矩阵一一对应的N个判别结果,得到所述M个分类中每一个分类在所述N个判别结果中的概率平均值;以及,
    根据所述判别结果与所述真实结果的差异、所述特征的特征值以及所述概率平均值更新所述初始图像生成器。
  7. 一种神经网络的压缩方法,其特征在于,包括:
    获取样本图像,所述样本图像是通过如权利要求1-6任一项所述的图像生成方法生成的,其中,所述预设判别器为待压缩神经网络;
    将所述样本图像输入到所述待压缩神经网络,得到所述样本图像对应的分类;
    根据所述样本图像和所述样本图像对应的分类对所述待压缩神经网络进行压缩,得到压缩后的神经网络,其中,所述压缩后的神经网络的参数少于所述待压缩神经网络的参数。
  8. 一种图像处理方法,其特征在于,包括:
    接收输入图像;
    将所述输入图像输入到压缩后的神经网络,通过所述压缩后的神经网络对所述输入图像进行处理,得到处理结果,其中,所述压缩后的神经网络是通过如权利要求7所述的神经网络压缩方法得到的;
    输出处理结果。
  9. 一种样本生成方法,其特征在于,包括:
    将第一矩阵输入初始样本生成器,得到第一生成样本,所述初始样本生成器为深度神经网络;
    将所述第一生成样本输入预设判别器,得到判别结果,其中,所述预设判别器是经过第一训练数据训练得到的,所述第一训练数据包括真实样本和该真实样本对应的分类;
    根据所述第一生成样本的判别结果更新所述初始样本生成器的参数,得到目标样本生成器;
    将第二矩阵输入目标样本生成器,得到第二生成样本。
  10. 根据权利要求8所述的方法,其特征在于,所述判别结果包括所述第一生成样本被预测为M个分类中每一个分类的概率,M为大于1的整数。
  11. 根据权利要求9所述的方法,其特征在于,所述根据所述判别结果更新所述初始样本生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述第一生成样本的真实结果;
    根据所述判别结果与所述真实结果的差异更新所述初始样本生成器。
  12. 根据权利要求9所述的方法,其特征在于,在所述根据所述判别结果更新所述初始样本生成器之前,所述方法还包括:
    通过所述预设判别器提取所述第一生成样本的特征;
    所述根据所述判别结果更新所述初始样本生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述第一生成样本的真实结果;根据所述判别结果与所述真实结果的差异以及所述特征更新所述初始样本生成器。
  13. 根据权利要求9所述的方法,其特征在于,所述根据所述判别结果更新所述初始样本生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述第一生成样本的真实结果;
    根据与N个第一矩阵一一对应的N个判别结果,得到所述M个分类中每一个分类在所述N个判别结果中的概率平均值,N为正整数;
    根据所述判别结果与所述真实结果的差异以及所述概率平均值更新所述初始样本生成器。
  14. 根据权利要求9所述的方法,其特征在于,在所述根据所述判别结果更新所述初始样本生成器之前,所述方法还包括:
    通过所述预设判别器提取所述第一生成样本的特征;
    所述根据所述判别结果更新所述初始样本生成器,具体包括:
    确定所述M个分类中各个分类对应的概率中的最大概率,将所述最大概率对应的分类确定为所述第一生成样本的真实结果;根据与N个第一矩阵一一对应的N个判别结果,得到所述M个分类中每一个分类在所述N个判别结果中的概率平均值;以及,根据所述判别结果与所述真实结果的差异、所述特征以及所述概率平均值更新所述初始样本生成器。
  15. 一种神经网络的压缩方法,其特征在于,包括:
    获取第二生成样本,所述第二生成样本通过如权利要求8-13任一项所述的样本生成方法生成,其中,所述预设判别器为待压缩神经网络;
    将所述第二生成样本输入到所述待压缩神经网络,得到所述第二生成样本对应的分类;
    根据所述第二生成样本和所述第二生成样本对应的分类对所述待压缩神经网络进行压缩,得到压缩后的神经网络,其中,所述所述压缩后的神经网络的参数少于所述待压缩神经网络的参数。
  16. 一种数据处理方法,其特征在于,包括:
    接收输入数据;
    将所述输入数据输入到压缩后的神经网络,通过所述压缩后的神经网络对所述输入数据进行处理,得到处理结果,其中,所述压缩后的神经网络是通过如权利要求15所述的神经网络压缩方法得到的;
    输出处理结果。
  17. 一种图像生成装置,其特征在于,包括:存储器和处理器,所述存储器用于程序,所述处理器执行所述存储器存储的程序,当存储器存储的程序被执行是,所述处理器用于执行如权利要求1-6任一项所述的图像生成方法。
  18. 一种神经网络的压缩装置,其特征在于,包括:存储器和处理器,所述存储器用于程序,所述处理器执行所述存储器存储的程序,当存储器存储的程序被执行是,所述处理器用于执行如权利要求7所述的神经网络的压缩方法。
  19. 一种终端,其特征在于,包括存储器存储器和处理器,所述存储器用于程序,所述 处理器执行所述存储器存储的程序,当存储器存储的程序被执行是,所述处理器用于执行如权利要求8所述的图像处理方法。
  20. 一种样本生成装置,其特征在于,包括:存储器和处理器,所述存储器用于程序,所述处理器执行所述存储器存储的程序,当存储器存储的程序被执行是,所述处理器用于执行如权利要求9-14任一项所述的样本生成方法。
  21. 一种神经网络的压缩装置,其特征在于,包括:存储器和处理器,所述存储器用于程序,所述处理器执行所述存储器存储的程序,当存储器存储的程序被执行是,所述处理器用于执行如权利要求15所述的神经网络的压缩方法。
  22. 一种终端,其特征在于,包括存储器存储器和处理器,所述存储器用于程序,所述处理器执行所述存储器存储的程序,当存储器存储的程序被执行是,所述处理器用于执行如权利要求16所述的数据处理方法。
  23. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括如权利要求1-6任一项所述的图像生成方法。
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码如权利要求7所述的神经网络的压缩方法。
  25. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码如权利要求8所述的图像处理方法。
  26. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括如权利要求9-14任一项所述的样本生成方法。
  27. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码如权利要求15所述的神经网络的压缩方法。
  28. 一种计算机可读存储介质,其特征在于,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码如权利要求16所述的数据处理方法。
PCT/CN2020/082599 2019-03-31 2020-03-31 图像生成方法、神经网络的压缩方法及相关装置、设备 WO2020200213A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20783785.7A EP3940591A4 (en) 2019-03-31 2020-03-31 IMAGE GENERATING METHODS, NEURAL NETWORK COMPRESSION METHODS AND RELATED APPARATUS AND METHODS
US17/488,735 US20220019855A1 (en) 2019-03-31 2021-09-29 Image generation method, neural network compression method, and related apparatus and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910254752.3 2019-03-31
CN201910254752.3A CN110084281B (zh) 2019-03-31 2019-03-31 图像生成方法、神经网络的压缩方法及相关装置、设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/488,735 Continuation US20220019855A1 (en) 2019-03-31 2021-09-29 Image generation method, neural network compression method, and related apparatus and device

Publications (1)

Publication Number Publication Date
WO2020200213A1 true WO2020200213A1 (zh) 2020-10-08

Family

ID=67414008

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/082599 WO2020200213A1 (zh) 2019-03-31 2020-03-31 图像生成方法、神经网络的压缩方法及相关装置、设备

Country Status (4)

Country Link
US (1) US20220019855A1 (zh)
EP (1) EP3940591A4 (zh)
CN (2) CN117456297A (zh)
WO (1) WO2020200213A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949706A (zh) * 2021-02-25 2021-06-11 平安科技(深圳)有限公司 Ocr训练数据生成方法、装置、计算机设备及存储介质
CN113657350A (zh) * 2021-05-12 2021-11-16 支付宝(杭州)信息技术有限公司 人脸图像处理方法及装置
CN114387357A (zh) * 2020-10-16 2022-04-22 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质
CN114943101A (zh) * 2022-05-18 2022-08-26 广州大学 一种隐私保护的生成模型构建方法
CN117746214A (zh) * 2024-02-07 2024-03-22 青岛海尔科技有限公司 基于大模型生成图像的文本调整方法、装置、存储介质

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117456297A (zh) * 2019-03-31 2024-01-26 华为技术有限公司 图像生成方法、神经网络的压缩方法及相关装置、设备
US11455531B2 (en) * 2019-10-15 2022-09-27 Siemens Aktiengesellschaft Trustworthy predictions using deep neural networks based on adversarial calibration
CN110910310B (zh) * 2019-10-25 2021-04-30 南京大学 一种基于身份信息的人脸图像重建方法
CN110796619B (zh) * 2019-10-28 2022-08-30 腾讯科技(深圳)有限公司 一种图像处理模型训练方法、装置、电子设备及存储介质
CN111797970B (zh) * 2019-12-24 2024-06-28 华为技术有限公司 训练神经网络的方法和装置
CN111260655B (zh) * 2019-12-31 2023-05-12 深圳云天励飞技术有限公司 基于深度神经网络模型的图像生成方法与装置
CN111414993B (zh) * 2020-03-03 2024-03-01 三星(中国)半导体有限公司 卷积神经网络的裁剪、卷积计算方法及装置
CN113537483A (zh) * 2020-04-14 2021-10-22 杭州海康威视数字技术股份有限公司 一种域适配方法、装置及电子设备
CN113570508A (zh) * 2020-04-29 2021-10-29 上海耕岩智能科技有限公司 图像修复方法及装置、存储介质、终端
CN111753091A (zh) * 2020-06-30 2020-10-09 北京小米松果电子有限公司 分类方法、分类模型的训练方法、装置、设备及存储介质
KR20220008035A (ko) 2020-07-13 2022-01-20 삼성전자주식회사 위조 지문 검출 방법 및 장치
CN112016591A (zh) * 2020-08-04 2020-12-01 杰创智能科技股份有限公司 一种图像识别模型的训练方法及图像识别方法
CN112052948B (zh) * 2020-08-19 2023-11-14 腾讯科技(深圳)有限公司 一种网络模型压缩方法、装置、存储介质和电子设备
CN112381209B (zh) * 2020-11-13 2023-12-22 平安科技(深圳)有限公司 一种模型压缩方法、***、终端及存储介质
CN112906870B (zh) * 2021-03-17 2022-10-18 清华大学 一种基于小样本的网络模型压缩云端服务方法和装置
CN113242440A (zh) * 2021-04-30 2021-08-10 广州虎牙科技有限公司 直播方法、客户端、***、计算机设备以及存储介质
KR102540763B1 (ko) * 2021-06-03 2023-06-07 주식회사 딥브레인에이아이 머신 러닝 기반의 립싱크 영상 생성을 위한 학습 방법 및 이를 수행하기 위한 립싱크 영상 생성 장치
CN113313697B (zh) * 2021-06-08 2023-04-07 青岛商汤科技有限公司 图像分割和分类方法及其模型训练方法、相关装置及介质
CN113780534B (zh) * 2021-09-24 2023-08-22 北京字跳网络技术有限公司 网络模型的压缩方法、图像生成方法、装置、设备及介质
CN113961674B (zh) * 2021-12-21 2022-03-22 深圳市迪博企业风险管理技术有限公司 一种关键信息与上市公司公告文本语义匹配方法及装置
CN114282684A (zh) * 2021-12-24 2022-04-05 支付宝(杭州)信息技术有限公司 训练用户相关的分类模型、进行用户分类的方法及装置
CN114819141A (zh) * 2022-04-07 2022-07-29 西安电子科技大学 用于深度网络压缩的智能剪枝方法和***
CN115240006B (zh) * 2022-07-29 2023-09-19 南京航空航天大学 目标检测的卷积神经网络优化方法、装置及网络结构
CN116994309B (zh) * 2023-05-06 2024-04-09 浙江大学 一种公平性感知的人脸识别模型剪枝方法
CN116543135B (zh) * 2023-05-17 2023-09-29 北京博瑞翔伦科技发展有限公司 一种基于复杂场景图像数据采集方法和***

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284786A (zh) * 2018-10-10 2019-01-29 西安电子科技大学 基于分布和结构匹配生成对抗网络的sar图像地物分类方法
CN109344921A (zh) * 2019-01-03 2019-02-15 湖南极点智能科技有限公司 一种基于深度神经网络模型的图像识别方法、装置及设备
CN109543753A (zh) * 2018-11-23 2019-03-29 中山大学 基于自适应模糊修复机制的车牌识别方法
CN110084281A (zh) * 2019-03-31 2019-08-02 华为技术有限公司 图像生成方法、神经网络的压缩方法及相关装置、设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11003995B2 (en) * 2017-05-19 2021-05-11 Huawei Technologies Co., Ltd. Semi-supervised regression with generative adversarial networks
CN107563428B (zh) * 2017-08-25 2019-07-02 西安电子科技大学 基于生成对抗网络的极化sar图像分类方法
CN109508669B (zh) * 2018-11-09 2021-07-23 厦门大学 一种基于生成式对抗网络的人脸表情识别方法
WO2020175446A1 (ja) * 2019-02-28 2020-09-03 富士フイルム株式会社 学習方法、学習システム、学習済みモデル、プログラム及び超解像画像生成装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284786A (zh) * 2018-10-10 2019-01-29 西安电子科技大学 基于分布和结构匹配生成对抗网络的sar图像地物分类方法
CN109543753A (zh) * 2018-11-23 2019-03-29 中山大学 基于自适应模糊修复机制的车牌识别方法
CN109344921A (zh) * 2019-01-03 2019-02-15 湖南极点智能科技有限公司 一种基于深度神经网络模型的图像识别方法、装置及设备
CN110084281A (zh) * 2019-03-31 2019-08-02 华为技术有限公司 图像生成方法、神经网络的压缩方法及相关装置、设备

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387357A (zh) * 2020-10-16 2022-04-22 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质
CN112949706A (zh) * 2021-02-25 2021-06-11 平安科技(深圳)有限公司 Ocr训练数据生成方法、装置、计算机设备及存储介质
CN112949706B (zh) * 2021-02-25 2024-01-05 平安科技(深圳)有限公司 Ocr训练数据生成方法、装置、计算机设备及存储介质
CN113657350A (zh) * 2021-05-12 2021-11-16 支付宝(杭州)信息技术有限公司 人脸图像处理方法及装置
CN114943101A (zh) * 2022-05-18 2022-08-26 广州大学 一种隐私保护的生成模型构建方法
CN114943101B (zh) * 2022-05-18 2024-05-17 广州大学 一种隐私保护的生成模型构建方法
CN117746214A (zh) * 2024-02-07 2024-03-22 青岛海尔科技有限公司 基于大模型生成图像的文本调整方法、装置、存储介质
CN117746214B (zh) * 2024-02-07 2024-05-24 青岛海尔科技有限公司 基于大模型生成图像的文本调整方法、装置、存储介质

Also Published As

Publication number Publication date
EP3940591A1 (en) 2022-01-19
EP3940591A4 (en) 2022-05-18
CN117456297A (zh) 2024-01-26
CN110084281B (zh) 2023-09-12
US20220019855A1 (en) 2022-01-20
CN110084281A (zh) 2019-08-02

Similar Documents

Publication Publication Date Title
WO2020200213A1 (zh) 图像生成方法、神经网络的压缩方法及相关装置、设备
WO2021042828A1 (zh) 神经网络模型压缩的方法、装置、存储介质和芯片
WO2020238293A1 (zh) 图像分类方法、神经网络的训练方法及装置
WO2021120719A1 (zh) 神经网络模型更新方法、图像处理方法及装置
WO2022083536A1 (zh) 一种神经网络构建方法以及装置
WO2020221200A1 (zh) 神经网络的构建方法、图像处理方法及装置
WO2021043112A1 (zh) 图像分类方法以及装置
WO2020228376A1 (zh) 文本处理方法、模型训练方法和装置
WO2020216227A9 (zh) 图像分类方法、数据处理方法和装置
WO2021022521A1 (zh) 数据处理的方法、训练神经网络模型的方法及设备
WO2021155792A1 (zh) 一种处理装置、方法及存储介质
WO2021057056A1 (zh) 神经网络架构搜索方法、图像处理方法、装置和存储介质
WO2022042713A1 (zh) 一种用于计算设备的深度学习训练方法和装置
WO2022001805A1 (zh) 一种神经网络蒸馏方法及装置
WO2021218517A1 (zh) 获取神经网络模型的方法、图像处理方法及装置
WO2021147325A1 (zh) 一种物体检测方法、装置以及存储介质
WO2021164750A1 (zh) 一种卷积层量化方法及其装置
WO2021008206A1 (zh) 神经网络结构的搜索方法、图像处理方法和装置
CN113326930B (zh) 数据处理方法、神经网络的训练方法及相关装置、设备
CN113705769A (zh) 一种神经网络训练方法以及装置
WO2021018245A1 (zh) 图像分类方法及装置
CN110222718B (zh) 图像处理的方法及装置
WO2021051987A1 (zh) 神经网络模型训练的方法和装置
WO2024041479A1 (zh) 一种数据处理方法及其装置
WO2023231794A1 (zh) 一种神经网络参数量化方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20783785

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020783785

Country of ref document: EP

Effective date: 20211012