CN109903350B - Image compression method and related device - Google Patents

Image compression method and related device Download PDF

Info

Publication number
CN109903350B
CN109903350B CN201711289667.8A CN201711289667A CN109903350B CN 109903350 B CN109903350 B CN 109903350B CN 201711289667 A CN201711289667 A CN 201711289667A CN 109903350 B CN109903350 B CN 109903350B
Authority
CN
China
Prior art keywords
image
neural network
compressed
target
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711289667.8A
Other languages
Chinese (zh)
Other versions
CN109903350A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to CN201711289667.8A priority Critical patent/CN109903350B/en
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to KR1020197037566A priority patent/KR102434728B1/en
Priority to KR1020197037574A priority patent/KR102434729B1/en
Priority to US16/482,710 priority patent/US11593658B2/en
Priority to EP18868807.1A priority patent/EP3627397B1/en
Priority to EP19215859.0A priority patent/EP3660628B1/en
Priority to EP19215858.2A priority patent/EP3667569A1/en
Priority to KR1020197023878A priority patent/KR102434726B1/en
Priority to PCT/CN2018/095548 priority patent/WO2019076095A1/en
Priority to EP19215860.8A priority patent/EP3660706B1/en
Publication of CN109903350A publication Critical patent/CN109903350A/en
Priority to US16/529,041 priority patent/US10540574B2/en
Application granted granted Critical
Publication of CN109903350B publication Critical patent/CN109903350B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses an image compression method, which comprises the following steps: the method comprises the steps of obtaining an original image with a first resolution, compressing the original image based on a target model to obtain a compressed image with a second resolution, identifying the compressed image based on an identifying neural network model to obtain reference label information, obtaining a loss function according to the target label information and the reference label information, obtaining the target original image with the first resolution when the loss function converges on a first threshold value or the current training frequency of the compressed neural network is larger than or equal to a second threshold value, using the target model as a corresponding compressed neural network model when the training of the compressed neural network is completed, and compressing the target original image based on the compressed neural network model to obtain the target compressed image with the second resolution. According to the embodiment of the application, the effectiveness of image compression and the accuracy rate of identification can be improved.

Description

Image compression method and related device
Technical Field
The present application relates to the field of image compression technologies, and in particular, to an image compression method and a related apparatus.
Background
With the advent of the big data age, data is growing at an explosive speed, a huge amount of data carries information to be transmitted among people, and images are an important means for human beings to acquire information, express information and transmit information as a visual basis for human beings to perceive the world.
In the prior art, the data volume is effectively reduced through image compression, and the transmission rate of the image is improved. However, after the image is compressed, it is difficult to retain all the information of the original image, and therefore, how to compress the image is still a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the application provides an image compression method and a related device, which can be used for a compression neural network for image training, and improve the effectiveness of image compression and the accuracy of identification.
In a first aspect, an embodiment of the present application provides an image compression method, including:
acquiring an original image with a first resolution, wherein the original image is any one training image in a compressed training image set of a compressed neural network, and label information of the original image is used as target label information;
compressing the original image based on a target model to obtain a compressed image with a second resolution, wherein the second resolution is smaller than the first resolution, and the target model is a current neural network model of the compressed neural network;
identifying the compressed image based on an identified neural network model to obtain reference label information, wherein the identified neural network model is a corresponding neural network model when the training of the identified neural network is completed;
obtaining a loss function according to the target tag information and the reference tag information;
when the loss function converges to a first threshold value or the current training times of the compressed neural network are greater than or equal to a second threshold value, acquiring a target original image with the first resolution ratio, and taking the target model as a corresponding compressed neural network model when the compressed neural network training is finished;
and compressing the target original image based on the compressed neural network model to obtain a target compressed image with the second resolution.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the method further includes:
and when the loss function is not converged to the first threshold value or the current training times of the compressed neural network are smaller than the second threshold value, updating the target model according to the loss function to obtain an updated model, taking the updated model as the target model, taking the next training image as the original image, and executing the step of obtaining the original image with the first resolution.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the identifying the compressed image based on the identifying neural network model to obtain reference tag information includes:
preprocessing the compressed image to obtain an image to be identified;
and identifying the image to be identified based on the identification neural network model to obtain the reference label information.
With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the preprocessing includes size processing, and the preprocessing the compressed image to obtain an image to be recognized includes:
and when the image size of the compressed image is smaller than the basic image size of the recognition neural network, filling pixel points in the compressed image according to the basic image size to obtain the image to be recognized.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the compressing the training atlas includes at least an identification training atlas, and the method further includes:
and training the recognition neural network by adopting the recognition training image set to obtain the recognition neural network model, wherein each training image in the recognition training image set at least comprises label information with the type consistent with that of the target label information.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, after the compressing the target original image based on the compressed neural network model to obtain the target compressed image of the second resolution, the method further includes:
and compressing the target compressed image based on the recognition neural network model to obtain the label information of the target original image, and storing the label information of the target original image.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, the compressing the training atlas includes multiple dimensions, and the compressing the original image based on the target model to obtain a compressed image at a second resolution includes:
identifying the original image based on the target model to obtain a plurality of image information, wherein each dimension corresponds to one image information;
and compressing the original image based on the target model and the image information to obtain the compressed image.
In a second aspect, an embodiment of the present application provides an image compression apparatus, including a processor, and a memory connected to the processor, wherein:
the memory is used for storing a first threshold, a second threshold, a current neural network model and training times of a compressed neural network, a compressed training image set of the compressed neural network and label information of each training image in the compressed training image set, an identified neural network model and a compressed neural network model, the current neural network model of the compressed neural network is used as a target model, the compressed neural network model is a corresponding target model when the compressed neural network training is completed, and the identified neural network model is a corresponding neural network model when the identified neural network training is completed;
the processor is configured to acquire an original image with a first resolution, where the original image is any one of the compressed training image sets, and tag information of the original image is used as target tag information; compressing the original image based on the target model to obtain a compressed image with a second resolution, wherein the second resolution is smaller than the first resolution; identifying the compressed image based on the identification neural network model to obtain reference label information; obtaining a loss function according to the target tag information and the reference tag information; when the loss function converges to the first threshold value or the training times is greater than or equal to the second threshold value, acquiring a target original image with the first resolution ratio, and determining that the target model is the compressed neural network model; and compressing the target original image based on the compressed neural network model to obtain a target compressed image with the second resolution.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the processor is further configured to update the target model according to the loss function to obtain an updated model when the loss function does not converge to the first threshold, or the training frequency is smaller than the second threshold, take the updated model as the target model, take a next training image as the original image, and perform the step of obtaining the original image with the first resolution.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the processor is specifically configured to perform preprocessing on the compressed image to obtain an image to be identified; and identifying the image to be identified based on the identification neural network model to obtain the reference label information.
With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the preprocessing includes a size processing, and the memory is further configured to store a basic image size of the identified neural network; and the processor is specifically configured to, when the image size of the compressed image is smaller than the size of the basic image, fill pixel points in the compressed image according to the size of the basic image to obtain the image to be identified.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the compressed training atlas includes at least an identification training atlas, and the processor is further configured to train the identification neural network with the identification training atlas to obtain the identification neural network model, where each training image in the identification training atlas includes at least tag information of a type consistent with that of the target tag information.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a fifth possible implementation manner of the second aspect, the processor is further configured to identify the target compressed image based on the identified neural network model, so as to obtain tag information of the target original image; the memory is also used for storing the label information of the target original image.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a sixth possible implementation manner of the second aspect, the compressed training atlas includes multiple dimensions, and the processor is specifically configured to identify the original image based on the target model to obtain multiple pieces of image information, where each dimension corresponds to one piece of image information; and compressing the original image based on the target model and the image information to obtain the compressed image.
In a third aspect, an embodiment of the present application provides another electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for some or all of the steps described in the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method of the first aspect.
After the image compression method and the related device are adopted, the compressed image of the original image is obtained based on the target model, the reference label information of the compressed image is obtained based on the recognition neural network model, the loss function is obtained according to the target label information and the reference label information included in the original image, when the loss function converges on a first threshold value or the current training frequency of the compressed neural network is greater than or equal to a second threshold value, the training of the compressed neural network for image compression is completed, the target model is used as the compressed neural network model, and the target compressed image of the target original image can be obtained based on the compressed neural network model. That is to say, a loss function is obtained through a reference label value obtained by the trained neural network model and a target label value included in the original image, the training is completed when the loss function meets a preset condition or the current training frequency of the compressed neural network exceeds a preset threshold, otherwise, the weight of the compressed neural network is repeatedly adjusted through training the compressed neural network, namely, the image content represented by each pixel in the same image is adjusted, the loss of the compressed neural network is reduced, the effectiveness of image compression is improved, and therefore the accuracy of the identification is convenient to improve.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Wherein:
fig. 1 is a schematic diagram illustrating an operation of a neural network according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of an image compression method provided in an embodiment of the present application;
fig. 2A is a schematic view of a scene of a size processing method according to an embodiment of the present application;
fig. 2B is a schematic flowchart of a single-layer neural network operation method according to an embodiment of the present disclosure;
FIG. 2C is a schematic structural diagram of an apparatus for performing inverse training of a compressed neural network according to an embodiment of the present disclosure;
fig. 2D is a schematic structural diagram of an H-tree module according to an embodiment of the present disclosure;
fig. 2E is a schematic structural diagram of a main operation module according to an embodiment of the present disclosure;
fig. 2F is a schematic structural diagram of an operation module according to an embodiment of the present disclosure;
FIG. 2G is an example block diagram of a compressed neural network reverse training provided by an embodiment of the present application;
fig. 3 is a schematic flowchart of an image compression method provided in an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
The embodiment of the application provides an image compression method and a related device, a compression neural network for image compression can be trained, and the effectiveness of image compression and the accuracy of identification are improved. The present application is described in further detail below with reference to specific embodiments and with reference to the attached drawings.
The input neurons and the output neurons mentioned in the invention do not refer to the neurons in the input layer and the neurons in the output layer of the whole neural network, but for any two adjacent layers in the network, the neurons in the lower layer of the network feedforward operation are the input neurons, and the neurons in the upper layer of the network feedforward operation are the output neurons. Taking a convolutional neural network as an example, let a convolutional neural network have L layers, where K is 1, 2.., L-1, and for the K-th layer and the K + 1-th layer, the K-th layer is referred to as an input layer, where neurons are the input neurons, and the K + 1-th layer is referred to as an output layer, where neurons are the output neurons. That is, each layer except the topmost layer can be used as an input layer, and the next layer is a corresponding output layer.
The operations mentioned above are operations in one layer of the neural network, and for the multi-layer neural network, the implementation process is shown in fig. 1, in which the arrows in the dotted line represent the inverse operations, and the arrows in the solid line represent the forward operations. In the forward operation, after the execution of the artificial neural network of the previous layer is completed, the output neuron obtained from the previous layer is used as the input neuron of the next layer to perform operation (or the output neuron is subjected to some operation and then used as the input neuron of the next layer), and meanwhile, the weight value is replaced by the weight value of the next layer. In the inverse operation, after the inverse operation of the artificial neural network of the previous layer is completed, the input neuron gradient obtained by the previous layer is used as the output neuron gradient of the next layer for operation (or the input neuron gradient is subjected to some operation and then used as the output neuron gradient of the next layer), and meanwhile, the weight value is replaced by the weight value of the next layer.
The forward propagation stage of the neural network corresponds to forward operation and is a process from input data to output data, the backward propagation stage corresponds to backward operation and is a process that errors between final result data and expected output data reversely pass through the forward propagation stage, the weights of all layers are corrected in a mode of error gradient reduction through repeated forward propagation and backward propagation, the weights of all layers are adjusted, the neural network learning and training process is also a process, and errors output by the network can be reduced.
In the application, the types of the compressed training atlas compressing the neural network and the number of the training images included in each type of training atlas are not limited, the more types and the more number, the more training times and the lower loss rate of image compression are, and the accuracy of image recognition is convenient to improve.
The compressed training atlas may include multiple dimensions such as images at multiple angles, images at multiple light intensities, or images captured by multiple different types of image capture devices. When the compressed neural network is trained according to the compressed training image sets corresponding to different dimensions, the effectiveness of image compression under different conditions is improved, and the application range of the image compression method is expanded.
The label information included in the training images in the compressed training image set is not limited by the application, and the image part to be trained is marked and can be used for detecting whether the training of the compressed neural network is completed. For example: in driving images shot by road video monitoring, tag information is target license plate information, the driving images are input into a compressed neural network to obtain compressed images, the compressed images are identified based on an identified neural network model to obtain reference license plate information, if the reference license plate information is matched with the target license plate information, training of the compressed neural network can be determined to be completed, and otherwise, when the current training times of the compressed neural network is smaller than a preset threshold value, the compressed neural network needs to be trained.
The type of the label information is not limited, and the label information can be license plate information, face information, traffic sign information, object classification information and the like.
The neural network identification model is data obtained when training of the neural network identification for image identification is completed, a training method for identifying the neural network is not limited, a Batch Gradient Descent algorithm (BGD), a random Gradient Descent algorithm (SGD), a small Batch Gradient Descent algorithm (mini-Batch SGD) and the like can be adopted for training, and one training period is completed by single forward operation and reverse Gradient propagation.
Optionally, the recognition training image set is used to train the recognition neural network to obtain the recognition neural network model.
And identifying each training image in the training image set, wherein the identification training image set at least comprises label information consistent with the type of the target label information of each training image in the compressed training images. That is, the recognition neural network model may recognize a compressed image output by a compressed neural network (to be trained or to be trained).
For example, if the type of the tag information of the compressed training image is a license plate, the type of the tag information of the recognition training image at least comprises the license plate, so that the recognition neural network model is ensured to recognize the compressed image output by the compressed neural network, and the license plate information is obtained.
Optionally, compressing the training atlas includes at least identifying the training atlas.
Because the images in the training image set are limited by the influence of factors such as angles, light rays or image acquisition equipment, when the recognition training image set is adopted for training, the accuracy of recognizing the neural network model can be improved, the training efficiency of compressing the neural network is improved, and the effectiveness of image compression is improved.
Referring to fig. 2, fig. 2 is a schematic flowchart of an image compression method according to an embodiment of the present disclosure. As shown in fig. 2, the method includes:
201: an original image of a first resolution is acquired.
The first resolution is an input resolution of the compressed neural network, the second resolution is smaller than the first resolution, and the first resolution is an output resolution of the compressed neural network, that is, a compression ratio (a ratio of the second resolution to the first resolution) of an image input to the compressed neural network is fixed, that is, different images are compressed based on the same compressed neural network model, and images with the same compression ratio can be obtained.
The original image is any training image in a compressed training image set of the compressed neural network, and the label information of the original image is used as the target label information. The label information is not limited in the application, and the label information can be obtained by marking through human identification, or can be obtained by inputting an original image into an identification neural network and carrying out identification based on an identification neural network model.
202: and compressing the original image based on the target model to obtain a compressed image with a second resolution.
Wherein the target model is a current neural network model of the compressed neural network, that is, the target model is a current parameter of the compressed neural network. And compressing the original image with the resolution equal to the input resolution of the compressed neural network based on the target model to obtain a compressed image with the resolution equal to the output resolution of the compressed neural network.
Optionally, the compressing the original image based on the target model to obtain a compressed image with a second resolution includes: identifying the original image based on the target model to obtain a plurality of image information, wherein each dimension corresponds to one image information; and compressing the original image based on the target model and the image information to obtain the compressed image.
For example, the training image includes multiple dimensions, the original image is identified based on the target model, the image information corresponding to each dimension can be determined, and then the original image is compressed according to each image information, so that the accuracy of image compression under different dimensions is improved.
203: and identifying the compressed image based on the identifying neural network model to obtain reference label information.
The identification method is not limited in the present application, and may include two parts, i.e., feature extraction and feature identification, and the result obtained by the feature identification is used as reference tag information, for example: after the driving image is compressed, obtaining the reference label information corresponding to the driving compressed image as a license plate number; and obtaining reference label information corresponding to the face compressed image as a face identification result after the face image is compressed.
Optionally, the identifying the compressed image based on the identified neural network model to obtain the reference tag information includes: preprocessing the compressed image to obtain an image to be identified; and identifying the image to be identified based on the identification neural network model to obtain the reference label information.
Pretreatment includes, but is not limited to, any one or more of the following: data format conversion processing (such as normalization processing, integer data conversion and the like), data deduplication processing, data exception processing, data missing filling processing and the like. By preprocessing the compressed image, the recognition efficiency and accuracy of image recognition can be improved.
Likewise, the acquiring the original image of the first resolution includes: receiving an input image; and preprocessing the input image to obtain the original image. By preprocessing the input image, the compression efficiency of image compression can be improved.
The preprocessing described above also includes size processing, since the neural network has fixed size requirements, i.e. only images of the same size as the basic image of the neural network can be processed. The basic image size of the compressed neural network is taken as a first basic image size, the basic image size of the identified neural network is taken as a second basic image size, namely, the size of the input image required by the compressed neural network is equal to the first basic image size, and the size of the input image required by the identified neural network is equal to the second basic image size. The compression neural network can compress the image to be compressed which meets the size of the first basic image to obtain a compressed image; the recognition neural network can recognize the image to be recognized which meets the size of the second basic image to obtain the reference label information.
The specific size processing method is not limited in the present application, and may include a method of clipping or filling pixel points, a method of scaling according to the size of the basic image, a method of down-sampling the input image, and the like.
The peripheral pixel points are cut into non-key information areas on the periphery of the cut image; the down-sampling process is a process of reducing the sampling rate of a specific signal, for example: and averaging 4 adjacent pixel points to serve as the value of one pixel point at the corresponding position of the processed image, so that the size of the image is reduced.
Optionally, the preprocessing the compressed image to obtain an image to be recognized includes: and when the image size of the compressed image is smaller than the basic image size of the recognition neural network, filling pixel points in the compressed image according to the basic image size to obtain the image to be recognized.
The present application does not limit the pixel point, and may be corresponding to any color mode, for example: rgb (0, 0, 0). The specific position of pixel filling is not limited, and any position except the compressed image can be used, namely the compressed image is not processed, but the image is expanded by adopting a pixel filling mode, so that the compressed image is not deformed, and the identification efficiency and accuracy of image identification are improved conveniently.
For example, as shown in fig. 2A, the compressed image is placed at the upper left of the image to be recognized, and the positions of the image to be recognized except for the compressed image are filled with pixel points.
Similarly, the preprocessing the input image to obtain the original image includes: and when the image size of the input image is smaller than the first basic image size of the compressed neural network, filling pixel points into the input image according to the first basic image size to obtain the original image. The original image to be compressed is identified by the identified neural network through pixel filling to obtain reference label information, and the compression rate of the input image is not changed through the pixel filling, so that the efficiency and the accuracy of training the compressed neural network are improved conveniently.
204: and obtaining a loss function according to the target label information and the reference label information.
In this application, a loss function is used to describe the error magnitude between target tag information and reference tag information, where the tag information includes multiple dimensions, and is generally calculated using a square-error formula:
Figure BDA0001497600370000101
wherein: c is the dimension of the label information, tkTo the k-dimension, y, of the reference label informationkIs the kth dimension of the target label information.
205: judging whether the loss function converges to a first threshold or whether the current training times of the compressed neural network is greater than or equal to a second threshold, if so, executing step 206; if not, go to step 207.
In the training method of the compressed neural network, the training period corresponding to each training image is completed by single forward operation and reverse gradient propagation, the threshold of the loss function is set as a first threshold, and the threshold of the training times of the compressed neural network is set as a second threshold. That is, if the loss function converges to the first threshold or the training frequency is greater than or equal to the second threshold, completing the training of the compressed neural network, and taking the target model as the corresponding compressed neural network model when the training of the compressed neural network is completed; otherwise, the inverse propagation stage of the compressed neural network is entered according to the loss function, i.e. the target model is updated according to the loss function, and the training is performed for the next training image, i.e. step 202-.
The present application does not limit the reverse training method of the compressed neural network, and optionally, please refer to the flowchart of the single-layer neural network operation method provided in fig. 2B, and fig. 2B may be applied to the structural diagram of the apparatus for performing the reverse training of the compressed neural network shown in fig. 2C.
As shown in fig. 2C, the apparatus includes an instruction cache unit 21, a controller unit 22, a direct memory access unit 23, an H-tree module 24, a master operation module 25, and a plurality of slave operation modules 26, which may be implemented by hardware circuits (e.g., application specific integrated circuits ASIC).
Wherein, the instruction cache unit 21 reads in the instruction through the direct memory access unit 23 and caches the read instruction; the controller unit 22 reads instructions from the instruction cache unit 21 and translates the instructions into micro instructions that control the behavior of other modules, such as the dma unit 23, the master operation module 25, the slave operation module 26, and the like; the direct memory access unit 23 can access and store an external address space, and directly read and write data to each cache unit inside the device, thereby completing loading and storing of the data.
Fig. 2D schematically shows the structure of the H-tree module 24, and as shown in fig. 2D, the H-tree module 24 constitutes a data path between the master operation module 25 and the plurality of slave operation modules 26 and has an H-tree type structure. The H tree is a binary tree path formed by a plurality of nodes, each node sends upstream data to two downstream nodes in the same way, combines the data returned by the two downstream nodes and returns the data to the upstream node. For example, in the reverse operation process of the neural network, vectors returned by two nodes at the downstream end are added into one vector at the current node and returned to the node at the upstream end. At the stage of starting calculation of each layer of artificial neural network, the input gradient in the master operation module 25 is sent to each slave operation module 26 through the H tree module 24; after the calculation process of the slave operation module 26 is completed, the sum of the output gradient vector portions output from each slave operation module 26 is added two by two in the H-tree module 24, that is, the sum of all the output gradient vector portions is summed up to be the final output gradient vector.
Fig. 2E schematically shows the structure of the main operation module 25, and as shown in fig. 2E, the main operation module 25 includes an operation unit 251, a data dependency judgment unit 252, and a neuron buffer unit 253.
The neuron buffer 253 is configured to buffer input data and output data used by the main operation module 25 in a calculation process. The arithmetic unit 251 performs various arithmetic functions of the main arithmetic module. The data dependency relationship determining unit 252 is a port of the operation unit 251 for reading and writing the neuron cache unit 253, and can ensure that there is no consistency conflict for reading and writing data in the neuron cache unit 253. Specifically, the data dependency determining unit 252 determines whether there is a dependency between the unexecuted microinstruction and the data of the microinstruction being executed, and if not, allows the microinstruction to be immediately issued, otherwise, the microinstruction is allowed to be issued only after all the microinstructions depended by the microinstruction are completely executed. For example, all microinstructions destined for the data dependency unit 252 may be stored in an instruction queue within the data dependency unit 252, in which if the read data range of a read instruction conflicts with the write data range of a write instruction located earlier in the queue, the instruction must wait until the dependent write instruction is executed. Meanwhile, the data dependency relationship determining unit 252 is also responsible for reading the input gradient vector from the neuron buffer unit 253 and sending the input gradient vector to the slave computing module 26 through the H-tree module 24, and the output data of the slave computing module 26 is directly sent to the computing unit 251 through the H-tree module 24. The instruction output from the controller unit 22 is sent to the arithmetic unit 251 and the dependency relationship judging unit 252 to control the behavior thereof.
Fig. 2F schematically illustrates the structure of the operation module 26, and as shown in fig. 2F, each slave operation module 26 includes an operation unit 261, a data dependency determination unit 262, a neuron buffering unit 263, a weight buffering unit 264, and a weight gradient buffering unit 265.
The arithmetic unit 261 receives the microinstruction issued by the controller unit 22 and performs arithmetic logic operations.
The data dependency relationship determination unit 262 is responsible for reading and writing operations to the cache unit during the calculation process. The data dependency determination unit 262 ensures that there is no consistency conflict with the reading and writing of the cache unit. Specifically, the data dependency determining unit 262 determines whether there is a dependency between the unexecuted microinstruction and the data of the microinstruction being executed, and if not, allows the microinstruction to be immediately issued, otherwise, the microinstruction is allowed to be issued only after all the microinstructions depended by the microinstruction are completely executed. For example, all microinstructions destined for the data dependency unit 262 may be stored in an instruction queue within the data dependency unit 262 in which a read instruction must wait until the dependent write instruction is executed if the read data range of the read instruction conflicts with the write data range of the write instruction located earlier in the queue.
The neuron buffer unit 263 buffers the input gradient vector data and the partial sum of the output gradient vector calculated from the operation module 26.
The weight buffer unit 264 buffers the weight vector required by the slave computing module 26 during the calculation process. For each slave module, only the column of the weight matrix corresponding to that slave module 26 is stored.
The weight gradient buffer unit 265 buffers weight gradient data required in the process of updating the weight from the corresponding operation module. Each slave computing module 26 stores weight gradient data corresponding to its stored weight vector.
The first half part and the weight value can be updated in parallel in the process of realizing the reverse training and calculating of the output gradient vector of each layer of artificial neural network from the operation module 26. Taking an artificial neural network fully-connected layer (MLP) as an example, the process is out _ gradient ═ w × in _ gradient, wherein multiplication of a weight matrix w and an input gradient vector in _ gradient can be divided into unrelated parallel computation subtasks, out _ gradient and in _ gradient are column vectors, each slave operation module only computes the product of a corresponding part of scalar elements in _ gradient and a column corresponding to the weight matrix w, each obtained output vector is a partial sum to be accumulated of a final result, and the partial sums are added pairwise in an H tree to obtain a final result. The calculation process becomes a parallel process of calculating partial sums and a subsequent process of accumulation. Each slave operation module 26 calculates a partial sum of the output gradient vectors, and all the partial sums are summed in the H-tree module 24 to obtain the final output gradient vector. Each slave operation module 26 multiplies the input gradient vector by the output value of each layer in the forward operation, and calculates the gradient of the weight value, so as to update the weight value stored in the slave operation module 26. The forward operation and the reverse training are two main processes of a neural network algorithm, the neural network needs to train (update) the weight in the network, firstly, the forward output of an input vector in the network formed by the current weight needs to be calculated, which is a forward process, and then, the weight of each layer is reversely trained (updated) layer by layer according to the difference between an output value and a labeled value of the input vector. The output vectors of each layer and the derivative values of the activation functions are saved during the forward calculation, and the data are needed by the reverse training process, so the data are guaranteed to exist at the beginning of the reverse training. The output value of each layer in the forward operation is the existing data when the reverse operation starts, and can be cached in the main operation module through the direct memory access unit and sent to the slave operation module through the H tree. The main operation module 25 performs subsequent calculation based on the output gradient vector, for example, the output gradient vector is multiplied by the derivative of the activation function in the forward operation to obtain the input gradient value of the next layer. The derivative of the activation function in the forward operation is the existing data at the beginning of the reverse operation, and can be cached in the main operation module through a direct memory access unit.
According to the embodiment of the invention, an instruction set for executing the forward operation of the artificial neural network on the device is also provided. The instruction set comprises a CONFIG instruction, a COMPUTE instruction, an IO instruction, a NOP instruction, a JUMP instruction and a MOVE instruction, wherein:
configuring various constants required by calculation of a current layer by the CONFIG instruction before calculation of each layer of artificial neural network is started;
the COMPUTE instruction completes the arithmetic logic calculation of each layer of artificial neural network;
the IO instruction reads input data required by calculation from an external address space and stores the data back to the external space after the calculation is finished;
the NOP instruction is responsible for emptying the microinstructions currently loaded into all internal microinstruction cache queues, and all instructions before the NOP instruction are guaranteed to be finished. NOP instructions do not contain any operations themselves;
the JUMP instruction is responsible for the JUMP of the next instruction address to be read from the instruction cache unit by the controller and is used for realizing the JUMP of a control flow;
the MOVE instruction is responsible for carrying data at one address in the internal address space of the device to another address in the internal address space of the device, and the process is independent of the arithmetic unit and does not occupy the resources of the arithmetic unit in the execution process.
Fig. 2G is an example block diagram of compressed neural network reverse training provided in an embodiment of the present application. The process of calculating the output gradient vector is out _ gradient ═ w _ in _ gradient, wherein the matrix vector multiplication of the weight matrix w and the input gradient vector in _ gradient can be divided into unrelated parallel calculation subtasks, each slave operation module 26 calculates the partial sum of the output gradient vector, and all the partial sums are summed in the H-tree module 24 to obtain the final output gradient vector. In fig. 2G, the output gradient vector of the upper layer is multiplied by the corresponding derivative of the activation function to obtain the input data of the current layer, and then multiplied by the weight matrix to obtain the output gradient vector. The process of calculating the weight update gradient is dw ═ x in _ gradient, in which each slave operation module 26 calculates the update gradient of the weight of the corresponding part of the module. The slave operation module 26 multiplies the input gradient by the input neuron during forward operation to calculate a weight update gradient dw, and then updates the weight w according to the learning rate set by the instruction using w, dw and the weight update gradient dw' used when the weight was updated last time.
Referring to fig. 2G, the input gradient (input gradient0, …, input gradient3 in fig. 2G) is the output gradient vector of the n +1 th layer, which is first multiplied by the derivative value of the n-th layer in the forward operation process (f '(out 0), …, f' (out3) in fig. 2G) to obtain the input gradient vector of the n-th layer, and the process is completed in the master operation module 5, sent from the H-tree module 24 to the slave operation module 26, and temporarily stored in the neuron buffer unit 263 of the slave operation module 26. Then, the input gradient vector is multiplied by the weight matrix to obtain the output gradient vector of the nth layer. iN this process, the ith slave operation module calculates the product of the ith scalar iN the input gradient vectors and the column vector [ w _ i0, …, w _ iN ] iN the weight matrix, and the resultant output vectors are added pairwise iN the H-tree module 24 to obtain the final output gradient vector (output gradient0, …, output gradient3 iN fig. 2G).
Meanwhile, the operation module 26 needs to update the weight stored in this module, and the process of calculating the weight update gradient is dw _ ij ═ x _ j × in _ gradient _ i, where x _ j is the jth element of the input (i.e., the output of the (n-1) th layer) vector of the nth layer in the forward operation, and in _ gradient _ i is the ith element of the input gradient vector (i.e., the product of the input gradient and the derivative f' in fig. 2G) of the nth layer in the reverse operation. The input of the nth layer in the forward operation is the data existing at the beginning of the reverse training, and is sent to the slave operation module 26 through the H-tree module 24 and temporarily stored in the neuron buffer unit 263. Then, in the slave operation module 26, after the calculation of the partial sum of the output gradient vectors is completed, the ith scalar of the input gradient vector is multiplied by the input vector of the nth layer of forward operation to obtain the gradient vector dw of the updated weight value and the weight value is updated according to the gradient vector dw.
As shown in fig. 2B, an IO instruction is pre-stored at the first address of the instruction cache unit; the controller unit reads the IO instruction from the first address of the instruction cache unit, and according to the translated micro instruction, the direct memory access unit reads all instructions related to the single-layer artificial neural network reverse training from the external address space and caches the instructions in the instruction cache unit; the controller unit reads in a next IO instruction from the instruction cache unit, and according to the translated micro instruction, the direct memory access unit reads all data required by the main operation module from an external address space to a neuron cache unit of the main operation module, wherein the data comprises input neurons, activation function derivative values and input gradient vectors during forward operation; the controller unit reads in the next IO instruction from the instruction cache unit, and according to the translated micro instruction, the direct memory access unit reads ownership value data and weight gradient data required by the slave operation module from the external address space and respectively stores the ownership value data and the weight gradient data in a weight cache unit and a weight gradient cache unit of the corresponding slave operation module; the controller unit reads in a next CONFIG instruction from the instruction cache unit, and the operation unit configures the value of an internal register of the operation unit according to parameters in the translated micro instruction, including various constants required by the neural network calculation of the layer, the accuracy setting of the calculation of the layer, the learning rate when updating the weight and the like; the controller unit reads in a next COMPUTE instruction from the instruction cache unit, and according to the translated microinstruction, the main operation module sends an input gradient vector and an input neuron during forward operation to each slave operation module through the H tree module, and the input gradient vector and the input neuron during forward operation are stored in the neuron cache unit of the slave operation module; according to the microinstruction decoded by the COMPUTE instruction, the operation unit of the slave operation module reads the weight vector (namely part of the column of the weight matrix stored by the slave operation module) from the weight cache unit, completes the vector multiplication scalar operation of the weight vector and the input gradient vector, and returns the output vector part sum through an H tree; meanwhile, the slave operation module multiplies the input gradient vector by the input neuron to obtain a weight gradient and stores the weight gradient to a weight gradient cache unit; in the H tree module, the output gradient parts returned from the operation module are added pairwise step by step to obtain a complete output gradient vector; the main operation module obtains a return value of the H-tree module, reads an activation function derivative value during forward operation from the neuron cache unit according to a microinstruction decoded by a COMPUTE instruction, multiplies the derivative value by a returned output vector to obtain a next layer of reverse training input gradient vector, and writes the next layer of reverse training input gradient vector back to the neuron cache unit; the controller unit reads in the next complete instruction from the instruction cache unit, reads the weight w from the value cache unit from the operation module according to the decoded microinstruction, reads the weight gradient dw of the current time and the weight gradient dw' used for updating the weight last time from the value gradient cache unit, and updates the weight w; and the controller unit reads in the next IO instruction from the instruction cache unit, and stores the output gradient vector in the neuron cache unit to an external address space designated address by the direct memory access unit according to the translated micro instruction, so that the operation is finished.
For the multilayer artificial neural network, the implementation process is similar to that of a single-layer neural network, after the previous layer artificial neural network is executed, the calculation instruction of the next layer performs the calculation process by using the output gradient vector calculated in the main calculation module as the input gradient vector of the next layer training, and the weight address and the weight gradient address in the instruction are changed to the address corresponding to the current layer.
By adopting the device for executing the reverse training of the neural network, the support for the forward operation of the multilayer artificial neural network is effectively improved. And the special on-chip cache for the reverse training of the multilayer neural network is adopted, the reusability of input neurons and weight data is fully excavated, the data are prevented from being read from the memory repeatedly, the memory access bandwidth is reduced, and the problem that the memory bandwidth becomes the bottleneck of the forward operational performance of the multilayer artificial neural network is avoided.
206: and acquiring a target original image with the first resolution, and compressing the target original image based on a compressed neural network model to obtain a target compressed image with the second resolution.
Wherein the target original image is an image (an image belonging to the same dataset) that is in accordance with the type of label information of the training image. If the loss function converges to the first threshold or the training times is greater than or equal to the second threshold, the compressed neural network completes the training, the compressed neural network can be directly input to the compressed neural network for image compression to obtain a target compressed image, and the target compressed image can be identified by the identified neural network.
Optionally, after the compressing the target original image based on the compressed neural network model to obtain the target compressed image at the second resolution, the method further includes: and identifying the target compressed image based on the identification neural network model to obtain the label information of the target original image, and storing the label information of the target original image.
That is to say, after the training of the compressed neural network is completed, the compressed image based on the recognition neural network model can be recognized, and the efficiency and the accuracy of manually recognizing the label information are improved.
207: and updating the target model according to the loss function to obtain an updated model, taking the updated model as the target model, taking the next training image as the original image, and executing step 202.
It can be understood that a loss function is obtained through a reference label value obtained by training the obtained recognition neural network model and a target label value included in the original image, the training is completed when the loss function meets a preset condition or the current training frequency of the compressed neural network exceeds a preset threshold, otherwise, the weight of the compressed neural network is repeatedly adjusted through training the compressed neural network, that is, the image content represented by each pixel in the same image is adjusted, and the loss of the compressed neural network is reduced. And the compressed neural network model obtained by training is used for image compression, so that the effectiveness of image compression is improved, and the accuracy of identification is improved conveniently.
Referring to fig. 3, fig. 3 is a schematic structural diagram of an image compression apparatus according to an embodiment of the present application, and as shown in fig. 3, the apparatus 300 includes: a processor 301 and a memory 302 connected to the processor 301.
In this embodiment of the application, the memory 302 is configured to store a first threshold, a second threshold, a current neural network model and training times of a compressed neural network, a compressed training image of the compressed neural network, and label information of each training image in the compressed training image, an identified neural network model, and a compressed neural network model, where the current neural network model of the compressed neural network is used as a target model, the compressed neural network model is a corresponding target model when the compressed neural network training is completed, and the identified neural network model is a corresponding neural network model when the identified neural network training is completed.
The processor 301 is configured to acquire an original image with a first resolution, where the original image is any one of the compressed training image sets, and tag information of the original image is used as target tag information; compressing the original image based on the target model to obtain a compressed image with a second resolution, wherein the second resolution is smaller than the first resolution; identifying the compressed image based on the identification neural network model to obtain reference label information; obtaining a loss function according to the target tag information and the reference tag information; when the loss function converges to the first threshold value or the training times is greater than or equal to the second threshold value, acquiring a target original image with the first resolution ratio, and determining that the target model is the compressed neural network model; and compressing the target original image based on the compressed neural network model to obtain a target compressed image with the second resolution.
Optionally, the processor 301 is further configured to update the target model according to the loss function when the loss function does not converge to the first threshold, or the training frequency is smaller than the second threshold, to obtain an updated model, use the updated model as the target model, use the next training image as the original image, and perform the step of obtaining the original image with the first resolution.
Optionally, the processor 301 is specifically configured to perform preprocessing on the compressed image to obtain an image to be identified; and identifying the image to be identified based on the identification neural network model to obtain the reference label information.
Optionally, the preprocessing includes a size processing, and the memory 302 is further configured to store a basic image size of the identified neural network; the processor 301 is specifically configured to, when the image size of the compressed image is smaller than the size of the basic image, fill pixel points in the compressed image according to the size of the basic image to obtain the image to be identified.
Optionally, the compressed training atlas at least includes an identification training atlas, and the processor 301 is further configured to train the identification neural network by using the identification training atlas to obtain the identification neural network model, where each training image in the identification training atlas at least includes tag information of a type consistent with the type of the target tag information.
Optionally, the processor 301 is further configured to identify the target compressed image based on the identified neural network model, so as to obtain tag information of the target original image;
the memory 302 is also used for storing the label information of the target original image.
Optionally, the compressed training atlas includes multiple dimensions, and the processor 301 is specifically configured to identify the original image based on the target model to obtain multiple pieces of image information, where each dimension corresponds to one piece of image information; and compressing the original image based on the target model and the image information to obtain the compressed image.
It can be understood that a compressed image of an original image is obtained based on a target model, reference label information of the compressed image is obtained based on a recognition neural network model, a loss function is obtained according to the target label information and the reference label information included in the original image, when the loss function converges on a first threshold value or the current training frequency of the compressed neural network is greater than or equal to a second threshold value, training of the compressed neural network for image compression is completed, the target model is used as the compressed neural network model, and a target compressed image of the target original image can be obtained based on the compressed neural network model. That is to say, a loss function is obtained through a reference label value obtained by the trained neural network model and a target label value included in the original image, the training is completed when the loss function meets a preset condition or the current training frequency of the compressed neural network exceeds a preset threshold, otherwise, the weight of the compressed neural network is repeatedly adjusted through training the compressed neural network, namely, the image content represented by each pixel in the same image is adjusted, the loss of the compressed neural network is reduced, the effectiveness of image compression is improved, and therefore the accuracy of the identification is convenient to improve.
In one embodiment, the application discloses an electronic device comprising the image compression device.
In one embodiment, the present application discloses an electronic device, as shown in fig. 4, the electronic device 400 includes a processor 401, a memory 402, a communication interface 403, and one or more programs 404, wherein the one or more programs 404 are stored in the memory 402 and configured to be executed by the processor 401, and the program 404 includes instructions for performing some or all of the steps described in the image compression method.
The electronic device includes, but is not limited to, a robot, a computer, a printer, a scanner, a tablet computer, a smart terminal, a mobile phone, a vehicle data recorder, a navigator, a sensor, a camera, a cloud server, a camera, a video camera, a projector, a watch, an earphone, a mobile storage, a wearable device, a vehicle, a household appliance, and a medical device.
The vehicle comprises an airplane, a ship and/or a vehicle; the household appliances comprise a television, an air conditioner, a microwave oven, a refrigerator, an electric cooker, a humidifier, a washing machine, an electric lamp, a gas stove and a range hood; the medical equipment comprises a nuclear magnetic resonance apparatus, a B-ultrasonic apparatus and/or an electrocardiograph.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network Personal Computers (PCs), minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
In another embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the implementation described in the image compression method.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.
The above-mentioned embodiments are further described in detail for the purpose of illustrating the invention, and it should be understood that the above-mentioned embodiments are only illustrative of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. An image compression method, comprising:
acquiring an original image with a first resolution, wherein the original image is any one training image in a compressed training image set of a compressed neural network, and label information of the original image is taken as target label information;
compressing the original image based on a target model to obtain a compressed image with a second resolution, wherein the second resolution is smaller than the first resolution, and the target model is a current neural network model of the compressed neural network;
identifying the compressed image based on an identified neural network model to obtain reference label information, wherein the identified neural network model is a corresponding neural network model when the training of the identified neural network is completed;
obtaining a loss function according to the target tag information and the reference tag information;
when the loss function converges to a first threshold value or the current training times of the compressed neural network are greater than or equal to a second threshold value, acquiring a target original image with the first resolution, taking the target model as a corresponding compressed neural network model when the training of the compressed neural network is completed, wherein the label information of the target original image is consistent with the type of the label information of the original image;
and compressing the target original image based on the compressed neural network model to obtain a target compressed image with the second resolution.
2. The method of claim 1, further comprising:
and when the loss function is not converged to the first threshold value or the current training times of the compressed neural network are smaller than the second threshold value, updating the target model according to the loss function to obtain an updated model, taking the updated model as the target model, taking the next training image as the original image, and executing the step of obtaining the original image with the first resolution.
3. The method according to claim 1 or 2, wherein the identifying the compressed image based on the identifying neural network model to obtain the reference tag information comprises:
preprocessing the compressed image to obtain an image to be identified;
and identifying the image to be identified based on the identification neural network model to obtain the reference label information.
4. The method according to claim 3, wherein the preprocessing comprises size processing, and the preprocessing the compressed image to obtain the image to be recognized comprises:
and when the image size of the compressed image is smaller than the basic image size of the recognition neural network, filling pixel points in the compressed image according to the basic image size to obtain the image to be recognized.
5. The method of claim 1 or 2, wherein compressing the training atlas includes at least identifying the training atlas, the method further comprising:
and training the recognition neural network by adopting the recognition training image set to obtain the recognition neural network model, wherein each training image in the recognition training image set at least comprises label information with the type consistent with that of the target label information.
6. The method according to claim 1 or 2, wherein after the compressing the target original image based on the compressed neural network model to obtain the target compressed image of the second resolution, the method further comprises:
and compressing the target compressed image based on the recognition neural network model to obtain the label information of the target original image, and storing the label information of the target original image.
7. The method of claim 1 or 2, wherein the compressed training atlas includes a plurality of dimensions, and wherein the compressing the original image based on the target model to obtain a compressed image at a second resolution comprises:
identifying the original image based on the target model to obtain a plurality of image information, wherein each dimension corresponds to one image information;
and compressing the original image based on the target model and the image information to obtain the compressed image.
8. An image compression apparatus comprising a processor, a memory coupled to the processor, wherein:
the memory is used for storing a first threshold, a second threshold, a current neural network model and training times of a compressed neural network, a compressed training image set of the compressed neural network and label information of each training image in the compressed training image set, an identified neural network model and a compressed neural network model, the current neural network model of the compressed neural network is used as a target model, the compressed neural network model is a corresponding target model when the compressed neural network training is completed, and the identified neural network model is a corresponding neural network model when the identified neural network training is completed;
the processor is configured to acquire an original image with a first resolution, where the original image is any one of the compressed training image sets, and tag information of the original image is used as target tag information; compressing the original image based on the target model to obtain a compressed image with a second resolution, wherein the second resolution is smaller than the first resolution; identifying the compressed image based on the identification neural network model to obtain reference label information; obtaining a loss function according to the target tag information and the reference tag information; when the loss function converges to the first threshold value or the training times is greater than or equal to the second threshold value, acquiring a target original image with the first resolution, and determining that the target model is the compressed neural network model, wherein the type of label information of the target original image is consistent with that of the original image; and compressing the target original image based on the compressed neural network model to obtain a target compressed image with the second resolution.
9. The apparatus of claim 8, wherein the processor is further configured to update the target model according to the loss function to obtain an updated model when the loss function does not converge on the first threshold or the training frequency is smaller than the second threshold, take the updated model as the target model, take a next training image as the original image, and perform the step of obtaining the original image with the first resolution.
10. The apparatus according to claim 8 or 9, wherein the processor is specifically configured to pre-process the compressed image to obtain an image to be recognized; and identifying the image to be identified based on the identification neural network model to obtain the reference label information.
11. The apparatus of claim 10, wherein the pre-processing comprises size processing,
the memory is further used for storing a basic image size of the recognition neural network;
and the processor is specifically configured to, when the image size of the compressed image is smaller than the size of the basic image, fill pixel points in the compressed image according to the size of the basic image to obtain the image to be identified.
12. The apparatus according to claim 8 or 9, wherein the compressed training atlas includes at least a recognition training atlas, and the processor is further configured to train the recognition neural network with the recognition training atlas to obtain the recognition neural network model, and each training image in the recognition training atlas includes at least label information in accordance with the type of the target label information.
13. The apparatus according to claim 8 or 9, wherein the processor is further configured to identify the target compressed image based on the identified neural network model, to obtain tag information of the target original image;
the memory is also used for storing the label information of the target original image.
14. The apparatus according to claim 8 or 9, wherein the compressed training atlas includes a plurality of dimensions, and the processor is specifically configured to identify the original image based on the target model to obtain a plurality of image information, each dimension corresponding to one image information; and compressing the original image based on the target model and the image information to obtain the compressed image.
15. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1-7.
16. A computer-readable storage medium, having stored thereon a computer program comprising program instructions, which, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-7.
CN201711289667.8A 2017-10-20 2017-12-07 Image compression method and related device Active CN109903350B (en)

Priority Applications (11)

Application Number Priority Date Filing Date Title
CN201711289667.8A CN109903350B (en) 2017-12-07 2017-12-07 Image compression method and related device
PCT/CN2018/095548 WO2019076095A1 (en) 2017-10-20 2018-07-13 Processing method and apparatus
US16/482,710 US11593658B2 (en) 2017-10-20 2018-07-13 Processing method and device
EP18868807.1A EP3627397B1 (en) 2017-10-20 2018-07-13 Processing method and apparatus
EP19215859.0A EP3660628B1 (en) 2017-10-20 2018-07-13 Dynamic voltage frequency scaling device and method
EP19215858.2A EP3667569A1 (en) 2017-10-20 2018-07-13 Processing method and device, operation method and device
KR1020197037566A KR102434728B1 (en) 2017-10-20 2018-07-13 Processing method and apparatus
KR1020197037574A KR102434729B1 (en) 2017-10-20 2018-07-13 Processing method and apparatus
EP19215860.8A EP3660706B1 (en) 2017-10-20 2018-07-13 Convolutional operation device and method
KR1020197023878A KR102434726B1 (en) 2017-10-20 2018-07-13 Treatment method and device
US16/529,041 US10540574B2 (en) 2017-12-07 2019-08-01 Image compression method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711289667.8A CN109903350B (en) 2017-12-07 2017-12-07 Image compression method and related device

Publications (2)

Publication Number Publication Date
CN109903350A CN109903350A (en) 2019-06-18
CN109903350B true CN109903350B (en) 2021-08-06

Family

ID=66939820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711289667.8A Active CN109903350B (en) 2017-10-20 2017-12-07 Image compression method and related device

Country Status (1)

Country Link
CN (1) CN109903350B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110808738B (en) * 2019-09-16 2023-10-20 平安科技(深圳)有限公司 Data compression method, device, equipment and computer readable storage medium
CN113657136B (en) * 2020-05-12 2024-02-13 阿里巴巴集团控股有限公司 Identification method and device
CN112954011B (en) * 2021-01-27 2023-11-10 上海淇玥信息技术有限公司 Image resource compression method and device and electronic equipment
CN113065579B (en) * 2021-03-12 2022-04-12 支付宝(杭州)信息技术有限公司 Method and device for classifying target object
CN113422950B (en) * 2021-05-31 2022-09-30 北京达佳互联信息技术有限公司 Training method and training device, image data processing method and device, electronic device, and storage medium
CN117440172B (en) * 2023-12-20 2024-03-19 江苏金融租赁股份有限公司 Picture compression method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809704A (en) * 2016-03-30 2016-07-27 北京小米移动软件有限公司 Method and device for identifying image definition
CN105976400A (en) * 2016-05-10 2016-09-28 北京旷视科技有限公司 Object tracking method and device based on neural network model
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163121B (en) * 2015-08-24 2018-04-17 西安电子科技大学 Big compression ratio satellite remote sensing images compression method based on depth autoencoder network
US10432953B2 (en) * 2016-02-05 2019-10-01 Deepmind Technologies Limited Compressing images using neural networks
US10204286B2 (en) * 2016-02-29 2019-02-12 Emersys, Inc. Self-organizing discrete recurrent network digital image codec
CN106296692A (en) * 2016-08-11 2017-01-04 深圳市未来媒体技术研究院 Image significance detection method based on antagonism network
CN107018422B (en) * 2017-04-27 2019-11-05 四川大学 Still image compression method based on depth convolutional neural networks
CN107301668B (en) * 2017-06-14 2019-03-15 成都四方伟业软件股份有限公司 A kind of picture compression method based on sparse matrix, convolutional neural networks
CN107403415B (en) * 2017-07-21 2021-04-09 深圳大学 Compressed depth map quality enhancement method and device based on full convolution neural network
CN107403166B (en) * 2017-08-02 2021-01-26 广东工业大学 Method and device for extracting pore characteristics of face image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809704A (en) * 2016-03-30 2016-07-27 北京小米移动软件有限公司 Method and device for identifying image definition
CN105976400A (en) * 2016-05-10 2016-09-28 北京旷视科技有限公司 Object tracking method and device based on neural network model
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于神经网络与SVM的图像压缩(编码)理论和方法;高绪慧;《中国优秀硕士学位论文全文数据库 信息科技辑》;20080215(第2期);I136-59 *
嫌图片太大?!卷积神经网络轻松实现无损压缩到20%;全球人工智能;《https://www.sohu.com/a/163460325_642762》;20170810;1-7 *

Also Published As

Publication number Publication date
CN109903350A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109903350B (en) Image compression method and related device
CN110050267B (en) System and method for data management
CN111860812B (en) Apparatus and method for performing convolutional neural network training
CN107895191B (en) Information processing method and related product
US10540574B2 (en) Image compression method and related device
WO2021077557A1 (en) Magnetic resonance image reconstruction method and apparatus, device, and medium
EP3564863B1 (en) Apparatus for executing lstm neural network operation, and operational method
CN106855952B (en) Neural network-based computing method and device
CN111260025A (en) Apparatus and method for performing LSTM neural network operations
WO2022228425A1 (en) Model training method and apparatus
EP3451238A1 (en) Apparatus and method for executing pooling operation
CN111160547B (en) Device and method for artificial neural network operation
CN112115801B (en) Dynamic gesture recognition method and device, storage medium and terminal equipment
CN112749666B (en) Training and action recognition method of action recognition model and related device
CN114359289A (en) Image processing method and related device
US11521007B2 (en) Accelerator resource utilization by neural networks
CN111047020B (en) Neural network operation device and method supporting compression and decompression
CN114139630A (en) Gesture recognition method and device, storage medium and electronic equipment
CN114925320A (en) Data processing method and related device
CN110647356A (en) Arithmetic device and related product
CN111652349A (en) Neural network processing method and related equipment
CN111860772B (en) Device and method for executing artificial neural network mapping operation
CN114090466A (en) Instruction processing device and method, computer equipment and storage medium
CN114546491A (en) Data operation method, data operation device and data processor
CN114254563A (en) Data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant