TW202032423A - Method for image processing and apparatus thereof - Google Patents

Method for image processing and apparatus thereof Download PDF

Info

Publication number
TW202032423A
TW202032423A TW108146511A TW108146511A TW202032423A TW 202032423 A TW202032423 A TW 202032423A TW 108146511 A TW108146511 A TW 108146511A TW 108146511 A TW108146511 A TW 108146511A TW 202032423 A TW202032423 A TW 202032423A
Authority
TW
Taiwan
Prior art keywords
image
confrontation
processing
feature
data
Prior art date
Application number
TW108146511A
Other languages
Chinese (zh)
Inventor
韓江帆
董瀟逸
張瑞茂
羅平
張衛明
俞能海
王曉剛
Original Assignee
大陸商北京市商湯科技開發有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大陸商北京市商湯科技開發有限公司 filed Critical 大陸商北京市商湯科技開發有限公司
Publication of TW202032423A publication Critical patent/TW202032423A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present application discloses an image processing method and apparatus. The method includes: performing feature extraction processing on the image to be processed to obtain a first feature image; performing fusion processing on the first feature image and the confrontation data to obtain a second feature image; and performing decoding processing on the second feature image to obtain the confrontation image, wherein the category of the confrontation image is the same as the category of the confrontation data. Corresponding apparatuses are also disclosed.

Description

圖像處理方法及裝置 Image processing method and device

本公開關於圖像處理技術領域但不限於圖像處理技術領域,尤其關於一種圖像處理方法及裝置。 The present disclosure relates to the field of image processing technology but is not limited to the field of image processing technology, and in particular relates to an image processing method and device.

在圖像識別領域,基於神經網路的方法克服了相關圖像處理技術,在多種應用上都有很好的效果,如:人臉識別,物體識別,手寫文字識別等等。此前,神經網路在這些專案上的識別準確率很低,如果識別出錯了,人們很少會在意,隨著深度學習演算法的完善,神經網路在這些專案上的識別準確率大大提高,研究神經網路犯的錯誤就變得很有價值,其中有一種錯誤叫對抗圖像。 In the field of image recognition, neural network-based methods overcome related image processing technologies and have good effects in a variety of applications, such as face recognition, object recognition, handwritten text recognition, and so on. Previously, the recognition accuracy of neural networks on these projects was very low. If the recognition is wrong, people seldom care about it. With the improvement of deep learning algorithms, the recognition accuracy of neural networks on these projects is greatly improved. It becomes very valuable to study the mistakes made by neural networks, one of which is called adversarial images.

對抗圖像指經過微小調整就可以讓神經網路輸出錯誤結果的圖像,將該微小調整稱為對抗資料,根據不同的對抗資料得到不同的對抗圖像,將生成的對抗圖像輸入被攻擊的神經網路,從而發現被攻擊的神經網路的缺陷。此外,對抗圖像還可用於神經網路的對抗訓練,提高神經網路的抗干擾能力。 The confrontation image refers to the image that can make the neural network output wrong results after minor adjustments. This minor adjustment is called confrontation data. Different confrontation images are obtained according to different confrontation data, and the generated confrontation images are input and attacked The neural network to discover the defects of the attacked neural network. In addition, confrontation images can also be used for confrontation training of neural networks to improve the anti-interference ability of neural networks.

本公開提供一種圖像處理方法及裝置。 The present disclosure provides an image processing method and device.

第一方面,提供了一種圖像處理方法,包括:對待處理圖像進行特徵提取處理,得到第一特徵圖像;對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像;對所述第二特徵圖像進行解碼處理,得到對抗圖像。 In a first aspect, an image processing method is provided, including: performing feature extraction processing on an image to be processed to obtain a first feature image; performing fusion processing on the first feature image and confrontation data to obtain a second feature image Like; decode the second characteristic image to obtain a confrontational image.

在一種可能實現的方式中,所述對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像,包括:對所述對抗資料進行預處理,得到第三特徵圖像;對所述第三特徵圖像與所述第一特徵圖像進行融合處理,得到第四特徵圖像;對所述第四特徵圖像進行卷積處理,得到所述第二特徵圖像。 In a possible implementation manner, the performing fusion processing on the first characteristic image and the confrontation data to obtain the second characteristic image includes: preprocessing the confrontation data to obtain a third characteristic image; Performing fusion processing on the third feature image and the first feature image to obtain a fourth feature image; performing convolution processing on the fourth feature image to obtain the second feature image.

在另一種可能實現的方式中,所述對所述對抗資料進行預處理,得到第三特徵圖像,包括:對所述對抗資料進行編碼處理,得到編碼處理後的對抗資料;對所述編碼處理後的對抗資料填充預設值,使填充後得到的第三特徵圖像的尺寸與所述第一特徵圖像的尺寸相同。 In another possible implementation manner, the preprocessing the confrontation data to obtain the third characteristic image includes: encoding the confrontation data to obtain the confrontation data after the encoding processing; and encoding the confrontation data. The processed confrontation data is filled with a preset value, so that the size of the third characteristic image obtained after filling is the same as the size of the first characteristic image.

在又一種可能實現的方式中,所述對所述對抗資料進行編碼處理,得到編碼處理後的對抗資料,包括:對所述對抗資料進行獨熱編碼處理,得到所述編碼處理後的對抗資料。 In another possible implementation manner, the encoding the confrontation data to obtain the encoded confrontation data includes: performing one-hot encoding processing on the confrontation data to obtain the encoded confrontation data .

在又一種可能實現的方式中,所述對所述第三特徵圖像與所述第一特徵圖像進行融合處理,得到第四特徵 圖像,包括:將所述第三特徵圖像與所述第一特徵圖像在通道維度上進行拼接處理,得到所述第四特徵圖像。 In another possible implementation manner, the fusion processing is performed on the third feature image and the first feature image to obtain a fourth feature The image includes: performing splicing processing on the channel dimension of the third characteristic image and the first characteristic image to obtain the fourth characteristic image.

在又一種可能實現的方式中,所述將所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像,還包括:對所述對抗資料進行特徵提取處理,得到所述第一特徵圖像的權重矩陣;將所述第一特徵圖像與所述權重矩陣在通道維度上進行點乘,得到所述第二特徵圖像。 In another possible implementation manner, the fusion processing of the first characteristic image and the confrontation data to obtain the second characteristic image further includes: performing feature extraction processing on the confrontation data to obtain the first characteristic image. A weight matrix of a feature image; the first feature image and the weight matrix are dot-multiplied in the channel dimension to obtain the second feature image.

在又一種可能實現的方式中,所述對所述對抗資料進行特徵提取處理,得到所述第一特徵圖像的權重矩陣,包括:對所述對抗資料進行線性變換,得到線性變換後的對抗資料;對所述線性變換後的對抗資料進行非線性變換,得到所述第一特徵圖像的權重矩陣。 In another possible implementation manner, the performing feature extraction processing on the confrontation data to obtain the weight matrix of the first feature image includes: performing linear transformation on the confrontation data to obtain the linearly transformed confrontation Data; non-linear transformation is performed on the linearly transformed confrontation data to obtain the weight matrix of the first feature image.

在又一種可能實現的方式中,所述對所述對抗資料進行線性變換,得到線性變換後的對抗資料,包括:獲取所述對抗資料的權重;根據所述權重對所述對抗資料進行加權求和,得到所述線性變換後的對抗資料。 In another possible implementation manner, the linearly transforming the confrontation data to obtain the confrontation data after the linear transformation includes: obtaining a weight of the confrontation data; and performing a weighted calculation on the confrontation data according to the weight. And to obtain the confrontation data after the linear transformation.

在又一種可能實現的方式中,所述對所述線性變換後的對抗資料進行非線性變換,得到所述第一特徵圖像的權重矩陣,包括:將所述線性變換後的對抗資料代入啟動函數,得到所述第一特徵圖像的權重矩陣。 In yet another possible implementation manner, the performing nonlinear transformation on the linearly transformed confrontation data to obtain the weight matrix of the first feature image includes: substituting the linearly transformed confrontation data into the startup Function to obtain the weight matrix of the first feature image.

在又一種可能實現的方式中,所述對待處理圖像進行特徵提取處理,得到第一特徵圖像,包括:對所述待處理圖像進行卷積處理,得到所述第一特徵圖像。 In another possible implementation manner, the performing feature extraction processing on the image to be processed to obtain the first feature image includes: performing convolution processing on the image to be processed to obtain the first feature image.

在又一種可能實現的方式中,所述對所述第二特徵圖像進行解碼處理,得到對抗圖像,包括:對所述第二特徵圖像進行卷積處理,得到第五特徵圖像;將所述第二特徵圖像與所述第五特徵圖像融合處理,得到第六特徵圖像;對所述第六特徵圖像進行反卷積處理,得到所述對抗圖像。 In another possible implementation manner, the performing decoding processing on the second characteristic image to obtain a confrontation image includes: performing convolution processing on the second characteristic image to obtain a fifth characteristic image; The second feature image and the fifth feature image are fused to obtain a sixth feature image; the sixth feature image is subjected to deconvolution processing to obtain the confrontation image.

在又一種可能實現的方式中,基於多目標對抗生成網路對所述待處理圖像進行特徵提取處理,得到所述第一特徵圖像;對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像;以及對所述第二特徵圖像進行解碼處理,得到所述對抗圖像。 In another possible implementation manner, feature extraction processing is performed on the image to be processed based on a multi-target confrontation generation network to obtain the first characteristic image; the first characteristic image and the confrontation data are fused Processing to obtain a second characteristic image; and decoding the second characteristic image to obtain the confrontation image.

在又一種可能實現的方式中,所述多目標對抗生成網路基於損失函數進行反向傳播訓練得到,所述損失函數為: In another possible implementation manner, the multi-target confrontation generation network is obtained by back propagation training based on a loss function, and the loss function is:

L(x)=L cls (H(F θ (x,t)),t)+α L re (x,F θ (x,t)); L (x)= L cls ( H ( F θ (x,t)),t)+α L re (x, F θ (x,t));

其中,F θ (x,t)為所述對抗圖像,t為所述對抗數據,L cls 為交叉熵損失函數,H(F θ (x,t))為將所述對抗圖像輸入至被攻擊的神經網路得到的類別,α為自然數,L re =∥x-F θ (x,t)∥ p p

Figure 108146511-A0101-12-0004-13
{0,1,2,∞}。 Wherein, F θ (x, t) is the confrontation image, t is the confrontation data, L cls is the cross-entropy loss function, and H ( F θ (x, t)) is the input of the confrontation image to The category obtained by the attacked neural network, α is a natural number, L re =∥x- F θ (x,t)∥ p , p
Figure 108146511-A0101-12-0004-13
{0,1,2 , ∞}.

第二方面,提供了一種圖像處理裝置,包括:第一處理單元,配置為對待處理圖像進行特徵提取處理,得到第一特徵圖像;融合處理單元,配置為對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像;第二處理單元,配置為對所述第二特徵圖像進行解碼處理,得到對抗圖像。 In a second aspect, an image processing device is provided, including: a first processing unit configured to perform feature extraction processing on an image to be processed to obtain a first feature image; and a fusion processing unit configured to perform feature extraction on the first feature image The image and the confrontation data are fused to obtain a second characteristic image; the second processing unit is configured to decode the second characteristic image to obtain the confrontation image.

在一種可能實現的方式中,所述融合處理單元包括:預處理子單元,配置為對所述對抗資料進行預處理,得到第三特徵圖像;第一融合處理子單元,配置為對所述第三特徵圖像與所述第一特徵圖像進行融合處理,得到第四特徵圖像;第一處理子單元,配置為對所述第四特徵圖像進行卷積處理,得到所述第二特徵圖像。 In a possible implementation manner, the fusion processing unit includes: a preprocessing subunit configured to preprocess the confrontation data to obtain a third characteristic image; and a first fusion processing subunit configured to perform processing on the The third feature image is fused with the first feature image to obtain a fourth feature image; the first processing subunit is configured to perform convolution processing on the fourth feature image to obtain the second feature image. Feature image.

在另一種可能實現的方式中,所述預處理子單元,還配置為對所述對抗資料進行編碼處理,得到編碼處理後的對抗資料;以及對所述編碼處理後的對抗資料填充預設值,使填充後得到的第三特徵圖像的尺寸與所述第一特徵圖像的尺寸相同。 In another possible implementation manner, the preprocessing subunit is further configured to perform encoding processing on the confrontation data to obtain encoded confrontation data; and fill the encoded confrontation data with preset values , Making the size of the third characteristic image obtained after filling the same as the size of the first characteristic image.

在又一種可能實現的方式中,所述預處理子單元,還配置為對所述對抗資料進行獨熱編碼處理,得到所述編碼處理後的對抗資料。 In another possible implementation manner, the preprocessing subunit is further configured to perform one-hot encoding processing on the confrontation data to obtain the encoded confrontation data.

在又一種可能實現的方式中,所述第一融合處理子單元,還配置為將所述第三特徵圖像與所述第一特徵圖像在通道維度上進行拼接處理,得到所述第四特徵圖像。 In another possible implementation manner, the first fusion processing subunit is further configured to perform splicing processing on the channel dimension of the third characteristic image and the first characteristic image to obtain the fourth Feature image.

在又一種可能實現的方式中,所述融合處理單元還包括:特徵提取子單元,配置為對所述對抗資料進行特徵提取處理,得到所述第一特徵圖像的權重矩陣;第二處理子單元,配置為將所述第一特徵圖像與所述權重矩陣在通道維度上進行點乘,得到所述第二特徵圖像。 In another possible implementation manner, the fusion processing unit further includes: a feature extraction subunit configured to perform feature extraction processing on the confrontation data to obtain a weight matrix of the first feature image; and a second processing subunit The unit is configured to perform dot multiplication on the channel dimension of the first characteristic image and the weight matrix to obtain the second characteristic image.

在又一種可能實現的方式中,所述特徵提取子單元,還配置為對所述對抗資料進行線性變換,得到線性變 換後的對抗資料;以及對所述線性變換後的對抗資料進行非線性變換,得到所述第一特徵圖像的權重矩陣。 In another possible implementation manner, the feature extraction subunit is further configured to perform a linear transformation on the confrontation data to obtain a linear transformation. The exchanged confrontation data; and performing nonlinear transformation on the linearly transformed confrontation data to obtain the weight matrix of the first feature image.

在又一種可能實現的方式中,所述特徵提取子單元,還配置為獲取所述對抗資料的權重;以及根據所述權重對所述對抗資料進行加權求和,得到所述線性變換後的對抗資料。 In another possible implementation manner, the feature extraction subunit is further configured to obtain the weight of the confrontation data; and perform a weighted summation on the confrontation data according to the weight to obtain the linearly transformed confrontation data.

在又一種可能實現的方式中,所述特徵提取子單元,還配置為將所述線性變換後的對抗資料代入啟動函數,得到所述第一特徵圖像的權重矩陣。 In another possible implementation manner, the feature extraction subunit is further configured to substitute the linearly transformed confrontation data into an activation function to obtain the weight matrix of the first feature image.

在又一種可能實現的方式中,所述第一處理單元包括:第三處理子單元,配置為對所述待處理圖像進行卷積處理,得到所述第一特徵圖像。 In another possible implementation manner, the first processing unit includes: a third processing subunit configured to perform convolution processing on the image to be processed to obtain the first characteristic image.

在又一種可能實現的方式中,所述第二處理單元包括:第四處理子單元,配置為對所述第二特徵圖像進行卷積處理,得到第五特徵圖像;第二融合子單元,配置為將所述第二特徵圖像與所述第五特徵圖像融合處理,得到第六特徵圖像;第五處理子單元,配置為對所述第六特徵圖像進行反卷積處理,得到所述對抗圖像。 In another possible implementation manner, the second processing unit includes: a fourth processing subunit configured to perform convolution processing on the second feature image to obtain a fifth feature image; and a second fusion subunit , Configured to fuse the second feature image with the fifth feature image to obtain a sixth feature image; a fifth processing subunit configured to perform deconvolution processing on the sixth feature image , To obtain the confrontation image.

在又一種可能實現的方式中,所述裝置還包括:多目標對抗生成網路,配置為對所述待處理圖像進行特徵提取處理,得到所述第一特徵圖像;以及對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像;以及對所述第二特徵圖像進行解碼處理,得到所述對抗圖像。 In another possible implementation manner, the device further includes: a multi-target confrontation generation network configured to perform feature extraction processing on the image to be processed to obtain the first feature image; and A feature image is fused with the confrontation data to obtain a second feature image; and the second feature image is decoded to obtain the confrontation image.

在又一種可能實現的方式中,所述裝置還包括:訓練單元,配置為基於損失函數進行反向傳播訓練所述多目標對抗生成網路,所述損失函數為: In another possible implementation manner, the device further includes: a training unit configured to perform back propagation training of the multi-target confrontation generation network based on a loss function, and the loss function is:

L(x)=L cls (H(F θ (x,t)),t)+α L re (x,F θ (x,t)); L (x)= L cls ( H ( F θ (x,t)),t)+α L re (x, F θ (x,t));

其中,F θ (x,t)為所述對抗圖像,t為所述對抗資料,L cls 為交叉熵損失函數,H(F θ (x,t))為將所述對抗圖像輸入至被攻擊的神經網路得到的類別,α為自然數,L re =∥x-F θ (x,t)∥ p p

Figure 108146511-A0101-12-0007-14
{0,1,2,∞}。 Where F θ (x, t) is the confrontation image, t is the confrontation data, L cls is the cross-entropy loss function, and H ( F θ (x, t)) is the input of the confrontation image to The category obtained by the attacked neural network, α is a natural number, L re =∥x- F θ (x,t)∥ p , p
Figure 108146511-A0101-12-0007-14
{0,1,2,∞}.

第三方面,本公開提供了一種圖像處理裝置,包括:處理器和記憶體,所述處理器和所述記憶體耦合;其中,所述記憶體儲存有程式指令,所述程式指令被所述處理器執行時,使所述處理器執行第一方面中任意一項所述的方法。 In a third aspect, the present disclosure provides an image processing device, including: a processor and a memory, the processor and the memory are coupled; wherein the memory stores program instructions, the program instructions are When the processor is executed, the processor is caused to execute the method described in any one of the first aspect.

第四方面,本公開提供了一種電腦可讀儲存介質,所述電腦可讀儲存介質中儲存有電腦程式,所述電腦程式包括程式指令,所述程式指令當被批次處理裝置的處理器執行時,使所述處理器執行第一方面中任意一項所述的方法。 In a fourth aspect, the present disclosure provides a computer-readable storage medium in which a computer program is stored, the computer program includes program instructions, and the program instructions are executed by a processor of a batch processing device When the time, the processor is caused to execute the method described in any one of the first aspect.

第五方面,本公開提供了一種電腦程式產品,所述電腦程式產品被處理器執行時,能夠實現第一方面中任意一項所述的方法。 In a fifth aspect, the present disclosure provides a computer program product, which can implement the method described in any one of the first aspect when the computer program product is executed by a processor.

本公開通過將對抗資料與圖像特徵圖像進行融合,得到融合後的特徵圖像,再對融合後的特徵圖像進行解碼處理,實現對融合後的特徵圖像的解碼,得到包含有對抗 資料的對抗圖像。生成的對抗圖像可用於但不限於用於:攻擊已經訓練好神經網路,找出已訓練好的神經網路的缺陷,並可基於對抗圖像對已訓練好的神經網路訓練,以彌補這些已訓練好的神經網路的缺陷,從而提高神經網路的魯棒性。 The present disclosure combines the confrontation data with the image feature image to obtain the fused feature image, and then decodes the fused feature image to realize the decoding of the fused feature image, and obtain the fused feature image. Confrontational images of data. The generated confrontation image can be used but not limited to: attack the trained neural network, find the defects of the trained neural network, and train the trained neural network based on the confrontation image to Make up for the defects of these trained neural networks, thereby improving the robustness of the neural network.

1‧‧‧一種圖像處理裝置 1‧‧‧An image processing device

11‧‧‧第一處理單元 11‧‧‧The first processing unit

111‧‧‧第三處理子單元 111‧‧‧Third processing subunit

12‧‧‧融合處理單元 12‧‧‧Fusion processing unit

121‧‧‧預處理子單元 121‧‧‧Preprocessing subunit

122‧‧‧第一融合子單元 122‧‧‧First Fusion Subunit

123‧‧‧第一處理子單元 123‧‧‧The first processing subunit

124‧‧‧特徵提取子單元 124‧‧‧Feature Extraction Subunit

125‧‧‧第二處理子單元 125‧‧‧Second processing subunit

13‧‧‧第二處理單元 13‧‧‧Second processing unit

131‧‧‧第四處理子單元 131‧‧‧The fourth processing subunit

132‧‧‧第二融合子單元 132‧‧‧The second fusion subunit

133‧‧‧第五處理子單元 133‧‧‧Fifth processing subunit

14‧‧‧訓練單元 14‧‧‧Training Unit

2‧‧‧圖像處理裝置 2‧‧‧Image processing device

21‧‧‧處理器 21‧‧‧Processor

22‧‧‧輸入裝置 22‧‧‧Input device

23‧‧‧輸出裝置 23‧‧‧Output device

24‧‧‧記憶體 24‧‧‧Memory

為了更清楚地說明本公開實施例或背景技術中的技術方案,下面將對本公開實施例或背景技術中所需要使用的附圖進行說明。 In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the background art, the following will describe the drawings that need to be used in the embodiments of the present disclosure or the background art.

圖1為本公開實施例提供的一種圖像處理方法的流程示意圖; FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the disclosure;

圖2為本公開實施例提供的一種提取圖像特徵的方法的流程示意圖; 2 is a schematic flowchart of a method for extracting image features provided by an embodiment of the disclosure;

圖3為本公開實施例提供的一種對抗資料與第一特徵圖像融合的流程示意圖; FIG. 3 is a schematic diagram of a process of fusion of confrontation data and a first feature image according to an embodiment of the disclosure;

圖4為本公開實施例提供的一種對抗資料與第一特徵圖像融合模組的結構示意圖; FIG. 4 is a schematic structural diagram of a confrontation data and first feature image fusion module provided by an embodiment of the disclosure;

圖5為本公開實施例提供的另一種對抗資料與第一特徵圖像融合的流程示意圖; 5 is a schematic diagram of another process of fusion of confrontation data and a first feature image provided by an embodiment of the disclosure;

圖6為本公開實施例提供的另一種對抗資料與第一特徵圖像融合模組的結構示意圖; 6 is a schematic structural diagram of another confrontation data and a first feature image fusion module provided by an embodiment of the disclosure;

圖7為本公開實施例提供的一種特徵圖像解碼的方法的流程示意圖; FIG. 7 is a schematic flowchart of a method for decoding characteristic images according to an embodiment of the disclosure;

圖8為本公開實施例提供的一種圖像處理裝置的結構示意圖; FIG. 8 is a schematic structural diagram of an image processing device provided by an embodiment of the disclosure;

圖9為本公開實施例提供的一種圖像處理裝置的硬體結構示意圖。 FIG. 9 is a schematic diagram of the hardware structure of an image processing device provided by an embodiment of the disclosure.

為了使本技術領域的人員更好地理解本公開方案,下面將結合本公開實施例中的附圖,對本公開實施例中的技術方案進行清楚、完整地描述,顯然,所描述的實施例僅僅是本公開一部分實施例,而不是全部的實施例。基於本公開中的實施例,本領域普通技術人員在沒有做出創造性勞動前提下所獲得的所有其他實施例,都屬於本公開保護的範圍。 In order to enable those skilled in the art to better understand the solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only It is a part of the embodiments of the present disclosure, but not all the embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present disclosure.

本公開的說明書和申請專利範圍及上述附圖中的術語“第一”、“第二”等是用於區別不同物件,而不是用於描述特定順序。此外,術語“包括”和“具有”以及它們任何變形,意圖在於覆蓋不排他的包含。例如包含了一系列步驟或單元的過程、方法、系統、產品或設備沒有限定於已列出的步驟或單元,而是可選地還包括沒有列出的步驟或單元,或可選地還包括對於這些過程、方法、產品或設備固有的其他步驟或單元。 The terms "first", "second", etc. in the specification and patent application scope of the present disclosure and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific order. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.

在本文中提及“實施例”意味著,結合實施例描述的特定特徵、結構或特性可以包含在本公開的至少一個實施例中。在說明書中的各個位置出現該短語並不一定均是 指相同的實施例,也不是與其它實施例互斥的獨立的或備選的實施例。本領域技術人員顯式地和隱式地理解的是,本文所描述的實施例可以與其它實施例相結合。 Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present disclosure. The phrase that appears in various places in the description does not necessarily mean It refers to the same embodiment, and is not an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.

本公開提供的多目標對抗生成網路(下文稱為第一神經網路),可根據輸入的對抗資料對待處理圖像進行調整,得到相應的對抗圖像,對抗圖像可用於攻擊已經訓練好的神經網路(下文稱為第二神經網路),通過對抗圖像對第二神經網路的攻擊,找出第二神經網路的缺陷,並可基於對抗圖像對第二神經網路訓練,以彌補第二神經網路的缺陷,提高第二神經網路的魯棒性。 The multi-target confrontation generation network (hereinafter referred to as the first neural network) provided by the present disclosure can adjust the image to be processed according to the input confrontation data to obtain the corresponding confrontation image. The confrontation image can be used to attack and has been trained. The neural network of (hereinafter referred to as the second neural network), by countering the attack of the image on the second neural network, finds the defects of the second neural network, and can attack the second neural network based on the counter images Training to make up for the defects of the second neural network and improve the robustness of the second neural network.

為了更清楚地說明本公開實施例或背景技術中的技術方案,下面將對本公開實施例或背景技術中所需要使用的附圖進行說明。下面結合本公開實施例中的附圖對本公開實施例進行描述。 In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the background art, the following will describe the drawings that need to be used in the embodiments of the present disclosure or the background art. The embodiments of the present disclosure will be described below in conjunction with the drawings in the embodiments of the present disclosure.

請參閱圖1,圖1是本公開實施例提供的一種圖像處理方法的流程示意圖。 Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.

101、對待處理圖像進行特徵提取處理,得到第一特徵圖像。 101. Perform feature extraction processing on the image to be processed to obtain a first feature image.

對抗圖像指經過微小調整就可以讓神經網路輸出錯誤結果的圖像,將該微小調整稱為對抗資料,根據不同的對抗資料得到不同的對抗圖像。本公開實施例提供了一種多目標對抗生成網路用以生成對抗圖像,即通過第一神經網路對對抗資料和待處理圖像進行處理,得到包含有對抗資料的對抗圖像。對待處理圖像進行特徵提取處理,將待處理圖 像的大小(圖像尺寸)縮小的同時,得到待處理圖像的特徵圖像。 The confrontation image refers to the image that can make the neural network output wrong results after minor adjustments. This minor adjustment is called confrontation data, and different confrontation images are obtained according to different confrontation data. The embodiment of the present disclosure provides a multi-target confrontation generation network for generating confrontation images, that is, processing confrontation data and images to be processed through a first neural network to obtain confrontation images containing the confrontation data. Perform feature extraction processing on the image to be processed, and While the image size (image size) is reduced, the characteristic image of the image to be processed is obtained.

上述特徵提取處理可以包括卷積處理但不限於卷積處理,在一種可能實現的方式中,對待處理圖像中的任意一個像素點,使卷積範本的中心點和該像素點重合,卷積範本上的點與待處理圖像上對應的像素點相乘,最後再將每個像素點的積相加,得到該像素點的卷積值,通過對待處理圖像中每個像素點進行上述卷積處理,將待處理圖像的尺寸縮小,並提取出第一特徵圖像。 The above-mentioned feature extraction processing may include convolution processing but is not limited to convolution processing. In a possible implementation manner, any pixel in the image to be processed is made to coincide with the center point of the convolution template and the convolution The points on the template are multiplied by the corresponding pixels on the image to be processed, and finally the product of each pixel is added to obtain the convolution value of the pixel. The above is performed by each pixel in the image to be processed Convolution processing reduces the size of the image to be processed and extracts the first characteristic image.

102、對上述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像。 102. Perform fusion processing on the first feature image and the confrontation data to obtain a second feature image.

為實現對第二神經網路進行有效的攻擊,需要通過不同標籤(類別)的對抗圖像去攻擊第二神經網路,找出第二神經網路的缺陷。將上述第一特徵圖像與對抗資料進行融合,得到的第二特徵圖像的類別與對抗資料的類別相同,通過設置不同的對抗資料,可得到不同的對抗圖像。在一種可能實現的方式中,對對抗資料進行特徵提取處理,得到對抗資料的特徵資訊,再將對抗資料的特徵資訊與第一特徵圖像進行融合,得到包含對抗資料的特徵資訊的第二特徵圖像。 In order to achieve an effective attack on the second neural network, it is necessary to attack the second neural network through counter images of different labels (categories) to find out the defects of the second neural network. The first feature image and the confrontation data are merged, and the category of the obtained second feature image is the same as that of the confrontation data. By setting different confrontation materials, different confrontation images can be obtained. In a possible implementation method, feature extraction processing is performed on the confrontation data to obtain the characteristic information of the confrontation data, and then the characteristic information of the confrontation data is merged with the first characteristic image to obtain the second characteristic containing the characteristic information of the confrontation data image.

103、對上述第二特徵圖像進行解碼處理,得到對抗圖像。 103. Perform decoding processing on the foregoing second characteristic image to obtain a confrontation image.

經過102的處理,第二特徵圖像中已經包含對抗資料的特徵資訊,因此,對第二特徵圖像進行解碼處理, 得到的圖像就是與對抗資料對應的對抗圖像,其中,解碼處理可以為一下任意一種:反卷積處理、雙線性插值處理、反池化處理。在一種可能實現的方式中,對第二特徵圖像進行多次反卷積處理,得到與待處理圖像的尺寸相同的對抗圖像。 After the processing of 102, the second feature image already contains the feature information of the confrontation data. Therefore, the second feature image is decoded. The obtained image is the confrontation image corresponding to the confrontation data, where the decoding processing can be any of the following: deconvolution processing, bilinear interpolation processing, and de-pooling processing. In a possible implementation manner, the second feature image is subjected to multiple deconvolution processing to obtain a confrontation image with the same size as the image to be processed.

本公開實施例通過將對抗資料與圖像特徵圖像進行融合,得到融合後的特徵圖像,再對融合後的特徵圖像進行解碼處理,實現對融合後的特徵圖像的解碼,得到類別與對抗資料的類別相同的對抗圖像。生成的對抗圖像攻擊已經訓練好神經網路,找出已訓練好的神經網路的缺陷,並可基於對抗圖像對已訓練好的神經網路訓練,以彌補這些已訓練好的神經網路的缺陷,從而提高神經網路的魯棒性。 In the embodiments of the present disclosure, the fused feature image is obtained by fusing the confrontation data with the image feature image, and then the fused feature image is decoded to realize the decoding of the fused feature image to obtain the category A confrontation image of the same category as the confrontation material. The generated counter-image attack has trained the neural network to find out the defects of the trained neural network, and the trained neural network can be trained based on the counter-image to make up for these trained neural networks The defect of the road, thereby improving the robustness of the neural network.

請參閱圖2,圖2是本公開實施例提供的一種提取圖像特徵的方法的流程示意圖。 Please refer to FIG. 2, which is a schematic flowchart of a method for extracting image features according to an embodiment of the present disclosure.

201、獲取待處理圖像。 201. Obtain an image to be processed.

對抗圖像指經過細微調整就可以讓神經網路輸出錯誤結果的輸入圖像。在圖像識別中,可以理解為原來被一個神經網路分類為一個類(比如“熊貓”)的圖片,經過非常細微甚至人眼無法察覺的改動後,突然被誤分成另一個類(比如“長臂猿”)。也就是說,通過對待處理圖像進行細微調整,可得到攻擊第二神經網路的對抗圖像,可選地,待處理圖像可以為用於訓練第二神經網路的資料集。 Confrontational images refer to input images that can be adjusted to allow the neural network to output incorrect results. In image recognition, it can be understood that a picture that was originally classified into one category (such as "panda") by a neural network, after very subtle or even imperceptible changes to the human eye, is suddenly mistakenly classified into another category (such as " Gibbon"). That is to say, by making fine adjustments to the image to be processed, a confrontation image that attacks the second neural network can be obtained. Optionally, the image to be processed can be a data set used to train the second neural network.

202、對待處理圖像進行特徵提取處理,得到第一特徵圖像。 202. Perform feature extraction processing on the image to be processed to obtain a first feature image.

本公開實施例提供的第一神經網路用以生成對抗圖像,即通過第一神經網路對對抗資料和待處理圖像進行處理,得到類別與對抗資料的類別相同的對抗圖像。首先,通過第一神經網路的編碼層對待處理圖像進行特徵提取處理,特徵提取處理可以通過多種方式實現,例如卷積、池化等,本公開實施例對此不做具體限定。在一些可能的實現方式中,圖像編碼層包括多層卷積層,通過圖像編碼層對待處理圖像逐層進行卷積處理完成對待處理圖像的特徵提取處理,其中,每個卷積層提取出的特徵內容及語義資訊均不一樣,具體表現為,特徵提取處理一步步地將圖像的特徵抽象出來,同時也將逐步去除相對次要的特徵,因此,越到後面提取出的特徵尺寸越小,內容及語義資訊就越濃縮。通過多層卷積層逐級對待處理圖像進行卷積處理,並提取相應的特徵,最終得到固定大小的特徵圖像,這樣,可在獲得待處理圖像主要內容資訊(即待處理圖像的特徵圖像)的同時,將圖像尺寸縮小,減小系統的計算量,提高運算速度。在一種可能實現的方式中,卷積處理的實現過程如下:卷積層對待處理圖像做卷積處理,即利用卷積核在待處理圖像上滑動,並將待處理圖像上的像素與對應的卷積核上的數值相乘,然後將所有相乘後的值相加作為卷積核中間像素對應的圖像上像素值,最終滑動處理完待處理圖像中所有的像素,並提取出第一特徵圖像。需要理解的是,本公開對上述卷積層的數量不做具體限定,可選地,上述卷積層的數量取為2。 The first neural network provided by the embodiment of the present disclosure is used to generate a confrontation image, that is, the confrontation data and the image to be processed are processed through the first neural network to obtain a confrontation image of the same category as the confrontation data. First, the encoding layer of the first neural network performs feature extraction processing on the image to be processed. The feature extraction processing can be implemented in a variety of ways, such as convolution and pooling, which are not specifically limited in the embodiment of the present disclosure. In some possible implementations, the image coding layer includes multiple convolutional layers, and the image coding layer performs convolution processing on the image to be processed layer by layer to complete the feature extraction process of the image to be processed, where each convolutional layer extracts The feature content and semantic information are different. The specific performance is that the feature extraction process abstracts the features of the image step by step, while also gradually removing relatively minor features. Therefore, the feature size extracted later Smaller, more condensed content and semantic information. Through the multi-layer convolution layer, the image to be processed is convolved step by step, and the corresponding features are extracted, and finally a fixed-size feature image is obtained. In this way, the main content information of the image to be processed (that is, the feature of the image to be processed can be obtained). At the same time, the image size is reduced, the calculation amount of the system is reduced, and the calculation speed is improved. In a possible implementation, the convolution process is implemented as follows: the convolution layer performs convolution processing on the image to be processed, that is, the convolution kernel is used to slide the image to be processed, and the pixels on the image to be processed are Multiply the values on the corresponding convolution kernel, and then add all the multiplied values as the pixel value on the image corresponding to the middle pixel of the convolution kernel, and finally slide all the pixels in the image to be processed, and extract Out the first feature image. It should be understood that the present disclosure does not specifically limit the number of the foregoing convolutional layers, and optionally, the number of the foregoing convolutional layers is taken as 2.

在一種可能實現的方式中,在卷積層後連接有Batch Norm層,通過Batch Norm層加入可訓練的參數完成對資料的歸一化處理,同時能加快訓練速度,並去除數據的相關性,突出特徵資料之間的分佈差異。再通過ReLu啟動層進行處理,可以增加資料的非線性,把當前特徵空間通過一定的線性映射轉換到另一個空間,讓資料能夠更好的被分類,同時能在很大程度地解決了圖像分割網路在學習過程梯度耗散的問題。 In one possible way, the Batch Norm layer is connected after the convolutional layer, and trainable parameters are added through the Batch Norm layer to complete the normalization of the data, which can speed up the training speed and remove the correlation of the data. Distribution differences between characteristic data. Then through the ReLu startup layer for processing, the nonlinearity of the data can be increased, and the current feature space can be converted to another space through a certain linear mapping, so that the data can be better classified, and the image can be solved to a large extent. The problem of gradient dissipation in the learning process of the segmentation network.

在一種可能實現的方式中,在將待處理圖像輸入至第一神經網路之前,還可以對待處理圖像進行預處理,並將預處理後的待處理圖像輸入到第一神經網路進行特徵提取處理,得到第一特徵圖像。在一些可能的實現方式中,預處理包括縮放處理,例如,第一神經網路的輸入圖像大小固定為513*513,此時,對於尺寸大於513*513的待處理圖像,可以將待處理圖像的尺寸縮小至513*513,而對於尺寸小於513*513的待處理圖像,可以將待處理圖像的尺寸放大至513*513。需要理解的是,根據實際需求,在將待處理圖像輸入至第一神經網路之前,也可以不對待處理圖像進行預處理,對此,本公開不做具體限定。 In a possible implementation, before the image to be processed is input to the first neural network, the image to be processed can also be preprocessed, and the preprocessed image to be processed is input to the first neural network Perform feature extraction processing to obtain the first feature image. In some possible implementations, the preprocessing includes scaling. For example, the input image size of the first neural network is fixed at 513*513. At this time, for an image to be processed with a size greater than 513*513, the The size of the processed image is reduced to 513*513, and for the image to be processed whose size is smaller than 513*513, the size of the image to be processed can be enlarged to 513*513. It should be understood that, according to actual needs, before the image to be processed is input to the first neural network, the image to be processed may not be preprocessed, which is not specifically limited in the present disclosure.

本公開實施例通過圖像編碼層對待處理圖像進行特徵提取處理,在縮小待處理圖像的尺寸的同時,提取出待處理圖像的特徵圖像。 In the embodiments of the present disclosure, the image coding layer performs feature extraction processing on the image to be processed, and while reducing the size of the image to be processed, the feature image of the image to be processed is extracted.

請參閱圖3,圖3是本公開實施例提供的一種對抗資料與第一特徵圖像融合的流程示意圖。 Please refer to FIG. 3. FIG. 3 is a schematic diagram of a process of fusing the confrontation data and the first feature image according to an embodiment of the present disclosure.

301、對上述對抗資料進行預處理,得到第三特徵圖像。 301. Preprocess the aforementioned confrontation data to obtain a third characteristic image.

如201所述,對抗圖像指經過微小調整就可以讓神經網路輸出錯誤結果的圖像,將該微小調整稱為對抗資料,根據不同的對抗資料可得到不同的對抗圖像,通過不同的對抗圖像去攻擊第二神經網路,找出的第二神經網路的缺陷也不一樣。在一些可能實現的方式中,第二神經網路對待處理圖像的識別結果為:1,現將對抗資料設為2,將與對抗資料對應的對抗圖像輸入至第二神經網路,最終得到的識別結果應該為2。也就是說,需要根據對抗資料對待處理圖像進行相應的調整,得到與對抗資料對應的對抗圖像,如:通過在待處理圖像上加入一些人工雜訊來“欺騙”第二神經網路,使得第二神經網路輸出的類別與對抗資料的類別相同。 As mentioned in 201, the confrontation image refers to the image that can make the neural network output wrong results after minor adjustments. This minor adjustment is called confrontation data. According to different confrontation data, different confrontation images can be obtained. To attack the second neural network against images, the defects found in the second neural network are also different. In some possible implementations, the recognition results of the image to be processed by the second neural network are: 1. Now the confrontation data is set to 2, and the confrontation image corresponding to the confrontation data is input to the second neural network, and finally The recognition result should be 2. That is to say, the image to be processed needs to be adjusted accordingly based on the confrontation data to obtain the confrontation image corresponding to the confrontation data, such as: "cheat" the second neural network by adding some artificial noise to the image to be processed , So that the output type of the second neural network is the same as the type of the confrontation data.

如圖4所示,本公開實施例提供了一種對抗資料編碼及特徵融合模組,用於對對抗資料進行特徵提取處理,得到對抗資料的特徵資訊,再將對抗資料的特徵資訊與上述第一特徵圖像進行融合,得到類別與對抗資料的類別相同的對抗圖像。由於在神經網路對圖像進行識別的過程中,往往會遇到分類特徵,例如:人的性別有男女,祖國有中國、美國、法國等,顯然,這些特徵值並不是連續的,而是離散的,無序的,因此,在對對抗資料進行處理之前,需要對上述對抗資料進行編碼處理,對對抗資料進行特徵數位化。需要理解的是,該編碼處理也是對抗資料編碼及特徵融合模組完成 的(圖4中未示出)。在一些可能實現的方式中,對上述對抗資料進行獨熱編碼處理(one-hot encoding),得到編碼處理後的對抗資料。獨熱編碼主要是採用N位狀態寄存器來對N個狀態進行編碼,每個狀態都有獨立的寄存器位,通過獨熱編碼處理後,對抗資料將被表示為二進位向量,其中,N為正整數。 As shown in FIG. 4, the embodiment of the present disclosure provides a confrontation data encoding and feature fusion module, which is used to perform feature extraction processing on the confrontation data to obtain the characteristic information of the confrontation data, and then combine the characteristic information of the confrontation data with the first Feature images are fused to obtain a confrontation image with the same category as that of the confrontation data. Because in the process of neural network recognition of images, classification features are often encountered. For example, people are male and female, and their home country is China, the United States, France, etc. Obviously, these feature values are not continuous, but Discrete and disordered. Therefore, before processing the confrontation data, it is necessary to encode the above-mentioned confrontation data and digitize the characteristics of the confrontation data. It needs to be understood that the encoding process is also done against data encoding and feature fusion modules 的 (not shown in Figure 4). In some possible implementation manners, one-hot encoding is performed on the aforementioned confrontation data to obtain the encoded confrontation data. One-hot encoding mainly uses N-bit status registers to encode N states. Each state has an independent register bit. After one-hot encoding is processed, the confrontation data will be expressed as a binary vector, where N is positive Integer.

在神經網路的處理待處理圖像的過程中,特徵提取處理得到的第一特徵圖像是一個矩陣,為使編碼處理後的對抗資料與第一特徵圖像能進行融合,如圖4所示,對抗資料編碼及特徵融合模組首先對編碼處理後的對抗資料進行補位,使補位後得到的第三特徵圖像的尺寸與第一特徵圖像的尺寸一致。在一些可能實現的方式中,第一特徵圖像的大小為4*4,編碼處理後的對抗資料的大小為3*1,基於編碼處理後的對抗資料構建上述第三特徵圖像,使得第三特徵圖像的大小為4*4,其中,第三特徵圖像中除編碼處理後的對抗資料之外的所有元素的值均為預設值,可選地,該預設值為0。對於編碼處理後的對抗資料在第三特徵圖像中的具體位置,本公開不做具體限定。 In the process of processing the image to be processed by the neural network, the first feature image obtained by the feature extraction process is a matrix, in order to fuse the encoded confrontation data with the first feature image, as shown in Figure 4. It is shown that the confrontation data encoding and feature fusion module first complements the encoded confrontation data, so that the size of the third feature image obtained after the compensation is consistent with the size of the first feature image. In some possible implementations, the size of the first feature image is 4*4, the size of the confrontation data after encoding is 3*1, and the third feature image is constructed based on the confrontation data after the encoding process, so that the first The size of the three-characteristic image is 4*4, and the values of all elements in the third characteristic image except the confrontation data after the encoding process are preset values, and optionally, the preset value is 0. The present disclosure does not specifically limit the specific position of the encoded confrontation data in the third feature image.

302、對上述第三特徵圖像與上述第一特徵圖像進行融合處理,得到第四特徵圖像。 302. Perform fusion processing on the third characteristic image and the first characteristic image to obtain a fourth characteristic image.

如圖4所示,對抗資料編碼及特徵融合模組對上述第三特徵圖像與上述第一特徵圖像進行融合處理,在一些可能實現的方式中,將上述第三特徵圖像與上述第一特徵圖像在通道維度上進行拼接處理,得到第四特徵圖像。 As shown in Figure 4, the confrontation data encoding and feature fusion module performs fusion processing on the third feature image and the first feature image. In some possible implementation manners, the third feature image and the first feature image are combined. A feature image is spliced in the channel dimension to obtain a fourth feature image.

303、對上述第四特徵圖像進行卷積處理,得到第二特徵圖像。 303. Perform convolution processing on the fourth characteristic image to obtain a second characteristic image.

如圖4所示,在對第一特徵圖像與第三特徵圖像進行融合得到第四特徵圖像後,還將對第四特徵圖像進行卷積處理,以調整第四特徵圖像的行數、列數、維數,並得到預定大小的第二特徵圖像。在一些可能實現的例子中,通過302在通道維度上的拼接處理得到的第四特徵圖像的大小為(K+C)*H*W,其中,K+C為第四特徵圖像的通道的維度,通過卷積處理將第四特徵圖像的通道的維度降至C,得到大小為C*H*W的第二特徵圖像。 As shown in Figure 4, after the first feature image and the third feature image are fused to obtain the fourth feature image, the fourth feature image will also be subjected to convolution processing to adjust the fourth feature image. The number of rows, the number of columns, the number of dimensions, and the second feature image of a predetermined size is obtained. In some possible implementation examples, the size of the fourth feature image obtained by the splicing process of 302 in the channel dimension is (K+C)*H*W, where K+C is the channel of the fourth feature image The dimension of the channel of the fourth feature image is reduced to C through convolution processing, and the second feature image of size C*H*W is obtained.

本公開實施例通過對對抗資料進行編碼處理並補位,得到與第一特徵圖像大小相同的第三資料,並通過對第一特徵圖像與第三特徵圖像的融合處理,將對抗資料的特徵資訊與待處理圖像的特徵資訊融合入第四特徵圖像,最後對第四特徵圖像進行卷積處理,得到特定大小的第二特徵圖像。 In the embodiments of the present disclosure, the confrontation data is encoded and supplemented to obtain the third data of the same size as the first characteristic image, and the confrontation data is obtained by fusing the first characteristic image and the third characteristic image. The feature information of and the feature information of the image to be processed are merged into the fourth feature image, and finally the fourth feature image is subjected to convolution processing to obtain a second feature image of a specific size.

請參閱圖5,圖5是本公開實施例提供的另一種對抗資料與第一特徵圖像融合的流程示意圖。 Please refer to FIG. 5. FIG. 5 is a schematic diagram of another process of fusing the confrontation data and the first feature image according to an embodiment of the present disclosure.

501、對上述對抗資料進行特徵提取處理,得到上述第一特徵圖像的權重矩陣。 501. Perform feature extraction processing on the confrontation data to obtain a weight matrix of the first feature image.

如圖6所示,本實施例提供的對抗資料編碼及特徵融合模組首先通過多層感知機對對抗資料進行線性變換。多層感知機包含輸入輸出層,在輸入層和輸出層中間可以有多個隱層,對於中間隱層的數量本公開不做具體限定。 多層感知機的每一層都與上一層全連接(即上一層的任何一個神經元與下一層的所有神經元都有連接),其中,隱層中的每一個神經元都有相應的權重和偏置,因此,多層感知機所有的參數就是每個神經元的權重以及偏置,該權重和偏置的具體大小是通過對多層感知機進行訓練得到的,可選地,多層感知機的訓練方法可以為梯度下降法或反向傳播法。 As shown in FIG. 6, the confrontation data encoding and feature fusion module provided by this embodiment first performs a linear transformation on the confrontation data through a multilayer perceptron. The multi-layer perceptron includes an input and output layer, and there may be multiple hidden layers between the input layer and the output layer. The number of intermediate hidden layers is not specifically limited in this disclosure. Each layer of the multilayer perceptron is fully connected to the previous layer (that is, any neuron in the previous layer is connected to all neurons in the next layer), and each neuron in the hidden layer has a corresponding weight and bias Therefore, all the parameters of the multilayer perceptron are the weight and bias of each neuron. The specific size of the weight and bias is obtained by training the multilayer perceptron. Optionally, the training method of the multilayer perceptron It can be a gradient descent method or a back propagation method.

將對抗資料登錄至多層感知機時,獲取多層感知機的權重和偏置(即對抗資料的權重),再根據權重和偏置對上述對抗資料進行加權求和,得到線性變換後的對抗資料,在一種可能實現的方式中,多層感知機的權重和偏置分別為:w i b i ,其中i為神經元的數量,對抗資料為x,則多層感知機對對抗資料進行線性變換後得到的線性變換後的 對抗資料為:

Figure 108146511-A0101-12-0018-2
。 When registering the confrontation data to the multi-layer perceptron, obtain the weight and bias of the multi-layer perceptron (that is, the weight of the confrontation data), and then perform a weighted summation of the above-mentioned confrontation data according to the weight and bias to obtain the confrontation data after linear transformation. In a possible implementation, the weights and biases of the multilayer perceptron are: w i and b i , where i is the number of neurons and the confrontation data is x , then the multilayer perceptron performs a linear transformation on the confrontation data to obtain The confrontation data after linear transformation of is:
Figure 108146511-A0101-12-0018-2
.

如果不對線性變換後的對抗資料進行任何處理,則輸出的信號將僅僅是一個簡單的線性函數。由於線性函數的複雜性有限,從資料中學習複雜映射的能力較小,無法學習和處理複雜類型的資料,例如圖像、視頻、音頻、語音等。因此,需要通過對線性變換後的對抗資料進行非線性變換,來解決諸如圖像處理、視頻處理等複雜問題。在多層感知機後連接非線性啟動函數,通過啟動函數對線性變換後的對抗資料進行非線性變換,可處理複雜的映射,在一些可能實現的方式中,將上述線性變換後的對抗資料代入sigmoid函數,實現對線性變換後的對抗資料的非線性變 換,得到第一特徵圖像的權重矩陣,該權重矩陣用於對上述第一特徵圖像在通道維度上進行重新權重化。 If no processing is done on the linearly transformed confrontation data, the output signal will only be a simple linear function. Due to the limited complexity of linear functions, the ability to learn complex mappings from data is relatively small, and it is impossible to learn and process complex types of data, such as images, video, audio, and voice. Therefore, it is necessary to solve complex problems such as image processing and video processing by performing nonlinear transformation on the linearly transformed confrontation data. Connect the non-linear activation function after the multilayer perceptron, and use the activation function to perform nonlinear transformation on the linearly transformed confrontation data, which can handle complex mapping. In some possible implementation methods, the above-mentioned linearly transformed confrontation data is substituted into sigmoid Function to realize the nonlinear change of the confrontation data after linear transformation In other words, a weight matrix of the first feature image is obtained, and the weight matrix is used to re-weight the first feature image in the channel dimension.

在一個可能實現的方式中,多層感知機對對抗資料的處理可參見下式: In a possible way, the processing of the confrontation data by the multilayer perceptron can be seen in the following formula:

t'=F en (W,t)=σ(W2 δ(W1t)) t' = F en (W,t)=σ(W 2 δ(W 1 t))

其中,

Figure 108146511-A0101-12-0019-3
,δ為ReLu啟動函數,W1
Figure 108146511-A0101-12-0019-23
R UXK 和 W2
Figure 108146511-A0101-12-0019-16
R UXC 均為全連接層,U為全連接層中神經元的數量,t'
Figure 108146511-A0101-12-0019-24
R C C為第一特徵圖像的通道的維度。 among them,
Figure 108146511-A0101-12-0019-3
, Δ is the starting function of ReLu, W 1
Figure 108146511-A0101-12-0019-23
R U X K and W 2
Figure 108146511-A0101-12-0019-16
R U X C are all fully connected layers, U is the number of neurons in the fully connected layer, t'
Figure 108146511-A0101-12-0019-24
R C , C is the dimension of the channel of the first feature image.

502、將上述第一特徵圖像與上述權重矩陣在通道維度上進行點乘,得到第二特徵圖像。 502. Perform dot multiplication on the channel dimension of the first feature image and the weight matrix to obtain a second feature image.

由於在對待處理圖像進行卷積處理時,卷積核都是以一個局部感受野的方式對待處理圖像進行卷積,因此得到的第一特徵圖像的各個通道內的資料不能利用通道以外的紋理資訊。通過尋找第一特徵圖像中各通道間的相關性,給予各通道不同的權值,然後再進行加權求和可很好的解決上述問題,增加特徵圖像中的有用資訊。因此,將第一特徵圖像與權重矩陣在通道維度上進行點乘,得到第二特徵圖像。 When the image to be processed is convolved, the convolution kernel convolves the image to be processed in a local receptive field, so the data in each channel of the obtained first feature image cannot be used outside the channel Texture information. By searching for the correlation between the channels in the first feature image, giving different weights to each channel, and then performing a weighted summation, the above problems can be solved well and the useful information in the feature image can be increased. Therefore, the first feature image and the weight matrix are dot-multiplied in the channel dimension to obtain the second feature image.

本公開實施例通過對對抗資料進行特徵提取處理,得到第一特徵圖像的權重矩陣,再將第一特徵圖像與權重矩陣在通道維度上進行點乘,得到第二特徵圖像。 The embodiment of the present disclosure obtains the weight matrix of the first feature image by performing feature extraction processing on the confrontation data, and then multiplies the first feature image and the weight matrix in the channel dimension to obtain the second feature image.

需要理解的是,301~303與501~502為本公開提供的兩種對抗資料編碼及特徵融合模組,這兩種模組都可 與第一神經網路中的其他模組相連(如:編碼層),在實際應用中具體使用哪一種模組,本公開不做具體限定。 It should be understood that 301 ~ 303 and 501 ~ 502 are two types of confrontation data encoding and feature fusion modules provided by this disclosure. Both modules can be connected to other modules in the first neural network (such as: Coding layer), which type of module is used in actual applications, the present disclosure does not specifically limit.

請參閱圖7,圖7是本公開實施例提供的一種特徵圖像解碼的方法的流程示意圖。 Please refer to FIG. 7. FIG. 7 is a schematic flowchart of a method for decoding a feature image according to an embodiment of the present disclosure.

701、對上述第二特徵圖像進行卷積處理,得到第五特徵圖像。 701. Perform convolution processing on the foregoing second feature image to obtain a fifth feature image.

通過上述處理得到的第二特徵圖像融合了對抗資料的特徵資訊和待處理圖像的特徵資訊,此時,只需通過解碼處理對第二特徵圖像進行解碼,即可得到與對抗資料對應的對抗圖像。 The second feature image obtained through the above processing combines the feature information of the confrontation data and the feature information of the image to be processed. At this time, only the second feature image needs to be decoded through the decoding process, and the corresponding data can be obtained. Of confrontational images.

神經網路的深度越深,訓練起來的難度就越大,優化神經網路的難度也就越大,如果不能很好的通過訓練學習到合適的權重,深的神經網路會的效果反而不如相對較淺的網路,而通過在第一神經網路中加入殘差塊可解決上述訓練難度大和優化難度大的問題,並提升第一神經網路的效率。因此,在第一神經網路中,對抗資料編碼及特徵融合模組後連接有一個殘差塊,殘差塊中可以是多層卷積層,也可以是多層全連接層,對此,本公開不做具體限定。在一種可能實現的方式中,殘差塊中包含有6層卷積層,所有卷積層之間串聯,即上一層卷積層的輸出為下一層卷積層的輸入,前兩層卷積層依次對第二特徵圖像進行卷積處理,得到第七特徵圖像,再將第七特徵圖像與第二特徵圖像進行融合,得到第八特徵圖像,再依次通過第三層卷積層和第四層卷積層的卷積處理,得到第九特徵圖像,此時,對第九特徵 圖像和第八特徵圖像進行融合,得到第十特徵圖像,依次通過第五層卷積層和第六層卷積層的卷積處理,得到第五特徵圖像。可選地,上述融合可以為特徵相加。 The deeper the neural network, the more difficult it is to train, and the more difficult it is to optimize the neural network. If the appropriate weights cannot be learned through training well, the effect of the deep neural network will be worse. For a relatively shallow network, adding a residual block to the first neural network can solve the above-mentioned difficulties in training and optimization, and improve the efficiency of the first neural network. Therefore, in the first neural network, a residual block is connected after the confrontation data encoding and feature fusion module. The residual block can be a multi-layer convolutional layer or a multi-layer fully connected layer. For this, the present disclosure does not Make specific restrictions. In a possible implementation, the residual block contains 6 layers of convolutional layers, all convolutional layers are connected in series, that is, the output of the previous convolutional layer is the input of the next convolutional layer, and the first two convolutional layers are in turn against the second The feature image is subjected to convolution processing to obtain the seventh feature image, and then the seventh feature image and the second feature image are merged to obtain the eighth feature image, which then passes through the third convolutional layer and the fourth layer in turn The convolution processing of the convolutional layer obtains the ninth feature image. The image and the eighth feature image are fused to obtain the tenth feature image, and the fifth feature image is obtained through the convolution processing of the fifth convolution layer and the sixth convolution layer in turn. Optionally, the aforementioned fusion may be feature addition.

702、將上述第二特徵圖像與上述第五特徵圖像融合處理,得到第六特徵圖像。 702. Fusion process the second feature image and the fifth feature image to obtain a sixth feature image.

與701中的融合方式一樣,將第二特徵圖像與第五特徵圖像相加,得到第六特徵圖像。 Like the fusion method in 701, the second feature image and the fifth feature image are added to obtain the sixth feature image.

703、對上述第六特徵圖像進行反卷積處理,得到對抗圖像。 703. Perform deconvolution processing on the sixth characteristic image to obtain a confrontation image.

卷積層的前向傳播過程相當於反卷積層的反向傳播過程,卷積層的反向傳播過程相當於反卷積層的前向傳播過程,因此,通過對上述第六特徵圖像進行反卷積處理,可實現對第六特徵圖像的解碼,並得到對抗圖像。需要指出的是,反卷積層的數量與202中卷積層的數量一致。 The forward propagation process of the convolutional layer is equivalent to the back propagation process of the deconvolutional layer, and the back propagation process of the convolutional layer is equivalent to the forward propagation process of the deconvolutional layer. Therefore, by deconvolving the sixth feature image Processing can realize the decoding of the sixth characteristic image and obtain the confrontation image. It should be pointed out that the number of deconvolution layers is the same as the number of convolution layers in 202.

本公開實施例通過對第二特徵圖像進行解碼處理,實現對第二特徵圖像的解碼,得到與對抗資料對應的對抗圖像。 The embodiment of the present disclosure implements decoding of the second feature image by performing decoding processing on the second feature image, and obtains a confrontation image corresponding to the confrontation data.

以下實施例是本公開實施例提供的一種訓練第一神經網路的方法。 The following embodiment is a method for training the first neural network provided by the embodiment of the present disclosure.

在應用第一神經網路生成對抗圖像之前,需要通過反向傳播梯度的方法對第一神經網路進行訓練,以優化第一神經網路中的權重。在一種可能實現的方式中,將第一神經網路生成的對抗圖像輸入至第二神經網路,第二神經網路將對對抗圖像進行判別,並給出相應的類別。根據第二神 經網路給出的類別與對抗資料的類別之間的誤差,以及對抗圖像與待處理圖像之間的歐氏距離,調整第一神經網路中的權重,不斷反覆運算上述過程,直至收斂,完成對第一神經網路的訓練,其中,上述權重參數包括:神經網路中的卷積核的數量、卷積核的大小、神經元的權重大小、神經元的偏置等。可選地,上述訓練過程所用的損失函數可參見下式: Before applying the first neural network to generate the confrontation image, the first neural network needs to be trained by the back propagation gradient method to optimize the weights in the first neural network. In a possible implementation manner, the confrontation image generated by the first neural network is input to the second neural network, and the second neural network will distinguish the confrontation image and give a corresponding category. According to the second god The error between the category given by the network and the category of the confrontation data, and the Euclidean distance between the confrontation image and the image to be processed, adjust the weights in the first neural network, and continue to repeat the above process until Convergence and complete the training of the first neural network, where the above weight parameters include: the number of convolution kernels in the neural network, the size of the convolution kernel, the weight of the neuron, the bias of the neuron, etc. Optionally, the loss function used in the above training process may refer to the following formula:

L(x)=L cls (H(F θ (x,t)),t)+α L re (x,F θ (x,t)) L (x) = L cls ( H ( F θ (x,t)),t)+α L re (x, F θ (x,t))

其中,F θ (x,t)為對抗圖像,t為所述對抗數據,L cls 為交叉熵損失函數,α為自然數,L re =∥x-F θ (x,t)∥ p p

Figure 108146511-A0101-12-0022-25
{0,1,2,∞},H(F θ (x,t))為將對抗圖像輸入至第二神經網路(即被攻擊的神經網路)得到的類別。重構損失函數L re =∥x-F θ (x,t)∥ p 用以測量對抗圖像與待處理圖像之間的差異,可選地,將p取為2,即為對抗圖像與待處理圖像之間的歐式距離。通過給重構損失函數設定一個約束
Figure 108146511-A0101-12-0022-19
,使對抗圖像與待處理圖像之間的歐氏距離較小,即L re =∥x-F θ (x,t)∥ p <
Figure 108146511-A0101-12-0022-20
。通過選取合適的權重α,可提高第一神經網路的訓練速度。 Where F θ (x,t) is the confrontation image, t is the confrontation data, L cls is the cross-entropy loss function, α is a natural number, L re =∥x- F θ (x,t)∥ p , p
Figure 108146511-A0101-12-0022-25
{0,1,2,∞}, H ( F θ (x, t)) is the category obtained by inputting the confrontation image to the second neural network (ie, the neural network under attack). The reconstruction loss function L re =∥x- F θ (x,t)∥ p is used to measure the difference between the confrontation image and the image to be processed. Optionally, take p as 2, which is the confrontation image The Euclidean distance from the image to be processed. By setting a constraint on the reconstruction loss function
Figure 108146511-A0101-12-0022-19
, Make the Euclidean distance between the confronting image and the image to be processed smaller, that is, L re =∥x- F θ (x,t)∥ p <
Figure 108146511-A0101-12-0022-20
. By selecting an appropriate weight α, the training speed of the first neural network can be increased.

上述詳細闡述了本公開實施例的方法,下面提供了本公開實施例的裝置。 The foregoing describes the method of the embodiment of the present disclosure in detail, and the device of the embodiment of the present disclosure is provided below.

請參閱圖8,圖8為本公開實施例提供的一種圖像處理裝置的結構示意圖,該裝置1包括:第一處理單元11、融合處理單元12、第二處理單元13及訓練單元14,其中: Please refer to FIG. 8. FIG. 8 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure. The device 1 includes: a first processing unit 11, a fusion processing unit 12, a second processing unit 13 and a training unit 14, wherein :

第一處理單元11,配置為對待處理圖像進行特徵提取處理,得到第一特徵圖像; The first processing unit 11 is configured to perform feature extraction processing on the image to be processed to obtain a first feature image;

融合處理單元12,配置為對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像; The fusion processing unit 12 is configured to perform fusion processing on the first feature image and the confrontation data to obtain a second feature image;

第二處理單元13,配置為對所述第二特徵圖像進行解碼處理,得到對抗圖像,其中,所述對抗圖像的類別與所述對抗資料的類別相同; The second processing unit 13 is configured to decode the second characteristic image to obtain a confrontation image, wherein the type of the confrontation image is the same as the type of the confrontation data;

訓練單元14,配置為基於損失函數進行反向傳播訓練多目標對抗生成網路。 The training unit 14 is configured to perform back-propagation training based on the loss function of the multi-target confrontation generation network.

在一種可能實現的方式中,所述融合處理單元12包括:預處理子單元121,配置為對所述對抗資料進行預處理,得到第三特徵圖像;第一融合處理子單元122,配置為對所述第三特徵圖像與所述第一特徵圖像進行融合處理,得到第四特徵圖像;第一處理子單元123,配置為對所述第四特徵圖像進行卷積處理,得到所述第二特徵圖像。 In a possible implementation manner, the fusion processing unit 12 includes: a preprocessing subunit 121 configured to preprocess the confrontation data to obtain a third characteristic image; and a first fusion processing subunit 122 configured to Perform fusion processing on the third feature image and the first feature image to obtain a fourth feature image; the first processing subunit 123 is configured to perform convolution processing on the fourth feature image to obtain The second feature image.

在另一種可能實現的方式中,所述預處理子單元121,還配置為對所述對抗資料進行編碼處理,得到編碼處理後的對抗資料;以及對所述編碼處理後的對抗資料填充預設值,使填充後得到的第三特徵圖像的尺寸與所述第一特徵圖像的尺寸相同。 In another possible implementation manner, the preprocessing subunit 121 is further configured to perform encoding processing on the confrontation data to obtain encoded confrontation data; and fill presets with the encoded confrontation data. Value so that the size of the third characteristic image obtained after filling is the same as the size of the first characteristic image.

在又一種可能實現的方式中,所述預處理子單元121,還配置為對所述對抗資料進行獨熱編碼處理,得到所述編碼處理後的對抗資料。 In another possible implementation manner, the preprocessing subunit 121 is further configured to perform one-hot encoding processing on the confrontation data to obtain the encoded confrontation data.

在又一種可能實現的方式中,所述第一融合處理子單元122,還配置為將所述第三特徵圖像與所述第一特徵圖像在通道維度上進行拼接處理,得到所述第四特徵圖像。 In another possible implementation manner, the first fusion processing subunit 122 is further configured to perform splicing processing on the channel dimension of the third characteristic image and the first characteristic image to obtain the first characteristic image. Four characteristic images.

在又一種可能實現的方式中,所述融合處理單元12還包括:特徵提取子單元124,配置為對所述對抗資料進行特徵提取處理,得到所述第一特徵圖像的權重矩陣;第二處理子單元125,配置為將所述第一特徵圖像與所述權重矩陣在通道維度上進行點乘,得到所述第二特徵圖像。 In another possible implementation manner, the fusion processing unit 12 further includes: a feature extraction subunit 124, configured to perform feature extraction processing on the confrontation data to obtain the weight matrix of the first feature image; and second The processing subunit 125 is configured to perform dot multiplication on the channel dimension of the first characteristic image and the weight matrix to obtain the second characteristic image.

在又一種可能實現的方式中,所述特徵提取子單元124,還配置為對所述對抗資料進行線性變換,得到線性變換後的對抗資料;以及對所述線性變換後的對抗資料進行非線性變換,得到所述第一特徵圖像的權重矩陣。 In yet another possible implementation manner, the feature extraction subunit 124 is further configured to perform linear transformation on the confrontation data to obtain linearly transformed confrontation data; and perform nonlinearity on the linearly transformed confrontation data Transform to obtain the weight matrix of the first feature image.

在又一種可能實現的方式中,所述特徵提取子單元124,還配置為獲取所述對抗資料的權重;以及根據所述權重對所述對抗資料進行加權求和,得到所述線性變換後的對抗資料。 In another possible implementation manner, the feature extraction subunit 124 is further configured to obtain the weight of the confrontation data; and perform a weighted summation on the confrontation data according to the weight to obtain the linearly transformed Confrontational information.

在又一種可能實現的方式中,所述特徵提取子單元124,還配置為將所述線性變換後的對抗資料代入啟動函數,得到所述第一特徵圖像的權重矩陣。 In another possible implementation manner, the feature extraction subunit 124 is further configured to substitute the linearly transformed confrontation data into a starting function to obtain the weight matrix of the first feature image.

在又一種可能實現的方式中,所述第一處理單元11包括:第三處理子單元111,配置為對所述待處理圖像進行卷積處理,得到所述第一特徵圖像。 In another possible implementation manner, the first processing unit 11 includes: a third processing subunit 111 configured to perform convolution processing on the image to be processed to obtain the first characteristic image.

在又一種可能實現的方式中,所述第二處理單元13包括:第四處理子單元131,配置為對所述第二特徵圖像進行卷積處理,得到第五特徵圖像;第二融合子單元132,配置為將所述第二特徵圖像與所述第五特徵圖像融合處理,得到第六特徵圖像;第五處理子單元133,配置為對所述第六特徵圖像進行反卷積處理,得到所述對抗圖像。 In another possible implementation manner, the second processing unit 13 includes: a fourth processing subunit 131 configured to perform convolution processing on the second feature image to obtain a fifth feature image; and a second fusion The subunit 132 is configured to merge the second feature image with the fifth feature image to obtain a sixth feature image; the fifth processing subunit 133 is configured to perform processing on the sixth feature image Deconvolution processing to obtain the confrontation image.

在又一種可能實現的方式中,所述裝置1還包括:所述多目標對抗生成網路,配置為對所述待處理圖像進行特徵提取處理,得到所述第一特徵圖像;以及對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像;以及對所述第二特徵圖像進行解碼處理,得到所述對抗圖像。 In another possible implementation manner, the device 1 further includes: the multi-target confrontation generation network, configured to perform feature extraction processing on the image to be processed to obtain the first feature image; and Performing fusion processing on the first characteristic image and the confrontation data to obtain a second characteristic image; and performing decoding processing on the second characteristic image to obtain the confrontation image.

在又一種可能實現的方式中,所述損失函數為: In another possible implementation manner, the loss function is:

L(x)=L cls (H(F θ (x,t)),t)+α L re (x,F θ (x,t)); L (x)= L cls ( H ( F θ (x,t)),t)+α L re (x, F θ (x,t));

其中,F θ (x,t)為所述對抗圖像,t為所述對抗資料,L cls 為交叉熵損失函數,H(F θ (x,t))為將所述對抗圖像輸入至被攻擊的神經網路得到的類別,α為自然數,L re =∥x-F θ (x,t)∥ p p

Figure 108146511-A0101-12-0025-21
{0,1,2,∞}。 Where F θ (x, t) is the confrontation image, t is the confrontation data, L cls is the cross-entropy loss function, and H ( F θ (x, t)) is the input of the confrontation image to The category obtained by the attacked neural network, α is a natural number, L re =∥x- F θ (x,t)∥ p , p
Figure 108146511-A0101-12-0025-21
{0,1,2,∞}.

在一些實施例中,本公開實施例提供的裝置具有的功能或包含的單元可以用於執行上文方法實施例描述的方法,其具體實現可以參照上文方法實施例的描述,為了簡潔,這裡不再贅述。 In some embodiments, the functions or units included in the apparatus provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments. For specific implementation, refer to the description of the above method embodiments. For brevity, here No longer.

圖9為本公開實施例提供的一種圖像處理裝置的硬體結構示意圖。該處理裝置2包括處理器21,還可以包括輸入裝置22、輸出裝置23和記憶體24。該輸入裝置22、 輸出裝置23、記憶體24和處理器21之間通過匯流排相互連接。 FIG. 9 is a schematic diagram of the hardware structure of an image processing device provided by an embodiment of the disclosure. The processing device 2 includes a processor 21, and may also include an input device 22, an output device 23, and a memory 24. The input device 22, The output device 23, the memory 24 and the processor 21 are connected to each other through a bus.

記憶體包括但不限於是隨機存取記憶體(random access memory,RAM)、唯讀記憶體(read-only memory,ROM)、可擦除可程式設計唯讀記憶體(erasable programmable read only memory,EPROM)、或可擕式唯讀記憶體(compact disc read-only memory,CD-ROM),該記憶體用於相關指令及資料。 The memory includes but is not limited to random access memory (RAM), read-only memory (read-only memory, ROM), erasable programmable read only memory (erasable programmable read only memory, EPROM), or compact disc read-only memory (CD-ROM), which is used for related commands and data.

輸入裝置用於輸入資料和/或信號,以及輸出裝置用於輸出資料和/或信號。輸出裝置和輸入裝置可以是獨立的器件,也可以是一個整體的器件。 The input device is used to input data and/or signals, and the output device is used to output data and/or signals. The output device and the input device can be independent devices or a whole device.

處理器可以包括是一個或多個處理器,例如包括一個或多個中央處理器(central processing unit,CPU),在處理器是一個CPU的情況下,該CPU可以是單核CPU,也可以是多核CPU。 The processor may include one or more processors, for example, one or more central processing units (central processing unit, CPU). In the case that the processor is a CPU, the CPU may be a single-core CPU or Multi-core CPU.

記憶體用於儲存網路設備的程式碼和資料。 The memory is used to store the code and data of the network equipment.

處理器用於調用該記憶體中的程式碼和資料,執行上述方法實施例中的步驟。具體可參見方法實施例中的描述,在此不再贅述。 The processor is used to call the program code and data in the memory to execute the steps in the above method embodiment. For details, please refer to the description in the method embodiment, which will not be repeated here.

可以理解的是,圖9僅僅示出了一種圖像處理裝置的簡化設計。在實際應用中,圖像處理裝置還可以分別包含必要的其他元件,包含但不限於任意數量的輸入/輸出裝置、處理器、控制器、記憶體等,而所有可以實現本公開實施例的圖像處理裝置都在本公開的保護範圍之內。 It can be understood that FIG. 9 only shows a simplified design of an image processing device. In practical applications, the image processing device may also include other necessary components, including but not limited to any number of input/output devices, processors, controllers, memory, etc., and all the diagrams that can implement the embodiments of the present disclosure The image processing devices are all within the protection scope of the present disclosure.

本公開提供了一種電腦程式產品,所述電腦程式產品被處理器執行時,能夠前述任意技術方案提供的圖像處理方法,例如,圖1至圖3及圖7所示的方法。 The present disclosure provides a computer program product that, when executed by a processor, can provide an image processing method provided by any of the foregoing technical solutions, for example, the method shown in FIG. 1 to FIG. 3 and FIG. 7.

本領域普通技術人員可以意識到,結合本文中所公開的實施例描述的各示例的單元及演算法步驟,能夠以電子硬體、或者電腦軟體和電子硬體的結合來實現。這些功能究竟以硬體還是軟體方式來執行,取決於技術方案的特定應用和設計約束條件。專業技術人員可以對每個特定的應用來使用不同方法來實現所描述的功能,但是這種實現不應認為超出本公開的範圍。 A person of ordinary skill in the art can be aware that the units and algorithm steps described in the examples in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of the present disclosure.

所屬領域的技術人員可以清楚地瞭解到,為描述的方便和簡潔,上述描述的系統、裝置和單元的具體工作過程,可以參考前述方法實施例中的對應過程,在此不再贅述。 Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which is not repeated here.

在本公開所提供的幾個實施例中,應該理解到,所揭露的系統、裝置和方法,可以通過其它的方式實現。例如,以上所描述的裝置實施例僅僅是示意性的,例如,所述單元的劃分,僅僅為一種邏輯功能劃分,實際實現時可以有另外的劃分方式,例如多個單元或元件可以結合或者可以集成到另一個系統,或一些特徵可以忽略,或不執行。另一點,所顯示或討論的相互之間的耦合或直接耦合或通信連接可以是通過一些介面,裝置或單元的間接耦合或通信連接,可以是電性,機械或其它的形式。 In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or elements may be combined or may be Integrate into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

所述作為分離部件說明的單元可以是或者也可以不是物理上分開的,作為單元顯示的部件可以是或者也可以不是物理單元,即可以位於一個地方,或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施例方案的目的。 The units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units . Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

另外,在本公開各個實施例中的各功能單元可以集成在一個處理單元中,也可以是各個單元單獨物理存在,也可以兩個或兩個以上單元集成在一個單元中。 In addition, the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

在上述實施例中,可以全部或部分地通過軟體、硬體、固件或者其任意組合來實現。當使用軟體實現時,可以全部或部分地以電腦程式產品的形式實現。所述電腦程式產品包括一個或多個電腦指令。在電腦上載入和執行所述電腦程式指令時,全部或部分地產生按照本公開實施例所述的流程或功能。所述電腦可以是通用電腦、專用電腦、電腦網路、或者其他可程式設計裝置。所述電腦指令可以儲存在電腦可讀儲存介質中,或者通過所述電腦可讀儲存介質進行傳輸。所述電腦指令可以從一個網站網站、電腦、伺服器或資料中心通過有線(例如同軸電纜、光纖、數位用戶線路(digital subscriber line,DSL))或無線(例如紅外、無線、微波等)方式向另一個網站網站、電腦、伺服器或資料中心進行傳輸。所述電腦可讀儲存介質可以是電腦能夠存取的任何可用介質或者是包含一個或多個可用介質集成的伺服器、資料中心等資料儲存裝置。所述可用介質可以是磁性介質,(例如,軟碟、硬碟、磁帶)、光介質(例如, 數位通用光碟(digital versatile disc,DVD))、或者半導體介質(例如固態硬碟(solid state disk,SSD))等。 In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present disclosure are generated in whole or in part. The computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices. The computer instructions can be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium. The computer instructions can be sent from a website, computer, server, or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) Another website, computer, server or data center for transmission. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like integrated with one or more available media. The usable medium can be a magnetic medium, (for example, a floppy disk, a hard disk, a tape), an optical medium (for example, Digital versatile disc (DVD), or semiconductor media (for example, solid state disk (SSD)), etc.

本領域普通技術人員可以理解實現上述實施例方法中的全部或部分流程,該流程可以由電腦程式來指令相關的硬體完成,該程式可儲存於電腦可讀取儲存介質中,該程式在執行時,可包括如上述各方法實施例的流程。而前述的儲存介質包括:唯讀記憶體(read-only memory,ROM)或隨機存取記憶體(random access memory,RAM)、磁碟或者光碟等各種可儲存程式碼的介質。 A person of ordinary skill in the art can understand that all or part of the process in the above-mentioned embodiment method can be realized. The process can be completed by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium. At this time, it may include the process of each method embodiment described above. The aforementioned storage medium includes: read-only memory (ROM) or random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes.

圖1代表圖為流程圖,無元件符號簡單說明 Figure 1 represents the flow chart, without component symbols for simple explanation

Claims (15)

一種圖像處理方法,包括: An image processing method, including: 對待處理圖像進行特徵提取處理,得到第一特徵圖像; Performing feature extraction processing on the image to be processed to obtain a first feature image; 對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像; Performing fusion processing on the first feature image and the confrontation data to obtain a second feature image; 對所述第二特徵圖像進行解碼處理,得到對抗圖像,其中,所述對抗圖像的類別與所述對抗資料的類別相同。 The second characteristic image is decoded to obtain a confrontation image, wherein the type of the confrontation image is the same as the type of the confrontation data. 根據請求項1所述的方法,其中,所述對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像,包括: The method according to claim 1, wherein the performing fusion processing on the first characteristic image and the confrontation data to obtain the second characteristic image includes: 對所述對抗資料進行預處理,得到第三特徵圖像; Preprocessing the confrontation data to obtain a third characteristic image; 對所述第三特徵圖像與所述第一特徵圖像進行融合處理,得到第四特徵圖像; Performing fusion processing on the third characteristic image and the first characteristic image to obtain a fourth characteristic image; 對所述第四特徵圖像進行卷積處理,得到所述第二特徵圖像。 Performing convolution processing on the fourth characteristic image to obtain the second characteristic image. 根據請求項2所述的方法,其中,所述對所述對抗資料進行預處理,得到第三特徵圖像,包括: The method according to claim 2, wherein the preprocessing the confrontation data to obtain a third characteristic image includes: 對所述對抗資料進行編碼處理,得到編碼處理後的對抗資料; Encoding the confrontation data to obtain encoded confrontation data; 對所述編碼處理後的對抗資料填充預設值,使填充後得到的第三特徵圖像的尺寸與所述第一特徵圖像的尺寸相同。 Filling a preset value to the confrontation data after the encoding process, so that the size of the third characteristic image obtained after filling is the same as the size of the first characteristic image. 根據請求項3所述的方法,其中,所述對所述對抗資料進行編碼處理,得到編碼處理後的對抗資料,包括: The method according to claim 3, wherein the encoding the confrontation data to obtain the confrontation data after the encoding processing includes: 對所述對抗資料進行獨熱編碼處理,得到所述編碼處理後的對抗資料。 One-hot encoding processing is performed on the confrontation data to obtain the confrontation data after the encoding processing. 根據請求項2所述的方法,其中,所述對所述第三特徵圖像與所述第一特徵圖像進行融合處理,得到第四特徵圖像,包括: The method according to claim 2, wherein the performing fusion processing on the third characteristic image and the first characteristic image to obtain a fourth characteristic image includes: 將所述第三特徵圖像與所述第一特徵圖像在通道維度上進行拼接處理,得到所述第四特徵圖像。 The third characteristic image and the first characteristic image are spliced in the channel dimension to obtain the fourth characteristic image. 根據請求項1所述的方法,其中,所述將所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像,還包括: The method according to claim 1, wherein the fusion processing of the first characteristic image and the confrontation data to obtain a second characteristic image further includes: 對所述對抗資料進行特徵提取處理,得到所述第一特徵圖像的權重矩陣; Performing feature extraction processing on the confrontation data to obtain a weight matrix of the first feature image; 將所述第一特徵圖像與所述權重矩陣在通道維度上進行點乘,得到所述第二特徵圖像。 The first feature image and the weight matrix are dot-multiplied in the channel dimension to obtain the second feature image. 根據請求項6所述的方法,其中,所述對所述對抗資料進行特徵提取處理,得到所述第一特徵圖像的權重矩陣,包括: The method according to claim 6, wherein the performing feature extraction processing on the confrontation data to obtain the weight matrix of the first feature image includes: 對所述對抗資料進行線性變換,得到線性變換後的對抗資料; Linearly transform the confrontation data to obtain the confrontation data after linear transformation; 對所述線性變換後的對抗資料進行非線性變換,得到所述第一特徵圖像的權重矩陣。 Performing nonlinear transformation on the linearly transformed confrontation data to obtain the weight matrix of the first feature image. 根據請求項7所述的方法,其中,所述對所述對抗資料進行線性變換,得到線性變換後的對抗資料,包括: The method according to claim 7, wherein the linearly transforming the confrontation data to obtain the confrontation data after the linear transformation includes: 獲取所述對抗資料的權重; Obtaining the weight of the confrontation data; 根據所述權重對所述對抗資料進行加權求和,得到所述線性變換後的對抗資料。 Perform a weighted summation on the confrontation data according to the weight to obtain the confrontation data after linear transformation. 根據請求項7所述的方法,其中,所述對所述線性變換後的對抗資料進行非線性變換,得到所述第一特徵圖像的權重矩陣,包括: The method according to claim 7, wherein the nonlinearly transforming the linearly transformed confrontation data to obtain the weight matrix of the first feature image includes: 將所述線性變換後的對抗資料代入啟動函數,得到所述第一特徵圖像的權重矩陣。 Substituting the linearly transformed confrontation data into an activation function to obtain the weight matrix of the first feature image. 根據請求項1所述的方法,其中,所述對待處理圖像進行特徵提取處理,得到第一特徵圖像,包括: The method according to claim 1, wherein the performing feature extraction processing on the image to be processed to obtain the first feature image includes: 對所述待處理圖像進行卷積處理,得到所述第一特徵圖像。 Convolution processing is performed on the image to be processed to obtain the first characteristic image. 根據請求項1所述的方法,其中,所述對所述第二特徵圖像進行解碼處理,得到對抗圖像,包括: The method according to claim 1, wherein the decoding processing on the second characteristic image to obtain a confrontation image includes: 對所述第二特徵圖像進行卷積處理,得到第五特徵圖像; Performing convolution processing on the second feature image to obtain a fifth feature image; 將所述第二特徵圖像與所述第五特徵圖像融合處理,得到第六特徵圖像; Fusion processing the second characteristic image and the fifth characteristic image to obtain a sixth characteristic image; 對所述第六特徵圖像進行反卷積處理,得到所述對抗圖像。 Deconvolution processing is performed on the sixth feature image to obtain the confrontation image. 根據請求項1至11任意一項所述的方法,其中,基於多目標對抗生成網路對所述待處理圖像進行特徵提取處理,得到所述第一特徵圖像; The method according to any one of claim items 1 to 11, wherein the feature extraction process is performed on the image to be processed based on a multi-target confrontation generation network to obtain the first feature image; 對所述第一特徵圖像與對抗資料進行融合處理,得到第二特徵圖像;以及 Performing fusion processing on the first feature image and the confrontation data to obtain a second feature image; and 對所述第二特徵圖像進行解碼處理,得到所述對抗圖像。 The second characteristic image is decoded to obtain the confrontation image. 根據請求項12所述的方法,其中,所述多目標對抗生成網路基於損失函數進行反向傳播訓練得到,所述損失函數為: The method according to claim 12, wherein the multi-target confrontation generation network is obtained by back propagation training based on a loss function, and the loss function is: L(x)=L cls (H(F θ (x,t)),t)+α L re (x,F θ (x,t)); L (x)= L cls ( H ( F θ (x,t)),t)+α L re (x, F θ (x,t)); 其中,F θ (x,t)為所述對抗圖像,t為所述對抗數據,L cls 為交叉熵損失函數,H(F θ (x,t))為將所述對抗圖像輸入至被攻擊的神經網路得到的類別,α為自然數,L re =∥x-F θ (x,t)∥ p p
Figure 108146511-A0101-13-0004-26
{0,1,2,∞}。
Wherein, F θ (x, t) is the confrontation image, t is the confrontation data, L cls is the cross-entropy loss function, and H ( F θ (x, t)) is the input of the confrontation image to The category obtained by the attacked neural network, α is a natural number, L re =∥x- F θ (x,t)∥ p , p
Figure 108146511-A0101-13-0004-26
{0,1,2,∞}.
一種圖像處理裝置,包括:處理器和記憶體,所述處理器和所述記憶體耦合;其中,所述記憶體儲存有程式指令,所述程式指令被所述處理器執行時,使所述處理器執行如請求項1至13任意一項所述的方法。 An image processing device comprising: a processor and a memory, the processor and the memory are coupled; wherein the memory stores program instructions, and when the program instructions are executed by the processor, the The processor executes the method according to any one of claim items 1 to 13. 一種電腦可讀儲存介質,其中,所述電腦可讀儲存介質中儲存有電腦程式,所述電腦程式包括程式指令,所述程式指令當被批次處理裝置的處理器執行時,使所述處理器執行如請求項1至13任意一項所述的方法。 A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and the computer program includes program instructions that, when executed by a processor of a batch processing device, cause the processing The device executes the method described in any one of request items 1 to 13.
TW108146511A 2019-01-31 2019-12-18 Method for image processing and apparatus thereof TW202032423A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910099484.2A CN109902723A (en) 2019-01-31 2019-01-31 Image processing method and device
CN201910099484.2 2019-01-31

Publications (1)

Publication Number Publication Date
TW202032423A true TW202032423A (en) 2020-09-01

Family

ID=66944495

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108146511A TW202032423A (en) 2019-01-31 2019-12-18 Method for image processing and apparatus thereof

Country Status (3)

Country Link
CN (1) CN109902723A (en)
TW (1) TW202032423A (en)
WO (1) WO2020155614A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI769724B (en) * 2021-03-04 2022-07-01 鴻海精密工業股份有限公司 Image feature extraction method and image feature extraction device, electronic device and storage media

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902723A (en) * 2019-01-31 2019-06-18 北京市商汤科技开发有限公司 Image processing method and device
CN110378854B (en) * 2019-07-17 2021-10-26 上海商汤智能科技有限公司 Robot image enhancement method and device
CN111079761B (en) * 2019-11-05 2023-07-18 北京航空航天大学青岛研究院 Image processing method, device and computer storage medium
CN111275057B (en) * 2020-02-13 2023-06-20 腾讯科技(深圳)有限公司 Image processing method, device and equipment
CN111709879B (en) * 2020-06-17 2023-05-26 Oppo广东移动通信有限公司 Image processing method, image processing device and terminal equipment
US12019747B2 (en) * 2020-10-13 2024-06-25 International Business Machines Corporation Adversarial interpolation backdoor detection
CN112434744B (en) * 2020-11-27 2023-05-26 北京奇艺世纪科技有限公司 Training method and device for multi-modal feature fusion model
CN112669240B (en) * 2021-01-22 2024-05-10 深圳市格灵人工智能与机器人研究院有限公司 High-definition image restoration method and device, electronic equipment and storage medium
CN113657521B (en) * 2021-08-23 2023-09-19 天津大学 Method for separating two mutually exclusive components in image
CN113792723B (en) * 2021-09-08 2024-01-16 浙江力石科技股份有限公司 Optimization method and system for identifying stone carving characters

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9940539B2 (en) * 2015-05-08 2018-04-10 Samsung Electronics Co., Ltd. Object recognition apparatus and method
US9965717B2 (en) * 2015-11-13 2018-05-08 Adobe Systems Incorporated Learning image representation by distilling from multi-task networks
US11205103B2 (en) * 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
CN107563493A (en) * 2017-07-17 2018-01-09 华南理工大学 A kind of confrontation network algorithm of more maker convolution composographs
CN107292352B (en) * 2017-08-07 2020-06-02 北京中星微人工智能芯片技术有限公司 Image classification method and device based on convolutional neural network
CN107862377A (en) * 2017-11-14 2018-03-30 华南理工大学 A kind of packet convolution method that confrontation network model is generated based on text image
CN108257116A (en) * 2017-12-30 2018-07-06 清华大学 A kind of method for generating confrontation image
CN108090521B (en) * 2018-01-12 2022-04-08 广州视声智能科技股份有限公司 Image fusion method and discriminator of generative confrontation network model
CN108305238B (en) * 2018-01-26 2022-03-29 腾讯科技(深圳)有限公司 Image processing method, image processing device, storage medium and computer equipment
CN108492265A (en) * 2018-03-16 2018-09-04 西安电子科技大学 CFA image demosaicing based on GAN combines denoising method
CN108665506B (en) * 2018-05-10 2021-09-28 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer storage medium and server
CN108765512B (en) * 2018-05-30 2022-04-12 清华大学深圳研究生院 Confrontation image generation method based on multi-level features
CN109191409B (en) * 2018-07-25 2022-05-10 北京市商汤科技开发有限公司 Image processing method, network training method, device, electronic equipment and storage medium
CN109902723A (en) * 2019-01-31 2019-06-18 北京市商汤科技开发有限公司 Image processing method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI769724B (en) * 2021-03-04 2022-07-01 鴻海精密工業股份有限公司 Image feature extraction method and image feature extraction device, electronic device and storage media

Also Published As

Publication number Publication date
CN109902723A (en) 2019-06-18
WO2020155614A1 (en) 2020-08-06

Similar Documents

Publication Publication Date Title
TW202032423A (en) Method for image processing and apparatus thereof
TWI753327B (en) Image processing method, processor, electronic device and computer-readable storage medium
CN109711422B (en) Image data processing method, image data processing device, image data model building method, image data model building device, computer equipment and storage medium
CN112001914A (en) Depth image completion method and device
CN111079532A (en) Video content description method based on text self-encoder
CN111340814A (en) Multi-mode adaptive convolution-based RGB-D image semantic segmentation method
CN113628059B (en) Associated user identification method and device based on multi-layer diagram attention network
CN112233012B (en) Face generation system and method
CN112614110B (en) Method and device for evaluating image quality and terminal equipment
CN112861976B (en) Sensitive image identification method based on twin graph convolution hash network
CN114119975A (en) Language-guided cross-modal instance segmentation method
CN116580257A (en) Feature fusion model training and sample retrieval method and device and computer equipment
US20230394306A1 (en) Multi-Modal Machine Learning Models with Improved Computational Efficiency Via Adaptive Tokenization and Fusion
CN113779186A (en) Text generation method and device
Uddin et al. A perceptually inspired new blind image denoising method using $ L_ {1} $ and perceptual loss
Xia et al. Combination of multi‐scale and residual learning in deep CNN for image denoising
CN117392260B (en) Image generation method and device
CN117894038A (en) Method and device for generating object gesture in image
Xu et al. Depth map denoising network and lightweight fusion network for enhanced 3d face recognition
CN113159053A (en) Image recognition method and device and computing equipment
CN116127925B (en) Text data enhancement method and device based on destruction processing of text
da Silva et al. No‐reference video quality assessment method based on spatio‐temporal features using the ELM algorithm
CN112001865A (en) Face recognition method, device and equipment
CN112784967B (en) Information processing method and device and electronic equipment
WO2022178975A1 (en) Noise field-based image noise reduction method and apparatus, device, and storage medium