RU2559773C2

RU2559773C2 - Method of searching for digital image containing digital watermark

Info

Publication number: RU2559773C2
Application number: RU2013155158/08A
Authority: RU
Inventors: Владимир Алексеевич Иванов; Дмитрий Александрович Кирюхин; Сергей Владимирович Радаев; Алексей Александрович Пронкин; Геннадий Валерьевич Романишин; Евгений Николаевич Битков; Иван Владимирович Иванов
Priority date: 2013-12-11
Filing date: 2013-12-11
Publication date: 2015-08-10
Also published as: RU2013155158A

Abstract

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to a method of searching for digital images containing a digital watermark. The method of searching for a digital image containing a digital watermark includes preliminary processing of a digital image, classifying the image into one of two classes, selecting fragments of the digital image containing recurrent elements, converting said fragments from a pixel map of the RGB colour scheme into a pixel map of a colour scheme expressed through the wavelength of the continuous spectrum in the visible optical range, analysing the selected image fragments, forming an attribute space for each image fragment; further, from the calculated values, generating an eigen image vector containing statistical image characteristics, presented through the wavelength of the continuous spectrum in the visible optical range, and associating the digital image with one of the two classes: a digital image containing a digital watermark and a digital image without a digital watermark.

EFFECT: enabling operation of the method in conditions without prior information on the law of embedding a digital watermark, as well as a low probability of false alarm.

2 dwg

Description

Изобретение относится к области стеганографии, а именно к способам идентификации цифровых изображений (ЦИ), содержащих цифровой водяной знак (ЦВЗ), и может быть использовано для различения оригинального ЦИ, защищенного авторскими правами с помощью внедренного в него ЦВЗ, от его копий, а также для поиска ЦИ различных форматов хранения, содержащих дополнительную цифровую информацию в условиях отсутствия априорных сведений о законе ее встраивания и присутствия в ЦИ.The invention relates to the field of steganography, and in particular to methods for identifying digital images (DIs) containing a digital watermark (CEH), and can be used to distinguish the original DI protected by copyright with the help of the integrated CEH from its copies, and to search for digital data of various storage formats containing additional digital information in the absence of a priori information about the law of its incorporation and presence in digital data.

Известен способ для идентификации данных ЦИ в формате хранения JPEG (US Patent №0040015697, МПК G06K 009/00, 2004 г.), позволяющий установить, действительно ли полученное ЦИ отправлено известным источником и не было ли содержимое файла незначительно модифицировано во время передачи. Для кодирования проверочной информации уникальная хеш-функция получается из первой части данных ЦИ, содержащихся в сжатом ЦИ формата хранения JPEG таким образом, что любые искажения указанной части данных ЦИ в дальнейшем были бы отражены в иной хеш-функции, полученной на основе принятого файла. Хеш-функция дает значение проверки целостности, записываемое в первую часть данных ЦИ. Далее это значение шифруется в строку подписи. Строка подписи встраивается в следующую часть данных ЦИ. Процесс повторяется до тех пор, пока все части данных ЦИ не будут обработаны. Строка подписи, соответствующая последней части данных, встраивается в эту часть. Так как внедрение значения проверки целостности не изменяет последовательности данных файла формата хранения JPEG, любой декодер после этого может декодировать ЦИ. Далее файл ЦИ передается предназначенному получателю. Для декодирования получателем внедренной проверочной информации относительно подлинности отправителя файла формата хранения JPEG хеш-функция вычисляется на основе первой части данных принятого ЦИ. Вторая часть данных характеризует местоположение, где была внедрена строка подписи для первой части данных. В этом случае подпись извлекается из данных. После чего строка подписи дешифруется в виде результата хеш-функции (проверки целостности), содержащейся в самих данных. Эти два числа сравниваются друг с другом. Если первое проверочное число соответствует числу, содержащемуся в найденной строке подписи, которая была ранее внедрена автором, то принимается решение, что данные первой части ЦИ подлинны. Процесс повторяется для каждой последующей части данных, пока не будут обработаны все части данных ЦИ.There is a method for identifying digital data in a JPEG storage format (US Patent No. 0040015697, IPC G06K 009/00, 2004), which makes it possible to establish whether the received digital data was actually sent by a known source and if the file contents were slightly modified during transmission. To encode the verification information, a unique hash function is obtained from the first part of the DI data contained in the compressed DI of the JPEG storage format so that any distortions of this part of the DI data would be reflected in a different hash function obtained on the basis of the received file. The hash function gives the integrity check value recorded in the first part of the DI data. Further, this value is encrypted in the signature line. The signature line is embedded in the next part of the DI data. The process is repeated until all parts of the DI data have been processed. The signature line corresponding to the last part of the data is embedded in this part. Since the implementation of the integrity check value does not change the data sequence of the JPEG storage file, any decoder can then decode the DI. Next, the QI file is transferred to the intended recipient. In order for the receiver to decode the embedded verification information regarding the authenticity of the sender of the JPEG storage format file, the hash function is calculated based on the first data portion of the received digital signature. The second part of the data characterizes the location where the signature line was inserted for the first part of the data. In this case, the signature is retrieved from the data. After that, the signature string is decrypted as the result of the hash function (integrity check) contained in the data itself. These two numbers are compared with each other. If the first check number corresponds to the number contained in the found signature line, which was previously introduced by the author, then it is decided that the data of the first part of the QI is authentic. The process is repeated for each subsequent piece of data until all parts of the data are processed.

Также известен способ идентификации ЦИ, содержащего многократный ЦВЗ (US Patent №20050058320, МПК G06K 009/00, 2005 г.), включающий этап встраивания в документ (ЦИ) дополнительной информации, состоящей из двух типов ЦВЗ, соединенный с этапом считывания встроенных ЦВЗ из идентифицируемого документа (ЦИ), который в свою очередь соединен с этапом сравнения полученных энергетических характеристик считанных ЦВЗ двух типов с образцом, соединенным с этапом принятия решения о несанкционированном копировании идентифицируемого документа (ЦИ).Also known is a method for identifying a digital data center containing multiple CEHs (US Patent No. 200550058320, IPC G06K 009/00, 2005), including the step of embedding additional information in the document (CH) consisting of two types of CEHs connected to the step of reading the embedded CEH from an identifiable document (DI), which in turn is connected to the step of comparing the obtained energy characteristics of the read two types of CEHs with a sample connected to the decision-making phase of unauthorized copying of the identifiable document (DI).

Приведенные выше аналоги применяются в области защиты авторских прав и обеспечивают различение документов-оригиналов (ЦИ оригиналов) от их копий, полученных путем распечатки и сканирования, однако недостатком вышеперечисленных способов является то, что они применяются только в условиях присутствия априорных сведений о законе встраивания ЦВЗ, в противном случае вышеперечисленные способы становятся неэффективными и различить, является ли идентифицируемый документ (ЦИ) копией или оригиналом, не представляется возможным.The above analogs are used in the field of copyright protection and provide a distinction between original documents (original documents) from their copies obtained by printing and scanning, however, the drawback of the above methods is that they are used only in the presence of a priori information about the law on embedding CEH, otherwise, the above methods become ineffective and it is not possible to distinguish whether the identifiable document (DI) is a copy or an original.

Наиболее близким по технической сущности к заявляемому изобретению (прототипом) является способ идентификации цифрового изображения, содержащего цифровой водяной знак (патент RU №2304306, МПК G06K 009/00, 2007 г.), включающий предварительную обработку ЦИ, формирование собственного характеристического вектора (СХВ) ЦИ, классификацию изображения к одному из двух классов.The closest in technical essence to the claimed invention (prototype) is a method for identifying a digital image containing a digital watermark (patent RU No. 2304306, IPC G06K 009/00, 2007), including preliminary processing of the QI, the formation of its own characteristic vector (SHV) QI, image classification to one of two classes.

Такой способ осуществляют в два этапа, называемых «обучение» и «анализ». Предварительная обработка ЦИ включает процедуру формирования обучающей выборки из ЦИ, содержащих встроенные случайным образом ЦВЗ, процедуру формирования трех двумерных массивов значений интенсивности точек каждого ЦИ в виде карты пикселей цветовой схемы RGB (красной, зеленой, синей). Далее формирование СХВ содержит процедуру многоуровневого двумерного дискретного вейвлет-преобразования (необходимо не менее трех уровней вейвлет-преобразования) над каждым массивом интенсивностей точек трех цветовых составляющих в отдельности с последующим вычислением статистических характеристик высоких порядков из распределения вейвлет-коэффициентов на разных поддиапазонах вейвлет-преобразования. Одновременно с процедурой вычисления статистических характеристик высоких порядков из распределения вейвлет-коэффициентов на разных поддиапазонах вейвлет-преобразования вычисляют ошибку предсказания значений вейвлет-коэффициентов на разных поддиапазонах n-го уровня вейвлет-преобразования, а также на поддиапазонах последующего n+1-го уровня вейвлет-преобразования для вертикального, горизонтального и диагонального поддиапазонов соответственно. Данными статистическими характеристиками являются выборочное среднее, выборочная дисперсия, асимметрия и эксцесс. Все вычисленные значения статистических характеристик включают в СХВ ЦИ. После формирования массива СХВ всех ЦИ из обучающей выборки выполняют классификацию изображения к одному из двух классов следующим образом: обучают классификатор, построенный на основе дискриминантного анализа для линейной дискриминации ЦИ из обучающей выборки на два класса: ЦИ, содержащие ЦВЗ, и ЦИ, не содержащие ЦВЗ. После этого ″обучение″ заканчивают и начинают ″анализ″.This method is carried out in two stages, called "training" and "analysis". The preprocessing of the QI includes the procedure for generating a training sample of QIs containing randomly integrated DECs, the procedure for generating three two-dimensional arrays of points intensity values for each QI in the form of a pixel map of the RGB color scheme (red, green, blue). Further, the formation of SHW contains the procedure of a multi-level two-dimensional discrete wavelet transform (at least three levels of wavelet transform are necessary) over each array of intensities of points of three color components separately, followed by calculation of high-order statistical characteristics from the distribution of wavelet coefficients on different sub-bands of the wavelet transform. Simultaneously with the procedure for calculating high-order statistical characteristics from the distribution of wavelet coefficients on different subbands of the wavelet transform, the error in predicting the values of wavelet coefficients on different subbands of the n-th level of the wavelet transform, as well as on the subbands of the subsequent n + 1-th level of the wavelet transformations for the vertical, horizontal and diagonal subbands, respectively. These statistical characteristics are sample mean, sample variance, asymmetry and excess. All calculated values of statistical characteristics are included in the CWS QI. After the formation of the SHW array of all the DIs from the training sample, the image is classified into one of two classes as follows: a classifier is trained based on discriminant analysis for linear discrimination of the DI from the training sample into two classes: DIs containing CEH and DI not containing CEI . After that, ″ training ″ is completed and ″ analysis ″ begins.

″Анализ″ включает все процедуры, описанные выше, только теперь с помощью сформированного СХВ классифицируют анализируемые изображения к одному из двух классов, используя результаты дискриминации всех СХВ, полученных от ЦИ из обучающей выборки при ″обучении″.″ Analysis ″ includes all the procedures described above, only now using the generated SHV classifies the analyzed images to one of two classes, using the discrimination results of all SHV received from the QI from the training set for ″ training ″.

Такой способ используется в области защиты авторских прав и обеспечивает идентификацию ЦИ, содержащих ЦВЗ, в условиях отсутствия априорных сведений о законе встраивания ЦВЗ. Недостатком способа является большая вероятность ложной тревоги (ошибки первого рода) вследствие нестационарности ЦИ (двумерного сигнала).This method is used in the field of copyright protection and provides identification of digital information centers containing CEH, in the absence of a priori information about the law of embedding CEH. The disadvantage of this method is the high likelihood of false alarm (error of the first kind) due to the non-stationary QI (two-dimensional signal).

Задачей изобретения является разработка способа поиска цифрового изображения, содержащего цифровой водяной знак, обеспечивающего работу в условиях отсутствия априорных сведений о законе встраивания ЦВЗ, при этом способ должен обеспечивать низкий уровень вероятности ложной тревоги (ошибки первого рода).The objective of the invention is to develop a method of searching for a digital image containing a digital watermark, providing work in the absence of a priori information about the law of embedding CEH, while the method should provide a low level of probability of false alarm (error of the first kind).

Эта задача решается тем, что в способ идентификации цифрового изображения, содержащего цифровой водяной знак, между предварительной обработкой ЦИ и классификацией изображения к одному из двух классов последовательно введены процедура выделения фрагментов ЦИ, содержащих повторяющиеся элементы, процедура преобразования фрагментов ЦИ из карты пикселей цветовой схемы RGB (красной, зеленой, синей) в карту пикселей цветовой схемы, выраженную через длину волны непрерывного спектра видимого оптического диапазона, процедура анализа выделенных фрагментов изображения, процедура формирования собственного характеристического вектора изображения, содержащего статистические характеристики изображения, представленного через длины волн непрерывного спектра видимого оптического диапазона.This problem is solved in that in the method for identifying a digital image containing a digital watermark, between the preliminary processing of the digital image and the classification of the image, one of the two classes sequentially introduces the procedure for extracting digital fragments containing repeating elements, the procedure for converting digital fragments from the pixel map of the RGB color scheme (red, green, blue) to the pixel map of the color scheme, expressed through the wavelength of the continuous spectrum of the visible optical range, the analysis procedure selected fragments of an image, the procedure for generating an own characteristic image vector containing statistical characteristics of the image represented through the wavelengths of the continuous spectrum of the visible optical range.

Введение новых процедур позволяет идентифицировать ЦИ, содержащее ЦВЗ, в условиях отсутствия априорных сведений о законе и месте встраивания ЦВЗ, при этом введение процедуры преобразования фрагментов ЦИ из карты пикселей цветовой схемы RGB (красной, зеленой, синей) в карту пикселей цветовой схемы, выраженную через длину волны непрерывного спектра видимого оптического диапазона, дает возможность устранения нестационарности двумерного сигнала (ЦИ).The introduction of new procedures makes it possible to identify a digital center containing a CEH, in the absence of a priori information about the law and place of incorporation of the CEH, and the introduction of a procedure for converting fragments of a digital fragment from a pixel map of the RGB color scheme (red, green, blue) to a pixel map of the color scheme, expressed through the wavelength of the continuous spectrum of the visible optical range, makes it possible to eliminate the unsteadiness of a two-dimensional signal (DI).

Простейшим повторяющимся элементом на ЦИ является одиночный пиксель в случае монохромного (однородного) фрагмента (деталь кузова автомобиля, участок стены дома, дороги или неба). Априори известно, что, например, детали кузова подавляющею числа автомобилей окрашены одним цветом, следовательно, имеют один оттенок. При обнаружении на таких монохромных (однородных) фрагментах элементов (пикселей) другого оттенка цвета можно предположить, что ЦИ содержит встроенный ЦВЗ.The simplest repeating element on the digital center is a single pixel in the case of a monochrome (homogeneous) fragment (car body part, section of the wall of a house, road or sky). A priori it is known that, for example, body parts of the vast majority of cars are painted in one color, therefore, they have the same shade. If elements (pixels) of a different color shade are detected on such monochrome (homogeneous) fragments, it can be assumed that the digital center contains an integrated CEH.

Другим примером повторяющегося элемента может служить периодически или не периодически повторяющиеся геометрическая фигура или сложный рисунок. Такие элементы будут присутствовать, например, на фрагменте ЦИ участков стенных обоев или дорожной плитки, мозаики Пенроуза (Журнал «Наука и жизнь», 2013 г., выпуск №6, Картина мира на листе бумаги. Стр.40). Обнаружение нарушения геометрической формы или искажения рисунка отдельных элементов на фрагменте с периодически или не периодически повторяющимися заведомо одинаковыми элементами может быть сигналом для более детального изучения ЦИ на предмет содержания в нем встроенного ЦВЗ.Another example of a repeating element can be a periodically or non-periodically repeating geometric figure or complex pattern. Such elements will be present, for example, on a fragment of the graphic art of sections of wall-paper or road tiles, mosaics of Penrose (Journal of Science and Life, 2013, issue No. 6, A picture of the world on a sheet of paper. Page 40). The detection of a violation of the geometric shape or distortion of the pattern of individual elements on a fragment with periodically or not periodically repeating obviously identical elements can be a signal for a more detailed study of the digital center for the content of the integrated CEH in it.

Участок радуги является классическим фрагментом, содержащим повторяющиеся элементы с плавным переходом оттенков всего спектра видимого оптического диапазона. Так как порядок следования оттенков цвета, представленных RGB кодами, отличается от порядка следования оттенков цвета в непрерывном спектре видимого оптического диапазона, то при встраивании ЦВЗ в значения интенсивностей точек ЦИ в любую из трех цветовых составляющих в виде карты пикселей цветовой схемы RGB (красной, зеленой, синей) порядок следования оттенков цвета, например, в непрерывном спектре видимого оптического диапазона с большой долей вероятности будет нарушен.The rainbow section is a classic fragment containing repeating elements with a smooth transition of shades of the entire spectrum of the visible optical range. Since the order of the color shades represented by the RGB codes differs from the order of the color shades in the continuous spectrum of the visible optical range, when embedding the CEH into the intensities of the QI points in any of the three color components in the form of a pixel map of the RGB color scheme (red, green , blue) the sequence of shades of color, for example, in the continuous spectrum of the visible optical range with a high degree of probability will be violated.

Проведенный анализ уровня техники позволил установить, что аналоги, характеризующиеся совокупностью признаков, тождественных всем признакам заявленного технического решения, отсутствуют, что указывает на соответствие заявленного способа условию патентоспособности «новизна».The analysis of the prior art made it possible to establish that analogues that are characterized by a set of features identical to all the features of the claimed technical solution are absent, which indicates the compliance of the claimed method with the condition of patentability “novelty”.

Результаты поиска известных решений в данной и смежных областях техники с целью выявления признаков, совпадающих с отличительными от прототипа признаками заявленного объекта, показали, что они не следуют явным образом из уровня техники. Из уровня техники также не выявлена известность отличительных существенных признаков, обусловливающих тот же технический результат, который достигнут в заявляемом способе. Следовательно, заявленное изобретение соответствует условию патентоспособности «изобретательский уровень».Search results for known solutions in this and related fields of technology in order to identify features that match the distinctive features of the claimed object from the prototype showed that they do not follow explicitly from the prior art. The prior art also did not reveal the fame of the distinctive essential features that determine the same technical result that is achieved in the claimed method. Therefore, the claimed invention meets the condition of patentability "inventive step".

Заявленный способ поясняется чертежами, на которых показано:The claimed method is illustrated by drawings, which show:

фиг.1 - блок-схема реализации способа поиска ЦИ, содержащего ЦВЗ;figure 1 is a block diagram of an implementation of a search method for a digital signal containing a CEH;

фиг.2 - сравнение результатов имитационного моделирования для способа-прототипа и заявленного способа.figure 2 - comparison of the results of simulation for the prototype method and the claimed method.

Реализация заявленного способа заключается в следующем (фиг.1).The implementation of the claimed method is as follows (figure 1).

В процедуре предварительной обработки ЦИ осуществляют встраивание в ЦИ дополнительной информации с использованием различных алгоритмов встраивания с целью обучения классификатора. Затем формируют трехмерный массив значений интенсивности точек ЦИ в виде карты пикселей цветовой схемы RGB (красной, зеленой, синей) (Миано Д. Форматы и алгоритмы сжатия изображений в действии. - М.: ″Триумф″, 2003 г.) (блок 1).In the pre-processing procedure, the digital information is embedded in the digital information using various embedding algorithms in order to train the classifier. Then form a three-dimensional array of values of the intensity of the points in the form of a pixel map of the color scheme RGB (red, green, blue) (Miano D. Formats and image compression algorithms in action. - M .: ″ Triumph ″, 2003) (block 1) .

Далее выделяют фрагменты ЦИ, содержащие повторяющиеся элементы (блок 2). Данную задачу рассматривают с точки зрения текстурно-цветовой сегментации, предполагая при этом, что исходные данные представлены в формате представления цветовой схемы RGB (красной, зеленой, синей) ЦИ, а монохромность (однородность) областей будет определяться на основе оценок их яркостных, цветовых и текстурных характеристик. Для нахождения периодичности в ЦИ используют свойства Фурье-спектра. В целом текстурно-цветовое пространство признаков получают объединением двух подпространств - цветовых и текстурных признаков. В качестве цветовых признаков используют следующие характеристики: цветность, насыщенность и яркость (HSL - hue, saturation, lightness). Данное пространство цветовых признаков совпадает с обычным RGB (красной, зеленой, синей) цветовым пространством с точностью до координатного преобразования (Чочиа П.А. Пирамидальный алгоритм сегментации изображений. Информационные процессы. Том 10, №1, 2010 г., с.23-35).Next, fragments of QI containing repeating elements are isolated (block 2). This problem is considered from the point of view of texture-color segmentation, assuming that the source data are presented in the format of the RGB color scheme (red, green, blue), and the monochrome (uniformity) of the areas will be determined based on estimates of their brightness, color and texture characteristics. To find the periodicity in the QI, use the properties of the Fourier spectrum. In general, the texture-color space of signs is obtained by combining two subspaces - color and texture signs. The following characteristics are used as color signs: color, saturation and brightness (HSL - hue, saturation, lightness). This space of color features coincides with the usual RGB (red, green, blue) color space accurate to coordinate transformation (P. Chochia. Pyramidal algorithm for image segmentation. Information processes. Volume 10, No. 1, 2010, p.23- 35).

Далее осуществляют сегментацию ЦИ согласно алгоритму пирамидального преобразования (Чочиа П.А. Пирамидальный алгоритм сегментации изображений. Информационные процессы. Том 10, №1, 2010 г., с.23-35). Результатом сегментации является ЦИ в формате представления цветовой схемы RGB (красной, зеленой, синей), состоящее из смежных непересекающихся фрагментов, в геометрическом расположении точно соответствующих фрагментам, полученным в результате преобразования согласно алгоритму пирамидального преобразования (Чочиа П.А. Пирамидальный алгоритм сегментации изображений. Информационные процессы. Том 10, №1, 2010 г., с.23-35).Then, QI is segmented according to the pyramidal transformation algorithm (P. Chochia, Pyramidal image segmentation algorithm. Information processes. Volume 10, No. 1, 2010, pp. 23-35). The segmentation result is a digital representation in the representation format of the RGB color scheme (red, green, blue), consisting of adjacent disjoint fragments, in a geometrical arrangement exactly corresponding to the fragments obtained as a result of the transformation according to the pyramidal transformation algorithm (Chochia P.A. Pyramidal image segmentation algorithm. Information Processes, Volume 10, No. 1, 2010, pp. 23-35).

Затем над каждым фрагментом ЦИ выполняют преобразование из карты пикселей цветовой схемы RGB (красной, зеленой, синей) в карту пикселей цветовой схемы, выраженную через длину волны непрерывного спектра видимого оптического диапазона (блок 3). Например, в качестве одного из вариантов вышеуказанного преобразования возможно представление изображения, описывающее непрерывный спектр через длину волны видимого оптического диапазона известным способом (Татаринов А., Игнатенко А. Спектральный цвет и его реконструкция из RGB. Компьютерная графика и мультимедиа. Сетевой журнал. Выпуск №4 (3)/2006 г.).Then, a conversion from the pixel map of the RGB color scheme (red, green, blue) to the pixel map of the color scheme, expressed in terms of the wavelength of the continuous spectrum of the visible optical range (block 3), is performed on each fragment of the digital image. For example, as one of the options for the above conversion, it is possible to represent an image describing a continuous spectrum through the wavelength of the visible optical range in a known manner (Tatarinov A., Ignatenko A. Spectral color and its reconstruction from RGB. Computer graphics and multimedia. Network magazine. Issue No. 4 (3) / 2006).

Затем анализируют выделенные фрагменты изображения (блок 4). В результате анализа выделенных фрагментов изображения формируют признаковое пространство для каждого фрагмента изображения.Then, the selected image fragments are analyzed (block 4). As a result of the analysis of the selected image fragments, a feature space is formed for each image fragment.

В качестве признака для монохромного (однородного) фрагмента изображения выступает среднее расстояние (средняя разность) между пикселями изображения, представленное длиной волны непрерывного спектра видимого оптического диапазона.The average distance (average difference) between the pixels of the image represented by the wavelength of the continuous spectrum of the visible optical range acts as a sign for a monochrome (homogeneous) image fragment.

Среднее расстояние (средняя разность) между центральным пикселем изображения и смежными с ним пикселями, выраженными через длины волн непрерывною спектра видимого оптического диапазона, рассчитывают по формуле:The average distance (average difference) between the central pixel of the image and adjacent pixels, expressed through the wavelengths of the continuous spectrum of the visible optical range, is calculated by the formula:

где S_ср.i - среднее расстояние (средняя разность) для i-го пикселя;where S _cf.i is the average distance (average difference) for the i-th pixel;

z_i - значения центральных пикселей, выраженные через длины волн непрерывного спектра видимого оптического диапазона;z _i are the values of the central pixels expressed in terms of the wavelengths of the continuous spectrum of the visible optical range;

m_j - значения пикселей, смежных с центральным пикселем, выраженные через длины волн непрерывного спектра видимого оптического диапазона;m _j are the values of pixels adjacent to the central pixel, expressed in terms of the wavelengths of the continuous spectrum of the visible optical range;

N - количество смежных пикселей (N∈[1, 8]).N is the number of adjacent pixels (N∈ [1, 8]).

После того как будут вычислены средние расстояния (средние разности) для каждого пикселя в рассматриваемом фрагменте изображения, рассчитывают среднее расстояние (средняя разность) для фрагмента изображения в целом:After the average distances (average differences) for each pixel in the considered image fragment are calculated, the average distance (average difference) for the entire image fragment is calculated:

где S_k - среднее расстояние (средняя разность) для k-го фрагмента;where S _k is the average distance (average difference) for the k-th fragment;

M - общее количество пикселей в k-м фрагменте.M is the total number of pixels in the kth fragment.

Далее средние расстояния (средние разности) S_k рассчитывают для всех k монохромных (однородных) фрагментов, в совокупности образующих целостное изображение.Further, the average distances (average differences) S _{k are} calculated for all k monochrome (homogeneous) fragments that together form an integral image.

Затем вычисляют среднее общее расстояние (среднюю общую разность) S_общ для всего изображения в целом:Then calculate the average total distance (average total difference) S _total for the whole image as a whole:

где L - количество монохромных (однородных) фрагментов в изображении.where L is the number of monochrome (homogeneous) fragments in the image.

Далее вычисляют дисперсию среднего расстояния (средней разности) пикселей анализируемого изображения для каждого монохромного (однородного) фрагмента по следующей формуле:Then, the variance of the average distance (average difference) of the pixels of the analyzed image for each monochrome (homogeneous) fragment is calculated according to the following formula:

где D_k(X) - дисперсия среднего расстояния (средней разности) k-го фрагмента;where D _k (X) is the variance of the average distance (average difference) of the k-th fragment;

x_i - значения пикселей изображения, выраженные значениями длин волн непрерывного спектра видимого оптического диапазона, в рассматриваемом k-м фрагменте изображения;x _i - pixel values of the image, expressed by wavelengths of the continuous spectrum of the visible optical range, in the considered k-th image fragment;

N - общее количество пикселей ЦИ в рассматриваемом k-м фрагменте, получаемых после выполнения процедуры сегментации ЦИ на монохромные (однородные) фрагменты.N is the total number of QI pixels in the kth fragment under consideration obtained after the QI segmentation procedure into monochrome (homogeneous) fragments.

Затем рассчитывают среднее значение дисперсии среднего расстояния (средней разности) всего изображения по формуле:Then calculate the average variance of the average distance (average difference) of the entire image according to the formula:

Далее из вычисленных значений формируют СХВ изображения, содержащий статистические характеристики изображения, представленного через длины волн непрерывного спектра видимого оптического диапазона (блок 5):Next, from the calculated values, an SCH image is generated containing the statistical characteristics of the image represented through the wavelengths of the continuous spectrum of the visible optical range (block 5):

Процедура формирования СХВ изображения, содержащего статистические характеристики изображения, представленного через длины волн непрерывного спектра видимого оптического диапазона (блок 5), соединена обратной связью с процедурой предварительной обработки ЦИ (блок 1), что указывает на то, что все вышеописанные процедуры выполняют над каждым изображением из обучающей выборки отдельно.The procedure for the formation of the CXB image containing the statistical characteristics of the image represented through the wavelengths of the continuous spectrum of the visible optical range (block 5) is connected by feedback to the preprocessing procedure of the DI (block 1), which indicates that all the above procedures are performed on each image from the training set separately.

В качестве признака для периодического фрагмента выступает соответствие геометрической формы повторяющихся элементов.As a sign for a periodic fragment, the geometric shape of the repeating elements is consistent.

В качестве признака для фрагментов ЦИ, содержащих повторяющиеся плавные переходы оттенков спектра, является плавность функции второго порядка. Функция в этом случае будет представлять собой зависимость значения интенсивности пикселя ЦИ от его координаты.As a sign for fragments of QI, containing repeating smooth transitions of the shades of the spectrum, is the smoothness of the second-order function. The function in this case will be the dependence of the value of the pixel intensity of the QI on its coordinate.

После формирования массива признаков на основе теории распознавания образов (Гонсалес Р., Вудс Р., Эддинс С. Цифровая обработка изображений в среде MATLAB. Техносфера, Москва, 2006 г., с.502) известными методами производят классифицирование ЦИ на два класса: ЦИ, содержащие ЦВЗ, и ЦИ, не содержащие ЦВЗ (блок 6).After the formation of an array of features based on the theory of pattern recognition (Gonzalez R., Woods R., Eddins S. Digital image processing in MATLAB. Technosphere, Moscow, 2006, p. 502), the known methods are used to classify digital data into two classes: digital data containing CEH and DI not containing CEH (block 6).

Правомерность теоретических предпосылок проверялась с помощью имитационных моделей системы-прототипа и системы, реализующей заявленный способ поиска ЦИ, содержащего ЦВЗ.The validity of the theoretical assumptions was checked using simulation models of a prototype system and a system that implements the claimed method of searching for a digital data center containing a CEH.

Показателем эффективности способов поиска ЦИ, содержащих ЦВЗ, является вероятность ложной тревоги (ошибки первого рода) Р_л.т..An indicator of the effectiveness of methods for searching for QI containing CEH is the probability of false alarm (type I errors) R _lt .

Для оценки качества функционирования разработанного способа были проведены эксперименты по обнаружению ЦИ, содержащих ЦВЗ. С этой целью были сформированы обучающие наборы для двух классов ЦИ и контрольная выборка. Обучающий набор для класса «чистый» состоял из 500 файлов различных форматов хранения ЦИ (JPEG, JPEG 2000, BMP). Обучающий набор для класса «стего» состоял из аналогичных файлов со встроенными ЦВЗ максимального объема. В контрольную выборку были включены 2000 файлов, не входящих в обучающие наборы, 1000 из которых содержали ЦВЗ максимального объема.To assess the quality of functioning of the developed method, experiments were carried out to detect QI containing CEH. For this purpose, training kits for two classes of QI and a control sample were formed. The training set for the “clean” class consisted of 500 files of various formats for storing DI (JPEG, JPEG 2000, BMP). The training set for the “stego” class consisted of similar files with built-in CEHs of maximum volume. The control sample included 2,000 files that were not included in the training sets, 1,000 of which contained the maximum volume CEH.

Для исследования зависимости вероятности ложной тревоги Р_л.т. от объема ЦВЗ, используемого при обучении, дополнительно были сформированы обучающие выборки, содержащие ЦВЗ с различным объемом (K_зап).To study the dependence of the probability of false alarm R _lt of the volume of the CEH used in training, training samples were additionally formed containing CEH with a different volume (K _app ).

Результаты, представленные на фиг.2, подтверждают существенный положительный эффект от внедрения нового способа.The results presented in figure 2, confirm a significant positive effect from the introduction of a new method.

Промышленная применимость изобретения обусловлена тем, что устройство, реализующее предложенный способ, может быть осуществлено с помощью современной элементной базы с достижением указанного в изобретении назначения.Industrial applicability of the invention is due to the fact that a device that implements the proposed method can be implemented using a modern elemental base to achieve the destination specified in the invention.

Claims

Способ поиска цифрового изображения, содержащего цифровой водяной знак, заключающийся в том, что предварительно обрабатывают цифровое изображение, классифицируют изображение к одному из двух классов, отличающийся тем, что выделяют фрагменты цифрового изображения, содержащие повторяющиеся элементы, преобразуют фрагменты цифрового изображения из карты пикселей цветовой схемы RGB в карту пикселей цветовой схемы, выраженную через длину волны непрерывного спектра видимого оптического диапазона, анализируют выделенные фрагменты изображения, в результате чего формируют признаковое пространство для каждого фрагмента изображения; далее из вычисленных значений формируют собственный характеристический вектор изображения, содержащий статистические характеристики изображения, представленного через длины волн непрерывного спектра видимого оптического диапазона, и относят цифровое изображение к одному из двух классов: цифровое изображение, содержащее цифровой водяной знак, и цифровое изображение, не содержащее цифрового водяного знака. A method of searching for a digital image containing a digital watermark, namely, that the digital image is pre-processed, the image is classified into one of two classes, characterized in that fragments of the digital image containing repeating elements are extracted, fragments of the digital image are converted from the pixel map of the color scheme RGB to the pixel map of the color scheme, expressed in terms of the wavelength of the continuous spectrum of the visible optical range, analyze the selected fragments of the image Ia, whereby the shape feature space for each tile image; Further, from the calculated values, an eigen-characteristic image vector is formed containing the statistical characteristics of the image represented through the wavelengths of the continuous spectrum of the visible optical range, and the digital image is assigned to one of two classes: a digital image containing a digital watermark and a digital image not containing a digital watermark.