RU2680358C1

RU2680358C1 - Method of recognition of content of compressed immobile graphic messages in jpeg format

Info

Publication number: RU2680358C1
Application number: RU2018117646A
Authority: RU
Inventors: Владимир Алексеевич Иванов; Алексей Валентинович Скурнович; Андрей Михайлович Ревякин
Priority date: 2018-05-14
Filing date: 2018-05-14
Publication date: 2019-02-19

Abstract

FIELD: data processing.SUBSTANCE: invention relates to the field of data recognition. Method for recognizing a compressed immobile graphic message is based on a sequence of operations, as a result of which a JPEG file is decoded before the dequantization procedure, an array of DCT coefficient values of the color component Y is formed, the central moments are calculated from the distribution of these coefficients, characteristic vector of features thereof is formed, values thereof are normalized, then they are used in a linear predictive rule and decision is made whether a compressed NHS in JPEG format belongs to one of the recognized classes.EFFECT: reducing the processing time of the compressed NHS in the JPEG format by reducing the number of operations and ensuring the correct recognition of the content.1 cl, 4 dwg, 1 tbl

Description

Изобретение относится к области распознавания данных и может быть использовано для предварительной обработки и распознавания контента сжатых неподвижных графических сообщений (НГС) в формате JPEG при решении задач анализа больших объемов мультимедийной информации. The invention relates to the field of data recognition and can be used for pre-processing and recognition of the contents of compressed stationary graphic messages (NGS) in JPEG format for solving problems of analyzing large volumes of multimedia information.

Для удобства описания способа распознавания контента сжатых НГС в формате JPEG введем ряд определений.For convenience, the description of the method for recognizing the content of compressed NGS in JPEG format, we introduce a number of definitions.

Под сжатыми НГС в формате JPEG понимаются неподвижные цифровые изображения, сжатые в соответствии со спецификацией JFIF и представленные в виде файлов формата JPEG – стандарт сжатия цифрового изображения, определенный в ИСО/МЭК 10918-1 [ГОСТ Р ИСО/МЭК 19794-5–2013]. Для сжатия контента НГС в формате JPEG (цифровых изображений в формате JPEG) последовательно выполняются три основные операции: дискретное косинусное преобразование (ДКП, Discrete Cosine Transform), округление (квантование, Quntization) коэффициентов ДКП и их последующее энтропийное кодирование (кодами RLE и Хаффмана) [ИСО/МЭК 10918-1].Compressed NGS in JPEG format means still digital images compressed in accordance with the JFIF specification and presented in the form of JPEG files — the digital image compression standard defined in ISO / IEC 10918-1 [GOST R ISO / IEC 19794-5–2013] . To compress NGS content in JPEG format (digital images in JPEG format) three basic operations are performed sequentially: discrete cosine transform (DCT), rounding (quantization, Quntization) of DCT coefficients and their subsequent entropy coding (RLE and Huffman codes) [ISO / IEC 10918-1].

Под контентом сжатых НГС в формате JPEG в предлагаемом изобретении понимается содержательная часть сообщений, сведений [ГОСТ Р 43.0.7–2011]. The content of compressed NGS in JPEG format in the present invention is understood as the content of messages, information [GOST R 43.0.7–2011].

Цифровое изображение – матрица из пикселей, организованной в формате строк и колонок. Цифровое изображение с составляющими М на N шкалы уровней серого или цветовых значений состоит из

пикселей [ГОСТ Р ИСО/МЭК 19794-9–2009].A digital image is a matrix of pixels organized in the format of rows and columns. A digital image with components M to N of the gray scale or color value scale consists of

pixels [GOST R ISO / IEC 19794-9-2009].

Пиксель – наименьший элемент поверхности визуализации, которому может быть независимым образом заданы цвет, интенсивность и другие характеристики изображения [ГОСТ 27459-87 Системы обработки информации. Машинная графика. Термины и определения – С. 3].A pixel is the smallest element of the visualization surface, which can be independently set the color, intensity and other characteristics of the image [GOST 27459-87 Information processing systems. Machine Graphics. Terms and definitions - S. 3].

Цветовая модель RGB – аддитивная цветовая модель, как правило, описывающая способ синтеза цвета для цветовоспроизведения (Синтез цвета // Фотокинотехника: Энциклопедия / Главный редактор Е. А. Иофис. – М. : Советская энциклопедия, 1981. – 274 с.).The RGB color model is an additive color model, usually describing the method of color synthesis for color reproduction (Color Synthesis // Photokinotechnics: Encyclopedia / Editor-in-Chief E. A. Iofis. - M.: Soviet Encyclopedia, 1981. - 274 p.).

Растровая графика – область машинной графики, в которой изображения генерируются из массива пикселей, упорядоченных по строкам и столбцам [ГОСТ 27459-87 Системы обработки информации. Машинная графика. Термины и определения – С. 2].Raster graphics - the area of computer graphics in which images are generated from an array of pixels arranged in rows and columns [GOST 27459-87 Information processing systems. Machine Graphics. Terms and definitions - S. 2].

Для решения задачи распознавания контента сжатых НГС в формате JPEG в разных способах могут применятся различные варианты их представления: растровая графика, векторная графика, фрактальная графика и их комбинации. To solve the problem of recognizing the content of compressed NGS in JPEG format in different ways, various options for their presentation can be applied: raster graphics, vector graphics, fractal graphics, and combinations thereof.

Известен способ распознавания текстовой информации из векторно-растрового изображения (патент RU № 2309456 от 27.10.2007), который включает в себя следующие этапы: разбиение изображения до получения областей (фрагментов), содержащих неразрывный логически связанный текст наибольшего размера; разбиение на области, предположительно содержащие текст для последующего анализа соседних областей на возможность их объединения в более крупные фрагменты, разбиение текстовых объектов на отдельные символы и группы символов по предполагаемым местам размещения пробелов или других неидентифицируемых символов; анализ и составление (объединение, сборка) групп символов в строки, разбиение на отдельные символы и группы символов для последующего преобразования абсолютных координат символов в группы, разделенные пробелами и увеличенными межсимвольными промежутками; обработку и анализ растровых объектов для выявления изображения текста в нетекстовых объектах, анализ для выявления векторных объектов, отличных от разделителей, в том числе выходящих за пределы объекта.A known method of recognizing textual information from a vector-raster image (patent RU No. 2309456 dated 10.27.2007), which includes the following steps: splitting the image to obtain areas (fragments) containing inextricably logically connected text of the largest size; splitting into areas presumably containing text for subsequent analysis of neighboring areas for the possibility of combining them into larger fragments, splitting text objects into separate characters and groups of characters at the proposed locations of spaces or other unidentifiable characters; analysis and compilation (combining, assembling) of groups of characters into lines, splitting into separate characters and groups of characters for the subsequent conversion of the absolute coordinates of characters into groups, separated by spaces and extended intersymbol spaces; processing and analysis of raster objects to identify images of text in non-text objects, analysis to identify vector objects other than separators, including those that go beyond the object.

Наиболее близким по технической сущности к заявляемому способу и выбранным в качестве прототипа является способ распознавания контентного содержания сообщений графических форматов (патент RU № 2479028 от 10.04.2013), заключающийся в том, что для решения задачи распознавания контента сжатых НГС формата JPEG, на первом этапе: определяют объем растра изображения, содержащегося в НГС, и отсеивают сообщения, принадлежащие к элементам Web-дизайна (баннеры); декодируют принятый графический файл в сообщение графического формата цветовой схемы RGB; преобразуют сообщение графического формата в двумерный массив элементов, описывающий структуру растра изображения; определяют объем растра изображения и полученное значение объема растра сравнивают с пороговым значением и отсеивают сообщения, принадлежащие к элементам Web-дизайна; на втором этапе: оценивают значение признаков, характеризующих энтропию сообщений графических форматов и принимают решение о контенте цифрового изображения содержании сообщения; рассчитывают значение результирующего информативного признака характеризующего контент НГС, при этом для вычисления результирующего информативного признака при распознавании контента НГС предлагается многоуровневая схема преобразований структурных признаков объекта с целью получения значений, характеризующих энтропию НГС; сравнивают полученное значение информативного признака с пороговыми значениями и принимают решение о типе контента анализируемого сжатого НГС.The closest in technical essence to the claimed method and selected as a prototype is a method for recognizing the content content of messages in graphic formats (patent RU No. 2479028 of 04/10/2013), which consists in the fact that in order to solve the problem of recognizing the content of compressed NGS JPEGs, at the first stage : determine the volume of the raster image contained in the NHS, and filter out messages that belong to Web design elements (banners); decode the received graphic file in a message of the graphic format of the RGB color scheme; converting a graphic format message into a two-dimensional array of elements describing the structure of the image raster; determine the volume of the raster image and the resulting value of the volume of the raster is compared with a threshold value and the messages belonging to the elements of Web design are screened out; at the second stage: evaluate the value of the features characterizing the entropy of messages in graphic formats and decide on the content of the digital image, the content of the message; calculate the value of the resulting informative sign characterizing the content of the NGS, while to calculate the resulting informative sign when recognizing the content of the NGS, a multilevel scheme of transformations of the structural features of the object is proposed in order to obtain values characterizing the entropy of the NGS; compare the obtained value of the informative feature with threshold values and decide on the type of content of the analyzed compressed NGS.

Технической проблемой данных аналога и прототипа является высокая длительность обработки (низкая эффективность) каждого сжатого НГС в формате JPEG в связи с необходимостью выполнения всех процедур преобразования сжатого НГС в формате JPEG в цветовую схему RGB для получения растра цифрового изображения; а также низкая вероятность правильного распознавания контента сжатого НГС из-за использование одного информативного признака.The technical problem of analogue and prototype data is the high processing time (low efficiency) of each compressed NGS in JPEG format due to the need to perform all the procedures for converting a compressed NGS in JPEG format to an RGB color scheme to obtain a digital image raster; as well as a low probability of correct recognition of the contents of compressed NGS due to the use of one informative feature.

Для решения технической проблемы предлагается способ распознавания контента сжатых НГС в формате JPEG, позволяющий сократить время (повысить эффективность) обработки каждого сжатого НГС в формате JPEG за счет уменьшения количества операций по обработке сжатого НГС в формате JPEG путем исключения процедур деквантования значений массивов коэффициентов и последующего их преобразования в цветовую схему RGB, а также повысить вероятность правильного распознавания контента сжатого НГС за счет использования нескольких информативных признаков.To solve a technical problem, a method for recognizing the contents of compressed NGS in JPEG format is proposed, which allows to reduce the time (increase efficiency) in processing each compressed NGS in JPEG format by reducing the number of operations for processing compressed NGS in JPEG format by eliminating the procedures for dequantizing the values of coefficient arrays and their subsequent conversion to the RGB color scheme, and also increase the likelihood of correct recognition of the contents of the compressed NGS through the use of several informative features.

В заявленном способе эта задача решается тем, что на основе анализа служебной части файла формата JPEG определяют объем его растра, декодируют информационную часть файла формата JPEG по Хаффману, формируют двумерный массив значений коэффициентов дискретного косинусного преобразования цветовой компоненты Y, дополнительно формируют обучающую выборку для двух классов сжатых неподвижных графических сообщений в формате JPEG в зависимости от вида контента. Затем вычисляют в качестве признаков центральные моменты из распределения коэффициентов дискретного косинусного преобразования цветовой компоненты Y каждого файла обучающей выборки и формируют собственный характеристический вектор признаков каждого файла обучающей выборки. Далее формируют двумерные массивы признаков для каждого класса файлов обучающей выборки, вычисляют среднее арифметическое и среднее квадратическое отклонение в массиве признаков обучающей выборки. После чего нормируют значения признаков и используют их для формирования линейного прогностического правила, с помощью которого вычисляют и сохраняют коэффициенты линейной прогностической функции. Затем, на основе полученных нормированных значений признаков собственного характеристического вектора каждого распознаваемого сжатого неподвижного графического сообщения в формате JPEG и сохраненных коэффициентов линейной прогностической функции, получают значение линейной прогностической функции, которое сравнивают с порогом и принимают решение о принадлежности анализируемого сжатого неподвижного графического сообщений в формате JPEG к одному из распознаваемых классов. После этого формируют массивы сжатых неподвижных графических сообщений в формате JPEG в соответствии с принадлежностью к конкретному классу.In the claimed method, this problem is solved in that, based on the analysis of the service part of the JPEG file, the volume of its raster is determined, the information part of the JPEG file is decoded according to Huffman, a two-dimensional array of values of the coefficients of the discrete cosine transform of the color component Y is formed, and a training sample for two classes is additionally generated compressed still image messages in JPEG format depending on the type of content. Then, the central moments are calculated as signs from the distribution of the coefficients of the discrete cosine transform of the color component Y of each training sample file and an eigen-characteristic characteristic vector of the characteristics of each training sample file is formed. Next, two-dimensional arrays of attributes are formed for each class of files of the training set, the arithmetic mean and standard deviation are calculated in the set of features of the training set. After that, the values of the signs are normalized and they are used to form a linear prognostic rule, with the help of which the coefficients of the linear prognostic function are calculated and stored. Then, based on the obtained normalized values of the characteristics of the eigen characteristic vector of each recognizable compressed fixed graphic message in JPEG format and the stored coefficients of the linear predictive function, the value of the linear predictive function is obtained, which is compared with a threshold and a decision is made on whether the analyzed compressed fixed graphic message in JPEG format belongs to to one of the recognized classes. After that, arrays of compressed still graphic messages in JPEG format are formed in accordance with their belonging to a particular class.

Новая совокупность существенных признаков позволяет достичь указанного технического результата по обработке сжатого НГС в формате JPEG, путем исключения процедур деквантования значений массивов коэффициентов и последующего их преобразования в цветовую схему RGB, и использования дополнительных информативных признаков.A new set of essential features allows us to achieve the specified technical result for processing compressed NGS in JPEG format by eliminating the procedures for dequantizing the values of coefficient arrays and their subsequent conversion to the RGB color scheme, and using additional informative features.

Проведенный анализ уровня техники позволил установить, что аналоги, характеризующиеся совокупностью признаков, тождественных всем признакам заявленного способа распознавания контента сжатых НГС в формате JPEG, отсутствуют. Следовательно, заявленное изобретение соответствует условию патентоспособности «новизна».The analysis of the prior art made it possible to establish that there are no analogues that are characterized by a combination of features that are identical to all the features of the claimed method for recognizing compressed NGS content in JPEG format. Therefore, the claimed invention meets the condition of patentability "novelty."

Результаты поиска известных решений в данной и смежных областях техники с целью выявления признаков, совпадающих с отличительными от прототипа признаками заявленного объекта, показали, что они не следуют явным образом из уровня техники. Из уровня техники также не выявлена известность влияния предусматриваемых существенными признаками заявленного изобретения преобразований на достижение указанного технического результата. Следовательно, заявленное изобретение соответствует условию патентоспособности «изобретательский уровень».Search results for known solutions in this and related fields of technology in order to identify features that match the distinctive features of the claimed object from the prototype showed that they do not follow explicitly from the prior art. The prior art also did not reveal the popularity of the impact provided by the essential features of the claimed invention, the transformations on the achievement of the specified technical result. Therefore, the claimed invention meets the condition of patentability "inventive step".

Промышленная применимость изобретения обусловлена тем, что устройство, реализующее предложенный способ, может быть осуществлено с помощью современной элементной базы, в качестве которой используются современные высокопроизводительные программируемые логические интегральные схемы (ПЛИС) типа Xilinx Spartan-6 LX45 FPGA или Xilinx Virtex-7 2000T архитектуры FPGA, обеспечивающие быстродействующую обработку потока изображений (Угрюмов Е. П., Программируемые логические матрицы, программируемая матричная логика, базовые матричные кристаллы / Цифровая схемотехника. Учебное пособие для вузов. Изд. 2, БХВ-Петербург, 2004. Глава 7 – 357 с.).The industrial applicability of the invention is due to the fact that a device that implements the proposed method can be implemented using a modern element base, which uses modern high-performance programmable logic integrated circuits (FPGAs) of the Xilinx Spartan-6 LX45 FPGA or Xilinx Virtex-7 2000T FPGA architecture providing high-speed image stream processing (E. Ugryumov, Programmable logic matrices, programmable matrix logic, basic matrix crystals / Digital circuitry Linda University textbook Publishing House 2, BHV-Petersburg, 2004. Chapter 7 -... 357 c)..

Заявленный способ поясняется чертежами, на которых:The claimed method is illustrated by drawings, in which:

на фиг. 1 – схема общей структуры организации системы распознавания контента сжатых НГС в формате JPEG;in FIG. 1 is a diagram of the general structure of an organization for recognizing compressed NGS content in JPEG format;

на фиг. 2 – логическая схема этапов обучения системы распознавания контента сжатых НГС в формате JPEG и непосредственно распознавания;in FIG. 2 is a logical diagram of the stages of learning a content recognition system of compressed NGS in JPEG format and directly recognition;

на фиг. 3 – сравнения времени обработки сжатых НГС в формате JPEG прототипом и заявленным способом;in FIG. 3 - comparison of the processing time of compressed NGS in JPEG format by the prototype and the claimed method;

на фиг. 4 – сравнения вероятности распознавания контента сжатых НГС в формате JPEG прототипом и заявленным способом.in FIG. 4 - comparison of the probability of recognition of compressed NGS content in JPEG format by the prototype and the claimed method.

Основу предлагаемого способа распознавания контента сжатых НГС в формате JPEG составляют теоретические предпосылки в виде выявленных статистических свойств в массивах коэффициентов ДКП, присущих структуре сжатых НГС в формате JPEG с различным контентом, с применением линейного метода распознавания данных с обучением. С учетом этого способ включает в себя два основных этапа (фиг.1): обучение системы и непосредственно распознавание контента сжатых НГС в формате JPEG на основе сохраненных результатов обучения путем разделения на классы S1 и S2 в зависимости от типа контента.The basis of the proposed method for recognizing the contents of compressed NGSs in JPEG format is constituted by theoretical prerequisites in the form of identified statistical properties in arrays of DCT coefficients inherent in the structure of compressed NGSs in JPEG format with different content using the linear method of data recognition with training. With this in mind, the method includes two main steps (Fig. 1): training the system and directly recognizing the content of compressed NGS in JPEG format based on the stored learning results by dividing into classes S1 and S2 depending on the type of content.

Реализация заявленного способа заключается в следующем (фиг. 2).The implementation of the claimed method is as follows (Fig. 2).

1. Считывают из массива файлов формата JPEG служебную область очередного обрабатываемого файла, необходимую для правильного декодирования информационной области файла, т.е. размеры массива пикселей, адрес информационной области (области контента), таблицы кода Хаффмана.1. The service area of the next file being processed, necessary for the correct decoding of the file information area, is read from an array of JPEG files; pixel array sizes, address of the information area (content area), Huffman code table.

2. На основании данных из служебной области о размере изображения определяют объем растра изображения

. В рамках способа прототипа в качестве порогового значения объема растра предлагается величина

, которая определяется на основе анализа многочисленных НГС в реальных каналах передачи данных.2. Based on the data from the service area about the image size, the image raster volume is determined

. In the framework of the prototype method as a threshold value of the volume of the raster is proposed value

, which is determined based on the analysis of numerous NGSs in real data transmission channels.

3. Декодируют информационную часть файла формата JPEG кодом Хаффмана.3. Decode the information part of the file format JPEG Huffman code.

4. Декодируют повторы (RLE-декодирование) области контента сжатого НГС.4. Decode the repeats (RLE-decoding) of the content area of the compressed NGS.

5. Формируют из полученных после RLE-декодирования области контента сжатого НГС в формате JPEG двумерный массив коэффициентов ДКП цветовой компоненты Y, отвечающую за яркость. Натурные эксперименты показали, что именно данная компонента содержит основную информацию о контенте сжатого изображения.5. A two-dimensional array of DCT coefficients of the color component Y, which is responsible for brightness, is formed from the content region of compressed NGS obtained after RLE decoding. Field experiments showed that this particular component contains basic information about the content of the compressed image.

6. Вычисляют центральные моменты из распределений коэффициентов ДКП цветовой компоненты Y, в общем виде согласно выражению (1):6. The central moments are calculated from the distributions of the DCT coefficients of the color component Y, in the general form according to the expression (1):

, (1)

, (one)

где s – порядок момента;where s is the order of the moment;

– объем выборки;

- sample size;

– частота появления величины со значением

;

- frequency of occurrence of a quantity with a value

;

– выборочное среднее.

- sample mean.

Формирование словаря признаков

на основе значений центральных моментов базируется на утверждении о том, что основными статистическими характеристиками, описывающими распределение случайной величины, являются центральные моменты некоторых порядков [Гмурман, В. Е. Теория вероятностей и математическая статистика: учеб. пособие для вузов / В. Е. Гмурман. – 9-е изд., стер. – М.: Высш. шк., 2003. – 479 с.: ил.].Creation of a dictionary of signs

based on the values of the central moments is based on the statement that the main statistical characteristics that describe the distribution of a random variable are the central moments of certain orders [Gmurman, V. Ye. Probability theory and mathematical statistics: textbook. manual for universities / V. E. Gmurman. - 9th ed. - M .: Higher. school, 2003. - 479 p.: ill.].

Отмечается, что моменты более высоких порядков позволяют охарактеризовать и «усилить роль» больших, но маловероятных значений случайной величины. Как показали эксперименты в данных характеристиках случайной величины и наблюдаются основные отличия у НГС с разным контентом. Учитывая это, для получения точечных оценок из распределения случайной величины при неизвестном законе распределения в предлагаемом способе используются центральные моменты порядков 2–10. Применение для создания распознающей системы центральных моменты именно данных порядков основывается на предварительно проведенных натурных экспериментах, в которых оценивалась эффективность разделения НГС на классы с помощью комбинаций признаков.It is noted that moments of higher orders allow us to characterize and "strengthen the role" of large, but unlikely values of a random variable. As experiments have shown in these characteristics of a random variable and the main differences are observed in NHSs with different contents. Given this, to obtain point estimates from the distribution of a random variable with an unknown distribution law, the proposed method uses central moments of orders 2–10. The application of the central moments of precisely these orders to create a recognizing system of central moments is based on previously conducted full-scale experiments, in which the efficiency of the separation of non-natural gas systems into classes using combinations of attributes was evaluated.

7. Формируют собственный характеристический вектор (СХВ) признаков каждого считанного файла формата JPEG, который включает значения центральных моментов различных порядков, вычисленных из распределения коэффициентов ДКП цветовой компоненты Y и характеризующих особенности частотной области обрабатываемого НГС:7. Form their own characteristic vector (SHV) of the characteristics of each read file in the JPEG format, which includes the values of the central moments of various orders calculated from the distribution of DCT coefficients of the color component Y and characterizing the features of the frequency domain of the processed NGS:

. (2)

Обучают систему распознавания контента сжатых НГС в формате JPEG, основываясь на модели линейного дискриминантного анализа Фишера [Горелик, А. Л. Методы распознавания: учебное пособие для вузов / А. Л. Горелик, В. А. Скрипкин. – Изд. 4. – Москва: Букинист. – 2004. − 262 с.] На этапе обучения выполняют следующее:They train the recognition system for compressed NGS content in JPEG format based on the Fisher linear discriminant analysis model [Gorelik, A. L. Recognition methods: textbook for universities / A. L. Gorelik, V. A. Skripkin. - Ed. 4. - Moscow: Book Buyer. - 2004. - 262 p.] At the training stage, do the following:

8. Формируют обучающую выборку для двух классов (S1 и S2) сжатых НГС в формате JPEG в зависимости от вида контента.8. Form a training sample for two classes (S1 and S2) of compressed NGS in JPEG format depending on the type of content.

Количество НГС каждого класса в обучающей выборке определяют исходя из испытаний Бернулли, как следствия из закона больших чисел [Вентцель, Е. С. Теория вероятностей: учебник / Е.С. Вентцель. – 11-е изд., стер. – Москва: КНОРУС, – 2010. – 664 с.]: The number of NGSs of each class in the training sample is determined based on Bernoulli tests, as a consequence of the law of large numbers [Wentzel, E. S. Probability Theory: textbook / E. S. Wentzel. - 11th ed. - Moscow: KNORUS, - 2010. - 664 p.]:

, (3)

где

– вычисляемая вероятность (правильной классификации либо класса S1, либо класса S2), Where

- calculated probability (of the correct classification of either class S1 or class S2),

ε – точность определения вероятности, Ф(∙) – функция Лапласа;ε is the accuracy of determining probability, Φ (∙) is the Laplace function;

, n – количество наблюдений (количество сжатых НГС определенного класса в обучающей выборке).

, n is the number of observations (the number of compressed NGS of a certain class in the training sample).

При условии вероятности ложной тревоги, не превышающей значение

, задаваясь точностью

с достоверностью

, для обучения классификатора необходимо не менее

сжатых НГС, класса S2. При тех же условиях, но с учетом, что вероятность обнаружения сжатых НГС в формата JPEG класса S1, должна быть не менее

, для обучения необходимо использовать

сжатых НГС в формата JPEG класса S1. Provided the probability of a false alarm not exceeding the value

wondering

with certainty

, to train the classifier, at least

compressed NGS, class S2. Under the same conditions, but taking into account that the probability of detecting compressed NGS in JPEG format of class S1 should be at least

, for training you must use

compressed NGS in JPEG format S1 class.

9. Из векторов всех сжатых НГС в формата JPEG, включенных в обучающую выборку, формируют двумерные массивы признаков для каждого класса файлов обучающей выборки

. 9. From the vectors of all compressed NGS in JPEG format included in the training set, two-dimensional arrays of features are formed for each class of training set files

.

10. В массиве признаков обучающей выборки без разбиения на классы вычисляют среднее арифметическое

и среднее квадратичное отклонение

для каждого j-го признака.10. In the array of features of the training sample, without division into classes, calculate the arithmetic mean

and standard deviation

for each j-th attribute.

11. Нормируют значения признаков (j-го признака i-го сжатого НГС в формата JPEG) в массивах обучающей выборки в соответствии с выражением (4):11. Normalize the values of the attributes (j-th attribute of the i-th compressed NGS in JPEG format) in the arrays of the training sample in accordance with the expression (4):

, (4)

, (four)

где –

исходное значение j-го признака i-го сжатого НГС в формата JPEG в обучающей выборке. where -

initial value of the j-th attribute of the i-th compressed NGS in JPEG format in the training set.

Нормирование элементов векторов признаков приводит к приведению их к безразмерным величинам и к определенному диапазону изменений значений этих признаков.The normalization of elements of feature vectors leads to their reduction to dimensionless quantities and to a certain range of changes in the values of these signs.

12. Используют нормированные значения признаков для формирования линейного прогностического правила следующего вида:12. Use normalized values of signs to form a linear prognostic rule of the following form:

, (5)

где

– вектор значений признаков распознаваемого объекта;Where

- the vector of values of the features of the recognized object;

– обратная ковариационная матрица для двух классов S1 и S2.

Is the inverse covariance matrix for two classes S1 and S2.

Среднюю ковариационную матрицу

для двух классов S1 и S2 оценивают в соответствии с выражением (6):The average covariance matrix

for two classes S1 and S2 are evaluated in accordance with expression (6):

, (6)

где n₁ и n₂ – количество сжатых НГС в формате JPEG в соответствующих парах классов в обучающей выборке; where n ₁ and n ₂ - the number of compressed NGS in JPEG format in the corresponding pairs of classes in the training set;

– двумерные массивы, у которых по строкам расположены значения признаков объектов k-го класса вычисленных в соответствии с выражением (7):

- two-dimensional arrays for which the values of the attributes of objects of the k-th class calculated in accordance with the expression (7) are arranged in rows:

,, (7)(7)

где

– массив значений признаков объектов k-го класса из обучающей выборки;Where

- an array of attribute values of objects of the k-th class from the training set;

– массив, у которого по столбцам расположены средние значения признаков k-го класса.

- an array in which the average values of the attributes of the k-th class are located in the columns.

13. Вычисляют коэффициенты линейной прогностической функции, представляющей собой уравнение разделяющей поверхности, которое в общем виде можно представить выражением (8):13. Calculate the coefficients of the linear prognostic function, which is the equation of the dividing surface, which in general form can be represented by the expression (8):

, (8)

где

,

– коэффициенты линейной прогностической функции, полученные на основе выражения (5).Where

,

Are the coefficients of the linear predictive function obtained on the basis of expression (5).

14. Сохраняют результаты обучения классификатора в виде коэффициентов линейной прогностической функции

,

.14. Save the learning results of the classifier in the form of coefficients of a linear predictive function

,

.

Следовательно, для реализации этапа распознавания необходима информация, полученная на этапе обучения классификатора

,

.Therefore, for the implementation of the recognition stage, the information obtained at the training stage of the classifier is necessary

,

.

15. На этапе распознавания контента сжатого НГС в формате JPEG нормируют значения признаков СХВ распознаваемого сжатого НГС в соответствие с выражением (4) и на основе результатов, полученных в блоке 11.15. At the stage of recognizing the contents of a compressed NGS in JPEG format, the values of the CXF attributes of the recognized compressed NGS are normalized in accordance with expression (4) and based on the results obtained in block 11.

16. Подставляют нормированные значения признаков СХВ распознаваемого сжатого НГС в формате JPEG в линейное прогностическое правило, полученное в блоке 12.16. Substitute the normalized values of the CXB attributes of the recognized compressed NGS in JPEG format to the linear prognostic rule obtained in block 12.

17. Вычисляют значение полученной в блоке 13 линейной прогностической функции (8), используя результаты обучения классификатора в виде коэффициентов линейной прогностической функции

,

.17. The value of the linear predictive function (8) obtained in block 13 is calculated using the results of training the classifier in the form of coefficients of the linear predictive function

,

.

18. Разделяют сжатые НГС в формате JPEG на классы по видам контента в соответствии с правилом: если

, то относим сжатое НГС к классу S1, если

, к классу S2. 18. Divide compressed NHSs in JPEG format into classes by type of content in accordance with the rule: if

, then we assign the squeezed NGS to the class S1 if

to class S2.

Экспериментальная проверка способа прототипа и способа распознавания контента сжатых НГС в формате JPEG была выполнена на ЭВМ при помощи пакета прикладных программ для решения задач технических вычислений MATLAB с использованием дополнительных библиотек функций реализованных в С++ при следующих исходных данных: An experimental verification of the prototype method and the method for recognizing the contents of compressed NGS in JPEG format was performed on a computer using an application package for solving MATLAB technical computing problems using additional function libraries implemented in C ++ with the following initial data:

1) 500 сжатых НГС в формате JPEG класса S1 с объемом каждого файла 500–3 000 кбайт, содержащих цифровые изображения текста (цифровые фотографии книг, газет, учебников); 1) 500 compressed NGS in JPEG format S1 class with each file size of 500-3 000 kB containing digital images of the text (digital photographs of books, newspapers, textbooks);

2) 500 сжатых НГС в формате JPEG класса S2 с объемом каждого файла 500–3 000 кбайт, содержащих цифровые изображения пейзажей и портретов.2) 500 compressed NGS in JPEG format S2 class with the volume of each file 500-3 000 kB containing digital images of landscapes and portraits.

3) сжатые НГС в формате JPEG не искажены и в хорошем качестве с объемом растра не менее порогового значения,

пикселов.3) compressed NGS in JPEG format are not distorted and in good quality with a raster volume of at least a threshold value,

pixels.

Таблица 1Table 1

Кол-во НГС, класс Number of NHSs, class

Время обработки одного НГСProcessing time of one NGS Способ
прототипаWay
prototype 500, S1500, S1 0,920.92 0,080.08 0,6 с0.6 s 500, S2500, S2 0,920.92 0,080.08 Заявленный способThe claimed method 500, S1500, S1 0,970.97 0,070,07 0,5 с0.5 s 500, S2500, S2 0,930.93 0,030,03

Результаты экспериментов показали, что при сравнении основных показателей способа прототипа и заявленного способа следует вывод, что в предлагаемом способе повышается вероятность правильного распознавания с 92 % до 97 % (фиг.3) и уменьшается длительность обработки (фиг.4) при распознавании контента сжатых НГС формата JPEG двух различных классов: содержащих и не содержащих текст. The results of the experiments showed that when comparing the main indicators of the prototype method and the claimed method, it follows that the proposed method increases the probability of correct recognition from 92% to 97% (figure 3) and decreases the processing time (figure 4) when recognizing the contents of compressed NGS JPEG format of two different classes: containing and not containing text.

Таким образом, эффективность заявленного способа по сравнению со способом прототипа увеличилась на 16,7 %, а также вероятность правильного распознавания контента сжатых НГС формата JPEG выросла на 5 %, чем достигается заявленный технический результат.Thus, the effectiveness of the claimed method compared to the prototype method increased by 16.7%, and the likelihood of correct recognition of the contents of compressed NHS JPEGs increased by 5%, which achieves the claimed technical result.

Заявленный способ распознавания контента сжатых НГС формата JPEG, с помощью которого можно осуществлять предварительное распознавание контента сжатых НГС и основанный на различиях статистических свойств коэффициентов ДКП яркостной составляющей Y, позволяет сократить время обработки каждого сжатого НГС в формате JPEG за счет уменьшения количество операций по декодированию путем исключения процедур деквантования значений массивов коэффициентов и последующего их преобразования в цветовую схему RGB, а также повысить вероятность правильного распознавания контента НГС в формате JPEG за счёт использования нескольких информативных признаков. The claimed method for recognizing the contents of compressed NGSs of the JPEG format, with which it is possible to preliminarily recognize the contents of compressed NGSs and based on the differences in the statistical properties of the DCT coefficients of the brightness component Y, it reduces the processing time of each compressed NGS in JPEG format by reducing the number of decoding operations by eliminating procedures for dequantizing values of coefficient arrays and their subsequent conversion to the RGB color scheme, as well as increasing the probability of correct recognition of NGS content in JPEG format through the use of several informative features.

Claims

Способ распознавания контента сжатого неподвижного графического сообщения в формате JPEG, заключающийся в том, что на основе анализа служебной части файла формата JPEG определяют объем его растра, декодируют информационную часть файла формата JPEG по Хаффману, формируют двумерный массив значений коэффициентов дискретного косинусного преобразования цветовой компоненты Y, отличающийся тем, что формируют обучающую выборку для двух классов сжатых неподвижных графических сообщений в формате JPEG в зависимости от вида контента, вычисляют в качестве признаков центральные моменты из распределения коэффициентов дискретного косинусного преобразования цветовой компоненты Y каждого файла обучающей выборки, формируют собственный характеристический вектор признаков каждого файла обучающей выборки, потом формируют двумерные массивы признаков для каждого класса файлов обучающей выборки, вычисляют среднее арифметическое и среднее квадратическое отклонение в массиве признаков обучающей выборки, затем нормируют значения признаков и используют их для формирования линейного прогностического правила, с помощью которого вычисляют и сохраняют коэффициенты линейной прогностической функции, затем на основе полученных нормированных значений признаков собственного характеристического вектора каждого распознаваемого сжатого неподвижного графического сообщения в формате JPEG и сохраненных коэффициентов линейной прогностической функции получают значение линейной прогностической функции, которое сравнивают с порогом и принимают решение о принадлежности анализируемого сжатого неподвижного графического сообщения в формате JPEG к одному из распознаваемых классов, после чего формируют массивы сжатых неподвижных графических сообщений в формате JPEG в соответствии с принадлежностью к конкретному классу. A method for recognizing the contents of a compressed fixed graphic message in JPEG format, which consists in the fact that, based on the analysis of the service part of the JPEG file, the volume of its raster is determined, the information part of the JPEG file is decoded by Huffman, a two-dimensional array of values of the coefficients of the discrete cosine color component Y, characterized in that a training sample is formed for two classes of compressed still graphic messages in JPEG format depending on the type of content, they are calculated as ve the features, the central moments from the distribution of the coefficients of the discrete cosine transform of the color component Y of each training sample file form their own characteristic feature vector of each training sample file, then form two-dimensional feature arrays for each class of training sample files, and calculate the arithmetic mean and mean square deviation in the feature array training sample, then normalize the values of the characteristics and use them to form a linear prognosis rule, with the help of which the coefficients of the linear predictive function are calculated and stored, then based on the obtained normalized values of the characteristics of the eigen characteristic vector of each recognizable compressed fixed graphic message in JPEG format and the stored coefficients of the linear predictive function, the value of the linear predictive function is obtained, which is compared with the threshold and decide on whether the analyzed compressed fixed graphic message belongs to JPEG format to one of the recognized classes, after which they form arrays of compressed still graphic messages in JPEG format in accordance with their belonging to a particular class.