KR20190098801A

KR20190098801A - Classificating method for image of trademark using machine learning

Info

Publication number: KR20190098801A
Application number: KR1020180011743A
Authority: KR
Inventors: 문경혜; 조수원
Original assignee: 문경혜; 주식회사 투아트
Priority date: 2018-01-31
Filing date: 2018-01-31
Publication date: 2019-08-23
Also published as: KR102495721B1

Abstract

According to one embodiment of the present invention, provided is a trademark image classification method utilizing machine learning, which comprises the steps of: preparing learning data using a plurality of published trademark thumbnail images; and performing category classification of the learning data using a convolutional neural network (CNN) algorithm. The learning data includes a label generated based on a figure code assigned to the trademark thumbnail image.

Description

머신 러닝을 활용한 상표 이미지 분류 방법{CLASSIFICATING METHOD FOR IMAGE OF TRADEMARK USING MACHINE LEARNING}CLASSIFICATING METHOD FOR IMAGE OF TRADEMARK USING MACHINE LEARNING}

본 발명은 상표 이미지 분류 방법에 관한 것으로, 구체적으로는 머신 러닝을 활용한 상표 이미지 분류 방법에 관한 것이다. The present invention relates to a trademark image classification method, and more particularly, to a trademark image classification method using machine learning.

최근 사람들의 지적재산권에 대한 인식이 강해지고 있다. 특히 상표권 침해에 대한 인식이 강해지며 상표에 대한 분쟁을 미연에 예방하고, 상표의 지적재산권을 적극적으로 보호하려고 한다. Recently, people's awareness of intellectual property rights has become stronger. In particular, it is expected to increase the awareness of trademark infringement, prevent disputes over trademarks, and actively protect intellectual property rights of trademarks.

그런 지적재산권 보호로 인해 타인의 상표권을 침해하고 있었다는 사실을 인식하지 못하고 있던 소상공인들이 피해를 보고 있다. Small business owners, who did not realize that such intellectual property protections were infringing others' trademark rights, are suffering.

이를 뒷받침하는 근거로 상표권 침해에 관련된 법적 심판건수가 매해 늘어가고 있는 점을 들 수 있다. The reason for this is that the number of legal cases related to trademark infringement is increasing every year.

이러한 상표권 침해로 인한 피해는 실제로 상표가 론칭되어 실제 수익이 창출되는 시기에 발생한다. The damage caused by such trademark infringement occurs when the trademark is actually launched and the actual profit is generated.

그렇기 때문에 이를 예방하기 위해서는 특허상표의 론칭보다 전의 단계인 상표를 제작하는 단계에서부터 다른 상표를 침해하고 있는지의 여부를 확인해야 한다. Therefore, in order to prevent this, it is necessary to check whether the trademark is infringing other trademarks from the stage of producing the trademark, which is before the launch of the patent trademark.

하지만 지적재산권의 보호에 대한 인식은 강해진 것과 별개로 지적재산권의 침해 여부를 아는 것은 어려운 문제다. However, apart from the growing awareness of the protection of intellectual property rights, it is difficult to know whether it is infringing.

지적재산권이 침해당했는지 알기 위해서는 상표법에 대한 전문지식이 필요한데, 그러한 전문지식을 지닌 전문 변리사를 통해 컨설팅을 받는 경우 많은 비용이 발생하기 때문에 중소기업이나 개인이 접근하기는 어려우며, 나아가 직접 상표권 조회를 통해 판단을 하는 것도 어렵다. In order to know whether intellectual property has been infringed, expertise in trademark law is required.However, consulting with a professional patent attorney with such expertise can be costly and difficult for SMEs and individuals to access. It is also difficult to judge.

현재 특허정 보넷 키프리스에서 상표 검색을 지원하고 있으며, 키프리스에서 제공하고 있는 검색 방법은 두 가지로 일반검색과 스마트검색이 있다. Currently, the patent information is supported by Cypris, and there are two search methods provided by Cypris: general search and smart search.

일반검색은 간단한 단어나 연산식을 통해 검색하는 방법으로 텍스트 기반 검색방법이고, 스마트검색은 상표 명칭, 상품분류, 출원번호, 출원일자, 출원인 등 총 19개의 다양한 검색항목 중에서 선별하거나 다수의 검색항목을 이용하여 검색하는 방법이다. General search is a text-based search method by searching through simple words or expressions, and smart search is selected from a total of 19 various search items such as trademark name, product classification, application number, application date, and applicant, or multiple search items. How to search using.

권리구분, 유형, 행정처분, 상표명칭, 분류정보, 유사군, 지정상품, 출원번호, 등록번호, 날짜, 이름 등 다양한 검색항목이 있다. There are various search items such as right classification, type, administrative disposition, trade name, classification information, similar group, designated product, application number, registration number, date and name.

또한 KIPRIS PLUS에서 특허청에 공개되어 일반 인에게 제공되는 상표 정보를 중·고급 사용자와 기관, 정보제공사업자, 시스템 개발사업자 등에게 검색 인덱스 정보를 활용할 수 있는 서비스기능(API)을 제공하고 있다.In addition, KIPRIS PLUS provides service functions (APIs) that can utilize search index information to mid- and high-end users, organizations, information providers, and system development companies for the trademark information disclosed to the Korean Intellectual Property Office.

하지만 이것들도 모두 텍스트 기반 검색을 하고 있기 때문에 아직 등록하지 않은 상표 이미지의 경우에는 키프리스에서 제공하고 있는 정보들이 없고, 단지 등록할 상표 이미지만 있으므로, 이미지의 입력을 통하여 유사 상표를 검색할 수 있는 방법이 요구되는 실정이다.However, since all of these are text-based searches, there is no information provided by Cypris for trademark images that are not yet registered, and only trademark images to be registered can be used to search for similar trademarks through the input of images. A method is needed.

한편, 하기 선행기술문헌에는 특정 글자를 포함하는 상표명의 검색에 있어서 그 특정 글자의 전체 상표명에 있어서의 상대 위치와 길이를 특정하여 검색함으로써 상표검색의 활용도를 높이고 효율적인 상표검색을 가능하게 하는 상표명 사용자 모드 검색방법 및 시스템이 개시되어 있을 뿐 본 발명의 기술적 요지를 개시하고 있지 않다.On the other hand, in the following prior art document, in searching for a brand name including a specific letter, the user of a brand name that improves the utilization of the trademark search and enables efficient trademark search by specifying and searching the relative position and length in the entire brand name of the specific letter The mode search method and system are disclosed but do not disclose the technical gist of the present invention.

대한민국 공개특허공보 제10-2009-0017360호Republic of Korea Patent Publication No. 10-2009-0017360

본 발명의 일 실시예에 따른 PLC 프로토콜 암호화 방법은 전술한 문제점을 해결하기 위하여 다음과 같은 해결과제를 목적으로 한다.PLC protocol encryption method according to an embodiment of the present invention aims to solve the above problems.

기존의 텍스트 기반의 상표 검색의 문제점을 해소하기 위하여 머신 러닝을 활용한 이미지 기반의 상표 검색 방법을 제공하는데 그 목적이 있다. The purpose of the present invention is to provide an image-based trademark retrieval method using machine learning to solve the problem of conventional text-based trademark retrieval.

본 발명의 해결과제는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 해결과제들은 아래의 기재로부터 당해 기술분야에 있어서의 통상의 지식을 가진 자에게 명확하게 이해되어 질 수 있을 것이다.The problems of the present invention are not limited to those mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명의 일 실시예에 따른 머신 러닝을 활용한 상표 이미지 분류 방법은 공개된 복수 개의 상표 썸네일 이미지를 이용하여 학습 데이터를 준비하는 단계; 및 상기 학습 데이터를 CNN(Convolutional Neural Network) 알고리즘을 이용하여 상기 학습 데이터의 카테고리 분류를 수행하는 단계;를 포함하고, 상기 학습 데이터는 상기 상표 썸네일 이미지에 부여된 도형코드에 기초하여 생성된 레이블을 포함한다.Trademark image classification method using a machine learning according to an embodiment of the present invention comprises the steps of preparing the training data using a plurality of published thumbnail image thumbnails; And performing category classification of the training data using a convolutional neural network (CNN) algorithm, wherein the training data includes a label generated based on a figure code assigned to the trademark thumbnail image. Include.

본 발명의 일 실시예에 머신 러닝을 활용한 상표 이미지 분류 방법은 이미지 기반의 상표 검색을 가능하게 함으로써, 선행 상표의 검색 편의성 및 정확성을 향상시킬 수 있는 효과를 기대할 수 있다. Trademark image classification method using the machine learning in an embodiment of the present invention can be expected by the effect of improving the search convenience and accuracy of the preceding trademark by enabling image-based trademark search.

본 발명의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 당해 기술분야에 있어서의 통상의 지식을 가진 자에게 명확하게 이해되어질 수 있을 것이다.The effects of the present invention are not limited to those mentioned above, and other effects that are not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 KIPRIS 상표검색에서의 항목별 검색 화면이다.
도 2는 KIPRIS 상표검색의 항목별 검색에서 이미지 검색을 위한 항목이 표시된 화면이다.
도 3은 유럽 특허청 이미지 검색 시스템의 화면이다.
도 4는 유럽 특허청 이미지 검색 결과의 화면이다.
도 5는 일반적인 신경망과 컨볼루션 신경망의 구조를 도시한 도면이다.
도 6은 컨볼루션 신경망 아키텍처를 도시한 도면이다.
도 7은 도형코드의 설명 예를 도시한 도면이다.
도 8은 도형코드가 있는 상표 예를 도시한 도면이다.
도 9는 글자로만 이루어진 상표 예를 도시한 도면이다.
도 10은 유사군 코드의 설명 예를 도시한 도면이다.
도 11은 제안하는 CNN 구조를 도시한 도면이다.
도 12는 50X50 크기로 풀링된 32개의 결과물을 도시한 도면이다.
도 13은 25X25 크기로 풀링된 64개의 결과물을 도시한 도면이다.
도 14는 13X13 크기로 풀링된 128개의 결과물을 도시한 도면이다.
도 15는 제안하는 상표 검색 시스템의 구조를 도시한 도면이다.
도 16은 시프트 이미지에 대한 컨볼루션 필터의 크기에 따른 결과를 도시한 도면이다.
도 17은 원본 이미지(점)와 시프트 이미지(네모)의 결과를 도시한 도면이다.
도 18은 원본 이미지(점), 부분 손실 이미지(네모) 및 시프트 이미지(세모)를 도시한 도면이다.
도 19는 인식에 성공한 상표 이미지의 예를 도시한 도면이다.
도 20은 인식에 실패한 상표 이미지의 예를 도시한 도면이다.1 is an item search screen in KIPRIS trademark search.
2 is a screen displaying an item for image search in the item search of KIPRIS trademark search.
3 is a screen of the European Patent Office image retrieval system.
4 is a screen of the European Patent Office image search results.
5 is a diagram illustrating the structure of a general neural network and a convolutional neural network.
6 shows a convolutional neural network architecture.
7 is a diagram illustrating an example of a figure code.
8 shows an example of a trademark having a figure code.
9 is a view showing an example of a trademark consisting of only letters.
10 is a diagram illustrating an example of a similar group code.
11 is a diagram illustrating a proposed CNN structure.
FIG. 12 shows 32 results pooled to 50 × 50 size.
FIG. 13 shows 64 results pooled to 25 × 25 size.
FIG. 14 shows 128 results pooled to 13 × 13 size.
15 is a diagram showing the structure of a proposed trademark search system.
FIG. 16 is a diagram illustrating a result according to a size of a convolution filter for a shift image. FIG.
17 is a diagram showing the results of an original image (dot) and a shift image (square).
18 is a diagram showing an original image (dot), a partial loss image (square), and a shift image (square).
19 is a diagram showing an example of a trademark image that has been successfully recognized.
20 shows an example of a trademark image that fails to be recognized.

첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성 요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. DETAILED DESCRIPTION OF EMBODIMENTS Preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings, and the same or similar components will be given the same reference numerals regardless of the reference numerals and redundant description thereof will be omitted.

또한, 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 발명의 사상을 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 발명의 사상이 제한되는 것으로 해석되어서는 아니 됨을 유의해야 한다.In addition, in describing the present invention, when it is determined that the detailed description of the related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, it should be noted that the accompanying drawings are only for easily understanding the spirit of the present invention and should not be construed as limiting the spirit of the present invention by the accompanying drawings.

<특허 상표 검색 시스템의 구성과 문제점> <Configuration and Problems of Patent Trademark Search System>

상표 검색 시스템Trademark search system

특허 상표 검색은 KIPRIS(Korea Intellectual Property Rights Information Service)라고 하는 이름으로 국내·외 지식재산권에 대한 모든 정보를 데이터베이스로 구축하여 누구나 인터넷을 통해 무료로 이용할 수 있도록 제공하고 있다.The patent trademark search is called KIPRIS (Korea Intellectual Property Rights Information Service) and provides all information on domestic and foreign intellectual property rights as a database so that anyone can use it freely over the Internet.

KIPRIS는 국내 지식재산권(특허, 실용신한, 디자인, 상표 등) 정보와 미국, 일본, 유럽, 중국 등의 해외 지식재산권 정보를 서비스하고 있다.KIPRIS provides information on domestic intellectual property rights (patents, utility models, designs, trademarks, etc.) and foreign intellectual property rights such as the United States, Japan, Europe, and China.

국내 지식재산권은 공보 발행 후 1일 이내에 KIPRIS에 탑재되어 서비스 되고 있으며, 해외 지식재산권은 입수 후 2주 또는 1개월 이내에 서비스 되고 있다.Domestic intellectual property rights are installed and serviced within KIPRIS within one day of publication, while overseas intellectual property rights are provided within two weeks or one month of acquisition.

KIPRIS 통합검색은 서비스되는 모든 지식재산권(특허, 실용신안, 디자 인, 상표, 해외특허)에 대하여 권리구분 없이 전 권리를 한 번에 검색하는 통합검색과 검색하고자 하는 권리를 선택하여 원하는 검색서비스를 이용할 수 있다.KIPRIS Integrated Search selects the integrated search that searches all the rights at once and the right to search for all intellectual property rights (patents, utility models, designs, trademarks, foreign patents) that are serviced, and selects the desired search service. It is available.

KIPRIS에서 제공하고 있는 여러 검색 시스템에서 특히 상표 검색은 도형과 이미지를 포함하는 형식이다. 이 상표 이미지를 검색하는 방법은 이미지에 직접 각종 태그를 붙이고 이 태그를 사용한 텍스트 기반 검색 이라는 문제점이 있다. 이에 반해 유럽 특허청의 이미지 검색 시스템은 이미지를 그대로 Drag and drop 방식으로 입력받아, 유사한 이미지를 검색하는 시스템이 구축 되어 있다.In many of the search systems provided by KIPRIS, trademark search is a form that includes figures and images. This trademark image search method has a problem in that various tags are directly attached to the image and text-based search using the tag. On the other hand, the image search system of the European Patent Office receives a drag and drop image as it is, and has a system for searching similar images.

이에 본 특허에서는 기존 KIPRIS에서 사용하고 있는 텍스트 기반의 상표 이미지 검색의 문제점을 해결하고자, 머신 러닝 이미지 인식 기술을 사용하여 이미지 기반 검색 시스템을 구축하고자 한다.Therefore, in this patent, to solve the problem of the text-based trademark image retrieval used in the existing KIPRIS, to build an image-based retrieval system using machine learning image recognition technology.

머신 러닝Machine learning

워런 맥컬록(Warren McCulloch)와 월터 피츠(Walter Pitts)는 수학과 임계 논리(threshold logic)라 불리는 알고리즘을 바탕으로 신경망을 위한 계산학 모델을 만들었다. 이 모델은 신경망 연구의 두 가지 다른 접근법에 대한 초석을 닦았다. 하나의 접근법은 뇌의 신경학적 처리에 집중하는 것이고 다른 하나는 인공 신경망의 활용에 집중하는 것이다.Warren McCulloch and Walter Pitts created computational models for neural networks based on algorithms called mathematics and threshold logic. This model paved the way for two different approaches to neural network research. One approach focuses on the neurological processing of the brain and the other focuses on the use of artificial neural networks.

1940년 후반에 심리학자 도널드 헤비안(Donald Hebbian)은 헤비안 학습(Hebbian learning)이라 불리는 신경가소성의 원리에 근거한 학습의 기본 가정을 만들었다. 헤비안 학습은 전형적인 자율학습으로 이것의 변형들은 장기강화(long term potentiation)의 초기 모델이 된다. 이러한 아이디어는 1948년 튜링의 B-type 기계에 계산학 모델을 적용하는데서 출발하였다.In the late 1940s, the psychologist Donald Hebbian made a basic assumption of learning based on the principle of neuroplasticity called Hebbian learning. Hebian learning is typical self-learning, its variants being the initial model of long term potentiation. This idea began in 1948 with the application of computational models to Turing's B-type machines.

팔리(Farley)와 웨슬리 클라크(Wesley A. Clark)는 MIT에서 헤비안 네트워크를 모의 실험하기 위해 처음으로 계산학 모델을 사용하였다. 다른 신경망 계산학 기계들은 로체스터(Rochester), 홀랜드(Holland), 하빗 (Habit), 두다(Duda)에 의해 만들어졌다.Farley and Wesley A. Clark used the first computational model at MIT to simulate the Hebian network. Other neural network computational machines were made by Rochester, Holland, Habit, and Duda.

프랑크 로젠블라트(Frank Rosenblatt)는 퍼셉트론 즉, 간단한 덧셈과 뺄셈을 하는 이층구조의 학습 컴퓨터 망에 근거한 패턴 인식을 위한 알고리즘을 만들었다. 계산학 표기법과 함께 로벤블라트는 또한 기본적인 퍼셉트론에 대한 회로가 아닌 예를 들면 배타적 논리합 회로 (exclusive-or circuit)와 같은 회로를 표기하였다. 해당 회로의 수학 계산은 폴 웨어보스(Paul Werbos)에 의해 오차역전파법이 만들어진 후에 가능하였다.Frank Rosenblatt created an algorithm for pattern recognition based on Perceptron, a two-tiered learning computer network with simple addition and subtraction. Along with computational notation, Robbenblatt also denoted circuits, such as exclusive-or circuits, but not circuits for basic perceptrons. The mathematical calculation of the circuit was possible after the error backpropagation method was created by Paul Werbos.

마빈 민스키(Marvin Minsky)와 시모어 페퍼트(Seymour Papert)에 의해 기계학습 논문이 발표된 후에 신경망 연구는 침체되었다. 그들은 인공신경망에서 두 가지 문제점을 찾아내었다. 첫 번째로는 단층 신경망은 배타적 논리합 회로를 처리하지 못한다는 것이다. 두 번째 중요한 문제는 거대한 신경망에 의해 처리되는 긴 시간을 컴퓨터가 충분히 효과적으로 처리할 만큼 정교하지 않다는 것이다. 신경망 연구는 컴퓨터가 충분히 빨라지고, 배타적 논리합 문제를 효율적으로 처리하는 오차역전파법이 만들어지기까지 더디게 진행되었다.Neural network research declined after machine learning papers were published by Marvin Minsky and Seymour Papert. They found two problems with the neural network. The first is that single-layer neural networks do not handle exclusive OR circuits. The second major problem is that computers are not sophisticated enough to handle the long time handled by a large neural network. Neural network research has slowed until computers are sufficiently fast and error backpropagation is developed that efficiently handles exclusive OR problems.

1980년대 중반 병렬 분산 처리는 연결주의(connectionism)라는 이름으로 각광을 받았다. 데이비드 럼멜하트(David E. Rumelhart)와 제임스 맥클레랜드(James McClelland)가 쓴 교과서는 연결주의를 이용해 신경 처리를 컴퓨터에서 모의 실험하기 위한 모든 것을 설명하였다.In the mid-1980s, parallel distributed processing came into the spotlight in the name of connectionism. Textbooks written by David E. Rumelhart and James McClelland explained everything about computer simulations of neural processing using connectionism.

인공신경망이 어느정도 뇌의 기능을 반영하는지 불분명하기 때문에 뇌신경 처리의 간단한 모델과 뇌 생물학적 구조간의 상관관계에 대해 논란 중에 있으나 인공지능에서 사용되는 신경망은 전통적으로 뇌 신경 처리의 간단한 모델로 간주된다. 인공신경망은 SVM과 같은 다른 기계학습 방법들의 인기를 점차적으로 추월하고 있다. 2000년대 이후 딥 러닝의 출현이후 신경 집합의 새로운 관심은 다시 조명받고 있다.Although it is unclear how much neural networks reflect brain function, there is a debate about the correlation between simple models of brain processing and brain biological structures, but neural networks used in artificial intelligence are traditionally regarded as simple models of brain processing. Artificial neural networks are gradually overtaking the popularity of other machine learning methods such as SVM. Since the advent of deep learning since the 2000s, new attention has been refocused on neural sets.

머신 러닝은 대량의 데이터를 통해 컴퓨터로 하여금 스스로 학습할 수있는 알고리즘으로 영상 인식, 음성 인식 등 다양한 분야에서 활용되고 있다. 현재 머신 러닝을 크게 세 가지로 분류하고 있는데 이는 지도 학습과 비지도 학습, 준 지도 학습이다.Machine learning is an algorithm that enables a computer to learn itself through a large amount of data, and is being used in various fields such as image recognition and speech recognition. There are three main categories of machine learning: supervised learning, unsupervised learning, and semisupervised learning.

지도 학습에서는 예제 (x,y), x∈X, y∈Y의 집합이 주어졌을 때, 가능한 함수 f:X, Y들의 목록 중 예제에 제일 적합한 함수를 고르는 것을목표로 한다. 즉, 주어진 데이터로부터 함수를 추론하는 것이다. 이때 비용 함수는 주어짐 데이터가 추론한 함수와 얼마나 어긋나느냐에 따라 달려 있고, 문제에 대한 사전 지식을 암시적으로 포함하고 있다.In supervised learning, given the set of examples (x, y), x∈X, y∈Y, the goal is to choose the best function from the list of possible functions f: X, Y for the example. In other words, it infers a function from the given data. The cost function depends on how far the given data deviates from the inferred function and implicitly contains prior knowledge of the problem.

흔히 모든 예제 쌍에 대한 망의 출력 f(x)과 목표값 y의 평균 제곱 오차를 최소화하는 평균 제곱 오차를 비용 함수로 사용한다. 이 비용을 최소화하기 위해 다층 퍼셉트론이라 불리는 신경망의 한 분류에 경사 하강 법을 이용한다면, 이것은 신경망을 학습하기 위해 널리 쓰이는 오차역전파(Backpropagation)법이 된다.Often we use the mean squared error as the cost function to minimize the mean squared error of the net output f (x) and the target value y for all example pairs. To minimize this cost, if we use gradient descent in a class of neural networks called multilayer perceptrons, this is a widely used backpropagation method for learning neural networks.

지도 학습 패러다임에 해당하는 과제에는 패턴 인식과 회귀분석이 있다. 지도 학습은 음성인식이나 모션 인식 분야에 나타나는 순차적 데이 터에도 적용시킬 수 있다. 이것은 현재까지 얻어진 답의 품질에 대해 계속해서 피드백을 주는 함수의 형태로서 교수를 받는다고 생각할 수 있다.Tasks that fall under the supervised learning paradigm include pattern recognition and regression analysis. Supervised learning can also be applied to sequential data appearing in the areas of speech recognition and motion recognition. This can be thought of as being taught as a function of continuous feedback on the quality of the answers obtained so far.

비지도 학습에서는 데이터 x 가 주어졌을 때 데이터 x와 망의 출력 f에대한 임의의 비용 함수를 최소화한다. 비용 함수는 할 과제(모델링할 것)와 선험적 가정(모델, 모델의 변수와 관측된 변수에 대한 암시적인 성질)에 따라 결정된다. 간단한 예로, 비용함수 C=E[(x-f(x))²]가 주어져 있고 a가 상수일 때 모델 f(x)=a에 대해 생각해보자. 비용을 최소화하면 데이터의 평균인 a값이 나올 것이다. 비용 함수는 이것보다 훨씬 더 복잡해질 수 있고, 그 꼴은 어디에 사용되느냐에 따라 달려 있다.In unsupervised learning, given data x minimizes any cost function for data x and the output f of the network. The cost function depends on the task to be modeled and the a priori assumptions (the model, the implicit nature of the model's variables and the observed variables). As a simple example, consider the model f (x) = a when the cost function C = E [(xf (x)) ² ] is given and a is a constant. Minimizing the cost will result in the a value, which is the average of the data. The cost function can be much more complicated than this, and the form depends on where it is used.

예를 들어, 압축과 관련된 과제에서는 x와 f(x)사이의 상호 정보량과 관련이 있는 비용 함수를 사용할 수 있고, 통계 모델링에서는 데이터가 주어졌을 때 모델의 사후 확률과 관련지을 수 있을 것이다.For example, a task involving compression may use a cost function that is related to the amount of mutual information between x and f (x), and in statistical modeling it may be related to the posterior probability of the model given the data.

비지도 학습 패러다임에 속하는 과제는 일반적으로 근사와 관련된 문제 들이다. 클러스터링, 확률 분포의 예측, 데이터 압축, 베이지언 스팸 필터링 등에 이것을 응용할 수 있다.Tasks belonging to the unsupervised learning paradigm are usually problems related to approximation. This can be applied to clustering, probability distribution prediction, data compression, Bayesian spam filtering, and so on.

준 지도 학습에서 데이터 x는 주어지지 않고, 대신 행위자가 환경과 상호 작용을 함으로써 생성된다. 시간의 매 순간 t마다, 행위자는 행동 y₁를 취하고 환경에서 x_t와 순간적인 비용 c_t가 알려지지 않은 특정한 법칙에 따라 생성되어 관측된다. 이 때 목표는 예상되는 장기적 (누적) 비용을 최소화하는 특정한 행동을 고르는 정책을 찾는 것이다. 환경의 법칙과 각각의 정책에 따른 장기적 비용은 보통 모르지만, 예측할 수는 있다.In semi-supervised learning, data x is not given, but instead is created by the actor interacting with the environment. At each moment t of time, the actor takes action y ₁ and observes that x _t and the instantaneous cost c _t are generated and observed according to a specific law that is unknown. The goal is to find a policy that selects specific behaviors that minimize expected long-term (cumulative) costs. The laws of the environment and the long-term costs of each policy are usually unknown but can be predicted.

형식적으로 말해 환경은 상태 s₁,···, s_n∈ S와 a₁,···, a_n∈ A가 순간적인 비용의 분포 P(c_t|s_t), 관측 분포 P(x_t|s_t). 상태천이 분포 P(s_t+1|s_t,a_t)와 함께 주어진 마르코프 결정 프로세스로 모델링되며, 정책은 관측 값들이 주어졌을 때 행동에 대한 조건부 분포로 정의된다. 이 두 가지는 함께 마르코프 연쇄를 이룬다. 목표는 비용을 최소화하는 정책, 즉, 비용이 최소인 마르코프 연쇄를 찾는 것이다.Formally tell environment state _{_{s 1, ···, s n ∈}} S and _{_{a 1, ···, a n ∈}} A the distribution of the instantaneous costs _{_{P (c t | s t)}} , the observation distribution P (x _t s _t ). Given the state transition distribution P (s _{t + 1} | s _t , a _t ), it is modeled as a Markov decision process, and a policy is defined as a conditional distribution of behavior when given observations. These two form a Markov chain together. The goal is to find a policy that minimizes costs, that is, the Markov chain with the lowest cost.

준 지도 학습 패러다임에 속하는 과제에는 제어 문제, 게임, 순차적 결정 문제 등이 있다.Tasks belonging to the semi-supervised paradigm include control problems, games, and sequential decision problems.

CNN(Convolutional Neural Networks)Convolutional Neural Networks (CNN)

신경망은 입력받은 벡터를 일련의 히든 레이어(hidden layer)를 통해 변형(transform) 시킨다. 각 히든 레이어는 뉴런들로 이뤄져 있으며, 각뉴런은 앞쪽 레이어(previous layer)의 모든 뉴런과 연결되어 있다. 같은 레이어 내에 있는 뉴런들 끼리는 연결이 존재하지 않고 서로 독립적 이다. 마지막 레이어는 출력 레이어라고 불리며, 분류 문제에서 클래스 점수를 나타낸다.The neural network transforms the received vector through a series of hidden layers. Each hidden layer is made up of neurons, and each neuron is connected to every neuron in the previous layer. Neurons within the same layer do not have connections and are independent of each other. The last layer, called the output layer, represents the class score in the classification problem.

일반 신경망은 이미지를 다루기에 적절하지 않다. CIFAR-10 데이터의 경우 각 이미지가 32x32x3 (가로,세로 32, 3개 컬러 채널)로 이뤄져 있어서 첫 번째 히든 레이어 내의 하나의 뉴런의 경우 32x32x3=3072개의 가중치가 필요하지만, 더 큰 이미지를 사용할 경우에는 같은 구조를 이용하는 것이 불가능하다. 예를 들어 200x200x3의 크기를 가진 이미 지는 같은 뉴런에 대해 200x200x3=120,000개의 가중치를 필요로 하기 때문이다. 더욱이, 이런 뉴런이 레이어 내에 여러 개 존재하므로 모수의 개수가 크게 증가하게 된다.Ordinary neural networks are not suitable for handling images. For CIFAR-10 data, each image consists of 32x32x3 (horizontal, 32 vertical, 3 color channels), which requires 32x32x3 = 3072 weights for one neuron within the first hidden layer, but for larger images It is not possible to use the same structure. For example, a 200x200x3 image requires 200x200x3 = 120,000 weights for the same neuron. Moreover, the number of parameters increases greatly because there are several such neurons in the layer.

CNN(convolutional neural network)은 컴퓨터 비전, 음성 인식, 자연어 처리 등과 같은 다양한 패턴 인식 문제에 널리 적용되는 특수한 종류의 신경망이다. CNN은 처음 Hubel & Wiesel에서 영감을 얻었으며 많은 연구자들이 지속적으로 시행하고 있다. CNN은 입력이 이미지로 이루어져 있다는 특징을 살려 좀더 합리정인 방향으로 아키텍쳐를 구성한다. 일반 신경망과 달리, 가로, 세로, 깊이의 3개 차원을 가지게 된다. 하나의 레이어에 위치한 뉴런들은 일반 신경망과는 달리 앞 레이어의 전체 뉴런이 아닌 일부에만 연결이 되어 있다.A convolutional neural network (CNN) is a special kind of neural network that is widely applied to various pattern recognition problems such as computer vision, speech recognition, natural language processing, and the like. CNN was initially inspired by Hubel & Wiesel and is being practiced by many researchers. CNN constructs the architecture in a more rational way, taking advantage of the fact that the input consists of images. Unlike a normal neural network, it has three dimensions: horizontal, vertical, and depth. Unlike normal neural networks, neurons in one layer are connected to only some of the neurons in the previous layer.

지금까지는 CNN의 레이어 진행에 대한 것이다. CNN의 구성은 단일한 레이어의 구조가 아니라 컨볼루션(Convolution), ReLU, 풀링(Pooling)의세 가지가 하나의 레이어를 구성하고 있다.So far, this is about layering in CNN. CNN is composed of three layers of convolution, ReLU, and pooling instead of a single layer structure.

컨볼루션 연산은 필터와 이미지의 로컬한 영역간의 내적 연산을 한 것과 같다. 컨볼루션 레이어의 일반적인 구현 패턴은 이 점을 이용해 컨볼 루션 레이어의 forward pass를 하나의 큰 매트릭스 곱으로 계산된다. 컨볼루션 연산을 반복하게 되면 미분값이 점점 작아져 0에 수렴하게 된다.Convolutional operations are like the dot product between the filter and the local area of the image. The general implementation pattern of the convolutional layer uses this to calculate the forward pass of the convolutional layer as one large matrix product. By repeating the convolution operation, the derivative becomes smaller and converges to zero.

이를 해결하기 위해 ReLU f(x)=max(0,x)를 통해 이를 보정한다.To solve this, correct this with ReLU f (x) = max (0, x).

풀링의 역할은 네트워크의 파라미터의 개수나 연산량을 줄이기 위해서 이미지의 크기를 줄이는 것이다. 풀링은 MAX 풀링과 MEAN 풀링이 있으며 일반적으로 MAX 풀링을 사용한다. 반복된 컨볼루션 레이어를 지나 최종적으로 일반적인 신경망을 통해 이미지를 카테고리별로 분류한다.The role of pooling is to reduce the size of the image in order to reduce the number of parameters or the amount of computation in the network. Pooling includes MAX pooling and MEAN pooling and generally uses MAX pooling. After the repeated convolutional layer, the image is finally categorized by a general neural network.

<제안하는 상표 이미지 검색 방법 및 시스템><Suggested trademark image search method and system>

상표의 분류Classification of Trademarks

특허 상표 이미지 검색은 많은 상표 이미지들 중에 기존에 있는 상표 중에 유사한 상표가 있는지 여부를 알아내는 것이기 때문에 강화 학습보 다는 지도 학습이나 비지도 학습 방법이 적합하다. 그 중에서도 특허 상표의 경우에는 자체적인 분류 기준이 있기 때문에 이미 등록된 상표들의 경우에는 그 레이블들이 모두 존재한다. 따라서 상표 이미지를 분류하는 방법으로 지도 학습법이 적합하다.Since patent trademark image retrieval is to find out whether there are similar trademarks among existing trademarks among many trademark images, supervised learning or unsupervised learning method is more suitable than reinforcement learning. Among them, since patent trademarks have their own classification criteria, all of the labels exist in the case of already registered trademarks. Therefore, supervised learning is suitable as a method of classifying trademark images.

기존에 사용하는 특허 상표의 분류 방법으로 도형코드라는 것이 있다.As a classification method of the existing patent trademark, there is a figure code.

이는 상표 내에 있는 이미지를 표현하고 있는 여섯 자리의 숫자로 표현 되어 많은 상표를 분류하고 있다.This classifies many trademarks as six digits representing the images in the trademark.

이 도형코드의 경우에는 두 가지 문제점이 있는데 그 중 하나는 도 7에 나온 하나의 나무 혹은 관목, 두 개의 나무 혹은 관목처럼 같은 이미지가 하나가 있는 경우와 여러 개가 있는 경우를 따로 구분하고 있는 경우도 있기 때문에 이를 토대로 각 상표를 분류하는 것은 곤란한 점이 있다. 따라서 기존의 도형코드에서 수량에 관련된 부분과 이미지가 아닌 소리상표, 냄새상표 등의 내용을 배제하여 만든 새로운 도형코드를 상표의 분류 기준으로 사용하여 지도 학습에 필요한 훈련 데이터로 사용한 다. 새로운 도형코드는 기존의 여섯 자리에서 대분류와 중분류에 해당하는 네 자리까지만 사용하여 분류한다. 이 분류 체계에 따라 모든 상표를 146개의 카테고리로 분류했다.In the case of this figure code, there are two problems, one of which distinguishes one case from a single tree or a shrub, two trees or a shrub, and a case of having the same image and a case of several. Because of this, it is difficult to classify each trademark based on this. Therefore, the new figure code created by excluding the parts related to quantity from the existing figure code and the contents of the sound trademark, the smell trademark, etc., is used as the training data for supervised learning using the classification criteria of the trademark. The new geometry code is classified using only the last six digits, up to four digits corresponding to the major and middle classifications. According to this classification system, all trademarks are classified into 146 categories.

기존의 도형코드에 있는 또 다른 문제점은 하나의 특허 상표에 부여되어 있는 도형코드가 도 8과 같이 한 개가 아니라 여러 개라는 것이다. 예를 들어 하나의 특허 상표에 들어있는 이미지에 별의 그림과 달의 그림이 같이 존재한다면 도형코드 역시 별에 대한 도형코드와 달에 대한 도형코드가 존재하게 된다. 그런 이유로 하나의 특허 상표에서 여러 개의 레이블이 붙어 있는 셈이다. 본 특허에서는 상표에 부여되어 있는 도형코드 중에 가장 앞에 있는 도형코드를 상표의 도형코드라고 정의 내려, 하나의 상표당 하나의 레이블이 구성 될 수 있도록 했다.Another problem with the existing figure code is that the figure code assigned to one patent trademark is not one but several as shown in FIG. 8. For example, if there is a picture of a star and a picture of a moon in an image included in a patent trademark, the figure code also has a figure code of a star and a figure of a moon. That's why there are several labels on one patent trademark. In this patent, the first figure code among the figure codes assigned to the trademark is defined as the figure code of the trademark, so that one label can be formed per trademark.

유사군 코드Fuzzy Code

유사군 코드는 상품ㆍ서비스업 명칭 및 분류 구분에 관한 고시상의 상품 및 서비스업 명칭을 나타내는 것이다. 상표의 검색에 주된 목적은 상표 침해 여부의 확인이기 때문에 상표 이미지가 유사하더라도 같은 직종에 해당하지 않으면 침해에 해당하지 않는다. 그러므로 위에서 분류한 도형코드가 일치하여 유사한 이미지라고 하더라도 유사군 코드가 다르면 특허에 침해되지 않는다. 따라서 검색한 이미지에 대해 유사군 코드가 일치하지 않는 상표를 모두 배제한다.The similar group code indicates the name of the goods and services in the notification regarding the goods and services name and classification classification. Since the main purpose of searching for a trademark is to identify whether the trademark is infringing, a similar trademark image does not constitute an infringement unless it belongs to the same occupation. Therefore, even if the image code classified above is similar and similar image, the similar group code is not infringed by the patent. Therefore, all trademarks that do not match the similarity code for the searched image are excluded.

학습 방법Learning method

특허청에 있는 특허 상표의 썸네일 이미지를 사용해 훈련 데이터를 준비한다. 썸네일 이미지는 가로 100픽셀, 세로 100픽셀의 이미지로 특허 청에는 모두 300만개 이상의 상표 이미지가 있다. 하지만 그 중에는 도 9와 같이 도형코드가 없이 글자로만 이루어진 상표들도 존재한다.Prepare training data using thumbnail images of patent trademarks from the Patent Office. The thumbnail image is 100 pixels wide by 100 pixels high, and the JPO has more than 3 million trademark images. However, among them, as shown in Figure 9 there are also trademarks consisting only of letters without a figure code.

심지어 도형코드가 있는 상표보다 도형코드가 없는 상표들이 더 많다.There are even more trademarks without geometry codes than trademarks with geometry codes.

이에 본 특허에서는 2012년에 출원한 상표들 중에 도형코드가 있는 상표 26,791개만 학습 대상으로 했다.Therefore, in this patent, only 26,791 trademarks with graphic codes among the trademarks filed in 2012 were studied.

준비된 2012년의 상표 이미지를 CNN(Convolutional Neural Network) 알고리즘을 사용해 유사한 상표끼리 분류한다. CNN의 구조는 도 11과 같이 3X3 필터를 이용한 컨볼루션과 풀링을 세 번 반복하여 특징을 추출한다. 이 때 풀링은 맥스풀링을 사용했다. 세 번의 특징추출 이후에는 한 층의 레이어와 출력부로 이루어진 인공신경망을 통해 카테 고리 분류를 완료한다The 2012 trademark image prepared is classified between similar trademarks using the Convolutional Neural Network (CNN) algorithm. As shown in FIG. 11, the CNN structure extracts features by repeating convolution and pooling three times using a 3X3 filter. At this time, pooling used max pooling. After three feature extractions, the categorization is completed through a neural network consisting of a layer and an output layer.

<실험 결과 및 분석>Experimental Results and Analysis

실험은 Python 2.7.12, Tensorflow-gpu 1.0.1 환경에서 진행되었으며, 사용한 그래픽카드는 GeForce GTX TITAN X(Pascal)을 사용했다.The experiment was conducted in Python 2.7.12, Tensorflow-gpu 1.0.1 environment, and the graphics card used was GeForce GTX TITAN X (Pascal).

CNN 알고리즘을 사용해 학습시킨 26,791개의 데이터 중에서 임의로 1339개의 이미지를 뽑아 테스트를 진행했다. 테스트의 결과는 51 Epoch부터 100 Epoch에서 나온 결과의 평균을 기록했다. 그 결과 97.56%로 높은 성공률을 보였다. 이에 일부가 지워지거나, 이미지를 이동 시켜 위치가 바뀐 상표 이미지에 대해서도 강건한지 알아보기 위해, 학습 시킨 상표 이미지 중에서 임의로 670개의 이미지를 뽑아 테스트 해보았다. 그 결과, 일부가 지워진 상표의 경우에는 86.18%로 강건한 편이었지만, 이미지의 위치가 바뀐 경우에는 46.25% 정도로 결과가 좋지 못했다. 원본과 지워진 이미지, 위치가 바뀐 이미지에 대한 결과 값은 도 18과 같다.The test was conducted by randomly extracting 1339 images from 26,791 data trained using the CNN algorithm. The test results averaged from 51 Epoch to 100 Epoch. The result was a high success rate of 97.56%. In order to find out whether the part of the trademark image was deleted or moved by shifting the image, we tested the randomly selected 670 images among the trained trademark images. As a result, some of the deleted trademarks were robust at 86.18%, but the results were not good at 46.25% when the position of the image was changed. The resulting values for the original, the erased image, and the repositioned image are shown in FIG. 18.

위치가 바뀐 상표의 인식을 개선하기 위해서 두 가지 방법을 사용해보 았다. 하나는 컨볼루션에서 사용한 필터를 더 키워서 학습시키는 방법이다. 다른 하나는 원본이 되는 상표 이미지를 규칙적인 방법을 통해 가공한 새로운 이미지로 만들어 각각의 이미지를 모두 학습 시키는 방법이다. 본 발명에서는 상표 이미지를 학습할 때, 상하좌우 대각선으로각 5픽셀과 10픽셀만큼 이동시켜, 원본을 포함해 모두 25개의 이미지로 학습 시켰다. 필터의 크기를 키운 경우에는 도 16과 같이 미미한 효과를 보여주었다.Two methods were used to improve the recognition of the relocated trademark. One is to learn how to grow the filter used in convolution. The other is to learn each image by making the original trademark image into a new image processed in a regular manner. In the present invention, when learning the trademark image, by moving 5 pixels and 10 pixels, respectively, diagonally up, down, left and right, all 25 images including the original. In the case of increasing the size of the filter showed a slight effect as shown in FIG.

이동시킨 이미지를 학습 시킨 경우에는 도 17과 같이 5Epoch만에 80% 이상에 도달하고, 최종적으로는 85%가량에 수렴했다. 실험을 했던 5픽셀과 10픽셀이 아니라 3,6,9,12,15픽셀만큼 이동시켜 학습 시킨다면 충분히 좋은 학습 결과를 볼 수 있을 것이다. 학습량이 증가함에 따라 발생하는 소요 시간의 증가는 큰 문제가 되지 않는다. 학습 데이터의 양을 25배로 늘였음에도 불구하고 1Epoch에 소요된 시간은 약 7분가량 걸렸다. 5픽셀과 10픽셀이 아니라 3,6,9,12,15픽셀만큼 이동시켜 학습 시킨다고 하더라도 50Epoch에 도달하기까지 채 하루도 걸리지 않는다는 것을 알 수 있다.In the case of learning the shifted image, as shown in FIG. 17, 80% or more was reached in 5Epoch alone, and finally converged to about 85%. If you train by moving 3, 6, 9, 12, 15 pixels instead of 5 and 10 pixels, you will get good results. The increase in the amount of time that occurs as the amount of learning increases is not a big problem. Despite 25 times the amount of training data, 1Epoch took about 7 minutes. Even if you train by moving 3, 6, 9, 12, 15 pixels instead of 5 and 10 pixels, you can see that it takes less than a day to reach 50 Epoch.

도 19는 인식에 성공한 특허 상표로 각각 해당되는 카테고리가 (a), (b)는 별, (c)는 물방울, (d), (e), (f)는 날개, (g), (h), (i)는 하트, (j)는 곤충, (k), (l)은 산에 해당하는 이미지이다.FIG. 19 is a patent trademark that successfully recognizes a category corresponding to (a), (b) as a star, (c) as a drop, (d), (e), and (f) as a wing, (g), (h) ), (i) are hearts, (j) are insects, (k) and (l) are mountains.

도 20은 인식에 실패한 특허 상표로 (a)는 물방울로 인식해야 하지만 하트로 잘못 인식했다. (a)의 경우에 뒤집혀 있는 하트와 유사하게 생겼기 때문으로 추정된다. (b)와 (c)는 곤충으로 인식해야 하지만 각각 과일, 날개로 잘못 인식했다. (b)의 경우는 딸기나 사과와 같은 과일과 유사하게 생겼기 때문으로 추정된다. (c)의 경우는 잠자리의 날개 부분으로 인해 날개로 인식했으리라 추정된다. 다만 (c)의 경우는 세 가지 도형 코드로 분류를 확장했을 때는 올바른 답을 찾을 것이라고 추정한다.20 is a patent trademark that fails to recognize (a) should be recognized as a drop of water but incorrectly recognized as a heart. In the case of (a), it is assumed that it looks similar to the inverted heart. (b) and (c) should be recognized as insects, but incorrectly recognized as fruit and wing, respectively. (b) is presumably because it looks similar to a fruit such as a strawberry or an apple. In case (c), the wing part of the dragonfly is assumed to be recognized as a wing. However, in the case of (c), it is assumed that the correct answer will be found when the classification is extended to three shape codes.

<결론>Conclusion

본 특허에서는 CNN을 사용한 상표 이미지 분류 시스템을 제안하였다. 상표 이미지의 분류는 기존 상표의 분류체계인 도형코드를 따랐으며, 학습에 필요한 상표 이미지는 Kipris에 등록된 상표 이미지를 사용 하였다.This patent proposes a trademark image classification system using CNN. The classification of the trademark image follows the graphic code which is the classification system of the existing trademark, and the trademark image registered for Kipris is used for the learning image.

본 특허에서 제안한 상표 이미지 분류 시스템에서는 부분 손실이나 위치 변경에 강건하여, 잘 대처할 수 있다는 장점이 있다.The trademark image classification system proposed in this patent has the advantage of being able to cope well by being robust against partial loss or position change.

앞으로의 연구 방향은 이미지의 위치뿐만 아니라 크기에 대해서도 강건할 수 있는 방법을 찾는 연구와 더불어 해외 상표 이미지에 대하여 확장 적용하고, 거기에 따른 보다 효율적인 학습 방법의 연구에 있다.The future direction of research is to find a method that can be robust not only for the location of the image but also for the size, and to expand the application of foreign brand images and to study more efficient learning methods.

본 명세서에서 설명되는 실시예와 첨부된 도면은 본 발명에 포함되는 기술적 사상의 일부를 예시적으로 설명하는 것에 불과하다. 따라서 본 명세서에 개시된 실시예들은 본 발명의 기술적 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이므로, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것이 아님은 자명하다. 본 발명의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당해 기술분야에 있어서의 통상의 지식을 가진 자가 용이하게 유추할 수 있는 변형 예와 구체적인 실시예는 모두 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다. The embodiments described in the present specification and the accompanying drawings merely illustrate some of the technical ideas included in the present invention. Therefore, since the embodiments disclosed in the present specification are not intended to limit the technical spirit of the present invention but to explain, it is obvious that the scope of the technical spirit of the present invention is not limited by these embodiments. Modifications and specific embodiments that can be easily inferred by those skilled in the art within the scope of the technical spirit included in the specification and drawings of the present invention are included in the scope of the present invention. It should be interpreted.

Claims

공개된 복수 개의 상표 썸네일 이미지를 이용하여 학습 데이터를 준비하는 단계; 및
상기 학습 데이터를 CNN(Convolutional Neural Network) 알고리즘을 이용하여 상기 학습 데이터의 카테고리 분류를 수행하는 단계;
를 포함하고,
상기 학습 데이터는 상기 상표 썸네일 이미지에 부여된 도형코드에 기초하여 생성된 레이블을 포함하는 머신 러닝을 활용한 상표 이미지 분류 방법.Preparing training data using a plurality of published trademark thumbnail images; And
Performing category classification of the training data using a convolutional neural network (CNN) algorithm;
Including,
And the learning data includes a label generated based on a figure code attached to the trademark thumbnail image.