KR102376652B1

KR102376652B1 - Method and system for analazing real-time of product data and updating product information using ai

Info

Publication number: KR102376652B1
Application number: KR1020210105132A
Authority: KR
Inventors: 남궁지환; 서동우; 최진욱
Original assignee: 헤드리스 주식회사
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2022-03-21

Abstract

The present invention relates to a method for analyzing product data in real time and updating product information by using AI. The method includes the following steps of: collecting product data by collecting user data to be used for machine learning through an embedded script module and a user log collector module in order to perform user modeling; evaluating a product through the product data collected through a real time data analysis processing system module connected with the user log collector module and a management type ETL service module connected with the real time data analysis processing system; enabling an interactive query service module to provide an interactive query service for a data catalog, based on data loaded from a management type ETL service; and interlinking the evaluated product with at least one medium and managing a sync on the medium in real time. Therefore, the present invention is capable of generating a cluster idea through AI.

Description

AI를 활용한 상품 데이터 실시간 분석 및 상품 정보를 업데이트하기 위한 방법 및 시스템{METHOD AND SYSTEM FOR ANALAZING REAL-TIME OF PRODUCT DATA AND UPDATING PRODUCT INFORMATION USING AI}Method and system for real-time analysis of product data and updating product information using AI

본 발명은 AI를 활용하여 상품 데이터를 실시간으로 분석하여 연동된 각 매체에 최적화된 정보를 실시간으로 업데이트하는 발명에 관한 것이다.The present invention relates to an invention that analyzes product data in real time using AI to update information optimized for each interlocked medium in real time.

종래에는, 사용자의 이용 패턴을 분석하여 상품 추천하기 위해 다양한 매체를 통한 사용자의 이용 정보를 수집하고 수집된 정보를 빅데이터로 분석하여 최적의 상품을 실시간 추천하는 기술이 제공되었다.Conventionally, in order to analyze a user's usage pattern and recommend a product, a technology for collecting user's usage information through various media and analyzing the collected information as big data to recommend an optimal product in real time has been provided.

그러나, 이러한 종래 기술은 실시간으로 변하는 상품 데이터를 실시간으로 반영할 수 없었다. 아울러, 종래 기술에서는 이러한 사용자의 이용 패턴 등을 인공지능을 통해 데이터를 정제하려는 시도는 없었다.However, this prior art could not reflect real-time changing product data in real time. In addition, in the prior art, there has been no attempt to purify data through artificial intelligence, such as the user's usage patterns.

배경기술로서는 대한민국 공개특허공보 제10-2020-0130761호(2020.11.20. 공개) 등이 있다.As a background technology, there is Korean Patent Publication No. 10-2020-0130761 (published on November 20, 2020) and the like.

본 발명은 AI를 활용하여 상품 데이터를 실시간으로 분석하여 연동된 각 매체에 최적화된 정보를 실시간으로 업데이트하여 제공하는 것이다.The present invention utilizes AI to analyze product data in real time to update and provide information optimized for each interlocked medium in real time.

본 발명이 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present invention are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

상기 목적을 달성하기 위해, 본 발명에 따른 상품 데이터 실시간 분석 및 상품 정보 제공 방법은, 장치에 의해 수행되는, AI를 활용한 상품 데이터 실시간 분석 및 상품 정보를 업데이트하기 위한 방법으로서, 사용자 모델링을 위해, 내장 스크립트 모듈 및 사용자 로그 수집기 모듈을 통해 머신러닝에 사용하기 위한 사용자의 데이터를 수집함으로써 상품 데이터를 수집하는 단계; 사용자 로그 수집기 모듈에 연결되는 실시간 데이터 분석 처리 시스템 모듈 및 실시간 데이터 분석 처리 시스템 모듈에 연결되는 관리형 ETL 서비스 모듈을 통해 수집된 상품 데이터를 통해 상품을 평가하는 단계 - 상품을 평가하는 단계는 상품을 스코어링(scoring)하는 것을 포함하고, 상품을 평가하는 단계는 MLOps를 통해 지속적으로 상품 주제의 모델 최적화 및 신규 모델의 추가 관리를 수행하는 단계를 포함하고, 관리형 ETL 서비스 모듈은 데이터 카탈로그(Data Catalog)를 포함함 -; 및 관리형 ETL 서비스로부터 적재된 데이터에 기초하여, 대화식 쿼리 서비스 모듈에서 데이터 카탈로그에 대한 대화식 쿼리 서비스를 제공하는 단계 - 대화식 쿼리 서비스 모듈은 클라우드 기계 학습 플랫폼 모듈에 연결되고, 클라우드 기계 학습 플랫폼 모듈에서 데이터에 관한 기계 학습이 수행됨 -; 및 평가된 상품을 하나 이상의 매체에 연동시키고 매체에 싱크를 실시간으로 관리하는 단계 - 싱크를 실시간으로 관리하는 것은 DevOps를 통해 수행됨 - 를 포함한다.In order to achieve the above object, the product data real-time analysis and product information providing method according to the present invention is a method for real-time analysis of product data and updating product information using AI, performed by a device, for user modeling , collecting product data by collecting user data for use in machine learning through a built-in script module and user log collector module; Evaluating the product through the product data collected through the real-time data analysis processing system module connected to the user log collector module and the managed ETL service module connected to the real-time data analysis processing system module - Evaluating the product includes scoring, and the step of evaluating the product includes continuously performing model optimization of the product subject and additional management of the new model through MLOps, and the managed ETL service module is a Data Catalog ) including -; and based on the data loaded from the managed ETL service, providing an interactive query service for the data catalog in the interactive query service module, wherein the interactive query service module is connected to the cloud machine learning platform module, and in the cloud machine learning platform module Machine learning is performed on the data -; and linking the evaluated product to one or more media and managing the sync to the media in real time - managing the sync in real time is performed through DevOps.

본 발명에 따른 상품 데이터 실시간 분석 및 상품 정보 제공 방법은, 내장 스크립트 모듈은 자바스크립트(JavaScript)를 포함하고, 사용자 로그 수집기 모듈은, 주기적인 서비스 요청을 처리하기 위해서 커널 상에 백그라운드 모드로 실행되는 프로세스로서 주기적으로 사용자 모델링에서 요구되는 데이터를 획득하도록 하는 자바 데몬(Jave Daemon)를 포함한다.In the product data real-time analysis and product information providing method according to the present invention, the built-in script module includes JavaScript, and the user log collector module is executed in a background mode on the kernel to process periodic service requests. As a process, it includes a Java Daemon that periodically obtains data required for user modeling.

본 발명에 따른 상품 데이터 실시간 분석 및 상품 정보 제공 방법은, 사용자 모델링에서 데이터는 클러스터링되고, 클러스터링은 공통 취향 클러스터링, 구매력 클러스터링 및 인구통계학 클러스터링 중 하나이다.In the product data real-time analysis and product information provision method according to the present invention, data is clustered in user modeling, and the clustering is one of common taste clustering, purchasing power clustering, and demographic clustering.

본 발명에 따른 상품 데이터 실시간 분석 및 상품 정보 제공 방법은, 비딩 스코어를 계산하는 단계; 쇼핑몰 또는 개인 판매자로부터 입력된 예산에 기초하여, 비딩 한도를 설정하는 단계; 및 상품의 셀러, 브랜드 및 카테고리 중 적어도 하나 이상을 통해 상품 데이터를 필더링하는 단계를 더 포함하고, 비딩 스코어를 계산하는 단계는, 상품 데이터에 기초하여 상품 스코어를 결정하는 단계; 상품 스코어에 기초하여 비딩 가중치를 계산하는 단계; 및 비딩 스코어의 상단 가격을 결정하는 단계를 포함한다.Product data real-time analysis and product information providing method according to the present invention, calculating a bid score; setting a bid limit based on a budget input from a shopping mall or a personal seller; and filtering product data through at least one of a seller, a brand, and a category of the product, wherein calculating the bid score includes: determining a product score based on the product data; calculating a bid weight based on the product score; and determining the top price of the bid score.

본 발명에 따른 상품 데이터 실시간 분석 및 상품 정보 제공 방법은, 대화식 쿼리 서비스 모듈이 데이터를 시각화하여 제공하는 데이터 시각화 모듈을 포함한다.The product data real-time analysis and product information provision method according to the present invention includes a data visualization module that an interactive query service module provides by visualizing data.

본 발명의 기타 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Other specific details of the invention are included in the detailed description and drawings.

상기와 같이 구성된 본 발명에 따르면, AI를 통해 상품 데이터를 실시간 분석하여 연동되는 각 매체에 최적화된 정보를 실시간으로 업데이트할 수 있는 효과를 제공할 수 있다.According to the present invention configured as described above, it is possible to provide an effect of real-time updating of information optimized for each interworking medium by analyzing product data in real time through AI.

본 발명에 따르면, 상품 데이터를 수집함에 있어서, 초대용량 데이터를 준실시간으로 처리할 수 있는 아키텍쳐와 엔지니어링이 제공될 수 있고, MSA(MicroService Architecture)와 헤드리스(headless) 서비스를 구현하여 유연하고 넓은 사용성을 가질 수 있는 효과를 제공할 수 있다.According to the present invention, in collecting product data, architecture and engineering capable of processing super-capacity data in semi-real time can be provided, and flexible and wide by implementing MSA (MicroService Architecture) and headless service It is possible to provide an effect that can have usability.

본 발명에 따르면, 수집된 상품 데이터를 통해 상품 평가(스코어링) 및 노출 지면을 분배함에 있어서, MLOps를 통해 지속적으로 상품 주제 모델 최적화 및 신규 모델 추가관리를 할 수 있고, 각 클러스터별 상품 데이터를 머신러닝 학습을 통해 AI를 통해 클러스터 아이디어를 생성할 수 있는 효과를 제공할 수 있다.According to the present invention, in product evaluation (scoring) and distribution of exposure space through the collected product data, product topic model optimization and new model addition management can be continuously performed through MLOps, and product data for each cluster can be machined Learning by learning can provide the effect of generating cluster ideas through AI.

본 발명에 따르면, 매체 연동 및 실시간 싱크(sync) 관리에 있어서, API 및 주요 매체의 EP 양식에 유연하게 대응할 수 있는 미들웨어 애플리케이션이 제공될 수 있고, 상품 데이터의 실시간 싱크 유지를 위한 DevOps 및 엔지니어링이 제공될 수 있다.According to the present invention, in media interworking and real-time sync management, a middleware application that can flexibly respond to APIs and EP forms of major media can be provided, and DevOps and engineering for maintaining real-time sync of product data can be provided.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 사용자 모델링(100) 동작을 설명하기 위한 도면이다.
도 2는 상품 스코어(Product Score)를 분석(200)하기 위한 동작을 설명하기 위한 도면이다.
도 3은 비딩 스코어(Bidding Score)를 결정하기 위한 동작을 설명하기 위한 도면이다.
도 4는 AI를 활용하여 상품 데이터를 실시간으로 분석하여 연동된 각 매체에 최적화된 정보를 실시간으로 업데이트하는 방법을 설명하는 순서도이다.1 is a diagram for explaining an operation of the user modeling 100 .
2 is a diagram for explaining an operation for analyzing 200 of a product score.
3 is a diagram for explaining an operation for determining a bidding score.
4 is a flowchart illustrating a method of real-time updating of information optimized for each interlocked medium by analyzing product data in real time using AI.

이하에서, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세하게 설명한다. 그러나, 본 발명의 실시예들에는 다양한 변경이 가해질 수 있어서 특허출원의 권리 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 실시예들에 대한 모든 변경, 균등물 내지 대체물이 권리 범위에 포함되는 것으로 이해되어야 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, since various changes may be made to the embodiments of the present invention, the scope of the patent application is not limited or limited by these embodiments. It should be understood that all modifications, equivalents and substitutes for the embodiments are included in the scope of the rights.

본 발명의 실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 실시될 수 있다. 따라서, 실시예들은 특정한 개시형태로 한정되는 것이 아니며, 본 명세서의 범위는 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments of the present invention are disclosed for purposes of illustration only, and may be changed and implemented in various forms. Accordingly, the embodiments are not limited to the specific disclosure form, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical spirit.

"제1" 또는 "제2" 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Although terms such as “first” or “second” may be used to describe various elements, these terms should only be construed for the purpose of distinguishing one element from another. For example, a first component may be termed a second component, and similarly, a second component may also be termed a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being “connected to” another component, it may be directly connected or connected to the other component, but it should be understood that another component may exist in between.

실시예에서 사용한 용어는 단지 설명을 목적으로 사용된 것으로, 한정하려는 의도로 해석되어서는 안된다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 예를 들어, 본 명세서에서 아이템에 대응하는 명사의 단수 형은 관련된 문맥상 명백하게 다르게 지시하지 않는 한, 아이템 한 개 또는 복수 개를 포함할 수 있다. 본 명세서에서, "A 또는 B", "A 및 B 중 적어도 하나", "A 또는 B 중 적어도 하나", "A, B 또는 C", "A, B 및 C 중 적어도 하나" 및 "A, B, 또는 C 중 적어도 하나"와 같은 문구들 각각은 그 문구들 중 해당하는 문구에 함께 나열된 항목들 중 어느 하나, 또는 그들의 모든 가능한 조합을 포함할 수 있다. "제1", "제2", 또는 "첫째" 또는 "둘째"와 같은 용어들은 단순히 해당 구성요소를 다른 해당 구성요소와 구분하기 위해 사용될 수 있으며, 해당 구성요소들을 다른 측면(예컨대, 중요성 또는 순서)에서 한정하지 않는다. 어떤(예컨대, 제1) 구성요소가 다른(예컨대, 제2) 구성요소에, "기능적으로" 또는 "통신가능하게"라는 용어와 함께 또는 이런 용어 없이, "커플드" 또는 "커넥티드"라고 언급된 경우, 그것은 어떤 구성요소가 다른 구성요소에 직접적으로(예컨대, 유선으로), 무선으로, 또는 제3 구성요소를 통하여 연결될 수 있다는 것을 의미한다.The terms used in the examples are used for the purpose of description only, and should not be construed as limiting. The singular expression includes the plural expression unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that a feature, number, step, operation, component, part, or a combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof. For example, in the present specification, the singular form of a noun corresponding to an item may include one item or a plurality of items, unless the relevant context clearly dictates otherwise. As used herein, "A or B", "at least one of A and B", "at least one of A or B", "A, B or C", "at least one of A, B and C" and "A; Each of the phrases such as "at least one of B, or C" may include any one of, or all possible combinations of, items listed together in the corresponding one of the phrases. Terms such as “first”, “second”, or “first” or “second” may simply be used to distinguish a component from another component in question, and refer to the component in another aspect (e.g., importance or order) is not limited. One (eg, first) component is referred to as “coupled” or “connected” to another (eg, second) component, with or without the terms “functionally” or “communicable”. When referenced, it means that one component can be coupled to another component directly (eg, by wire), wirelessly, or through a third component.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 본 명세서서 사용되는 모든 용어들은 실시예가 속하는 기술 분야에서 통상의 지식을 가진 통상의 기술자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the embodiment belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

또한, 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 실시예의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.In addition, in the description with reference to the accompanying drawings, the same components are given the same reference numerals regardless of the reference numerals, and the overlapping description thereof will be omitted. In describing the embodiment, if it is determined that a detailed description of a related known technology may unnecessarily obscure the gist of the embodiment, the detailed description thereof will be omitted.

실시예들은, 퍼스널 컴퓨터, 랩톱 컴퓨터, 태블릿 컴퓨터, 스마트 폰, 텔레비전, 스마트 가전 기기, 지능형 자동차, 키오스크, 웨어러블 장치 등 다양한 형태의 전자 기기로 구현될 수 있다.The embodiments may be implemented in various types of electronic devices, such as personal computers, laptop computers, tablet computers, smart phones, televisions, smart home appliances, intelligent cars, kiosks, wearable devices, and the like.

본 명세서에서 설명되는 구성요소들의 각각의 구성요소(예컨대, 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있다. 다양한 실시 예들에 따르면, 해당 구성요소들 중 하나 이상의 구성요소들 또는 동작들이 생략되거나, 또는 하나 이상의 다른 구성요소들 또는 동작들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성요소들(예컨대, 모듈 또는 프로그램)은 하나의 구성요소로 통합될 수 있다. 이런 경우, 통합된 구성요소는 상기 복수의 구성요소들 각각의 구성요소의 하나 이상의 기능들을 상기 통합 이전에 상기 복수의 구성요소들 중 해당 구성요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시 예들에 따르면, 모듈, 프로그램 또는 다른 구성요소에 의해 수행되는 동작들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 상기 동작들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 동작들이 추가될 수 있다.Each component (eg, a module or a program) of components described herein may include a singular or plural entity. According to various embodiments, one or more components or operations among the corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, a plurality of components (eg, a module or a program) may be integrated into one component. In this case, the integrated component may perform one or more functions of each component of the plurality of components identically or similarly to those performed by the corresponding component among the plurality of components prior to the integration. . According to various embodiments, operations performed by a module, program, or other component are executed sequentially, in parallel, repetitively, or heuristically, or one or more of the operations are executed in a different order, omitted, or , or one or more other operations may be added.

본 명세서에서 사용되는 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구현된 유닛을 포함할 수 있으며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로와 같은 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는, 상기 부품의 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 일 실시 예에 따르면, 모듈은 ASIC(application-specific integrated circuit)의 형태로 구현될 수 있다.As used herein, the term “module” may include a unit implemented in hardware, software, or firmware, and may be used interchangeably with terms such as, for example, logic, logic block, component, or circuit. A module may be an integrally formed part or a minimum unit or a part of the part that performs one or more functions. For example, according to an embodiment, the module may be implemented in the form of an application-specific integrated circuit (ASIC).

본 명세서의 다양한 실시 예들은 기기(machine) 의해 읽을 수 있는 저장 매체(storage medium)(예컨대, 메모리)에 저장된 하나 이상의 명령어들을 포함하는 소프트웨어(예컨대, 프로그램 또는 애플리케이션)로서 구현될 수 있다. 예를 들면, 기기의 프로세서는, 저장 매체로부터 저장된 하나 이상의 명령어들 중 적어도 하나의 명령을 호출하고, 그것을 실행할 수 있다. 이것은 기기가 상기 호출된 적어도 하나의 명령어에 따라 적어도 하나의 기능을 수행하도록 운영되는 것을 가능하게 한다. 상기 하나 이상의 명령어들은 컴파일러에 의해 생성된 코드 또는 인터프리터에 의해 실행될 수 있는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장 매체는, 비일시적(non-transitory) 저장 매체의 형태로 제공될 수 있다. 여기서, '비일시적'은 저장 매체가 실재(tangible)하는 장치이고, 신호(signal)(예컨대, 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장 매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다.Various embodiments of the present specification may be implemented as software (eg, a program or an application) including one or more instructions stored in a storage medium (eg, memory) readable by a machine. For example, the processor of the device may call at least one of the one or more instructions stored from the storage medium and execute it. This makes it possible for the device to be operated to perform at least one function according to the called at least one command. The one or more instructions may include code generated by a compiler or code executable by an interpreter. The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory' only means that the storage medium is a tangible device and does not include a signal (eg, electromagnetic wave), and this term refers to the case where data is semi-permanently stored in the storage medium and It does not distinguish between temporary storage cases.

본 명세서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 상품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 상품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 상품은 기기로 읽을 수 있는 저장 매체(예컨대, compact disc read only memory(CD-ROM))의 형태로 배포되거나, 또는 애플리케이션 스토어를 통해 또는 두 개의 사용자 장치들(예컨대, 스마트폰들) 간에 직접, 온라인으로 배포(예컨대, 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 상품의 적어도 일부는 제조사의 서버, 애플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다. Methods according to various embodiments disclosed herein may be provided by being included in a computer program product. A computer program product may be traded between a seller and a buyer as a product. The computer program product is distributed in the form of a machine-readable storage medium (eg, compact disc read only memory (CD-ROM)), or via an application store or between two user devices (eg, smartphones). It may be distributed directly, online (eg, downloaded or uploaded). In the case of online distribution, at least a part of the computer program product may be temporarily stored or temporarily created in a machine-readable storage medium such as a memory of a server of a manufacturer, a server of an application store, or a relay server.

도 1은 사용자 모델링(100) 동작을 설명하기 위한 도면이다.1 is a diagram for explaining an operation of the user modeling 100 .

사용자 모델링(100)에서, 사용자가 상품을 얼마나 자주 구매하는지, 사용자가 한번에 구매할 때 얼마나 구매하는지에 관한 FRM(financial risk management)이 고려될 수 있다. 또한, 사용자 모델링(100)에서, 유사타겟 모델링, 룩어라이크 모델링(Look-Alike Modeling) 등이 고려될 수 있다. 예를 들어, Look-Alike Modeling은 자동화된 데이터 분석을 통해 새롭고 고유한 대상을 찾는 데 도움이 될 수 있고, 프로세스는 트레이트(trait) 또는 세그먼트(segment), 시간 간격, 및 첫 번째 및 타사 데이터 소스들(data sources)를 선택함으로써 시작된다. 선택한 사항에서 알고리즘 모델의 입력을 제공한다. Analytics 프로세스가 실행되면 선택한 모집단의 공유 특성을 기반으로 적합한 사용자를 찾을 수 있고, 완료 시 이러한 데이터는 트레이트 빌더에서 사용될 수 있다. 또한 알고리즘 트레이트를 룰 기반 트레이트들(rules-based traits)과 결합하는 세그먼트를 작성하고 불린(Boolean) 표현식 및 비교 연산자와 함께 다른 자격 요구 사항을 추가할 수 있다. 룩어라이크 모델링(Look-Alike Modeling)은 사용 가능한 모든 트레이트 데이터에서 값을 추출하는 동적 방법을 제공한다. In the user modeling 100 , financial risk management (FRM) regarding how often a user purchases a product and how much a user purchases at one time may be considered. Also, in the user modeling 100 , similar target modeling, look-alike modeling, and the like may be considered. For example, Look-Alike Modeling can help find new and unique objects through automated data analysis, the process can include traits or segments, time intervals, and first and third-party data. It starts by selecting the data sources. The selection provides input to the algorithm model. When the Analytics process runs, it can find suitable users based on the shared characteristics of the selected population, and upon completion this data can be used in the trait builder. You can also write segments that combine algorithm traits with rules-based traits and add other qualification requirements along with Boolean expressions and comparison operators. Look-Alike Modeling provides a dynamic way to extract values from all available trait data.

룩어라이크 모델링(Look-Alike Modeling)을 사용하면 다음과 같은 주요 이점이 있을 수 있다. Using Look-Alike Modeling can provide the following key benefits:

● 데이터 정확도: 알고리즘이 정기적으로 실행되므로 결과를 최신 및 연관성 있게 유지할 수 있다.● Data accuracy: Algorithms are run regularly to keep results up-to-date and relevant.

● 자동화: 대량의 정적 규칙을 관리할 필요가 없다. 알고리즘에서 대상을 찾을 수 있다.● Automation: No need to manage a large number of static rules. The algorithm can find the target.

● 시간 절약 및 노력 절약: 모델링 프로세스를 통해 작업 traits/segments 시간을 추측하거나 새로운 대상을 발견하기 위해 캠페인에서 리소스를 사용할 필요가 없다. 모델이 자동으로 이 작업을 수행한다.● Save time and effort: The modeling process eliminates the need to time guess job traits/segments or spend resources in your campaign to discover new targets. The model does this automatically.

● 안정성: 모델링은 자체 데이터와 액세스 권한이 있는 선택한 타사 데이터를 평가하는 서버측 검색 및 자격 프로세스와 함께 작동하므로, 트레이트에 대한 자격을 얻기 위해 사이트 방문자를 볼 필요가 없다.● Reliability: Modeling works with a server-side search and qualification process that evaluates its own data and selected third-party data to which it has access, so there is no need to see site visitors to qualify for a trait.

또한, 사용자 모델링(100)에서, 공통 취향 클러스터링, 구매력 클러스터링, 인구통계학 클러스터링이 이용될 수 있다. 클러스터링은 비지도 학습에서 이용되는 것으로서, 공통적인 취향이나 성질을 규명함으로써 사용자들을 그룹화할 수 있고, 이를 바탕으로 판매자는 타겟팅된 마케팅/광고를 할 수 있는데, 이러한 클러스터링 과정에서 사용자 그룹을 나누기 위한 기준으로서, 공통 취향, 구매력, 인구통계학 등의 방법이 사용될 수 있다.Also, in the user modeling 100 , common taste clustering, purchasing power clustering, and demographic clustering may be used. Clustering is used in unsupervised learning, and users can be grouped by identifying common tastes or characteristics, and based on this, sellers can perform targeted marketing/advertising. As such, methods such as common taste, purchasing power, and demographics may be used.

한편, 클러스터링의 구체적인 과정은, 예를 들어, 인공지능 분야에서 대표적인 클러스터링 알고리즘으로서 K-Means가 설명된다. K-Means는 비지도 학습 알고리즘으로 데이터세트에는 별도의 레이블이 존재하지 않는다. 따라서, 사용자가 자신의 도메인 지식을 바탕으로 몇 개의 클러스터로 나누어 볼지 결정하여 임의의 K값을 설정한다. 이렇게 설정한 K값이 알고리즘 이름에서 말하는 Mean 즉, K개 클러스터의 중심(centroid) 개수가 된다. (1) 정해진 K개의 중심은 보통은 임의로(random) 데이터 공간에 위치하게 된다. (2) 그 후 각 중심점과 전체 데이터 포인트들과의 거리를 계산해 데이터 포인트들은 자신과 가장 가까운 중심으로 클러스터링된다. (3) 전체 데이터 포인트가 클러스터링된 후 각각의 클러스터의 중심을 다시 계산 후 (4) 다시 새로운 중심과 데이터 포인트들과의 거리를 계산해서 데이터 포인트와 더 가까운 클러스터의 중심이 존재하는 경우 그 중심으로 업데이트한다. 더 이상 클러스터의 중심이 변동이 없을 때까지 (2) 내지 (4)의 과정을 반복한다. K-Means의 경우 초기에 중심의 위치를 임의로 설정하기에 최적의 해가 아닌 로컬 미니멈(local minimum)일 수도 있으므로, 여러 번의 테스트를 통해 이를 검증해보는 것이 바람직하다.On the other hand, for the specific process of clustering, for example, K-Means is described as a representative clustering algorithm in the field of artificial intelligence. K-Means is an unsupervised learning algorithm, and there are no separate labels in the dataset. Therefore, the user determines how many clusters to view based on his or her domain knowledge and sets an arbitrary K value. The K value set in this way becomes the Mean mentioned in the algorithm name, that is, the number of centroids of K clusters. (1) The determined K centers are usually randomly located in the data space. (2) Then, by calculating the distance between each center point and all data points, the data points are clustered with the center closest to itself. (3) After all data points are clustered, the center of each cluster is calculated again (4) The distance between the new center and the data points is calculated again. update The process of (2) to (4) is repeated until the center of the cluster is no longer fluctuated. In the case of K-Means, it may be a local minimum rather than an optimal solution to arbitrarily set the position of the center in the beginning, so it is desirable to verify this through multiple tests.

이러한 사용자 모델링(100)에서, 사용자(엔드유저: 110)는 내장 스크립트(120)(예컨대, JavaScript)와 연결되어 있으며, 내장 스크립트 모듈(120)은 사용자 로그 수집기 모듈(유저 로그 수집기 모듈)(130)(예컨대 Java Daemon)과 연결되어 있다. 머신러닝에 사용될 데이터는 사용자(110)로부터 내장 스크립트 모듈(120) 및 사용자 로그 수집기 모듈(유저 로그 수집기 모듈)(130)을 통해 사용자 모델링(100)을 통한 머신러닝에 사용하기 위한 데이터를 수집할 수 있다. 예를 들어, 자바 스크립트(JavaScript)에 포함되어 이미 만들어진 함수 등을 통해 데이터를 얻어낼 수 있다. 예를 들어 자바 데몬(Java Daemon)은 주기적인 서비스 요청을 처리하기 위해서 커널 상에 백그라운드 모드로 실행되는 프로세스로서 주기적으로 사용자 모델링(100)에서 요구되는 데이터를 획득하게 할 수 있다.In this user modeling 100 , a user (end user: 110 ) is connected to a built-in script 120 (eg, JavaScript), and the built-in script module 120 is a user log collector module (user log collector module) 130 ) (eg Java Daemon). Data to be used for machine learning is collected from the user 110 through the built-in script module 120 and the user log collector module (user log collector module) 130 to collect data for use in machine learning through user modeling 100. can For example, data can be obtained through functions that are already created in JavaScript. For example, the Java daemon is a process executed in a background mode on the kernel in order to process periodic service requests, and may periodically obtain data required by the user modeling 100 .

도 2는 상품 스코어(Product Score)를 분석(200)하기 위한 동작을 설명하기 위한 도면이다. 상품 스코어를 분석하기 위해, 데이터 파이프라인을 스트리밍 (Streaming Data Pipeline)할 수 있는 실시간 데이터 분석 처리 시스템 모듈(210), 풀-텍스트 검색 및 분석 엔진(Full-text search and analytics engine)일 수 있는 분산 검색 엔진 모듈(220), 클라우드 객체 스토리지 모듈(230) 및 데이터 카탈로그(Data Catalog)일 수 있는 관리형 ETL 서비스 모듈(240)이 이용될 수 있다.2 is a diagram for explaining an operation for analyzing 200 of a product score. A real-time data analysis processing system module 210 that may stream a data pipeline, a distributed full-text search and analytics engine, which may be a full-text search and analytics engine, to analyze product scores. The search engine module 220 , the cloud object storage module 230 , and the managed ETL service module 240 , which may be a data catalog, may be used.

실시간 데이터 분석 처리 시스템 모듈(210)은 데이터를 파이프라인으로 스트리밍하는 것으로서, 사용자(유저)로부터 수집되고 있는 데이터를 실시간으로 파이프라인화하여 스트리밍함으로써 실시간으로 데이터를 분석하도록 구성된다.The real-time data analysis processing system module 210 is configured to analyze data in real time by pipelined and streaming data being collected from a user (user) in real time as streaming data into a pipeline.

분산 검색 엔진 모듈(220)은 풀-텍스트 검색 및 분석을 위한 엔진일 수 있다.The distributed search engine module 220 may be an engine for full-text search and analysis.

한편, 클라우드 객체 스토리지 모듈(230)과 관련하여, 비즈니스가 성장함에 따라, 많은 소스에서 빠르게 확장되지만 동시에 격리된 데이터 풀을 관리하게 되며 다수의 애플리케이션 및 비즈니스 프로세스에서 사용하게 되는데, 많은 기업이 비즈니스 애플리케이션에 복잡성을 가중하고 혁신을 늦추는 조각난 스토리지 포트폴리오 문제와 씨름하고 있는 문제가 있다. 객체 스토리지, 예컨대 클라우드 객체 스토리지 모듈(230)은 어떠한 유형의 데이터이든 네이티브 형식으로 저장할 수 있는 고도로 확장 가능하고 비용 효율적인 스토리지를 제공한다.On the other hand, with respect to the cloud object storage module 230, as the business grows, it rapidly expands from many sources, but at the same time manages an isolated data pool and uses it in a number of applications and business processes. There is a problem wrestling with a fragmented storage portfolio that adds complexity and slows innovation. Object storage, such as cloud object storage module 230, provides highly scalable and cost-effective storage for storing any type of data in a native format.

관리형 ETL 서비스 모듈(240)에서 ETL은 데이터 추출(Extraction), 변환(Transformation) 및 적재(Loading)의 약자이다. 관리형 ETL 서비스(240)는 클라우드 객체 스토리지로부터 사용자의 로그 수집을 통해 얻어진 데이터를 추출(Extract)하고, 목적 데이터로 변환(Transform)하며, 목표 데이터인 서비스(대화식 쿼리 서비스)로 데이터를 적재(Load)한다. 예를 들어, 데이터 카탈로그는 기업이 데이터를 활용하여 경쟁 우위를 확보하기 위해, 사용자가 빠르게 데이터를 찾고, 이해하고, 적절히 활용할 수 있어야 하는데, 이를 위해 다양한 IT 기술들이 도입되지만, 사실 IT는 데이터를 비즈니스 관점으로 이해할 수 없으며, 데이터의 품질 또는 규제 관련된 기능에 많은 책임을 지게 된다. 데이터 카탈로그는 비즈니스 사용자들이 조직이 보유한 데이터와 프로세스에 대한 지식과 정보를 카탈로그로 방식으로 이해할 수 있도록 지원하는 도구이다. 여기서 프로세스는 데이터의 생산/관리/소비와 관련된 활동을 의미한다. 데이터 카탈로그는 여러 조직의 데이터 사전(메타데이터)에 대한 기술적 세부내용을 사용자들이 간단하고 이해하기 쉬운 방식으로 변환한다. 이를 통해서 사용자가 사용가능한 데이터 세트를 식별하고, 이해할 수 있도록 지원하며, 데이터 정의/동의어/핵심비즈니스속성/사용방법 등의 메타데이터에 대한 투명성을 제공한다. 사용자가 처리해야 할 데이터 문제가 있을 때 데이터 카탈로그를 통해서 해당 데이터를 관리하는 책임자/소유자를 쉽게 찾아서 비즈니스 활용할 수 있게 된다. 또한 데이터 카탈로그는 사용자가 데이터가 어떻게 활용되어 왔는지 데이터 흐름과 의존관계를 쉽게 파악할 수 있도록 구성되기도 한다. 이러한 과정에서 데이터 품질점수 및 측정기준 제공 및 사용제한 설정이 가능하다.In the managed ETL service module 240 , ETL is an abbreviation for data extraction, transformation, and loading. The managed ETL service 240 extracts data obtained through user log collection from cloud object storage, transforms it into target data, and loads the data into the target data service (interactive query service) ( Load). For example, a data catalog requires that users quickly find, understand, and properly utilize data in order for companies to use data to gain a competitive advantage. To this end, various IT technologies are introduced, but in fact, IT It cannot be understood from a business point of view, and it takes a lot of responsibility for the quality of the data or its functions related to regulation. Data catalogs are tools that enable business users to catalog knowledge and information about the data and processes an organization owns. Here, the process refers to the activities related to the production/management/consumption of data. Data catalogs transform the technical details of multiple organizations' data dictionaries (metadata) into a simple and understandable way for users. This helps users to identify and understand usable data sets, and provides transparency about metadata such as data definitions/synonyms/core business attributes/usage methods. When a user has a data problem that needs to be dealt with, it is possible to easily find the person in charge/owner who manages the data through the data catalog and utilize it for business. In addition, the data catalog is structured so that users can easily understand how data has been used and how data flows and dependencies. In this process, it is possible to provide data quality scores and measurement standards, and to set usage restrictions.

실시간 데이터 분석 처리 시스템 모듈 및 관리형 ETL 서비스 모듈에서 상품의 스코어를 결정하는 것은, 노출, 전환 4개의 카테고리를 결정하는 것을 포함할 수 있다. 또한, 실시간 데이터 분석 처리 시스템 모듈 및 관리형 ETL 서비스 모듈에서 상품의 스코어를 결정하는 것은, 노출수, 전환수, 바이 매트릭스 분석, 조회수, 전환수, 바이 매트릭스 분석, 사용자 고객 모델과 상품스코어 데이터를 종합한 분석을 통해 결정하는 것을 포함한다. 여기서 상품을 그룹으로 하는 것이 아닌 개개의 상품별로 스코어가 결정될 수 있다.Determining the score of the product in the real-time data analysis processing system module and the managed ETL service module may include determining four categories of impressions and conversions. In addition, determining the product score in the real-time data analysis processing system module and managed ETL service module is based on the number of impressions, conversions, bi-matrix analysis, views, conversions, bi-matrix analysis, user customer model and product score data. This includes making decisions through synthetic analysis. Here, the score may be determined for each product rather than grouping the products.

도 3은 비딩 스코어(Bidding Score)를 결정하기 위한 동작을 설명하기 위한 도면이다.3 is a diagram for explaining an operation for determining a bidding score.

대화식 쿼리 서비스 모듈(310)는 관리형 ETL 서비스 모듈(240)로부터 적재된 데이터를 수신한다. 또한, 대화식 쿼리 서비스 모듈(310)은 사용자가 데이터를 분석할 수 있도록 데이터를 시각화하기 위해 데이터 시각화 모듈(320)로 데이터를 송신할 수 있다. 대화식 쿼리 서비스 모듈(310)는 관리형 ETL 서비스 모듈(240)로부터 적재된 데이터를 클라우드 기계 학습 플랫폼 모듈(330)에 송신할 수 있다. The interactive query service module 310 receives data loaded from the managed ETL service module 240 . In addition, the interactive query service module 310 may transmit data to the data visualization module 320 to visualize the data so that the user can analyze the data. The interactive query service module 310 may transmit data loaded from the managed ETL service module 240 to the cloud machine learning platform module 330 .

상품 서비스 모듈(340)은 클라우드 기계 학습 플랫폼 모듈(330)로부터 수신되는 데이터를 분산 검색 엔진으로 송신할 수 있다. 또한, 상품 서비스 모듈(340)은 쇼핑몰(350-1) 및 개인 판매자(350-2)에게 상품에 관한 서비스를 제공할 수 있다. 상품에 관한 서비스는 상품 스코어를 통해 비딩 가중치 및 상단 가격을 정한 스코어를 제공할 수 있으며, 운영자의 예산 및 상황에 맞추어, 비딩 한도를 설정하기 위한 데이터를 제공할 수 있으며, 상품의 기본 정보를 기초로 (예컨대, 셀러, 브랜드, 카테고리와 같은 필터링 기준을 통해) 데이터를 필터링하도록 구성된다.The product service module 340 may transmit data received from the cloud machine learning platform module 330 to the distributed search engine. Also, the product service module 340 may provide product-related services to the shopping mall 350 - 1 and the individual seller 350 - 2 . The product-related service can provide a score that determines the bid weight and upper price through the product score, and can provide data for setting the bid limit according to the budget and situation of the operator, and based on the basic information of the product and filter data (eg, via filtering criteria such as seller, brand, category).

한편, REST(Representational State Transfer)는 월드 와이드 웹과 같은 분산 하이퍼미디어 시스템을 위한 소프트웨어 아키텍처의 한 형식이다. 이러한 용어는 로이 필딩(Roy Fielding)의 2000년 박사학위 논문에서 소개되었다. 필딩은 HTTP의 주요 저자 중 한 사람이다. 이 개념은 네트워킹 문화에 널리 퍼졌으며, 엄격한 의미로 REST는 네트워크 아키텍처 원리의 모음이다. 여기서 '네트워크 아키텍처 원리'란 자원을 정의하고 자원에 대한 주소를 지정하는 방법 전반을 일컫는다. 간단한 의미로는, 웹 상의 자료를 HTTP위에서 SOAP이나 쿠키를 통한 세션 트랙킹 같은 별도의 전송 계층 없이 전송하기 위한 아주 간단한 인터페이스를 말한다. 이 두 가지의 의미는 겹치는 부분과 충돌되는 부분이 있다. 필딩의 REST 아키텍처 형식을 따르면 HTTP나 WWW가 아닌 아주 커다란 소프트웨어 시스템을 설계하는 것도 가능하다. 또한, 리모트 프로시저 콜 대신에 간단한 XML과 HTTP 인터페이스를 이용해 설계하는 것도 가능하다. 필딩의 REST 원리를 따르는 시스템은 종종 RESTful이란 용어로 지칭되며, Restful API는 REST 원리를 따르는 API를 의미한다.On the other hand, REST (Representational State Transfer) is a form of software architecture for distributed hypermedia systems such as the World Wide Web. These terms were introduced in Roy Fielding's 2000 doctoral dissertation. Fielding is one of the main authors of HTTP. This concept has become widespread in networking culture, and in a strict sense REST is a collection of network architecture principles. Here, the 'network architecture principle' refers to the overall method of defining resources and assigning addresses to resources. In a simple sense, it refers to a very simple interface for transmitting data on the web over HTTP without a separate transport layer such as SOAP or session tracking through cookies. These two meanings overlap and conflict. Following Fielding's REST architecture format, it is possible to design very large software systems that are not HTTP or WWW. It is also possible to design using simple XML and HTTP interfaces instead of remote procedure calls. Systems that follow Fielding's REST principles are often referred to as RESTful, and Restful APIs refer to APIs that follow REST principles.

도 4는 AI를 활용하여 상품 데이터를 실시간으로 분석하여 연동된 각 매체에 최적화된 정보를 실시간으로 업데이트하는 방법을 설명하는 순서도이다.4 is a flowchart illustrating a method of real-time updating of information optimized for each interlocked medium by analyzing product data in real time using AI.

단계 410에서, 사용자 모델링을 위해, 내장 스크립트 모듈 및 사용자 로그 수집기 모듈을 통해 머신러닝에 사용하기 위한 사용자의 데이터를 수집함으로써 상품 데이터를 수집할 수 있다.In step 410, for user modeling, product data may be collected by collecting user data for use in machine learning through a built-in script module and a user log collector module.

단계 420에서, 사용자 로그 수집기 모듈에 연결되는 실시간 데이터 분석 처리 시스템 모듈 및 실시간 데이터 분석 처리 시스템 모듈에 연결되는 관리형 ETL 서비스 모듈을 통해 수집된 상품 데이터를 통해 상품을 평가할 수 있다. 상품을 평가하는 단계는 상품을 스코어링(scoring)하는 것을 포함하고, 상품을 평가하는 단계는 MLOps를 통해 지속적으로 상품 주제의 모델 최적화 및 신규 모델의 추가 관리를 수행하는 단계를 포함하고, 관리형 ETL 서비스 모듈은 데이터 카탈로그(Data Catalog)를 포함할 수 있다.In step 420, the product may be evaluated through the product data collected through the real-time data analysis processing system module connected to the user log collector module and the managed ETL service module connected to the real-time data analysis processing system module. The step of evaluating the product includes scoring the product, and the step of evaluating the product includes continuously performing model optimization of the product subject and additional management of the new model through MLOps, managed ETL The service module may include a data catalog.

단계 430에서, 관리형 ETL 서비스로부터 적재된 데이터에 기초하여, 대화식 쿼리 서비스 모듈에서 데이터 카탈로그에 대한 대화식 쿼리 서비스를 제공할 수 있다. 대화식 쿼리 서비스 모듈은 클라우드 기계 학습 플랫폼 모듈에 연결되고, 클라우드 기계 학습 플랫폼 모듈에서 데이터에 관한 기계 학습이 수행될 수 있다.In step 430, based on the data loaded from the managed ETL service, the interactive query service module may provide an interactive query service for the data catalog. The interactive query service module may be connected to the cloud machine learning platform module, and machine learning on data may be performed in the cloud machine learning platform module.

단계 440에서, 평가된 상품을 하나 이상의 매체에 연동시키고 매체에 싱크를 실시간으로 관리할 수 있다 싱크를 실시간으로 관리하는 것은 DevOps를 통해 수행될 수 있다.In step 440 , the evaluated product may be linked to one or more media, and the sync to the media may be managed in real time. Managing the sync in real time may be performed through DevOps.

본 발명의 다른 실시예에서, 머신러닝에서, Word2vec 알고리즘은, 텍스트마이닝을 위한 것으로, 각 단어 간의 앞, 뒤 관계를 보고 근접도를 정하는 알고리즘이다. Word2vec 알고리즘은 비지도 학습 알고리즘이다. Word2vec 알고리즘은 이름이 나타내는 바와 같이 단어의 의미를 벡터형태로 표현하는 계량기법일 수 있다. Word2vec 알고리즘은 각 단어를 200차원 정도의 공간에서 백터로 표현할 수 있다. Word2vec 알고리즘을 이용하면, 각 단어마다 단어에 해당하는 벡터를 구할 수 있다.In another embodiment of the present invention, in machine learning, the Word2vec algorithm, for text mining, is an algorithm that determines the proximity by looking at the front and back relationships between each word. The Word2vec algorithm is an unsupervised learning algorithm. The Word2vec algorithm may be a metric that expresses the meaning of a word in a vector form, as the name indicates. The Word2vec algorithm can represent each word as a vector in a space of about 200 dimensions. Using the Word2vec algorithm, a vector corresponding to a word can be obtained for each word.

Word2vec 알고리즘은 종래의 다른 알고리즘에 비해 자연어 처리 분야에서 비약적인 정밀도 향상을 가능하게 할 수 있다. Word2vec은 입력한 말뭉치(corpus)의 문장에 있는 단어와 인접 단어의 관계를 이용해 단어의 의미를 학습할 수 있다. Word2vec 알고리즘은 인공 신경망에 근거한 것으로, 같은 맥락을 지닌 단어는 가까운 의미를 지니고 있다는 전제에서 출발한다. Word2vec 알고리즘은 텍스트 문서를 통해 학습을 진행하며, 한 단어에 대해 근처(전후 5 내지 10 단어 정도)에 출현하는 다른 단어들을 관련 단어로서 인공 신경망에 학습시킨다. 연관된 의미의 단어들은 문서상에서 가까운 곳에 출현할 가능성이 높기 때문에 학습을 반복해 나가는 과정에서 두 단어는 점차 가까운 벡터를 지닐 수 있다.The Word2vec algorithm may enable a dramatic improvement in precision in the field of natural language processing compared to other conventional algorithms. Word2vec can learn the meaning of a word by using the relationship between a word in a sentence of an input corpus and an adjacent word. The Word2vec algorithm is based on an artificial neural network, and it starts from the premise that words with the same context have close meanings. The Word2vec algorithm learns through text documents, and trains the artificial neural network to learn other words that appear nearby (about 5 to 10 words before and after) for one word as related words. Since words with related meanings are more likely to appear close to each other in the document, two words can have vectors that are closer to each other in the process of repeating learning.

구글에서 개발한 Word2Vec은 분포 가설(distributional hypothesis)을 가정 하에 표현한 분산 표현을 따른다. 예를 들어 '강아지'라는 단어는 '귀엽다', '예쁘다', '애교' 등의 단어와 같이 자주 등장한다고 가정해볼 때, 그에 따라 분포 가설에 맞춰 해당 단어들을 벡터화한다면 유사한 값이 도출된다. 즉, 의미적으로 가까운 단어가 된다는 의미이다.Word2Vec developed by Google follows the expression of variance expressed under the assumption of the distributional hypothesis. For example, assuming that the word 'dog' appears frequently with words such as 'cute', 'pretty', and 'aesthetic', similar values are derived if the corresponding words are vectorized according to the distribution hypothesis accordingly. That is, they become semantically close words.

분산 표현을 통해 벡터를 구하는 방법이 이미 널리 사용되고 있었지만, Word2Vec이 특히 주목받게 된 이유는 효율성 부분이다. 간단한 인공신경망 모형을 기반으로 학습 데이터의 규모가 10억 단어 이상으로 커져도 요구되는 계산량을 낮은 수준으로(computationally cheap) 유지할 수 있다. 이는 학습 과정을 쉽게 병렬화(parallelization)하여 짧은 시간 안에 양질의 단어 벡터 표상을 얻을 수 있기 때문이다. 이처럼 속도를 대폭 개선시킨 Word2Vec에는 CBoW와 Skip-Gram이라는 두 가지 학습 방법이 존재한다.Although the method of finding a vector through a distributed representation has already been widely used, the reason Word2Vec has received special attention is its efficiency. Based on a simple artificial neural network model, it is possible to keep the required amount of computation at a low level (computationally cheap) even if the size of the training data grows to more than 1 billion words. This is because a high-quality word vector representation can be obtained in a short time by easily parallelizing the learning process. Word2Vec, which has greatly improved the speed, has two learning methods: CBoW and Skip-Gram.

먼저, CBoW(Continuous Bag of Words)는 주변에 있는 단어들로 중간에 있는 단어들을 예측하는 방법으로서, 예측해야 하는 단어를 중심 단어(center word)라 하고, 예측에 사용되는 단어들을 주변 단어(context word)라고 한다. 중심 단어를 예측하기 위해서는 앞뒤로 몇 개의 단어를 볼지 결정하게 되는데, 그 크기를 윈도우(window)라고 하며, 학습을 위한 데이터 셋을 만들기 위해 슬라이딩 윈도우(sliding window)를 사용한다. Word2Vec의 학습은 주변 단어 크기에 따라 말뭉치(corpus)를 슬라이딩하면서 중심 단어의 주변 단어들을 보고 각 단어의 벡터 값을 업데이트 해나가는 방식이다. 이 때, 윈도우 내에 등장하지 않는 단어에 해당하는 벡터는 중심 단어 벡터와 멀어지게끔, 등장하는 주변 단어 벡터는 중심 단어 벡터와 가까워지도록 값을 변경해 나간다. CBoW를 이용하는 Word2Vec의 신경망 구조에서, 은닉층(hidden layer)이 하나이므로 딥러닝 모델이 아닌 얕은 신경망(shallow neural network)라고 할 수 있다. 게다가 일반적인 은닉층과는 다르게 활성화 함수가 존재하지 않고, 연산(룩업 테이블)만을 담당하는 층이므로 투사층(projection layer)이라고 부른다. 주변 단어 별 원-핫 벡터에 가중치 W를 곱해서 생긴 결과 벡터들은 투사층에서 만나고 이 벡터들의 평균을 구하고, 평균 벡터 값을 다시 두 번째 가중치 행렬 W'과 곱하여 나온 원-핫 벡터들과 차원이 동일한 벡터에 softmax함수를 적용함으로써 스코어 벡터(score vector)를 구하는데, 여기서 스코어 벡터의 n번째 인덱스 값은 n번째 단어가 중심 단어일 확률을 의미한다. 추가적으로, 스코어 벡터(or 예측 값)과 실제중심 단어 벡터 값과의 오차를 줄이기 위해 손실 함수(loss function)을 사용하여 오차를 최소화하는 방향으로 학습된다.First, CBoW (Continuous Bag of Words) is a method of predicting words in the middle with surrounding words. is called word). In order to predict the central word, it is decided how many words to look back and forth, the size of which is called a window, and a sliding window is used to create a data set for learning. Word2Vec learning is a method of updating the vector value of each word by looking at the surrounding words of the central word while sliding the corpus according to the size of the surrounding word. At this time, the values of the vectors corresponding to words that do not appear in the window are changed so that they move away from the central word vector and the neighboring word vectors that appear in the window become closer to the central word vector. In the neural network structure of Word2Vec using CBoW, there is only one hidden layer, so it can be called a shallow neural network rather than a deep learning model. In addition, unlike a general hidden layer, there is no activation function and it is called a projection layer because it is a layer in charge of only calculations (lookup tables). The resulting vectors obtained by multiplying the weight W by the one-hot vector for each neighboring word meet in the projection layer, take the average of these vectors, and multiply the average vector value by the second weight matrix W' and have the same dimension as the one-hot vectors A score vector is obtained by applying the softmax function to the vector, where the nth index value of the score vector means the probability that the nth word is the central word. Additionally, in order to reduce the error between the score vector (or predicted value) and the actual center word vector value, a loss function is used to minimize the error.

Skip-Gram의 원리는 CBoW와 크게 다르지 않지만 중심 단어를 보고 어떤 주변 단어가 존재하는지 예측하는 모델이며, Skip-Gram은 CBoW와 같이 매우 간단한 인공신경망 모델이며, 입력층과 출력층만 CBoW와 반대인 구조이다. 이처럼 유사한 두 방식을 두고 여러 논문에서 성능 비교를 진행하였고, 그 결과 전반적으로 Skip-Gram이 CBoW보다 좋은 성능을 보인다고 알려져 있다.The principle of Skip-Gram is not very different from CBoW, but it is a model that predicts which surrounding words exist by looking at the central word. am. Performance comparison was conducted in several papers with these two similar methods, and as a result, Skip-Gram is known to show better performance than CBoW overall.

전술한 CBoW 및 Skip-Gram의 두 가지 방식은 입력으로 주어진 단어를 N차원의 벡터로 투영한 뒤, softmax함수를 이용하여 출력 단어를 맞추도록 학습되면서 학습 속도가 매우 느려진다는 단점이 있을 수 있는데, 이는 정확한 계산을 위해 데이터 셋에 존재하는 모든 단어를 한꺼번에 고려하여 계산량이 매우 커지기 때문이다. Word2Vec이 상용화되기 시작하면서 해당 단점을 극복하기 위해 hierarchical softmax와 negative sampling이라는 두 가지 방법이 제안되고 있다.The aforementioned two methods, CBoW and Skip-Gram, project a word given as an input into an N-dimensional vector, and then learn to match the output word using the softmax function. This is because the amount of calculation is very large by considering all words in the data set at once for accurate calculation. As Word2Vec begins to be commercialized, two methods, hierarchical softmax and negative sampling, have been proposed to overcome this drawback.

Hierarchical softmax는 순수하게 softamx함수를 빠르게 계산하기 위한 것으로서, 모든 단어 별 등장 빈도를 고려하여 이진 트리(binary tree)를 구성하는 방법이다.Hierarchical softmax is purely for quickly calculating the softamx function, and it is a method of constructing a binary tree considering the frequency of appearance of every word.

분석부는 이러한 신경망에 사용자 고객 모델링 데이터를 입력하여 키워드의 문맥 정보를 나타내는 벡터 값을 제1 벡터 값으로 산출할 수 있다. 키워드 분석부는 고객 모델링 데이터를 단어 수준으로 분리하여 신경망에 입력하여 각 필터링된 결과의 문맥 정보를 나타내는 벡터 값을 제2 벡터 값으로 산출할 수 있다. 분석부는 제1 벡터 값 및 복수의 제2 벡터 값 간의 유사도를 산출하고, 그 유사도가 가장 높은 제2 벡터 값을 추출할 수 있으며, 추출한 제2 벡터 값에 해당하는 사용자 고객 모델로서 추출할 수 있다.The analysis unit may input the user customer modeling data into the neural network to calculate a vector value representing context information of a keyword as a first vector value. The keyword analyzer may divide the customer modeling data into word levels, input the data into the neural network, and calculate a vector value representing context information of each filtered result as a second vector value. The analysis unit may calculate the similarity between the first vector value and the plurality of second vector values, extract the second vector value having the highest similarity, and extract it as a user customer model corresponding to the extracted second vector value .

시스템은 데스크탑 컴퓨터, 랩탑 컴퓨터 등 개인용 컴퓨터, 스마트폰, PDA(Personal Digital Assistant), PMP(Portable Multimedia Player), 태블릿 PC 등 모바일 기기, 웨어러블 기기, 또는 서버 등 네트워크 접속이 가능한 통신 기기일 수 있으며, 그 종류나 개수에 제한이 없다.The system may be a personal computer such as a desktop computer or a laptop computer, a smart phone, a personal digital assistant (PDA), a portable multimedia player (PMP), a mobile device such as a tablet PC, a wearable device, or a communication device capable of network access such as a server, There is no limit to the type or number of them.

프로세서는 기본적인 산술, 로직 및 압축 패키지 파일에 대한 이미지 최적화 방법 등의 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리 또는 네트워크 인터페이스 또는 버스를 통해 프로세서에 제공될 수 있다. 프로세서는 프로그램 코드를 실행하도록 구성될 수 있다. 프로그램 코드는 메모리 등의 기록 장치에 저장될 수 있다.The processor may be configured to process instructions of a computer program by performing input/output operations such as basic arithmetic, logic, and an image optimization method for a compressed package file. Instructions may be provided to the processor via a memory or network interface or bus. The processor may be configured to execute program code. The program code may be stored in a recording device such as a memory.

프로세서는 본 실시예에 따른 기능을 구현하기 위해 구성될 수 있다. 프로세서는 각각의 기능 구현하는 방법에 따라 일부 컴포넌트가 생략되거나, 도시되지 않은 추가의 컴포넌트가 더 포함되거나, 2개 이상의 컴포넌트가 결합되도록 구성될 수 있다.A processor may be configured to implement a function according to the present embodiment. The processor may be configured such that some components are omitted, additional components not shown are further included, or two or more components are combined according to a method of implementing each function.

메모리는 컴퓨터에서 판독 가능한 기록 매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 메모리에는 OS와, 프로그램 실행코드와, 기타 정보가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드와 같은 별도의 컴퓨터에서 판독 가능한 기록 매체로부터 로딩될 수도 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록 매체가 아닌 네트워크 인터페이스를 통해 메모리에 로딩될 수도 있다.The memory is a computer-readable recording medium and may include a random access memory (RAM), a read only memory (ROM), and a permanent mass storage device such as a disk drive. The memory may store an OS, program execution code, and other information. These software components may be loaded from a separate computer-readable recording medium such as a floppy drive, disk, tape, DVD/CD-ROM drive, or memory card. In another embodiment, the software components may be loaded into the memory through a network interface instead of a computer-readable recording medium.

네트워크 인터페이스는 본 실시예에 따른 기능을 수행하기 위한 각각의 구성을 컴퓨터 네트워크에 연결하기 위한 컴퓨터 하드웨어 구성요소일 수 있다. 네트워크 인터페이스는 무선 연결 또는 유선 연결 방식을 이용하여 컴퓨터 네트워크에 연결시킬 수 있다.The network interface may be a computer hardware component for connecting each component for performing a function according to the present embodiment to a computer network. The network interface may be connected to a computer network using a wireless connection or a wired connection method.

버스는 구성요소 간의 통신 및 데이터 전송을 가능하게 할 수 있다. 버스는 고속 시리얼 버스(high-speed serial bus), 병렬 버스(parallel bus), SAN(Storage Area Network) 및 기타 통신 방식 중 하나 이상을 이용하여 구성될 수 있다.A bus may enable communication and data transfer between components. The bus may be configured using one or more of a high-speed serial bus, a parallel bus, a storage area network (SAN), and other communication methods.

인공지능 학습 모델이 활용될 수 있으며, 컨벌루션 신경망은 특징추출 신경망과 분류 신경망으로 구성되어 있으며 특징 추출 신경망은 입력 신호를 컨벌루션 계층과 풀링 계층을 차례로 쌓아 진행한다. 컨벌루션 계층은 컨벌루션 연산, 컨벌루션 필터 및 활성함수를 포함하고 있다. 컨벌루션 필터의 계산은 대상입력의 행렬 크기에 따라 조정되나 일반적으로 9X9 행렬을 사용한다. 활성 함수는 일반적으로 ReLU 함수, 시그모이드 함수, 및 tanh 함수 등을 사용하나 이에 한정하지 않는다. 풀링 계층은 입력의 행렬 크기를 줄이는 역할을 하는 계층으로, 특정 영역의 픽셀을 묶어 대표값을 추출하는 방식을 사용한다. 풀링 계층의 연산에는 일반적으로 평균값이나 최대값을 많이 사용하나 이에 한정하지 않는다. 해당 연산은 정방 행렬을 사용하여 진행되는데, 일반적으로 9X9 행렬을 사용한다. 컨벌루션 계층과 풀링 계층은 해당 입력이 차이를 유지한 상태에서 충분히 작아질 때까지 번갈아 반복 진행된다.An artificial intelligence learning model can be utilized, and the convolutional neural network consists of a feature extraction neural network and a classification neural network, and the feature extraction neural network stacks the input signal with a convolutional layer and a pooling layer sequentially. The convolution layer includes a convolution operation, a convolution filter, and an activation function. The calculation of the convolution filter is adjusted according to the matrix size of the target input, but a 9X9 matrix is generally used. The activation function generally uses, but is not limited to, a ReLU function, a sigmoid function, and a tanh function. The pooling layer is a layer that reduces the size of the input matrix, and uses a method of extracting representative values by tying pixels in a specific area. In general, the average value or the maximum value is often used for the calculation of the pooling layer, but is not limited thereto. The operation is performed using a square matrix, which is usually a 9x9 matrix. The convolutional layer and the pooling layer are repeated alternately until the corresponding input becomes small enough while maintaining the difference.

분류 신경망은 은닉층과 출력층을 가지고 있다. 컨벌루션 신경망에서는 일반적으로 은닉층이 3개 이상 존재하며, 각 은닉층의 노드는 100개로 지정하나 경우에 따라 그 이상 또는 이하로 정할 수 있다. 은닉층의 활성함수는 ReLU 함수, 시그모이드 함수 및 tanh 함수 등을 사용하나 이에 한정하지 않는다. 컨벌루션 신경망의 출력층 노드는 총 100개로 할 수 있다. 분류 신경망의 출력층 활성함수는 소프트맥스 함수를 사용한다. 소프트맥스 함수는 one-hot 인코딩의 대표 함수로서, 모든 출력 노드의 합을 총 1이 되게 하며 가장 최대의 값을 가지는 출력 노드의 출력을 1로 하고, 나머지 출력 노드의 출력을 0으로 하는 함수이다. 소프트맥스 함수를 통해 100개의 출력 중 하나의 출력만을 선택하는 것이 가능하다.A classification neural network has a hidden layer and an output layer. In a convolutional neural network, there are generally three or more hidden layers, and 100 nodes for each hidden layer are specified, but more or less can be specified in some cases. The activation function of the hidden layer uses a ReLU function, a sigmoid function, and a tanh function, but is not limited thereto. A total of 100 output layer nodes of a convolutional neural network can be made. The output layer activation function of the classification neural network uses the softmax function. The softmax function is a representative function of one-hot encoding, which makes the sum of all output nodes total 1, sets the output of the output node having the largest value to 1, and sets the output of the remaining output nodes to 0. . It is possible to select only one output out of 100 outputs through the softmax function.

컨벌루션 신경망의 학습을 위한 학습 신호는 제어 장치 관리자에 의해 수정해야 할 정보를 포함하여 수동으로 생성될 수 있다. 제어 장치 관리자는 학습 신호 생성의 필요에 대한 판단을 바탕으로 수정해야 할 정보를 수동으로 입력할 수 있으며, 이를 통해 학습 신호의 종류에 따라 데이터 베이스 정보의 강화 혹은 약화가 이뤄질 수 있다.A training signal for training a convolutional neural network may be manually generated including information to be modified by a control device manager. The control device manager can manually input the information to be corrected based on the judgment on the need to generate the learning signal, and through this, the database information can be strengthened or weakened according to the type of the learning signal.

학습 신호는 정답과 출력값의 오차를 바탕으로 만들어지며, 경우에 따라 델타를 이용하는 SGD나 배치 방식 혹은 역전파 알고리즘을 따르는 방식을 사용할 수 있다. 본 학습 신호에 의해 컨벌루션 신경망은 기존의 가중치를 수정해학습을 수행하며, 경우에 따라 모멘텀을 사용할 수 있다. 오차의 계산에는 비용함수가 사용될 수 있는데, 비용함수로 Cross entropy 함수를 사용할 수 있다.The learning signal is created based on the error between the correct answer and the output value, and in some cases, SGD using delta, batch method, or method following the back propagation algorithm can be used. Based on this learning signal, the convolutional neural network performs learning by modifying the existing weights, and in some cases, momentum can be used. A cost function can be used to calculate the error, and a cross entropy function can be used as the cost function.

컨벌루션 신경망은 학습 신호를 바탕으로 데이터 베이스 내의 정보를 수정하기 위한 학습을 할 수 있다. 미리 학습된 컨벌루션 신경망은 3개 이상의 은닉층을 가지고 있으며, 각 은닉층은 50개 이상의 은닉 노드를 가질 수 있다. 각 은닉 노드의 활성함수는 ReLU 함수, 시그모이드 함수 및 tanh 함수를 사용할 수 있으나, 이에 국한되지 않는다. 출력 노드의 함수는 one-hot 인코딩 기법을 활용한 소프트맥스 함수를 사용할 수 있다. 출력은 one-hot 인코딩 기법에 따라 하나의 분류만을 선택하며, 선택된 분류로부터 명령을 수행하도록 할 수 있다.A convolutional neural network can learn to modify information in a database based on a learning signal. A pre-trained convolutional neural network has 3 or more hidden layers, and each hidden layer can have 50 or more hidden nodes. The activation function of each hidden node may use a ReLU function, a sigmoid function, and a tanh function, but is not limited thereto. The function of the output node can use the softmax function using the one-hot encoding technique. The output selects only one classification according to the one-hot encoding technique, and it is possible to execute a command from the selected classification.

이상의 설명은 본 명세서에 개시된 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 명세서에 개시된 실시예들이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 명세서에 개시된 실시예들의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. The above description is merely illustrative of the technical idea disclosed in the present specification, and those of ordinary skill in the art to which the embodiments disclosed in the present specification pertain within a range that does not depart from the essential characteristics of the embodiments disclosed herein. Various modifications and variations will be possible.

따라서, 본 명세서에 개시된 실시예들은 본 명세서에 개시된 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 명세서에 개시된 기술 사상의 범위가 한정되는 것은 아니다. 본 명세서에 개시된 기술 사상의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 명세서의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Accordingly, the embodiments disclosed in the present specification are not intended to limit the technical spirit disclosed in this specification, but to illustrate, and the scope of the technical spirit disclosed in the present specification is not limited by these embodiments. The scope of protection of the technical ideas disclosed in this specification should be interpreted by the claims below, and all technical ideas within the equivalent range should be interpreted as being included in the scope of the present specification.

Claims

장치에 의해 수행되는, AI를 활용한 상품 데이터 실시간 분석 및 상품 정보를 업데이트하기 위한 방법으로서,
사용자 모델링을 위해, 내장 스크립트 모듈 및 사용자 로그 수집기 모듈을 통해 머신러닝에 사용하기 위한 사용자의 데이터를 수집함으로써 상품 데이터를 수집하는 단계;
상기 사용자 로그 수집기 모듈에 연결되는 실시간 데이터 분석 처리 시스템 모듈 및 상기 실시간 데이터 분석 처리 시스템 모듈에 연결되는 관리형 ETL 서비스 모듈을 통해 수집된 상품 데이터를 통해 상품을 평가하는 단계 - 상기 상품을 평가하는 단계는 상품을 스코어링(scoring)하는 것을 포함하고, 상기 상품을 평가하는 단계는 MLOps를 통해 지속적으로 상품 주제의 모델 최적화 및 신규 모델의 추가 관리를 수행하는 단계를 포함하고, 상기 관리형 ETL 서비스 모듈은 데이터 카탈로그(Data Catalog)를 포함함 -; 및
상기 관리형 ETL 서비스로부터 적재된 데이터에 기초하여, 대화식 쿼리 서비스 모듈에서 상기 데이터 카탈로그에 대한 대화식 쿼리 서비스를 제공하는 단계 - 상기 대화식 쿼리 서비스 모듈은 클라우드 기계 학습 플랫폼 모듈에 연결되고, 상기 클라우드 기계 학습 플랫폼 모듈에서 상기 데이터에 관한 기계 학습이 수행됨 -; 및
평가된 상품을 하나 이상의 매체에 연동시키고 상기 매체에 싱크를 실시간으로 관리하는 단계 - 상기 싱크를 실시간으로 관리하는 것은 DevOps를 통해 수행됨 -
를 포함하고,
상기 내장 스크립트 모듈은 자바스크립트(JavaScript)를 포함하고,
상기 사용자 로그 수집기 모듈은, 주기적인 서비스 요청을 처리하기 위해서 커널 상에 백그라운드 모드로 실행되는 프로세스로서 주기적으로 상기 사용자 모델링에서 요구되는 데이터를 획득하도록 하는 자바 데몬(Jave Daemon)를 포함하고,
상기 사용자 모델링에서 상기 데이터는 클러스터링되고,
상기 방법은,
비딩 스코어를 계산하는 단계;
쇼핑몰 또는 개인 판매자로부터 입력된 예산에 기초하여, 비딩 한도를 설정하는 단계; 및
상품의 셀러, 브랜드 및 카테고리 중 적어도 하나 이상을 통해 상기 상품 데이터를 필더링하는 단계
를 더 포함하고,
상기 대화식 쿼리 서비스 모듈은 데이터를 시각화하여 제공하는 데이터 시각화 모듈을 포함하고,
상기 사용자 모델링은 룩어라이크 모델링(Look-Alike Modeling)에 기초하고,
상기 룩어라이크 모델링에서, 자동화된 데이터 분석을 통해 새롭고 고유한 대상을 찾기 위해 트레이트(trait) 또는 세그먼트(segment), 시간 간격, 및 첫 번째 및 타사 데이터 소스들(data sources)이 선택되고, 선택된 사항에서 알고리즘 모델의 입력이 제공되며, Analytics 프로세스가 실행되고, 선택한 모집단의 공유 특성을 기반으로 사용자를 찾도록 구성되고,
상기 트레이트는 룰 기반 트레이트들(rules-based traits)과 결합하는 세그먼트를 작성하고 불린(Boolean) 표현식 및 비교 연산자를 포함하고,
상기 룩어라이크 모델에서 알고리즘은, 정기적으로 실행되고, 사용 가능한 모든 트레이트 데이터에서 값을 추출하며, 트레이트들 시간을 추측하거나 새로운 대상을 발견하기 위해 캠페인에서 리소스를 사용하지 않으면서 자체 데이터와 액세스 권한이 있는 선택한 타사 데이터를 평가하는 서버측 검색 및 자격 프로세스와 함께 작동하도록 구성되는, 방법.As a method for real-time analysis of product data and updating product information using AI, performed by a device,
For user modeling, collecting product data by collecting user data for use in machine learning through a built-in script module and a user log collector module;
Evaluating the product through the real-time data analysis processing system module connected to the user log collector module and the product data collected through the managed ETL service module connected to the real-time data analysis processing system module - Evaluating the product includes scoring a product, and the step of evaluating the product includes continuously performing model optimization of a product subject and additional management of a new model through MLOps, wherein the managed ETL service module includes: Contains Data Catalog -; and
Based on the data loaded from the managed ETL service, providing an interactive query service for the data catalog in an interactive query service module, wherein the interactive query service module is connected to a cloud machine learning platform module, and the cloud machine learning machine learning is performed on the data in the platform module; and
Linking the evaluated product to one or more media and managing the sync in the media in real time - Managing the sync in real time is performed through DevOps -
including,
The built-in script module includes JavaScript (JavaScript),
The user log collector module includes a Java Daemon that periodically acquires data required for the user modeling as a process executed in a background mode on the kernel to process periodic service requests,
In the user modeling, the data is clustered,
The method is
calculating a bid score;
setting a bid limit based on a budget input from a shopping mall or a personal seller; and
Filtering the product data through at least one of a seller, a brand, and a category of the product
further comprising,
The interactive query service module includes a data visualization module that provides visualization of data,
The user modeling is based on look-a-like modeling,
In the look-a-like modeling, traits or segments, time intervals, and first and third-party data sources are selected and selected to find new and unique targets through automated data analysis. input from the algorithm model is provided, the Analytics process is run, and is configured to find users based on shared characteristics of the selected population;
The traits create segments that combine with rules-based traits and include Boolean expressions and comparison operators;
In the look-a-like model, the algorithm runs regularly, extracts values from all available trait data, and accesses its own data without consuming resources in the campaign to guess the time of traits or discover new targets. A method, configured to work in conjunction with a server-side discovery and entitlement process that evaluates selected third-party data for permission.

삭제delete