WO2015129983A1

WO2015129983A1 - Device and method for recommending movie on basis of distributed mining of fuzzy association rules

Info

Publication number: WO2015129983A1
Application number: PCT/KR2014/010252
Authority: WO
Inventors: 김민성
Original assignee: 에스케이플래닛 주식회사
Priority date: 2014-02-26
Filing date: 2014-10-29
Publication date: 2015-09-03
Also published as: KR20150101341A; KR102167593B1

Abstract

Disclosed are a device and a method for recommending a movie on the basis of distributed mining of fuzzy association rules. First rating data comprising a movie rating is acquired, the acquired first rating data is converted into second rating data by using a fuzzy membership function, and a list of associated movies is generated by applying the converted second rating data to mining of fuzzy association rules. A movie is recommended in the descending order of ratings by using the list of associated movies that has been generated, thus enabling recommendation of a movie that is associated with a user to whom a movie is to be recommended.

Description

분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법Movie recommendation device and method based on distributed fuzzy association rule mining

본 발명은 사용자의 영화 평점 정보를 언어적 정보로 변환하여 언어적 정보의 연관 관계를 통해 영화를 추천하는 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법에 관한 것으로, 특히 영화 평점에 대한 데이터를 퍼지 연관 규칙 마이닝에 적용하여 획득한 연관 영화 목록을 생성하고, 생성한 목록을 이용하여 추천 대상 사용자에게 영화를 추천할 수 있는 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법에 관한 것이다.The present invention relates to a movie recommendation apparatus and method based on distributed fuzzy association rule mining for recommending a movie through linguistic information by converting movie rating information of a user into linguistic information. The present invention relates to a movie recommendation apparatus and method based on distributed fuzzy association rule mining that generates a list of related movies obtained by applying to fuzzy association rule mining and recommends a movie to a user who is recommended using the generated list.

본 발명은 2014년 02월 26일 출원된 한국특허출원 제10-2014-0022934호의 출원일의 이익을 주장하며, 그 내용 전부는 본 명세서에 포함된다.The present invention claims the benefit of the filing date of Korean Patent Application No. 10-2014-0022934, filed February 26, 2014, the entire contents of which are incorporated herein.

일반적인 영화 추천 알고리즘을 통한 영화 추천은 사용자의 영화 구매 이력을 기반으로 연관 규칙 마이닝(Association Rule Mining)을 통해 '이 영화를 본 사람이 본 영화'의 형태로 추천한다. 또는, 영화 간의 유사도를 계산하여 '이 영화와 유사한 영화'를 추출하는 방식으로 추천이 가능하다. Movie recommendation through the general movie recommendation algorithm is based on the user's purchase history of the movie and is recommended in the form of 'movie watched by people who saw this movie' through association rule mining. Or, it is possible to recommend by extracting a 'movie similar to this movie' by calculating the similarity between movies.

그러나 영화에 대한 평점 로그가 대용량이고, 사용자의 수가 많은 경우에는 통상적인 방법을 이용하여 연관 규칙을 계산해내기 어렵기 때문에, 영화 추천이라는 도메인에 퍼지 연관 규칙 마이닝을 적용한 예를 찾아 보기 어려운 실정이다. However, it is difficult to find an example of applying fuzzy association rule mining to the domain of movie recommendation because it is difficult to calculate the association rule using a conventional method when the rating log of the movie is large and the number of users is large.

따라서, 영화에 대한 사용자의 평점 정보를 언어적인 정보로 변환하고, 언어적 정보에 기반한 분산 퍼지 연관 규칙 마이닝을 통해 사용자의 영화 평점 정보에 대한 연관 관계를 획득하고, 획득한 연관 관계를 이용하여 사용자에게 적합한 영화를 추천할 수 있는 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 기술의 필요성이 절실하게 대두된다.Therefore, the user's rating information for the movie is converted into linguistic information, and the user obtains an association with the movie rating information of the user through distributed fuzzy association rule mining based on the linguistic information, and uses the obtained association relationship. There is an urgent need for a film recommendation technique based on distributed fuzzy association rule mining that can recommend a suitable film for the user.

관련 선행기술로는, 한국 공개 특허 제10-2013-0009360A호, 2013년 7월 30일 공개 (명칭: 영화추천 서비스 제공방버 및 그 시스템)가 있다.Related prior arts include Korean Patent Application Publication No. 10-2013-0009360A, published July 30, 2013 (name: movie recommendation service provision and its system).

본 발명의 목적은, 사용자들이 영화에 대해 남기는 평점 정보를 언어적 평가 정보로 치환하여, 치환된 평가 정보와 연관성 있는 영화를 추천함으로써 사용자의 선호도 성향에 적합한 영화를 추천하는 것이다.An object of the present invention is to recommend a movie suitable for the user's preference tendency by replacing the rating information that the user leaves with respect to the movie by linguistic evaluation information and recommending a movie related to the substituted evaluation information.

또한, 본 발명의 목적은 분산 프레임 워크에 적합한 데이터 처리 방식을 이용하여, 대용량의 영화 평점 정보를 효율적으로 처리하여 사용자들에게 보다 신뢰성 있는 영화 추천 기능을 제공하는 것이다.In addition, an object of the present invention is to provide a more reliable movie recommendation function to users by efficiently processing a large amount of movie rating information using a data processing method suitable for a distributed framework.

상기한 목적을 달성하기 위한 본 발명에 따른 영화 추천 장치는, 영화에 대한 평점을 포함한 제1 평점 데이터를 획득하는 데이터 획득부, 획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성하는 연관 목록 생성부 및 추천 대상 사용자에게 연관 영화 목록을 이용하여 영화를 추천하는 영화 추천부를 포함한다.Movie recommendation apparatus according to the present invention for achieving the above object, the data acquisition unit for obtaining the first rating data including the rating for the movie, converts the obtained first rating data into second rating data, An association list generator for generating a related movie list by applying the second rating data to Fuzzy Association Rule Mining and a movie recommendation unit for recommending a movie using the related movie list to the recommendation target user.

이 때, 연관 목록 생성부는 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다.In this case, the association list generation unit assigns a rating to one or more of the fuzzy membership functions including the triangular membership function, the trapezoidal membership function, and the Gaussian membership function, thereby fuzzy membership. A value may be obtained, and the language label according to the acquired fuzzy affiliation value may be replaced with a rating to be converted into second rating data.

이 때, 연관 목록 생성부는 퍼지 연관 규칙 마이닝을 이용해서 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하고, 생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 연관 영화 목록을 생성할 수 있다.In this case, the association list generator may generate one or more of fuzzy reliability and fuzzy correlation using fuzzy association rule mining, and generate an associated movie list based on at least one of the generated fuzzy reliability and fuzzy correlation.

이 때, 연관 목록 생성부는 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성하는 연관 조합 생성부 및 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산하는 퍼지 지지도 계산부를 포함할 수 있다.At this time, the association list generation unit generates the association combination unit for generating the association combination for each movie by combining the converted second rating data according to the fuzzy association rule, and generates a rating history for each movie by arranging the second rating data for each movie, It may include a fuzzy support calculator for calculating the fuzzy support for each movie by using the generated rating history for each movie.

이 때, 퍼지 지지도 계산부는 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 영화별 퍼지 지지도를 계산할 수 있다.In this case, the fuzzy support calculator may calculate the fuzzy support for each movie using a reference value obtained by normalizing the fuzzy belonging value.

이 때, 연관 목록 생성부는 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하고, 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산할 수 있다.In this case, the association list generation unit combines at least two or more of the fuzzy membership functions to calculate the association combination fuzzy support for the association combination for each movie, and the fuzzy reliability using one or more of the per-movie fuzzy support and the calculated association combination fuzzy support. Can be calculated.

이 때, 연관 목록 생성부는 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다.In this case, the association list generator may calculate the fuzzy correlation using one or more of square values of the fuzzy support for each movie, the fuzzy reliability, and the fuzzy support for each movie.

이 때, 영화 추천부는 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다.At this time, the movie recommendation unit may determine the ranking of the related movie list generated according to a predetermined importance level, and recommend the movies in the order of the determined related movie list having the highest ranking.

이 때, 연관 영화 목록은 영화의 제목, 장르, 감독, 국가, 제작연도 및 이미지 중 하나 이상의 정보를 포함할 수 있다.At this time, the list of related movies may include one or more of the title, genre, director, country, production year and image of the movie.

또한, 본 발명에 따른 영화 추천 방법은, 영화에 대한 평점을 포함한 입력 데이터를 획득하는 단계, 획득한 입력 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝에 적용하여 연관 영화 목록을 생성하는 단계 및 추천 대상 사용자에게 생성한 연관 영화 목록을 이용하여 영화를 추천하는 단계를 포함한다.In addition, the movie recommendation method according to the present invention comprises the steps of: obtaining input data including a rating for a movie; converting the obtained input data into second rating data; and applying the converted second rating data to fuzzy association rule mining. Generating a related movie list and recommending a movie using the related movie list generated to the recommendation target user.

이 때, 연관 영화 목록을 생성하는 단계는 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하는 단계를 포함하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다.At this time, the step of generating an associative movie list assigns a rating to at least one of fuzzy membership functions including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function. The method may include obtaining a fuzzy belonging value and converting the language label according to the acquired fuzzy belonging value into a second rating data by substituting the rating for the language label.

이 때, 연관 영화 목록을 생성하는 단계는 연관 규칙 마이닝을 이용하여 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하는 단계 및 생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 연관 영화 목록을 생성하는 단계를 포함할 수 있다.In this case, generating the related movie list may include generating one or more of fuzzy reliability and fuzzy correlation using association rule mining and generating the related movie list based on at least one of the generated fuzzy reliability and fuzzy correlation. It may include the step.

이 때, 연관 영화 목록을 생성하는 단계는 퍼지 연관 규칙 마이닝을 이용하여 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하는 단계 및 생성한 퍼지 신뢰도 및 퍼지 상관도 중 하나를 기준으로 연관 영화 목록을 생성하는 단계를 포함할 수 있다.In this case, the generating of the related movie list may include generating one or more of fuzzy reliability and fuzzy correlation using fuzzy association rule mining and generating the related movie list based on one of the generated fuzzy reliability and fuzzy correlation. It may include the step.

이 때, 연관 영화 목록을 생성하는 단계는 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하는 단계 및 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산하는 단계를 포함할 수 있다.In this case, the generating of the associative movie list may include combining at least two or more of the fuzzy membership functions to calculate the associative combination fuzzy support for the associative combination by film, and at least one of the associative fuzzy support and the calculated associative combination fuzzy support Computing the fuzzy reliability using may include.

이 때, 연관 영화 목록을 생성하는 단계는 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산하는 단계를 포함할 수 있다.In this case, the generating of the related movie list may include calculating a fuzzy correlation using at least one of square values of fuzzy support for each movie, fuzzy reliability, and fuzzy support for each movie.

이 때, 영화를 추천하는 단계는 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하는 단계를 포함하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다.In this case, the recommending of the movie may include determining a ranking of the related movie list generated according to a predetermined importance level, and recommending the movies in the order of the determined related movie list having the highest ranking.

본 발명에 따르면, 다수의 사용자로부터 영화에 대한 평점 정보를 획득하고, 획득한 평점 정보를 이용하여 연관성 있는 영화 목록을 추출함으로써, 영화를 추천할 사용자의 평점을 이용하여 추천 대상 사용자의 선호도에 상응하는 영화를 추천할 수 있다.According to the present invention, by obtaining rating information on a movie from a plurality of users and extracting a list of relevant movies using the obtained rating information, the rating of the user to recommend the movie corresponds to the preference of the user to be recommended I can recommend a movie to say.

또한, 본 발명은 영화에 대한 평점을 언어적 정보를 치환하여 연관 관계를 도출함에 따라, 언어적 정보 간의 다양한 방향성에 기반하여 다양한 추천 영화 목록을 생성하여 제공할 수 있다.In addition, according to the present invention, as the derivation relationship is derived by substituting linguistic information with a rating for a movie, a list of various recommended movies may be generated and provided based on various directions between linguistic information.

도 1은 본 발명의 일실시예에 따른 영화 추천 장치를 나타낸 블록도이다.1 is a block diagram showing a movie recommendation apparatus according to an embodiment of the present invention.

도 2는 도 1의 영화 추천 장치 중 연관 목록 생성부를 나타낸 블록도이다.FIG. 2 is a block diagram illustrating an association list generator of the movie recommendation apparatus of FIG. 1.

도 3은 본 발명의 일실시예에 따른 영화 추천 방법을 나타낸 동작 흐름도이다.3 is a flowchart illustrating a movie recommendation method according to an embodiment of the present invention.

도 4는 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정을 나타낸 동작 흐름도이다.4 is a flowchart illustrating a process of generating a related movie list according to an embodiment of the present invention.

도 5는 영화의 대한 사용자들의 제1 평점 데이터의 일 예를 나타낸 도면이다.5 is a diagram illustrating an example of first rating data of users of a movie.

도 6a 내지 도 6c는 본 발명에 따른 제2 평점 데이터를 생성하기 위한 퍼지 소속 함수를 나타낸 도면이다. 6A to 6C illustrate a fuzzy membership function for generating second rating data according to the present invention.

도 7은 도 5에 나타난 제1 평점 데이터를 도 6a의 퍼지 소속 함수를 이용하여 퍼지 소속도 값 및 제2 평점 데이터로 나타낸 도면이다.FIG. 7 is a diagram illustrating first rating data illustrated in FIG. 5 as fuzzy membership values and second rating data using the fuzzy membership function of FIG. 6A.

이하 본 발명의 바람직한 실시예를 첨부한 도면을 참조하여 상세히 설명한다. 다만, 하기의 설명 및 첨부된 도면에서 본 발명의 요지를 흐릴 수 있는 공지 기능 또는 구성에 대한 상세한 설명은 생략한다. 또한, 도면 전체에 걸쳐 동일한 구성 요소들은 가능한 한 동일한 도면 부호로 나타내고 있음에 유의하여야 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, in the following description and the accompanying drawings, detailed descriptions of well-known functions or configurations that may obscure the subject matter of the present invention will be omitted. In addition, it should be noted that like elements are denoted by the same reference numerals as much as possible throughout the drawings.

이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위한 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. 또한 제 1, 제 2 등의 용어는 다양한 구성요소들을 설명하기 위해 사용하는 것으로, 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용될 뿐, 상기 구성요소들을 한정하기 위해 사용되지 않는다.The terms or words used in the specification and claims described below should not be construed as being limited to ordinary or dictionary meanings, and the inventors are appropriate as concepts of terms for explaining their own invention in the best way. It should be interpreted as meanings and concepts in accordance with the technical spirit of the present invention based on the principle that it can be defined. Therefore, the embodiments described in the present specification and the configuration shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all of the technical ideas of the present invention, and various alternatives may be substituted at the time of the present application. It should be understood that there may be equivalents and variations. In addition, terms such as “first” and “second” are used to describe various components, and are only used to distinguish one component from another component and are not used to limit the components.

도 1을 참조하면, 본 발명의 일실시예에 따른 영화 추천 장치(100)는 데이터 획득부(110), 연관 목록 생성부(120) 및 영화 추천부(130)를 포함할 수 있다.Referring to FIG. 1, the movie recommendation apparatus 100 according to an exemplary embodiment may include a data acquirer 110, an association list generator 120, and a movie recommender 130.

데이터 획득부(110)는 영화에 대한 사용자의 평점을 포함하는 제1 평점 데이터를 획득할 수 있다.The data acquirer 110 may obtain first rating data including a user's rating for the movie.

이 때, 제1 평점 데이터에는 입력 로그로부터 획득할 수 있는 정보, 즉 사용자를 구분할 수 있는 사용자 아이디, 영화를 구분할 수 있는 영화 아이디를 포함할 수 있으며, 숫자 형태로 된 평점을 포함할 수 있다. 또한, 다른 형태의 입력 로그에서 사용자 아이디, 영화 아이디 및 평점을 추출할 수 있으면 제1 평점 데이터를 획득할 수 있다. 예를 들어, 제1 평점 데이터를 하나의 트랜잭션으로 처리하기 위해 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태로 나타낼 수 있다. 여기에서 User1은 사용자 아이디, m1, m3 및 m100은 영화 아이디 그리고 r_1_1, r_1_3 및 r_1_100 등은 평점으로 n번째 사용자가 m번째 영화에 대해 매긴 평점은 r_n_m의 형태로 표시할 수 있다. In this case, the first rating data may include information that can be obtained from the input log, that is, a user ID for identifying a user and a movie ID for distinguishing a movie, and may include a rating in the form of a number. In addition, if the user ID, the movie ID, and the rating may be extracted from another type of input log, the first rating data may be obtained. For example, in order to process the first rating data in one transaction, it may be represented in the form of 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100)'. Here, User1 may display a user ID, m1, m3, and m100 are movie IDs, and r_1_1, r_1_3, and r_1_100 may be ratings, and a rating given by the nth user to the mth movie may be displayed in the form of r_n_m.

또한, 연관 목록 생성부(120)는 획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성할 수 있다. In addition, the association list generating unit 120 converts the obtained first rating data into second rating data and generates the related movie list by applying the converted second rating data to Fuzzy Association Rule Mining. can do.

이 때, 퍼지 연관 규칙 마이닝은 퍼지 이론을 연관 규칙 마이닝에 적용한 기법으로써, 각 대상이 어떤 집합에 속한다 또는 속하지 않는다는 이진법 논리로부터, 각 대상이 그 모임에 속하는 정도를 소속 함수로 나타냄으로써 표현할 수 있다. 따라서, 연관 규칙 마이닝과 같이 사용자의 로그를 기반으로 소비된 상품 간의 연관성을 계산하여, 마켓이나 스토어에 등록된 상품간의 연관성을 계산할 수 있다.In this case, fuzzy association rule mining is a technique that applies fuzzy theory to association rule mining, and can be expressed by representing the degree to which each object belongs to the group as a function of membership from binary logic that each object belongs to or does not belong to a certain set. . Therefore, the association between the products consumed based on the user's log, such as association rule mining, can be calculated to calculate the association between the products registered in the market or store.

이와 같은 퍼지 연관 규칙은 주로 한 대의 기계 장치를 이용해서 계산되는데, 대용량의 추천을 위해서는 이러한 로직의 분산 처리가 필요하므로, 이 때, 본 발명에서는 퍼지 연관 규칙 마이닝을 분산 프레임 워크에 기반하여 더 효과적으로 계산하기 위해 하둡(Hadoop)의 맵리듀스(MapReduce)를 사용할 수 있다.Such a fuzzy association rule is mainly calculated using a single machine, and since a large amount of recommendation requires distributed processing of such logic, in the present invention, the fuzzy association rule mining is more effectively based on a distributed framework. You can use Hadoop's MapReduce to calculate.

이 때, 맵리듀스에서는 mapper와 reducer의 단계별로 <key, value>를 정의하여 해결할 수 있다. <key, value>는 데이터가 처리되는 기본 단위인 데이터 페어(pair)이며, key와 value는 임의의 구조체나 클래스로 정의하여 복잡한 형태의 데이터를 처리할 수 있다.At this time, MapReduce can be solved by defining <key, value> in each step of mapper and reducer. <key, value> is a data pair that is the basic unit of data processing, and key and value can be defined as any structure or class to process complex data.

이 때, 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다. At this time, a fuzzy membership value is obtained by assigning a rating to at least one of fuzzy membership functions including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function. In addition, the language label according to the acquired fuzzy affiliation value may be converted into second rating data by substituting with a rating.

이 때, 제1 평점 데이터에 포함된 평점을 퍼지 소속 함수에 대입하면 언어 레이블과 관련하여 0과 1사이의 값으로 퍼지 소속도 값을 획득할 수 있고, 퍼지 소속도 값이 큰 언어 레이블의 값을 평점과 치환함으로써 제2 평점 데이터로 변환할 수 있다. 예를 들어, 제1 평점 데이터에 포함된 8점의 평점을 퍼지 소속 함수를 이용하여 퍼지 소속도 값으로 생성한 값이 '보통'의 언어 레이블과 관련하여 0.3의 퍼지 소속도 값을 획득하고, '좋다'의 언어 레이블과 관련하여 0.7의 퍼지 소속도 값을 획득하였다면, 제1 평점 데이터의 평점 8에 대한 정보를 '좋다'로 치환하여 제2 평점 데이터를 생성할 수 있다.At this time, if the rating included in the first rating data is substituted into the fuzzy membership function, the fuzzy membership value can be obtained with a value between 0 and 1 with respect to the language label, and the value of the language label with a large fuzzy membership value is obtained. Can be converted into second rating data by substituting for the rating. For example, a value generated by the fuzzy membership function using the fuzzy membership function of eight points included in the first rating data obtains a fuzzy membership value of 0.3 with respect to the language label of 'normal'. If a fuzzy affiliation value of 0.7 is obtained with respect to the language label of "good", the second rating data may be generated by substituting the information of the rating 8 of the first rating data with "good".

이 때, 각각의 퍼지 소속 함수는 '싫다', '보통', '좋다'와 같은 언어 레이블과 대응되게 되며, 퍼지 소속 함수의 범위와 퍼지 소속 함수의 개수는 따로 지정이 가능할 수 있다.At this time, each fuzzy membership function corresponds to a language label such as 'no', 'normal', or 'good', and the range of the fuzzy membership function and the number of fuzzy membership functions may be separately specified.

또한, 연관 목록 생성부(120)는 퍼지 연관 규칙 마이닝을 이용해서 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하고, 생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 연관 영화 목록을 생성할 수 있다. In addition, the association list generator 120 may generate at least one of fuzzy reliability and fuzzy correlation using fuzzy association rule mining, and generate a related movie list based on at least one of the generated fuzzy reliability and fuzzy correlation. Can be.

또한, 연관 목록 생성부(120)는 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성할 수 있고, 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산할 수 있다.In addition, the association list generating unit 120 may generate the association combination for each movie by combining the converted second rating data according to the fuzzy association rule, and generates a rating history for each movie by arranging the second rating data for each movie. In addition, the fuzzy support for each movie may be calculated using the generated rating history for each movie.

이 때, 영화별 연관 조합은 퍼지 연관 규칙에 따라 미리 설정된 연관 규칙의 길이로 조합을 생성할 수 있다. 예를 들어, 연관 규칙의 길이가 2라고 할 경우 'm1, m3, user1, r_1_1, r_1_3'으로 영화 m1과 m3에 대한 영화 조합을 생성하고, 해당 영화 조합에 대한 사용자별 영화의 평점 정보를 모아서 'm1, m3,(user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ... , (userN, r_N_1, r_N_3)'과 같은 데이터를 수집할 수 있다. 이 때, r_n_m과 같은 표현은 n번째 사용자가 m번째 영화에 대해 매긴 평점으로 해석할 수 있다. In this case, the association association for each movie may generate the combination with a length of a preset association rule according to the fuzzy association rule. For example, if the association rule has a length of 2, create a movie combination for the movies m1 and m3 with 'm1, m3, user1, r_1_1, r_1_3', and gather the rating information of each user's movie for that movie combination. Data such as 'm1, m3, (user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ..., (userN, r_N_1, r_N_3)' may be collected. At this time, an expression such as r_n_m may be interpreted as a rating given by the nth user to the mth movie.

또한, 영화별 평점 이력은 예를 들어, 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태의 제2 평점 데이터를 영화별로 모아서 'm1, (user1,r_1_1), (user1,r_2_1), ... , (userN, r_N_1)'과 같은 형태로 모을 수 있다. In addition, the rating history for each movie is, for example, by collecting the second rating data of the form of 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100) for each movie' m1, (user1, r_1_1), (user1, r_2_1), ..., (userN, r_N_1) '.

이 때, 영화별 퍼지 지지도는 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 계산할 수 있다. 예를 들어, 아래의 수학식 1을 이용하여 각각의 퍼지 소속도 값을 정규화할 수 있다. 이 때, 수학식 1에서

는 l번째 소속 함수에 대해 a_j값이 가지는 소속도 값,

는 정규화된 기본값 및 t_i[a_j]는 트랜잭션 DB |T|에서 i번째 레코드 값을 나타낼 수 있다.In this case, the fuzzy support degree for each movie may be calculated using a reference value obtained by normalizing the fuzzy belonging value. For example, each fuzzy belonging value may be normalized using Equation 1 below. At this time, in Equation 1

Is the membership value of the value a _j for the lth membership function,

Is the normalized default value and t _i [a _j ] can represent the i th record value in transaction DB | T |.

[수학식 1][Equation 1]

이와 같이 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 아래의 수식과 같이 영화별 퍼지 지지도를 계산할 수 있다. As described above, fuzzy support for each movie may be calculated using the reference value obtained by normalizing the fuzzy membership degree.

또한, 아래의 수학식 2를 이용하여 퍼지 지지도를 계산할 수 있다. 이 때, 수학식 2에서

는 l번째 소속 함수에 대해 a_j값이 가지는 소속도 값,

는 정규화된 기본값, t_i[a_j]는 트랜잭션 DB |T|에서 i번째 레코드 값 및 FS_<A,X>는 계산한 기준 값을 대입하여 계산한 퍼지 지지도를 나타낼 수 있다.In addition, fuzzy support can be calculated using Equation 2 below. At this time, in equation (2)

Is the membership value of the value a _j for the lth membership function,

Is the normalized default value, t _i [a _j ] may represent the fuzzy support calculated by substituting the i-th record value and FS _{<A, X>} in the transaction DB | T |.

[수학식 2][Equation 2]

또한, 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하고, 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산할 수 있다. In addition, at least two or more of the fuzzy membership functions may be combined to calculate the associated combination fuzzy support for the associated association for each movie, and the fuzzy reliability may be calculated using one or more of the per-movie fuzzy support and the calculated associated combination fuzzy support.

이 때, 연관 조합 퍼지 지지도는 'm1, m2, MF_1, MF_2, FS(m1, MF1, m2, MF_2)'의 형태로 나타낼 수 있으며 이 때 m은 영화, MF는 퍼지 소속 함수, FS는 퍼지 지지도를 나타낼 수 있다. 예를 들어, 아래의 수학식 3을 이용하여 퍼지 신뢰도를 계산할 수 있다. 이 때, 수학식 3에서

는 l번째 소속 함수에 대해 a_j값이 가지는 소속도 값,

는 정규화된 기본값, t_i[a_j]는 트랜잭션 DB |T|에서 i번째 레코드 값, FS_<A,X>, FS_<C,Z> 는 기준 값을 대입하여 계산한 퍼지 지지도 및 FC_{<<A,X>,<B,Y>>}는 퍼지 신뢰도를 나타낼 수 있다.In this case, the associated combination fuzzy support may be represented in the form of 'm1, m2, MF_1, MF_2, FS (m1, MF1, m2, MF_2)', where m is a film, MF is a fuzzy membership function, and FS is a fuzzy support Can be represented. For example, fuzzy reliability may be calculated using Equation 3 below. At this time, in equation (3)

Is the membership value of the value a _j for the lth membership function,

Is the normalized default value, t _i [a _j ] is the i th record value in transaction DB | T |, FS _{<A, X>} , FS _{<C, Z>} are the fuzzy support and FC _{<< A, X>, <B, Y >>} may represent fuzzy reliability.

[수학식 3][Equation 3]

또한, 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다. 예를 들어, 아래의 수학식 4를 이용하여 퍼지 상관도를 계산할 수 있다. 이 때, 수학식 4에서

는 l번째 소속 함수에 대해 a_j값이 가지는 소속도 값,

는 정규화된 기본값, t_i[a_j]는 트랜잭션 DB |T|에서 i번째 레코드 값, FS_<A,X>, FS_<C,Z> 는 기준 값을 대입하여 계산한 퍼지 지지도, FC_{<<A,X>,<B,Y>>}는는 퍼지 신뢰도 및 CORR_{<<A,X>,<B,Y>>}는 퍼지 상관도를 나타낼 수 있다.In addition, the fuzzy correlation may be calculated using one or more of squared values of the fuzzy support for each movie, the fuzzy reliability, and the fuzzy support for each movie. For example, fuzzy correlation may be calculated using Equation 4 below. At this time, in equation (4)

Is the membership value of the value a _j for the lth membership function,

Is the normalized default value, t _i [a _j ] is the i th record value in transaction DB | T |, FS _{<A, X>} , FS _{<C, Z>} is the fuzzy support calculated by substituting the baseline value, FC _{<< A, X>, <B, Y >> may represent} fuzzy reliability and CORR _{<< A, X>, <B, Y >>} may represent fuzzy correlation.

[수학식 4][Equation 4]

여기서here

영화 추천부(130)는 추천 대상 사용자에게 연관 영화 목록을 이용하여 영화를 추천할 수 있다. 예를 들어, 연관 영화 목록에 포함된 영화 중에서 연관성이 높은 영화의 순서대로 추천 대상 사용자에게 보여줌으로써 사용자에게 보다 적합한 영화 순으로 추천할 수 있다.The movie recommendation unit 130 may recommend a movie to the recommendation user by using the related movie list. For example, it may be recommended to a user in the order of movies more appropriate to the user by showing to the target audience users in the order of highly related movies among the movies included in the related movie list.

이 때, 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다. 예를 들어, 제2 평점 데이터에 포함된 언어 레이블을 이용하여 영화간의 연관 관계가 '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 추천 서비스를 통해 추천된 영화가 마음에 든 사용자를 더 유입하는 계기로 사용할 수 있는 반면, '좋다 -> 싫다', '싫다 -> 싫다'의 관계는 직접적인 구매 유도보다는 호기심 유도를 위한 용도에 적합할 수 있다. 또한, '보통'으로 연결되는 관계는 직접적인 추천 서비스 상 기능과 연결되기 어려워 추천 서비스 측면에서는 불필요한 정보가 될 수 있다. 따라서, '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 중요도가 높은 순위로 결정하고 '좋다 -> 싫다', '싫다 -> 싫다' 및 '보통'으로 연결되는 관계는 비교적 중요도가 낮은 순위로 결정할 수 있다.At this time, the ranking of the created related movie list may be determined according to a predetermined importance level, and the movies may be recommended in the order of the determined highest related movie list. For example, using the language label included in the second rating data, the relation between the movies is 'good-> good' or 'no-> good'. While it can be used to attract more users, the relationship of 'like-> dislike' and 'dislike-> dislike' may be suitable for the purpose of inducing curiosity rather than direct purchase. In addition, the relation that is connected to 'normal' may be difficult to be connected to a function on the direct recommendation service, which may be unnecessary information in terms of recommendation service. Therefore, the relationship between 'good-> good' or 'no--good' is determined by high priority, and the relationship between 'good-> dislike', 'no-> dislike' and 'normal' is relatively Low priority may be determined.

또한, 연관 영화 목록간의 중복된 영화가 존재하는 경우, 연관 영화 목록의 순위를 기준으로 하위 목록에 있는 중복 영화를 삭제할 수 있다. 예를 들어, '영화 A를 좋다고 한 사용자가 좋다'고 한 영화 목록에 영화 B가 존재하는데, '영화 A를 좋다고 한 사용자가 싫다'고 한 영화 목록에도 영화 B가 존재한다면, 연관 영화 목록의 순위를 확인하여 비교적 하위 목록인 '영화 A를 좋다고 한 사용자가 싫다'고 한 영화 목록에서 영화 B를 삭제할 수 있다.In addition, when duplicated movies exist between related movie lists, the duplicate movies in the lower list may be deleted based on the ranking of the related movie list. For example, if there is a movie B in the list of movies that says 'user who likes movie A is good' and a movie list that says 'the user who likes movie A does not like', By checking the rankings, you can remove Movie B from the list of movies that you're saying, "I hate Movie A."

이와 같은 영화 추천 장치를 이용하여 사용자들이 영화에 대해 남기는 평점 정보를 통해 사용자와 연관성 있는 영화를 추천함으로써 사용자의 선호 성향에 적합한 영화를 추천할 수 있다.By using the movie recommendation device as described above, the user may recommend a movie that is related to the user's preference by recommending a movie related to the user through the rating information left for the movie.

도 2를 참조하면, 도 1의 영화 추천 장치 중 연관 목록 생성부(120)는 연관 조합 생성부(210) 및 퍼지 지지도 계산부(220)를 포함한다.Referring to FIG. 2, the association list generator 120 of the movie recommendation apparatus of FIG. 1 includes an association combination generator 210 and a fuzzy support calculator 220.

연관 조합 생성부(210)는 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성할 수 있다. The association combination generation unit 210 may generate the association combination for each movie by combining the converted second rating data according to the fuzzy association rule.

이 때, 영화별 연관 조합은 퍼지 연관 규칙에 따라 미리 설정된 연관 규칙의 길이로 조합을 생성할 수 있다. 예를 들어, 연관 규칙의 길이가 2라고 할 경우 'm1, m3, user1, r_1_1, r_1_3'으로 영화 m1과 m3에 대한 영화 조합을 생성하고, 해당 영화 조합에 대한 사용자별 영화의 평점 정보를 모아서 'm1, m3,(user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ... , (userN, r_N_1, r_N_3)'과 같은 데이터를 수집할 수 있다. 이 때, r_n_m과 같은 표현은 n번째 사용자가 m번째 영화에 대해 매긴 평점으로 해석할 수 있다.In this case, the association association for each movie may generate the combination with a length of a preset association rule according to the fuzzy association rule. For example, if the association rule has a length of 2, create a movie combination for the movies m1 and m3 with 'm1, m3, user1, r_1_1, r_1_3', and gather the rating information of each user's movie for that movie combination. Data such as 'm1, m3, (user1, r_1_1, r_1_3), (uesr7, r_7_1, r_7_3), ..., (userN, r_N_1, r_N_3)' may be collected. At this time, an expression such as r_n_m may be interpreted as a rating given by the nth user to the mth movie.

퍼지 지지도 계산부(220)는 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산할 수 있다.The fuzzy support calculator 220 may generate a rating history for each movie by arranging the second rating data for each movie, and calculate the fuzzy support for each movie using the generated rating history for each movie.

이 때, 영화별 평점 이력은 예를 들어, 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태의 제2 평점 데이터를 영화별로 모아서 'm1, (user1,r_1_1), (user1,r_2_1), ... , (userN, r_N_1)'과 같은 형태로 모을 수 있다. In this case, the rating history for each movie may be, for example, collecting second rating data in the form of 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100)' for each movie and then 'm1'. , (user1, r_1_1), (user1, r_2_1), ..., (userN, r_N_1) '.

이 때, 영화별 퍼지 지지도는 퍼지 소속도 값을 정규화하여 획득한 기준 값을 이용하여 계산할 수 있다. 예를 들어, 상기에서 설명한 수학식 1을 이용하여 각각의 퍼지 소속도 값을 정규화할 수 있다. In this case, the fuzzy support degree for each movie may be calculated using a reference value obtained by normalizing the fuzzy belonging value. For example, each fuzzy belonging value may be normalized using Equation 1 described above.

또한, 상기에서 설명한 수학식 2를 이용하여 퍼지 지지도를 계산할 수 있다.In addition, fuzzy support may be calculated using Equation 2 described above.

도 3을 참조하면, 본 발명의 일실시예에 따른 영화 추천 방법은 영화에 대한 사용자의 평점을 포함하는 제1 평점 데이터를 획득할 수 있다(S310).Referring to FIG. 3, the movie recommendation method according to an embodiment of the present invention may obtain first rating data including a user's rating for a movie (S310).

이 때, 제1 평점 데이터에는 입력 로그로부터 획득할 수 있는 정보, 즉 사용자를 구분할 수 있는 사용자 아이디, 영화를 구분할 수 있는 영화 아이디를 포함할 수 있으며, 숫자 형태로 된 평점을 포함할 수 있다. 또한, 다른 형태의 입력 로그에서 사용자 아이디, 영화 아이디 및 평점을 추출할 수 있으면 제1 평점 데이터를 획득할 수 있다. 예를 들어, 제1 평점 데이터를 하나의 트랜잭션으로 처리하기 위해 'User1(m1, r_1_1), (m3, r_1_3), ... , (m100, r_1_100)'와 같은 형태로 나타낼 수 있다. 여기에서 User1은 사용자 아이디, m1, m3 및 m100은 영화 아이디 그리고 r_1_1, r_1_3 및 r_1_100 등은 평점으로 n번째 사용자가 m번째 영화에 대해 매긴 평점은 r_n_m의 형태로 표시할 수 있다.In this case, the first rating data may include information that can be obtained from the input log, that is, a user ID for identifying a user and a movie ID for distinguishing a movie, and may include a rating in the form of a number. In addition, if the user ID, the movie ID, and the rating may be extracted from another type of input log, the first rating data may be obtained. For example, in order to process the first rating data in one transaction, it may be represented in the form of 'User1 (m1, r_1_1), (m3, r_1_3), ..., (m100, r_1_100)'. Here, User1 may display a user ID, m1, m3, and m100 are movie IDs, and r_1_1, r_1_3, and r_1_100 may be ratings, and a rating given by the nth user to the mth movie may be displayed in the form of r_n_m.

또한, 본 발명의 일실시예에 따른 영화 추천 장치는 획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성할 수 있다(S320).In addition, the movie recommendation device according to an embodiment of the present invention converts the obtained first rating data into second rating data and applies the converted second rating data to Fuzzy Association Rule Mining for association. A list of movies may be generated (S320).

또한, 상기에서 설명한 수학식 2를 이용하여 퍼지 지지도를 계산할 수 있다. In addition, fuzzy support may be calculated using Equation 2 described above.

이 때, 연관 조합 퍼지 지지도는 'm1, m2, MF_1, MF_2, FS(m1, MF1, m2, MF_2)'의 형태로 나타낼 수 있으며 이 때 m은 영화, MF는 퍼지 소속 함수, FS는 퍼지 지지도를 나타낼 수 있다. 예를 들어, 상기에서 설명한 수학식 3을 이용하여 퍼지 신뢰도를 계산할 수 있다. In this case, the associated combination fuzzy support may be expressed in the form of 'm1, m2, MF_1, MF_2, FS (m1, MF1, m2, MF_2)', where m is a film, MF is a fuzzy membership function, and FS is a fuzzy support Can be represented. For example, fuzzy reliability may be calculated using Equation 3 described above.

또한, 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다. 예를 들어, 상기에서 설명한 수학식 4를 이용하여 퍼지 상관도를 계산할 수 있다. In addition, the fuzzy correlation may be calculated using one or more of squared values of the fuzzy support for each movie, the fuzzy reliability, and the fuzzy support for each movie. For example, fuzzy correlation may be calculated using Equation 4 described above.

또한, 본 발명의 일실시예에 따른 영화 추천 방법은 추천 대상 사용자에게 연관 영화 목록을 이용하여 영화를 추천할 수 있다(S330). 예를 들어, 연관 영화 목록에 포함된 영화 중에서 연관성이 높은 영화의 순서대로 추천 대상 사용자에게 보여줌으로써 사용자에게 보다 적합한 영화 순으로 추천할 수 있다.In addition, the movie recommendation method according to an embodiment of the present invention may recommend a movie to the recommendation target user by using the related movie list (S330). For example, it may be recommended to a user in the order of movies more appropriate to the user by showing to the target audience users in the order of highly related movies among the movies included in the related movie list.

이 때, 미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천할 수 있다. 예를 들어, 제2 평점 데이터에 포함된 언어 레이블을 이용하여 영화간의 연관 관계가 '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 추천 서비스를 이용하여 추천된 영화가 마음에 든 사용자를 더 유입하는 계기로 사용할 수 있는 반면, '좋다 -> 싫다', '싫다 -> 싫다'의 관계는 직접적인 구매 유도보다는 호기심 유도를 위한 용도에 적합할 수 있다. 또한, '보통'으로 연결되는 관계는 직접적인 추천 서비스 상 기능과 연결되기 어려워 서비스 측면에서는 불필요한 정보가 될 수 있다. 따라서, '좋다 -> 좋다'의 관계나 '싫다 -> 좋다'의 관계는 높은 중요도 순위로 결정하고 '좋다 -> 싫다', '싫다 -> 싫다' 및 '보통'으로 연결되는 관계는 비교적 낮은 중요도 순위로 결정할 수 있다.At this time, the ranking of the created related movie list may be determined according to a predetermined importance level, and the movies may be recommended in the order of the determined highest related movie list. For example, using the language label included in the second rating data, the relation between the movies is 'good-> good' or 'no-> good'. While it can be used as an opportunity to attract more users, the relationship of 'like-> dislike' and 'dislike-> dislike' may be suitable for inducing curiosity rather than direct purchase. In addition, the relation that is connected to 'normal' is difficult to be connected to a function on the direct recommendation service, and thus may be unnecessary information in terms of service. Therefore, the relationship between 'good-> good' or 'no--good' is determined by high importance ranking, and the relationship between 'good-> dislike', 'no-> dislike' and 'normal' is relatively low. It can be determined by importance ranking.

이와 같은 영화 추천 방법을 통해 영화 추천 서비스를 이용하는 사용자들에게 대용량의 사용자 로그를 이용한 신뢰성 있는 영화 추천 서비스를 제공할 수 있다.Through such a movie recommendation method, a reliable movie recommendation service using a large user log can be provided to users using the movie recommendation service.

도 4를 참조하면, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 제1 평점 데이터에 포함된 평점을 언어 레이블로 치환하여 제2 평점 데이터를 획득할 수 있다(S410).Referring to FIG. 4, in the process of generating a related movie list according to an embodiment of the present invention, second rating data may be obtained by replacing a rating included in the first rating data with a language label (S410).

이 때, 삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 평점을 대입하여 퍼지 소속도 값을 획득하고, 획득한 퍼지 소속도 값에 따른 언어 레이블을 평점과 치환하여 제2 평점 데이터로 변환할 수 있다.At this time, a fuzzy membership value is obtained by assigning a rating to at least one of fuzzy membership functions including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function. In addition, the language label according to the acquired fuzzy affiliation value may be converted into second rating data by substituting with a rating.

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성할 수 있다(S420).Also, in the process of generating the related movie list according to an embodiment of the present invention, the related second movie combination may be generated by combining the converted second rating data according to the fuzzy association rule (S420).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고(S430), 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산할 수 있다(S440).In addition, the process of generating a related movie list according to an embodiment of the present invention generates a rating history for each movie by arranging the second rating data for each movie (S430), fuzzy for each movie using the generated rating history for each movie Support can be calculated (S440).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 단계(S420) 및 단계(S440)에서 각각 생성 및 계산한 영화별 연관 조합과 영화별 퍼지 지지도를 이용하여, 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산할 수 있다(S450).In addition, the process of generating a list of related movies according to an embodiment of the present invention is based on the fuzzy membership function by using the associated associations and the fuzzy support for each movie generated and calculated in steps S420 and S440, respectively. The association combination fuzzy support for the association association for each movie may be calculated by combining at least two (S450).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 퍼지 신뢰도를 계산할 수 있다(S460).In addition, the process of generating a list of related movies according to an embodiment of the present invention may calculate the fuzzy reliability by using one or more of the fuzzy support for each movie and the calculated associated combination fuzzy support (S460).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 퍼지 상관도를 계산할 수 있다(S470).In addition, in the process of generating a related movie list according to an embodiment of the present invention, the fuzzy correlation may be calculated using at least one of square values of fuzzy support for each movie, fuzzy reliability, and fuzzy support for each movie (S470).

또한, 본 발명의 일실시예에 따른 연관 영화 목록을 생성하는 과정은 생성한 퍼지 신뢰도 및 퍼지 상관도 중 하나를 기준으로 연관 영화 목록을 생성할 수 있다(S480).In addition, the process of generating a related movie list according to an embodiment of the present invention may generate a related movie list based on one of the generated fuzzy reliability and fuzzy correlation (S480).

도 5, 도 6a 내지 도 6c 및 도 7을 참조하면, 도 5와 같이 사용자들이 영화에 대해서 평점을 부여하였을 때, 도 6a 내지 도 6c에 나타낸 퍼지 소속 함수들을 이용하여 도 7과 같이 퍼지 소속도 값 및 제2 평점 데이터를 생성할 수 있다.Referring to FIGS. 5, 6A, 6C, and 7, when the user assigns a rating to a movie as shown in FIG. 5, the fuzzy membership diagram is illustrated in FIG. 7 using the fuzzy membership functions shown in FIGS. 6A through 6C. Value and second rating data may be generated.

예를 들어, 도 6a의 퍼지 소속 함수를 이용하는 경우에 도 5의 사용자 1이 영화 1에 대해 평점을 2점 부여하였음을 알 수 있다. 이 때, 사용자 1의 영화 1에 대한 퍼지 소속도 값은 도 7에서 싫다:1.0, 보통:0.0, 좋다:0.0으로 나타냄을 알 수 있다. 따라서, 이 경우에는 언어 레이블 '싫다'를 제1 평점 데이터의 평점 2점과 치환하여 제2 평점 데이터로 변환할 수 있다.For example, in the case of using the fuzzy membership function of FIG. 6A, it can be seen that User 1 of FIG. 5 rated 2 movies. At this time, it can be seen that the fuzzy affiliation value for the movie 1 of the user 1 is expressed as dislike: 1.0, normal: 0.0, and good: 0.0 in FIG. Therefore, in this case, the language label "dislike" can be replaced with the two points of the first rating data to be converted into the second rating data.

또한, 사용자 1이 영화 4에 대해서는 평점을 8점을 부여하였음을 알 수 있다. 이 때, 사용자 1의 영화 4에 대한 퍼지 소속도 값은 도 7에서 싫다:0.0, 보통:0.33, 좋다:0.67으로 나타냄을 알 수 있다. 따라서, 이 경우에는 언어 레이블 '좋다'를 제1 평점 데이터의 평점 8점과 치환하여 제2 평점 데이터로 변환할 수 있다.Also, it can be seen that the user 1 has given a score of 8 to the movie 4. At this time, it can be seen that the fuzzy affiliation value for the movie 4 of the user 1 is shown in FIG. 7 as hate: 0.0, normal: 0.33, and good: 0.67. Therefore, in this case, the language label 'good' may be converted into second rating data by substituting eight ratings of the first rating data.

본 발명에 따른 영화 추천 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 모든 형태의 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 이러한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The movie recommendation method according to the present invention may be implemented in the form of program instructions that can be executed by various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and any type of hardware device specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions may include high-level language code that can be executed by a computer using an interpreter as well as machine code such as produced by a compiler. Such hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상에서와 같이 본 발명에 따른 분산 퍼지 연관 규칙 마이닝에 기반한 영화 추천 장치 및 방법은 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다.As described above, the apparatus and method for recommending a movie based on distributed fuzzy association rule mining according to the present invention is not limited to the configuration and method of the embodiments described as described above, and the embodiments may be modified in various ways. All or part of each of the embodiments may be configured to be selectively combined to make it possible.

본 발명에 의하면 영화에 대한 평점을 포함한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝에 적용하여 연관 영화 목록을 생성하고, 생성한 연관 영화 목록을 통해 영화를 추천 함으로써 추천 대상 사용자의 선호도에 적합한 영화를 효과적으로 추천할 수 있다. 나아가, 이와 같은 추천 기능을 대용량의 사용자에게 적용하기 위해 분산 프레임 워크에 적합한 데이터 처리 방식을 사용하기 때문에 대규모의 평점 로그 데이터에도 적용할 수 있어 보다 신뢰성 있는 서비스를 제공할 수 있다.According to the present invention, the first rating data including the rating for the movie is converted into second rating data, the converted second rating data is applied to fuzzy association rule mining to generate a related movie list, and the generated related movie list By recommending a movie through the user, it is possible to effectively recommend a movie suitable for the preference of the target user. In addition, since the data processing method suitable for a distributed framework is used to apply such a recommendation function to a large amount of users, it can be applied to a large-scale rating log data, thereby providing a more reliable service.

Claims

영화에 대한 평점을 포함한 제1 평점 데이터를 획득하는 데이터 획득부;A data obtaining unit obtaining first rating data including a rating for the movie;

획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝(Fuzzy Association Rule Mining)에 적용하여 연관 영화 목록을 생성하는 연관 목록 생성부; 및An association list generator configured to convert the obtained first rating data into second rating data and apply the converted second rating data to fuzzy association rule mining to generate a related movie list; And

추천 대상 사용자에게 상기 연관 영화 목록을 이용하여 영화를 추천하는 영화 추천부Movie recommendation unit for recommending a movie to the target audience using the related movie list

를 포함하는 것을 특징으로 하는 영화 추천 장치.Movie recommendation device comprising a.
청구항 1에 있어서,The method according to claim 1,

상기 연관 목록 생성부는The association list generation unit

삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 상기 평점을 대입하여 퍼지 소속도 값을 획득하고,Obtain a fuzzy membership value by substituting the rating to one or more of the fuzzy membership functions including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function,

획득한 퍼지 소속도 값에 따른 언어 레이블을 상기 평점과 치환하여 제2 평점 데이터로 변환하는 것을 특징으로 하는 영화 추천 장치.And a language label based on the acquired fuzzy affiliation value to convert the language label into second rating data.
청구항 1에 있어서,The method according to claim 1,

상기 연관 목록 생성부는The association list generation unit

상기 퍼지 연관 규칙 마이닝을 이용해서 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하고,Generate one or more of fuzzy reliability and fuzzy correlation using the fuzzy association rule mining,

생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 상기 연관 영화 목록을 생성하는 것을 특징으로 하는 영화 추천 장치.And generating the list of related movies based on at least one of the generated fuzzy reliability and the fuzzy correlation.
청구항 3에 있어서,The method according to claim 3,

상기 연관 영화 목록 생성부는The associated movie list generation unit

변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성하는 연관 조합 생성부; 및An association combination generator configured to combine the converted second rating data according to a fuzzy association rule to generate an association combination for each movie; And

상기 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산하는 퍼지 지지도 계산부를 포함하는 것을 특징으로 하는 영화 추천 장치.And a fuzzy support calculator for generating a movie rating history by arranging the second rating data for each movie and calculating fuzzy support for each movie using the generated movie rating history.
청구항 4에 있어서,The method according to claim 4,

상기 퍼지 지지도 계산부는The fuzzy support calculation unit

상기 퍼지 지지도 값을 정규화하여 획득한 기준 값을 이용하여 상기 영화별 퍼지 지지도를 계산하는 것을 특징으로 하는 영화 추천 장치.The fuzzy support for each movie is calculated by using the reference value obtained by normalizing the fuzzy support value.
청구항 4에 있어서,The method according to claim 4,

상기 연관 목록 생성부는The association list generation unit

상기 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 상기 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하고,Combining at least two or more of the fuzzy membership functions to calculate an associative combination fuzzy support for the associative combination for each movie,

상기 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 상기 퍼지 신뢰도를 계산하는 것을 특징으로 하는 영화 추천 장치.And the fuzzy reliability is calculated using at least one of the fuzzy support for each movie and the associated associated fuzzy support.
청구항 4에 있어서,The method according to claim 4,

상기 연관 목록 생성부는The association list generation unit

상기 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 상기 퍼지 상관도를 계산하는 것을 특징으로 하는 영화 추천 장치.And the fuzzy correlation is calculated using at least one of squared values of the fuzzy support for each movie, the fuzzy reliability, and the fuzzy support for each movie.
청구항 1에 있어서,The method according to claim 1,

상기 영화 추천부는The movie recommendation unit

미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하고, 결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천하는 것을 특징으로 하는 영화 추천 장치.The movie recommendation apparatus of claim 1, wherein the ranking of the generated related movie list is determined according to a predetermined importance, and the movies are recommended in the order of the determined highest related movie list.
청구항 1에 있어서,The method according to claim 1,

상기 연관 영화 목록은The related movie list is

영화의 제목, 장르, 감독, 국가, 제작연도 및 이미지 중 하나 이상의 정보를 포함하는 것을 특징으로 하는 영화 추천 장치. A movie recommendation device comprising at least one of a title, a genre, a director, a country, a production year, and an image of a movie.
영화에 대한 평점을 포함한 제1 평점 데이터를 획득하는 단계;Obtaining first rating data including a rating for the movie;

획득한 제1 평점 데이터를 제2 평점 데이터로 변환하고, 변환한 제2 평점 데이터를 퍼지 연관 규칙 마이닝에 적용하여 연관 영화 목록을 생성하는 단계; 및Converting the obtained first rating data into second rating data and generating the related movie list by applying the converted second rating data to fuzzy association rule mining; And

추천 대상 사용자에게 상기 연관 영화 목록을 이용하여 영화를 추천하는 단계Recommending a movie to the target user using the related movie list

를 포함하는 것을 특징으로 하는 영화 추천 방법.Movie recommendation method comprising a.
청구항 10에 있어서,The method according to claim 10,

상기 연관 영화 목록을 생성하는 단계는Generating the related movie list

삼각형 소속 함수(Triangular membership function), 사다리꼴 소속 함수(Trapezoidal membership function) 및 가우시안 소속 함수(Gaussian membership function)를 포함하는 퍼지 소속 함수 중 하나 이상에 상기 평점을 대입하여 퍼지 소속도 값을 획득하는 단계를 포함하고,Obtaining a fuzzy membership value by assigning the rating to one or more of the fuzzy membership functions including a triangular membership function, a trapezoidal membership function, and a Gaussian membership function. Including,

획득한 퍼지 소속도 값에 따른 언어 레이블을 상기 평점과 치환하여 제2 평점 데이터로 변환하는 것을 특징으로 하는 영화 추천 방법.And a language label based on the acquired fuzzy affiliation value is converted into second rating data by substituting the rating.
청구항 10에 있어서,The method according to claim 10,

상기 연관 영화 목록을 생성하는 단계는Generating the related movie list

상기 퍼지 연관 규칙 마이닝을 이용하여 퍼지 신뢰도 및 퍼지 상관도 중 하나 이상을 생성하는 단계; 및Generating one or more of fuzzy reliability and fuzzy correlation using the fuzzy association rule mining; And

생성한 퍼지 신뢰도 및 퍼지 상관도 중 적어도 하나를 기준으로 상기 연관 영화 목록을 생성하는 단계를 포함하는 것을 특징으로 하는 영화 추천 방법.Generating the related movie list based on at least one of the generated fuzzy reliability and the fuzzy correlation.
청구항 12에 있어서,The method according to claim 12,

상기 연관 영화 목록을 생성하는 단계는Generating the related movie list

변환한 제2 평점 데이터를 퍼지 연관 규칙에 따라 조합하여 영화별 연관 조합을 생성하는 단계; 및Generating the associative association for each movie by combining the converted second rating data according to the fuzzy association rule; And

상기 제2 평점 데이터를 영화별로 정리한 영화별 평점 이력을 생성하고, 생성한 영화별 평점 이력을 이용하여 영화별 퍼지 지지도를 계산하는 단계를 포함하는 것을 특징으로 하는 영화 추천 방법.Generating a rating history for each movie by arranging the second rating data for each movie, and calculating fuzzy support for each movie using the generated rating history for each movie.
청구항 13에 있어서,The method according to claim 13,

상기 연관 영화 목록을 생성하는 단계는Generating the related movie list

상기 퍼지 소속 함수 중 적어도 둘 이상을 조합하여 상기 영화별 연관 조합에 대한 연관 조합 퍼지 지지도를 계산하는 단계; 및Combining at least two or more of the fuzzy membership functions to calculate an associative combination fuzzy support for the associative association by movie; And

상기 영화별 퍼지 지지도 및 계산한 연관 조합 퍼지 지지도 중 하나 이상을 이용하여 상기 퍼지 신뢰도를 계산하는 단계를 포함하는 것을 특징으로 하는 영화 추천 방법.And calculating the fuzzy reliability using at least one of the per-film fuzzy support and the calculated associated combination fuzzy support.
청구항 13에 있어서,The method according to claim 13,

상기 연관 영화 목록을 생성하는 단계는Generating the related movie list

상기 영화별 퍼지 지지도, 퍼지 신뢰도 및 영화별 퍼지 지지도의 제곱 값 중 하나 이상을 이용하여 상기 퍼지 상관도를 계산하는 단계를 포함하는 것을 특징으로 하는 영화 추천 방법.And calculating the fuzzy correlation using one or more of the square value of the fuzzy support for each movie, the fuzzy reliability, and the fuzzy support for each movie.
청구항 10에 있어서, The method according to claim 10,

상기 영화를 추천하는 단계는Recommend the movie

미리 설정된 중요도에 따라 생성한 연관 영화 목록의 순위를 결정하는 단계를 포함하고,Determining a ranking of the related movie list generated according to a predetermined importance level,

결정한 순위가 높은 연관 영화 목록의 순서대로 영화를 추천하는 것을 특징으로 하는 영화 추천 방법.A movie recommendation method, characterized in that the recommendation of the movies in the order of the ranked list of related movies with a high ranking.
청구항 10 내지 16 중 어느 한 항의 방법을 실행하기 위한 프로그램이 기록된 컴퓨터에서 판독 가능한 기록매체.A computer-readable recording medium having recorded thereon a program for executing the method of claim 10.