KR101924448B1

KR101924448B1 - Real estate clustering method and apparatus, system and method for estimating market price of real estate using the same

Info

Publication number: KR101924448B1
Application number: KR1020170096188A
Authority: KR
Inventors: 구름; 탁온식
Original assignee: 주식회사 빅밸류
Priority date: 2017-07-28
Filing date: 2017-07-28
Publication date: 2018-12-04

Abstract

The present invention relates to a real estate market clustering apparatus, and a real estate market price estimation method using the same. According to embodiments of the present invention, the real estate market clustering apparatus comprises: a real transaction real estate information acquisition unit which acquires real transaction real estate information including information related to a building and a real transaction price for a real transaction price real estate positioned in a predetermined area; and a clustering unit which clusters the predetermined area by using the real transaction real estate information. The clustering unit includes: a cluster analysis unit for calculating a cluster tendency coefficient for each of a first cluster and a second cluster by applying real estate real transaction information for the first cluster and the second cluster included in a first cluster set generated in the predetermined area to a cluster analysis model; a cluster comparison unit for generating a similarity score between the first cluster and the second cluster by using the cluster tendency coefficient for the first cluster and the second cluster; and a cluster merging unit for merging the first cluster and the second cluster into a third cluster on the basis of the similarity score, for example, when the similarity score is equal to or greater than a threshold value.

Description

부동산 시장 군집화 방법 및 장치, 이를 이용한 부동산 시세 추정 시스템 및 방법{REAL ESTATE CLUSTERING METHOD AND APPARATUS, SYSTEM AND METHOD FOR ESTIMATING MARKET PRICE OF REAL ESTATE USING THE SAME}TECHNICAL FIELD The present invention relates to a real estate market clustering method and apparatus and a system and method for estimating a real estate market using the system and method.

본 발명은 부동산을 군집화하는 방법 및 장치, 이를 이용한 부동산 시세 추정 시스템 및 방법에 관한 것으로, 보다 상세하게는 빅데이터를 이용하여 유사한 속성의 부동산끼리 군집화하는 방법 및 장치, 이를 이용한 부동산 시세 추정 시스템 및 방법에 관한 것이다.The present invention relates to a method and apparatus for clustering real estate, and a system and method for estimating a real estate market using the same. More particularly, the present invention relates to a method and apparatus for clustering real estate having similar properties using big data, &Lt; / RTI >

부동산 시장은 재화의 단위인 부동산 자체의 이동성이 일반 상품시장에 비해 극히 낮은 특징이 있다. 따라서, 거주(또는 매매)를 희망하는 부동산 수요자는 희망 부동산을 포함하는 일정 범위의 군집(이하, “주변 군집”)을 참조하여 희망 부동산의 시세를 예측하는 것이 일반적이다. In the real estate market, the mobility of the real estate itself, which is a unit of goods, is extremely low compared to the general commodity market. Therefore, a real estate purchaser who desires to reside (or sell) generally predicts the price of a desired real estate by referring to a certain range of communities including a desired real estate (hereinafter, " surrounding community ").

도 1은, 종래의 실시예에 따른, 소정 영역에 설정된 복수의 군집을 도시한 도면이다. 종래의 부동산 시장 분석은 일반적으로 임의의 반경으로 부동산 시장을 군집화하거나, 단순히 행정 구역을 기준으로 부동산 시장을 군집화하였다. 예를 들면 서울시의 경우, 도 1에 도시된 바와 같이 25개의 구(區)를 기준으로 서울시 부동산 시장의 권역을 군집화하였다. 또는 도심권, 동부권, 서부권, 남부권, 북부권과 같이 단순히 방위를 중심으로 부동산 권역을 군집화하였다. 1 is a diagram showing a plurality of communities set in a predetermined area according to a conventional embodiment. Conventional real estate market analysis generally clustered the real estate market with a certain radius, or simply clustered the real estate market based on the administrative districts. For example, in the case of Seoul, as shown in Fig. 1, the real estate market area of Seoul was clustered based on 25 districts. Or urban areas, eastern regions, western regions, southern regions and northern regions.

이러한 종래의 부동산 시장 분석 방법은 부동산별로 유형, 면적, 건축년도 등과 같은 개별 부동산의 속성을 반영하지 않은 한계가 있다. 이를 해결하기 위해 K-means 알고리즘과 같은 군집화 알고리즘을 통해 부동산의 개별 속성을 반영하여 부동산 시장을 군집화하는 일부 사례들이 있으나, 초기 중심값(initial points)이 매번 랜덤하게 설정되어 부동산 수요자에게 일관적인 참조 정보를 제공할 수 없는 한계가 있다. This conventional method of analyzing the real estate market has a limit that does not reflect the property of individual real estate such as type, area, construction year, and the like. In order to solve this problem, there are some cases of clustering the real estate market by reflecting the individual properties of the real estate through the clustering algorithm such as the K-means algorithm. However, since the initial points are set randomly, There is a limit that information can not be provided.

한국등록특허 KR 10-0541625 B1Korean Patent No. KR 10-0541625 B1

본 발명의 일 측면에 따르면, 실거래가 및 건축물 관련 정보에 기초하여 군집 간 유사도를 측정하고, 유사한 군집을 병합하여 개선된 군집 세트를 생성하는 부동산 시장 군집화 장치가 제공된다. According to an aspect of the present invention, there is provided a real estate market clustering apparatus for measuring similarity between clusters based on actual transaction information and building related information, and merging similar clusters to generate an improved cluster set.

이외에, 부동산 시장 군집화 방법이 제공되고, 관련된 컴퓨터 저장 매체 또한 제공된다.In addition, a real estate market clustering method is provided, and associated computer storage media are also provided.

본 발명의 다른 일측면에 따르면, 상기 부동산 시장 군집화 방법 및 장치를 이용한 부동산 시세 추정 시스템 및 방법이 제공된다.According to another aspect of the present invention, there is provided a real estate market estimation system and method using the real estate market clustering method and apparatus.

본 발명의 일 측면에 따른 빅데이터를 이용한 부동산 시장 군집화 방법은 a) 소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 부동산 실거래 정보를 획득하는 단계; b) 상기 소정 영역에 제1 군집 세트를 생성하는 단계; c) 상기 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 단계; d) 상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 단계; e) 상기 유사도 점수에 기초하여 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 단계; 및 f) 상기 제3 군집을 포함하는 제2 군집 세트를 생성하는 단계를 포함할 수 있다. 여기서, 상기 군집 경향 계수는 상기 건축물 관련 정보가 상기 실거래가에 미치는 영향을 나타낼 수도 있다. According to an aspect of the present invention, there is provided a method of clustering a real estate market using big data, comprising the steps of: a) obtaining real estate real-estate information including real transactions for real estate and building- b) generating a first set of clusters in the predetermined region; c) calculating a community tendency coefficient for each of the first community and the second community by applying the real estate real-estate information on the first community and the second community included in the first community set to the community analysis model; d) generating a similarity score between the first community and the second community using the population tendency coefficients for the first community and the second community; e) merging the first community and the second community into the third community based on the similarity score; And f) generating a second population of clusters comprising the third population. Here, the cluster tendency coefficient may indicate the influence of the building-related information on the actual transaction amount.

일 실시예에서, 상기 부동산 시장 군집화 방법은 상기 단계들 c) 내지 f)를 반복하는 단계; 및 상기 제2 군집 세트에 포함된 적어도 하나의 군집에 위치하는 실거래 부동산의 수에 기초하여 상기 반복하는 단계를 완료하는 단계를 더 포함할 수 있다. In one embodiment, the method of real estate market clustering comprises repeating steps c) to f) above; And completing the repeating step based on a number of real estate properties located in at least one community included in the second community set.

일 실시예에서, 상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하는 단계는 상기 제1 군집에 포함된 제1 실거래 부동산의 위치와 상기 제2 군집에 포함된 제2 실거래 부동산의 위치에 더 기초할 수 있다. In one embodiment, merging the first community and the second community into the third community may include merging the location of the first real estate in the first community and the location of the second real estate in the second community Can be further based on location.

일 실시예에서, 상기 소정 영역에 제1 군집 세트를 생성하는 단계는 상기 소정 영역 내 위치하는 적어도 하나의 고정점에 기초하여 상기 제1 군집 세트를 생성할 수 있다. In one embodiment, generating the first set of clusters in the predetermined region may generate the first set of clusters based on at least one fixed point located in the predetermined region.

일 실시예에서, 상기 고정점은 상기 제1 군집 세트의 군집 편향도에 기초하여 지정될 수 있다. In one embodiment, the anchor point may be specified based on the cluster biasedness of the first set of clusters.

일 실시예에서, 상기 군집 분석 모델은 실거래 부동산의 실거래가, 전용면적, 대지권면적, 개별공시지가, 거래시기, 건축시기 중 적어도 하나를 기초로 상기 군집 경향 계수를 산출할 수 있다. In one embodiment, the community analysis model can calculate the community tendency coefficient based on at least one of actual area of actual real estate, exclusive area, land area area, individual official land price, transaction time, and construction time.

일 실시예에서, 상기 병합하는 단계는 상기 유사도 점수가 임계치 이상인 경우 상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하되, 상기 임계치는 상기 제1 군집 세트에 포함된 군집의 수에 기초하여 설정될 수 있다. In one embodiment, the merging step merges the first cluster and the second cluster into the third cluster when the similarity score is equal to or greater than the threshold, wherein the threshold is set to the number of the clusters included in the first cluster set As shown in FIG.

일 실시예에서, 상기 군집 분석 모델은 아래의 수학식과 같이 표현되고, In one embodiment, the cluster analysis model is expressed as: < EMI ID =

ln(Y) = α1ln(X1) + α2ln(X2) + α3ln(X3) + α4(X4) + α0ln (Y) =? 1ln (X1) +? 2ln (X2) +? 3ln (X3) +? 4 (X4) +?

여기서, Y는 실거래가, X1은 전용면적, X2는 대지권면적, X3은 개별공시지가, X4는 사용년수를 나타내고, α1는 X1의 군집 경향 계수, α2는 X2의 군집 경향 계수, α3는 X3의 군집 경향 계수, α4는 X4의 군집 경향 계수, α0는 X1, X2, X3 및 X4 외의 부동산 관련 정보의 군집 경향 계수를 나타낼 수도 있다. Where X1 is a real trade, X1 is a dedicated area, X2 is a large land area, X3 is an individual official site, X4 is the years of use, α1 is a cluster tendency coefficient of X1, α2 is a cluster tendency coefficient, Trend coefficient, α4 is the cluster tendency coefficient of X4, and α0 is the cluster tendency coefficient of real estate related information other than X1, X2, X3 and X4.

일 실시예에서, 상기 유사도 점수는 아래의 수학식과 같이 표현되고, In one embodiment, the similarity score is expressed as: < EMI ID =

여기서, 유사도 점수는 상기 제1 군집과 상기 제2 군집 간의 유사도 점수, A는 상기 제1 군집에 대한 군집 경향 계수, B는 상기 제2 군집에 대한 군집 경향 계수, N은 상기 군집 분석 모델에 의해 산출되는 군집 경향 계수의 수를 나타낼 수도 있다. Here, the similarity score is a score of similarity between the first community and the second community, A is a population tendency coefficient for the first community, B is a population tendency coefficient for the second community, It may also represent the number of population trending coefficients calculated.

상기 실시예들 중 어느 하나에 따른 부동산 시장 군집화 방법을 실행하기 위한 컴퓨터 프로그램이 컴퓨터 판독가능 기록매체에 기록될 수 있다. A computer program for executing the real estate market clustering method according to any one of the above embodiments may be recorded on a computer-readable recording medium.

상기 실시예들 중 어느 하나의 항에 따른 부동산 시장 군집화 방법에 의해 생성된 군집 세트를 이용하는 부동산 시세 추정 방법은 획득된 실거래 부동산 정보를 이용하여 상기 군집 세트에 포함된 각 군집에 대한 거래가 추정 학습 모델을 각각 생성하는 단계; 및 평가대상 부동산의 건축물 관련 정보를 상기 평가대상 부동산에 연관된 거래가 추정 학습 모델에 적용하여 상기 평가대상 부동산의 시세를 추정하는 단계를 더 포함하되, 상기 거래가 추정 학습 모델을 생성하는 단계는 상기 실거래 부동산 중 소정 거리내의 실거래 부동산에 대하여 실거래 부동산 쌍을 생성하는 단계; 각 실거래 부동산 쌍에 대하여, 거래가 비율을 계산하는 단계; 및 상기 거래가 비율 및 상기 실거래 부동산 쌍의 건축물 관련 정보를 기초로 상기 거래가 추정 학습 모델을 생성하는 모델링 단계를 포함할 수 있다. 여기서 상기 소정 거리 내의 실거래 부동산은 제1 실거래 부동산 쌍에서 독립거래로 적용되고, 제2 실거래 부동산 쌍에서 종속거래로 적용될 수도 있다. A real estate market estimation method using a community set generated by the real estate market clustering method according to any one of the above embodiments is a method for estimating a real estate market using a real estate market real estate information, Respectively; And estimating a price of the evaluation target real estate by applying building related information of the evaluation target real estate to the transaction estimated real estate model associated with the real estate to be valued, Generating a real real estate pair for a real real estate within a predetermined distance among the real real estate pairs; Calculating a transaction price ratio for each real trade real estate pair; And a modeling step in which the transaction creates an estimated learning model based on the transaction price ratio and building related information of the real estate pair. Here, the real real estate within the predetermined distance may be applied as an independent transaction in the first real estate pair and as a dependent transaction in the second real estate pair.

상기 실시예에 따른 부동산 시세 추정 방법을 실행하기 위한 컴퓨터 프로그램이 컴퓨터 판독가능 기록매체에 기록될 수 있다.A computer program for executing the real estate market estimation method according to the embodiment can be recorded on a computer readable recording medium.

본 발명의 일 측면에 따른 부동산 시장 군집화 장치는 소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 실거래 부동산 정보를 획득하는 실거래 부동산 정보 획득부; 및 상기 실거래 부동산 정보를 이용하여 상기 소정 영역을 군집화하는 군집화부를 포함하고, 상기 군집화부는 상기 소정 영역에 생성된 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 군집 분석부; 상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 군집 비교부; 및 상기 유사도 점수에 기초하여 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 군집 병합부를 포함할 수 있다. According to an aspect of the present invention, there is provided a real estate market clustering apparatus comprising: a real-estate information acquiring unit for acquiring real-estate information including a real transaction for real estate and building-related information; And a clustering unit for clustering the predetermined area using the actual real estate information, wherein the clustering unit clusters real estate real transactions for the first cluster and the second cluster included in the first cluster set generated in the predetermined region, A cluster analyzer for applying the model to the first and second clusters to calculate a cluster tendency coefficient for each of the first and second clusters; A cluster comparing unit for generating a similarity score between the first cluster and the second cluster using the cluster tendency coefficients for the first cluster and the second cluster; And a cluster merging unit for merging the first cluster and the second cluster into the third cluster based on the similarity score.

일 실시예에서, 상기 군집 병합부는 상기 제1 군집에 포함된 제1 실거래 부동산의 위치와 상기 제2 군집에 포함된 제2 실거래 부동산의 위치에 더 기초할 수 있다. In one embodiment, the community merging unit may further be based on the location of the first real estate included in the first community and the location of the second real estate included in the second community.

일 실시예에서, 상기 군집화부는 상기 소정 영역 내 위치하는 적어도 하나의 고정점에 기초하여 상기 소정 영역에 상기 제1 군집 세트를 생성할 수 있다. In one embodiment, the clustering unit may generate the first set of clusters in the predetermined region based on at least one fixed point located in the predetermined region.

일 실시예에서, 상기 고정점은 상기 부동산 시장 군집 세트의 군집 편향도에 기초하여 지정될 수 있다. In one embodiment, the anchor point may be specified based on a population bias map of the real estate market cluster set.

일 실시예에서, 상기 군집 병합부는 상기 유사도 점수가 임계치 이상인 경우 상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하고, 상기 임계치는 상기 제1 군집 세트에 포함된 군집의 수에 기초하여 설정될 수 있다. In one embodiment, the cluster merging unit merges the first cluster and the second cluster into the third cluster when the similarity score is equal to or greater than the threshold, and the threshold is based on the number of the clusters included in the first cluster set .

실시예들에 따른 부동산 시장 군집화 장치에 의해 생성된 군집 세트를 이용하는 부동산 시세 추정 시스템은 상기 실시예들 중 어느 하나에 따른 부동산 시장 군집화 장치, 획득된 실거래 부동산 정보를 이용하여 군집 세트에 포함된 각 군집에 대한 거래가 추정 학습 모델을 각각 생성하는 학습 모델 생성부; 및 평가대상 부동산의 건축물 관련 정보를 상기 평가대상 부동산에 연관된 거래가 추정 학습 모델에 적용하여 상기 평가대상 부동산의 시세를 추정하는 시세 추정부를 포함하며, 상기 학습 모델 생성부는 상기 실거래 부동산 중 소정 거리내의 실거래 부동산에 대하여 실거래 부동산 쌍을 생성하는 실거래 부동산 쌍 생성부; 각 실거래 부동산 쌍에 대하여, 거래가 비율을 계산하는 거래가 비율 계산부; 및 상기 거래가 비율 및 상기 실거래 부동산 쌍의 건축물 관련 정보를 기초로 상기 거래가 추정 학습 모델을 생성하는 모델링부를 포함할 수 있다. 여기서, 상기 소정 거리 내의 실거래 부동산은 제1 실거래 부동산 쌍에서 독립거래로 적용되고, 제2 실거래 부동산 쌍에서 종속거래로 적용될 수도 있다.The system for estimating a real estate market using a set of communities generated by the real estate market clustering apparatus according to the embodiments is characterized in that it comprises a real estate market clustering apparatus according to any one of the above embodiments, A learning model generation unit for generating a transactionally estimated learning model for each cluster; And a quotient estimating section for estimating a price of the evaluation target real estate by applying building related information of the evaluation target real estate to an estimated transaction learning price model associated with the real estate to be valued, A real estate pair generating unit for generating a real estate pair with respect to the real estate; A transaction price ratio calculation unit for calculating a transaction price ratio for each real transaction real estate pair; And a modeling unit that generates the transactional estimated learning model based on the transaction price ratio and building related information of the real estate pair. Here, the real real estate within the predetermined distance may be applied as an independent transaction in the first real estate pair and as a dependent transaction in the second real estate pair.

본 발명의 일 실시예에 따르면, 소정 영역 내 실거래 부동산의 실거래가 및 실거래 부동산 연관 정보를 이용함으로써 보다 효율적인 부동산 시장의 분류가 가능하다. 특히 본 발명은 사용자로 하여금 다세대 주택과 같이 정형화되지 않은 주택형태에 대하여 시세를 추정하는데 기초가 되는 군집을 선택하는데 도움을 줄 수 있다.According to an embodiment of the present invention, it is possible to classify the real estate market more efficiently by using real transactions of real real estate and real transaction related information in a predetermined area. In particular, the present invention can help a user select a community to base a quote on a non-standardized housing form, such as a multi-family home.

또한 본 발명의 일 실시예에 따르면, 초기 중심점을 고정시킴으로써 군집 결과 일관적인 부동산 시장 분류를 사용자에게 제공할 수 있다. Also, according to an embodiment of the present invention, it is possible to provide the user with consistent real estate market classification by fixing the initial center point.

또한, 본 발명의 일 실시예에 따르면, 평가대상 부동산 주변의 실거래 부동산의 건축물 관련 정보 및 실거래가 등의 빅데이터를 학습하여 학습 모델을 생성하고, 이를 이용함으로써 정확한 부동산 시세 예측이 가능하며, 특히 전술한 바와 같이 선택된 군집을 이용하여 거래가 추정 학습 모델을 생성함으로써 시세 추정의 정확도와 효율성을 더욱 향상할 수 있다. Further, according to an embodiment of the present invention, it is possible to accurately predict a real estate market by generating a learning model by learning big data such as building related information and actual transaction information of real real estate around the evaluation target real estate, As described above, the accuracy and efficiency of the market estimation can be further improved by generating a transactional estimated learning model using selected clusters.

본 발명의 부동산 시세 추정 시스템은 예컨대 다세대 주택이나 빌라 등과 같이 정형화되지 않은 주택형태에 대하여 시세를 추정하는데 특히 적합하게 이용될 수 있다.The real estate market estimation system of the present invention can be particularly suitably used for estimating quotes for non-standardized housing types such as multi-family houses or villas.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 청구범위의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the description of the claims.

도 1은, 종래의 실시예에 따른, 행정 구역에 기초하여 군집화된 소정 영역을 도시한 도면이다.
도 2는, 본 발명의 일 실시예에 따른, 부동산 시세 추정 시스템의 블록도이다.
도 3은, 본 발명의 일 실시예에 따른, 초기 군집 세트를 생성하기 위해 소정 영역에 초기 군집 중심점을 도시한 도면이다.
도 4a-4b는, 본 발명의 일 실시예에 따른, 부동산 시장 군집 방법에 의해 설정된 부동산 시장 군집 세트를 각각 도시한 도면이다.
도 5는, 본 발명의 일 실시예에 따른, 학습 모델 생성부(200)의 블록도이다.
도 6은, 본 발명의 일 실시예에 따른, 평가대상 부동산의 시세를 추정하는 방법을 설명하기 위해 간략화된 지도를 도시한 도면이다.
도 7은, 본 발명의 일 실시예에 따른, 부동산 시장 군집화 방법의 흐름도이다.
도 8은, 본 발명의 일 실시예에 따른, 부동산 시세 추정 방법의 흐름도이다.
상기 도면들은 단지 묘사(illustration)의 목적을 위해서 본 발명의 다양한 실시예들을 묘사한다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다. 통상의 기술자는 본 명세서에 설명된 구조 및 방법의 대안적인 실시예가 본 명세서에 설명된 발명의 원리를 벗어나지 않고 사용될 수도 있다는 것을 다음의 설명으로부터 용이하게 인식할 수 있을 것이다.Fig. 1 is a diagram showing a predetermined region clustered on the basis of an administrative region according to a conventional embodiment.
2 is a block diagram of a real estate market estimation system, in accordance with an embodiment of the present invention.
FIG. 3 is a diagram showing an initial cluster center point in a predetermined area to generate an initial cluster set, according to an embodiment of the present invention.
FIGS. 4A and 4B are views showing a set of real estate market communities set by a real estate market clustering method according to an embodiment of the present invention, respectively.
5 is a block diagram of a learning model generation unit 200 according to an embodiment of the present invention.
Fig. 6 is a diagram showing a simplified map for explaining a method of estimating a quotation of an evaluation target real estate, according to an embodiment of the present invention.
7 is a flowchart of a real estate market clustering method according to an embodiment of the present invention.
8 is a flowchart of a method for estimating a real estate market, according to an embodiment of the present invention.
The figures depict various embodiments of the present invention only for purposes of illustration. In the drawings, like reference numerals refer to the same or similar functions throughout the several views. Those skilled in the art will readily appreciate from the following description that alternative embodiments of the structures and methods described herein may be used without departing from the principles of the invention described herein.

실시예들은 여기에 첨부된 도면들을 참조하여 설명될 것이다 그러나, 여기에 개시된 원리들은 많은 상이한 형태로 구현될 수도 있으며 여기에서 기재된 실시예로 제한되어 생각되지 않아야 한다. 발명의 상세한 설명에서, 잘 알려진 특징 및 기술에 대한 상세한 설명이 실시예의 특징을 불필요하게 불명확하게 하는 것을 피하기 위해 생략될 수도 있다. Embodiments will now be described with reference to the accompanying drawings, however, the principles disclosed herein may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In the description of the invention, a detailed description of well-known features and techniques may be omitted to avoid unnecessarily obscuring the features of the embodiments.

본 명세서에서 "부(unit)", "모듈(module)", "장치" 또는 "시스템" 등은 하드웨어, 하드웨어와 소프트웨어의 조합, 또는 소프트웨어 등 컴퓨터 관련 엔티티(entity)를 지칭한다. 예를 들어, 본 명세서에서 부, 모듈, 장치 또는 시스템 등은 실행중인 프로세스, 프로세서, 객체(object), 실행 파일(executable), 실행 스레드(thread of execution), 프로그램(program), 및/또는 컴퓨터(computer)일 수 있으나, 이에 제한되는 것은 아니다. 예를 들어, 컴퓨터에서 실행중인 애플리케이션(application) 및 컴퓨터의 양쪽이 모두 본 명세서의 부, 모듈, 장치 또는 시스템 등에 해당할 수 있다. A "unit," "module," "device," or "system" or the like in this specification refers to a computer-related entity such as a hardware, a combination of hardware and software, or software. A processor, an object, an executable, a thread of execution, a program, and / or a computer, for example, a computer, but is not limited to, a computer. For example, both an application running on a computer and a computer may correspond to a part, module, device or system of the present specification.

본 명세서에서, 빅데이터는 대량의 정형, 비정형 또는 반정형 데이터의 집합을 의미하는 것으로서, 건축물 관련 정보, 실거래가 정보 등을 포함할 수 있다. 여기서, 정형 데이터는 고정된 필드에 저장된 데이터로서, 예컨대, 관계형 데이터베이스, 스프레드쉬트 등이 있다. 또한, 비정형 데이터는 고정된 필드에 저장되어 있지 않는 데이터로서, 예컨대, 텍스트 문서, 이미지, 동영상, 음성 데이터 등이 있다. 또한, 반정형 데이터는 고정된 필드에 저장되어 있지 않지만 메타데이터나 스키마를 포함하는 데이터로서, 예컨대, XML, HTML, 텍스트 등이 있다.In the present specification, big data refers to a large set of regular, irregular or semi-structured data, and may include building related information, actual transaction information, and the like. Here, the formatted data is data stored in a fixed field, for example, a relational database, a spreadsheet, and the like. In addition, the atypical data is data that is not stored in a fixed field, for example, a text document, an image, a moving image, voice data, and the like. Semi-fixed data is data that is not stored in a fixed field but includes metadata and schema, for example, XML, HTML, text, and the like.

이하, 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.

도 2는, 본 발명의 일 실시예에 따른 빅데이터를 이용한 부동산 시세 추정 시스템의 블록도이다. 도 2에 도시된 바와 같이, 부동산 시세 추정 시스템(1000)은 부동산 군집화 장치(10), 학습 모델 생성부(200), 시세 추정부(300)를 포함하며, 상기 부동산 군집화 장치(10)는 실거래 부동산 정보 획득부(100) 및 군집화부(150)를 포함한다. 다른 일 실시예에서, 상기 시스템(1000)은 사용자 인터페이스 제공부(500)를 더 포함할 수도 있다.2 is a block diagram of a real estate market estimation system using big data according to an embodiment of the present invention. 2, the real estate market estimating system 1000 includes a real estate clustering apparatus 10, a learning model generating unit 200, and a market estimating unit 300. The real estate clustering apparatus 10 includes a real estate clustering apparatus 10, A real estate information obtaining unit 100 and a clustering unit 150. [ In another embodiment, the system 1000 may further comprise a user interface providing 500.

본 발명의 상기 장치(10)는 부동산 시장 군집화 방법을 수행하기 위한 소프트웨어(애플리케이션)가 설치되어 실행될 수 있으며, 상기 군집화부(150) 등의 구성은 상기 장치(10)에서 실행되는 상기 부동산 시장 군집화 방법을 수행하기 위한 소프트웨어에 의해 제어될 수 있다. The apparatus 10 of the present invention may be installed and executed with software (application) for realizing the real estate market clustering method. The configuration of the clustering unit 150 and the like may be implemented in the real estate market clustering And may be controlled by software for performing the method.

상기 장치(10)는 별도의 단말이거나 또는 단말의 일부 모듈일 수 있다. 또한, 상기 군집화부(150) 등의 구성은 통합 모듈로 형성되거나, 하나 이상의 모듈로 이루어 질 수 있다. 그러나, 이와 반대로 각 구성은 별도의 모듈로 이루어질 수도 있다.The device 10 may be a separate terminal or some module of the terminal. In addition, the configuration of the clustering unit 150 and the like may be formed of an integrated module or one or more modules. However, conversely, each configuration may be a separate module.

상기 장치(10)는 이동성을 갖거나 고정될 수 있다. 상기 장치(10)는, 서버(server) 또는 엔진(engine) 형태일 수 있으며, 디바이스(device), 기구(apparatus), 단말(terminal), UE(user equipment), MS(mobile station), 무선기기(wireless device), 휴대기기(handheld device) 등 다른 용어로 불릴 수 있다. The device 10 may be mobile or stationary. The device 10 may be in the form of a server or an engine and may be a device, an apparatus, a terminal, a user equipment (UE), a mobile station (MS) a wireless device, a handheld device, and the like.

상기 장치(10)는 운영체제(Operation System; OS), 즉 시스템을 기반으로 다양한 소프트웨어를 실행하거나 제작할 수 있다. 상기 운영체제는 소프트웨어가 장치의 하드웨어를 사용할 수 있도록 하기 위한 시스템 프로그램으로서, 안드로이드 OS, iOS, 윈도우 모바일 OS, 바다 OS, 심비안 OS, 블랙베리 OS 등 모바일 컴퓨터 운영체제 및 윈도우 계열, 리눅스 계열, 유닉스 계열, MAC, AIX, HP-UX 등 컴퓨터 운영체제를 모두 포함할 수 있다.The device 10 may execute or produce various software based on an operating system (OS), i.e., a system. The operating system is a system program for allowing software to use the hardware of a device. The operating system includes a mobile computer operating system such as Android OS, iOS, Windows Mobile OS, Sea OS, Symbian OS, Blackberry OS, MAC, AIX, and HP-UX.

상기 장치(10)는 미리 구축된 부동산 데이터베이스(400)를 이용할 수 있다. 상기 부동산 데이터베이스(400)는 실거래가 정보, 건축물 관련 정보를 포함하는 부동산 정보를 저장한다. 본 명세서에서 건축물 관련 정보는 전용면적, 대지권면적 등을 포함하는 건축물 대장, 토지대장, 토지이용 계획확인원, (토지, 단독주택, 다중주택, 다가구 주택, 다세대 주택, 아파트, 연립 주택 등의) 공시가격, 거래년도, 건축년도 등을 포함할 수 있으나, 이에 제한되는 것은 아니다. 건축물 관련 정보는 실거래가 정보를 포함하지 않는다. The device 10 may use a pre-built real estate database 400. The real estate database 400 stores real estate information including real estate information and building related information. In the present specification, the information related to the buildings includes information such as a building register, a land register, a land use plan checker (land, single house, multi-house, multi-family house, multi-family house, apartment, But is not limited to, the published price, the business year, the construction year, and the like. Information on buildings is not included in actual transactions.

또한, 부동산 데이터베이스(400)는 거시경제의 다양한 변수들, 예를 들어 기준 금리, 물가지수, 취득세, 인지세, 종합부동산세와 같은 부동산 관련 세금 또는 LTV, DTI와 같은 부동산 대출 규제 및 2003년 5월 23일 대책 또는 2003년 10월 29일 대책 같은 부동산 정책과 같은 정부의 부동산에 대한 행동 등과 같은 시장 상황에 따라 적합한 정보를 더 포함할 수 있다. 또한, 사용 환경에 따라 정보를 업데이트 할 수 있다. In addition, the real estate database 400 may also include various variables of the macroeconomy such as the base rate, the price index, the acquisition tax, the stamp tax, the real estate related taxes such as the comprehensive real estate tax or the real estate loan regulation such as LTV, DTI, 23 Acts, or Acts on Government Real Estate, such as Real Estate Policies, such as measures taken on October 29, 2003, as appropriate. In addition, information can be updated according to the usage environment.

일 실시예에서, 상기 장치(10)는 내부에 부동산 데이터베이스(400)를 더 포함할 수도 있다. 상기 실시예와 같이 상기 장치(10)가 부동산 데이터베이스(400)를 더 포함하는 경우, 저장되는 데이터베이스 정보는 외부 데이터베이스로부터 업데이트될 수도 있고, 사용자의 입력으로부터 업데이트될 수도 있다. In one embodiment, the device 10 may further include a real estate database 400 therein. If the device 10 further includes the real estate database 400 as in the above embodiment, the stored database information may be updated from an external database or from a user's input.

다른 실시예에서, 상기 장치(10)는 클라우드 서버와 같은 외부 데이터베이스 정보를 이용할 수 있는 컴퓨팅 장치일 수도 있다. 이 경우, 상기 장치(10)는 사용자의 요구에 따라 외부의 부동산 데이터베이스(400)에 접속하여 데이터 통신을 할 수 있는 모든 종류의 단말일 수도 있다. 예컨데, 상기 장치(10)는 데이터망(예; 인터넷)에 무선 접속하는 무선단말(예; 핸드폰, PDA(Personal Digital Assistant), 노트북, PMP(Portable Multimedia Player) 등) 또는, 데이터망에 유선 접속하는 유선단말(예: PC, 노트북 등)일 수도 있다.In another embodiment, the device 10 may be a computing device capable of utilizing external database information, such as a cloud server. In this case, the device 10 may be any kind of terminal capable of accessing an external real estate database 400 and performing data communication according to a user's request. For example, the device 10 may be a wireless terminal (e.g., a cellular phone, a PDA (Personal Digital Assistant), a notebook computer, a portable multimedia player (PMP) (E.g., a PC, a notebook computer, or the like).

상기 장치(10)가 본 명세서에 서술되지 않은 다른 구성요소를 포함할 수도 있다는 것이 당업자에게 명백할 것이다. 또한, 상기 장치(10)는, 네트워크 인터페이스 및 프로토콜, 및 데이터 엔트리를 위한 입력 장치를 포함하는, 본 명세서에 서술된 동작에 필요한 다른 하드웨어 요소를 포함할 수도 있다.It will be apparent to those skilled in the art that the device 10 may include other components not described herein. The device 10 may also include other hardware elements required for the operations described herein, including network interfaces and protocols, and input devices for data entries.

실거래 부동산 정보 획득부(100)는 소정 영역 내에 위치하는 실거래 부동산에 대한 실거래가 정보 및 건축물 관련 정보를 획득한다. 상기 소정 영역은 부동산 시장을 분석하고자 하는 영역으로서 서울시와 같은 행정구역, 또는 사용자에 의해 지정된 임의의 영역일 수 있다. The real-real estate information obtaining unit (100) obtains information on real transactions and real-estate-related information on real real estate located in a predetermined area. The predetermined area may be an area for analyzing the real estate market, such as an administrative area such as Seoul or an area designated by a user.

실거래 부동산 정보 획득부(100)는 실거래가 정보와 건축물 관련 정보에 기초하여 실거래 부동산 중 일부 실거래 부동산을 필터링할 수도 있다. 일 실시예에서, 실거래 부동산 정보 획득부(100)는 정보 처리부(110)를 더 포함할 수도 있다. 정보 처리부(110)는 실거래가 정보와 실거래 부동산의 전용면적 정보를 이용하여 실거래 부동산을 정규화할 수도 있다. 정보 처리부(110)가 면적당 실거래가를 표준화하면, 정규분포 값이 -5 미만 및 5 초과에 대응하는 일부 실거래 부동산은 필터링될 수도 있다. 그 결과, 정보가 왜곡된 실거래 부동산이 필터링되어 부동산 시장 분석 및 부동산 실거래가 예측을 보다 정확하게 할 수 있다.The real real estate information acquisition unit 100 may filter some real real estate among actual real estate based on real information and information related to buildings. In one embodiment, the real estate information obtaining unit 100 may further include an information processing unit 110. [ The information processing unit 110 may normalize the actual real estate by using the actual area information and the dedicated area information of the real real estate. When the information processing unit 110 normalizes real transactions per area, some real estate properties corresponding to a normal distribution value of less than -5 and exceeding 5 may be filtered. As a result, the real estate market analysis and real estate transactions can be more accurately predicted by filtering out real estate distorted information.

일부 실시예에서, 실거래 부동산 정보 획득부(100)는 실거래 부동산의 밀집도 또는 실거래 시기를 기초로, 실거래 부동산 중 일부 실거래 부동산을 필터링할 수도 있다. In some embodiments, the real real estate information obtaining unit 100 may filter some real real estate among the real real estate based on the density of the real real estate or the actual transaction time.

본 명세서에서 설명되는 건축물 관련 정보는, 주소, 지하여부, 층, 면적, 구조, 연면적, 용적률, 건축년, 세대수, 주차장 유무, 세대당 주차대수, 승강기 유무, 승강기대수, 개별공시지가, 주택의 종류, 근린시설, 일조권 중 적어도 하나를 포함할 수 있으나 이에 제한되는 것은 아니다. The information related to the building described in this specification includes information such as an address, an underground floor, a floor, an area, a structure, a floor area, a floor space, a building year, a house number, the presence of a parking lot, parking lot number per household, A neighborhood facility, a sunshine zone, and the like.

일 예시에서, 실거래 부동산의 구조가 철근·콘크리트의 경우 1이고 그 외에는 0으로 볼 수 있으나 이에 본 발명이 제한되는 것은 아니고, 각 구조코드에 대한 값은 알고리즘에 따라서 변화될 수 있다. In one example, the structure of real real estate is 1 for reinforced concrete, and 0 otherwise, but the present invention is not limited thereto, and the value for each structure code can be changed according to an algorithm.

연면적은 표제부 전체 연면적으로 볼 수 있고, 건축년은 현재날짜에서 표제부 사용 승인일을 뺀 날로 볼 수 있다. 세대수는 표제부 표시에 따를 수 있다. 또한 개별 공시지가는 가장 최근 것을 이용할 수 있다. 여기서 '표제부'는 건축물 대장의 표제부를 의미할 수도 있다. The floor area can be viewed as the total floor area of the title part, and the construction year can be viewed as the date subtracted from the current date and the approval date of the title part use. The number of households may follow the heading indication. Also, the most recent ones are available for individual official publications. In this case, the 'heading part' may mean the heading part of the building head.

군집화부(150)는 소정 영역을 복수의 군집으로 군집화한다. 군집화부(150)는 소정 영역의 부동산 시장 경향에 기초하여 소정 영역을 복수의 군집을 포함하는 군집 세트로 군집화한다. 부동산 시장 경향은 건축물 관련 정보가 실거래가 형성에 미치는 영향의 정도에 연관된다. 이와 관련된 자세한 내용은 단계(S153)에서 보다 상세하게 서술된다. The clustering unit 150 clusters a predetermined region into a plurality of clusters. The clustering unit 150 clusters a predetermined area into a set of clusters including a plurality of clusters based on a tendency of a real estate market in a predetermined area. Real estate market trends are related to the degree of impact of building information on the formation of real transactions. Details related to this will be described in more detail in step S153.

군집화부(150)는 K-평균(k-means), 기대값 최대화(Expectation-Maximization), 밀도기반 군집화(Density-based spatial clustering of applications with noise) 중 신경망(neural network) 적어도 하나의 군집화 알고리즘을 이용하여 소정 영역을 군집화할 수 있으나, 이에 제한되지 않고 다양한 군집화 알고리즘을 이용할 수도 있다. Clustering unit 150 may perform at least one clustering algorithm such as K-means, Expectation-Maximization, and density-based spatial clustering of applications with noise. However, the present invention is not limited to this, and various clustering algorithms may be used.

도 3은, 본 발명의 일 실시예에 따른, 초기 군집 세트를 생성하기 위한 중심점을 도시한 도면이고, 도 4a-4b는, 본 발명의 일 실시예에 따른, 부동산 시장 군집 방법에 의해 설정된 부동산 시장 군집 세트를 각각 도시한 도면이다.FIG. 3 is a diagram showing a center point for generating an initial set of clusters, according to an embodiment of the present invention. FIGS. And a set of market clusters, respectively.

일 실시예에서, 군집화부(150)는 K-평균 알고리즘을 이용하여 소정 영역을 군집화한다. 이 경우, 군집화부(150)는 우선 소정 영역에 임의의 초기 중심점(initial points)을 지정하여 초기 군집 세트를 생성한다. 초기 중심점의 수는 소정 영역의 면적 등에 기초하여 설정될 수도 있다. 예를 들어, 도 3에 도시된 바와 같이 서울시의 경우, 군집화부(150)는 서울시에 25개의 초기 중심점을 설정하고, 이에 기초하여 25개의 군집을 포함하는 초기 군집 세트를 생성한다. In one embodiment, clustering unit 150 clusters a given region using a K-means algorithm. In this case, the clustering unit 150 first designates initial initial points in a predetermined region to generate an initial cluster set. The number of initial center points may be set based on an area of a predetermined area or the like. For example, as shown in FIG. 3, in the case of Seoul, the clustering unit 150 sets 25 initial center points in Seoul and generates an initial cluster set including 25 clusters based on the initial center points.

일 실시예에서, 적어도 하나의 초기 중심점은 고정점일 수도 있다. 예를 들어, 도 3과 같이 25개의 초기 중심점이 지정되는 경우, 가운데 점(P)가 고정점일 수도 있다. 이 경우, 나머지 24개의 초기 중심점은 점(P)에 기초하여 최대평균거리 알고리즘, 삼각형 높이 알고리즘, 평균연결 알고리즘, 중심연결 알고리즘 및 Ward 알고리즘 중 적어도 하나의 초기 중심 설정 알고리즘을 통해 소정 영역에 지정될 수도 있으나, 이에 제한되지 않으며 다양한 초기 중심 설정 알고리즘을 이용하여 나머지 24개의 초기 중심점을 지정할 수도 있다. In one embodiment, the at least one initial center point may be a fixed point. For example, when 25 initial center points are designated as shown in FIG. 3, the center point P may be a fixed point. In this case, the remaining 24 initial center points are assigned to the predetermined area through the initial centering algorithm of at least one of the maximum average distance algorithm, the triangle height algorithm, the average connection algorithm, the center connection algorithm and the Ward algorithm based on the point P But it is not limited thereto, and the remaining 24 initial center points may be specified using various initial centering algorithms.

일 예에서, 도 3을 참조하면, 군집화부(150)는 우선 고정점(P)를 지정하고, 나머지 24개의 초기 중심점을 최대평균거리 알고리즘을 통해 고정점(P)에 기초하여 지정한다. 이 경우, 하나의 소정 영역, 예를 들어 서울시에 대해서 점(P)는 물론 나머지 24개의 초기 중심점 또한 고정점으로 지정된다. 그 결과, 소정 영역을 군집화할 때마다 일관된 군집 세트를 생성하여, 사용자에게 일관된 군집 정보를 제공할 수 있다. In one example, referring to FIG. 3, the clustering unit 150 first designates the fixed point P and designates the remaining 24 initial center points based on the fixed point P through the maximum average distance algorithm. In this case, not only the point P but also the remaining 24 initial center points are designated as fixed points for one predetermined area, for example, Seoul. As a result, a consistent set of clusters is generated every time a predetermined region is clustered, and consistent cluster information can be provided to the user.

일 실시예에서, 상기 고정점은 초기 군집 세트의 군집 편향도가 최소가 되도록 지정될 수도 있다. 군집 편향도는 소정 영역내 군집의 밀집 정도를 나타낸다. 군집이 서로 간에 멀리 떨어질수록 군집 편향도는 낮아진다. In one embodiment, the anchor points may be specified such that the cluster biasing of the initial set of clusters is minimal. The degree of cluster deflection represents the degree of crowding of the cluster in a predetermined area. The farther the cluster is from each other, the lower the degree of cluster deflection.

일 예에서, 최대평균 알고리즘을 통해 서울시에 고정점(P)와 나머지 초기 중심점을 지정하는 경우, 군집 편향도가 최소가 되도록 도 3과 같이 서울시의 중심에 고정점(P)가 지정될 수도 있다. 그러면 24개의 초기 중심점 또한 서울시의 중심을 기준으로 도 3과 같은 형태로 지정된다. In one example, when the fixed point P and the remaining initial center points are designated in Seoul through the maximum average algorithm, a fixed point P may be designated in the center of Seoul as shown in FIG. 3 so that the degree of cluster deflection is minimized . Then, 24 initial center points are also designated as shown in FIG. 3 based on the center of Seoul.

그 결과, 도 3에 도시된 초기 중심점에 기초하여 도 4a와 같은 초기 군집 세트가 소정 영역(서울시)에 생성된다. 상기 고정점은 소정 영역의 면적, 형태 등에 기초하여 설정될 수도 있으나, 이에 제한되지 않는다. 또는 사용자의 입력에 의해 설정되거나 업데이트될 수도 있다.As a result, based on the initial center point shown in Fig. 3, an initial cluster set as shown in Fig. 4A is generated in a predetermined region (Seoul City). The fixed point may be set based on the area, shape, or the like of the predetermined area, but is not limited thereto. Or may be set or updated by the user's input.

도 2에 도시된 바와 같이, 소정 영역을 복수의 군집을 포함하는 군집 세트로 군집화하는 군집화부(150)는 군집 분석부(152), 군집 비교부(153) 및 군집 병합부(154)를 포함한다. 2, the clustering unit 150 for clustering a predetermined region into a clustering set including a plurality of clusters includes a cluster analyzing unit 152, a cluster comparing unit 153, and a cluster merging unit 154 do.

군집 분석부(152)는 소정 영역 내 임의의 두 군집(A 및 B)에 대한 군집 경향 계수를 각각 산출한다. 군집 분석 모델은 건축물 관련 정보가 실거래 부동산의 실거래가를 형성하는데 미치는 영향을 분석하기 위한 통계 모델을 나타낸다. 군집 분석 모델은 회귀 분석(Regression Analysis)의 형태로 형성될 수도 있으나, 이에 제한되지 않으며 다양한 통계 분석 방법을 통해 형성될 수도 있다. The cluster analysis unit 152 calculates a cluster tendency coefficient for any two groups A and B within a predetermined area. The cluster analysis model represents a statistical model for analyzing the impact of building information on real transactions of real estate. The cluster analysis model may be formed in the form of a regression analysis, but not limited thereto, and may be formed through various statistical analysis methods.

일 실시예에서, 군집 분석 모델은 회귀 분석에 의해 아래의 [수학식 1]의 형태로 형성될 수도 있으나, 이에 제한되는 것은 아니다. 예컨대, 일조권, 승강기 유무와 같은 실거래가 형성에 영향을 줄 수 있는 다양한 건축물 관련 정보가 이용되어 군집 분석 모델의 형태는 변경될 수도 있다. In one embodiment, the cluster analysis model may be formed by regression analysis in the form of [Equation 1] below, but is not limited thereto. For example, various types of building-related information that may affect the formation of actual transactions such as the right of daylight, the presence of elevator, etc. may be used, and the form of the cluster analysis model may be changed.

[수학식 1][Equation 1]

ln(Y) = α1ln(X1) + α2ln(X2) + α3ln(X3) + α4(X4)+ α0 ln (Y) =? 1ln (X1) +? 2ln (X2) +? 3ln (X3) +? 4 (X4) +?

여기서, Y는 종속변수, αN은 독립변수(N은 정수), ()는 Y에 영향을 미치는 항목을 나타낸다. 일 실시예에서, Y는 실거래가, X1은 전용면적, X2는 대지권면적, X3은 개별공시지가 및 X4는 사용년수(거래년도-건축년도)를 나타낸다. N은 반영하고자 하는 항목에 따라 달라질 수도 있다. 예를 들어, 기존의 건축물 관련 정보의 항목이 삭제되거나, 기타 건축물 관련 정보에 대한 항목들이 추가되거나 대체됨에 따라, N 값 및 [수학식 1]의 형태는 달라질 수도 있다. Where Y is the dependent variable, αN is the independent variable (N is an integer), and () In one embodiment, Y represents a real transaction, X1 represents a dedicated area, X2 represents a land area, X3 represents an individual land price, and X4 represents years of use (trade year - construction year). N may vary depending on the item to be reflected. For example, as the existing building-related information items are deleted or other building-related information items are added or replaced, the N value and the form of [Equation 1] may be different.

α0, α1, … , α4는 임의의 군집(Cluster)에 대한 군집 경향 계수로서, 건축물 관련 정보와 실거래가 사이의 관련성을 나타낸다. 일 실시예에서, 상기 군집 분석 모델이 회귀 분석에 의해 생성된 경우, 군집 경향 계수는 회귀 분석 모델의 표준화 계수에 대응한다. 이 경우, 군집 경향 계수는 해당 건축물 관련 정보가 실거래가에 미치는 영향을 나타내낸다. α0, α1, ... , and α4 is a cluster tendency coefficient for an arbitrary cluster (cluster), indicating the relationship between building related information and actual transactions. In one embodiment, when the cluster analysis model is generated by regression analysis, the cluster tendency coefficient corresponds to the normalization factor of the regression analysis model. In this case, the cluster tendency coefficient indicates the influence of the information related to the building on the real transactions.

군집 분석부(152)는 군집 분석 모델을 통해 군집 세트에 포함된 임의의 두 군집(A 및 B)에 대한 군집 경향 계수 {A0, A1, A2, A3, A4} 및 {B0, B1, B2, B3, B4}를 각각 생성한다. 이 경우, A1은 군집(A)에서 전용면적이 실거래가를 형성하는데 미치는 영향의 정도를 나타내고, B1는 군집(B)에서 전용면적이 실거래가를 형성하는데 미치는 영향의 정도를 나타낸다. 예를 들어, A1 값이 0.7이고, B1 값이 0.5인 경우, 전용면적이 실거래가 형성에 미치는 영향이 군집(B) 보다 군집(A)에서 더 크다고 분석할 수 있다. A0, B0은 군집(A 및 B) 평균으로서, X1-X4 이외의 항목이 실거래가 형성에 미치는 영향의 정도를 나타낸다. 군집에 대한 임의의 항목의 값 또는 군집 경향 계수는 정수, 부동 소수점(floating point number), 또는 이진 값과 같은 수치일 수도 있거나, 또는 그것은 범주형(categorical)일 수도 있다.A1, A2, A3, A4} and {B0, B1, B2, B3) for any two clusters A and B included in the clustering set through the cluster analysis model, B3, B4}, respectively. In this case, A1 represents the degree of influence exerted by the exclusive area in the cluster (A) to form the real transaction, and B1 represents the degree of influence of the exclusive area in the cluster (B) For example, if the value of A1 is 0.7 and the value of B1 is 0.5, it can be analyzed that the influence of dedicated area on the formation of real trade is greater in cluster (A) than in cluster (B). A0 and B0 are the cluster (A and B) averages, and items other than X1-X4 indicate the degree of influence of actual transactions on formation. The value of any item or cluster tendency coefficient for the cluster may be a number such as an integer, a floating point number, or a binary value, or it may be categorical.

군집 비교부(153)는 군집 경향 계수에 기초하여 두 군집(A 및 B)을 비교하고 군집 간의 유사도를 측정한다. 군집 비교부(153)는 유클리디안 거리(Euclidean distance), 코사인 거리(Cosine distance) 및 마할라노비스 거리(Mahalanobis distance) 측정 중 적어도 하나의 유사도 측정 알고리즘을 이용하여 두 군집(A 및 B) 간의 유사도를 측정할 수도 있으나, 이에 제한되지 않으며 다양한 유사도 측정 알고리즘을 이용하여 두 군집(A 및 B) 간의 유사도를 측정할 수도 있다.The cluster comparing unit 153 compares the two clusters A and B based on the cluster tendency coefficient and measures the similarity between the clusters. The cluster comparing unit 153 calculates the distance between the two clusters A and B using at least one similarity measurement algorithm among Euclidean distance, Cosine distance and Mahalanobis distance measurement. The degree of similarity may be measured, but not limited thereto, and the degree of similarity between two clusters A and B may be measured using various similarity measurement algorithms.

일 실시예에서, 군집 비교부(153)는 코사인 거리 측정 알고리즘을 이용하여 군집(A 및 B) 간의 유사도를 측정할 수도 있다. 이 경우, 군집 비교부(153)는 아래의 [수학식 2]에 의해 유사도 점수를 생성한다. In one embodiment, the cluster comparator 153 may measure the similarity between the clusters A and B using a cosine distance measurement algorithm. In this case, the cluster comparator 153 generates the similarity score using the following equation (2).

[수학식 2]&Quot; (2) "

Ai는 군집(A)에 대한 군집 경향 계수를 나타내며, Bi는 군집(B)에 대한 군집 경향 계수를 나타낸다. N은 상기 군집 분석 모델에 의해 산출되는, 군집(A 및 B)에 대한 군집 경향 계수의 수를 나타낸다. 일 실시예에서, [수학식 1] 형태의 군집 분석 모델에 의해 군집 경향 계수가 산출되는 경우, N은 5이다. 군집 비교부(153)는 군집(A)와 군집(B) 간의 유사도 점수에 기초하여 군집(A 및 B)가 유사한지 결정한다. Ai represents the cluster tendency coefficient for the cluster (A), and Bi represents the cluster tendency coefficient for the cluster (B). N represents the number of cluster tendency coefficients for the cluster (A and B) calculated by the cluster analysis model. In one embodiment, when the population tendency coefficient is calculated by the cluster analysis model of the form of [Equation 1], N is 5. The cluster comparing unit 153 determines whether the clusters A and B are similar based on the similarity score between the cluster A and the cluster B. [

예를 들어, 군집(A)와 군집(B) 간의 유사도 점수는 0~1 사이의 값으로 산출될 수도 있으며, 이 경우 군집(A)와 군집(B) 간의 유사도 점수가 1에 가까울수록 군집(A)와 군집(B)는 서로 유사한 군집 경향을 가진다고 분석될 수도 있다. For example, the score of similarity between cluster (A) and cluster (B) may be calculated as a value between 0 and 1. In this case, the closer the score of similarity between cluster A and cluster B is, A) and cluster (B) may be analyzed to have similar cluster tendencies.

군집 병합부(154)는 유사도 점수에 기초하여 군집(A 및 B)을 병합한다. 군집 병합부(154)는 군집(A)와 군집(B) 간의 유사도 점수가 임계치 이상인 경우, 두 군집(A 및 B)의 부동산 시장 경향이 유사하므로 하나의 군집으로 병합한다. 일 예에서, 임계치는 0.998로 설정될 수도 있다. 군집 병합부(154)는 0.998 이상의 유사도 점수를 갖는 두 군집(A 및 B)을 군집(C)로 병합할 수도 있다. The cluster merging unit 154 merges the clusters A and B based on the similarity score. When the similarity score between the community A and the community B is equal to or greater than the threshold value, the community merging unit 154 merges the two communities A and B into one community because the tendencies of the real estate market are similar. In one example, the threshold may be set to 0.998. The cluster merging unit 154 may merge the two clusters A and B having the similarity score of 0.998 or more into the cluster C.

그 결과, 군집(C) 내 실거래 부동산이 군집(A 및 B) 보다 많아지게 되어, 학습 모델 생성부(200)는 군집(A 및 B)에 대한 거래가 추정 학습 모델보다 통계 표본 측면에서 개선된 군집(C)에 대한 거래가 추정 학습 모델을 생성할 수 있다. As a result, the real-estate real estate in the community C becomes larger than the community A and B, and the learning model generation unit 200 generates a learning model for the community A and the community B, (C) can generate an estimated learning model.

일 실시예에서, 임계치는 군집 세트에 포함된 군집의 수에 기초할 수도 있다. 예를 들어, 임계치는 군집 세트에 포함된 군집이 25개인 경우 0.998이고, 군집이 22개인 경우는 0.995, 군집이 20개인 경우는 0.98일 수도 있다. 이와 같이 단계적인 병합을 통해, 실제로 유사하지 않은 군집이 병합되는 것을 방지할 수 있다. 따라서 실제 부동산 시장의 경향을 보다 반영하여 개선된 군집을 생성할 수 있다. In one embodiment, the threshold may be based on the number of clusters included in the clustering set. For example, the threshold may be 0.998 for 25 clusters in a cluster set, 0.995 for 22 clusters, and 0.98 for 20 clusters. Through such a gradual merging, it is possible to prevent merging of non-similar clusters. Therefore, it is possible to generate improved clusters by reflecting the tendency of actual real estate market.

일부 실시예에서, 임계치는 소정 영역의 부동산 실거래 정보, 소정 영역의 부동산 실거래 빈도, 밀집도, 소정 영역의 인구, 면적 등과 같은 다양한 요소에 의해 설정될 수도 있다.In some embodiments, the threshold value may be set by various factors such as real estate real-estate information of a predetermined area, actual real-estate frequency of a predetermined area, density, population of a predetermined area, area and the like.

일 실시예에서, 군집 병합부(154)는 군집(A)와 군집(B)을 병합할 때 군집(A)와 군집(B) 간의 유사도 점수와 더불어 군집(A)에 포함된 임의의 실거래 부동산의 위치와 군집(B)에 포함된 임의의 실거래 부동산의 위치에 더 기초할 수도 있다. 군집(A)와 군집(B)의 유사도 점수가 임계치 이상인 경우, 군집 병합부(154)는 두 군집이 이웃한지 여부를 더 판단할 수도 있다. In one embodiment, the merging unit 154 merges the population A and the population B together with the similarity score between the population A and the population B, And the location of any real estate included in the cluster (B). When the similarity score between the cluster A and the cluster B is equal to or greater than the threshold, the cluster merge unit 154 may further determine whether the two clusters are neighboring.

군집 병합부(154)는 군집(A)와 군집(B)가 동일한 행정 구역의 실거래 부동산을 동시에 포함하는 경우 군집(A)와 군집(B)가 이웃한다고 판단한다. 도 4a-4b를 참조하면, 유사도 점수가 임계치 이상인 두 군집(A 및 B)에 대해, 군집(A)가 송파구에 위치하는 실거래 부동산(HN)을 포함하고, 군집(B) 또한 송파구에 위치하는 실거래 부동산(HM)을 포함한다. 이 경우, 군집 병합부(154)는 두 군집(A 및 B)가 유사하고 이웃하다고 판단하여 군집(A 및 B)을 군집(C)으로 병합한다. The cluster merging unit 154 determines that the cluster A and the cluster B are neighbors when the cluster A and the cluster B simultaneously include real real estate in the same administrative region. 4A and 4B, for two communities A and B having a degree of similarity score equal to or greater than the threshold value, the community A includes the real estate HN located at the channel, and the community B also includes the real estate Real estate (HM). In this case, the cluster merging unit 154 merges the clusters A and B into the cluster C by determining that the two clusters A and B are similar and neighboring.

군집 병합부(154)는 군집(A 및 B)을 병합할 때 군집에 포함된 실거래 부동산의 위치에 더 기초하므로, 병합된 군집에는 실제 부동산 시장의 지역성이 더 반영되어 보다 개선된 부동산 시장 분석 및 평가대상 시세 예측 상품 및 서비스를 제공할 수 있다. Since the cluster merging unit 154 is based on the location of the real estate included in the cluster when the crowds A and B are merged, the merged cluster further reflects the real estate market locality, It is possible to provide a quoted product and service to be evaluated.

군집화부(150)는 소정 영역을 병합된 군집(C)를 포함하는 군집 세트로 군집화하고, 이를 학습 모델 생성부(200)에 제공한다. The clustering unit 150 clusters a predetermined region into a set of clusters including the clusters C and provides the clusters to the learning model generating unit 200.

일 실시예에서, 군집화부(150)는 군집 세트 내 적어도 하나의 군집에 위치하는 실거래 부동산의 수가 표본 기준 이상이 될 때까지 유사도 점수 또는 실거래 부동산의 위치에 기초하여 군집을 병합할 수도 있다. 표본 기준은 하나의 군집이 통계적으로 가치 있는 결과를 도출해낼 수 있는 표본(실거래 부동산)을 포함하고 있는지 판단하는 기준이다. In one embodiment, the clustering unit 150 may merge the clusters based on the similarity score or the location of the real estate until the number of real estate located in at least one cluster in the cluster set is greater than or equal to the sample criteria. The sample criterion is a criterion for judging whether a community includes a sample (real estate) that can yield statistically valuable results.

일 예에서, 표본 기준이 1000인 경우, 군집이 실거래 부동산을 1000개 이상 포함해야 통계적으로 가치 있는 결과가 도출된다. 이 경우, 군집화부(150)는 군집 세트 내 적어도 하나의 군집이 1000개 미만의 실거래 부동산을 포함하면 반복하여 군집을 계속 병합할 수도 있다. 그 후, 모든 군집이 표본 기준 이상의 실거래 부동산을 각각 포함하는 경우, 병합의 반복이 완료되고, 생성된 최종적인 군집 세트를 학습 모델 생성부(200)에 제공한다. In one example, if the sample criterion is 1000, then the cluster must include at least 1000 real real estate to produce statistically valuable results. In this case, the clustering unit 150 may repeatedly merge the clusters repeatedly if at least one cluster in the cluster set includes less than 1,000 real-estate properties. Thereafter, when all the clusters each include actual real estate equal to or larger than the sample standard, the repetition of the merge is completed, and the generated final set of clusters is provided to the learning model generating unit 200.

학습 모델 생성부(200)는 획득된 실거래 부동산의 실거래가 및 건축물 관련 정보를 이용하여 군집 세트에 포함된 각 군집별로 거래가 추정 학습 모델을 생성할 수 있다. 예를 들어, 군집화부(150)가 군집(C)를 포함하는 16개의 군집 세트를 생성한 경우, 학습 모델 생성부(200)는 군집(C)에 대한 거래가 추정 학습 모델 이외에 16개의 거래가 추정 학습 모델을 생성할 수도 있다. The learning model generating unit 200 can generate a transactional estimated learning model for each of the clusters included in the set of clusters by using the actual transaction of the real real estate and the structure related information. For example, when the clustering unit 150 generates 16 clusters including clusters C, the learning model generating unit 200 generates 16 clusters in addition to the estimated learning models, You can also create a model.

학습 모델 생성부(200)는 기계학습법을 통해 상기 거래가 추정 학습 모델을 생성할 수 있으며, 상기 기계학습법은 다중회귀분석, 인공 신경망, M5P(decision tree), Bayesian Network, CART 중 적어도 하나를 포함할 수 있으나 이에 본 발명이 제한되는 것은 아니다.The learning model generation unit 200 may generate the transactional estimated learning model through a machine learning method and the machine learning method may include at least one of a multiple regression analysis, an artificial neural network, a decision tree (M5P), a Bayesian network, The present invention is not limited thereto.

도 5는 본 발명의 일 실시예에 따른 학습 모델 생성부(200)의 블록도이고, 도 6는 본 발명의 일 실시예에 따라 평가대상 부동산의 시세를 추정하는 방법을 설명하기 위한 간략화된 지도이다. 도 4를 참조하면 학습 모델 생성부(200)는 실거래 부동산 쌍 생성부(210), 거래가 비율 계산부(220) 및 모델링부(230)를 포함할 수 있다.FIG. 5 is a block diagram of a learning model generating unit 200 according to an embodiment of the present invention. FIG. 6 is a simplified map for explaining a method of estimating a market price of an evaluation target real estate according to an embodiment of the present invention. to be. 4, the learning model generation unit 200 may include a real-estate pair generation unit 210, a transaction-value-ratio calculation unit 220, and a modeling unit 230.

도 6를 참조하면, 평가대상 부동산(E) 및 최근 소정 기간(예컨대 2년, 6개월 등) 동안 실거래가 이루어진 실거래 부동산(H1-H6)이 나타난다. 평가대상 부동산(E)와 실거래 부동산(H1-H6)은 동일한 군집(C) 내에 위치한다. 도시되진 않았지만 소정 기간 거래가 없던 부동산들이 평가대상 부동산과 실거래 부동산 사이에 위치할 수도 있다.Referring to FIG. 6, the real estate E to be evaluated and the real real estate H1 to H6 where real transactions have been made during a recent predetermined period (for example, two years, six months, etc.) appear. The evaluation target real estate E and real real estate H1-H6 are located in the same cluster C. The real estate that is not shown but has no transaction for a certain period may be located between the real estate to be evaluated and real estate.

실거래 부동산 쌍 생성부(210)는 실거래 부동산 중 소정 거리내의 실거래 부동산에 대하여 실거래 부동산 쌍을 생성할 수 있다. 도 5를 참조하면, 실거래 부동산 쌍 생성부(210)는 실거래 부동산(H1)과 실거래 부동산(H2)로 H1-H2 쌍을 생성할 수 있다. 이와 같은 방식으로, H1-H3, H1-H4...이 생성될 수 있다. 또한 H1-H2 쌍을 예로 들면, 앞에 기재된 H1은 종속거래로 보고, 뒤에 기재된 H2는 독립거래로 볼 수 있다. 따라서 H1-H2 쌍과 H2-H1쌍이 각각 생성될 수 있다.The real-transaction-pair generating unit 210 can generate a real-transaction real-estate pair with respect to the real-transaction real-estate within a predetermined distance of the real-transaction real-estate. Referring to FIG. 5, the real transaction pair generation unit 210 can generate H1-H2 pair with the real transaction real estate H1 and the real transaction real estate H2. In this way, H1-H3, H1-H4 ... can be generated. Also, taking H1-H2 pair as an example, H1 described above is reported as a dependent transaction, and H2 listed below can be regarded as an independent transaction. H1-H2 pair and H2-H1 pair can be generated respectively.

거래가 비율 계산부(220)는 각 실거래 부동산 쌍에 대하여 거래가 비율을 계산할 수 있다. 거래가 비율은 제1 실거래 부동산의 제1 실거래가를 제2 실거래 부동산의 제2 실거래가로 나눈 값에 관련될 수 있다. 예컨대, H1의 실거래가가 120만원이고, H2의 실거래가가 100만원인 경우, H1-H2 쌍의 거래가 비율은 1.20이 될 수 있다.The transaction price ratio calculation unit 220 can calculate the transaction price ratio for each real transaction real estate pair. The transaction rate may be related to the value of the first real transaction of the first real estate real divided by the second real transaction of the second real transaction real estate. For example, if the actual transaction amount of H1 is 1,200,000 won and the actual transaction amount of H2 is one million won, the transaction value ratio of the H1-H2 pair may be 1.20.

모델링부(230)는 상기 실거래 부동산 쌍의 거래가 비율 및 상기 실거래 부동산 쌍의 건축물 관련 정보를 기초로 상기 거래가 추정 학습 모델을 생성할 수 있다. 예컨대 모델링부(230)는 기계 학습 시 다중회귀분석, 인공 신경망, M5P(decision tree), Bayesian Network, CART(Classification And Regression Tree) 중 적어도 하나를 이용하여 상기 거래가 추정 학습 모델을 생성할 수 있으나 이에 제한되지 않으며 다양한 기계 학습법을 적용할 수 있다.The modeling unit 230 may generate the transactional estimated learning model on the basis of the transaction ratio of the actual transaction pair and the building related information of the real transaction pair. For example, the modeling unit 230 may generate the transaction estimated learning model using at least one of a multiple regression analysis, an artificial neural network, a decision tree (M5P), a Bayesian network, and a classification and regression tree (CART) And various machine learning methods can be applied.

일 실시예에서, 모델링부(230)는 생성된 실거래 부동산 쌍에 대하여, 난수를 발생시키고, 분류(sorting)하여, 기계 학습을 위한 학습 데이터군 및 검증 데이터군으로 그루핑(grouping)할 수도 있다.In one embodiment, the modeling unit 230 may generate and sort random numbers for the generated real-estate pairs, and group them into a learning data group and a verification data group for machine learning.

일 실시예에서 모델링부(230)는 아래 [수학식 3]을 이용하여 거래가 비율을 정의할 수 있지만, 이에 제한되는 것이 아니다. 예컨대, 전세, 월세 등 부동산에 따라 [수학식 3]은 변형하여 사용할 수 있다.In one embodiment, the modeling unit 230 may define a transaction ratio using Equation (3) below, but is not limited thereto. For example, [Equation (3)] can be modified and used according to real estate such as chartering, rent, and the like.

[수학식 3]&Quot; (3) "

Y (거래가비율) = β1 * X1(B1의 지하여부)Y (trading value ratio) = β1 * X1 (whether B1 is underground)

+ β2 * X2(B1의 층)+ beta 2 * X2 (layer of B1)

+ β3 * X3(B1의 면적)+ beta 3 * X3 (area of B1)

+ β4 * X4(B1의 구조)+? 4 * X4 (Structure of B1)

+ β5 * X5(B1의 연면적)+ β5 * X5 (the floor area of B1)

+ β6 * X6(B1의 용적률)+? 6 * X6 (volume fraction of B1)

+ β7 * X7(B1의 건축년)+ β7 * X7 (construction year of B1)

+ β8 * X8(B1의 세대수)+ β8 * X8 (the number of households of B1)

+ β9 * X9(B1의 주차대수)+ β9 * X9 (Number of parking spaces in B1)

+ β10 * X10(B1의 승강기대수)+ β10 * X10 (the number of lifts of B1)

+ β11 * X11(B1의 개별공시지가) + β11 * X11 (individual official price of B1)

+ …+ ...

-{+ β12 * X12(B2의 지하여부)- {+ β12 * X12 (Whether B2 is underground)

+ β13 * X13(B2의 층)+? 13 * X13 (layer of B2)

+ β14 * X14(B2의 면적)+ beta 14 * X14 (area of B2)

+ β15 * X15(B2의 구조)+ beta 15 * X15 (structure of B2)

+ β16 * X16(B2의 연면적)+ β16 * X16 (the floor area of B2)

+ β17 * X17(B2의 용적률)+? 17 * X17 (volume fraction of B2)

+ β18 * X18(B2의 건축년)+ β18 * X18 (construction year of B2)

+ β19 * X19(B2의 세대수)+ β19 * X19 (the number of households of B2)

+ β20 * X20(B2의 주차대수)+ β20 * X20 (Number of parking spaces in B2)

+ β21 * X21(B2의 승강기대수)+ β21 * X21 (the number of lifts of B2)

+ β22 * X22(B2의 개별공시지가)+ β22 * X22 (individual official price of B2)

+ ...}+ ...}

+ ε+ ε

여기서, Y는 종속변수, XN은 독립변수(N은 정수), βN 회귀계수(N은 정수), ε는 오차항이고, ()는 기준 거래와 종속 거래의 속성에 해당되는 항목을 나타낸다. N은 반영하고자 하는 속성에 따라 달라질 수 있다. 예컨대, 수학식 3에서는 지하여부, 층, 면적, 등이 표현되었으나, 다른 실시예에서는 다른 건축물 대장상의 정보, 토지 대장 상의 정보 등 건축물 관련 정보들이 추가되거나 대체될 수 있다. Where Y is the dependent variable, XN is the independent variable (N is an integer), βN is the regression coefficient (N is an integer), ε is the error term, and () indicates the item that corresponds to the attributes of the reference transaction and the dependent transaction. N may vary depending on the attribute to be reflected. For example, in Equation (3), underground condition, floor, area, and the like are expressed, but in another embodiment, information related to buildings such as information on other buildings, information on buildings, information on buildings, and the like can be added or replaced.

또한 다른 일 실시예에서 모델링부(230)는 거래가 비율을 계산함에 있어서, 각 거래(예컨대 거래 S, 거래 T)의 개별 속성인 지하여부(s1, t1), 층(s2, t2), 면적(s3, t3), 구조(s4, t4), 연면적(s5, t5)... 등을 확보한다. 그 후 독립변수를 (s1-t1), (s2-t2), (s3-t3) 등으로 설정하고, 종속변수는 (S거래가 / T거래가) - 1 로 설정할 수 있다. 여기서 오차항(ε)을 0으로 하여 분석함으로써 개별 속성 차이로 인한 보정 비율만을 반영할 수도 있다.In another embodiment, the modeling unit 230 may be configured to calculate the ratio of the transaction (for example, transaction S, transaction T) to the underground situation (s1, t1), the floor (s2, t2) s3, t3), structures (s4, t4), total floor area (s5, t5), and so on. Then, the independent variable can be set to (s1-t1), (s2-t2), (s3-t3), and the dependent variable can be set to (S transaction / T transaction) - 1. Here, by analyzing the error term (0) as 0, only the correction ratio due to the individual property difference can be reflected.

시세 추정부(300)는 상기 평가대상 부동산의 건축물 관련 정보를 상기 거래가 추정 학습 모델에 적용하여 상기 평가대상 부동산의 시세를 추정할 수 있다. The city tax calculation unit 300 can estimate the market price of the evaluation target real estate by applying the building related information of the evaluation target real estate to the transactional estimated learning model.

또한 시세 추정부(300)는 상기 평가대상 부동산을 기준으로 소정 거리 내에 위치한 소정 개수의 실거래 부동산의 건축물 관련 정보를 상기 거래가 추정 학습 모델에 더 적용할 수 있다. 도 5를 참조하면 거리(L2)에 포함되는 실거래 부동산(H1-H3)들에 대한 건축물 관련 정보 및 실거래가를 함께 위 학습 모델에 적용하여 시세를 추정할 수 있다.In addition, the system 300 may further apply the building-related information of a predetermined number of real-estate properties located within a predetermined distance based on the evaluation-target real estate to the transaction-estimated learning model. Referring to FIG. 5, it is possible to estimate the market price by applying the building-related information and actual transaction information to the real learning real estate H1-H3 included in the distance L2.

시세 추정부(300)가 평가대상 부동산의 건축물 관련 정보를 상기 학습 모델에 적용한 경우, 최초 출력값은 다른 실거래 부동산과의 상대적인 비율(예컨대 상술한 거래가 비율)로 표현된다. 따라서 해당 실거래 부동산의 실거래가를 위 비율에 곱하여 평가대상 부동산의 시세를 추정할 수 있다. When the city tax calculation unit 300 applies the building-related information of the evaluation target real estate to the learning model, the initial output value is represented by a ratio relative to other real estate real estate (for example, the above-mentioned transaction price ratio). Therefore, it is possible to estimate the price of the real estate subject to evaluation by multiplying the real rate of the actual real estate by the above ratio.

예컨대 20개의 실거래 부동산이 시세 추정에 이용된 경우, 20개의 후보 시세가 산출될 수 있다. 시세 추정부(300)는 복수의 후보 시세의 평균을 최종 시세로 추정할 수 있다. For example, if 20 real-estate properties are used for quotation estimation, 20 candidate quotes can be calculated. The tick maker 300 can estimate an average of a plurality of candidate quotes by the final quotation.

사용자 인터페이스 제공부(500)는 산출된 시세 정보를 제공하는 사용자 인터페이스를 사용자에게 제공한다. 상기 사용자 인터페이스는 웹 또는 어플리케이션을 통해서 사용자에게 제공될 수 있다. 사용자 인터페이스 제공부(500)는 지도(510)를 사용자에게 제공할 수 있다. 사용자는 지도(510)에서 시세를 확인하고자 하는 부동산을 선택할 수 있다. 사용자에 의해 선택된 부동산에 대하여, 사용자 인터페이스부(500)는 추정된 시세 또는 건축물 대장정보를 제공할 수 있다. 이를 위해서, 사용자 인터페이스 제공부(500)는 지도(510)와 함께 또는 대안적으로, 건축물 대장정보 표시부(520) 또는 세부 부동산 표시부(530)를 제공할 수 있다.The user interface providing unit 500 provides the user with a user interface for providing the calculated quotation information. The user interface may be provided to a user via a web or application. The user interface providing unit 500 may provide the map 510 to the user. The user can select the real estate to check the price on the map 510. For the real estate selected by the user, the user interface unit 500 may provide the estimated quota or building bill information. To this end, the user interface provider 500 may provide, together with or in addition to the map 510, a building information display unit 520 or a detailed property display unit 530.

건축물 대장정보 표시부(520)는 선택된 부동산에 대한 용도, 면적, 구성 등의 정보를 표시할 수 있다. 세부 부동산 표시부는 선택된 부동산에 대하여 세부적인 부동산 정보(예컨대 호 수)를 표시할 수 있다. 세부 부동산 표시부(530)는 다세대 주택의 경우 주차장 또는 각 주택의 호수의 대략적인 위치 등을 표시할 수 있다. 사용자는 필요한 호수를 선택하여 선택한 호수의 시세 추정값을 확인할 수 있다.The building information display unit 520 can display information such as usage, area, configuration, and the like for the selected real estate. The detailed real estate display may display detailed real estate information (e.g., number of homes) for the selected property. The detailed property display unit 530 can display the approximate location of a parking lot or a lake of each house in case of a multi-family house. The user can check the estimated price of the lake selected by selecting the required lake.

도 7은, 본 발명의 일 실시예에 따른 부동산 시장 군집화 방법의 흐름도를 도시한 도면이며, 도 8은 본 발명의 일 실시예에 따른, 부동산 시세 추정 방법의 흐름도를 도시한 도면이다. 상기 부동산 시장 군집화 방법은 상술한 장치(10)의 구성요소에 의해 구현될 수도 있다. 또한, 상기 부동산 시세 추정 방법은 상술한 시스템(1000)의 구성요소에 의해 구현될 수도 있다.FIG. 7 is a flowchart illustrating a real estate market clustering method according to an embodiment of the present invention, and FIG. 8 is a flowchart illustrating a real estate market estimation method according to an embodiment of the present invention. The real estate market clustering method may be implemented by the components of the device 10 described above. In addition, the real estate market estimation method may be implemented by the components of the system 1000 described above.

일 실시예에서, 부동산 시장 군집화 방법은 a) 소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 부동산 실거래 정보를 획득하는 단계(S100); b) 상기 소정 영역에 제1 군집 세트를 생성하는 단계(S151); c) 상기 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 단계(S152); d) 상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 단계(S153); e) 상기 유사도 점수가 임계치 이상인 경우 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 단계(S154); 및 f) 상기 제3 군집을 포함하는 제2 군집 세트를 생성하는 단계(S155)를 포함할 수도 있다. 위에서 서술한 바와 같이, 군집 경향 계수는 건축물 관련 정보가 실거래가에 미치는 영향을 나타낸다. In one embodiment, the real estate market clustering method comprises the steps of: a) obtaining real-estate real-estate information including real transactions for real estate and building-related information that are located within a predetermined real estate (SlOO); b) generating a first community set in the predetermined area (S151); c) calculating (S152) a population tendency coefficient for each of the first community and the second community by applying real estate real transactions for the first community and the second community included in the first community set to the community analysis model; d) generating a similarity score between the first community and the second community using the community tendency coefficients for the first community and the second community (S153); e) merging the first cluster and the second cluster into a third cluster when the similarity score is equal to or greater than a threshold (S154); And f) generating a second set of clusters comprising the third cluster (S 155). As described above, the cluster tendency coefficient indicates the influence of information on buildings on real transactions.

일 실시예에서, 부동산 시장 군집화 방법은 단계(S151-S155)를 반복하는 단계(S156) 및 군집 세트에 포함된 적어도 하나의 군집에 위치하는 실거래 부동산의 수에 기초하여 단계(S156)를 완료하는 단계(S157)를 더 포함할 수도 있다. 표본 기준은 하나의 군집이 통계적으로 가치 있는 결과를 도출해낼 수 있는 표본(실거래 부동산)을 포함하고 있는지 판단하는 기준이다. In one embodiment, the real estate market clustering method includes repeating steps S151-S155 (S156) and completing step S156 based on the number of real estate located in at least one cluster included in the set of clusters And may further include step S157. The sample criterion is a criterion for judging whether a community includes a sample (real estate) that can yield statistically valuable results.

일 예에서, 표본 기준이 1000인 경우, 군집이 실거래 부동산을 1000개 이상 포함해야 통계적으로 가치 있는 결과가 도출된다. 이 경우, 실거래 단계(S155)에서 생성된 새로운 군집 세트 중 적어도 하나의 군집이 표본 기준보다 적은 실거래 부동산을 포함하면 반복하여 군집이 병합되고 새로운 군집 세트가 재생성된다. 그 후, 모든 군집이 표본 기준 이상의 실거래 부동산을 각각 포함하는 경우, 반복이 완료되고(S157), 최종적인 군집 세트를 생성한다. In one example, if the sample criterion is 1000, then the cluster must include at least 1000 real real estate to produce statistically valuable results. In this case, if at least one community among the new community sets generated in the actual transaction step S155 includes real estate that is smaller than the sample reference, the community is repeated and the new community set is regenerated. Thereafter, when all the clusters each include actual real estate equal to or more than the sample standard, the repetition is completed (S157), and a final set of clusters is generated.

그 결과, 10개 또는 50개의 실거래 부동산을 포함하는, 통계적으로 가치가 없는 결과를 도출해내는 군집이 제거된다. 또한, 단계(S156)가 계속 되어 소정 범위에 최종적으로 하나의 군집이 생성되는 것과 같은 문제가 생길 가능성을 방지한다. As a result, clusters that yield statistically insignificant results, including 10 or 50 real estate properties, are eliminated. In addition, the step S156 is continued to prevent the possibility that a problem such as the generation of a single cluster in a predetermined range is prevented.

상기 표본 기준에 대응하는 실거래 부동산의 수는 군집의 면적, 소정 영역의 면적, 실거래 부동산의 빈도, 밀집도 등 다양한 요인에 의해 설정될 수도 있다. 또는 사용자의 입력에 의해 설정될 수도 있으며, 추후 업데이트될 수도 있다. The number of real real estate corresponding to the sample standard may be set by various factors such as the area of the cluster, the area of the predetermined area, the frequency of actual real estate, and the density. Or may be set by the user's input, and may be updated later.

일 실시예에서, 상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하는 단계(S154)는 상기 제1 군집에 포함된 제1 실거래 부동산의 위치와 상기 제2 군집에 포함된 제2 실거래 부동산의 위치에 더 기초할 수도 있다. In one embodiment, merging the first cluster and the second cluster into the third cluster (S154) may further comprise: comparing a position of the first real estate included in the first cluster with a second real estate included in the second cluster, It may be further based on the location of the real estate.

일 실시예에서, 도 7에 도시된 바와 같이, 부동산 시장 군집화 방법에 의해 생성된 군집 세트를 이용하는 빅데이터를 이용한 부동산 시세 추정 방법은 소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 부동산 실거래 정보를 획득하는 단계(S100); 상기 소정 영역을 군집 세트로 군집화하는 단계(S150); 획득된 실거래 부동산 정보를 이용하여 군집 세트에 포함된 각 군집에 대한 거래가 추정 학습 모델을 각각 생성하는 단계(S200); 및 평가대상 부동산의 건축물 관련 정보를 평가대상 부동산에 연관된 거래가 추정 학습 모델에 적용하여 평가대상 부동산의 시세를 추정하는 단계(S300)를 포함하며, 상기 단계(S150)는 단계(S151-155)를 포함한다. In an embodiment, as shown in FIG. 7, a real estate market estimation method using big data using a set of clusters generated by a real estate market clustering method includes a real transaction located within a predetermined area, (S100); Clustering the predetermined region into a cluster set (S150); A step S200 of generating a transaction-valued learning model for each community included in the community set using the acquired real estate real estate information; And a step (S300) of applying a building related information of the evaluation target real estate to a transaction price estimation learning model associated with the evaluation target real estate (S300), wherein the step (S150) .

상기 실시예에서, 거래가 추정 학습 모델을 생성하는데 이용되는 군집 세트는 단계(S155)에서 생성된 새로운 군집 세트를 나타낸다. 따라서, 새로운 군집 세트에 포함된 군집의 수에 대응하는 거래가 추정 학습 모델 세트가 생성된다. In this embodiment, the set of clusters in which transactions are used to generate the estimated learning models represents the new set of clusters generated in step S155. Thus, a set of estimated value learning models corresponding to the number of populations included in the new population set is generated.

사용자가 단계(S155)에서 생성된 새로운 군집 세트 중 임의의 일 군집에 위치하는 평가대상 부동산의 시세를 추정하고자 하는 경우, 평가대상 부동산이 위치하는 군집에 대한 거래가 추정 학습 모델(즉, 평가대상 부동산에 연관된 거래가 추정 학습 모델)에 평가대상 부동산의 건축물 관련 정보를 적용하여 평가대상 부동산의 시세를 추정한다. When the user wishes to estimate the price of the evaluation target real estate located in a certain one of the new population sets generated in step S155, it is possible to estimate the trading value of the community in which the evaluation target real estate is located, And estimates the market price of the evaluation target real estate by applying the building-related information of the evaluation target real estate.

다른 실시예에서, 상기 단계(S150)는 단계(S156-S157)를 더 포함할 수도 있다. 이 경우 단계(S155)에서 생성된 군집 세트(제2 군집 세트)에 대해서 단계(S152-155)가 수행되어 새로운 군집 세트(예를 들어, 제3 군집 세트)가 생성될 수도 있다. 이 경우, 단계(S157)에서 완료된 군집 세트의 수에 대응하는 거래가 추정 학습 모델 세트가 생성된다(S200). 단계(S300)에서 평가대상 부동산에 연관된 거래가 추정 학습 모델은 제3 군집 세트에 기초된다. In another embodiment, the step S150 may further include steps S156-S157. In this case, steps S152-155 may be performed for the set of societies (the second set of societies) generated in step S155 to generate a new set of societies (e.g., a third set of societies). In this case, a set of estimated value learning models corresponding to the number of sets of clusters completed in step S157 is generated (S200). In step S300, the transaction price estimated learning model associated with the evaluation target real estate is based on the third set of clusters.

일 실시예에서, 상기 거래가 추정 학습 모델을 생성하는 단계는 상기 실거래 부동산 중 소정 거리내의 실거래 부동산에 대하여 실거래 부동산 쌍을 생성하는 단계; 각 실거래 부동산 쌍에 대하여, 거래가 비율을 계산하는 단계; 및 상기 거래가 비율 및 상기 실거래 부동산 쌍의 건축물 관련 정보를 기초로 상기 거래가 추정 학습 모델을 생성하는 모델링 단계를 포함할 수도 있다. 여기서, 거래가 비율은 제1 실거래 부동산의 제1 실거래가를 제2 실거래 부동산의 제2 실거래가로 나눈 값에 관련될 수 있다.In one embodiment, the transaction generating the estimated learning model comprises: generating a real estate pair for a real estate within a predetermined distance of the real estate; Calculating a transaction price ratio for each real trade real estate pair; And a modeling step in which the transaction generates an estimated learning model based on the transaction price ratio and building related information of the real estate pair. Here, the transaction price ratio may be related to the value of the first real transaction of the first real estate real divided by the second real transaction of the second real transaction real estate.

모델링 단계는, 다중회귀분석, 인공 신경망, M5P(decision tree), Bayesian Network, CART 중 적어도 하나를 이용하여 상기 거래가 추정 학습 모델을 생성하는 것일 수 있다.The modeling step may be to generate the transactionally estimated learning model using at least one of a multiple regression analysis, an artificial neural network, a decision tree (M5P), a Bayesian network, and a CART.

일 실시예에서, 상기 소정 거리 내의 실거래 부동산은 제1 실거래 부동산 쌍에서 독립거래로 적용되고, 제2 실거래 부동산 쌍에서 종속거래로 적용될 수 있다. In one embodiment, the real estate within the predetermined distance is applied as an independent transaction in the first real estate pair, and can be applied as the dependent transaction in the second real estate pair.

본 명세서에서, 데이터베이스 시스템 또는 데이터베이스 장치는 전적으로 하드웨어이거나, 또는 부분적으로 하드웨어이고 부분적으로 소프트웨어인 측면을 가질 수 있다. As used herein, a database system or database device may be entirely hardware, or partially hardware, and may be partially software.

본 발명의 일 실시예에 따른 방법은 일련의 과정들을 수행하기 위한 컴퓨터 프로그램의 형태로 구현될 수도 있으며, 상기 컴퓨터 프로그램은 컴퓨터 판독가능 기록매체에 기록될 수도 있다. 실시예들에 따른 부동산 시장 군집화 방법을 구현하기 위한 프로그램이 기록되는 컴퓨터 판독가능 기록매체는 컴퓨터에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터 판독가능 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있다. 또한 컴퓨터 판독가능 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수도 있다.The method according to an embodiment of the present invention may be implemented in the form of a computer program for performing a series of processes, and the computer program may be recorded on a computer-readable recording medium. The computer-readable recording medium on which the program for implementing the real estate market clustering method according to the embodiments is recorded includes all kinds of recording apparatuses in which data that can be read by a computer is stored. Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like. The computer-readable recording medium may also be distributed over a networked computer system so that computer readable code in a distributed manner may be stored and executed.

이상에서 살펴본 본 발명은 도면에 도시된 실시예들을 참고로 하여 설명하였으나, 이는 예시적인 것에 불과하며 당해 분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 실시예의 변형이 가능하다는 점을 이해할 것이다. 그러나, 이와 같은 변형은 본 발명의 기술적 보호범위 내에 있다고 보아야 할 것이다. 따라서, 본 발명의 진정한 기술적 보호범위는 첨부된 특허청구범위의 기술적 사상에 의해서 정해져야 할 것이다. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. . However, it should be understood that such modifications are within the technical scope of the present invention. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

Claims

부동산 시장 군집 장치에 의해 실행되는 빅데이터를 이용한 부동산 시장 군집화 방법으로서,
상기 방법은,
a) 실거래 부동산 정보 획득부에 의해, 소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 부동산 실거래 정보를 획득하는 단계;
b) 군집화부에 의해, 상기 소정 영역에 제1 군집 세트를 생성하는 단계;
c) 상기 군집화부에 의해, 상기 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 단계;
d) 상기 군집화부에 의해, 상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 단계;
e) 상기 군집화부에 의해, 상기 유사도 점수에 기초하여 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 단계; 및
f) 상기 군집화부에 의해, 상기 제3 군집을 포함하는 제2 군집 세트를 생성하는 단계를 포함하고,
상기 군집 경향 계수는 건축물 관련 정보와 실거래가 사이의 관련성을 나타내며,
상기 군집 분석 모델은 실거래 부동산의 실거래가, 전용면적, 대지권면적, 개별공시지가, 거래시기, 건축시기 중 적어도 하나를 기초로 상기 군집 경향 계수를 산출하는 모델로서, 상기 군집 분석 모델은 아래의 수학식과 같이 표현되며,
ln(Y) = α1ln(X1) + α2ln(X2) + α3ln(X3) + α4(X4) + α0
여기서, Y는 실거래가, X1은 전용면적, X2는 대지권면적, X3은 개별공시지가, X4는 사용년수를 나타내고, α1는 X1의 군집 경향 계수, α2는 X2의 군집 경향 계수, α3는 X3의 군집 경향 계수, α4는 X4의 군집 경향 계수, α0는 X1, X2, X3 및 X4 외의 부동산 관련 정보의 군집 경향 계수를 나타내는 것을 특징으로 하는 부동산 시장 군집화 방법. As a method of clustering a real estate market using big data executed by a real estate market clustering device,
The method comprises:
a) obtaining, by the real estate real estate information acquisition unit, real estate real estate information including a real transaction for a real estate and information related to a building, the real transaction located within a predetermined area;
b) generating, by the clustering unit, a first set of clusters in the predetermined region;
c) calculating a cluster tendency coefficient for each of the first cluster and the second cluster by applying the real estate real-estate information on the first cluster and the second cluster included in the first cluster set to the cluster analysis model by the clustering unit; ;
d) generating, by the clustering unit, a similarity score between the first cluster and the second cluster using a cluster tendency coefficient for the first cluster and the second cluster;
e) merging the first cluster and the second cluster into the third cluster based on the similarity score by the clustering unit; And
f) generating, by the clustering unit, a second cluster set including the third cluster,
The cluster tendency coefficient indicates a relation between building related information and actual transaction,
The community analysis model is a model for calculating the community tendency coefficient based on at least one of real estate of actual real estate, exclusive area, land area, individual official land price, trading time, and construction time, Expressed,
ln (Y) =? 1ln (X1) +? 2ln (X2) +? 3ln (X3) +? 4 (X4) +?
Where X1 is a real trade, X1 is a dedicated area, X2 is a large land area, X3 is an individual official site, X4 is the years of use, α1 is a cluster tendency coefficient of X1, α2 is a cluster tendency coefficient, A trend coefficient,? 4 is a population tendency coefficient of X4, and? 0 is a population tendency coefficient of real estate related information other than X1, X2, X3 and X4.

제 1 항에 있어서,
상기 군집화부에 의해, 상기 단계들 c) 내지 f)를 반복하는 단계; 및
상기 제2 군집 세트에 포함된 적어도 하나의 군집에 위치하는 실거래 부동산의 수에 기초하여 상기 반복하는 단계를 완료하는 단계를 더 포함하는 부동산 시장 군집화 방법. The method according to claim 1,
Repeating the steps c) to f) by the clustering unit; And
Further comprising: completing the repeating step based on a number of real estate properties located in at least one community included in the second community set.

제 1 항에 있어서,
상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하는 단계는,
상기 제1 군집에 포함된 제1 실거래 부동산의 위치와 상기 제2 군집에 포함된 제2 실거래 부동산의 위치에 더 기초하는 부동산 시장 군집화 방법. The method according to claim 1,
The merging of the first cluster and the second cluster into the third cluster may include:
And further based on the location of the first real estate included in the first community and the location of the second real estate included in the second community.

제 1 항에 있어서,
상기 소정 영역에 제1 군집 세트를 생성하는 단계는,
상기 소정 영역 내 위치하는 적어도 하나의 고정점에 기초하여 상기 제1 군집 세트를 생성하는 부동산 시장 군집화 방법. The method according to claim 1,
The method of claim 1, wherein generating a first set of clusters in the predetermined region comprises:
Wherein the first set of clusters is generated based on at least one fixed point located within the predetermined region.

제 4 항에 있어서,
상기 고정점은 상기 제1 군집 세트의 군집 편향도에 기초하여 지정되는 부동산 시장 군집화 방법. 5. The method of claim 4,
Wherein the fixed point is specified based on a degree of population bias of the first set of communities.

삭제delete

제 1 항에 있어서,
상기 병합하는 단계는,
상기 유사도 점수가 임계치 이상인 경우 상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하되,
상기 임계치는 상기 제1 군집 세트에 포함된 군집의 수에 기초하여 설정되는 부동산 시장 군집화 방법. The method according to claim 1,
Wherein the merging comprises:
Merging the first cluster and the second cluster into the third cluster when the similarity score is equal to or greater than a threshold value,
Wherein the threshold is set based on the number of clusters included in the first cluster set.

삭제delete

제 1 항에 있어서,
상기 유사도 점수는 아래의 수학식과 같이 표현되고,

여기서, 유사도 점수는 상기 제1 군집과 상기 제2 군집 간의 유사도 점수, A는 상기 제1 군집에 대한 군집 경향 계수, B는 상기 제2 군집에 대한 군집 경향 계수, N은 상기 군집 분석 모델에 의해 산출되는 군집 경향 계수의 수를 나타내는 부동산 시장 군집화 방법. The method according to claim 1,
The similarity score is expressed by the following equation,

Here, the similarity score is a score of similarity between the first community and the second community, A is a population tendency coefficient for the first community, B is a population tendency coefficient for the second community, Real estate market clustering method, which shows the number of population trending coefficients calculated.

제 1 항 내지 제 5 항, 제 7 항 및 제 9 항 중 어느 하나의 항에 따른 부동산 시장 군집화 방법을 실행하기 위한 컴퓨터 프로그램이 기록된 컴퓨터 판독가능 기록매체. A computer-readable recording medium on which a computer program for executing a real estate market clustering method according to any one of claims 1 to 5, 7, and 9 is recorded.

부동산 시장을 군집하여 평가대상 부동산의 시세를 추정하는 부동산 시세 추정 방법으로서,
실거래 부동산 정보 획득부에 의해, 소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 부동산 실거래 정보를 획득하는 단계;
군집화부에 의해, 상기 소정 영역에 제1 군집 세트를 생성하는 단계;
상기 군집화부에 의해, 상기 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 단계;
상기 군집화부에 의해, 상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 단계;
상기 군집화부에 의해, 상기 유사도 점수에 기초하여 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 단계;
상기 군집화부에 의해, 상기 제3 군집을 포함하는 제2 군집 세트를 생성하는 단계;
학습 모델 생성부에 의해, 획득된 실거래 부동산 정보를 이용하여 상기 군집 세트에 포함된 각 군집에 대한 거래가 추정 학습 모델을 각각 생성하는 단계; 및
시세 추정부에 의해, 평가대상 부동산의 건축물 관련 정보를 상기 평가대상 부동산에 연관된 거래가 추정 학습 모델에 적용하여 상기 평가대상 부동산의 시세를 추정하는 단계를 더 포함하되,
상기 거래가 추정 학습 모델을 생성하는 단계는,
상기 실거래 부동산 중 소정 거리내의 실거래 부동산에 대하여 실거래 부동산 쌍을 생성하는 단계;
각 실거래 부동산 쌍에 대하여, 거래가 비율을 계산하는 단계; 및
상기 거래가 비율 및 상기 실거래 부동산 쌍의 건축물 관련 정보를 기초로 상기 거래가 추정 학습 모델을 생성하는 모델링 단계를 포함하되,
상기 소정 거리 내의 실거래 부동산은 제1 실거래 부동산 쌍에서 독립거래로 적용되고, 제2 실거래 부동산 쌍에서 종속거래로 적용되는 부동산 시세 추정 방법. CLAIMS 1. A method for estimating a real estate market by crowding a real estate market,
Real estate real estate information acquisition section acquires real estate real estate information including a real transaction for a real estate and information related to a real estate located within a predetermined region;
Generating, by the clustering unit, a first set of clusters in the predetermined region;
Applying the real estate real-estate information on the first community and the second community included in the first community set to the community analysis model by the clustering unit to calculate a community tendency coefficient for each of the first community and the second community; ;
Generating a similarity score between the first community and the second community by using the population tendency coefficient for the first community and the second community by the clustering unit;
Merging the first community and the second community into the third community based on the degree of similarity score by the clustering unit;
Generating, by the clustering unit, a second cluster set including the third cluster;
Generating a transaction-valued learning model for each of the communities included in the community set using the acquired real estate information by the learning model generation unit; And
Further comprising the step of estimating the price of the evaluation target real estate by applying the building related information of the evaluation target real estate to the transaction estimation estimated learning model associated with the evaluation target real estate by the city order calculating unit,
Wherein the step of generating the transaction cost estimated learning model comprises:
Generating a real real estate pair for real real estate within a predetermined distance of the real real estate;
Calculating a transaction price ratio for each real trade real estate pair; And
A modeling step in which the transaction generates an estimated learning model based on the transaction price ratio and building related information of the real estate pair,
Wherein the actual real estate within the predetermined distance is applied as an independent transaction in the first real estate pair and is applied as the dependent transaction in the second real estate pair.

제 11 항에 따른 부동산 시세 추정 방법을 실행하기 위한 컴퓨터 프로그램이 기록된 컴퓨터 판독가능 기록매체. 12. A computer-readable recording medium on which a computer program for executing a real estate market estimation method according to claim 11 is recorded.

소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 실거래 부동산 정보를 획득하는 실거래 부동산 정보 획득부; 및
상기 실거래 부동산 정보를 이용하여 상기 소정 영역을 군집화하는 군집화부를 포함하고,
상기 군집화부는,
상기 소정 영역에 생성된 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 군집 분석부;
상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 군집 비교부; 및
상기 유사도 점수에 기초하여 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 군집 병합부를 포함하되,
상기 군집 분석 모델은 실거래 부동산의 실거래가, 전용면적, 대지권면적, 개별공시지가, 거래시기, 건축시기 중 적어도 하나를 기초로 상기 군집 경향 계수를 산출하는 모델로서, 상기 군집 분석 모델은 아래의 수학식과 같이 표현되고,
ln(Y) = α1ln(X1) + α2ln(X2) + α3ln(X3) + α4(X4) + α0
여기서, Y는 실거래가, X1은 전용면적, X2는 대지권면적, X3은 개별공시지가, X4는 사용년수를 나타내고,
α1는 X1의 군집 경향 계수, α2는 X2의 군집 경향 계수, α3는 X3의 군집 경향 계수, α4는 X4의 군집 경향 계수, α0는 X1, X2, X3 및 X4 외의 부동산 관련 정보의 군집 경향 계수를 나타내는 부동산 시장 군집화 장치. A real real estate information acquiring unit for acquiring real real estate information including real transactions for real estate and building related information that are located within a predetermined area; And
And a clustering unit for clustering the predetermined area using the real estate information,
The clustering unit,
The real estate transaction information for the first community and the second community included in the first community set generated in the predetermined region is applied to the community analysis model to calculate a community tendency coefficient for each of the first community and the second community, part;
A cluster comparing unit for generating a similarity score between the first cluster and the second cluster using the cluster tendency coefficients for the first cluster and the second cluster; And
And a cluster merging unit for merging the first cluster and the second cluster into the third cluster based on the similarity score,
The community analysis model is a model for calculating the community tendency coefficient based on at least one of real estate of actual real estate, exclusive area, land area, individual official land price, trading time, and construction time, Expressed,
ln (Y) =? 1ln (X1) +? 2ln (X2) +? 3ln (X3) +? 4 (X4) +?
Here, Y is a real transaction, X1 is a dedicated area, X2 is a large land area, X3 is an individual official site, X4 is a year,
α1 is the cluster tendency coefficient of X1, α2 is the cluster tendency coefficient of X2, α3 is the cluster tendency coefficient of X3, α4 is the cluster tendency coefficient of X4, α0 is the cluster tendency coefficient of the real estate related information other than X1, X2, X3 and X4 Real estate market clustering device.

제 13 항에 있어서,
상기 군집 병합부는,
상기 제1 군집에 포함된 제1 실거래 부동산의 위치와 상기 제2 군집에 포함된 제2 실거래 부동산의 위치에 더 기초하는 부동산 시장 군집화 장치. 14. The method of claim 13,
Wherein,
And further based on the location of the first real estate included in the first community and the location of the second real estate included in the second community.

제 13 항에 있어서,
상기 군집화부는,
상기 소정 영역 내 위치하는 적어도 하나의 고정점에 기초하여 상기 제1 군집 세트를 생성하는 부동산 시장 군집화 장치. 14. The method of claim 13,
The clustering unit,
And generates the first community set based on at least one fixed point located in the predetermined area.

제 15 항에 있어서,
상기 고정점은 상기 부동산 시장 군집 세트의 군집 편향도에 기초하여 지정되는 부동산 시장 군집화 장치.16. The method of claim 15,
Wherein the fixed point is specified based on a population bias map of the real estate market cluster set.

삭제delete

제 13 항에 있어서,
상기 군집 병합부는,
상기 유사도 점수가 임계치 이상인 경우 상기 제1 군집과 상기 제2 군집을 상기 제3 군집으로 병합하고,
상기 임계치는 상기 제1 군집 세트에 포함된 군집의 수에 기초하여 설정되는 부동산 시장 군집화 장치. 14. The method of claim 13,
Wherein,
Merging the first cluster and the second cluster into the third cluster when the similarity score is equal to or greater than a threshold,
Wherein the threshold is set based on the number of communities included in the first community set.

삭제delete

제 13 항에 있어서,
상기 유사도 점수는 아래의 수학식과 같이 표현되고,

여기서, 유사도 점수는 상기 제1 군집과 상기 제2 군집 간의 유사도 점수, A는 상기 제1 군집에 대한 군집 경향 계수, B는 상기 제2 군집에 대한 군집 경향 계수, N은 상기 군집 분석 모델에 의해 산출되는 군집 경향 계수의 수를 나타내는 부동산 시장 군집화 장치. 14. The method of claim 13,
The similarity score is expressed by the following equation,

Here, the similarity score is a score of similarity between the first community and the second community, A is a population tendency coefficient for the first community, B is a population tendency coefficient for the second community, Real estate market clustering device which shows the number of cluster tendency coefficients calculated.

부동산 시장을 군집화하여 평가대상 부동산의 시세를 추정하는 부동산 시세 추정 시스템으로서,
소정 영역 내에 위치하는 실거래가 부동산에 대한 실거래가 및 건축물 관련 정보를 포함하는 실거래 부동산 정보를 획득하는 실거래 부동산 정보 획득부;
상기 실거래 부동산 정보를 이용하여 상기 소정 영역을 복수의 군집을 포함한 군집 세트로 군집화하는 군집화부;
획득된 실거래 부동산 정보를 이용하여 상기 군집 세트에 포함된 각 군집에 대한 거래가 추정 학습 모델을 각각 생성하는 학습 모델 생성부; 및
평가대상 부동산의 건축물 관련 정보를 상기 평가대상 부동산에 연관된 거래가 추정 학습 모델에 적용하여 상기 평가대상 부동산의 시세를 추정하는 시세 추정부를 포함하며,
상기 군집화부는,
상기 소정 영역에 생성된 제1 군집 세트에 포함되는 제1 군집 및 제2 군집에 대한 부동산 실거래 정보를 군집 분석 모델에 적용하여 각 제1 군집 및 제2 군집에 대한 군집 경향 계수를 산출하는 군집 분석부;
상기 제1 군집 및 제2 군집에 대한 군집 경향 계수를 이용하여 상기 제1 군집과 상기 제2 군집 간의 유사도 점수를 생성하는 군집 비교부; 및
상기 유사도 점수에 기초하여 상기 제1 군집과 상기 제2 군집을 제3 군집으로 병합하는 군집 병합부를 포함하고,
상기 학습 모델 생성부는,
상기 실거래 부동산 중 소정 거리내의 실거래 부동산에 대하여 실거래 부동산 쌍을 생성하는 실거래 부동산 쌍 생성부;
각 실거래 부동산 쌍에 대하여, 거래가 비율을 계산하는 거래가 비율 계산부; 및
상기 거래가 비율 및 상기 실거래 부동산 쌍의 건축물 관련 정보를 기초로 상기 거래가 추정 학습 모델을 생성하는 모델링부를 포함하되,
상기 소정 거리 내의 실거래 부동산은 제1 실거래 부동산 쌍에서 독립거래로 적용되고, 제2 실거래 부동산 쌍에서 종속거래로 적용되는 부동산 시세 추정 시스템.

A real estate market estimation system for clustering a real estate market to estimate a market price of an evaluation target real estate,
A real real estate information acquiring unit for acquiring real real estate information including real transactions for real estate and building related information that are located within a predetermined area;
A clustering unit for clustering the predetermined region into a cluster set including a plurality of clusters using the real estate information;
A learning model generation unit for generating a transactional estimated learning model for each community included in the community set using the acquired real estate real estate information; And
And a market price estimating unit for estimating a market price of the evaluation target real estate by applying building related information of the evaluation target real estate to the estimated transaction learning price model associated with the real estate to be valued,
The clustering unit,
The real estate transaction information for the first community and the second community included in the first community set generated in the predetermined region is applied to the community analysis model to calculate a community tendency coefficient for each of the first community and the second community, part;
A cluster comparing unit for generating a similarity score between the first cluster and the second cluster using the cluster tendency coefficients for the first cluster and the second cluster; And
And a cluster merging unit for merging the first cluster and the second cluster into the third cluster based on the similarity score,
Wherein the learning model generation unit comprises:
A real real estate pair generating unit for generating real real estate pairs with respect to real real estate within a predetermined distance of the real real estate;
A transaction price ratio calculation unit for calculating a transaction price ratio for each real transaction real estate pair; And
And a modeling unit that generates the transactional estimated learning model based on the transaction price ratio and building related information of the actual real estate pair,
Real estate real estate within the predetermined distance is applied as an independent transaction in a first real estate real estate pair and is applied as a dependent transaction in a second real estate real estate pair.