JP2020098388A

JP2020098388A - Demand prediction method, demand prediction program, and demand prediction device

Info

Publication number: JP2020098388A
Application number: JP2018235377A
Authority: JP
Inventors: 浩子鈴木; Hiroko Suzuki; 渡部　勇; Isamu Watabe; 勇渡部
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-12-17
Filing date: 2018-12-17
Publication date: 2020-06-25
Anticipated expiration: 2038-12-17
Also published as: JP7139932B2

Abstract

To increase accuracy of prediction of demand for a new item before sale.SOLUTION: A demand prediction device extracts, from each of documents having an attribute described therein of an existing item whose sale has been already started or a new item whose sale is not started and on the basis of a preset condition, a characteristic word showing the attribute of each of the items. The demand prediction device generates, from appearance frequency of the characteristic word included in each of the documents, clustering information showing a combination of a degree of the characteristic word being contained for each of the items. The demand prediction device learns a prediction model performing prediction of demand for the new item using learning data in which the generated clustering information is set to an explanatory variable and sales record of the existing item is set to an objective variable.SELECTED DRAWING: Figure 1

Description

本発明は、需要予測方法、需要予測プログラムおよび需要予測装置に関する。 The present invention relates to a demand forecasting method, a demand forecasting program, and a demand forecasting apparatus.

商品の需要予測は、過去の売上実績の傾向に基づき将来を予測する手法が一般的であるが、新商品の発売前の研究開発や企画段階では、予測したい新商品の受注データや売上データが存在しないため、売上予測ができない。このため、過去にすでに発売された類似商品を探索し、その類似商品の過去の売上データを用いることで、新商品の売上予測をすることが行われている。 For product demand forecasting, a method of forecasting the future based on past sales performance trends is generally used, but at the R&D and planning stages before the launch of a new product, the order data and sales data for the new product you want to forecast are available. Sales forecast cannot be made because it does not exist. For this reason, it has been performed to search for similar products that have already been released in the past and to predict sales of new products by using past sales data of the similar products.

例えば、新商品の発売前や発売初期に、予測実行者が複数の類似商品の探索および各類似商品の重みを指定し、その重みを用いて類似商品の過去の売上の重み付け加算を算出して、需要予測を行う技術が知られている。また、商品に関する発言を含むソーシャルメディアデータと商品属性データを用いて、過去に発売された類似商品を抽出し、この類似商品の実績を用いて、新商品の売り上げを予測する技術が知られている。 For example, before or during the initial release of a new product, the predictor searches for a plurality of similar products and specifies the weight of each similar product, and the weight is used to calculate the weighted addition of past sales of similar products. , Technology for forecasting demand is known. In addition, there is known a technology for predicting sales of a new product by extracting a similar product that has been released in the past using social media data including comments about the product and product attribute data, and using the track record of the similar product. There is.

特開２００８−１８６４１３号公報JP, 2008-186413, A 特開２０００−３３８８号公報JP 2000-3388 A 特開２０１３−１８２４１５号公報JP, 2013-182415, A

しかしながら、上記技術では、発売前である新商品の需要予測の精度を向上させることが難しい。 However, with the above technology, it is difficult to improve the accuracy of the demand forecast of a new product before it is released.

例えば、予測実行者が類似商品の指定と重み付けの指定を行う場合、主観的や属人的な要素が強く、類似商品のどのような内容が新商品の需要予測に影響するのかを定量的に把握することができないので、必ずしも新商品の需要予測の精度が高いとは限らない。また、類似商品の過去の売上の重み付け加算やソーシャルメディアデータでは、過去の類似製品の単純な組み合わせでは表現できない新商品の需要を正しく予測することができない。 For example, when a predictor specifies similar products and specifies weighting, there are strong subjective and personal factors, and what kind of similar products affect the demand prediction of new products quantitatively. Since it cannot be grasped, the accuracy of demand forecasts for new products is not always high. In addition, weighted addition of past sales of similar products and social media data cannot correctly predict demand for new products that cannot be expressed by a simple combination of past similar products.

一つの側面では、新商品の需要予測の精度を向上させることができる需要予測方法、需要予測プログラムおよび需要予測装置を提供することを目的とする。 In one aspect, it is an object of the present invention to provide a demand forecasting method, a demand forecasting program, and a demand forecasting apparatus capable of improving the precision of demand forecasting of a new product.

第１の案では、需要予測方法は、コンピュータが、発売が開始されている既存商品または発売が開始されていない新商品の属性が記載された各文書から、予め設定された条件に基づいて各商品の属性を示す特徴語を抽出する処理を実行する。需要予測方法は、コンピュータが、前記各文書に含まれる特徴語の出現頻度から、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成する処理を実行する。需要予測方法は、コンピュータが、生成したクラスタリング情報を説明変数に設定し、前記既存商品の売上実績を目的変数に設定した学習データを用いて、前記新商品の需要予測を行う予測モデルを学習する処理を実行する。 In the first proposal, the demand forecasting method is such that the computer uses a document that describes the attributes of an existing product that has been launched or a new product that has not been launched, based on preset conditions. A process of extracting a characteristic word indicating an attribute of a product is executed. In the demand forecasting method, the computer executes a process of generating clustering information indicating a combination of degrees having characteristic words for each product from the appearance frequency of characteristic words included in each document. In the demand forecasting method, the computer sets the generated clustering information as an explanatory variable, and uses the learning data in which the sales performance of the existing product is set as the objective variable to learn a forecasting model for predicting the demand for the new product. Execute the process.

一実施形態によれば、新商品の需要予測の精度を向上させることができる。 According to one embodiment, it is possible to improve the accuracy of demand prediction of a new product.

図１は、実施例１にかかる需要予測装置を説明する図である。FIG. 1 is a diagram illustrating a demand prediction device according to the first embodiment. 図２は、実施例１にかかる需要予測装置の機能構成を示す機能ブロック図である。FIG. 2 is a functional block diagram of the functional configuration of the demand prediction device according to the first embodiment. 図３は、企画書ＤＢに記憶される企画書の一例を示す図である。FIG. 3 is a diagram showing an example of a plan book stored in the plan book DB. 図４は、売上情報ＤＢに記憶される情報の例を示す図である。FIG. 4 is a diagram showing an example of information stored in the sales information DB. 図５は、月別売上情報ＤＢに記憶される情報の例を示す図である。FIG. 5 is a diagram showing an example of information stored in the monthly sales information DB. 図６は、実施例１にかかる学習フェーズを説明する図である。FIG. 6 is a diagram illustrating a learning phase according to the first embodiment. 図７は、実施例１にかかる適用フェーズを説明する図である。FIG. 7 is a diagram illustrating an application phase according to the first embodiment. 図８は、処理の流れを示すフローチャートである。FIG. 8 is a flowchart showing the flow of processing. 図９は、実施例２にかかる需要予測装置を説明する図である。FIG. 9 is a schematic diagram illustrating a demand prediction device according to the second embodiment. 図１０は、実施例２にかかる学習フェーズを説明する図である。FIG. 10 is a diagram illustrating a learning phase according to the second embodiment. 図１１は、実施例２にかかる適用フェーズを説明する図である。FIG. 11 is a diagram illustrating an application phase according to the second embodiment. 図１２は、スムージングの例を説明する図である。FIG. 12 is a diagram illustrating an example of smoothing. 図１３は、スムージングの別例を説明する図である。FIG. 13 is a diagram illustrating another example of smoothing. 図１４は、スムージング結果を説明する図である。FIG. 14 is a diagram illustrating a smoothing result. 図１５は、効果を説明する図である。FIG. 15 is a diagram for explaining the effect. 図１６は、効果の比較例を説明する図である。FIG. 16 is a diagram illustrating a comparative example of the effect. 図１７は、ハードウェア構成例を説明する図である。FIG. 17 is a diagram illustrating a hardware configuration example.

以下に、本願の開示する需要予測方法、需要予測プログラムおよび需要予測装置の実施例を図面に基づいて詳細に説明する。なお、この実施例によりこの発明が限定されるものではない。また、各実施例は、矛盾のない範囲内で適宜組み合わせることができる。 Hereinafter, embodiments of the demand forecasting method, demand forecasting program, and demand forecasting device disclosed in the present application will be described in detail with reference to the drawings. The present invention is not limited to the embodiments. In addition, the respective examples can be appropriately combined within a range without contradiction.

［需要予測装置の説明］
図１は、実施例１にかかる需要予測装置１０を説明する図である。図１に示す需要予測装置１０は、発売前の新商品の需要予測を実行するコンピュータ装置の一例である。この需要予測装置１０は、学習フェーズにおいて予測モデルを学習し、適用フェーズにおいて学習済みの予測モデルを用いた需要予測を実行する。 [Explanation of demand forecasting device]
FIG. 1 is a diagram illustrating a demand prediction device 10 according to the first embodiment. The demand forecasting device 10 shown in FIG. 1 is an example of a computer device that executes demand forecasting of new products before they are released. The demand prediction device 10 learns a prediction model in the learning phase and executes demand prediction using the learned prediction model in the application phase.

図１に示すように、学習フェーズでは、需要予測装置１０は、すでに発売されている既存商品の企画書や発売前の研究開発時や企画時に作成される新商品の企画書などのテキスト情報から、内容を表す単語（キーワード）を抽出する。そして、需要予測装置１０は、抽出したキーワードを用いて、クラスタリングを行ってクラスタを生成する。その後、需要予測装置１０は、既存商品のクラスタ結果を説明変数、既存商品の売上情報を目的変数とする学習データを用いて、需要予測を行う予測モデルを学習する。 As shown in FIG. 1, in the learning phase, the demand forecasting device 10 uses text information such as a plan for an existing product that has already been released and a plan for a new product that is created at the time of research and development before the release or at the time of planning. , Words (keywords) representing the contents are extracted. Then, the demand prediction device 10 uses the extracted keywords to perform clustering to generate clusters. After that, the demand prediction device 10 learns a prediction model for demand prediction using learning data in which the cluster result of the existing product is the explanatory variable and the sales information of the existing product is the target variable.

学習完了後の適用フェーズでは、需要予測装置１０は、学習フェーズで生成された新商品のクラスタ結果を、学習済みの予測モデルに入力する。そして、需要予測装置１０は、予測モデルの出力結果を需要予測として取得する。 In the application phase after the learning is completed, the demand prediction device 10 inputs the cluster result of the new product generated in the learning phase into the learned prediction model. Then, the demand prediction device 10 acquires the output result of the prediction model as the demand prediction.

このように、需要予測装置１０は、商品に紐づくテキスト情報を入力としてクラスタリングを行い、キーワード群の意味的なまとまりを持つクラスタを得て、このクラスタ結果を説明変数として予測モデルに投入する。これにより、各クラスタの売上に対する影響度を定量的に計算することができる。したがって、需要予測装置１０は、新商品の需要予測の精度を向上させることができる。 In this way, the demand prediction device 10 performs clustering by inputting text information associated with a product, obtains a cluster having a semantic unity of keyword groups, and inputs the cluster result as an explanatory variable into a prediction model. This makes it possible to quantitatively calculate the degree of influence of each cluster on sales. Therefore, the demand prediction device 10 can improve the accuracy of the demand prediction of a new product.

［機能構成］
図２は、実施例１にかかる需要予測装置１０の機能構成を示す機能ブロック図である。図２に示すように、需要予測装置１０は、通信部１１、記憶部１２、制御部３０を有する。 [Function configuration]
FIG. 2 is a functional block diagram of the functional configuration of the demand prediction device 10 according to the first embodiment. As shown in FIG. 2, the demand prediction device 10 includes a communication unit 11, a storage unit 12, and a control unit 30.

通信部１１は、他の装置の間の通信を制御する処理部であり、例えば通信インタフェースなどである。例えば、通信部１１は、管理者から各種処理開始の指示や各種データなどを受信し、管理者端末に学習結果や予測結果などを送信する。 The communication unit 11 is a processing unit that controls communication between other devices, and is, for example, a communication interface. For example, the communication unit 11 receives an instruction to start various processes, various data, and the like from the administrator, and transmits learning results, prediction results, and the like to the administrator terminal.

記憶部１２は、各種データや制御部３０が実行するプログラムなどを記憶する記憶装置の一例であり、例えばメモリやハードディスクなどである。記憶部１２は、企画書ＤＢ１３、売上情報ＤＢ１４、月別売上情報ＤＢ１５、テキスト情報ＤＢ１６、重み情報ＤＢ１７、クラスタＤＢ１８、学習データＤＢ１９、学習結果ＤＢ２０、予測結果ＤＢ２１を有する。 The storage unit 12 is an example of a storage device that stores various data and programs executed by the control unit 30, and is, for example, a memory or a hard disk. The storage unit 12 has a plan document DB 13, sales information DB 14, monthly sales information DB 15, text information DB 16, weight information DB 17, cluster DB 18, learning data DB 19, learning result DB 20, and prediction result DB 21.

企画書ＤＢ１３は、すでに発売されている既存商品の企画書のデータと、発売前であり、研究開発段階や企画段階で生成された新商品の企画書のデータとを記憶するデータベースである。具体的には、企画書ＤＢ１３は、材料名、ターゲット年代、商品の内容などの商品に関する情報を表すキーワードが含まれる各企画書のデータを記憶する。 The plan book DB 13 is a database that stores plan book data for existing products that have already been sold and plan data for new products that have been created at the research and development stage or the planning stage before the sale. Specifically, the plan document DB 13 stores data of each plan document including a keyword indicating information about a product such as a material name, target age, and content of the product.

図３は、企画書ＤＢ１３に記憶される企画書の一例を示す図である。図３に示すように、企画書は、商品の特徴や商品の説明などを表す項目ａと項目ｂが記載される。また、項目ａには、項目ａに関する情報を具体的に記載した文書１ａが記載され、項目ｂには、項目ｂに関する情報を具体的に記載した文書１ｂが記載される。また、文書１ａと文書１ｂには、キーワードが含まれる。例えば、項目ａは、商品特徴を説明する項目であり、項目ｂは、商品のターゲットを記載する項目などである。 FIG. 3 is a diagram showing an example of a plan book stored in the plan book DB 13. As shown in FIG. 3, the plan includes item a and item b that represent the features of the product, the description of the product, and the like. Further, the item a describes a document 1a that specifically describes information about the item a, and the item b describes a document 1b that specifically describes information about the item b. Further, the document 1a and the document 1b include keywords. For example, the item a is an item that describes the product characteristics, and the item b is an item that describes the target of the product.

売上情報ＤＢ１４は、既存商品の売上情報を記憶するデータベースである。具体的には、売上情報ＤＢ１４は、既存商品の発売日の売上を記憶する。図４は、売上情報ＤＢ１４に記憶される情報の例を示す図である。図４に示すように、売上情報ＤＢ１４は、「商品、発売日、売り上げ」を対応付けて記憶する。 The sales information DB 14 is a database that stores sales information of existing products. Specifically, the sales information DB 14 stores sales of existing products on the sale date. FIG. 4 is a diagram showing an example of information stored in the sales information DB 14. As shown in FIG. 4, the sales information DB 14 stores “commodity, release date, sales” in association with each other.

ここで記憶される「商品」は、発売された既存商品の商品名であり、「発売日」は、発売が開始された日であり、「売り上げ」は、売上個数などである。図４の例では、商品１は、2018年6月10日に発売が開始されて、その日の売上が100個だったことを示す。なお、売上情報ＤＢ１４は、発売開始日の売上に限らず、特定のある発売日の売上を記憶することもできる。なお、本実施例では、商品１、商品２、商品３を既存商品として説明する。 The “product” stored here is the product name of the existing product that has been released, the “release date” is the date on which the sale is started, and the “sales” is the number of units sold. In the example of FIG. 4, it is shown that the product 1 was released on June 10, 2018 and the sales on that day were 100 pieces. The sales information DB 14 is not limited to the sales on the sale start date, and can store the sales on a specific release date. In this embodiment, the products 1, 2, and 3 will be described as existing products.

月別売上情報ＤＢ１５は、既存商品の月別の売上情報を記憶するデータベースである。図５は、月別売上情報ＤＢ１５に記憶される情報の例を示す図である。図５に示すように、売上情報ＤＢ１４は、「商品、１か月目、２か月目、３か月目」を対応付けて記憶する。 The monthly sales information DB 15 is a database that stores monthly sales information of existing products. FIG. 5 is a diagram showing an example of information stored in the monthly sales information DB 15. As shown in FIG. 5, the sales information DB 14 stores “commodity, first month, second month, third month” in association with each other.

ここで記憶される「商品」は、発売された既存商品の商品名であり、「１か月目」等は、発売開始から１か月、２か月、３か月ごとの売上個数などである。図５の例では、商品１は、発売開始から１か月目で1000個、１か月目から２か月目で300個、２か月目から３か月目で100個の売り上げがあったことを示す。なお、月別に限らず、日別や年別などの情報を用いることもできる。 "Merchandise" stored here is the name of the existing product that has been released, and "1st month" is the number of units sold every 1 month, 2 months, or 3 months after the start of sale. is there. In the example of Fig. 5, product 1 has sales of 1000 in the first month from the start of sales, 300 in the first to second months, and 100 in the second to third months. Indicates that It should be noted that it is possible to use information such as daily and yearly information as well as monthly information.

テキスト情報ＤＢ１６は、既存商品および新商品の各企画書に関するデータを記憶するデータベースである。具体的には、テキスト情報ＤＢ１６は、各商品について、項目ａと項目ｂのそれぞれにどのような文書が含まれるかを記憶する。 The text information DB 16 is a database that stores data related to each plan of existing products and new products. Specifically, the text information DB 16 stores what kind of document is included in each of item a and item b for each product.

例えば、テキスト情報ＤＢ１６は、「商品、項目ａ、項目ｂ」を対応付けて記憶する。ここで記憶される「商品」は、商品名であり、「項目ａ」と「項目ｂ」は、クラスタ分類に使用されるキーワードの抽出元となる文書が記載されている箇所を示す情報である。一例を挙げると、商品１の項目ａに含まれる文書１ａと項目ｂに含まれる文書１ｂとが抽出元である場合、テキスト情報ＤＢ１６は、「商品、項目ａ、項目ｂ」として「商品１、文書１ａ、文書１ｂ」を記憶する。 For example, the text information DB 16 stores “commodity, item a, item b” in association with each other. The "product" stored here is a product name, and the "item a" and "item b" are information indicating the place where the document from which the keyword used for cluster classification is extracted is described. .. As an example, when the document 1a included in the item a of the product 1 and the document 1b included in the item b are extraction sources, the text information DB 16 stores “product 1, item a, item b” as “product 1, Document 1a, document 1b" are stored.

重み情報ＤＢ１７は、テキスト情報に含まれるキーワードの重みに関する情報を記憶するデータベースである。具体的には、重み情報ＤＢ１７は、企画書などから抽出された各キーワードの重みを記憶する。 The weight information DB 17 is a database that stores information about weights of keywords included in text information. Specifically, the weight information DB 17 stores the weight of each keyword extracted from the plan book or the like.

クラスタＤＢ１８は、既存商品と新商品とを含む各商品が分類されたクラスタに関する情報を記憶するデータベースである。具体的には、クラスタＤＢ１８は、各商品をキーワードでクラスタリングした結果を記憶する。すなわち、クラスタＤＢ１８は、商品ごとの、各クラスタに割り当てられたクラスタＩＤを記憶する。 The cluster DB 18 is a database that stores information about clusters into which each product including existing products and new products is classified. Specifically, the cluster DB 18 stores the result of clustering each product with a keyword. That is, the cluster DB 18 stores the cluster ID assigned to each cluster for each product.

学習データＤＢ１９は、月別の予測モデルの学習に使用される学習データを記憶するデータベースである。具体的には、学習データＤＢ１９は、各商品のクラスタＩＤを説明変数に設定し、各商品の発売後の月別ごとの各売り上げを目的変数に設定した複数の学習データを記憶する。例えば、学習データＤＢ１９は、１か月目の売上予測用の予測モデルを学習するための学習データ、２か月目の売上予測用の予測モデを学習するための学習データル、３か月目の売上予測用の予測モデルそれぞれを学習するための学習データを記憶する。 The learning data DB 19 is a database that stores learning data used for learning the prediction model for each month. Specifically, the learning data DB 19 stores a plurality of learning data in which the cluster ID of each product is set as an explanatory variable, and each sale of each product after sale is set as an objective variable. For example, the learning data DB 19 is learning data for learning a prediction model for sales prediction for the first month, learning data for learning a prediction model for sales prediction for the second month, and data for the third month. Learning data for learning each prediction model for sales prediction is stored.

学習結果ＤＢ２０は、月別の各予測モデルの学習結果を記憶するデータベースである。例えば、学習結果ＤＢ２０は、制御部３０による学習データの判別結果（分類結果）、重回帰分析や機械学習などによって学習された各種パラメータを記憶する。例えば、学習結果ＤＢ２０は、１か月目の売上予測用の予測モデル、２か月目の売上予測用の予測モデル、３か月目の売上予測用の予測モデルそれぞれを構成するための各種パラメータなどを記憶する。 The learning result DB 20 is a database that stores the learning result of each prediction model for each month. For example, the learning result DB 20 stores a discrimination result (classification result) of learning data by the control unit 30, various parameters learned by multiple regression analysis, machine learning, and the like. For example, the learning result DB 20 includes various parameters for configuring a prediction model for sales prediction for the first month, a prediction model for sales prediction for the second month, and a prediction model for sales prediction for the third month. And so on.

予測結果ＤＢ２１は、学習済みの予測モデルを用いて予測された、新商品の売上予測結果を記憶するデータベースである。具体的には、予測結果ＤＢ２１は、各新商品について、１か月目の売上予測結果、２か月目の売上予測結果、３か月目の売上予測結果を記憶する。 The prediction result DB 21 is a database that stores the sales prediction result of a new product predicted using the learned prediction model. Specifically, the prediction result DB 21 stores the sales prediction result of the first month, the sales prediction result of the second month, and the sales prediction result of the third month for each new product.

制御部３０は、需要予測装置１０全体を司る処理部であり、例えばプロセッサなどである。この制御部３０は、学習処理部４０と予測処理部５０を有する。なお、学習処理部４０と予測処理部５０は、プロセッサが有する電子回路の一例やプロセッサが実行するプロセスの一例である。 The control unit 30 is a processing unit that controls the entire demand prediction apparatus 10, and is, for example, a processor. The control unit 30 has a learning processing unit 40 and a prediction processing unit 50. The learning processing unit 40 and the prediction processing unit 50 are an example of an electronic circuit included in the processor and an example of a process executed by the processor.

学習処理部４０は、単語抽出部４１、重み算出部４２、選定部４３、クラスタリング部４４、学習データ生成部４５、学習部４６を有し、月別ごとの予測モデルを学習する処理部である。 The learning processing unit 40 includes a word extraction unit 41, a weight calculation unit 42, a selection unit 43, a clustering unit 44, a learning data generation unit 45, and a learning unit 46, and is a processing unit that learns a prediction model for each month.

単語抽出部４１は、既存商品および新商品の企画書に含まれる項目ごとに出現するキーワードを抽出する処理部である。例えば、単語抽出部４１は、商品１の企画書の項目ａに記載される文書１ａに形態素解析などを実行して、キーワードとして、Ｋ１ａ、Ｋ２ａ、Ｋ３ａ、Ｋ４ａなどを抽出する。また、単語抽出部４１は、商品１の企画書の項目ｂに記載される文書１ｂに形態素解析などを実行して、キーワードとして、Ｋ１ｂ、Ｋ２ｂ、Ｋ３ｂ、Ｋ４ｂ、Ｋ５ｂなどを抽出する。 The word extraction unit 41 is a processing unit that extracts a keyword that appears for each item included in a plan of an existing product and a new product. For example, the word extraction unit 41 executes morphological analysis or the like on the document 1a described in the item a of the plan document of the product 1 and extracts K1a, K2a, K3a, K4a, etc. as keywords. Further, the word extracting unit 41 executes morphological analysis or the like on the document 1b described in item b of the plan document of the product 1 to extract K1b, K2b, K3b, K4b, K5b, etc. as keywords.

このようにして、単語抽出部４１は、既存商品の企画書および新商品の企画書からキーワードを抽出し、抽出結果をテキスト情報ＤＢ１６に格納するとともに、重み算出部４２に出力する。 In this way, the word extracting unit 41 extracts the keyword from the plan of the existing product and the plan of the new product, stores the extraction result in the text information DB 16, and outputs it to the weight calculating unit 42.

重み算出部４２は、各キーワードの重みを算出する処理部である。具体的には、重み算出部４２は、単語抽出部４１によって抽出された各キーワードのＴＦＩＤＦ（Term Frequency Inverse Document Frequency）を算出する。上記例で説明すると、重み算出部４２は、商品１について、項目ａの文書１ａにおけるキーワード「Ｋ１ａ」のＴＧＩＤＦを、Ｋ１ａの重みとして算出する。 The weight calculation unit 42 is a processing unit that calculates the weight of each keyword. Specifically, the weight calculation unit 42 calculates the TFIDF (Term Frequency Inverse Document Frequency) of each keyword extracted by the word extraction unit 41. In the above example, the weight calculation unit 42 calculates the TGIDF of the keyword “K1a” in the document 1a of item a as the weight of K1a for the product 1.

このようにして、重み算出部４２は、既存商品の企画書から抽出された各キーワードおよび新商品の企画書から抽出された各キーワードの重みを算出して、算出結果を重み情報ＤＢ１７に格納し、選定部４３に出力する。 In this way, the weight calculation unit 42 calculates the weight of each keyword extracted from the plan of the existing product and each keyword extracted from the plan of the new product, and stores the calculation result in the weight information DB 17. , To the selection unit 43.

選定部４３は、キーワードの選定を実行する処理部である。具体的には、選定部４３は、単語抽出部４１によって抽出された各キーワードのうち、重みが所定値未満のキーワード、ストップワードリストに該当するキーワード、除外対象に品詞に該当するキーワードを除外する。 The selection unit 43 is a processing unit that executes keyword selection. Specifically, the selection unit 43 excludes, from among the keywords extracted by the word extraction unit 41, keywords having a weight less than a predetermined value, keywords that correspond to the stop word list, and keywords that correspond to the part of speech to be excluded. ..

このようにして、選定部４３は、既存商品の各キーワードおよび新商品の各キーワードのそれぞれから選定を実行し、選定結果をクラスタリング部４４に出力する。なお、ストップキーワードとは、キーワードとして対象外とする単語の一覧であり、除外対象の品詞とは、助詞などであり、これらは管理者等により予め設定されたり、一般的な辞書を用いたりすることができる。 In this way, the selection unit 43 executes selection from each keyword of the existing product and each keyword of the new product, and outputs the selection result to the clustering unit 44. Note that the stop keyword is a list of words that are excluded as keywords, and the part of speech to be excluded is a particle, etc., which are preset by the administrator or the like, or a general dictionary is used. be able to.

クラスタリング部４４は、選定部４３により選定されたキーワードを用いて、商品のクラスタリングを実行する処理部である。すなわち、クラスタリング部４４は、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成する。 The clustering unit 44 is a processing unit that performs clustering of products using the keyword selected by the selection unit 43. That is, the clustering unit 44 generates clustering information indicating a combination of degrees having characteristic words for each product.

例えば、クラスタリング部４４は、各商品についてクラスタリングを実行し、各クラスタにクラスタＩＤを付与する。例えば、クラスタリング部４４は、項目ａに属するキーワードのうち、Ｋ１ａとＫ３ａを有するクラスタにＩＤ「Ｃ１ａ」を設定し、Ｋ２ａとＫ３ａを有するクラスタにＩＤ「Ｃ２ａ」を設定する。同様に、クラスタリング部４４は、項目ｂに属するキーワードのうち、Ｋ１ｂとＫ２ｂを有するクラスタにＩＤ「Ｃ１ｂ」を設定し、Ｋ２ｂとＫ３ｂを有するクラスタにＩＤ「Ｃ２ｂ」を設定する。 For example, the clustering unit 44 executes clustering for each product and gives a cluster ID to each cluster. For example, the clustering unit 44 sets the ID “C1a” to the cluster having K1a and K3a and the ID “C2a” to the cluster having K2a and K3a among the keywords belonging to the item a. Similarly, the clustering unit 44 sets the ID “C1b” to the cluster having K1b and K2b and the ID “C2b” to the cluster having K2b and K3b among the keywords belonging to the item b.

そして、クラスタリング部４４は、商品とクラスタＩＤとを対応付けて、商品ごとに「項目ａ（Ｃ１ａ、Ｃ２ａ）、項目ｂ（Ｃ１ｂ、Ｃ２ｂ）」を生成する。例えば、クラスタリング部４４は、商品１の項目ａにおけるキーワードが「Ｋ２ａ、Ｋ３ａ」、項目ｂにおけるキーワードが「Ｋ１ｂ、Ｋ２ｂ」である場合、商品１の「項目ａ（Ｃ１ａ、Ｃ２ａ）、項目ｂ（Ｃ１ｂ、Ｃ２ｂ）」として「0，1，1，0」を生成する。 Then, the clustering unit 44 associates the product with the cluster ID and generates “item a (C1a, C2a), item b (C1b, C2b)” for each product. For example, when the keyword in the item a of the product 1 is “K2a, K3a” and the keyword in the item b is “K1b, K2b”, the clustering unit 44 determines the “item a (C1a, C2a), item b() of the product 1”. "0,1,1,0" is generated as "C1b, C2b)".

このようにして、クラスタリング部４４は、既存商品および新商品を含む各商品のクラスタリングを実行し、クラスタリング結果をクラスタＤＢ１８に格納し、学習データ生成部４５に出力する。なお、クラスタリング手法は、一般的な様々な手法を用いることができる。 In this way, the clustering unit 44 executes the clustering of each product including the existing product and the new product, stores the clustering result in the cluster DB 18, and outputs it to the learning data generation unit 45. As the clustering method, various general methods can be used.

学習データ生成部４５は、クラスタリング結果を用いて、学習データを生成する処理部である。具体的には、学習データ生成部４５は、クラスタリング結果のうち既存商品のクラスタリング結果を説明変数、月別売上情報を目的変数とする学習データを生成し、学習データＤＢ１９に格納する。 The learning data generation unit 45 is a processing unit that generates learning data using the clustering result. Specifically, the learning data generation unit 45 generates learning data in which the clustering result of the existing products is the explanatory variable and the monthly sales information is the objective variable among the clustering results, and stores the learning data in the learning data DB 19.

上記例で説明すると、学習データ生成部４５は、商品１の「項目ａ（Ｃ１ａ、Ｃ２ａ）、項目ｂ（Ｃ１ｂ、Ｃ２ｂ）」である「0，1，1，0」を説明変数、月別売上情報ＤＢ１５に記憶される１か月目の売上「1000」、２か月目の売上「300」、３か月目の売上「100」それぞれを目的変数とする学習データを生成する。 Explaining with the above example, the learning data generation unit 45 uses the “item a (C1a, C2a), item b (C1b, C2b)” of the product 1 as an explanatory variable “0, 1, 1, 0” and monthly sales. Learning data is generated with the first month sales “1000”, the second month sales “300”, and the third month sales “100” stored in the information DB 15 as objective variables.

つまり、学習データ生成部４５は、１か月目の売上予測を行う予測モデル用の学習データとして「（0，1，1，0），1000」を生成し、２か月目の売上予測を行う予測モデル用の学習データとして「（0，1，1，0），300」を生成し、１か月目の売上予測を行う予測モデル用の学習データとして「（0，1，1，0），100」を生成する。このようにして、学習データ生成部４５は、既存商品が属するクラスタＩＤを用いて、月別ごとの予測モデル用の各学習データを生成する。 That is, the learning data generation unit 45 generates "(0,1,1,0),1000" as learning data for the prediction model for predicting the sales for the first month, and calculates the sales forecast for the second month. "(0,1,1,0),300" is generated as the learning data for the prediction model to be performed, and "(0,1,1,0) is used as the learning data for the prediction model for the sales forecast for the first month. ), 100” is generated. In this way, the learning data generation unit 45 uses the cluster ID to which the existing product belongs to generate each learning data for each prediction model for each month.

学習部４６は、月別ごとの予測モデルの学習を実行する処理部である。具体的には、学習部４６は、学習データＤＢ１９に記憶される月別ごとの学習データを用いて、月別ごとの予測モデルを生成する教師有学習を実行する。例えば、学習部４６は、重回帰分析を用いて学習処理を実行する。 The learning unit 46 is a processing unit that executes learning of a prediction model for each month. Specifically, the learning unit 46 uses the learning data for each month stored in the learning data DB 19 to perform supervised learning for generating a prediction model for each month. For example, the learning unit 46 executes the learning process using multiple regression analysis.

そして、学習部４６は、学習結果を学習結果ＤＢ２０に格納する。なお、学習処理を終了するタイミングは、所定数以上の学習データを用いた学習が完了した時点、目的変数（ラベル）と予測モデルの出力結果との誤差が閾値未満となった時点など、任意に設定することができる。 Then, the learning unit 46 stores the learning result in the learning result DB 20. It should be noted that the timing of ending the learning process is arbitrarily set such as when the learning using a predetermined number or more of learning data is completed, or when the error between the objective variable (label) and the output result of the prediction model is less than the threshold value. Can be set.

予測処理部５０は、学習済みの予測モデルを用いて、新商品の売上予測を実行する処理部である。例えば、予測処理部５０は、学習結果ＤＢ２０から各種パラメータを読み出して、１か月目用の予測モデル、２か月目用の予測モデル、３か月目用の予測モデルを構築する。そして、予測処理部５０は、新商品に対応付けられたクラスタＩＤ「1，0，1，0」を特徴量として各予測モデルに入力して各出力結果を取得し、予測結果ＤＢ２１に格納する。このようにして、予測処理部５０は、新商品について、１か月目の売上予測、２か月目の売上予測、３か月目の売上予測を行う。 The prediction processing unit 50 is a processing unit that executes sales prediction of a new product using the learned prediction model. For example, the prediction processing unit 50 reads various parameters from the learning result DB 20 and constructs a prediction model for the first month, a prediction model for the second month, a prediction model for the third month. Then, the prediction processing unit 50 inputs the cluster ID “1,0,1,0” associated with the new product into each prediction model as a feature amount, acquires each output result, and stores the output result in the prediction result DB 21. .. In this way, the prediction processing unit 50 performs the sales forecast for the first month, the sales forecast for the second month, and the sales forecast for the third month for the new product.

［具体例］
次に、図６と図７を用いて、学習フェーズと適用フェーズの具体例を説明する。図６は、実施例１にかかる学習フェーズを説明する図である。図７は、実施例１にかかる適用フェーズを説明する図である。 [Concrete example]
Next, specific examples of the learning phase and the application phase will be described with reference to FIGS. 6 and 7. FIG. 6 is a diagram illustrating a learning phase according to the first embodiment. FIG. 7 is a diagram illustrating an application phase according to the first embodiment.

（学習フェーズ）
図６に示すように、テキスト情報ＤＢ１６は、商品ごとに項目ａに属する文書と項目ｂに属する文書とを記憶する。具体的には、テキスト情報ＤＢ１６は、「商品、項目ａ、項目ｂ」として「商品１、文書１ａ、文書１ｂ」、「商品２、文書２ａ、文書２ｂ」、「商品３、文書３ａ、文書３ｂ」、「新商品、文書ａ、文書ｂ」を記憶する。 (Learning phase)
As shown in FIG. 6, the text information DB 16 stores a document belonging to item a and a document belonging to item b for each product. Specifically, the text information DB 16 stores “commodity 1, document 1a, document 1b”, “commodity 2, document 2a, document 2b”, “commodity 3, document 3a, document” as “commodity, item a, item b”. 3b”, “new product, document a, document b” are stored.

そして、単語抽出部４１は、各商品の文書からキーワードを抽出する（Ｓ１）。例えば、単語抽出部４１は、項目ａの文書からキーワード「Ｋ１ａ、Ｋ２ａ、Ｋ３ａ」を抽出し、項目ｂの文書からキーワード「Ｋ１ｂ、Ｋ２ｂ、Ｋ３ｂ」を抽出する。 Then, the word extracting unit 41 extracts a keyword from the document of each product (S1). For example, the word extraction unit 41 extracts the keyword “K1a, K2a, K3a” from the document of item a and the keyword “K1b, K2b, K3b” from the document of item b.

続いて、重み算出部４２が、各キーワードのＴＦＩＤＦを算出し、選定部４３は、キーワードの選定を実行する（Ｓ２）。例えば、商品１については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．７）、Ｋ３ａは重み（０．１）、Ｋ１ｂは重み（０．８）、Ｋ２ｂは重み（０．６）、Ｋ３ｂは該当なしと選定される。商品２については、Ｋ１ａは重み（０．８）、Ｋ２ａは該当なし、Ｋ３ａは重み（０．７）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．１）、Ｋ３ｂは重み（０．８）と選定される。 Subsequently, the weight calculation unit 42 calculates the TFIDF of each keyword, and the selection unit 43 executes the keyword selection (S2). For example, for product 1, K1a is not applicable, K2a is weight (0.7), K3a is weight (0.1), K1b is weight (0.8), K2b is weight (0.6), and K3b is Selected as not applicable. For product 2, K1a has weight (0.8), K2a does not apply, K3a has weight (0.7), K1b does not apply, K2b has weight (0.1), and K3b has weight (0.8). Is selected.

商品３については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．８）、Ｋ３ａは重み（０．３）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．３）、Ｋ３ｂは重み（０．５）と選定される。新商品については、Ｋ１ａは重み（０．７）、Ｋ２ａは該当なし、Ｋ３ａは重み（０．９）、Ｋ１ｂは重み（０．７）、Ｋ２ｂは重み（０．６）、Ｋ３ｂは該当なしと選定される。 For product 3, K1a is not applicable, K2a is weighted (0.8), K3a is weighted (0.3), K1b is not applicable, K2b is weighted (0.3), and K3b is weighted (0.5). Is selected. For new products, K1a is weight (0.7), K2a is not applicable, K3a is weight (0.9), K1b is weight (0.7), K2b is weight (0.6), and K3b is not applicable. Is selected.

そして、クラスタリング部４４は、重み算出やキーワード選定の結果を用いて、既存商品および新商品のクラスタリングを実行する（Ｓ３）。例えば、クラスタリング部４４は、項目ａについて、Ｋ１ａとＫ３ａを含む商品をクラスタＣ１ａに分類し、Ｋ２ａとＫ３ａを含む商品をクラスタＣ２ａに分類する。また、クラスタリング部４４は、項目ｂについて、Ｋ１ｂとＫ２ｂを含む商品をクラスタＣ１ｂに分類し、Ｋ２ｂとＫ３ｂを含む商品をクラスタＣ２ｂに分類する。 Then, the clustering unit 44 executes the clustering of the existing product and the new product by using the result of the weight calculation and the keyword selection (S3). For example, for the item a, the clustering unit 44 classifies the products including K1a and K3a into the cluster C1a and classifies the products including K2a and K3a into the cluster C2a. Further, for the item b, the clustering unit 44 classifies the products including K1b and K2b into the cluster C1b and classifies the products including K2b and K3b into the cluster C2b.

この結果、商品１は、項目ａに関してＣ２ａに属し、項目ｂに関してＣ１ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，１，０」が生成される。商品２は、項目ａに関してＣ１ａに属し、項目ｂに関してＣ２ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，０，１」が生成される。 As a result, the product 1 belongs to C2a for the item a and belongs to C1b for the item b, so that “0, 1, 1, 0” is generated as the clustering result “C1a, C2a, C1b, C2b”. Since the product 2 belongs to C1a regarding the item a and belongs to C2b regarding the item b, “1,0,0,1” is generated as the clustering result “C1a, C2a, C1b, C2b”.

商品３は、項目ａに関してＣ２ａに属し、項目ｂに関してＣ２ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，０，１」が生成される。新商品は、項目ａに関してＣ１ａに属し、項目ｂに関してＣ１ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，１，０」が生成される。 Since the product 3 belongs to C2a regarding the item a and belongs to C2b regarding the item b, “0, 1, 0, 1” is generated as the clustering result “C1a, C2a, C1b, C2b”. Since the new product belongs to C1a for item a and belongs to C1b for item b, “1,0,1,0” is generated as the clustering result “C1a, C2a, C1b, C2b”.

その後、学習データ生成部４５は、既存商品と新商品とを含むクラスタリング結果のうち既存商品のクラスタリング結果と、既存商品の売上実績とを用いて、学習データを生成する（Ｓ４とＳ５）。 After that, the learning data generation unit 45 generates learning data by using the clustering result of the existing product among the clustering results including the existing product and the new product and the sales record of the existing product (S4 and S5).

例えば、学習データ生成部４５は、商品１について３つの学習データを生成する。すなわち、学習データ生成部４５は、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、１か月目の売上「１０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、２か月目の売上「３００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、３か月目の売上「１００」を目的変数に設定した学習データを生成する。 For example, the learning data generation unit 45 generates three learning data for the product 1. That is, the learning data generation unit 45 generates learning data in which the clustering result “0, 1, 1, 0” of the product 1 is set as an explanatory variable and the sales of the first month “1000” is set as an objective variable. .. The learning data generation unit 45 generates learning data in which the clustering result “0, 1, 1, 0” of the product 1 is set as an explanatory variable and the sales of the second month “300” is set as an objective variable. The learning data generation unit 45 generates learning data in which the clustering result “0, 1, 1, 0” of the product 1 is set as an explanatory variable and the sales “100” at the third month is set as an objective variable.

同様に、学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、１か月目の売上「１５００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、２か月目の売上「１０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、３か月目の売上「７００」を目的変数に設定した学習データを生成する。 Similarly, the learning data generation unit 45 generates learning data in which the clustering result “1,0,0,1” of the product 2 is set as an explanatory variable and the first-month sales “1500” is set as an objective variable. To do. The learning data generation unit 45 generates learning data in which the clustering result “1,0,0,1” of the product 2 is set as an explanatory variable and the sales “1000” in the second month is set as an objective variable. The learning data generation unit 45 generates learning data in which the clustering result “1,0,0,1” of the product 2 is set as an explanatory variable and the sales “700” at the third month is set as an objective variable.

同様に、学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、１か月目の売上「５０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、２か月目の売上「２０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、３か月目の売上「５００」を目的変数に設定した学習データを生成する。 Similarly, the learning data generation unit 45 generates learning data in which the clustering result “0, 1, 0, 1” of the product 3 is set as an explanatory variable, and the first-month sales “5000” is set as an objective variable. To do. The learning data generation unit 45 generates learning data in which the clustering result “0, 1, 0, 1” of the product 3 is set as an explanatory variable and the sales “2000” in the second month is set as an objective variable. The learning data generation unit 45 generates learning data in which the clustering result “0, 1, 0, 1” of the product 3 is set as an explanatory variable and the sales “500” at the third month is set as an objective variable.

そして、学習部４６は、生成された各学習データを用いて、各予測モデルを学習する（Ｓ６）。すなわち、学習部４６は、クラスタリング結果を商品の特徴を表すベクトルデータ、月別の売上情報を正解情報として、予測モデル（学習モデル）を学習する。 Then, the learning unit 46 learns each prediction model using each generated learning data (S6). That is, the learning unit 46 learns a prediction model (learning model) using the clustering result as vector data representing the features of the product and the monthly sales information as the correct answer information.

例えば、学習部４６は、商品１の学習データ「（０，１，１，０）、１０００」と、商品２の学習データ「（１，０，０，１）、１５００」と、商品３の学習データ「（０，１，０，１）、５０００」とを用いて、重回帰分析により、１か月目の売上予測を行う予測モデルを学習する。 For example, the learning unit 46 uses the learning data “(0,1,1,0), 1000” of the product 1, the learning data “(1,0,0,1), 1500” of the product 2, and the product 3 Using the learning data “(0,1,0,1), 5000”, a prediction model for predicting the sales for the first month is learned by multiple regression analysis.

同様に、学習部４６は、商品１の学習データ「（０，１，１，０）、３００」と、商品２の学習データ「（１，０，０，１）、１０００」と、商品３の学習データ「（０，１，０，１）、２０００」とを用いて、重回帰分析により、２か月目の売上予測を行う予測モデルを学習する。 Similarly, the learning unit 46 uses the learning data “(0,1,1,0), 300” of the product 1, the learning data “(1,0,0,1), 1000” of the product 2, and the product 3 The learning model “(0,1,0,1), 2000” is used to learn a prediction model that predicts sales for the second month by multiple regression analysis.

同様に、学習部４６は、商品１の学習データ「（０，１，１，０）、１００」と、商品２の学習データ「（１，０，０，１）、７００」と、商品３の学習データ「（０，１，０，１）、５００」とを用いて、重回帰分析により、３か月目の売上予測を行う予測モデルを学習する。 Similarly, the learning unit 46 uses the learning data “(0, 1, 1, 0), 100” of the product 1, the learning data “(1, 0, 0, 1), 700” of the product 2, and the product 3 The learning model “(0,1,0,1), 500” is used to learn a prediction model for predicting sales for the third month by multiple regression analysis.

（適用フェーズ）
適用フェーズでは、図７に示すように、予測処理部５０は、図６のＳ３で得られたクラスタリング結果のうち、新商品のクラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，１，０」を抽出する。 (Application phase)
In the application phase, as illustrated in FIG. 7, the prediction processing unit 50 sets “1, 0, as the clustering result “C1a, C2a, C1b, C2b” of the new product among the clustering results obtained in S3 of FIG. 1, 0” is extracted.

そして、予測処理部５０は、新商品のクラスタリング結果「１，０，１，０」を、１か月用の予測モデル、２か月用の予測モデル、３か月用の予測モデルのそれぞれに入力する。その後、予測処理部５０は、１か月用の予測モデルの出力値「２５００」、２か月用の予測モデルの出力値「２０」、３か月用の予測モデルの出力値「３００」を取得する。 Then, the prediction processing unit 50 assigns the new product clustering result “1,0,1,0” to the prediction model for one month, the prediction model for two months, and the prediction model for three months, respectively. input. After that, the prediction processing unit 50 sets the output value “2500” of the prediction model for one month to “2”, the output value “20” of the prediction model for two months, and the output value “300” of the prediction model for three months. get.

この結果、予測処理部５０は、新商品が発売されてから１か月目の売上予測を「２５００個」、１か月目から２か月目までの売上予測を「２０個」、２か月目から３か月目までの売上予測を「３００個」と予測する。 As a result, the prediction processing unit 50 determines whether the sales forecast for the first month after the new product is released is “2500 units”, or the sales forecast for the first month to the second month is “20 units”, or 2 The sales forecast from the third month to the third month is predicted to be "300".

［処理の流れ］
図８は、処理の流れを示すフローチャートである。図８に示すように、学習処理部４０は、処理開始が指示されると（Ｓ１０１：Ｙｅｓ）、企画書ＤＢ１３に記憶される企画書のデータを読み込む（Ｓ１０２）。 [Process flow]
FIG. 8 is a flowchart showing the flow of processing. As shown in FIG. 8, when the processing start is instructed (S101: Yes), the learning processing unit 40 reads the plan document data stored in the plan document DB 13 (S102).

続いて、学習処理部４０は、各企画書のデータからキーワードを抽出し（Ｓ１０３）、キーワードの重みを算出する（Ｓ１０４）。そして、学習処理部４０は、ストップワードリストや重みを用いて、キーワードの選定を実行する（Ｓ１０５）。 Then, the learning processing unit 40 extracts a keyword from the data of each plan (S103) and calculates the weight of the keyword (S104). Then, the learning processing unit 40 executes keyword selection using the stop word list and the weight (S105).

その後、学習処理部４０は、選定されたキーワードを用いてクラスタリングを実行し（Ｓ１０６）、商品とクラスタＩＤとを対応付けたクラスタリング結果を生成する（Ｓ１０７）。 After that, the learning processing unit 40 executes clustering using the selected keyword (S106) and generates a clustering result in which the product and the cluster ID are associated with each other (S107).

そして、学習処理部４０は、新商品を含むクラスタリングのうち、既存商品のクラスタリング結果を説明変数として抽出し（Ｓ１０８）、各商品の月別の売上情報を用いて学習データを生成する（Ｓ１０９）。続いて、学習処理部４０は、学習データを用いて、予測モデルの学習を実行する（Ｓ１１０）。 Then, the learning processing unit 40 extracts the clustering result of the existing products from the clustering including the new products as an explanatory variable (S108), and generates learning data by using the monthly sales information of each product (S109). Subsequently, the learning processing unit 40 executes learning of the prediction model using the learning data (S110).

その後、学習が完了すると（Ｓ１１１：Ｙｅｓ）、予測処理部５０は、学習フェーズにおけるクラスタリング結果のうち、新商品のクラスタリング結果を学習済みの予測モデルに入力して（Ｓ１１２）、予測結果を取得する（Ｓ１１３）。 After that, when learning is completed (S111: Yes), the prediction processing unit 50 inputs the clustering result of the new product among the clustering results in the learning phase into the learned prediction model (S112), and acquires the prediction result. (S113).

実施例１では、新商品を含めた状態でクラスタリングを実行する例を説明したが、この手法は、学習時に、新商品の企画書などが作成されている場合には有効な手法である。しかし、学習時に新商品の企画書がない場合も考えられる。そこで、実施例２では、学習時には既存商品の企画書を用いて予測モデルを学習し、予測時に新商品の企画書を用いて予測を実行する例を説明する。 In the first embodiment, an example in which clustering is executed in a state in which a new product is included has been described, but this method is an effective method when a plan for a new product is created at the time of learning. However, it is possible that there is no plan for a new product when learning. Therefore, in the second embodiment, an example will be described in which a prediction model is learned using a plan for an existing product at the time of learning and prediction is performed using a plan for a new product at the time of prediction.

［実施例２にかかる需要予測装置の説明］
図９は、実施例２にかかる需要予測装置１０を説明する図である。図９に示すように、学習フェーズでは、需要予測装置１０は、すでに発売されている既存商品の企画書から、内容を表すキーワードを抽出する。そして、需要予測装置１０は、抽出した単語を用いて、既存商品のクラスタリングを行ってクラスタを生成する。その後、需要予測装置１０は、既存商品のクラスタ結果を説明変数に設定し、既存商品の売上情報を目的変数に設定した学習データを用いて、需要予測を行う予測モデルを学習する。 [Description of Demand Forecasting Device According to Second Embodiment]
FIG. 9 is a diagram illustrating the demand prediction device 10 according to the second embodiment. As shown in FIG. 9, in the learning phase, the demand prediction device 10 extracts a keyword indicating the content from the plan of the existing product that has already been released. Then, the demand prediction device 10 uses the extracted words to perform clustering of existing products to generate clusters. After that, the demand prediction device 10 sets the cluster result of the existing product as the explanatory variable, and uses the learning data in which the sales information of the existing product is set as the objective variable to learn the prediction model for performing the demand prediction.

学習完了後の適用フェーズでは、需要予測装置１０は、新商品の企画書からキーワードを抽出し、キーワードの一致数などを用いて、各商品と新商品とのテキスト間の類似度を算出する。そして、需要予測装置１０は、各商品と新商品とのテキスト間の類似度を用いて、既存商品のクラスタＩＤごとの重み付け加算を算出する。すなわち、需要予測装置１０は、各商品が属するクラスタへの新商品の依存度を算出する。その後、需要予測装置１０は、重み付け加算の結果を各予測モデルに入力して、予測モデルの出力結果を需要予測として取得する。 In the application phase after learning is completed, the demand prediction apparatus 10 extracts a keyword from the plan of the new product and calculates the degree of similarity between the texts of each product and the new product by using the number of matching keywords. Then, the demand prediction device 10 calculates the weighted addition for each cluster ID of the existing product by using the similarity between the texts of each product and the new product. That is, the demand prediction device 10 calculates the degree of dependence of a new product on the cluster to which each product belongs. After that, the demand prediction device 10 inputs the result of the weighted addition into each prediction model, and acquires the output result of the prediction model as the demand prediction.

［具体例］
次に、図１０と図１１を用いて、学習フェーズと適用フェーズの具体例を説明する。図１０は、実施例２にかかる学習フェーズを説明する図である。図１１は、実施例２にかかる適用フェーズを説明する図である。 [Concrete example]
Next, specific examples of the learning phase and the application phase will be described with reference to FIGS. 10 and 11. FIG. 10 is a diagram illustrating a learning phase according to the second embodiment. FIG. 11 is a diagram illustrating an application phase according to the second embodiment.

（学習フェーズ）
実施例１と異なる点は、新商品の情報は用いずに、既存商品の情報のみを用いて、既存商品のみをクラスタリングする点である。具体的には、図１０に示すように、テキスト情報ＤＢ１６は、商品ごとに項目ａに属する文書情報である「商品、項目ａ、項目ｂ」として「商品１、文書１ａ、文書１ｂ」、「商品２、文書２ａ、文書２ｂ」、「商品３、文書３ａ、文書３ｂ」を記憶する。 (Learning phase)
A difference from the first embodiment is that only existing product information is used without using new product information, and only existing products are clustered. Specifically, as shown in FIG. 10, the text information DB 16 includes “commodity 1, document 1a, document 1b”, “commodity 1, document 1a, document 1b” as “commodity, item a, item b” which is document information belonging to item a for each commodity. "Product 2, document 2a, document 2b" and "product 3, document 3a, document 3b" are stored.

そして、単語抽出部４１は、各商品の文書からキーワードを抽出する（Ｓ１０）。例えば、単語抽出部４１は、項目ａの文書からキーワード「Ｋ１ａ、Ｋ２ａ、Ｋ３ａ」を抽出し、項目ｂの文書からキーワード「Ｋ１ｂ、Ｋ２ｂ、Ｋ３ｂ」を抽出する。 Then, the word extracting unit 41 extracts a keyword from the document of each product (S10). For example, the word extraction unit 41 extracts the keyword “K1a, K2a, K3a” from the document of item a and the keyword “K1b, K2b, K3b” from the document of item b.

続いて、重み算出部４２が、各キーワードのＴＦＩＤＦを算出し、選定部４３は、キーワードの選定を実行する（Ｓ１１）。例えば、商品１については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．７）、Ｋ３ａは重み（０．１）、Ｋ１ｂは重み（０．８）、Ｋ２ｂは重み（０．６）、Ｋ３ｂは該当なしと選定される。商品２については、Ｋ１ａは重み（０．８）、Ｋ２ａは該当なし、Ｋ３ａは重み（０．７）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．１）、Ｋ３ｂは重み（０．８）と選定される。商品３については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．８）、Ｋ３ａは重み（０．３）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．３）、Ｋ３ｂは重み（０．５）と選定される。 Then, the weight calculation unit 42 calculates the TFIDF of each keyword, and the selection unit 43 executes the keyword selection (S11). For example, for product 1, K1a is not applicable, K2a is weight (0.7), K3a is weight (0.1), K1b is weight (0.8), K2b is weight (0.6), and K3b is Selected as not applicable. For product 2, K1a has weight (0.8), K2a does not apply, K3a has weight (0.7), K1b does not apply, K2b has weight (0.1), and K3b has weight (0.8). Is selected. For product 3, K1a is not applicable, K2a is weighted (0.8), K3a is weighted (0.3), K1b is not applicable, K2b is weighted (0.3), and K3b is weighted (0.5). Is selected.

そして、クラスタリング部４４は、重み算出やキーワード選定の結果を用いて、既存商品のクラスタリングを実行する（Ｓ１２）。例えば、クラスタリング部４４は、実施例１と同様、項目ａについてクラスタＣ１ａとＣ２ａに分類し、項目ｂについてクラスタＣ１ｂとＣ２ｂに分類する。 Then, the clustering unit 44 uses the results of weight calculation and keyword selection to perform clustering of existing products (S12). For example, the clustering unit 44 classifies item a into clusters C1a and C2a and classifies item b into clusters C1b and C2b, as in the first embodiment.

この結果、商品１について、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，１，０」が生成される。商品２について、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，０，１」が生成される。商品３について、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，０，１」が生成される。 As a result, “0, 1, 1, 0” is generated as the clustering result “C1a, C2a, C1b, C2b” for the product 1. For the product 2, “1,0,0,1” is generated as the clustering result “C1a, C2a, C1b, C2b”. For product 3, “0, 1, 0, 1” is generated as the clustering result “C1a, C2a, C1b, C2b”.

その後、学習データ生成部４５は、既存商品のクラスタリング結果と既存商品の売上実績とを用いて、学習データを生成する（Ｓ１３）。例えば、学習データ生成部４５は、商品１について、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、１か月目の売上「１０００」、２か月目の売上「３００」、３か月目の売上「１００」のそれぞれを目的変数に設定した３つの学習データを生成する。 After that, the learning data generation unit 45 generates learning data by using the clustering result of the existing product and the sales record of the existing product (S13). For example, the learning data generation unit 45 sets the clustering result “0, 1, 1, 0” of the product 1 as the explanatory variable for the product 1, and sales of the first month “1000” and sales of the second month. Three learning data in which each of “300” and sales “100” at the third month is set as an objective variable is generated.

同様に、学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、１か月目の売上「１５００」、２か月目の売上「１０００」、３か月目の売上「７００」それぞれを目的変数に設定した、３つの学習データを生成する。同様に、学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、１か月目の売上「５０００」、２か月目の売上「２０００」、３か月目の売上「５００」それぞれを目的変数に設定した、３つの学習データを生成する。 Similarly, the learning data generation unit 45 sets the clustering result “1,0,0,1” of the product 2 as an explanatory variable, and sales for the first month “1500” and sales for the second month “1000”. Three learning data in which each of the sales “700” of the third month is set as an objective variable are generated. Similarly, the learning data generation unit 45 sets the clustering result “0, 1, 0, 1” of the product 3 as an explanatory variable, and sales for the first month “5000” and sales for the second month “2000”. Three learning data sets in which the sales variables “500” for the third month are set as the objective variables are generated.

そして、学習部４６は、生成された各学習データを用いて、各予測モデルを学習する（Ｓ１４）。例えば、学習部４６は、商品１の学習データ「（０，１，１，０）、１０００」と、商品２の学習データ「（１，０，０，１）、１５００」と、商品３の学習データ「（０，１，０，１）、５０００」とを用いて、重回帰分析により、１か月目の売上予測を行う予測モデルを学習する。 Then, the learning unit 46 uses the generated learning data to learn each prediction model (S14). For example, the learning unit 46 stores the learning data “(0,1,1,0), 1000” of the product 1, the learning data “(1,0,0,1), 1500” of the product 2, and the product 3 Using the learning data “(0,1,0,1), 5000”, a prediction model for predicting the sales for the first month is learned by multiple regression analysis.

（適用フェーズ）
適用フェーズでは、実施例１と異なり、新商品の情報を用いて、新商品と既存商品とのテキスト間類似度を算出し、新商品が分類済みのクラスタにどれだけ関連するかを示す特徴量を算出する。そして、新商品の特徴量を入力として予測を実行する。 (Application phase)
In the application phase, unlike the first embodiment, the similarity between the texts of the new product and the existing product is calculated using the information of the new product, and the feature amount indicating how the new product is related to the classified clusters. To calculate. Then, the prediction is executed by inputting the feature amount of the new product.

図１１に示すように、テキスト情報ＤＢ１６は、新商品の企画書データ「商品、項目ａ、項目ｂ」として「新商品、文書ａ、文書ｂ」を記憶する。この状態で、予測処理部５０は、新商品の企画書データの各文書からキーワードを抽出し、ＴＧＩＤＦなどを用いた各キーワードの重みの算出、キーワードの選定などを実行する（Ｓ２０）。例えば、予測処理部５０は、新商品の項目ａの文書ａからキーワード「Ｋ１ａ、Ｋ３ａ」を抽出し、項目ｂの文書ｂからキーワード「Ｋ１ｂ、Ｋ２ｂ」を抽出する。そして、予測処理部５０は、Ｋ１ａの重み「０．７」、Ｋ３ａの重み「０．９」、Ｋ１ｂの重み「０．７」、Ｋ２ｂの重み「０．６」を算出する。ここでは、Ｋ２ａとＫ３ｂは、抽出されなかったとする。 As shown in FIG. 11, the text information DB 16 stores “new product, document a, document b” as the plan document data “product, item a, item b” of the new product. In this state, the prediction processing unit 50 extracts a keyword from each document of the plan data of the new product, calculates the weight of each keyword using TGIDF, etc., and selects the keyword (S20). For example, the prediction processing unit 50 extracts the keyword “K1a, K3a” from the document a of the item a of the new product and the keyword “K1b, K2b” from the document b of the item b. Then, the prediction processing unit 50 calculates the weight “0.7” of K1a, the weight “0.9” of K3a, the weight “0.7” of K1b, and the weight “0.6” of K2b. Here, it is assumed that K2a and K3b are not extracted.

続いて、予測処理部５０は、コサイン類似度などの手法を用いて、新商品の重み情報と各既存商品の重み情報との間のテキスト間類似度を算出する（Ｓ２１）。例えば、予測処理部５０は、既存の商品１について、項目ａのテキスト間類似度「０．１１」と項目ｂのテキスト間類似度「１．００」を算出する。同様に、予測処理部５０は、既存の商品２について、項目ａのテキスト間類似度「０．９８」と項目ｂのテキスト間類似度「０．０８」を算出する。また、予測処理部５０は、既存の商品３について、項目ａのテキスト間類似度「０．２８」と項目ｂのテキスト間類似度「０．３３」を算出する。なお、コサイン類似度を用いたテキスト間類似度に限らず、全キーワードののうち一致する割合など一般的な類似度の算出手法を用いることもできる。 Subsequently, the prediction processing unit 50 calculates a text similarity between the weight information of the new product and the weight information of each existing product by using a method such as cosine similarity (S21). For example, the prediction processing unit 50 calculates the text similarity “0.11” of the item a and the text similarity “1.00” of the item b for the existing product 1. Similarly, the prediction processing unit 50 calculates the text similarity “0.98” of the item a and the text similarity “0.08” of the item b for the existing product 2. Further, the prediction processing unit 50 calculates the inter-text similarity of item a “0.28” and the inter-text similarity of item b “0.33” for the existing product 3. Note that not only the inter-text similarity using the cosine similarity but also a general similarity calculation method such as a matching ratio of all keywords can be used.

その後、予測処理部５０は、既存商品のクラスタＩＤのダミー変数を、テキスト間類似度で重み付け加算を行う。（Ｓ２２）。上記例で説明すると、予測処理部５０は、商品１に対して、図１０のＳ１２で生成されたクラスタリング結果を参照し、商品１が属する項目ａのクラスタＣ２ａにテキスト間類似度「０．１１」、商品１が属する項目ｂのクラスタＣ１ｂにテキスト間類似度「１．００」を設定する。 After that, the prediction processing unit 50 performs weighted addition on the dummy variable of the cluster ID of the existing product with the inter-text similarity. (S22). Explaining in the above example, the prediction processing unit 50 refers to the clustering result generated in S12 of FIG. 10 for the product 1, and the inter-text similarity “0.11” is added to the cluster C2a of the item a to which the product 1 belongs. , The inter-text similarity “1.00” is set to the cluster C1b of the item b to which the product 1 belongs.

同様に、予測処理部５０は、商品２に対して、図１０のＳ１２で生成されたクラスタリング結果を参照し、商品２が属する項目ａのクラスタＣ１ａにテキスト間類似度「０．９８」、商品２が属する項目ｂのクラスタＣ２ｂにテキスト間類似度「０．０８」を設定する。また、予測処理部５０は、商品３に対して、図１０のＳ１２で生成されたクラスタリング結果を参照し、商品３が属する項目ａのクラスタＣ２ａにテキスト間類似度「０．２８」、商品２が属する項目ｂのクラスタＣ２ｂにテキスト間類似度「０．３３」を設定する。 Similarly, the prediction processing unit 50 refers to the clustering result generated in S12 of FIG. 10 for the product 2, and determines that the cluster C1a of the item a to which the product 2 belongs has the inter-text similarity of “0.98” and the product. The inter-text similarity “0.08” is set in the cluster C2b of the item b to which 2 belongs. Further, the prediction processing unit 50 refers to the clustering result generated in S12 of FIG. 10 for the product 3, and the cluster C2a of the item a to which the product 3 belongs has the inter-text similarity of “0.28”, the product 2 The inter-text similarity “0.33” is set in the cluster C2b of the item b to which the belongs.

その後、予測処理部５０は、各既存商品のクラスタＩＤに対して設定されたテキスト間類似度を加算して、新商品の特徴量（特徴ベクトル）を生成する。上記例で説明すると、予測処理部５０は、項目ａのＣ１ａに対して、商品２のテキスト間類似度「０．９８」を設定し、項目ａのＣ２ａに対して、商品１のテキスト間類似度「０．１１」と商品３のテキスト間類似度「０．２８」を加算した「０．３９」を設定する。同様に、予測処理部５０は、項目ｂのＣ１ｂに対して、商品１のテキスト間類似度「１．００」を設定し、項目ｂのＣ２ｂに対して、商品２のテキスト間類似度「０．０８」と商品３のテキスト間類似度「０．３３」を加算した「０．４２」を設定する。 After that, the prediction processing unit 50 adds the inter-text similarity set to the cluster ID of each existing product to generate the feature amount (feature vector) of the new product. In the above example, the prediction processing unit 50 sets the text similarity of item 2 to C1a of item a, and sets the text similarity of item 1 to C2a of item a. The degree "0.11" and the similarity between the texts of the product 3 "0.28" are added to set "0.39". Similarly, the prediction processing unit 50 sets the inter-text similarity of item 1 to C1b of item b, and sets the inter-text similarity of item 2 to 0 for item C2b of item b. .08” and the similarity between the texts of the product 3 “0.33” are added to set “0.42”.

そして、予測処理部５０は、加算結果「０．９８，０．３９，１．００，０．４２」を説明変数として、学習済みである１か月用の予測モデル、２か月用の予測モデル、３か月用の予測モデルそれぞれに入力して、予測結果を取得する（Ｓ２３）。上記例で説明すると、予測処理部５０は、新商品が発売されてから１か月目の売上予測を「２５００個」、１か月目から２か月目までの売上予測を「２０個」、２か月目から３か月目までの売上予測を「３００個」と予測する。 Then, the prediction processing unit 50 uses the addition result “0.98, 0.39, 1.00, 0.42” as an explanatory variable, and has learned a one-month prediction model and two-month prediction model. The prediction result is acquired by inputting each to the model and the prediction model for 3 months (S23). Explaining in the above example, the prediction processing unit 50 gives a sales forecast of “2500 units” for the first month after the new product is released, and a “20 units” sales forecast for the first to second months. The sales forecast from the second month to the third month is predicted to be "300".

ところで、実施例１と実施例２では、月別の予測モデルを用いて、月別の需要予測を行う場合を説明したが、これに限定されるものではない。例えば、月別の需要予測結果を用いてスムージングを実行することにより、予測されていない期間の予測結果を推測することができる。 By the way, in the first and second embodiments, the case of performing the monthly demand forecast using the monthly forecast model has been described, but the present invention is not limited to this. For example, by performing smoothing using the monthly demand forecast result, it is possible to infer the forecast result of the period not forecasted.

例えば、需要予測装置１０の予測処理部５０は、学習データの時間粒度（例えば４週）が予測したい時間粒度（例えば週次）よりも大きい場合は、予測結果が得られたのち、予測したい時間粒度に合わせて等分割する。例を挙げると、予測処理部５０は、４週分の予測値４分割して、階段状の週次の予測値に変換する。 For example, when the time granularity (for example, 4 weeks) of the learning data is larger than the time granularity (for example, weekly) desired to be predicted, the prediction processing unit 50 of the demand prediction device 10 obtains the prediction result, and then the desired time for prediction. Divide into equal parts according to the grain size. For example, the prediction processing unit 50 divides the prediction value for four weeks into four, and converts the prediction value into a stepwise weekly prediction value.

続いて、予測処理部５０は、発売後の売上が特定の確率分布に従って時間的に変化すると仮定する。なお、一般に売上は減衰していくので、ワイブル分布、対数正規分布、対数ロジスティック分布などの確率分布を使用する。 Subsequently, the prediction processing unit 50 assumes that the sales after the sales change with time according to a specific probability distribution. Note that since sales generally decline, probability distributions such as Weibull distribution, lognormal distribution and loglogistic distribution are used.

そして、予測処理部５０は、複数の予測値全体にフィットするように、一般的な手法を用いて確率分布のパラメータを計算し、得られたパラメータを用いて、任意の時刻での新たな予測値（補間・補正した予測値）を計算する。つまり、時刻を代入すると新たな予測値が得られる。 Then, the prediction processing unit 50 calculates the parameters of the probability distribution using a general method so as to fit all of the plurality of predicted values, and uses the obtained parameters to make a new prediction at any time. Calculate the value (predicted value after interpolation/correction). That is, a new predicted value can be obtained by substituting the time.

ここで、図１２から図１４を用いて具体例を説明する。図１２は、スムージングの例を説明する図であり、図１３は、スムージングの別例を説明する図である。図１４は、スムージング結果を説明する図である。 Here, a specific example will be described with reference to FIGS. 12 to 14. FIG. 12 is a diagram illustrating an example of smoothing, and FIG. 13 is a diagram illustrating another example of smoothing. FIG. 14 is a diagram illustrating a smoothing result.

図１２では、スムージングを不規則変動の低減に用いる場合を説明する。例えば、週別の売上を独立に予測するため、予測結果に不規則変化（予測誤差）が含まれると予測結果がばらつく可能性がある。このため、複数週の予測結果を総合的に判断して予測値の補間や補正を行うことにより、不規則変化を抑えることができる。 In FIG. 12, a case where smoothing is used to reduce irregular fluctuation will be described. For example, since the sales for each week are independently predicted, the prediction result may vary if the prediction result includes an irregular change (prediction error). Therefore, irregular changes can be suppressed by comprehensively determining the prediction results of a plurality of weeks and performing interpolation and correction of the prediction values.

図１２に示す図は、週次の売上実績を目的変数として週次の予測モデルを学習し、学習済みの週次の予測モデルを用いて発売日から週次（各週）の売上予測を算出し、週次の予測値に対しスムージングを行った結果である。このようにすることで、予測値を補正することができるので、需要予測の精度を向上させることができる。なお、入力の予測値の単位は週次に限定されない。月次の予測値や４週分の予測値に対し同様の処理を行うことも可能である。 In the diagram shown in FIG. 12, a weekly forecast model is learned by using weekly sales results as an objective variable, and a weekly (each week) sales forecast is calculated from the release date using the learned weekly forecast model. , Is the result of smoothing the weekly predicted value. By doing so, the predicted value can be corrected, so that the accuracy of the demand prediction can be improved. The unit of the input predicted value is not limited to weekly. It is also possible to perform the same processing on the monthly forecast value and the forecast values for four weeks.

図１３では、学習データの時間粒度と予測の時間粒度のギャップ調整に用いる場合を説明する。例えば、予測モデルの学習に使用するデータの単位（例えば４週）と、予測が必要となる単位（例えば週次）にギャップがある場合がある。 In FIG. 13, a case will be described where it is used for adjusting the gap between the time granularity of learning data and the time granularity of prediction. For example, there may be a gap between the unit of data used for learning the prediction model (for example, 4 weeks) and the unit that requires prediction (for example, weekly).

例を挙げると、４週でしかデータを取得して管理していないので、週次データが存在しないときが挙げられる。また、週次のデータは存在するが、数量が少量のためデータ量が不足し、そのままではうまく予測モデルを作成することができず、より大きい括り（４週）で集計して学習に使用する必要が発生したときなどが挙げられる。これらの場合、一般的な手法では、学習データと同じ単位での予測結果しか得られない。 For example, since the data is acquired and managed only in 4 weeks, there is a case where there is no weekly data. In addition, although there is weekly data, the amount of data is insufficient due to the small amount of data, so it is not possible to create a prediction model as it is, and it is aggregated in a larger group (4 weeks) and used for learning. For example, when the need arises. In these cases, the general method can only obtain the prediction result in the same unit as the learning data.

そこで、予測結果を補間および補正することにより、学習データとは異なる任意の時間粒度での予測結果を計算する。図１３では、４週分データで学習し得られた４週分の予測値に対してスムージングを行い、新たに週次の予測値が得られた結果である。つまり、４週分の予測値を４分割して、階段状の週次の予測値に一度変換したものを図示する。 Therefore, the prediction result is calculated at an arbitrary time granularity different from the learning data by interpolating and correcting the prediction result. FIG. 13 shows the result of smoothing the prediction values for four weeks obtained by learning from the data for four weeks and newly obtaining weekly prediction values. That is, the prediction value for four weeks is divided into four, and is once converted into the stepwise weekly prediction value.

このように、予測結果では得られない予測値を推定することができるので、学習データに依存することなく、需要予測を行うことができる。なお、週次に限らず、日次など任意の単位で算出を行うことができる。また、入力の予測値の単位は４週に限定されず、月次の予測値や週次の予測値に対し、同様の処理を行うこともできる。 In this way, since it is possible to estimate a predicted value that cannot be obtained from the prediction result, it is possible to perform demand prediction without depending on the learning data. Note that the calculation can be performed not only on a weekly basis but also on an arbitrary unit such as a daily basis. Further, the unit of the input predicted value is not limited to four weeks, and similar processing can be performed on the monthly predicted value and the weekly predicted value.

このようなスムージング手法を実施例１や実施例２に適用することで、図１４に示す結果を得ることができる。具体的には、月別の予測モデルから得られた月別の予測結果を用いて、週別の予測値を推定することができる。 The result shown in FIG. 14 can be obtained by applying such a smoothing method to the first and second embodiments. Specifically, a weekly prediction value can be estimated using the monthly prediction result obtained from the monthly prediction model.

さて、これまで本発明の実施例について説明したが、本発明は上述した実施例以外にも、種々の異なる形態にて実施されてよいものである。 Although the embodiments of the present invention have been described so far, the present invention may be implemented in various different forms other than the above-described embodiments.

［実施例の効果］ [Effect of Example]

図１５と図１６を用いて、上記実施例の効果を説明する。図１５は、効果を説明する図である。図１６は、効果の比較例を説明する図である。図１５に示すように、上記実施例による手法は、企画書に記載される狙い、ターゲット層、特徴、キャンペーン情報などを用いてテキストマイニングおよびクラスタリングを実行した結果を説明変数に設定し、商品の属性である過去の販売実績などを目的変数に設定した学習データを用いた機械学習を実行する。 The effects of the above embodiment will be described with reference to FIGS. 15 and 16. FIG. 15 is a diagram for explaining the effect. FIG. 16 is a diagram illustrating a comparative example of the effect. As shown in FIG. 15, in the method according to the above embodiment, the result of executing text mining and clustering using the aim, target layer, characteristics, campaign information, etc. described in the plan is set as an explanatory variable, and Machine learning is performed using learning data in which the past sales record, which is an attribute, is set as an objective variable.

このような学習により、上記実施例による手法は、新商品の初期需要と需要の推移を学習することができるとともに、累積の増加や需要傾向（パターン）をも学習することができる。すなわち、上記実施例による手法は、類似商品の需要パターンを要素の組合せに分解して学習することができる。 By such learning, the method according to the above-described embodiment can learn the initial demand and the transition of demand of a new product, and also learn the increase in accumulation and the demand tendency (pattern). That is, the method according to the above-described embodiment can decompose the demand pattern of similar products into a combination of elements for learning.

この結果、図１６に示すように、定数予測を行う一般手法Ａや、定型情報や数値情報のみを用いた機械学習を行う一般手法Ｂに比べて、誤差率を低減することができる。シミュレーションによれば、実施例による手法を用いることで、誤差率の中央値を１９．３％まで低減することができる。 As a result, as shown in FIG. 16, the error rate can be reduced as compared with the general method A for performing constant prediction and the general method B for performing machine learning using only fixed form information and numerical information. According to the simulation, the median error rate can be reduced to 19.3% by using the method according to the embodiment.

［データや数値等］
上記実施例で用いた数値、データ例、データの数、ラベルの設定内容等は、あくまで一例であり、任意に変更することができる。また、キーワードは、特徴語の一例である。また、既存商品は、過去の商品であり、現段階で販売が終了している商品であってもよく、現段階で販売が継続中の商品であってもよい。また、目的変数には、売上以外にも、携帯電話などの契約数を用いることができる。また、上記実施例では、月別の売上情報を用いて月別の予測モデルを生成する例を説明したが、これに限定されるものではなく、日別、週別、年別の売上情報を用いることで、様々な予測モデルを生成することができる。 [Data, numbers, etc.]
Numerical values, data examples, the number of data, label setting contents and the like used in the above embodiments are merely examples, and can be arbitrarily changed. The keyword is an example of a characteristic word. The existing product may be a product that has been sold in the past, and may be a product that has been sold at this stage, or a product that is currently sold at this stage. In addition to sales, the number of contracts for mobile phones can be used as the objective variable. Further, in the above embodiment, an example of generating a monthly prediction model using monthly sales information has been described, but the present invention is not limited to this, and daily sales information, weekly sales information, and yearly sales information may be used. Thus, various prediction models can be generated.

また、企画書データの項目ａと項目ｂを用いる例を説明したが、これに限定されるものではなく、１つ以上の項目を用いることができ、企画書全体を１つの項目として用いることもできる。また、企画書以外にも商品の説明書などを用いることもできる。 Further, the example in which the item a and the item b of the plan document data are used has been described, but the present invention is not limited to this, one or more items can be used, and the entire plan document can be used as one item. it can. Further, in addition to the plan book, it is possible to use a product description or the like.

［システム］
上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 [system]
The information including the processing procedures, control procedures, specific names, various data and parameters shown in the above-mentioned documents and drawings can be arbitrarily changed unless otherwise specified.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散や統合の具体的形態は図示のものに限られない。つまり、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、学習処理部４０と予測処理部５０とを別々の装置で実現することもできる。 Further, each component of each device shown in the drawings is functionally conceptual and does not necessarily have to be physically configured as shown. That is, the specific form of distribution and integration of each device is not limited to that shown in the drawings. That is, all or part of them can be functionally or physically distributed/integrated in arbitrary units according to various loads and usage conditions. For example, the learning processing unit 40 and the prediction processing unit 50 can be realized by separate devices.

さらに、各装置にて行なわれる各処理機能は、その全部または任意の一部が、ＣＰＵおよび当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 Further, each processing function performed by each device may be realized in whole or in part by a CPU and a program analyzed and executed by the CPU, or may be realized as hardware by a wired logic.

［ハードウェア］
図１７は、ハードウェア構成例を説明する図である。図１７に示すように、需要予測装置１０は、通信装置１０ａ、ＨＤＤ（Hard Disk Drive）１０ｂ、メモリ１０ｃ、プロセッサ１０ｄを有する。また、図１７に示した各部は、バス等で相互に接続される。 [hardware]
FIG. 17 is a diagram illustrating a hardware configuration example. As shown in FIG. 17, the demand prediction device 10 includes a communication device 10a, an HDD (Hard Disk Drive) 10b, a memory 10c, and a processor 10d. Further, the respective units shown in FIG. 17 are mutually connected by a bus or the like.

通信装置１０ａは、ネットワークインタフェースカードなどであり、他のサーバとの通信を行う。ＨＤＤ１０ｂは、図２に示した機能を動作させるプログラムやＤＢを記憶する。 The communication device 10a is a network interface card or the like, and communicates with other servers. The HDD 10b stores a program for operating the functions shown in FIG. 2 and a DB.

プロセッサ１０ｄは、図２に示した各処理部と同様の処理を実行するプログラムをＨＤＤ１０ｂ等から読み出してメモリ１０ｃに展開することで、図２等で説明した各機能を実行するプロセスを動作させる。すなわち、このプロセスは、需要予測装置１０が有する各処理部と同様の機能を実行する。具体的には、プロセッサ１０ｄは、学習処理部４０と予測処理部５０等と同様の機能を有するプログラムをＨＤＤ１０ｂ等から読み出す。そして、プロセッサ１０ｄは、学習処理部４０と予測処理部５０等と同様の処理を実行するプロセスを実行する。 The processor 10d reads a program that executes the same processing as the processing units illustrated in FIG. 2 from the HDD 10b or the like and loads the program in the memory 10c, thereby operating the processes that execute the functions illustrated in FIG. 2 or the like. That is, this process performs the same function as each processing unit included in the demand prediction device 10. Specifically, the processor 10d reads a program having the same functions as the learning processing unit 40, the prediction processing unit 50, and the like from the HDD 10b and the like. Then, the processor 10d executes a process for executing the same processing as the learning processing unit 40, the prediction processing unit 50, and the like.

このように需要予測装置１０は、プログラムを読み出して実行することで需要予測方法を実行する情報処理装置として動作する。また、需要予測装置１０は、媒体読取装置によって記録媒体から上記プログラムを読み出し、読み出された上記プログラムを実行することで上記した実施例と同様の機能を実現することもできる。なお、この他の実施例でいうプログラムは、需要予測装置１０によって実行されることに限定されるものではない。例えば、他のコンピュータまたはサーバがプログラムを実行する場合や、これらが協働してプログラムを実行するような場合にも、本発明を同様に適用することができる。 In this way, the demand forecasting device 10 operates as an information processing device that executes the demand forecasting method by reading and executing the program. Further, the demand forecasting device 10 can also realize the same function as that of the above-described embodiment by reading the program from the recording medium by the medium reading device and executing the read program. The programs referred to in the other embodiments are not limited to being executed by the demand prediction device 10. For example, the present invention can be similarly applied to the case where another computer or server executes the program, or when these cooperate with each other to execute the program.

１０需要予測装置
１１通信部
１２記憶部
１３企画書ＤＢ
１４売上情報ＤＢ
１５月別売上情報ＤＢ
１６テキスト情報ＤＢ
１７重み情報ＤＢ
１８クラスタＤＢ
１９学習データＤＢ
２０学習結果ＤＢ
２１予測結果ＤＢ
３０制御部
４０学習処理部
４１単語抽出部
４２重み算出部
４３選定部
４４クラスタリング部
４５学習データ生成部
４６学習部
５０予測処理部 10 demand forecasting device 11 communication unit 12 storage unit 13 plan book DB
14 Sales information DB
15 Monthly Sales Information DB
16 Text information DB
17 Weight information DB
18 cluster DB
19 Learning data DB
20 Learning result DB
21 Prediction result DB
30 control unit 40 learning processing unit 41 word extraction unit 42 weight calculation unit 43 selection unit 44 clustering unit 45 learning data generation unit 46 learning unit 50 prediction processing unit

Claims

コンピュータが、
発売が開始されている既存商品または発売が開始されていない新商品の属性が記載された各文書から、予め設定された条件に基づいて各商品の属性を示す特徴語を抽出し、
前記各文書に含まれる特徴語の出現頻度から、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成し、
生成したクラスタリング情報を説明変数に設定し、前記既存商品の売上実績を目的変数に設定した学習データを用いて、前記新商品の需要予測を行う予測モデルを学習する
処理を実行することを特徴とする需要予測方法。 Computer
From each document that describes the attributes of existing products that have been launched or new products that have not been launched, extract characteristic words indicating the attributes of each product based on preset conditions,
From the appearance frequency of the characteristic words included in each document, to generate clustering information indicating a combination of degrees having characteristic words for each product,
The generated clustering information is set as an explanatory variable, and the learning data in which the sales performance of the existing product is set as the objective variable is used to perform a process of learning a prediction model for predicting the demand of the new product. Demand forecasting method.

前記抽出する処理は、複数の既存商品それぞれに対応する各文書と前記新商品に対応する文書とから前記特徴語を抽出し、
前記生成する処理は、前記複数の既存商品と前記新商品とを、それぞれから抽出された特徴語を用いてクラスタリングして前記クラスタリング情報を生成し、
前記学習する処理は、前記クラスタリング情報のうち前記複数の既存商品それぞれに該当するクラスタリング情報を用いて複数の学習データを生成し、前記複数の学習データを用いて前記予測モデルを学習する処理を実行することを特徴とする請求項１に記載の需要予測方法。 The extracting process extracts the characteristic word from each document corresponding to each of a plurality of existing products and the document corresponding to the new product,
The processing to generate, the plurality of existing products and the new product, to generate the clustering information by clustering using the feature word extracted from each
The learning process executes a process of generating a plurality of learning data by using clustering information corresponding to each of the plurality of existing products among the clustering information, and learning the prediction model by using the plurality of learning data. The demand forecasting method according to claim 1, wherein:

前記クラスタリング情報のうち、前記新商品に該当するクラスタリング情報を、学習済みの予測モデルに入力し、前記学習済みの予測モデルの出力結果を、前記新商品の需要予測として取得する、処理を前記コンピュータが実行することを特徴とする請求項２に記載の需要予測方法。 Of the clustering information, the clustering information corresponding to the new product is input to the learned prediction model, and the output result of the learned prediction model is acquired as the demand prediction of the new product, the computer processing 3. The demand forecasting method according to claim 2, wherein the demand forecasting method is executed by.

前記抽出する処理は、複数の既存商品それぞれに対応する各文書から前記特徴語を抽出し、
前記生成する処理は、前記複数の既存商品を、それぞれから抽出された特徴語を用いてクラスタリングして前記クラスタリング情報を生成し、
前記学習する処理は、複数の既存商品の前記クラスタリング情報を用いて複数の学習データを生成し、前記複数の学習データを用いて前記予測モデルを学習する処理を実行することを特徴とする請求項１に記載の需要予測方法。 The extracting process extracts the characteristic word from each document corresponding to each of a plurality of existing products,
The processing to generate, the plurality of existing products, to generate the clustering information by clustering using the feature word extracted from each
In the learning process, a plurality of learning data is generated using the clustering information of a plurality of existing products, and a process of learning the prediction model is performed using the plurality of learning data. The demand forecasting method described in 1.

前記新商品に対応する文書から前記特徴語を抽出し、
前記新商品の特徴語と、前記複数の既存商品それぞれの特徴語とを用いて、前記新商品と前記複数の既存商品それぞれとの類似度を算出し、
前記類似度を用いて、前記新商品を前記複数の既存商品がクラスタリングされた各クラスタに対応付け、
前記新商品を前記各クラスタに対応付けた結果を、学習済みの予測モデルに入力し、前記学習済みの予測モデルの出力結果を、前記新商品の需要予測として取得する、
処理を前記コンピュータが実行することを特徴とする請求項４に記載の需要予測方法。 Extracting the characteristic word from the document corresponding to the new product,
Using the feature word of the new product and the feature word of each of the plurality of existing products, calculating the degree of similarity between the new product and each of the plurality of existing products,
Using the similarity, the new product is associated with each cluster in which the plurality of existing products are clustered,
The result of associating the new product with each of the clusters is input to a learned prediction model, and the output result of the learned prediction model is acquired as the demand prediction of the new product,
The demand forecasting method according to claim 4, wherein the computer executes the process.

前記学習する処理は、前記クラスタリング情報を説明変数に設定し、前記既存商品の所定期間ごとの売上実績それぞれを目的変数に設定した各学習データを用いて、前記所定期間ごとの前記新商品の需要予測を行う各予測モデルを学習することを特徴とする請求項１に記載の需要予測方法。 In the learning process, the clustering information is set as an explanatory variable, and the demand for the new product for each predetermined period is set by using each learning data in which the sales performance of the existing product for each predetermined period is set as an objective variable. The demand forecasting method according to claim 1, wherein each forecasting model for forecasting is learned.

前記新商品に該当するクラスタリング情報を、学習済みの各予測モデルに入力し、前記学習済みの各予測モデルの出力結果を、前記所定期間ごとの前記新商品の需要予測として取得する、処理を前記コンピュータが実行することを特徴とする請求項６に記載の需要予測方法。 The clustering information corresponding to the new product is input to each learned prediction model, and the output result of each learned prediction model is acquired as the demand prediction of the new product for each of the predetermined periods. The demand forecasting method according to claim 6, which is executed by a computer.

前記所定期間ごとに予測された前記新商品の需要予測の結果を用いてスムージングを実行して各所定期間の間の予測結果を補間する、処理を前記コンピュータが実行することを特徴とする請求項７に記載の需要予測方法。 The computer executes a process of performing smoothing by using a result of the demand forecast of the new product predicted for each of the predetermined periods and interpolating a prediction result of each of the predetermined periods. The demand forecasting method according to 7.

コンピュータに、
発売が開始されている既存商品または発売が開始されていない新商品の属性が記載された各文書から、予め設定された条件に基づいて各商品の属性を示す特徴語を抽出し、
前記各文書に含まれる特徴語の出現頻度から、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成し、
生成したクラスタリング情報を説明変数に設定し、前記既存商品の売上実績を目的変数に設定した学習データを用いて、前記新商品の需要予測を行う予測モデルを学習する
処理を実行させることを特徴とする需要予測プログラム。 On the computer,
From each document that describes the attributes of existing products that have been launched or new products that have not been launched, extract characteristic words indicating the attributes of each product based on preset conditions,
From the appearance frequency of the characteristic words included in each document, to generate clustering information indicating a combination of degrees having characteristic words for each product,
The generated clustering information is set as an explanatory variable, and the learning data in which the sales performance of the existing product is set as the objective variable is used to execute a process of learning a prediction model for predicting the demand of the new product. Demand forecasting program.

発売が開始されている既存商品または発売が開始されていない新商品の属性が記載された各文書から、予め設定された条件に基づいて各商品の属性を示す特徴語を抽出する抽出部と、
前記各文書に含まれる特徴語の出現頻度から、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成する生成部と、
生成したクラスタリング情報を説明変数に設定し、前記既存商品の売上実績を目的変数に設定した学習データを用いて、前記新商品の需要予測を行う予測モデルを学習する学習部と
を有することを特徴とする需要予測装置。 From each document in which the attributes of the existing product that has been launched or the new product that has not been launched is described, an extraction unit that extracts a characteristic word indicating the attribute of each product based on a preset condition,
A generation unit that generates clustering information indicating a combination of degrees having characteristic words for each product from appearance frequencies of characteristic words included in each document;
A learning unit configured to set the generated clustering information as an explanatory variable, and to use a learning data in which the sales performance of the existing product is set as an objective variable to learn a prediction model for predicting the demand for the new product. Demand forecasting device.