JP2018151789A

JP2018151789A - Information processing apparatus, information processing method, program, and advertisement information processing system

Info

Publication number: JP2018151789A
Application number: JP2017046663A
Authority: JP
Inventors: 田村　健; Takeshi Tamura; 健田村; 伸次池宮; Shinji Ikemiya; 琢郎森; Takuro Mori; 工藤　和也; Kazuya Kudo; 和也工藤; 麻里衣目; Mari Kinume
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2017-03-10
Filing date: 2017-03-10
Publication date: 2018-09-27
Anticipated expiration: 2037-03-10
Also published as: JP6739379B2

Abstract

PROBLEM TO BE SOLVED: To provide an information processing apparatus, an information processing method, a program, and an advertisement information processing system capable of accurately and easily grasping relevance between queries.SOLUTION: The information processing apparatus includes a calculating unit for calculating a degree of association between respective two queries in a plurality of queries used for network search on the basis of the number of users who have searched for each of the two queries, a generation unit for generating graph data indicating whether there is association between the two queries and a degree of association between each of the two queries and a classification unit for classifying the queries on the basis of the graph data.SELECTED DRAWING: Figure 2

Description

本発明は、情報処理装置、情報処理方法、プログラム、および広告情報処理システムに関する。 The present invention relates to an information processing apparatus, an information processing method, a program, and an advertisement information processing system.

従来、ウェブ検索において、ユーザが入力したキーワード（クエリ）を解析する技術についての研究が進められている。例えば、過去に入力されたクエリのリストを用いてクエリ間の関連性を解析することで、利便性の高い検索サービスを提供することが可能となる（例えば、特許文献１参照）。 2. Description of the Related Art Conventionally, research on a technique for analyzing a keyword (query) input by a user in web search has been advanced. For example, it is possible to provide a highly convenient search service by analyzing the relationship between queries using a list of queries input in the past (see, for example, Patent Document 1).

特開２０１５−９７０２６号公報JP2015-97026A

クエリの解析においては、クエリ間の関連性をいかに正確に把握できるかが重要となる。また、膨大な数のクエリが解析対象となるため、解析処理を簡易化することも求められている。 In query analysis, it is important how accurately the relationship between queries can be grasped. In addition, since an enormous number of queries are to be analyzed, it is also required to simplify the analysis process.

本発明は、このような事情を考慮してなされたものであり、クエリ間の関連性を正確かつ簡単に把握することが可能な情報処理装置、情報処理方法、プログラム、および広告情報処理システムを提供することを目的の一つとする。 The present invention has been made in view of such circumstances, and provides an information processing apparatus, an information processing method, a program, and an advertisement information processing system that can accurately and easily grasp the relationship between queries. One of the purposes is to provide.

本発明の一態様は、ネットワーク検索に用いられた複数のクエリにおける各２つのクエリの間の関連度を、前記各２つのクエリの双方を検索したユーザ数に基づいて算出する算出部と、前記各２つのクエリの間の関連付けの有無と、前記各２つのクエリの間の関連度とを示すグラフデータを生成する生成部と、前記グラフデータに基づいてクエリを分類する分類部と、を備える情報処理装置である。 One aspect of the present invention is a calculation unit that calculates the degree of association between two queries in a plurality of queries used for network search based on the number of users who have searched both the two queries, A generation unit that generates graph data indicating presence / absence of association between each two queries, and a degree of association between each of the two queries, and a classification unit that classifies the queries based on the graph data. Information processing apparatus.

本発明の一態様によれば、クエリ間の関連性を正確かつ簡単に把握することができる。 According to one aspect of the present invention, the relationship between queries can be grasped accurately and easily.

第１実施形態における情報処理システム１の構成図である。It is a lineblock diagram of information processing system 1 in a 1st embodiment. 第１実施形態における情報処理装置７の機能構成を示す図である。It is a figure which shows the function structure of the information processing apparatus 7 in 1st Embodiment. 第１実施形態におけるクエリ間の関連性を示すグラフデータである。It is graph data which shows the relationship between the queries in 1st Embodiment. 第１実施形態における情報処理装置７の処理の一例を示すフローチャートである。It is a flowchart which shows an example of a process of the information processing apparatus 7 in 1st Embodiment. 第１実施形態におけるノード処理の一例を説明した図である。It is a figure explaining an example of the node process in 1st Embodiment. 第１実施形態におけるノード処理の一例を説明した図である。It is a figure explaining an example of the node process in 1st Embodiment. 第１実施形態におけるノード処理の一例を説明した図である。It is a figure explaining an example of the node process in 1st Embodiment. 第１実施形態におけるノード間の親子関係を示す木構造を示す図である。It is a figure which shows the tree structure which shows the parent-child relationship between the nodes in 1st Embodiment. 第２実施形態における情報処理装置７の処理の一例を示すフローチャートである。It is a flowchart which shows an example of a process of the information processing apparatus 7 in 2nd Embodiment. 第２実施形態におけるクエリ間の関連性を示すグラフデータである。It is graph data which shows the relationship between the queries in 2nd Embodiment. 第２実施形態におけるノード処理の一例を説明した図である。It is a figure explaining an example of the node process in 2nd Embodiment. 第３実施形態における情報処理装置７の処理の一例を示すフローチャートである。It is a flowchart which shows an example of a process of the information processing apparatus 7 in 3rd Embodiment. 第３実施形態におけるクエリ間の関連性を示すグラフデータである。It is graph data which shows the relationship between the queries in 3rd Embodiment. 第３実施形態におけるノード処理の一例を説明した図である。It is a figure explaining an example of the node process in 3rd Embodiment. 第１実施形態における情報処理システム１Ａの構成図である。It is a lineblock diagram of information processing system 1A in a 1st embodiment.

以下、図面を参照し、本発明の情報処理装置、情報処理方法、プログラム、および広告情報処理システムの実施形態について説明する。情報処理装置は、ユーザの端末装置から送信されたクエリ間の関連度を示す重複検索スコアを算出し、算出した重複検索スコアに基づいてクエリの分類を行う。 Hereinafter, embodiments of an information processing apparatus, an information processing method, a program, and an advertisement information processing system according to the present invention will be described with reference to the drawings. The information processing device calculates a duplicate search score indicating the degree of association between queries transmitted from the user's terminal device, and classifies the query based on the calculated duplicate search score.

＜第１実施形態＞
図１は、第１実施形態における情報処理システム１の構成図である。情報処理システム１は、例えば、一以上の端末装置３と、一以上の検索サーバ５と、一以上の情報処理装置７とを備える。端末装置３と、検索サーバ５とは、ネットワークＮＷによって互いに接続されており、このネットワークＮＷを介して互いに通信する。ネットワークＮＷは、例えば、ＷＡＮ（Wide Area Network）やＬＡＮ（Local Area Network）、インターネット、専用回線、無線基地局、プロバイダなどを含む。 <First Embodiment>
FIG. 1 is a configuration diagram of an information processing system 1 in the first embodiment. The information processing system 1 includes, for example, one or more terminal devices 3, one or more search servers 5, and one or more information processing devices 7. The terminal device 3 and the search server 5 are connected to each other via a network NW, and communicate with each other via the network NW. The network NW includes, for example, a WAN (Wide Area Network), a LAN (Local Area Network), the Internet, a dedicated line, a wireless base station, a provider, and the like.

［端末装置］
端末装置３は、検索サービスを利用するユーザによって操作される。端末装置３は、例えば、パーソナルコンピュータ、スマートフォンなどの携帯電話やタブレット端末、ＰＤＡ（Personal Digital Assistant）などのコンピュータ装置である。端末装置３では、ユーザの操作に基づいて動作するブラウザまたはアプリケーションプログラムが、情報提供を要求するクエリを検索サーバ５に送信し、クエリと関連付けされた検索情報を検索サーバ５から受信する。 [Terminal device]
The terminal device 3 is operated by a user who uses the search service. The terminal device 3 is a computer device such as a personal computer, a mobile phone such as a smartphone, a tablet terminal, or a PDA (Personal Digital Assistant). In the terminal device 3, a browser or an application program that operates based on a user operation transmits a query for requesting information provision to the search server 5 and receives search information associated with the query from the search server 5.

［検索サーバ］
検索サーバ５は、端末装置３からクエリを受信し、受信したクエリに基づいて検索の結果を提供する。検索サーバ５は、クエリと、コンテンツの参照情報（例えばＵＲＬなど）とを関連付けた検索データベース（図示しない）を備えている。検索サーバ５は、端末装置３からクエリを受信した場合、検索データベースから、クエリに関連付けられたコンテンツを参照するための参照情報を抽出し、端末装置３に送信する。 Search server
The search server 5 receives a query from the terminal device 3 and provides a search result based on the received query. The search server 5 includes a search database (not shown) in which a query is associated with content reference information (for example, a URL). When the search server 5 receives a query from the terminal device 3, the search server 5 extracts reference information for referring to the content associated with the query from the search database, and transmits the reference information to the terminal device 3.

検索サーバ５は、端末装置３から受信したクエリと、クエリの送信元の端末装置３のユーザの識別情報とを関連付けした履歴情報を記憶部（図示しない）に記憶する。ユーザの識別情報とは、例えば、端末装置３に備えられたウェブブラウザごとに管理されるクッキー（HTTP cookie）に関する情報や、端末装置３のＩＰアドレスなどである。これらの識別情報は、クエリを入力したユーザの識別情報とみなすことができる。また、ユーザが検索サーバ５にアクセスする際にログインを行っている場合、そのログインＩＤをユーザの識別情報としてよい。
［情報処理装置］
情報処理装置７は、検索サーバ５から履歴情報を取得し、取得した履歴情報を用いてクエリの分類を行う。図２は、情報処理装置７の機能構成を示す図である。情報処理装置７は、例えば、取得部１０と、関連度算出部１２（算出部）と、生成部１４と、分類部１６と、記憶部１８とを備える。情報処理装置７に含まれる各機能部は、複数の装置に分散されてもよい。例えば、関連度算出部１２と他の機能部とは別体の装置によって実現されてもよい。記憶部１８は、ＮＡＳ（Network Attached Storage）などの記憶装置であってもよい。 The search server 5 stores history information that associates the query received from the terminal device 3 with the identification information of the user of the terminal device 3 that is the transmission source of the query, in a storage unit (not shown). The user identification information is, for example, information related to a cookie (HTTP cookie) managed for each web browser provided in the terminal device 3 or the IP address of the terminal device 3. These pieces of identification information can be regarded as identification information of the user who inputs the query. Further, when the user logs in when accessing the search server 5, the login ID may be used as the user identification information.
[Information processing device]
The information processing device 7 acquires history information from the search server 5 and classifies the query using the acquired history information. FIG. 2 is a diagram illustrating a functional configuration of the information processing apparatus 7. The information processing device 7 includes, for example, an acquisition unit 10, a relevance calculation unit 12 (calculation unit), a generation unit 14, a classification unit 16, and a storage unit 18. Each functional unit included in the information processing device 7 may be distributed to a plurality of devices. For example, the association degree calculation unit 12 and other function units may be realized by separate devices. The storage unit 18 may be a storage device such as NAS (Network Attached Storage).

関連度算出部１２、生成部１４、および分類部１６は、例えば、ＣＰＵ（Central Processing Unit）などのプロセッサが、記憶部１８に記憶されたプログラム（ソフトウェア）を実行することにより実現される。プログラムは、例えば、ネットワークＮＷを介してアプリケーションサーバからダウンロードされてもよいし、予め情報処理装置７にプリインストールされていてもよい。また、これらの機能部は、ＬＳＩ（Large Scale Integration）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field-Programmable Gate Array）などのハードウェアによって実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。記憶部１８は、例えば、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、ＨＤＤ（Hard Disk Drive）、フラッシュメモリ、またはこれらのうち複数が組み合わされたハイブリッド型記憶装置などによって実現される。 The degree-of-association calculation unit 12, the generation unit 14, and the classification unit 16 are realized by a processor (software) stored in the storage unit 18, for example, by a processor such as a CPU (Central Processing Unit). For example, the program may be downloaded from the application server via the network NW, or may be preinstalled in the information processing apparatus 7 in advance. In addition, these functional units may be realized by hardware such as LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array), or cooperation of software and hardware. It may be realized by. The storage unit 18 is realized by, for example, a random access memory (RAM), a read only memory (ROM), a hard disk drive (HDD), a flash memory, or a hybrid storage device in which a plurality of these are combined.

取得部１０は、端末装置３から送信されたクエリと、送信元の端末装置３のユーザの識別情報とが関連付けされた履歴情報を検索サーバ５から取得し、記憶部１８に記憶させる。 The acquisition unit 10 acquires from the search server 5 history information in which the query transmitted from the terminal device 3 is associated with the identification information of the user of the terminal device 3 that is the transmission source, and stores the history information in the storage unit 18.

関連度算出部１２は、ネットワーク検索に用いられた複数のクエリにおける各２つのクエリの間の関連度を、各２つのクエリの双方を検索したユーザ数に基づいて算出する。例えば、関連度算出部１２は、記憶部１８から履歴情報を読み出し、クエリ間の関連度を示す重複検索スコアを算出する。重複検索スコアＳｃｏｒｅは、例えば、以下の式（１）によって算出される。 The relevance calculating unit 12 calculates the relevance between each of the two queries in the plurality of queries used for the network search based on the number of users who have searched both the two queries. For example, the relevance calculation unit 12 reads history information from the storage unit 18 and calculates a duplicate search score indicating the relevance between queries. The duplicate search score Score is calculated by the following equation (1), for example.

式（１）において、Ａｕｓｅｒは、クエリＡの検索ユーザ数であり、Ｂｕｓｅｒは、クエリＢの検索ユーザ数であり、ＡＬＬｕｓｅｒは、検索ユーザ数全体であり、Ａｕｓｅｒ∧Ｂｕｓｅｒは、クエリＡとクエリＢとの双方を検索しているユーザ数である。重複検索スコアＳｃｏｒｅは、値が大きいほど、クエリＡとクエリＢとの関連度が強いことを示す。また、クエリＡとクエリＢとの検索ユーザ数が互いに近いほど、重複検索スコアＳｃｏｒｅが高くなる傾向になる。関連度算出部１２は、重複検索スコアを算出した２つのクエリを示す情報と、算出した重複検索スコアとを関連付けした情報（以下、「重複検索スコア情報」と呼ぶ）を記憶部１８に記憶させる。 In the formula (1), Auser is the number of search users of the query A, Buser is the number of search users of the query B, ALLuser is the total number of search users, and User∧Buser is the query A and the query B And the number of users searching both. The duplicate search score Score indicates that the greater the value, the stronger the degree of association between the query A and the query B. Further, the closer the search user numbers of the query A and the query B are, the higher the duplicate search score Score tends to be. The degree-of-association calculation unit 12 causes the storage unit 18 to store information (hereinafter referred to as “duplicate search score information”) that associates the information indicating the two queries for which the duplicate search score has been calculated with the calculated duplicate search score. .

生成部１４は、複数のクエリにおける各２つのクエリの間の関連付けの有無と、この各２つのクエリの間の関連度とを示すグラフデータを生成する。また、生成部１４は、生成したグラフデータにおける複数のクエリの中から、関連付けされたクエリの数が１つであるクエリを選択し、この選択したクエリと、この選択したクエリと関連付けされたクエリとの間の親子関係を設定する。例えば、生成部１４は、記憶部１８から読み出した履歴情報を用いて、図３に示すようなグラフデータを生成する。 The production | generation part 14 produces | generates the graph data which show the presence or absence of the correlation between each two queries in a some query, and the relevance degree between these two queries. Further, the generation unit 14 selects a query having one associated query from a plurality of queries in the generated graph data, and selects the selected query and the query associated with the selected query. Set the parent-child relationship between. For example, the generation unit 14 generates graph data as illustrated in FIG. 3 using the history information read from the storage unit 18.

図３では、計８個のクエリがノードＡからＨとして示されている。図３において、リンクＬ１からリンクＬ９が各ノード間の関連付けの有無を示している。リンクＬ１からリンクＬ９によって互いに結ばれた２つのノードは、同一のユーザによって双方が検索された２つのクエリを示している。図３において括弧内に示された数値は、各リンクの重複検索スコアを示している。 In FIG. 3, a total of eight queries are shown as nodes A through H. In FIG. 3, links L1 to L9 indicate whether or not each node is associated. The two nodes connected to each other by the link L1 to the link L9 indicate two queries that are both searched by the same user. The numerical values shown in parentheses in FIG. 3 indicate the duplicate search score of each link.

例えば、図３においては、ノードＡと、ノードＤとは、リンクＬ４によって接続され、その重複検索スコアは“６”であることが示されている。また、ノードＡは、リンクＬ５によってノードＢと接続され、その重複検索スコアは“８”であり、リンクＬ３によってノードＣと接続され、その重複検索スコアは“４”であり、リンクＬ４によってノードＤと接続され、その重複検索スコアは“６”であり、リンクＬ２によってノードＥと接続され、その重複検索スコアは“５”であることが示されている。すなわち、ノードＡは、ノードＢとの関連性が最も高いことが分かる。 For example, in FIG. 3, the node A and the node D are connected by a link L4, and the duplicate search score is “6”. Node A is connected to node B by link L5, and its duplicate search score is “8”, and is connected to node C by link L3, and its duplicate search score is “4”, and node L is linked by link L4. It is connected to D and its duplicate search score is “6”, and it is connected to node E by link L2 and its duplicate search score is “5”. That is, it can be seen that node A has the highest relevance with node B.

分類部１６は、生成部１４によって生成されたグラフデータに基づいてクエリを分類する。分類部１６によるクエリの分類処理の詳細については後述する。 The classification unit 16 classifies the query based on the graph data generated by the generation unit 14. Details of the query classification processing by the classification unit 16 will be described later.

記憶部１８は、取得部１０が検索サーバ５から取得した履歴情報、関連度算出部１２が算出した重複検索スコア情報、生成部１４によって生成されたグラフデータ、分類部１６によって分類されたクエリの分類結果などを記憶する。 The storage unit 18 includes history information acquired by the acquisition unit 10 from the search server 5, duplicate search score information calculated by the relevance calculation unit 12, graph data generated by the generation unit 14, and queries classified by the classification unit 16. Stores classification results.

［情報処理装置の処理］
次に、図４を参照しながら情報処理装置７の動作について説明する。図４は、情報処理装置７の処理の一例を示すフローチャートである。 [Processing of information processing device]
Next, the operation of the information processing apparatus 7 will be described with reference to FIG. FIG. 4 is a flowchart illustrating an example of processing of the information processing apparatus 7.

まず、取得部１０は、履歴情報を検索サーバ５から取得し、記憶部１８に記憶させる（ステップＳ１０１）。 First, the acquisition unit 10 acquires history information from the search server 5 and stores it in the storage unit 18 (step S101).

次に、関連度算出部１２は、記憶部１８から履歴情報を読み出し、読み出した履歴情報に基づいて重複検索スコアを算出する（ステップＳ１０３）。関連度算出部１２は、重複検索スコア情報を記憶部１８に記憶させる。 Next, the relevance calculation unit 12 reads history information from the storage unit 18 and calculates a duplicate search score based on the read history information (step S103). The relevance calculation unit 12 stores the duplicate search score information in the storage unit 18.

次に、生成部１４は、記憶部１８から重複検索スコア情報を読み出して、クエリ間の関連性を示すグラフデータを生成する（ステップＳ１０５）。例えば、生成部１４は、図３に示すようなクエリ間の関連性を示すグラフデータを生成する。 Next, the production | generation part 14 reads duplication search score information from the memory | storage part 18, and produces | generates the graph data which show the relationship between queries (step S105). For example, the production | generation part 14 produces | generates the graph data which show the relationship between queries as shown in FIG.

次に、生成部１４は、接続先のノードの数が１つであるノード（以下、「末端ノード」と呼ぶ）の処理を行う（ステップＳ１０７）。例えば、生成部１４は、グラフデータにおいて、末端ノードを選択し、選択した末端ノードを「子ノード」とし、選択した末端ノードの接続先のノードを「親ノード」として親子関係を設定する。 Next, the generation unit 14 performs processing for a node having one connection destination node (hereinafter referred to as “terminal node”) (step S107). For example, in the graph data, the generation unit 14 selects a terminal node, sets the selected terminal node as a “child node”, and sets a parent-child relationship with the connection destination node of the selected terminal node as a “parent node”.

図５は、図３に示されたグラフデータに対するノード処理の一例を説明する図である。図３に示すグラフデータにおいては、２つの末端ノード（ノードＤおよびノードＦ）が存在する。図５のステップＳ１０７（１回目）に示されるように、生成部１４は、例えば、処理対象の末端ノードとしてノードＤを選択し、選択したノードＤを「子ノード」とし、ノードＤの接続先であるノードＡをノードＤの「親ノード」として親子関係を設定する。親子関係の設定が完了したノードＤは、グラフデータからは削除されたとみなして以降の処理が行われる。図５では、削除されたノードは点線で示されている。 FIG. 5 is a diagram for explaining an example of node processing for the graph data shown in FIG. In the graph data shown in FIG. 3, there are two terminal nodes (node D and node F). As illustrated in step S107 (first time) in FIG. 5, for example, the generation unit 14 selects the node D as the terminal node to be processed, sets the selected node D as the “child node”, and the connection destination of the node D The parent-child relationship is set with the node A as the “parent node” of the node D. The node D for which the parent-child relationship has been set is regarded as deleted from the graph data, and the subsequent processing is performed. In FIG. 5, the deleted nodes are indicated by dotted lines.

次に、生成部１４は、全ての末端ノードの処理が完了したか否かを判定する（ステップＳ１０９）。生成部１４は、全ての末端ノードの処理が完了していないと判定した場合、未処理の末端ノードに対して上記の親子関係の設定を行う。図５に示す例においてノードＤの処理が完了した後には、未処理の末端ノードであるノードＦが存在する。このため、図５のステップＳ１０７（２回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＦを選択し、選択したノードＦを「子ノード」とし、ノードＦの接続先であるノードＥをノードＦの「親ノード」として親子関係を設定する。 Next, the generation unit 14 determines whether or not processing for all terminal nodes has been completed (step S109). If the generation unit 14 determines that the processing of all the terminal nodes has not been completed, the generation unit 14 sets the above parent-child relationship with respect to an unprocessed terminal node. In the example shown in FIG. 5, after the processing of the node D is completed, there is a node F that is an unprocessed terminal node. Therefore, as shown in step S107 (second time) in FIG. 5, the generation unit 14 selects the node F as the terminal node to be processed, sets the selected node F as the “child node”, and connects the nodes F A parent-child relationship is set with the previous node E as the “parent node” of node F.

上記のノードＦに対する処理の結果、ノードＥは、ノードＡのみに接続された末端ノードとなる。このため、図５のステップＳ１０７（３回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＥを選択し、選択したノードＥを「子ノード」とし、ノードＥの接続先であるノードＡをノードＥの「親ノード」として親子関係を設定する。 As a result of the processing for the node F, the node E becomes a terminal node connected to only the node A. Therefore, as shown in step S107 (third time) in FIG. 5, the generation unit 14 selects the node E as the terminal node to be processed, sets the selected node E as the “child node”, and connects the nodes E A parent-child relationship is set with node A as the “parent node” of node E as the previous node.

一方、生成部１４は、全ての末端ノードの処理が完了したと判定した場合、グラフデータに含まれる全てのノードの処理が完了したか否かを判定する（ステップＳ１１１）。生成部１４は、全てのノードの処理が完了していないと判定した場合、未処理のノード（ここでは、関連付けされたクエリの数が２つ以上であるクエリが未処理のノードとなる）の中で、接続先のノードの数が最も少ないノードを処理対象として選択し、選択したノードに接続された複数のリンクの内、重複検索スコアが最も低いリンクを削除する（ステップＳ１１３）。リンクが削除されることにより新たに末端ノードが生成されるため、生成部１４は、この新たに生成された末端ノードに対する上記の処理を行う。 On the other hand, when it determines with the process of all the terminal nodes having been completed, the production | generation part 14 determines whether the process of all the nodes contained in graph data was completed (step S111). If the generation unit 14 determines that the processing of all the nodes has not been completed, the generation unit 14 selects an unprocessed node (here, a query having two or more associated queries becomes an unprocessed node). Among them, the node having the smallest number of connection destination nodes is selected as a processing target, and the link having the lowest duplicate search score is deleted from the plurality of links connected to the selected node (step S113). Since the end node is newly generated by deleting the link, the generation unit 14 performs the above-described processing for the newly generated end node.

図５に示す例において末端ノードであるノードＥの処理が完了した後には、未処理のノードであるノードＡ、Ｂ、Ｃ、Ｇ、およびＨが存在する。このため、生成部１４は、全てのノードの処理が完了していないと判定し、上記のリンクの削除処理を行う。例えば、図５に示す例において、接続先のノードの数が最も少ないノードとして、接続先のノードの数が２つであるノードＡ、Ｃ、Ｇ、およびＨが処理対象の候補となる。生成部１４は、ノードＡ、Ｃ、Ｇ、およびＨのいずれかを１つを処理対象として選択し（図５に示す例ではノードＡを選択し）、選択したノードＡに接続された複数のリンクＬ５およびＬ３の内、重複検索スコアが低いリンクＬ３を削除する。リンクＬ３が削除されたことにより、ノードＡおよびノードＣが末端ノードとなる。なお、生成部１４は、選択したノードに接続された複数のリンクの中で重複検索スコアが最も低いリンクが複数存在する場合には、任意の１つのリンクを削除してよい。 In the example shown in FIG. 5, after the processing of the node E, which is the terminal node, is completed, there are nodes A, B, C, G, and H that are unprocessed nodes. For this reason, the generation unit 14 determines that the processing of all the nodes has not been completed, and performs the above-described link deletion processing. For example, in the example illustrated in FIG. 5, nodes A, C, G, and H having two connection destination nodes are candidates for processing as nodes having the smallest number of connection destination nodes. The generation unit 14 selects one of the nodes A, C, G, and H as a processing target (selects the node A in the example illustrated in FIG. 5), and a plurality of nodes connected to the selected node A Of the links L5 and L3, the link L3 having a low duplicate search score is deleted. Since the link L3 is deleted, the node A and the node C become end nodes. Note that the generation unit 14 may delete any one link when there are a plurality of links having the lowest duplicate search score among a plurality of links connected to the selected node.

図６は、図５に示す例においてリンクＬ３が削除された後のノード処理の一例を説明する図である。図６のステップＳ１０７（４回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＡを選択し、選択したノードＡを「子ノード」とし、ノードＡの接続先であるノードＢをノードＡの「親ノード」として親子関係を設定する。さらに、図６のステップＳ１０７（５回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＣを選択し、選択したノードＣを「子ノード」とし、ノードＣの接続先であるノードＢをノードＣの「親ノード」として親子関係を設定する。 FIG. 6 is a diagram illustrating an example of node processing after the link L3 is deleted in the example illustrated in FIG. 6, the generation unit 14 selects the node A as the terminal node to be processed, sets the selected node A as the “child node”, and is the connection destination of the node A. A parent-child relationship is set with node B as the “parent node” of node A. Furthermore, as shown in step S107 (fifth time) in FIG. 6, the generation unit 14 selects the node C as the terminal node to be processed, sets the selected node C as the “child node”, and the connection destination of the node C The parent-child relationship is set with the node B as “parent node” of the node C.

図６に示す例において末端ノードであるノードＣの処理が完了した後には、未処理のノードであるノードＢ、Ｇ、およびＨが存在する。このため、接続先のノードの数が最も少ないノードとして、接続先のノードの数が２つであるノードＢ、Ｇ、およびＨが処理対象の候補となる。生成部１４は、ノードＢ、Ｇ、およびＨのいずれかを１つを処理対象として選択し（図６に示す例ではノードＢを選択し）、選択したノードＢに接続された複数のリンクＬ７およびＬ８の内、重複検索スコアが低いリンクＬ７を削除する。リンクＬ７が削除されたことにより、ノードＢおよびノードＧが末端ノードとなる。 In the example shown in FIG. 6, after the processing of the node C, which is the terminal node, is completed, there are nodes B, G, and H that are unprocessed nodes. Therefore, the nodes B, G, and H having two connection destination nodes are candidates for processing as the nodes having the smallest number of connection destination nodes. The generation unit 14 selects one of the nodes B, G, and H as a processing target (in the example illustrated in FIG. 6, selects the node B), and a plurality of links L7 connected to the selected node B And the link L7 having a low duplicate search score is deleted. Since the link L7 is deleted, the node B and the node G become the end nodes.

図７は、図６に示す例においてリンクＬ７が削除された後のノード処理の一例を説明する図である。図７のステップＳ１０７（６回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＢを選択し、選択したノードＢを「子ノード」とし、ノードＢの接続先であるノードＨをノードＢの「親ノード」として親子関係を設定する。さらに、図７のステップＳ１０７（７回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＧを選択し、選択したノードＧを「子ノード」とし、ノードＧの接続先であるノードＨをノードＧの「親ノード」として親子関係を設定する。 FIG. 7 is a diagram illustrating an example of node processing after the link L7 is deleted in the example illustrated in FIG. As shown in step S107 (sixth) in FIG. 7, the generation unit 14 selects the node B as the terminal node to be processed, sets the selected node B as the “child node”, and is the connection destination of the node B. The parent-child relationship is set with the node H as the “parent node” of the node B. Further, as shown in step S107 (seventh time) in FIG. 7, the generation unit 14 selects the node G as the terminal node to be processed, sets the selected node G as the “child node”, and the connection destination of the node G A parent-child relationship is set with the node H as the “parent node” of the node G.

一方、生成部１４は、全てのノードの処理が完了したと判定した場合、クエリ間の親子関係を示す木構造を生成する（ステップＳ１１５）。図８は、図５から図７において設定された親子関係をまとめた木構造を示す図である。図８に示す木構造では、第１から第５階層までの各層にノードが配置されている。 On the other hand, when it determines with the process of all the nodes having been completed, the production | generation part 14 produces | generates the tree structure which shows the parent-child relationship between queries (step S115). FIG. 8 is a diagram showing a tree structure in which the parent-child relationships set in FIGS. 5 to 7 are summarized. In the tree structure shown in FIG. 8, nodes are arranged in each layer from the first to the fifth layers.

次に、分類部１６は、生成部１４によって生成された木構造を用いてクエリを分類する（ステップＳ１１７）。例えば、分類部１６は、木構造における階層に基づいてクエリを分類する。分類部１６は、同一階層に位置するクエリを同一のグループに属するクエリとして分類してよい。また、分類部１６は、予め設定された階層以下のクエリを同一のグループに属するクエリとして分類してもよい。クエリの分類に利用される基準は任意である。以上により、情報処理装置７は、本フローチャートの処理を終了する。 Next, the classification unit 16 classifies the query using the tree structure generated by the generation unit 14 (step S117). For example, the classification unit 16 classifies the query based on the hierarchy in the tree structure. The classification unit 16 may classify queries located in the same hierarchy as queries belonging to the same group. Further, the classification unit 16 may classify queries below a preset hierarchy as queries belonging to the same group. The criteria used for query classification are arbitrary. Thus, the information processing apparatus 7 ends the process of this flowchart.

以上において説明した第１実施形態によれば、クエリ間の関連性を正確かつ簡単に把握することができる。 According to the first embodiment described above, the relationship between queries can be grasped accurately and easily.

＜第２実施形態＞
以下、第２実施形態について説明する。第１実施形態と比較して、第２実施形態の情報処理装置７は、分類部１６におけるクエリの分類処理が異なる。このため、構成などについては第１実施形態で説明した図および関連する記載を援用し、詳細な説明を省略する。 Second Embodiment
Hereinafter, a second embodiment will be described. Compared to the first embodiment, the information processing apparatus 7 of the second embodiment is different in the query classification processing in the classification unit 16. For this reason, about the structure etc., the figure and related description which were demonstrated in 1st Embodiment are used, and detailed description is abbreviate | omitted.

［情報処理装置の処理］
次に、図９を参照しながら情報処理装置７の動作について説明する。図９は、第２実施形態における情報処理装置７の処理の一例を示すフローチャートである。 [Processing of information processing device]
Next, the operation of the information processing apparatus 7 will be described with reference to FIG. FIG. 9 is a flowchart illustrating an example of processing of the information processing apparatus 7 in the second embodiment.

まず、取得部１０は、履歴情報を検索サーバ５から取得し、記憶部１８に記憶させる（ステップＳ２０１）。 First, the acquisition unit 10 acquires history information from the search server 5 and stores it in the storage unit 18 (step S201).

次に、関連度算出部１２は、記憶部１８から履歴情報を読み出し、読み出した履歴情報に基づいて重複検索スコアを算出する（ステップＳ２０３）。関連度算出部１２は、重複検索スコア情報を記憶部１８に記憶させる。 Next, the relevance calculation unit 12 reads history information from the storage unit 18 and calculates a duplicate search score based on the read history information (step S203). The relevance calculation unit 12 stores the duplicate search score information in the storage unit 18.

次に、生成部１４は、記憶部１８から重複検索スコア情報を読み出して、クエリ間の関連性を示すグラフデータを生成する（ステップＳ２０５）。例えば、生成部１４は、図１０に示すようなクエリ間の関連性を示すグラフデータを生成する。図１０に示すグラフデータにおいては、計１０個のクエリがノードＡからＪとして示されている。 Next, the production | generation part 14 reads duplication search score information from the memory | storage part 18, and produces | generates the graph data which show the relationship between queries (step S205). For example, the production | generation part 14 produces | generates the graph data which show the relationship between queries as shown in FIG. In the graph data shown in FIG. 10, a total of 10 queries are indicated as nodes A to J.

次に、分類部１６は、生成したグラフデータにおいて、処理対象とする１つのノードをランダムに選択する（ステップＳ２０７）。次に、分類部１６は、選択したノードを基準として、所定のリンク数以内で接続されたクエリを１つのグループに分類する（ステップＳ２０９）。 Next, the classification unit 16 randomly selects one node to be processed in the generated graph data (step S207). Next, the classification unit 16 classifies the queries connected within the predetermined number of links into one group with the selected node as a reference (step S209).

図１１は、図１０に示されたグラフデータに対するノード処理の一例を説明する図である。図１１のステップＳ２０７およびＳ２０９（１回目）に示されるように、分類部１６は、例えば、処理対象のノードとしてノードＪを選択する。次に、分類部１６は、ノードＪを基準として、例えば、リンク数が３以内で接続されたノードを同一のグループ（Ｊグループ）に属するノードとして分類する。リンク数が３以内で接続されたノードには、ノードＪとリンクＬ１１を介して直接的に接続されたノードＩ（リンク数１）と、ノードＪとリンクＬ１１およびＬ１０を介して接続されたノードＨ（リンク数２）と、ノードＪとリンクＬ１１、Ｌ１０、およびＬ９を介して接続されたノードＧ（リンク数３）と、ノードＪとリンクＬ１１、Ｌ１０、およびＬ８を介して接続されたノードＢ（リンク数３）とが含まれる。 FIG. 11 is a diagram for explaining an example of node processing for the graph data shown in FIG. As shown in steps S207 and S209 (first time) in FIG. 11, the classification unit 16 selects, for example, the node J as a processing target node. Next, the classification unit 16 classifies, for example, nodes connected within 3 links as nodes belonging to the same group (J group) with the node J as a reference. The nodes connected within 3 links are the node I directly connected to the node J via the link L11 (the number of links is 1), and the node connected to the node J via the links L11 and L10. H (number of links 2), node G connected to node J via links L11, L10, and L9 (number of links 3), and node J connected to links L11, L10, and L8 B (number of links 3).

次に、分類部１６は、全てのノードの分類処理が完了したか否かを判定する（ステップＳ２１１）。分類部１６は、全てのノードの分類処理が完了していないと判定した場合、ノードの分類処理が完了していないノードの中から、処理対象とする１つのノードをランダムに選択し、上記の分類処理を再度行う。 Next, the classification unit 16 determines whether or not the classification process for all nodes has been completed (step S211). If the classification unit 16 determines that all the node classification processes have not been completed, the classification unit 16 randomly selects one node to be processed from the nodes for which the node classification process has not been completed. Repeat the classification process.

図１１に示す例においてノードＪを基準とした分類が完了した後には、未分類のノードであるノードＡ、Ｃ、Ｄ、Ｅ、およびＦが存在する。このため、分類部１６は、全てのノードの分類処理が完了していないと判定し、ノードＡ、Ｃ、Ｄ、Ｅ、およびＦの中から、処理対象とする１つのノードをランダムに選択し、上記の分類処理を行う。図１１のステップＳ２０７およびＳ２０９（２回目）に示されるように、分類部１６は、例えば、処理対象とするノードとしてノードＦを選択する。 In the example shown in FIG. 11, after the classification based on the node J is completed, there are nodes A, C, D, E, and F that are unclassified nodes. For this reason, the classification unit 16 determines that the classification processing of all the nodes has not been completed, and randomly selects one node to be processed from the nodes A, C, D, E, and F. The above classification process is performed. As shown in steps S207 and S209 (second time) in FIG. 11, the classification unit 16 selects, for example, the node F as a node to be processed.

次に、分類部１６は、ノードＦを基準として、例えば、リンク数が３以内で接続されたノードを同一のグループ（Ｆグループ）に属するノードとして分類する。リンク数が３以内で接続されたノードには、ノードＦとリンクＬ１を介して直接的に接続されたノードＥ（リンク数１）と、ノードＦとリンクＬ１およびＬ２を介して接続されたノードＡ（リンク数２）と、ノードＦとリンクＬ１、Ｌ２、およびＬ４を介して接続されたノードＤ（リンク数３）と、ノードＦとリンクＬ１、Ｌ２、およびＬ５を介して接続されたノードＢ（リンク数３）と、ノードＦとリンクＬ１、Ｌ２、およびＬ３を介して接続されたノードＣ（リンク数３）とが含まれる。 Next, the classification unit 16 classifies the nodes connected with the number of links within 3 as nodes belonging to the same group (F group) with the node F as a reference. The nodes connected within 3 links are the node E directly connected to the node F via the link L1, and the node connected to the node F via the links L1 and L2. A (number of links 2), node D connected to node F via links L1, L2, and L4 (number of links 3), and node F connected to nodes L1, L2, and L5 B (number of links 3) and node C connected to node F via links L1, L2, and L3 (number of links 3) are included.

ここで、ノードＢは、ノードＪを基準としたグループ（Ｊグループ）と、ノードＦを基準としたグループ（Ｆグループ）との双方に所属することになる。このように、クエリのランダムな選択を複数回行うことにより、１つのノードが複数のグループに所属することになった場合には、分類部１６は、このノードと、このノードの接続先の各ノードとの重複検索スコアの大きさに基づいて、いずれのグループに分類するかを決定する。 Here, the node B belongs to both a group based on the node J (J group) and a group based on the node F (F group). As described above, when a single node belongs to a plurality of groups by performing random selection of a query a plurality of times, the classification unit 16 determines each of the nodes and connection destinations of the nodes. The group to be classified is determined based on the size of the duplicate search score with the node.

例えば、ノードＪを基準とした分類処理においてノードＪからノードＢに至る経路上のノードＢと接続されたリンク（図１１に示す例では、リンクＬ８（１０））と、ノードＦを基準とした分類処理においてノードＦからノードＢに至る経路上のノードＢと接続されたリンク（図１１に示す例では、リンクＬ５（８））とでは、リンクＬ５（８）よりもリンクＬ８（１０）の重複検索スコアが高い。この場合、ノードＢは、重複検索スコアが高い（すなわち、Ｊグループとの結び付きが強い）と考えられるため、分類部１６は、ノードＢをＪグループに分類する。なお、ノードＪを基準としたリンクＬ８と、ノードＦを基準としたリンクＬ５との重複検索スコアが互いに同じである場合には、分類部１６は、ノードＢを任意の一方のグループに分類してよい。 For example, in the classification process using node J as a reference, a link (link L8 (10) in the example shown in FIG. 11) connected to node B on the route from node J to node B and node F as a reference In the classification process, the link connected to the node B on the path from the node F to the node B (in the example shown in FIG. 11, the link L5 (8)) has the link L8 (10) rather than the link L5 (8). High duplicate search score. In this case, since the node B is considered to have a high duplicate search score (that is, the connection with the J group is strong), the classification unit 16 classifies the node B into the J group. If the duplicate search scores of the link L8 based on the node J and the link L5 based on the node F are the same, the classification unit 16 classifies the node B into any one group. It's okay.

一方、分類部１６は、全てのノードの分類処理が完了したと判定した場合、分類結果を記憶部１８に記憶させる。以上により、情報処理装置７は、本フローチャートの処理を終了する。 On the other hand, when the classification unit 16 determines that the classification process for all the nodes has been completed, the classification unit 16 stores the classification result in the storage unit 18. Thus, the information processing apparatus 7 ends the process of this flowchart.

以上において説明した第２実施形態によれば、クエリ間の関連性を正確かつ簡単に把握することができる。また、処理対象とする１つのノードをランダムに選択して、この選択したノードを基準とした分類処理を行うことで処理を簡略化することができる。 According to the second embodiment described above, the relationship between queries can be grasped accurately and easily. Further, the processing can be simplified by randomly selecting one node to be processed and performing a classification process based on the selected node.

＜第３実施形態＞
以下、第３実施形態について説明する。第１実施形態と比較して、第３実施形態の情報処理装置７は、分類部１６におけるクエリの分類処理が異なる。このため、構成などについては第１実施形態で説明した図および関連する記載を援用し、詳細な説明を省略する。 <Third Embodiment>
Hereinafter, the third embodiment will be described. Compared to the first embodiment, the information processing apparatus 7 of the third embodiment is different in the query classification processing in the classification unit 16. For this reason, about the structure etc., the figure and related description which were demonstrated in 1st Embodiment are used, and detailed description is abbreviate | omitted.

［情報処理装置の処理］
次に、図１２を参照しながら情報処理装置７の動作について説明する。多くのクエリと関連付けされるクエリは、スパムなどによって不正に検索が行われたキーワードである場合がある。このような多くのクエリと関連付けされるクエリを含む履歴情報に対して処理を行うと、クエリ間の関連性が複雑化し、分類処理の精度が低下する場合がある。そこで、本実施形態の情報処理装置７では、所定数以上のクエリと関連付けされるクエリを削除し、削除したクエリ以外のクエリについて分類処理を行う。図１２は、第３実施形態における情報処理装置７の処理の一例を示すフローチャートである。 [Processing of information processing device]
Next, the operation of the information processing apparatus 7 will be described with reference to FIG. Queries associated with many queries may be keywords that have been illegally searched due to spam or the like. When processing is performed on history information including queries associated with such many queries, the relationship between the queries may be complicated, and the accuracy of classification processing may be reduced. Therefore, in the information processing apparatus 7 of this embodiment, queries associated with a predetermined number or more of queries are deleted, and classification processing is performed on queries other than the deleted queries. FIG. 12 is a flowchart illustrating an example of processing of the information processing apparatus 7 in the third embodiment.

まず、取得部１０は、履歴情報を検索サーバ５から取得し、記憶部１８に記憶させる（ステップＳ３０１）。 First, the acquisition unit 10 acquires history information from the search server 5 and stores it in the storage unit 18 (step S301).

次に、関連度算出部１２は、記憶部１８から履歴情報を読み出し、読み出した履歴情報に基づいて重複検索スコアを算出する（ステップＳ３０３）。関連度算出部１２は、重複検索スコア情報を記憶部１８に記憶させる。 Next, the degree-of-association calculation unit 12 reads history information from the storage unit 18 and calculates a duplicate search score based on the read history information (step S303). The relevance calculation unit 12 stores the duplicate search score information in the storage unit 18.

次に、生成部１４は、記憶部１８から重複検索スコア情報を読み出して、クエリ間の関連性を示すグラフデータを生成する（ステップＳ３０５）。例えば、生成部１４は、図１３に示すようなクエリ間の関連性を示すグラフデータを生成する。図１３に示すグラフデータにおいては、計１０個のクエリがノードＡからＪとして示されている。 Next, the production | generation part 14 reads duplication search score information from the memory | storage part 18, and produces | generates the graph data which show the relationship between queries (step S305). For example, the production | generation part 14 produces | generates the graph data which show the relationship between queries as shown in FIG. In the graph data shown in FIG. 13, a total of 10 queries are shown as nodes A to J.

次に、生成部１４は、生成したグラフデータにおいて、接続先のノードの数が所定数以上であるノードを削除する（ステップＳ３０７）。図１４は、図１３に示されたグラフデータに対するノード処理の一例を説明する図である。図１４のステップＳ３０７に示されるように、生成部１４は、接続先のノードの数が所定数以上である（図１４に示す例では、接続先のノードの数が５つ以上）ノードＢを削除する。これにより、ノードＩは、接続先を有さないノードとなる。この場合、分類部１６は、ノードＩは、Ｉグループに属するノードとして分類する。 Next, the generation unit 14 deletes nodes in the generated graph data in which the number of connection destination nodes is a predetermined number or more (step S307). FIG. 14 is a diagram for explaining an example of node processing for the graph data shown in FIG. As illustrated in step S307 of FIG. 14, the generation unit 14 determines that the number of connection destination nodes is a predetermined number or more (in the example illustrated in FIG. 14, the number of connection destination nodes is 5 or more). delete. Thereby, the node I becomes a node having no connection destination. In this case, the classification unit 16 classifies the node I as a node belonging to the I group.

次に、生成部１４は、接続先のノードの数が１つである末端ノードの処理を行う（ステップＳ３０９）。図１４に示す例においてノードＢが削除された後のグラフデータにおいては、２つの末端ノード（ノードＤおよびノードＦ）が存在する。図１４のステップＳ３０９（１回目）に示されるように、生成部１４は、例えば、処理対象の末端ノードとしてノードＤを選択し、選択したノードＤを「子ノード」とし、ノードＤの接続先であるノードＡをノードＤの「親ノード」として親子関係を設定する。 Next, the generation unit 14 performs processing for the terminal node having one connection destination node (step S309). In the example shown in FIG. 14, there are two terminal nodes (node D and node F) in the graph data after node B is deleted. As illustrated in step S309 (first time) in FIG. 14, for example, the generation unit 14 selects the node D as the terminal node to be processed, sets the selected node D as the “child node”, and the connection destination of the node D The parent-child relationship is set with the node A as the “parent node” of the node D.

次に、生成部１４は、全ての末端ノードの処理が完了したか否かを判定する（ステップＳ３１１）。生成部１４は、全ての末端ノードの処理が完了していないと判定した場合、未処理の末端ノードに対して上記の親子関係の設定を行う。図１４に示す例においてノードＤの処理が完了した後には、未処理の末端ノードであるノードＦが存在する。このため、図１４のステップＳ３０９（２回目）に示されるように、生成部１４は、処理対象の末端ノードとしてノードＦを選択し、選択したノードＦを「子ノード」とし、ノードＦの接続先であるノードＥをノードＦの「親ノード」として親子関係を設定する。 Next, the generation unit 14 determines whether or not the processing of all terminal nodes has been completed (step S311). If the generation unit 14 determines that the processing of all the terminal nodes has not been completed, the generation unit 14 sets the above parent-child relationship with respect to an unprocessed terminal node. In the example illustrated in FIG. 14, after the processing of the node D is completed, there is a node F that is an unprocessed terminal node. For this reason, as shown in step S309 (second time) in FIG. 14, the generation unit 14 selects the node F as the terminal node to be processed, sets the selected node F as the “child node”, and connects the nodes F. A parent-child relationship is set with the previous node E as the “parent node” of node F.

一方、生成部１４が全ての末端ノードの処理が完了したと判定した場合、分類部１６は、グラフデータにおける複数のノードの中から処理対象とする１つのノードをランダムに選択する（ステップＳ３１３）。次に、分類部１６は、選択したノードを基準として、所定のリンク数以内で接続されたクエリを１つのグループに分類する（ステップＳ３１５）。例えば、図１４のステップＳ３１３およびＳ３１５（１回目）に示されるように、分類部１６は、例えば、処理対象とするノードとしてノードＪを選択する。次に、分類部１６は、ノードＪを基準として、例えば、リンク数が３以内で接続されたノードを同一のグループ（Ｊグループ）に属するノードとして分類する。リンク数が３以内で接続されたノードには、ノードＪとリンクＬ１１を介して直接的に接続されたノードＨ（リンク数１）と、ノードＪとリンクＬ１２を介して直接的に接続されたノードＧ（リンク数１）とが含まれる。 On the other hand, when the generation unit 14 determines that the processing of all terminal nodes has been completed, the classification unit 16 randomly selects one node to be processed from among a plurality of nodes in the graph data (step S313). . Next, the classification unit 16 classifies the queries connected within the predetermined number of links into one group based on the selected node (step S315). For example, as shown in steps S313 and S315 (first time) in FIG. 14, the classification unit 16 selects the node J as a node to be processed, for example. Next, the classification unit 16 classifies, for example, nodes connected within 3 links as nodes belonging to the same group (J group) with the node J as a reference. Nodes connected within 3 links are connected directly to node H via node L and node H (number of links 1) and directly to node J via link L12. Node G (number of links 1) is included.

次に、分類部１６は、全てのノードの分類処理が完了したか否かを判定する（ステップＳ３１７）。分類部１６は、全てのノードの分類処理が完了していないと判定した場合、ノードの分類処理が完了していないノードの中から、処理対象とする１つのノードをランダムに選択し、上記の分類処理を再度行う。 Next, the classification unit 16 determines whether or not the classification processing for all the nodes has been completed (step S317). If the classification unit 16 determines that all the node classification processes have not been completed, the classification unit 16 randomly selects one node to be processed from the nodes for which the node classification process has not been completed. Repeat the classification process.

図１４に示す例においてノードＪを基準とした分類が完了した後には、未分類のノードであるノードＡ、Ｃ、およびＥが存在する。このため、分類部１６は、全てのノードの分類処理が完了していないと判定し、ノードＡ、Ｃ、およびＥの中から、処理対象とする１つのノードをランダムに選択し、上記の分類処理を行う。図１４のステップＳ３１３およびＳ３１５（２回目）に示されるように、分類部１６は、例えば、処理対象とするノードとしてノードＣを選択する。次に、分類部１６は、ノードＣを基準として、例えば、リンク数が３以内で接続されたノードを同一のグループ（Ｃグループ）に属するノードとして分類する。リンク数が３以内で接続されたノードには、ノードＣとリンクＬ３を介して直接的に接続されたノードＡ（リンク数１）と、ノードＣとリンクＬ１３を介して直接的に接続されたノードＥ（リンク数１）とが含まれる。 In the example shown in FIG. 14, after the classification based on the node J is completed, there are nodes A, C, and E which are unclassified nodes. For this reason, the classification unit 16 determines that the classification process for all the nodes has not been completed, and randomly selects one node to be processed from the nodes A, C, and E, and performs the above classification. Process. As shown in steps S313 and S315 (second time) in FIG. 14, the classification unit 16 selects, for example, the node C as a node to be processed. Next, the classification unit 16 classifies, for example, nodes connected within 3 links as nodes belonging to the same group (C group) with the node C as a reference. Nodes connected within 3 links are connected directly to node A via node L (link number 1) and directly to node C via link L13. Node E (number of links 1).

また、上記の末端ノード処理において、ノードＡ（親ノード）の子ノードと設定されたノードＤ、およびノードＥ（親ノード）の子ノードと設定されたノードＦは、親ノードと同じグループ（Ｃグループ）に所属するノードとして分類される。 In the terminal node processing described above, the node D set as a child node of the node A (parent node) and the node F set as a child node of the node E (parent node) are in the same group (C Group).

以上において説明した第３実施形態によれば、クエリ間の関連性を正確かつ簡単に把握することができる。また、多くのクエリと関連付けされるクエリを削除することで、クエリ間の関連性の複雑化を回避でき、クエリの分類処理の精度を向上させることができる。 According to the third embodiment described above, it is possible to accurately and easily grasp the relationship between queries. Also, by deleting queries that are associated with many queries, it is possible to avoid complications in the relationship between queries and improve the accuracy of query classification processing.

＜第４実施形態＞
以下、第４実施形態について説明する。第１実施形態と比較して、第４実施形態の情報処理システムが、広告情報処理装置をさらに備える点が異なる。このため、構成などについては第１実施形態で説明した図および関連する記載を援用し、詳細な説明を省略する。 <Fourth embodiment>
The fourth embodiment will be described below. Compared to the first embodiment, the information processing system of the fourth embodiment is different in that it further includes an advertisement information processing apparatus. For this reason, about the structure etc., the figure and related description which were demonstrated in 1st Embodiment are used, and detailed description is abbreviate | omitted.

図１５は、広告情報処理システム１Ａの構成図である。広告情報処理システム１Ａは、例えば、図１に示す端末装置３、検索サーバ５、および情報処理装置７に加えて、広告情報処理装置９を備える。広告情報処理装置９は、検索サーバ５および情報処理装置７と接続されている。 FIG. 15 is a configuration diagram of the advertisement information processing system 1A. The advertisement information processing system 1A includes, for example, an advertisement information processing device 9 in addition to the terminal device 3, the search server 5, and the information processing device 7 illustrated in FIG. The advertisement information processing device 9 is connected to the search server 5 and the information processing device 7.

［広告情報処理装置］
広告情報処理装置９は、情報処理装置７から出力されたクエリの分類結果に基づいて、検索サーバ５が端末装置３から受信したクエリに応じた広告情報を決定する。広告情報処理装置９は、決定した広告情報を検索サーバ５に出力する。検索サーバ５は、端末装置３から受信したクエリに対する検索の結果とともに、広告情報処理装置９から入力された広告情報を端末装置３に送信する。 [Advertising information processing device]
The advertisement information processing device 9 determines advertisement information corresponding to the query received by the search server 5 from the terminal device 3 based on the query classification result output from the information processing device 7. The advertisement information processing apparatus 9 outputs the determined advertisement information to the search server 5. The search server 5 transmits the advertisement information input from the advertisement information processing device 9 to the terminal device 3 together with the search result for the query received from the terminal device 3.

例えば、広告情報処理装置９は、複数の広告情報と、広告情報の各々と関連付けされたクエリとを記憶する記憶部（図示しない）を備えている。広告情報処理装置９は、上記の第１から第３実施形態における情報処理装置７によって出力された分類結果に基づいて、検索サーバ５が端末装置３から受信したクエリと関連付けされた広告情報、あるいは、このクエリと同じグループに分類された他のクエリと関連付けされた広告情報を検索サーバ５に出力する。 For example, the advertisement information processing apparatus 9 includes a storage unit (not shown) that stores a plurality of advertisement information and a query associated with each of the advertisement information. The advertisement information processing device 9 is configured to display the advertisement information associated with the query received from the terminal device 3 by the search server 5 based on the classification result output by the information processing device 7 in the first to third embodiments. The advertisement information associated with other queries classified into the same group as this query is output to the search server 5.

上記の第４実施形態の広告情報処理システム１Ａによれば、端末装置３のユーザが入力したクエリに適した広告情報を、端末装置３に提供することができる。これにより、広告効果の高いサービスを実現することができる。 According to the advertisement information processing system 1A of the fourth embodiment, advertisement information suitable for a query input by the user of the terminal device 3 can be provided to the terminal device 3. Thereby, it is possible to realize a service with a high advertising effect.

上記の実施形態では、クエリ間の重複検索スコアに基づいてクエリを分類する方法について説明した。なお、クエリの分類を行う場合には、クエリが検索された時間に関する情報があわせて使用されてもよい。例えば、分類部１６は、検索サーバ５から取得した履歴情報と、検索サーバ５が端末装置３から各クエリを受信した時間に関する情報とが関連付けされた情報を用いて、クエリ間の関連性を示すグラフデータを生成してもよい。このグラフデータにおいては、関連付けされたクエリ間における時間的な検索の順序が有向グラフによって示されている。また、検索サーバ５が端末装置３から各クエリを受信した時間（検索時間）または各クエリ間の検索時間の差分が、各ノードまたはリンクに紐付けされている。このようなグラフデータを生成することで、例えば、あるユーザは、クエリＡを検索した後にクエリＢを検索しているといった検索の時系列情報を把握することができ、クエリの分類をより詳細に行うことができる。 In the above embodiment, the method for classifying queries based on the duplicate search score between queries has been described. In addition, when classifying a query, information on the time when the query was searched may be used together. For example, the classification unit 16 indicates the relationship between the queries by using information in which the history information acquired from the search server 5 and the information related to the time when the search server 5 receives each query from the terminal device 3 are associated with each other. Graph data may be generated. In this graph data, the temporal search order between the associated queries is indicated by a directed graph. In addition, the time when the search server 5 receives each query from the terminal device 3 (search time) or the difference in the search time between each query is linked to each node or link. By generating such graph data, for example, a certain user can grasp the time series information of the search that searches the query B after searching the query A, and the query classification is more detailed. It can be carried out.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 As mentioned above, although the form for implementing this invention was demonstrated using embodiment, this invention is not limited to such embodiment at all, In the range which does not deviate from the summary of this invention, various deformation | transformation and substitution Can be added.

１‥情報処理システム、１Ａ‥広告情報処理システム、３‥端末装置、５‥検索サーバ、７‥情報処理装置、９‥広告情報処理装置、１０‥取得部、１２‥関連度算出部、１４‥生成部、１６‥分類部、１８‥記憶部、ＮＷ‥ネットワーク DESCRIPTION OF SYMBOLS 1 ... Information processing system, 1A ... Advertisement information processing system, 3 ... Terminal device, 5 ... Search server, 7 ... Information processing device, 9 ... Advertisement information processing device, 10 ... Acquisition part, 12 ... Relevance calculation part, 14 ... Generating unit, 16 ... Classification unit, 18 ... Storage unit, NW ... Network

Claims

ネットワーク検索に用いられた複数のクエリにおける各２つのクエリの間の関連度を、前記各２つのクエリの双方を検索したユーザ数に基づいて算出する算出部と、
前記各２つのクエリの間の関連付けの有無と、前記各２つのクエリの間の関連度とを示すグラフデータを生成する生成部と、
前記生成部によって生成されたグラフデータに基づいてクエリを分類する分類部と、
を備える情報処理装置。 A calculation unit that calculates the degree of association between each two queries in a plurality of queries used for network search based on the number of users who have searched both the two queries;
A generation unit that generates graph data indicating whether or not there is an association between the two queries and a degree of association between the two queries;
A classification unit for classifying a query based on the graph data generated by the generation unit;
An information processing apparatus comprising:

前記生成部は、更に、前記グラフデータにおける前記複数のクエリの中から、関連付けされたクエリの数が１つであるクエリを選択し、前記選択したクエリと、前記選択したクエリと関連付けされたクエリとの間の親子関係を設定し、
前記分類部は、前記生成部により設定された親子関係に基づいてクエリを分類する、
請求項１に記載の情報処理装置。 The generation unit further selects a query having one associated query from the plurality of queries in the graph data, and selects the selected query and the query associated with the selected query. Set the parent-child relationship between
The classification unit classifies the query based on the parent-child relationship set by the generation unit.
The information processing apparatus according to claim 1.

前記生成部は、前記グラフデータにおいて関連付けされたクエリの数が２つ以上であるクエリについて、前記関連度が最も低いクエリとの関連付けを削除して、関連付けされたクエリの数が１つであるクエリを生成し、前記生成したクエリと、前記生成したクエリと関連付けされたクエリとの間の親子関係を設定する、
請求項２に記載の情報処理装置。 The generation unit deletes the association with the query having the lowest degree of association for a query having two or more associated queries in the graph data, and the number of associated queries is one. Generating a query and setting a parent-child relationship between the generated query and a query associated with the generated query;
The information processing apparatus according to claim 2.

前記分類部は、前記グラフデータにおける前記複数のクエリの中から、１つのクエリをランダムに選択し、前記選択したクエリを基準として所定のリンク数以内で接続されたクエリを１つのグループに分類する、
請求項１に記載の情報処理装置。 The classifying unit randomly selects one query from the plurality of queries in the graph data, and classifies queries connected within a predetermined number of links based on the selected query as one group. ,
The information processing apparatus according to claim 1.

前記分類部は、前記クエリのランダムな選択を複数回行うことにより１つクエリが複数のグループに分類される場合には、前記１つのクエリと、前記１つのクエリと関連付けされたクエリとの間の関連度に基づいて、前記１つのクエリが分類される１つのグループを決定する、
請求項４に記載の情報処理装置。 The classification unit, when one query is classified into a plurality of groups by performing random selection of the query a plurality of times, between the one query and a query associated with the one query. Determining one group into which the one query is classified based on the relevance of
The information processing apparatus according to claim 4.

前記生成部は、前記グラフデータにおける前記複数のクエリの中から、所定数以上のクエリと関連付けされたクエリを削除し、
前記分類部は、前記削除したクエリ以外のクエリを分類する、
請求項１に記載の情報処理装置。 The generation unit deletes a query associated with a predetermined number of queries from the plurality of queries in the graph data,
The classification unit classifies a query other than the deleted query;
The information processing apparatus according to claim 1.

前記生成部は、更に、前記グラフデータにおける前記複数のクエリの中から、関連付けされたクエリの数が１つであるクエリを選択し、前記選択したクエリと、前記選択したクエリと関連付けされたクエリとの間の親子関係を設定し、
前記分類部は、前記複数のクエリの中から、前記選択したクエリ以外の１つのクエリをランダムに選択し、前記ランダムに選択したクエリを基準として所定のリンク数以内で接続されたクエリと、前記ランダムに選択したクエリおよび前記所定のリンク数以内で接続されたクエリと親子関係が設定されたクエリとを１つのグループに分類する、
請求項６に記載の情報処理装置。 The generation unit further selects a query having one associated query from the plurality of queries in the graph data, and selects the selected query and the query associated with the selected query. Set the parent-child relationship between
The classification unit randomly selects one query other than the selected query from the plurality of queries, the query connected within a predetermined number of links based on the randomly selected query, and the query Classifying randomly selected queries and queries connected within the predetermined number of links and queries having a parent-child relationship into one group,
The information processing apparatus according to claim 6.

コンピュータが、
ネットワーク検索に用いられた複数のクエリにおける各２つのクエリの間の関連度を、前記各２つのクエリの双方を検索したユーザ数に基づいて算出し、
前記各２つのクエリの間の関連付けの有無と、前記各２つのクエリの間の関連度とを示すグラフデータを生成し、
前記グラフデータに基づいてクエリを分類する、
情報処理方法。 Computer
Calculating the relevance between each two queries in a plurality of queries used for network search based on the number of users who searched both of the two queries;
Generating graph data indicating the presence / absence of association between the two queries and the degree of association between the two queries;
Classifying queries based on the graph data;
Information processing method.

コンピュータに、
ネットワーク検索に用いられた複数のクエリにおける各２つのクエリの間の関連度を、前記各２つのクエリの双方を検索したユーザ数に基づいて算出させ、
前記各２つのクエリの間の関連付けの有無と、前記各２つのクエリの間の関連度とを示すグラフデータを生成させ、
前記グラフデータに基づいてクエリを分類させる、
プログラム。 On the computer,
Relevance between each two queries in a plurality of queries used for network search is calculated based on the number of users who searched both of the two queries,
Generating graph data indicating the presence / absence of association between the two queries and the degree of association between the two queries;
Classifying queries based on the graph data;
program.

請求項１から７のいずれか一項に記載の情報処理装置と、
ユーザの端末装置からクエリを受信し、前記受信したクエリに応じた検索の結果を前記端末装置に送信する検索サーバと、
前記情報処理装置から出力されたクエリの分類結果に基づいて、前記検索サーバが受信した前記クエリと関連付けされた広告情報を決定する広告情報処理装置と、
を備え、
前記検索サーバは、更に、前記広告情報処理装置によって決定された前記広告情報を前記端末装置に送信する、
広告情報処理システム。 An information processing apparatus according to any one of claims 1 to 7,
A search server that receives a query from a terminal device of a user and transmits a search result corresponding to the received query to the terminal device;
An advertisement information processing apparatus that determines advertisement information associated with the query received by the search server, based on a query classification result output from the information processing apparatus;
With
The search server further transmits the advertisement information determined by the advertisement information processing device to the terminal device.
Advertising information processing system.