JP5682480B2

JP5682480B2 - Information processing apparatus, information processing method, and information processing program

Info

Publication number: JP5682480B2
Application number: JP2011146736A
Authority: JP
Inventors: 昌彦杉村; 秋吾中村; 正樹石原; 馬場　孝之; 孝之馬場; 遠藤　進; 進遠藤; 上原　祐介; 祐介上原; 増本　大器; 大器増本; 茂美長田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-06-30
Filing date: 2011-06-30
Publication date: 2015-03-11
Anticipated expiration: 2031-06-30
Also published as: JP2013015920A

Description

本発明は、情報を処理する情報処理装置、情報処理方法、および情報処理プログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and an information processing program for processing information.

近年の情報化社会において、ネットワーク上のウェブサイトからの情報収集が盛んである。ウェブサイトには、複数のウェブページが含まれ、各ウェブページがリンクによって繋がっている。このとき、情報収集をおこなう者は、ウェブサイト内の最上位層のウェブページから、リンク先のウェブページにアクセスして、内容を一つ一つ確認し、自分が探している情報があるか否かを判断する必要がある。 In the information-oriented society in recent years, information gathering from websites on the network is thriving. A website includes a plurality of web pages, and each web page is connected by a link. At this time, the person who collects the information accesses the linked web page from the top-level web page in the website, checks the contents one by one, and whether there is information that he / she is looking for. It is necessary to judge whether or not.

従来、情報収集の効率化のために、ウェブページにアクセスする前に、ウェブページの概要情報をポップアップとして出力する技術がある。そして、利用者が表示された概要情報から自分が探していた情報であるか否かを判断できるようにしている（例えば、下記特許文献１参照）。 Conventionally, in order to improve information collection efficiency, there is a technique for outputting summary information of a web page as a pop-up before accessing the web page. Then, it is possible to determine whether or not the user is searching for information from the displayed summary information (for example, see Patent Document 1 below).

また、文書における単語の出現頻度を算出する技術がある（例えば、下記特許文献２参照）。また、アクセスされた情報の表示時間に基づいて情報に重要度を設定する技術がある（例えば、下記特許文献３参照）。また、検索キーと関連性が大きい文書内のブロックを特定する技術がある（例えば、下記特許文献４参照）。 There is also a technique for calculating the appearance frequency of words in a document (see, for example, Patent Document 2 below). In addition, there is a technique for setting importance to information based on the display time of accessed information (see, for example, Patent Document 3 below). In addition, there is a technique for specifying a block in a document that is highly related to a search key (see, for example, Patent Document 4 below).

特開２００３−２８１０９３号公報JP 2003-281093 A 特開２０００−１１２９９０号公報Japanese Patent Application Laid-Open No. 2000-112990 特開２００９−１５１６２７号公報JP 2009-151627 A 特開２００８−２６９０６９号公報JP 2008-269069 A

しかしながら、上述した従来技術では、ウェブページの概要情報として、ウェブページの制作者が予め作成した情報、ウェブページの最上部などの特定箇所の情報、またはウェブページのスナップショットが採用されていた。結果として、閲覧者のニーズに適さない情報が概要情報になっている場合があるといった問題があった。また、閲覧者のニーズの変化に対応して、概要情報を決定することができないといった問題があった。 However, in the above-described conventional technology, information created in advance by the creator of the web page, information on a specific portion such as the top of the web page, or a snapshot of the web page is adopted as the summary information of the web page. As a result, there is a problem that information that is not suitable for the needs of the viewer may be summary information. In addition, there is a problem that summary information cannot be determined in response to changes in the needs of viewers.

本発明は、上述した従来技術による問題点を解消するため、ウェブページへのアクセス状況を考慮して概要情報を決定できる情報処理装置、情報処理方法、および情報処理プログラムを提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide an information processing apparatus, an information processing method, and an information processing program capable of determining summary information in consideration of an access situation to a web page in order to solve the above-described problems caused by the prior art. To do.

上述した課題を解決し、目的を達成するため、本発明の一側面によれば、閲覧対象ページについてのアクセス元で閲覧対象ページに遷移する際に使われた検索キーワードおよびアクセス元で閲覧対象ページを閲覧していた時間を閲覧対象ページへのアクセスごとに取得し、取得された検索キーワードでたどり着いた閲覧対象ページについてのアクセス元で閲覧対象ページを閲覧していた時間に基づいて、検索キーワードの閲覧対象ページにおける重要度を、検索キーワードごとに算出し、検索キーワードごとに算出された検索キーワードの閲覧対象ページにおける重要度と、閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、テキスト領域の閲覧対象ページにおける重要度を、テキスト領域ごとに算出し、テキスト領域ごとに算出されたテキスト領域の閲覧対象ページにおける重要度に基づいて、閲覧対象ページの概要情報となる特定のテキスト領域を決定する情報処理装置、情報処理方法、および情報処理プログラムが提案される。 In order to solve the above-described problems and achieve the object, according to one aspect of the present invention, the search keyword used when transitioning to the browsing target page at the access source for the browsing target page and the browsing target page at the access source For each visit to the page to be browsed, and based on the time the browsed page was viewed at the access source for the page to be browsed that was reached with the obtained search keyword, The level of importance in the page to be browsed is calculated for each search keyword, the level of importance in the page to be browsed for the search keyword calculated for each search keyword, and the number of occurrences of each search keyword for each text area of the page to be browsed. Based on this, the importance of the text area on the page to be browsed is calculated for each text area. An information processing apparatus, an information processing method, and an information processing program for determining a specific text area that is summary information of a browsing target page based on the importance of the browsing area of the text area calculated for each strike area are proposed The

また、上述した課題を解決し、目的を達成するため、本発明の一側面によれば、閲覧対象ページ群についてのアクセス元で閲覧対象ページ群に遷移する際に使われた検索キーワードおよびアクセス元で閲覧対象ページ群を閲覧していた時間を閲覧対象ページ群へのアクセスごとに取得し、取得された検索キーワードでたどり着いた閲覧対象ページ群についてのアクセス元で閲覧対象ページ群を閲覧していた時間に基づいて、検索キーワードの閲覧対象ページ群における重要度を、検索キーワードごとに算出し、検索キーワードごとに算出された検索キーワードの閲覧対象ページ群における重要度と、閲覧対象ページ群のテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、テキスト領域の閲覧対象ページ群における重要度を、テキスト領域ごとに算出し、テキスト領域ごとに算出されたテキスト領域の閲覧対象ページ群における重要度に基づいて、閲覧対象ページ群の概要情報となる特定のテキスト領域を決定する情報処理装置、情報処理方法、および情報処理プログラムが提案される。 Further, in order to solve the above-described problems and achieve the object, according to one aspect of the present invention, the search keyword and access source used when transitioning to the browse target page group at the access source for the browse target page group The time when the browsing target page group was browsed was acquired for each access to the browsing target page group, and the browsing target page group was browsed by the access source of the browsing target page group reached by the acquired search keyword. Based on the time, the importance of the search keyword in the browsing target page group is calculated for each search keyword, the importance of the search keyword in the browsing target page group calculated for each search keyword, and the text area of the browsing target page group Based on the number of occurrences of each search keyword for each page, the importance of the browsed page group in the text area is An information processing apparatus that calculates a specific text area that is calculated for each text area and that is based on the importance of the text area calculated for each text area in the browsing target page group, and that serves as summary information of the browsing target page group, and information processing A method and an information processing program are proposed.

本発明の一側面によれば、ウェブページへのアクセス状況を考慮して概要情報を決定できるという効果を奏する。 According to one aspect of the present invention, there is an effect that outline information can be determined in consideration of an access situation to a web page.

図１は、情報処理装置によるウェブページの概要情報の決定の内容を示す説明図である。FIG. 1 is an explanatory diagram showing details of determination of web page summary information by the information processing apparatus. 図２は、システムの構成例を示す説明図である。FIG. 2 is an explanatory diagram illustrating a configuration example of the system. 図３は、実施の形態にかかる情報処理装置１００のハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram of a hardware configuration example of the information processing apparatus 100 according to the embodiment. 図４は、アクセスログＤＢ２０１の記憶内容を示す説明図である。FIG. 4 is an explanatory diagram showing the contents stored in the access log DB 201. 図５は、検索キーワードＤＢ２０２の記憶内容を示す説明図である。FIG. 5 is an explanatory diagram showing the contents stored in the search keyword DB 202. 図６は、領域重要度ＤＢ２０３の記憶内容を示す説明図である。FIG. 6 is an explanatory diagram showing the contents stored in the region importance DB 203. 図７は、情報処理装置１００の機能的構成を示すブロック図である。FIG. 7 is a block diagram illustrating a functional configuration of the information processing apparatus 100. 図８は、情報処理装置１００による検索キーワードと滞在時間の取得の具体例を示す説明図である。FIG. 8 is an explanatory diagram illustrating a specific example of acquisition of a search keyword and a stay time by the information processing apparatus 100. 図９は、情報処理装置１００による検索キーワードごとの重要度の算出の具体例を示す説明図である。FIG. 9 is an explanatory diagram illustrating a specific example of calculation of the importance for each search keyword by the information processing apparatus 100. 図１０は、情報処理装置１００によるテキスト領域ごとの領域重要度の算出の具体例を示す説明図である。FIG. 10 is an explanatory diagram illustrating a specific example of the calculation of the region importance for each text region by the information processing apparatus 100. 図１１は、情報処理装置１００による概要情報の提供の具体例を示す説明図である。FIG. 11 is an explanatory diagram illustrating a specific example of provision of summary information by the information processing apparatus 100. 図１２は、検索キーワード抽出処理の処理内容の詳細を示すフローチャートである。FIG. 12 is a flowchart showing details of processing contents of the search keyword extraction processing. 図１３は、領域重要度算出処理の処理内容の詳細を示すフローチャートである。FIG. 13 is a flowchart showing details of the processing contents of the region importance calculation processing.

以下に添付図面を参照して、この発明にかかる情報処理装置、情報処理方法、および情報処理プログラムの実施の形態を詳細に説明する。情報処理装置は、ウェブページ内で多くの閲覧者が関心を持つテキスト領域を当該ウェブページの概要情報に決定する。そのために、まず、情報処理装置は、閲覧者がウェブページにたどり着く過程で使用した検索キーワードと閲覧端末で当該ウェブページが表示されていた時間（以下、「滞在時間」という）とを取得する。 Exemplary embodiments of an information processing apparatus, an information processing method, and an information processing program according to the present invention will be explained below in detail with reference to the accompanying drawings. The information processing apparatus determines a text area in which many viewers are interested in a web page as summary information of the web page. For this purpose, the information processing apparatus first obtains the search keyword used in the process of the viewer reaching the web page and the time (hereinafter referred to as “stay time”) when the web page was displayed on the browsing terminal.

次に、情報処理装置は、当該ウェブページにおける当該検索キーワードへの閲覧者の関心の強さを示す重要度を算出する。そして、情報処理装置は、当該ウェブページ内において、より重要度の高い検索キーワードをより多く含むテキスト領域を、閲覧者の関心の強いテキスト領域であるとして、当該ウェブページの概要情報に決定する。 Next, the information processing device calculates an importance level indicating the level of interest of the viewer to the search keyword in the web page. Then, the information processing apparatus determines that the text area including more search keywords with higher importance in the web page is the outline information of the web page, assuming that the text area is strongly interested by the viewer.

結果として、情報処理装置は、当該ウェブページの閲覧者に対して、多くの閲覧者が関心を持った概要情報を提供することができるようになる。そして、当該ウェブページの閲覧者は、多くの閲覧者が関心を持つ概要情報を参照することができるため、自らが探している情報が当該ウェブページにあるかを判断しやすくなる。 As a result, the information processing apparatus can provide summary information that many viewers are interested in to the viewers of the web page. And since the viewer of the said web page can refer to the summary information which many viewers are interested in, it becomes easy to judge whether the information which he is looking for exists in the said web page.

（情報処理装置によるウェブページの概要情報の決定の内容）
まず、図１を用いて、情報処理装置によるウェブページの概要情報の決定の内容について説明する。 (Details of the determination of web page summary information by the information processing device)
First, contents of determination of web page summary information by the information processing apparatus will be described with reference to FIG.

図１は、情報処理装置によるウェブページの概要情報の決定の内容を示す説明図である。図１において、情報処理装置１００は、ウェブページＷＰ内で、多くの閲覧者Ｓが関心を持つテキスト領域をウェブページＷＰの概要情報に決定する。 FIG. 1 is an explanatory diagram showing details of determination of web page summary information by the information processing apparatus. In FIG. 1, the information processing apparatus 100 determines a text area in which many viewers S are interested in the web page WP as summary information of the web page WP.

そのために、まず、図１の（ａ）に示すように、情報処理装置１００は、検索キーワードへのウェブページＷＰにおける閲覧者Ｓの関心の強さを示す重要度を算出する。ここで、情報処理装置１００は、ウェブページＷＰへのアクセスに関する情報をアクセスログとして記憶している。ウェブページＷＰは、複数の閲覧者Ｓから閲覧されている。各閲覧者Ｓは、閲覧端末を使用して、検索キーワードを検索サイトに入力し、検索サイトの検索結果ページからウェブページＷＰにたどり着き、ウェブページＷＰを閲覧したとする。 For this purpose, first, as illustrated in FIG. 1A, the information processing apparatus 100 calculates the importance indicating the level of interest of the viewer S in the web page WP for the search keyword. Here, the information processing apparatus 100 stores information related to access to the web page WP as an access log. The web page WP is browsed by a plurality of viewers S. It is assumed that each browsing person S inputs a search keyword into a search site using a browsing terminal, arrives at a web page WP from a search result page of the search site, and browses the web page WP.

例えば、閲覧者Ｓ（甲）は、検索キーワード「干渉」を用いてウェブページＷＰにたどり着き、９０秒閲覧したとする。また、閲覧者Ｓ（乙）は、検索キーワード「干渉」を用いてウェブページＷＰにたどり着き、１２０秒閲覧したとする。また、閲覧者Ｓ（丙）は、検索キーワード「シミュレーション」を用いてウェブページＷＰにたどり着き、６０秒閲覧したとする。 For example, it is assumed that the viewer S (Exhibit A) reaches the web page WP using the search keyword “interference” and browses for 90 seconds. Further, it is assumed that the viewer S (B) reaches the web page WP using the search keyword “interference” and browses for 120 seconds. In addition, it is assumed that the viewer S (丙) reaches the web page WP using the search keyword “simulation” and browses for 60 seconds.

（１）ここで、情報処理装置１００は、ウェブページＷＰのアクセスログを参照し、ウェブページＷＰへのアクセスごとに、閲覧者Ｓが使用した検索キーワードとウェブページＷＰへの滞在時間とを取得する。 (1) Here, the information processing apparatus 100 refers to the access log of the web page WP, and acquires the search keyword used by the viewer S and the staying time on the web page WP for each access to the web page WP. To do.

（２）次に、情報処理装置１００は、ウェブページＷＰにおける検索キーワードごとの重要度を算出する。ここでは、情報処理装置１００は、閲覧者Ｓの滞在時間の和が大きい検索キーワードが、多くの閲覧者Ｓから関心を持たれている検索キーワードであるとして、重要度を高くする。具体的には、例えば、情報処理装置１００は、滞在時間の和を重要度にする。 (2) Next, the information processing apparatus 100 calculates the importance for each search keyword in the web page WP. Here, the information processing apparatus 100 increases the degree of importance, assuming that the search keyword having a large sum of the stay times of the viewers S is a search keyword that is interested by many viewers S. Specifically, for example, the information processing apparatus 100 sets the sum of stay times as the importance.

次に、図１の（ｂ）に示すように、情報処理装置１００は、ウェブページＷＰ内のテキスト領域（ここでは、３箇所のテキスト領域Ｆ１〜Ｆ３）ごとに、各テキスト領域に対する閲覧者Ｓの関心の強さを示す領域重要度を算出し、概要情報を決定する。 Next, as illustrated in FIG. 1B, the information processing apparatus 100 reads the viewer S for each text area for each text area (here, three text areas F1 to F3) in the web page WP. The area importance level indicating the intensity of interest is calculated and summary information is determined.

（１）ここで、情報処理装置１００は、ウェブページＷＰ内の各テキスト領域のデータを取得する。ここでは、情報処理装置１００は、各テキスト領域Ｆ１〜Ｆ３のデータを取得する。 (1) Here, the information processing apparatus 100 acquires data of each text area in the web page WP. Here, the information processing apparatus 100 acquires data of the text regions F1 to F3.

（２）次に、情報処理装置１００は、取得した各テキスト領域Ｆ１〜Ｆ３のデータに基づいて、テキスト領域ごとに閲覧者Ｓの関心の強さを示す領域重要度を算出する。ここでは、領域重要度は、テキスト領域内に含まれる検索キーワードの重要度の和である。具体的には、例えば、テキスト領域Ｆ２には、重要度「２１０」の検索キーワード「干渉」が２つ含まれ、重要度「６０」の検索キーワード「シミュレーション」が１つ含まれているため、テキスト領域Ｆ２の領域重要度は、「２１０＋２１０＋６０」になる。 (2) Next, the information processing apparatus 100 calculates a region importance level indicating the interest level of the viewer S for each text region based on the acquired data of the text regions F1 to F3. Here, the area importance is the sum of the importance of the search keywords included in the text area. Specifically, for example, the text region F2 includes two search keywords “interference” with the importance “210” and one search keyword “simulation” with the importance “60”. The area importance of the text area F2 is “210 + 210 + 60”.

これにより、情報処理装置１００は、算出した領域重要度に基づいて、閲覧者Ｓが関心を持っているテキスト領域のデータを特定して、特定したデータを概要情報に決定できる。ここでは、領域重要度が最も高いテキスト領域Ｆ２のデータがウェブページＷＰの概要情報になる。 As a result, the information processing apparatus 100 can specify data of a text area in which the viewer S is interested based on the calculated area importance, and can determine the specified data as summary information. Here, the data of the text area F2 having the highest area importance is the outline information of the web page WP.

結果として、図１の（ｃ）に示すように、ウェブページＷＰのリンク元ページＬＰにおいて、ウェブページＷＰにたどり着いた多くの閲覧者Ｓが関心を持っていた概要情報を表示できるようになる。具体的には、例えば、情報処理装置１００は、閲覧端末においてウェブページＷＰへのリンクＬにマウスポインタＰを重ねた場合に、概要情報に決定されたテキスト領域Ｆ２のデータがポップアップＰＵとして表示されるように、ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）文書にＪａｖａ（登録商標）Ｓｃｒｉｐｔを用いて埋め込んでおく。 As a result, as shown in (c) of FIG. 1, it is possible to display summary information that many viewers S who have reached the web page WP are interested in in the link source page LP of the web page WP. Specifically, for example, when the information processing apparatus 100 overlays the mouse pointer P on the link L to the web page WP at the viewing terminal, the data of the text area F2 determined as the summary information is displayed as a pop-up PU. As described above, the HTML (HyperText Markup Language) document is embedded using Java (registered trademark) Script.

そのため、あらたにウェブページＷＰを閲覧しようとする閲覧者Ｓは、多くの閲覧者Ｓが関心を持っていた概要情報に基づいて、ウェブページＷＰにアクセスする前に、ウェブページＷＰの内容を判断できるようになる。結果として、閲覧者Ｓは、ウェブページＷＰの内容を精査せずに、ウェブページＷＰに自らが求めている情報が記載されているかを判断できるようになり、情報収集を効率化できる。また、ウェブページＷＰの制作者は、多くの閲覧者Ｓが関心を持っていた概要情報を自動的に決定できるため、閲覧者Ｓの関心を予測して概要情報を設定する手間をかけずに済む。 For this reason, the viewer S who wants to browse the web page WP newly determines the content of the web page WP before accessing the web page WP based on the summary information that many viewers S are interested in. become able to. As a result, the viewer S can determine whether or not the information he / she desires is described in the web page WP without examining the contents of the web page WP, and can efficiently collect information. In addition, since the creator of the web page WP can automatically determine the summary information that many viewers S are interested in, it is possible to predict the interests of the viewers S and not to set the summary information. That's it.

（システムの構成例）
次に、図２を用いて、図１に示した情報処理装置１００と、ウェブページＷＰの閲覧者Ｓが使用する閲覧端末と、検索サーバと、を含むシステムの構成例について説明する。 (System configuration example)
Next, a configuration example of a system including the information processing apparatus 100 illustrated in FIG. 1, a browsing terminal used by the viewer S of the web page WP, and a search server will be described with reference to FIG. 2.

図２は、システムの構成例を示す説明図である。図２に示すように、システムは、情報処理装置１００と、閲覧端末２１０と、検索サーバ２２０と、を含む。なお、図２では、閲覧端末２１０は１つであるが、閲覧端末２１０は複数含まれていてもよい。 FIG. 2 is an explanatory diagram illustrating a configuration example of the system. As shown in FIG. 2, the system includes an information processing apparatus 100, a browsing terminal 210, and a search server 220. In FIG. 2, the number of browsing terminals 210 is one, but a plurality of browsing terminals 210 may be included.

情報処理装置１００は、ウェブサイト内の各ウェブページＷＰのデータを記憶している。ウェブページＷＰのデータとは、ネットワークＮ上で公開される文書であり、例えば、ＨＴＭＬ文書やＸＭＬ（ＥｘｔｅｎｓｉｂｌｅＭａｒｋｕｐＬａｎｇｕａｇｅ）文書である。また、情報処理装置１００は、閲覧端末２１０からのアクセスに関する情報を記憶するアクセスログＤＢ（ＤａｔａＢａｓｅ）２０１を有する。また、情報処理装置１００は、ウェブサイト内の各ウェブページＷＰに対する検索キーワードの重要度を記憶する検索キーワードＤＢ２０２を有する。また、情報処理装置１００は、ウェブページＷＰ内の各テキスト領域の領域重要度を記憶する領域重要度ＤＢ２０３を有する。 The information processing apparatus 100 stores data of each web page WP in the website. The data of the web page WP is a document published on the network N, and is, for example, an HTML document or an XML (Extensible Markup Language) document. Further, the information processing apparatus 100 includes an access log DB (DataBase) 201 that stores information related to access from the browsing terminal 210. The information processing apparatus 100 also includes a search keyword DB 202 that stores the importance of the search keyword for each web page WP in the website. The information processing apparatus 100 also includes a region importance DB 203 that stores the region importance of each text region in the web page WP.

閲覧端末２１０は、閲覧者Ｓの操作を受けて、検索サーバ２２０にアクセスし、ウェブページＷＰを検索する端末である。また、閲覧端末２１０は、閲覧者Ｓの操作を受けて、検索サーバ２２０による検索結果ページに表示されるリンクＬを介して、情報処理装置１００が記憶するウェブページＷＰのデータにアクセスをおこなう端末である。検索サーバ２２０は、閲覧端末２１０で閲覧者Ｓにより入力された検索キーワードに基づいて、ネットワークＮ上のウェブページＷＰを検索するサーバである。 The browsing terminal 210 is a terminal that receives the operation of the viewer S, accesses the search server 220, and searches the web page WP. In addition, the browsing terminal 210 receives the operation of the viewer S, and accesses the data of the web page WP stored in the information processing apparatus 100 via the link L displayed on the search result page by the search server 220. It is. The search server 220 is a server that searches the web page WP on the network N based on the search keyword input by the viewer S at the browsing terminal 210.

（情報処理装置１００のハードウェア構成例）
次に、図３を用いて、図１および図２に示した情報処理装置１００のハードウェア構成例について説明する。 (Hardware configuration example of information processing apparatus 100)
Next, a hardware configuration example of the information processing apparatus 100 illustrated in FIGS. 1 and 2 will be described with reference to FIG.

図３は、実施の形態にかかる情報処理装置１００のハードウェア構成例を示すブロック図である。図３において、情報処理装置１００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１と、ＲＯＭ（Ｒｅａｄ‐ＯｎｌｙＭｅｍｏｒｙ）３０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）３０３と、磁気ディスクドライブ３０４と、磁気ディスク３０５と、光ディスクドライブ３０６と、光ディスク３０７と、ディスプレイ３０８と、Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）３０９と、キーボード３１０と、マウス３１１と、スキャナ３１２と、プリンタ３１３と、を備えている。また、各構成部はバス３２０によってそれぞれ接続されている。 FIG. 3 is a block diagram of a hardware configuration example of the information processing apparatus 100 according to the embodiment. In FIG. 3, an information processing apparatus 100 includes a CPU (Central Processing Unit) 301, a ROM (Read-Only Memory) 302, a RAM (Random Access Memory) 303, a magnetic disk drive 304, a magnetic disk 305, and an optical disk. A drive 306, an optical disk 307, a display 308, an I / F (Interface) 309, a keyboard 310, a mouse 311, a scanner 312, and a printer 313 are provided. Each component is connected by a bus 320.

ここで、ＣＰＵ３０１は、情報処理装置１００の全体の制御を司る。ＲＯＭ３０２は、ブートプログラムなどのプログラムを記憶している。また、ＲＯＭ３０２は、ウェブサイト内のウェブページＷＰのデータを記憶している。ＲＡＭ３０３は、ＣＰＵ３０１のワークエリアとして使用される。また、ＲＡＭ３０３は、アクセスログＤＢ２０１と、検索キーワードＤＢ２０２と、領域重要度ＤＢ２０３と、を記憶する。 Here, the CPU 301 governs overall control of the information processing apparatus 100. The ROM 302 stores a program such as a boot program. The ROM 302 stores data of the web page WP in the website. The RAM 303 is used as a work area for the CPU 301. The RAM 303 stores an access log DB 201, a search keyword DB 202, and a region importance DB 203.

磁気ディスクドライブ３０４は、ＣＰＵ３０１の制御にしたがって磁気ディスク３０５に対するデータのリード／ライトを制御する。磁気ディスク３０５は、磁気ディスクドライブ３０４の制御で書き込まれたデータを記憶する。 The magnetic disk drive 304 controls the reading / writing of the data with respect to the magnetic disk 305 according to control of CPU301. The magnetic disk 305 stores data written under the control of the magnetic disk drive 304.

光ディスクドライブ３０６は、ＣＰＵ３０１の制御にしたがって光ディスク３０７に対するデータのリード／ライトを制御する。光ディスク３０７は、光ディスクドライブ３０６の制御で書き込まれたデータを記憶したり、光ディスク３０７に記憶されたデータをコンピュータに読み取らせたりする。 The optical disk drive 306 controls the reading / writing of the data with respect to the optical disk 307 according to control of CPU301. The optical disk 307 stores data written under the control of the optical disk drive 306, and causes the computer to read data stored on the optical disk 307.

ディスプレイ３０８は、カーソル、アイコンあるいはツールボックスをはじめ、文書、画像、機能情報などのデータを表示する。このディスプレイ３０８は、例えば、ＣＲＴ、ＴＦＴ液晶ディスプレイ、プラズマディスプレイなどを採用することができる。 The display 308 displays data such as a document, an image, and function information as well as a cursor, an icon, or a tool box. As the display 308, for example, a CRT, a TFT liquid crystal display, a plasma display, or the like can be adopted.

インターフェース（以下、「Ｉ／Ｆ」と略する。）３０９は、通信回線を通じてＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどのネットワークＮに接続され、このネットワークＮを介して他の装置に接続される。そして、Ｉ／Ｆ３０９は、ネットワークＮと内部のインターフェースを司り、外部装置からのデータの入出力を制御する。Ｉ／Ｆ３０９には、例えばモデムやＬＡＮアダプタなどを採用することができる。 An interface (hereinafter abbreviated as “I / F”) 309 is connected to a network N such as a LAN (Local Area Network), a WAN (Wide Area Network), and the Internet through a communication line. Connected to other devices. The I / F 309 serves as an internal interface with the network N and controls input / output of data from an external device. For example, a modem or a LAN adapter may be employed as the I / F 309.

キーボード３１０は、文字、数字、各種指示などの入力のためのキーを備え、データの入力をおこなう。また、タッチパネル式の入力パッドやテンキーなどであってもよい。マウス３１１は、カーソルの移動や範囲選択、あるいはウィンドウの移動やサイズの変更などをおこなう。ポインティングデバイスとして同様に機能を備えるものであれば、トラックボールやジョイスティックなどであってもよい。 The keyboard 310 includes keys for inputting characters, numbers, various instructions, and the like, and inputs data. Moreover, a touch panel type input pad or a numeric keypad may be used. The mouse 311 performs cursor movement, range selection, window movement, size change, and the like. A trackball or a joystick may be used as long as they have the same function as a pointing device.

スキャナ３１２は、画像を光学的に読み取り、情報処理装置１００内に画像データを取り込む。なお、スキャナ３１２は、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）機能を持たせてもよい。また、プリンタ３１３は、画像データや文書データを印刷する。プリンタ３１３には、例えば、レーザプリンタやインクジェットプリンタを採用することができる。 The scanner 312 optically reads an image and takes in the image data into the information processing apparatus 100. The scanner 312 may have an OCR (Optical Character Reader) function. The printer 313 prints image data and document data. As the printer 313, for example, a laser printer or an ink jet printer can be employed.

（アクセスログＤＢ２０１の記憶内容）
次に、図４を用いて、ＲＡＭ３０３に記憶されているアクセスログＤＢ２０１の記憶内容について説明する。 (Storage contents of access log DB 201)
Next, the contents stored in the access log DB 201 stored in the RAM 303 will be described with reference to FIG.

図４は、アクセスログＤＢ２０１の記憶内容を示す説明図である。図４に示すように、アクセスログＤＢ２０１は、ホスト項目のそれぞれに対応付けて、日時項目と、ＵＲＬ項目と、リファラ項目と、を有し、アクセスごとにレコードを構成する。 FIG. 4 is an explanatory diagram showing the contents stored in the access log DB 201. As shown in FIG. 4, the access log DB 201 has a date / time item, a URL item, and a referrer item in association with each host item, and configures a record for each access.

ホスト項目には、ウェブページＷＰにアクセスした閲覧端末２１０を識別する識別子が記憶されている。具体的には、例えば、識別子は、ＩＰアドレスである。日時項目には、ウェブページＷＰにアクセスされた日時が記憶されている。ＵＲＬ項目には、ウェブページＷＰを識別する識別子が記憶されている。リファラ項目には、ＵＲＬ項目の識別子により識別されるウェブページＷＰのリンク元ページＬＰが記憶されている。 The host item stores an identifier for identifying the browsing terminal 210 that has accessed the web page WP. Specifically, for example, the identifier is an IP address. The date and time item stores the date and time when the web page WP was accessed. In the URL item, an identifier for identifying the web page WP is stored. The referrer item stores the link source page LP of the web page WP identified by the identifier of the URL item.

なお、一般的に、ウェブサーバで記憶されるアクセスログには、データの転送量、閲覧に使用された通信プロトコル、閲覧に使用されたウェブブラウザ、および閲覧端末２１０のＯＳなどの情報が含まれるが、ここでは、簡単のため省略する。 In general, the access log stored in the web server includes information such as the data transfer amount, the communication protocol used for browsing, the web browser used for browsing, and the OS of the browsing terminal 210. However, it is omitted here for simplicity.

（検索キーワードＤＢ２０２の記憶内容）
次に、図５を用いて、ＲＡＭ３０３に記憶されている検索キーワードＤＢ２０２の記憶内容について説明する。 (Contents stored in the search keyword DB 202)
Next, the storage contents of the search keyword DB 202 stored in the RAM 303 will be described with reference to FIG.

図５は、検索キーワードＤＢ２０２の記憶内容を示す説明図である。図５に示すように、検索キーワードＤＢ２０２は、ページ名項目のそれぞれに対応付けて、検索キーワード項目を有し、ウェブページＷＰごとにレコードを構成する。 FIG. 5 is an explanatory diagram showing the contents stored in the search keyword DB 202. As shown in FIG. 5, the search keyword DB 202 has a search keyword item associated with each page name item, and constitutes a record for each web page WP.

ページ名項目には、ウェブページＷＰの名称が記憶されている。検索キーワード項目には、検索キーワードごとに、ページ名項目が示すウェブページＷＰへの閲覧者Ｓの関心の強さを示す重要度が記憶されている。例えば、重要度として、ページ名項目が示すウェブページＷＰでの閲覧者Ｓの滞在時間の和が記憶されている。なお、重要度として、ページ名項目が示すウェブページＷＰへのアクセス数を採用してもよい。 In the page name item, the name of the web page WP is stored. In the search keyword item, for each search keyword, an importance level indicating the degree of interest of the viewer S in the web page WP indicated by the page name item is stored. For example, the sum of the staying time of the viewer S on the web page WP indicated by the page name item is stored as the importance. As the importance, the number of accesses to the web page WP indicated by the page name item may be adopted.

（領域重要度ＤＢ２０３の記憶内容）
次に、図６を用いて、ＲＡＭ３０３に記憶されている領域重要度ＤＢ２０３の記憶内容について説明する。 (Storage contents of area importance DB 203)
Next, the contents stored in the area importance DB 203 stored in the RAM 303 will be described with reference to FIG.

図６は、領域重要度ＤＢ２０３の記憶内容を示す説明図である。図６に示すように、領域重要度ＤＢ２０３は、領域項目のそれぞれに対応付けて、領域重要度項目を有し、ウェブページＷＰ内のテキスト領域ごとにレコードを構成する。 FIG. 6 is an explanatory diagram showing the contents stored in the region importance DB 203. As shown in FIG. 6, the region importance DB 203 has region importance items associated with each region item, and constitutes a record for each text region in the web page WP.

領域項目には、ウェブページＷＰ内のテキスト領域を識別する識別子が記憶されている。領域重要度項目には、領域項目の識別子により識別されるテキスト領域への閲覧者Ｓの関心の強さを示す領域重要度が記憶されている。例えば、領域重要度として、テキスト領域に含まれる検索キーワードの重要度の和が記憶されている。なお、領域重要度の算出の際は、隣接するテキスト領域に含まれる検索キーワードをさらに参照してもよい。 In the area item, an identifier for identifying a text area in the web page WP is stored. In the area importance item, area importance indicating the intensity of interest of the viewer S in the text area identified by the identifier of the area item is stored. For example, the sum of importance of search keywords included in a text area is stored as the area importance. In calculating the region importance, the search keyword included in the adjacent text region may be further referred to.

（情報処理装置１００の機能的構成例）
次に、図７を用いて、情報処理装置１００の機能的構成例について説明する。 (Functional configuration example of information processing apparatus 100)
Next, a functional configuration example of the information processing apparatus 100 will be described with reference to FIG.

図７は、情報処理装置１００の機能的構成を示すブロック図である。情報処理装置１００は、取得部７０１と、第１の算出部７０２と、第２の算出部７０３と、決定部７０４と、埋込部７０５と、出力部７０６と、を含む構成である。この制御部となる機能（取得部７０１〜出力部７０６）は、具体的には、例えば、図３に示したＲＯＭ３０２、ＲＡＭ３０３、磁気ディスク３０５、光ディスク３０７などの記憶装置に記憶されたプログラムをＣＰＵ３０１に実行させることにより、または、Ｉ／Ｆ３０９により、その機能を実現する。 FIG. 7 is a block diagram illustrating a functional configuration of the information processing apparatus 100. The information processing apparatus 100 includes an acquisition unit 701, a first calculation unit 702, a second calculation unit 703, a determination unit 704, an embedding unit 705, and an output unit 706. Specifically, the functions (acquisition unit 701 to output unit 706) serving as the control unit are, for example, a program stored in a storage device such as the ROM 302, the RAM 303, the magnetic disk 305, and the optical disk 307 illustrated in FIG. The function is realized by executing the function or by the I / F 309.

取得部７０１は、閲覧対象ページについてのアクセス元で閲覧対象ページに遷移する際に使われた検索キーワードおよびアクセス元で閲覧対象ページを閲覧していた時間を閲覧対象ページへのアクセスごとに取得する機能を有する。ここで、閲覧対象ページとは、上述したウェブページＷＰである。アクセス元とは、上述した閲覧端末２１０である。検索キーワードとは、閲覧対象ページにたどり着くまでに閲覧端末２１０で入力された検索キーワードであり、例えば、閲覧端末２１０で入力されて検索サーバ２２０に送信された検索キーワードである。閲覧対象ページを閲覧していた時間とは、閲覧端末２１０が閲覧対象ページを表示していた時間であり、上述した滞在時間である。 The acquisition unit 701 acquires, for each access to the browsing target page, the search keyword used when the browsing source page is changed to the browsing target page and the time during which the browsing target page was browsed at the access source. It has a function. Here, the browsing target page is the web page WP described above. The access source is the browsing terminal 210 described above. The search keyword is a search keyword input at the browsing terminal 210 before reaching the browsing target page. For example, the search keyword is a search keyword input at the browsing terminal 210 and transmitted to the search server 220. The time when the browsing target page is browsed is the time when the browsing terminal 210 is displaying the browsing target page, and is the stay time described above.

具体的には、例えば、取得部７０１は、アクセスログＤＢ２０１を参照することにより、ウェブページＷＰにアクセスした閲覧端末２１０で入力された検索キーワードおよび閲覧端末２１０がウェブページＷＰを表示していた時間を取得する。これにより、情報処理装置１００は、ウェブページＷＰへの閲覧者Ｓの関心の強さの指標となるアクセスに使用された検索キーワードや滞在時間を取得することができる。 Specifically, for example, the acquisition unit 701 refers to the access log DB 201, and the search keyword input at the browsing terminal 210 that has accessed the web page WP and the time during which the browsing terminal 210 is displaying the web page WP. To get. Thereby, the information processing apparatus 100 can acquire the search keyword and the staying time used for the access that is an index of the interest of the viewer S to the web page WP.

また、取得部７０１は、閲覧対象ページについてのアクセス元で閲覧対象ページにたどり着くまでのページ数が規定数以下である検索キーワードおよびアクセス元で閲覧対象ページを閲覧していた時間を閲覧対象ページへのアクセスごとに取得する機能を有する。具体的には、例えば、取得部７０１は、検索サイトからウェブページＷＰにたどり着くまでのページ数が規定数以下である場合の検索キーワードと滞在時間とを重要度の算出に使用する。 In addition, the acquisition unit 701 displays the search keyword in which the number of pages until reaching the browsing target page at the access source for the browsing target page and the time when the browsing target page was browsed at the access source is displayed as the browsing target page. It has the function to acquire for every access. Specifically, for example, the acquisition unit 701 uses the search keyword and the staying time when the number of pages until reaching the web page WP from the search site is equal to or less than the specified number for calculating the importance.

これにより、ウェブページＷＰと関連の深い検索キーワードと滞在時間とを重要度の算出に使用するため、精度よく重要度を算出できるようになる。なお、取得されたデータは、ＲＡＭ３０３、磁気ディスク３０５、光ディスク３０７などの記憶領域に記憶される。 Accordingly, since the search keyword and the stay time closely related to the web page WP are used for calculating the importance, the importance can be calculated with high accuracy. The acquired data is stored in a storage area such as the RAM 303, the magnetic disk 305, and the optical disk 307.

第１の算出部７０２は、取得部７０１によって取得された検索キーワードでたどり着いた閲覧対象ページについてのアクセス元で閲覧対象ページを閲覧していた時間に基づいて、検索キーワードの閲覧対象ページにおける重要度を、検索キーワードごとに算出する機能を有する。具体的には、例えば、第１の算出部７０２は、検索キーワードごとのウェブページＷＰへの滞在時間の和を、検索キーワードの重要度として算出する。これにより、情報処理装置１００は、ウェブページＷＰにおける検索キーワードへの閲覧者Ｓの関心の強さを示す重要度を算出することができる。 The first calculation unit 702 determines the importance of the search keyword in the browse target page based on the time when the browse target page was browsed at the access source for the browse target page reached by the search keyword acquired by the acquisition unit 701. Is calculated for each search keyword. Specifically, for example, the first calculation unit 702 calculates the sum of the staying time on the web page WP for each search keyword as the importance of the search keyword. Thereby, the information processing apparatus 100 can calculate the importance indicating the strength of the viewer's interest in the search keyword in the web page WP.

また、第１の算出部７０２は、取得部７０１によって取得された検索キーワードでたどり着いた閲覧対象ページについてのアクセス元で閲覧対象ページを閲覧していた時間のうち、閾値以下であるアクセス元で閲覧対象ページを閲覧していた時間に基づいて、検索キーワードの閲覧対象ページにおける重要度を、検索キーワードごとに算出する機能を有する。具体的には、例えば、第１の算出部７０２は、１回のアクセスにおける閲覧時間が閾値以上であった場合、当該閲覧時間を重要度の算出に使用しない。 In addition, the first calculation unit 702 browses at an access source that is equal to or less than a threshold among the times when the browse target page was browsed at the access source for the browse target page reached by the search keyword acquired by the acquisition unit 701. Based on the time during which the target page was browsed, the search keyword has a function for calculating the importance of the search keyword in the browse target page for each search keyword. Specifically, for example, when the browsing time in one access is equal to or greater than a threshold, the first calculation unit 702 does not use the browsing time for calculating the importance.

これにより、例えば、情報処理装置１００は、閲覧端末２１０でウェブページＷＰが表示されているものの、閲覧者ＳがウェブページＷＰを閲覧していない状況（例えば、閲覧者Ｓが離席中、または食事中など）における閲覧時間は重要度の算出に使用しない。そのため、情報処理装置１００は、精度よく重要度を算出できるようになる。 Thereby, for example, the information processing apparatus 100 has a situation where the web page WP is displayed on the browsing terminal 210 but the browser S is not browsing the web page WP (for example, the viewer S is away from the seat, or Browsing time (eg during meals) is not used to calculate importance. Therefore, the information processing apparatus 100 can calculate the importance with high accuracy.

また、第１の算出部７０２は、取得部７０１によって取得された検索キーワードでたどり着いた閲覧対象ページについてのアクセス元で閲覧対象ページを閲覧していた時間のうち、閾値以上であるアクセス元で閲覧対象ページを閲覧していた時間に基づいて、検索キーワードの閲覧対象ページにおける重要度を、検索キーワードごとに算出する機能を有する。具体的には、例えば、第１の算出部７０２は、１回のアクセスにおける閲覧時間が閾値以下であった場合、当該閲覧時間を重要度の算出に使用しない。 In addition, the first calculation unit 702 browses at an access source that is equal to or greater than a threshold among the times when the browse target page was browsed at the access source for the browse target page reached by the search keyword acquired by the acquisition unit 701. Based on the time during which the target page was browsed, the search keyword has a function for calculating the importance of the search keyword in the browse target page for each search keyword. Specifically, for example, when the browsing time in one access is less than or equal to a threshold value, the first calculation unit 702 does not use the browsing time for calculating the importance.

これにより、例えば、情報処理装置１００は、閲覧端末２１０でウェブページＷＰが表示されたものの、閲覧者ＳがウェブページＷＰに関心がない状況（例えば、閲覧者Ｓが流し読みをした場合など）における閲覧時間は重要度の算出に使用しない。そのため、精度よく重要度を算出できるようになる。なお、算出結果は、検索キーワードＤＢ２０２に記憶される。 Thereby, for example, the information processing apparatus 100 displays the web page WP on the browsing terminal 210, but the browser S is not interested in the web page WP (for example, when the browser S performs a read-through). The browsing time in is not used to calculate importance. Therefore, the importance can be calculated with high accuracy. The calculation result is stored in the search keyword DB 202.

第２の算出部７０３は、第１の算出部７０２によって検索キーワードごとに算出された検索キーワードの閲覧対象ページにおける重要度と、閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、テキスト領域の閲覧対象ページにおける重要度を、テキスト領域ごとに算出する機能を有する。ここで、テキスト領域の閲覧対象ページにおける重要度とは、上述した領域重要度である。 The second calculation unit 703 determines the importance of the search keyword calculated for each search keyword by the first calculation unit 702 in the browsing target page and the number of appearances of each search keyword for each text area of the browsing target page. Based on this, it has a function of calculating the importance of the text area in the browsing target page for each text area. Here, the importance level in the browsing target page of the text area is the above-described area importance level.

具体的には、例えば、第２の算出部７０３は、テキスト領域に含まれる検索キーワードの重要度の和を、領域重要度として算出する。また、第２の算出部は、さらに、隣接するテキスト領域に含まれる検索キーワードの重要度の和を参照して、領域重要度を算出してもよい。なお、算出結果は、領域重要度ＤＢ２０３に記憶される。これにより、情報処理装置１００は、ウェブページＷＰ内の各テキスト領域への閲覧者Ｓの関心の強さを示す領域重要度を算出することができる。 Specifically, for example, the second calculation unit 703 calculates the sum of the importance levels of the search keywords included in the text area as the area importance level. Further, the second calculation unit may further calculate the region importance by referring to the sum of the importance of the search keywords included in the adjacent text region. The calculation result is stored in the region importance DB 203. Thereby, the information processing apparatus 100 can calculate the region importance indicating the intensity of the interest of the viewer S in each text region in the web page WP.

決定部７０４は、第２の算出部７０３によってテキスト領域ごとに算出されたテキスト領域の閲覧対象ページにおける重要度に基づいて、閲覧対象ページの概要情報となる特定のテキスト領域を決定する機能を有する。具体的には、例えば、決定部７０４は、ウェブページＷＰ内で最も領域重要度の高いテキスト領域を、ウェブページＷＰの概要情報となるテキスト領域に決定する。 The determining unit 704 has a function of determining a specific text region that is the summary information of the browsing target page based on the importance of the text region in the browsing target page calculated for each text region by the second calculating unit 703. . Specifically, for example, the determination unit 704 determines the text area with the highest area importance in the web page WP as the text area that is the outline information of the web page WP.

これにより、ウェブサイトの制作者は、閲覧者Ｓが関心を持つテキスト領域を調査して概要情報を決定するといった手間を削減できる。また、ウェブサイトの制作者は、ウェブサイトの閲覧者Ｓのニーズが変化し閲覧者Ｓが関心を持つテキスト領域が変化した場合にも、変化したニーズに対応した概要情報になるテキスト領域を容易に決定できる。なお、決定結果は、ＲＡＭ３０３、磁気ディスク３０５、光ディスク３０７などの記憶領域に記憶される。 Thereby, the creator of the website can reduce the trouble of investigating the text area in which the viewer S is interested and determining summary information. In addition, even if the website viewer S needs change and the text area that the viewer S is interested in changes, the website creator can easily create a text area that becomes summary information corresponding to the changed needs. Can be determined. The determination result is stored in a storage area such as the RAM 303, the magnetic disk 305, and the optical disk 307.

埋込部７０５は、決定部７０４によって決定された特定のテキスト領域内のデータを、閲覧対象ページのリンク元ページＬＰから呼び出し可能な形式で、リンク元ページＬＰに埋め込む機能を有する。具体的には、例えば、埋込部７０５は、ウェブページＷＰのリンク元ページＬＰにおいて、ウェブページＷＰへのリンクＬがマウスオーバされたときに、概要情報がポップアップＰＵに表示されるように、リンク元ページＬＰのＨＴＭＬ文書内にＪａｖａＳｃｒｉｐｔを用いて埋め込む。 The embedding unit 705 has a function of embedding data in the specific text region determined by the determining unit 704 in the link source page LP in a format that can be called from the link source page LP of the browsing target page. Specifically, for example, in the link source page LP of the web page WP, the embedding unit 705 causes the summary information to be displayed on the pop-up PU when the link L to the web page WP is moused over. Embed it in the HTML document of the link source page LP using JavaScript.

これにより、ウェブサイトの閲覧者Ｓは、リンク先のウェブページＷＰにアクセスする前に、リンク先のウェブページＷＰの概要情報を知ることができる。そのため、ウェブサイトの閲覧者Ｓにとって、情報の取捨選択が容易になり、ウェブサイトの利便性を向上できる。 Thereby, the browser S of the website can know the outline information of the linked web page WP before accessing the linked web page WP. Therefore, it becomes easy for the website viewer S to select information, and the convenience of the website can be improved.

また、埋込部７０５は、決定部７０４によって決定された特定のテキスト領域内のデータを、閲覧対象ページ内のテキスト領域より上の領域に埋め込む機能を有する。具体的には、例えば、埋込部７０５は、ウェブページＷＰ内の概要情報になるテキスト領域より上の領域に、当該ウェブページＷＰの概要情報を埋め込む。 The embedding unit 705 has a function of embedding data in the specific text area determined by the determining unit 704 in an area above the text area in the browse target page. Specifically, for example, the embedding unit 705 embeds the outline information of the web page WP in an area above the text area that becomes the outline information in the web page WP.

これにより、ウェブページＷＰの閲覧者Ｓは、ウェブページＷＰ全体を読むことなく、ウェブページＷＰの概要情報を知ることができるようになる。また、ウェブページＷＰのＨＴＭＬ文書内に「ｍｅｔａｄｅｓｃｒｉｐｔｉｏｎタグ」を用いて埋め込むことで、検索サイトのスニペットへ概要情報が表示されるようにしてもよい。 Thereby, the viewer S of the web page WP can know the outline information of the web page WP without reading the entire web page WP. Alternatively, the outline information may be displayed on the snippet of the search site by embedding it in the HTML document of the web page WP using a “meta description tag”.

出力部７０６は、決定部７０４によって決定された閲覧対象ページの概要情報となる特定のテキスト領域を出力する機能を有する。具体的には、例えば、出力部７０６は、閲覧端末２１０に対して概要情報を送信する。出力形式としては、例えば、ディスプレイ３０８への表示、プリンタ３１３への印刷出力、Ｉ／Ｆ３０９による外部装置への送信がある。また、ＲＡＭ３０３、磁気ディスク３０５、光ディスク３０７などの記憶領域に記憶することとしてもよい。 The output unit 706 has a function of outputting a specific text area serving as summary information of the browsing target page determined by the determination unit 704. Specifically, for example, the output unit 706 transmits summary information to the browsing terminal 210. The output format includes, for example, display on the display 308, print output to the printer 313, and transmission to an external device via the I / F 309. Alternatively, the data may be stored in a storage area such as the RAM 303, the magnetic disk 305, and the optical disk 307.

これにより、情報処理装置１００は、閲覧端末２１０からの要求を受けてから概要情報を出力することができる。また、情報処理装置１００は、情報処理装置１００のユーザに、閲覧情報を通知することができる。 Thereby, the information processing apparatus 100 can output the summary information after receiving a request from the browsing terminal 210. Further, the information processing apparatus 100 can notify the user of the information processing apparatus 100 of browsing information.

また、取得部７０１〜出力部７０６は、閲覧対象ページとして、ウェブページＷＰの集合を採用してもよい。具体的には、例えば、取得部７０１〜決定部７０４は、ウェブページＷＰの集合を一つのウェブページＷＰとして扱い、ウェブページＷＰの集合の中から、多くの閲覧者Ｓが興味を持っているテキスト領域のデータを特定し、ウェブページＷＰの集合の概要情報に決定する。これにより、複数のウェブページＷＰにまたがった記事があった場合に、複数のウェブページＷＰ全体（当該記事全体）での概要情報を決定することができるようになる。 In addition, the acquisition unit 701 to the output unit 706 may employ a set of web pages WP as browsing target pages. Specifically, for example, the acquisition unit 701 to the determination unit 704 treat a set of web pages WP as one web page WP, and many viewers S are interested in the set of web pages WP. Data in the text area is specified and determined as summary information of a set of web pages WP. As a result, when there is an article extending over a plurality of web pages WP, it is possible to determine summary information for the entire plurality of web pages WP (the entire article).

また、具体的には、例えば、埋込部７０５は、ウェブページＷＰの集合のいずれかのウェブページＷＰへのリンク元ページＬＰにおいて、ウェブページＷＰへのリンクＬがマウスオーバされたときに、概要情報がポップアップＰＵに表示されるように埋め込む。これにより、ウェブサイトの閲覧者Ｓは、リンク先のウェブページＷＰの集合にアクセスする前に、リンク先のウェブページＷＰの集合の概要情報を知ることができる。 Specifically, for example, the embedding unit 705, when the link L to the web page WP is hovered in the link source page LP to any web page WP of the set of web pages WP, Embed the summary information so that it is displayed in the pop-up PU. Thereby, the viewer S of the website can know the summary information of the set of linked web pages WP before accessing the set of linked web pages WP.

また、具体的には、例えば、埋込部７０５は、ウェブページＷＰの集合のうちの最上位層のウェブページＷＰに概要情報を埋め込む。これにより、ウェブページＷＰの集合の閲覧者Ｓは、ウェブページＷＰの集合全体を読むことなく、ウェブページＷＰの集合の概要情報を知ることができる。 Specifically, for example, the embedding unit 705 embeds the outline information in the web page WP of the highest layer in the set of web pages WP. Thereby, the viewer S of the set of web pages WP can know the summary information of the set of web pages WP without reading the entire set of web pages WP.

また、具体的には、例えば、出力部７０６は、ウェブページＷＰの集合の概要情報を出力する。これにより、情報処理装置１００は、閲覧端末２１０からの概要情報の要求を受けてからウェブページＷＰの集合の概要情報を出力することができる。また、情報処理装置１００は、情報処理装置１００のユーザに、ウェブページＷＰの集合の閲覧情報を通知することができる。 Specifically, for example, the output unit 706 outputs summary information of a set of web pages WP. Thereby, the information processing apparatus 100 can output the summary information of the set of web pages WP after receiving the request for the summary information from the browsing terminal 210. In addition, the information processing apparatus 100 can notify the user of the information processing apparatus 100 of browsing information on a set of web pages WP.

（情報処理装置１００による概要情報の決定の具体例）
次に、図８〜１１を用いて、情報処理装置１００による概要情報の決定の具体例について説明する。 (Specific example of determination of summary information by information processing apparatus 100)
Next, a specific example of determination of summary information by the information processing apparatus 100 will be described with reference to FIGS.

（情報処理装置１００による検索キーワードと滞在時間の取得の具体例）
まず、図８を用いて、情報処理装置１００による検索キーワードと滞在時間の取得の具体例について説明する。 (Specific example of acquisition of search keyword and stay time by information processing apparatus 100)
First, a specific example of acquisition of a search keyword and stay time by the information processing apparatus 100 will be described with reference to FIG.

図８は、情報処理装置１００による検索キーワードと滞在時間の取得の具体例を示す説明図である。ここで、情報処理装置１００は、アクセスログＤＢ２０１を参照して、検索キーワードと滞在時間を取得する。 FIG. 8 is an explanatory diagram illustrating a specific example of acquisition of a search keyword and a stay time by the information processing apparatus 100. Here, the information processing apparatus 100 refers to the access log DB 201 and acquires the search keyword and the staying time.

（１）具体的には、まず、アクセスログＤＢ２０１のホスト項目の記憶内容が同一の複数のレコードを参照し、１回のアクセスにおける経路を取得する。ここで、情報処理装置１００は、「ａａ．ｂｂ．ｎｅ．ｊｐ」で識別される閲覧端末２１０が、「ｈｔｔｐ：／／ｘｘｘ．ｃｏ．ｊｐ／ｓｅａｒｃｈ＝”干渉”」の検索サイトから、「ｉｎｄｅｘ．ｈｔｍｌ」のウェブページＷＰにアクセスした経路を取得する。また、情報処理装置１００は、「ａａ．ｂｂ．ｎｅ．ｊｐ」で識別される閲覧端末２１０が、「ｉｎｄｅｘ．ｈｔｍｌ」のウェブページＷＰから、「ｍｏｋｕｊｉ．ｈｔｍｌ」のウェブページＷＰにアクセスした経路を取得する。 (1) Specifically, first, a plurality of records having the same storage contents of host items in the access log DB 201 are referred to, and a route for one access is acquired. Here, the information processing apparatus 100 is configured such that the browsing terminal 210 identified by “aa.bb.ne.jp” has the “http://xxx.co.jp/search=“interference” ”search site“ The route that accessed the web page WP of “index.html” is acquired. Further, the information processing apparatus 100 is configured such that the browsing terminal 210 identified by “aa.bb.ne.jp” accesses the web page WP of “mokuji.html” from the web page WP of “index.html”. To get.

（２）そして、情報処理装置１００は、取得した経路上の各ウェブページＷＰへアクセスされた時刻に基づいて、滞在時間を取得する。例えば、「ｉｎｄｅｘ．ｈｔｍｌ」のウェブページＷＰへの滞在時間は、「ｍｏｋｕｊｉ．ｈｔｍｌ」のウェブページＷＰへアクセスされた時刻から「ｉｎｄｅｘ．ｈｔｍｌ」のウェブページＷＰへアクセスされた時刻を引いた値になる。また、最後にアクセスされた「ｍｏｋｕｊｉ．ｈｔｍｌ」のウェブページＷＰの滞在時間は、例えば、閲覧端末２１０でウェブページＷＰが閉じられた時刻から、「ｍｏｋｕｊｉ．ｈｔｍｌ」のウェブページＷＰへアクセスされた時刻を引いた値になる。 (2) The information processing apparatus 100 acquires the stay time based on the time when each web page WP on the acquired route is accessed. For example, the stay time of “index.html” on the web page WP is a value obtained by subtracting the time of accessing the web page WP of “index.html” from the time of accessing the web page WP of “mokuji.html”. become. The stay time of the last accessed “mokuji.html” web page WP is, for example, accessed from the time when the web page WP was closed on the viewing terminal 210 to the web page WP of “mokuji.html”. The value is obtained by subtracting the time.

（３）また、検索サイトのアドレスには、例えば、「ｈｔｔｐ：／／ｘｘｘ．ｃｏ．ｊｐ／ｓｅａｒｃｈ＝”干渉”」のように、検索に使用された検索キーワードが含まれている。ここでは、簡単のため、ＵＲＬに検索キーワードがそのまま含まれているように表したが、実際には、ＵＲＬには検索キーワードを示すコードが含まれる。情報処理装置１００は、検索サイトのＵＲＬに含まれる検索キーワードを参照することで、取得した経路上の各ウェブページＷＰについての閲覧端末２１０での検索キーワードを取得する。 (3) The search site address includes the search keyword used for the search, for example, “http://xxx.co.jp/search=“interference” ”. Here, for the sake of simplicity, the URL is shown as if the search keyword is included as it is, but actually, the URL includes a code indicating the search keyword. The information processing apparatus 100 acquires a search keyword in the browsing terminal 210 for each web page WP on the acquired route by referring to the search keyword included in the URL of the search site.

情報処理装置１００は、１回のアクセスにおける検索キーワードが「干渉」と「シミュレーション」の２つである場合、それぞれについて滞在時間を取得してもよいし、それぞれの検索キーワードに滞在時間を分けてもよい。また、情報処理装置１００は、検索キーワードが複合語である「干渉シミュレーション」の場合、１つの検索キーワードとして「干渉シミュレーション」を取得してもよいし、２つの検索キーワードとして「干渉」と「シミュレーション」に分けて取得してもよい。 When there are two search keywords “interference” and “simulation” in one access, the information processing apparatus 100 may acquire the stay time for each, or divide the stay time into each search keyword. Also good. Further, when the search keyword is “interference simulation” that is a compound word, the information processing apparatus 100 may acquire “interference simulation” as one search keyword, or “interference” and “simulation” as two search keywords. May be obtained separately.

（情報処理装置１００による検索キーワードごとの重要度の算出の具体例）
次に、図９を用いて、情報処理装置１００による検索キーワードごとの重要度の算出の具体例について説明する。 (Specific example of calculation of importance for each search keyword by the information processing apparatus 100)
Next, a specific example of calculation of importance for each search keyword by the information processing apparatus 100 will be described with reference to FIG.

図９は、情報処理装置１００による検索キーワードごとの重要度の算出の具体例を示す説明図である。ここで、情報処理装置１００は、ウェブサイト内のウェブページＷＰごとに、図８において取得した各検索キーワードについて重要度を算出する。なお、以下では、簡単のため、ウェブサイト内の１つのウェブページＷＰを対象ページとして、対象ページでの検索キーワードの重要度を算出する場合について説明する。 FIG. 9 is an explanatory diagram illustrating a specific example of calculation of the importance for each search keyword by the information processing apparatus 100. Here, the information processing apparatus 100 calculates the importance for each search keyword acquired in FIG. 8 for each web page WP in the website. In the following, for the sake of simplicity, a case will be described in which the importance of a search keyword on a target page is calculated using one web page WP in the website as the target page.

図９の（ａ）は、図８と同様にして、情報処理装置１００が取得した、閲覧端末２１０からのアクセスの経路と、閲覧端末２１０が使用した検索キーワードと、各ウェブページＷＰでの閲覧端末２１０の滞在時間と、を表している。 9A is similar to FIG. 8, the access path from the browsing terminal 210 acquired by the information processing apparatus 100, the search keyword used by the browsing terminal 210, and browsing on each web page WP. The stay time of the terminal 210 is represented.

図９の（ａ）に示すように、対象ページは、経路１では、検索キーワード「干渉」を使用してたどり着いた閲覧端末２１０に９０秒表示されている。また、対象ページは、経路２では、検索キーワード「シミュレーション」を使用してたどり着いた閲覧端末２１０に６０秒表示されている。また、対象ページは、経路３では、検索キーワード「バーチャル」を使用してたどり着いた閲覧端末２１０に９０秒表示されている。また、対象ページは、経路４では、検索キーワード「干渉」を使用してたどり着いた閲覧端末２１０に６０秒表示されている。また、対象ページは、経路５では、検索キーワード「干渉」を使用してたどり着いた閲覧端末２１０に４０秒表示されている。 As shown in FIG. 9A, the target page is displayed for 90 seconds on the browsing terminal 210 that is reached using the search keyword “interference” in the route 1. Further, the target page is displayed for 60 seconds on the browsing terminal 210 that is reached using the search keyword “simulation” in the route 2. Further, the target page is displayed for 90 seconds on the browsing terminal 210 that is reached using the search keyword “virtual” in the route 3. Further, the target page is displayed for 60 seconds on the browsing terminal 210 that is reached using the search keyword “interference” in the route 4. Further, the target page is displayed for 40 seconds on the browsing terminal 210 that is reached using the search keyword “interference” in the path 5.

図９の（ｂ）に示すように、情報処理装置１００は、検索キーワードの重要度を算出する。例えば、検索キーワードの重要度として、ウェブページＷＰにおける滞在時間の和を採用できる。この場合、検索キーワード「干渉」の重要度は「１９０」になり、検索キーワード「シミュレーション」の重要度は「６０」になり、検索キーワード「バーチャル」の重要度は「９０」になる。 As illustrated in FIG. 9B, the information processing apparatus 100 calculates the importance of the search keyword. For example, the sum of the staying time on the web page WP can be adopted as the importance of the search keyword. In this case, the importance of the search keyword “interference” is “190”, the importance of the search keyword “simulation” is “60”, and the importance of the search keyword “virtual” is “90”.

ここでは、簡単のため、ウェブサイト内の１つのウェブページＷＰを対象ページとして説明したが、全ウェブページＷＰのそれぞれを対象ページとして同様の処理をおこなってもよい。なお、算出した重要度は、検索キーワードＤＢ２０２に記憶される。 Here, for the sake of simplicity, one web page WP in the website has been described as a target page, but the same processing may be performed using all the web pages WP as target pages. Note that the calculated importance is stored in the search keyword DB 202.

（情報処理装置１００によるテキスト領域ごとの領域重要度の算出の具体例）
次に、図１０を用いて、情報処理装置１００によるテキスト領域ごとの領域重要度の算出の具体例について説明する。 (Specific example of calculation of region importance for each text region by information processing apparatus 100)
Next, a specific example of the calculation of the area importance for each text area by the information processing apparatus 100 will be described with reference to FIG.

図１０は、情報処理装置１００によるテキスト領域ごとの領域重要度の算出の具体例を示す説明図である。ここで、情報処理装置１００は、図９において算出した検索キーワードの重要度に基づいて、ウェブページＷＰのテキスト領域ごとに領域重要度を算出する。なお、以下では、簡単のため、ウェブサイト内の１つのウェブページＷＰを対象ページとして、対象ページでのテキスト領域ごとの領域重要度を算出する場合について説明する。 FIG. 10 is an explanatory diagram illustrating a specific example of the calculation of the region importance for each text region by the information processing apparatus 100. Here, the information processing apparatus 100 calculates the region importance for each text region of the web page WP based on the importance of the search keyword calculated in FIG. In the following, for the sake of simplicity, a case will be described in which the region importance for each text region on the target page is calculated using one web page WP in the website as the target page.

（１）まず、情報処理装置１００は、対象ページ内のテキスト領域を特定し、各テキスト領域のデータを取得する。具体的には、情報処理装置１００は、ＨＴＭＬ文書内の改行コードから、段落ごとのテキスト領域Ｆ１〜Ｆ１２を特定し、各テキスト領域Ｆ１〜Ｆ１２のデータを取得する。 (1) First, the information processing apparatus 100 specifies a text area in the target page and acquires data of each text area. Specifically, the information processing apparatus 100 specifies the text areas F1 to F12 for each paragraph from the line feed code in the HTML document, and acquires data of the text areas F1 to F12.

（２）そして、情報処理装置１００は、各テキスト領域Ｆ１〜Ｆ１２に含まれる検索キーワードの出現回数と検索キーワードの重要度に基づいて、テキスト領域ごとに領域重要度を算出する。例えば、情報処理装置１００は、領域重要度として、各テキスト領域Ｆ１〜Ｆ１２に含まれる検索キーワードごとに出現回数と重要度との積を算出し、算出した積の和をとった値を採用する。この場合、例えば、テキスト領域Ｆ１２の領域重要度は「３４０」になる。 (2) The information processing apparatus 100 calculates the region importance for each text region based on the number of appearances of the search keyword included in each of the text regions F1 to F12 and the importance of the search keyword. For example, the information processing apparatus 100 calculates the product of the number of appearances and the importance for each search keyword included in each of the text regions F1 to F12 as the region importance, and adopts a value obtained by summing the calculated products. . In this case, for example, the area importance of the text area F12 is “340”.

また、例えば、情報処理装置１００は、各テキスト領域Ｆ１〜Ｆ１２に含まれる検索キーワードごとに出現回数と重要度との積を算出し、算出した積の和をとった値を算出する。そして、情報処理装置１００は、領域重要度として、各テキスト領域Ｆ１〜Ｆ１２ごとに、各テキスト領域Ｆ１〜Ｆ１２について算出された値と、隣接するテキスト領域について算出された和の何割か（例えば、８割）と、の和をとった値を採用する。この場合、例えば、領域Ｆ１１の領域重要度は、領域Ｆ１１について算出された値「０」と、領域Ｆ１０について算出された値「９０」の８割「７２」と、領域Ｆ１２について算出された値「３４０」の８割「２４２」と、の和「３１４」になる。 For example, the information processing apparatus 100 calculates the product of the number of appearances and the importance for each search keyword included in each of the text regions F1 to F12, and calculates a value obtained by summing the calculated products. Then, the information processing apparatus 100 determines, as the area importance, for each text area F1 to F12, a percentage calculated from the value calculated for each text area F1 to F12 and the sum calculated for adjacent text areas (for example, 80%) and the sum of In this case, for example, the area importance of the area F11 is the value “0” calculated for the area F11, 80% “72” of the value “90” calculated for the area F10, and the value calculated for the area F12. The sum of “340” and 80% “242” is “314”.

これにより、情報処理装置１００は、算出した重要度に基づいて、対象ページにおいて閲覧者Ｓの関心が強いテキスト領域を特定することができ、特定したテキスト領域のデータを概要情報に決定することができる。 Accordingly, the information processing apparatus 100 can identify a text area in which the viewer S is strongly interested in the target page based on the calculated importance, and can determine data of the identified text area as summary information. it can.

ここでは、簡単のため、ウェブサイト内の１つのウェブページＷＰを対象ページとして説明したが、全ウェブページＷＰのそれぞれを対象ページとして同様の処理をおこなってもよい。なお、算出した領域重要度は、領域重要度ＤＢ２０３に記憶される。 Here, for the sake of simplicity, one web page WP in the website has been described as a target page, but the same processing may be performed using all the web pages WP as target pages. The calculated area importance is stored in the area importance DB 203.

（情報処理装置１００による概要情報の提供の具体例）
次に、図１１を用いて、情報処理装置１００による概要情報の提供の具体例について説明する。 (Specific example of provision of summary information by information processing apparatus 100)
Next, a specific example of provision of summary information by the information processing apparatus 100 will be described with reference to FIG.

図１１は、情報処理装置１００による概要情報の提供の具体例を示す説明図である。情報処理装置１００は、図１０において決定された概要情報を、ウェブサイトの閲覧者Ｓに提供する。 FIG. 11 is an explanatory diagram illustrating a specific example of provision of summary information by the information processing apparatus 100. The information processing apparatus 100 provides the overview information determined in FIG. 10 to the website viewer S.

図１０に示すように、例えば、情報処理装置１００は、ウェブページＷＰのリンク元ページＬＰにおいて、ウェブページＷＰにたどり着いた多くの閲覧者Ｓが関心を持っていた概要情報（概要情報の周辺のデータを含む）を表示する。具体的には、情報処理装置１００は、閲覧端末２１０においてウェブページＷＰへのリンクＬにマウスポインタＰを重ねた場合に概要情報がポップアップＰＵとして表示されるように、ＨＴＭＬ文書にＪａｖａＳｃｒｉｐｔを用いて埋め込んでおく。 As illustrated in FIG. 10, for example, the information processing apparatus 100 includes, in the link source page LP of the web page WP, summary information (of the periphery of the summary information) that many viewers S who have reached the web page WP are interested in. Data). Specifically, the information processing apparatus 100 uses JavaScript in the HTML document so that the summary information is displayed as a pop-up PU when the mouse pointer P is overlaid on the link L to the web page WP in the browsing terminal 210. Embed it.

これにより、リンク先のウェブページＷＰを閲覧しようとする閲覧者Ｓは、多くの閲覧者Ｓが関心を持っていた概要情報に基づいて、ウェブページＷＰの内容を判断できるようになる。また、ウェブページＷＰの制作者は、多くの閲覧者Ｓが関心を持っていた概要情報が自動的に決定されるため、閲覧者Ｓの関心を予測して概要情報を設定する手間をかけずに済む。 Thereby, the browsing person S who is going to browse the linked web page WP can judge the content of the web page WP based on the summary information that many browsing persons S are interested in. Further, since the creator of the web page WP automatically determines the summary information that many readers S are interested in, the creator of the web page WP does not need to set the summary information by predicting the interests of the viewers S. It will end.

また、情報処理装置１００は、ウェブページＷＰ内に当該ウェブページＷＰの概要情報を埋め込んでもよい。これにより、ウェブページＷＰにアクセスした閲覧者Ｓは、ウェブページＷＰ全体を閲覧せずとも、ウェブページＷＰの概要を把握できるようになる。また、情報処理装置１００は、ＨＴＭＬ文書に「ｍｅｔａｄｅｓｃｒｉｐｔｉｏｎタグ」を使用して、概要情報を埋め込んでおき、検索サイトでのスニペットへ表示されるようにしてもよい。 Further, the information processing apparatus 100 may embed summary information of the web page WP in the web page WP. Thereby, the browsing person S who accessed the web page WP can grasp | ascertain the outline | summary of the web page WP, without browsing the whole web page WP. Further, the information processing apparatus 100 may embed summary information by using a “meta description tag” in an HTML document and display it in a snippet on a search site.

（検索キーワード抽出処理の処理内容）
次に、図１２を用いて、検索キーワード抽出処理の処理内容の詳細について説明する。検索キーワード抽出処理は、図８および図９に示した情報処理装置１００がおこなった処理である。 (Contents of search keyword extraction process)
Next, details of the processing content of the search keyword extraction processing will be described with reference to FIG. The search keyword extraction process is a process performed by the information processing apparatus 100 shown in FIGS.

図１２は、検索キーワード抽出処理の処理内容の詳細を示すフローチャートである。まず、ＣＰＵ３０１は、アクセスログＤＢ２０１からアクセスの経路を抽出する（ステップＳ１２０１）。次に、ＣＰＵ３０１は、未処理のウェブページＷＰを対象ページに選択する（ステップＳ１２０２）。そして、ＣＰＵ３０１は、対象ページを通過するアクセスの経路を選択する（ステップＳ１２０３）。 FIG. 12 is a flowchart showing details of processing contents of the search keyword extraction processing. First, the CPU 301 extracts an access route from the access log DB 201 (step S1201). Next, the CPU 301 selects an unprocessed web page WP as a target page (step S1202). Then, the CPU 301 selects an access path that passes through the target page (step S1203).

次に、ＣＰＵ３０１は、選択した経路に基づいて、対象ページにおける検索キーワードごとの重要度を算出する（ステップＳ１２０４）。そして、ＣＰＵ３０１は、未処理のウェブページＷＰがあるか否かを判定する（ステップＳ１２０５）。 Next, the CPU 301 calculates the importance for each search keyword in the target page based on the selected route (step S1204). Then, the CPU 301 determines whether there is an unprocessed web page WP (step S1205).

ここで、未処理のウェブページＷＰがある場合（ステップＳ１２０５：Ｙｅｓ）、ＣＰＵ３０１は、ステップＳ１２０２に戻る。一方、未処理のウェブページＷＰがない場合（ステップＳ１２０５：Ｎｏ）、ＣＰＵ３０１は、処理結果を検索キーワードＤＢ２０２に記憶し（ステップＳ１２０６）、検索キーワード抽出処理を終了する。 Here, if there is an unprocessed web page WP (step S1205: Yes), the CPU 301 returns to step S1202. On the other hand, when there is no unprocessed web page WP (step S1205: No), the CPU 301 stores the processing result in the search keyword DB 202 (step S1206), and ends the search keyword extraction process.

これにより、情報処理装置１００は、ウェブサイト内の各ウェブページＷＰについて、検索キーワードごとに重要度を算出することができる。また、検索キーワード抽出処理では、ウェブページＷＰの集合を一つのウェブページＷＰとして扱って、ウェブページＷＰの集合全体における検索キーワードごとの重要度を算出してもよい。 Thereby, the information processing apparatus 100 can calculate the importance for each search keyword for each web page WP in the website. Further, in the search keyword extraction process, the set of web pages WP may be handled as one web page WP, and the importance for each search keyword in the entire set of web pages WP may be calculated.

（領域重要度算出処理の処理内容）
次に、図１３を用いて、領域重要度算出処理の処理内容の詳細について説明する。領域重要度算出処理は、図１０に示した情報処理装置１００がおこなった処理である。 (Processing content of area importance calculation processing)
Next, details of the processing content of the region importance calculation processing will be described using FIG. The area importance calculation process is a process performed by the information processing apparatus 100 illustrated in FIG.

図１３は、領域重要度算出処理の処理内容の詳細を示すフローチャートである。まず、ＣＰＵ３０１は、未処理のウェブページＷＰを対象ページに選択する（ステップＳ１３０１）。そして、ＣＰＵ３０１は、対象ページに含まれる全テキスト領域を特定する（ステップＳ１３０２）。 FIG. 13 is a flowchart showing details of the processing contents of the region importance calculation processing. First, the CPU 301 selects an unprocessed web page WP as a target page (step S1301). Then, the CPU 301 identifies all text areas included in the target page (step S1302).

次に、ＣＰＵ３０１は、検索キーワードＤＢ２０２と特定されたテキスト領域のデータとを参照して、特定されたテキスト領域ごとの領域重要度を算出する（ステップＳ１３０３）。そして、ＣＰＵ３０１は、未処理のウェブページＷＰがあるか否かを判定する（ステップＳ１３０４）。 Next, the CPU 301 refers to the search keyword DB 202 and the data of the specified text area, and calculates the area importance for each specified text area (step S1303). Then, the CPU 301 determines whether there is an unprocessed web page WP (step S1304).

ここで、未処理のウェブページＷＰがある場合（ステップＳ１３０４：Ｙｅｓ）、ＣＰＵ３０１は、ステップＳ１３０１に戻る。一方、未処理のウェブページＷＰがない場合（ステップＳ１３０４：Ｎｏ）、ＣＰＵ３０１は、処理結果を領域重要度ＤＢ２０３に記憶し（ステップＳ１３０５）、領域重要度算出処理を終了する。 If there is an unprocessed web page WP (step S1304: Yes), the CPU 301 returns to step S1301. On the other hand, when there is no unprocessed web page WP (step S1304: No), the CPU 301 stores the processing result in the region importance DB 203 (step S1305), and ends the region importance calculation processing.

これにより、情報処理装置１００は、ウェブサイト内の各ウェブページＷＰについて、テキスト領域ごとに重要度を算出することができる。また、領域重要度算出処理では、ウェブページＷＰの集合を一つのウェブページＷＰとして扱って、ウェブページＷＰの集合全体におけるテキスト領域ごとの領域重要度を算出してもよい。 Thereby, the information processing apparatus 100 can calculate the importance for each text area for each web page WP in the website. In the area importance calculation process, the set of web pages WP may be handled as one web page WP, and the area importance for each text area in the entire set of web pages WP may be calculated.

以上説明したように、情報処理装置は、ウェブページＷＰにたどり着いた際の検索キーワードとウェブページＷＰでの滞在時間とからウェブページＷＰ内で多くの閲覧者Ｓが興味を持っているテキスト領域を特定する。そして、情報処理装置１００は、特定された多くの閲覧者Ｓが興味を持っているテキスト領域のデータを、ウェブページＷＰの概要情報に決定する。 As described above, the information processing apparatus determines a text area in which many viewers S are interested in the web page WP from the search keyword when the web page WP is reached and the stay time in the web page WP. Identify. Then, the information processing apparatus 100 determines the data of the text area in which many specified viewers S are interested as the outline information of the web page WP.

これにより、ウェブサイトの制作者は、閲覧者Ｓが関心を持つテキスト領域を調査して概要情報を決定するといった手間を削減できる。また、ウェブサイトの制作者は、ウェブサイトの閲覧者Ｓのニーズが変化し閲覧者Ｓが関心を持つテキスト領域が変化した場合にも、変化したニーズに対応した概要情報になるテキスト領域を容易に決定できる。 Thereby, the creator of the website can reduce the trouble of investigating the text area in which the viewer S is interested and determining summary information. In addition, even if the website viewer S needs change and the text area that the viewer S is interested in changes, the website creator can easily create a text area that becomes summary information corresponding to the changed needs. Can be determined.

また、情報処理装置１００は、決定された概要情報を、ウェブページＷＰのリンク元ページＬＰに埋め込んでおく。これにより、ウェブサイトの閲覧者Ｓは、リンク先のウェブページＷＰにアクセスする前に、リンク先のウェブページＷＰの概要情報を知ることができる。そのため、ウェブサイトの閲覧者Ｓにとって、情報収集が容易になり、ウェブサイトの利便性を向上できる。 The information processing apparatus 100 embeds the determined summary information in the link source page LP of the web page WP. Thereby, the browser S of the website can know the outline information of the linked web page WP before accessing the linked web page WP. For this reason, information can be easily collected for the website viewer S, and the convenience of the website can be improved.

また、情報処理装置１００は、ウェブページＷＰに、当該ウェブページＷＰの概要情報を埋め込んでおく。これにより、ウェブページＷＰの閲覧者Ｓは、ウェブページＷＰ全体を読むことなく、ウェブページＷＰの概要情報を知ることができるようになり、情報収集の効率化を図ることができる。また、ウェブページＷＰのＨＴＭＬ文書内に「ｍｅｔａｄｅｓｃｒｉｐｔｉｏｎタグ」を用いて概要情報を埋め込むことで、検索サイトのスニペットへ概要情報が表示されるようにし、検索サイトでの閲覧者Ｓの情報収集の効率化を図ることができる。 Further, the information processing apparatus 100 embeds outline information of the web page WP in the web page WP. Thereby, the viewer S of the web page WP can know the outline information of the web page WP without reading the entire web page WP, and can improve the efficiency of information collection. Also, by embedding the summary information using the “meta description tag” in the HTML document of the web page WP, the summary information is displayed on the snippet of the search site, and the information collection of the viewer S on the search site is performed. Efficiency can be improved.

また、情報処理装置１００は、１回のアクセスにおける閲覧時間が閾値以上であった場合、当該閲覧時間を重要度の算出に使用しない。これにより、例えば、情報処理装置１００は、閲覧端末２１０でウェブページＷＰが表示されているものの、閲覧者ＳがウェブページＷＰを閲覧していない状況（例えば、閲覧者Ｓが離席中、または食事中など）における閲覧時間は重要度の算出に使用しない。そのため、情報処理装置１００は、精度よく重要度を算出できるようになる。 In addition, when the browsing time in one access is equal to or greater than the threshold, the information processing apparatus 100 does not use the browsing time for calculating the importance. Thereby, for example, the information processing apparatus 100 has a situation where the web page WP is displayed on the browsing terminal 210 but the browser S is not browsing the web page WP (for example, the viewer S is away from the seat, or Browsing time (eg during meals) is not used to calculate importance. Therefore, the information processing apparatus 100 can calculate the importance with high accuracy.

また、情報処理装置１００は、１回のアクセスにおける閲覧時間が閾値以下であった場合、当該閲覧時間を重要度の算出に使用しない。これにより、例えば、情報処理装置１００は、閲覧端末２１０でウェブページＷＰが表示されたものの、閲覧者ＳがウェブページＷＰに関心がない状況（例えば、閲覧者Ｓが流し読みをした場合など）における閲覧時間は重要度の算出に使用しない。そのため、精度よく重要度を算出できるようになる。 Further, when the browsing time in one access is less than or equal to the threshold value, the information processing apparatus 100 does not use the browsing time for calculating the importance. Thereby, for example, the information processing apparatus 100 displays the web page WP on the browsing terminal 210, but the browser S is not interested in the web page WP (for example, when the browser S performs a read-through). The browsing time in is not used to calculate importance. Therefore, the importance can be calculated with high accuracy.

また、情報処理装置１００は、検索サイトからウェブページＷＰにたどり着くまでのページ数が規定数以下である場合の検索キーワードと滞在時間とを重要度の算出に使用する。これにより、ウェブページＷＰと関連の深い検索キーワードと滞在時間とを重要度の算出に使用するため、精度よく重要度を算出できるようになる。 In addition, the information processing apparatus 100 uses the search keyword and the stay time when the number of pages from the search site to the web page WP is equal to or less than the specified number to calculate the importance. Accordingly, since the search keyword and the stay time closely related to the web page WP are used for calculating the importance, the importance can be calculated with high accuracy.

また、情報処理装置１００は、ウェブページＷＰの集合を一つのウェブページＷＰとして、ウェブページＷＰの集合の中から、多くの閲覧者Ｓが興味を持っているテキスト領域のデータを、ウェブページＷＰの集合の概要情報に決定する。これにより、複数のウェブページＷＰにまたがった記事があった場合に、複数のウェブページＷＰ全体での概要情報を決定することができるようになる。 In addition, the information processing apparatus 100 sets a set of web pages WP as one web page WP, and converts data in a text area in which many viewers S are interested from the set of web pages WP to the web page WP. To the summary information of the set. Thereby, when there is an article extending over a plurality of web pages WP, it is possible to determine summary information for the plurality of web pages WP as a whole.

なお、本実施の形態で説明した情報処理方法は、予め用意されたプログラムをパーソナル・コンピュータやワークステーション等のコンピュータで実行することにより実現することができる。本情報処理プログラムは、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また本情報処理プログラムは、インターネット等のネットワークを介して配布してもよい。 The information processing method described in this embodiment can be realized by executing a program prepared in advance on a computer such as a personal computer or a workstation. The information processing program is recorded on a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM, an MO, and a DVD, and is executed by being read from the recording medium by the computer. The information processing program may be distributed through a network such as the Internet.

上述した実施の形態に関し、さらに以下の付記を開示する。 The following additional notes are disclosed with respect to the embodiment described above.

（付記１）閲覧対象ページについてのアクセス元で前記閲覧対象ページに遷移する際に使われた検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得する取得手段と、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出する第１の算出手段と、
前記第１の算出手段によって前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページにおける重要度と、前記閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページにおける重要度を、前記テキスト領域ごとに算出する第２の算出手段と、
前記第２の算出手段によって前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページにおける重要度に基づいて、前記閲覧対象ページの概要情報となる特定のテキスト領域を決定する決定手段と、
を備えることを特徴とする情報処理装置。 (Supplementary Note 1) For each access to the browsing target page, the search keyword used when the browsing source page is changed to the browsing target page and the time the browsing source page was browsed at the access source Acquisition means for acquiring,
Based on the time when the browsing target page was browsed at the access source with respect to the browsing target page reached by the search keyword acquired by the acquisition means, the importance of the search keyword in the browsing target page is determined as follows. First calculating means for calculating for each search keyword;
Based on the importance of the search keyword calculated for each search keyword by the first calculation means in the browsing target page and the number of appearances of each search keyword for each text area of the browsing target page, A second calculating means for calculating the importance of the text area in the browse target page for each text area;
Determining means for determining a specific text area as summary information of the browse target page based on the importance of the text area in the browse target page calculated for each text area by the second calculating means;
An information processing apparatus comprising:

（付記２）前記決定手段によって決定された特定のテキスト領域内のデータを、前記閲覧対象ページのリンク元ページから呼び出し可能な形式で、前記リンク元ページに埋め込む埋込手段を備えることを特徴とする付記１に記載の情報処理装置。 (Additional remark 2) It is provided with the embedding means which embeds the data in the specific text area determined by the said determination means in the said link source page in the format which can be called from the link source page of the said browsing object page, The information processing apparatus according to appendix 1.

（付記３）前記決定手段によって決定された特定のテキスト領域内のデータを、前記閲覧対象ページ内の前記テキスト領域より上の領域に埋め込む埋込手段を備えることを特徴とする付記１に記載の情報処理装置。 (Supplementary note 3) The supplementary note 1, further comprising an embedding unit that embeds data in a specific text area determined by the determination unit in an area above the text region in the browsing target page. Information processing device.

（付記４）前記第１の算出手段は、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間のうち、閾値以下である前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出することを特徴とする付記１〜３のいずれか一つに記載の情報処理装置。 (Supplementary Note 4) The first calculation means includes:
The browsing target page is browsed at the access source that is less than or equal to a threshold value during the time when the browsing source page was browsed at the access source for the browsing target page that was reached by the search keyword acquired by the acquisition means. The information processing apparatus according to any one of appendices 1 to 3, wherein the importance level of the search keyword in the browsing target page is calculated for each search keyword based on the time spent.

（付記５）前記第１の算出手段は、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間のうち、閾値以上である前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出することを特徴とする付記１〜３のいずれか一つに記載の情報処理装置。 (Supplementary Note 5) The first calculation means includes:
The browsing target page is browsed at the access source that is equal to or greater than a threshold value during the time when the browsing target page was browsed at the access source for the browsing target page that was reached by the search keyword acquired by the acquisition means. The information processing apparatus according to any one of appendices 1 to 3, wherein the importance level of the search keyword in the browsing target page is calculated for each search keyword based on the time spent.

（付記６）前記第１の算出手段は、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間のうち、第１の閾値以上かつ第２の閾値以下である前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出することを特徴とする付記１〜３のいずれか一つに記載の情報処理装置。 (Supplementary Note 6) The first calculation means includes:
The access that is equal to or greater than a first threshold and equal to or less than a second threshold in the time that the browse source page was browsed at the access source for the browse target page that was reached by the search keyword acquired by the acquisition means Any one of Supplementary notes 1 to 3, wherein the importance of the search keyword in the browse target page is calculated for each search keyword based on the time when the browse target page was originally browsed. The information processing apparatus described in 1.

（付記７）前記取得手段は、
前記閲覧対象ページについてのアクセス元で前記閲覧対象ページにたどり着くまでのページ数が規定数以下である検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得することを特徴とする付記１〜６のいずれか一つに記載の情報処理装置。 (Appendix 7) The acquisition means includes:
Access to the browsing target page is a search keyword in which the number of pages until reaching the browsing target page at the access source for the browsing target page is equal to or less than a specified number, and the time when the browsing target page was browsed at the access source The information processing apparatus according to any one of appendices 1 to 6, wherein the information processing apparatus is acquired for each of the information.

（付記８）閲覧対象ページ群についてのアクセス元で前記閲覧対象ページ群に遷移する際に使われた検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページ群へのアクセスごとに取得する取得手段と、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページ群についての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページ群における重要度を、前記検索キーワードごとに算出する第１の算出手段と、
前記第１の算出手段によって前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページ群における重要度と、前記閲覧対象ページ群のテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページ群における重要度を、前記テキスト領域ごとに算出する第２の算出手段と、
前記第２の算出手段によって前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページ群における重要度に基づいて、前記閲覧対象ページ群の概要情報となる特定のテキスト領域を決定する決定手段と、
を備えることを特徴とする情報処理装置。 (Supplementary Note 8) The search keyword used when transitioning to the browse target page group at the access source for the browse target page group and the time during which the browse target page was browsed at the access source are entered into the browse target page group Acquisition means for acquiring each access,
The importance of the search keyword in the browse target page group is determined based on the time when the browse target page was browsed at the access source for the browse target page group arrived at the search keyword acquired by the acquisition means. First calculation means for calculating for each search keyword;
Based on the importance of the search keyword calculated for each search keyword by the first calculation means in the browsing target page group and the number of appearances of each search keyword for each text area of the browsing target page group. A second calculating means for calculating the importance of the text area in the browsing target page group for each text area;
Determining means for determining a specific text area as summary information of the browsing target page group based on the importance of the text area in the browsing target page group calculated for each text area by the second calculating means. When,
An information processing apparatus comprising:

（付記９）コンピュータが、
閲覧対象ページについてのアクセス元で前記閲覧対象ページに遷移する際に使われた検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得し、
取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出し、
前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページにおける重要度と、前記閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページにおける重要度を、前記テキスト領域ごとに算出し、
前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページにおける重要度に基づいて、前記閲覧対象ページの概要情報となる特定のテキスト領域を決定する、
処理を実行することを特徴とする情報処理方法。 (Supplementary note 9)
The search keyword used when transitioning to the browsing target page at the access source for the browsing target page and the time during which the browsing target page was browsed at the access source are acquired for each access to the browsing target page,
The importance of the search keyword in the browse target page is determined for each search keyword based on the time when the browse target page was browsed at the access source with respect to the browse target page arrived at the acquired search keyword. Calculate
The browsing target page of the text area based on the importance of the search keyword calculated for each search keyword in the browsing target page and the number of times each search keyword appears for each text area of the browsing target page. The importance in is calculated for each text area,
Determining a specific text area as summary information of the browsing target page based on the importance of the text area calculated for each text area in the browsing target page;
An information processing method characterized by executing processing.

（付記１０）コンピュータが、
閲覧対象ページ群についてのアクセス元で前記閲覧対象ページ群に遷移する際に使われた検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページ群へのアクセスごとに取得し、
取得された検索キーワードでたどり着いた前記閲覧対象ページ群についての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページ群における重要度を、前記検索キーワードごとに算出し、
前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページ群における重要度と、前記閲覧対象ページ群のテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページ群における重要度を、前記テキスト領域ごとに算出し、
前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページ群における重要度に基づいて、前記閲覧対象ページ群の概要情報となる特定のテキスト領域を決定する、
処理を実行することを特徴とする情報処理方法。 (Supplementary note 10)
For each access to the browsing target page group, the search keyword used when transitioning to the browsing target page group at the access source for the browsing target page group and the time during which the browsing target page was browsed at the access source Acquired,
The importance of the search keyword in the browse target page group is determined based on the time when the browse target page was browsed at the access source for the browse target page group arrived at the acquired search keyword. Calculated for each
Based on the importance of the search keyword calculated for each search keyword in the browsing target page group and the number of appearances of each search keyword for each text area of the browsing target page group, the browsing of the text region Calculate the importance in the target page group for each text area,
Determining a specific text area as summary information of the browse target page group based on the importance of the text area calculated for each text area in the browse target page group;
An information processing method characterized by executing processing.

（付記１１）コンピュータに、
閲覧対象ページについてのアクセス元で前記閲覧対象ページに遷移する際に使われた検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得し、
取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出し、
前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページにおける重要度と、前記閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページにおける重要度を、前記テキスト領域ごとに算出し、
前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページにおける重要度に基づいて、前記閲覧対象ページの概要情報となる特定のテキスト領域を決定する、
処理を実行させることを特徴とする情報処理プログラム。 (Supplementary note 11)
The search keyword used when transitioning to the browsing target page at the access source for the browsing target page and the time during which the browsing target page was browsed at the access source are acquired for each access to the browsing target page,
The importance of the search keyword in the browse target page is determined for each search keyword based on the time when the browse target page was browsed at the access source with respect to the browse target page arrived at the acquired search keyword. Calculate
The browsing target page of the text area based on the importance of the search keyword calculated for each search keyword in the browsing target page and the number of times each search keyword appears for each text area of the browsing target page. The importance in is calculated for each text area,
Determining a specific text area as summary information of the browsing target page based on the importance of the text area calculated for each text area in the browsing target page;
An information processing program for executing a process.

（付記１２）コンピュータに、
閲覧対象ページ群についてのアクセス元で前記閲覧対象ページ群に遷移する際に使われた検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページ群へのアクセスごとに取得し、
取得された検索キーワードでたどり着いた前記閲覧対象ページ群についての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページ群における重要度を、前記検索キーワードごとに算出し、
前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページ群における重要度と、前記閲覧対象ページ群のテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページ群における重要度を、前記テキスト領域ごとに算出し、
前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページ群における重要度に基づいて、前記閲覧対象ページ群の概要情報となる特定のテキスト領域を決定する、
処理を実行させることを特徴とする情報処理プログラム。 (Supplementary note 12)
For each access to the browsing target page group, the search keyword used when transitioning to the browsing target page group at the access source for the browsing target page group and the time during which the browsing target page was browsed at the access source Acquired,
The importance of the search keyword in the browse target page group is determined based on the time when the browse target page was browsed at the access source for the browse target page group arrived at the acquired search keyword. Calculated for each
Based on the importance of the search keyword calculated for each search keyword in the browsing target page group and the number of appearances of each search keyword for each text area of the browsing target page group, the browsing of the text region Calculate the importance in the target page group for each text area,
Determining a specific text area as summary information of the browse target page group based on the importance of the text area calculated for each text area in the browse target page group;
An information processing program for executing a process.

１００情報処理装置
Ｓ閲覧者
２１０閲覧端末
７０１取得部
７０２第１の算出部
７０３第２の算出部
７０４決定部
７０５埋込部 DESCRIPTION OF SYMBOLS 100 Information processing apparatus S Browser 210 Viewing terminal 701 Acquisition part 702 1st calculation part 703 2nd calculation part 704 Determination part 705 Embedding part

Claims

閲覧対象ページについてのアクセス元で前記閲覧対象ページに遷移する際に前記アクセス元で入力された検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得する取得手段と、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出する第１の算出手段と、
前記第１の算出手段によって前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページにおける重要度と、前記閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページにおける重要度を、前記テキスト領域ごとに算出する第２の算出手段と、
前記第２の算出手段によって前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページにおける重要度に基づいて、前記閲覧対象ページの概要情報となる特定のテキスト領域を決定する決定手段と、
を備えることを特徴とする情報処理装置。 For each access to the browsing target page, the search keyword input at the access source and the time during which the browsing target page was browsed at the access source when transitioning to the browsing target page at the access source for the browsing target page Acquisition means for acquiring,
Based on the time when the browsing target page was browsed at the access source with respect to the browsing target page reached by the search keyword acquired by the acquisition means, the importance of the search keyword in the browsing target page is determined as follows. First calculating means for calculating for each search keyword;
Based on the importance of the search keyword calculated for each search keyword by the first calculation means in the browsing target page and the number of appearances of each search keyword for each text area of the browsing target page, A second calculating means for calculating the importance of the text area in the browse target page for each text area;
Determining means for determining a specific text area as summary information of the browse target page based on the importance of the text area in the browse target page calculated for each text area by the second calculating means;
An information processing apparatus comprising:

前記決定手段によって決定された特定のテキスト領域内のデータを、前記閲覧対象ページのリンク元ページから呼び出し可能な形式で、前記リンク元ページ内に埋め込む埋込手段を備えることを特徴とする請求項１に記載の情報処理装置。 The embedding means for embedding the data in the specific text area determined by the determining means in the link source page in a format that can be called from the link source page of the browsing target page. The information processing apparatus according to 1.

前記決定手段によって決定された特定のテキスト領域内のデータを、前記閲覧対象ページ内の前記テキスト領域より上の領域に埋め込む埋込手段を備えることを特徴とする請求項１に記載の情報処理装置。 The information processing apparatus according to claim 1, further comprising an embedding unit that embeds data in a specific text area determined by the determining unit in an area above the text area in the browsing target page. .

前記取得手段は、
前記閲覧対象ページについてのアクセス元で前記閲覧対象ページにたどり着くまでのページ数が規定数以下である検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得することを特徴とする請求項１〜３のいずれか一つに記載の情報処理装置。 The acquisition means includes
Access to the browsing target page is a search keyword in which the number of pages until reaching the browsing target page at the access source for the browsing target page is equal to or less than a specified number, and the time when the browsing target page was browsed at the access source The information processing apparatus according to claim 1, wherein the information processing apparatus is acquired for each of the information processing apparatuses.

閲覧対象ページ群についてのアクセス元で前記閲覧対象ページ群に遷移する際に前記アクセス元で入力された検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページ群へのアクセスごとに取得する取得手段と、
前記取得手段によって取得された検索キーワードでたどり着いた前記閲覧対象ページ群についての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページ群における重要度を、前記検索キーワードごとに算出する第１の算出手段と、
前記第１の算出手段によって前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページ群における重要度と、前記閲覧対象ページ群のテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページ群における重要度を、前記テキスト領域ごとに算出する第２の算出手段と、
前記第２の算出手段によって前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページ群における重要度に基づいて、前記閲覧対象ページ群の概要情報となる特定のテキスト領域を決定する決定手段と、
を備えることを特徴とする情報処理装置。 The search keyword input at the access source and the time during which the browse target page was browsed at the access source when transitioning to the browse target page group at the access source for the browse target page group are input to the browse target page group Acquisition means for acquiring each access,
The importance of the search keyword in the browse target page group is determined based on the time when the browse target page was browsed at the access source for the browse target page group arrived at the search keyword acquired by the acquisition means. First calculation means for calculating for each search keyword;
Based on the importance of the search keyword calculated for each search keyword by the first calculation means in the browsing target page group and the number of appearances of each search keyword for each text area of the browsing target page group. A second calculating means for calculating the importance of the text area in the browsing target page group for each text area;
Determining means for determining a specific text area as summary information of the browsing target page group based on the importance of the text area in the browsing target page group calculated for each text area by the second calculating means. When,
An information processing apparatus comprising:

コンピュータが、
閲覧対象ページについてのアクセス元で前記閲覧対象ページに遷移する際に前記アクセス元で入力された検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得し、
取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出し、
前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページにおける重要度と、前記閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページにおける重要度を、前記テキスト領域ごとに算出し、
前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページにおける重要度に基づいて、前記閲覧対象ページの概要情報となる特定のテキスト領域を決定する、
処理を実行することを特徴とする情報処理方法。 Computer
For each access to the browsing target page, the search keyword input at the access source and the time during which the browsing target page was browsed at the access source when transitioning to the browsing target page at the access source for the browsing target page Get into
The importance of the search keyword in the browse target page is determined for each search keyword based on the time when the browse target page was browsed at the access source with respect to the browse target page arrived at the acquired search keyword. Calculate
The browsing target page of the text area based on the importance of the search keyword calculated for each search keyword in the browsing target page and the number of times each search keyword appears for each text area of the browsing target page. The importance in is calculated for each text area,
Determining a specific text area as summary information of the browsing target page based on the importance of the text area calculated for each text area in the browsing target page;
An information processing method characterized by executing processing.

コンピュータに、
閲覧対象ページについてのアクセス元で前記閲覧対象ページに遷移する際に前記アクセス元で入力された検索キーワードおよび前記アクセス元で前記閲覧対象ページを閲覧していた時間を前記閲覧対象ページへのアクセスごとに取得し、
取得された検索キーワードでたどり着いた前記閲覧対象ページについての前記アクセス元で前記閲覧対象ページを閲覧していた時間に基づいて、前記検索キーワードの前記閲覧対象ページにおける重要度を、前記検索キーワードごとに算出し、
前記検索キーワードごとに算出された前記検索キーワードの前記閲覧対象ページにおける重要度と、前記閲覧対象ページのテキスト領域ごとの各検索キーワードの出現回数と、に基づいて、前記テキスト領域の前記閲覧対象ページにおける重要度を、前記テキスト領域ごとに算出し、
前記テキスト領域ごとに算出された前記テキスト領域の前記閲覧対象ページにおける重要度に基づいて、前記閲覧対象ページの概要情報となる特定のテキスト領域を決定する、
処理を実行させることを特徴とする情報処理プログラム。 On the computer,
For each access to the browsing target page, the search keyword input at the access source and the time during which the browsing target page was browsed at the access source when transitioning to the browsing target page at the access source for the browsing target page Get into
The importance of the search keyword in the browse target page is determined for each search keyword based on the time when the browse target page was browsed at the access source with respect to the browse target page arrived at the acquired search keyword. Calculate
The browsing target page of the text area based on the importance of the search keyword calculated for each search keyword in the browsing target page and the number of times each search keyword appears for each text area of the browsing target page. The importance in is calculated for each text area,
Determining a specific text area as summary information of the browsing target page based on the importance of the text area calculated for each text area in the browsing target page;
An information processing program for executing a process.