JP5400813B2

JP5400813B2 - Address search device and address search method

Info

Publication number: JP5400813B2
Application number: JP2011011057A
Authority: JP
Inventors: 航小黒; 敏典高山
Original assignee: Zenrin Datacom Co Ltd
Current assignee: Zenrin Datacom Co Ltd
Priority date: 2011-01-21
Filing date: 2011-01-21
Publication date: 2014-01-29
Anticipated expiration: 2031-01-21
Also published as: JP2012155356A

Description

本発明は、住所検索装置及び住所検索方法に関する。 The present invention relates to an address search device and an address search method.

近年、様々な情報を検索可能な検索サービスが広く利用されている。このような検索サービスでは、ユーザから入力された検索文字列で検索用のデータベースを検索することにより、検索文字列に関連する情報が出力される。例えば、ユーザから入力された検索文字列に関連する住所や、この住所付近の地図を表示する住所検索システムが知られている（特許文献１）。 In recent years, search services that can search various information have been widely used. In such a search service, information related to a search character string is output by searching a search database using a search character string input by a user. For example, an address search system that displays an address related to a search character string input by a user and a map near the address is known (Patent Document 1).

ところで、住所検索システムでは、入力される検索文字列は住所通りに入力されるとは限らないため、一般的に、検索文字列を住所として可能性のある単語に分割したうえで検索が実行される。具体的には、検索文字列から検索単語を生成し、検索単語との一致数が多い住所ほど検索結果の上位として出力することができる。例えば、検索文字列として「東京中野」が入力された場合に、検索単語「東京」、「中野」が生成されたとする。このとき、単語レベルで分割された「東京／都／中野／区／中野／・・・」の住所は、「東京」を１つ、「中野」を２つ含むため、検索単語との一致数は「３」となる。一方、「静岡／県／浜松／市／東／区／中野／・・・」の住所は、「中野」を１つ含むのみであるため、検索単語との一致数は「１」となる。よって、検索単語との一致数のみを考慮すれば、検索文字列「東京中野」に対する検索結果としては、「東京都中野区中野・・・」が「静岡県浜松市東区中野・・・」よりも上位として出力される。 By the way, in the address search system, since the input search character string is not always input according to the address, the search is generally performed after dividing the search character string into possible words as addresses. The Specifically, a search word is generated from the search character string, and an address having a larger number of matches with the search word can be output as a higher rank of the search result. For example, when “Tokyo Nakano” is input as a search character string, search words “Tokyo” and “Nakano” are generated. At this time, the address of “Tokyo / To / Nakano / Ku / Nakano /...” Divided at the word level includes one “Tokyo” and two “Nakano”, so the number of matches with the search word Becomes “3”. On the other hand, since the address of “Shizuoka / prefecture / Hamamatsu / city / east / ku / Nakano /...” Includes only “Nakano”, the number of matches with the search word is “1”. Therefore, if only the number of matches with the search word is considered, the search result for the search string “Tokyo Nakano” is “Nakano Nakano, Tokyo” from “Nakano Higashi-ku, Hamamatsu, Shizuoka Prefecture”. Are also output as higher.

特開２００３−１８６８８０号公報JP 2003-186880 A

しかしながら、検索文字列の分割によっては、検索結果がユーザの意図したものと異なってしまう場合がある。例えば、検索文字列として「東中野」と入力された場合に、検索単語「東」、「中野」が生成されたとする。このとき、「東京／都／中野／区／東中野／・・・」の住所は「中野」を１つ含むため、検索単語との一致数は「１」となる。一方、「静岡／県／浜松／市／東／区／中野／・・・」の住所は、「東」を１つ、「中野」を１つ含むため、検索単語との一致数は「２」となる。よって、検索単語との一致数のみを考慮すれば、検索文字列「東中野」に対する検索結果として、「東京都中野区東中野」よりも「静岡県浜松市東区中野」の方が上位として出力されてしまう。つまり、検索文字列と同じ「東中野」という１つの大字を含む住所よりも、「東」という１つの市区町村と、「中野」という１つの大字とを含む住所の方が上位として出力されてしまう。 However, depending on the division of the search character string, the search result may differ from what the user intended. For example, when “Higashi Nakano” is input as a search character string, search words “East” and “Nakano” are generated. At this time, since the address of “Tokyo / Metro / Nakano / Ku / Higashi Nakano /...” Includes one “Nakano”, the number of matches with the search word is “1”. On the other hand, the address of “Shizuoka / prefecture / Hamamatsu / city / east / ku / Nakano /...” Includes one “east” and one “Nakano”, so the number of matches with the search word is “2”. " Therefore, if only the number of matches with the search word is considered, the search result for the search string “Higashinakano” is output as “Higashinakano, Nakano-ku, Tokyo” as “Higashi-nakano, Hamamatsu-shi, Shizuoka Prefecture”. End up. In other words, an address that includes one municipality called “East” and one large letter “Nakano” is output as a higher rank than an address that contains one large letter “Higashi Nakano” that is the same as the search character string. End up.

本発明はこのような事情に鑑みてなされたものであり、住所の構造を考慮した住所検索を可能とすることを目的とする。 The present invention has been made in view of such circumstances, and an object thereof is to enable address search in consideration of the structure of an address.

本発明の一側面に係る住所検索装置は、住所に含まれうる単語を、住所の階層を示す階層情報と対応付けて記憶する辞書記憶部と、住所の階層間の連続可能性を示すコスト情報を記憶するコスト記憶部と、検索用の住所データを記憶する検索用マスタ記憶部と、住所を検索するための検索文字列を含む検索要求を受け付ける検索要求受付部と、検索文字列を辞書記憶部に記憶されている単語で分割して得られる検索単語の組み合わせのうち、連続可能性の高い検索単語の組み合わせを、コスト情報に基づいて出力する検索文字列分割部と、検索用マスタ記憶部に記憶されている住所データの中から、検索文字列分割部から出力される検索単語が含まれる住所データを検索する検索部と、検索単語との一致度に応じて検索部の検索結果を出力する検索結果出力部と、を備える。 An address search device according to one aspect of the present invention includes a dictionary storage unit that stores words that can be included in an address in association with hierarchical information indicating an address hierarchy, and cost information indicating continuity between address hierarchies. A cost storage unit that stores the search address data, a search master storage unit that stores search address data, a search request reception unit that receives a search request including a search character string for searching for an address, and a search character string stored in the dictionary A search character string dividing unit that outputs a combination of search words having a high possibility of continuation, based on cost information, among search word combinations obtained by dividing words stored in the unit, and a search master storage unit The search unit that searches for address data that includes the search word output from the search string dividing unit from the address data stored in the search data, and outputs the search result of the search unit according to the degree of match with the search word You And a search result output unit.

なお、本発明において、「部」とは、単に物理的手段を意味するものではなく、その「部」が有する機能をソフトウェアによって実現する場合も含む。また、１つの「部」や装置が有する機能が２つ以上の物理的手段や装置により実現されても、２つ以上の「部」や装置の機能が１つの物理的手段や装置により実現されても良い。 In the present invention, the “part” does not simply mean a physical means, but includes a case where the function of the “part” is realized by software. Also, even if the functions of one “unit” or device are realized by two or more physical means or devices, the functions of two or more “units” or devices are realized by one physical means or device. May be.

本発明によれば、住所の構造を考慮した住所検索が可能となる。 According to the present invention, it is possible to search for an address in consideration of the address structure.

本実施形態の住所検索装置の構成を示す図である。It is a figure which shows the structure of the address search apparatus of this embodiment. 住所データの一例を示す図である。It is a figure which shows an example of address data. 辞書データの一例を示す図である。It is a figure which shows an example of dictionary data. 検索用マスタの一例を示す図である。It is a figure which shows an example of the search master. コスト情報の一例を示す図である。It is a figure which shows an example of cost information. コスト算出の一例を示す図である。It is a figure which shows an example of cost calculation. 辞書データ生成処理の一例を示すフローチャートである。It is a flowchart which shows an example of a dictionary data generation process. 検索用マスタ生成処理の一例を示すフローチャートである。It is a flowchart which shows an example of the search master production | generation process. 検索処理の一例を示すフローチャートである。It is a flowchart which shows an example of a search process. コスト算出の一例を示す図である。It is a figure which shows an example of cost calculation. コスト算出の一例を示す図である。It is a figure which shows an example of cost calculation.

以下、図面を参照して本発明の一実施形態について説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

＝＝構成＝＝
図１は、本実施形態の住所検索装置の構成を示す図である。住所検索装置１０は、ユーザから入力される検索文字列に関連性の高い住所を出力する情報処理装置である。なお、ユーザから入力される検索文字列は、ユーザ端末からネットワーク経由で住所検索装置１０に送信されることとしてもよいし、住所検索装置１０において入力されることとしてもよい。つまり、住所検索装置１０は、ユーザ端末から入力される検索文字列に基づいて検索を行うサーバであってもよいし、カーナビゲーション装置のように単体で動作する装置であってもよい。 == Configuration ==
FIG. 1 is a diagram illustrating a configuration of an address search apparatus according to the present embodiment. The address search device 10 is an information processing device that outputs an address highly relevant to a search character string input from a user. The search character string input by the user may be transmitted from the user terminal to the address search device 10 via the network, or may be input by the address search device 10. That is, the address search device 10 may be a server that performs a search based on a search character string input from a user terminal, or may be a device that operates alone, such as a car navigation device.

図１に示すように、住所検索装置１０は、住所データ記憶部２０、辞書生成部２２、辞書記憶部２４、マスタ生成部２６、検索用マスタ記憶部２８、コスト情報記憶部３０、検索要求受付部３２、検索文字列分割部３４、検索部３６、検索結果出力部３８、及び地域情報記憶部４０を含んで構成されている。なお、住所データ記憶部２０、辞書記憶部２４、検索用マスタ記憶部２８、コスト情報記憶部３０、及び地域情報記憶部４０は、住所検索装置１０において、メモリや記憶装置等の記憶領域を用いて実現することができる。また、辞書生成部２２、マスタ生成部２６、検索要求受付部３２、検索文字列分割部３４、検索部３６、及び検索結果出力部３８は、住所検索装置１０において、メモリに格納されたプログラムをプロセッサが実行することにより実現することができる。 As shown in FIG. 1, the address search device 10 includes an address data storage unit 20, a dictionary generation unit 22, a dictionary storage unit 24, a master generation unit 26, a search master storage unit 28, a cost information storage unit 30, and a search request reception. A unit 32, a search character string dividing unit 34, a search unit 36, a search result output unit 38, and a region information storage unit 40 are configured. The address data storage unit 20, the dictionary storage unit 24, the search master storage unit 28, the cost information storage unit 30, and the regional information storage unit 40 use a storage area such as a memory or a storage device in the address search device 10. Can be realized. In addition, the dictionary generation unit 22, the master generation unit 26, the search request reception unit 32, the search character string division unit 34, the search unit 36, and the search result output unit 38 are programs stored in the memory in the address search device 10. This can be realized by execution by the processor.

住所データ記憶部２０には、階層ごとに区切られた住所データが記憶される。ここで、住所の階層は、例えば、上から順に「都道府県」、「市区町村」、「大字（文字）」、「字丁目」、「数字部分（地番）」により構成される。図２は、住所データ記憶部２０に記憶される住所データの一例を示す図である。図２に示すように、例えば、「東京都中野区中野」の住所は、「東京」、「都」、「中野」、「区」、「中野」というように、階層ごとのデータに区切られている。また、住所データ記憶部２０には、各階層の接尾辞も記憶されている。ここで、接尾辞とは、「都」、「県」、「市」、「区」等のように、住所の各階層の後に付与される語句である。なお、住所の階層を判別可能であれば、接尾辞そのものが住所データ記憶部２０に格納されていなくてもよい。例えば、階層を識別するタグ情報とともに、各階層のデータが住所データ格納部２０に格納されていることとしてもよい。あるいは、階層ごとにあらかじめ定められた位置に、各階層のデータが格納されていることとしてもよい。 The address data storage unit 20 stores address data divided for each hierarchy. Here, the address hierarchy is composed of, for example, “prefecture”, “city”, “large character (character)”, “character chome”, and “number part (location number)” in order from the top. FIG. 2 is a diagram illustrating an example of address data stored in the address data storage unit 20. As shown in FIG. 2, for example, the address of “Nakano, Nakano-ku, Tokyo” is divided into data for each hierarchy, such as “Tokyo”, “Metro”, “Nakano”, “Ku”, “Nakano”. ing. The address data storage unit 20 also stores suffixes for each layer. Here, the suffix is a phrase given after each layer of the address, such as “city”, “prefecture”, “city”, “ward”, and the like. If the address hierarchy can be determined, the suffix itself may not be stored in the address data storage unit 20. For example, the data of each hierarchy may be stored in the address data storage unit 20 together with tag information for identifying the hierarchy. Or it is good also as the data of each hierarchy being stored in the position defined beforehand for every hierarchy.

辞書生成部２２は、住所データ記憶部２０に記憶されている住所データから、住所に含まれうる単語に住所の階層情報を付与した辞書データを生成する。そして、辞書生成部２２は、生成した辞書データを辞書記憶部２４に格納する。図３は、辞書データ記憶部２４に記憶される辞書データの一例を示す図である。図３には、図２に示した住所データから生成された辞書データが示されている。例えば、「東京都中野区中野」の住所データからは、「東京」という単語に階層情報「都道府県」を付与した辞書データが生成されている。同様に、「中野」という単語に階層情報「市区町村」を付与した辞書データが生成されている。さらに、「中野」という単語に階層情報「大字（文字）」を付与した辞書データが生成されている。また、他の住所データからも同様に辞書データが生成されている。さらに、辞書データ記憶部２４には、同一住所で表記が異なる単語（バリエーション）を登録することも可能である。例えば、「霞が関」という大字（文字）の単語がある場合に、「霞ヶ関」や「霞関」をバリエーションの単語として登録しておくことができる。なお、バリエーションの単語は、手作業で生成されてもよいし、住所データから切り出された単語に対して所定の正規化を施すことにより生成されてもよい。また、辞書データ記憶部２４では、一連のバリエーション、例えば、「霞が関」、「霞ヶ関」、「霞関」を、対応づけて記憶しておくことも可能である。 The dictionary generation unit 22 generates, from the address data stored in the address data storage unit 20, dictionary data obtained by adding address hierarchy information to words that can be included in the address. Then, the dictionary generation unit 22 stores the generated dictionary data in the dictionary storage unit 24. FIG. 3 is a diagram illustrating an example of dictionary data stored in the dictionary data storage unit 24. FIG. 3 shows dictionary data generated from the address data shown in FIG. For example, from the address data of “Nakano, Nakano-ku, Tokyo”, dictionary data in which hierarchical information “prefecture” is added to the word “Tokyo” is generated. Similarly, dictionary data in which hierarchical information “city” is assigned to the word “Nakano” is generated. Furthermore, dictionary data in which hierarchical information “Large characters (characters)” is added to the word “Nakano” is generated. Similarly, dictionary data is generated from other address data. Furthermore, it is possible to register words (variations) having different notations at the same address in the dictionary data storage unit 24. For example, when there is a word (character) of the capital letter “Kasumigaseki”, “Kasumigaseki” or “Kasumigaseki” can be registered as a variation word. The variation word may be generated manually, or may be generated by applying a predetermined normalization to the word cut out from the address data. The dictionary data storage unit 24 can also store a series of variations, for example, “Kasumigaseki”, “Kasumigaseki”, and “Saseki” in association with each other.

マスタ生成部２６は、住所データ記憶部２０に記憶されている住所データを、辞書記憶部２４に記憶されている単語で分割することにより、単語レベルに分割された住所データを生成して検索用マスタ記憶部２８に格納する。ここで、マスタ生成部２６は、単語レベルに分割された住所データに、単語のバリエーションを含ませておくこともできる。例えば、前述の例のように、元の住所データの単語が「霞が関」である場合に、「霞ヶ関」、「霞関」を含ませておくことができる。例えば、マスタ生成部２６は、辞書記憶部２４を参照することにより、辞書記憶部２４に登録されているバリエーションの単語を検索用マスタ記憶部２８に格納することができる。なお、バリエーションの単語は、手作業で生成されてもよいし、元の住所データの単語に対して所定の正規化を施すことによって生成されてもよい。 The master generation unit 26 divides the address data stored in the address data storage unit 20 by words stored in the dictionary storage unit 24, thereby generating address data divided into word levels for search. Store in the master storage unit 28. Here, the master generation part 26 can also include the variation of a word in the address data divided | segmented into the word level. For example, as in the example described above, when the word of the original address data is “Kasumigaseki”, “Kasumigaseki” and “Kasumigaseki” can be included. For example, the master generation unit 26 can store the words of variations registered in the dictionary storage unit 24 in the search master storage unit 28 by referring to the dictionary storage unit 24. The variation word may be generated manually, or may be generated by applying a predetermined normalization to the original address data word.

また、マスタ生成部２６は、住所データ記憶部２０に記憶されている住所データから、文字レベルに分割された住所データを生成して検索用マスタ記憶部２８に格納する。さらに、マスタ生成部２６は、表示用の住所データや緯度経度等の付帯情報を検索用マスタ記憶部２８に格納することができる。また、マスタ生成部２６は、単語レベルで分割された住所データに対する単語単位での検索を高速に実行可能とするためのインデックスを生成し、検索用マスタ記憶部２８に格納する。同様に、マスタ生成部２６は、文字レベルで分割された住所データに対する文字単位での検索を高速に実行可能とするためのインデックスを生成し、検索用マスタ記憶部２８に格納する。図４は、検索用マスタ記憶部２８に記憶される検索用マスタの一例を示す図である。図４には、図２に示す住所データと図３に示す辞書データとに基づいて生成された検索用マスタが示されている。 The master generation unit 26 generates address data divided into character levels from the address data stored in the address data storage unit 20 and stores the address data in the search master storage unit 28. Further, the master generation unit 26 can store additional information such as display address data and latitude / longitude in the search master storage unit 28. In addition, the master generation unit 26 generates an index for enabling high-speed search in word units for the address data divided at the word level, and stores the index in the search master storage unit 28. Similarly, the master generation unit 26 generates an index for enabling high-speed search in address units for address data divided at the character level, and stores the index in the search master storage unit 28. FIG. 4 is a diagram illustrating an example of the search master stored in the search master storage unit 28. FIG. 4 shows a search master generated based on the address data shown in FIG. 2 and the dictionary data shown in FIG.

コスト情報記憶部３０には、住所の階層間の連続可能性を示すコスト情報が記憶されている。ここで、「連続可能性」とは、各階層に分割された単語のつながりやすさを表すものである。なお、階層には接尾辞も含まれる。例えば、「中野」という「市区町村」の単語の直後には、「区」という「市区町村接尾辞」の単語が続く可能性が高い一方、「東京」という「都道府県」の単語が続く可能性は低い。 The cost information storage unit 30 stores cost information indicating the continuity between address hierarchies. Here, the “continuity possibility” represents the ease of connection of words divided into each hierarchy. The hierarchy also includes a suffix. For example, the word “City”, “Nakano”, is likely to be followed by the word “City”, “City”, while the word “Prefecture”, “Tokyo” It is unlikely to continue.

図５は、コスト情報記憶部３０に記憶されるコスト情報の一例を示す図である。図５には、縦軸を前の単語、横軸を次の単語とした場合のコスト情報が示されている。なお、本実施形態では、コスト情報は連続可能性が高いほど小さい値となる「コスト」としてコスト情報記憶部３０に記憶されている。例えば、前の単語の階層が「市区町村」の場合、次の単語の階層が「都道府県」、「都道府県接尾辞」の場合のコストは「１００」である。これは、「市区町村」の直後に「都道府県」、「都道府県接尾辞」が続く可能性が極めて低いことを示している。また、前の単語の階層が「市区町村」の場合、次の単語の階層が「市区町村接尾辞」となる場合のコストは「１」である。これは、「市区町村」の直後には「市区町村接尾辞」が続く可能性が最も高いことを示している。つまり、図５に示すコスト情報を参照すれば、「中野」という「市区町村」の直後に続く単語としては、「区」等の「市区町村接尾辞」が続く可能性が最も高い一方、「東京」や「都」が続く可能性は極めて低いことがわかる。なお、図５に示す例では、当然につながりうる階層間のコストには「１」〜「９」が設定されている。また、正式な住所としてはつながらないが、住所の順序関係が保たれている階層間のコストには「１０」〜「７０」が設定されている。また、順序関係が破壊されている階層間のコストには「１００」が設定されている。そして、当然につながりうる階層間や、順序関係が保たれている階層間では、より近い階層間のコストが低くなっている。ただし、図５に示すコストの値は一例にすぎない。 FIG. 5 is a diagram illustrating an example of cost information stored in the cost information storage unit 30. FIG. 5 shows cost information when the vertical axis is the previous word and the horizontal axis is the next word. In the present embodiment, the cost information is stored in the cost information storage unit 30 as “cost” having a smaller value as the possibility of continuity increases. For example, when the hierarchy of the previous word is “city”, the cost when the hierarchy of the next word is “prefecture” and “prefecture suffix” is “100”. This indicates that it is very unlikely that “prefecture” or “prefecture suffix” immediately follows “city”. Further, when the hierarchy of the previous word is “city”, the cost when the hierarchy of the next word is “city suffix” is “1”. This indicates that there is the highest possibility that “city / town / town / town / town / village” is followed immediately by “city / town / town / town / town / town / village”. In other words, referring to the cost information shown in FIG. 5, the word “Nakano” immediately following “City” is most likely to be followed by “City” suffix such as “City”. , “Tokyo” and “City” are very unlikely to follow. In the example illustrated in FIG. 5, “1” to “9” are set for the costs between the hierarchies that can naturally be connected. Moreover, although it is not connected as a formal address, “10” to “70” are set as the costs between the hierarchies where the order relation of the addresses is maintained. In addition, “100” is set as the cost between hierarchies in which the order relationship is broken. And the cost between the lower hierarchies is low between the hierarchies that can naturally be connected or between the hierarchies in which the order relationship is maintained. However, the cost values shown in FIG. 5 are merely examples.

検索要求受付部３２は、住所を検索するための検索文字列を含む検索要求をユーザから受け付ける。例えば、検索要求受付部３２は、携帯端末やパーソナルコンピュータ等で入力される検索要求をインターネット等のネットワークを介して受信することができる。また、例えば、検索要求受付部３２は、住所検索装置１０においてユーザから入力される検索要求を受け付けることができる。 The search request receiving unit 32 receives a search request including a search character string for searching for an address from the user. For example, the search request receiving unit 32 can receive a search request input from a mobile terminal, a personal computer, or the like via a network such as the Internet. Further, for example, the search request receiving unit 32 can receive a search request input from the user in the address search device 10.

検索文字列分割部３４は、検索要求に含まれる検索文字列を辞書記憶部２４に記憶されている単語で分割することにより、検索単語の組み合わせである検索単語リストを生成する。そして、生成された検索単語リストのうち、コスト情報記憶部３０に記憶されているコストの合計値が最も小さくなる検索単語リストを出力する。また、検索文字列分割部３４は、検索文字列を文字ごとに分割した検索文字の組み合わせである検索文字リストも出力する。なお、連結されない１つの単語のコストは「０」であることとする。 The search character string dividing unit 34 generates a search word list that is a combination of search words by dividing the search character string included in the search request by the words stored in the dictionary storage unit 24. Then, among the generated search word lists, a search word list having the smallest total cost stored in the cost information storage unit 30 is output. In addition, the search character string dividing unit 34 also outputs a search character list that is a combination of search characters obtained by dividing the search character string for each character. Note that the cost of one word that is not concatenated is “0”.

例えば、検索文字列が「東京中野」であり、図３に示すように、「東京」、「中野」、「中」、「野」の単語が辞書記憶部２４に記憶されている場合、検索文字列分割部３４は、検索単語リストとして、「東京／中野」の組み合わせと、「東京／中／野」の組み合わせを生成することができる。ここで、図３の辞書データを参照すると、「東京」は「都道府県」、「中野」は「市区町村」または「大字（文字）」、「中」は「市区町村」、「野」は「大字（文字）」である。つまり、階層の遷移としては、図６に示す３パターンが考えられる。そして、図６に示す各パターンについて、図５に示すコスト情報に基づいてコストを求めると、「東京／中野」の組み合わせは、「都道府県」→「市区町村」の場合に「１０」、「都道府県」→「大字（文字）」の場合に「３０」となり、「東京／中／野」の組み合わせは、「３０」となる。つまり、「東京／中野」の組み合わせのコストが最も小さくなっている。よって、この例の場合、検索文字列分割部３４は、検索単語リスト「東京／中野」を出力する。 For example, if the search character string is “Tokyo Nakano” and the words “Tokyo”, “Nakano”, “Middle”, and “Field” are stored in the dictionary storage unit 24 as shown in FIG. The character string dividing unit 34 can generate a combination of “Tokyo / Nakano” and a combination of “Tokyo / Naka / No” as the search word list. Here, referring to the dictionary data in FIG. 3, “Tokyo” is “prefecture”, “Nakano” is “city” or “large character (character)”, “middle” is “city”, “field” "Is a large character (character). That is, three patterns shown in FIG. 6 can be considered as the hierarchy transition. For each pattern shown in FIG. 6, when the cost is calculated based on the cost information shown in FIG. 5, the combination of “Tokyo / Nakano” is “10” in the case of “prefecture” → “city” In the case of “prefecture” → “large character (character)”, it is “30”, and the combination of “Tokyo / Middle / Field” is “30”. That is, the cost of the combination of “Tokyo / Nakano” is the lowest. Therefore, in this example, the search character string dividing unit 34 outputs the search word list “Tokyo / Nakano”.

検索部３６は、検索文字列分割部３４から出力される検索単語リスト及び検索文字リストを用いて検索用マスタ記憶部２８に記憶されている検索用マスタを検索する。具体的には、検索部３６は、検索単語リスト中の単語を含む住所データを検索する。また、検索部３６は、検索文字リスト中の文字を含む住所データを検索する。そして、検索部３６は、検索単語リスト及び検索文字リストのそれぞれに対する一致度を示す情報を含む検索結果を出力する。一致度を示す情報には、例えば、検索単語と一致する単語の数や、検索文字と一致する文字の数が含まれる。 The search unit 36 searches the search master stored in the search master storage unit 28 using the search word list and the search character list output from the search character string dividing unit 34. Specifically, the search unit 36 searches for address data including words in the search word list. Moreover, the search part 36 searches the address data containing the character in a search character list. And the search part 36 outputs the search result containing the information which shows the matching degree with respect to each of a search word list | wrist and a search character list | wrist. The information indicating the matching degree includes, for example, the number of words that match the search word and the number of characters that match the search character.

検索結果出力部３８は、検索単語及び検索文字との一致度に応じて検索部３６の検索結果を出力する。例えば、検索結果出力部３８は、検索用マスタ記憶部２８に記憶されている住所データについて、検索単語と一致する単語の数、および、検索文字と一致する文字の数に基づいて検索文字列に対するスコア（関連度）を算出し、スコアの高い順に住所に関連する情報が表示されるように検索結果を出力する。なお、検索結果出力部３８は、スコアを算出する際に、文字の一致よりも単語の一致の重みを高くすることができる。具体的には、検索単語と一致する単語数が多い順に高いスコアとし、さらに、検索単語と一致する単語数が同一の住所については、検索文字と一致する文字数が多い順に高いスコアとすることができる。 The search result output unit 38 outputs the search result of the search unit 36 according to the degree of matching with the search word and the search character. For example, for the address data stored in the search master storage unit 28, the search result output unit 38 applies the search character string based on the number of words that match the search word and the number of characters that match the search character. A score (relevance) is calculated, and search results are output so that information related to addresses is displayed in descending order of score. It should be noted that the search result output unit 38 can increase the weight of the word match rather than the character match when calculating the score. Specifically, the score is increased in descending order of the number of words that match the search word, and for addresses having the same number of words that match the search word, the score is increased in the order of increasing number of characters that match the search character. it can.

例えば、検索結果出力部３８は、スコアの高い順に、検索用マスタに含まれる表示用住所データを出力することができる。また、例えば、検索結果出力部３８は、スコアの最も高い住所データに対応する地図情報を出力することができる。また、検索結果出力部３８は、スコアが同じ住所データについては、地域情報に応じた順序で検索結果を出力することができる。ここで、地域情報とは、各地域の人口密度やリアルタイムの混雑度等であり、地域情報記憶部４０に記憶されている。例えば、検索結果出力部３８は、同スコアの住所データについては、人口密度が高い地域の住所データがより上位となるように出力することができる。また、検索結果出力部３８は、ユーザ端末あるいは住所検索装置１０の現在位置に近い地域の住所データがより上位となるように出力することとしてもよい。また、検索結果出力部３８は、同スコアの住所データについては、長さが短い住所データがより上位となるように出力することができる。例えば、「中央区日本橋」という検索文字列に対して、「東京都中央区日本橋」という住所データと、「大阪府大阪市中央区日本橋」という住所データが検索された場合、検索結果出力部３８は、単語数が少ない「東京都中央区日本橋」の住所データが上位となるように検索結果を出力することができる。 For example, the search result output unit 38 can output the display address data included in the search master in descending order of score. For example, the search result output unit 38 can output map information corresponding to address data having the highest score. Moreover, the search result output part 38 can output a search result in the order according to area information about the address data with the same score. Here, the area information is the population density of each area, the real-time congestion degree, and the like, and is stored in the area information storage unit 40. For example, the search result output unit 38 can output the address data having the same score so that the address data in a region having a high population density is higher. In addition, the search result output unit 38 may output the address data in a region close to the current position of the user terminal or the address search device 10 so as to be higher. Further, the search result output unit 38 can output address data having the same score so that address data having a shorter length is higher. For example, when address data “Nihonbashi, Chuo-ku, Tokyo” and address data “Nipponbashi, Chuo-ku, Osaka, Osaka” are searched for the search character string “Chuo-ku Nihonbashi”, the search result output unit 38 Can output the search results so that the address data of “Nihonbashi, Chuo-ku, Tokyo” with a small number of words is higher.

＝＝処理＝＝
住所検索装置１０における処理の一例について説明する。 == Processing ==
An example of processing in the address search device 10 will be described.

図７は、辞書データ生成処理の一例を示すフローチャートである。辞書生成部２２は、住所データ記憶部２０に記憶されている住所データを参照し、階層ごとの住所文字列から住所の単語を取得する（Ｓ７０１）。そして、辞書生成部２２は、取得した住所の単語に階層情報を付与して辞書データを生成し、辞書記憶部２４に格納する（Ｓ７０２）。 FIG. 7 is a flowchart illustrating an example of dictionary data generation processing. The dictionary generation unit 22 refers to the address data stored in the address data storage unit 20, and acquires the address word from the address character string for each layer (S701). Then, the dictionary generation unit 22 generates hierarchical data by adding hierarchical information to the acquired address word, and stores it in the dictionary storage unit 24 (S702).

なお、辞書生成部２２は、辞書データを生成する際に、住所データ記憶部２０に記憶されている住所データに対して所定の正規化を施すこととしてもよい。ここで、正規化とは、例えば、片仮名を平仮名に変換したり、「ヶ」を「が」に変換したり、「２丁目１０−５」を「２−１０−５」に変換したりすることである。また、辞書生成部２２は、一連の数字部分については１つの単語として辞書記憶部２４に記憶することができる。例えば、辞書生成部２２は、「２−１０−５」を１つの単語として辞書記憶部２４に記憶することができる。 The dictionary generation unit 22 may perform predetermined normalization on the address data stored in the address data storage unit 20 when generating the dictionary data. Here, normalization is, for example, converting katakana to hiragana, converting "month" to "ga", or converting "2-chome 10-5" to "2-10-5". That is. Moreover, the dictionary production | generation part 22 can memorize | store in the dictionary memory | storage part 24 as one word about a series of numerical parts. For example, the dictionary generation unit 22 can store “2-10-5” in the dictionary storage unit 24 as one word.

図８は、検索用マスタ生成処理の一例を示すフローチャートである。マスタ生成部２６は、住所データ記憶部２０及び辞書記憶部２４を参照し、辞書データに登録されている単語と同一の区切りで単語レベルに分割された住所データを生成する（Ｓ８０１）。なお、辞書データに格納されている単語が正規化されている場合、マスタ生成部２６によって生成される住所データも正規化されたものを用いることができる。また、マスタ生成部２６は、住所データ記憶部２０及び辞書記憶部２４を参照し、文字レベルに分割された住所データを生成する（Ｓ８０２）。そして、マスタ生成部２６は、単語レベルに分割された住所データと、文字レベルに分割された住所データに対するインデックスを生成し（Ｓ８０３）、住所データとともに検索用マスタ記憶部２８に格納する（Ｓ８０４）。なお、マスタ生成部２６は、表示用住所データや付帯情報についても検索用マスタ記憶部２８に格納する。 FIG. 8 is a flowchart illustrating an example of the search master generation process. The master generation unit 26 refers to the address data storage unit 20 and the dictionary storage unit 24, and generates address data divided at the word level at the same segment as the words registered in the dictionary data (S801). In addition, when the word stored in the dictionary data is normalized, the address data generated by the master generation unit 26 can also be normalized. The master generation unit 26 refers to the address data storage unit 20 and the dictionary storage unit 24, and generates address data divided into character levels (S802). Then, the master generation unit 26 generates an index for the address data divided into the word level and the address data divided into the character level (S803), and stores it in the search master storage unit 28 together with the address data (S804). . The master generation unit 26 also stores display address data and incidental information in the search master storage unit 28.

図９は、検索処理の一例を示すフローチャートである。まず、検索要求受付部３２は、検索文字列を含む検索要求を受け付ける（Ｓ９０１）。検索要求が受け付けられると、検索文字列分割部３４は、検索文字列から検索単語リスト及び検索文字リストを生成して出力する。具体的には、検索文字列分割部３４は、辞書記憶部２４に記憶されている単語を用いて検索文字列を分割することにより、検索単語リストの候補を生成する（Ｓ９０２）。続いて、検索文字列分割部３４は、コスト記憶部３０に記憶されているコスト情報に基づいて、各候補のコストを算出する（Ｓ９０３）。そして、検索文字列分割部３４は、コストが最も低い検索単語リストと、検索文字リストとを出力する（Ｓ９０４）。なお、検索文字列分割部３４は、辞書データに格納されている単語と同一の正規化を検索文字列に対して施した上で、検索単語リストを生成することができる。 FIG. 9 is a flowchart illustrating an example of search processing. First, the search request receiving unit 32 receives a search request including a search character string (S901). When the search request is accepted, the search character string dividing unit 34 generates a search word list and a search character list from the search character string and outputs them. Specifically, the search character string dividing unit 34 generates search word list candidates by dividing the search character string using the words stored in the dictionary storage unit 24 (S902). Subsequently, the search character string dividing unit 34 calculates the cost of each candidate based on the cost information stored in the cost storage unit 30 (S903). Then, the search character string dividing unit 34 outputs the search word list with the lowest cost and the search character list (S904). The search character string dividing unit 34 can generate the search word list after performing the same normalization on the search character string as the words stored in the dictionary data.

その後、検索部３６は、検索単語リスト及び検索文字リストを用いて検索用マスタ記憶部２８の検索を行う（Ｓ９０５）。そして、検索結果出力部３８は、検索部３６での検索結果に基づいて、検索単語リスト及び検索文字リストとの一致度の高い順に住所データを出力する（Ｓ９０６）。 Thereafter, the search unit 36 searches the search master storage unit 28 using the search word list and the search character list (S905). Then, the search result output unit 38 outputs the address data in descending order of the degree of coincidence with the search word list and the search character list based on the search result in the search unit 36 (S906).

ここで、検索文字列の分割処理について、具体例を用いて説明する。なお、住所データ、辞書データ、及び検索用マスタは図２〜図４に示す状態であることとする。 Here, the search character string dividing process will be described using a specific example. It is assumed that the address data, dictionary data, and search master are in the state shown in FIGS.

図１０は、検索文字列に「東中野」が含まれている場合の一例を示している。検索文字列分割部３４は、辞書記憶部２４を参照し、検索文字列「東中野」から検索単語リストの候補を生成する。ここで、生成される検索単語リストの候補は、図１０に示すように、「東中野」、「東／中野」、「東／中／野」の３つとなる。検索文字列分割部３４は、各候補についてのコストを算出する。ここで、「東中野」は「大字（文字）」の１単語であるため、コストは「０」である。また、「東／中野」は、「市区町村」→「市区町村」の場合のコストが「１０」、「市区町村」→「大字（文字）」の場合のコストが「２０」となる。また、「東／中／野」は、「市区町村」→「市区町村」→「大字（文字）」であり、コストは「３０」となる。よって、検索文字列分割部３４は、コストが最も低い「東中野」を検索単語リストとして出力する。なお、この例に示すように、１単語の検索単語リストは最もコストが低くなる。そのため、検索文字列分割部３４は、１単語の検索単語リストが存在する場合には、他の検索単語リストのコストを算出することなく、１単語の検索単語リストを出力することとしてもよい。 FIG. 10 shows an example where “Higashinakano” is included in the search character string. The search character string dividing unit 34 refers to the dictionary storage unit 24 and generates a search word list candidate from the search character string “Higashinakano”. Here, as shown in FIG. 10, there are three search word list candidates to be generated: “Higashi Nakano”, “East / Nakano”, and “East / Middle / Field”. The search character string dividing unit 34 calculates the cost for each candidate. Here, “Higashinakano” is one word of “Large characters (characters)”, so the cost is “0”. “Higashi / Nakano” has a cost of “10” in the case of “city” → “city”, and a cost of “20” in the case of “city” → “large letter (character)”. Become. “East / Middle / Field” is “City” → “City” → “Large (character)”, and the cost is “30”. Therefore, the search character string dividing unit 34 outputs “Higashinakano” having the lowest cost as a search word list. As shown in this example, the cost of a one-word search word list is the lowest. Therefore, when there is a one-word search word list, the search character string dividing unit 34 may output a one-word search word list without calculating the cost of another search word list.

そして、検索単語リスト「東中野」が出力されると、検索部３６は検索単語リスト「東中野」を用いて検索用マスタ記憶部２８の検索を行う。ここでは、説明を簡略化するため、検索文字リストについては考慮しないこととする。検索部３６は、検索単語リスト「東中野」をキーとして、検索用マスタの単語レベルの住所データを検索する。このとき、「東京／都／中野／区／東中野／・・・」の住所データはマッチするが、「静岡／県／浜松／市／東／区／中野／町／・・・」の住所データはマッチしない。したがって、検索結果出力部３８は、スコアが最も高い「東京都中野区東中野・・・」の住所データを出力する。 When the search word list “Higashinakano” is output, the search unit 36 searches the search master storage unit 28 using the search word list “Higashinakano”. Here, in order to simplify the description, the search character list is not considered. The search unit 36 searches the word level address data of the search master using the search word list “Higashinakano” as a key. At this time, the address data of “Tokyo / To / Nakano / Ku / Higashi-Nakano / ...” matches, but the address data of “Shizuoka / prefecture / Hamamatsu / city / east / ku / Nakano / town / ...” Does not match. Therefore, the search result output unit 38 outputs the address data of “Higashinakano, Nakano-ku, Tokyo” with the highest score.

ここで、仮に、検索文字列「東中野」が「東／中野」に分割された場合を検討する。この場合、「東京／都／中野／区／中野／・・・」の住所データは、検索単語「中野」との一致数が「２」となる。また、「東京／都／中野／区／東中野／・・・」の住所データは、検索単語「中野」との一致数が「１」となる。また、「静岡／県／浜松／市／東／区／中野／町／・・・」の住所データは、検索単語「東」との一致数が「１」、検索単語「中野」との一致数が「１」であり、一致数の合計値は「２」となる。よって、単純に検索単語との一致数に従ってスコアが決定されることとすると、「東京都中野区東中野・・・」の住所データよりも、「静岡県浜松市東区中野町・・・」の住所データの方がスコアが高くなってしまう。つまり、ユーザが入力した「東中野」という検索文字列を、住所の階層間の連続可能性を考慮せずに「東／中野」に分割してしまうと、ユーザの意図に反した情報が検索結果として出力されてしまう可能性がある。 Here, suppose that the search character string “Higashi Nakano” is divided into “East / Nakano”. In this case, the address data of “Tokyo / To / Nakano / Ku / Nakano /...” Has “2” as the number of matches with the search word “Nakano”. In addition, the address data of “Tokyo / To / Nakano / Ku / Higashinakano /...” Has “1” as the number of matches with the search word “Nakano”. The address data of “Shizuoka / prefecture / Hamamatsu / city / east / ku / Nakano / town /...” Matches the search word “East” with the number of matches “1” and the search word “Nakano”. The number is “1”, and the total number of matches is “2”. Therefore, if the score is simply determined according to the number of matches with the search word, the address of “Nakanocho, Higashi-ku, Hamamatsu, Shizuoka Prefecture” rather than the address data of “Higashinakano, Nakano-ku, Tokyo” The data has a higher score. In other words, if the search character string “Higashi Nakano” entered by the user is divided into “Higashi / Nakano” without considering the continuity between the address hierarchies, information contrary to the user's intention will be displayed. May be output as.

これに対して、本実施形態では、コスト情報に基づいて住所の階層間の連続可能性を考慮した結果、ユーザが入力した「東中野」という検索文字列から「東中野」という検索単語リストが生成される。したがって、ユーザの意図をより反映したと考えられる検索結果を出力することが可能となる。 On the other hand, in the present embodiment, as a result of considering the continuity between the address hierarchies based on the cost information, a search word list “Higashi Nakano” is generated from the search character string “Higashi Nakano” input by the user. The Therefore, it is possible to output a search result that is considered to reflect the user's intention more.

図１１に、別の具体例を示す。図１１は、検索文字列に「中野東」が含まれている場合の一例を示している。検索文字列分割部３４は、辞書記憶部２４を参照し、検索文字列「中野東」から検索単語リストの候補を生成する。ここで、生成される検索単語リストの候補は、図１１に示すように、「中野東」、「中野／東」、「中／野／東」の３つとなる。検索文字列分割部３４は、各候補についてのコストを算出する。ここで、「中野東」は「大字（文字）」の１単語であるため、コストは「０」である。また、「中野／東」は、「市区町村」→「市区町村」の場合のコストが「１０」、「大字（文字）」→「市区町村」の場合のコストが「１００」となる。また、「中／野／東」は、「市区町村」→「大字（文字）」→「市区町村」であり、コストは「１２０」となる。よって、検索文字列分割部３４は、コストが最も低い「中野東」を検索単語リストとして出力する。 FIG. 11 shows another specific example. FIG. 11 shows an example in which “Nakano Higashi” is included in the search character string. The search character string dividing unit 34 refers to the dictionary storage unit 24 and generates a search word list candidate from the search character string “Nakano Higashi”. Here, as shown in FIG. 11, there are three search word list candidates to be generated: “Nakano East”, “Nakano / East”, and “Naka / No / East”. The search character string dividing unit 34 calculates the cost for each candidate. Here, “Nakano Higashi” is one word of “Large characters (characters)”, so the cost is “0”. In the case of “Nakano / Higashi”, the cost in the case of “city” → “city” is “10”, and the cost in the case of “large letter (character)” → “city” is “100”. Become. “Middle / Field / East” is “city / town / village” → “large character (character)” → “city / town”, and the cost is “120”. Therefore, the search character string dividing unit 34 outputs “Nakano Higashi” having the lowest cost as a search word list.

そして、検索単語リスト「中野東」を用いて検索用マスタ記憶部２８が検索されることにより、「中野東」を単語として含む、「広島県広島市安芸区中野東町・・・」の住所データが出力される。 Then, by searching the search master storage unit 28 using the search word list “Nakano Higashi”, the address data of “Nakano Higashimachi, Hiroshima, Aki-ku, Hiroshima” including “Nakano Higashi” as a word. Is output.

ここで、仮に、検索文字列「中野東」が「中野／東」に分割された場合を検討する。この場合、検索単語リスト「中野／東」との一致数は、「広島／県／広島／市／安芸／区／中野東／町・・・」の住所データが「０」である一方、「静岡／県／浜松／市／東／区／中野／町／・・・」の住所データは「２」となる。つまり、検索文字列「中野東」とは順序が異なり、「東」、「中野」の順の階層となっている住所データが検索結果の上位として出力されてしまう。 Here, suppose that the search character string “Nakano Higashi” is divided into “Nakano / Higashi”. In this case, the number of matches with the search word list “Nakano / Higashi” is “0” in the address data of “Hiroshima / prefecture / Hiroshima / city / Aki / ku / Nakano-Higashi / town. The address data of “Shizuoka / prefecture / Hamamatsu / city / east / ward / Nakano / town /...” Is “2”. That is, the order of the search character string “Nakano Higashi” is different, and the address data in the hierarchy of “East” and “Nakano” is output as the higher rank of the search result.

これに対して、本実施形態では、コスト情報に基づいて住所の階層間の連続可能性を考慮した結果、ユーザが入力した「中野東」という検索文字列から「中野東」という検索単語リストが生成されることにより、ユーザの意図をより反映したと考えられる検索結果を出力することが可能となる。 On the other hand, in the present embodiment, as a result of considering the continuity between the address hierarchies based on the cost information, the search word list “Nakano Higashi” is obtained from the search character string “Nakano Higashi” input by the user. By being generated, it is possible to output a search result that is considered to more reflect the user's intention.

つまり、本実施形態によれば、住所の構造を考慮したうえで検索文字列から検索単語を生成し、生成された検索単語に基づいて住所データの検索を行うことにより、ユーザの意図を反映した検索結果を出力することが可能となる。 That is, according to the present embodiment, the search word is generated from the search character string in consideration of the address structure, and the address data is searched based on the generated search word, thereby reflecting the intention of the user. The search result can be output.

また、本実施形態では、辞書生成時と同一の規則で検索文字列を正規化して検索単語を生成することが可能であるため、検索単語を検索用マスタとマッチングする際の精度を向上させることができる。 Further, in the present embodiment, it is possible to generate a search word by normalizing a search character string according to the same rule as that at the time of dictionary generation, so that the accuracy when matching the search word with the search master is improved. Can do.

また、本実施形態では、辞書データを生成する際に、住所に含まれる一連の数字部分を１つの単語とすることが可能である。これにより、例えば、検索文字列に「２−１０−５」が含まれる場合、「２」、「１０」、「５」の各文字によるマッチングではなく、「２−１０−５」という単語でのマッチングが可能となり、より精度の高い検索が可能となる。 Moreover, in this embodiment, when generating dictionary data, it is possible to make a series of number parts included in an address into one word. Thus, for example, if “2-10-5” is included in the search character string, it is not matched by the characters “2”, “10”, and “5”, but the word “2-10-5”. Can be matched, and more accurate search is possible.

また、本実施形態では、検索用マスタに、同一住所で表記が異なる複数種類の単語を含ませることができる。これにより、検索文字列と検索用マスタとのマッチング率を高めることができる。例えば、元の住所データが「霞が関」である場合に、検索用マスタに「霞ヶ関」、「霞関」のバリエーションが登録されていれば、検索文字列に「霞が関」が含まれる場合に限らず、「霞ヶ関」や「霞関」が含まれる場合においても検索用マスタとのマッチングが可能となる。 In the present embodiment, the search master can include a plurality of types of words having the same address and different notations. Thereby, the matching rate between the search character string and the search master can be increased. For example, if the original address data is “Kasumigaseki” and the variations of “Kasumigaseki” and “Kasumigaseki” are registered in the search master, not only the case where “Kasumigaseki” is included in the search character string. In addition, even when “Kasumigaseki” or “Kasumigaseki” is included, matching with the search master becomes possible.

なお、本実施形態は、本発明の理解を容易にするためのものであり、本発明を限定して解釈するためのものではない。本発明は、その趣旨を逸脱することなく、変更／改良され得るととともに、本発明にはその等価物も含まれる。 Note that this embodiment is intended to facilitate understanding of the present invention and is not intended to limit the present invention. The present invention can be changed / improved without departing from the gist thereof, and the present invention includes equivalents thereof.

例えば、本実施形態では、住所データ記憶部２０が住所検索装置１０に含まれることとしたが、住所データが住所検索装置１０の外部から入力されることとしてもよい。 For example, in the present embodiment, the address data storage unit 20 is included in the address search device 10, but the address data may be input from the outside of the address search device 10.

また、例えば、本実施形態では、検索文字列から生成される検索単語を用いて、単語レベルに分割された住所データを含む検索用マスタを検索することとしたが、検索対象はこれに限られない。例えば、検索文字列から生成される検索単語が含まれる平文が検索されることとしてもよい。この場合、検索対象となる平文が検索用の住所データであり、このような平文を記憶する記憶部が検索用マスタ記憶部となる。また、平文に限らず、住所の一部となる文字列を含みうる任意のデータを、検索用の住所データとすることができる。 Further, for example, in this embodiment, the search master including the address data divided into word levels is searched using the search word generated from the search character string, but the search target is limited to this. Absent. For example, a plain text including a search word generated from a search character string may be searched. In this case, the plaintext to be searched is address data for search, and the storage unit that stores such plaintext is the search master storage unit. Further, not limited to plain text, any data that can include a character string that is part of an address can be used as address data for search.

１０住所検索装置
２０辞書生成部
２２住所データ記憶部
２４辞書記憶部
２６マスタ生成部
２８検索用マスタ記憶部
３０コスト情報記憶部
３２検索要求受付部
３４検索文字列分割部
３６検索部
３８検索結果出力部
４０地域情報記憶部 DESCRIPTION OF SYMBOLS 10 Address search device 20 Dictionary generation part 22 Address data storage part 24 Dictionary storage part 26 Master generation part 28 Master storage part for search 30 Cost information storage part 32 Search request reception part 34 Search character string division part 36 Search part 38 Search result output Department 40 Regional Information Storage Department

Claims

住所に含まれうる単語を、住所の階層を示す階層情報と対応付けて記憶する辞書記憶部と、
住所の階層間の連続可能性を示すコスト情報を記憶するコスト記憶部と、
検索用の住所データを記憶する検索用マスタ記憶部と、
住所を検索するための検索文字列を含む検索要求を受け付ける検索要求受付部と、
前記検索文字列を前記辞書記憶部に記憶されている単語で分割して得られる検索単語の複数の組み合わせのそれぞれについて、検索単語に対応付けられた前記階層情報と、前記コスト情報とに基づいて、連続可能性を示すコストを算出し、該算出されたコストに基づいて、連続可能性の高い検索単語の組み合わせを出力する検索文字列分割部と、
前記検索用マスタ記憶部に記憶されている住所データの中から、前記検索文字列分割部から出力される検索単語が含まれる住所データを検索する検索部と、
前記検索単語との一致度に応じて前記検索部の検索結果を出力する検索結果出力部と、
を備える住所検索装置。 A dictionary storage unit that stores words that can be included in the address in association with hierarchical information indicating the hierarchy of the address;
A cost storage unit for storing cost information indicating the continuity between address hierarchies;
A search master storage unit for storing search address data;
A search request receiving unit that receives a search request including a search character string for searching for an address;
For each of a plurality of combinations of search words obtained by dividing the search character string by words stored in the dictionary storage unit, based on the hierarchical information associated with the search words and the cost information A search character string dividing unit that calculates a cost indicating continuity and outputs a combination of search words having a high continuity possibility based on the calculated cost;
A search unit for searching for address data including a search word output from the search character string dividing unit from among the address data stored in the search master storage unit;
A search result output unit that outputs a search result of the search unit according to a degree of coincidence with the search word;
An address search device comprising:

請求項１に記載の住所検索装置であって、
階層ごとに分割された住所データを記憶する住所データ記憶部と、
前記階層ごとに分割された住所データから、前記住所に含まれうる単語及び前記階層情報を生成して前記辞書記憶部に格納する辞書生成部をさらに備える、
住所検索装置。 The address search device according to claim 1,
An address data storage unit for storing address data divided for each hierarchy;
Further comprising a dictionary generation unit that generates words and the hierarchy information that can be included in the address from the address data divided for each hierarchy and stores them in the dictionary storage unit;
Address search device.

請求項２に記載の住所検索装置であって、
前記辞書生成部は、前記階層ごとに分割された住所データから、所定の規則に従って正規化された、前記住所に含まれうる単語を生成する、
住所検索装置。 The address search device according to claim 2,
The dictionary generation unit generates words that can be included in the address, normalized according to a predetermined rule, from address data divided for each hierarchy.
Address search device.

請求項３に記載の住所検索装置であって、
前記検索文字列分割部は、前記所定の規則に従って正規化された、前記検索単語の組み合わせを出力する、
住所検索装置。 The address search device according to claim 3,
The search character string dividing unit outputs a combination of the search words normalized according to the predetermined rule.
Address search device.

請求項２〜４の何れか一項に記載の住所検索装置であって、
前記辞書生成部は、住所に含まれる一連の数字部分を１つの単語として前記辞書記憶部に格納する、
住所検索装置。 The address search device according to any one of claims 2 to 4,
The dictionary generation unit stores a series of numeric parts included in an address as one word in the dictionary storage unit.
Address search device.

請求項１〜５の何れか一項に記載の住所検索装置であって、
前記検索用マスタ記憶部は、同一住所に対する表記が異なる複数種類の単語を前記検索用の住所データに含ませることができる、
住所検索装置。 The address search device according to any one of claims 1 to 5,
The search master storage unit can include a plurality of types of words with different notations for the same address in the search address data.
Address search device.

請求項１〜６の何れか一項に記載の住所検索装置であって、
各地域の情報を示す地域情報を記憶する地域情報記憶部をさらに備え、
前記検索結果出力部は、前記地域情報に応じた順序で前記検索結果を出力する、
住所検索装置。 The address search device according to any one of claims 1 to 6,
It further includes an area information storage unit that stores area information indicating information of each area,
The search result output unit outputs the search results in an order according to the area information.
Address search device.

請求項１〜７の何れか一項に記載の住所検索装置であって、
前記検索結果出力部は、前記検索単語が含まれる住所データの長さに応じた順序で検索結果を出力する、
住所検索装置。 The address search device according to any one of claims 1 to 7,
The search result output unit outputs search results in an order corresponding to the length of address data including the search word.
Address search device.

情報処理装置が、
住所に含まれうる単語を、住所の階層を示す階層情報と対応付けて辞書記憶部に記憶し、
住所の階層間の連続可能性を示すコスト情報をコスト記憶部に記憶し、
検索用の住所データを検索用マスタ記憶部に記憶し、
住所を検索するための検索文字列を含む検索要求を受け付け、
前記検索文字列を前記辞書記憶部に記憶されている単語で分割して得られる検索単語の複数の組み合わせのそれぞれについて、検索単語に対応付けられた前記階層情報と、前記コスト情報とに基づいて、連続可能性を示すコストを算出し、該算出されたコストに基づいて、連続可能性の高い検索単語の組み合わせを出力し、
前記検索用マスタ記憶部に記憶されている住所データの中から、前記出力された検索単語が含まれる住所データを検索し、
前記検索単語との一致度に応じて前記検索結果を出力する、
住所検索方法。
Information processing device
The words that can be included in the address are stored in the dictionary storage unit in association with the hierarchy information indicating the hierarchy of the address,
Stores cost information indicating the continuity between address hierarchies in the cost storage unit,
Store the address data for search in the search master storage unit,
Accept a search request that includes a search string to search for an address,
For each of a plurality of combinations of search words obtained by dividing the search character string by words stored in the dictionary storage unit, based on the hierarchical information associated with the search words and the cost information , Calculate a cost indicating continuity, and based on the calculated cost, output a combination of search words with high continuity,
Search the address data including the output search word from the address data stored in the search master storage unit,
Outputting the search result according to the degree of matching with the search word;
Address search method.