CN103714081A - Method and device for recognizing proprietary place names - Google Patents

Method and device for recognizing proprietary place names Download PDF

Info

Publication number
CN103714081A
CN103714081A CN201210376874.8A CN201210376874A CN103714081A CN 103714081 A CN103714081 A CN 103714081A CN 201210376874 A CN201210376874 A CN 201210376874A CN 103714081 A CN103714081 A CN 103714081A
Authority
CN
China
Prior art keywords
search word
user
proper noun
confidence level
described search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210376874.8A
Other languages
Chinese (zh)
Other versions
CN103714081B (en
Inventor
李扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201210376874.8A priority Critical patent/CN103714081B/en
Publication of CN103714081A publication Critical patent/CN103714081A/en
Application granted granted Critical
Publication of CN103714081B publication Critical patent/CN103714081B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method and a device for recognizing proprietary place names. The method includes acquiring a search word input by a user and feature information of the user; utilizing the search word for inquiry, and determining reliability of the search word according to inquiry results and the feature information; judging whether the reliability is larger than a preset threshold value and the search word accords with predefined name rules or not, and if yes, recognizing the search word as a proprietary place name. Compared with the prior art, the method and the device have the advantages that proprietary place name recognition in the field of LBS (location based service) is effectively performed by utilizing geographic locations and operating behavior features of users, and recall rate and accuracy rate of proprietary place name recognition of map retrieval service can be increased, so that recall rate and accuracy rate of map retrieval are increased, and processing speed of map retrieval is increased.

Description

A kind of recognition methods of proper noun and device
[technical field]
The present invention relates to map information handling technical field, particularly a kind of recognition methods of proper noun and device.
[background technology]
Development along with social informatization, take information of place names as basic position-based service (LocationBased Service, LBS) demand is growing, and people more and more depend on by LBS and serve and carry out geographic information retrieval, find fast and accurately needed information.Geographic information retrieval is exactly on the basis of conventional information retrieval, makes full use of the information relevant to geography, and the feature of combining geographic information and processing mode help user more effectively to complete information retrieval.
Geographic information retrieval technology is mainly in order to return to the information such as geospatial location of place name or entity object by geographical name retrieval, for example, retrieve the geographic position of certain point of interest POI or carry out layout of roads etc., and by type or relation retrieve in order to return to the point of interest location satisfying condition, such as near the restaurant certain place of retrieval, public place of entertainment etc.
Place name is the proprietary name of expressing nature in a certain particular spatial location or political geography entity, because people mostly rely on place name in daily interchange, express and receive geographic position, thereby often need place name to participate in expressing query contents when type or relation retrieve.Therefore, geographical name retrieval becomes application the most widely in geographic information retrieval technology.Yet the geographical space that existing geographical name retrieval technology sometimes can not correct understanding place name contains is semantic, there will be and looks into inaccurate phenomenon.In order to improve the accuracy that utilizes geographical name retrieval, utilize proper noun can directly match geographical location information accurately, thereby promote the accuracy of geographical name retrieval, return to rapidly result for retrieval.
It is mainly the co-occurrence frequency by comprising each lexical item (term) in going to add up search word in the large search set in whole internet that existing proper noun is known method for distinguishing, judges whether it is proper noun.Yet, adopt in this way the difference due to data set and field experience, in LBS field, most of search word corresponding data in internet data are less, do not have statistical significance, and proper name identification is recalled not high enough; And, being also easily subject to the interference of other field related data, the accuracy rate of proper noun identification is not high yet.
[summary of the invention]
In view of this, the invention provides a kind of recognition methods and device of proper noun, can promote recall rate and the accuracy rate of the proper noun identification of map retrieval service, thereby improve recall rate and the accuracy rate of map retrieval, promote the processing speed of map retrieval.
Concrete technical scheme is as follows:
A recognition methods for proper noun, the method comprises the following steps:
Obtain the search word of user's input and user's characteristic information;
Utilize described search word to inquire about, and according to Query Result and described characteristic information, determine the confidence level of described search word;
Judge whether that described confidence level is greater than predetermined threshold value and described search word meets predefined title rule, if so, described search word is identified as to proper noun.
According to one preferred embodiment of the present invention, described characteristic information comprises:
User-selected search pattern when user's inputted search word;
User-selected Tu district operator scheme before or after user's inputted search word; And/or
The geographic position at user place.
According to one preferred embodiment of the present invention, the described search word that utilizes is inquired about, and determines the confidence level of described search word according to Query Result and described characteristic information, specifically comprises:
Judging whether can inquire about under described search pattern the result and the described search pattern that obtain matching with described search word is predefined search pattern, if so, increases the confidence level of described search word.
According to one preferred embodiment of the present invention, the described search word that utilizes is inquired about, and determines the confidence level of described search word according to Query Result and described characteristic information, specifically comprises:
Judge whether user has the action of described figure district operator scheme, and after user completes the action of described figure district operator scheme, within the scope of Tu district, place, can inquire about and obtain the result that matches with described search word, if so, increase the confidence level of described search word.
According to one preferred embodiment of the present invention, the described search word that utilizes is inquired about, and determines the confidence level of described search word according to Query Result and described characteristic information, specifically comprises:
Judgement utilizes described search word, whether can inquire the result matching in the distance with described positional information within predeterminable range threshold value, if so, increases the confidence level of described search word.
According to one preferred embodiment of the present invention, after described search word is identified as to proper noun, also comprise:
The search word that is identified as proper noun is formed to proper noun set, and according to the follow-up Query Result of user, described proper noun set is dynamically adjusted.
According to one preferred embodiment of the present invention, describedly according to the follow-up Query Result of user, described proper noun set is dynamically adjusted, is specially:
When user utilize search word in described proper noun set inquire about but can not get matching result time, reduce the confidence level of described search word;
When described with a low credibility during in predetermined threshold value, described search word is identified as to non-proper noun, and deletes from described proper noun set.
A recognition device for proper noun, this device comprises:
Acquisition module, for obtaining the search word of user's input and user's characteristic information;
Query processing module, for utilizing described search word to inquire about, and determines the confidence level of described search word according to Query Result and described characteristic information;
Comprehensive judge module, for judging whether that described confidence level is greater than predetermined threshold value and whether described search word meets predefined title rule, if so, is identified as proper noun by described search word.
According to one preferred embodiment of the present invention, described acquisition module comprises:
Second obtains submodule, for obtaining user-selected search pattern when user's inputted search word;
The 3rd obtains submodule, for obtained user-selected Tu district operator scheme before or after user's inputted search word; Or
The 4th obtains submodule, for obtaining the geographic position at user place.
According to one preferred embodiment of the present invention, described query processing module comprises:
First processes submodule, for judging that described second obtains the search pattern that submodule obtains and whether be predefined search pattern and utilize described search word can inquire about the result that obtains matching under described search pattern, if so, increase the confidence level of described search word.
According to one preferred embodiment of the present invention, described query processing module comprises:
Second processes submodule, for judging that the described the 3rd obtains submodule and whether have the action that gets described figure district operator scheme, and the action that utilizes described search word to complete described figure district operator scheme user can be inquired about the result that obtains matching within the scope of Tu district, place afterwards, if so, increase the confidence level of described search word.
According to one preferred embodiment of the present invention, described query processing module comprises:
The 3rd processes submodule, for judgement, utilizes described search word, whether can inquire the result matching in the distance with described positional information within predeterminable range threshold value, if so, increases the confidence level of described search word.
According to one preferred embodiment of the present invention, described device also comprises:
Adjusting module, for the formed proper noun set of search word that is identified as proper noun for described comprehensive judge module, dynamically adjusts according to the Query Result that user is follow-up.
According to one preferred embodiment of the present invention, described adjusting module specifically for as user, utilize search word in described proper noun set inquire about but can not get matching result time, described search word is identified as to non-proper noun, and deletes from described proper noun set.
As can be seen from the above technical solutions, the recognition methods of proper noun provided by the invention and device, utilize user's geographic position and operation behavior feature, carry out the proper noun identification in effective LBS field, can promote recall rate and the accuracy rate of the proper noun identification of map retrieval service.The present invention utilizes user's geographic position, selected search pattern and carries out Tu district operator scheme, determined the deflection of recognition result, the result according to the retrieval of not being completely cured is dynamically adjusted proper noun, guaranteed the accuracy rate of proper name identification, thereby improve recall rate and the accuracy rate of map retrieval, promote the processing speed of map retrieval.
[accompanying drawing explanation]
The recognition methods process flow diagram of the proper noun that Fig. 1 provides for the embodiment of the present invention one;
The recognition methods process flow diagram of the proper noun that Fig. 2 provides for the embodiment of the present invention two;
The recognition device schematic diagram of the proper noun that Fig. 3 provides for the embodiment of the present invention three;
The recognition device schematic diagram of the proper noun that Fig. 4 provides for the embodiment of the present invention four.
[embodiment]
In order to make the object, technical solutions and advantages of the present invention clearer, below in conjunction with the drawings and specific embodiments, describe the present invention.
Embodiment mono-
Fig. 1 is the recognition methods process flow diagram of the proper noun that provides of the present embodiment, and as shown in Figure 1, the method comprises:
Step S101, the search word that obtains user's input and user's characteristic information.
Map search service provider, when the services such as map search, LBS location are provided, can provide some search pattern Huo Tu district operator scheme etc., conventionally so that user selects suitable search pattern according to actual conditions.
Search pattern generally includes: public transport search, search box search or near search etc.When user carries out LBS positioning service, inputted search word (query) under selected search pattern.Under public transport search pattern, the search word of user's input is mostly bus stop title, and nearby, under search and search box search pattern, the search word of user's input is mostly point of interest title etc.
User operates map Tu district operator scheme and generally includes: drag figure district, enlarged drawing district or dwindle figure district etc.User can adjust the map location and the scope that show by amplifying, dwindle or dragging figure district.For example, when Dang Tuqu is shown as whole Beijing City map, user can and drag figure district demonstration Tu district is become to some regions by enlarged drawing district, even certain Huo Shang district, community etc.
While using the mobile terminals such as mobile phone to carry out LBS positioning service as user, system can be utilized the GPS(GPS in terminal) etc. locating module first obtain the geographical location information at user place, thereby utilize the geographic position that gets user to assist to carry out the identification of proper noun, can further improve accuracy rate and coverage rate.
Thereby user's characteristic information comprises: user-selected search pattern when user's inputted search word, user-selected Tu district operator scheme before or after user's inputted search word, and the information such as geographic position at user place.
Step S102, utilize described search word to inquire about, and according to Query Result and described characteristic information, determine the confidence level of described search word.
Utilize search word to inquire about, judge whether to have the Query Result matching with this search word, and according to user's characteristic information, determine the confidence level of search word.
The confidence level of an acquiescence is set for each search word, or, the confidence level of this search word before also can reading from database.For can set different weighted values under different search pattern Huo Tu district operator scheme, under associative mode, if the Query Result that utilizes search word inquiry to obtain matching is superimposed by original confidence level of the weighted value under this corresponding modes and search word.
For example, when user is under the pattern of public transport search after input " five public transport fields, road junction " at search pattern, after search, can inquire and in map, comprise the bus stop matching, now the original confidence level of the weighted value under public transport search pattern and this search word is added, if the confidence level of search word acquiescence is 0, weighted value under public transport search pattern is 3, and the confidence level that can determine this search word is 3.
The weighted value of search pattern can be, but not limited to be set to: near public transport search >=search > search box search.The weighted value of figure district operator scheme can be, but not limited to be set to: dwindle Tu Qu >=enlarged drawing district > and drag figure district.
Search word for user input, if cannot inquire the Query Result matching, reduces the confidence level of this search word, or is made as zero, because what now conventionally can think that user inputs is non-proper noun or wrong lexical item.
Step S103, judge whether that described confidence level is greater than predetermined threshold value and described search word and meets predefined title rule, if so, enter step S104, described search word is identified as to proper noun, otherwise enter step S105, described search word is identified as to non-proper noun.
Predefined title rule is not to be the lexical item of proper noun in order to get rid of obviously.If there is stop words such as " ", " " in search word, generally all think not to be proper noun, for example, " Pekinese hospital ", " which pharmacy of Tongrentang removed " etc.Predefined title rule can be, but not limited to comprise: place name+suffix (for example " Peking University "), or the form of place name+scope of business+suffix (for example " Beijing Tumour Hospital ").
Predetermined threshold value arranges according to the demand of practical application scene, if described confidence level is greater than predetermined threshold value and search word meets predefined title rule, this search word is identified as to proper noun, otherwise, be identified as non-proper noun.
Embodiment bis-
Fig. 2 is the recognition methods process flow diagram of the proper noun that provides of the present embodiment, and as shown in Figure 2, the method comprises:
Step S201, the search word that obtains user's input and user's characteristic information.
The processing procedure of this step is identical with step S101 in embodiment mono-, in this, repeats no more.Finally acquire user's search word, the geographic position at the search pattern of selection, selection Tu district operator scheme and user place.
Step S202, for search pattern, judge whether described search pattern is predefined search pattern, if, enter step S204, judge whether to inquire the result matching with search word, if, enter step S206, increase the confidence level of described search word.
If can inquire about, obtain the result that matches with described search word under described search pattern, and described search pattern is predefined search pattern, thinks that the possibility of this search word proper noun is larger.
Step S203, for figure district operator scheme, judge whether user has the action of described figure district operator scheme, if, enter step S204, judge whether to inquire the result matching with search word, if, enter step S206, increase the confidence level of described search word.
What figure district operator scheme recorded conventionally is to initiate user the action of carrying out before the searching request of corresponding search word.When user Jiang Tu district's convergent-divergent or drag to after a suitable Tu district scope, then to before or after the search word of input to search engine, initiate searching request.
If user has the action of described figure district operator scheme, and after user completes the action of described figure district operator scheme, within the scope of Tu district, place, can inquire about and obtain the result that matches with described search word, increase the confidence level of described search word.That is to say, within the scope of current figure district, can inquire the Search Results matching, think that the possibility of this search word proper noun is larger.
Whether step S205, judgement can inquire the result matching within predeterminable range threshold value in the distance with described positional information, if so, enter step S206, increased the confidence level of described search word.
If there is the point of interest matching with described search word in preset range around at obtained customer location, think that the possibility of this search word proper noun is larger.
It is worth mentioning that, the sequencing of described step S202 and step S204, step S203 and step S204 can be changed, and the sequencing of step S202 and step S204, step S203 and step S204 and step S205 also can be changed, final according to the judged result of those steps, if, the confidence level of the search word meeting the demands is improved, otherwise, the confidence level of search word do not changed.
Step S207, judge whether that described confidence level is greater than predetermined threshold value and described search word and meets predefined title rule, if so, enter step S208, otherwise, step S209 entered.
In the present embodiment, step S207 is identical to step S105 with step S103 in embodiment mono-to the processing procedure of step S209, in this, repeats no more.
Step S210, will be identified as the search word of proper noun form proper noun set, and according to the follow-up Query Result of user, described proper noun set is dynamically adjusted.
When user utilize search word in described proper noun set inquire about but can not get matching result time, reduce the confidence level of described search word.When described with a low credibility during in predetermined threshold value, described search word is identified as to non-proper noun, and deletes from described proper noun set.
Give an example, when user is when using LBS product, for example Baidu's map, first by user is positioned, obtains the geographical location information at user place, supposes that user is in Xi'an.Now, user, by near search " Xi'an ephrosis hospital ", so preferentially searches in (5Km) near user.If there is poi " Xi'an ephrosis hospital " near this user, this search word meets predefined title rule and confidence level is greater than predetermined threshold value determining, and this search word " Xi'an ephrosis hospital " is identified as to a proper noun.If user searches for by near search or search box, when the search word " Xi'an ephrosis hospital " that is identified as proper noun is regarded as to an integral body and searched for, if come to nothing, this search word " Xi'an ephrosis hospital " is converted into non-proper noun.So now, then according to search near " the ephrosis hospital " " Xi'an ", obtain series of results.The present invention, by the identification of proper noun, can improve the recall precision of map search, also can improve the accuracy rate of map search simultaneously.
Be more than the detailed description that method provided by the present invention is carried out, below the recognition device of proper noun provided by the invention be described in detail.
Embodiment tri-
Fig. 3 is the recognition device schematic diagram of the proper noun that provides of the present embodiment, and as shown in Figure 3, this device comprises: acquisition module 301, query processing module 302 and comprehensive judge module 303.
Acquisition module 301 is for obtaining the search word of user's input and user's characteristic information.
Map search service provider, when the services such as map search, LBS location are provided, can provide some search pattern Huo Tu district operator scheme etc., conventionally so that user selects suitable search pattern according to actual conditions.
Search pattern generally includes: public transport search, search box search or near search etc.When user carries out LBS positioning service, inputted search word (query) under selected search pattern.Under public transport search pattern, the search word of user's input is mostly bus stop title, and nearby, under search and search box search pattern, the search word of user's input is mostly point of interest title etc.
User operates map Tu district operator scheme and generally includes: drag figure district, enlarged drawing district or dwindle figure district etc.User can adjust the map location and the scope that show by amplifying, dwindle or dragging figure district.For example, when Dang Tuqu is shown as whole Beijing City map, user can and drag figure district demonstration Tu district is become to some regions by enlarged drawing district, even certain Huo Shang district, community etc.
While using the mobile terminals such as mobile phone to carry out LBS positioning service as user, system can be utilized the GPS(GPS in terminal) etc. locating module first obtain the geographical location information at user place, thereby utilize the geographic position that gets user to assist to carry out the identification of proper noun, can further improve accuracy rate and coverage rate.
Thereby user's characteristic information comprises: user-selected search pattern when user's inputted search word, user-selected Tu district operator scheme before or after user's inputted search word, and the information such as geographic position at user place.
Query processing module 302 is for utilizing described search word to inquire about, and according to Query Result and described characteristic information, determines the confidence level of described search word.
Query processing module 302 utilizes search word to inquire about, and judges whether to have the Query Result matching with this search word, and according to user's characteristic information, determines the confidence level of search word.
The confidence level of an acquiescence is set for each search word, or, the confidence level of this search word before also can reading from database.For can set different weighted values under different search pattern Huo Tu district operator scheme, under associative mode, if the Query Result that utilizes search word inquiry to obtain matching is superimposed by original confidence level of the weighted value under this corresponding modes and search word.
For example, when user is under the pattern of public transport search after input " five public transport fields, road junction " at search pattern, query processing module 302 can inquire and in map, comprise the bus stop matching after search, now the original confidence level of the weighted value under public transport search pattern and this search word is added, if the confidence level of search word acquiescence is 0, weighted value under public transport search pattern is 3, and the confidence level that can determine this search word is 3.
The weighted value of search pattern can be, but not limited to be set to: near public transport search >=search > search box search.The weighted value of figure district operator scheme can be, but not limited to be set to: dwindle Tu Qu >=enlarged drawing district > and drag figure district.
For the search word of user input, if cannot inquire the Query Result matching, 302 confidence levels that reduce these search words of query processing module, or be made as zero, because what now conventionally can think that user inputs is non-proper noun or wrong lexical item.
Comprehensive judge module 303, for judging whether described confidence level and be greater than predetermined threshold value and described search word meeting predefined title rule, if so, is identified as proper noun by described search word, otherwise described search word is identified as to non-proper noun.
Predefined title rule is not to be the lexical item of proper noun in order to get rid of obviously.If there is stop words such as " ", " " in search word, generally all think not to be proper noun, for example, " Pekinese hospital ", " which pharmacy of Tongrentang removed " etc.Predefined title rule can be, but not limited to comprise: place name+suffix (for example " Peking University "), or the form of place name+scope of business+suffix (for example " Beijing Tumour Hospital ").
Predetermined threshold value arranges according to the demand of practical application scene, if described confidence level is greater than predetermined threshold value and search word meets predefined title rule, comprehensive 303 of judge modules are identified as proper noun by this search word, otherwise comprehensive judge module 303 is identified as non-proper noun by this search word.
Embodiment tetra-
Fig. 4 is the recognition device schematic diagram of the proper noun that provides of the present embodiment, and as shown in Figure 4, this device comprises: acquisition module 401, query processing module 402, comprehensive judge module 403 and adjusting module 404.
Acquisition module 401, for obtaining the search word of user input and user's characteristic information, specifically comprises: first obtains submodule 4011, second obtains submodule 4012, the 3rd and obtain submodule 4013 and the 4th and obtain submodule 4014.
First obtains submodule 4011 for obtaining the search word of user's input.
Second obtains submodule 4012 for obtaining user-selected search pattern when user's inputted search word.
Search pattern generally includes: public transport search, search box search or near search etc.When user carries out LBS positioning service, inputted search word (query) under selected search pattern.Under public transport search pattern, the search word of user's input is mostly bus stop title, and nearby, under search and search box search pattern, the search word of user's input is mostly point of interest title etc.
The 3rd obtains submodule 4013 for obtained user-selected Tu district operator scheme before or after user's inputted search word.
User operates map Tu district operator scheme and generally includes: drag figure district, enlarged drawing district or dwindle figure district etc.User can adjust the map location and the scope that show by amplifying, dwindle or dragging figure district.For example, when Dang Tuqu is shown as whole Beijing City map, user can and drag figure district demonstration Tu district is become to some regions by enlarged drawing district, even certain Huo Shang district, community etc.
The 4th obtains submodule 4014 for obtaining the geographic position at user place.
While using the mobile terminals such as mobile phone to carry out LBS positioning service as user, system can be utilized the GPS(GPS in terminal) etc. locating module first obtain the geographical location information at user place, thereby utilize the geographic position that gets user to assist to carry out the identification of proper noun, can further improve accuracy rate and coverage rate.
Query processing module 402 is for utilizing described search word to inquire about, and according to Query Result and described characteristic information, determine and specifically comprise the confidence level of described search word: first processes submodule 4021, second processes submodule 4022 and the 3rd and process submodule 4023.
First processes submodule 4021 for judging that described second obtains the search pattern that submodule obtains and whether be predefined search pattern and utilize described search word can inquire about the result that obtains matching under described search pattern, if so, increase the confidence level of described search word.
If can inquire about, obtain the result that matches with described search word under described search pattern, and described search pattern is predefined search pattern, thinks that the possibility of this search word proper noun is larger.
Second processes submodule 4022 for judging that the described the 3rd obtains submodule and whether have the action that gets described figure district operator scheme, and the action that utilizes described search word to complete described figure district operator scheme user can be inquired about the result that obtains matching within the scope of Tu district, place afterwards, if so, increase the confidence level of described search word.
What figure district operator scheme recorded conventionally is to initiate user the action of carrying out before the searching request of corresponding search word.When user Jiang Tu district's convergent-divergent or drag to after a suitable Tu district scope, then to before or after the search word of input to search engine, initiate searching request.
If user has the action of described figure district operator scheme, and after user completes the action of described figure district operator scheme, within the scope of Tu district, place, can inquire about and obtain the result that matches with described search word, increase the confidence level of described search word.That is to say, within the scope of current figure district, can inquire the Search Results matching, think that the possibility of this search word proper noun is larger.
The 3rd processes submodule 4023 utilizes described search word for judgement, whether can inquire the result matching in the distance with described positional information within predeterminable range threshold value, if so, increases the confidence level of described search word.
If there is the point of interest matching with described search word in preset range around at obtained customer location, think that the possibility of this search word proper noun is larger.
Comprehensive judge module 403, for judging whether described confidence level and be greater than predetermined threshold value and described search word meeting predefined title rule, if so, is identified as proper noun by described search word, otherwise described search word is identified as to non-proper noun.
In the present embodiment, the configuration of module 403 is identical with module 303 in embodiment tri-, in this, repeats no more.
Adjusting module 404 forms proper noun set for comprehensive judge module 403 being identified as to the search word of proper noun, and according to the follow-up Query Result of user, described proper noun set is dynamically adjusted.
When user utilize search word in described proper noun set inquire about but can not get matching result time, adjusting module 404 reduces the confidence level of described search word.When described with a low credibility during in predetermined threshold value, comprehensive judge module 403 is identified as non-proper noun by described search word, and deletes from described proper noun set.
The recognition methods of proper noun provided by the invention and device, utilize user's geographic position and operation behavior feature, carries out the proper noun identification in effective LBS field, can promote recall rate and the accuracy rate of the proper noun identification of map retrieval service.The present invention utilizes user's geographic position, selected search pattern and carries out Tu district operator scheme, determined the deflection of recognition result, the result according to the retrieval of not being completely cured is dynamically adjusted proper noun, guaranteed the accuracy rate of proper name identification, thereby improve recall rate and the accuracy rate of map retrieval, promote the processing speed of map retrieval.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (14)

1. a recognition methods for proper noun, is characterized in that, described method comprises:
Obtain the search word of user's input and user's characteristic information;
Utilize described search word to inquire about, and according to Query Result and described characteristic information, determine the confidence level of described search word;
Judge whether that described confidence level is greater than predetermined threshold value and described search word meets predefined title rule, if so, described search word is identified as to proper noun.
2. method according to claim 1, is characterized in that, described characteristic information comprises:
User-selected search pattern when user's inputted search word;
User-selected Tu district operator scheme before or after user's inputted search word; And/or the geographic position at user place.
3. method according to claim 2, is characterized in that, the described search word that utilizes is inquired about, and determines the confidence level of described search word according to Query Result and described characteristic information, specifically comprises:
Judging whether can inquire about under described search pattern the result and the described search pattern that obtain matching with described search word is predefined search pattern, if so, increases the confidence level of described search word.
4. method according to claim 2, is characterized in that, the described search word that utilizes is inquired about, and determines the confidence level of described search word according to Query Result and described characteristic information, specifically comprises:
Judge whether user has the action of described figure district operator scheme, and after user completes the action of described figure district operator scheme, within the scope of Tu district, place, can inquire about and obtain the result that matches with described search word, if so, increase the confidence level of described search word.
5. method according to claim 2, is characterized in that, the described search word that utilizes is inquired about, and determines the confidence level of described search word according to Query Result and described characteristic information, specifically comprises:
Judgement utilizes described search word, whether can inquire the result matching in the distance with described positional information within predeterminable range threshold value, if so, increases the confidence level of described search word.
6. method according to claim 1, is characterized in that, after described search word is identified as to proper noun, also comprises:
The search word that is identified as proper noun is formed to proper noun set, and according to the follow-up Query Result of user, described proper noun set is dynamically adjusted.
7. method according to claim 1, is characterized in that, describedly according to the follow-up Query Result of user, described proper noun set is dynamically adjusted, and is specially:
When user utilize search word in described proper noun set inquire about but can not get matching result time, reduce the confidence level of described search word;
When described with a low credibility during in predetermined threshold value, described search word is identified as to non-proper noun, and deletes from described proper noun set.
8. a recognition device for proper noun, is characterized in that, described device comprises:
Acquisition module, for obtaining the search word of user's input and user's characteristic information;
Query processing module, for utilizing described search word to inquire about, and determines the confidence level of described search word according to Query Result and described characteristic information;
Comprehensive judge module, for judging whether that described confidence level is greater than predetermined threshold value and whether described search word meets predefined title rule, if so, is identified as proper noun by described search word.
9. device according to claim 8, is characterized in that, described acquisition module comprises:
Second obtains submodule, for obtaining user-selected search pattern when user's inputted search word;
The 3rd obtains submodule, for obtained user-selected Tu district operator scheme before or after user's inputted search word; Or
The 4th obtains submodule, for obtaining the geographic position at user place.
10. device according to claim 9, is characterized in that, described query processing module comprises:
First processes submodule, for judging that described second obtains the search pattern that submodule obtains and whether be predefined search pattern and utilize described search word can inquire about the result that obtains matching under described search pattern, if so, increase the confidence level of described search word.
11. devices according to claim 9, is characterized in that, described query processing module comprises:
Second processes submodule, for judging that the described the 3rd obtains submodule and whether have the action that gets described figure district operator scheme, and the action that utilizes described search word to complete described figure district operator scheme user can be inquired about the result that obtains matching within the scope of Tu district, place afterwards, if so, increase the confidence level of described search word.
12. devices according to claim 9, is characterized in that, described query processing module comprises:
The 3rd processes submodule, for judgement, utilizes described search word, whether can inquire the result matching in the distance with described positional information within predeterminable range threshold value, if so, increases the confidence level of described search word.
13. devices according to claim 8, is characterized in that, described device also comprises:
Adjusting module, for the formed proper noun set of search word that is identified as proper noun for described comprehensive judge module, dynamically adjusts according to the Query Result that user is follow-up.
14. devices according to claim 8, it is characterized in that, described adjusting module specifically for as user, utilize search word in described proper noun set inquire about but can not get matching result time, described search word is identified as to non-proper noun, and deletes from described proper noun set.
CN201210376874.8A 2012-09-29 2012-09-29 A kind of recognition methods of proper noun and device Active CN103714081B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210376874.8A CN103714081B (en) 2012-09-29 2012-09-29 A kind of recognition methods of proper noun and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210376874.8A CN103714081B (en) 2012-09-29 2012-09-29 A kind of recognition methods of proper noun and device

Publications (2)

Publication Number Publication Date
CN103714081A true CN103714081A (en) 2014-04-09
CN103714081B CN103714081B (en) 2018-10-16

Family

ID=50407067

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210376874.8A Active CN103714081B (en) 2012-09-29 2012-09-29 A kind of recognition methods of proper noun and device

Country Status (1)

Country Link
CN (1) CN103714081B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103970911A (en) * 2014-05-28 2014-08-06 苏州数字地图网络科技有限公司 Intelligent word segmentation querying method based on geographical name bank and intelligent word segmentation querying system based on geographical name bank
CN104462531A (en) * 2014-12-23 2015-03-25 北京奇虎科技有限公司 Method and system for determining whether search term invokes map interface
CN104537041A (en) * 2014-12-23 2015-04-22 北京奇虎科技有限公司 Method and system for determining whether map interface is called or not based on user search term
CN107491489A (en) * 2017-07-18 2017-12-19 深圳天珑无线科技有限公司 A kind of map search method, apparatus and computer-readable recording medium
CN110377686A (en) * 2019-07-04 2019-10-25 浙江大学 A kind of address information Feature Extraction Method based on deep neural network model
CN112966511A (en) * 2021-02-08 2021-06-15 广州探迹科技有限公司 Entity word recognition method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061211A1 (en) * 2000-06-30 2003-03-27 Shultz Troy L. GIS based search engine
CN101350012A (en) * 2007-07-18 2009-01-21 北京灵图软件技术有限公司 Method and system for matching address
CN101840406A (en) * 2009-03-20 2010-09-22 富士通株式会社 Place name searching device and system
CN101876975A (en) * 2009-11-04 2010-11-03 中国科学院声学研究所 Identification method of Chinese place name
CN101957819A (en) * 2009-07-21 2011-01-26 北京大学 Place name searching method and system based on context

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061211A1 (en) * 2000-06-30 2003-03-27 Shultz Troy L. GIS based search engine
CN101350012A (en) * 2007-07-18 2009-01-21 北京灵图软件技术有限公司 Method and system for matching address
CN101840406A (en) * 2009-03-20 2010-09-22 富士通株式会社 Place name searching device and system
CN101957819A (en) * 2009-07-21 2011-01-26 北京大学 Place name searching method and system based on context
CN101876975A (en) * 2009-11-04 2010-11-03 中国科学院声学研究所 Identification method of Chinese place name

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103970911A (en) * 2014-05-28 2014-08-06 苏州数字地图网络科技有限公司 Intelligent word segmentation querying method based on geographical name bank and intelligent word segmentation querying system based on geographical name bank
CN104462531A (en) * 2014-12-23 2015-03-25 北京奇虎科技有限公司 Method and system for determining whether search term invokes map interface
CN104537041A (en) * 2014-12-23 2015-04-22 北京奇虎科技有限公司 Method and system for determining whether map interface is called or not based on user search term
CN104537041B (en) * 2014-12-23 2018-05-04 北京奇虎科技有限公司 A kind of definite user's query word whether the method and system of invocation map interface
CN107491489A (en) * 2017-07-18 2017-12-19 深圳天珑无线科技有限公司 A kind of map search method, apparatus and computer-readable recording medium
CN110377686A (en) * 2019-07-04 2019-10-25 浙江大学 A kind of address information Feature Extraction Method based on deep neural network model
CN110377686B (en) * 2019-07-04 2021-09-17 浙江大学 Address information feature extraction method based on deep neural network model
CN112966511A (en) * 2021-02-08 2021-06-15 广州探迹科技有限公司 Entity word recognition method and device
CN112966511B (en) * 2021-02-08 2024-03-15 广州探迹科技有限公司 Entity word recognition method and device

Also Published As

Publication number Publication date
CN103714081B (en) 2018-10-16

Similar Documents

Publication Publication Date Title
JP6343010B2 (en) Identifying entities associated with wireless network access points
US9826345B2 (en) Method and apparatus for detecting points of interest or events based on geotagged data and geolocation seeds
CN109376761B (en) Address identification and longitude and latitude mining method and device thereof
CN103714081A (en) Method and device for recognizing proprietary place names
US20110087685A1 (en) Location-based service middleware
CN107092623B (en) Interest point query method and device
US20140074871A1 (en) Device, Method and Computer-Readable Medium For Recognizing Places
US9811564B2 (en) POI information providing system, POI information providing device, POI information output device, POI information providing method, and program therefor
CN110019645B (en) Index library construction method, search method and device
CN105869513B (en) Method and device for displaying associated annotation points on electronic map interface
US9471596B2 (en) Systems and methods for processing search queries utilizing hierarchically organized data
WO2012172160A1 (en) Method and apparatus for resolving geo-identity
TW201428518A (en) Method, mobile device and computer program product for displaying surrounding points of interest
CN102449625A (en) Method and apparatus for automatic geo-location search learning
US20170308560A1 (en) Location Searching with Category Indices
US9441983B2 (en) Navigation system with content curation mechanism and method of operation thereof
US10242114B2 (en) Point of interest tagging from social feeds
CN105592120A (en) Method and apparatus for providing geographic position information
EP2706496A1 (en) Device, method and computer-readable medium for recognizing places in a text
CN111931077A (en) Data processing method and device, electronic equipment and storage medium
JP2021103162A (en) Method, apparatus, device and medium used in navigation
CN102214221A (en) Location-based service body searching method
CN106028445B (en) Method and device for determining positioning accuracy
CN113378055A (en) Enterprise pushing method, device, equipment and storage medium based on visitor information
CN105203123A (en) Method for guiding travelling route through intelligent terminal

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant