CN105786922B - Method and device for determining missing electronic map data - Google Patents

Method and device for determining missing electronic map data Download PDF

Info

Publication number
CN105786922B
CN105786922B CN201410830042.8A CN201410830042A CN105786922B CN 105786922 B CN105786922 B CN 105786922B CN 201410830042 A CN201410830042 A CN 201410830042A CN 105786922 B CN105786922 B CN 105786922B
Authority
CN
China
Prior art keywords
address
coded
missing
electronic map
map data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410830042.8A
Other languages
Chinese (zh)
Other versions
CN105786922A (en
Inventor
杨自华
张文斗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Autonavi Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Autonavi Software Co Ltd filed Critical Autonavi Software Co Ltd
Priority to CN201410830042.8A priority Critical patent/CN105786922B/en
Publication of CN105786922A publication Critical patent/CN105786922A/en
Application granted granted Critical
Publication of CN105786922B publication Critical patent/CN105786922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Navigation (AREA)

Abstract

The invention discloses a method and equipment for determining missing electronic map data, wherein the method comprises the following steps: when the geographic coding of the address to be coded fails, determining the address to be coded as missing electronic map data and storing the missing electronic map data in a preset missing database; and further recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data if the geocoding fails. According to the scheme, the missing electronic map data is determined in a geographic coding mode, and a basis is provided for subsequently supplementing the electronic map data, so that the electronic map database can be improved in a targeted manner, and the success rate of subsequent geographic coding is further improved.

Description

Method and device for determining missing electronic map data
Technical Field
The invention relates to the technical field of navigation electronic maps, in particular to a method and equipment for determining missing electronic map data.
Background
Geocoding, also known as address matching, refers to a code set to identify the location and attributes of points, lines, faces. Through geocoding, Chinese address description information or place name description information can be converted into a specific coordinate position point on the earth surface, and the method is realized specifically as follows: and matching the Chinese address description information or the place name description information with standard map data in an electronic map database, and taking a coordinate position point corresponding to the successfully matched standard map data as a coordinate position point corresponding to the Chinese address description information or the place name description information when the Chinese address description information or the place name description information is matched with the standard map data. In the process of positioning or searching in an electronic map, address description information input by a user is often required to be converted into a specific coordinate position point through geocoding. But the geocoding fails because the address description information input by the user is likely to have errors or the address described by the address description information is a new address which is not updated into the electronic map data.
At present, when geocoding is successfully carried out on address information input by a user, coordinate position points obtained by geocoding are directly fed back; and if the geocoding fails, directly feeding back the geocoding failure. Since the address information input by the user is likely to be a plurality of location points which the user may query subsequently, the method has a high existing value, if the address information is not processed (such as the longitude and latitude coordinates of the collected address information), if the address information is still used for geocoding subsequently, the geocoding failure still can be displayed, and the method is not beneficial to the enrichment of the content of the electronic map database and the subsequent geocoding.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for determining missing electronic map data, which determine the missing electronic map data through geocoding and provide a basis for subsequently perfecting an electronic map database.
A method for determining missing electronic map data comprises the following steps:
receiving a geocoding request carrying an address to be coded;
performing word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database;
and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
A missing electronic map data determination apparatus comprising:
the receiving module is used for receiving a geocoding request carrying an address to be coded;
the word segmentation module is used for segmenting the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
the first missing electronic map data determining module is used for carrying out geographic coding on the address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geographic coding fails, storing the missing electronic map data into a preset missing database, and triggering the second missing electronic map data determining module;
and the second missing electronic map data determining module is used for recombining the address segments corresponding to the addresses to be coded according to the position sequence of the address segments in the addresses to be coded to obtain at least one new address containing a part of the address segments corresponding to the addresses to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
The invention has the following beneficial effects:
according to the method provided by the embodiment of the invention, when a geocoding request carrying an address to be coded is received, the address to be coded in the geocoding request is subjected to word segmentation to obtain an address fragment forming the address to be coded; geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database; and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails. According to the scheme, missing electronic map data can be determined through a geographic coding mode, address segments corresponding to the address to be coded are recombined according to the position sequence of the address to be coded in the address to be coded, at least one new address containing partial address segments corresponding to the address to be coded is obtained, geographic coding is conducted on the new address, if the geographic coding fails, the new address is determined to be the missing electronic map data, and finally the determined missing electronic map data is stored in a preset missing database to provide a basis for subsequent supplement of electronic map data, so that the electronic map database can be perfected in a targeted mode, and the success rate of subsequent geographic coding is further improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of a method for determining missing electronic map data according to an embodiment of the present invention;
fig. 2 is a second flowchart of a method for determining missing electronic map data according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a device for determining missing electronic map data according to an embodiment of the present invention.
Detailed Description
In order to achieve the purpose of the present invention, embodiments of the present invention provide a method and an apparatus for determining missing electronic map data, where when a geocoding request carrying an address to be coded is received, a word segmentation is performed on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded; geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database; and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails. According to the scheme, missing electronic map data can be determined through a geographic coding mode, address segments corresponding to the address to be coded are recombined according to the position sequence of the address to be coded in the address to be coded, at least one new address containing partial address segments corresponding to the address to be coded is obtained, geographic coding is conducted on the new address, if the geographic coding fails, the new address is determined to be the missing electronic map data, and finally the determined missing electronic map data is stored in a preset missing database to provide a basis for subsequent supplement of electronic map data, so that the electronic map database can be perfected in a targeted mode, and the success rate of subsequent geographic coding is further improved.
Various embodiments of the present invention are described in further detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The first embodiment is as follows:
as shown in fig. 1, which is a flowchart of a method for determining missing electronic map data according to an embodiment of the present invention, the method includes:
step 101: receiving a geocoding request carrying an address to be encoded.
Wherein, the geocoding request comprises an address to be coded.
Preferably, in order to improve the geocoding efficiency and the balance of the geocoding system in the embodiment of the present invention, a plurality of independent geocoding servers with the same function are arranged in the geocoding system in the embodiment of the present invention. When the geocoding system receives a plurality of geocoding requests, the geocoding requests are uniformly distributed to different geocoding servers to be processed in parallel; or sending the geocoding request to a geocoding server which is idle at present.
Step 102: and performing word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded.
Preferably, since the address to be encoded is typically user-entered information, the information may not be standardized, such as "at Beijing XX street XXXXX building", "near Beijing Western monograph mall", and the like. Before the word segmentation is performed on the address to be coded in the geocoding request in the aforementioned step 102, to normalize the information input by the user, the method further includes the following steps: and determining the invalid words in the address to be coded according to a preset invalid word library, and deleting the invalid words in the address to be coded. Step 102 is to perform word segmentation operation on the address to be coded after the invalid word is deleted.
Preferably, after deleting the invalid word in the address to be encoded, further, performing a specification operation such as case unification, letter case unification and the like on the numbers in the address to be encoded.
It should be noted that the invalid word described in the embodiment of the present invention refers to a word that is associated with an address but cannot determine the address, for example: home, find hotel, where, etc. In practical application, if it is determined that the address to be encoded included in the address encoding request includes an invalid word according to a preset invalid word bank, for example: when the user goes home, finds a hotel and the like, the invalid words cannot specify a specific address, and the geocoding server cannot recognize the invalid words, so that the invalid words in the address to be coded are deleted.
Preferably, since the address to be encoded input by the user may be a standard address or a non-standard address, the non-standard address may include an administrative name, a road name, a cell name, and the like, but may also include some fuzzy words, for example, the address to be encoded may be address descriptive information, such as "opposite to 10 # Suzhou street in Haizhou district, Beijing city"; the address to be encoded may also be an ambiguous address such as "near a high-tech building in the Hai lake district, Beijing". The standard address refers to a phrase capable of accurately determining the address position, and includes a word segmentation combination of administrative district names, road names and district names, for example: xx district xx Luxx number court xx floor of xx city. The embodiment of the invention has the advantages that the adopted word segmentation modes are different aiming at the standard address and the non-standard address, and the word segmentation is carried out on the address to be coded in a more proper word segmentation mode, so that the word segmentation is more accurate. Therefore, before the foregoing step 102 performs word segmentation on the address to be coded, the following steps may be further included:
and matching the address to be coded with a preset non-standard address lexicon, if the matching is successful, determining that the address to be coded is a non-standard address, and if the matching is failed, determining that the address to be coded is a standard address.
In the embodiment of the invention, the non-standard address lexicon comprises ambiguous words such as 'nearby', 'opposite', 'beside', and the like. If the words in the non-standard address word bank exist in the address to be coded, the matching of the address to be coded and the non-standard address word bank is successful.
At this time, in the step 102, performing word segmentation on the address to be coded in the geocoding request specifically includes:
when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank;
and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
In the embodiment of the invention, the general participle word bank refers to a word bank which is preset and contains all participles (such as professional terms, professional nouns and the like) related to each industry; the standard word segmentation word bank refers to a word bank of pre-set segmentation words related to the geographic information industry.
Step 103: and carrying out geocoding on the address fragment corresponding to the address to be coded.
Step 104: and when the geographic coding fails, determining the address to be coded as missing electronic map data, and storing the missing electronic map data in a preset missing database.
Step 105: and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
The foregoing step 105 can be implemented in the following two ways:
mode 1: traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and executing the following operations every time one address fragment is traversed until the last address fragment of the address to be coded is traversed: combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database.
For example, the address C1C2C3C4 to be encoded sequentially includes address fragments C1, C2, C3, and C4, and sequentially traverses from the first address fragment of the address to be encoded: traversing C1, geocoding the C1 as a new address, and if the geocoding fails, taking the C1 as missing electronic map data; traversing C2, performing geocoding by taking C1C2 as a new address, and taking C1C2 as missing electronic map data if the geocoding fails; traversing C3, geo-coding C1C2C3 as a new address, and if geo-coding fails, taking the C1C2C3 as missing electronic map data.
Preferably, since the administrative level of the address fragment located at the front in the address to be encoded is higher than that of the address fragment located at the back, the following operations may be performed in the embodiment of the present invention: sequentially traversing from the first address fragment of the address to be coded, traversing C1, geocoding C1 as a new address, if the geocoding fails, taking C1 as missing electronic map data, and directly determining C1C2, C1C2C3 and the three new addresses as missing electronic map data; and if the geocoding is successful, continuously traversing C2, geocoding C1C2, if the geocoding fails, directly determining C1C2C3 as missing electronic map data, and if the geocoding succeeds, continuously traversing C3.
Mode 2: and sequentially decreasing the last address segment of the address to be coded in a descending manner in a reverse order, wherein the following operations are executed when the last address segment is decreased until the first address segment of the address to be coded is decreased: taking the address formed by the address segments left after the address segments are decreased as a new address; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database.
For example, the address C1C2C3C4 to be encoded sequentially includes address fragments C1, C2, C3, and C4, and the address fragment whose address to be encoded is located at the last one sequentially decreases: deleting the last address fragment C4 in the C1C2C3C4 to obtain a new address C1C2C3, geocoding the new address, and if the geocoding fails, continuously deleting the last address fragment C3 in the C1C2C3 to obtain a new address C1C 2; and geocoding the new address C1C2, if the encoding fails, deleting the last address fragment C2 to obtain a new address C1, geocoding the C1, and the like.
Preferably, the following operations may also be performed in the embodiment of the present invention: deleting the last address fragment C4 in the C1C2C3C4 to obtain a new address C1C2C3, carrying out geocoding on the new address, and ending the flow if the geocoding is successful; if the address fails, the last address fragment C3 in the C1C2C3 is continuously deleted to obtain a new address C1C 2; and geocoding the new address C1C2, ending the process if the encoding is successful, deleting the last address fragment C2 to obtain a new address C1 if the encoding is failed, geocoding the C1, and the like.
Preferably, in the embodiment of the present invention, when it is determined that an address to be encoded is missing electronic map data, after determining the missing electronic map data according to the address to be encoded, a parent-child relationship of the missing electronic map data is established, and if the address to be encoded is C1C2C3C4, it is determined that C1C2, C1C2C3, and C1C2C3C4 are all missing electronic map data, and since a geographic area range included in a new address including fewer address fragments is wider, the parent-child relationship of C1C2, C1C2C3, and C1C2C3C4 is established as follows: C1C2 is a father node, C1C2C3 is a subordinate child node of the father node, and C1C2C3C4 is a subordinate child node of C1C2C 3.
Preferably, to further provide a better basis for perfecting the electronic map database so that the acquiring personnel can preferentially supplement missing electronic map data with higher importance, the embodiment of the present invention may further include the following steps 106 to 108 in the flow of the method shown in fig. 1, as shown in fig. 2:
step 106: determining a data missing type of the missing electronic map data.
Wherein, the data missing types comprise: administrative district type, road type, district type, house number type, building number type. The priority of the data missing type is as follows in sequence from high to low: administrative district type, road type, district type, house number type, building number type.
In step 106, determining a data missing type of the missing electronic map data, which may specifically be as follows: and judging the type of the last address segment of the missing electronic map data, comparing the type with the data missing type, and determining the data missing type in comparison as the data missing type of the missing electronic map data. For example: if the last address segment missing the electronic map data is an administrative district, determining that the data missing type of the electronic map data is an administrative district type, and if the missing electronic map data is an 'xx city xx district', determining that the data missing type of the missing electronic map data is the administrative district type; if the missing electronic map data is 'xx road in xx city xx', the data missing type of the missing electronic map data is a road type; if the missing electronic map data is 'xx road xx cell in xx city xx', the data missing type of the missing electronic map data is a cell type; if the missing electronic map data is' xx cell xx number xx of xx city xx, the type of data missing of the missing electronic map data is a type of house number.
Step 107: and determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data.
The higher the priority of the data missing type of the missing electronic map data is, the greater the importance of the data missing type is; the more frequent the electronic map data is missing, the greater its importance.
In step 107, determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data, wherein the specific implementation can be as follows: the higher the data missing type of the missing electronic map data is, the higher the corresponding importance degree is, and if the data missing type is the same, the higher the missing frequency is, the higher the corresponding importance degree is.
Step 108: and sequencing the missing electronic map data in the missing database according to the sequence of the importance degrees from high to low, and storing the sequenced missing electronic map data in the missing database into a preset acquisition database.
By the scheme of the first embodiment of the invention, when a geocoding request carrying an address to be coded is received, the address to be coded in the geocoding request is subjected to word segmentation to obtain an address fragment forming the address to be coded; geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database; and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails. According to the scheme, missing electronic map data can be determined through a geographic coding mode, address segments corresponding to the address to be coded are recombined according to the position sequence of the address to be coded in the address to be coded, at least one new address containing partial address segments corresponding to the address to be coded is obtained, geographic coding is conducted on the new address, if the geographic coding fails, the new address is determined to be the missing electronic map data, and finally the determined missing electronic map data is stored in a preset missing database to provide a basis for subsequent supplement of electronic map data, so that the electronic map database can be perfected in a targeted mode, and the success rate of subsequent geographic coding is further improved.
Example two:
fig. 3 is a schematic structural diagram of a data searching apparatus according to a second embodiment of the present invention. The apparatus comprises: a receiving module 31, a word segmentation module 32, a first missing electronic map data determination module 33, and a second missing electronic map data determination module 34, wherein:
a receiving module 31, configured to receive a geocoding request carrying an address to be coded;
a word segmentation module 32, configured to perform word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
the first missing electronic map data determining module 33 is configured to perform geocoding on the address segment corresponding to the address to be coded, determine the address to be coded as missing electronic map data when the geocoding fails, store the missing electronic map data in a preset missing database, and trigger the second missing electronic map data determining module 34;
and a second missing electronic map data determining module 34, configured to recombine the address segments corresponding to the addresses to be coded according to the position sequence of the address segments in the addresses to be coded, obtain at least one new address including a partial address segment corresponding to the addresses to be coded, geocode the new address, determine the new address as missing electronic map data if the geocoding fails, and store the missing electronic map data in a preset missing database.
Optionally, the determining device further includes: a deletion module 35, wherein:
a deleting module 35, configured to determine an invalid word in the address to be encoded according to a preset invalid word bank before the word segmentation module 32 performs word segmentation on the address to be encoded in the geocoding request, and delete the invalid word in the address to be encoded; and triggering the word segmentation module 32 aiming at the address to be coded after the invalid word is deleted.
Optionally, the determining device further includes: a matching module 36, wherein:
the matching module 36 is configured to match the address to be coded with a preset non-standard address lexicon before the word segmentation module 32 performs word segmentation on the address to be coded in the geocoding request, determine that the address to be coded is a non-standard address if matching is successful, and determine that the address to be coded is a standard address if matching is failed;
the word segmentation module 32 is specifically configured to: when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank; and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
Specifically, the second missing electronic map data determining module 34 is specifically configured to:
traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded when traversing one address fragment; geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the last address segment of the address to be coded is traversed;
or,
sequentially decreasing the last address segment from the last address segment of the address to be coded in a reverse order decreasing mode, and taking the address formed by the rest address segments after the address segment is decreased as a new address when the address segment is decreased; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the first address segment of the address to be coded is decreased.
Optionally, the apparatus may further comprise: a data loss type determining module 37, an importance determining module 38, and a ranking module 39, wherein:
a data missing type determining module 37, configured to determine a data missing type of the missing electronic map data;
the importance determining module 38 is configured to determine the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data;
and the sorting module 39 is configured to sort the missing electronic map data in the missing database according to the order of importance from high to low, and store the sorted missing electronic map data in the missing database into a preset collection database.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus (device), or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for determining missing electronic map data, comprising:
receiving a geocoding request carrying an address to be coded;
performing word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database;
and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
2. The method of claim 1, wherein prior to tokenizing the address to be coded in the geocoding request, further comprising:
determining an invalid word in the address to be coded according to a preset invalid word library, and deleting the invalid word in the address to be coded;
and aiming at the address to be coded after the invalid word is deleted, performing the step of segmenting the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded.
3. The method of claim 1, wherein prior to tokenizing the address to be coded in the geocoding request, further comprising:
matching the address to be coded with a preset non-standard address lexicon, if the matching is successful, determining that the address to be coded is a non-standard address, and if the matching is failed, determining that the address to be coded is a standard address;
the word segmentation of the address to be coded in the geocoding request specifically includes:
when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank;
and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
4. The method according to any one of claims 1 to 3, wherein address fragments corresponding to the address to be encoded are recombined according to a position sequence of the address to be encoded in the address to be encoded to obtain at least one new address including a partial address fragment corresponding to the address to be encoded, the new address is geocoded, and if the geocoding fails, the new address is determined as missing electronic map data and is stored in a preset missing database, specifically comprising:
traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded when traversing one address fragment; geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the last address segment of the address to be coded is traversed;
or,
sequentially decreasing the last address segment from the last address segment of the address to be coded in a reverse order decreasing mode, and taking the address formed by the rest address segments after the address segment is decreased as a new address when the address segment is decreased; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the first address segment of the address to be coded is decreased.
5. The method of any of claims 1 to 3, further comprising:
determining a data missing type of the missing electronic map data;
determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data;
and sequencing the missing electronic map data in the missing database according to the sequence of the importance degrees from high to low, and storing the sequenced missing electronic map data in the missing database into a preset acquisition database.
6. A missing electronic map data determination apparatus, comprising:
the receiving module is used for receiving a geocoding request carrying an address to be coded;
the word segmentation module is used for segmenting the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
the first missing electronic map data determining module is used for carrying out geographic coding on the address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geographic coding fails, storing the missing electronic map data into a preset missing database, and triggering the second missing electronic map data determining module;
and the second missing electronic map data determining module is used for recombining the address segments corresponding to the addresses to be coded according to the position sequence of the address segments in the addresses to be coded to obtain at least one new address containing a part of the address segments corresponding to the addresses to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
7. The determination device of claim 6, wherein the determination device further comprises: a deletion module, wherein:
the deleting module is used for determining the invalid words in the address to be coded according to a preset invalid word bank and deleting the invalid words in the address to be coded before the word segmentation module carries out word segmentation on the address to be coded in the geocoding request; and triggering the word segmentation module aiming at the address to be coded after the invalid word is deleted.
8. The determination device of claim 6, wherein the determination device further comprises: a matching module, wherein:
the matching module is used for matching the address to be coded with a preset non-standard address word bank before the word segmentation module performs word segmentation on the address to be coded in the geocoding request, determining the address to be coded as a non-standard address if the matching is successful, and determining the address to be coded as a standard address if the matching is failed;
the word segmentation module is specifically configured to: when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank; and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
9. The determination device according to any one of claims 6 to 8,
the second missing electronic map data determination module is specifically configured to:
traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded when traversing one address fragment; geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the last address segment of the address to be coded is traversed;
or,
sequentially decreasing the last address segment from the last address segment of the address to be coded in a reverse order decreasing mode, and taking the address formed by the rest address segments after the address segment is decreased as a new address when the address segment is decreased; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the first address segment of the address to be coded is decreased.
10. The determination device according to any one of claims 6 to 8, further comprising:
the data missing type determining module is used for determining the data missing type of the missing electronic map data;
the importance determining module is used for determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data;
and the sorting module is used for sorting the missing electronic map data in the missing database from high to low according to the importance degree and storing the sorted missing electronic map data in the missing database into a preset acquisition database.
CN201410830042.8A 2014-12-25 2014-12-25 Method and device for determining missing electronic map data Active CN105786922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410830042.8A CN105786922B (en) 2014-12-25 2014-12-25 Method and device for determining missing electronic map data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410830042.8A CN105786922B (en) 2014-12-25 2014-12-25 Method and device for determining missing electronic map data

Publications (2)

Publication Number Publication Date
CN105786922A CN105786922A (en) 2016-07-20
CN105786922B true CN105786922B (en) 2020-02-14

Family

ID=56388913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410830042.8A Active CN105786922B (en) 2014-12-25 2014-12-25 Method and device for determining missing electronic map data

Country Status (1)

Country Link
CN (1) CN105786922B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110688851B (en) * 2019-09-26 2023-07-28 亿企赢网络科技有限公司 Method, device and medium for extracting key information of address text

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882163A (en) * 2010-06-30 2010-11-10 中国科学院地理科学与资源研究所 Fuzzy Chinese address geographic evaluation method based on matching rule
CN103699623A (en) * 2013-12-19 2014-04-02 百度在线网络技术(北京)有限公司 Geo-coding realizing method and device
CN103914544A (en) * 2014-04-03 2014-07-09 浙江大学 Method for quickly matching Chinese addresses in multi-level manner on basis of address feature words
CN104216895A (en) * 2013-05-31 2014-12-17 高德软件有限公司 Method and device for generating POI data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110087695A1 (en) * 2009-10-09 2011-04-14 Verizon Patent And Licensing Inc. Apparatuses, methods and systems for a truncated postal code smart address parser
CN102446186B (en) * 2010-10-13 2016-03-30 上海众恒信息产业股份有限公司 Chinese geocoding and coding/decoding method and device
US20130046604A1 (en) * 2011-08-17 2013-02-21 Bank Of America Corporation Virtual loyalty card program
CN102567492B (en) * 2011-12-22 2013-10-30 哈尔滨工程大学 Method for sea-land vector map data integration and fusion
CN107844565B (en) * 2013-05-16 2021-07-16 阿里巴巴集团控股有限公司 Commodity searching method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882163A (en) * 2010-06-30 2010-11-10 中国科学院地理科学与资源研究所 Fuzzy Chinese address geographic evaluation method based on matching rule
CN104216895A (en) * 2013-05-31 2014-12-17 高德软件有限公司 Method and device for generating POI data
CN103699623A (en) * 2013-12-19 2014-04-02 百度在线网络技术(北京)有限公司 Geo-coding realizing method and device
CN103914544A (en) * 2014-04-03 2014-07-09 浙江大学 Method for quickly matching Chinese addresses in multi-level manner on basis of address feature words

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于规则的中文地址分词与匹配方法;谭侃侃;《中国优秀硕士学位论文全文数据库 基础科学辑》;20120615(第6期);第A008-29页 *

Also Published As

Publication number Publication date
CN105786922A (en) 2016-07-20

Similar Documents

Publication Publication Date Title
WO2020228706A1 (en) Fence address-based coordinate data processing method and apparatus, and computer device
CN105808609B (en) Method and equipment for judging data redundancy of information points
CN107656913B (en) Map interest point address extraction method, map interest point address extraction device, server and storage medium
US8452106B2 (en) Partition min-hash for partial-duplicate image determination
CN108628811B (en) Address text matching method and device
WO2016165538A1 (en) Address data management method and device
EP3153978B1 (en) Address search method and device
WO2018177316A1 (en) Information identification method, computing device, and storage medium
CN102279889B (en) A kind of question pushing method and system based on geography information
CN109661659B (en) Visual positioning map storing and loading method, device, system and storage medium
CN112069276B (en) Address coding method, address coding device, computer equipment and computer readable storage medium
WO2016155386A1 (en) Method and device for determining whether webpage comprises point of interest (poi) data
EP3364309B1 (en) Account mapping method and device based on address information
CN104679801B (en) A kind of interest point search method and device
CN105608113B (en) Judge the method and device of POI data in text
CN108228657B (en) Method and device for realizing keyword retrieval
CN103914455B (en) A kind of interest point search method and device
CN110990520A (en) Address coding method and device, electronic equipment and storage medium
CN110688434B (en) Method, device, equipment and medium for processing interest points
CN111896016A (en) Position information processing method and device, storage medium and terminal
CN112487122B (en) Address normalization processing method and device
CN109299443B (en) News text duplication eliminating method based on minimum vertex coverage
CN110990651A (en) Address data processing method and device, electronic equipment and computer readable medium
CN112069824B (en) Region identification method, device and medium based on context probability and citation
CN105786922B (en) Method and device for determining missing electronic map data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200417

Address after: 310012 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee after: Alibaba (China) Co.,Ltd.

Address before: 102200, No. 8, No., Changsheng Road, Changping District science and Technology Park, Beijing, China. 1-5

Patentee before: AUTONAVI SOFTWARE Co.,Ltd.

TR01 Transfer of patent right