CN102789467A - Data fusion method, data fusion device and data processing system - Google Patents

Data fusion method, data fusion device and data processing system Download PDF

Info

Publication number
CN102789467A
CN102789467A CN2011101317655A CN201110131765A CN102789467A CN 102789467 A CN102789467 A CN 102789467A CN 2011101317655 A CN2011101317655 A CN 2011101317655A CN 201110131765 A CN201110131765 A CN 201110131765A CN 102789467 A CN102789467 A CN 102789467A
Authority
CN
China
Prior art keywords
data
input
address
title
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011101317655A
Other languages
Chinese (zh)
Inventor
张轩
王东海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN2011101317655A priority Critical patent/CN102789467A/en
Publication of CN102789467A publication Critical patent/CN102789467A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is suitable for the field of information processing technology and provides a data fusion method, a data fusion device and a data processing system. The data fusion method includes the following steps: receiving input data; adjusting whether data which are the same as the input data exist in pre-stored data; and adding novel description information in the input data into the same data when the data which are the same as the input data exist in the pre-stored data. Different description information in the same data can be fused, information of data can be enriched, and satisfaction with searched data of users is improved.

Description

A kind of data fusion method, device and data handling system
Technical field
The invention belongs to technical field of information processing, relate in particular to a kind of data fusion method, device and data handling system.
Background technology
(Point of Interset, POI) data generally include information such as title, classification, address, longitude and latitude to point of interest.The acquisition mode of POI data has multiple, for example: collection, internet collection etc. on the spot.Because the difference of acquisition mode causes same POI data that collect possibly have different descriptors.
Does the different descriptor of same the POI data that how will collect merge? Is key how to judge that a plurality of POI data that collect are same POI data? Prior art judges through the direct relatively title of POI data whether said POI data are same POI data; Error rate is higher; Because the difference of acquisition mode; The title of POI data maybe be also incomplete same, but the same really POI data of expression, for example:
Title 1: Quanjude (Yu Quan Road)
Address 1: No. 44, Fuxing Road, Haidian District, Beijing City;
Title 2: shop, Quanjude Yu Quan Road
Address 2: No. 44, Fuxing Road, Haidian District, Beijing City;
Though title 1 is different with title 2, what on map, represent is same position, and what therefore should think expression is same POI data.In addition, because the POI data is larger, judge through the title that compares the POI data in twos whether said POI data are same POI data, require a great deal of time, and cost is higher and efficient is lower.
Summary of the invention
The embodiment of the invention provides a kind of data fusion method, is intended to solve the problem of different descriptors in the identical POI data.
The embodiment of the invention is achieved in that a kind of data fusion method, said method comprising the steps of:
Receive the data of input;
Judge in the data that prestore and whether have the data identical with the data of said input;
When having the identical data of data with said input in the data that prestore, new descriptor is added in the said identical data in the data with said input.
Another purpose of the embodiment of the invention is to provide a kind of device of data fusion, and said device comprises:
The Data Receiving unit is used to receive the data of input;
Judging unit is used for judging whether the data that prestore exist the data identical with the data of said input;
The data fusion unit is used for when there are the identical data of data with said input in the data that prestore, and new descriptor is added in the said identical data in the data with said input.
A purpose again of the embodiment of the invention is to provide a kind of data handling system, and said data handling system comprises said data fusion device.
In embodiments of the present invention; Whether there are the data identical in the data that prestore through judgement with the data of said input; When having the identical data of data with said input, new descriptor is added in the said identical data in the data with said input, can effectively enrich the information of data; Reduce the redundancy of data simultaneously, improve the satisfaction of user the data that search.
Description of drawings
Fig. 1 is the realization flow figure of the data fusion method that provides of the embodiment of the invention one;
Fig. 2 is the concrete realization flow figure of the judgement identical data that provides of the embodiment of the invention two;
Fig. 3 is the composition structural drawing of the data fusion device that provides of the embodiment of the invention three;
Fig. 4 is the composition structural drawing of the judging unit that provides of the embodiment of the invention three.
Embodiment
In order to make the object of the invention, technical scheme and advantage clearer,, the present invention is further elaborated below in conjunction with accompanying drawing and embodiment.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
Whether there are the data identical in the data that the embodiment of the invention prestores through judgement with the data of said input; When having the identical data of data with said input; New descriptor is added in the said identical data in the data with said input; Can effectively enrich the information of data, and reduce the redundancy of data, improve the satisfaction of user the data that search.
For technical scheme of the present invention is described, describe through specific embodiment below.
Embodiment one:
Fig. 1 shows the realization flow of the data fusion method that the embodiment of the invention one provides, and details are as follows for this procedure:
In step S101, receive the data of input.
In the present embodiment, said data are including, but not limited to interest point data.
In step S102, judge in the data prestore whether have the data identical with the data of said input, if judged result is " being ", execution in step S103 then, if judged result is " denying ", execution in step S104 then.
In the present embodiment; For the different descriptors in the data identical in the data that collect are merged; Enrich data message; When receiving the data of input, the data of this input and the data that prestore are compared, judge in the data that prestore whether have the data identical with the data of this input.Whether wherein judge in the data that prestore exists the concrete steps of the data identical with the data of said input as shown in Figure 2.
In step S103, when having the identical data of data with said input, new descriptor is added in the said identical data in the data with said input.
In the present embodiment, the different or new descriptors that refer to a plurality of same data of said fusion are fused on the data.For example:
Data 1:
Title 1: Quanjude (Yu Quan Road)
Address 1: No. 44, Fuxing Road, Haidian District, Beijing City
Phone 1:12345678;
Data 2:
Title 2: shop, Quanjude Yu Quan Road
Address 2: No. 44, Fuxing Road, Haidian District, Beijing City
Phone 2:87654321;
Find that through judging data 1 and data 2 are same data, data 1 and data 2 merged that the data after the fusion are:
Title: Quanjude (Yu Quan Road) or shop, Quanjude Yu Quan Road
Address: No. 44, Fuxing Road, Haidian District, Beijing City
Phone: 12345678 or 87654321;
Through fusion to different descriptors in the identical data, can effectively enrich the information of former data, improved the satisfaction of user to the data that search.And, there is identical information in original identical data, after merging, reduced the redundancy of identical data, saved the space of data storage.
In step S104, when not having the identical data of data with said input, store the data of said input.
In the present embodiment, when not having the identical data of data with this input, explain that these data for newly-increased data, then directly store these data, so that compare with the data of input next time.
In embodiments of the present invention; Whether have the data identical with the data of said input in the data that prestore through judgement, when having the identical data of data with said input, new descriptor is added in the said identical data in the data with said input; Can effectively enrich the information of data; Improve the satisfaction of user, reduced the redundancy of data simultaneously, saved the space of data storage the data that search.
Embodiment two:
Fig. 2 is the concrete realization flow that whether has the data identical with the data of said input in the data that prestore of judgement that the embodiment of the invention two provides:
In step S201, the data of input are carried out pre-service;
In the present embodiment, preprocessing process includes but not limited to address longitude and latitude conversion, title fractionation, address fractionation etc.
Said address longitude and latitude converts into when there is not longitude and latitude in said data, and the address through said data obtains longitude and latitude.
Said title is split as title with said data and is split as address prefix, core, branch part and closes key name (Keyname) and add suffix macrotaxonomy part.Wherein, address prefix is through to behind the data participle, according to address word sequence table, obtains the address speech, removes that last address of address speech obtains again.For example: " the Long Guan restaurant is gone back in the ChangPing, Beijing City district ", address speech are " ChangPing, Beijing City district Hui Longguan ", remove last address of address speech, and then address prefix is " ChangPing, Beijing City district ", and " Hui Longguan " remains into the core of back; The branch part then is to obtain branch suffix speech through " () " and branch suffix list, again through judging whether the speech before the suffix speech of branch is that address name or street name are obtained complete branch trade name; Remove address prefix and the branch part is the core; At last, with the core basis, through comparison Keyname allocation list and suffix macrotaxonomy table, find the keyname and the suffix macrotaxonomy part of correspondence, wherein Keyname allocation list and suffix macrotaxonomy table obtain after through manual sorting.
Said address is split as to be divided the address of said data according to province, city, county, area, street, rank.For example: with " No. 38, No. 38 silver-colored sections in Haidian Street, Haidian District, Beijing City mansion " according to province | the city | the county | the area | the street | rank is divided, with "! " ending split into " Beijing | 11! The Haidian District | 12! Street, Haidian | 14! No. 30 | 16! Silver section mansion | ".
In step S202, the title of said pretreated data is carried out the binary word segmentation, each speech after the cutting is generated corresponding key word with the longitude and latitude combination of said data;
In the present embodiment; Title to pretreated data is carried out the binary word segmentation, and each speech after the cutting is generated corresponding key word (KEY) with the longitude and latitude combination of said data, illustrates as follows: the title " shop, Kfc Zhong Guan-cun " of data is carried out the binary word segmentation; Be divided among the kf|fc|c | middle pass | Guan Cun | 7 speech in shop, village; Then the longitude and latitude of these data is done scope checking, verify these data whether in Chinese scope, i.e. 43.005<=latitude<=144.015; 18.0 whether set up<=longtitude<=54.0, when setting up according to lat_key=int ((latitude-43.005) * 1000)/15; Long_key=int ((longtitude-18.000) * 1000)/10 calculates, and wherein, 1000 expression scopes are in one kilometer, and 10 and 15 is constant.At last every pair of speech (like " middle pass ")+lat_key+long_key is KEY.Under the KEY that prestores all to the tabulation of data should be arranged; This tabulation has comprised all data relevant with this KEY, when receiving the data of input, compares KEY earlier; When KEY is identical, just its corresponding data list is compared, can significantly reduce the number of times that comparison is calculated thus.Improve the efficient that identical data is judged.
In step S203, according to the data list of said keyword search to the correspondence that prestores;
In the present embodiment, according to said key word, the data list of the correspondence that the burst at the data of search input place and the data longitude and latitude of this input prestore in 8 bursts on every side.
In step S204, data and each data in the said data list of input are carried out similarity relatively;
In the present embodiment, similarity parameter relatively including, but not limited to following at least one: identical core word, identical suffix macrotaxonomy, binary speech similarity ratio, anti-document frequency comprehensive similarity, substring, subsequence and individual character comprise rate.Wherein, said identical core word directly obtains after pre-service.
When the more said binary speech of data similarity ratio (bigram_similar), need meet the following conditions: the title of two data relatively needs two consecutive identical words at least; The physical distance of two data relatively will be in a km.Bigram_similar is through the quantity a of same words and the quantity b of different speech in the data name behind the statistics binary word segmentation, calculates by a/a+b then.For example: " the prosperous SiChuan of Hui Longguan Fish Filets in Hot Chili Oil " and " prosperous SiChuan Fish Filets in Hot Chili Oil go back to the Long Guan shop "; Identical speech have " Hui Long | Long Guan | prosperous crust | the SiChuan | Shu Shui | poach | Boiled fish " 8 pairs; Different speech has " see prosperous | fish returns | sight shop " 3 pairs, so bigram_similar is 8/8+3=0.727.
Said identical suffix macrotaxonomy (key-categorysuf_similar) is to add the suffix macrotaxonomy through the keyname that compares two data partly to obtain, and detailed process is exemplified below:
The keyname of tentation data 1 is 1k, and suffix is categorized as 1s, and the keyname of data 2 is 2k, and suffix is categorized as 2s, and concrete computing method are following:
If 1k for empty or 2k for empty // if keyname all arranged
{
If 1k is not equal to the 2k//keyname difference
{
key-categoty_simlar=0;
}
If Else 1k equals 2k
{
If 1s for empty and 2s for empty // the suffix macrotaxonomy all arranged
{
If equaling 2s // suffix macrotaxonomy, 1s equates
{
Key-categoty_simlar=1; //keyname is identical, and the suffix macrotaxonomy is identical
}
But the identical suffix macrotaxonomy of Else//keyname is different
{
key-categoty_simlar=0;
}
}
Else
{
Key-categoty_simlar=2; // keyname is identical, does not have suffix }
}
}
Else//all do not have keyname
{
If 1s for empty and 2s for empty // the suffix macrotaxonomy all arranged
{
If equaling 2s // suffix, 1s divides quasi-equal
{
Key-categoty_simlar=3; // not having keyname, the suffix classification is identical
}
Else
{
key-categoty_simlar=0;
}
}
Else//do not have keyname is branch quasisuffix not also
{
key-categoty_simlar=1;
}
};
When the data of two comparisons satisfy that bigram_similar does not conflict with keyname greater than the classification (for example: school and square) in preset threshold values, big zone and address clearly when different (the data physical distance of two comparisons is greater than 30 meters), calculate anti-document frequency comprehensive similarity (idf_similar) parameter again.
Said anti-document frequency comprehensive similarity (idf_similar) is that overall computing formula is through the mediation similarity of calculating title similarity, address similarity, phone similarity and obtaining apart from similarity:
Idf_similar=0.85*name_similar+0.05*address_similar+0.05*phone_similar+0.05*lating_similar
Title similarity (name_similar) most importantly wherein, the concrete computing formula of name_similar is following:
Name_similar=W_same_scores_total*2/1_scores+2_scores
Wherein, W_same_scores_total=Wsame_1*Wsame_1_scores+Wsame_2*Wsame _ 2_scores...+Wsame_n*Wsame_n_scores
1_scores=1_w_1*1_w_1_scores+...+1_w_t*1_w_t_scores
2_scores=2_w_1*2_w_1_scores+...+2_w_f*2_w_f_scores
Wsame_i carries out the same words obtained behind the binary word segmentation to the title of two data; Wsame_i_scores is the score value (being weight) that each same words is added up in advance.
Can reduce the influence of some non-core speech in the title of data through the name_similar computing method; For example " extra large emperor hotel " and " extra large emperor hotel "; Because the frequency that " extra large emperor " occurs in the title of data is lower; So Wsame_scores (extra large emperor) is just high, and the frequency that " hotel " " hotel " occurs in the title of data is higher, so Wsame_scores (hotel); Wsame_scores (hotel) score value is just low; But the score of Wsame_scores (extra large emperor) * 2/Wsame_scores (extra large emperor) * 2+Wsame_scores (hotel)+Wsame_scores (hotel) is really very high, so can judge that " extra large emperor hotel " is similar for very with " extra large emperor hotel ", in like manner calculates in " extra large emperor hotel " and " Xing Hai hotel "; The name_similar that obtains is very low, so can judge that " extra large emperor hotel " and " Xing Hai hotel " is for dissimilar.
The computing formula of Lating_similar is: Lating_similar=MIN (100.0/distance, 1);
Said address similarity (address_similar) is that the address with the data of two comparisons is divided into province, city, county, area, street, six ranks of rank and compares;
The manner of comparison of said phone similarity (phone_similar) is following:
Phone_similar=1 when phone a and phone b are identical; Back 7 back 7 phone_similar=0.7 when identical with phone b as phone a; Phone_similar=0 under other situation;
Said substring is meant the character string that in long string, occurs continuously, and for example: " abc " is the substring of " abcef ";
Said subsequence is meant the character string that in long string, occurs in order, and for example: " abc " is the subsequence of " axbxc ";
It is the probability that the individual character in the substring occurs in long string that said individual character comprises rate, and for example: the probability that " a " in " abc ", " b " occur in " abdef " is 1/5=0.2.
In the present embodiment, calculate, can effectively reduce the error rate that identical data is judged, improve the recall rate of identical data and the efficient that identical data is judged through the similarity of data relatively being carried out various dimensions.
In step S205, when said similarity met preset threshold value, the data of judging said comparison were same data.
In the present embodiment, according to the similarity that calculates, compare with preset threshold value, judge whether the data of said comparison are same data, when said similarity met preset threshold value, the data of judging said comparison were same data.For example: when bigram_similar>=0.8, the data of judging said comparison are same data; When bigram_similar<0.2, the data of judging said comparison are not same data; When 0.4<=bigram_similar<0.8; If the keyname of idf_similar>0.9 or data is identical with the suffix macrotaxonomy or have substring or subsequence to concern; The data of then judging said comparison earlier are same data, if idf_similar<=0.9, and the suffix macrotaxonomy is all identical; And remove idf_similar>=0.5 behind the suffix speech, the data of then judging said comparison are same data; When 0.2<=bigram_similar<0.4; If the title of 0.5>cal_similar>0.1 and two data relatively has the keyname of substring or subsequence relation or data all identical with the suffix classification; The data of then judging said comparison are same data, and other situation judge that then the data of said comparison are not same data.
Embodiment three:
Fig. 3 shows the composition structure of the data fusion device that the embodiment of the invention three provides, and for the ease of explanation, only shows the part relevant with the embodiment of the invention.
This data fusion device can be to run on the unit that software unit, hardware cell or software and hardware in the data handling system combine, and also can be used as independently, suspension member is integrated in these data handling systems or runs in the application system of these data handling systems.
This data fusion device comprises Data Receiving unit 31, judging unit 32, data fusion unit 33 and direct storage unit 34.Wherein, the concrete function of each unit is following:
Data Receiving unit 31 is used to receive the data of input;
Judging unit 32; Be used for judging whether the data that prestore exist the data identical with the data of said input; And when judged result is " being ", add in the said identical data through new descriptor in the data of data fusion unit 33 with said input; When judged result is " denying ", through the data of the said input of direct storage unit 34 storages.Wherein, said judging unit 32 also comprises pre-processing module 41, title cutting module 42, search module 43, similarity comparison module 44 and determination module 45 (as shown in Figure 4), and each module concrete function is following:
Pre-processing module 41 is used for the data of input are carried out pre-service;
Title cutting module 42 is used for the title of said pretreated data is carried out the binary word segmentation, and each speech after the cutting is generated corresponding key word with the longitude and latitude combination of said data;
Search module 43 is used for according to the data list of said keyword search to the correspondence that prestores;
Similarity comparison module 44 is used for the data of input and each data of said data list are carried out similarity relatively;
Determination module 45 is used for when said similarity meets preset threshold value, and the data of judging said comparison are same data.
In this enforcement, the concrete implementation of each module repeats no more at this as stated.
In embodiments of the present invention; Whether there are the data identical in the data that prestore through judgement with the data of said input; When having the identical data of data with said input, new descriptor is added in the said identical data in the data with said input, can effectively enrich the information of data; And reduce the redundancy of data, improve the satisfaction of user to the data that search.And, in the process that identical data is judged, calculate through the similarity of data relatively being carried out various dimensions, can effectively reduce the error rate that identical data is judged, improve the recall rate of identical data and the efficient that identical data is judged.
The above is merely preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of within spirit of the present invention and principle, being done, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims (11)

1. a data fusion method is characterized in that, said method comprising the steps of:
Receive the data of input;
Judge in the data that prestore and whether have the data identical with the data of said input;
When having the identical data of data with said input in the data that prestore, new descriptor is added in the said identical data in the data with said input.
2. whether the method for claim 1 is characterized in that, exist the step of the data identical with the data of said input to be specially in the data that said judgement prestores:
Data to input are carried out pre-service;
Title to said pretreated data is carried out the binary word segmentation, and each speech after the cutting is generated corresponding key word with the longitude and latitude combination of said data;
According to the data list of said keyword search to the correspondence that prestores;
Data and each data in the said data list of input are carried out similarity relatively; When said similarity meets preset threshold value, judge that the data of said comparison and the data of said input are same data.
3. method as claimed in claim 2 is characterized in that, said preprocessing process comprises that address longitude and latitude conversion, title split, the address splits;
Said address longitude and latitude converts into when there is not longitude and latitude in data, and the address through said data obtains longitude and latitude;
Said title is split as title with said data and is split as address prefix, core, branch part and closes key name and add suffix macrotaxonomy part;
Said address is split as to be divided the address of said data according to province, city, county, area, street, rank.
4. method as claimed in claim 2; It is characterized in that, the parameter of said similarity comparison comprise following at least one: identical core word, identical suffix macrotaxonomy, binary speech similarity ratio, anti-document frequency comprehensive similarity, substring, subsequence and individual character comprise rate.
5. the method for claim 1 is characterized in that, said method is further comprising the steps of:
When not having the identical data of data with said input, store the data of said input.
6. a data fusion device is characterized in that, said device comprises:
The Data Receiving unit is used to receive the data of input;
Judging unit is used for judging whether the data that prestore exist the data identical with the data of said input; And
The data fusion unit is used for when there are the identical data of data with said input in the data that prestore, and new descriptor is added in the said identical data in the data with said input.
7. device as claimed in claim 6 is characterized in that, said judging unit also comprises:
Pre-processing module is used for the data of input are carried out pre-service;
Title cutting module is used for the title of said pretreated data is carried out the binary word segmentation, and each speech after the cutting is generated corresponding key word with the longitude and latitude combination of said data;
Search module is used for according to the data list of said keyword search to the correspondence that prestores;
The similarity comparison module is used for the data of input and each data of said data list are carried out similarity relatively;
Determination module is used for when said similarity meets preset threshold value, judges that the data of said comparison and the data of said input are same data.
8. device as claimed in claim 7 is characterized in that, said preprocessing process comprises that address longitude and latitude conversion, title split, the address splits;
Said address longitude and latitude converts into when there is not longitude and latitude in data, and the address through said data obtains longitude and latitude;
Said title is split as title with said data and is split as address prefix, core, branch part and key word and adds suffix macrotaxonomy part;
Said address is split as to be divided the address of said data according to province, city, county, area, street, rank.
9. device as claimed in claim 7; It is characterized in that, the parameter of said similarity comparison comprise following at least one: identical core word, identical suffix macrotaxonomy, binary speech similarity ratio, anti-document frequency comprehensive similarity, substring, subsequence and individual character comprise rate.
10. device as claimed in claim 6 is characterized in that, said device also comprises:
Storage unit is used for when not having the identical data of data with said input, storing the data of said input.
11. a data handling system is characterized in that, said data handling system comprises the described data fusion device of each claim of claim 6 to 10.
CN2011101317655A 2011-05-20 2011-05-20 Data fusion method, data fusion device and data processing system Pending CN102789467A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101317655A CN102789467A (en) 2011-05-20 2011-05-20 Data fusion method, data fusion device and data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101317655A CN102789467A (en) 2011-05-20 2011-05-20 Data fusion method, data fusion device and data processing system

Publications (1)

Publication Number Publication Date
CN102789467A true CN102789467A (en) 2012-11-21

Family

ID=47154871

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101317655A Pending CN102789467A (en) 2011-05-20 2011-05-20 Data fusion method, data fusion device and data processing system

Country Status (1)

Country Link
CN (1) CN102789467A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102667A (en) * 2013-04-11 2014-10-15 北京四维图新科技股份有限公司 POI (Point of Interest) information differentiation method and device
CN104216895A (en) * 2013-05-31 2014-12-17 高德软件有限公司 Method and device for generating POI data
CN104751232A (en) * 2015-04-27 2015-07-01 携程计算机技术(上海)有限公司 Automatic matching method for hotels
CN104899189A (en) * 2015-05-27 2015-09-09 深圳市华傲数据技术有限公司 Object name matching method based on information entropy
CN106033475A (en) * 2016-05-18 2016-10-19 苏州奖多多科技有限公司 Information matching method and device and electronic equipment
CN106104657A (en) * 2014-04-09 2016-11-09 三菱电机株式会社 Map drawing device, mapping method and map depiction program
CN106649331A (en) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 Business district recognition method and equipment
CN109857957A (en) * 2019-01-29 2019-06-07 掌阅科技股份有限公司 Establish method, electronic equipment and the computer storage medium of tag library
CN110288023A (en) * 2019-06-26 2019-09-27 广州小鹏汽车科技有限公司 Fusion method and device, detection method, acquisition methods, server and vehicle
CN110851547A (en) * 2019-10-11 2020-02-28 上海中旖能源科技有限公司 Multi-data-source map data fusion method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1685383A (en) * 2002-09-27 2005-10-19 株式会社查纳位资讯情报 Map data product and map data processing device
US20090019081A1 (en) * 2007-07-09 2009-01-15 Technion Research And Development Foundation Ltd. Integrating data from maps on the world-wide web
CN101882135A (en) * 2009-05-04 2010-11-10 高德软件有限公司 Data processing method and device
CN102062610A (en) * 2009-11-18 2011-05-18 神达电脑股份有限公司 Method and device for creating and playing customized audio alert message

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1685383A (en) * 2002-09-27 2005-10-19 株式会社查纳位资讯情报 Map data product and map data processing device
US20090019081A1 (en) * 2007-07-09 2009-01-15 Technion Research And Development Foundation Ltd. Integrating data from maps on the world-wide web
CN101882135A (en) * 2009-05-04 2010-11-10 高德软件有限公司 Data processing method and device
CN102062610A (en) * 2009-11-18 2011-05-18 神达电脑股份有限公司 Method and device for creating and playing customized audio alert message

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102667A (en) * 2013-04-11 2014-10-15 北京四维图新科技股份有限公司 POI (Point of Interest) information differentiation method and device
CN104216895A (en) * 2013-05-31 2014-12-17 高德软件有限公司 Method and device for generating POI data
CN104216895B (en) * 2013-05-31 2018-01-30 高德软件有限公司 A kind of method and device for generating POI data
CN106104657A (en) * 2014-04-09 2016-11-09 三菱电机株式会社 Map drawing device, mapping method and map depiction program
CN104751232A (en) * 2015-04-27 2015-07-01 携程计算机技术(上海)有限公司 Automatic matching method for hotels
CN104751232B (en) * 2015-04-27 2018-07-31 携程计算机技术(上海)有限公司 Hotel's automatic matching method
CN104899189A (en) * 2015-05-27 2015-09-09 深圳市华傲数据技术有限公司 Object name matching method based on information entropy
WO2016188051A1 (en) * 2015-05-27 2016-12-01 深圳市华傲数据技术有限公司 Information entropy-based object name matching method
CN104899189B (en) * 2015-05-27 2017-11-28 深圳市华傲数据技术有限公司 Object oriented matching process based on comentropy
CN106649331B (en) * 2015-10-29 2020-09-11 阿里巴巴集团控股有限公司 Business circle identification method and equipment
CN106649331A (en) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 Business district recognition method and equipment
CN106033475A (en) * 2016-05-18 2016-10-19 苏州奖多多科技有限公司 Information matching method and device and electronic equipment
CN109857957A (en) * 2019-01-29 2019-06-07 掌阅科技股份有限公司 Establish method, electronic equipment and the computer storage medium of tag library
CN109857957B (en) * 2019-01-29 2021-06-15 掌阅科技股份有限公司 Method for establishing label library, electronic equipment and computer storage medium
CN110288023A (en) * 2019-06-26 2019-09-27 广州小鹏汽车科技有限公司 Fusion method and device, detection method, acquisition methods, server and vehicle
CN110851547A (en) * 2019-10-11 2020-02-28 上海中旖能源科技有限公司 Multi-data-source map data fusion method

Similar Documents

Publication Publication Date Title
CN102789467A (en) Data fusion method, data fusion device and data processing system
CN108804532B (en) Query intention mining method and device and query intention identification method and device
CN101276361B (en) Method and system for displaying related key words
Metzler et al. Structured event retrieval over microblog archives
US8983928B2 (en) Real time content searching in social network
CN101984423B (en) Hot-search word generation method and system
US8812536B2 (en) Providing regional content by matching geographical properties
US20110145348A1 (en) Systems and methods for identifying terms relevant to web pages using social network messages
US8793260B2 (en) Related pivoted search queries
CN104462113A (en) Search method and device and electronic equipment
CN102725759A (en) Semantic table of contents for search results
CN103699700A (en) Search guidance generation method, system and related server
CN102750949A (en) Voice recognition method and device
CN105528372A (en) An address search method and apparatus
US9275156B2 (en) Trending topic identification from social communications
CN102163234A (en) Equipment and method for error correction of query sequence based on degree of error correction association
US8768910B1 (en) Identifying media queries
CN103049495A (en) Method, device and equipment for providing searching advice corresponding to inquiring sequence
CN103778204A (en) Voice analysis-based video search method, equipment and system
WO2013083369A1 (en) Fuzzy full text search
CN103425662A (en) Information search method and device in network community
CN103955480A (en) Method and equipment for determining target object information corresponding to user
CN103810204A (en) Information search method and information search device
US9514198B1 (en) Suggesting a tag to promote a discussion topic
Zhu et al. Finding top-k similar users based on trajectory-pattern model for personalized service recommendation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20121121