CN106919705A - The affiliated spatial identification method and device of the network information - Google Patents

The affiliated spatial identification method and device of the network information Download PDF

Info

Publication number
CN106919705A
CN106919705A CN201710141330.6A CN201710141330A CN106919705A CN 106919705 A CN106919705 A CN 106919705A CN 201710141330 A CN201710141330 A CN 201710141330A CN 106919705 A CN106919705 A CN 106919705A
Authority
CN
China
Prior art keywords
region
network information
weight
affiliated
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710141330.6A
Other languages
Chinese (zh)
Inventor
安倩
李永红
张政勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sohu New Media Information Technology Co Ltd
Original Assignee
Beijing Sohu New Media Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sohu New Media Information Technology Co Ltd filed Critical Beijing Sohu New Media Information Technology Co Ltd
Priority to CN201710141330.6A priority Critical patent/CN106919705A/en
Publication of CN106919705A publication Critical patent/CN106919705A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/52Network services specially adapted for the location of the user terminal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a kind of affiliated spatial identification method and device of network information, gathered by obtaining the region included in the network information;The number of times occurred in the network information according to each region in the set of region and position, determine the corresponding weight in each region, and weight is used to characterize the probability that the affiliated region of the network information is corresponding region.According to the corresponding weight in each region, concentrated from region name, determine the affiliated region of the network information.So as in pushing network information, accordingly be pushed according to the affiliated region of the network information.For example, pushing the network information of Baoan District to the user of Baoan District.

Description

The affiliated spatial identification method and device of the network information
Technical field
The present invention relates to communication technical field, the affiliated spatial identification method and device of the network information is more particularly related to.
Background technology
With internet developing rapidly in the world, the network media has been acknowledged as after newspaper, broadcast, TV " fourth media " afterwards, network turns into one of information main carriers, and the propagation characteristic of network causes that the network information of magnanimity is gushed To user.
But how accurately user still more pays close attention to the thing for occurring at one's side, therefore the ground in the identification network information Domain information, it appears particularly important.
The content of the invention
In view of this, the invention provides a kind of affiliated spatial identification method and device of network information, to overcome existing skill The problem for not having to recognize the affiliated region of the network information in art.
To achieve the above object, the present invention provides following technical scheme:
A kind of affiliated spatial identification method of the network information, including:
The region set that the network information includes is obtained, the region set includes at least one region;
Each region occurs in the network information in gathering according to the region number of times and position, determine each The corresponding weight in region, weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
According to the corresponding weight in each region, concentrated from the region name, determine the affiliated region of the network information.
Wherein, the region name collection that the acquisition network information includes includes:
The network information is divided, multiple words are obtained;
From the multiple word, the target word that each region for obtaining and prestoring matches;
The region is constituted by the target word to gather.
Wherein, each region occurs in the network information in the set according to the region number of times and position Put, determine that the corresponding weight in each region includes:
Judge the position that each region occurs in the network information in the set of the region;
When caption position of first region in the network information during the region is gathered, calculated according to first function The weight of first region, position of the first function with corresponding region in the title is with weight as dependent variable The function of independent variable, and the independent variable and dependent variable of the first function are negative correlation;
When the second region position in the text of the network information during the region is gathered, calculated according to second function The weight of second region, position of the second function with corresponding region in the text is with weight as dependent variable The function of independent variable, and the independent variable and dependent variable of the second function are negative correlation.
Wherein, each region occurs in the network information in the set according to the region number of times and position Put, determine the corresponding weight in each region, also include:
When the number of times that the 3rd region during the region is gathered occurs in the network information is more than or equal to twice, by institute The corresponding each weight in the 3rd region is stated to be added;
By the corresponding each weight sum in the 3rd region, it is defined as the weight of the 3rd region.
Wherein, it is described to be concentrated from the region name according to the corresponding weight in each region, determine the network information institute Possession domain includes:
According to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, judge that the region is each in gathering Whether there is relationship between superior and subordinate between region;
It is when there is relationship between superior and subordinate between at least two regions during the region is gathered, at least two region is true It is set to a fine granularity region;
The corresponding weight at least two regions is carried out into default calculating, the fine granularity region is obtained and is weighed accordingly Weight;
With each region in each corresponding weight in fine granularity region in gathering according to the region, and region set All the corresponding weight in coarseness region without relationship between superior and subordinate, the fine granularity region and coarse grain for including are gathered from the region In degree region, the affiliated region of the network information is determined.
Wherein, in the fine granularity region and coarseness region included from region set, the network letter is determined Region belonging to breath includes:
Judge the corresponding weight in fine granularity region and coarseness region and the first predetermined threshold value that the region set is included Magnitude relationship;
When the number of the weight more than or equal to first predetermined threshold value is zero, determine the network information without affiliated Region;
When the number of the weight more than or equal to first predetermined threshold value is at least one, by the corresponding mesh of weight limit Mark region, is defined as the affiliated Regional Property of the network information, and the target area is fine granularity region or coarseness region.
Wherein, when the number of the weight when more than or equal to first predetermined threshold value is at least one, will most authority The corresponding target area of weight, being defined as the affiliated Regional Property of the network information includes:
When the number of the weight more than or equal to first predetermined threshold value is one, will be greater than being preset equal to described first The corresponding target area of weight of threshold value, is defined as the affiliated Regional Property of the network information;
When the number of the weight more than or equal to the predetermined threshold value is at least two, calculates and be more than or equal to the default threshold In corresponding at least two target area of weight of value, the difference of each two target area respective weights;
When at least one difference is more than or equal to the second predetermined threshold value, the corresponding target area of weight limit is defined as The affiliated Regional Property of the network information;
When all differences are respectively less than second predetermined threshold value, determine the network information without affiliated region.
A kind of affiliated spatial identification device of the network information, including:
Acquisition module, for obtaining the region set that the network information includes, the region set includes at least one ground Domain;
First determining module, for the number of times occurred in the network information according to each region in the set of the region And position, determining the corresponding weight in each region, it is corresponding region that weight is used to characterize the affiliated region of the network information Probability;
Second determining module, for according to the corresponding weight in each region, being concentrated from the region name, determines the net The affiliated region of network information.
Wherein, first determining module includes:
First judging unit, for judging the position that each region occurs in the network information;
First computing unit, for when caption position of first region in the network information in the set of the region When, the weight of first region, position of the first function with corresponding region in the title are calculated according to first function Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the first function are negative correlation;
Second computing unit, for when the second region position in the text of the network information in the set of the region When, the weight of second region, position of the second function with corresponding region in the text are calculated according to second function Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the second function are negative correlation.
Wherein, second determining module includes:
Second judging unit, for according to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, sentencing Whether there is relationship between superior and subordinate between each region in the disconnected region set;
First determining unit, for when there is relationship between superior and subordinate between at least two regions in the set of the region, inciting somebody to action At least two region is defined as a fine granularity region;
Acquiring unit, for the corresponding weight at least two regions to be carried out into default calculating, obtains the fine granularity The corresponding weight in region;
Second determining unit, for gathering according to the region in each corresponding weight in fine granularity region, it is and described The corresponding weight in coarseness region in the set of region with each region all without relationship between superior and subordinate, from the region, set is included Fine granularity region and coarseness region in, determine the affiliated region of the network information.
Understood via above-mentioned technical scheme, compared with prior art, belonging to the network information provided in an embodiment of the present invention In spatial identification method, gathered by obtaining the region included in the network information;According to each region in the set of region in network The number of times occurred in information and position, determine the corresponding weight in each region, and weight is used to characterize the affiliated region of the network information It is the probability of corresponding region.According to the corresponding weight in each region, concentrated from region name, determine the affiliated region of the network information. So as in pushing network information, accordingly be pushed according to the affiliated region of the network information.For example, being pushed to the user of Baoan District The network information of Baoan District.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is a kind of flow chart of the affiliated spatial identification method of network information provided in an embodiment of the present invention;
Fig. 2 be a kind of affiliated spatial identification method of the network information provided in an embodiment of the present invention according to the region gather In the number of times that occurs in the network information of each region and position, determine a kind of realization of the corresponding weight in each region The method flow schematic diagram of mode;
Fig. 3 is the partial schematic diagram of region tree provided in an embodiment of the present invention;
Fig. 4 be the affiliated spatial identification method of the network information provided in an embodiment of the present invention in weighed accordingly according to each region Weight, concentrates from the region name, determines a kind of method flow schematic diagram of implementation of the affiliated region of the network information;
Fig. 5 is to be recognized belonging to each network information using the affiliated spatial identification method of the network information provided in an embodiment of the present invention The schematic diagram of region;
Fig. 6 is the structural representation of the affiliated spatial identification device of the network information provided in an embodiment of the present invention;
Fig. 7 is the structural representation of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
It is a kind of flow chart of the affiliated spatial identification method of network information provided in an embodiment of the present invention, the method such as Fig. 1 Including:
Step S101:The region set that the network information includes is obtained, the region set includes at least one region.
The embodiment of the present invention is directed in the network information for the network information for recording region, if root in the network information Do not include region originally, then the embodiment of the present invention cannot know its affiliated region by the network information.
The network information can be voice messaging, text message etc..When the network information is voice messaging, can be by voice Information is converted to text message.The embodiment of the present invention is provided but is not limited to following " obtaining the region set that the network information includes " Implementation method.
The network information is divided, multiple words are obtained;From the multiple word, obtain and each region phase for prestoring The target word of matching;The region is constituted by the target word to gather.
Assuming that the network information is:Baoan District's cultural festival.
The network information can be divided into:Treasured, Bao'an, Baoan District, Baoan District's text, Baoan District's culture, Baoan District's cultural festival; Peace, An Qu, peace area text, the culture of peace area, peace area's cultural festival;Area, Qu Wen, area's culture, area's cultural festival;Text, culture, cultural festival;Change, Change section;Section.
Preferably, due to the particularity of Chinese text, needed when word is extracted with participle instrument accurately by network Information be cut into word and phrase and filter out such as ", she, he, it ... " useless auxiliary word, adverbial word, stop words.If The network information is english vocabulary, then can filter out as:The words such as a, an.
Above-mentioned each region can be stored in advance in regional information storehouse.Each region can be all of comprising the whole nation or the whole world Province, city, county, town, village, it is preferred that each region can also include street information, neighbourhood committee's information, the latitude and longitude information of region, with And the significant information of some regions, such as the Daming Lake in Jinan, Pekinese's the Forbidden City etc..
Matched with each region for prestoring by the multiple words that will be divided to the network information, Baoan District can be obtained This region vocabulary.
Step S102:Each region occurs in the network information in gathering according to the region number of times and position Put, determine the corresponding weight in each region, weight is used to characterize the probability that the affiliated region of the network information is corresponding region.
The network information generally comprises title and text, and the position that region occurs in the network information can refer to region just Occur in text, or, region occurs in title.
The position that region occurs in a network can also refer to which word during region is located at text, if the network information is As a example by " Baoan District's cultural festival ", the position of Baoan District is that the position that the 1st word, i.e. Baoan District occur in the network information is 1.
Preferably, corresponding weight when region occurs in title, higher than corresponding weight when occurring in the body of the email.
The number of times that the weight of region occurs with region in the network information is proportionate, negatively correlated with the position for occurring.
Positive correlation (Positive correlation), it is identical to refer to that two variables change directions, a variable by greatly to During small or ascending change, another variable is also descending or ascending change.
It is in opposite direction that negative correlation refers to that two variables change, and each variable is descending or during ascending change, another Individual variable is ascending on the contrary or descending change.
Step S103:According to the corresponding weight in each region, concentrated from the region name, determine the network information institute Possession domain.
The bigger region of weight, is that the possibility of the affiliated region of the network information is bigger.
In the affiliated spatial identification method of the network information provided in an embodiment of the present invention, by obtaining what is included in the network information Gather region;The number of times occurred in the network information according to each region in the set of region and position, determine each region phase The weight answered, weight is used to characterize the probability that the affiliated region of the network information is corresponding region.According to the corresponding weight in each region, Concentrated from region name, determine the affiliated region of the network information.So as in pushing network information, according to the affiliated region of the network information Accordingly pushed.For example, pushing the network information of Baoan District to the user of Baoan District.
As shown in Fig. 2 described in foundation in a kind of affiliated spatial identification method of the network information provided in an embodiment of the present invention Each region occurs in the network information in the set of region number of times and position, determine the corresponding weight in each region A kind of method flow schematic diagram of implementation, the method includes:
Step S201:Judge the position that each region occurs in the network information in the set of the region.
Step S202:When caption position of first region in the network information during the region is gathered, according to the One function calculates the weight of first region.
Position of the first function with corresponding region in the title as dependent variable, the letter with weight as independent variable Number, and the independent variable and dependent variable of the first function are negative correlation.
The specific presentation formula of first function has various, and the embodiment of the present invention is provided but is not limited to below equation:
Weight=the first numerical value of first function1/ position, wherein, the first numerical value can be the arbitrary value more than or equal to 1, it is assumed that First numerical value is 2, then weight=2 of first function1/ position, still by taking Baoan District's cultural festival as an example, it is assumed that Baoan District's cultural festival is mark Inscribe, then weight=2 of Baoan District1/1=2.It is again entitled with " first good fortune section of Crab with butter diet culture forever is held in Baoan District " Example, weight=2 of Baoan District1/14
Step S203:When the second region position in the text of the network information during the region is gathered, according to the Two functions calculate the weight of second region.
Position of the second function with corresponding region in the text as dependent variable, the letter with weight as independent variable Number, and the independent variable and dependent variable of the second function are negative correlation.
The specific presentation formula of second function has various, and the embodiment of the present invention is provided but is not limited to below equation:
Weight=the second value of second function1/ positionWherein, second value can be the arbitrary value more than or equal to 1, preferably , second value is less than the first numerical value.Assuming that the first numerical value is 1.01, then weight=1.01 of first function1/ position, still with Bao'an As a example by area's cultural festival, it is assumed that Baoan District's cultural festival is text, then weight=1.01 of Baoan District1/1=1.01.Again with " first good fortune The section of Crab with butter diet culture forever is held in Baoan District " for as a example by text, weight=1.01 of Baoan District1/14
If it is understood that when the number of times that same region occurs in the network information is more than or equal to 2 times, the region is net The probability of the affiliated region of network information is just bigger, now also includes:
Step S204:The number of times that the 3rd region occurs in the network information in the region is gathered is more than or equal to two When secondary, the corresponding each weight in the 3rd region is added.
Step S205:By the corresponding each weight sum in the 3rd region, it is defined as the weight of the 3rd region.
Above-mentioned first region, the second region, the 3rd region are probably same region, it is also possible to different regions.If Same region, then show, same region all occurs in that in title and text.
It is understood that the region included in the network information may not have relationship between superior and subordinate, for example:Beijing and Shanghai, But there is relationship between superior and subordinate some regions, for example, Guangdong Province, Shenzhen, Baoan District.In above-described embodiment, prestore Each region, can be stored by tree, i.e., each region is stored with tree in regional information storehouse, by the tree Shape structure can find the branch where each region during region is gathered.As shown in figure 3, being region provided in an embodiment of the present invention The partial schematic diagram of tree.
In Fig. 3, Hebei province, Baoding, Boye County, Xu Cun are a branch;Guangdong Province, Shenzhen, Baoan District are another Individual branch.Can judge whether region has relationship between superior and subordinate between each region in gathering by tree.Can so obtain More fine-grained region.Such as Xu Cun is more fine-grained region compared to Hebei province.Assuming that region set includes:Hebei Province, Baoding, Boye County, Xu Cun, Guangdong Province, Shenzhen, Baoan District, Beijing;Then Hebei province, Baoding, Boye County, Xu Cun pairs Answer a fine granularity region;Guangdong Province, Shenzhen, Baoan District one fine granularity region of correspondence.Beijing correspondence coarseness region.
As shown in figure 4, in the affiliated spatial identification method of the network information provided in an embodiment of the present invention according to each region Corresponding weight, concentrates from the region name, determines a kind of method stream of implementation of the affiliated region of the network information Journey schematic diagram, the method includes:
Step S401:According to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, judge describedly Whether there is relationship between superior and subordinate between each region in the set of domain.
Because some regions are ambiguous, for example, for being exposed to the sun for this region, there is Chaoyang District in Beijing, and Liaoning has Chaoyang City.Assuming that region set includes:Liaoning Province, Chaoyang, then can be according to the upper of each region belonging relation for prestoring Inferior relation, determines the Chaoyang City that Chaoyang is Liaoning Province.Therefore, step S401 provided in an embodiment of the present invention can also be effective Solve ambiguous problem in region.
Step S402:When there is relationship between superior and subordinate between at least two regions during the region is gathered, by described at least Two regions are defined as a fine granularity region.
Step S403:The corresponding weight at least two regions is carried out into default calculating, the fine granularity region is obtained Corresponding weight.
Default calculating can for product, average etc..
By taking Guangdong Province, Shenzhen, Baoan District as an example, then the weight * Shenzhen weight * in the weight=Guangdong Province of Baoan District is protected Pacify the weight in area, or, the weight of Baoan District=(weight in the weight in Guangdong Province+Shenzhen's weight+security personnel area)/3.
Step S404:Each corresponding weight in fine granularity region in gathering according to the region, and region set In the corresponding weight in coarseness region with each region all without relationship between superior and subordinate, from the fine granularity that region set is included In region and coarseness region, the affiliated region of the network information is determined.
If it is understood that each corresponding weight in fine granularity region, each coarseness region are corresponding in the set of region Weight it is all smaller, then illustrate that the network information does not have obvious Regional Property.It is described from the region set include it is thin In granularity region and coarseness region, determine that the affiliated region of the network information includes:
Judge the corresponding weight in fine granularity region and coarseness region and the first predetermined threshold value that the region set is included Magnitude relationship;
When the number of the weight more than or equal to first predetermined threshold value is zero, determine the network information without affiliated Region;
When the number of the weight more than or equal to first predetermined threshold value is at least one, by the corresponding mesh of weight limit Mark region, is defined as the affiliated Regional Property of the network information, and the target area is fine granularity region or coarseness region.
If it is understood that when being at least one more than or equal to the number of the weight of first predetermined threshold value, and greatly In the corresponding each corresponding weight in target area of weight equal to first predetermined threshold value, any two target area is corresponding When the difference of weight is all smaller, illustrate that the network information may be still without obvious Regional Property.Or, the affiliated region of the network information is Multiple purpose regions.
When the number of the weight when more than or equal to first predetermined threshold value is at least one, by weight limit correspondence Target area, being defined as the affiliated Regional Property of the network information includes:
When the number of the weight more than or equal to first predetermined threshold value is one, will be greater than being preset equal to described first The corresponding target area of weight of threshold value, is defined as the affiliated Regional Property of the network information;
When the number of the weight more than or equal to the predetermined threshold value is at least two, calculates and be more than or equal to the default threshold In corresponding at least two target area of weight of value, the difference of each two target area respective weights;
When at least one difference is more than or equal to the second predetermined threshold value, the corresponding target area of weight limit is defined as The affiliated Regional Property of the network information;
When all differences are respectively less than second predetermined threshold value, determine the network information without affiliated region.
Above-mentioned first preset value and the second preset value can be configured according to actual conditions.
As shown in figure 5, being to recognize each network using the affiliated spatial identification method of the network information provided in an embodiment of the present invention The schematic diagram of the affiliated region of information.
In Fig. 5, the region for including is gathered in the part outlined with dotted line frame by the corresponding region of each network information.
Shown in Fig. 5 new using the affiliated spatial identification method identification region of the network information provided in an embodiment of the present invention News, recruitment information, search notice, the example of region cuisines.Known using the affiliated region of the network information provided in an embodiment of the present invention Other method can also recognize other network informations, such as weather forecast etc..
The embodiment of the present invention is additionally provided and the affiliated spatial identification method of the network information, the corresponding affiliated region of the network information Identifying device, as shown in fig. 6, be the structural representation of the affiliated spatial identification device of the network information provided in an embodiment of the present invention, The affiliated spatial identification device of the network information includes:
Acquisition module 61, for obtaining the region set that the network information includes, the region set includes at least one Region;
First determining module 62, it is secondary for what is occurred in the network information according to each region in the set of the region Number and position, determine the corresponding weight in each region, and weight is used to characterize the affiliated region of the network information for corresponding region Probability;
Second determining module 63, for according to the corresponding weight in each region, being concentrated from the region name, it is determined that described The affiliated region of the network information.
Optionally, acquisition module includes:
Lexical unit is obtained, for the network information to be divided, multiple words is obtained;
Obtain target word unit, the target that each region for from the multiple word, obtaining and prestoring matches Word;
Component units, gather for constituting the region by the target word.
Optionally, the first determining module includes:
First judging unit, for judging the position that each region occurs in the network information;
First computing unit, for when caption position of first region in the network information in the set of the region When, the weight of first region, position of the first function with corresponding region in the title are calculated according to first function Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the first function are negative correlation;
Second computing unit, for when the second region position in the text of the network information in the set of the region When, the weight of second region, position of the second function with corresponding region in the text are calculated according to second function Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the second function are negative correlation.
Optionally, the first determining module also includes:
Addition unit, for gathering when the region in the number of times that occurs in the network information of the 3rd region be more than etc. When twice, the corresponding each weight in the 3rd region is added;
Weight unit is determined, for by the corresponding each weight sum in the 3rd region, being defined as the 3rd ground The weight in domain.
Optionally, the second determining module includes:
Second judging unit, for according to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, sentencing Whether there is relationship between superior and subordinate between each region in the disconnected region set;
First determining unit, for when there is relationship between superior and subordinate between at least two regions in the set of the region, inciting somebody to action At least two region is defined as a fine granularity region;
Acquiring unit, for the corresponding weight at least two regions to be carried out into default calculating, obtains the fine granularity The corresponding weight in region;
Second determining unit, for gathering according to the region in each corresponding weight in fine granularity region, it is and described The corresponding weight in coarseness region in the set of region with each region all without relationship between superior and subordinate, from the region, set is included Fine granularity region and coarseness region in, determine the affiliated region of the network information.
Optionally, the second determining unit includes:
Judgment sub-unit, for judging the corresponding weight in fine granularity region and coarseness region that the region set is included With the magnitude relationship of the first predetermined threshold value;
First determination subelement, for when the weight more than or equal to first predetermined threshold value number be zero when, really The fixed network information is without affiliated region;
Second determination subelement, for being at least one when the number of the weight for being more than or equal to first predetermined threshold value When, by the corresponding target area of weight limit, it is defined as the affiliated Regional Property of the network information, the target area is particulate Degree region or coarseness region.
Optionally, the second determination subelement includes:
First determination sub-module, for when the number of the weight more than or equal to first predetermined threshold value is one, inciting somebody to action More than or equal to the corresponding target area of weight of first predetermined threshold value, it is defined as the affiliated Regional Property of the network information;
Calculating sub module, for when the number of the weight more than or equal to the predetermined threshold value is at least two, calculating big In corresponding at least two target area of weight equal to the predetermined threshold value, the difference of each two target area respective weights Value;
Second determination sub-module, for when at least one difference is more than or equal to the second predetermined threshold value, by weight limit pair The target area answered, is defined as the affiliated Regional Property of the network information;
3rd determination sub-module, for when all differences are respectively less than second predetermined threshold value, determining the network letter Breath region without belonging to.
The embodiment of the present invention additionally provides a kind of electronic equipment, as shown in fig. 7, for electronics provided in an embodiment of the present invention sets Standby structural representation, the electronic equipment includes:Processor 71, communication interface 72, memory 73 and communication bus 74;
Wherein processor 71, communication interface 72, memory 73 complete mutual communication by communication bus 74;
Optionally, communication interface 72 can be the interface of communication module, such as interface of gsm module;
Processor 71, for configuration processor;
Memory 73, for depositing program and data;
Program can include program code, and described program code includes computer-managed instruction;Data can include region Or the relationship between superior and subordinate between region.
Processor 71 is probably a central processor CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or it is arranged to implement one or more integrated electricity of the embodiment of the present invention Road.
Memory 73 may include high-speed RAM memory, it is also possible to also including nonvolatile memory (non-volatile Memory), for example, at least one magnetic disk storage.
Wherein, program can be specifically for:
The region set that the network information includes is obtained, the region set includes at least one region;
Each region occurs in the network information in gathering according to the region number of times and position, determine each The corresponding weight in region, weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
According to the corresponding weight in each region, concentrated from the region name, determine the affiliated region of the network information.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms be used merely to by One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation Between there is any this actual relation or order.And, term " including ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that process, method, article or equipment including a series of key elements not only include that A little key elements, but also other key elements including being not expressly set out, or also include for this process, method, article or The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", does not arrange Except also there is other identical element in the process including the key element, method, article or equipment.
Each embodiment is described by the way of progressive in this specification, and what each embodiment was stressed is and other The difference of embodiment, between each embodiment identical similar portion mutually referring to.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or uses the application. Various modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can in other embodiments be realized in the case where spirit herein or scope is not departed from.Therefore, the application The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The scope most wide for causing.

Claims (10)

1. a kind of affiliated spatial identification method of network information, it is characterised in that including:
The region set that the network information includes is obtained, the region set includes at least one region;
Each region occurs in the network information in gathering according to the region number of times and position, determine each region Corresponding weight, weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
According to the corresponding weight in each region, concentrated from the region name, determine the affiliated region of the network information.
2. the affiliated spatial identification method of the network information, the region that the acquisition network information includes according to claim 1 Title collection includes:
The network information is divided, multiple words are obtained;
From the multiple word, the target word that each region for obtaining and prestoring matches;
The region is constituted by the target word to gather.
3. the affiliated spatial identification method of the network information according to claim 1, it is characterised in that described according to the region collection Each region occurs in the network information in conjunction number of times and position, determine that the corresponding weight in each region includes:
Judge the position that each region occurs in the network information in the set of the region;
When caption position of first region in the network information during the region is gathered, calculate described according to first function The weight of the first region, position of the first function with corresponding region in the title, as dependent variable, is from change with weight The function of amount, and the independent variable and dependent variable of the first function are negative correlation;
When the second region position in the text of the network information during the region is gathered, calculate described according to second function The weight of the second region, position of the second function with corresponding region in the text, as dependent variable, is from change with weight The function of amount, and the independent variable and dependent variable of the second function are negative correlation.
4. the affiliated spatial identification method of the network information according to claim 3, it is characterised in that described according to the region collection Each region occurs in the network information in conjunction number of times and position, determine the corresponding weight in each region, also include:
When the number of times that occurs in the network information of the 3rd region is more than or equal to twice during the region is gathered, by described the The corresponding each weight in three regions is added;
By the corresponding each weight sum in the 3rd region, it is defined as the weight of the 3rd region.
5. according to any affiliated spatial identification method of the network information of Claims 1-4, it is characterised in that the foundation is every The corresponding weight in one region, concentrates from the region name, determines that the affiliated region of the network information includes:
According to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, each region in the region set is judged Between whether have relationship between superior and subordinate;
When there is relationship between superior and subordinate between at least two regions during the region is gathered, at least two region is defined as One fine granularity region;
The corresponding weight at least two regions is carried out into default calculating, the corresponding weight in fine granularity region is obtained;
In each corresponding weight in fine granularity region in gathering according to the region, and region set with each region not The corresponding weight in coarseness region with relationship between superior and subordinate, the fine granularity region included from region set and coarseness ground In domain, the affiliated region of the network information is determined.
6. the affiliated spatial identification method of the network information according to claim 5, it is characterised in that described from region set Comprising fine granularity region and coarseness region in, determine that the affiliated region of the network information includes:
Judge that the corresponding weight in fine granularity region and coarseness region that the region set is included is big with the first predetermined threshold value Small relation;
When the number of the weight more than or equal to first predetermined threshold value is zero, determine the network information without institute possession Domain;
When the number of the weight more than or equal to first predetermined threshold value is at least one, by the corresponding target ground of weight limit Domain, is defined as the affiliated Regional Property of the network information, and the target area is fine granularity region or coarseness region.
7. the affiliated spatial identification method of the network information according to claim 6, it is characterised in that described when more than or equal to described When the number of the weight of the first predetermined threshold value is at least one, by the corresponding target area of weight limit, it is defined as the network The affiliated Regional Property of information includes:
When the number of the weight more than or equal to first predetermined threshold value is one, will be greater than being equal to first predetermined threshold value The corresponding target area of weight, be defined as the affiliated Regional Property of the network information;
When the number of the weight more than or equal to the predetermined threshold value is at least two, calculate more than or equal to the predetermined threshold value In corresponding at least two target area of weight, the difference of each two target area respective weights;
When at least one difference is more than or equal to the second predetermined threshold value, the corresponding target area of weight limit is defined as described The affiliated Regional Property of the network information;
When all differences are respectively less than second predetermined threshold value, determine the network information without affiliated region.
8. the affiliated spatial identification device of a kind of network information, it is characterised in that including:
Acquisition module, for obtaining the region set that the network information includes, the region set includes at least one region;
First determining module, for gathering according to the region in the number of times that occurs in the network information of each region and Position, determines the corresponding weight in each region, and weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
Second determining module, for according to the corresponding weight in each region, being concentrated from the region name, determines the network letter Region belonging to breath.
9. the affiliated spatial identification device of the network information according to claim 8, it is characterised in that the first determining module bag Include:
First judging unit, for judging the position that each region occurs in the network information;
First computing unit, for gathering when the region in caption position of first region in the network information when, according to The weight of first region is calculated according to first function, position of the first function with corresponding region in the title is as cause The independent variable and dependent variable of variable, the function with weight as independent variable, and the first function are negative correlation;
Second computing unit, for when the region gather in the second region position in the text of the network information when, according to The weight of second region is calculated according to second function, position of the second function with corresponding region in the text is as cause The independent variable and dependent variable of variable, the function with weight as independent variable, and the second function are negative correlation.
10. the affiliated spatial identification device of the network information according to claim 8, it is characterised in that second determining module Including:
Second judging unit, for according to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, judging institute Whether state during region is gathered has relationship between superior and subordinate between each region;
First determining unit, for when there is relationship between superior and subordinate between at least two regions in the set of the region, by described in At least two regions are defined as a fine granularity region;
Acquiring unit, for the corresponding weight at least two regions to be carried out into default calculating, obtains the fine granularity region Corresponding weight;
Second determining unit, for according to each corresponding weight in fine granularity region in the set of the region, and the region The corresponding weight in coarseness region in set with each region all without relationship between superior and subordinate, from region set include it is thin In granularity region and coarseness region, the affiliated region of the network information is determined.
CN201710141330.6A 2017-03-10 2017-03-10 The affiliated spatial identification method and device of the network information Pending CN106919705A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710141330.6A CN106919705A (en) 2017-03-10 2017-03-10 The affiliated spatial identification method and device of the network information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710141330.6A CN106919705A (en) 2017-03-10 2017-03-10 The affiliated spatial identification method and device of the network information

Publications (1)

Publication Number Publication Date
CN106919705A true CN106919705A (en) 2017-07-04

Family

ID=59460363

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710141330.6A Pending CN106919705A (en) 2017-03-10 2017-03-10 The affiliated spatial identification method and device of the network information

Country Status (1)

Country Link
CN (1) CN106919705A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779174A (en) * 2012-06-26 2012-11-14 北京奇虎科技有限公司 Public opinion information display system and method
CN103020038A (en) * 2012-12-25 2013-04-03 人民搜索网络股份公司 Internet public opinion regional relevance computing method
CN103064951A (en) * 2012-12-31 2013-04-24 南京烽火星空通信发展有限公司 Region recognition method and device of public opinion information
CN106357835A (en) * 2016-09-05 2017-01-25 百度在线网络技术(北京)有限公司 Method and device for determining subordinate region of target IP address
CN106375955A (en) * 2016-08-30 2017-02-01 多盟睿达科技(中国)有限公司 Regional identification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779174A (en) * 2012-06-26 2012-11-14 北京奇虎科技有限公司 Public opinion information display system and method
CN103020038A (en) * 2012-12-25 2013-04-03 人民搜索网络股份公司 Internet public opinion regional relevance computing method
CN103064951A (en) * 2012-12-31 2013-04-24 南京烽火星空通信发展有限公司 Region recognition method and device of public opinion information
CN106375955A (en) * 2016-08-30 2017-02-01 多盟睿达科技(中国)有限公司 Regional identification method and device
CN106357835A (en) * 2016-09-05 2017-01-25 百度在线网络技术(北京)有限公司 Method and device for determining subordinate region of target IP address

Similar Documents

Publication Publication Date Title
Jackoway et al. Identification of live news events using Twitter
Lee et al. Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection
CN102831248B (en) Network focus method for digging and device
CN103345524B (en) Method and system for detecting microblog hot topics
CN109344262B (en) Knowledge system establishing method and device and storage medium
CN107038168A (en) A kind of user's commuting track management method, apparatus and system
CN102354312B (en) Geographic coding for location search queries
CN102122005B (en) GIS-based spatial analysis and application method for similar paths of typhoon
CN109101474B (en) Address aggregation method, package aggregation method and equipment
CN102968494B (en) The system and method for transport information is gathered by microblogging
CN103970747B (en) Data processing method for network side computer to order search results
CN101299217A (en) Method, apparatus and system for processing map information
CN103294712A (en) System and method for recommending hot spot area in real time
CN105468744A (en) Big data platform for realizing tax public opinion analysis and full text retrieval
CN105677661A (en) Method for detecting repetition data of social media
CN102880623A (en) Method and device for searching people with same name
CN101477552A (en) Website user rank division method
CN106776567A (en) A kind of internet big data analyzes extracting method and system
CN112738729B (en) Method and system for discriminating visiting and returning tourists through mobile phone signaling data
CN106886517A (en) Business site selecting method, device and system
DE202013000058U1 (en) Device and computer readable medium for recognizing locations
CN106407473A (en) Event similarity modeling-based event context acquisition method and system
CN108345662A (en) A kind of microblog data weighted statistical method of registering considering user distribution area differentiation
Wu et al. Mining typhoon victim information based on multi-source data fusion using social media data in China: a case study of the 2019 Super Typhoon Lekima
CN106056515A (en) Community grid event cluster feature extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170704

RJ01 Rejection of invention patent application after publication