CN106919705A - The affiliated spatial identification method and device of the network information - Google Patents
The affiliated spatial identification method and device of the network information Download PDFInfo
- Publication number
- CN106919705A CN106919705A CN201710141330.6A CN201710141330A CN106919705A CN 106919705 A CN106919705 A CN 106919705A CN 201710141330 A CN201710141330 A CN 201710141330A CN 106919705 A CN106919705 A CN 106919705A
- Authority
- CN
- China
- Prior art keywords
- region
- network information
- weight
- affiliated
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/52—Network services specially adapted for the location of the user terminal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/55—Push-based network services
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a kind of affiliated spatial identification method and device of network information, gathered by obtaining the region included in the network information;The number of times occurred in the network information according to each region in the set of region and position, determine the corresponding weight in each region, and weight is used to characterize the probability that the affiliated region of the network information is corresponding region.According to the corresponding weight in each region, concentrated from region name, determine the affiliated region of the network information.So as in pushing network information, accordingly be pushed according to the affiliated region of the network information.For example, pushing the network information of Baoan District to the user of Baoan District.
Description
Technical field
The present invention relates to communication technical field, the affiliated spatial identification method and device of the network information is more particularly related to.
Background technology
With internet developing rapidly in the world, the network media has been acknowledged as after newspaper, broadcast, TV
" fourth media " afterwards, network turns into one of information main carriers, and the propagation characteristic of network causes that the network information of magnanimity is gushed
To user.
But how accurately user still more pays close attention to the thing for occurring at one's side, therefore the ground in the identification network information
Domain information, it appears particularly important.
The content of the invention
In view of this, the invention provides a kind of affiliated spatial identification method and device of network information, to overcome existing skill
The problem for not having to recognize the affiliated region of the network information in art.
To achieve the above object, the present invention provides following technical scheme:
A kind of affiliated spatial identification method of the network information, including:
The region set that the network information includes is obtained, the region set includes at least one region;
Each region occurs in the network information in gathering according to the region number of times and position, determine each
The corresponding weight in region, weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
According to the corresponding weight in each region, concentrated from the region name, determine the affiliated region of the network information.
Wherein, the region name collection that the acquisition network information includes includes:
The network information is divided, multiple words are obtained;
From the multiple word, the target word that each region for obtaining and prestoring matches;
The region is constituted by the target word to gather.
Wherein, each region occurs in the network information in the set according to the region number of times and position
Put, determine that the corresponding weight in each region includes:
Judge the position that each region occurs in the network information in the set of the region;
When caption position of first region in the network information during the region is gathered, calculated according to first function
The weight of first region, position of the first function with corresponding region in the title is with weight as dependent variable
The function of independent variable, and the independent variable and dependent variable of the first function are negative correlation;
When the second region position in the text of the network information during the region is gathered, calculated according to second function
The weight of second region, position of the second function with corresponding region in the text is with weight as dependent variable
The function of independent variable, and the independent variable and dependent variable of the second function are negative correlation.
Wherein, each region occurs in the network information in the set according to the region number of times and position
Put, determine the corresponding weight in each region, also include:
When the number of times that the 3rd region during the region is gathered occurs in the network information is more than or equal to twice, by institute
The corresponding each weight in the 3rd region is stated to be added;
By the corresponding each weight sum in the 3rd region, it is defined as the weight of the 3rd region.
Wherein, it is described to be concentrated from the region name according to the corresponding weight in each region, determine the network information institute
Possession domain includes:
According to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, judge that the region is each in gathering
Whether there is relationship between superior and subordinate between region;
It is when there is relationship between superior and subordinate between at least two regions during the region is gathered, at least two region is true
It is set to a fine granularity region;
The corresponding weight at least two regions is carried out into default calculating, the fine granularity region is obtained and is weighed accordingly
Weight;
With each region in each corresponding weight in fine granularity region in gathering according to the region, and region set
All the corresponding weight in coarseness region without relationship between superior and subordinate, the fine granularity region and coarse grain for including are gathered from the region
In degree region, the affiliated region of the network information is determined.
Wherein, in the fine granularity region and coarseness region included from region set, the network letter is determined
Region belonging to breath includes:
Judge the corresponding weight in fine granularity region and coarseness region and the first predetermined threshold value that the region set is included
Magnitude relationship;
When the number of the weight more than or equal to first predetermined threshold value is zero, determine the network information without affiliated
Region;
When the number of the weight more than or equal to first predetermined threshold value is at least one, by the corresponding mesh of weight limit
Mark region, is defined as the affiliated Regional Property of the network information, and the target area is fine granularity region or coarseness region.
Wherein, when the number of the weight when more than or equal to first predetermined threshold value is at least one, will most authority
The corresponding target area of weight, being defined as the affiliated Regional Property of the network information includes:
When the number of the weight more than or equal to first predetermined threshold value is one, will be greater than being preset equal to described first
The corresponding target area of weight of threshold value, is defined as the affiliated Regional Property of the network information;
When the number of the weight more than or equal to the predetermined threshold value is at least two, calculates and be more than or equal to the default threshold
In corresponding at least two target area of weight of value, the difference of each two target area respective weights;
When at least one difference is more than or equal to the second predetermined threshold value, the corresponding target area of weight limit is defined as
The affiliated Regional Property of the network information;
When all differences are respectively less than second predetermined threshold value, determine the network information without affiliated region.
A kind of affiliated spatial identification device of the network information, including:
Acquisition module, for obtaining the region set that the network information includes, the region set includes at least one ground
Domain;
First determining module, for the number of times occurred in the network information according to each region in the set of the region
And position, determining the corresponding weight in each region, it is corresponding region that weight is used to characterize the affiliated region of the network information
Probability;
Second determining module, for according to the corresponding weight in each region, being concentrated from the region name, determines the net
The affiliated region of network information.
Wherein, first determining module includes:
First judging unit, for judging the position that each region occurs in the network information;
First computing unit, for when caption position of first region in the network information in the set of the region
When, the weight of first region, position of the first function with corresponding region in the title are calculated according to first function
Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the first function are negative correlation;
Second computing unit, for when the second region position in the text of the network information in the set of the region
When, the weight of second region, position of the second function with corresponding region in the text are calculated according to second function
Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the second function are negative correlation.
Wherein, second determining module includes:
Second judging unit, for according to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, sentencing
Whether there is relationship between superior and subordinate between each region in the disconnected region set;
First determining unit, for when there is relationship between superior and subordinate between at least two regions in the set of the region, inciting somebody to action
At least two region is defined as a fine granularity region;
Acquiring unit, for the corresponding weight at least two regions to be carried out into default calculating, obtains the fine granularity
The corresponding weight in region;
Second determining unit, for gathering according to the region in each corresponding weight in fine granularity region, it is and described
The corresponding weight in coarseness region in the set of region with each region all without relationship between superior and subordinate, from the region, set is included
Fine granularity region and coarseness region in, determine the affiliated region of the network information.
Understood via above-mentioned technical scheme, compared with prior art, belonging to the network information provided in an embodiment of the present invention
In spatial identification method, gathered by obtaining the region included in the network information;According to each region in the set of region in network
The number of times occurred in information and position, determine the corresponding weight in each region, and weight is used to characterize the affiliated region of the network information
It is the probability of corresponding region.According to the corresponding weight in each region, concentrated from region name, determine the affiliated region of the network information.
So as in pushing network information, accordingly be pushed according to the affiliated region of the network information.For example, being pushed to the user of Baoan District
The network information of Baoan District.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this
Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is a kind of flow chart of the affiliated spatial identification method of network information provided in an embodiment of the present invention;
Fig. 2 be a kind of affiliated spatial identification method of the network information provided in an embodiment of the present invention according to the region gather
In the number of times that occurs in the network information of each region and position, determine a kind of realization of the corresponding weight in each region
The method flow schematic diagram of mode;
Fig. 3 is the partial schematic diagram of region tree provided in an embodiment of the present invention;
Fig. 4 be the affiliated spatial identification method of the network information provided in an embodiment of the present invention in weighed accordingly according to each region
Weight, concentrates from the region name, determines a kind of method flow schematic diagram of implementation of the affiliated region of the network information;
Fig. 5 is to be recognized belonging to each network information using the affiliated spatial identification method of the network information provided in an embodiment of the present invention
The schematic diagram of region;
Fig. 6 is the structural representation of the affiliated spatial identification device of the network information provided in an embodiment of the present invention;
Fig. 7 is the structural representation of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
It is a kind of flow chart of the affiliated spatial identification method of network information provided in an embodiment of the present invention, the method such as Fig. 1
Including:
Step S101:The region set that the network information includes is obtained, the region set includes at least one region.
The embodiment of the present invention is directed in the network information for the network information for recording region, if root in the network information
Do not include region originally, then the embodiment of the present invention cannot know its affiliated region by the network information.
The network information can be voice messaging, text message etc..When the network information is voice messaging, can be by voice
Information is converted to text message.The embodiment of the present invention is provided but is not limited to following " obtaining the region set that the network information includes "
Implementation method.
The network information is divided, multiple words are obtained;From the multiple word, obtain and each region phase for prestoring
The target word of matching;The region is constituted by the target word to gather.
Assuming that the network information is:Baoan District's cultural festival.
The network information can be divided into:Treasured, Bao'an, Baoan District, Baoan District's text, Baoan District's culture, Baoan District's cultural festival;
Peace, An Qu, peace area text, the culture of peace area, peace area's cultural festival;Area, Qu Wen, area's culture, area's cultural festival;Text, culture, cultural festival;Change,
Change section;Section.
Preferably, due to the particularity of Chinese text, needed when word is extracted with participle instrument accurately by network
Information be cut into word and phrase and filter out such as ", she, he, it ... " useless auxiliary word, adverbial word, stop words.If
The network information is english vocabulary, then can filter out as:The words such as a, an.
Above-mentioned each region can be stored in advance in regional information storehouse.Each region can be all of comprising the whole nation or the whole world
Province, city, county, town, village, it is preferred that each region can also include street information, neighbourhood committee's information, the latitude and longitude information of region, with
And the significant information of some regions, such as the Daming Lake in Jinan, Pekinese's the Forbidden City etc..
Matched with each region for prestoring by the multiple words that will be divided to the network information, Baoan District can be obtained
This region vocabulary.
Step S102:Each region occurs in the network information in gathering according to the region number of times and position
Put, determine the corresponding weight in each region, weight is used to characterize the probability that the affiliated region of the network information is corresponding region.
The network information generally comprises title and text, and the position that region occurs in the network information can refer to region just
Occur in text, or, region occurs in title.
The position that region occurs in a network can also refer to which word during region is located at text, if the network information is
As a example by " Baoan District's cultural festival ", the position of Baoan District is that the position that the 1st word, i.e. Baoan District occur in the network information is 1.
Preferably, corresponding weight when region occurs in title, higher than corresponding weight when occurring in the body of the email.
The number of times that the weight of region occurs with region in the network information is proportionate, negatively correlated with the position for occurring.
Positive correlation (Positive correlation), it is identical to refer to that two variables change directions, a variable by greatly to
During small or ascending change, another variable is also descending or ascending change.
It is in opposite direction that negative correlation refers to that two variables change, and each variable is descending or during ascending change, another
Individual variable is ascending on the contrary or descending change.
Step S103:According to the corresponding weight in each region, concentrated from the region name, determine the network information institute
Possession domain.
The bigger region of weight, is that the possibility of the affiliated region of the network information is bigger.
In the affiliated spatial identification method of the network information provided in an embodiment of the present invention, by obtaining what is included in the network information
Gather region;The number of times occurred in the network information according to each region in the set of region and position, determine each region phase
The weight answered, weight is used to characterize the probability that the affiliated region of the network information is corresponding region.According to the corresponding weight in each region,
Concentrated from region name, determine the affiliated region of the network information.So as in pushing network information, according to the affiliated region of the network information
Accordingly pushed.For example, pushing the network information of Baoan District to the user of Baoan District.
As shown in Fig. 2 described in foundation in a kind of affiliated spatial identification method of the network information provided in an embodiment of the present invention
Each region occurs in the network information in the set of region number of times and position, determine the corresponding weight in each region
A kind of method flow schematic diagram of implementation, the method includes:
Step S201:Judge the position that each region occurs in the network information in the set of the region.
Step S202:When caption position of first region in the network information during the region is gathered, according to the
One function calculates the weight of first region.
Position of the first function with corresponding region in the title as dependent variable, the letter with weight as independent variable
Number, and the independent variable and dependent variable of the first function are negative correlation.
The specific presentation formula of first function has various, and the embodiment of the present invention is provided but is not limited to below equation:
Weight=the first numerical value of first function1/ position, wherein, the first numerical value can be the arbitrary value more than or equal to 1, it is assumed that
First numerical value is 2, then weight=2 of first function1/ position, still by taking Baoan District's cultural festival as an example, it is assumed that Baoan District's cultural festival is mark
Inscribe, then weight=2 of Baoan District1/1=2.It is again entitled with " first good fortune section of Crab with butter diet culture forever is held in Baoan District "
Example, weight=2 of Baoan District1/14。
Step S203:When the second region position in the text of the network information during the region is gathered, according to the
Two functions calculate the weight of second region.
Position of the second function with corresponding region in the text as dependent variable, the letter with weight as independent variable
Number, and the independent variable and dependent variable of the second function are negative correlation.
The specific presentation formula of second function has various, and the embodiment of the present invention is provided but is not limited to below equation:
Weight=the second value of second function1/ positionWherein, second value can be the arbitrary value more than or equal to 1, preferably
, second value is less than the first numerical value.Assuming that the first numerical value is 1.01, then weight=1.01 of first function1/ position, still with Bao'an
As a example by area's cultural festival, it is assumed that Baoan District's cultural festival is text, then weight=1.01 of Baoan District1/1=1.01.Again with " first good fortune
The section of Crab with butter diet culture forever is held in Baoan District " for as a example by text, weight=1.01 of Baoan District1/14。
If it is understood that when the number of times that same region occurs in the network information is more than or equal to 2 times, the region is net
The probability of the affiliated region of network information is just bigger, now also includes:
Step S204:The number of times that the 3rd region occurs in the network information in the region is gathered is more than or equal to two
When secondary, the corresponding each weight in the 3rd region is added.
Step S205:By the corresponding each weight sum in the 3rd region, it is defined as the weight of the 3rd region.
Above-mentioned first region, the second region, the 3rd region are probably same region, it is also possible to different regions.If
Same region, then show, same region all occurs in that in title and text.
It is understood that the region included in the network information may not have relationship between superior and subordinate, for example:Beijing and Shanghai,
But there is relationship between superior and subordinate some regions, for example, Guangdong Province, Shenzhen, Baoan District.In above-described embodiment, prestore
Each region, can be stored by tree, i.e., each region is stored with tree in regional information storehouse, by the tree
Shape structure can find the branch where each region during region is gathered.As shown in figure 3, being region provided in an embodiment of the present invention
The partial schematic diagram of tree.
In Fig. 3, Hebei province, Baoding, Boye County, Xu Cun are a branch;Guangdong Province, Shenzhen, Baoan District are another
Individual branch.Can judge whether region has relationship between superior and subordinate between each region in gathering by tree.Can so obtain
More fine-grained region.Such as Xu Cun is more fine-grained region compared to Hebei province.Assuming that region set includes:Hebei
Province, Baoding, Boye County, Xu Cun, Guangdong Province, Shenzhen, Baoan District, Beijing;Then Hebei province, Baoding, Boye County, Xu Cun pairs
Answer a fine granularity region;Guangdong Province, Shenzhen, Baoan District one fine granularity region of correspondence.Beijing correspondence coarseness region.
As shown in figure 4, in the affiliated spatial identification method of the network information provided in an embodiment of the present invention according to each region
Corresponding weight, concentrates from the region name, determines a kind of method stream of implementation of the affiliated region of the network information
Journey schematic diagram, the method includes:
Step S401:According to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, judge describedly
Whether there is relationship between superior and subordinate between each region in the set of domain.
Because some regions are ambiguous, for example, for being exposed to the sun for this region, there is Chaoyang District in Beijing, and Liaoning has
Chaoyang City.Assuming that region set includes:Liaoning Province, Chaoyang, then can be according to the upper of each region belonging relation for prestoring
Inferior relation, determines the Chaoyang City that Chaoyang is Liaoning Province.Therefore, step S401 provided in an embodiment of the present invention can also be effective
Solve ambiguous problem in region.
Step S402:When there is relationship between superior and subordinate between at least two regions during the region is gathered, by described at least
Two regions are defined as a fine granularity region.
Step S403:The corresponding weight at least two regions is carried out into default calculating, the fine granularity region is obtained
Corresponding weight.
Default calculating can for product, average etc..
By taking Guangdong Province, Shenzhen, Baoan District as an example, then the weight * Shenzhen weight * in the weight=Guangdong Province of Baoan District is protected
Pacify the weight in area, or, the weight of Baoan District=(weight in the weight in Guangdong Province+Shenzhen's weight+security personnel area)/3.
Step S404:Each corresponding weight in fine granularity region in gathering according to the region, and region set
In the corresponding weight in coarseness region with each region all without relationship between superior and subordinate, from the fine granularity that region set is included
In region and coarseness region, the affiliated region of the network information is determined.
If it is understood that each corresponding weight in fine granularity region, each coarseness region are corresponding in the set of region
Weight it is all smaller, then illustrate that the network information does not have obvious Regional Property.It is described from the region set include it is thin
In granularity region and coarseness region, determine that the affiliated region of the network information includes:
Judge the corresponding weight in fine granularity region and coarseness region and the first predetermined threshold value that the region set is included
Magnitude relationship;
When the number of the weight more than or equal to first predetermined threshold value is zero, determine the network information without affiliated
Region;
When the number of the weight more than or equal to first predetermined threshold value is at least one, by the corresponding mesh of weight limit
Mark region, is defined as the affiliated Regional Property of the network information, and the target area is fine granularity region or coarseness region.
If it is understood that when being at least one more than or equal to the number of the weight of first predetermined threshold value, and greatly
In the corresponding each corresponding weight in target area of weight equal to first predetermined threshold value, any two target area is corresponding
When the difference of weight is all smaller, illustrate that the network information may be still without obvious Regional Property.Or, the affiliated region of the network information is
Multiple purpose regions.
When the number of the weight when more than or equal to first predetermined threshold value is at least one, by weight limit correspondence
Target area, being defined as the affiliated Regional Property of the network information includes:
When the number of the weight more than or equal to first predetermined threshold value is one, will be greater than being preset equal to described first
The corresponding target area of weight of threshold value, is defined as the affiliated Regional Property of the network information;
When the number of the weight more than or equal to the predetermined threshold value is at least two, calculates and be more than or equal to the default threshold
In corresponding at least two target area of weight of value, the difference of each two target area respective weights;
When at least one difference is more than or equal to the second predetermined threshold value, the corresponding target area of weight limit is defined as
The affiliated Regional Property of the network information;
When all differences are respectively less than second predetermined threshold value, determine the network information without affiliated region.
Above-mentioned first preset value and the second preset value can be configured according to actual conditions.
As shown in figure 5, being to recognize each network using the affiliated spatial identification method of the network information provided in an embodiment of the present invention
The schematic diagram of the affiliated region of information.
In Fig. 5, the region for including is gathered in the part outlined with dotted line frame by the corresponding region of each network information.
Shown in Fig. 5 new using the affiliated spatial identification method identification region of the network information provided in an embodiment of the present invention
News, recruitment information, search notice, the example of region cuisines.Known using the affiliated region of the network information provided in an embodiment of the present invention
Other method can also recognize other network informations, such as weather forecast etc..
The embodiment of the present invention is additionally provided and the affiliated spatial identification method of the network information, the corresponding affiliated region of the network information
Identifying device, as shown in fig. 6, be the structural representation of the affiliated spatial identification device of the network information provided in an embodiment of the present invention,
The affiliated spatial identification device of the network information includes:
Acquisition module 61, for obtaining the region set that the network information includes, the region set includes at least one
Region;
First determining module 62, it is secondary for what is occurred in the network information according to each region in the set of the region
Number and position, determine the corresponding weight in each region, and weight is used to characterize the affiliated region of the network information for corresponding region
Probability;
Second determining module 63, for according to the corresponding weight in each region, being concentrated from the region name, it is determined that described
The affiliated region of the network information.
Optionally, acquisition module includes:
Lexical unit is obtained, for the network information to be divided, multiple words is obtained;
Obtain target word unit, the target that each region for from the multiple word, obtaining and prestoring matches
Word;
Component units, gather for constituting the region by the target word.
Optionally, the first determining module includes:
First judging unit, for judging the position that each region occurs in the network information;
First computing unit, for when caption position of first region in the network information in the set of the region
When, the weight of first region, position of the first function with corresponding region in the title are calculated according to first function
Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the first function are negative correlation;
Second computing unit, for when the second region position in the text of the network information in the set of the region
When, the weight of second region, position of the second function with corresponding region in the text are calculated according to second function
Dependent variable is set to, the function with weight as independent variable, and the independent variable and dependent variable of the second function are negative correlation.
Optionally, the first determining module also includes:
Addition unit, for gathering when the region in the number of times that occurs in the network information of the 3rd region be more than etc.
When twice, the corresponding each weight in the 3rd region is added;
Weight unit is determined, for by the corresponding each weight sum in the 3rd region, being defined as the 3rd ground
The weight in domain.
Optionally, the second determining module includes:
Second judging unit, for according to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, sentencing
Whether there is relationship between superior and subordinate between each region in the disconnected region set;
First determining unit, for when there is relationship between superior and subordinate between at least two regions in the set of the region, inciting somebody to action
At least two region is defined as a fine granularity region;
Acquiring unit, for the corresponding weight at least two regions to be carried out into default calculating, obtains the fine granularity
The corresponding weight in region;
Second determining unit, for gathering according to the region in each corresponding weight in fine granularity region, it is and described
The corresponding weight in coarseness region in the set of region with each region all without relationship between superior and subordinate, from the region, set is included
Fine granularity region and coarseness region in, determine the affiliated region of the network information.
Optionally, the second determining unit includes:
Judgment sub-unit, for judging the corresponding weight in fine granularity region and coarseness region that the region set is included
With the magnitude relationship of the first predetermined threshold value;
First determination subelement, for when the weight more than or equal to first predetermined threshold value number be zero when, really
The fixed network information is without affiliated region;
Second determination subelement, for being at least one when the number of the weight for being more than or equal to first predetermined threshold value
When, by the corresponding target area of weight limit, it is defined as the affiliated Regional Property of the network information, the target area is particulate
Degree region or coarseness region.
Optionally, the second determination subelement includes:
First determination sub-module, for when the number of the weight more than or equal to first predetermined threshold value is one, inciting somebody to action
More than or equal to the corresponding target area of weight of first predetermined threshold value, it is defined as the affiliated Regional Property of the network information;
Calculating sub module, for when the number of the weight more than or equal to the predetermined threshold value is at least two, calculating big
In corresponding at least two target area of weight equal to the predetermined threshold value, the difference of each two target area respective weights
Value;
Second determination sub-module, for when at least one difference is more than or equal to the second predetermined threshold value, by weight limit pair
The target area answered, is defined as the affiliated Regional Property of the network information;
3rd determination sub-module, for when all differences are respectively less than second predetermined threshold value, determining the network letter
Breath region without belonging to.
The embodiment of the present invention additionally provides a kind of electronic equipment, as shown in fig. 7, for electronics provided in an embodiment of the present invention sets
Standby structural representation, the electronic equipment includes:Processor 71, communication interface 72, memory 73 and communication bus 74;
Wherein processor 71, communication interface 72, memory 73 complete mutual communication by communication bus 74;
Optionally, communication interface 72 can be the interface of communication module, such as interface of gsm module;
Processor 71, for configuration processor;
Memory 73, for depositing program and data;
Program can include program code, and described program code includes computer-managed instruction;Data can include region
Or the relationship between superior and subordinate between region.
Processor 71 is probably a central processor CPU, or specific integrated circuit ASIC (Application
Specific Integrated Circuit), or it is arranged to implement one or more integrated electricity of the embodiment of the present invention
Road.
Memory 73 may include high-speed RAM memory, it is also possible to also including nonvolatile memory (non-volatile
Memory), for example, at least one magnetic disk storage.
Wherein, program can be specifically for:
The region set that the network information includes is obtained, the region set includes at least one region;
Each region occurs in the network information in gathering according to the region number of times and position, determine each
The corresponding weight in region, weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
According to the corresponding weight in each region, concentrated from the region name, determine the affiliated region of the network information.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms be used merely to by
One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation
Between there is any this actual relation or order.And, term " including ", "comprising" or its any other variant meaning
Covering including for nonexcludability, so that process, method, article or equipment including a series of key elements not only include that
A little key elements, but also other key elements including being not expressly set out, or also include for this process, method, article or
The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", does not arrange
Except also there is other identical element in the process including the key element, method, article or equipment.
Each embodiment is described by the way of progressive in this specification, and what each embodiment was stressed is and other
The difference of embodiment, between each embodiment identical similar portion mutually referring to.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or uses the application.
Various modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can in other embodiments be realized in the case where spirit herein or scope is not departed from.Therefore, the application
The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one
The scope most wide for causing.
Claims (10)
1. a kind of affiliated spatial identification method of network information, it is characterised in that including:
The region set that the network information includes is obtained, the region set includes at least one region;
Each region occurs in the network information in gathering according to the region number of times and position, determine each region
Corresponding weight, weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
According to the corresponding weight in each region, concentrated from the region name, determine the affiliated region of the network information.
2. the affiliated spatial identification method of the network information, the region that the acquisition network information includes according to claim 1
Title collection includes:
The network information is divided, multiple words are obtained;
From the multiple word, the target word that each region for obtaining and prestoring matches;
The region is constituted by the target word to gather.
3. the affiliated spatial identification method of the network information according to claim 1, it is characterised in that described according to the region collection
Each region occurs in the network information in conjunction number of times and position, determine that the corresponding weight in each region includes:
Judge the position that each region occurs in the network information in the set of the region;
When caption position of first region in the network information during the region is gathered, calculate described according to first function
The weight of the first region, position of the first function with corresponding region in the title, as dependent variable, is from change with weight
The function of amount, and the independent variable and dependent variable of the first function are negative correlation;
When the second region position in the text of the network information during the region is gathered, calculate described according to second function
The weight of the second region, position of the second function with corresponding region in the text, as dependent variable, is from change with weight
The function of amount, and the independent variable and dependent variable of the second function are negative correlation.
4. the affiliated spatial identification method of the network information according to claim 3, it is characterised in that described according to the region collection
Each region occurs in the network information in conjunction number of times and position, determine the corresponding weight in each region, also include:
When the number of times that occurs in the network information of the 3rd region is more than or equal to twice during the region is gathered, by described the
The corresponding each weight in three regions is added;
By the corresponding each weight sum in the 3rd region, it is defined as the weight of the 3rd region.
5. according to any affiliated spatial identification method of the network information of Claims 1-4, it is characterised in that the foundation is every
The corresponding weight in one region, concentrates from the region name, determines that the affiliated region of the network information includes:
According to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, each region in the region set is judged
Between whether have relationship between superior and subordinate;
When there is relationship between superior and subordinate between at least two regions during the region is gathered, at least two region is defined as
One fine granularity region;
The corresponding weight at least two regions is carried out into default calculating, the corresponding weight in fine granularity region is obtained;
In each corresponding weight in fine granularity region in gathering according to the region, and region set with each region not
The corresponding weight in coarseness region with relationship between superior and subordinate, the fine granularity region included from region set and coarseness ground
In domain, the affiliated region of the network information is determined.
6. the affiliated spatial identification method of the network information according to claim 5, it is characterised in that described from region set
Comprising fine granularity region and coarseness region in, determine that the affiliated region of the network information includes:
Judge that the corresponding weight in fine granularity region and coarseness region that the region set is included is big with the first predetermined threshold value
Small relation;
When the number of the weight more than or equal to first predetermined threshold value is zero, determine the network information without institute possession
Domain;
When the number of the weight more than or equal to first predetermined threshold value is at least one, by the corresponding target ground of weight limit
Domain, is defined as the affiliated Regional Property of the network information, and the target area is fine granularity region or coarseness region.
7. the affiliated spatial identification method of the network information according to claim 6, it is characterised in that described when more than or equal to described
When the number of the weight of the first predetermined threshold value is at least one, by the corresponding target area of weight limit, it is defined as the network
The affiliated Regional Property of information includes:
When the number of the weight more than or equal to first predetermined threshold value is one, will be greater than being equal to first predetermined threshold value
The corresponding target area of weight, be defined as the affiliated Regional Property of the network information;
When the number of the weight more than or equal to the predetermined threshold value is at least two, calculate more than or equal to the predetermined threshold value
In corresponding at least two target area of weight, the difference of each two target area respective weights;
When at least one difference is more than or equal to the second predetermined threshold value, the corresponding target area of weight limit is defined as described
The affiliated Regional Property of the network information;
When all differences are respectively less than second predetermined threshold value, determine the network information without affiliated region.
8. the affiliated spatial identification device of a kind of network information, it is characterised in that including:
Acquisition module, for obtaining the region set that the network information includes, the region set includes at least one region;
First determining module, for gathering according to the region in the number of times that occurs in the network information of each region and
Position, determines the corresponding weight in each region, and weight is used to characterize the probability that the affiliated region of the network information is corresponding region;
Second determining module, for according to the corresponding weight in each region, being concentrated from the region name, determines the network letter
Region belonging to breath.
9. the affiliated spatial identification device of the network information according to claim 8, it is characterised in that the first determining module bag
Include:
First judging unit, for judging the position that each region occurs in the network information;
First computing unit, for gathering when the region in caption position of first region in the network information when, according to
The weight of first region is calculated according to first function, position of the first function with corresponding region in the title is as cause
The independent variable and dependent variable of variable, the function with weight as independent variable, and the first function are negative correlation;
Second computing unit, for when the region gather in the second region position in the text of the network information when, according to
The weight of second region is calculated according to second function, position of the second function with corresponding region in the text is as cause
The independent variable and dependent variable of variable, the function with weight as independent variable, and the second function are negative correlation.
10. the affiliated spatial identification device of the network information according to claim 8, it is characterised in that second determining module
Including:
Second judging unit, for according to the relationship between superior and subordinate for characterizing each region belonging relation for prestoring, judging institute
Whether state during region is gathered has relationship between superior and subordinate between each region;
First determining unit, for when there is relationship between superior and subordinate between at least two regions in the set of the region, by described in
At least two regions are defined as a fine granularity region;
Acquiring unit, for the corresponding weight at least two regions to be carried out into default calculating, obtains the fine granularity region
Corresponding weight;
Second determining unit, for according to each corresponding weight in fine granularity region in the set of the region, and the region
The corresponding weight in coarseness region in set with each region all without relationship between superior and subordinate, from region set include it is thin
In granularity region and coarseness region, the affiliated region of the network information is determined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710141330.6A CN106919705A (en) | 2017-03-10 | 2017-03-10 | The affiliated spatial identification method and device of the network information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710141330.6A CN106919705A (en) | 2017-03-10 | 2017-03-10 | The affiliated spatial identification method and device of the network information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106919705A true CN106919705A (en) | 2017-07-04 |
Family
ID=59460363
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710141330.6A Pending CN106919705A (en) | 2017-03-10 | 2017-03-10 | The affiliated spatial identification method and device of the network information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106919705A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102779174A (en) * | 2012-06-26 | 2012-11-14 | 北京奇虎科技有限公司 | Public opinion information display system and method |
CN103020038A (en) * | 2012-12-25 | 2013-04-03 | 人民搜索网络股份公司 | Internet public opinion regional relevance computing method |
CN103064951A (en) * | 2012-12-31 | 2013-04-24 | 南京烽火星空通信发展有限公司 | Region recognition method and device of public opinion information |
CN106357835A (en) * | 2016-09-05 | 2017-01-25 | 百度在线网络技术(北京)有限公司 | Method and device for determining subordinate region of target IP address |
CN106375955A (en) * | 2016-08-30 | 2017-02-01 | 多盟睿达科技(中国)有限公司 | Regional identification method and device |
-
2017
- 2017-03-10 CN CN201710141330.6A patent/CN106919705A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102779174A (en) * | 2012-06-26 | 2012-11-14 | 北京奇虎科技有限公司 | Public opinion information display system and method |
CN103020038A (en) * | 2012-12-25 | 2013-04-03 | 人民搜索网络股份公司 | Internet public opinion regional relevance computing method |
CN103064951A (en) * | 2012-12-31 | 2013-04-24 | 南京烽火星空通信发展有限公司 | Region recognition method and device of public opinion information |
CN106375955A (en) * | 2016-08-30 | 2017-02-01 | 多盟睿达科技(中国)有限公司 | Regional identification method and device |
CN106357835A (en) * | 2016-09-05 | 2017-01-25 | 百度在线网络技术(北京)有限公司 | Method and device for determining subordinate region of target IP address |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jackoway et al. | Identification of live news events using Twitter | |
Lee et al. | Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection | |
CN102831248B (en) | Network focus method for digging and device | |
CN103345524B (en) | Method and system for detecting microblog hot topics | |
CN109344262B (en) | Knowledge system establishing method and device and storage medium | |
CN107038168A (en) | A kind of user's commuting track management method, apparatus and system | |
CN102354312B (en) | Geographic coding for location search queries | |
CN102122005B (en) | GIS-based spatial analysis and application method for similar paths of typhoon | |
CN109101474B (en) | Address aggregation method, package aggregation method and equipment | |
CN102968494B (en) | The system and method for transport information is gathered by microblogging | |
CN103970747B (en) | Data processing method for network side computer to order search results | |
CN101299217A (en) | Method, apparatus and system for processing map information | |
CN103294712A (en) | System and method for recommending hot spot area in real time | |
CN105468744A (en) | Big data platform for realizing tax public opinion analysis and full text retrieval | |
CN105677661A (en) | Method for detecting repetition data of social media | |
CN102880623A (en) | Method and device for searching people with same name | |
CN101477552A (en) | Website user rank division method | |
CN106776567A (en) | A kind of internet big data analyzes extracting method and system | |
CN112738729B (en) | Method and system for discriminating visiting and returning tourists through mobile phone signaling data | |
CN106886517A (en) | Business site selecting method, device and system | |
DE202013000058U1 (en) | Device and computer readable medium for recognizing locations | |
CN106407473A (en) | Event similarity modeling-based event context acquisition method and system | |
CN108345662A (en) | A kind of microblog data weighted statistical method of registering considering user distribution area differentiation | |
Wu et al. | Mining typhoon victim information based on multi-source data fusion using social media data in China: a case study of the 2019 Super Typhoon Lekima | |
CN106056515A (en) | Community grid event cluster feature extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170704 |
|
RJ01 | Rejection of invention patent application after publication |