CN104462547B - A kind of method and system of configurable collecting webpage data - Google Patents
A kind of method and system of configurable collecting webpage data Download PDFInfo
- Publication number
- CN104462547B CN104462547B CN201410822548.4A CN201410822548A CN104462547B CN 104462547 B CN104462547 B CN 104462547B CN 201410822548 A CN201410822548 A CN 201410822548A CN 104462547 B CN104462547 B CN 104462547B
- Authority
- CN
- China
- Prior art keywords
- acquisition
- configuration
- information
- website
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
- G06F16/9577—Optimising the visualization of content, e.g. distillation of HTML documents
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The present invention relates to a kind of method and system of configurable collecting webpage data, especially suitable for needing the case where constantly updating the acquisition mode to web data, this method comprises: S1, obtaining from database the configuration information of collecting webpage data;S2, according to configuration information, obtain required classifieds website and log in;S3, according to the site information after login, obtain the theme of required acquisition under website;S4, according to configuration information, collected theme acquires required web page contents;S5, the tables of data according to configuration acquire the information needed of content pages by regular expression in the tables of data that configures or certain Rule Extraction;S6, the list data extracted is stored into database.Implement the method and system of configurable collecting webpage data of the invention, user voluntarily can need the web data acquired by arbitrary disposition, and the relevant data information of the whole network is acquired by configured acquisition scheme, realize flexibly, easily collecting webpage data.
Description
Technical field
The present invention relates to network communication technology fields, constantly update more specifically to one kind to webpage
The method and system of the configurable collecting webpage data of the case where acquisition mode of data.
Background technique
With the high speed development that Web technology and Web are applied, the arriving of big data era applies website to various Web, special
The monitoring of other social platform, the public opinion monitoring of each company, user data acquisition, big data excavate using more and more extensive;
All trades and professions are also increasingly dependent on internet and rely on internet information height.But the data of internet are all magnanimity,
So how to go to extract the data that we need?
Acquisition system only for a certain website or several websites currently on the market, there is no spies can configure, specified
The webpage data acquiring method of fixed number evidence.
Webpage layout design both using Table mode or can use DIV mode or both mixed composition, so
It will appear acquisition mistake or abnormal when acquiring data;It needs to develop program again after the website revision of acquisition, increase is opened
Send out cost.
This just needs us to go to develop these data of a system acquisition, and each website is each have their own design and shows
Mode cannot acquire all websites with same kind of analysis mode, to avoid doing an analytic method and net for each website
Correcting of standing needs to modify program, it is necessary to need to develop a kind of general, configurable collecting webpage data system.
Summary of the invention
The technical problem to be solved by the present invention is to can only acquire one or several for existing collecting webpage data system
A website has unicity and not very practical defect, provides a kind of configurable, the webpage that operation strategies widely can configure
The method and system of data acquisition.
The technical scheme to solve the above technical problems is that a kind of method of configurable collecting webpage data,
This method comprises:
S1, the configuration information that collecting webpage data is obtained from database, the configuration information include: configuration acquisition website
Classification information, configuration acquisition theme Template Information, configuration acquisition content pages Template Information and configuration data table information;
S2, the classification information that website is acquired according to configuration judge whether the classification for enabling acquisition website, if it is enable
The classification of website is acquired, classifieds website is obtained, otherwise terminates program;
S3, the classification information that website is acquired according to configuration, judge whether to log in collected classifieds website, if it is step on
Otherwise the land classifieds website will log in the classifieds website using virtual log-on webpage;
S4, theme Template Information is acquired according to configuration, obtains the theme of required acquisition under website;
S5, the theme according to acquisition judge the content of the theme with the presence or absence of multi-page situation, if it is according to paging
Mark obtains list of websites information, otherwise directly acquires the content pages of the theme;
S6, acquisition content is intercepted according to the opening flag and end mark of content pages, and content pages are obtained according to expression formula
Network address set;
S7, the acquisition content pages Template Information according to configuration, obtain the content pages of acquisition;
S8, the content pages according to acquisition judge that it, with the presence or absence of multi-page situation, is if it is obtained according to paging mark
Then the list of websites information of multi-page intercepts content according to the opening flag of content pages and end mark, otherwise direct basis
The content of opening flag and end mark interception content pages;
S9, the corresponding expression formula of field or dependency rule extraction list data are obtained according to the data table information of configuration;
S10, the list data extracted is stored into database.
In the method for configurable collecting webpage data of the present invention, the acquisition attributes information includes: acquisition
Network address, acquisition website coding and frequency acquisition.
The acquisition network address, for acquiring the web page address for meeting configuration;
The acquisition website coding, for acquiring the source code of website;
The frequency acquisition is set as every 5 minutes once.
In the method for configurable collecting webpage data of the present invention, the data table information includes: acquisition mark
Topic, acquisition time, acquisition content and the source for acquiring content.
Title is acquired, for acquiring the title of content pages;
Content is acquired, for acquiring the content of content pages;
Acquire the source of content, the information of the content sources for acquiring content pages.
In the method for configurable collecting webpage data of the present invention, the configuration of the configuration information of the step S1
Step includes:
A, the classification and acquisition attributes information of configuration acquisition website;
B, configuration acquisition theme Template Information;
C, configuration acquisition content pages Template Information;
D, storage configuration information transfers use into database after convenient.
The system for constructing a kind of configurable collecting webpage data, comprising: starting module transfers configuration module, judges mould
Block obtains configuration information module, database, interception content module and memory module;
The database is used for storage configuration information and list data;
The acquisition configuration information module, for configuring the web data of acquisition needed for user;
The acquisition configuration information module includes obtaining Website Module, obtaining subject of Web site module, obtain content pages module
With acquisition list data module, wherein
The acquisition Website Module, for classifieds website needed for obtaining user;
The acquisition subject of Web site module, for obtaining theme needed for user in classifieds website;
The acquisition content pages module, for obtaining content pages needed for user in theme;
List data module is obtained, for obtaining list data in content pages.
The judgment module includes: that first judgment module, the second judgment module, third judgment module and the 4th judge mould
Block;
The interception content module includes: the first interception content module and the second interception content module;
The acquisition configuration information module includes: to obtain Website Module, obtain subject of Web site module, obtain content pages module
With acquisition list data module.
Starting module, for starting configurable collecting webpage data system;
Configuration module is transferred, the corresponding configuration information for acquiring needed for transferring from database;
First judgment module, for judging whether that configuration acquires the classification of website and the function of acquisition attributes, judgement are
The no classification for enabling acquisition website, if it is enables the classification of acquisition website, obtains classifieds website, otherwise terminates program;
Second judgment module logs in collected classifieds website for judging whether, if it is logs in the website, otherwise
The classifieds website will be logged in using virtual log-on webpage;
Subject of Web site module is obtained, for the subject of Web site Template Information according to configuration, obtains the institute for logging in classifieds website
The theme needed;
Third judgment module, for judging the subject content with the presence or absence of multi-page situation, if it is according to paging mark
Will obtains the list of websites information of multi-page, and the content pages of multi-page are obtained by the list information, otherwise directly acquire the master
The content pages of topic;
First interception content module, for the opening flag and end mark interception content information by content pages;
Acquisition content pages module is obtained to obtain from the topic module of website for the acquisition content page information according to configuration
Take required content pages;
4th judgment module, for judging that it, with the presence or absence of multi-page situation, is if it is obtained according to paging mark more
The list of websites information of the page, then according to the content of opening flag and end mark interception content pages, otherwise directly basis is opened
Begin to indicate the content with end mark interception content pages;
Second interception content module, for the opening flag and end mark interception content information by web page contents page;
Extract list data module, for the acquisition data table information according to configuration, extract the corresponding expression formula of field or
Person's Rule list data;
Memory module, for storing the data extracted into database.
In the system of configurable collecting webpage data of the present invention, the acquisition Website Module is before execution
It is first made whether to enable and log in the judgement of website, if it is carries out the module for obtaining subject of Web site and content pages, otherwise will
End process.
In the system of configurable collecting webpage data of the present invention, if the 4th judgment module encounters multipage
Face situation acquires data by the way of datacycle merging when paging acquires content.
The method and system for implementing configurable collecting webpage data of the invention, have the advantages that user can
Voluntarily arbitrary disposition needs the webpage data information and condition acquired, acquires the relevant of the whole network by configured acquisition scheme
Data information realizes the acquisition that data content flexibly, is easily carried out to any webpage.
Detailed description of the invention
Fig. 1 is the flow chart of the first preferred embodiment of the method for configurable collecting webpage data of the invention;
Fig. 2 is the flow chart of the second preferred embodiment of the method for configurable collecting webpage data of the invention;
Fig. 3 be configurable collecting webpage data of the invention method first or two preferred embodiment configuration information
The flow chart of step;
Fig. 4 is the system block diagram of configurable collecting webpage data of the invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not
For limiting the present invention.
As shown in Figure 1, the process of the first preferred embodiment in the method for configurable collecting webpage data of the invention
In figure, the method for the configurable collecting webpage data proceeds to step S110 after starting from step S100: step S100,
The configuration information of collecting webpage data is obtained from database, which includes: the classification information of configuration acquisition website, is matched
Set acquisition theme Template Information, configuration acquisition content pages Template Information and configuration data table information;Then, next step is arrived
S120 judges whether the classification for enabling acquisition website according to the classification information of configuration acquisition website, if it is enables acquisition net
The classification stood obtains classifieds website;Otherwise terminate program;Then, next step S130 is arrived, according to the classification of configuration acquisition website
Information judges whether to log in collected classifieds website, if it is logs in the classifieds website, otherwise will log in net using virtual
Page logs in the classifieds website;Then, next step S140 is arrived, theme Template Information is acquired according to configuration, needed for obtaining under website
The theme to be acquired;Then, next step S150 is arrived, according to the theme of acquisition, judges the subject content with the presence or absence of multi-page
Situation if it is obtains the list of websites information of multi-page according to paging mark, obtains multi-page by the list information
Otherwise content pages directly acquire the content pages of the theme;Then, arrive next step S160, according to the opening flag of content pages and
The network address set of end mark interception acquisition content and the multi-page according to expression formula acquisition content pages;Then, next step is arrived
S170 obtains the content pages of acquisition according to the acquisition content pages Template Information of configuration;Then, next step 180 is arrived, according to adopting
The content pages of collection judge that it, with the presence or absence of multi-page situation, if it is obtains the list of websites of multi-page according to paging mark
Otherwise information is directly marked according to opening flag and end then according to the content of opening flag and end mark interception content pages
The content of will interception content pages;Then, next step S190 is arrived, the corresponding expression of field is obtained according to the data table information of configuration
Formula or dependency rule extract list data, then, arrive next step S200, by the list data extracted storage to database
In, last this method ends at step S210.
Further, the acquisition attributes information includes: acquisition network address, acquisition website coding and frequency acquisition.
Further, the data table information includes: acquisition title, acquisition time, acquisition content and acquisition content
Source.
Further, the expression formula uses regular expression, such as finds out acquisition time by regular expression, then
Regular expression extracts the formula on date are as follows: d { 4 } (- |/|) d { 1,2 } 1 d { 1,2 }.
The method of configurable collecting webpage data of the invention can provide a kind of configuration needs that can customize for user
The mode of collecting webpage data, increases the practicality and validity.
As shown in Fig. 2, the process of the second preferred embodiment in the method for configurable collecting webpage data of the invention
In figure, the method for the configurable collecting webpage data proceeds to step S310 after starting from step S300: step S300,
The configuration information of collecting webpage data is obtained from database;Then, next step S320 is arrived, according to point of configuration acquisition website
Category information obtains the website of classification, then arrives next step S330, according to the subject information of configuration acquisition, obtains institute under website
The theme for needing to acquire;Then, required web page contents are acquired according to collected theme to next step S340;Then,
To next step S350, according to the data table information of configuration, regular expression or one are used by the data table information configured
The information of fixed Rule acquisition content pages;Then, next step S360 is arrived, by the list data extracted storage to data
In library;Last this method ends at step S370.
The method of configurable collecting webpage data of the invention can provide a kind of configuration needs that can customize for user
The mode of collecting webpage data, it is more simplified and user-friendly, and increase the practicality and validity.
As shown in figure 3, in first or two preferred embodiments of the method for configurable collecting webpage data of the invention
In the flow chart of configuration information step, the configuration information step in the method for the configurable collecting webpage data starts from walking
Proceed to step S410, the classification and acquisition attributes of configuration acquisition website after rapid S400: step S400;Then, to next
Step S420, configuration acquisition theme template;Then, next step S430, configuration acquisition content pages template are carried out;Then, it carries out
Next step S440, storage configuration information transfer use into database after convenient;Last this method ends at step
S450。
The process of configuration information step of the invention, it is clear to can be realized, related web site needed for detailed search acquires
Data information provides the condition support of acquisition, convenient for the progress of method flow.
As shown in figure 4, in the system block diagram of configurable collecting webpage data of the invention, the configurable webpage number
According to the system of acquisition, comprising: starting module 510 transfers configuration module 520, judgment module 530, obtains configuration information module
540, content module 550 and memory module 560, database 570 are intercepted;
The judgment module 530 includes: first judgment module 531, the second judgment module 532,533 and of third judgment module
4th judgment module 534;
The interception content module 550 includes: the first interception content module 551 and the second interception content module 552;
The database 570 is used for storage configuration information and list data;
The acquisition configuration information module 540 includes: to obtain Website Module 541, obtain subject of Web site module 542, obtain
Content pages module 543 and acquisition list data module 544.
The starting module 510, for starting configurable collecting webpage data system;
Described to transfer configuration module 520, for being acquired needed for being transferred from database the corresponding configuration information;
The first judgment module 531, for judging whether the classification of configuration acquisition website and the function of acquisition attributes,
Judge whether to enable the classification for acquiring website, if it is enables the classification of acquisition website, otherwise terminate program;
The acquisition Website Module 541, for according to configuration acquisition website classification and attribute information, from all kinds of websites
Website needed for middle acquisition;
Second judgment module 532 logs in collected classifieds website for judging whether, if it is logs in the net
It stands, otherwise the website will be logged in using virtual log-on webpage;
The acquisition subject of Web site module 542, for the subject of Web site Template Information according to configuration, acquisition logs in website
Required subject information;
The third judgment module 533, for judge the subject content with the presence or absence of multi-page situation, if it is basis
Paging mark obtains the list of websites information of multi-page, otherwise directly acquires the web page contents of the theme;
The first interception content module 551, for the opening flag and end mark interception content by web page contents
Information;
The acquisition content pages module 543, for the acquisition content page information according to configuration, from the topic module of website
Content page information needed for obtaining;
4th judgment module 534, for judging it with the presence or absence of multi-page situation, if it is according to paging mark
The list of websites information of multi-page is obtained, content is then intercepted according to opening flag and end mark, otherwise directly according to beginning
Mark and end mark intercept content;
The second interception content module 552, in the opening flag and end mark interception by web page contents page
Hold information;
The acquisition list data module 544 extracts the corresponding table of field for the acquisition data table information according to configuration
Up to formula or Rule list data;
The memory module 560, for storing the data extracted into database.
Further, the acquisition Website Module is first made whether to enable and log in the judgement of website before execution, such as
Fruit is the module for obtain subject of Web site and content pages, otherwise will terminate process.
Further, if the 4th judgment module encounters multi-page situation, paging uses datacycle when acquiring content
Combined mode acquires data.
Compared with prior art, the advantages of method and system of configurable collecting webpage data of the invention, is, uses
Family voluntarily can need the web data acquired by arbitrary disposition, be believed by the relevant data that configured acquisition scheme acquires the whole network
Breath, realize flexibly, easily collecting webpage data.
The above description is only an embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure transformation made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant technical fields,
Similarly it is included within the scope of the present invention.
Claims (6)
1. a kind of method of configurable collecting webpage data, which is characterized in that this method comprises:
S1, the configuration information that collecting webpage data is obtained from database, the configuration information include: the classification of configuration acquisition website
Information, configuration acquisition theme Template Information, configuration acquisition content pages Template Information and configuration data table information;
S2, the classification information that website is acquired according to configuration, the classifieds website acquired needed for obtaining judge whether to log in collected
Classifieds website if it is logs in the classifieds website, otherwise will log in the classifieds website using virtual log-on webpage;
S3, theme Template Information is acquired according to configuration, the theme acquired needed for obtaining in the classifieds website of acquisition judges the master
Topic whether there is multi-page situation, and the list information of multi-page network address is if it is obtained according to paging mark, passes through the list
The content pages of acquisition of information multi-page, otherwise directly acquire content pages;
S4, content pages Template Information is acquired according to configuration, whether the content pages acquired needed for obtaining from the theme of acquisition judge it
There are multi-page situations, if it is obtain the list of websites information of multi-page and opening for the content pages according to paging mark
Begin mark and end mark, intercepts the content of content pages, otherwise directly according to the opening flag of content pages and end mark, interception
The content of content pages;
S5, according to configuration data table information, obtain the corresponding expression formula of field or dependency rule, mentioned from the content pages of acquisition
Take list data;
S6, the list data of extraction is stored into database.
2. the method for configurable collecting webpage data according to claim 1, which is characterized in that the data table information
It include: to acquire title, acquisition time, acquisition content and the source for acquiring content.
3. the method for configurable collecting webpage data according to claim 1, which is characterized in that the step S1's
The configuration step of configuration information includes:
A, the classification and acquisition attributes of configuration acquisition website;
B, configuration acquisition theme template;
C, configuration acquisition content pages template;
D, storage configuration information is into database, to use wait transfer.
4. the method for configurable collecting webpage data according to claim 3, which is characterized in that the acquisition attributes packet
It includes: acquisition network address, acquisition website coding and frequency acquisition.
5. a kind of system of the configurable collecting webpage data based on claim 1 the method, which is characterized in that including number
According to library and obtain configuration information module, in which:
The acquisition configuration information module, for obtaining the configuration information of collecting webpage data from database;
The database is used for storage configuration information and list data;
The acquisition configuration information module includes obtaining Website Module, obtaining subject of Web site module, obtain content pages module and obtain
Take list data module, wherein
The acquisition Website Module, for the classification information according to configuration acquisition website, the classifieds website of acquisition needed for obtaining;
The acquisition subject of Web site module obtains in the classifieds website of acquisition for acquiring theme Template Information according to configuration
The theme of required acquisition;
The acquisition content pages module is adopted needed for the acquisition of the theme of acquisition for acquiring content pages Template Information according to configuration
The content pages of collection;
List data module is obtained to mention from the content pages of acquisition for obtaining the corresponding expression formula of field or dependency rule
Take list data.
6. the system of configurable collecting webpage data according to claim 5, which is characterized in that acquisition website mould
Block is also used to judge whether after execution to enable classifieds website, if it is carries out obtaining subject of Web site module and content pages
Otherwise module will terminate process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410822548.4A CN104462547B (en) | 2014-12-25 | 2014-12-25 | A kind of method and system of configurable collecting webpage data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410822548.4A CN104462547B (en) | 2014-12-25 | 2014-12-25 | A kind of method and system of configurable collecting webpage data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104462547A CN104462547A (en) | 2015-03-25 |
CN104462547B true CN104462547B (en) | 2019-04-02 |
Family
ID=52908582
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410822548.4A Active CN104462547B (en) | 2014-12-25 | 2014-12-25 | A kind of method and system of configurable collecting webpage data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104462547B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106547749A (en) * | 2015-09-16 | 2017-03-29 | 北京国双科技有限公司 | The method and apparatus of collecting webpage data |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915334A (en) * | 2015-05-29 | 2015-09-16 | 浪潮软件集团有限公司 | Automatic extraction method of key information of bidding project based on semantic analysis |
CN106022126B (en) * | 2016-05-06 | 2018-07-24 | 哈尔滨工程大学 | A kind of web page characteristics extracting method towards WEB trojan horse detections |
CN106341470A (en) * | 2016-08-31 | 2017-01-18 | 北京量科邦信息技术有限公司 | Method for keeping conversation and grasping continuously-updated data of conversation |
CN108520043A (en) * | 2018-03-30 | 2018-09-11 | 纳思达股份有限公司 | Data object acquisition method, apparatus and system, computer readable storage medium |
CN108549678B (en) * | 2018-04-02 | 2020-06-19 | 北京今朝在线科技有限公司 | Information acquisition system |
CN108763279B (en) * | 2018-04-11 | 2020-12-15 | 北京中科闻歌科技股份有限公司 | Webpage data distributed template acquisition method and system |
CN109902220B (en) * | 2019-02-27 | 2023-11-24 | 腾讯科技(深圳)有限公司 | Webpage information acquisition method, device and computer readable storage medium |
CN110334259A (en) * | 2019-04-22 | 2019-10-15 | 新分享科技服务(深圳)有限公司 | Webpage data acquiring method, device and computer readable storage medium |
CN110188259A (en) * | 2019-05-27 | 2019-08-30 | 厦门商集网络科技有限责任公司 | A kind of data grab method and device of configurableization |
CN111953766A (en) * | 2020-08-07 | 2020-11-17 | 福建省天奕网络科技有限公司 | Method and system for collecting network data |
CN112667872B (en) * | 2020-11-17 | 2023-04-07 | 国家计算机网络与信息安全管理中心 | Real-time acquisition method of new coronary pneumonia epidemic situation data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101034997A (en) * | 2006-03-09 | 2007-09-12 | 新数通兴业科技(北京)有限公司 | Method and system for accurately publishing the data information |
CN101561802A (en) * | 2008-04-18 | 2009-10-21 | 上海复旦光华信息科技股份有限公司 | Web page structural data extraction method and system |
CN103593344A (en) * | 2012-08-13 | 2014-02-19 | 北大方正集团有限公司 | Information acquisition method and device |
-
2014
- 2014-12-25 CN CN201410822548.4A patent/CN104462547B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101034997A (en) * | 2006-03-09 | 2007-09-12 | 新数通兴业科技(北京)有限公司 | Method and system for accurately publishing the data information |
CN101561802A (en) * | 2008-04-18 | 2009-10-21 | 上海复旦光华信息科技股份有限公司 | Web page structural data extraction method and system |
CN103593344A (en) * | 2012-08-13 | 2014-02-19 | 北大方正集团有限公司 | Information acquisition method and device |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106547749A (en) * | 2015-09-16 | 2017-03-29 | 北京国双科技有限公司 | The method and apparatus of collecting webpage data |
CN106547749B (en) * | 2015-09-16 | 2021-02-12 | 北京国双科技有限公司 | Webpage data acquisition method and device |
Also Published As
Publication number | Publication date |
---|---|
CN104462547A (en) | 2015-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104462547B (en) | A kind of method and system of configurable collecting webpage data | |
US11372935B2 (en) | Automatically generating a website specific to an industry | |
JP6377807B2 (en) | Rewriting search queries in online social networks | |
US8856100B2 (en) | Displaying browse sequence with search results | |
CN104765729B (en) | A kind of cross-platform microblogging community account matching process | |
US9355137B2 (en) | Displaying articles matching a user's interest based on key words and the number of comments | |
CN103294781A (en) | Method and equipment used for processing page data | |
CN102314440B (en) | Utilize the method and system in network operation language model storehouse | |
WO2019080910A1 (en) | Information processing system and method thereof for implementing information processing | |
CN108170678A (en) | A kind of text entities abstracting method and system | |
CN106302849A (en) | A kind of method carrying out moving solid fusion by carrier data | |
CN101894109A (en) | Database building method and device | |
CN104915438B (en) | A method of obtaining PCU associated data in specific topics microblogging | |
EP4232980A1 (en) | Content based related view recommendations | |
CN103999079A (en) | Aligning annotation of fields of documents | |
CN103997492B (en) | A kind of adaption system and method | |
CN106339381A (en) | Method and device for processing information | |
CN103377207B (en) | Microblog users relation acquisition method based on script engine | |
JP6680472B2 (en) | Information processing apparatus, information processing method, and information processing program | |
Vicient et al. | Unsupervised semantic clustering of Twitter hashtags | |
JP7003481B2 (en) | Reinforcing rankings for social media accounts and content | |
Shen et al. | A Catalogue Service for Internet GIS ervices Supporting Active Service Evaluation and Real‐Time Quality Monitoring | |
CN106599076B (en) | Forum guide map generation method and device | |
JP2009230483A (en) | Information retrieving method, program and device | |
CN104331472A (en) | Construction method and device of word segmentation training data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |