CN1932817A - Common interconnection network content keyword interactive system - Google Patents

Common interconnection network content keyword interactive system Download PDF

Info

Publication number
CN1932817A
CN1932817A CN 200610127372 CN200610127372A CN1932817A CN 1932817 A CN1932817 A CN 1932817A CN 200610127372 CN200610127372 CN 200610127372 CN 200610127372 A CN200610127372 A CN 200610127372A CN 1932817 A CN1932817 A CN 1932817A
Authority
CN
China
Prior art keywords
keyword
webpage
module
page
web page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200610127372
Other languages
Chinese (zh)
Inventor
陈远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 200610127372 priority Critical patent/CN1932817A/en
Publication of CN1932817A publication Critical patent/CN1932817A/en
Pending legal-status Critical Current

Links

Images

Abstract

Mutual universal network content keyword mutual system with true matching keyword and simply and rapidly adding keyword at real time. It relates to web page information grasp-organ, web page information analyzer to analyze the grasped page text, keyword index module to count matrix of web page and keyword. It also owns keyword demand module to find out keyword array which corresponds to the ID of web pages of keyword filtrating needed in the said web page, keyword matrix and JavaScript code interface. Then calculate the intersection about advertisement keyword bank to get a series of keywords and to classify and screen out the front keywords according to web page and advertisement keyword and advertisement keyword bank. Its AD Center module incepts HTTP demand sent by JavaScript code, calculates web pages ID corresponding URL, sends ID demand to keyword demand module, incepts the feedback keyword and the corresponding mutual advertisement information and adds them to the web page user ending by alternating with JavaScript code interface.

Description

Common interconnection network content keyword interactive system
Common interconnection network content keyword interactive system hereinafter to be referred as " civilian matchmaker ", is a kind of interaction technique of uniqueness.It obtains the content (hypertext) of required internet page by page gripping tool, pass it back system-specific server, win analysis and meaning of one's words judgement through text, homologous ray keyword dictionary coupling, and by in this webpage source program, embedding line code, information that will be relevant with this web page contents (additional hyperlink) is returned the web displaying local side, thereby realizes following purpose: allow the user of browsing page, realize interaction between the additional information relevant with web page contents.
Technical field
The present invention relates to internet arena, particularly a kind ofly can add additional information according to web page contents, and the system of realization and user's interaction.
Background technology
1. with the relevant search technique of civilian matchmaker's technology.
Search technique is a kind of technology that is present in internet arena.It is by grasping the method for web page contents (hypertext), and through the hypertext The matching analysis, to determine which webpage is relevant with the search of user's appointment, again by a new page, the Search Results that these are relevant returns to the user.
But search technique can only be returned relevant Search Results to the user in a new page.Secondly, search technique is initiatively initiated by the user usually.Once more, the result that search is returned just arrives the user, but can not be effectively mutual with forming between the user.Therefore, from the related scope of civilian matchmaker's technology, search technique is the system of a kind of " in very little scope (in the various searching products), for the user provides passive unidirectional information ".
2. the interconnection network content keyword distribution technology relevant with civilian matchmaker's technology.
By the interconnection network content keyword distribution technology, can display the content in interconnected internet client.It manually adds key words content to webpage by network editing, or editing machine issues key words content automatically on static state or dynamic web page, thereby realizes this process.
But this technology can only be added key word information at some webpage of appointment.Simultaneously, automatically the keyword that adds has the error on the participle more, and as will " all over the world " this keyword adding among " downloading a first song every day ", and the manual keyword of interpolation then expends great amount of manpower and time cost.
3. the Google AdSense ad distribution relevant with civilian matchmaker's technology
Google AdSense is suitable for the website publisher of various scales, and it can be implemented on the webpage and show and the higher Google advertisement of text correlativity by the search technique of Google.
But this technology can only be in webpage with the form carrying advertisement information of Banner.
Summary of the invention
The object of the present invention is to provide and a kind ofly can mate keyword accurately and effectively, simply promptly carry out the general interconnection network content keyword interactive system that keyword adds and realize the client interaction in real time.
Common interconnection network content keyword interactive system of the present invention comprises:
The info web grabber: this module is from the partner site, and the hyperlink rule according to certain grasps related pages set by step, and passes to the info web analyzer;
The info web analyzer: the parent page data that this module is grabbed back the info web grabber are analyzed, the judgement of multiple rule and algorithm by system's appointment, other texts such as screening link, advertisement, notice, and analyze and to obtain the page body textual portions, more various grouped datas are sent into the keyword index module;
The keyword index module: this module receives the grouped data content of info web analyzer input, generate the webpage sequencing table of each keyword, calculate webpage-keyword matrix thus,, use for the keyword query module in order to analyze the rank of keyword in each Web page text content;
Keyword query module: itself and ADCenter interface, needing to obtain the webpage ID of screening keyword, in webpage-keyword matrix that the keyword index module calculates, find the keyword array of this ID correspondence, calculate with the advertisement keyword storehouse in view of the above and occur simultaneously, draw a series of keywords, classification is taken all factors into consideration and is filtered out the top's keyword that is used to return the JavaScript code interface according to Web page classifying and advertisement keyword again;
The advertisement keyword storehouse: it is the crucial dictionary by system's appointment, all keywords are wherein classified according to certain classification, be used for asking for common factor with the keyword array of webpage ID correspondence, and compare, thereby judge and pass the interactive advertisement information of which keyword correspondence back webpage client that the user browses according to classified information;
The ADCenter module: this module receives the HTTP request that the JavaScript code sends, calculate the webpage ID of URL correspondence, and send the query requests of this ID to the keyword query module, receive the keyword feed back to and the interactive advertisement information of its correspondence then, mutual by with the JavaScript code interface again passed these interactive informations back the client of user's browsing page;
The JavaScript code interface: this interface realize between the client of common interconnection network content keyword interactive system and user's browsing page alternately, be compiled into and be line code, after adding this line code in the webpage source file, when webpage by the user when client is opened, this interface is sent to webpage URL information the ADCenter module of common interconnection network content keyword interactive system, receive keyword and the advertising message return again, and load keyword and interactive advertisement information shows in the webpage client.
Described common interconnection network content keyword interactive system, also comprise the buffer memory accelerating module, this module and keyword query module and ADCenter interface, in the screening process of Web page text keyword, realization is also upgraded data outmoded in the buffer memory effectively to the keyword query result's that the keyword query module is called buffer memory, when ADCenter receives the HTTP request of JavaScript code transmission, at first can connect this buffer memory accelerating module the webpage ID of described URL correspondence is inquired about, inquire about by the keyword query module again as no record.
Described common interconnection network content keyword interactive system also comprises Redirect Server, the demonstration of this server record advertisement or click information, thus set up the charging system.
Described common interconnection network content keyword interactive system, wherein, the JavaScript code in the JavaScript code interface is by being added into code in its all webpages by the owner of partner site or user's download safety insert and be loaded on code in the web browser and then mode that realize to use in all webpages that the user browses is added in the webpage source file.
Described common interconnection network content keyword interactive system, wherein, described ADCenter module is calculated as one 128 webpage ID with webpage URL and is undertaken by the MD5 algorithm.
Described common interconnection network content keyword interactive system, wherein, described JavaScript code interface is to utilize the Ajax technology to come dynamic load keyword and interactive advertisement information to show in the webpage client, so that these keywords are presented as different fonts, text color on the page, and when the user rests on mouse pointer on these keywords, show floating frame, embed picture, literal, video, sound or specific hyperlink according to actual demand in the described window, thereby realize a series of webpages and user's interbehavior.
Described common interconnection network content keyword interactive system, wherein, the info web grabber from the homepage of website as starting point, grasp webpage as much as possible, its algorithm is as follows: grasp this webpage, and find all hyperlink that exist in this webpage, these URL comprise the web page interlinkage of other website that this website homepage is pointed and the web page interlinkage of this website, after info web grabber module row is removed outside link, generate a URL formation by the regular weaves of " first in first out ", after the screening and PageRank ordering through system's masterplate, promptly at first judge the URL of the required extracting page according to the specified URL masterplate of advertiser demand with common interconnection network content keyword interactive system, the URL that will meet the masterplate form then according to the score value of PageRank separately by height and low the ordering, thereby generate the url list that needs to grasp the page, the extracting page tool by this existing maturation of HTTP Crawler grasps corresponding webpage again; In second batch of webpage, have a series of hyperlink equally, equally these URL information dosed in the URL formation, grasp according to above-mentioned steps again, thus constantly obtained web page files as much as possible with and structural information.
Described common interconnection network content keyword interactive system, wherein, the info web analyzer at first is converted into the web page files of html language the file of text formatting, utilizes following three kinds of modes to analyze to find out body part then:
For the webpage of the part website that certain template is arranged, directly choose the part of text correspondence according to the web page template of this website;
For the webpage that the part website of special mark is arranged at its body part, according to body part is found out in computing that should mark;
Body part is found out in text definition during according to the length of comparison each several part text, with the continuity of the superior and the subordinate's hyperlink web page contents, last this webpage of extracting.
Described common interconnection network content keyword interactive system, wherein, the info web analyzer is judged its PageRank by system algorithm, and the result is fed back to " fifo queue ", to determine the page of required preferential extracting according to the structural information that grasps the page.
Described common interconnection network content keyword interactive system, wherein, for special keyword screening, promptly should excluded keyword, when keyword index module participle, by hand rule is set, the special participle tabulation of configuration in the participle code, thus make keyword more meet the text meaning of one's words.
In sum, the literary composition matchmaker is a kind of general webpage keyword adding technique, it adds keyword does not influence webpage surfing originally, can rationally grasp and analyze the classification of extraction info web, by lexical analysis, judge text main contents direction, screen reasonable keyword, when keyword adds, realize the interaction function of client.
The beneficial effect of literary composition matchmaker technology mainly comprises the following aspects:
1, provides surcharges such as advertisement, service, promote the development of Internet market
Internet market obtains development at full speed in the time in nearly ten years, wherein, the continuous increase of internet content with the continuous lifting of magnanimity information quantity, but lacks the effective interactive approach at the internet text.Literary composition matchmaker's innovative technology has been filled up the blank in this field, makes the internet information content part of magnanimity, can produce surcharges such as advertisement, service.
2, promote marketing innovation, improve the vigor of market economy
For the enterprise in the market economy, how by market means, catch real potential audient, carry out its marketing activity, be the emphasis of most of enterprise marketing.Literary composition matchmaker technology for enterprise provides a kind of unprecedented approach, helps enterprise when its marketing activity at this on the one hand, can find the user who reads corresponding internet content by civilian matchmaker's technology, and realize interaction with them.
3, improve the initiative of user when view Internet information, promote the development of Internet industry
The internet information that the user browses, normally user's interest information category.By civilian matchmaker's technology,, can satisfy the needs that user to view Internet obtains information effectively for these users provide other relevant information.
Compare with search technique, civilian matchmaker possesses following significant advantageous feature:
Wider: search technique can only be used in certain particular search product.And civilian matchmaker can directly embed among the internet content of any existence.
Information initiatively is provided: search technique is only initiatively initiated by the user usually.And civilian matchmaker system can make the user when local side is opened the page, automatically by a series of process flow operation, initiatively transmits relevant information to the user.
Relevant information is provided: win judgement, lexical analysis by text, and step such as identical text matchmaker keyword dictionary coupling, the relevant information that civilian matchmaker provides to the user possesses high correlativity with the web page contents of user's browsing.
Compare with the interconnection network content keyword distribution technology, civilian matchmaker possesses following significant advantageous feature:
Simply, promptly keyword is added among the webpage: practical writing matchmaker technology, original webpage only need to add line code in source program, can realize the function that keyword adds automatically.
Keyword can be added on all internet pages: no matter which kind of form internet page adopts, or belongs to which kind of type, can realize that keyword adds by civilian matchmaker's technology.
Accurate and effective keyword coupling: civilian matchmaker's technology is won function, lexical analysis function and dictionary comparison function by text, can guarantee accurately and effectively suitable keyword to be added on the related web page.
Add in real time: civilian matchmaker's technology can be added keyword on the webpage in real time, and does not need the webpage owner to carry out related work in advance.
Compare with Google AdSense ad distribution, civilian matchmaker possesses following significant advantageous feature:
Advertising message loads the place difference, and civilian matchmaker inserts advertisement among the text that the user browses.Comparatively speaking, the user is to the concern of text, and its degree is greater than the concern to peripheral position.Thereby civilian matchmaker's advertising results are more excellent.
In addition, the form of expression of civilian matchmaker's technology is advanced more: have only when the user adopts click or hovers action the text keyword, civilian matchmaker system is display ads information in the text, has farthest protected favorable user experience.
Description of drawings
Fig. 1 is the mutual synoptic diagram of JavaScript request and civilian matchmaker's system data layer;
Fig. 2 is the mutual synoptic diagram of info web grabber and info web analyzer;
Fig. 3 is civilian matchmaker's system flow design sketch;
Fig. 4 is civilian matchmaker's entire system configuration diagram.
Embodiment
As shown in Figure 3, by the technological innovation of following basic step, civilian matchmaker's technology has solved described technical barrier well.
1. utilize the JavaScript coding, realize that " interpolation line code " promptly adds the solution of keyword to the page.
At first, by following dual mode the JavaScript code is added into the webpage source file:
1) website cooperation is for example cooperated with the Sina website, by the website owner code is added in all Sina website's webpages, when the user browses Sina's webpage, can carry out civilian matchmaker and use;
2) the user's download safety insert is loaded on code in the web browser, thereby realizes that in all webpages that the user browses civilian matchmaker uses.
Secondly, as shown in Figure 1, when the user browsed the webpage that adds the JavaScript code, this line code sent the HTTP request from the trend system, and the URL information that sends webpage is to the civilian matchmaker ADCenter of system module.
The ADCenter module is born the interface function of civilian matchmaker system and internet page, after it receives the HTTP request, by MD5 algorithm (a kind of cryptographic algorithm of existing maturation, it has safety, irreversible characteristics) webpage URL is calculated as one 128 ID, this ID is sent to the system data layer, and read the keyword and the advertising message of this ID correspondence, and be back to the web browser in face of the user, utilize the Ajax technology to realize the demonstration of dynamic load keyword and advertising message.Wherein, the Ajax technology is a kind of webpage multidate information loading technique of existing maturation, it is asynchronous JavaScript and XML (Asynchronous JavaScriptand XML) technology, for web exploitation provides asynchronous data transmission and exchanged form, can under refreshing the situation at (Refresh) interface, not heavily loaded (Reload) carry out exchanges data with server.
Literary composition matchmaker system data layer comprises info web grabber, info web analyzer, keyword index module, keyword query module, buffer memory accelerating module, advertisement keyword storehouse and advertising message storehouse, will set forth in following link.
2. utilize the setting of later stage loading, make the interpolation of keyword not influence webpage surfing originally fully.
The code that the literary composition matchmaker adds in webpage does not move when webpage is opened by the user immediately, but after waiting for that the whole loadings of webpage other guide are finished, just brings into operation.At this moment, the user has begun to carry out normal web page operation, thereby does not influence web page browsing speed.
3. utilize the info web grabber, grasp and store the web page files that needs.Utilize the info web analyzer again, analyze the part of text in each webpage, and tentatively according to webpage its PageRank of position calculation at space link dot matrix.
Webpage grasps, and is that the particular module of " info web grabber " by name among the civilian matchmaker is finished, and for each website, its Web page system is all organized according to a series of " hyperlink ".As shown in Figure 2, the function of info web grabber is: from the homepage of website as starting point, grasp webpage as much as possible, its algorithm is as follows: grasp this webpage, and find all hyperlink that exist in this webpage, these URL comprise the web page interlinkage of other website that this website homepage is pointed and the web page interlinkage of this website, after info web grabber module row is removed outside link, generate a URL formation by the regular weaves of " first in first out ", screening and PageRank ordering back (each webpage PageRank of starting stage is zero) through system's masterplate, is promptly at first (as Sina's automobile channel URL masterplate http://auto.sina.com.cn/news/ according to the resulting URL masterplate of advertiser demand with civilian matchmaker system? /? .shtml) URL of the required extracting page of judgement, the URL that will meet the masterplate form then according to the score value of PageRank separately by height and low the ordering, thereby generate the url list that needs to grasp the page, grasp corresponding webpage by HTTP Crawler (a kind of extracting page tool of existing maturation) again; In second batch of webpage, have a series of hyperlink equally, equally these URL information dosed in the URL formation, grasp according to above-mentioned steps again, thus constantly obtained web page files as much as possible with and structural information.
With " info web grabber " extracting " Sina's automobile channel " webpage is example, from Sina's homepage, its all URL pointed all is discharged in the URL formation, system judge again and meet " http://auto.sina.com.cn/news/ ?/? .shtml " URL of masterplate, many batches of extractings of carrying out the page.
But, in webpage, except " body part " of representing main contents, also comprising other texts such as link, advertisement, notice usually, these texts are not Web page text, they need be forgone away.In the literary composition matchmaker system, realizing the part of this function, is exactly " info web analyzer ".
The function implementation method of info web analyzer at first is converted into the web page files of html language the file of text formatting, utilizes three kinds of modes to analyze then.One, the webpage of part website are that certain template is arranged.According to the web page template of this website, directly choose the part of text correspondence; Its two, the webpage of part website has special mark at its body part, according to computing that should mark, finds out body part; Its three, according to the length of comparison each several part text, with the continuity of the superior and the subordinate's hyperlink web page contents, last text definition when grasping this webpage etc., find out body part.
In addition, the info web analyzer will be judged its PageRank (being the mature technology of webpage classification) by system algorithm, and the result is fed back to " fifo queue ", to determine the page of required preferential extracting according to the structural information that grasps the page.
4. by " keyword index module " and " keyword query module ", Web page text is carried out lexical analysis, filter out rational keyword.
The function of " keyword index module " realizes: obtain suddenly after the Web page text content by previous step, again it is carried out participle (mature technology), generate the webpage sequencing table of each keyword, it is the pairing keyword array of each webpage, and be ranked by the frequency that each keyword occurs at this webpage and obtain " webpage-keyword matrix ", its rank that act as keyword in each Web page text content of analysis, the webpage of its form such as following table-keyword matrix logic structural representation:
" keyword query module " and ADCenter interface, needing to obtain the webpage ID (being 128 ID that calculate webpage URL gained by MD5 mentioned above) of screening keyword, in " webpage-keyword matrix ", find corresponding keyword array, " keyword 1, keyword 2 ... keyword n ", calculate with the advertisement keyword storehouse in view of the above and get common factor, draw a series of keywords, take all factors into consideration according to Web page classifying and advertisement keyword classification again, filter out top's keyword that return JavaScript.
For example, in the news of Sina's automobile channel one piece " the self-driving travel route is chosen with lottery ", occur by " automobile ", " driving ", " snow mountain ", " scene ", " Tibet ", " power ", " field ", a plurality of keywords such as " photographies ", in " webpage-keyword matrix ", its form of expression is: " the open-air photography of 012343afaddb34fe798787656378f8e9-car steering snow mountain scene Tibet power ... ", through with the advertisement keyword storehouse in the bid advertisement keyword of purchase of manufacturer ask for after the common factor, obtain " automobile ", " driving ", " power ", " Tibet ", " photography " these five keywords, webpage ID " 012343afaddb34fe798787656378f8e9 " is corresponding to " separation vehicle ", in the advertisement keyword storehouse, " automobile ", " driving ", " power " is corresponding to " separation vehicle ", so system will return ADCenter to these three speech and their each self-corresponding advertising messages, be fetched and be presented at user's web browser end by the JavaScript code.
In the screening process of Web page text keyword, also need set up " buffer memory accelerating module ", and with " keyword query module " and " ADCenter " interface, realize buffer memory, and can upgrade data outmoded in the buffer memory effectively the keyword query result.This is a general module, in native system, behind the keyword and advertising message of " keyword query module " invoking web page ID correspondence, except that returning ADCenter, also deposit this group information package in " buffer memory accelerating module ", when ADCenter receives the HTTP request of JavaScript code transmission once more, at first can connect " buffer memory accelerating module " and inquire about, then inquire about as no record by " keyword query module ".
For the screening of special keyword, as " governor " in " Changsha, Hunan ", be a kind of should excluded keyword, at this moment, system is got rid of by the method that manually adds screening rule.The screening of special key word is manual when " keyword index module " participle to be provided with rule, the special participle tabulation of configuration in the participle code, thus make keyword more meet the text meaning of one's words.
5. keyword is sent back to the client of user's browsing page, and realize interactive the demonstration.
In civilian matchmaker system, JavaScript code interface and keyword query module are connected by " ADCenter module ", finally selected keyword is sent back to the client of web page browsing, and the code that first step inserts before utilizing operation, make these keywords on the page, be presented as different fonts, text color etc.Simultaneously, when the user rests on mouse pointer on these keywords, show floating frame.Can embed picture, literal, video, sound according to actual demand in the window, or specific hyperlink etc., thereby realize a series of webpages and user's interbehavior.
6. the demonstration or the click information of the advertisement of " Redirect Server " (a kind of existing proven technique, it turns to the final goal page to the target of user capture from a page according to http protocol) record, thus the charging system set up.
The advertising message of being issued by civilian matchmaker system, as figure, literary composition, look, listen, URL, their hyperlink form is: Http:// server.iaso.cn/redirect.jsp? url=adclient.com/ad.html, when advertising message is shown or clicks, " Redirect Server " will note " IP ", " adID ", " publisherid " three category informations automatically, carry out computing by charging system (Internet advertising CPM, CPC general-purpose system).
The entire system framework comprises as shown in Figure 4:
1, info web grabber
This module is used for from the partner site, according to certain hyperlink rule, grasps related pages (comprising link information, text and structural information etc.) set by step, and passes Web page text analysis module corresponding server back.
2, info web analyzer
The parent page data that this module is used for the info web grabber is grabbed are back analyzed, the judgement of multiple rule and algorithm by system's appointment, and other texts such as screening link, advertisement, notice, and analyze and obtain the page body textual portions.Again various grouped datas are sent into the keyword index module.
3, keyword index module
This module is used to receive the grouped data content of info web analyzer input, generate the webpage sequencing table of each keyword, calculate " webpage-keyword matrix " thus, its rank that act as keyword in each Web page text content of analysis is used for the keyword query module.
4, keyword query module
" keyword query module " and ADCenter interface, needing to obtain the webpage ID (promptly calculating 128 ID of webpage URL gained) of screening keyword by MD5, in " webpage-keyword matrix " that the keyword index module calculates, find the keyword array of this ID correspondence, " keyword 1, keyword 2 ... keyword n ", calculate with the advertisement keyword storehouse in view of the above and occur simultaneously, draw a series of keywords, classification is taken all factors into consideration and is filtered out the top's keyword that returns JavaScript according to Web page classifying and advertisement keyword again.
5, advertisement keyword storehouse
It is the crucial dictionary by system's appointment, the purchase of bidding or fix a price by manufacturer of all keywords, and classify according to certain classification, be used for asking for common factor with the keyword array of webpage ID correspondence, and compare, thereby judge and pass the interactive advertisement information of which keyword correspondence back webpage client that the user browses according to classified information.
6, buffer memory accelerating module
The buffer memory accelerating module is used for the screening process at the Web page text keyword, realizes the buffer memory to the keyword query result, and can upgrade data outmoded in the buffer memory effectively.
7, ADCenter module
This module is used to receive the HTTP request that the JavaScript code sends, calculate the webpage ID of URL correspondence, and send the query requests of this ID to buffer memory accelerating module or keyword query module, receive the keyword feed back to and the interactive advertisement information of its correspondence then, mutual by with the JavaScript code interface again passed these interactive informations back the client of user's browsing page.
8, JavaScript code interface
This interface is used to realize mutual between the client of civilian matchmaker system and user's browsing page.The literary composition matchmaker is with this interface establishment becoming line code.The webpage owner is as long as add this line code in the webpage source file, when webpage by the user when client is opened, this interface is sent to the civilian matchmaker ADCenter of system module with webpage URL information, receive keyword and the advertising message of returning again, utilize the demonstration of Ajax technology dynamic load keyword and interactive advertisement information in the webpage client.
9, Redirect Server
This server is a mature technology, is used to write down the demonstration or the click information of advertisement, thereby sets up the charging system.
The advertising message of being issued by civilian matchmaker system, as figure, literary composition, look, listen, URL, their hyperlink form is: Http:// server.iaso.cn/redirect.jsp? url=adclient.com/ad.html, when advertising message is shown or clicks, " Redirect Server " will note " IP ", " adID ", " publisherid " three category informations automatically, carry out computing by charging system (Internet advertising CPM, CPC general-purpose system).

Claims (10)

1, a kind of common interconnection network content keyword interactive system is characterized in that, comprising:
The info web grabber: this module is from the partner site, and the hyperlink rule according to certain grasps related pages set by step, and passes to the info web analyzer;
The info web analyzer: the parent page data that this module is grabbed back the info web grabber are analyzed, the judgement of multiple rule and algorithm by system's appointment, other texts such as screening link, advertisement, notice, and analyze and to obtain the page body textual portions, more various grouped datas are sent into the keyword index module;
The keyword index module: this module receives the grouped data content of info web analyzer input, generate the webpage sequencing table of each keyword, calculate webpage-keyword matrix thus,, use for the keyword query module in order to analyze the rank of keyword in each Web page text content;
Keyword query module: itself and ADCenter interface, needing to obtain the webpage ID of screening keyword, in webpage-keyword matrix that the keyword index module calculates, find the keyword array of this ID correspondence, calculate with the advertisement keyword storehouse in view of the above and occur simultaneously, draw a series of keywords, classification is taken all factors into consideration and is filtered out the top's keyword that is used to return the JavaScript code interface according to Web page classifying and advertisement keyword again;
The advertisement keyword storehouse: it is the crucial dictionary by system's appointment, all keywords are wherein classified according to certain classification, be used for asking for common factor with the keyword array of webpage ID correspondence, and compare, thereby judge and pass the interactive advertisement information of which keyword correspondence back webpage client that the user browses according to classified information;
The ADCenter module: this module receives the HTTP request that the JavaScript code sends, calculate the webpage ID of URL correspondence, and send the query requests of this ID to the keyword query module, receive the keyword feed back to and the interactive advertisement information of its correspondence then, mutual by with the JavaScript code interface again passed these interactive informations back the client of user's browsing page;
The JavaScript code interface: this interface realize between the client of common interconnection network content keyword interactive system and user's browsing page alternately, be compiled into and be line code, after adding this line code in the webpage source file, when webpage by the user when client is opened, this interface is sent to webpage URL information the ADCenter module of common interconnection network content keyword interactive system, receive keyword and the advertising message return again, and load keyword and interactive advertisement information shows in the webpage client.
2, common interconnection network content keyword interactive system according to claim 1, it is characterized in that, also comprise the buffer memory accelerating module, this module and keyword query module and ADCenter interface, in the screening process of Web page text keyword, realization is also upgraded data outmoded in the buffer memory effectively to the keyword query result's that the keyword query module is called buffer memory, when ADCenter receives the HTTP request of JavaScript code transmission, at first can connect this buffer memory accelerating module the webpage ID of described URL correspondence is inquired about, inquire about by the keyword query module again as no record.
3, according to the common interconnection network content keyword interactive system of claim 1 or 2, it is characterized in that, also comprise Redirect Server, the demonstration of this server record advertisement or click information, thus set up the charging system.
4, according to the common interconnection network content keyword interactive system of claim 1, it is characterized in that the JavaScript code in the JavaScript code interface is by being added into code in its all webpages by the owner of partner site or user's download safety insert and be loaded on code in the web browser and then mode that realize to use is added in the webpage source file in all webpages that the user browses.
According to the common interconnection network content keyword interactive system of claim 1, it is characterized in that 5, described ADCenter module is calculated as one 128 webpage ID with webpage URL and is undertaken by the MD5 algorithm.
6, according to the common interconnection network content keyword interactive system of claim 1, it is characterized in that, described JavaScript code interface is to utilize the Ajax technology to come dynamic load keyword and interactive advertisement information to show in the webpage client, so that these keywords are presented as different fonts, text color on the page, and when the user rests on mouse pointer on these keywords, show floating frame, embed picture, literal, video, sound or specific hyperlink according to actual demand in the described window, thereby realize a series of webpages and user's interbehavior.
7, common interconnection network content keyword interactive system according to claim 1, it is characterized in that, the info web grabber from the homepage of website as starting point, grasp webpage as much as possible, its algorithm is as follows: grasp this webpage, and find all hyperlink that exist in this webpage, these URL comprise the web page interlinkage of other website that this website homepage is pointed and the web page interlinkage of this website, after info web grabber module row is removed outside link, generate a URL formation by the regular weaves of " first in first out ", after the screening and PageRank ordering through system's masterplate, promptly at first judge the URL of the required extracting page according to the resulting URL masterplate of advertiser demand with common interconnection network content keyword interactive system, the URL that will meet the masterplate form then according to the score value of PageRank separately by height and low the ordering, thereby generate the url list that needs to grasp the page, the extracting page tool by this existing maturation of HTTP Crawler grasps corresponding webpage again; In second batch of webpage, have a series of hyperlink equally, equally these URL information dosed in the URL formation, grasp according to above-mentioned steps again, thus constantly obtained web page files as much as possible with and structural information.
8, according to the common interconnection network content keyword interactive system of claim 1, it is characterized in that, the info web analyzer at first is converted into the web page files of html language the file of text formatting, utilizes following three kinds of modes to analyze to find out body part then:
For the webpage of the part website that certain template is arranged, directly choose the part of text correspondence according to the web page template of this website;
For the webpage that the part website of special mark is arranged at its body part, according to body part is found out in computing that should mark;
Body part is found out in text definition during according to the length of comparison each several part text, with the continuity of the superior and the subordinate's hyperlink web page contents, last this webpage of extracting.
9, according to the common interconnection network content keyword interactive system of claim 7, it is characterized in that the info web analyzer is judged its PageRank according to the structural information that grasps the page by system algorithm, and the result fed back to " fifo queue ", to determine the page of required preferential extracting.
10, according to the common interconnection network content keyword interactive system of claim 1, it is characterized in that, for special keyword screening, promptly should excluded keyword, the manual rule that is provided with when keyword index module participle, the special participle tabulation of configuration in the participle code, thus make keyword more meet the text meaning of one's words.
CN 200610127372 2006-09-15 2006-09-15 Common interconnection network content keyword interactive system Pending CN1932817A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200610127372 CN1932817A (en) 2006-09-15 2006-09-15 Common interconnection network content keyword interactive system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200610127372 CN1932817A (en) 2006-09-15 2006-09-15 Common interconnection network content keyword interactive system

Publications (1)

Publication Number Publication Date
CN1932817A true CN1932817A (en) 2007-03-21

Family

ID=37878652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200610127372 Pending CN1932817A (en) 2006-09-15 2006-09-15 Common interconnection network content keyword interactive system

Country Status (1)

Country Link
CN (1) CN1932817A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008122179A1 (en) * 2007-04-06 2008-10-16 Alibaba Group Holding Limited A method, device and system for processing relevant keyword
WO2009006844A1 (en) * 2007-07-09 2009-01-15 Zhiping Meng Method and system of web page semanteme applicating
CN100458797C (en) * 2007-06-20 2009-02-04 精实万维软件(北京)有限公司 Process for ordering network advertisement
CN100462980C (en) * 2007-06-26 2009-02-18 腾讯科技(深圳)有限公司 Content-related advertising identifying method and content-related advertising server
CN101551796A (en) * 2008-04-02 2009-10-07 上海亿动信息技术有限公司 Control system and corresponding control method for releasing information according to carrier content
WO2009137978A1 (en) * 2008-05-14 2009-11-19 华为技术有限公司 Method, system and device for presenting advertisement
CN101216842B (en) * 2008-01-07 2011-05-18 成都市华为赛门铁克科技有限公司 Method for obtaining page key words and page information processing apparatus
CN101211368B (en) * 2007-12-25 2011-08-03 北京搜狗科技发展有限公司 Method for classifying search term, device and search engine system
WO2011106907A1 (en) * 2010-03-04 2011-09-09 Yahoo! Inc. Intelligent feature expansion of online text ads
WO2011130870A1 (en) * 2010-04-19 2011-10-27 Hewlett-Packard Development Company, L.P. Semantically ranking content in a website
CN102332137A (en) * 2011-09-23 2012-01-25 纽海信息技术(上海)有限公司 Goods matching method and system
CN102456016A (en) * 2010-10-18 2012-05-16 ***通信集团四川有限公司 Method and device for sequencing search results
CN102549560A (en) * 2009-08-13 2012-07-04 谷歌公司 Shared server-side macros
CN102609518A (en) * 2012-02-09 2012-07-25 清华大学 Method and system for acquiring content of multistate AJAX (asynchronous javascript and extensible markup language) webpage
CN103348312A (en) * 2010-12-02 2013-10-09 戴斯帕克有限公司 Systems, devices and methods for streaming multiple different media content in a digital container
CN103377274A (en) * 2012-04-26 2013-10-30 果实伙伴股份有限公司 Method for inserting video advertisement in webpage
CN103425475A (en) * 2012-05-24 2013-12-04 聚胜万合信息技术(上海)有限公司 Method for adding click link to Internet advertisement
WO2014008654A1 (en) * 2012-07-12 2014-01-16 Google Inc. Systems and methods for selecting content using webref entities
CN103577423A (en) * 2012-07-23 2014-02-12 阿里巴巴集团控股有限公司 Keyword classification method and system
CN104021231A (en) * 2014-06-26 2014-09-03 北京奇虎科技有限公司 Method and device for displaying webpage in browser
CN104077341A (en) * 2013-07-19 2014-10-01 腾讯科技(北京)有限公司 Keyword auto-response mapping relation generation method and device in instant messaging
CN105894417A (en) * 2016-06-12 2016-08-24 深圳市悦好教育科技有限公司 Method for grading low-grade reading books of primary school based on proportion of standard curriculum characters
CN105930096A (en) * 2016-04-12 2016-09-07 中国民航信息网络股份有限公司 PageRank based data block pre-caching method
US9794198B2 (en) 2013-07-19 2017-10-17 Tencent Technology (Shenzhen) Company Limited Methods and systems for creating auto-reply messages
CN107784054A (en) * 2017-02-16 2018-03-09 平安科技(深圳)有限公司 A kind of page dissemination method and device
CN108170542A (en) * 2017-12-26 2018-06-15 上海亿动信息技术有限公司 A kind of control method and device that advertising information exchange is realized based on local page
CN108319376A (en) * 2017-12-29 2018-07-24 北京奇虎科技有限公司 A kind of input association recommendation method and device that optimization business word is promoted
CN108647342A (en) * 2018-05-14 2018-10-12 佛山市真觉网络科技有限公司 A method of the spider crawl of optimization Baidu
CN113177148A (en) * 2021-05-21 2021-07-27 滨州职业学院 Data pushing method and device and storage medium
CN113836434A (en) * 2021-11-25 2021-12-24 山东捷瑞数字科技股份有限公司 Web page data processing method based on database

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8626742B2 (en) 2007-04-06 2014-01-07 Alibaba Group Holding Limited Method, apparatus and system of processing correlated keywords
US9275100B2 (en) 2007-04-06 2016-03-01 Alibaba Group Holding Limited Method, apparatus and system of processing correlated keywords
WO2008122179A1 (en) * 2007-04-06 2008-10-16 Alibaba Group Holding Limited A method, device and system for processing relevant keyword
CN101281522B (en) * 2007-04-06 2010-11-03 阿里巴巴集团控股有限公司 Method and system for processing related key words
CN100458797C (en) * 2007-06-20 2009-02-04 精实万维软件(北京)有限公司 Process for ordering network advertisement
CN100462980C (en) * 2007-06-26 2009-02-18 腾讯科技(深圳)有限公司 Content-related advertising identifying method and content-related advertising server
WO2009006844A1 (en) * 2007-07-09 2009-01-15 Zhiping Meng Method and system of web page semanteme applicating
CN101211368B (en) * 2007-12-25 2011-08-03 北京搜狗科技发展有限公司 Method for classifying search term, device and search engine system
CN101216842B (en) * 2008-01-07 2011-05-18 成都市华为赛门铁克科技有限公司 Method for obtaining page key words and page information processing apparatus
CN101551796A (en) * 2008-04-02 2009-10-07 上海亿动信息技术有限公司 Control system and corresponding control method for releasing information according to carrier content
WO2009137978A1 (en) * 2008-05-14 2009-11-19 华为技术有限公司 Method, system and device for presenting advertisement
CN101582911B (en) * 2008-05-14 2014-12-03 华为技术有限公司 Method, system and device for presenting advertisement
CN102549560A (en) * 2009-08-13 2012-07-04 谷歌公司 Shared server-side macros
CN102549560B (en) * 2009-08-13 2016-10-26 谷歌公司 Shared server side is grand
WO2011106907A1 (en) * 2010-03-04 2011-09-09 Yahoo! Inc. Intelligent feature expansion of online text ads
US8788342B2 (en) 2010-03-04 2014-07-22 Yahoo! Inc. Intelligent feature expansion of online text ads
WO2011130870A1 (en) * 2010-04-19 2011-10-27 Hewlett-Packard Development Company, L.P. Semantically ranking content in a website
CN102939602B (en) * 2010-04-19 2016-10-12 惠普发展公司,有限责任合伙企业 To the content in website by semantic rank
CN102939602A (en) * 2010-04-19 2013-02-20 惠普发展公司,有限责任合伙企业 Semantically ranking content in a website
CN102456016A (en) * 2010-10-18 2012-05-16 ***通信集团四川有限公司 Method and device for sequencing search results
CN102456016B (en) * 2010-10-18 2014-10-01 ***通信集团四川有限公司 Method and device for sequencing search results
US9342212B2 (en) 2010-12-02 2016-05-17 Instavid Llc Systems, devices and methods for streaming multiple different media content in a digital container
CN103348312A (en) * 2010-12-02 2013-10-09 戴斯帕克有限公司 Systems, devices and methods for streaming multiple different media content in a digital container
CN102332137A (en) * 2011-09-23 2012-01-25 纽海信息技术(上海)有限公司 Goods matching method and system
CN102609518A (en) * 2012-02-09 2012-07-25 清华大学 Method and system for acquiring content of multistate AJAX (asynchronous javascript and extensible markup language) webpage
CN102609518B (en) * 2012-02-09 2015-02-18 清华大学 Method and system for acquiring content of multistate AJAX (asynchronous javascript and extensible markup language) webpage
CN103377274A (en) * 2012-04-26 2013-10-30 果实伙伴股份有限公司 Method for inserting video advertisement in webpage
CN103425475A (en) * 2012-05-24 2013-12-04 聚胜万合信息技术(上海)有限公司 Method for adding click link to Internet advertisement
WO2014008654A1 (en) * 2012-07-12 2014-01-16 Google Inc. Systems and methods for selecting content using webref entities
CN103577423B (en) * 2012-07-23 2016-12-07 阿里巴巴集团控股有限公司 Keyword classification method and system
CN103577423A (en) * 2012-07-23 2014-02-12 阿里巴巴集团控股有限公司 Keyword classification method and system
CN104077341A (en) * 2013-07-19 2014-10-01 腾讯科技(北京)有限公司 Keyword auto-response mapping relation generation method and device in instant messaging
US10243889B2 (en) 2013-07-19 2019-03-26 Tencent Technology (Shenzhen) Company Limited Keyword based automatic reply generation in a messaging application
US10382368B2 (en) 2013-07-19 2019-08-13 Tencent Technology (Shenzhen) Company Limited Methods and systems for creating auto-reply messages
US9794198B2 (en) 2013-07-19 2017-10-17 Tencent Technology (Shenzhen) Company Limited Methods and systems for creating auto-reply messages
CN104021231A (en) * 2014-06-26 2014-09-03 北京奇虎科技有限公司 Method and device for displaying webpage in browser
CN104021231B (en) * 2014-06-26 2017-07-28 北京奇虎科技有限公司 The method and apparatus that webpage is shown in browser
CN105930096B (en) * 2016-04-12 2019-01-11 中国民航信息网络股份有限公司 A kind of data block pre-cache method based on PageRank
CN105930096A (en) * 2016-04-12 2016-09-07 中国民航信息网络股份有限公司 PageRank based data block pre-caching method
CN105894417A (en) * 2016-06-12 2016-08-24 深圳市悦好教育科技有限公司 Method for grading low-grade reading books of primary school based on proportion of standard curriculum characters
CN107784054A (en) * 2017-02-16 2018-03-09 平安科技(深圳)有限公司 A kind of page dissemination method and device
CN108170542A (en) * 2017-12-26 2018-06-15 上海亿动信息技术有限公司 A kind of control method and device that advertising information exchange is realized based on local page
CN108170542B (en) * 2017-12-26 2022-06-28 上海亿动信息技术有限公司 Control method and device for realizing advertisement information exchange based on local page
CN108319376A (en) * 2017-12-29 2018-07-24 北京奇虎科技有限公司 A kind of input association recommendation method and device that optimization business word is promoted
CN108319376B (en) * 2017-12-29 2021-11-26 北京奇虎科技有限公司 Input association recommendation method and device for optimizing commercial word promotion
CN108647342A (en) * 2018-05-14 2018-10-12 佛山市真觉网络科技有限公司 A method of the spider crawl of optimization Baidu
CN113177148B (en) * 2021-05-21 2022-06-24 滨州职业学院 Data pushing method and device and storage medium
CN113177148A (en) * 2021-05-21 2021-07-27 滨州职业学院 Data pushing method and device and storage medium
CN113836434A (en) * 2021-11-25 2021-12-24 山东捷瑞数字科技股份有限公司 Web page data processing method based on database
CN113836434B (en) * 2021-11-25 2022-03-04 山东捷瑞数字科技股份有限公司 Web page data processing method based on database

Similar Documents

Publication Publication Date Title
CN1932817A (en) Common interconnection network content keyword interactive system
US8782040B2 (en) Generating ranked search results using linear and nonlinear ranking models
US20220237655A1 (en) Providing Advertisements from Related Search Queries
US10325033B2 (en) Determination of content score
JP6416150B2 (en) Search method, search system, and computer program
CN1278626A (en) Cost-reduced on-line service and method for self-adaptive defining advertisement target and apparatus thereof
US8209616B2 (en) System and method for interfacing a web browser widget with social indexing
US7873640B2 (en) Semantic analysis documents to rank terms
US9449094B2 (en) Navigating among content items in a set
US7849081B1 (en) Document analyzer and metadata generation and use
US9798820B1 (en) Classification of keywords
CN1825308A (en) Web search system and method thereof
US8548981B1 (en) Providing relevance- and diversity-influenced advertisements including filtering
US9251206B2 (en) Generalized edit distance for queries
CN101036157A (en) Determining ad targeting information and/or ad creative information using past search queries
CN101079064A (en) Web page sequencing method and device
KR20090111813A (en) Online computer-aided translation
WO2013060015A1 (en) Advertisement determination system and method for clustered search results
US20080201219A1 (en) Query classification and selection of associated advertising information
US20150193814A1 (en) Systems and methods for context-based video advertising
WO2014081762A1 (en) Mobile-commerce store generator that automatically extracts and converts data
CN106874368B (en) RTB bidding advertisement position value analysis method and system
JP5814089B2 (en) Information display control device, information display control method, and program
CN107085573B (en) Hotspot information acquisition method and device
JP2006146446A (en) Retrieval optimization system and method for web site

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication