WO2012094937A1 - 网页页面预读方法、中转服务器和网页页面预读*** - Google Patents

网页页面预读方法、中转服务器和网页页面预读*** Download PDF

Info

Publication number
WO2012094937A1
WO2012094937A1 PCT/CN2011/084107 CN2011084107W WO2012094937A1 WO 2012094937 A1 WO2012094937 A1 WO 2012094937A1 CN 2011084107 W CN2011084107 W CN 2011084107W WO 2012094937 A1 WO2012094937 A1 WO 2012094937A1
Authority
WO
WIPO (PCT)
Prior art keywords
page
read
reading
network resource
sub
Prior art date
Application number
PCT/CN2011/084107
Other languages
English (en)
French (fr)
Inventor
梁捷
Original Assignee
广州市动景计算机科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州市动景计算机科技有限公司 filed Critical 广州市动景计算机科技有限公司
Priority to US13/580,961 priority Critical patent/US8375107B2/en
Publication of WO2012094937A1 publication Critical patent/WO2012094937A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Definitions

  • the present invention relates to the field of mobile internet, and more particularly to a web page pre-reading method, a relay server, and a web page pre-reading system having the same.
  • the server predicts which files need to be pre-loaded based on the user's historical browsing behavior and web page layout.
  • the pre-reading function of the webpage is implemented according to the pre-loaded file, so that the user does not need to wait for the webpage to be loaded.
  • the process quickly gets the page content that needs to be loaded from the server.
  • the Chinese patent No. 200910313007.8 proposes a webpage pre-reading and integrated browsing system for a mobile communication device terminal and an application method thereof, and a pre-reading module for reading a subpage of a current webpage and saving the pre-reading module and using Combining the current webpage and the subpage into a webpage combination module of the combined webpage, and performing a multi-layer search on the pre-reading page, re-combining the obtained subpage and the current page into a unified webpage display to the user, thereby improving the browsing of the user Feel.
  • the Chinese patent application with the application number 200910313007.8 proposes that the pre-reading integration scheme simply uses the keywords on the webpage such as “next page”/“next chapter”/“next section” as the hotspot for exciting pre-reading.
  • the acquisition and combination of pages, for web pages with complex sub-page structures (such as the case where there are "next page” and “next chapter” links in the serial novel), accurate pre-reading is not possible, and the application number is
  • the Chinese patent pre-reading rules of 200910313007.8 are single, lacking consideration for users' browsing preferences, and poor friendliness.
  • an object of the present invention is to provide a web page pre-reading and integration technology based on a relay server to overcome the defects of the prior art pre-reading rules and poor friendliness.
  • a web page pre-reading method comprising the following steps:
  • a relay server comprising:
  • a receiving unit configured to receive a page access request for page data of a page having multiple sub-pages from the mobile terminal, and receive page data of the requested page from the network resource server in response to the page access request;
  • a predetermined keyword analyzing unit configured to perform pre-reading keyword analysis on page data or pre-read subpage data currently returned from the network resource server, to obtain the currently returned page data or subpage data included in the Read-ahead keywords;
  • a page pre-reading unit configured to pre-read page data of the next subpage of the current page or the subpage from the network resource server according to the pre-read keyword having the highest priority among the obtained pre-read keywords.
  • a judging unit configured to determine whether the page data of the sub-page of the predetermined number of layers has been pre-read
  • a sending unit configured to send a page access request to the network resource server, and send the page data of the pre-read predetermined number of sub-pages to the mobile terminal
  • the processing of the predetermined keyword analyzing unit and the page pre-reading unit is repeatedly executed from the acquired page data of the page until the determining unit Determining to pre-read page data of a predetermined number of sub-pages of the page from the network resource server.
  • a web page pre-reading system including a mobile terminal, a network resource server, and the above-described relay server is provided.
  • the transit server-based web page pre-reading and integration technology solution it is possible to flexibly acquire valid sub-pages and perform merge re-processing on the obtained sub-pages, and when the user browses a webpage that needs to be turned over multiple times
  • a same combined webpage is formed and displayed to the user, all the content is rendered in a one-time layout, so that the webpage presentation is more flexible, user-friendly and fast, reducing the user's operation flow and enhancing the user's browsing of the multi-subpage page.
  • Browse experience it is possible to flexibly acquire valid sub-pages and perform merge re-processing on the obtained sub-pages, and when the user browses a webpage that needs to be turned over multiple times
  • the strategy for multi-layer search and analysis of the pre-read page can more accurately predict the next click behavior of the user, which can improve the click response speed, and can be less error pre-read, save network traffic;
  • the user basically does not have to wait, which can save user time and save the number of access requests to the network.
  • the invention is applied to news reading composed of a webpage novel or a plurality of consecutive webpages, and can greatly improve the use feeling of the mobile browser user.
  • FIG. 1 is a flow chart showing a method for pre-reading a web page based on a relay server according to the present invention
  • Figure 2 shows a pre-reading process for each sub-page according to the present invention
  • Figure 3 is a block diagram showing the structure of a combined rearrangement of sub-pages of the same structure in accordance with the present invention
  • FIG. 4 is a block diagram showing a relay server for web page pre-reading of the present invention.
  • Figure 5 is a block diagram showing a web page pre-reading system of the present invention.
  • Figure 6 shows a web page pre-reading system with a relay server in accordance with the present invention.
  • the present invention adopts a multi-subpage merge pre-reading scheme based on keyword-based priority and user behavior, and performs read-ahead operations and
  • the heavy layout and merge processing are processed in the relay server, and only the final step of the webpage rendering display is placed on the client to minimize the burden on the mobile communication device terminal.
  • the sub-page of the webpage is relative to the current page browsed by the user, and specifically may be other pages pointed by the hyperlink in the current browsing page, such as the next page, etc. .
  • the present invention adopts a basic structure of a client-server.
  • the present invention is mainly designed for a mobile terminal as a web browsing client, it is not excluded to apply the present invention to an access other than a mobile terminal such as a mobile phone.
  • sexual clients such as PCs, specific service providers (such as airports, stations, query terminals), etc. Therefore, in the following description, "mobile terminal”, “terminal”, “client” and other expressions refer directly to the user.
  • Interactive access client refer directly to the user.
  • FIG. 1 is a flow chart showing a method for pre-reading a web page based on a relay server according to the present invention.
  • the mobile terminal when the mobile terminal user browses the webpage page, the mobile terminal submits the webpage page access of the user to the transit server, and the transit server requests the corresponding webpage data from the corresponding web resource server according to the page access request of the mobile terminal.
  • step S110 after the relay server receives the page data returned from the network resource server, starting from the acquired page data of the page, performing the following pre-reading process until the predetermined layer of the page is pre-read from the network resource server Page data of a plurality of sub-pages: performing pre-read keyword analysis on page data or pre-read sub-page data currently returned from the network resource server to obtain the currently returned page data or sub-page data Pre-reading keywords; pre-reading page data of the next subpage of the current page or subpage from the network resource server according to the pre-read keyword having the highest priority among the obtained pre-read keywords (step S120). Specifically, for example, in one example, as shown in FIG.
  • step S121 analysis of a pre-read keyword is performed on page data returned from a network resource server (S121); and then, according to the priority of the obtained pre-read keyword Obtaining a URL of the subpage of the requested page from the network resource server (S123); after obtaining the URL of the subpage, the relay server prereads the subpage to be preread from the network resource server according to the obtained URL
  • step S121 the process returns to step S121, and in step S121, the pre-read keyword analysis is performed on the page data of the currently pre-read sub-page.
  • the processing of steps S121 to S127 is then repeatedly performed until it is determined in step S127 that the predetermined number of subpage page data prefetching has been performed.
  • the predetermined number of layers may be determined according to the hardware configuration of the terminal, or may be set by the user. In terms of the control of the number of pre-reading layers, it can be realized by a counter or other quantity control method known to those skilled in the art.
  • the relay server After pre-reading the sub-pages of the specified number of layers, the relay server performs merge reordering on the pre-readed predetermined number of sub-pages (S130); finally, the combined rearranged combined page data is sent to the mobile terminal. (S140), thereby realizing pre-reading and integrated browsing of webpage pages required by the user.
  • the mobile terminal can directly retrieve the pre-read pages of the multiple sub-pages in the cache for display, so that an appropriate number of sub-page contents can be browsed without any page turning operation.
  • the priority of the pre-read keywords is reduced from left to right:
  • the transit server selects the keyword with the highest priority, saves the subpage of the webpage pointed to by the keyword link, and the subpage pointed to by the same keyword link of the subpage.
  • the news page only has a summary of the news.
  • the highest priority keyword for the page is "Next Page”.
  • the body of the news has a total of 5 pages, and each page has a keyword "Next page.” ", the relay server will pre-read the second page to the fifth page of the news as a subpage of the home page.
  • a large amount of user behavior is also included in the pre-reading consideration, that is, after the URL of the sub-page is acquired in step S123, the sub-decision is further determined.
  • the URL of the page is the same as the URL of the most frequently accessed page after accessing the page according to the statistics of the massive user behavior. If they are the same, proceed to the pre-reading process of step S125; if not, according to the most frequently accessed
  • the jump access rate of the page URL determines whether the most frequently accessed page is pre-read.
  • jump access rate exceeds a preset threshold, pre-reading the most frequently accessed page according to the most frequently accessed page URL; otherwise, if the jump access rate does not exceed a preset threshold, then Do any read-ahead operations.
  • the massive user behavior statistics of the relay server is to record the access behavior of all users, and based on the records of these behaviors, various statistical parameters related to the user's access behavior, such as the access frequency (ie, one day) are obtained. How many times to access, the probability of a page jumping to another page, etc., which are based on the user's access behavior record.
  • the most frequently visited webpage (including the sub-page of the webpage and other jumped webpages) after the webpage is accessed is obtained, if according to the keyword
  • the URL of the pre-read subpage is the same as the URL of the most frequently visited webpage after the webpage is accessed, and it can be determined that the user wishes to continuously read the news, so that the keyword will be pre-read according to the keyword. All sub-pages are merged and rearranged and sent to the mobile terminal for caching.
  • the number of combined sub-pages may be determined according to the hardware configuration of the terminal, or may be set by the user, but considering the browsing habits of the user and the terminal configuration of most mobile terminals, the number of combined sub-pages generally does not exceed four layers, preferably The two-to-3 layer sub-pages are combined and rearranged.
  • the statistical data of the historical access behavior of the mass user is used to determine the most frequently accessed web page after the web page is accessed.
  • the jump access rate of the web page to determine whether to perform a read-ahead of the jump access.
  • the jump access rate can be set according to an empirical value, such as a value between 60% and 80%. In a specific implementation manner of the present invention, if the jump access rate of the webpage reaches 70%, the most frequently visited webpage is pre-read, and if the jump access rate of the webpage does not reach 70%, no Any subsequent read-ahead.
  • the comparison judgment of such massive user behavior statistics information may be introduced in the pre-reading process of only the first layer sub-page, or may be obtained by counting the URL of the sub-page and the mass user behavior in the pre-reading process of each sub-page. Jump page URLs are compared so that each page read by the relay server is closer to the user's browsing needs.
  • the process of comparing the URL of the subpage with the jumppage URL obtained by the massive user behavior statistics in the pre-reading process of each subpage is as follows: after acquiring the URL of one subpage each time, Determining whether the URL of the subpage is the same as the URL of the most frequently accessed page after accessing the previous page of the subpage according to the massive user behavior statistics. If they are the same, proceeding to the prefetching operation of step S125 above; otherwise, according to the massive amount
  • the jump access rate of the most frequently visited page URL determined by the user behavior statistics determines whether the most frequently accessed page is prefetched.
  • the jump access rate exceeds a preset threshold, the most frequently accessed page is pre-read according to the most frequently accessed page URL; if the jump access rate does not exceed a preset threshold, no pre-reading is performed. Operation, only the pre-read subpages are merged and rearranged.
  • the page structure is the same for each sub-page based on the same content: using the same markup language, the same page title, etc., so it can be based on this feature.
  • the contents of the subpage are combined and rearranged. Specifically, the webpage is divided into a title part and a body part according to the webpage markup language, and the body part content of each subpage is extracted and combined into a body part of the new combined webpage, and the title part of the new combined webpage adopts each subpage title. In the same part, the user can directly output the combined full page content when browsing.
  • Fig. 3 is a schematic diagram showing the simple structure of the combined rearrangement of the sub-pages of the same structure. As shown in FIG. 3, it is assumed that the page currently browsed by the user is Pagef, and each of the subpages is subpage 1, subpage 2, subpage 3, and subpage 4.
  • subpage 1 The body part of subpage 1 is text01, and the title is abc1;
  • subpage 2 The body part of subpage 2 is text02, and the title is abc2;
  • subpage 3 The body part of subpage 3 is text03, and the title is abc3;
  • subpage 4 The body part of subpage 4 is text04, and the title is abc4;
  • the intermediate server After the intermediate server obtains and saves the subpage 1 to the subpage 4 by pre-reading, after the massive user behavior statistics, it is found that the most frequently accessed page is accessed after the currently browsed page is accessed by Pagef in the user access history behavior statistics.
  • the page is subpage 1, so all pagepages of pagef can be merged and re-formatted into a combined webpage 5.
  • the body part of the combined webpage 5 is a combination of the subpage 1, the subpage 2, the subpage 3, and the body part of the subpage 4, so the body part of the combined webpage 5 is: text01, The continuous combination of text02, text03 and text04; the title of the combined webpage 5 is the same part of the title of the subpage 1, subpage 2, subpage 3 and subpage 4, that is, abc.
  • the relay server sends the web page 5 after the combination rearrangement to the client.
  • the client When the user clicks the hyperlink "Next Page" on the current page pagef, the client will combine the web page 5 for rendering display. In this way, the customer does not have to perform cumbersome page turning operations to obtain more browsing content that is connected to the current page, thereby obtaining a better browsing experience.
  • the links on the webpage that can stimulate the pre-reading operation are not only "next page”, but also may have "next section", "next chapter” or even " In the case of the next volume, for the case where such multiple pre-reading hotspot keywords exist simultaneously, it is necessary to first judge the priority of the keyword on the webpage.
  • the priority of the "next page” of the "next page” has a higher priority, and the priority of the "next chapter” is higher than the priority of the "next volume”.
  • the relay server will pre-read the second chapter after obtaining the first page of Chapter 2 according to the priority of the keyword. From page 2 to page 7, then analyze the page that most users usually visit when they browse the first page (massive user behavior statistics). If the most frequently visited page is page 2, then the 7 pages will be The content or the contents of pages 2 to 4 are merged and rearranged and sent to the cache of the mobile terminal.
  • the mobile terminal When the user clicks the keyword "next page" on the current web page, the mobile terminal directly retrieves the merged page. Cache data for display. If the transit server is based on the analysis, it is found that the page most frequently accessed by all users after browsing the first page is not the second page, but the third page of the last chapter, then the transit server will be the third to the last chapter. The page and several subpages after the third page are pre-read until the number of pre-read subpages reaches the specified number of layers, and then the relay server performs the combined rearrangement processing on all the pages read in advance and sends them to the mobile terminal.
  • the relay server analyzes, it is found that the page most frequently accessed by all users after browsing the first page is not the second page, nor is it any page of the novel, but another page with irrelevant content, and from If the page of the web novel is linked to the page with an unrelated content of the content having a jump access rate of up to 70%, the relay server will pre-read the page and its sub-pages that are not related to the content, and perform page merge and rearrangement processing and then send the result to the mobile terminal. .
  • the degree of similarity between the sub-pages is usually taken into consideration when performing the merging process, and various merging rearrangements can be performed by judging the sub-page URL, and of course, the merging rearrangement can be performed by judging the similarity degree of the sub-page layout. .
  • the read-ahead logic of the present invention there are two types of keyword-based pre-reading and pre-reading based on statistics (massive user behavior statistics).
  • the priority of the pre-reading for the keyword is greater than the priority of the pre-reading based on the statistics.
  • it is directed to the pre-reading with keywords, that is, for Pre-read keywords, pre-read keywords; pages without pre-read keywords, pre-read according to the statistics of massive user behavior of the transit server.
  • a web page pre-reading and integrated browsing method based on a relay server according to the present invention is described above with reference to FIGS. 1 and 2.
  • the foregoing webpage pre-reading and integrated browsing method based on the transit server of the present invention may be implemented by software, implemented by hardware, or implemented by a combination of software and hardware.
  • the present invention also provides a relay server for web page pre-reading and integrated browsing
  • FIG. 4 shows a block diagram of a relay server 400 for web page pre-reading according to the present invention.
  • the relay server 400 for web page pre-reading includes a receiving unit 410, a pre-read keyword analyzing unit 420, a page pre-reading unit 430, a judging unit 440, a pre-reading page combining unit 450, and a transmitting unit 460.
  • the receiving unit 410 is configured to receive a page access request for page data of a page having multiple sub-pages from the mobile terminal, and receive page data of the requested page from the network resource server in response to the page access request.
  • the predetermined keyword analysis unit 420 is configured to perform pre-read keyword analysis on the page data or the pre-read subpage data currently returned from the network resource server to obtain the currently included page data or the subpage data included in the subpage data. Read-ahead keywords.
  • the page pre-reading unit 430 is configured to pre-read page data of the next subpage of the current page or the subpage from the network resource server according to the pre-read keyword having the highest priority among the obtained pre-read keywords. .
  • the determining unit 440 is configured to determine whether the page data of the predetermined number of sub-pages has been pre-read.
  • the processing of the predetermined keyword analysis unit 420 and the page pre-reading unit 430 is repeatedly executed from the acquired page data of the page until the
  • the determining unit 440 determines that the page data of the predetermined number of subpages of the page has been pre-read from the network resource server.
  • the pre-read page combination unit 450 is configured to perform merge reordering on the sub-page data of all the predetermined number of sub-pages pre-read by the page pre-reading unit.
  • the sending unit 460 transmits the merged rearranged combined page data to the mobile terminal.
  • the page pre-reading unit 430 may further include: a URL address obtaining module 431, configured to be used according to the pre-read keyword obtained by the pre-read keyword analyzing unit. a pre-reading keyword having the highest priority, obtaining a URL address of the sub-page to be pre-read from the network resource server; and a page data obtaining module 433, configured to use the network resource according to the obtained URL address
  • the server prefetches the subpage data to be read ahead.
  • the priority of the set pre-read keywords is from the highest to the lowest: next page, [next page], next page, [next page], next page
  • the process of the pre-reading sub-pages by the relay server is as follows: the page data obtaining module 433 in the page pre-reading unit 430 pre-predicts from the network resource server according to the URL of the sub-page and the priority of the pre-read keyword.
  • the pre-read keyword analysis unit 420 further analyzes the first layer sub-page and obtains the pre-read keyword contained on the first layer sub-page, and then, the page data acquisition in the page pre-reading unit 430
  • the module 431 acquires the URL of the next layer of the sub-page of the first layer sub-page from the network resource server according to the priority of the pre-read keyword.
  • the page pre-reading unit 430 continues to pre-read the next layer of sub-pages from the network resource server according to the URL of the next-level sub-page, and then continues to analyze the next-level sub-page by the pre-read keyword analysis unit 420 until the pre-reading Specifies all subpages of the number of layers.
  • the number of combined sub-pages generally does not exceed four layers, and the sub-pages of the second to third layers are preferably combined and rearranged.
  • the relay server 400 further includes a mass user behavior statistical unit 470,
  • the historical access behavior, access frequency, and jump access rate of each page accessed by the user are counted according to the web browsing behavior of a large number of users.
  • the massive user behavior statistics unit 470 records the access behaviors of all users, and based on the records of these behaviors, various statistical parameters, such as the frequency of access (ie, how many times a day is accessed), the probability of a page jumping to another page, etc., are obtained. They are all based on the user's access behavior record.
  • the relay server 400 may further include a determining unit (not shown) for determining the acquired after the URL address obtaining module acquires the URL address of the subpage to be pre-read from the network resource server.
  • the URL address is the same as the page URL that is most frequently accessed after accessing the page, according to the mass user behavior statistics unit 470. If the same, the page data obtaining module 433 in the page pre-reading unit 430 acquires the content from the network resource server according to the URL of the next subpage of the pre-read previous page or the previous subpage. The page data of the subpage to be read ahead. If not the same, the page pre-reading unit performs a pre-read operation according to the jump access rate of the most frequently accessed page URL.
  • the page pre-reading unit 430 determines the most frequently accessed page URL as the URL of the sub-page obtained from the network resource server, thereby acquiring the page data from the URL; otherwise, If the jump access rate does not exceed the preset threshold, the page pre-determination unit 430 does not perform any pre-read operation.
  • the preset threshold of the jump access rate may be set according to an empirical value, such as a value between 60% and 80%.
  • the preset threshold is 70%, that is, if the webpage jumps to access When the rate reaches 70%, the most frequently visited web pages are pre-read. If the jump access rate of this web page does not reach 70%, no subsequent read-ahead is performed.
  • the present invention further provides a web page pre-reading system including a mobile terminal, a network resource server and the foregoing relay server, and a block diagram thereof is shown in FIG. 6.
  • the merge and rearrangement process in Figure 1 can be omitted.
  • the pre-read page composition unit 450 and the mass user behavior statistics unit 470 in FIG. 4 may also be omitted.
  • the page data of the next sub-page of the current page or the sub-page is pre-read from the network resource server, except for the above implementation.
  • other well-known methods in the art may also be employed.
  • the invention combines the priority of the keyword, the analysis of the massive user behavior, matches the pre-read page with the user's browsing intention, and merges and re-types the plurality of pre-read pages, so that the end user can not be required multiple times. Under the premise of turning pages, browsing to more web content at one time, and greatly improving the user's browsing satisfaction from the pre-reading content, thereby more effectively improving the user's web browsing experience.
  • the method according to the invention can also be implemented as a computer program executed by a CPU.
  • the computer program is executed by the CPU, the above-described functions defined in the method of the present invention are performed.
  • the above method steps and system elements may also be implemented with a controller or processor and a computer readable storage device for storing a computer program that causes the controller or processor to perform the steps or unit functions described above.
  • a computer readable storage device eg, a memory
  • a volatile memory can be a volatile memory or a nonvolatile memory, or can include both volatile and nonvolatile memory.
  • non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash volatile memory
  • Volatile memory can include random access memory (RAM), which can act as external cache memory.
  • RAM can be obtained in a variety of forms, such as synchronous RAM (DRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR) SDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM).
  • DRAM synchronous RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR dual data rate SDRAM
  • ESDRAM Enhanced SDRAM
  • SLDRAM Synchronous Link DRAM
  • DRRAM Direct Rambus RAM
  • DSPs digital signal processors
  • ASIC dedicated An integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • the processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • a software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor, such that the processor can read information from or write information to the storage medium.
  • the storage medium can be integrated with a processor.
  • the processor and the storage medium can reside in an ASIC.
  • the ASIC can reside in the user terminal.
  • the processor and the storage medium may reside as discrete components in the user terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Description

网页页面预读方法、中转服务器和网页页面预读***
本发明涉及移动互联网领域,更为具体地,涉及一种网页页面预读方法、中转服务器以及具有该中转服务器的网页页面预读***。
随着社会的进步和技术的发展,人们越来越多地使用移动终端来以无线的方式接入网络来获取信息。在普通的页面浏览流程中,用户在当前页面点击前进到链接页面,即时加载,但对于目前的通过移动终端尤其是手机上网方式来说,由于终端硬件以及网络接口的限制,在点击了加载前进页面后到加载完成之前,用户难免有十到二十秒的等待网页加载过程,这个过程将浪费一定用户时间,消耗用户的耐心。针对这种情况,网页的预读技术逐渐得到广泛的应用。
技术问题
关于网页的预读,现有技术中存在的普遍做法如下:
服务器基于客户端终的用户历史浏览行为和网页排版预测哪些文件需要被预先装载,当用户进行网页浏览的时候根据所预先装载的文件实现网页的预读功能,这样,用户就不需要等待网页加载的过程而很快地从服务器获得需要加载的页面内容。
美国US7284035(B2)号专利申请中就披露了类似的技术方案:决定网页的某些特定子页被用户获取,如果决定了就优选预读这些被确定的子页,对子页的偏好通过分析用户之前访问的网页来产生,这种分析所参考的因素包括某个用户对某一网页的子页的访问历史、访问过去的天数、在本页中有多少个子页等等。例如,当一个用户每天早上访问同一个新闻网站并总是阅读政治、计算机、旅游和阅读栏目的文章的时候,根据该专利,则当新闻网页被访问时,这些喜好将会被决定,而那些与政治、计算机、旅游和阅读栏目的文章将被比其他栏目更加优先的被加载入浏览器的缓存。
但是,上述现有技术对特定子页的预读,用户仍需要进行子页的翻页操作才能从客户端的缓存中逐一获得这些子页,对于需要浏览含有多个子页的网页(如连载小说)的用户来说,虽然预读能够在一定程度上缩短用户的等待时间,但频繁的翻页操作也会给连续的浏览体验带来不便。
申请号为200910313007.8的中国专利提出了一种用于移动通讯设备终端的网页页面预读及整合浏览***及其应用方法,采用预先操作读取当前网页的子页并保存的预读模块和用于把当前网页和子页组合成一个组合网页的网页组合模块,并对预读页面进行多层搜索,将对获取的子页与当前页面进行重新组合成一个统一的网页显示给用户,提高用户的浏览感受。
但申请号为200910313007.8的中国专利所提出预读整合方案只是简单地以网页上的如“下一页”/“下一章”/“下一节”关键词为激发预读的热点进行相应子页的获取和组合,对于含有复杂子页结构的网页(如连载小说中同时存在“下一页”和“下一章”链接的情形)来说无法进行精确的预读,并且该申请号为200910313007.8的中国专利预读的规则单一,缺乏对用户浏览喜好的考虑,友好性较差。
技术解决方案
鉴于上述问题,本发明的目的是提供一种基于中转服务器的网页页面预读及整合技术,以克服上述现有技术中预读规则单一、友好性差的缺陷。
根据本发明的一个方面,提供了一种网页页面预读方法,包括如下步骤:
根据移动终端发送的针对具有多个子页面的页面的页面数据的页面访问请求,向网络资源服务器请求该页面的页面数据;
在从所述网络资源服务器获取该页面的页面数据后,从所获取的该页面的页面数据开始,执行下述预读过程,直到从所述网络资源服务器预读该页面的预定层数的子页面的页面数据:
对当前从所述网络资源服务器返回的页面数据或预读的子页面数据进行预读关键字分析,以获得所述当前返回的页面数据或子页面数据中包含的预读关键字;
根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据;以及
将所述预读出的预定层数的子页面的页面数据发送给移动终端。
根据本发明的另一方面,提供了一种中转服务器,包括:
接收单元,用于从移动终端接收针对具有多个子页面的页面的页面数据的页面访问请求,以及响应于所述页面访问请求,从网络资源服务器接收所请求的页面的页面数据;
预定关键字分析单元,用于对当前从所述网络资源服务器返回的页面数据或预读的子页面数据进行预读关键字分析,以获得所述当前返回的页面数据或子页面数据中包含的预读关键字;
页面预读单元,用于根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据;
判断单元,用于判断是否已经预读预定层数的子页面的页面数据;以及
发送单元,用于向网络资源服务器发送页面访问请求,以及将所述预读出的预定层数的子页面的页面数据发送给移动终端,
其中,在从所述网络资源服务器获取该页面的页面数据后,从所获取的该页面的页面数据开始,重复执行所述预定关键字分析单元和页面预读单元的处理,直到所述判断单元确定为从所述网络资源服务器预读该页面的预定层数的子页面的页面数据。
根据本发明的再一方面,提供了一种包括移动终端、网络资源服务器和上述的中转服务器的网页页面预读***。
利用上述根据本发明的基于中转服务器的网页页面预读及整合技术方案,能够灵活地获取有效的子页并对所获得的子页进行合并重排处理,当用户浏览需要多次翻页的网页时形成一个同一的组合网页显示给用户,从而在终端将所有的内容进行一次性排版渲染,使得网页展现更加灵活、人性化并且快速,减少用户的操作流程,增强用户浏览多子页页面时的浏览体验。
并且,本发明对预读页面进行多层搜索和分析的策略,能够更准确地预知用户下一次点击行为,既可提高点击响应速度,又可以较少错误预读,节省网络流量;同时利用空闲时下载页面,用户基本不用等待,可以很好的节省用户时间,节省对网络的访问请求次数。
本发明应用在网页小说或连续多个网页组成的新闻阅读上,能大幅提高移动浏览器用户的使用感受。
为了实现上述以及相关目的,本发明的一个或多个方面包括后面将详细说明并在权利要求中特别指出的特征。下面的说明以及附图详细说明了本发明的某些示例性方面。然而,这些方面指示的仅仅是可使用本发明的原理的各种方式中的一些方式。此外,本发明旨在包括所有这些方面以及它们的等同物。
附图说明
通过参考以下结合附图的说明及权利要求书的内容,并且随着对本发明的更全面理解,本发明的其它目的及结果将更加明白及易于理解。在附图中:
图1示出了根据本发明的基于中转服务器的网页页面预读方法的流程图;
图2示出了根据本发明的每一个子页的预读流程;
图3示出了根据本发明对相同结构的子页进行组合重排的结构示意图;
图4示出了本发明的用于网页页面预读的中转服务器的方框示意图;
[根据细则91更正 12.01.2012] 
图5示出了本发明的网页页面预读***的方框图;
[根据细则91更正 12.01.2012] 
图6示出了具有根据本发明的中转服务器的网页页面预读***。
在所有附图中相同的标号指示相似或相应的特征或功能。
本发明的实施方式
以下将结合附图对本发明的具体实施例进行详细描述。
为了能够灵活地获取有效的子页并对所获得的子页进行合并重排处理,本发明采用基于关键字的优先级和用户行为的多子页合并预读方案,并且把预读的操作和重排版及合并处理放在中转服务器中处理,仅将网页渲染显示的最后步骤放到客户端,以最大化地减轻移动通讯设备终端的负担。
需要说明的是,在对本发明技术方案的表述中,网页的子页是相对于用户浏览的当前页面来说的,具体的可以是当前浏览页面中超链接所指向的其他页面,比如下一页等。
另外,本发明采用客户端-服务器的基本构架,虽然本发明主要是针对以移动终端作为网页浏览客户端来设计的,但也不排除将本发明应用到除手机等移动终端之外的接入性客户端,如PC、特定服务提供终端(如机场、车站的查询终端)等,因此,在下文的表述中,“移动终端”、“终端”、“客户端”等表述均指与用户直接交互的接入性客户端。
图1示出了根据本发明的基于中转服务器的网页页面预读方法的流程图。
如图1所示,当移动终端用户浏览网页页面时,移动终端即将用户的网页页面访问提交给中转服务器,而中转服务器则根据移动终端的页面访问请求向相应的网络资源服务器请求相应的页面数据(S110);在中转服务器收到从网络资源服务器返回页面数据后,从所获取的该页面的页面数据开始,执行下述预读过程,直到从所述网络资源服务器预读该页面的预定层数的子页面的页面数据:对当前从所述网络资源服务器返回的页面数据或预读的子页面数据进行预读关键字分析,以获得所述当前返回的页面数据或子页面数据中包含的预读关键字;根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据(步骤S120)。具体地,例如,在一个示例中,如图2所示,首先,对从网络资源服务器返回的页面数据进行预读关键字的分析(S121);然后根据所获得的预读关键字的优先级从网络资源服务器获取所请求页面的子页面的URL(S123);在获取到子页面的URL之后,中转服务器根据所述获取的URL,从所述网络资源服务器预读要预读的子页面的页面数据(S125),然后,在步骤S127,判断是否已经经过预定层数的子页面页面数据预读。如果没有经过预定层数的子页面页面数据预读,则返回到步骤S121,在步骤S121,对当前预读的子页面的页面数据进行预读关键字分析。随后重复执行步骤S121到S127的处理,直到在步骤S127判断为已经进行预定层数的子页面页面数据预读为止。这里,所述预定层数可以根据终端的硬件配置确定,也可以由用户自主设定。在预读层数的控制方面,可以通过计数器或者其他本领域技术人员所悉知的数量控制方式来实现。
在预读完规定层数的子页面之后,中转服务器就对所述预读出的预定层数的子页面进行合并重排(S130);最后将合并重排后的组合页面数据发送给移动终端(S140),从而实现对用户所需要的网页页面预读及整合浏览。
通过上述中转服务器对用户请求的网页的子页进行的预读及整合浏览的处理,用户在点击当前浏览页面上的上述根据预读关键字的优先级所确定的关键字或最常访问页面链接的时候,移动终端就能够直接调取缓存中多子页合并后的预读页面进行显示,从而不需要任何的翻页操作就能够浏览到适当数量的子页内容。
其中,预读关键字的优先级从左到右,依次降低:
下页、[下页]、下一页、[下一页]、下页|、>>下页、>>下页|、下一张、[下一张]、[->]、>、[>]、[->>]、>>、[>>]、下章、[下章]、下一章、[下一章]、下节、[下节]
中转服务器在判断网页上关键字的优先级的过程中,选取优先级最高的关键字,将该关键字链接指向的网页的子页及该子页的同一关键字链接指向的子页等都保存下来,例如一篇新闻的首页只有新闻的摘要内容,该页面的最高优先级关键字是“下页”,该新闻的正文一共有5页,且每一页的下方都有关键字“下页”,则中转服务器会将该新闻的第2页到第5都作为首页的子页进行预读。
除了上述关键字优先级的引入,在本发明的一个优选实施方式中,还将海量用户行为纳入预读的考虑因素之中,即在步骤S123获取了子页面的URL之后,进一步判断所述子页面的URL与根据海量用户行为统计得到的在访问所述页面之后最常访问的页面URL是否相同,如果相同,则继续步骤S125的预读流程;如果不相同,则根据所述最常访问的页面URL的跳转访问率,确定是否预读所述最常访问的页面。
如果所述跳转访问率超过预设阈值,则根据所述最常访问的页面URL,预读所述最常访问的页面;否则,如果所述跳转访问率没有超过预设阈值,则不进行任何预读操作。
在本发明的技术方案中,中转服务器的海量用户行为统计就是记录所有用户的访问行为,并基于这些行为的记录得出与用户的访问行为相关的额各种统计参数,例如访问频率(即一天访问多少次)、某页面跳转到另外一个页面的几率等,这些都是基于用户的访问行为记录统计出来的。
具体来说,根据海量用户对上述子页的历史访问行为的统计数据得到该网页被访问之后最常被访问的网页(其中包括该网页的子页以及其他跳转的网页),如果根据关键字优先级预读到的子页的URL和该网页被访问之后最常被访问的网页的URL是一样的,则可以判定用户很希望对该新闻进行连续阅读,于是将根据关键字预读到的所有子页进行合并重排处理后发送到移动终端进行缓存。
组合的子页数量可以根据终端的硬件配置确定,也可以由用户自主设定,但考虑到用户的浏览习惯和大多数移动终端的终端配置,组合的子页数一般不超过四层,优选地对2~3层的子页进行合并重排。
如果根据关键字优先级预读到的子页和该网页被访问之后最常被访问的网页是不同的,则根据海量用户的历史访问行为的统计数据判断该网页被访问之后最常被访问的网页的跳转访问率来确定是否进行跳转访问的预读。该跳转访问率可根据经验值设定,如60%~80%之间的数值。在本发明的一个具体实施方式中,如果这个网页的跳转访问率达到70%,就对最常被访问的网页进行预读,如果这个网页的跳转访问率没有达到70%,则不进行任何后续预读。
这种海量用户行为统计信息的比较判断可以在仅第一层子页面的预读过程中引入,也可以在每一次子页面的预读过程中均将子页面的URL与海量用户行为统计得到的跳转页面URL进行比较,以使得中转服务器所预读的每一页面都更加贴近用户的浏览需求。
具体的,作为示例,在每一次子页面的预读过程中均将子页面的URL与海量用户行为统计得到的跳转页面URL进行比较的过程如下:在每次获取一个子页面的URL之后,都判断该子页面的URL与根据海量用户行为统计得到的在访问该子页面的上一页面之后最常访问的页面URL是否相同,如果相同,则继续上述步骤S125的预读操作;否则根据海量用户行为统计所确定的最常访问的页面URL的跳转访问率确定是否预读该最常访问的页面。同样,如果该跳转访问率超过预设阈值,则根据最常访问的页面URL预读所述最常访问的页面;如果所述跳转访问率没有超过预设阈值,则不进行任何预读操作,仅就已经预读的子页面进行合并重排处理。
下面以两个具体的实施方式来对本发明提供的上述基于中转服务器的网页页面预读及整合浏览方法做示例性说明。
实施例一
在网页的版面编排设计方面,对基于同一内容的各子页说来,它们的页面结构都是一样的:使用相同的标记语言,相同的网页标题等等,因此就可以根据这一特点来对子页的内容进行组合重排。具体来说,就是把网页根据网页标记语言分成标题部分和正文部分,把各子页的正文部分内容抽取,并组合成新组合网页的正文部分,新组合网页的标题部分采用各子页标题的相同部分,用户浏览时直接输出组合后的整页内容即可。
图3为对相同结构的子页进行组合重排的简单结构示意图。如图3所示,假设用户当前浏览的页面为Pagef,其各子页分别为子页1、子页2、子页3和子页4。
子页1的正文部分为text01,标题为abc1;
子页2的正文部分为text02,标题为abc2;
子页3的正文部分为text03,标题为abc3;
子页4的正文部分为text04,标题为abc4;
中转服务器通过预读将子页1到子页4获取并保存后,经过海量用户行为统计后发现,在用户访问历史行为统计中对当前浏览的页面为Pagef进行访问完之后,最常被访问的页面就是子页1,因此就可以将pagef所有子页合并重新排版成组合网页5。
如图3所示,在组合重排后,组合网页5的正文部分为子页1,子页2,子页3和子页4的正文部分的结合,因此组合网页5的正文部分为:text01,text02,text03和text04的连续组合;组合网页5的标题为子页1,子页2,子页3和子页4的标题的相同部分,即为abc。组合完成之后,中转服务器将该组合重排后的网页5发送给客户端。
当用户点击当前页面pagef上的超链接“下一页”时,客户端将组合网页5进行渲染显示。这样,客户不必进行繁琐的翻页操作即可获得与当前页面相衔接的更多的浏览内容,从而获得更好的浏览体验。
实施例二
对于涉及网络连载小说的网页来说,由于小说的章节需要,网页上能激发预读操作的链接不仅“下一页”,还可能会同时存在“下一节”、“下一章”甚至“下一卷”的情形,对于这种多个预读热点关键字同时存在的情况,就需要首先对该网页上关键字的优先级进行判断。
比如在某网络小说的网页中,“下一页”的优先级“下一章”的优先级高,而“下一章”的优先级又会高于“下一卷”的优先级。假如用户现在正在读第二章第1页(假如第二章一共有7页),那么中转服务器就会根据关键字的优先级,在获得第二章第1页之后,预读第二章第2页到第7页,然后分析所有用户当浏览完第1页后通常最常访问的那个页面(海量用户行为统计),如果该最常访问的页面是第2页,那么就将这7页的内容或者第2页到第4页的内容进行合并重排处理并发送给移动终端的缓存,当用户点击当前网页上的关键字“下一页”时,移动终端直接调取合并后的页面缓存数据进行显示。如果中转服务器根据分析,发现所有用户当浏览完第1页后最常访问的那个页面不是第2页,而是最后一章的倒数第3页,那么中转服务器将对最后一章的倒数第3页以及第3页后的若干子页进行预读,直至预读的子页数量达到规定层数,然后中转服务器将所预读的所有页面进行合并重排处理并发送给移动终端。如果中转服务器根据分析,发现所有用户当浏览完第1页后都最常访问的那个页面不是第2页,也是不该小说的任何一页,而是另外一个内容不相关的页面,且从该网络小说的页面链接到该内容不相关的页面的跳转访问率高达70%,则中转服务器将预读该内容不相关的页面及其子页,并进行页面合并重排处理后发送给移动终端。
在本发明中,进行合并处理时通常考虑到子页之间的相似程度,可以通过判断子页URL的方式进行各种合并重排,当然也可以通过判断子页排版的相似程度进行合并重排。
在本发明的预读逻辑中,有基于关键字的预读和基于统计(海量用户行为统计)的预读两种。按照一般的逻辑,针对关键字的预读的优先级大于基于统计的预读的优先级,在上述对本发明技术方案的表述中,都是针对有关键字的预读,也就是说,对于有预读关键字的页面,优先预读关键字;没有预读关键字的页面,根据中转服务器的海量用户行为统计结果进行预读。
如上参照图1和图2描述了根据本发明的基于中转服务器的网页页面预读及整合浏览方法。本发明的上述基于中转服务器的网页页面预读及整合浏览方法,可以采用软件实现,也可以采用硬件实现,或采用软件和硬件组合的方式实现。
与上述方法相对应,本发明还提供了一种用于网页页面预读及整合浏览的中转服务器,图4示出了根据本发明的用于网页页面预读的中转服务器400的方框示意图。如图4所示,用于网页页面预读的中转服务器400包括接收单元410、预读关键字分析单元420、页面预读单元430、判断单元440、预读页面组合单元450和发送单元460。
其中,接收单元410用于从移动终端接收针对具有多个子页面的页面的页面数据的页面访问请求,以及响应于所述页面访问请求,从网络资源服务器接收所请求的页面的页面数据。
预定关键字分析单元420用于对当前从所述网络资源服务器返回的页面数据或预读的子页面数据进行预读关键字分析,以获得所述当前返回的页面数据或子页面数据中包含的预读关键字。页面预读单元430用于根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据。所述判断单元440用于判断是否已经预读预定层数的子页面的页面数据。其中,在从所述网络资源服务器获取该页面的页面数据后,从所获取的该页面的页面数据开始,重复执行所述预定关键字分析单元420和页面预读单元430的处理,直到所述判断单元440判断为已经从所述网络资源服务器预读该页面的预定层数的子页面的页面数据。预读页面组合单元450用于对所述页面预读单元预读的所有预定层数的子页面的子页面数据进行合并重排。发送单元460将经过合并重排后的组合页面数据发送给移动终端。
在本发明的一个示例中,如图5所示,所述页面预读单元430还可以包括:URL地址获取模块431,用于根据所述预读关键字分析单元获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器获取该要预读的子页面的URL地址;以及页面数据获取模块433,用于根据所述获取的URL地址,从所述网络资源服务器预读要预读的子页面数据。
在预读关键字分析单元420中,所设定的预读关键字的优先级从高至低依次为:下页、[下页]、下一页、[下一页]、下页|、>>、下页、>>下页|、下一张、[下一张]、[->]、>、[>]、[->>]、>>、[>>]、下章、[下章]、下一章、[下一章]、下节、[下节]。
结合图4和图5,中转服务器逐层预读子页面的过程如下:在页面预读单元430中的页面数据获取模块433根据子页面的URL和预读关键字的优先级从网络资源服务器预读第一层子页面之后,预读关键字分析单元420进一步分析第一层子页面并获得该第一层子页面上包含的预读关键字,然后,页面预读单元430中的页面数据获取模块431根据预读关键字的优先级从网络资源服务器获取第一层子页面的下一层子页面的URL。接着,页面预读单元430继续根据下一层子页面的URL从网络资源服务器预读下一层子页面,然后由预读关键字分析单元420继续针对下一层子页面进行分析,直至预读规定层数的所有子页面。
考虑到用户的浏览习惯和大多数移动终端的终端配置,组合的子页数一般不超过四层,优选地对2~3层的子页进行合并重排。
为了使预读页面更加符合用户的浏览意向,在本发明的一个优选实施方式中,还将海量用户行为纳入预读的考虑因素之中,即中转服务器400还包括海量用户行为统计单元470,用于根据海量用户的网页浏览行为统计用户访问每一页面的历史访问行为、访问频率以及跳转访问率。海量用户行为统计单元470记录所有用户的访问行为,并基于这些行为的记录得出各种统计参数,如访问频率(即一天访问多少次)、某页面跳转到另外一个页面的几率等,这些都是基于用户的访问行为记录统计出来的。
此外,所述中转服务器400还可以包括确定单元(未示出),用于在所述URL地址获取模块从所述网络资源服务器获取要预读的子页面的URL地址之后,确定所述获取的URL地址与根据所述海量用户行为统计单元统计470得到的在访问所述页面之后最常访问的页面URL是否相同。如果相同,则所述页面预读单元430中的页面数据获取模块433根据所述已经预读的上一页面或上一子页面的下一子页面的URL,从所述网络资源服务器获取所述要预读的子页面的页面数据。如果不相同,则所述页面预读单元根据所述最常访问的页面URL的跳转访问率进行预读操作。其中,如果跳转访问率超过预设阈值,则所述页面预读单元430将最常访问的页面URL确定为从网络资源服务器获取的子页面的URL,从而从该URL获取页面数据;否则,如果所述跳转访问率没有超过预设阈值,则所述页面预定单元430不执行任何预读操作。
跳转访问率的预设阈值可根据经验值设定,如60%~80%之间的数值在本发明的一个优选实施方式中,预设阈值为70%,即如果这个网页的跳转访问率达到70%,就对最常被访问的网页进行预读,如果这个网页的跳转访问率没有达到70%,则不进行任何后续预读。
再一方面,本发明还提供一种包括移动终端、网络资源服务器和前述中转服务器的网页页面预读***,其方框图如图6所示。
这里要说明的是,上述实施例仅仅是例示性的,而不是限制性。本领域技术人员要明白的是,还可以对上述实施例进行各种修改。
例如,在一个示例中,图1中的合并和重排过程可以省略。同样,图4中的预读页面组合单元450以及海量用户行为统计单元470也可以省略。
此外,在根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据,除了上述实施例中公开的方式之外,也可以采用本领域的其它公知方式。
本发明结合关键字的优先级、海量用户行为分析,对预读的页面进行契合用户浏览意向的选择,并对多个预读的页面进行合并重排版,不能使终端用户能够在不需要多次翻页的前提下一次性浏览到更多的网页内容,还从预读内容上使用户的浏览满意度得大大改善,从而更有效地提升了用户的网页浏览体验。
此外,根据本发明的方法还可以被实现为由CPU执行的计算机程序。在该计算机程序被CPU执行时,执行本发明的方法中限定的上述功能。
此外,上述方法步骤以及***单元也可以利用控制器或处理器以及用于存储使得控制器或处理器实现上述步骤或单元功能的计算机程序的计算机可读存储设备实现。
此外,应该明白的是,本文所述的计算机可读存储设备(例如,存储器)可以是易失性存储器或非易失性存储器,或者可以包括易失性存储器和非易失性存储器两者。作为例子而非限制性的,非易失性存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)或快闪存储器。易失性存储器可以包括随机存取存储器(RAM),该RAM可以充当外部高速缓存存储器。作为例子而非限制性的,RAM可以以多种形式获得,比如同步RAM(DRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据速率SDRAM(DDR SDRAM)、增强SDRAM(ESDRAM)、同步链路DRAM(SLDRAM)以及直接Rambus RAM(DRRAM)。所公开的方面的存储设备意在包括但不限于这些和其它合适类型的存储器。
本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。为了清楚地说明硬件和软件的这种可互换性,已经就各种示意性组件、方块、模块、电路和步骤的功能对其进行了一般性的描述。这种功能是被实现为软件还是被实现为硬件取决于具体应用以及施加给整个***的设计约束。本领域技术人员可以针对每种具体应用以各种方式来实现所述的功能,但是这种实现决定不应被解释为导致脱离本发明的范围。
结合这里的公开所描述的各种示例性逻辑块、模块和电路可以利用被设计成用于执行这里所述功能的下列部件来实现或执行:通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑器件、分立门或晶体管逻辑、分立的硬件组件或者这些部件的任何组合。通用处理器可以是微处理器,但是可替换地,处理器可以是任何传统处理器、控制器、微控制器或状态机。处理器也可以被实现为计算设备的组合,例如,DSP和微处理器的组合、多个微处理器、一个或多个微处理器结合DSP核、或任何其它这种配置。
结合这里的公开所描述的方法或算法的步骤可以直接包含在硬件中、由处理器执行的软件模块中或这两者的组合中。软件模块可以驻留在RAM存储器、快闪存储器、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、可移动盘、CD-ROM、或本领域已知的任何其它形式的存储介质中。示例性的存储介质被耦合到处理器,使得处理器能够从该存储介质中读取信息或向该存储介质写入信息。在一个替换方案中,所述存储介质可以与处理器集成在一起。处理器和存储介质可以驻留在ASIC中。ASIC可以驻留在用户终端中。在一个替换方案中,处理器和存储介质可以作为分立组件驻留在用户终端中。
如上参照附图以示例的方式描述根据本发明的网页页面预读及整合浏览方法和***。但是,本领域技术人员应当理解,对于上述本发明所提出的网页页面预读及整合浏览方法和***,还可以在不脱离本发明内容的基础上做出各种改进。因此,本发明的保护范围应当由所附的权利要求书的内容确定。

Claims (11)

  1. 一种网页页面预读方法,包括:
    根据移动终端发送的针对具有多个子页面的页面的页面数据的页面访问请求,向网络资源服务器请求该页面的页面数据;
    在从所述网络资源服务器获取该页面的页面数据后,从所获取的该页面的页面数据开始,执行下述预读过程,直到从所述网络资源服务器预读该页面的预定层数的子页面的页面数据:
    对当前从所述网络资源服务器返回的页面数据或预读的子页面数据进行预读关键字分析,以获得所述当前返回的页面数据或子页面数据中包含的预读关键字;
    根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据;以及
    将所述预读出的预定层数的子页面的页面数据发送给移动终端。
  2. 如权利要求1所述的网页页面预读方法,其中,根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据的步骤包括:
    根据所述当前返回的页面数据或子页面数据中包含的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器获取该要预读的下一子页面的URL地址;以及
    根据所述获取的URL地址,从所述网络资源服务器预读要预读的下一子页面的页面数据。
  3. 如权利要求2所述的网页页面预读方法,其中,在从所述网络资源服务器获取该要预读的下一子页面的URL地址后,所述方法还包括:
    判断所述要预读的子页面的URL与根据海量用户行为统计得到的在访问所述当前返回的页面或子页面之后最常访问的页面URL是否相同,
    如果相同,则根据所述当前返回的页面或子页面的下一子页面的URL,从所述网络资源服务器获取所述当前返回的页面或子页面的下一子页面的页面数据;
    如果不相同,则根据所述最常访问的页面URL的跳转访问率,确定是否预读所述最常访问的页面。
  4. 如权利要求3所述的网页页面预读方法,其中,
    如果所述跳转访问率超过预设阈值,则根据所述最常访问的页面URL,预读所述最常访问的页面;否则
    如果所述跳转访问率没有超过预设阈值,则不进行任何预读操作。
  5. 如权利要求1所述的网页页面预读方法,在预读出预定层数的子页面的页面数据之后,所述方法还包括:
    对所述预读的预定层数的子页面的页面数据进行合并重排;以及
    将合并重排后的组合页面数据发送给移动终端。
  6. 一种中转服务器,包括:
    接收单元,用于从移动终端接收针对具有多个子页面的页面的页面数据的页面访问请求,以及响应于所述页面访问请求,从网络资源服务器接收所请求的页面的页面数据;
    预定关键字分析单元,用于对当前从所述网络资源服务器返回的页面数据或预读的子页面数据进行预读关键字分析,以获得所述当前返回的页面数据或子页面数据中包含的预读关键字;
    页面预读单元,用于根据所述获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器预读该当前页面或子页面的下一子页面的页面数据;
    判断单元,用于判断是否已经预读预定层数的子页面的页面数据;以及
    发送单元,用于向网络资源服务器发送页面访问请求,以及将所述预读出的预定层数的子页面的页面数据发送给移动终端,
    其中,在从所述网络资源服务器获取该页面的页面数据后,从所获取的该页面的页面数据开始,重复执行所述预定关键字分析单元和页面预读单元的处理,直到所述判断单元确定为从所述网络资源服务器预读该页面的预定层数的子页面的页面数据。
  7. 如权利要求6所述的中转服务器,其中,所述页面预读单元包括:
    URL地址获取模块,用于根据所述预读关键字分析单元获得的预读关键字中的具有最高优先级的预读关键字,从所述网络资源服务器获取该要预读的子页面的URL地址;以及
    页面数据获取模块,用于根据所述获取的URL地址,从所述网络资源服务器预读要预读的子页面数据。
  8. 如权利要求7所述的中转服务器,还包括:
    海量用户行为统计单元,用于根据海量用户的网页浏览行为,统计用户访问每一页面的历史访问行为、访问频率以及跳转访问率;以及
    确定单元,用于在所述URL地址获取模块从所述网络资源服务器获取要预读的子页面的URL地址之后,确定所述获取的URL地址与根据所述海量用户行为统计单元统计得到的在访问所述页面之后最常访问的页面URL是否相同,
    其中,在所述获取的URL地址与所述海量用户行为统计单元统计得到的最常访问的页面URL相同时,所述页面预读单元根据所述已经预读的上一页面或上一子页面的下一子页面的URL,从所述网络资源服务器获取所述要预读的子页面的页面数据;
    在所述获取的URL地址与所述海量用户行为统计单元统计得到的最常访问的页面URL不同时,所述页面预定单元根据所述最常访问的页面URL的跳转访问率,进行页面预读操作。
  9. 如权利要求8所述的中转服务器,其中,
    如果所述跳转访问率超过预设阈值,则所述页面预读单元根据所述最常访问的页面URL,从所述网络资源服务器预读要预读的页面数据;否则
    如果所述跳转访问率没有超过预设阈值,则所述页面预读单元不执行任何预读操作。
  10. 如权利要求6所述的中转服务器,还包括:
    预读页面组合单元,用于对所述页面预读单元预读的所有预定层数的子页面的子页面数据进行合并重排,以及
    所述发送单元将经过合并重排后的组合页面数据发送给移动终端。
  11. 一种网页页面预读***,包括移动终端、网络资源服务器和如权利要求6~10中任一项所述的中转服务器。
PCT/CN2011/084107 2011-01-14 2011-12-16 网页页面预读方法、中转服务器和网页页面预读*** WO2012094937A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/580,961 US8375107B2 (en) 2011-01-14 2011-12-16 Webpage pre-reading method, transfer server and webpage pre-reading system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110008500.6 2011-01-14
CN201110008500.6A CN102123168B (zh) 2011-01-14 2011-01-14 基于中转服务器的网页页面预读及整合方法和***

Publications (1)

Publication Number Publication Date
WO2012094937A1 true WO2012094937A1 (zh) 2012-07-19

Family

ID=44251620

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/084107 WO2012094937A1 (zh) 2011-01-14 2011-12-16 网页页面预读方法、中转服务器和网页页面预读***

Country Status (3)

Country Link
US (1) US8375107B2 (zh)
CN (1) CN102123168B (zh)
WO (1) WO2012094937A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552906A (zh) * 2020-04-24 2020-08-18 上海连尚网络科技有限公司 一种用于响应阅读应用中页面访问请求的方法与设备
CN115037801A (zh) * 2022-03-14 2022-09-09 阿里巴巴(中国)有限公司 优先级调整方法、电子设备及存储介质

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102123168B (zh) * 2011-01-14 2012-07-18 广州市动景计算机科技有限公司 基于中转服务器的网页页面预读及整合方法和***
CN102185923A (zh) * 2011-05-16 2011-09-14 广州市动景计算机科技有限公司 一种移动通讯设备终端网页浏览方法
CN103067341B (zh) * 2011-10-20 2017-05-03 ***通信集团四川有限公司 网上营业厅访问方法、***和装置
US10296558B1 (en) * 2012-02-27 2019-05-21 Amazon Technologies, Inc. Remote generation of composite content pages
CN103309905A (zh) * 2012-03-16 2013-09-18 百度在线网络技术(北京)有限公司 对网页进行转码后合并阅读的方法以及服务器
CN102821088B (zh) * 2012-05-07 2015-12-16 北京京东世纪贸易有限公司 获取网络数据的***和方法
CN103488411B (zh) * 2012-06-13 2016-06-01 腾讯科技(深圳)有限公司 切换页面的方法和装置
CN103678307B (zh) * 2012-08-31 2016-07-13 腾讯科技(深圳)有限公司 页面显示方法及客户端
CN103678324B (zh) * 2012-09-03 2019-03-19 百度在线网络技术(北京)有限公司 一种用于打开网页的方法、装置和设备
CN103678393B (zh) * 2012-09-20 2018-06-15 腾讯科技(深圳)有限公司 获取信息的方法和装置
US20140082484A1 (en) * 2012-09-20 2014-03-20 Tencent Technology (Shenzhen) Company Limited Method and apparatus for obtaining information
CN103778115A (zh) * 2012-10-17 2014-05-07 腾讯科技(深圳)有限公司 网站名称提取方法及装置
CN103870479B (zh) * 2012-12-11 2018-01-05 腾讯科技(武汉)有限公司 网页显示方法和装置
CN104427369B (zh) * 2013-09-09 2018-08-10 联想(北京)有限公司 遥控端设备、被遥控端设备以及用于其的方法
CN104462142B (zh) * 2013-09-24 2019-01-15 联想(北京)有限公司 一种搜索网页页面中内容的方法及装置
CN103617228A (zh) * 2013-11-25 2014-03-05 北京奇虎科技有限公司 一种计算关联网页URL模式pattern的方法和装置
CN103617229A (zh) * 2013-11-25 2014-03-05 北京奇虎科技有限公司 一种关联网页数据库的建立方法和装置
CN103631906A (zh) * 2013-11-25 2014-03-12 北京奇虎科技有限公司 一种识别网页url中页码标识的方法和装置
CN104731817B (zh) * 2013-12-23 2019-11-22 腾讯科技(深圳)有限公司 一种网页展现方法和装置
US9886422B2 (en) * 2014-08-06 2018-02-06 International Business Machines Corporation Dynamic highlighting of repetitions in electronic documents
CN104268236B (zh) * 2014-09-28 2018-03-16 深圳市优网科技有限公司 一种识别网页浏览业务的方法及装置
CN104410675A (zh) * 2014-11-12 2015-03-11 北京奇虎科技有限公司 数据传输方法、数据***及相关装置
CN104506641B (zh) * 2014-12-30 2018-03-06 百度在线网络技术(北京)有限公司 网页应用程序的访问方法和装置
US10602332B2 (en) * 2016-06-20 2020-03-24 Microsoft Technology Licensing, Llc Programming organizational links that propagate to mobile applications
CN108255918B (zh) * 2017-09-15 2020-11-03 阿里巴巴(中国)有限公司 预读关键词集合的获取方法、网页访问设备及电子设备
CN109766082B (zh) * 2017-11-09 2022-04-12 北京京东尚科信息技术有限公司 应用程序页面跳转的方法和装置
CN111782328A (zh) * 2020-07-02 2020-10-16 支付宝(杭州)信息技术有限公司 应用处理的方法及装置
CN113779450A (zh) * 2020-08-31 2021-12-10 北京沃东天骏信息技术有限公司 页面访问方法和页面访问装置
CN113282354B (zh) * 2021-06-28 2023-04-07 中国平安人寿保险股份有限公司 应用程序的h5页面加载方法、装置、设备及存储介质
CN114139072B (zh) * 2021-10-29 2024-06-21 北京达佳互联信息技术有限公司 页面数据处理方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1588867A (zh) * 2004-09-14 2005-03-02 吴怡达 以网页浏览器为介面的点对点分散式搜索下载***及方法
CN101325602A (zh) * 2008-07-30 2008-12-17 广州市动景计算机科技有限公司 一种微浏览器智能预读网页的方法及***
CN102123168A (zh) * 2011-01-14 2011-07-13 广州市动景计算机科技有限公司 基于中转服务器的网页页面预读及整合方法和***

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3555523B2 (ja) * 1999-10-20 2004-08-18 日本電気株式会社 メモリ管理装置及び管理方法並びに管理プログラムを記録した記録媒体
US7058691B1 (en) * 2000-06-12 2006-06-06 Trustees Of Princeton University System for wireless push and pull based services
US8065620B2 (en) * 2001-01-31 2011-11-22 Computer Associates Think, Inc. System and method for defining and presenting a composite web page
US20080222283A1 (en) * 2007-03-08 2008-09-11 Phorm Uk, Inc. Behavioral Networking Systems And Methods For Facilitating Delivery Of Targeted Content
CN101777068B (zh) * 2009-12-31 2015-07-22 优视科技有限公司 一种用于移动通讯设备终端的网页页面预读及整合浏览***及其应用方法
CN101777081A (zh) * 2010-03-08 2010-07-14 中兴通讯股份有限公司 一种提高网页访问速度的方法及装置
US20120066359A1 (en) * 2010-09-09 2012-03-15 Freeman Erik S Method and system for evaluating link-hosting webpages
US9646100B2 (en) * 2011-03-14 2017-05-09 Verisign, Inc. Methods and systems for providing content provider-specified URL keyword navigation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1588867A (zh) * 2004-09-14 2005-03-02 吴怡达 以网页浏览器为介面的点对点分散式搜索下载***及方法
CN101325602A (zh) * 2008-07-30 2008-12-17 广州市动景计算机科技有限公司 一种微浏览器智能预读网页的方法及***
CN102123168A (zh) * 2011-01-14 2011-07-13 广州市动景计算机科技有限公司 基于中转服务器的网页页面预读及整合方法和***

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552906A (zh) * 2020-04-24 2020-08-18 上海连尚网络科技有限公司 一种用于响应阅读应用中页面访问请求的方法与设备
CN111552906B (zh) * 2020-04-24 2023-06-27 上海连尚网络科技有限公司 一种用于响应阅读应用中页面访问请求的方法与设备
CN115037801A (zh) * 2022-03-14 2022-09-09 阿里巴巴(中国)有限公司 优先级调整方法、电子设备及存储介质

Also Published As

Publication number Publication date
US8375107B2 (en) 2013-02-12
US20120317244A1 (en) 2012-12-13
CN102123168A (zh) 2011-07-13
CN102123168B (zh) 2012-07-18

Similar Documents

Publication Publication Date Title
WO2012094937A1 (zh) 网页页面预读方法、中转服务器和网页页面预读***
US7908336B2 (en) Dynamically inserting prefetch tags by the web server
US10389826B2 (en) Webpage pre-reading method, apparatus and smart terminal device
US10372778B2 (en) Speculative actions based on user dwell time over selectable content
CN108363815B (zh) 一种网页页面的预读取方法、装置及智能终端设备
KR102151457B1 (ko) 통신 시스템에서 페이지 로딩 시간 단축 방법 및 장치
US11403365B2 (en) Method and apparatus for storing webpage access records
US8762490B1 (en) Content-facilitated speculative preparation and rendering
US8499033B2 (en) Method, device, and system for acquiring a web page
US20120136926A1 (en) Computer networking system and method with javascript execution for pre-fetching content from dynamically-generated url
WO2017107568A1 (zh) 一种基于云-端协同的移动浏览器资源加载优化方法
CN102438045A (zh) Web页面的预取方法、***以及访问web页面的方法
WO2013097667A1 (zh) 网页内容加载控制方法及装置
US9531829B1 (en) Smart hierarchical cache using HTML5 storage APIs
JP2012522322A (ja) ページをレンダリングするための装置および方法
CN106681990B (zh) 一种移动云存储环境下缓存数据的预取方法
JP2001222459A (ja) キャッシングのためのシステム及び方法
CN103916474A (zh) 缓存时间的确定方法、装置及***
WO2012159360A1 (zh) 网页预取的方法及装置
WO2012119496A1 (zh) 预读方法和装置
JP4135876B2 (ja) コンテンツ先読み装置およびコンテンツ先読みプログラム
CN113438302A (zh) 动态资源多级缓存方法、***、计算机设备及存储介质
WO2014101462A1 (zh) 网页文本压缩方法和装置
WO2013097107A1 (zh) 网页预读方法、网页预读装置、浏览器和移动终端
Huang et al. Poster: A framework for instant mobile web browsing with smart prefetching and caching

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 13580961

Country of ref document: US

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11855791

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC OF 081113

122 Ep: pct application non-entry in european phase

Ref document number: 11855791

Country of ref document: EP

Kind code of ref document: A1