CN103761231A - Method and device for providing media content information of page by search engine - Google Patents

Method and device for providing media content information of page by search engine Download PDF

Info

Publication number
CN103761231A
CN103761231A CN201310487591.5A CN201310487591A CN103761231A CN 103761231 A CN103761231 A CN 103761231A CN 201310487591 A CN201310487591 A CN 201310487591A CN 103761231 A CN103761231 A CN 103761231A
Authority
CN
China
Prior art keywords
media content
webpage
arrow
indicated
content information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310487591.5A
Other languages
Chinese (zh)
Inventor
侯小虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310487591.5A priority Critical patent/CN103761231A/en
Publication of CN103761231A publication Critical patent/CN103761231A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9558Details of hyperlinks; Management of linked annotations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a method for providing the media content information of a webpage by a search engine. The method comprises the following steps: when receiving a search request matched with the identification of the media content information preset in the webpage, extracting the text messages and media content information of the webpage as the research result of the research request; responding to the selection of the text messages and media content information of the webpage so as to display the search result.

Description

A kind of search engine provides the method and apparatus of web page media content information
Technical field
The present invention relates to field of computer technology, relating in particular to a kind of search engine provides the method and apparatus of web page media content information.
Background technology
Along with the development of computer technology and universal, the demand of obtaining various media informations by search engine net also increases day by day.At present, nearly all media content, for example, picture, animation, Voice & Video are all the form carryings with webpage.Therefore, mainly by input key word, hit triggering related web page, and related web page is presented at and in Search Results, obtains related media content information.Search Results mainly presents with the form of word, for example, with the form of keyword general rise of prices of the stocks and other securities, be presented in webpage, as shown in Figure 1, and does not provide about the prompting that whether comprises media content information and media content relevant information in webpage.Can there is following problem in this mode: only by the Word message in Search Results, user cannot recognize that each webpage the inside has the media content information of how much oneself wanting on earth, and whether the degree of correlation how, have webpage cheating suspicion to gain click suspicion by cheating; User, in order to find media content, must open each webpage by the keyword general rise of prices of the stocks and other securities situation point of observing each Search Results in webpage, and then screen, and efficiency is not high; Owing to not knowing each webpage situation of media content behind, cause a lot of forward webpage click amounts higher, but actual result situation is not met consumers' demand; And main flow search engine has click feedback mechanism at present, finally make these webpage rankings of not meeting consumers' demand always very high, actual have deviation with user's request, causes information search efficiency not high.
Summary of the invention
In view of the above problems, the present invention has been proposed, to provide a kind of search engine that overcomes the problems referred to above or address the above problem at least in part that the method and apparatus of web page media content information is provided.
According to a first aspect of the present invention, provide a kind of search engine that the method for web page media content information is provided, comprise step: receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request; And in response to the selection of the Word message to webpage and media content information, display of search results.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, media content at least comprises the one in following: picture, animation, Voice & Video.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, step as the Search Results of searching request comprises: receive with webpage in the sign of default media content information match searching request time, extract a kind of Word message in webpage at least following Search Results as searching request: title, summary and text.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, step as the Search Results of searching request comprises: receive with webpage in the sign of default media content information match searching request time, extract a kind of media content information in webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/size, the one URL address of form and each media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, step as the Search Results of searching request comprises: be extracted as preallocated the 2nd URL address of each media content, wherein the page of the second thumbnail that shows one or more media contents is pointed in the 2nd URL address.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, step as the Search Results of searching request comprises: receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And Word message and the media content information of pressing predetermined way combination webpage, as the Search Results of searching request.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, press Word message and the media content information of predetermined way combination webpage, as the step of the Search Results of searching request, comprise: the first thumbnail of selecting a media content from the media content information of webpage; And in Search Results, show the first thumbnail of a media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, in response to the selection of the Word message to webpage and media content information, the step of display of search results comprises: in response to the selection of the first thumbnail to a media content, jump to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, in response to the selection of the Word message to webpage and media content information, the step of display of search results also comprises: in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, jump to a URL address of this media content, to show the information of this media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, press Word message and the media content information of predetermined way combination webpage, as the step of the Search Results of searching request, comprise: the first thumbnail of selecting multiple media contents from the media content information of webpage; And in Search Results, show the first thumbnail of multiple media contents.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, in response to the selection of the Word message to webpage and media content information, the step of display of search results comprises: in response to the selection of the first thumbnail to each media content, jump to this media content a URL address, to show the information of this media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, media content information comprises word segment and thumbnail part, and wherein in response to the selection of the Word message to webpage and media content information, the step of display of search results comprises: in response to the selection to word segment, jump to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.
Alternatively, at search engine according to an embodiment of the invention, provide in the method for web page media content information, in response to the selection of the Word message to webpage and media content information, the step of display of search results also comprises: in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, jump to a URL address of this media content, to show the information of this media content.
According to a second aspect of the present invention, a kind of device that web page media content information is provided for search engine is provided, comprise: information extraction modules, be suitable for receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request; And search result display module, be suitable for the selection in response to the Word message to webpage and media content information, display of search results.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, media content at least comprises the one in following: picture, animation, Voice & Video.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, information extraction modules is suitable for: receive with webpage in the sign of default media content information match searching request time, extract a kind of Word message in webpage at least following Search Results as searching request: title, summary and text.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, information extraction modules is suitable for: receive with webpage in the sign of default media content information match searching request time, extract a kind of media content information in webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, information extraction modules is suitable for: be extracted as preallocated the 2nd URL address of each media content, wherein the page of the second thumbnail that shows one or more media contents is pointed in the 2nd URL address.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, information extraction modules comprises: Word message extraction unit and media content information extraction unit, be suitable for receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And information combination unit, be suitable for by Word message and the media content information of predetermined way combination webpage, as the Search Results of searching request.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, information combination unit is suitable for: the first thumbnail of selecting a media content from the media content information of webpage; And in Search Results, show the first thumbnail of a media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, search result display module is suitable for: in response to the selection of the first thumbnail to a media content, jump to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, search result display module is suitable for: in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, jump to a URL address of this media content, to show the information of this media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, information combination unit is suitable for: the first thumbnail of selecting multiple media contents from the media content information of webpage; And in Search Results, show the first thumbnail of multiple media contents.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, search result display module is suitable for: in response to the selection of the first thumbnail to each media content, jump to a URL address of this media content, to show the information of this media content.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, media content information comprises word segment and thumbnail part, and wherein search result display module is suitable for: in response to the selection to word segment, jump to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.
Alternatively, at search engine according to an embodiment of the invention, provide in the device of web page media content information, search result display module is also suitable for: in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, jump to a URL address of this media content, to show the information of this media content.
The invention provides the method and apparatus that above-mentioned search engine obtains web page media content information.According to embodiments of the invention, the mode that the method and apparatus of search engine acquisition web page media content information provides more directly perceived, has been easier to the searching media content information of understanding for client, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to better understand technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Accompanying drawing explanation
By reading below detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skills.Accompanying drawing is only for the object of preferred implementation is shown, and do not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
Fig. 1 is the webpage schematic diagram of display of search results in prior art;
Fig. 2 is the process flow diagram of the method for search engine collecting web page media content information according to an embodiment of the invention;
Fig. 3 is the exemplary view of the Web page picture information search result that search engine provides according to an embodiment of the invention;
Fig. 4 is the exemplary view of the webpage audio-frequency information Search Results that search engine provides according to an embodiment of the invention;
Fig. 5 is that search engine provides the process flow diagram of the method for web page media content information according to an embodiment of the invention;
Fig. 6 is the exemplary view of the Web page picture information search result that provides of search engine according to another embodiment of the invention;
Fig. 7 is the process flow diagram that search engine according to another embodiment of the invention provides the method for web page media content information;
Fig. 8 is the structural representation of the device of search engine collecting web page media content information according to an embodiment of the invention;
Fig. 9 is that search engine provides the structural representation of the device of web page media content information according to an embodiment of the invention;
Figure 10 is the structural representation that search engine according to another embodiment of the invention provides the device of web page media content information.
Embodiment
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in accompanying drawing, but should be appreciated that and can realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order more thoroughly to understand the disclosure that these embodiment are provided, and can be by the those skilled in the art that conveys to complete the scope of the present disclosure.
In an embodiment of the present invention, search engine can be according to certain strategy, use specific computer program to gather information from internet, after information being organized and is processed, for user provides search service, the system by information display relevant user search to user.
Embodiment mono-
The method of paper search engine collecting web page media content information below, specifically comprises:
Capture info web; Detect the sign whether info web comprises the information of default media content; Detecting comprise sign in info web in the situation that, extract Word message and media content information in info web; And based on Word message and media content information, set up respectively text index storehouse and media content index storehouse.
Fig. 2 shows the process flow diagram of the method 100 of search engine collecting web page media content information according to an embodiment of the invention.In an embodiment of the present invention, media content can at least comprise the one in following: picture, animation, Voice & Video.Certainly can understand, media content also can comprise other guide.
As shown in Figure 2, in step S101, capture info web.For example, can capture info web from one or more Website server.
In one exemplary embodiment of the present invention, info web can comprise Word message and media content information.Alternatively, Word message can comprise the one at least following: title, summary and text.Alternatively, media content information can comprise the one at least following: a URL address of title, quantity, the first thumbnail, author, length and/or size, form and each media content of media content.
In one exemplary embodiment of the present invention, for the webpage that carries picture, info web can comprise Word message and pictorial information.Alternatively, Word message can comprise: title (3A is indicated as arrow), summary (3B is indicated as arrow), and/or the URL(of webpage is as indicated in arrow 3C).Alternatively, pictorial information can comprise the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and/or the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.Certainly can understand, Word message and pictorial information also can comprise other guide.
In one exemplary embodiment of the present invention, for the webpage that carries audio frequency, info web can comprise Word message and audio-frequency information.Alternatively, Word message can comprise: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and/or webpage is as indicated in arrow 4C).Alternatively, audio-frequency information can comprise the URL address (not shown) of audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown) and/or audio frequency.Certainly can understand, Word message and audio-frequency information also can comprise other guide.
In step S103, detect the sign whether info web comprises the information of default media content.
In one exemplary embodiment of the present invention, the sign of the information by default media content judges in the info web of crawl whether comprise specific media content.Alternatively, when the searched key word of user input can match with this specific media content, search engine can provide and show the Search Results that comprises this webpage.Certainly can understand, in embodiments of the invention, do not limit the concrete form of the sign of above-mentioned default media content information.
In step S105, detecting comprise above-mentioned sign in info web in the situation that, extract Word message and media content information in info web;
In one exemplary embodiment of the present invention, step S105 can comprise: detecting comprise above-mentioned sign in info web in the situation that, extract a kind of Word message in webpage at least following: title, summary and text; And extract a kind of media content information in webpage at least following: a URL address of title, quantity, the first thumbnail, author, length and/or size, form and each media content of media content.
In the exemplary embodiment of the present invention shown in Fig. 3, for the webpage that carries picture, in the case of detecting the sign that comprises default pictorial information in info web, alternatively, extract a kind of Word message in this webpage at least following: title (3A is indicated as arrow), summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C).Alternatively, extract a kind of pictorial information in this webpage at least following: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.
In the exemplary embodiment of the present invention shown in Fig. 4, for the webpage that carries audio frequency, in the case of detecting the sign that comprises default audio-frequency information in info web, alternatively, extract a kind of Word message in this webpage at least following: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C).Alternatively, extract a kind of audio-frequency information in this webpage at least following: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency.
In one exemplary embodiment of the present invention, step S105 also comprises: for webpage distributes the 2nd URL address, wherein the page of the second thumbnail of the one or more media contents in display web page is pointed in the 2nd URL address.
In the exemplary embodiment of the present invention shown in Fig. 3, for the webpage that carries picture, detecting comprise above-mentioned sign in info web in the situation that, extract the original URL address of picture in webpage, and for webpage distributes new URL address, wherein the page of the thumbnail of the one or more pictures in display web page is pointed in this new URL address.Alternatively, the page of the thumbnail of whole pictures in display web page is pointed in this new URL address.Alternatively, when user selects the option corresponding to this new URL address in Search Results, during as the picture header in Fig. 3 (as arrow, 3D is indicated), jump to the page (3G is indicated as arrow) corresponding to this new URL address, to show the thumbnail of whole pictures in this webpage to user.Alternatively, when user selects the thumbnail (3H is indicated as arrow) of each picture in this page, jump to the original URL of this picture, so that the details of this picture to be provided.
In step S107, based on Word message and media content information, set up respectively text index storehouse and media content index storehouse.
In one exemplary embodiment of the present invention, step S107 comprises: the Word message in text index storehouse is associated about the media content information of same webpage with media content index storehouse.
In the exemplary embodiment of the present invention shown in Fig. 3, for the webpage that carries picture, based on the extracted a kind of Word message at least following: title (3A is indicated as arrow), summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C), set up text index storehouse.Alternatively, based on the extracted a kind of pictorial information at least following: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture, set up picture indices storehouse.Alternatively, the above-mentioned Word message in text index storehouse is associated with the above-mentioned pictorial information about same webpage in picture indices storehouse.
In the exemplary embodiment of the present invention shown in Fig. 4, for the webpage that carries audio frequency, based on the extracted a kind of Word message at least following: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C), set up text index storehouse.Alternatively, based on the extracted a kind of audio-frequency information at least following: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency, set up audio index storehouse.Alternatively, the above-mentioned Word message in text index storehouse is associated with the above-mentioned audio-frequency information about same webpage in audio index storehouse.
In embodiments of the invention, extract Word message and media content information in info web, and based on Word message and media content information, set up respectively text index storehouse and media content index storehouse, can provide mode more directly perceived, that be easier to the searching media content information of understanding for client, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
It should be noted that, method shown in Fig. 2 do not limit by shown in the order of each step carry out, can adjust as required the sequencing of each step, in addition, described step is also not limited to above-mentioned steps and divides, and above-mentioned steps can further split into more multi-step also can be merged into still less step.
Embodiment bis-
After search engine collecting web page media content information, searching request that can be based on user, obtains Search Results.Introduce search engine below the method for web page media content information is provided, specifically can comprise: receive searching request; Whether detect described searching request is associated with media content; In the situation that searching request is associated with media content, in predefined text index storehouse and media content index storehouse, search the webpage mating with searching request; And from text index storehouse and media content index storehouse, extract respectively Word message and the media content information of webpage, as the Search Results of searching request.
Fig. 5 shows search engine according to an embodiment of the invention provides the process flow diagram of the method 200 of web page media content information.In an embodiment of the present invention, media content can at least comprise the one in following: picture, animation, Voice & Video.Certainly can understand, media content also can comprise other guide.
As shown in Figure 5, in step S201, receive searching request.For example can receive searching request from one or more ustomer premises access equipment.Alternatively, searching request can be the searched key word that user inputs.Certainly can understand, in embodiments of the invention, do not limit the concrete form of above-mentioned searching request.
At step S203, detect searching request and whether be associated with media content.Alternatively, when user's inputted search keyword, judge whether user's searching request contains the demand of media content, for example, whether contain picture demand, animation demand, video requirement or audio frequency demand.
In step S205, in the situation that searching request is associated with media content, in predefined text index storehouse and media content index storehouse, search the webpage mating with searching request.
In one exemplary embodiment of the present invention, predefined text index storehouse can comprise the Word message of webpage, for example, and the title of webpage, summary and/or text.Predefined media content index storehouse can comprise media content information, for example, and a URL address of the title of media content, quantity, the first thumbnail, author, length and/or size, form and/or each media content.
In the exemplary embodiment of the present invention shown in Fig. 3, in the situation that searching request is associated with picture, in predefined text index storehouse and picture indices storehouse, search the webpage mating with searching request.Alternatively, predefined text index storehouse can comprise a kind of Word message at least following: the title (3A is indicated as arrow) of webpage, summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C).Alternatively, predefined picture indices storehouse can comprise a kind of pictorial information at least following: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.
In the exemplary embodiment of the present invention shown in Fig. 4, in the situation that searching request is associated with audio frequency, in predefined text index storehouse and audio index storehouse, search the webpage mating with searching request.Alternatively, predefined text index storehouse can comprise a kind of Word message at least following: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C).Alternatively, predefined audio index storehouse can comprise a kind of audio-frequency information at least following: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency.
In step S207, from text index storehouse and media content index storehouse, extract respectively Word message and the media content information of webpage, as the Search Results of searching request.Alternatively, this Search Results can show on one or more ustomer premises access equipment.
In one exemplary embodiment of the present invention, step S207 can comprise: from text index storehouse, extract a kind of Word message in webpage at least following: title, summary and text, and as the Search Results of searching request.
In one exemplary embodiment of the present invention, step S207 can comprise: from media content index storehouse, extract a kind of media content information in webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
In one exemplary embodiment of the present invention, step S207 can comprise: for the one or more media contents in webpage distribute the 2nd URL address, wherein the page of the second thumbnail that shows one or more media contents is pointed in the 2nd URL address.
In the exemplary embodiment of the present invention shown in Fig. 3, in the situation that searching request is associated with picture, from text index storehouse and picture indices storehouse, extract respectively Word message and the pictorial information of webpage, and for the one or more pictures in webpage distribute new URL address, wherein the page (3G is indicated as arrow) of the thumbnail of the one or more pictures in display web page is pointed in this new URL address.Alternatively, for the whole pictures in webpage distribute new URL address, wherein the page of the thumbnail of whole pictures in display web page is pointed in this new URL address.Alternatively, when user selects the option corresponding to this new URL address in Search Results, during as the picture header in Fig. 3 (as arrow, 3D is indicated), jump to the page (3G is indicated as arrow) corresponding to this new URL address, to show the thumbnail of whole pictures in this webpage to user.
In one exemplary embodiment of the present invention, step S207 can comprise: Word message and the media content information of from text index storehouse and media content index storehouse, extracting respectively webpage; And Word message and the media content information of pressing predetermined way combination webpage, as the Search Results of searching request.
In one exemplary embodiment of the present invention, by predetermined way, combine Word message and the media content information of described webpage, as the step of the Search Results of searching request, comprise: the first thumbnail of selecting a media content from the media content information of webpage; And in Search Results, show the first thumbnail of a media content.
In exemplary embodiment of the present invention as shown in Figure 3, in the situation that searching request is associated with picture, from text index storehouse, extract respectively the following Word message of webpage: the title (3A is indicated as arrow) of webpage, the URL(of summary (3B is indicated as arrow) and/or webpage is as indicated in arrow 3C), from picture indices storehouse, extract the following pictorial information of webpage: picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), first thumbnail (3F is indicated as arrow) of picture, picture author (not shown), picture size or resolution (not shown), the URL address (not shown) of picture format (not shown) and/or picture.Alternatively, in the first thumbnail of extracted picture, select first thumbnail (3F is indicated as arrow), to be presented in Search Results.As shown in Figure 3, every Search Results includes title, summary and/or the URL of webpage, and first thumbnail (3F is indicated as arrow) of picture header, picture number and/or picture.
In one exemplary embodiment of the present invention, by Word message and the media content information of predetermined way combination webpage, as the step of the Search Results of searching request, comprise: the first thumbnail of selecting multiple media contents from the media content information of webpage; And in Search Results, show the first thumbnail of the plurality of media content.
In exemplary embodiment of the present invention as shown in Figure 6, in the first thumbnail of extracted picture, select four the first thumbnails (6E is indicated as arrow), to be presented in Search Results.Certainly can understand, the quantity of selected the first thumbnail is not limited to the quantity described in the embodiment of the present invention.In the Search Results shown in Fig. 6, every Search Results includes title, summary and the URL of webpage, and four first thumbnails (6E is indicated as arrow) of picture header, picture number and picture.
In one exemplary embodiment of the present invention, media content information comprises word segment and thumbnail part, and word segment points to the page of the second thumbnail that shows one or more media contents.
In exemplary embodiment of the present invention as shown in Figure 3, in the situation that searching request is associated with picture, pictorial information comprises word segment and thumbnail part.In Search Results, word segment can comprise the URL address (not shown) of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and/or picture; Thumbnail part can comprise first thumbnail (3F is indicated as arrow) of picture.Wherein, when user selects picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) or other word segments, jump to the new page (3G is indicated as arrow), this page shows second thumbnail (3H is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.
In exemplary embodiment of the present invention as shown in Figure 6, in the situation that searching request is associated with picture, pictorial information comprises word segment and thumbnail part.In Search Results, word segment can comprise the URL address (not shown) of picture header (not shown), picture number (6D is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown), picture, and/or other word segments (" > > " as indicated in arrow 6G); Thumbnail part can comprise first thumbnail (6E is indicated as arrow) of picture.Wherein, when user selects picture number (6D is indicated as arrow) or other word segments (" > > " as indicated in arrow 6G), jump to the new page (6H is indicated as arrow), this page shows second thumbnail (6I is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.
In embodiments of the invention, in the situation that searching request is associated with media content, in predefined text index storehouse and media content index storehouse, search the webpage mating with searching request; From text index storehouse and media content index storehouse, extract respectively Word message and the media content information of webpage, as the Search Results of searching request, can provide mode more directly perceived, that be easier to the searching media content information of understanding for client, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
It should be noted that, method shown in Fig. 5 do not limit by shown in the order of each step carry out, can adjust as required the sequencing of each step, in addition, described step is also not limited to above-mentioned steps and divides, and above-mentioned steps can further split into more multi-step also can be merged into still less step.
Embodiment tri-
After search engine obtains web page media content information, searching request that can be based on user, provides Search Results to user.Introducing search engine below provides the method for web page media content information, specifically can comprise: receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request; And in response to the selection of the Word message to webpage and media content information, provide Search Results.
Fig. 7 shows search engine according to an embodiment of the invention provides the process flow diagram of the method 300 of web page media content information.In an embodiment of the present invention, media content can at least comprise the one in following: picture, animation, Voice & Video.Certainly can understand, media content also can comprise other guide.
As shown in Figure 7, in step S301, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request.For example, can receive searching request from one or more ustomer premises access equipment.Alternatively, searching request can be the searched key word that user inputs.Certainly can understand, in embodiments of the invention, do not limit the concrete form of above-mentioned searching request.
In an exemplary embodiment of the present invention, step S301 can comprise: receive with webpage in the sign of default media content information match searching request time, extract a kind of Word message in webpage at least following Search Results as searching request: title, summary and text.
In an exemplary embodiment of the present invention, step S301 can comprise: receive with webpage in the sign of default media content information match searching request time, extract a kind of media content information in webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
In exemplary embodiment of the present invention as shown in Figure 3, receive with webpage in the sign of default pictorial information match searching request time, extract a kind of Word message in webpage at least following Search Results as searching request: the title (3A is indicated as arrow) of webpage, summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C).Alternatively, can extract a kind of pictorial information in webpage at least following: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.
In the exemplary embodiment of the present invention shown in Fig. 4, receive with webpage in the sign of default audio-frequency information match searching request time, extract a kind of Word message in webpage at least following Search Results as searching request: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C).Alternatively, can extract a kind of audio-frequency information in webpage at least following: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency.
In an exemplary embodiment of the present invention, step S301 can comprise and be extracted as preallocated the 2nd URL address of each media content, and the page of the second thumbnail that shows described one or more media contents is pointed in wherein said the 2nd URL address
In exemplary embodiment of the present invention as shown in Figure 3, receive with webpage in the sign of default pictorial information match searching request time, be extracted as preallocated the 2nd URL address of each picture, wherein the page of the second thumbnail that shows one or more pictures is pointed in the 2nd URL address.Alternatively, whole pictures that can be extracted as in webpage distribute new URL address, and wherein the page (3G is indicated as arrow) of the thumbnail of whole pictures in display web page is pointed in this new URL address.Alternatively, when user selects the option corresponding to this new URL address in Search Results, during as picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) or other word segments, jump to the page (3G is indicated as arrow) corresponding to this new URL address, to show the thumbnail of whole pictures in this webpage to user.
In one exemplary embodiment of the present invention, step S301 can comprise: receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And Word message and the media content information of pressing predetermined way combination webpage, as the Search Results of searching request.
In step S303, in response to the selection of the Word message to webpage and media content information, provide Search Results.For example, this Search Results can show on one or more ustomer premises access equipment.
In one exemplary embodiment of the present invention, in response to the selection to webpage Word message, jump to a URL address, so that Search Results to be provided.For example, as shown in Figure 4, in response to the selection to webpage Word message (web page title as indicated in arrow 4A), jump to a URL address, so that the details (4H is indicated as arrow) of this media content to be provided.
In one exemplary embodiment of the present invention, in step S301, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And by Word message and the media content information of following predetermined way combination webpage, as the Search Results of searching request: select the first thumbnail of a media content from the media content information of webpage, and show the first thumbnail of a media content in Search Results.Alternatively, in step S303, in response to the selection of the first thumbnail to a media content, jump to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.Alternatively, in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, jump to a URL address of this media content, so that the information of this media content to be provided.
In exemplary embodiment of the present invention as shown in Figure 3, in step S301, receive with webpage in the sign of default pictorial information match searching request time, extract the following Word message of webpage: the title (3A is indicated as arrow) of webpage, the URL(of summary (3B is indicated as arrow) and/or webpage is as indicated in arrow 3C), and extract the following pictorial information of webpage: picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), first thumbnail (3F is indicated as arrow) of picture, picture author (not shown), picture size or resolution (not shown), the URL address (not shown) of picture format (not shown) and/or picture.Alternatively, by Word message and the pictorial information of following predetermined way combination webpage, as the Search Results of searching request: select the first thumbnail of a picture from the pictorial information of webpage, and show first thumbnail (3F is indicated as arrow) of a picture in result.Alternatively, in step S303, when user selects the thumbnail (3F is indicated as arrow) of this picture, jump to the new page (3G is indicated as arrow), this page shows second thumbnail (3H is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.Alternatively, when user selects second thumbnail (3H is indicated as arrow) of the each picture in new interface (3G is indicated as arrow), jump to a URL address of this picture, so that the details of this picture to be provided.
In another exemplary embodiment of the present invention, in step S301, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And by Word message and the media content information of following predetermined way combination webpage, as the Search Results of searching request: select the first thumbnail of multiple media contents from the media content information of webpage, and show the first thumbnail of multiple media contents in Search Results.Alternatively, in step S303, in response to the selection of the first thumbnail to each media content, jump to a URL address of this media content, so that the information of this media content to be provided.
In exemplary embodiment of the present invention as shown in Figure 6, in step S301, receive with webpage in the sign of default pictorial information match searching request time, extract the following Word message of webpage: title (6A is indicated as arrow), the URL(of summary (6B is indicated as arrow) and/or webpage is as indicated in arrow 6C), and extract the following pictorial information of webpage: picture header (not shown), picture number (6D is indicated as arrow), first thumbnail (6E is indicated as arrow) of picture, picture author (not shown), picture size or resolution (not shown), the URL address (not shown) of picture format (not shown) and/or picture.Alternatively, by Word message and the pictorial information of following predetermined way combination webpage, as the Search Results of searching request: select four the first thumbnails from the pictorial information of webpage, and show four the first thumbnails in result.Certainly can understand, the embodiment of the present invention does not limit picture number selected and that show.Alternatively, in step S303, when user selects each in these four picture thumbnails, jump to the new page (6F is indicated as arrow), this page provides the details of this picture.
In another exemplary embodiment of the present invention, in step S301, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, wherein media content information comprises word segment and thumbnail part; And Word message and the media content information of pressing predetermined way combination webpage, as the Search Results of searching request.Alternatively, in step S303, in response to the selection to word segment, jump to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.Alternatively, in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, jump to a URL address of this media content, so that the information of this media content to be provided.
In exemplary embodiment of the present invention as shown in Figure 3, the pictorial information in Search Results, comprises word segment and thumbnail part.Word segment can comprise picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) and/or other words; Thumbnail part comprises first thumbnail (3F is indicated as arrow) of picture.Alternatively, when user selects picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) or other words, jump to the new page (3G is indicated as arrow), this page shows second thumbnail (3H is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.Alternatively, in response to the selection of the second thumbnail (3H is indicated as arrow) to each picture, jump to a URL address of this picture, so that the details of this picture to be provided.
In exemplary embodiment of the present invention as shown in Figure 6, the pictorial information in Search Results, comprises word segment and thumbnail part.Word segment can comprise picture header (not shown), picture number (6D is indicated as arrow) and/or other words (" > > " as indicated in arrow 6G in Fig. 6); Thumbnail part comprises first thumbnail (6E is indicated as arrow) of picture.Alternatively, when user selects picture header (not shown), picture number (6D is indicated as arrow) or other words (indicated " the > > " of arrow 6G in as Fig. 6), jump to the new page (6H is indicated as arrow), this page shows second thumbnail (6I is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.Alternatively, in response to the selection of the second thumbnail (6I is indicated as arrow) to each picture, jump to a URL address of this picture, so that the details of this picture to be provided.
In embodiments of the invention, receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request; And in response to the selection of the Word message to webpage and media content information, Search Results is provided, Word message and media content information can be provided in Search Results, thereby for client provides mode more directly perceived, that be easier to the searching media content information of understanding, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
It should be noted that, method shown in Fig. 7 do not limit by shown in the order of each step carry out, can adjust as required the sequencing of each step, in addition, described step is also not limited to above-mentioned steps and divides, and above-mentioned steps can further split into more multi-step also can be merged into still less step.
Embodiment tetra-
Introduce the device for search engine collecting web page media content information according to one exemplary embodiment of the present invention below.
Alternatively, this device is suitable for carrying out previously described method 100.
Fig. 8 shows according to the structural representation of a kind of device 400 for search engine collecting web page media content information of the present invention.In an embodiment of the present invention, this device 400 comprises: information scratching module 401, is suitable for capturing info web; Label detection module 403, is suitable for detecting the sign whether described info web comprises the information of default media content; Information extraction modules 405, is suitable for, detecting comprise described sign in info web in the situation that, extracting Word message and media content information in described info web; Index database is set up module 407, is suitable for, based on described Word message and described media content information, setting up respectively text index storehouse and media content index storehouse.
In an embodiment of the present invention, media content can at least comprise the one in following: picture, animation, Voice & Video.Certainly can understand, media content also can comprise other guide.
As shown in Figure 8, device 400 comprises information scratching module 401, is suitable for capturing info web.For example, as shown in Figure 8, information scratching module 401 can capture info web from one or more Website server.
In one exemplary embodiment of the present invention, info web can comprise Word message and media content information.Alternatively, Word message can comprise the one at least following: title, summary and text.Alternatively, media content information can comprise the one at least following: a URL address of title, quantity, the first thumbnail, author, length and/or size, form and each media content of media content.
In one exemplary embodiment of the present invention, for the webpage that carries picture, info web can comprise Word message and pictorial information.Alternatively, Word message can comprise: the URL(of title (3A is indicated as arrow), summary (3B is indicated as arrow) and/or webpage is as indicated in arrow 3C).Alternatively, pictorial information can comprise the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and/or the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.Certainly can understand, Word message and pictorial information also can comprise other guide.
In one exemplary embodiment of the present invention, for the webpage that carries audio frequency, info web can comprise Word message and audio-frequency information.Alternatively, Word message can comprise: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and/or webpage is as indicated in arrow 4C).Alternatively, audio-frequency information can comprise audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and/or the URL address (not shown) of audio frequency.Certainly can understand, Word message and audio-frequency information also can comprise other guide.
As shown in Figure 8, device 400 comprises label detection module 403, is suitable for detecting the sign whether info web comprises the information of default media content.
In one exemplary embodiment of the present invention, label detection module 403 judges in the info web of crawl whether comprise specific media content by the sign of the information of default media content.Alternatively, when the searched key word of user's input can match with this specific media content, search engine can provide the Search Results that comprises this webpage.Certainly can understand, in embodiments of the invention, do not limit the concrete form of the sign of above-mentioned default media content information.
As shown in Figure 8, device 400 comprises information extraction modules 405, is suitable for, detecting comprise sign in info web in the situation that, extracting Word message and media content information in info web.
In one exemplary embodiment of the present invention, information extraction modules 405 is suitable for detecting comprise sign in info web in the situation that, extracts a kind of Word message in webpage at least following: title, summary and text; And extract a kind of media content information in webpage at least following: a URL address of title, quantity, the first thumbnail, author, length and/or size, form and each media content of media content.
In the exemplary embodiment of the present invention shown in Fig. 3, for the webpage that carries picture, in the case of the sign that comprises default pictorial information during label detection module 403 detects info web, alternatively, information extraction modules 405 can be extracted a kind of Word message in this webpage at least following: title (3A is indicated as arrow), summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C).Alternatively, information extraction modules 405 can be extracted a kind of pictorial information in this webpage at least following: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.
In the exemplary embodiment of the present invention shown in Fig. 4, for the webpage that carries audio frequency, in the case of the sign that comprises default audio-frequency information during label detection module 403 detects info web, alternatively, information extraction modules 405 can be extracted a kind of Word message in this webpage at least following: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C).Alternatively, information extraction modules 405 can be extracted a kind of audio-frequency information in this webpage at least following: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency.
In one exemplary embodiment of the present invention, information extraction modules 405 is also suitable for for webpage distributes the 2nd URL address, and wherein the page of the second thumbnail of the one or more media contents in display web page is pointed in the 2nd URL address.
In the exemplary embodiment of the present invention shown in Fig. 3, for the webpage that carries picture, the in the situation that of comprising above-mentioned sign in label detection module 403 detects info web, information extraction modules 405 can be extracted the original URL address of picture in webpage, and for webpage distributes new URL address, wherein the page of the thumbnail of the one or more pictures in display web page is pointed in this new URL address.Alternatively, the page of the thumbnail of whole pictures in display web page is pointed in this new URL address.Alternatively, when user selects the option corresponding to this new URL address in Search Results, during as the picture header in Fig. 3 (as arrow, 3D is indicated), jump to the page (3G is indicated as arrow) corresponding to this new URL address, to show the thumbnail of whole pictures in this webpage to user.Alternatively, when user selects the thumbnail of each picture in this page, jump to the original URL of this picture, so that the details of this picture to be provided.
As shown in Figure 8, device 400 comprises that index database sets up module 407, is suitable for based on Word message and media content information, sets up respectively text index storehouse and media content index storehouse.
In one exemplary embodiment of the present invention, index database is set up module 407 and is suitable for making the Word message in text index storehouse to be associated about the media content information of same webpage with media content index storehouse.
In the exemplary embodiment of the present invention shown in Fig. 3, for the webpage that carries picture, index database is set up a kind of Word message at least following that module 407 can extract based on information extraction modules 405: title (3A is indicated as arrow), summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C), set up text index storehouse.Alternatively, index database is set up a kind of pictorial information at least following that module 407 can extract based on information extraction modules 405: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture, set up picture indices storehouse.Alternatively, index database is set up module 407 and can be suitable for making the above-mentioned Word message in text index storehouse to be associated with the above-mentioned pictorial information about same webpage in picture indices storehouse.
In the exemplary embodiment of the present invention shown in Fig. 4, for the webpage that carries audio frequency, index database is set up a kind of Word message at least following that module 407 is suitable for extracting based on information extraction modules 405: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C), set up text index storehouse.Alternatively, index database is set up a kind of audio-frequency information at least following that module 407 is suitable for extracting based on information extraction modules 405: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency, set up audio index storehouse.Alternatively, index database is set up module 407 and is suitable for making the above-mentioned Word message in text index storehouse to be associated with the above-mentioned audio-frequency information about same webpage in audio index storehouse.
In embodiments of the invention, the device 400 of search engine collecting web page media content information can provide mode more directly perceived, that be easier to the searching media content information of understanding for client, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
Embodiment five
Introducing below provides the device of web page media content information according to exemplary embodiment of the present invention for search engine.
Alternatively, this device is suitable for carrying out previously described method 200.
Fig. 9 shows and for search engine, provides the structural representation of the device 500 of web page media content information according to exemplary embodiment of the present invention.In an embodiment of the present invention, device 500 comprises:
Request receiving module 501, is suitable for receiving searching request;
Whether request detection module 503, be suitable for detecting searching request and be associated with media content;
Webpage searching module 505, is suitable in the situation that searching request is associated with media content, in predefined text index storehouse and media content index storehouse, searches the webpage mating with searching request; And
Information extraction modules 507, is suitable for from text index storehouse and media content index storehouse, extracting respectively Word message and the media content information of webpage, as the Search Results of searching request.
In an embodiment of the present invention, media content can at least comprise the one in following: picture, animation, Voice & Video.Certainly can understand, media content also can comprise other guide.
As shown in Figure 9, device 500 comprises request receiving module 501, is suitable for receiving searching request.For example, as shown in Figure 9, information extraction modules 507 can receive searching request from one or more ustomer premises access equipment.Alternatively, searching request can be the searched key word that user inputs.Certainly can understand, in embodiments of the invention, do not limit the concrete form of above-mentioned searching request.
As shown in Figure 9, whether device 500 comprises request detection module 503, be suitable for detecting searching request and be associated with media content.Alternatively, when user's inputted search keyword, request detection module 503 judges whether user's searching request contains the demand of media content, for example, whether contain picture demand, animation demand, video requirement or audio frequency demand.
As shown in Figure 9, device 500 comprises Webpage searching module 505, is suitable in the situation that searching request is associated with media content, in predefined text index storehouse and media content index storehouse, searches the webpage mating with searching request.
In one exemplary embodiment of the present invention, predefined text index storehouse can comprise the Word message of webpage, for example, and the title of webpage, summary and/or text.Predefined media content index storehouse can comprise media content information, for example, and a URL address of the title of media content, quantity, the first thumbnail, author, length and/or size, form and/or each media content.
In the exemplary embodiment of the present invention shown in Fig. 3, in the situation that request detection module 503 detects that searching request is associated with picture, Webpage searching module 505 can be searched the webpage mating with searching request in predefined text index storehouse and picture indices storehouse.Alternatively, predefined text index storehouse can comprise a kind of Word message at least following: the title (3A is indicated as arrow) of webpage, summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C).Alternatively, predefined picture indices storehouse can comprise a kind of pictorial information at least following: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.
In the exemplary embodiment of the present invention shown in Fig. 4, in the situation that request detection module 503 detects that searching request is associated with audio frequency, Webpage searching module 505 can be searched the webpage mating with searching request in predefined text index storehouse and audio index storehouse.Alternatively, predefined text index storehouse can comprise a kind of Word message at least following: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C).Alternatively, predefined audio index storehouse can comprise a kind of audio-frequency information at least following: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency.
As shown in Figure 9, device 500 comprises information extraction modules 507, is suitable for from text index storehouse and media content index storehouse, extracting respectively Word message and the media content information of webpage, as the Search Results of searching request.Alternatively, this Search Results can show on one or more ustomer premises access equipment.
In one exemplary embodiment of the present invention, information extraction modules 507 is suitable for extracting a kind of Word message in webpage at least following from text index storehouse: title, summary and text, and as the Search Results of searching request.
In one exemplary embodiment of the present invention, information extraction modules 507 is suitable for extracting a kind of media content information in webpage at least following from media content index storehouse: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
In one exemplary embodiment of the present invention, information extraction modules 507 is suitable for for the one or more media contents in webpage distribute the 2nd URL address, and wherein the page of the second thumbnail that shows one or more media contents is pointed in the 2nd URL address.
In the exemplary embodiment of the present invention shown in Fig. 3, in the situation that request detection module 503 detects that searching request is associated with picture, information extraction modules 507 can be extracted respectively Word message and the pictorial information of webpage from text index storehouse and picture indices storehouse, and the one or more pictures that can be in webpage distribute new URL address, the wherein page (3G is indicated as arrow) of the thumbnail of the one or more pictures in this new URL address sensing display web page.Alternatively, whole pictures that information extraction modules 507 can be in webpage distribute new URL address, and wherein the page of the thumbnail of whole pictures in display web page is pointed in this new URL address.Alternatively, when user selects the option corresponding to this new URL address in Search Results, during as the picture header in Fig. 3 (as arrow, 3D is indicated), jump to the page (3G is indicated as arrow) corresponding to this new URL address, to show the thumbnail of whole pictures in this webpage to user.
In one exemplary embodiment of the present invention, information extraction modules 507 comprises: Word message extraction unit and media content information extraction unit are suitable for from text index storehouse and media content index storehouse, extracting respectively Word message and the media content information of webpage; And information combination unit, be suitable for by Word message and the media content information of predetermined way combination webpage, as the Search Results of searching request.
In one exemplary embodiment of the present invention, information combination unit is suitable for selecting the first thumbnail of a media content from the media content information of webpage; And in Search Results, show the first thumbnail of a media content.
In exemplary embodiment of the present invention as shown in Figure 3, in the situation that request detection module 503 detects that searching request is associated with picture, Word message extraction unit and media content information extraction unit extract respectively the following Word message of webpage from text index storehouse: the title (3A is indicated as arrow) of webpage, the URL(of summary (3B is indicated as arrow) and/or webpage is as indicated in arrow 3C), from picture indices storehouse, extract the following pictorial information of webpage: picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), first thumbnail (3F is indicated as arrow) of picture, picture author (not shown), picture size or resolution (not shown), the URL address (not shown) of picture format (not shown) and/or picture.Alternatively, first thumbnail (3F is indicated as arrow) is selected in information combination unit in the first thumbnail of extracted picture, to be presented in Search Results.As shown in Figure 3, every Search Results includes title, summary and the URL of webpage, and first thumbnail (3F is indicated as arrow) of picture header, picture number and picture.
In one exemplary embodiment of the present invention, information combination unit is suitable for selecting the first thumbnail of multiple media contents from the media content information of webpage; And in Search Results, show the first thumbnail of the plurality of media content.
In exemplary embodiment of the present invention as shown in Figure 6, first thumbnail (6E is indicated as arrow) of four pictures can be selected in information combination unit in the first thumbnail of picture, to be presented in Search Results.Certainly can understand, the quantity of selected the first thumbnail in information combination unit is not limited to the quantity described in the embodiment of the present invention.In the Search Results shown in Fig. 6, every Search Results includes title, summary and the URL of webpage, and four first thumbnails (6E is indicated as arrow) of picture header, picture number and picture.
In one exemplary embodiment of the present invention, media content information comprises word segment and thumbnail part, and word segment points to the page of the second thumbnail that shows one or more media contents.
In exemplary embodiment of the present invention as shown in Figure 3, in the situation that request detection module 503 detects that searching request is associated with picture, pictorial information comprises word segment and thumbnail part.In Search Results, word segment can comprise the URL address (not shown) of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and/or picture; Thumbnail part can comprise first thumbnail (3F is indicated as arrow) of picture.Wherein, when user selects picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) or other word segments, jump to the new page (3G is indicated as arrow), this page shows second thumbnail (3H is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.
In exemplary embodiment of the present invention as shown in Figure 6, in the situation that request detection module 503 detects that searching request is associated with picture, pictorial information comprises word segment and thumbnail part.In Search Results, word segment can comprise URL address (not shown) and/or other word segments (" > > " as indicated in arrow 6G) of picture header (not shown), picture number (6D is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown), picture; Thumbnail part can comprise first thumbnail (6E is indicated as arrow) of picture.Wherein, when user selects picture number (6D is indicated as arrow) or other word segments (" > > " as indicated in arrow 6G), jump to the new page (6H is indicated as arrow), this page shows second thumbnail (6I is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.
In embodiments of the invention, for search engine, provide the device 500 of web page media content information to provide mode more directly perceived, that be easier to the searching media content information of understanding for client, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
Embodiment six
Introducing below provides the device of web page media content information according to exemplary embodiment of the present invention for search engine.
Alternatively, this device is suitable for carrying out previously described method 300.
Figure 10 shows and for search engine, provides the structural representation of the device 600 of web page media content information according to exemplary embodiment of the present invention.
In an embodiment of the present invention, device 600 comprises:
Information extraction modules 601, be suitable for receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request; And
Search Results provides module 603, is suitable for the selection in response to the Word message to webpage and media content information, and Search Results is provided.
In an embodiment of the present invention, media content can at least comprise the one in following: picture, animation, Voice & Video.Certainly can understand, media content also can comprise other guide.
As shown in figure 10, device 600 comprises information extraction modules 601, be suitable for receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage, as the Search Results of searching request.For example, as shown in figure 10, from one or more ustomer premises access equipment, receive with webpage in the sign of default media content information match searching request time, information extraction modules 601 can be extracted Word message and the media content information of webpage, as the Search Results of searching request.
Alternatively, searching request can be the searched key word that user inputs.Certainly can understand, in embodiments of the invention, do not limit the concrete form of above-mentioned searching request.
In an exemplary embodiment of the present invention, information extraction modules 601 be suitable for receive with webpage in the sign of default media content information match searching request time, extract a kind of Word message in webpage at least following Search Results as searching request: title, summary and text.
In an exemplary embodiment of the present invention, information extraction modules 601 be suitable for receive with webpage in the sign of default media content information match searching request time, extract a kind of media content information in webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
In exemplary embodiment of the present invention as shown in Figure 3, receive with webpage in the sign of default pictorial information match searching request time, information extraction modules 601 can be extracted a kind of Word message in webpage at least following Search Results as searching request: the title (3A is indicated as arrow) of webpage, summary (3B is indicated as arrow), and the URL(of webpage is as indicated in arrow 3C).Alternatively, information extraction modules 601 can be extracted a kind of pictorial information in webpage at least following Search Results as searching request: the URL address (not shown) of the first thumbnail (3F is indicated as arrow), picture author (not shown), picture size or resolution (not shown), picture format (not shown) and the picture of picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), picture.
In the exemplary embodiment of the present invention shown in Fig. 4, receive with webpage in the sign of default audio-frequency information match searching request time, information extraction modules 601 can be extracted a kind of Word message in webpage at least following Search Results as searching request: the URL(of title (4A is indicated as arrow), summary (4B is indicated as arrow) and webpage is as indicated in arrow 4C).Alternatively, information extraction modules 601 can be extracted a kind of audio-frequency information in webpage at least following Search Results as searching request: audio title (4D is indicated as arrow), audio thumbnail (4E is indicated as arrow), audio frequency author (4F is indicated as arrow), audio frequency size (4G is indicated as arrow), audio format (not shown), and the URL address (not shown) of audio frequency.
In an exemplary embodiment of the present invention, information extraction modules 601 can be extracted as preallocated the 2nd URL address of each media content, and wherein the page of the second thumbnail that shows described one or more media contents is pointed in the 2nd URL address
In exemplary embodiment of the present invention as shown in Figure 3, receive with webpage in the sign of default pictorial information match searching request time, information extraction modules 601 can be extracted as preallocated the 2nd URL address of each picture, and wherein the page of the second thumbnail that shows one or more pictures is pointed in the 2nd URL address.Alternatively, whole pictures that information extraction modules 601 can be extracted as in webpage distribute new URL address, and wherein the page (3G is indicated as arrow) of the thumbnail of whole pictures in display web page is pointed in this new URL address.Alternatively, when user selects the option corresponding to this new URL address in Search Results, during as picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) or other word segments, jump to the page (3G is indicated as arrow) corresponding to this new URL address, to show the thumbnail of whole pictures in this webpage to user.
In one exemplary embodiment of the present invention, information extraction modules 601 comprises Word message extraction unit and media content information extraction unit, be suitable for receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And information combination unit, be suitable for by Word message and the media content information of predetermined way combination webpage, as the Search Results of searching request.
As shown in figure 10, device 600 comprises that Search Results provides module 603, is suitable for the selection in response to the Word message to webpage and media content information, and Search Results is provided.For example, as shown in figure 10, this Search Results can show on one or more ustomer premises access equipment.
In one exemplary embodiment of the present invention, Search Results provides the module 603 can be in response to the selection to webpage Word message, jumps to a URL address, so that Search Results to be provided.For example, as shown in Figure 4, Search Results provides module 603 in response to the selection to webpage Word message (web page title as indicated in arrow 4A), to carry out the step that jumps to a URL address, so that the details (4H is indicated as arrow) of this video to be provided.
In another exemplary embodiment of the present invention, Word message extraction unit and media content information extraction unit can receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And information combination unit can be by Word message and the media content information of following predetermined way combination webpage, as the Search Results of searching request: select the first thumbnail of a media content from the media content information of webpage, and show the first thumbnail of a media content in Search Results.Alternatively, Search Results provides module 603 in response to the selection of the first thumbnail to a media content, to carry out the step that jumps to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.Alternatively, Search Results provides module 603 in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, to carry out the step of a URL address that jumps to this media content, so that the information of this media content to be provided.
In exemplary embodiment of the present invention as shown in Figure 3, Word message extraction unit and media content information extraction unit can receive with webpage in the sign of default pictorial information match searching request time, extract the following Word message of webpage: the title (3A is indicated as arrow) of webpage, the URL(of summary (3B is indicated as arrow) and/or webpage is as indicated in arrow 3C), and extract the following pictorial information of webpage: picture header (3D is indicated as arrow), picture number (3E is indicated as arrow), first thumbnail (3F is indicated as arrow) of picture, picture author (not shown), picture size or resolution (not shown), the URL address (not shown) of picture format (not shown) and/or picture.Alternatively, information combination unit can be by Word message and the pictorial information of following predetermined way combination webpage, as the Search Results of searching request: select the first thumbnail of a picture from the pictorial information of webpage, and show first thumbnail (3F is indicated as arrow) of a picture in result.Alternatively, when user selects the thumbnail (3F is indicated as arrow) of this picture, Search Results provides module 603 can carry out the step that jumps to the new page (3G is indicated as arrow), and this page shows second thumbnail (3H is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.Alternatively, when user selects second thumbnail (3H is indicated as arrow) of the each picture in new interface (3G is indicated as arrow), Search Results provides module 603 to carry out to jump to the step of a URL address of this picture, to show the details of this picture.
In another exemplary embodiment of the present invention, Word message extraction unit and media content information extraction unit can receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of webpage; And information combination unit can be by Word message and the media content information of following predetermined way combination webpage, as the Search Results of searching request: select the first thumbnail of multiple media contents from the media content information of webpage, and show the first thumbnail of multiple media contents in Search Results.Alternatively, Search Results provides module 603 in response to the selection of the first thumbnail to each media content, to carry out the step of a URL address that jumps to this media content, so that the information of this media content to be provided.
In exemplary embodiment of the present invention as shown in Figure 6, Word message extraction unit and media content information extraction unit can receive with webpage in the sign of default pictorial information match searching request time, extract the following Word message of webpage: title (6A is indicated as arrow), the URL(of summary (6B is indicated as arrow) and/or webpage is as indicated in arrow 6C), and extract the following pictorial information of webpage: picture header (not shown), picture number (6D is indicated as arrow), first thumbnail (6E is indicated as arrow) of picture, picture author (not shown), picture size or resolution (not shown), the URL address (not shown) of picture format (not shown) and/or picture.Alternatively, information combination unit can be by Word message and the pictorial information of following predetermined way combination webpage, as the Search Results of searching request: select four the first thumbnails from the pictorial information of webpage, and show four the first thumbnails in result.Certainly can understand, the embodiment of the present invention does not limit picture number selected and that show.Alternatively, when user selects each in these four picture thumbnails, Search Results provides module 603 can carry out the step that jumps to the new page (6F is indicated as arrow), and this page shows the details of this picture.
In another exemplary embodiment of the present invention, Word message extraction unit and media content information extraction unit can receive with webpage in the sign of default media content information match searching request time, Word message and the media content information of extracting webpage, wherein media content information comprises word segment and thumbnail part; And information combination unit can combine by predetermined way Word message and the media content information of webpage, as the Search Results of searching request.Alternatively, Search Results provides module 603 in response to the selection to word segment, to carry out the step that jumps to the 2nd URL address, to obtain the page of the second thumbnail that shows one or more media contents.Alternatively, Search Results provides module 603 in response to the selection of the second thumbnail to the each media content showing in the 2nd URL address, to carry out the step of a URL address that jumps to this media content, so that the information of this media content to be provided.
In exemplary embodiment of the present invention as shown in Figure 3, the pictorial information in Search Results, comprises word segment and thumbnail part.Word segment can comprise picture header (3D is indicated as arrow), picture number (3E is indicated as arrow) and/or other words; Thumbnail part comprises first thumbnail (3F is indicated as arrow) of picture.Alternatively, when user selects picture header (3D is indicated as arrow) or picture number (3E is indicated as arrow), Search Results provides module 603 can carry out the step that jumps to the new page (3G is indicated as arrow), and this page shows second thumbnail (3H is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.Alternatively, in response to the selection of the second thumbnail (3H is indicated as arrow) to each picture, Search Results provides module 603 to carry out to jump to the step of a URL address of this picture, so that the details of this picture to be provided.
In exemplary embodiment of the present invention as shown in Figure 3, the pictorial information in Search Results, comprises word segment and thumbnail part.Word segment can comprise picture header (not shown), picture number (6D is indicated as arrow) and/or other words (" > > " as indicated in arrow 6G in Fig. 6); Thumbnail part comprises first thumbnail (6E is indicated as arrow) of picture.Alternatively, when user selects picture header (not shown), picture number (6D is indicated as arrow), or during other words (" > > " as indicated in arrow 6G in Fig. 6), Search Results provides module 603 can carry out the step that jumps to the new page (6H is indicated as arrow), and this page shows second thumbnail (6I is indicated as arrow) of one or more pictures.Alternatively, the second thumbnail of whole pictures in this page display web page.Alternatively, in response to the selection of the second thumbnail (6I is indicated as arrow) to each picture, Search Results provides module 603 to carry out to jump to the step of a URL address of this picture, so that the details of this picture to be provided.
In embodiments of the invention, for search engine, provide the device 600 of web page media content information to provide Word message and media content information at Search Results, thereby for client provides mode more directly perceived, that be easier to the searching media content information of understanding, make user can substantially understand the relevant information of media content in webpage, help user to determine the information of the Search Results degree of correlation, thereby improved search efficiency.
The method and apparatus providing at this is intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment.Various general-purpose systems also can with based on using together with this teaching.According to description above, it is apparent constructing the desired structure of this class device.In addition, the present invention is not also for any certain programmed language.It should be understood that and can utilize various programming languages to realize content of the present invention described here, and the description of above language-specific being done is in order to disclose preferred forms of the present invention.
In the instructions that provided herein, a large amount of details have been described.But, can understand, embodiments of the invention can be put into practice in the situation that there is no these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the above in the description of exemplary embodiment of the present invention, each feature of the present invention is grouped together into single embodiment, figure or sometimes in its description.But, the method for the disclosure should be construed to the following intention of reflection: the present invention for required protection requires than the more feature of feature of clearly recording in each claim.Or rather, as reflected in claims, inventive aspect is to be less than all features of disclosed single embodiment above.Therefore, claims of following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can the module in the device in embodiment are adaptively changed and they are arranged in one or more devices different from this embodiment.Some modules in embodiment can be combined into a module or unit or assembly, and can put them in addition multiple submodules or subelement or sub-component.At least some in such feature and/or process or module are mutually repelling, and can adopt any combination to combine all processes or the unit of disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and disclosed any method like this or equipment.Unless clearly statement in addition, in this instructions (comprising claim, summary and the accompanying drawing followed) disclosed each feature can be by providing identical, be equal to or similar object alternative features replaces.
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included in other embodiment, the combination of the feature of different embodiment means within scope of the present invention and forms different embodiment.For example, in claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
Each device embodiment of the present invention can realize with hardware, or realizes with the software module of moving on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that and can use in practice microprocessor or digital signal processor (DSP) to realize the some or all functions according to the some or all modules in the device of the embodiment of the present invention.The present invention can also be embodied as part or all the device program (for example, computer program and computer program) for carrying out method as described herein.Realizing program of the present invention and can be stored on computer-readable medium like this, or can there is the form of one or more signal.Such signal can be downloaded and obtain from internet website, or provides on carrier signal, or provides with any other form.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the case of not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed as element or step in the claims.Being positioned at word " " before element or " one " does not get rid of and has multiple such elements.The present invention can be by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim of having enumerated some devices, several in these devices can be to carry out imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title by these word explanations.

Claims (10)

1. search engine provides a method for web page media content information, comprises step:
Receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of described webpage, as the Search Results of described searching request; And
In response to the selection of the Word message to described webpage and media content information, provide described Search Results.
2. the method for claim 1, wherein said media content at least comprises the one in following: picture, animation, Voice & Video.
3. method as claimed in claim 1 or 2, wherein receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of described webpage, as the step of the Search Results of described searching request, comprise:
Receive with webpage in the sign of default media content information match searching request time, extract a kind of described Word message in described webpage at least following Search Results as described searching request: title, summary and text.
4. the method as described in claim 1-3 any one, wherein receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of described webpage, as the step of the Search Results of described searching request, comprise:
Receive with webpage in the sign of default media content information match searching request time, extract the described media content information of one in described webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
5. the method as described in claim 1-4 any one, wherein receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of described webpage, as the step of the Search Results of described searching request, comprise:
Be extracted as preallocated the 2nd URL address of each media content, the page of the second thumbnail that shows described one or more media contents is pointed in wherein said the 2nd URL address.
6. the device that web page media content information is provided for search engine, comprising:
Information extraction modules, be suitable for receive with webpage in the sign of default media content information match searching request time, extract Word message and the media content information of described webpage, as the Search Results of described searching request; And
Search Results provides module, is suitable for the selection in response to the Word message to described webpage and media content information, and described Search Results is provided.
7. device as claimed in claim 6, wherein said media content at least comprises the one in following: picture, animation, Voice & Video.
8. the device as described in claim 6 or 7, wherein said information extraction modules is suitable for:
Receive with webpage in the sign of default media content information match searching request time, extract a kind of described Word message in described webpage at least following Search Results as described searching request: title, summary and text.
9. the device as described in claim 6-8 any one, wherein said information extraction modules is suitable for:
Receive with webpage in the sign of default media content information match searching request time, extract the described media content information of one in described webpage at least following: the title of media content, quantity, the first thumbnail, author, length and/a URL address of size, form and each media content.
10. the device as described in claim 6-9 any one, wherein said information extraction modules is suitable for:
Be extracted as preallocated the 2nd URL address of each media content, the page of the second thumbnail that shows described one or more media contents is pointed in wherein said the 2nd URL address.
CN201310487591.5A 2013-10-17 2013-10-17 Method and device for providing media content information of page by search engine Pending CN103761231A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310487591.5A CN103761231A (en) 2013-10-17 2013-10-17 Method and device for providing media content information of page by search engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310487591.5A CN103761231A (en) 2013-10-17 2013-10-17 Method and device for providing media content information of page by search engine

Publications (1)

Publication Number Publication Date
CN103761231A true CN103761231A (en) 2014-04-30

Family

ID=50528471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310487591.5A Pending CN103761231A (en) 2013-10-17 2013-10-17 Method and device for providing media content information of page by search engine

Country Status (1)

Country Link
CN (1) CN103761231A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166728A (en) * 2014-08-28 2014-11-26 百度在线网络技术(北京)有限公司 Method and system for generating customized content through search engine and search engine
CN104536968A (en) * 2014-11-28 2015-04-22 北京奇虎科技有限公司 Method and device for optimizing search results
CN104809113A (en) * 2014-01-23 2015-07-29 腾讯科技(深圳)有限公司 Webpage information display method and device
WO2015196910A1 (en) * 2014-06-27 2015-12-30 北京奇虎科技有限公司 Search engine-based summary information extraction method, apparatus and search engine
CN109284421A (en) * 2018-09-21 2019-01-29 广州神马移动信息科技有限公司 The management of content-data and inspection method and its device, equipment/terminal/server, computer-readable medium in Knowledge Community

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996029661A1 (en) * 1995-03-20 1996-09-26 Interval Research Corporation Retrieval of hyperlinked information resources using heuristics
CN1845100A (en) * 2006-05-15 2006-10-11 南京大学 Image extraction feedback method in web search
CN102129453A (en) * 2011-03-04 2011-07-20 黄斌 Display control device and method capable of displaying search result in mode of text completed with graphs
CN102207943A (en) * 2010-03-29 2011-10-05 上海博泰悦臻电子设备制造有限公司 Identification information matching-based search method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996029661A1 (en) * 1995-03-20 1996-09-26 Interval Research Corporation Retrieval of hyperlinked information resources using heuristics
CN1845100A (en) * 2006-05-15 2006-10-11 南京大学 Image extraction feedback method in web search
CN102207943A (en) * 2010-03-29 2011-10-05 上海博泰悦臻电子设备制造有限公司 Identification information matching-based search method and device
CN102129453A (en) * 2011-03-04 2011-07-20 黄斌 Display control device and method capable of displaying search result in mode of text completed with graphs

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809113A (en) * 2014-01-23 2015-07-29 腾讯科技(深圳)有限公司 Webpage information display method and device
CN104809113B (en) * 2014-01-23 2019-08-09 腾讯科技(深圳)有限公司 The display methods and device of webpage information
WO2015196910A1 (en) * 2014-06-27 2015-12-30 北京奇虎科技有限公司 Search engine-based summary information extraction method, apparatus and search engine
CN104166728A (en) * 2014-08-28 2014-11-26 百度在线网络技术(北京)有限公司 Method and system for generating customized content through search engine and search engine
CN104166728B (en) * 2014-08-28 2019-01-25 百度在线网络技术(北京)有限公司 Method, system and the search engine for customizing content are generated by search engine
CN104536968A (en) * 2014-11-28 2015-04-22 北京奇虎科技有限公司 Method and device for optimizing search results
CN104536968B (en) * 2014-11-28 2018-01-05 北京奇虎科技有限公司 A kind of method and apparatus for Optimizing Search result
CN109284421A (en) * 2018-09-21 2019-01-29 广州神马移动信息科技有限公司 The management of content-data and inspection method and its device, equipment/terminal/server, computer-readable medium in Knowledge Community

Similar Documents

Publication Publication Date Title
CN103761232A (en) Method and device for providing media content information of webpage
US9659278B2 (en) Methods, systems, and computer program products for displaying tag words for selection by users engaged in social tagging of content
CN103761231A (en) Method and device for providing media content information of page by search engine
JP2019121377A (en) System and method for detangling of interleaved conversations in communication platforms, method of analyzing unstructured messages, program, and computer device
CN103488781A (en) Method and search engine server for providing information search
US10733247B2 (en) Methods and systems for tag expansion by handling website object variations and automatic tag suggestions in dynamic tag management
CN102968451A (en) Method for loading website data in browser format page and browser client
CN102830894A (en) Method and apparatus for bookmarking webpage
CN103559286A (en) Processing method and device for video searching results
CN103020266A (en) Method and device for extracting webpage text content
CN103646122A (en) Picture identification method based on drag picture, picture identification system based on drag picture, picture identification equipment based on drag picture and picture identification device based on drag picture
CN104699841A (en) Method and device for providing list summary information of search results
CN104462590A (en) Information searching method and device
CN106611065B (en) Searching method and device
RU2562397C2 (en) Method and apparatus for inserting address of hyperlink into bookmark
CN105095175A (en) Method and device for obtaining truncated web title
US20160328110A1 (en) Method, system, equipment and device for identifying image based on image
CN104021191A (en) Method and system for providing solutions to mobile terminal related problems and server
CN105786836A (en) Method and system for generating structured abstract of video webpage
CN103761230A (en) Method and device for capturing media content information of webpage by search engine
CN105574174A (en) Search method and device based on search prompt
EP2034418A1 (en) System and method for assisting a user in constructing of a search query
CN110264283A (en) A kind of popularization resource exhibition method and device
CN102982143A (en) Searching method for network novel and browsing device
CN105183843A (en) List page recognition system and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140430