CN110287371A

CN110287371A - Video pushing method, device and electronic equipment end to end

Info

Publication number: CN110287371A
Application number: CN201910562971.8A
Authority: CN
Inventors: 许世坤; 王长虎
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2019-09-27

Abstract

A kind of video pushing method, device and electronic equipment end to end are provided in the embodiment of the present disclosure, belongs to technical field of data processing, this method comprises: obtaining one or more target video present in video library；Video frame, audio file and the title text for including to the target video parse, and obtain the Context resolution result of multiple and different types；Feature calculation is carried out to the multiple different types of Context resolution result by preset disaggregated model, respectively obtains the feature vector of the video frame, the audio file and the title text；Using the feature vector of the video frame, the audio file and the title text as the feature vector of supplying system, in order to which the supplying system pushes the target video to target object using described eigenvector.By the processing scheme of the disclosure, the accuracy of video push is improved.

Description

Video pushing method, device and electronic equipment end to end

Technical field

This disclosure relates to technical field of data processing, more particularly to a kind of video pushing method, device and electricity end to end Sub- equipment.

Background technique

With the continuous development of Internet technology, network video becomes increasingly abundant, and user watches video and is no longer limited to TV, Platform can also be provided by the interested video-see of internet hunt, video to analyze by the video hobby to user Later, can active to user recommend video, to facilitate viewing of the user for video.In order to grasp the behavior of user Habit, it usually needs check that user watches the historical record of video, carry out video recommendations by a large amount of historical behavior data.

In recommender system, rely primarily on the interactive action between user and recommendation information carry out the study of recommender system and Training, recommendation effect influence dependent on the mutual movement between user and user, user and recommendation information, in the process, do not have Have and fine-grained management carried out to the content of video, so as to cause recommendation content there is a problem of it is inaccurate.

Summary of the invention

In view of this, the embodiment of the present disclosure provides a kind of video pushing method, device and electronic equipment end to end, at least Part solves problems of the prior art.

In a first aspect, the embodiment of the present disclosure provides a kind of video pushing method end to end, comprising:

Obtain one or more target video present in video library；

Video frame, audio file and the title text for including to the target video parse, and obtain multiple and different classes The Context resolution result of type；

Feature calculation is carried out to the multiple different types of Context resolution result by preset disaggregated model, respectively To the video frame, the feature vector of the audio file and the title text；

Using the feature vector of the video frame, the audio file and the title text as the feature of supplying system to Amount, in order to which the supplying system pushes the target video to target object using described eigenvector.

According to a kind of specific implementation of the embodiment of the present disclosure, one or more mesh present in the acquisition video library Mark video, comprising:

One or more videos to be screened are obtained from video library；

Judge on the label of the video to be screened with the presence or absence of recommendation label；

If it exists, then by the video selection to be screened be target video.

According to a kind of specific implementation of the embodiment of the present disclosure, the video frame for including to the target video, sound Frequency file and title text are parsed, and the Context resolution result of multiple and different types is obtained, comprising:

Frame image in the target video is parsed；

Based on the parsing result to image in target video, one or more video frames are selected；

Using the video frame as the component part of the Context resolution result.

According to a kind of specific implementation of the embodiment of the present disclosure, the video frame for including to the target video, sound Frequency file and title text are parsed, and the Context resolution result of multiple and different types is obtained, further includes:

Obtain the audio file for including in the target video；

Institute's audio file is converted into audible spectrum figure；

Using the audible spectrum figure as the component part of the Context resolution result.

The title text for including in the target video is obtained, using the title text as the Context resolution result Component part.

According to a kind of specific implementation of the embodiment of the present disclosure, it is described by preset disaggregated model to it is the multiple not The Context resolution result of same type carries out feature calculation, respectively obtains the video frame, the audio file and the heading-text This feature vector, comprising:

Feature calculation is carried out to the video frame in the Context resolution result using preset CNN model；

Extract the first eigenvector formed in full articulamentum in the CNN model.

According to a kind of specific implementation of the embodiment of the present disclosure, it is described by preset disaggregated model to it is the multiple not The Context resolution result of same type carries out feature calculation, respectively obtains the video frame, the audio file and the heading-text This feature vector, further includes:

Feature calculation is carried out to the audible spectrum figure in the Context resolution result using preset CNN model；

Extract the second feature vector formed in full articulamentum in the CNN model.

Feature calculation is carried out to the title text in the Context resolution result using preset RNN model；

Extract the third feature vector formed in the last one node in the RNN model.

It is described by the video frame, the audio file and institute according to a kind of specific implementation of the embodiment of the present disclosure Feature vector of the feature vector as supplying system of title text is stated, in order to which the supplying system utilizes described eigenvector The target video is pushed to target object, comprising:

Using the first eigenvector, the second feature vector and the third feature vector as the target video Feature vector, be directly added in preset recommended models；

The recommended models using the first eigenvector, the second feature vector and the third feature vector and The intrinsic information of the target video, to target object push and the matched target video of the classification information, wherein described solid Having information includes delivering the time, delivering place and video length information for the target video.

Second aspect, the embodiment of the present disclosure provide a kind of video push device end to end, comprising:

Module is obtained, for obtaining one or more target video present in video library；

Parsing module, video frame, audio file and title text for including to the target video are parsed, are obtained To the Context resolution result of multiple and different types；

Computing module, it is special for being carried out by preset disaggregated model to the multiple different types of Context resolution result Sign calculates, and respectively obtains the feature vector of the video frame, the audio file and the title text；

Pushing module, for using the feature vector of the video frame, the audio file and the title text as pushing away The feature vector of system is sent, is regarded in order to which the supplying system pushes the target to target object using described eigenvector Frequently.

The third aspect, the embodiment of the present disclosure additionally provide a kind of electronic equipment, which includes:

At least one processor；And

The memory being connect at least one processor communication；Wherein,

The memory is stored with the instruction that can be executed by least one processor, and the instruction is by least one processor It executes, so that at least one processor is able to carry out the end in any implementation of aforementioned first aspect or first aspect To the video pushing method at end.

Fourth aspect, the embodiment of the present disclosure additionally provide a kind of non-transient computer readable storage medium, the non-transient meter Calculation machine readable storage medium storing program for executing stores computer instruction, and the computer instruction is for making the computer execute aforementioned first aspect or the Video pushing method end to end in any implementation of one side.

5th aspect, the embodiment of the present disclosure additionally provide a kind of computer program product, which includes The calculation procedure being stored in non-transient computer readable storage medium, the computer program include program instruction, when the program When instruction is computer-executed, the end for executing the computer in aforementioned first aspect or any implementation of first aspect is arrived The video pushing method at end.

Video push scheme end to end in the embodiment of the present disclosure, including obtain one or more present in video library Target video；Video frame, audio file and the title text for including to the target video parse, and obtain multiple and different classes The Context resolution result of type；Feature meter is carried out to the multiple different types of Context resolution result by preset disaggregated model It calculates, respectively obtains the feature vector of the video frame, the audio file and the title text；By the video frame, described Feature vector of the feature vector of audio file and the title text as supplying system, in order to supplying system utilization Described eigenvector pushes the target video to target object.By the scheme of the disclosure, the accurate of video push is improved Property.

Detailed description of the invention

It, below will be to needed in the embodiment attached in order to illustrate more clearly of the technical solution of the embodiment of the present disclosure Figure is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present disclosure, for this field For those of ordinary skill, without creative efforts, it can also be obtained according to these attached drawings other attached drawings.

Fig. 1 is a kind of video push flow diagram end to end that the embodiment of the present disclosure provides；

Fig. 2 a-2b is a kind of neural network structure schematic diagram that the embodiment of the present disclosure provides；

Fig. 3 is another video push flow diagram end to end that the embodiment of the present disclosure provides；

Fig. 4 is another video push flow diagram end to end that the embodiment of the present disclosure provides；

Fig. 5 is the video push apparatus structure schematic diagram end to end that the embodiment of the present disclosure provides；

Fig. 6 is the electronic equipment schematic diagram that the embodiment of the present disclosure provides.

Specific embodiment

The embodiment of the present disclosure is described in detail with reference to the accompanying drawing.

Illustrate embodiment of the present disclosure below by way of specific specific example, those skilled in the art can be by this specification Disclosed content understands other advantages and effect of the disclosure easily.Obviously, described embodiment is only the disclosure A part of the embodiment, instead of all the embodiments.The disclosure can also be subject to reality by way of a different and different embodiment It applies or applies, the various details in this specification can also be based on different viewpoints and application, in the spirit without departing from the disclosure Lower carry out various modifications or alterations.It should be noted that in the absence of conflict, the feature in following embodiment and embodiment can To be combined with each other.Based on the embodiment in the disclosure, those of ordinary skill in the art are without creative efforts Every other embodiment obtained belongs to the range of disclosure protection.

It should be noted that the various aspects of embodiment within the scope of the appended claims are described below.Ying Xian And be clear to, aspect described herein can be embodied in extensive diversified forms, and any specific structure described herein And/or function is only illustrative.Based on the disclosure, it will be understood by one of ordinary skill in the art that one described herein Aspect can be independently implemented with any other aspect, and can combine the two or both in these aspects or more in various ways. For example, carry out facilities and equipments in terms of any number set forth herein can be used and/or practice method.In addition, can make With other than one or more of aspect set forth herein other structures and/or it is functional implement this equipment and/or Practice the method.

It should also be noted that, diagram provided in following embodiment only illustrates the basic structure of the disclosure in a schematic way Think, component count, shape and the size when only display is with component related in the disclosure rather than according to actual implementation in schema are drawn System, when actual implementation kenel, quantity and the ratio of each component can arbitrarily change for one kind, and its assembly layout kenel can also It can be increasingly complex.

In addition, in the following description, specific details are provided for a thorough understanding of the examples.However, fields The skilled person will understand that the aspect can be practiced without these specific details.

The embodiment of the present disclosure provides a kind of video pushing method end to end.Video end to end provided in this embodiment pushes away Delivery method can be executed by a computing device, which can be implemented as software, or be embodied as software and hardware Combination, which, which can integrate, is arranged in server, terminal device etc..

Referring to Fig. 1, a kind of video pushing method end to end of embodiment of the present disclosure offer, comprising:

S101 obtains one or more target video present in video library.

As video operation platform, it is typically stored with the video resource of magnanimity, these video resources may include video display Class video, news category video, various types of videos such as shoot the video certainly.Operation platform is always desirable to for user being most interested in Video push to user, to improve user for the attention rate of video platform, to further promote user in video The residence time of operation platform.

Target video is after video operation platform is analyzed by the video to magnanimity, to select from the video of magnanimity All or part of video out.For example, target video can be the video of user's recommendation, it is also possible to massive video library Zhong Guan The high video of note degree.In order to effective resolution target video, the video that needs can be recommended by video operation platform Label is recommended in setting, recommends the video of label as target video for containing.

S102, the video frame for including to the target video, audio file and title text parse, obtain it is multiple not The Context resolution result of same type.

Target video exists in the form of video file, generally comprises component part common in video file.For example, It include the text header for including, video frame, audio in target video in the video frame to form video, audio content and video The text header for including in content and video contains information more abundant in target video, by video frame, audio The text header for including in content and video is analyzed, and more information relevant to target video can be extracted.

Specifically, the video frame for including in target video can be extracted, it, can be from extraction by analyzing video frame To all video frame images in choose a part of typical frame image and describe the content of target video, and the view that will finally choose A component part of the frequency frame image as Context resolution result.

Also contain audio file in target video, audio file includes the background music of target video, in target video Other sound present in human dialog and target video can be with by parsing to the audio file in target video The classification of target video is judged from the angle of sound.Specifically, extracting target during parsing to target video Audio file present in video, as an example, the audio file extracted are stored in a manner of audible spectrum figure. Audible spectrum figure can also be used as a component part of Context resolution result.

Usually also contain content of text in target video, these content of text include video file text header (for example, Movie name), it is extracted by the text header to video file, also can further obtain the phase of target video inside the Pass Hold, the text header of target video can also be used as a component part of Context resolution result.

S103 carries out feature calculation to the multiple different types of Context resolution result by preset disaggregated model, Respectively obtain the feature vector of the video frame, the audio file and the title text.

After getting Context resolution result, need to divide the analysis of target video based on these Context resolution results Analysis.Common video classification methods are usually simply to be classified based on video name etc., do not analyse in depth in video and wrap The detailed content contained leads to there is a situation where inaccuracy for the classification of video.In order to deep analysis and target video Content, a-2b, can be set special neural network referring to fig. 2, obtained by way of neural metwork training target view The classification information of frequency.

The application mode of property as an example can be with for the video frame and audible spectrum figure in Context resolution result CNN convolutional neural networks are set and carry out classification based training, referring to fig. 2 a, which includes convolutional layer, pond layer, sampling Layer and full articulamentum.

Convolutional layer major parameter includes the size of convolution kernel and the quantity of input feature vector figure, if each convolutional layer may include The characteristic pattern of dry same size, for same layer characteristic value by the way of shared weight, the convolution kernel in every layer is in the same size.Volume Lamination carries out convolutional calculation to input picture, and extracts the spatial layout feature of input picture.

It can be connect with sample level behind the feature extraction layer of convolutional layer, sample level is used to ask the part of input picture flat Mean value simultaneously carries out Further Feature Extraction, by the way that sample level to be connect with convolutional layer, neural network model can be guaranteed for input Image has preferable robustness.

In order to accelerate the training speed of neural network model, pond layer is additionally provided with behind convolutional layer, pond layer uses The mode in maximum pond handles the output result of convolutional layer, can preferably extract the Invariance feature of input picture.

Full articulamentum will be integrated by the feature in the characteristics of image figure of multiple convolutional layers and pond layer, obtain input The characteristic of division that characteristics of image has, to be used for image classification.In neural network model, full articulamentum generates convolutional layer Characteristic pattern is mapped to the feature vector of a regular length.This feature vector contains the combination letter of all features of input picture Breath, this feature vector will contain most characteristic characteristics of image and keep down to complete image classification task in image.This Sample one can the specific generic numerical value of calculating input image (generic probability), be by most possible classification output Achievable classification task.For example, input picture can be classified as including [animal, wind after calculating by full articulamentum Scape, personage, plant] classification as a result, its corresponding probability is [P1, P2, P3, P4] respectively.

For the text header content in target video, classification based training can be carried out using RNN recurrent neural network.Referring to Fig. 2 b, recurrent neural network are made of the node that stratum is distributed, the child node including high-level father node, low order layer, least significant end Child node be usually output node, the property of node is identical as the node in tree.The output node of recurrent neural network is usual Positioned at the top of dendrogram, its structure is drawn from bottom to top at this time, and father node is located at the lower section of child node.Recurrent neural Each node of network can have data input, to the node of the i-th stratum, the calculation of system mode are as follows:

In formulaFor the system mode of the node and its all father node, when there is multiple father nodes,It is to merge For the system mode of matrix, X is the data input of the node, without calculating if the node does not input.F is excitation function Or the feedforward neural network of encapsulation, it can be using the depth algorithm of similar gate algorithm etc.U, W, b are weight coefficients, power Weight coefficient is unrelated with the stratum of node, and the weight of all nodes of recurrent neural network is shared.

By being input in RNN recurrent neural network, can obtain using the text header content in target video as input To the classification value based on RNN recurrent neural network to text header content.

In the actual operation process, can be used in advance trained image CNN disaggregated model to the picture frame taken It extracts embedding feature (feature vector)；Using trained audio CNN disaggregated model in advance to the audible spectrum figure taken It extracts embedding feature (feature vector)；The title text taken is extracted using trained RNN disaggregated model in advance Embedding feature (feature vector).

S104, using the feature vector of the video frame, the audio file and the title text as supplying system Feature vector, in order to which the supplying system pushes the target video to target object using described eigenvector.

After getting the characteristic information of target video, referring to Fig. 3, this feature vector can be directly added to video and pushed away It recommends in system, by this feature vector together with other video informations already existing in video recommendation system, is regarded to user The recommendation of frequency, wherein other video informations include but is not limited to that the city delivered of time for delivering of video, video, video are delivered Equipment, video length etc..

Referring to fig. 4, according to a kind of specific implementation of the embodiment of the present disclosure, the one or more target views of the acquisition Frequently, may include:

S401 obtains one or more videos to be screened from target video source.

Video operation platform is typically stored with the video resource of magnanimity, these video resources may include video display class video, News category video, various types of videos such as shoot the video certainly.Operation platform is always desirable to the video for being most interested in user It is pushed to user, so that user is improved for the attention rate of video platform, so that it is flat in video operation further to promote user The residence time of platform.

After video to be screened can be by analyzing video operation platform by the video to magnanimity, from magnanimity The all or part of video chosen in video.

S402 judges on the label of the video to be screened with the presence or absence of recommendation label.

In order to improve the screening efficiency of target video, the setting on video can be shifted to an earlier date and recommend label, so, passed through Judge that the label on video to be screened with the presence or absence of label is recommended, can be screened effectively.

S403, and if it exists, be then target video by the video selection to be screened.

As a kind of situation, during being parsed to the different types of content that target video includes, including it is right Image (video frame) in the target video is parsed, and based on the parsing result to image in target video, selects one Or multiple video frames, and using the video frame as the component part of the Context resolution result.

According to a kind of specific implementation of the embodiment of the present disclosure, the video frame for including to the target video, sound Frequency file and title text are parsed, and obtain the Context resolution result of multiple and different types, comprising: in the target video Frame image parsed；Based on the parsing result to image in target video, one or more video frames are selected；By the view Component part of the frequency frame as the Context resolution result.

According to a kind of specific implementation of the embodiment of the present disclosure, the video frame for including to the target video, sound Frequency file and title text are parsed, and the Context resolution result of multiple and different types is obtained, further includes: obtain the target view The audio file for including in frequency；Institute's audio file is converted into audible spectrum figure；Using the audible spectrum figure as the content The component part of parsing result.

According to a kind of specific implementation of the embodiment of the present disclosure, the video frame for including to the target video, sound Frequency file and title text are parsed, and obtain the Context resolution result of multiple and different types, comprising: obtain the target video In include title text, using the title text as the component part of the Context resolution result.

According to a kind of specific implementation of the embodiment of the present disclosure, it is described by preset disaggregated model to it is the multiple not The Context resolution result of same type carries out feature calculation, respectively obtains the video frame, the audio file and the heading-text This feature vector, comprising: feature calculation is carried out to the video frame in the Context resolution result using preset CNN model； Extract the first eigenvector formed in full articulamentum in the CNN model.

According to a kind of specific implementation of the embodiment of the present disclosure, it is described by preset disaggregated model to it is the multiple not The Context resolution result of same type carries out feature calculation, respectively obtains the video frame, the audio file and the heading-text This feature vector, further includes: feature is carried out to the audible spectrum figure in the Context resolution result using preset CNN model It calculates；Extract the second feature vector formed in full articulamentum in the CNN model.

According to a kind of specific implementation of the embodiment of the present disclosure, it is described by preset disaggregated model to it is the multiple not The Context resolution result of same type carries out feature calculation, respectively obtains the video frame, the audio file and the heading-text This feature vector, further includes: feature meter is carried out to the title text in the Context resolution result using preset RNN model It calculates；Extract the third feature vector formed in the last one node in the RNN model.

It is described by the video frame, the audio file and institute according to a kind of specific implementation of the embodiment of the present disclosure Feature vector of the feature vector as supplying system of title text is stated, in order to which the supplying system utilizes described eigenvector The target video is pushed to target object, comprising: by the first eigenvector, the second feature vector and the third Feature vector of the feature vector as the target video, is directly added in preset recommended models；The recommended models benefit With the intrinsic letter of the first eigenvector, the second feature vector and the third feature vector and the target video Breath, to target object push and the matched target video of the classification information, wherein the intrinsic information includes the target view Frequency delivers the time, delivers place and video length information.

Corresponding with above method embodiment, referring to Fig. 5, the embodiment of the present disclosure additionally provides a kind of video end to end Driving means 50, comprising:

Module 501 is obtained, for obtaining one or more target video present in video library；

Parsing module 502, video frame, audio file and title text for including to the target video solve Analysis, obtains the Context resolution result of multiple and different types；

Computing module 503, for by preset disaggregated model to the multiple different types of Context resolution result into Row feature calculation respectively obtains the feature vector of the video frame, the audio file and the title text；

Pushing module 504, for using the feature vector of the video frame, the audio file and the title text as The feature vector of supplying system regards in order to which the supplying system pushes the target to target object using described eigenvector Frequently.

Fig. 5 shown device can it is corresponding execute above method embodiment in content, what the present embodiment was not described in detail Part, referring to the content recorded in above method embodiment, details are not described herein.

Referring to Fig. 6, the embodiment of the present disclosure additionally provides a kind of electronic equipment 60, which includes:

At least one processor；And

The memory being connect at least one processor communication；Wherein,

The memory is stored with the instruction that can be executed by least one processor, and the instruction is by least one processor It executes, so that at least one processor is able to carry out in preceding method embodiment video pushing method end to end.

The embodiment of the present disclosure additionally provides a kind of non-transient computer readable storage medium, and the non-transient computer is readable to deposit Storage media stores computer instruction, and the computer instruction is for executing the computer in preceding method embodiment.

The embodiment of the present disclosure additionally provides a kind of computer program product, and the computer program product is non-temporary including being stored in Calculation procedure on state computer readable storage medium, the computer program include program instruction, when the program instruction is calculated When machine executes, the computer is made to execute the video pushing method end to end in preceding method embodiment.

Below with reference to Fig. 6, it illustrates the structural schematic diagrams for the electronic equipment 60 for being suitable for being used to realize the embodiment of the present disclosure. Electronic equipment in the embodiment of the present disclosure can include but is not limited to such as mobile phone, laptop, Digital Broadcasting Receiver Device, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), car-mounted terminal are (such as vehicle-mounted Navigation terminal) etc. mobile terminal and such as number TV, desktop computer etc. fixed terminal.Electronics shown in Fig. 6 Equipment is only an example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.

As shown in fig. 6, electronic equipment 60 may include processing unit (such as central processing unit, graphics processor etc.) 601, It can be loaded into random access storage according to the program being stored in read-only memory (ROM) 602 or from storage device 608 Program in device (RAM) 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with the behaviour of electronic equipment 60 Various programs and data needed for making.Processing unit 601, ROM 602 and RAM 603 are connected with each other by bus 604.It is defeated Enter/export (I/O) interface 605 and is also connected to bus 604.

In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, figure As the input unit 606 of sensor, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaking The output device 607 of device, vibrator etc.；Storage device 608 including such as tape, hard disk etc.；And communication device 609.It is logical T unit 609 can permit electronic equipment 60 and wirelessly or non-wirelessly be communicated with other equipment to exchange data.Although showing in figure The electronic equipment 60 with various devices is gone out, it should be understood that being not required for implementing or having all devices shown. It can alternatively implement or have more or fewer devices.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608 It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the embodiment of the present disclosure is executed Method in the above-mentioned function that limits.

It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned Any appropriate combination.

Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment；It is also possible to individualism, and not It is fitted into the electronic equipment.

Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by the electricity When sub- equipment executes, so that the electronic equipment: obtaining at least two internet protocol addresses；Send to Node evaluation equipment includes institute State the Node evaluation request of at least two internet protocol addresses, wherein the Node evaluation equipment is internet from described at least two In protocol address, chooses internet protocol address and return；Receive the internet protocol address that the Node evaluation equipment returns；Its In, the fringe node in acquired internet protocol address instruction content distributing network.

Alternatively, above-mentioned computer-readable medium carries one or more program, when said one or multiple programs When being executed by the electronic equipment, so that the electronic equipment: receiving the Node evaluation including at least two internet protocol addresses and request； From at least two internet protocol address, internet protocol address is chosen；Return to the internet protocol address selected；Wherein, The fringe node in internet protocol address instruction content distributing network received.

The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+ +, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions, for example, the One acquiring unit is also described as " obtaining the unit of at least two internet protocol addresses ".

It should be appreciated that each section of the disclosure can be realized with hardware, software, firmware or their combination.

The above, the only specific embodiment of the disclosure, but the protection scope of the disclosure is not limited thereto, it is any Those familiar with the art is in the technical scope that the disclosure discloses, and any changes or substitutions that can be easily thought of, all answers Cover within the protection scope of the disclosure.Therefore, the protection scope of the disclosure should be subject to the protection scope in claims.

Claims

1. a kind of video pushing method end to end characterized by comprising

Obtain one or more target video present in video library；

Video frame, audio file and the title text for including to the target video parse, and obtain multiple and different types Context resolution result；

Feature calculation is carried out to the multiple different types of Context resolution result by preset disaggregated model, respectively obtains institute State the feature vector of video frame, the audio file and the title text；

Using the feature vector of the video frame, the audio file and the title text as the feature vector of supplying system, In order to which the supplying system pushes the target video to target object using described eigenvector.

2. the method according to claim 1, wherein one or more target present in the acquisition video library Video, comprising:

One or more videos to be screened are obtained from video library；

If it exists, then by the video selection to be screened be target video.

3. the method according to claim 1, wherein the video frame for including to the target video, audio File and title text are parsed, and the Context resolution result of multiple and different types is obtained, comprising:

Frame image in the target video is parsed；

Using the video frame as the component part of the Context resolution result.

4. according to the method described in claim 3, it is characterized in that, the video frame for including to the target video, audio File and title text are parsed, and the Context resolution result of multiple and different types is obtained, further includes:

Obtain the audio file for including in the target video；

Institute's audio file is converted into audible spectrum figure；

5. according to the method described in claim 4, it is characterized in that, the video frame for including to the target video, audio File and title text are parsed, and the Context resolution result of multiple and different types is obtained, comprising:

The title text for including in the target video is obtained, using the title text as the composition of the Context resolution result Part.

6. the method according to claim 1, wherein it is described by preset disaggregated model to the multiple difference The Context resolution result of type carries out feature calculation, respectively obtains the video frame, the audio file and the title text Feature vector, comprising:

Extract the first eigenvector formed in full articulamentum in the CNN model.

7. according to the method described in claim 6, it is characterized in that, it is described by preset disaggregated model to the multiple difference The Context resolution result of type carries out feature calculation, respectively obtains the video frame, the audio file and the title text Feature vector, further includes:

8. the method according to the description of claim 7 is characterized in that it is described by preset disaggregated model to the multiple difference The Context resolution result of type carries out feature calculation, respectively obtains the video frame, the audio file and the title text Feature vector, further includes:

Extract the third feature vector formed in the last one node in the RNN model.

9. according to the method described in claim 8, it is characterized in that, described by the video frame, the audio file and described Feature vector of the feature vector of title text as supplying system, in order to the supplying system using described eigenvector to Target object pushes the target video, comprising:

Using the first eigenvector, the second feature vector and the third feature vector as the spy of the target video Vector is levied, is directly added in preset recommended models.

10. a kind of video push device end to end characterized by comprising

Parsing module, video frame, audio file and title text for including to the target video are parsed, are obtained more A different types of Context resolution result；

Computing module, by being carried out based on feature by preset disaggregated model to the multiple different types of Context resolution result It calculates, respectively obtains the feature vector of the video frame, the audio file and the title text；

Pushing module, for being using the feature vector of the video frame, the audio file and the title text as push The feature vector of system, in order to which the supplying system pushes the target video to target object using described eigenvector.

11. a kind of electronic equipment, which is characterized in that the electronic equipment includes:

At least one processor；And

The memory being connect at least one described processor communication；Wherein,

The memory is stored with the instruction that can be executed by least one described processor, and described instruction is by described at least one It manages device to execute, so that at least one described processor is able to carry out described in aforementioned any claim 1-9 video end to end Method for pushing.

12. a kind of non-transient computer readable storage medium, which stores computer instruction, The computer instruction is for making the computer execute described in aforementioned any claim 1-9 video pushing method end to end.