CN108629224B

CN108629224B - Information demonstrating method and device

Info

Publication number: CN108629224B
Application number: CN201710152564.0A
Authority: CN
Inventors: 李川; 游正朋
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2017-03-15
Filing date: 2017-03-15
Publication date: 2019-11-05
Anticipated expiration: 2037-03-15
Also published as: CN108629224A; WO2018166288A1

Abstract

This application discloses information demonstrating methods and device.One specific embodiment of this method includes: the key frame detected in target video, wherein key frame is the frame that image entropy is greater than preset image entropy threshold in target video；In response to detecting key frame, the image of target item is detected from key frame；In response to detecting the image of target item from key frame, determine whether the number that the frame of image of target item is continuously presented after key frame is greater than scheduled frame number；If more than scheduled frame number, then the information to be presented with the images match of target item is obtained, and information to be presented is presented in the frame of image that target item is continuously presented.Information to be presented can be pointedly presented in the embodiment to the target item in target video, improve the accuracy rate of information push.

Description

Information demonstrating method and device

Technical field

This application involves field of computer technology, and in particular to video technique field more particularly to information demonstrating method and Device.

Background technique

With the development of the quickly universal and digital image collection processing technique of internet, network video industry rises abruptly rapidly It rises, and plays increasingly important role in people's daily life.Include more letters such as image, sound, text as one kind The comprehensive media of breath, video has powerful information carrying and transmission capacity, therefore the semantic analysis of video and understanding are already An important research direction as multimedia signal processing field.On the other hand, with e-commerce platform Fast Growth, net Network shopping is increasingly becoming the shopping way of people's longest selection, this brings for the combination of network video industry and e-commerce Business opportunity.

Analysis video content simultaneously combines it with user personalized information, forms personalized advertisement recommender system and helps In the clicking rate and conversion ratio that promote advertisement, on the other hand spectators, which can be effectively reduced, in personalized advertisement recommendation passively to receive The sense of discomfort of set advertisement.Therefore, the content analysis for various network videos and carry out the relevant advertisements such as shopping at network clothes The personalized recommendation for information of being engaged in has important research significance and practical value.

Summary of the invention

The purpose of the application is to propose a kind of improved information demonstrating method and device, to solve background above technology department Divide the technical issues of mentioning.

In a first aspect, the embodiment of the present application provides a kind of information demonstrating method, this method comprises: in detection target video Key frame, wherein key frame be target video in image entropy be greater than preset image entropy threshold frame；It is closed in response to detecting Key frame detects the image of target item from key frame；In response to detecting the image of target item from key frame, determine Whether the number of the frame of the continuous image that target item is presented is greater than scheduled frame number after key frame；If more than scheduled frame Number then obtains the information to be presented with the images match of target item, and is in the frame of image that target item is continuously presented Existing information to be presented.

In some embodiments, the key frame in target video is detected, comprising: obtain image entropy and be greater than preset image entropy The frame of threshold value is as key frame；According to the playing sequence of target video, the image entropy after acquisition key frame is greater than preset figure As the first frame of entropy threshold；Determine whether the similarity of first frame and key frame is less than preset similarity threshold；If being less than pre- If similarity threshold, it is determined that go out first frame be key frame.

In some embodiments, the image of target item is detected from key frame, comprising: based on convolution mind trained in advance The image of target item is detected from key frame through network, wherein the image of convolutional neural networks target item for identification is special Levy and determine according to characteristics of image the image of target item.

In some embodiments, determine whether the number that the frame of image of target item is continuously presented after key frame is big In scheduled frame number, comprising: determine whether the image of target item is continuously presented on after key frame using compression track algorithm Different frames in；If continuous be presented, add up the number of the frame of the continuous image that target item is presented, and determines the number of frame Whether scheduled frame number is greater than.

In some embodiments, information to be presented is presented in the frame of image that target item is continuously presented, comprising: determine Location information of the image of target item in the frame of image that target item is continuously presented；It is determined according to location information to be presented The position of appearing of information；Information to be presented is presented on position of appearing.

In some embodiments, the information to be presented with the images match of target item is obtained, comprising: obtain letter to be presented Breath set, wherein information to be presented includes picture；Determine the picture and mesh in information aggregate to be presented in every information to be presented Mark the similarity between the image of article；At least one is chosen from information aggregate to be presented according to the descending sequence of similarity Item information to be presented.

In some embodiments, information to be presented includes text information；And it obtains and the images match of target item Information to be presented, comprising: obtain the text information with the categorical match of the image of target item.

In some embodiments, the information to be presented with the images match of target item is obtained, comprising: acquisition passes through terminal Watch the class label of the user of target video, wherein the class label of user is carried out greatly by the behavioral data to user What data were analyzed；The letter to be presented of class label matched at least one with user is obtained from information aggregate to be presented Breath.

Second aspect, the embodiment of the present application provide a kind of information presentation device, which includes: that key frame detection is single Member, for detecting the key frame in target video, wherein key frame is that image entropy is greater than preset image entropy threshold in target video The frame of value；Image detecting element, for detecting the image of target item from key frame in response to detecting key frame；It determines Unit, for the image in response to detecting target item from key frame, object is continuously presented in determination after key frame Whether the number of the frame of the image of product is greater than scheduled frame number；Display unit, for if more than scheduled frame number, then obtaining and mesh The information to be presented of the images match of article is marked, and information to be presented is presented in the frame of image that target item is continuously presented.

In some embodiments, key frame detection unit is further used for: obtaining image entropy and is greater than preset image entropy threshold The frame of value is as key frame；According to the playing sequence of target video, the image entropy after acquisition key frame is greater than preset image The first frame of entropy threshold；Determine whether the similarity of first frame and key frame is less than preset similarity threshold；If being less than default Similarity threshold, it is determined that go out first frame be key frame.

In some embodiments, image detecting element is further used for: based on convolutional neural networks trained in advance from pass The image of target item is detected in key frame, wherein the convolutional neural networks characteristics of image of target item and according to figure for identification As feature determines the image of target item.

In some embodiments, determination unit is further used for: the image of target item is determined using compression track algorithm Whether continuously it is presented in the different frames after key frame；If continuous be presented, add up the continuous image that target item is presented Frame number, and determine frame number whether be greater than scheduled frame number.

In some embodiments, display unit is further used for: determining that object is continuously being presented in the image of target item Location information in the frame of the image of product；The position of appearing of information to be presented is determined according to location information；It is on position of appearing Existing information to be presented.

In some embodiments, display unit is further used for: obtaining information aggregate to be presented, wherein information to be presented Including picture；It determines similar between the picture and the image of target item in information aggregate to be presented in every information to be presented Degree；At least one information to be presented is chosen from information aggregate to be presented according to the descending sequence of similarity.

In some embodiments, information to be presented includes text information；And display unit is further used for: acquisition and mesh Mark the text information of the categorical match of the image of article.

In some embodiments, display unit is further used for: obtaining the class that the user of target video is watched by terminal Distinguishing label, wherein the class label of user is to carry out big data analysis by the behavioral data to user to obtain；From to be presented The information to be presented of class label matched at least one with user is obtained in information aggregate.

The third aspect, the embodiment of the present application provide a kind of equipment, comprising: one or more processors；Storage device is used In storing one or more programs, when one or more programs are executed by one or more processors, so that at one or more Device is managed to realize such as the method in first aspect in any embodiment.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence is realized when the program is executed by processor such as the method in first aspect in any embodiment.

Information demonstrating method and device provided by the embodiments of the present application pass through the mesh in the key frame in detection target video Information to be presented is presented on the frame of image that target item is continuously presented in the image for marking article, and the application is based on target video Content carry out targetedly information present, improve information presentation precision, to reduce cost and improve the point of user Hit rate.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that this application can be applied to exemplary system architecture figures therein；

Fig. 2 is the flow chart according to one embodiment of the information demonstrating method of the application；

Fig. 3 a is the schematic diagram according to the building process of the compression vector of the information demonstrating method of the application；

Fig. 3 b is the schematic diagram that process is presented according to the information of the information demonstrating method of the application；

Fig. 4 is the flow chart according to another embodiment of the information demonstrating method of the application；

Fig. 5 is the structural schematic diagram according to one embodiment of the information presentation device of the application；

Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the equipment of the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the exemplary system of the embodiment of the information demonstrating method or information presentation device of the application System framework 100.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..The various client applications for supporting playing video file can be installed on terminal device 101,102,103, such as Web browser applications, shopping class application, searching class application, instant messaging tools, social platform software etc..

Terminal device 101,102,103 can be with display screen and support the various electronic equipments of video playing, packet Include but be not limited to smart phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) it is player, on knee portable Computer and desktop computer etc..

Server 105 can be to provide the server of various services, such as to showing on terminal device 101,102,103 Video provides the background video server supported.Background video server can to receive video playing request etc. data into The processing such as row analysis, and processing result (such as video data) is fed back into terminal device.

It should be noted that information demonstrating method provided by the embodiment of the present application is generally executed by server 105, accordingly Ground, information presentation device are generally positioned in server 105.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.

With continued reference to Fig. 2, the process 200 of one embodiment of the information demonstrating method according to the application is shown.The letter Cease rendering method, comprising the following steps:

Step 201, the key frame in target video is detected.

In the present embodiment, the electronic equipment (such as server shown in FIG. 1) of information demonstrating method operation thereon can To receive video playing using its terminal for carrying out video playing from user by wired connection mode or radio connection Request, according to video playing request target video, and detects the key frame in target video.Wherein, key frame is the mesh Mark the frame that image entropy in video is greater than preset image entropy threshold.The bit that image entropy is expressed as image gray levels set is average Number, per bit/pixel also illustrate the average information of video source.Image entropy is defined as:

Wherein H is image entropy, p_iIt is the probability for the pixel that gray scale is i in image.Image entropy in target video is obtained to be greater than The frame of preset image entropy threshold, can remove the blank frame in video, further decrease the complexity of algorithm.

In some optional implementations of the present embodiment, the key frame in target video is detected, comprising: obtain image Entropy is greater than the frame of preset image entropy threshold as key frame；According to the playing sequence of target video, after obtaining key frame Image entropy is greater than the first frame of preset image entropy threshold；Determine whether first frame and the similarity of key frame are less than preset phase Like degree threshold value；If being less than preset similarity threshold, it is determined that going out first frame is key frame.Under normal circumstances, in target video Comprising multiple independent scenes, the key frame of the image comprising target item is extracted in each independent scene, is facilitated It reduces and repeats to detect, to reduce the complexity of algorithm.The application is detected in video using the event information of successive frame in video Key frame.So-called event, which refers to, is divided into independent frame unit for video, and continuity is stronger between frame and frame in each cell, Image information difference is smaller, and the image difference degree between different units is larger.The similarity of image is using pixel between image Difference is portrayed.It is shown below:

Sim=-abs (curFrame-preFrame) (formula 2)

Wherein sim is similarity, and curFrame, preFrame are respectively picture of the same pixel in two continuous frames image Element value, abs is absolute value.According to the playing sequence of video, first image entropy got is greater than preset image entropy threshold Frame as key frame, the pixel value of any pixel is preFrame on the key frame.And it should in frame after the key frame The pixel value that pixel is in another pixel of same position is curFrame, if the sim being calculated according to formula 2 Value is less than preset similarity threshold, then the frame after the key frame is also determined as key frame.

Step 202, in response to detecting key frame, the image of target item is detected from key frame.

In the present embodiment, there may be the images of multiple articles in key frame, for example, T-shirt, cap, shoes, beverage etc. are schemed Picture.The image of target item can be detected from these images, carried out pointedly information and presented.Rather than key frame is presented In include all items photographed image-related information.For example, it is desired to when information relevant to T-shirt is presented, using T-shirt as object Product detect the image of T-shirt.

In some optional implementations of the present embodiment, the image of target item is detected from key frame, comprising: base The image of target item is detected from key frame in convolutional neural networks trained in advance, wherein convolutional neural networks are for knowing The characteristics of image of other target item and the image that target item is determined according to characteristics of image.Object is extracted with convolutional Neural net Product can effectively identify position and classification information of the image of target item in key frame, consequently facilitating succeeding target chases after Track and article are recommended.The picture of convolutional neural networks is inputted for one, extraction candidate region, every picture extract first Then 1000 candidate regions carry out picture size normalization to each candidate region, then extracted and waited using convolutional Neural net The high dimensional feature of favored area classifies to candidate region finally by full articulamentum.By classifying to each region, To extract the image of the target item on key frame, its position can also be determined.The network inspection of the application trained in advance The target of survey may include garment type, such as shoes, jacket, shorts, skirt, one-piece dress etc..These information are for subsequent article Recommend significant.The location information of target item is convenient for the position initialization of succeeding target tracking.

Convolutional neural networks (Convolutional Neural Networks, CNN) are a kind of artificial neural networks.Volume Product neural network is a kind of feedforward neural network, its artificial neuron can respond single around in a part of coverage area Member has outstanding performance for large-scale image procossing.Generally, the basic structure of CNN includes two layers, and one is characterized extract layer, The input of each neuron is connected with the local acceptance region of preceding layer, and extracts the feature of the part.Once the local feature quilt After extraction, its positional relationship between other feature is also decided therewith；The second is computation layer, each computation layer of network by Multiple Feature Mapping layer compositions, each Feature Mapping layer is a plane, and the weight of all neurons is equal in plane.Feature is reflected Activation primitive of the structure using the small sigmoid function of influence function core as convolutional network is penetrated, so that Feature Mapping has position Motion immovability.Further, since the neuron on a mapping face shares weight, thus reduce the number of network freedom parameter. Each of convolutional neural networks feature extraction layer all followed by one is used to ask the computation layer of local average and second extraction, This distinctive structure of feature extraction twice reduces feature resolution.Its artificial neuron can respond a part covering model Interior surrounding cells are enclosed, have outstanding performance for large-scale image procossing.Convolutional neural networks are formed more by combination low-level feature Add abstract high-rise expression attribute classification or feature, to find that the distributed nature of data indicates.The essence of deep learning is logical Crossing building has the machine learning model of many hidden layers and the training data of magnanimity, to learn more useful feature, to merge The accuracy of classification or prediction is promoted afterwards.The convolutional neural networks can be used to identify the feature of the target item in key frame, In, the feature of the target item may include the features such as the color of target item, texture, shade, direction change, quality.

Step 203, in response to detecting the image of target item from key frame, determination is continuously presented after key frame Whether the number of the frame of the image of target item is greater than scheduled frame number.

In the present embodiment, the object that a variety of track algorithms detect in tracking step 202 in successive frame can be used The image of product.Only all there is the image of target item in continuous multiple frames, then that information to be presented is presented is just significant.Choosing The frame for taking the image of target item to be more than certain threshold value there are the time is launched, and one side user has time enough to go to click On the one hand information content to be presented also can be effectively reduced, to not influence the viewing body of user in information to be presented, such as advertisement It tests.User's click information entry can enter the corresponding article webpage of information to be presented.Such as tracking study and detection can be used Track algorithms such as (TLD, tracking learning and detection) carry out the tracking of the image of target item.

In some optional implementations of the present embodiment, the figure that target item is continuously presented after key frame is determined Whether the number of the frame of picture is greater than scheduled frame number, comprising: determines whether the image of target item connects using compression track algorithm Continue in the different frames after being presented on key frame；If continuous be presented, add up the frame of the continuous image that target item is presented Number, and determine whether the number of frame is greater than scheduled frame number.Compression tracking is a kind of simply and efficiently compressed sensing based Track algorithm.First with the random perception for meeting compressed sensing (restricted isometry property, RIP) condition Square carries out dimensionality reduction to multi-scale image feature, is then carried out in the feature after dimensionality reduction using simple Naive Bayes Classifier Classification.As general pattern classification framework: first extracting the feature of image, then classified by classifier to it, difference is Here feature extraction uses compressed sensing, and classifier uses naive Bayesian.Then every frame updates classifier by on-line study.

It is as follows to compress track algorithm process:

(1) when t frame, we sample and obtain the image sheet of several targets (positive sample) and background (negative sample), Then multi-scale transform is carried out to them, then dimensionality reduction is carried out to multi-scale image feature by a sparseness measuring matrix, then It goes to train Naive Bayes Classifier by the feature (including target and background, belong to two classification problems) after dimensionality reduction.

(2) when t+1 frame, we (are kept away n scanning window of surrounding sample in the target position that previous frame traces into Remove scanning entire image from), by same sparseness measuring matrix to its dimensionality reduction, feature is extracted, it is then trained with t frame Naive Bayes Classifier is classified, and the classification maximum window of score is taken as target window.It thereby realizes from t frame To the target following of t+1 frame.

The building process for compressing vector is as shown in Figure 3a, and Fig. 3 a shows the sparse matrix of a n × m, it can be by one The x (m dimension) in dimensional images space transforms to the space v (n dimension) an of low-dimensional, and mathematical expression is exactly: v=Rx, wherein matrix R In, 301,303 and 302 respectively represent matrix element as negative, positive number and zero.Arrow indicates one of a line of calculation matrix R Nonzero element perceives an element in x, is equivalent to the ash of a square window filter and a certain fixed position of input picture Spend convolution.

X is projected to the v of lower dimensional space by using sparse random matrix R above.This random matrix R only needs It calculates once when program starts, is then remained unchanged during tracking.By integrogram, we can efficiently calculate v.

The building process of classifier is as follows: to each sample z (m dimensional vector), its low-dimensional expression be v (n tie up to Amount, n is much smaller than m).It is assumed that each element in v is independently distributed.It can be modeled by Naive Bayes Classifier.

Wherein, H (v) is classifier, y ∈ { 0,1 } representative sample label, and y=0 indicates that negative sample, y=1 indicate positive sample, Assuming that the prior probability of two classes is equal, p (y=1)=p (y=0)=0.5.It is assumed that the conditional probability p in classifier H (v) (v_i| y=1) and p (v_i| y=0) Gaussian Profile is also belonged to, mean value and variance are respectivelyFor adapt to it is long when with Track needs to constantly update model, i.e., goes to recalculate the mean value and variance of positive negative sample according to the sample newly detected, updates Mode is as follows:

λ > 0 is Studying factors in formula 4 and formula 5, is in practical applications the accumulation for avoiding error, the application take λ= 0.85。

Step 204, if more than scheduled frame number, then the information to be presented with the images match of target item is obtained, and Information to be presented is presented in the frame of the continuous image that target item is presented.

In the present embodiment, the target item image of the detection and step 203 of the target item image based on step 202 Tracking step can extract type, track, the frame number of appearance and duration of target item etc. from target video.These Information will be helpful to realize the personalized recommendation for being directed to user information.Letter to be presented is matched in preset information bank to be presented Breath, modification frame data or superposition by way of by information to be presented and present target item image frame be combined into it is new Frame, information to be presented to be presented in newly-generated frame.The information to be presented can be the text or picture being linked on webpage. As shown in Figure 3b, target item " T-shirt " 304 is detected in the key frame in target video, from preset information bank to be presented In match the picture 305 that webpage can be linked to associated with " T-shirt " and presented in key frame.User clicks picture After 305, related web page can be entered and browse information associated with " T-shirt ".Target is detected in key frame in target video Article " shoes " 306 matches the picture 307 that can be linked to webpage associated with " shoes " in preset information bank to be presented And it is presented in key frame.After user clicks picture 307, related web page can be entered and browse information associated with " shoes ".

In some optional implementations of the present embodiment, continuously present target item image frame in present to Information is presented, comprising: determine location information of the image of target item in the frame of image that target item is continuously presented；According to Location information determines the position of appearing of information to be presented；Information to be presented is presented on position of appearing.The presentation of information to be presented Position can be near the image of target item, can also be in the position of the image of other not shelter target articles.It can be according to mesh The size for marking the image of article determines the position of appearing of information to be presented, for example, if target item is a pair of shoes, and wait be in Existing information is shoe advertisement, and the position occupied is also bigger than shoes image itself, then is not suitable for pasting on the image of shoes wide It accuses, and advertisement should be added beside shoes image.If target item is a wardrobe, since the size of wardrobe image is bigger, Therefore compare and be suitble to directly be superimposed information to be presented on wardrobe image.

The method provided by the above embodiment of the application is real by the way that the content of target video and information to be presented to be associated Show and be imbued with targetedly information presentation, has improved the hit rate of information to be presented.

With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of information demonstrating method.The information is presented The process 400 of method, comprising the following steps:

Step 401, the key frame in target video is detected.

Step 402, in response to detecting key frame, the image of target item is detected from key frame.

Step 403, in response to detecting the image of target item from key frame, determination is continuously presented after key frame Whether the number of the frame of the image of target item is greater than scheduled frame number.

Step 401-403 and step 201-203 are essentially identical, therefore repeat no more.

Step 404, if more than scheduled frame number, then information aggregate to be presented is obtained.

In the present embodiment, when the frame number determined in step 403 is greater than scheduled frame number, from preset information to be presented Ku Li is matched and the higher information to be presented of target item image similarity.The information to be presented may include picture.

Step 405, determine the picture and target item in information aggregate to be presented in every information to be presented image it Between similarity.

In the present embodiment, if including picture in the information to be presented, the histogram and target of picture can be determined Similarity between the histogram of the image of article.First to the pixel number of target item image and the picture of information to be presented According to generating respective histogram data, be normalized to respective image histogram data and reuse Pasteur's coefficient (Bhattacharyya coefficient) algorithm calculates histogram data, finally obtains image similarity value, value Range is between [0,1], and 0 indicates extremely different, and 1 indicates extremely similar (identical).

In some optional implementations of the present embodiment, if the information to be presented includes text information, obtain With the text information of the categorical match of the image of target item.Classification is determined according to the keyword in text information, with object The classification of the image of product is matched, and similarity is obtained.For example, text information is " 299 yuan of XX sneakers price ", the text information It can achieve 90% with the similarity of the image of target item " sneakers ", the image and text information " XX of target item " sneakers " The similarity of 299 yuan of leather shoes price " can achieve 70%, the image and text information " XX basketball price of target item " sneakers " 299 yuan " similarity can be solely 10%.

Step 406, it is to be presented that at least one is chosen from information aggregate to be presented according to the descending sequence of similarity Information.

In the present embodiment, at least one information to be presented is chosen based on the similarity that step 405 determines.It is selected to The number that information is presented can be directly proportional to the size of the image of target item.For example, the biggish image of area ratio can be with Show several information to be presented more.The lesser image of area ratio preferably only shows an information to be presented, to avoid a presumptuous guest usurps the role of the host.

In some optional implementations of the present embodiment, the letter to be presented with the images match of target item is obtained Breath, comprising: obtain the class label that the user of target video is watched by terminal, wherein the class label of user is by right The behavioral data of user carries out what big data analysis obtained；It obtains from information aggregate to be presented and is matched with the class label of user At least one information to be presented.Information to be presented is further screened namely based on the personal characteristics of user, to user's needle Information to be presented is chosen to property.For example, can determine that the user of viewing target video is women by big data analysis, then may be used Female article relevant information is chosen as information to be presented.

Can by establish a user, information to be presented, target item image combination information recommendation mould to be presented Type, can be effectively predicted the clicking rate (ctr, Click-Through-Rate) of information to be presented, and clicking rate highest is estimated in push Information to be presented, to promote the conversion ratio that information to be presented is launched.The feature of the recommended models mainly include user characteristics, Three kinds of the feature of the image of the feature of article involved in information to be presented and the target item detected from target video.With The feature at family mainly includes the letter that age, gender, region, occupation, platform of user etc. can be drawn a portrait by user's big data Breath.The feature for the article that information to be presented is related to mainly include target item type, price, the article place of production (or seller where Ground), deposit of faith clicking rate to be presented.The mesh detected in target video is mainly included in the feature of the image of target item The image of target item occurs in the similarity and target video of the article that the image and information to be presented for marking article are related to Duration.

The processing of the feature for the article being related to information to be presented mainly includes discretization and two kinds of characteristic crossover.

(1) discretization

The feature of information recommendation model to be presented mainly includes three types discussed above, include in initial feature from Dissipate feature (such as user's gender, user region) and continuous feature (such as item price, age of user, target item image and The similarity for the article that information to be presented is related to, clicking rate of information to be presented etc.).Although wherein clicking rate and age are all to connect Continuous numerical value, but itself meaning is different, and the comparison of age size is nonsensical to information recommendation to be presented, and the size of clicking rate It is then meaningful, it is therefore desirable to the processing of discretization is done to features described above.

The processing mode of discretized features is as follows: continuous feature is done segment processing.It is 10 sections as clicking rate ctr divides, such as Fruit ctr=0.05, then character pair position 1.Other kinds of characteristic processing is similar.

(2) characteristic crossover

After feature sliding-model control, the feature after processing can be stretched as to a vector, as final feature.But This mode is linear model, has ignored the interaction between feature.If the combination of gender and type of goods is to letter to be presented Breath clicking rate has very direct influence.Therefore to feature intersects can effectively lift scheme predict accuracy rate.Characteristic crossover Method i.e. two features are combined to form new continuous feature, as gender and goods categories (m class) combination after if generate 2m A discrete feature.

If the discrete features vector that the application is formed is x, the dimension of feature is 113.Wherein x1~x10 is that age of user is special Levy section；X11~x18 is user's regional feature section；X19~x25 is user's job characteristics section；X26~x30 is that user watches video Platform features section；X31~x38 is goods categories characteristic segments；X39~x50 is item price characteristic segments；X51~x58 is for article Characteristic of field section；X59~x60 is article clicking rate characteristic segments；X61~x65 is that detection target duration characteristics section occurs；X66~x75 For detection target and advertising items similarity characteristic segments；X76~x91 is that goods categories/user's gender combines characteristic segments；X92~ X113 is user's gender/item price assemblage characteristic section.

Logic-based regression model recommends information to be presented.Logic Regression Models (Logic Regression, LR), are one A algorithm being widely used in advertisement recommendation.If training dataset is D=(x¹,y¹),(x²,y²)...(x^N,y^N), whereinFor construction feature, yⁱWhether advertisement is clicked, and 1 is clicks, and -1 is not click on.

The basic assumption of LR is, and conditional probability P (y=1 | x；θ) meet following expression:

Here g (θ^TIt x) is the sigmoid function mentioned, x is feature vector, and θ is parameter vector, corresponding decision letter Number are as follows:

y^*=1, ifP (y=1 | x) > 0.5 (formula 7)

It is next the parameter in solving model after the mathematical form of model determines.Using maximal possibility estimation, that is, look for To one group of parameter, so that the likelihood score (probability) of data is bigger under this group of parameter.In Logic Regression Models, likelihood score L (θ) It may be expressed as:

L (θ)=P (D | θ)=∏ P (y | x；θ)=∏ g (θ^Tx)^y(1-g(θ^Tx))^1-y(formula 8)

Take the available log likelihood l (θ) of logarithm:

L (θ)=∑ ylogg (θ^Tx)+(1-y)log(1-g(θ^TX)) (formula 9)

In LR model, maximizing above-mentioned likelihood function can be obtained optimized parameter.The application declines iteration using gradient Parameter is solved, it is optimal to approach by choosing the value for making objective function change a most fast direction adjusting parameter in each step Value.

To get the recommender system for arriving recommendation information to be presented after model training completion.It calculates in information bank to be presented The predetermined number information to be presented retrieved carries out clicking rate prediction, and selection is estimated the highest information to be presented of clicking rate and carried out It presents.

Figure 4, it is seen that compared with the corresponding embodiment of Fig. 2, the process of the information demonstrating method in the present embodiment 400 highlight the step of selecting information to be presented.So as to accurately select information to be presented, letter to be presented is extracted Effective information to be presented is presented as far as possible, reduces the cost for launching information to be presented for the hit rate of breath.

It presents and fills this application provides a kind of information as the realization to method shown in above-mentioned each figure with further reference to Fig. 5 The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively In kind electronic equipment.

As shown in figure 5, the information presentation device 500 of the present embodiment includes: key frame detection unit 501, image detection list Member 502, determination unit 503 and display unit 504.Wherein, key frame detection unit 501 is used to detect the key in target video Frame, wherein key frame is the frame that image entropy is greater than preset image entropy threshold in target video；Image detecting element 502 is used for In response to detecting key frame, the image of target item is detected from key frame；Determination unit 503 is used in response to from key frame In detect the image of target item, determine whether the number that the frame of image of target item is continuously presented after key frame big In scheduled frame number；Display unit 504 is used to then obtain with the images match of target item wait be in if more than scheduled frame number Existing information, and information to be presented is presented in the frame of image that target item is continuously presented.

In the present embodiment, the key frame detection unit 501 of information presentation device 500, image detecting element 502, determination The specific processing of unit 503 and display unit 504 can be with reference to step 201, the step 202, step in Fig. 2 corresponding embodiment 203, step 204.

In some optional implementations of the present embodiment, key frame detection unit 501 is further used for: obtaining image Entropy is greater than the frame of preset image entropy threshold as key frame；According to the playing sequence of target video, after obtaining key frame Image entropy is greater than the first frame of preset image entropy threshold；Determine whether first frame and the similarity of key frame are less than preset phase Like degree threshold value；If being less than preset similarity threshold, it is determined that going out first frame is key frame.

In some optional implementations of the present embodiment, image detecting element 502 is further used for: based on instruction in advance Experienced convolutional neural networks detect the image of target item from key frame, wherein convolutional neural networks object for identification The characteristics of image of product and the image that target item is determined according to characteristics of image.

In some optional implementations of the present embodiment, determination unit 503 is further used for: being calculated using compression tracking Method determines in the different frames after whether the image of target item is continuously presented on key frame；If continuous be presented, add up to connect The number of the frame of the continuous image that target item is presented, and determine whether the number of frame is greater than scheduled frame number.

In some optional implementations of the present embodiment, display unit 504 is further used for: determining target item Location information of the image in the frame of image that target item is continuously presented；The presentation of information to be presented is determined according to location information Position；Information to be presented is presented on position of appearing.

In some optional implementations of the present embodiment, display unit 504 is further used for: obtaining information to be presented Set, wherein information to be presented includes picture；Determine the picture and target in information aggregate to be presented in every information to be presented Similarity between the image of article；At least one is chosen from information aggregate to be presented according to the descending sequence of similarity Information to be presented.

In some optional implementations of the present embodiment, information to be presented includes text information；And display unit 504 are further used for: obtaining the text information with the categorical match of the image of target item.

In some optional implementations of the present embodiment, display unit 504 is further used for: obtaining and is seen by terminal See the class label of the user of target video, wherein the class label of user is to carry out counting greatly by the behavioral data to user It is obtained according to analysis；The information to be presented of class label matched at least one with user is obtained from information aggregate to be presented.

Below with reference to Fig. 6, it illustrates the knots of the computer system 600 for the equipment for being suitable for being used to realize the embodiment of the present application Structure schematic diagram.Equipment shown in Fig. 6 is only an example, should not function to the embodiment of the present application and use scope bring and appoint What is limited.

As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.

I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.；Storage section 608 including hard disk etc.； And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon Computer program be mounted into storage section 608 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media 611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two any combination.Computer readable storage medium for example can be --- but Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination. The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires electrical connection, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium, which can be, any include or stores The tangible medium of program, the program can be commanded execution system, device or device use or in connection.And In the application, computer-readable signal media may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include key frame detection unit, image detecting element, determination unit and display unit.Wherein, the title of these units is in certain feelings The restriction to the unit itself is not constituted under condition, for example, key frame detection unit is also described as " detection target video In key frame unit ".

As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in device described in above-described embodiment；It is also possible to individualism, and without in the supplying device.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should Device: the key frame in detection target video, wherein key frame is that image entropy is greater than preset image entropy threshold in target video Frame；In response to detecting key frame, the image of target item is detected from key frame；In response to detecting mesh from key frame The image of article is marked, determines whether the number that the frame of image of target item is continuously presented after key frame is greater than scheduled frame Number；If more than scheduled frame number, then the information to be presented with the images match of target item is obtained, and object is continuously being presented Information to be presented is presented in the frame of the image of product.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of information demonstrating method, which is characterized in that the described method includes:

Detect the key frame in target video, wherein the key frame is that image entropy is greater than preset figure in the target video As the frame of entropy threshold；

In response to detecting the key frame, the image of target item is detected from the key frame；

In response to detecting the image of the target item from the key frame, determination is continuously presented after the key frame Whether the number of the frame of the image of the target item is greater than scheduled frame number；

If more than scheduled frame number, then the information to be presented with the images match of the target item is obtained, and described continuous It presents and the information to be presented is presented in the frame of the image of the target item；

Wherein, the information to be presented of the acquisition and the images match of the target item, comprising:

By establish user, information to be presented, target item image combination information recommendation model to be presented, predict it is to be presented The highest information to be presented of clicking rate is estimated in the clicking rate of information, push.

2. the method according to claim 1, wherein the key frame in the detection target video, comprising:

It obtains image entropy and is greater than the frame of preset image entropy threshold as key frame；

According to the playing sequence of the target video, obtains the image entropy after the key frame and be greater than preset image entropy threshold First frame；

Determine whether the first frame and the similarity of the key frame are less than preset similarity threshold；

If being less than preset similarity threshold, it is determined that going out the first frame is key frame.

3. being wrapped the method according to claim 1, wherein detecting the image of target item from the key frame It includes:

The image of target item is detected from the key frame based on convolutional neural networks trained in advance, wherein the convolution Neural network the characteristics of image of the target item and determines the figure of the target item according to described image feature for identification Picture.

4. the method according to claim 1, wherein described in the determination is continuously presented after the key frame Whether the number of the frame of the image of target item is greater than scheduled frame number, comprising:

Determine whether the image of the target item is continuously presented on the difference after the key frame using compression track algorithm Frame in；

If continuous be presented, the number of the frame of the accumulative image that the target item is continuously presented, and determine the number of the frame Whether scheduled frame number is greater than.

5. the method according to claim 1, wherein described in the image that the target item is continuously presented Frame in the information to be presented is presented, comprising:

Determine location information of the image of the target item in the frame of the image that the target item is continuously presented；

The position of appearing of the information to be presented is determined according to the positional information；

The information to be presented is presented on the position of appearing.

6. method described in -5 any one according to claim 1, which is characterized in that the figure of the acquisition and the target item As matched information to be presented, comprising:

Obtain information aggregate to be presented, wherein the information to be presented includes picture；

Between the image for determining the picture and the target item in the information aggregate to be presented in every information to be presented Similarity；

At least one information to be presented is chosen from the information aggregate to be presented according to the descending sequence of similarity.

7. the method according to claim 1, wherein the information to be presented includes text information；And

The information to be presented of the acquisition and the images match of the target item, comprising:

Obtain the text information with the categorical match of the image of the target item.

8. the method according to claim 1, wherein the acquisition and the images match of the target item to Information is presented, comprising:

Obtain the class label that the user of the target video is watched by terminal, wherein the class label of the user is logical It crosses and what big data analysis obtained is carried out to the behavioral data of the user；

The information to be presented of class label matched at least one with the user is obtained from information aggregate to be presented.

9. a kind of information presentation device, which is characterized in that described device includes:

Key frame detection unit, for detecting the key frame in target video, wherein the key frame is in the target video Image entropy is greater than the frame of preset image entropy threshold；

Image detecting element, for detecting the image of target item from the key frame in response to detecting the key frame；

Determination unit is determined for the image in response to detecting the target item from the key frame in the key Whether the number that the frame of the image of the target item is continuously presented after frame is greater than scheduled frame number；

Display unit, for if more than scheduled frame number, then obtaining the information to be presented with the images match of the target item, And the information to be presented is presented in the frame of the image that the target item is continuously presented；

10. device according to claim 9, which is characterized in that the key frame detection unit is further used for:

11. device according to claim 9, which is characterized in that described image detection unit is further used for:

12. device according to claim 9, which is characterized in that the determination unit is further used for:

13. device according to claim 9, which is characterized in that the display unit is further used for:

The information to be presented is presented on the position of appearing.

14. according to device described in claim any one of 9-13, which is characterized in that the display unit is further used for:

15. device according to claim 9, which is characterized in that the information to be presented includes text information；And

The display unit is further used for:

16. device according to claim 9, which is characterized in that the display unit is further used for:

17. a kind of information presenting device, comprising:

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method described in any one of claims 1-8.

18. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor Such as method described in any one of claims 1-8 is realized when execution.