CN110287363A

CN110287363A - Resource supplying method, apparatus, equipment and storage medium based on deep learning

Info

Publication number: CN110287363A
Application number: CN201910431276.8A
Authority: CN
Inventors: 陈步青
Original assignee: OneConnect Smart Technology Co Ltd
Current assignee: OneConnect Smart Technology Co Ltd
Priority date: 2019-05-22
Filing date: 2019-05-22
Publication date: 2019-09-27

Abstract

The invention discloses a kind of resource supplying method, apparatus, equipment and medium based on deep learning, it include: to be passed to the first deep learning model after executing pretreatment to the first video frame information of front end camera acquisition, to identify the classification information of the target object by the first deep learning model；The corresponding audio resource of the classification information is obtained, and is sent to the headend equipment and plays out；The first face feature of the target object is extracted to the second video frame information that front end camera acquires during the audio resource plays, first face feature is passed to the second deep learning model, to identify the age information of the target object by the second deep learning model；Corresponding audio and video resources are obtained in conjunction with the classification information and age information of the target object, and is sent to the headend equipment and plays out.The present invention solves the problems, such as that existing advertisement pushing technology can not be towards all clients and real-time follow-up client feedback.

Description

Resource supplying method, apparatus, equipment and storage medium based on deep learning

Technical field

The present invention relates to information technology field more particularly to a kind of resource supplying method, apparatus based on deep learning, set Standby and storage medium.

Background technique

The extensive use of big data technology becomes the crucial branch for leading numerous technical progress of industry, benefit being promoted to increase In terms of support technology, especially advertisement pushing.However, the history that existing advertisement pushing mode is mainly based upon client retains information Judgement is carried out point, there is biggish limitation, it can to new client or the old visitor without retaining gender information, consumer record Family carries out advertisement pushing.After advertisement, client feedback can not be monitored in real time, and according to feedback adjustment ad content, It is easy to cause potential customers to be lost.

Therefore, a kind of method for finding advertisement pushing towards all clients and the client feedback that can follow up in real time becomes this The technical issues of field technical staff's urgent need to resolve.

Summary of the invention

The embodiment of the invention provides a kind of resource supplying method, apparatus, equipment and storage medium based on deep learning, To solve the problems, such as that existing resource push technology can not be towards all clients and real-time follow-up client feedback.

A kind of resource supplying method based on deep learning, comprising:

The first video frame information for obtaining the acquisition of front end camera, include in first video frame information target object and It wears information clothes；

Pretreatment is executed to first video frame information, using pretreated first video frame information as input Incoming first deep learning model, to identify that the apparel characteristic of the target object obtains by the first deep learning model The classification information of the target object；

Corresponding audio resource is obtained according to the classification information；

The audio resource is sent to headend equipment, so that the headend equipment receives and plays the audio money Source；

Obtain the second video frame information that the front end camera acquires during the audio resource plays；

The first face feature that the target object is extracted from second video frame information, first face is special Sign is as incoming second deep learning model is inputted, to identify first face feature by the second deep learning model Obtain the age information of the target object；

Corresponding audio and video resources are obtained in conjunction with the classification information and age information of the target object；

The audio and video resources are sent to headend equipment, so that the headend equipment receives and plays the audio-video Resource.

Optionally, after the first video frame information for obtaining the acquisition of front end camera, further includes:

Face datection is executed to first video frame information, obtains face included in first video frame information Quantity；

If the face quantity is greater than or equal to 2, the number of pixels of each face is calculated；

The corresponding face of maximum value in the pixel number is obtained, using the corresponding portrait of the face as target pair As.

Optionally, described to include: to first video frame information execution pretreatment

First video frame information is encoded according to H.265 video encoding standard；

According to rgb color mode to being converted described in after coding with video frame information, each Color Channel is obtained Corresponding pixel matrix.

Optionally, first face feature that the target object is extracted from second video frame information includes:

The detection of face's coarseness feature is executed to second video frame information；

If face's coarseness feature detects the positive face of the target object, second video frame information is held The detection of row face fine granularity feature, obtains the first face feature of the target object.

Optionally, the method also includes:

The third video frame information that front end camera acquires during the audio and video resources play is obtained, from the third The second face feature of the target object is extracted in video frame information；

Using second face feature as incoming third deep learning model is inputted, to pass through the third deep learning Model identifies that second face feature obtains the emotional information of the target object；

If the emotional information is when loseing interest in, to be obtained again according to the classification information of the target object and age information Take corresponding audio and video resources；

Optionally, the method also includes:

If the emotional information is interested, Xiang Suoshu headend equipment sends loop play message, so that before described The current audio and video resources of end equipment loop play.

A kind of resource supplying device based on deep learning, comprising:

First video acquiring module, for obtaining the first video frame information of front end camera acquisition, first video It include target object and its clothing information in frame information；

First deep learning module will be pretreated described for executing pretreatment to first video frame information First video frame information is as incoming first deep learning model is inputted, by described in the first deep learning model identification The apparel characteristic of target object obtains the classification information of the target object；

First advertisement obtains module, for obtaining corresponding audio resource according to the classification information；

First sending module, for the audio resource to be sent to headend equipment, so that the headend equipment receives And play the audio resource；

Second video acquiring module, the acquired during the audio resource plays for obtaining the front end camera Two video frame informations extract the first face feature of the target object from second video frame information；

Second deep learning module, for being passed to the second deep learning model for first face feature as input, To identify that first face feature obtains the age information of the target object by the second deep learning model；

Second advertisement obtains module, for the classification information and the corresponding sound of age information acquisition in conjunction with the target object Video resource；

Second sending module, for the audio and video resources to be sent to headend equipment, so that the headend equipment connects It receives and plays the audio and video resources.

Optionally, described device further include:

Third video acquiring module, the third acquired during the audio and video resources play for obtaining front end camera Video frame information extracts the second face feature of the target object from the third video frame information；

Third deep learning module, for being passed to third deep learning model for second face feature as input, To identify that second face feature obtains the emotional information of the target object by the third deep learning model；

Modulus block is recaptured in advertisement, if being when loseing interest in, again according to the target object for the emotional information Classification information and age information obtain corresponding audio and video resources；

Third sending module, for the audio and video resources to be sent to headend equipment, so that the headend equipment connects It receives and plays the audio and video resources.

A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing The computer program run on device, the processor realize the above-mentioned resource based on deep learning when executing the computer program Method for pushing.

A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter Calculation machine program realizes the above-mentioned resource supplying method based on deep learning when being executed by processor.

The first video frame information that the embodiment of the present invention is acquired by obtaining front end camera, first video frame information In include target object and its clothing information；Pretreatment is executed to first video frame information, by pretreated described One video frame information is as incoming first deep learning model is inputted, to identify the mesh by the first deep learning model The apparel characteristic of mark object obtains the classification information of the target object；Corresponding audio money is obtained according to the classification information Source；The audio resource is sent to headend equipment, so that the headend equipment receives and plays the audio resource；It obtains The front end camera second video frame information for acquiring during the audio resource plays, in second video frame information Face feature including the target object；The first face that the target object is extracted from second video frame information is special Sign, using the face feature as incoming second deep learning model is inputted, to be identified by the second deep learning model First face feature of the target object obtains the age information of the target object；Believe in conjunction with the classification of the target object Breath and age information obtain corresponding audio and video resources；The audio and video resources are sent to headend equipment, so that before described End equipment receives and plays the audio and video resources；Comprehensive more accurate advertisement may be implemented through this embodiment, and The information for knowing client in advance is not needed, can be old and new customers's advertisement, greatly improve the value of advertisement pushing.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.

Fig. 1 is a flow chart of the resource supplying method in one embodiment of the invention based on deep learning；

Fig. 2 is to execute in resource supplying method in one embodiment of the invention based on deep learning to the first video frame information One flow chart of Face datection；

Fig. 3 is in resource supplying method in one embodiment of the invention based on deep learning to first video frame information Execute a pretreated flow chart；

Fig. 4 is in resource supplying method in one embodiment of the invention based on deep learning from second video frame information One flow chart of middle the first face feature for extracting the target object；

Fig. 5 is another flow chart of the resource supplying method in one embodiment of the invention based on deep learning；

Fig. 6 is a functional block diagram of the resource supplying device in one embodiment of the invention based on deep learning；

Fig. 7 is a schematic diagram of computer equipment in one embodiment of the invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

The resource supplying method to provided in an embodiment of the present invention based on deep learning is described in detail below.At this In inventive embodiments, the method for the resource supplying, which is applied, to be made of headend equipment, front end camera and Cloud Server System, the headend equipment, front end camera can be with Cloud Server connection communications.The present embodiment is to execute with Cloud Server Main body, as shown in Figure 1, the resource supplying method based on deep learning includes:

In step s101, the first video frame information for obtaining the acquisition of front end camera, in first video frame information Including target object and its clothing information.

Herein, resource supplying method provided in an embodiment of the present invention is applied in the sale occasion such as market, bank outlets, And the headend equipment by being deployed in the sale occasion plays pushed resource.The resource includes but is not limited to quotient Product advertisement.The headend equipment includes but is not limited to LED screen, projection screen, 3D display screen.It is arranged on the headend equipment There is front end camera.The embodiment of the present invention acquires the first video frame information by the front end camera, obtains by before described The portrait of end equipment, i.e. target object.The target object is the client for carrying out resource supplying.

In invention the present embodiment, the target object is a portrait by the headend equipment.However, same When moment has multiple by the portrait of the headend equipment, then one of need to select from the multiple portrait as target pair As.Optionally, the embodiment of the present invention selects the portrait nearest apart from headend equipment as target object.As shown in Fig. 2, obtaining After first video frame information of front end camera acquisition, the embodiment of the present invention executes face inspection to first video frame information It surveys, the resource supplying method further include:

In step s 201, Face datection is executed to first video frame information, obtains first video frame information In included face quantity.

Optionally, the embodiment of the present invention uses GAMMA LAB face recognition algorithms, detects in first video frame information Facial image, obtain position of the face quantity and face in first video frame information in video frame information.

In step S202, if the face quantity is greater than or equal to 2, the number of pixels of each face is calculated.

After obtaining face quantity, judge whether the face quantity is greater than or equal to 2.If when, show described Include multiple portraits in one video frame information, i.e., has in synchronization by the client of the headend equipment multiple.According to each Position of a face in image information, marks the face, and calculates the number of pixels of the face, obtains each face Size.

In step S203, the corresponding face of maximum value in the pixel number is obtained, it is corresponding with the face Portrait is as target object.

Herein, number of pixels is bigger, and face is bigger, and the corresponding portrait of the face is closer apart from headend equipment.This hair Bright embodiment chooses the corresponding face of maximum value therein, by the number of pixels of each face of comparison with the people The corresponding portrait of face is as target object.Due in the same video frame information, its face of the client closer apart from camera Image is bigger, by selecting the corresponding face of maximum value of number of pixels, so as to obtain the people nearest apart from headend equipment Picture completes the selection target object from multiple clients included by the video frame information.

In step s 102, pretreatment is executed to first video frame information, by pretreated first video Frame information is as incoming first deep learning model is inputted, to identify the target object by the first deep learning model Apparel characteristic obtain the classification information of the target object.

Wherein, the first deep learning model is the classification identification model based on apparel characteristic, in advance by sea Labeled picture is measured to be learnt to obtain.When carrying out classification information identification, the embodiment of the present invention calls directly first depth Spend learning model.The classification information is the hierarchy of consumption divided according to apparel characteristic, including but not limited to business people, luxury Product consumer, ordinary consumer, consumer non-potential.

In embodiments of the present invention, camera video frame information collected passes through real time streaming transport protocol (Real Time Streaming Protocol, RTSP) it is transferred to Cloud Server.Cloud Server receives first video frame information, and to institute It states the first video frame information to be pre-processed, so that first video frame information meets the defeated of the first deep learning model Enter requirement.Optionally, as shown in figure 3, described include: to first video frame information execution pretreatment

In step S301, first video frame information is encoded according to H.265 video encoding standard.

Herein, the embodiment of the present invention improves image letter by H.265 encoding to first video frame information execution The compression efficiency of breath reduces real-time time delay and channel acquisition time, to be conducive to improve the timeliness of resource supplying.

In step s 302, it according to rgb color mode to being converted with video frame information described in after coding, obtains every The corresponding pixel matrix of one Color Channel.

Wherein, the rgb color mode is a kind of color standard of industry, by red (Red), green (Green), indigo plant (Blue) variation and superposition of three kinds of Color Channels obtain various colors.The embodiment of the present invention is according to rgb color mode to coding First video frame information afterwards is converted, and the corresponding pixel matrix of each Color Channel, i.e., the red channel (R) are obtained Corresponding pixel matrix, the corresponding pixel matrix in the green channel (G), the corresponding pixel matrix in the channel indigo plant (B).

When carrying out classification information identification, red (R), green (G), the corresponding pixel matrix of blue (B) triple channel are made For the input of the first deep learning model.The first deep learning model is according to red (R), green (G), indigo plant (B) three The corresponding pixel matrix in channel is calculated, and the classification information of the target object is obtained.

In step s 103, corresponding audio resource is obtained according to the classification information.

Herein, the corresponding one or more audios of different classes of information are stored by presetting database in Cloud Server Resource.The audio resource includes but is not limited to commercial audio advertisement.It is completed in the first deep learning model special to clothing The identification of sign after obtaining the classification information of the target object, is inquired presetting database according to the classification information, is corresponded to One or more audio resources.The audio resource is the audio file about product promotion information.

In step S104, the audio resource is sent to headend equipment, so that the headend equipment is received and broadcast Put the audio resource.

The corresponding one or more audio resources of the obtained classification information are inquired, headend equipment can be sent to and followed Ring plays, recommended information, sales promotion information or inventory information of voice prompting dependent merchandise etc., to attract the target object Pay attention to.

Illustratively, it if the classification information is business people, can be given a discount with voice prompting commercial affairs dress ornament；If the class Other information is luxury goods consumer, then can be limited the quantity money arrival with voice prompting luxury goods；If ordinary consumer, can be mentioned with voice Show coupon information.

In step s105, the second video frame that the front end camera acquires during the audio resource plays is obtained Information extracts the first face feature of the target object from second video frame information.

Herein, it is the attention in order to attract target object that headend equipment, which plays the audio resource,.When target object quilt When having attracted, it will usually which Xiang Suoshu headend equipment looks over.The embodiment of the present invention headend equipment play the audio resource when, The video frame information of front end camera acquisition is obtained in real time.In order to be distinguished with above-mentioned first video frame information, remember here For the second video frame information.Detect with the presence or absence of positive face in second video frame information, whether to differentiate the target object Front viewing headend equipment.Optionally, as shown in figure 4, extracting the mesh from second video frame information in step S105 Mark object the first face feature include:

In step S401, the detection of face's coarseness feature is executed to second video frame information.

Herein, face's coarseness feature is for judging in the second video frame information with the presence or absence of positive face.This implementation Example extracts positive face information by the detection of face's coarseness feature from second video frame information, main to pass through the second view of detection It whether there is two eyes, a nose and a mouth in frequency frame information.If so, determining second video frame information In there are positive face, indicate the target object front viewing headend equipment；If not, it is determined that in second video frame information not There are positive faces, indicate the target object not front viewing headend equipment.

In step S402, if face's coarseness feature detects the positive face of the target object, to described Two video frame informations execute the detection of face's fine granularity feature, obtain the first face feature of the target object.

Herein, face's fine granularity feature is used to judge the age information of the target object.When the target pair When watching headend equipment as front, the detection of face's fine granularity feature is executed to the video frame information, obtains the face of 57 dimensions Portion's feature, including but not limited to such as eyes interpupillary distance, mouth size, skin color, eyebrow size, eyebrow color, double temples colors, Hair color, hair length.

Optionally, if not obtaining the target object by face's coarseness feature detection in preset time range When positive face, indicates that the target object loses interest in the audio resource played, then terminate this facial feature detection.

In step s 106, using first face feature as incoming second deep learning model is inputted, to pass through It states the second deep learning model and identifies that first face feature obtains the age information of the target object.

Wherein, the second deep learning model is the age information identification model based on face feature, has been passed through in advance The picture labeled to magnanimity is learnt to obtain.When carrying out age information identification, the present embodiment calls directly second depth Spend learning model.The age information is the ages divided according to face feature, including but not limited to the elderly, middle age People, youth, children.

In step s 107, corresponding audio-video money is obtained in conjunction with the classification information and age information of the target object Source.

Herein, Cloud Server store the corresponding one or more of different classes of information except through presetting database can The audio resource of recommendation also further stores the corresponding recommendable audio-video money of all ages and classes information under same category information Source.The audio and video resources include but is not limited to commodity audio-visual advertisement.It completes in the second deep learning model to face After obtaining the age information of the target object, it is corresponding to inquire the age information under the classification information for the identification of feature One or more audio and video resources.The audio and video resources are the audio-video document about product promotion information, including voice letter Breath and image information.

In step S108, the audio and video resources are sent to headend equipment, so that the headend equipment receives simultaneously Play the audio and video resources.

The corresponding one or more audio and video resources of the obtained age information are inquired, headend equipment progress can be sent to Loop play, to realize recommended information, sales promotion information or the inventory information of voice and visual synchronization prompt dependent merchandise Deng giving related products recommendation to the target object, realize comprehensive, more accurately advertising push service, improve wide Accuse the value of push.

In conclusion the first video frame information that the embodiment of the present invention is acquired by obtaining front end camera, described first It include target object and its clothing information in video frame information；Pretreatment is executed to first video frame information, will be pre-processed First video frame information afterwards is as incoming first deep learning model is inputted, to pass through the first deep learning model Identify that the apparel characteristic of the target object obtains the classification information of the target object；It is obtained and is corresponded to according to the classification information Audio resource；The audio resource is sent to headend equipment, so that the headend equipment receives and plays the audio Resource；The second video frame information that the front end camera acquires during the audio resource plays is obtained, from described second The first face feature that the target object is extracted in video frame information, using first face feature as input incoming second Deep learning model, to identify that first face feature obtains the target object by the second deep learning model Age information；Corresponding audio and video resources are obtained in conjunction with the classification information and age information of the target object；The sound is regarded Frequency resource is sent to headend equipment, so that the headend equipment receives and plays the audio and video resources；It is real through the invention Applying example may be implemented comprehensive more accurate advertisement, and not need the information for knowing client in advance, can be old and new customers Advertisement has greatly improved the value of advertisement pushing.

Optionally, the audio-visual advertisement that headend equipment plays is not necessarily current target object content of interest.This Inventive embodiments can also update the audio-video operation that headend equipment is played according to the emotional change of target object.Such as Fig. 5 institute Show, the resource supplying method further include:

In step S109, the third video frame letter that front end camera acquires during the audio and video resources play is obtained Breath, extracts the second face feature of the target object from the third video frame information.

Herein, it is to promote the sale of goods to target object that headend equipment, which plays the audio and video resources,.The present invention is implemented Example further obtained by identifying the emotional information of the target object target object to the audio and video resources whether It is interested.When headend equipment plays the audio and video resources, the video frame information of front end camera acquisition is obtained in real time, is It is distinguished with the first video frame information and the second video frame information, is denoted as third video frame information here.Detect the third view The face feature of target object described in frequency frame information.Different from face's fine granularity feature, the face feature is for judging institute The emotional information of target object is stated, is denoted as the second face feature here.The emotional information is that the target object passes through face Feedback, including it is interested and lose interest in.

In step s 110, using second face feature as incoming third deep learning model is inputted, to pass through It states third deep learning model and identifies that second face feature obtains the emotional information of the target object.

Wherein, the third deep learning model is to judge whether interested two Classification and Identifications mould based on face feature Type is learnt to obtain by the picture labeled to magnanimity in advance.When carrying out emotional information identification, the present embodiment is direct The third deep learning model is called, using the face feature as incoming third deep learning model is inputted, by described Third deep learning model identifies that the face feature of the target object obtains the emotional information of the target object.

In step S111, if the emotional information is when loseing interest in, to be believed again according to the classification of the target object Breath and age information obtain corresponding audio and video resources.

After third deep learning model completes identification, the output of the third deep learning model is obtained as a result, obtaining The emotional information of the target object.If the emotional information is that when loseing interest in, it is corresponding to inquire the target object again Other corresponding audio and video resources of the age information under classification information, to execute advertisement switching.It is appreciated that other described sounds Video resource is the advertisement of other commodity in addition to the commodity that currently playing resource is related to.

In step S112, the audio and video resources are sent to headend equipment, so that the headend equipment receives simultaneously Play the audio and video resources.

Again the audio and video resources inquired can be sent to the headend equipment, so that the headend equipment will be original The audio and video resources of broadcasting are switched to the audio and video resources of the new acquisition, to realize adjustment advertisement pushing in real time, are promoted The value of advertisement pushing.

Illustratively, the headend equipment plays switch prompting voice, so after receiving the audio and video resources first After play the audio and video resources.Optionally, the switch prompting voice includes but is not limited to: apparently you lose interest in, you It can have a look ....It should be appreciated that the playing flow of above-mentioned headend equipment is only a specific example of the invention, and do not have to It is of the invention in limiting.

Optionally, the method also includes:

In step S113, if the emotional information is interested, Xiang Suoshu headend equipment sends loop play message, So that the audio and video resources that the headend equipment loop play is current.

If the emotional information be it is interested, generate loop play message, and the loop play message is sent to Headend equipment, so that the headend equipment rests in current audio and video resources.

In conclusion the present embodiment identifies that target object watches commodity audio and video resources by third deep learning model Mood can adjust advertisement pushing in real time, and follow up client feedback in time, realize that carrying out advertisement according to on-site customer mood pushes away It send.

It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.

In one embodiment, a kind of resource supplying device based on deep learning is provided, it should the resource based on deep learning Resource supplying method in driving means and above-described embodiment based on deep learning corresponds.As shown in fig. 6, depth should be based on The resource supplying device of study includes module.Detailed description are as follows for each functional module:

First video acquiring module 61, for obtaining the first video frame information of front end camera acquisition, first view It include target object and its clothing information in frequency frame information；

First deep learning module 62, for executing pretreatment to first video frame information, by pretreated institute The first video frame information is stated as incoming first deep learning model is inputted, to identify institute by the first deep learning model The apparel characteristic for stating target object obtains the classification information of the target object；

First resource obtains module 63, for obtaining corresponding audio resource according to the classification information；

First sending module 64, for the audio resource to be sent to headend equipment, so that the headend equipment connects It receives and plays the audio resource；

Second video acquiring module 65 acquires during the audio resource plays for obtaining the front end camera Second video frame information extracts the first face feature of the target object from second video frame information；

Second deep learning module 66, for using first face feature as the incoming second deep learning mould of input Type, to identify that first face feature obtains the age information of the target object by the second deep learning model；

Secondary resource obtain module 67, in conjunction with the target object classification information and age information obtain it is corresponding Audio and video resources；

Second sending module 68, for the audio and video resources to be sent to headend equipment, so that the headend equipment It receives and plays the audio and video resources.

Face detection module obtains first video frame for executing Face datection to first video frame information Included face quantity in information；If the face quantity is greater than or equal to 2, the pixel of each face is calculated Number；The corresponding face of maximum value in the pixel number is obtained, using the corresponding portrait of the face as target object.

Optionally, the first deep learning module 62 is also used to:

Optionally, the second deep learning module 66 is also used to:

Optionally, described device further include:

Third video acquiring module, the third acquired during the audio and video resources play for obtaining front end camera Video frame information；

Third deep learning module, for extracting the second face of the target object from the third video frame information Feature, using second face feature as incoming third deep learning model is inputted, to pass through the third deep learning mould Type identifies that second face feature obtains the emotional information of the target object；

Resource recaptures modulus block, if being when loseing interest in, again according to the target object for the emotional information Classification information and age information obtain corresponding audio and video resources；

Optionally, described device further include:

4th sending module, if be interested for the emotional information, Xiang Suoshu headend equipment sends loop play Message, so that the audio and video resources that the headend equipment loop play is current.

Specific restriction about the resource supplying device based on deep learning may refer to above for based on depth The restriction of the resource supplying method of habit, details are not described herein.Each mould in the above-mentioned resource supplying device based on deep learning Block can be realized fully or partially through software, hardware and combinations thereof.Above-mentioned each module can be embedded in the form of hardware or independence In processor in computer equipment, it can also be stored in a software form in the memory in computer equipment, in order to Processor, which calls, executes the corresponding operation of the above modules.

In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 7.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The network interface of machine equipment is used to communicate with external terminal by network connection.When the computer program is executed by processor with Realize a kind of resource supplying method based on deep learning.

In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory And the computer program that can be run on a processor, processor perform the steps of when executing computer program

The front end camera second video frame information for acquiring during the audio resource plays is obtained, from described the The first face feature of the target object is extracted in two video frame informations；

Using first face feature as incoming second deep learning model is inputted, to pass through second deep learning Model identifies that first face feature obtains the age information of the target object；

In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of when being executed by processor

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided by the present invention, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.

Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations；Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features；And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims

1. a kind of resource supplying method based on deep learning characterized by comprising

The first video frame information of front end camera acquisition is obtained, includes target object and its clothing in first video frame information Information；

Pretreatment is executed to first video frame information, it is incoming using pretreated first video frame information as input First deep learning model, to identify that it is described that the apparel characteristic of the target object obtains by the first deep learning model The classification information of target object；

The audio resource is sent to headend equipment, so that the headend equipment receives and plays the audio resource；

The first face feature that the target object is extracted from second video frame information makees first face feature To input incoming second deep learning model, to identify that first face feature obtains by the second deep learning model The age information of the target object；

The audio and video resources are sent to headend equipment, so that the headend equipment receives and plays the audio-video money Source.

2. the resource supplying method based on deep learning as described in claim 1, which is characterized in that obtaining front end camera After first video frame information of acquisition, further includes:

Face datection is executed to first video frame information, obtains face number included in first video frame information Amount；

The corresponding face of maximum value in the pixel number is obtained, using the corresponding portrait of the face as target object.

3. the resource supplying method based on deep learning as described in claim 1, which is characterized in that described to first view Frequency frame information executes pretreatment

According to rgb color mode to being converted described in after coding with video frame information, it is corresponding to obtain each Color Channel Pixel matrix.

4. the resource supplying method as described in any one of claims 1 to 3 based on deep learning, which is characterized in that it is described from The first face feature that the target object is extracted in second video frame information includes:

If face's coarseness feature detects the positive face of the target object, face is executed to second video frame information The detection of portion's fine granularity feature, obtains the first face feature of the target object.

5. the resource supplying method as described in any one of claims 1 to 3 based on deep learning, which is characterized in that the side Method further include:

The third video frame information that front end camera acquires during the audio and video resources play is obtained, from the third video The second face feature of the target object is extracted in frame information；

Using second face feature as incoming third deep learning model is inputted, to pass through the third deep learning model Identify that second face feature obtains the emotional information of the target object；

If the emotional information is when loseing interest in, again according to the classification information of the target object and age information acquisition pair The audio and video resources answered；

6. the resource supplying method based on deep learning as claimed in claim 5, which is characterized in that the method also includes:

If the emotional information is interested, Xiang Suoshu headend equipment sends loop play message, so that the front end is set The standby current audio and video resources of loop play.

7. a kind of resource supplying device based on deep learning characterized by comprising

First video acquiring module, for obtaining the first video frame information of front end camera acquisition, the first video frame letter It include target object and its clothing information in breath；

First deep learning module, for executing pretreatment to first video frame information, by pretreated described first Video frame information is as incoming first deep learning model is inputted, to identify the target by the first deep learning model The apparel characteristic of object obtains the classification information of the target object；

First resource obtains module, for obtaining corresponding audio resource according to the classification information；

First sending module, for the audio resource to be sent to headend equipment, so that the headend equipment is received and broadcast Put the audio resource；

Second video acquiring module, the second view acquired during the audio resource plays for obtaining the front end camera Frequency frame information extracts the first face feature of the target object from second video frame information；

Second deep learning module, for being passed to the second deep learning model for first face feature as input, with logical It crosses the second deep learning model and identifies that first face feature obtains the age information of the target object；

Secondary resource obtains module, for the classification information and the corresponding audio-video of age information acquisition in conjunction with the target object Resource；

Second sending module, for the audio and video resources to be sent to headend equipment, so that the headend equipment receives simultaneously Play the audio and video resources.

8. the resource supplying device based on deep learning as claimed in claim 7, which is characterized in that further include:

Third video acquiring module, the third video acquired during the audio and video resources play for obtaining front end camera Frame information extracts the second face feature of the target object from the third video frame information；

Third deep learning module, for being passed to third deep learning model for second face feature as input, with logical It crosses the third deep learning model and identifies that second face feature obtains the emotional information of the target object；

Resource recaptures modulus block, if being when loseing interest in, again according to the classification of the target object for the emotional information Information and age information obtain corresponding audio and video resources；

Third sending module, for the audio and video resources to be sent to headend equipment, so that the headend equipment receives simultaneously Play the audio and video resources.

9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to 6 described in any item resource supplying methods based on deep learning.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization such as the money as claimed in any one of claims 1 to 6 based on deep learning when the computer program is executed by processor Source method for pushing.