CN106202570A - A kind of user information acquiring method and device - Google Patents

A kind of user information acquiring method and device Download PDF

Info

Publication number
CN106202570A
CN106202570A CN201610659017.7A CN201610659017A CN106202570A CN 106202570 A CN106202570 A CN 106202570A CN 201610659017 A CN201610659017 A CN 201610659017A CN 106202570 A CN106202570 A CN 106202570A
Authority
CN
China
Prior art keywords
user
multimedia
sample
characteristic information
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610659017.7A
Other languages
Chinese (zh)
Inventor
赵九龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Holding Beijing Co Ltd
LeTV Information Technology Beijing Co Ltd
Original Assignee
LeTV Holding Beijing Co Ltd
LeTV Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Holding Beijing Co Ltd, LeTV Information Technology Beijing Co Ltd filed Critical LeTV Holding Beijing Co Ltd
Priority to CN201610659017.7A priority Critical patent/CN106202570A/en
Publication of CN106202570A publication Critical patent/CN106202570A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present invention provides a kind of user information acquiring method and device, relates to field of computer technology, and main purpose is the training sample by regular update forecast model, improves forecast model and calculates the stability of user's characteristic information.The embodiment of the present invention be the technical scheme is that collection multimedia sample set, and wherein, described multimedia sample set includes having the multimedia sample distinguishing user's characteristic information;Adding up the viewing user of described multimedia sample set, filter out training user's sample, described training user's sample is user's sample with clear and definite user's characteristic information;Utilize described training user's sample training user profile to obtain model, use described user profile to obtain model and obtain the characteristic information of targeted customer.Present invention is mainly used for obtaining the characteristic information of user.

Description

A kind of user information acquiring method and device
Technical field
The present embodiments relate to field of computer technology, particularly relate to a kind of user information acquiring method and device.
Background technology
Along with the arriving in cloud epoch, big data (Big data) have also attracted increasing concern.By to big data Analysis can obtain a lot of intelligence, deep, valuable information.At present, in video field, each big video website is more More paying attention to user's viewing experience in website, whether the content such as video is enriched, and whether user can more easily find Required video resource etc., to this end, video website can carry out video content recommendation targetedly for different users, with Facilitate that user is easier finds the video content wanting to see.And to accomplish the orientation pushed information to user, first have to determine not With the characteristic information of user, such as age of user, sex, occupation, hobby etc..
Gathering for above-mentioned user's characteristic information, traditional major way is the user filled in by user's Website login Log-on message, extracts relevant characteristic information from the log-on message of user.Or issued to user by the activity of website Questionnaire, extracts the correlated characteristic information of this user from the questionnaire of user feedback.But during reality is applied, Shen Ask someone to find many websites in order to increase the convenience that user uses, the step for user's filling registration information is not forced Regulation, therefore, most users does not the most carry out filling in of log-on message, and for the user of filling registration information, due to greatly Most characteristic information items does not do pressure for the consideration of privacy of user and fills in regulation so that these certain customers also cannot obtain use The characteristic information at family.And for questionnaire, the feedback rates that user fills in the most just ratio is relatively low, and the content filled in is true Property, accuracy are the highest, cause obtaining correct user's characteristic information.
In addition to traditional information mode gathering reporting of user, along with the extensive application of data analysis, the feature of user Information often can be drawn by its concrete behavior analysis.For video field, it is possible to by analyzing viewing user The video preprocessor watched measures the characteristic information of this user, and its process, generally by setting up forecast model, utilizes known instruction Practice sample and this forecast model is predicted training, when after the requirement reaching predetermined that predicts the outcome of model, just can be to the unknown The viewing behavior of user calculates, and draws the correlated characteristic information of this unknown subscriber.But, applicant is proposing the present invention's During find, the computation model of the prediction user's characteristic information used at present, mostly be disposable model, complete train after Using, this model is more applicable for the situation that training sample and prediction object variation are little, but leads at video the most always Territory, although the attribute of list portion film will not change, but the hot broadcast video in each time period is in change, and each time Customer group in Duan also can be continually changing.And these changing factors all may affect the result of calculation that forecast model is final, it is seen then that Bigger unstability is there is in existing forecast model when calculating the characteristic information of viewing user.
Summary of the invention
For the problem of above-mentioned existence, the embodiment of the present invention provides a kind of user information acquiring method and device, main mesh Be the training sample by regular update forecast model, improve forecast model calculate user's characteristic information stability.
For reaching above-mentioned purpose, present invention generally provides following technical scheme:
On the one hand, the embodiment of the present invention provides a kind of user information acquiring method, and the method includes:
Gathering multimedia sample set, wherein, described multimedia sample set includes having distinguishes user's characteristic information Multimedia sample;
Add up the viewing user of described multimedia sample set, filter out training user's sample, described training user's sample For having user's sample of clear and definite user's characteristic information;
Utilize described training user's sample training user profile to obtain model, use described user profile to obtain model and obtain The characteristic information of targeted customer.
Optionally, described collection multimedia sample set includes:
Gathering multimedia sample according to presetting rule, wherein, described multimedia sample is labeled with user's characteristic information tendency Label;
Go out multiple multimedia sample according to described user's characteristic information tendency label filtration, generate multimedia sample set.
Optionally, described collection multimedia sample set includes:
According to preset time interval taken at regular intervals multimedia sample set.
Optionally, the viewing user of described statistics described multimedia sample set, filter out training user's sample and include:
Obtain the viewing user of each multimedia sample in described multimedia sample set, obtain watching user's set;
According to the multimedia viewing record of each user in described viewing user's set, add up each user and watch institute State the multimedia quantity in multimedia sample set;
Described training user's sample is determined according to described multimedia quantity.
Optionally, described according to described multimedia quantity determine described training user's sample include:
Obtain the user's characteristic information tendency label of the described multimedia sample of user's viewing;
According to the weight of different user characteristic information tendency label, the user's characteristic information calculating each user is inclined to Point, described weight is for representing the tendency degree of user's characteristic information tendency label;
Sequence according to described user's characteristic information propensity score determines described training user's sample.
On the other hand, the embodiment of the present invention provides a kind of user profile acquisition device, and this device includes:
Collecting unit, is used for gathering multimedia sample set, and wherein, described multimedia sample set includes that having differentiation uses The multimedia sample of family characteristic information;
Select unit, for adding up the viewing user of the multimedia sample set that described collecting unit obtains, filter out instruction Practicing user's sample, described training user's sample is user's sample with clear and definite user's characteristic information;
Acquiring unit, for utilizing training user's sample training user profile of described selection Unit selection to obtain model, Use described user profile to obtain model and obtain the characteristic information of targeted customer.
Optionally, described collecting unit includes:
Acquisition module, for gathering multimedia sample according to presetting rule, wherein, described multimedia sample is labeled with user Characteristic information tendency label;
Generation module, for the multimedia sample gathered at described acquisition module according to described user's characteristic information tendency label This set filters out multiple multimedia sample, generates multimedia sample set.
Optionally, described collecting unit includes:
Time block, for according to preset time interval taken at regular intervals multimedia sample set.
Optionally, described selection unit includes:
Acquisition module, for obtaining the viewing user of each multimedia sample in described multimedia sample set, obtains Viewing user's set;
Statistical module, in the viewing user's set obtained according to described acquisition module, the multimedia of each user is seen See record, add up each user and watch the multimedia quantity in described multimedia sample set;
Determine module, for determining described training user's sample according to the multimedia quantity of described statistical module counts.
Optionally, described determine that module includes:
Obtain submodule, for obtaining the user's characteristic information tendency label of the described multimedia sample of user's viewing;
Calculating sub module, for the weight according to different user characteristic information tendency label, calculates the use of each user Family characteristic information propensity score, described weight is for representing the tendency degree of user's characteristic information tendency label;
Determining submodule, the sequence of the user's characteristic information propensity score for calculating according to described calculating sub module determines Described training user's sample.
By above-mentioned a kind of user information acquiring method and device it can be seen that the embodiment of the present invention is by building one Individual can obtain model with the user profile of regular exercise targeted customer is carried out user's characteristic information acquisition.And this regular instruction White silk is again training sample based on regular update, and therefore, the method that this user's characteristic information obtains is particularly suited for update content MultiMedia Field many, fireballing.By taken at regular intervals multimedia sample set, and according to user's choosing of viewing multimedia sample Providing representational high-quality user as up-to-date training sample for training user profile to obtain model, this model is passing through After training, its standard calculated is by multimedia based on current hot topic, and representative high-quality user, obtained meter Calculate result and also will more have ageing, the obtained characteristic information practical situation by the user that more fits so that it calculates Accuracy also will more stablize.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is made to introduce simply, it should be apparent that, the accompanying drawing in describing below is this Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to root Other accompanying drawing is obtained according to these accompanying drawings.
A kind of user information acquiring method flow chart that Fig. 1 provides for the embodiment of the present invention;
Fig. 2 obtains the training sample acquisition methods flow chart of model for a kind of user profile that the embodiment of the present invention provides;
The structure composition frame chart of a kind of user profile acquisition device that Fig. 3 provides for the embodiment of the present invention;
The structure composition frame chart of the another kind of user profile acquisition device that Fig. 4 provides for the embodiment of the present invention.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is The a part of embodiment of the present invention rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under not making creative work premise, broadly falls into the scope of protection of the invention.
Embodiments provide a kind of user information acquiring method, as it is shown in figure 1, the method is applied to according to user The characteristic information of multimedia viewing this user of behavior prediction, concrete steps include:
101, multimedia sample set is gathered.
Wherein, the multimedia sample form in multimedia sample set is not limited to video, audio frequency, word or picture etc..
In the multimedia sample set that the embodiment of the present invention is gathered, in multimedia sample can be concrete multimedia Hold, it is also possible to be the associated multimedia attributes such as multimedia title, size.And the multimedia sample gathered should be multimedia sample This, so-called multimedia sample refers to have the multimedia sample distinguishing user's characteristic information, and the characteristic information of user refers to Distinguish the attribute information of different user, such as, the information such as the sex of user, age, occupation.By these information, can be substantially Dope user's hobby for multimedia viewing such that it is able to provide personalized service for this user targetedly, as Multimedia information push, the configuration pin multimedia search pattern etc. to this user.
Each multimedia can have the viewing user group of correspondence, such as, the viewing of cartoon according to concrete content User is mainly child, and the viewing user of campus play mostly is student, the viewing user of variety entertainment program mainly youth, history Acute viewing user mostly then is middle age or old people;Such as, male prefers to see that American series, women prefer to see South Korean TV soaps etc.. Therefore, each multimedia can divide according to different user's characteristic information, specifically, can fill multimedia Label information makes a distinction, and multimedia can mark multiple label in order to the different user's characteristic information of labelling.This Outward, for the label of of a sort user's characteristic information, different multimedias there is also the difference of degree, such as, multimedia A All there is with multimedia B the label of women viewing, and the viewing user of multimedia A watches user relative to women for multimedia B More, to this, numerical value can be increased in the label in order to distinguishing characteristic information in various degree.
It should be noted that it is to be adopted for some user's characteristic information that the multimedia sample set gathered does not limits The multimedia of collection, it is also possible to be for different user's characteristic information synthetical collections.For the multimedia quantity that gathered then Being limited by the disposal ability of system, the quantity of collection is the most, and predicting the outcome that it is final is the most accurate.Therefore, the present invention is real Execute example and do not limit gathering multimedia quantity.
Further, in order to ensure the ageing of gathered multimedia sample, it is also possible to by these many matchmakers of regular update Up-to-date multimedia sample is added in set by body sample set, fails to understand simultaneously for distinguishing user's characteristic information in set Aobvious multimedia sample is deleted or is changed.In embodiments of the present invention, can be come periodically by a preset time interval Gathering multimedia sample set, wherein, preset time interval can be that artificial arranged is solid in embodiments of the present invention The fixed time cycle, there is the step of system at regular intervals ground triggering collection multimedia sample.It can also be a week revocable time Phase, this time interval is determined by the attendant of system.It is to say, preset time interval can be one day, one week, one Individual month such fixed cycle, it is also possible to freely determined by manager, when manager thinks that nearest content of multimedia updates relatively Time many, can be by cycle time, it might even be possible to triggered this acquisition step in real time by manager.
102, add up the viewing user of multimedia sample set, filter out training user's sample.
The viewing user of statistics multimedia sample set is the institute counting each multimedia sample in set one by one Some viewing users, then these viewings after users carry out duplicate removal process, are obtained the set of a total viewing user.Namely Saying, any one user in this set at least have viewed a multimedia in multimedia sample set.
After the set determining viewing user, then from this set, filter out training user's sample.Wherein, training user Sample refers to user's sample with clear and definite user's characteristic information.It is to say, watch multimedia record according to this user, can To determine one or several characteristic information of this user easily.Such as, mostly the multimedia viewing record of a user is Korea Spro Play, and the time of viewing concentrate on weekend or night, then this user is that the probability of working woman is the biggest.Additionally, filter out The quantity of training user's sample is then intended to obtain according to training user profile the needs of model and determines, quantity is the most, training The result of model is the most accurate, but the time required for training is the longest, and therefore, the quantity of training user's samples selection is also Non-the more the better, and be intended to be set according to specific circumstances.Such as, when the cycle of Sample Refreshment is long, it is possible to many Some samples are selected to be trained, otherwise, select some sample trainings less.
Screening training user's sample mainly according in the watched multimedia of user with the label of user's characteristic information After carrying out statistical computation, screen according to the sequence of score.It should be noted that calculated score value does not limits in this step Result of calculation for some user's characteristic information, it is also possible to be the comprehensive score of multiple user's characteristic information result of calculation.Right Then it is not specifically limited in the concrete computational methods embodiment of the present invention.
103, utilize training user's sample training user profile to obtain model, use the user profile after training to obtain model Obtain the characteristic information of targeted customer.
After determining training user's sample, utilize these training user's samples that user profile is obtained model and be trained, So that the result that user profile obtains model more meets the corresponding relation of user's characteristic information in these high-qualitys user's sample.Its In, it is by selected user that user profile obtains the effect of model, analyzes the multimedia viewing information of this user, obtains by analysis Go out this user's characteristic of correspondence information.
Finally, targeted customer watches multimedia behavior record be just input in the user profile acquisition model after training This user's characteristic of correspondence information can be obtained.
In conjunction with above-mentioned implementation it can be seen that the user information acquiring method that used of the embodiment of the present invention, it is logical Cross structure one and with the user profile acquisition model of regular exercise, targeted customer can be carried out user's characteristic information acquisition.Pass through Taken at regular intervals multimedia sample set, and select representative high-quality user's conduct according to the user of viewing multimedia sample Up-to-date training sample is used for training user profile to obtain model.Relatively use after completing training with existing model the most always Situation, uses the user profile after embodiment of the present invention training to obtain model and obtains the characteristic information of user, its standard calculated To be based on current popular multimedia, and representative high-quality user, obtained result of calculation also will more have Effective property, obtained characteristic information is by the practical situation of the user that more fits so that its accuracy calculated also will more Stable.
In order to user information acquiring method that the embodiment of the present invention proposed is explained in more detail, below will focus on explanation The most periodically obtaining user profile and obtain the training sample of model, the method is as in figure 2 it is shown, include concretely comprises the following steps:
201, multimedia sample is gathered according to presetting rule.
It should be noted that the trigger condition gathering multimedia sample in this step is to be touched according to preset time interval Send out.The related content being referred in the step 101 in above-mentioned Fig. 1 about the content of preset time interval, the most no longer Repeat.
The rule gathering multimedia sample in the embodiment of the present invention is preset according to practical situation, such as, when gathered When multimedia sample mainly distinguishes the sex of user, then emphasis is needed to gather labelling useful family gender tendency's label in multimedia Multimedia;When the multimedia sample gathered mainly distinguishes the age bracket of user, then need according to being marked with user's year Age, section was inclined to the multimedia of label.In addition to the user's characteristic information tendency label needing to refer to necessity, the rule of collection also needs Consider following index parameter: the time that multimedia viewing number of users, multimedia broadcasting time, multimedia are reached the standard grade.
Wherein, multimedia viewing number of users may determine that this multimedia userbase of viewing, only multimedia When viewing number of users reaches some, this multimedia just has the meaning being chosen as multimedia sample.
Multimedia broadcasting time may determine that this multimedia relative to other multimedia temperatures, number of times is the most more by joyous Meet, it is also possible to evaluate this multimedia viewing number of users from another angle.But, this parameter and multimedia viewing number of users Difference be, multimedia broadcasting time does not differentiates between user, it is allowed to user's repeatedly program request.
The time that multimedia is reached the standard grade is then for distinguishing multimedia newness degree, general in the case of, a multimedia On-line time the most long, its accumulative viewing number of users and broadcasting time are the biggest.And if two multimedias are viewing user Number or broadcasting time identical in the case of, one can consider that the shortest multimedia of on-line time is hot broadcast multimedia.
In conjunction with above-mentioned parameter and relevant actual demand, it is possible to collect batch of multimedia sample.
202, go out multiple multimedia sample according to user's characteristic information tendency label filtration, generate multimedia sample set.
After obtaining multiple multimedia sample, it is also possible to these multimedia samples are carried out more careful screening, obtains One multimedia sample set.The screening Primary Reference user's characteristic information tendency label of this step, and multimedia is according to mark The distribution situation signed, illustrates: assume that user's characteristic information to be screened is user's sex, then the mark of labelling in multimedia Sign content just to include: man, female, and the differentiation more refined, it is also possible to add the score value of correspondence in the label, many to distinguish difference The label tendency degree of media.Such as, the label of multimedia A is man 7, and the label of multimedia B is man 2, it is assumed that the interval of score value For 0-10, score value is the biggest, illustrates that multimedia tendentiousness is the most obvious, then visible, and multimedia A is for multimedia B, many The viewing user of media A is more prone to male, and the viewing user of multimedia B may also have least a portion of female user, and two Person relatively from the point of view of, multimedia A is exactly multimedia sample for multimedia B.Above-mentioned example only lists a user The screening of characteristic information, i.e. sex, be not limited to a user's characteristic information in certain embodiment of the present invention, multiple when considering During user's characteristic information, it is necessary to consider all of user's characteristic information, in actual applications, weighted value can be used Mode, carrys out overall merit multimedia high-quality degree according to the significance level of different user characteristic information.
Additionally, when screening multimedia sample, it is also possible to introduce this multimedia index score at other platforms.Because, One multimedia resource is often play in multiple different multimedia platforms, and different platforms all can have different viewings User group, therefore, the multimedia index in other multimedia platforms also has certain reference value.
Multimedia sample in finally obtained multimedia sample set should have clear and definite user's characteristic information tendency, And the feature being evenly distributed for different user's characteristic information.Wherein, it is evenly distributed and refers to gathered multimedia sample Be directed to a user's characteristic information and there is the most well-balanced distributed number, such as, acquire 100 multimedia samples, when with When family characteristic information is sex, optimal distribution be exactly 50 samples be male, 50 samples are women.
203, obtain the viewing user of each multimedia sample in multimedia sample set, obtain watching user's set.
Owing to each multimedia sample in multimedia sample set has substantial amounts of viewing user, therefore, statistics The all of each multimedia sample watch users, and the viewing user gathering all multimedia samples has just obtained viewing user Set.Owing to same user viewing may have the multiple multimedias in multimedia sample set, therefore, when statistics, also need Duplicate removal process to be carried out, or in this user, mark the quantity of seen multimedia sample.Wherein, in viewing user set Each user at least watched a multimedia sample in this multimedia sample set.
204, according to the multimedia viewing record of each user in viewing user's set, add up each user and watch many Multimedia quantity in sets of media samples.
After obtaining watching user's set, it is possible to each user being further directed in this set, obtain this user's All multimedia viewing records, this viewing record is not limited to history viewing record or the sight in preset time interval See record.
According to all multimedia viewing records of each user, then add up this user and have viewed this multimedia sample set In multimedia quantity, i.e. statistics multimedia viewing record and multimedia sample intersection of sets collection.
205, training user's sample is determined according to multimedia quantity.
Owing to each multimedia sample being at least marked with a user's characteristic information tendency label, therefore, watch user Each user in set can be at least one user's characteristic information of the multimedia sample labelling tendency label watched.With The multimedia sample quantity that family is seen is the most, and the user's characteristic information tendency label of its mark is the most, wherein, there may be phase With feature but the label of different tendency degree, or with feature and the label of same tendency degree.Here to acquired use The user's characteristic information tendency label of family viewing multimedia sample can not do duplicate removal and process, and retains all of label substance.
When the classification of user's characteristic information is unique, in viewing user's set, the difference between different user is the individual of label Value in various degree described in number and label.Now, when calculating the user's characteristic information propensity score of user, can basis Practical situation quantity and degree value to label respectively arranges weight, such as, when quantity is important, and can be by the weight of quantity It is set to 0.8, and the weight of degree value is set to 0.2, further according to weighted value COMPREHENSIVE CALCULATING user's characteristic information propensity score.
When the classification of user's characteristic information has multiple, say, that when the characteristic information that label is marked is multiple, see See user gather in difference between different user in addition to the degree value difference of label, the characteristic information representated by its label is the most not With.Now, when calculating the user's characteristic information propensity score of user, it is necessary to according to the significance level between different labels Carry out the configuration of weight, and then the characteristic information propensity score of COMPREHENSIVE CALCULATING user again.
In calculating viewing user's set after the characteristic information propensity score of each user, according to the row of score height Sequence, from viewing user set in filtering out a collection of training user's sample, these training user's samples be all watched above-mentioned many Media sample, and the multimedia sample seen also be marked with obvious characteristic information tendency label.Finally, by obtain this A little training user's samples obtain the training sample of model as user profile, and then use the user profile after training to obtain model Obtain the characteristic information of targeted customer.
Further, as the realization to said method, embodiments provide a kind of user profile acquisition device, As it is shown on figure 3, this device includes:
Collecting unit 31, is used for gathering multimedia sample set, and wherein, described multimedia sample set includes having differentiation The multimedia sample of user's characteristic information, and the characteristic information of user refers to distinguish the attribute information of different user;
Select unit 32, for adding up the viewing user of the multimedia sample set that described collecting unit 31 obtains, screening Going out to train user's sample, described training user's sample is user's sample with clear and definite user's characteristic information;
Acquiring unit 33, obtains mould for the training user's sample training user profile utilizing described selection unit 32 to select Type, uses the user profile after training to obtain model and obtains the characteristic information of targeted customer.
Further, as shown in Figure 4, described collecting unit 31 includes:
Acquisition module 311, for gathering multimedia sample according to presetting rule, wherein, described multimedia sample is labeled with User's characteristic information tendency label;
Generation module 312, for gathering many according to described user's characteristic information tendency label at described acquisition module 311 Media sample filters out multiple multimedia sample, generates multimedia sample set.
Further, as shown in Figure 4, described collecting unit 31 also includes:
Time block 313, for according to preset time interval taken at regular intervals multimedia sample set.
Further, as shown in Figure 4, described selection unit 32 includes:
Acquisition module 321, for obtaining the viewing user of each multimedia sample in described multimedia sample set, To viewing user's set;
Statistical module 322, in the viewing user's set obtained according to described acquisition module 321, each user is many Media viewing record, adds up each user and watches the multimedia quantity in described multimedia sample set;
Determine module 323, for determining described training user's sample according to the multimedia quantity of described statistical module 322 statistics This.
Further, as shown in Figure 4, described determine that module 323 includes:
Obtain submodule 3231, for obtaining the user's characteristic information tendency mark of the described multimedia sample of user's viewing Sign;
Calculating sub module 3232, for the different user characteristic information tendency mark obtained according to described acquisition submodule 3231 The weight signed, calculates the user's characteristic information propensity score of each user, and described weight is used for representing that user's characteristic information is inclined To the tendency degree of label;
Determine submodule 3233, for the user's characteristic information propensity score according to the calculating of described calculating sub module 3232 Sequence determines described training user's sample.
In sum, a kind of user information acquiring method that the embodiment of the present invention is used and device, be by building one Individual can obtain model with the user profile of regular exercise targeted customer is carried out user's characteristic information acquisition.With existing prediction Model is compared, and the user profile in the embodiment of the present invention obtains model can ensure model prediction user by regularly training The high accuracy of information, meanwhile, the embodiment of the present invention also proposed the screening mode to high-quality training sample, is by fixed Phase gathers multimedia sample set, and selects representative high-quality user as according to the user of viewing multimedia sample New training sample is used for training user profile to obtain model.The feelings the most always used after completing training compared to existing model Condition, the user profile after using the embodiment of the present invention to be trained obtains model and obtains the characteristic information of user, its mark calculated Brigadier is based on current popular multimedia, and representative high-quality user, and obtained result of calculation also will more Having ageing, obtained characteristic information is by the practical situation of the user that more fits so that its accuracy calculated also will more Add stable.
Device embodiment described above is only schematically, and the wherein said unit illustrated as separating component can To be or to may not be physically separate, the parts shown as unit can be or may not be physics list Unit, i.e. may be located at a place, or can also be distributed on multiple NE.Can be selected it according to the actual needs In some or all of module realize the purpose of the present embodiment scheme.Those of ordinary skill in the art are not paying creativeness Work in the case of, be i.e. appreciated that and implement.
Through the above description of the embodiments, those skilled in the art it can be understood that to each embodiment can The mode adding required general hardware platform by software realizes, naturally it is also possible to pass through hardware.Based on such understanding, on State the part that prior art contributes by technical scheme the most in other words to embody with the form of software product, should Computer software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD etc., including some fingers Make with so that a computer equipment (can be personal computer, server, or the network equipment etc.) performs each and implements The method described in some part of example or embodiment.
Last it is noted that above example is only in order to illustrate technical scheme, it is not intended to limit;Although With reference to previous embodiment, the present invention is described in detail, it will be understood by those within the art that: it still may be used So that the technical scheme described in foregoing embodiments to be modified, or wherein portion of techniques feature is carried out equivalent; And these amendment or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.

Claims (10)

1. a user information acquiring method, it is characterised in that described method includes:
Gathering multimedia sample set, wherein, described multimedia sample set includes having the many matchmakers distinguishing user's characteristic information Body sample;
Adding up the viewing user of described multimedia sample set, filter out training user's sample, described training user's sample is tool There is user's sample of clear and definite user's characteristic information;
Utilize described training user's sample training user profile to obtain model, use described user profile to obtain model and obtain target The characteristic information of user.
Method the most according to claim 1, it is characterised in that described collection multimedia sample set includes:
Gathering multimedia sample according to presetting rule, wherein, described multimedia sample is labeled with user's characteristic information tendency label;
Go out multiple multimedia sample according to described user's characteristic information tendency label filtration, generate multimedia sample set.
Method the most according to claim 1, it is characterised in that described collection multimedia sample set includes:
According to preset time interval taken at regular intervals multimedia sample set.
Method the most according to claim 1, it is characterised in that the viewing of described statistics described multimedia sample set is used Family, filters out training user's sample and includes:
Obtain the viewing user of each multimedia sample in described multimedia sample set, obtain watching user's set;
According to the multimedia viewing record of each user in described viewing user's set, add up each user and watch described many Multimedia quantity in sets of media samples;
Described training user's sample is determined according to described multimedia quantity.
Method the most according to claim 4, it is characterised in that described according to described multimedia quantity determine described training use Family sample includes:
Obtain the user's characteristic information tendency label of the described multimedia sample of user's viewing;
According to the weight of different user characteristic information tendency label, calculate the user's characteristic information propensity score of each user, Described weight is for representing the tendency degree of user's characteristic information tendency label;
Sequence according to described user's characteristic information propensity score determines described training user's sample.
6. a user profile acquisition device, it is characterised in that described device includes:
Collecting unit, is used for gathering multimedia sample set, and wherein, it is special that described multimedia sample set includes having differentiation user The multimedia sample of reference breath;
Select unit, for adding up the viewing user of the multimedia sample set that described collecting unit obtains, filter out training and use Family sample, described training user's sample is user's sample with clear and definite user's characteristic information;
Acquiring unit, for utilizing training user's sample training user profile of described selection Unit selection to obtain model, uses Described user profile obtains model and obtains the characteristic information of targeted customer.
Device the most according to claim 6, it is characterised in that described collecting unit includes:
Acquisition module, for gathering multimedia sample according to presetting rule, wherein, described multimedia sample is labeled with user characteristics Information tendency label;
Generation module, for the multimedia sample collection gathered at described acquisition module according to described user's characteristic information tendency label Conjunction filters out multiple multimedia sample, generates multimedia sample set.
Device the most according to claim 6, it is characterised in that described collecting unit includes:
Time block, for according to preset time interval taken at regular intervals multimedia sample set.
Device the most according to claim 6, it is characterised in that described selection unit includes:
Acquisition module, for obtaining the viewing user of each multimedia sample in described multimedia sample set, is watched User gathers;
Statistical module, the multimedia viewing note of each user in the viewing user's set obtained according to described acquisition module Record, adds up each user and watches the multimedia quantity in described multimedia sample set;
Determine module, for determining described training user's sample according to the multimedia quantity of described statistical module counts.
Device the most according to claim 9, it is characterised in that described determine that module includes:
Obtain submodule, for obtaining the user's characteristic information tendency label of the described multimedia sample of user's viewing;
Calculating sub module, for the weight according to different user characteristic information tendency label, the user calculating each user is special Reference breath propensity score, described weight is for representing the tendency degree of user's characteristic information tendency label;
Determining submodule, the sequence of the user's characteristic information propensity score for calculating according to described calculating sub module determines described Training user's sample.
CN201610659017.7A 2016-08-11 2016-08-11 A kind of user information acquiring method and device Pending CN106202570A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610659017.7A CN106202570A (en) 2016-08-11 2016-08-11 A kind of user information acquiring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610659017.7A CN106202570A (en) 2016-08-11 2016-08-11 A kind of user information acquiring method and device

Publications (1)

Publication Number Publication Date
CN106202570A true CN106202570A (en) 2016-12-07

Family

ID=57514239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610659017.7A Pending CN106202570A (en) 2016-08-11 2016-08-11 A kind of user information acquiring method and device

Country Status (1)

Country Link
CN (1) CN106202570A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291840A (en) * 2017-05-31 2017-10-24 北京奇艺世纪科技有限公司 A kind of user property forecast model construction method and device
CN107330075A (en) * 2017-06-30 2017-11-07 北京金山安全软件有限公司 Multimedia data processing method and device, server and storage medium
CN108322783A (en) * 2018-01-25 2018-07-24 广州虎牙信息科技有限公司 Video website userbase estimation method, storage medium and terminal
CN108764553A (en) * 2018-05-21 2018-11-06 世纪龙信息网络有限责任公司 Userbase prediction technique, device and computer equipment
CN109254990A (en) * 2018-09-11 2019-01-22 北京唐冠天朗科技开发有限公司 A kind of method and system of information source acquisition and dynamic analysis
CN109766955A (en) * 2019-02-12 2019-05-17 深圳乐信软件技术有限公司 Gender identification method, device, equipment and storage medium
CN111507520A (en) * 2020-04-15 2020-08-07 瑞纳智能设备股份有限公司 Dynamic prediction method and system for load of heat exchange unit

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708497A (en) * 2012-01-13 2012-10-03 合一网络技术(北京)有限公司 VideoBag feature-based accurate advertisement release system and method
CN103714130A (en) * 2013-12-12 2014-04-09 深圳先进技术研究院 Video recommendation system and method thereof
CN103763585A (en) * 2014-01-10 2014-04-30 北京酷云互动科技有限公司 User characteristic information obtaining method and device and terminal device
CN105447038A (en) * 2014-08-29 2016-03-30 国际商业机器公司 Method and system for acquiring user characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708497A (en) * 2012-01-13 2012-10-03 合一网络技术(北京)有限公司 VideoBag feature-based accurate advertisement release system and method
CN103714130A (en) * 2013-12-12 2014-04-09 深圳先进技术研究院 Video recommendation system and method thereof
CN103763585A (en) * 2014-01-10 2014-04-30 北京酷云互动科技有限公司 User characteristic information obtaining method and device and terminal device
CN105447038A (en) * 2014-08-29 2016-03-30 国际商业机器公司 Method and system for acquiring user characteristics

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291840A (en) * 2017-05-31 2017-10-24 北京奇艺世纪科技有限公司 A kind of user property forecast model construction method and device
CN107291840B (en) * 2017-05-31 2020-01-21 北京奇艺世纪科技有限公司 User attribute prediction model construction method and device
CN107330075A (en) * 2017-06-30 2017-11-07 北京金山安全软件有限公司 Multimedia data processing method and device, server and storage medium
CN108322783A (en) * 2018-01-25 2018-07-24 广州虎牙信息科技有限公司 Video website userbase estimation method, storage medium and terminal
CN108322783B (en) * 2018-01-25 2021-08-17 广州虎牙信息科技有限公司 Video website user scale presumption method, storage medium and terminal
CN108764553A (en) * 2018-05-21 2018-11-06 世纪龙信息网络有限责任公司 Userbase prediction technique, device and computer equipment
CN108764553B (en) * 2018-05-21 2020-12-15 世纪龙信息网络有限责任公司 User scale prediction method and device and computer equipment
CN109254990A (en) * 2018-09-11 2019-01-22 北京唐冠天朗科技开发有限公司 A kind of method and system of information source acquisition and dynamic analysis
CN109766955A (en) * 2019-02-12 2019-05-17 深圳乐信软件技术有限公司 Gender identification method, device, equipment and storage medium
CN111507520A (en) * 2020-04-15 2020-08-07 瑞纳智能设备股份有限公司 Dynamic prediction method and system for load of heat exchange unit

Similar Documents

Publication Publication Date Title
CN106202570A (en) A kind of user information acquiring method and device
CN107038213B (en) Video recommendation method and device
CN107888950A (en) A kind of method and system for recommending video
EP3819791A2 (en) Information search method and apparatus, device and storage medium
CN106055617A (en) Data pushing method and device
CN103634687B (en) The method and system of video search result are provided in intelligent television
CN105930425A (en) Personalized video recommendation method and apparatus
CN106131601A (en) Video recommendation method and device
CN105117460A (en) Learning resource recommendation method and system
CN106331779A (en) Method and system for pushing anchor based on user preferences during video playing process
CN107526810B (en) Method and device for establishing click rate estimation model and display method and device
CN106446078A (en) Information recommendation method and recommendation apparatus
CN109582875A (en) A kind of personalized recommendation method and system of online medical education resource
CN109543132A (en) Content recommendation method, device, electronic equipment and storage medium
CN111435371B (en) Video recommendation method and system, computer program product and readable storage medium
CN113535991B (en) Multimedia resource recommendation method and device, electronic equipment and storage medium
CN103761228B (en) The rank threshold of application program determines that method and rank threshold determine system
CN104216883A (en) Video recommendation reason generating system and method
CN109769007B (en) Service resource management system and method thereof
CN105701226A (en) Multimedia resource assessment method and device
JP5668010B2 (en) Information recommendation method, apparatus and program
CN106951471A (en) A kind of construction method of the label prediction of the development trend model based on SVM
CN104967690B (en) A kind of information-pushing method and device
CN108132964A (en) A kind of collaborative filtering method to be scored based on user item class
CN104657457B (en) A kind of user evaluates data processing method, video recommendation method and the device of video

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20161207