CN101771957B - User interest point determining method and device - Google Patents

User interest point determining method and device Download PDF

Info

Publication number
CN101771957B
CN101771957B CN200810241181A CN200810241181A CN101771957B CN 101771957 B CN101771957 B CN 101771957B CN 200810241181 A CN200810241181 A CN 200810241181A CN 200810241181 A CN200810241181 A CN 200810241181A CN 101771957 B CN101771957 B CN 101771957B
Authority
CN
China
Prior art keywords
interest
point
content
multimedia
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200810241181A
Other languages
Chinese (zh)
Other versions
CN101771957A (en
Inventor
郑于锷
孙杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN200810241181A priority Critical patent/CN101771957B/en
Publication of CN101771957A publication Critical patent/CN101771957A/en
Application granted granted Critical
Publication of CN101771957B publication Critical patent/CN101771957B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a user interest point determining method and a device. The method includes the following steps: media interest assemblage spaces are respectively generated aiming at each multimedia type, a plurality of preset interest points are contained in the media interest assemblage spaces, and each interest point determines an interest characteristic value according to the characteristics of a selected training sample corresponding to the interest point. When a user operates multimedia contents, a corresponding media characteristic value is determined according to the multimedia contents operated by the user; a difference value between the interest characteristic value corresponding to each interest point in the media interest assemblage spaces and the media characteristic value is calculated, and one or more corresponding interest points with lower difference value are determine as the interest points of the user. The user interest point determining method and the device provided in the invention can accurately determine the interest points of the user according to the multimedia contents operated by the user.

Description

A kind of user interest point is confirmed method and apparatus
Technical field
The present invention relates to the communications field, relate in particular to content of multimedia, confirm the method and apparatus of user interest point according to user's operation.
Background technology
In the existing communication network, use for the convenience of the user, the Internet service of varied function is provided.Like mSpaces is the Internet service that combines with the mobile communication function towards cellphone subscriber's personalization, is intended to increase user's viscosity and loyalty.MSpaces user utilizes the personal page of the various tool building component oneself that system provides; And enjoy the personal space storage of oneself and issue the original content of oneself; And be pushed to mobile phone through the issue of existing various types of communication service such as realizations information such as Fetion, multimedia message, note and mailbox, the own interested content of interactive communication, information sharing and customization with the good friend, enjoy and a bit submit multiple spot issue and the once customization mobile phone wireless service and the internet community service of propelling movement automatically to.
The user can upload/download pictures, upload/the down-load music video, write blog, send short coloured silk and customer location characteristic etc. through mobile phone terminal visit mSpaces website.Therefore; A lot of incidents relevant with user behavior can take place on the mSpaces platform; And the relevant content of operation of these incidents; As the content of the content of user's uploading pictures, user's download music, the media information that the user subscribes to etc. all with mobile phone terminal user's behavioural habits, interest characteristicses etc. have very important relationship.
In the prior art, when confirming user interest point, a kind of implementation is to confirm according to user's static attribute information, like sex, age and the user's affiliated area etc. according to the user.The implementation that also has is the information of uploading or downloading according to the user, and the keyword of match settings carries out class of subscriber and distinguishes, and according to class of subscriber under the user, determines user's point of interest.
The above-mentioned point of interest of prior art is confirmed method, all is the point of interest of simple and rough consumer positioning, and the user interest point of determining is not accurate enough.
Summary of the invention
The present invention provides a kind of user interest point to confirm method and apparatus, according to the content of multimedia of user's operation, determines user's point of interest more exactly.
User interest point provided by the invention is confirmed method, comprising:
According to the content of multimedia of user's operation, confirm to characterize the media characteristic value of said content of multimedia characteristic;
According to multimedia type under the said content of multimedia, confirm the corresponding medium interest space of birdsing of the same feather flock together;
Calculate said medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in the space and the difference between the said media characteristic value;
According to said difference order from small to large, choose the point of interest that one or more corresponding point of interest is confirmed as said user;
Wherein, said medium interest is birdsed of the same feather flock together the space to the generation in advance respectively of each multimedia type; Said medium interest is birdsed of the same feather flock together and is comprised the point of interest that is provided with in advance in the space, and the interest characteristics value that each said point of interest is corresponding is definite by the characteristic of the training sample of choosing corresponding with said point of interest.
The present invention also provides a kind of user interest point to confirm device, comprising:
The media characteristic determination module is used for the content of multimedia according to user's operation, confirms to characterize the media characteristic value of said content of multimedia characteristic;
The space determination module of birdsing of the same feather flock together is used for according to multimedia type under the said content of multimedia, confirms the corresponding medium interest space of birdsing of the same feather flock together;
The point of interest determination module is used for calculating said medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in space and the difference between the said media characteristic value; According to said difference order from small to large, choose the point of interest that or above corresponding point of interest are confirmed as said user;
The space of birdsing of the same feather flock together generates memory module, is used for generating corresponding medium interest birds of the same feather flock together space and storage respectively to each multimedia type; Said medium interest is birdsed of the same feather flock together and is comprised the point of interest that is provided with in advance in the space, and the interest characteristics value that each said point of interest is corresponding is definite by the characteristic of the training sample of choosing corresponding with said point of interest.
The present invention is directed to each multimedia type and generate the medium interest space of birdsing of the same feather flock together respectively; Medium interest is birdsed of the same feather flock together and is included several points of interest of setting in the space; Each point of interest has corresponding interest characteristics value;, and confirm in advance as training sample by the content of multimedia of choosing relevant according to the training sample characteristic with this point of interest.When the user operates content of multimedia (like picture, text and sound etc.),, confirm corresponding media characteristic value according to the content of multimedia of user's operation; Through computing medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in the space and the difference between the media characteristic value; Choose the less one or more corresponding point of interest of difference and confirm as user's point of interest.Because the media characteristic value of content of multimedia has characterized the characteristic of the content of multimedia of user's operation; And the corresponding interest characteristics value of point of interest is to be confirmed by the characteristic of the training sample of correspondence; Therefore; When the difference of media characteristic value and interest characteristics value hour, show that content of multimedia that the user the operates training sample corresponding with point of interest is more approaching, thereby the content of multimedia that realization is operated according to the user is determined user's point of interest more exactly.
Description of drawings
Fig. 1 confirms method flow diagram for the user interest point that the embodiment of the invention provides;
Fig. 2 confirms method flow diagram for the user interest point when the user operates picture that the embodiment of the invention provides;
The user interest point that Fig. 3 provides for the embodiment of the invention is confirmed the system architecture synoptic diagram;
The user interest point that Fig. 4 provides for the embodiment of the invention space of confirming to birds of the same feather flock together in the system generates the memory module configuration synoptic diagram;
The user interest point that Fig. 5 provides for the embodiment of the invention is confirmed point of interest determination module structural representation in the system;
The subscription client structural representation that Fig. 6 provides for the embodiment of the invention.
Embodiment
The present invention provides a kind of user interest point to confirm method and apparatus, realizes the content of multimedia according to user's operation, determines user's point of interest more exactly.
Below in conjunction with accompanying drawing, user interest point provided by the invention is confirmed that method and apparatus sets forth in detail with specific embodiment.
Referring to Fig. 1, confirm method flow diagram for the user interest point that the embodiment of the invention provides, specifically comprise:
Step S101, according to the content of multimedia of user operation, confirm to characterize the media characteristic value of this content of multimedia characteristic;
Step S102, according to multimedia type under the content of multimedia of user operation, confirm the corresponding medium interest space of birdsing of the same feather flock together;
Step S103, the computing medium interest difference between the media characteristic value of the operated content of multimedia of the corresponding interest characteristics value of each point of interest in the space and user of birdsing of the same feather flock together;
Step S104, according to the corresponding point of interest of the difference that calculate to obtain rank order from small to large;
Step S105, by difference order from small to large, choose the point of interest that one or more corresponding point of interest is confirmed as the user.
Wherein, medium interest is birdsed of the same feather flock together the space to the generation in advance respectively of each multimedia type.For example: generate the corresponding medium interest space of birdsing of the same feather flock together to picture (or image); Generate the corresponding medium interest space etc. of birdsing of the same feather flock together to sound.Medium interest is birdsed of the same feather flock together and is comprised several points of interest that are provided with in advance in the space; Each point of interest has the interest characteristics value; The interest characteristics value that each point of interest is corresponding; By with the content of multimedia of choosing in advance relevant with this point of interest as training sample, and confirm according to the characteristic of training sample.
Among one embodiment, the medium interest space of birdsing of the same feather flock together is the vector space of a multidimensional, and the media characteristic value is with the media characteristic vector representation, and the interest characteristics value is with the interest characteristics vector representation.
With multimedia one type, promptly picture is an example below, specifically describes how to generate the corresponding medium interest space of birdsing of the same feather flock together.
1) user is provided with birds of the same feather flock together several points of interest in space of medium interest in advance, as tennis, table tennis, tourism etc. all corresponding the medium interest point of interest in the space of birdsing of the same feather flock together.
2) be each m training sample picture of point of interest input and k keyword (option); This group training sample picture is represented a series of pictures relevant with this point of interest; Like " tennis " point of interest; Can import pictures such as tennis, racket, tennis tournament, keyword can be: tennis, warm Bolton, Roger Federer, Sa Labowa etc.
3) in order enough discrete values to represent these pictures, need from the training sample picture, extract the characteristic (like color range, brightness, profile etc.) of the picture that can quantize, form a n-dimensional vector.Concrete processing procedure is:
At first carry out the picture cutting, be about to each training sample picture and all be cut into picture of the same size (like 300*300), the Aspect Ratio of picture original size no matter all compresses it or is extended to same resolution, so that the picture feature value is extracted.
The picture feature value is extracted: define an image content through parameters such as color range, piecemeal color, piecemeal profiles.For example mainly extract the data of three aspects of image: color range, image block average color and image block outline data.For a point of interest (being made as α), for point of interest α chooses m training sample picture.For certain training sample picture i wherein, at first obtain the color range data of picture; Then; Picture is divided into the individual fritter of N*N (when picture size was 300*300, the value of N can be 8-10), earlier the color of every fritter picture is got average (bluring); And be mapped to a kind of color in the color space of 8 looks, thereby obtain the numerical value of each fritter color of expression.Again every little block feature is carried out profile and extract, obtains the vector arc description of N*N each piece contour feature of description, pass through the proper vector that above-mentioned processing has just obtained describing the picture material of this training sample picture i, that is:
imgf i(cs,o 1,o 2,o 3,....o N×N,c 1,c 2,c 3,...,c N×N)
Wherein, cs is a set, represent all just to scheme based on the color in the HSV that quantizes (tone hue, saturation degree saturation, value value) space, and with its distribution in 162 divide equally interval of dispersing.o 1, o 2..., o N * NRepresent the vector description set of the contour feature of each small images, c 1, c 2...., c N*NThe color value of each fritter in the presentation video.
For m the training sample picture of point of interest α, just can obtain m image content features vector like this, be expressed as:
ivect(imgf 1,imgf 2,imgf 3,....,imgf m)
This m image content features vector can roughly be described the image content features of this point of interest.
Except the image content features vector, each image can also have corresponding keyword that image is described.For example for point of interest α, its each training sample picture can correspondingly provide some keywords, thereby the keyword that gathers all training sample pictures obtains the characteristic vocabulary vector of point of interest α, that is:
despt(keyw 1,keyw 2,....,keyw n)
Wherein, keyw representes keyword, the characteristic set of words of despt set expression point of interest.
Respectively medium interest each point of interest that is provided with in the space of birdsing of the same feather flock together is imported one group of training sample image and keyword, set up ivect and despt proper vector, thereby made up based on the medium interest of the picture space of birdsing of the same feather flock together.
The proper vector of setting up can through expansion identifiable language (eXtensible Markup Language, XML) or other modes describe and be stored in the computing machine.
Made up based on the medium interest of picture and birdsed of the same feather flock together behind the space; For the picture of user's operation, can adopt identical method to extract picture feature, it is (clear convenient for describing to generate the characteristic of correspondence vector; Hereinafter; Point of interest characteristic of correspondence vector is called the interest characteristics vector, and the character pair vector that will generate according to the content of multimedia of user's operation is called the media characteristic vector), and the difference between computing medium proper vector and the interest characteristics vector; By difference order from small to large, choose the point of interest that one or more corresponding point of interest is confirmed as the user.
According to said method provided by the invention, the content of multimedia for the each operation of user all needs the difference between computing medium proper vector and the interest characteristics vector, determines the point of interest of the less corresponding point of interest of difference as the user according to the difference size.In order to reduce calculated amount as far as possible, in the preferred embodiment, also content of multimedia is generated corresponding content identification according to the sign generation strategy of setting, and the corresponding relation of memory contents sign and point of interest.The original records of corresponding relation comprises: the corresponding relation of the content identification that generates according to the content-data of the content of multimedia of training sample and corresponding point of interest; And after the user operates content of multimedia; Content of multimedia to this operation of user generates corresponding content identification according to same sign generation strategy; And mate with the content identification in the stored relation; When the corresponding content of the content of multimedia that has comprised this operation of user in the stored relation identified, can directly determine corresponding point of interest according to stored relation was user interest point; And when the corresponding content of the content of multimedia that does not comprise this operation of user in the stored relation identifies; Determine corresponding point of interest as stated above; With the corresponding content sign of the content of multimedia of this operation of user and the corresponding relation record of corresponding point of interest; Be increased in the stored relation, realize constantly increasing the record in the corresponding relation, to improve the follow-up power that is matched to.
Among one embodiment, corresponding relation can adopt the stored in form of form.In this mapping table, comprise two fields at least: content identification and corresponding point of interest.
Among one embodiment, the difference that mapping table can also storage computation goes out, that is: the difference between the interest characteristics vector of the media characteristic vector of the corresponding content of multimedia of memory contents sign and corresponding point of interest.
The content identification that generates is used for identifying uniquely corresponding content of multimedia; Content identification in the embodiment of the invention is not artificial identifier that does not have concrete implication or the numerical value that is provided with; But need calculate according to the multimedia content data of correspondence, characterize the characteristic of corresponding content of multimedia.After adopting same sign generation strategy to determine corresponding content identification to same type content of multimedia, just can be through the comparison of content identification, whether the content of multimedia of determining correspondence is identical.
Content identification with picture is generated as example below, specifies with the content identification that how to generate content of multimedia.
For each onesize picture file, can use a kind of algorithm to generate the unique identification (ID) of this picture, make same ID represent that the probability of different pictures levels off to 0.At first can use someway,, represent with a string short numeral with a large amount of quantity of picture file.For example represent with the CRC cyclic check code of image file.In the short numeral of this string, randomly draw the numerical value of some again, be side-play amount with these numerical value, extract the data content of this side-play amount in the picture file, these contents are merged to this string than generating corresponding ID in the short number word.
A kind of ID method for distilling of picture is following:
ID=ijk
In the following formula, i is 32 CRC cyclic check codes after picture file is deformed into 300*300, and the CRC production is:
G(x)=x 32+x 26+x 23+x 22+x 16+x 12+x 11++x 10+x 8+x 7+x 5+x 4+x 2+x+1
The decimal system front two number scale of getting i is i1, and j is the value of i1 byte of picture file; The value of getting the metric most significant digit of i is designated as i2, and k is the value of i2 byte of picture file.After three's (being i, j and k) merged is the value of ID.Like this, for each picture file, after distortion, ID basically can picture of unique identification.When the content of two picture files was identical, its corresponding ID was identical, and whether identical so whether comparison ID is identical if just can determine picture.
For audio data file, also can generate corresponding ID.For example, the preceding 10 seconds data of first cutting audio file (having the place of waveform to begin from audio frequency) according to the preceding 10 seconds data of audio frequency, are extracted audio frequency characteristics and are generated corresponding ID.Audio frequency characteristics for example comprises: short-time average energy, zero-crossing rate, frequency center and bandwidth etc.
Because XML has had tree-shaped characteristic correspondence and has had subvector in the imgf vector, therefore can come the identification medium characteristic with XML.Specifically define shown in following table one:
Table one:
Bookmark name Father node Attribute Content description
<media> Root node The unique identification Type of an image of ID representes the type of medium, is image (image) here The root node of a media characteristic of expression
<imgf> <media> Id representes the sequence number of imgf The image content features vector
<colorStage> <imgf> The beginning of the color range among the expression imgf, color range is used a plurality of numeric representations
<value> <colorStage> Stage representes the sequence number that color range is represented Each value in the expression color range vector
<outline> <imgf> Id representes piece number (value is 1 to N*N) The contour vector of a piece in the expression N*N piece
<line> <outline> StartX, startY, endX, endY represent the unique identification of the origin coordinates ID line label of a line The initial sum termination coordinate of certain the bar line among the expression outline
<colorSet> <imgf> The expression of N*N piece color
<color> <colorSet> Id representes piece number (value is 1 to N*N) A certain homochromatic among the expression imgf
<keywordSet> <media> Characteristic vocabulary vector
<keyword> <keywordSet> Id representes the ID of key word A keyword in the representation feature vocabulary vector
Can know corresponding one or more interest characteristics vectors of point of interest according to foregoing description; The number of vector equals to be the selected training sample quantity of this point of interest.As stated, point of interest α has m training sample picture, its corresponding image content features vector ivect (imgf 1, imgf 2, imgf 3...., imgf m) have a m vectorial imgf 1~imgf m
The medium interest based on picture that generates with the foregoing description space of birdsing of the same feather flock together is an example, and when the content of multimedia of user's operation was picture, the idiographic flow of determining user interest point was as shown in Figure 2, specifically comprises:
Step S201, according to the picture of user operation, adopt said method to determine media characteristic vector and the content identification corresponding with the picture of this operation of user.
Step S202, with the content identification of the picture of this operation of user, in the content identification of storage and point of interest corresponding relation, mate.
Step S203, judge whether to match identical content identification, if, execution in step S204; Otherwise, execution in step S205.
Step S204, according to stored relation, obtain the point of interest corresponding with content identification, go to step S213.
Step S205, obtain the interest characteristics vector ivect and the despt of next point of interest.
Next imgf vector among step S206, the taking-up ivect, the difference between the media characteristic vector of the picture that calculating imgf vector and user are operated, and write down this difference.
Suppose imgf 1The corresponding media characteristic vector of picture that the expression user is operated; Wherein:
imgf 1=(cs 1,o 1,1,o 1,2,o 1,3,....o 1,N×N,c 1,1,c 1,2,c 1,3,...,c 1,N×N)
Suppose imgf 2Be an imgf vector among the ivect, wherein:
Imgf 2=(cs 2, o 2,1, o 2,2, o 2,3... .o 2, N * N, c 2,1, c 2,2, c 2,3..., c 2, N * N) imgf 1And imgf 2Between difference be:
DBI(imgf 1,imgf 2)=γ 1*cdist(imgf 1,imgf 2)+γ 2*odist(imgf 1,imgf 2)+γ 3*cdist(imgf 1,imgf 2)
Wherein, cdist (imgf 1, imgf 2)=(cs 1-cs 2) TA (cs 2-cs 1), a [a I, j] relation of expression color i and color j.Odist (imgf 1, imgf 2) expression is through after the image cutting, the diversity of profile between two images.That is:
cdist ( imgf 1 , imgf 2 ) = ( o 1,1 - o 2,1 ) 2 + ( o 1,2 - o 2,2 ) 2 + . . . + ( o 1 , N * N - o 2 , N * N ) 2
Wherein, γ 1, γ 2, γ 3Weights for the various piece proportion.
Step S207, judge whether to accomplish the difference calculating between the media characteristic vector of whole imgf vectors and the picture of user's operation, if not, go to step S206; If continue step S208.
Step S208, judge that picture is whether subsidiary keyword arranged, if, execution in step S209; If not, execution in step S210.
Step S209, confirm picture whether appear in the despt set of point of interest with keyword, and calculate and number occurs.
Suppose that picture has attached m keyword, the keyword set of having attached the picture of keyword so can be expressed as:
Ikw = &cup; i = 1 m Ikeyw i
If total n the keyword of point of interest j representes that its keyword set is combined into:
kw j = &cup; j = 1 n keyw j
Picture Ikw and point of interest kw iKeyword degree of correlation KWR be expressed as:
KWR ( Ikw , kw i ) = &Sigma; l = 1 m isAppeared ( Ikeyw l , kw i )
Wherein (keyword, keywordset), whether the keyword keyword of expression picture occurs in the keyword set keywordset of point of interest isAppeared, if then (keyword, keywordset) value is 1 to isAppeared, otherwise is 0.
That is to say that KWR calculates is the summation of the number of times that in the keyword set of point of interest, occurs of the subsidiary keyword of picture.
Step S210, calculate the difference between the interest characteristics vector of media characteristic vector and current point of interest of picture of user's operation, and choose minimal difference as the difference between the vectorial interest characteristics vector corresponding of the media characteristic of the picture of user's operation with current point of interest.The concrete calculating formula of difference DII is:
DII = &beta; 1 min i m ( DBI ( imgf 1 , imgf i ) ) - &beta; 2 KWR ( Ikw , kw )
In the following formula, β 1And β 2Be weight coefficient.
Step S211, judge that whether birds of the same feather flock together whole points of interest in the space of medium interest all calculate and finish, and if not, go to step S205; If, execution in step S212.
The corresponding difference of each point of interest of step S212, comparison, the difference that each point of interest is corresponding be by from small to large rank order, determines minimum point of interest of difference or K less point of interest of difference.And the content identification of the picture of this operation of user and the corresponding point of interest of choosing and the corresponding difference that calculates be increased in the corresponding relation of preservation.
Step S213, with the point of interest of the point of interest of determining among step S204 or the step S212 as the user, be saved in the user interest historical record; In addition, can also in the user interest historical record, preserve the corresponding difference that calculates, this difference size can show that user interest is big or small with the otherness between the corresponding point of interest.
Adopt the said flow process of Fig. 2; When the media characteristic vector of the content of multimedia of determining user's operation; Also generate corresponding content identification, content identification and the point of interest mapping table stored according to the content identification coupling earlier are if can match identical content identification; Then directly the corresponding point of interest of output has been avoided and medium interest is birdsed of the same feather flock together in the space between each point of interest the calculating one by one of otherness relatively as user's point of interest.In addition; If this does not match identical content identification; After the method that then adopts the foregoing description to provide was determined corresponding point of interest, also the content identification with the content of multimedia of this user's operation was increased in the corresponding relation with the corresponding point of interest of determining, and makes that the record in the corresponding relation constantly increases; Follow-up when mating according to content identification, the coupling possibility of success also constantly increases.
Online during as the user through Internet logging in network side respective server, can catch the content of multimedia of user's operation by network side server, the user interest point that carrying out the above embodiment of the present invention provides is confirmed method, determines user's point of interest.
When user's off-line operation content of multimedia, can also determine corresponding media characteristic vector and content identification and be kept at this locality earlier by the content of multimedia of subscription client according to user's operation.When subscription client logging in network side server; The media characteristic vector and the content identification of this locality storage are sent to network side server; According to the medium interest that generates the in advance space of birdsing of the same feather flock together, adopt said method to determine user's point of interest by network side server.
One of ordinary skill in the art will appreciate that all or part of step that realizes in the foregoing description method is to instruct relevant hardware to accomplish through program; This program can be stored in the computer read/write memory medium, as: ROM/RAM, magnetic disc, CD etc.
Based on same inventive concept, the embodiment of the invention also provides a kind of user interest point to confirm device, and its structural representation is as shown in Figure 3, comprising:
Media characteristic determination module 31 is used for the content of multimedia according to user's operation, confirms to characterize the media characteristic value of content of multimedia characteristic;
The space determination module 32 of birdsing of the same feather flock together is used for according to multimedia type under the content of multimedia of user's operation, confirms the corresponding medium interest space of birdsing of the same feather flock together;
Point of interest determination module 33 is used for computing medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in space and the difference between the media characteristic value; According to difference order from small to large, choose the point of interest that or above corresponding point of interest are confirmed as the user;
The space of birdsing of the same feather flock together generates memory module 34, is used for generating corresponding medium interest birds of the same feather flock together space and storage respectively to each multimedia type; Medium interest is birdsed of the same feather flock together and is comprised the point of interest that is provided with in advance in the space, and the interest characteristics value that each point of interest is corresponding is definite by the characteristic of the training sample of choosing corresponding with this point of interest.
Among one embodiment, user interest point confirms that device also comprises:
Corresponding relation storage update module 35 is used to store the content identification of content of multimedia and the corresponding relation of point of interest; Said content identification generates according to the sign generation strategy of setting according to multimedia content data, and the original records of corresponding relation comprises: the corresponding relation of the content identification that generates according to the content-data of the content of multimedia of training sample and corresponding point of interest; And storage is according to generate, the current local content identification and the corresponding relation record of the corresponding point of interest of choosing of not being kept at of content of multimedia of user's operation.
Among one embodiment, the concrete structure of the space generation memory module 34 of birdsing of the same feather flock together is as shown in Figure 4, comprising:
Submodule 341 is set, is used to be provided with an above point of interest; And be that each point of interest is selected one or more training sample;
Feature extraction submodule 342 is used to extract the characteristic of the corresponding training sample of each point of interest, and it is vectorial to generate the interest characteristics corresponding with each point of interest;
Generate submodule 343, be used to preserve the corresponding interest characteristics vector of an above point of interest of setting, generate the medium interest space of birdsing of the same feather flock together.
Among one embodiment, the concrete structure of point of interest determination module 33 is as shown in Figure 5, comprising:
Content identification generates submodule 331, is used for the content of multimedia according to user's operation, according to the sign generation strategy of setting, generates the corresponding content sign of the content of multimedia of user's operation;
Matched sub-block 332 is used for the content identification with the content of multimedia of the user's operation that generates, with the content identification coupling of storage in the corresponding relation storage update module 35; And output matching result;
Confirm submodule 333, be used for when matching result be when matching identical content identification, according to stored relation in the corresponding relation storage update module 35, will the point of interest corresponding confirm as user's point of interest with this content identification; And
When matching result when not matching identical content identification, calculated difference according to the method described above, and according to difference order is from small to large chosen the point of interest that or above corresponding point of interest are confirmed as the user.
In the practical application, user interest point provided by the invention confirms that each module of device can be arranged in the network side server; Perhaps wherein media characteristic determination module is arranged in the subscription client, and all the other each modules are arranged in the network side server, sends the media characteristic value by subscription client, and perhaps media characteristic value and content identification are to network side server.
When the media characteristic determination module was arranged on subscription client, the concrete structure of this subscription client was as shown in Figure 6, comprising:
User operation case generator 61 is used to produce the content of multimedia operation, and storage should be operated;
Media characteristic determination module 62 is used for the content of multimedia according to user's operation, confirms to characterize the media characteristic value of content of multimedia characteristic; Perhaps, confirm to characterize outside the media characteristic value of content of multimedia characteristic, also generate the content identification of the content of multimedia of user's operation according to the sign generation strategy of setting according to the content of multimedia of user's operation;
Media characteristic memory module 63 is used for the media characteristic value that medium characteristic determination module 62 is determined; When said media characteristic determination module 62 also generates said content identification, also be used for the content identification that medium characteristic determination module 62 generates;
Media characteristic sending module 64, the media characteristic value that is used to send storage is to network side server; Perhaps send the media characteristic value and the content identification of storage and arrive network side server.
The media characteristic determination module is arranged in the subscription client, when user's off-line operation content of multimedia, can determines corresponding media characteristic vector and content identification and be kept at this locality by the content of multimedia of subscription client according to user's operation.When subscription client logging in network side server; The media characteristic vector and the content identification of this locality storage are sent to network side server; According to the medium interest that generates the in advance space of birdsing of the same feather flock together, adopt the disclosed user interest point of the above embodiment of the present invention to confirm that method determines user's point of interest by network side server.
In sum, the present invention is through generating the medium interest space of birdsing of the same feather flock together respectively to each multimedia type, determines the medium interest corresponding interest characteristics value of each point of interest in the space of birdsing of the same feather flock together according to training sample.When the user operates content of multimedia (like picture, text and sound etc.),, confirm corresponding media characteristic value according to the content of multimedia of user's operation; And choose the medium interest corresponding space of birdsing of the same feather flock together, computing medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in the space and the difference between the media characteristic value with the content of multimedia of this operation of user; Select the less corresponding point of interest of difference to confirm as user's point of interest.Because the media characteristic value of content of multimedia has characterized the characteristic of the content of multimedia of user's operation; And the corresponding interest characteristics value of point of interest is to be confirmed by the characteristic of the training sample of correspondence; Therefore; When the difference of media characteristic value and interest characteristics value hour, show that content of multimedia that the user the operates training sample corresponding with point of interest is more approaching, thereby the content of multimedia that realization is operated according to the user is determined user's point of interest more exactly.Through according to the long-term follow analysis of said method provided by the invention to the operated content of multimedia of user, write down and also bring in constant renewal in the user interest point historical record, can determine user's hobby more exactly.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims (12)

1. a user interest point is confirmed method, it is characterized in that, comprising:
According to the content of multimedia of user's operation, confirm to characterize the media characteristic value of said content of multimedia characteristic;
According to multimedia type under the said content of multimedia, confirm the corresponding medium interest space of birdsing of the same feather flock together;
Calculate said medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in the space and the difference between the said media characteristic value;
According to said difference order from small to large, choose the point of interest that one or more corresponding point of interest is confirmed as said user;
Wherein, said medium interest is birdsed of the same feather flock together the space to the generation in advance respectively of each multimedia type; Said medium interest is birdsed of the same feather flock together and is comprised the point of interest that is provided with in advance in the space, and the interest characteristics value that each said point of interest is corresponding is definite by the characteristic of the training sample of choosing corresponding with said point of interest.
2. the method for claim 1 is characterized in that, the said medium interest space of birdsing of the same feather flock together is the vector space of a multidimensional;
Said media characteristic value is with the media characteristic vector representation;
Said interest characteristics value is with the interest characteristics vector representation;
The said medium interest of said calculating birds of the same feather flock together corresponding interest characteristics value of each point of interest in the space and the difference between the said media characteristic value comprise:
Calculate said medium interest birds of the same feather flock together corresponding interest characteristics vector of each point of interest in the space and the vectorial difference between the said media characteristic vector.
3. method as claimed in claim 2 is characterized in that, said media characteristic vector sum interest characteristics vector adopts the expansion XML of identifiable language sign.
4. method as claimed in claim 2 is characterized in that, generates the birds of the same feather flock together concrete grammar in space of said medium interest to be:
An above point of interest is set;
For each point of interest is chosen one or more corresponding training samples;
Extract the corresponding training sample characteristic of each point of interest, generate the interest characteristics vector corresponding with each point of interest;
By the corresponding interest characteristics vector of a said above point of interest, generate the said medium interest space of birdsing of the same feather flock together.
5. method as claimed in claim 4 is characterized in that, corresponding one or more interest characteristics vectors of point of interest.
6. like the arbitrary described method of claim 1-5, it is characterized in that, also comprise: the content identification of storage content of multimedia and the corresponding relation of point of interest; Said content identification is to generate according to the sign generation strategy of setting according to the content-data of content of multimedia, and the original records of said corresponding relation comprises: the content identification that generates according to the content-data of the content of multimedia of said training sample and the corresponding relation of corresponding point of interest;
Further comprise before calculating said difference:,, generate the content identification of the content of multimedia of user's operation according to the sign generation strategy of said setting according to the content of multimedia of user's operation; And with the said corresponding relation of storage in content identification coupling; When matching identical content identification,, the point of interest corresponding with this content identification confirmed as said user's point of interest, and no longer carry out the difference calculation procedure according to the said corresponding relation of storage;
When not matching identical content identification, calculate said difference, and, choose the point of interest that one or more corresponding point of interest is confirmed as said user according to said difference order from small to large; And the corresponding relation record of content identification and this corresponding point of interest of choosing of content of multimedia that in said corresponding relation, increases said user's operation of this generation.
7. a user interest point is confirmed device, it is characterized in that, comprising:
The media characteristic determination module is used for the content of multimedia according to user's operation, confirms to characterize the media characteristic value of said content of multimedia characteristic;
The space determination module of birdsing of the same feather flock together is used for according to multimedia type under the said content of multimedia, confirms the corresponding medium interest space of birdsing of the same feather flock together;
The point of interest determination module is used for calculating said medium interest birds of the same feather flock together corresponding interest characteristics value of each point of interest in space and the difference between the said media characteristic value; According to said difference order from small to large, choose the point of interest that or above corresponding point of interest are confirmed as said user;
The space of birdsing of the same feather flock together generates memory module, is used for generating corresponding medium interest birds of the same feather flock together space and storage respectively to each multimedia type; Said medium interest is birdsed of the same feather flock together and is comprised the point of interest that is provided with in advance in the space, and the interest characteristics value that each said point of interest is corresponding is definite by the characteristic of the training sample of choosing corresponding with said point of interest.
8. device as claimed in claim 7 is characterized in that, the said space of birdsing of the same feather flock together generates memory module, comprising:
Submodule is set, is used to be provided with an above point of interest; And choose one or more corresponding training samples for each point of interest;
The feature extraction submodule is used to extract the characteristic of the corresponding training sample of each point of interest, and it is vectorial to generate the interest characteristics corresponding with each point of interest;
Generate submodule, be used to preserve the corresponding interest characteristics vector of a said above point of interest, generate the said medium interest space of birdsing of the same feather flock together.
9. device as claimed in claim 8 is characterized in that, also comprises:
Corresponding relation storage update module is used to store the content identification of content of multimedia and the corresponding relation of point of interest; Said content identification is to generate according to the sign generation strategy of setting according to the content-data of content of multimedia, and the original records of said corresponding relation comprises: the content identification that generates according to the content-data of the content of multimedia of said training sample and the corresponding relation of corresponding point of interest; And
Storage is according to the corresponding relation record of the content of multimedia content identification that generates and the corresponding point of interest of choosing of user's operation.
10. device as claimed in claim 9 is characterized in that, said point of interest determination module specifically comprises:
Content identification generates submodule, according to the content of multimedia of user's operation, according to the sign generation strategy of said setting, generates the content identification of the content of multimedia of user's operation;
Matched sub-block is used for the content identification with the content of multimedia of the said user's operation that generates, with the content identification coupling of storing in the said corresponding relation storage update module; And output matching result;
Confirm submodule; Be used for when matching result be when matching identical content identification; According to stored relation in the said corresponding relation storage update module, the point of interest corresponding with this content identification confirmed as said user's point of interest, and no longer carry out the difference calculation procedure; And
When matching result when not matching identical content identification, calculate said difference, and according to said difference order from small to large, choose the point of interest that or above corresponding point of interest are confirmed as said user.
11., it is characterized in that each module of said device is arranged in the network side server like each described device of claim 7-9; Perhaps
The media characteristic determination module of said device is arranged in the subscription client, and all the other each modules are arranged in the network side server; Said subscription client also sends said media characteristic value, and said network side server is given in perhaps said media characteristic value and content identification.
12. device as claimed in claim 11 is characterized in that, said subscription client comprises:
The user operation case generator is used to produce the content of multimedia operation, and storage should be operated;
The media characteristic determination module is used for the content of multimedia according to user's operation, confirms to characterize the media characteristic value of said content of multimedia characteristic; Perhaps be used for content of multimedia, confirm to characterize outside the media characteristic value of said content of multimedia characteristic,, generate the content identification of the content of multimedia of user's operation according to multimedia content data also according to the sign generation strategy of setting according to user's operation;
The media characteristic memory module is used to store the media characteristic value that said media characteristic determination module is determined; When said media characteristic determination module also generates said content identification, also be used to store said content identification;
The media characteristic sending module, the media characteristic value that is used to send storage is to network side server; The media characteristic value and the content identification that perhaps are used to send storage are to network side server.
CN200810241181A 2008-12-26 2008-12-26 User interest point determining method and device Expired - Fee Related CN101771957B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810241181A CN101771957B (en) 2008-12-26 2008-12-26 User interest point determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810241181A CN101771957B (en) 2008-12-26 2008-12-26 User interest point determining method and device

Publications (2)

Publication Number Publication Date
CN101771957A CN101771957A (en) 2010-07-07
CN101771957B true CN101771957B (en) 2012-10-03

Family

ID=42504485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810241181A Expired - Fee Related CN101771957B (en) 2008-12-26 2008-12-26 User interest point determining method and device

Country Status (1)

Country Link
CN (1) CN101771957B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8952983B2 (en) 2010-11-04 2015-02-10 Nokia Corporation Method and apparatus for annotating point of interest information
CN105635210B (en) * 2014-10-30 2021-04-27 腾讯科技(武汉)有限公司 Network information recommendation method and device and reading system
CN104715007A (en) * 2014-12-26 2015-06-17 小米科技有限责任公司 User identification method and device
CN109284449B (en) * 2018-10-23 2020-06-16 厦门大学 Interest point recommendation method and device
CN110781413B (en) * 2019-08-28 2024-01-30 腾讯大地通途(北京)科技有限公司 Method and device for determining interest points, storage medium and electronic equipment
CN111222047A (en) * 2020-01-03 2020-06-02 深圳市华宇讯科技有限公司 Picture downloading method, device, server and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1273400A (en) * 1999-05-10 2000-11-15 松下电器产业株式会社 Multi-media content extraction system and method using user described data recording medium
CN1549982A (en) * 2001-08-28 2004-11-24 皇家飞利浦电子股份有限公司 Automatic question formulation from a user selection in multimedia content
CN101267518A (en) * 2007-02-28 2008-09-17 三星电子株式会社 Method and system for extracting relevant information from content metadata

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1273400A (en) * 1999-05-10 2000-11-15 松下电器产业株式会社 Multi-media content extraction system and method using user described data recording medium
CN1549982A (en) * 2001-08-28 2004-11-24 皇家飞利浦电子股份有限公司 Automatic question formulation from a user selection in multimedia content
CN101267518A (en) * 2007-02-28 2008-09-17 三星电子株式会社 Method and system for extracting relevant information from content metadata

Also Published As

Publication number Publication date
CN101771957A (en) 2010-07-07

Similar Documents

Publication Publication Date Title
RU2745632C1 (en) Automated response server device, terminal device, response system, response method and program
CN101771957B (en) User interest point determining method and device
US8983971B2 (en) Method, apparatus, and system for mobile search
CN106326369B (en) Application topic recommendation method and device and server
Kim et al. Viscors: A visual-content recommender for the mobile web
CN108595461B (en) Interest exploration method, storage medium, electronic device and system
CN104331459B (en) A kind of network resource recommended method and device based on on-line study
CN107256267A (en) Querying method and device
CN105573995A (en) Interest identification method, interest identification equipment and data analysis method
CN110378434A (en) Training method, recommended method, device and the electronic equipment of clicking rate prediction model
CN111262953B (en) Method and device for pushing information in real time
CN102073704B (en) Text classification processing method, system and equipment
US9330135B2 (en) Method, apparatus and computer readable recording medium for a search using extension keywords
CN108536680B (en) Method and device for acquiring house property information
CN101083633A (en) Information searching system and searching method
CN112070542A (en) Information conversion rate prediction method, device, equipment and readable storage medium
CN101894146A (en) Method and system for realizing advertising function by using created text edit box
CN110855487A (en) Network user similarity management method, device and storage medium
CN111767953B (en) Method and apparatus for training an article coding model
CN113744002A (en) Method, device, equipment and computer readable medium for pushing information
CN111639199A (en) Multimedia file recommendation method, device, server and storage medium
CN108932262B (en) Song recommendation method and device
CN106844504B (en) A kind of method and apparatus for sending song and singly identifying
CN113032616B (en) Audio recommendation method, device, computer equipment and storage medium
CN113450134B (en) Advertisement putting method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121003

Termination date: 20211226

CF01 Termination of patent right due to non-payment of annual fee