CN108920521A - User's portrait-item recommendation system and method based on pseudo- ontology - Google Patents

User's portrait-item recommendation system and method based on pseudo- ontology Download PDF

Info

Publication number
CN108920521A
CN108920521A CN201810563501.9A CN201810563501A CN108920521A CN 108920521 A CN108920521 A CN 108920521A CN 201810563501 A CN201810563501 A CN 201810563501A CN 108920521 A CN108920521 A CN 108920521A
Authority
CN
China
Prior art keywords
user
portrait
concept
project
preference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810563501.9A
Other languages
Chinese (zh)
Other versions
CN108920521B (en
Inventor
张涛
邓悦
翁康年
张滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai university of finance and economics
Original Assignee
Shanghai university of finance and economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai university of finance and economics filed Critical Shanghai university of finance and economics
Priority to CN201810563501.9A priority Critical patent/CN108920521B/en
Publication of CN108920521A publication Critical patent/CN108920521A/en
Application granted granted Critical
Publication of CN108920521B publication Critical patent/CN108920521B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides user's portrait-item recommendation system and method based on pseudo- ontology, and user's portrait-item recommendation system provided by the invention based on pseudo- ontology includes:Pseudo- body module, user's portrait module, project portrait module, the recommending module based on preference;The puppet body module obtains field related text, generates field puppet ontology, and is output to user's portrait module and project portrait module;User's portrait module obtains the web browsing behavior of user, according to the field puppet ontology, to the user characteristics vector of the recommending module output optimization based on preference;The project portrait module obtains project associated description text, according to the field puppet ontology, to the item feature vector of the recommendation output optimization based on preference module;The corresponding project preference ranking of item feature vector output user optimized based on the recommending module of preference according to the user characteristics vector sum of the optimization.

Description

User's portrait-item recommendation system and method based on pseudo- ontology
Technical field
The invention belongs to user's portrait generation technique fields, and in particular to user's portrait-project recommendation based on pseudo- ontology System and method.
Background technique
Recommendation based on user's portrait belongs to content-based recommendation method in recommendation field.The definition of this method is root The feature that can reflect user interest, such as object are extracted in the corelation behaviour carried out when the article or selection article that select according to user The comment etc. of the characteristic (color, shape) of product or user to article is drawn a portrait the user interest profile extracted as user, Article to be recommended is also used simultaneously the character representation of same dimension, by by user's portrait and article portrait calculated come Recommended, the result of recommendation is highly dependent on the accuracy of user's portrait with article portrait.The result foundation of user's portrait Data it is different and different, studying the data used at present, there are mainly three types of types:One kind is the transaction data of user, the second class It is the text data that user delivers, mostlys come from the comment of electric business website user and the state that social network sites user is spontaneous It is the behavioral data that user browses webpage Deng, third class data.The characteristics of primary sources is only comprising numeric type data, no Comprising text data, the buying habit for being only capable of obtaining user is characterized, and the portrait dimension portrayed is limited.Secondary sources can carve The subjective emotion and emotional value of user are drawn, but is limited to user and usually has different text representations to be accustomed to this actual conditions, And the information of any active ues on network can only be collected into.Third class data essentially describe which net when user browsing Page.Regardless of whether user likes making comments on network or state, web page browsing has become user, and to obtain information most straight The mode connect, thus contained in the web page browsing behavioral data of user it is a large amount of objectively about the information of user interest, compared with It is more advantageous that preceding two classes data portray user's portrait, therefore this system portrays user's picture using the web browsing behavior of user Picture.
Although having contained the interest information largely about user in web browsing behavior, validity feature is therefrom extracted It is not easy to.User would generally be browsed frequent webpage domain name or Web Page Key Words as feature set by the research work of early stage, It regard the frequency of browsing as characteristic value after centainly converting, however if applying this method to user group, it can be because of use The frequent interest website of family group is different and makes the dimension disunity of each user's portrait.In recent years, the user based on ontology is drawn As modeling method gradually develops and obtains effect.Due to the natural advantage of ontology:There is specific contextual definition between each concept, So that the user that portrays of the method based on ontology draw a portrait can the objective feature for clearly describing user, and for different users, The feature set that method based on ontology generates has the characteristics that immobilize.
Draw a portrait one of the key technology of modeling of the user based on web browsing behavior is carried out with ontology how to be by network Browsing behavior is indicated with ontology.There is Domen etc. to use expert according to the problem of its research than more typical work to map Method.There is solid foundation in its research field by its team, has the ability to know method for distinguishing to the net of fixed network by expert Page, which carries out topic identification and classification, this method, cannot expand to other field, although and expert to map accuracy high, but meeting It takes a substantial amount of time and human cost.Hawalah etc. proposes tf-idf reflection method, i.e., by tf-idf method that webpage is literary The explanation content of text of this content and Ontological concept is expressed as bag of words vector, then measures similitude between two texts, most Afterwards using the similarity as the mapping value of webpage to ontology.The webpage that this method solves every field substantially is difficult to be mapped to The problem of ontology, but this is to be had existed based on ontology, and intrinsic concept has the case where completely illustrating document.But Actual conditions are that only only a few field has ontology.
Web browsing behavior also contains a large amount of interference informations, such as while containing magnanimity related with user information What extracts the focus that authentic and valid information is this kind of research from these magnanimity in mixed and disorderly data.Solve this kind of ask What the research of topic was generally accepted is assumed to be:If user is interested in a certain theme, the net of related subject can be repeatedly browsed Page.Based on this it is assumed that there is scholar to propose the Contextual Concept Clusteringalgorithm (3C) algorithm, The theme identified from webpage carries out the cluster of the ground based on similarity weight, and the maximum concept of weight is used to represent webpage Theme, and control using parameter beta the candidate concepts number for participating in cluster.This method can be excluded effectively in webpage Interference information, but when webpage just contains multiple themes originally, since finally only the selection highest concept of weight carrys out generation to the algorithm Table Web page subject is when user browses the webpage so that final mapping result can miss other themes for including in webpage When for interest to multiple themes, the user interest which can make portrait portray again is imperfect.
After initial portrait is built up, the information for how guaranteeing that portrait is portrayed is accurate, while can effectively be applied in recommendation and be The emphasis of modeling.Chen etc. has used three layers of ontology to portray user's portrait, which points out to mention in this way High whole accuracy because the number of plies for limiting ontology also just correspondingly makes the concept in ontology tail off, reduce by User interest is mapped to the probability of erroneous picture, but the very few number of plies also allow for drawing a portrait it is excessively wide to the expression of user interest It is general, the interest of user can not be identified on subdivision field.In research work in recent years, there is scholar to mention to father Concept carries out the algorithm and Sub-class Aggregation Scheme (SAS) of the update of 50% weight based on sub- concept Algorithm, the former entire renewal process continues to the root of ontology, although this update method makes to draw a portrait to a certain extent It is more acurrate, but the accumulative mode updated can make high-rise concept value weight generally be greater than the concept value of low layer, can equally make It draws a portrait excessively wide in range to the expression of user interest.The latter is based on the idea that be to portray user's meeting of portrait using two layers of ontology too It is wide in range, and can too be refined using the bottom of ontology, therefore portray user's portrait using the third layer of compromise, simultaneously The concept of third layer needs to carry out weight update according to its all sub- concept.However the ontology number of plies in reality is different, has The number of plies is more, and some numbers of plies are few, while demand of drawing a portrait is also various, therefore the concept of which layer is suitble to portray user's portrait not It can lump together.Hawalah etc. thinks that theGradualExtraWeightalgorithm (GEW) algorithm can effectively improve picture The accuracy rate of picture, the algorithm control sub- concept using the number of plies locating for parameter alpha and concept to the update weight of father's concept, with This is updated portrait to improve the validity of portrait.Compared to front several method, which considers ontology level Between difference, each level has been carried out respectively concept value update, the accuracy rate of portrait can be improved to a certain extent, even if It is that the concept of same level can also have any different, the relationship between each concept and its each sub- concept is also not quite similar.
Although work on hand achieves certain progress in user's portrait recommendation task based on web browsing behavior, In general there are still field limitation, the jejune problem of technology recommends accuracy rate lower.
Summary of the invention
Problems solved by the invention is existing user's portrait-project recommendation technology recommendation based on web browsing behavior Accuracy rate is lower;To solve described problem, the present invention provides user's portrait-item recommendation system and method based on pseudo- ontology.
User's portrait-item recommendation system provided by the invention based on pseudo- ontology includes:Pseudo- body module, Yong Huhua As module, project portrait module, based on the recommending module of preference;The puppet body module obtains field related text, generates Field puppet ontology, and it is output to user's portrait module and project portrait module;User's portrait module obtains the network of user Browsing behavior according to the field puppet ontology, the user characteristics vector of calculation optimization, and is output to the recommendation based on preference Module;The project portrait module obtains project associated description text, according to the field puppet ontology, the project of calculation optimization Feature vector, and it is output to the recommending module based on preference;It is described special according to the user based on the recommending module of preference Levy the corresponding project preference ranking of vector sum item feature vector output user.
Further, the pseudo- body module includes that field concept identification submodule and conceptual relation identify submodule, described Field concept identifies that submodule carries out word frequency statistics to field related text, and after removing stop words, the word by word frequency greater than α is remembered For field concept word, α is predetermined value;The conceptual relation identifies that the process of submodule is: Wherein, chFor h-th of notional word in field concept set of words C,For chN dimension term vector indicate,Indicate field concept word chIt is divided by hierarchical clustering For m-th of class of kth layer of pseudo- ontology.
Further, field concept identification submodule judges field concept word whether field is exclusive, if not field is special Belong to, is then defined as empty concept, if field is exclusive, is then defined as real concept.
Further, user's portrait module includes that initial user portrait generates submodule and user's portrait optimization submodule Block, the initial user portrait generate all webpage vocabulary that submodule browses user and carry out the expression based on term vector:Wherein tijkIndicate k-th of word in j-th of webpage of i-th of user browsing,For tijkN dimension term vector indicate.Webpage vocabulary and notional word are subjected to similarity measure:WhereinIndicate k-th in j-th of webpage of i-th of user browsing Similarity based on term vector between h-th of concept in word and pseudo- ontology,Indicate tijkTerm vector in g-th of dimension values,Indicate chTerm vector in g-th of dimension values;As unit of each notional word, the similarity that will be greater than threshold value is cumulative:Wherein q is threshold value, | tij| indicate j-th of i-th of user browsing The vocabulary number for including in webpage, dijIndicate j-th of webpage of i-th of user browsing,Indicate i-th of user's browsing J-th of webpage will to each concept to the preference of h-th of conceptIt adds up by user, calculates each use Preference value of the family to concept Wherein, | di| indicate the webpage number of i-th of user browsing Amount, diIndicate i-th of user,Indicate that the preference value of h-th of concept, certain preference of user is indicated with N by i-th of user The number that concept identifies in a period of time different web pages, when the preference concept frequency of occurrence is less than N, the preference concept In vain:Wherein,For chThe number being identified, finally, i-th of user couple The preference value of all conceptsConstitute user characteristics vector;The user, which draws a portrait, optimizes son Value without value father's concept is updated to the tired of the distance between father's concept and every sub- concept and the product of sub- concept value by module Add:
Wherein,Indicate i-th of user most Eventually for concept c on portraithPreference value,Indicate i-th of user on initial portrait for concept chPreference Value, c 'hIndicate concept c in pseudo- ontologyhAll sub- concepts set, | c 'h| indicate concept c in pseudo- ontologyhAll sons it is general The quantity of thought,Indicate i-th of user to chV-th of sub- concept preference,Indicate concept chWith its v-th Similarity of the sub- concept based on term vector, calculation method are: Wherein,It indicatesTerm vectorIn g-th of dimension values;Finally, preference value of i-th of user to all conceptsConstitute the user characteristics vector of optimization.
Further, the project portrait module includes that initial project portrait generates submodule and project portrait optimization submodule Block, the initial project portrait generate submodule and all items associated description text vocabulary are carried out the expression based on term vector:Wherein erkIndicate k-th of word in the description document of r-th of project,For erkN dimension term vector indicate, item description document vocabulary is similar to notional word progress Measurement:WhereinIt indicates in k-th of word and pseudo- ontology of r-th of project Based on the similarity of term vector between h-th of concept,Indicate erkTerm vector in g-th of dimension values;WithIndicate r Whether a project has the feature of h-th of concept, allIt is set to 0: Wherein | I | indicate the quantity of project;It, will when the similarity of the description document vocabulary of some project and notional word is more than threshold value qIt is set to 1:Finally, preference value of r-th of project to all conceptsItem feature vector is constituted, the project portrait optimization submodule will be without the value for being worth father's reality concept The quantity of the sub- concept of its value of having is updated to divided by its all sub- concept quantity: Wherein,Indicate r-th of project on final portrait for concept chPreference value,Indicate r-th of project first For concept c on beginning portraithPreference value,Indicate r-th of project to chV-th of sub- concept preference value;Finally, Preference value of r-th of project to all conceptsConstitute the item feature vector of optimization.
Further, described to be drawn according to the draw a portrait concept value of each dimension of the user multiplied by project based on the recommendation of preference As corresponding concept value obtains user to the preference R of the projectirWherein Indicate i-th of user in h-th of notional preference value,Indicate r-th of project in h-th of notional preference value, For each user, its preference to all candidate items is calculated, generates the project of the user from high to low by preference Recommendation list.
The present invention also provides user's portrait-item recommendation methods based on pseudo- ontology, which is characterized in that using the present invention Provided user's portrait-item recommendation system based on pseudo- ontology, including:Pseudo- body module, user portrait module, project Portrait module, the recommending module based on preference;The puppet body module obtains field related text, generates field puppet ontology, And it is output to user's portrait module and project portrait module;User's portrait module obtains the web browsing behavior of user, root According to the field puppet ontology, user characteristics vector is calculated, and is output to the recommending module based on preference;The project portrait Module obtains project associated description text, according to the field puppet ontology, calculates item feature vector, and is output to being based on The recommending module of preference;It is described based on the recommending module of preference according to the user characteristics vector sum item feature vector Export the corresponding project preference ranking of user.
Advantages of the present invention includes:Firstly, the present invention carries out user's portrait modeling using pseudo- ontology, so that without ontology Field also can using ontology user draw a portrait modeling method.Secondly, the present invention is generated using the initial portrait based on term vector Algorithm, so that the mapping of webpage to pseudo- ontology automates and based on semanteme.Finally, the present invention is according to pseudo- Ontological concept type Similarity between different and each concept carries out portrait optimization, so that portrait result is more accurate.
Detailed description of the invention
Fig. 1 is frame diagram of the invention.
Fig. 2 is to optimize for calculating user and drawing a portrait.
Specific embodiment
It can be seen from background technology that existing user's portrait-item recommendation method recommendation results applicabilities based on ontology Wideless and accuracy rate is not high;Applicant studies for described problem, it is believed that reason has two o'clock:First is that existing method is to this The integrality dependence of body is very big, for example requires the concept in ontology to have and completely illustrate document, and the construction pair of domain body Difficulty is larger for non-domain expert, this method for making the field of most of not ontology be difficult with Ontology Modeling.Second is that During webpage is mapped to ontology, existing method does not consider the similitude between semanteme.
Applicant is further studied regarding to the issue above, provides a kind of user based on pseudo- ontology in the present invention Portrait-item recommendation system and method.The present invention obtains Field Words using field related text, and Field Words are carried out base In the expression of term vector, the cluster based on level is then carried out, exports field puppet ontology.The webpage that the present invention browses user With pseudo- Ontological concept the similarity measure based on term vector is carried out, will be more than that the similarity of threshold value being reflected as webpage to puppet ontology Value is penetrated, initial user portrait is generated, and the update based on Concept Similarity and type is carried out to user's portrait, generates final User's portrait.The associated description document of project and pseudo- Ontological concept are carried out the similarity measure based on term vector by the present invention, will be surpassed The similarity for crossing threshold value generates initial project portrait as project to pseudo- ontology to mapping value, and carries out base to project portrait The update of type between concept generates final project portrait.The present invention draws a portrait user corresponding with project portrait dimension values phase Multiplied user exports project recommendation list by preference ranking to the preference of project.
Hereinafter, spirit and substance of the present invention are further elaborated in conjunction with the accompanying drawings and embodiments.
As shown in Figure 1, user's portrait-item recommendation system provided in an embodiment of the present invention based on pseudo- ontology, including: Pseudo- body module 01, user's portrait module 02, project portrait module 03, the recommending module 04 based on preference;The puppet ontology Module 01 obtains field related text, generates field puppet ontology, and is output to user's portrait module 02 and project portrait module 03;User's portrait module 02 obtains the web browsing behavior of user, according to the field puppet ontology, to based on preference Recommending module 04 export user characteristics vector;03 module of the project portrait obtains project associated description text, according to described Field puppet ontology exports item feature vector to the recommending module 04 based on preference;The recommending module based on preference 04 exports the corresponding project preference ranking of user according to the user characteristics vector sum item feature vector.
In the present embodiment, the field is classical music field, and field relevant documentation provides close for certain well-known symphony orchestra List is introduced in items on the program in 3 years, is obtained 186 field concept words by statistics word frequency, is carried out the table based on term vector to notional word Show, then will indicate that result carries out the cohesion clustering procedure based on level, exports pseudo- ontology.
In the present embodiment, the user network browsing data area that user's portrait module need to obtain is limited to 176 classical musics Webpage under associated dns name.User's portrait module includes that initial user portrait generates submodule and user's portrait optimization submodule, The initial user portrait generates all webpage vocabulary that submodule browses user and carries out the expression based on term vector:Wherein tijkIndicate k-th of word in j-th of webpage of i-th of user browsing,For tijkN dimension term vector indicate.Webpage vocabulary and notional word are subjected to similarity measure:WhereinIndicate k-th in j-th of webpage of i-th of user browsing Similarity based on term vector between h-th of concept in word and pseudo- ontology,Indicate tijkTerm vector in g-th of dimension values,Indicate chTerm vector in g-th of dimension values;As unit of each notional word, the similarity that will be greater than threshold value is cumulative:Wherein q is threshold value, | tij| indicate j-th of i-th of user browsing The vocabulary number for including in webpage, dijIndicate j-th of webpage of i-th of user browsing,Indicate i-th of user's browsing J-th of webpage to the preference of h-th of concept, in the present embodiment, q=0.7 will to each conceptBy user It adds up, calculates each user to the preference value of concept Wherein, | di| indicate the The webpage quantity of i user browsing, diIndicate i-th user,Indicate i-th of user to the preference value of h-th of concept, The number that certain preference concept of user identifies in a period of time different web pages is indicated with N, when the preference concept frequency of occurrence When less than N, the preference concept is invalid:
Wherein,For chThe number being identified, in the present embodiment, N =4.The user draws a portrait optimization as shown in Fig. 2, the value without value father's reality concept is updated between father's concept and every sub- concept Distance product it is cumulative:Wherein,It indicates i-th User is on final portrait for concept chPreference value,Indicate i-th of user on initial portrait for concept ch Preference value, c 'hIndicate concept c in pseudo- ontologyhAll sub- concepts set, | c 'h| indicate concept c in pseudo- ontologyhInstitute There is the quantity of sub- concept,Indicate i-th of user to chV-th of sub- concept preference,Indicate concept chWith Its v-th similarity of the sub- concept based on term vector, calculation method are:Wherein,It indicatesTerm vectorIn g-th of dimension values;Finally, i-th of user is to all concepts Preference valueConstitute the user characteristics vector of optimization.
In the present embodiment, the associated description document that project portrait need to obtain is the program description that certain well-known symphony orchestra provides Single, project portrait module includes that initial project portrait generates submodule and project portrait optimization submodule, the initial project portrait life All items associated description text vocabulary is subjected to the expression based on term vector at submodule: Wherein erkIndicate k-th of word in the description document of r-th of project,For erkN tie up term vector It indicates, item description document vocabulary and notional word is subjected to similarity measure:Its InIndicate in k-th of word and pseudo- ontology of r-th of project the similarity based on term vector between h-th of concept,Table Show erkTerm vector in g-th of dimension values;WithIndicate whether r-th of project has the feature of h-th of concept, institute HaveIt is set to 0:Wherein | I | indicate the quantity of project;When retouching for some project It, will when stating the similarity of document vocabulary and notional word and being more than threshold value qIt is set to 1:? In the present embodiment, q=0.7, finally, preference value of r-th of project to all conceptsStructure At item feature vector, the value without value father's reality concept is updated to the sub- concept of its value of having by the project portrait optimization submodule Quantity divided by its all sub- concept quantity:Wherein,Indicate the R project is on final portrait for concept chPreference value,Indicate r-th of project on initial portrait for concept chPreference value,Indicate r-th of project to chV-th of sub- concept preference value;Finally, r-th of project is to all The preference value of conceptConstitute the item feature vector of optimization.
In the present embodiment, based on the recommending module of preference according to the user draw a portrait each dimension concept value multiplied by Project is drawn a portrait the preference R that corresponding concept value obtains user to the projectirIts InIndicate i-th of user in h-th of notional preference value,Indicate that r-th of project is notional at h-th Preference value calculates its preference to all candidate items, generates the use from high to low by preference for each user The project recommendation list at family.
In the present embodiment, Baidu's term vector that the source of term vector is provided using Baidu's opening, the dimension of term vector For 50 dimensions.
In order to test user's portrait-item recommendation system provided in an embodiment of the present invention based on pseudo- ontology, this Embodiment has used the web browsing data of the 13000 booking data and user of certain well-known symphony orchestra user within half a year It is tested, by designing control experiment, modes, the multi-angle such as is compared with other experiments and demonstrates the validity of system And advantage.Specific experiment result is:Respectively using the accuracy rate of average ranking error and top n ranking as index, in algorithm Q and n carries out tune ginseng, obtains working as n=4, and effect is optimal when q=0.7;It, will using identical pseudo- ontology and portrait optimization method TheContextualConceptClusteringalgorithm (3C) algorithm and initial portrait generating algorithm do ten foldings intersection Verifying, what initial generating algorithm of drawing a portrait referred to calculates user characteristics vector with algorithm provided in an embodiment of the present invention, then with existing Method is according to the user characteristics vector of user characteristics vector calculation optimization, then calculates preference R the results are shown in Table 1:It can be seen that just The accuracy rate for the portrait generating algorithm that begins is higher than existing algorithm.
1 3C algorithm of table and initial portrait ten folding cross validation results of generating algorithm
Using identical pseudo- ontology and initial portrait generating algorithm, by GEW (the Gradual Extra Weight Algorithm) algorithm increases by 50% Weight algorithm, SAS (Sub-class Aggregation Scheme) algorithm and Ben Fa The portrait optimization algorithm of bright proposition carries out experiment comparison, and the portrait optimization algorithm refers to calculating user spy with existing method After levying vector, the user characteristics vector with algorithm provided by the embodiment of the present invention based on user characteristics vector calculation optimization, then Preference R is calculated, the results are shown in Table 2:It can be seen that the accuracy rate of portrait optimization algorithm is higher than existing algorithm.
The portrait optimization algorithm effect of table 2
It combines initial portrait generating algorithm with portrait optimization algorithm and carries out experiment ratio with 3C algorithm and GEW algorithm combination Compared with described combine initial portrait generating algorithm with portrait optimization algorithm refers to utilizing calculation provided by the embodiment of the present invention Method calculates the user characteristics vector of user characteristics vector sum optimization, then calculates preference R, and the results are shown in Table 3:It can be seen that incite somebody to action this After inventing the two kinds of algorithm combinations proposed, accuracy rate is still higher than existing combinational algorithm.
3 combinational algorithm experimental result of table
In conclusion the present invention to user's portrait-project recommendation technology based on web browsing behavior carried out analysis and It explores, proposes a kind of novel user's portrait modeling method based on pseudo- ontology, webpage is carried out to pseudo- ontology using term vector The mapping of concept, user's portrait is for generating user characteristics vector, and project portrait is used for production project feature vector, based on inclined The recommendation spent well generates project recommendation list.Project recommendation list accuracy rate generated is higher than existing method.
Although the invention has been described by way of example and in terms of the preferred embodiments, but it is not for limiting the present invention.Any this field Technical staff may be by the disclosure above methods and technical content to the present invention in the spirit and scope for not departing from invention Scheme makes possible variation and modification, therefore, anything that does not depart from the technical scheme of the invention, skill according to the present invention Art any simple modifications, equivalents, and modifications substantially to the above embodiments, belong to the guarantor of technical solution of the present invention Protect range.

Claims (7)

1. user's portrait-item recommendation system based on pseudo- ontology, which is characterized in that including:Pseudo- body module, user's portrait mould Block, project portrait module, the recommending module based on preference;The puppet body module obtains field related text, generates field Pseudo- ontology, and it is output to user's portrait module and project portrait module;User's portrait module obtains the web browsing of user Behavior according to the field puppet ontology, the user characteristics vector of calculation optimization, and is output to the recommending module based on preference; Project portrait module obtains project associated description text, according to the field puppet ontology, the item characteristic of calculation optimization to Amount, and it is output to the recommending module based on preference;It is described based on the recommending module of preference according to the user characteristics vector Project preference ranking corresponding with item feature vector output user.
2. user's portrait-item recommendation system according to claim 1 based on pseudo- ontology, which is characterized in that the puppet Body module includes field concept identification submodule and conceptual relation identifies submodule, and the field concept identification submodule is to neck Domain related text carries out word frequency statistics, and after removing stop words, the word by word frequency greater than α is denoted as field concept word, and α is predetermined value; The conceptual relation identifies that the process of submodule is: Wherein, chFor h-th of notional word in field concept set of words C,For chN tie up term vector table Show,Indicate field concept word chY m-th of class of layer of pseudo- ontology is divided by hierarchical clustering.
3. user's portrait-item recommendation system according to claim 2 based on pseudo- ontology, which is characterized in that the neck Domain concept identification submodule judges field concept word, and whether field is exclusive, if not field is exclusive, is then defined as empty concept, If field is exclusive, then real concept is defined as.
4. user's portrait-item recommendation system according to claim 2 based on pseudo- ontology, which is characterized in that the use Family portrait module includes that initial user portrait generates submodule and user's portrait optimization submodule, and the initial user portrait generates All webpage vocabulary that submodule browses user carry out the expression based on term vector: Wherein tijkIndicate k-th of word in j-th of webpage of i-th of user browsing,For tijkN Tieing up term vector indicates;Webpage vocabulary and notional word are subjected to similarity measure:WhereinIt indicates to be based on term vector between h-th of concept in k-th of word in j-th of webpage of i-th of user browsing and pseudo- ontology Similarity,Indicate tijkTerm vector in g-th of dimension values,Indicate chTerm vector in g-th of dimension values;With Each notional word is unit, and the similarity that will be greater than threshold value is cumulative: Wherein Q is threshold value, | tij| indicate the vocabulary number for including in j-th of webpage of i-th of user browsing, dijIndicate i-th of user's browsing J-th of webpage,J-th of webpage of i-th of user browsing is indicated to the preference of h-th of concept, to each concept, It willIt adds up by user, calculates each user to the preference value of concept Its In, | di| indicate the webpage quantity of i-th of user browsing, diIndicate i-th of user,Indicate i-th of user to h-th The preference value of concept indicates the number that certain preference concept of user identifies in a period of time different web pages with N, when the preference When concept frequency of occurrence is less than N, the preference concept is invalid:
Wherein,For chThe number being identified, finally, i-th of user is to all The preference value of conceptConstitute user characteristics vector;The user draws a portrait optimization submodule will It is cumulative with the product of sub- concept value that value without value father's reality concept is updated to the distance between father's concept and every a sub- concept:
Wherein,Indicate that i-th of user is finally drawing For concept c as onhPreference value,Indicate i-th of user on initial portrait for concept chPreference value, c 'hTable Show concept c in pseudo- ontologyhAll sub- concepts set, | c 'h| indicate concept c in pseudo- ontologyhAll sub- concepts quantity,Indicate i-th of user to chV-th of sub- concept preference,Indicate concept chIt is based on its v-th sub- concept The similarity of term vector, calculation method are:Wherein,It indicatesWord to AmountIn g-th of dimension values;Finally, preference value of i-th of user to all conceptsConstitute the user characteristics vector of optimization.
5. user's portrait-item recommendation system according to claim 2 based on pseudo- ontology, which is characterized in that the item Mesh portrait module includes that initial project portrait generates submodule and project portrait optimization submodule, and the initial project portrait generates All items associated description text vocabulary is carried out the expression based on term vector by submodule: Wherein erkIndicate k-th of word in the description document of r-th of project,For erkN tie up term vector It indicates, item description document vocabulary and notional word is subjected to similarity measure:WhereinIndicate in k-th of word and pseudo- ontology of r-th of project the similarity based on term vector between h-th of concept,It indicates erkTerm vector in g-th of dimension values;WithIndicate whether r-th of project has the feature of h-th of concept, allIt is set to 0: Wherein | I | indicate the quantity of project;When retouching for some project It, will when stating the similarity of document vocabulary and notional word and being more than threshold value qIt is set to 1:Most Eventually, preference value of r-th of project to all conceptsConstitute item feature vector, the project Portrait optimization submodule by the value without value father's reality concept be updated to its value of having sub- concept quantity divided by its all sub- concept quantity:Wherein,Indicate r-th of project on final portrait for concept ch Preference value,Indicate r-th of project on initial portrait for concept chPreference value,Indicate r-th of project pair chV-th of sub- concept preference value;Finally, preference value of r-th of project to all conceptsStructure At the item feature vector of optimization.
6. user's portrait-item recommendation system according to claim 1 based on pseudo- ontology, which is characterized in that the base User is obtained multiplied by draw a portrait corresponding concept value of project according to the draw a portrait concept value of each dimension of the user in the recommendation of preference To the preference R of the projectirWhereinIndicate that i-th of user is general at h-th Preference value in thought,Indicate that it is right to calculate its for each user in h-th of notional preference value for r-th of project The preference of all candidate items is generated the project recommendation list of the user by preference from high to low.
7. according to user's portrait-item recommendation system side based on 5 ontologies provided by any one of claim 1 to 6 Method, which is characterized in that including:
Step 1: the puppet body module obtains field related text, counts frequency and obtain field concept, with term vector high-ranking military officer domain Concept Vectors generate field puppet ontology with relationship between hierarchical clustering method identification field concept, and are input to user's portrait module With project portrait module;
Step 2: the user draws a portrait, module is input with the field puppet ontology, is drawn a portrait by initial user portrait and user The web browsing behavior of user is mapped on the puppet ontology of field by optimization algorithm, exports user to the recommending module based on preference Feature vector;
Step 3: the project portrait module is input with the field puppet ontology, drawn a portrait by initial project portrait and project Project associated description text is mapped on the puppet ontology of field by optimization algorithm, special to the recommending module output project based on preference Levy vector;
Step 4: it is described based on the recommending module of preference with user characteristics vector sum item feature vector be input, by user Feature vector be multiplied to obtain user with the feature vector of each project to the preference of each candidate items, by preference from height To the project recommendation list of low output user.
CN201810563501.9A 2018-06-04 2018-06-04 User portrait-project recommendation system and method based on pseudo ontology Active CN108920521B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810563501.9A CN108920521B (en) 2018-06-04 2018-06-04 User portrait-project recommendation system and method based on pseudo ontology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810563501.9A CN108920521B (en) 2018-06-04 2018-06-04 User portrait-project recommendation system and method based on pseudo ontology

Publications (2)

Publication Number Publication Date
CN108920521A true CN108920521A (en) 2018-11-30
CN108920521B CN108920521B (en) 2021-07-09

Family

ID=64418204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810563501.9A Active CN108920521B (en) 2018-06-04 2018-06-04 User portrait-project recommendation system and method based on pseudo ontology

Country Status (1)

Country Link
CN (1) CN108920521B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008391A (en) * 2019-02-28 2019-07-12 平安科技(深圳)有限公司 The construction method and device, storage medium, computer equipment of user interest portrait
CN110209908A (en) * 2019-04-19 2019-09-06 平安科技(深圳)有限公司 Application recommended method, device, computer equipment and computer storage medium based on user interest portrait
CN110222191A (en) * 2019-04-19 2019-09-10 平安科技(深圳)有限公司 Construction method, device, computer equipment and the computer storage medium of user interest portrait
CN111339429A (en) * 2020-03-27 2020-06-26 上海景域智能科技有限公司 Information recommendation method
WO2021004228A1 (en) * 2019-07-08 2021-01-14 汉海信息技术(上海)有限公司 Generation of recommendation reason
CN113434770A (en) * 2021-07-08 2021-09-24 广州康乾信息科技有限公司 Business portrait analysis method and system combining electronic commerce and big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050055365A1 (en) * 2003-09-09 2005-03-10 I.V. Ramakrishnan Scalable data extraction techniques for transforming electronic documents into queriable archives
US20080294624A1 (en) * 2007-05-25 2008-11-27 Ontogenix, Inc. Recommendation systems and methods using interest correlation
CN105389718A (en) * 2015-12-07 2016-03-09 深圳市天行家科技有限公司 Automobile after-sale service recommendation method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050055365A1 (en) * 2003-09-09 2005-03-10 I.V. Ramakrishnan Scalable data extraction techniques for transforming electronic documents into queriable archives
US20080294624A1 (en) * 2007-05-25 2008-11-27 Ontogenix, Inc. Recommendation systems and methods using interest correlation
CN105389718A (en) * 2015-12-07 2016-03-09 深圳市天行家科技有限公司 Automobile after-sale service recommendation method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘海鸥等: ""基于用户画像的旅游情境化推荐服务研究"", 《情报理论与实践》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008391A (en) * 2019-02-28 2019-07-12 平安科技(深圳)有限公司 The construction method and device, storage medium, computer equipment of user interest portrait
CN110209908A (en) * 2019-04-19 2019-09-06 平安科技(深圳)有限公司 Application recommended method, device, computer equipment and computer storage medium based on user interest portrait
CN110222191A (en) * 2019-04-19 2019-09-10 平安科技(深圳)有限公司 Construction method, device, computer equipment and the computer storage medium of user interest portrait
CN110222191B (en) * 2019-04-19 2023-08-22 平安科技(深圳)有限公司 User interest portrait construction method, device, computer equipment and computer storage medium
WO2021004228A1 (en) * 2019-07-08 2021-01-14 汉海信息技术(上海)有限公司 Generation of recommendation reason
CN111339429A (en) * 2020-03-27 2020-06-26 上海景域智能科技有限公司 Information recommendation method
CN111339429B (en) * 2020-03-27 2022-09-13 上海景域智能科技有限公司 Information recommendation method
CN113434770A (en) * 2021-07-08 2021-09-24 广州康乾信息科技有限公司 Business portrait analysis method and system combining electronic commerce and big data
CN113434770B (en) * 2021-07-08 2022-09-09 上海识致信息科技有限责任公司 Business portrait analysis method and system combining electronic commerce and big data

Also Published As

Publication number Publication date
CN108920521B (en) 2021-07-09

Similar Documents

Publication Publication Date Title
CN109492157B (en) News recommendation method and theme characterization method based on RNN and attention mechanism
CN108920521A (en) User's portrait-item recommendation system and method based on pseudo- ontology
CN103870973B (en) Information push, searching method and the device of keyword extraction based on electronic information
US20060155751A1 (en) System and method for document analysis, processing and information extraction
CN102043812A (en) Method and system for retrieving medical information
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN106802915A (en) A kind of academic resources based on user behavior recommend method
CN109871428A (en) For determining the method, apparatus, equipment and medium of the text degree of correlation
US20100274753A1 (en) Methods for filtering data and filling in missing data using nonlinear inference
CN108763362A (en) Method is recommended to the partial model Weighted Fusion Top-N films of selection based on random anchor point
CN111539197B (en) Text matching method and device, computer system and readable storage medium
CN108363804A (en) Local model weighted fusion Top-N movie recommendation method based on user clustering
CN106776711A (en) A kind of Chinese medical knowledge mapping construction method based on deep learning
CN105787068B (en) The academic recommended method and system analyzed based on citation network and user's proficiency
CN106845645A (en) Method, system and computer program that the dynamic of semantic network and the media synthesis for user's driving is produced
CN101582080A (en) Web image clustering method based on image and text relevant mining
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
CN108874783A (en) Power information O&M knowledge model construction method
CN104268292A (en) Label word library update method of portrait system
US20220107980A1 (en) Providing an object-based response to a natural language query
Velásquez Web site keywords: A methodology for improving gradually the web site text content
CN115374781A (en) Text data information mining method, device and equipment
CN114840747A (en) News recommendation method based on comparative learning
KR20230052609A (en) Review analysis system using machine reading comprehension and method thereof
CN107016566A (en) User model construction method based on body

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant