CN102819575A - Personalized search method for Web service recommendation - Google Patents

Personalized search method for Web service recommendation Download PDF

Info

Publication number
CN102819575A
CN102819575A CN2012102538842A CN201210253884A CN102819575A CN 102819575 A CN102819575 A CN 102819575A CN 2012102538842 A CN2012102538842 A CN 2012102538842A CN 201210253884 A CN201210253884 A CN 201210253884A CN 102819575 A CN102819575 A CN 102819575A
Authority
CN
China
Prior art keywords
speech
user
service
interest
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012102538842A
Other languages
Chinese (zh)
Other versions
CN102819575B (en
Inventor
窦万春
胡蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Huakang Information Technology Co Ltd
Ten Party Health Management (jiangsu) Ltd
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201210253884.2A priority Critical patent/CN102819575B/en
Publication of CN102819575A publication Critical patent/CN102819575A/en
Application granted granted Critical
Publication of CN102819575B publication Critical patent/CN102819575B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a personalized search method for Web service recommendation. The personalized search method comprises the following steps of: 1, preprocessing a WSDL (Web Services Description Language) file, i.e., forming a bag of words through two preprocessing steps of removing stop words and extracting stems; 2, extracting user interest, i.e., calculating weight of each word in the bag of words by using an improved TF-IDF (Term Frequency-Inverse Document Frequency) formula, and multiplying by a time decay factor of the word to obtain a new weight; selecting previous k words according to the weight from large to small as interest words of a user and corresponding weight of each word to form a k-dimension user interest vector; 3, calculating interest similarity, i.e., setting a similarity threshold and selecting the users with interest similarity exceeding the threshold as neighbor users of a target user; and 4, ordering service search results, calculating a recommended predicted value of the service according to similarity of neighbor users and the frequency of selecting service of the users, and arranging the searched results in a descending order according to the recommended predicted value, thereby obtaining the personalized search result.

Description

A kind of individuation search method that is used for the Web service recommendation
Technical field
The present invention relates to web search, recommendation in a kind of computer software technical field, particularly a kind of individuation search method that is used for the Web service recommendation.
Background technology
For the demand of the dirigibility, expansibility, correctness and the robustness that constantly satisfy software systems, the practice of soft project is progressively developed and certain methods, make that the structure of software systems can be based on existing software resource, but not from the beginning all develops.These methods have successfully been accelerated the tempo of development of software systems, have improved production efficiency.At the technological layer of method, the Function Decomposition that software is realized is some simple relatively reusable functional modules, also for soft project a kind of better software administration technology is provided.
Current, widely accepted software reuse technology be based on assembly soft project (Components-Based Software Engineering, CBSE).(Service Oriented Computing SOC) is a kind of new software development normal form based on assembly in service-oriented calculating; The infrastructure of SOC be service-oriented architectural framework (Service Oriented Architecture, SOA); Web service and SOA are a kind of realization versions of SOC.
As a kind of emerging, towards the distributed computing model of Internet, SOC provides the technology that better enables for structure loose coupling, inter-organization integrated application.Service-Oriented Architecture Based provides basic guarantee through the pattern of " issuing-search-bind " for using Service Source.Yet service user and ISP are separated, and increased user's understanding, have obtained and use the difficulty of required service.Particularly when user's demand changed along with the evolution of application construction process, how letting the user obtain proper service was the problem that needs solve.To this problem, the demand for services that the traditional services discovery technique mainly initiatively provides the mode of query requests to obtain the user through the user, perhaps directly let user oneself in resource collection according to manual the searching of certain taxonomic hierarchies.When resource collection constantly expands, the operation of manually searching service will become loaded down with trivial details, time-consuming, fallibility.At present, Web service search technique comprises based on UDDI registration center, through Web service website (like XMethods, RemoteMethods etc.), uses universal search engine (like Google, Yahoo etc.) and use four kinds of modes of professional search engine (like seekda, Merobase etc.).These ways of search are mainly supported the key search mode, in the retrieving and no user participate in, thereby result for retrieval has nothing to do with user interest, more can not change with the variation of user interest.
Different with the thinking of traditional search technique; The personalized search technology can be analyzed and compare with user's interest the service page in the Search Results; The help user therefrom finds out more interested service and it preferentially is presented in the search result list, thereby improves search efficiency of users.As in the Google personalized search; Look & feel (the rank that comprises information filtering that system allows customization oneself to like; Speech selection and query suggestion customization etc.); The personalized Subscribed Links of Google allows the user in the Google search engine of oneself, to create self-defined result, for the client represents service chaining. The personalized search of releasing allows the user to search for interest information according to the factum mode, and supports the user to the management of result for retrieval with share.The user can add note, can classify to the Web webpage and ordering etc. according to individual need.
Personalized recommendation technology tap/dip deep into user's personalization preferences; The information of the formula of taking the initiative " propelling movement " mode; Provide to robotization the information that meets individual requirements to the user; Rather than need user oneself from the Web information of magnanimity, to seek own interested content, thereby improve the efficient that user's effective information obtains.1992, first commending system Tapestry was born, and it is used for the collaborative filtering of Email and has obtained good effect.After this, commending system has obtained increasing concern with its wide application value.1996, Yahoo introduced portal website with commending system, added personalized user inlet MyYahoo, had proposed personalized service to different user; 1997, the AT&T laboratory proposed personalized recommendation system Referral Web and the PHOAKS based on collaborative filtering; Calendar year 2001, IBM Corporation has increased personalized recommendation system in its e-commerce platform Websphere, so that businessman's exploitation individual electronic business web site; Similar products like also has GroupLens, Amazon, Netflix etc., and application relates to electronic mail filtering, ecommerce class website, theme of news class website, search engine, online DVD lease website and some web2.0 socialization websites etc.
What personalized search used in a large number is the ultimate principle in the personalized recommendation; And personalized recommendation also needs a large amount of basic fundamentals of using for reference in the personalized search; Both are as tight association in the personalized service and two technology of core the most; Can satisfy the differentiation information requirement of different user largely, be with a wide range of applications.
Search engine can help the content that the user gets access to oneself efficiently, quickly from magnanimity Web resource need, thereby greatly improve the efficient that the user obtains information as the instrument of effective information retrieval.Along with enriching constantly of Web service resource and further developing of search engine technique, under the driving of user's actual need, individuation search method becomes the focus of search field research gradually.Individuation search method to Web service; Its core is according to the interest of user's personalization, preference; The service result for retrieval is carried out the screening and the ordering of " varying with each individual ", thereby the result for retrieval output of the differentiation that satisfies its individual demand is provided for different user.
Yet, how in the Web Internet resources, to find a kind of comparatively objective and accurate searching method, accurately service implementation pushes, and satisfies the needs of different the main consuming bodies, is a difficult point.
Summary of the invention
Goal of the invention: technical matters to be solved by this invention is to long defective of search out of true time in the prior art, and a kind of individuation search method that Web service is recommended that is used for is provided.
In order to solve the problems of the technologies described above, the invention discloses a kind of individuation search method that Web service is recommended that is used for, may further comprise the steps:
Step 1; Pre-service WSDL WSDL (Web Service Description Language; WSDL) document; From user's service recorder, obtain the WSDL document that it was selected,, form speech bag (bag ofwords) through removing stop words and extracting two pre-treatment step of stem;
Step 2 extracts user interest, uses improved TF-IDF formula to calculate the weight of each speech in the speech bag, and multiply by the time decay factor, obtains new weight δ IjSelect new weight δ IjBy the interest speech of big extremely little preceding k speech as the user, and the respective weights δ of each speech Ij, the user interest of forming the k dimension is vectorial; The excellent weights of k before selecting, and corresponding speech constitutes the user interest vector together.This measure helps reducing the dimension of user interest vector space and makes its dimension consistent, helps calculating efficiently the interest similarity between per two users.
Step 3 is calculated similarity, uses the vector angle cosine formula to calculate cosine between per two users apart from as its similarity; Set similarity threshold, the user who surpasses threshold value goes into to elect as targeted customer's neighbours user; The setting range of similarity threshold is 0 ~ 1.
Step 4, ordering service result for retrieval: the targeted customer submits services request to, goes out all by the Web service search engine retrieving and meets requested service; According to neighbours user select these services number of times and with targeted customer's similarity, adopt the weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; With result for retrieval according to recommending the predicted value descending sort, thereby obtain the personalized search result.
Among the present invention, improved TF-IDF (Term Frequency-Inverse Document Frequency, document-anti-document frequency) formula is following:
tf ( t ij ) = freq ( t ij , D i ) | D i | ,
idf ( t ij ) = log | D | | { D i : t ij ∈ D i } | ,
ω ij=tf(t ij)*idf 2(t ij),
Wherein, t IjBe i j speech in user's speech bag, tf (t Ij) be speech t IjDocument frequency, D iBe i user's speech bag, freq (t Ij, D i) be speech t IjAt speech bag D iThe middle number of times that occurs, | D i| be D iThe number of middle speech, idf (t Ij) be speech t IjAnti-document frequency, | D| is the number of WSDL document in the corpus, | { D i: t Ij∈ D i| expression speech t Ijω appearred in what users' speech bag IjBe speech t IjWeight;
The computing method of time decay factor are following:
Decay=2-e α*t
Wherein, Decay express time decay factor, e is the end of natural logarithm, generally uses numerical value 2.718.α is an attenuation rate, and span is [0,0.1], for example can be set at 0.1.When the α value is 0, Decay=1, the expression weights are decay in time not, and the α value is big more, and it is fast more to decay, and t is current time and the distance users difference between the last time of selecting to serve.Corresponding to the decay characteristics in time that user interest had, designed the time decay factor.New weight is the product of the value and the time decay factor of former weight, and for non-selected speech of a specified duration, its weight decays to 0 gradually.
Speech t in each user's speech bag IjNew weight δ IjComputing formula is:
δ ij=ω ij*Decay。
Among the present invention, it is following to calculate the similarity formula:
sim ( u a , u b ) = Σ j = 1 k δ aj * δ bj Σ j = 1 k δ aj 2 * Σ j = 1 k δ bj 2 ,
Wherein, u aWith u bBe two different users, sim (u a, u b) refer to the similarity between these two users, δ AjAnd δ BjBe respectively user u aWith user u bThe speech bag in the weight of j speech, k is the number of user interest speech.
Among the present invention, the formula of the recommendation predicted value of each result for retrieval of employing weighted mean predictor formula calculating is following:
P u t , s t = c ‾ u t + Σ u i ∈ N ( c u i , s t - c ‾ u i ) * sim ( u t , u i ) Σ u i ∈ N sim ( u t , u i ) 2 ,
Wherein, u tBe the targeted customer, s tBe destination service, the service of recommendation predicted value promptly to be calculated,
Figure BDA00001914438300043
Be targeted customer u tTo destination service s tThe recommendation predicted value,
Figure BDA00001914438300044
With
Figure BDA00001914438300045
Be respectively targeted customer u tWith neighbours user u iThe number of times of average selection service,
Figure BDA00001914438300046
Be neighbours user u iSelect target service s tNumber of times, sim (u t, u i) be targeted customer u tWith neighbours user u iThe interest similarity, N is targeted customer u tNeighbours set.
Among the present invention, remove stop words and refer to: in information retrieval, stop words is meant the speech that the frequency of occurrences is too high, do not have too overall search meaning.It is a step of vectorial participle in the knowledge extraction process that stop words is handled, speed and quality that its individual processing meeting speed up document is handled.At present, some English of publishing vocabulary of stopping using has been arranged, wherein more famous is inactive vocabulary that Van Rijsbergen delivers and the Brown Corpus vocabulary of stopping using.Chinese stop using vocabulary more famous Harbin Institute of Technology stop using vocabulary, Sichuan University machine intelligence laboratory stop using dictionary, the inactive vocabulary of Baidu etc. are arranged.The general vocabulary of stopping using not only comprises some general stop words, like a, and by, is; At etc., and be included in some vocabulary that the Web service field often occurs, service for example, soap; Response, request, set; Get etc., these speech are discrimination and little for Web service, and introduces easily and disturb.The speech that is contained in this table is removed from the WSDL document.WSDL document 7 important parameters: types, import, message, portType, operation, binding and service.These parameters are nested in the definitions root element.Adopt WSDL4J (Web Services Description Language for Java Toolkit; The JAVA kit of WSDL) the WSDL document of the user being selected is resolved; The content that parses is removed stop words, extract stem, form this user's speech bag.
Among the present invention, stem is meant that all inflectional affixes are removed the remaining part in back, and extracting stem is to remove the process that affixe obtains root.The present invention in the baud stem algorithm in univ cambridge uk's computer laboratory invention in 1979, carries out the extraction of stem for the speech in the WSDL document according to Martin doctor Poter, so that more accurately nothing repeatedly extracts the interest speech.
Compare with existing individuation search method; This method has three characteristics: the one, and not only implicit expression extracts user's interest itself; And obtained the relation between the different user interest through calculating the interest similarity; And adopt the collaborative filtering technology, and the service searching result is carried out the personalized ordering based on interest, improved the accuracy and the correlativity of Search Results to a certain extent; The 2nd, in the process that interest forms, added the time decay factor, represented the characteristics that user interest develops in time more exactly; The 3rd, but all off-line completion of first, second and third step of method are very little to the influence of recall precision.
The present invention just is being to use the ultimate principle in the personalized recommendation, and the collaborative filtering technology is applied to the personalized search of Web service, has improved user satisfaction and retrieval precision.Particularly, the present invention collects user's search records, describes from its Web service of selecting and extracts user interest the document, and form interest vector; According to the similarity of the cosine distance metric user interest of interest vector, the user who selects similarity with the targeted customer to surpass certain threshold value forms this user's neighbours; When the targeted customer submits the service search request to; Service recommendation system adopts one of above certain search technique to retrieve the service of a plurality of keyword matching for it; But directly result for retrieval is not returned to the user; But, by descending sort, return to the user then according to the recommendation predicted value that neighbours' selection experience and interest similarity thereof are calculated these result for retrieval.Like this, participate in service search result's customization user transparent, adopted the service recommendation method to accomplish personalized service search.
Beneficial effect: effect of the present invention is embodied in: the extraction of user interest does not need frequently to inquire the user or obtains the explicit feedback of user, thereby can obtain the approval and the use of more users user transparent.User interest and time correlation; The weight of the long-time interest that does not repeat to select decays gradually; Finally withdraw from the user interest vector, and the service interest of up-to-date frequent selection the user interest vector can be in time added to, thereby the variation of user interest can be expressed and follow the tracks of more accurately.Adopt the method for collaborative filtering that Search Results is recommended prediction and ordering,, also can from the experience of other similar users, obtain personalized recommendation even the targeted customer does not have the correlation experience of current required service.Can be widely used in the personalization of Web service search, support service recommendation, belong to computer software technical field.
Description of drawings
Below in conjunction with accompanying drawing and embodiment the present invention is done specifying further, above-mentioned and/or otherwise advantage of the present invention will become apparent.
Fig. 1 is a kind of process flow diagram that is used for the individuation search method of Web service recommendation of the present invention.
Embodiment
As shown in Figure 1, the invention discloses a kind of individuation search method that Web service is recommended that is used for, may further comprise the steps:
Step 1, pre-service WSDL document obtains the WSDL document that it was selected from user's service recorder, through removing stop words and extracting two pre-treatment step of stem, form the speech bag.
Step 2 extracts user interest, uses improved TF-IDF formula to calculate the weight of each speech in the speech bag, and multiply by the time decay factor of this speech, obtains new weight; Select weight by the interest speech of big extremely little preceding k speech as the user, and the respective weights of each speech, the user interest vector of composition k dimension.
Step 3 is calculated similarity, uses the vector angle cosine formula to calculate cosine between per two users apart from as its similarity; Set similarity threshold, the user who surpasses threshold value goes into to elect as targeted customer's neighbours user.
Step 4, ordering service result for retrieval: the targeted customer submits services request to, goes out all by the Web service search engine retrieving and meets requested service; According to neighbours user select these services number of times and with targeted customer's similarity, adopt the weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; With result for retrieval according to recommending the predicted value descending sort, thereby obtain the personalized search result.
Improved TF-IDF formula is following:
tf ( t ij ) = freq ( t ij , D i ) | D i | ,
idf ( t ij ) = log | D | | { D i : t ij ∈ D i } | ,
ω ij=tf(t ij)*idf 2(t ij),
Wherein, t IjBe i j speech in user's speech bag, tf (t Ij) be speech t IjDocument frequency, D iBe i user's speech bag, freq (t Ij, D i) be speech t IjAt speech bag D iThe middle number of times that occurs, | D i| be D iThe number of middle speech, idf (t Ij) be speech t IjAnti-document frequency, | D| is the number of WSDL document in the corpus, | { D i: t Ij∈ D i| mean t Ijω appearred in what users' speech bag IjBe speech t IjWeight.
The computing method of time decay factor are following:
Decay=2-e α*t
Wherein, Decay express time decay factor, e is the end of natural logarithm; α is an attenuation rate, and span is [0,0.1]; When the α value is 0, Decay=1, the expression weights are decay in time not; The α value is big more, and it is fast more to decay, and t is current time and the distance users difference between the last time of selecting to serve;
Speech t in each user's speech bag IjNew weight calculation formula be:
δ ij=ω ij*Decay。
Among the present invention, it is following to calculate the similarity formula:
sim ( u a , u b ) = Σ j = 1 k δ aj * δ bj Σ j = 1 k δ aj 2 * Σ j = 1 k δ bj 2 ,
Wherein, u aWith u bBe two different users, sim (u a, u b) refer to the similarity between these two users, δ AjAnd δ BjBe respectively user u aWith user u bThe speech bag in the weight of j speech, k is the number of user interest speech.
Among the present invention, the formula of the recommendation predicted value of each result for retrieval of employing weighted mean predictor formula calculating is following:
P u t , s t = c ‾ u t + Σ u i ∈ N ( c u i , s t - c ‾ u i ) * sim ( u t , u i ) Σ u i ∈ N sim ( u t , u i ) 2 ,
Wherein, u tBe the targeted customer, s tBe destination service, the service of recommendation predicted value promptly to be calculated, Be targeted customer u tTo destination service s tThe recommendation predicted value,
Figure BDA00001914438300076
With
Figure BDA00001914438300077
Be respectively targeted customer u tWith neighbours user u iThe number of times of average selection service,
Figure BDA00001914438300081
Be neighbours user u iSelect target service s tNumber of times, sim (u t, u i) be targeted customer u tWith neighbours user u iThe interest similarity, N is targeted customer u tNeighbours set.
Embodiment
The substance of present embodiment is from Web service supermarket (http: // 125.221.225.2:8080/WSSM/) background data base.
Present embodiment comprises following four steps:
(1) pre-service WSDL document
From the background data base in Web service supermarket, extract 200 users' service recorder, obtain raw data, the service recorder of certain customers is following:
Table 1 user service recorder (part)
Figure BDA00001914438300082
Enumerated four users in the table 1, user name is respectively: " tailaoliu ", and " fangfang ", " zww ", " skh " selected some Web services respectively.Document is described in the Web service that download and each user of pre-service selected, and removes stop words according to the inactive vocabulary that Van Rijsbergen delivers, and adopts Martin doctor's Poter poter stem algorithm to extract stem, formation speech bag.Selected service " BookingService " by name, " JasonsBooking ", three Web services of " HotelBookingEngine " like " taolaoliu "; From the WSDL document of three Web services of corresponding service website, download, be " render (84), hotel (99); reservation (40), invoice (33), room (269); city (81), client (13), book (194); ticket (13), basket (42), rate (25) " with extracting the speech bag that forms behind the stem through removing stop words.This speech bag comprises 11 speech altogether, wherein in the bracket behind each speech mark be the number of times that this speech occurs in document.
(2) extract user interest
All users' speech bag is formed corpus, uses improved TF-IDF formula to calculate the weight of each speech in the speech bag; The weight of the speech in each user's speech bag multiply by the time decay factor, obtain new weight.Excellent speech and the corresponding weights thereof of k formed the user interest vector before the weight.Like " render " speech in the speech bag of " taolaoliu "; The number of times that in the speech bag of " taolaoliu ", occurs is 84 times; In 200 users' speech bag, one has in 68 users' the speech bag and this speech occurred, therefore; The weights that calculate " render " according to improved TF-IDF formula in the claim step 2 are following
tf ( ′ ′ render ′ ′ ) = 84 11 = 7.64 ,
idf ( ′ ′ render ′ ′ ) = log 200 68 = 0.47 ,
ω “render”=7.64*0.47 2=1.68,
Continue decay factor computing time, the α value is 0.05, and the t value is an initial value 1, and Decay calculates as follows:
Decay=2-e 0.05*1=0.95,
New weights calculate as follows:
δ “render”=1.68*0.95=1.59
Likewise, calculate the weights of remaining speech in the speech bag, and preceding 6 maximum (being k=6) speech of weighting value, the interest vector that obtains user " taolaoliu " is: < (basket, 6.42), (hotel, 4.03), (room; 3.15), (book, 2.82), (render, 1.59), (information, 1.24)>, user's " fangfang " interest vector is: < (book; 3.31), (price, 3.26), (title, 3.23), (author, 3.17); (ISBN, 2.15), (infomation, 1.13)>, user's " zww " interest vector is: < (weather, 4.42), (city; 3.33), (forecast, 3.29), (replication, 2.12), (add, 1.12); (id, 1.11)>, user's " skh " interest vector is: < (weather, 3.39), (comment, 3.31), (forecast; 2.27), (city, 2.22), (replication, 1.20), (add, 1.10)>.Wherein, the new weights of this speech of numeral behind the interest speech.
(3) calculate the interest similarity
Use the vector angle cosine formula to calculate cosine between per two users apart from as its similarity; Set similarity threshold, the user who surpasses threshold value goes into to elect as targeted customer's neighbours.For example, adopt calculating formula of similarity calculating user " taolaoliu " and user's " fangfang " similarity following:
sim ( &prime; &prime; taolaoliu &prime; &prime; , &prime; &prime; fangfang &prime; &prime; ) =
2.82 * 3.31 + 1.24 * 1.13 6.42 2 + 4.03 2 + 3.15 2 + 2.82 2 + 1.59 2 + 1.24 2 * 3.31 2 + 3.26 3 + 3.23 2 + 3.17 2 + 2.15 2 + 1.13 2 = 0.17 ,
Setting similarity threshold is 0.15, then " taolaoliu " and user " fangfang " neighbours user each other.
(4) ordering service result for retrieval
The targeted customer submits services request to, and the Web service supermarket retrieves all for it and meets requested service; According to neighbours' services selection experience and with targeted customer's similarity, adopt the weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval.For example; Services request for targeted customer " taolaoliu " submission; Comprise the Web service of service " BookStoreService " by name in the result for retrieval, if this service was only selected 3 times by " fangfang ", the number of times of user's " taolaoliu " average selection service is 2; The number of times of user's " fangfang " average selection service is 1.5, and the recommendation predictor calculation that then should serve is following:
P &prime; &prime; BookStoreService &prime; &prime; = 2 + ( 3 - 1.5 ) * 0.17 0.17 2 = 3.5
With result for retrieval according to recommending the predicted value descending sort, thereby the user can obtain meeting the personalized search result of its interest from first page of result for retrieval rapidly.
Implementation result:
User " zww " hopes to obtain the service of buying books on the net as the current goal user.Respectively to the seekda search system ( Http:// webservices.seekda.com/, belong to prior art) and after the Web service supermarket submitted services request key word " book " to, the Search Results of preceding 10 ranks that obtain was respectively shown in table 2 and table 3.
The service of table 2.seekda Search Results top 10
Figure BDA00001914438300111
In the table 2, having only sequence number is that 2,4,5 service provides the function of buying books on the net, and user " zww " also will manually seek the service that meets own demand after obtaining the return results of system, and this process is consuming time often, uninteresting, fallibility.
The service of table 3.Web service supermarket Search Results top 10
Figure BDA00001914438300112
In the table 3, except sequence number is 9 service, all the other services all with buy books on the net relevant.This shows that personalized search uses collaborative filtering mode calculation services to recommend predicted value, can improve service searching accuracy and user search efficient, improves the satisfaction of user to the Web service search engine.
The invention provides a kind of individuation search method that Web service is recommended that is used for; The method and the approach of concrete this technical scheme of realization are a lot, and the above only is a preferred implementation of the present invention, should be understood that; For those skilled in the art; Under the prerequisite that does not break away from the principle of the invention, can also make some improvement and retouching, these improvement and retouching also should be regarded as protection scope of the present invention.The all available prior art of each ingredient not clear and definite in the present embodiment realizes.

Claims (4)

1. one kind is used for the individuation search method that Web service is recommended, and it is characterized in that, may further comprise the steps:
Step 1, pre-service WSDL WSDL document: from user's service recorder, obtain the WSDL document that it was selected,, form the speech bag through removing stop words and extracting two pre-treatment step of stem;
Step 2 extracts user interest: calculate the weight of each speech in the speech bag, and multiply by the time decay factor, obtain new weight δ IjSelect new weight δ IjBy the interest speech of big extremely little preceding k speech as the user, and the respective weights δ of each speech Ij, the user interest of forming the k dimension is vectorial;
Step 3 is calculated the interest similarity: calculate cosine between per two user interest vectors apart from as its interest similarity; Set similarity threshold, the user who surpasses threshold value goes into to elect as targeted customer's neighbours user;
Step 4, ordering service result for retrieval: the targeted customer submits services request to, goes out all by the Web service search engine retrieving and meets requested service; According to neighbours user select these services number of times and with targeted customer's similarity, adopt the weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; With result for retrieval according to recommending the predicted value descending sort, thereby obtain the personalized search result.
2. a kind of individuation search method that Web service is recommended that is used for according to claim 1 is characterized in that, in the step 2, calculates the weight of each speech in the speech bag, and multiply by the time decay factor of this speech, obtains new weight δ IjComprise the steps:
Use improved TF-IDF formula to calculate weights omega Ij:
tf ( t ij ) = freq ( t ij , D i ) | D i | ,
idf ( t ij ) = log | D | | { D i : t ij &Element; D i } | ,
ω ij=tf(t ij)*idf 2(t ij),
Wherein, t IjBe i j speech in user's speech bag, tf (t Ij) be speech t IjDocument frequency, D iBe i user's speech bag, freq (t Ij, D i) be speech t IjAt speech bag D iThe middle number of times that occurs, | D i| be D iThe number of middle speech, idf (t Ij) be speech t IjAnti-document frequency, | D| is the number of WSDL document in the corpus, | { D i: t Ij∈ D i| expression speech t Ijω appearred in what users' speech bag IjBe speech t IjWeight;
The computing method of time decay factor are following:
Decay=2-e α*t
Wherein, Decay express time decay factor, e is the end of natural logarithm; α is an attenuation rate, and span is [0,0.1]; When the α value is 0, Decay=1, the expression weights are decay in time not; The α value is big more, and it is fast more to decay, and t is current time and the distance users difference between the last time of selecting to serve;
Speech t in each user's speech bag IjNew weight δ IjComputing formula is:
δ ij=ω ij*Decay。
3. a kind of individuation search method that Web service is recommended that is used for according to claim 1 is characterized in that, adopts following method to calculate the user interest similarity in the step 3:
sim ( u a , u b ) = &Sigma; j = 1 k &delta; aj * &delta; bj &Sigma; j = 1 k &delta; aj 2 * &Sigma; j = 1 k &delta; bj 2 ,
Wherein, u aWith u bBe two different users, sim (u a, u b) refer to the similarity between these two users, δ AjAnd δ BjBe respectively user u aWith user u bThe speech bag in the weight of j speech, k is the number of user interest speech.
4. a kind of individuation search method that Web service is recommended that is used for according to claim 1 is characterized in that, in the step 4, the formula of the recommendation predicted value of each result for retrieval of employing weighted mean predictor formula calculating is following:
P u t , s t = c &OverBar; u t + &Sigma; u i &Element; N ( c u i , s t - c &OverBar; u i ) * sim ( u t , u i ) &Sigma; u i &Element; N sim ( u t , u i ) 2 ,
Wherein, u tBe the targeted customer, s tBe destination service, the service of recommendation predicted value promptly to be calculated,
Figure FDA00001914438200023
Be targeted customer u tTo destination service s tThe recommendation predicted value,
Figure FDA00001914438200024
With
Figure FDA00001914438200025
Be respectively targeted customer u tWith neighbours user u iThe number of times of average selection service,
Figure FDA00001914438200026
Be neighbours user u iSelect target service s tNumber of times, sim (u t, u i) be targeted customer u tWith neighbours user u iThe interest similarity, N is targeted customer u tNeighbours set.
CN201210253884.2A 2012-07-20 2012-07-20 Personalized search method for Web service recommendation Active CN102819575B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210253884.2A CN102819575B (en) 2012-07-20 2012-07-20 Personalized search method for Web service recommendation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210253884.2A CN102819575B (en) 2012-07-20 2012-07-20 Personalized search method for Web service recommendation

Publications (2)

Publication Number Publication Date
CN102819575A true CN102819575A (en) 2012-12-12
CN102819575B CN102819575B (en) 2015-06-17

Family

ID=47303686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210253884.2A Active CN102819575B (en) 2012-07-20 2012-07-20 Personalized search method for Web service recommendation

Country Status (1)

Country Link
CN (1) CN102819575B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324690A (en) * 2013-06-03 2013-09-25 焦点科技股份有限公司 Mixed recommendation method based on factorization condition limitation Boltzmann machine
CN103473291A (en) * 2013-09-02 2013-12-25 中国科学院软件研究所 Personalized service recommendation system and method based on latent semantic probability models
CN104102648A (en) * 2013-04-07 2014-10-15 腾讯科技(深圳)有限公司 User behavior data based interest recommending method and device
CN104111959A (en) * 2013-04-22 2014-10-22 浙江大学 Social network based service recommending method
CN104318268A (en) * 2014-11-11 2015-01-28 苏州晨川通信科技有限公司 Multiple transaction account identification method based on local distance measuring and learning
CN105205139A (en) * 2015-09-17 2015-12-30 罗旭斌 Personalized literature searching method
CN106055594A (en) * 2016-05-23 2016-10-26 成都陌云科技有限公司 Information providing method based on user interests
CN106126669A (en) * 2016-06-28 2016-11-16 北京邮电大学 User collaborative based on label filters content recommendation method and device
CN103678652B (en) * 2013-12-23 2017-02-01 山东大学 Information individualized recommendation method based on Web log data
CN106708920A (en) * 2016-10-09 2017-05-24 南京双运生物技术有限公司 Screening method for personalized scientific research literature
CN107644079A (en) * 2015-05-22 2018-01-30 广东欧珀移动通信有限公司 Method and device and related media production are recommended in one kind application
US9953060B2 (en) 2014-03-31 2018-04-24 Maruthi Siva P Cherukuri Personalized activity data gathering based on multi-variable user input and multi-dimensional schema
CN108268584A (en) * 2017-08-25 2018-07-10 广州市动景计算机科技有限公司 Message push method, device and server
WO2019028990A1 (en) * 2017-08-09 2019-02-14 上海壹账通金融科技有限公司 Code element naming method, device, electronic equipment and medium
CN109408713A (en) * 2018-10-09 2019-03-01 哈尔滨工程大学 A kind of software requirement searching system based on field feedback
CN109978642A (en) * 2017-12-27 2019-07-05 中移(杭州)信息技术有限公司 A kind of information recommendation method, device and communication equipment
CN110337682A (en) * 2016-07-15 2019-10-15 L·A·克里希纳斯瓦米 For supporting the educational data platform of the overall model of learner
CN107562919B (en) * 2017-09-13 2020-07-17 云南大学 Multi-index integrated software component retrieval method and system based on information retrieval

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685456A (en) * 2008-09-26 2010-03-31 华为技术有限公司 Search method, system and device
CN101996200A (en) * 2009-08-19 2011-03-30 华为技术有限公司 Method and device for searching file
CN102156733A (en) * 2011-03-25 2011-08-17 清华大学 Search engine and method based on service oriented architecture

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685456A (en) * 2008-09-26 2010-03-31 华为技术有限公司 Search method, system and device
CN101996200A (en) * 2009-08-19 2011-03-30 华为技术有限公司 Method and device for searching file
CN102156733A (en) * 2011-03-25 2011-08-17 清华大学 Search engine and method based on service oriented architecture

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102648B (en) * 2013-04-07 2017-12-01 腾讯科技(深圳)有限公司 Interest based on user behavior data recommends method and device
CN104102648A (en) * 2013-04-07 2014-10-15 腾讯科技(深圳)有限公司 User behavior data based interest recommending method and device
CN104111959B (en) * 2013-04-22 2017-06-20 浙江大学 Service recommendation method based on social networks
CN104111959A (en) * 2013-04-22 2014-10-22 浙江大学 Social network based service recommending method
CN103324690A (en) * 2013-06-03 2013-09-25 焦点科技股份有限公司 Mixed recommendation method based on factorization condition limitation Boltzmann machine
CN103473291B (en) * 2013-09-02 2017-01-18 中国科学院软件研究所 Personalized service recommendation system and method based on latent semantic probability models
CN103473291A (en) * 2013-09-02 2013-12-25 中国科学院软件研究所 Personalized service recommendation system and method based on latent semantic probability models
CN103678652B (en) * 2013-12-23 2017-02-01 山东大学 Information individualized recommendation method based on Web log data
US9953060B2 (en) 2014-03-31 2018-04-24 Maruthi Siva P Cherukuri Personalized activity data gathering based on multi-variable user input and multi-dimensional schema
CN104318268A (en) * 2014-11-11 2015-01-28 苏州晨川通信科技有限公司 Multiple transaction account identification method based on local distance measuring and learning
CN104318268B (en) * 2014-11-11 2017-09-08 苏州晨川通信科技有限公司 A kind of many trading account recognition methods based on local distance metric learning
CN107644079A (en) * 2015-05-22 2018-01-30 广东欧珀移动通信有限公司 Method and device and related media production are recommended in one kind application
CN105205139A (en) * 2015-09-17 2015-12-30 罗旭斌 Personalized literature searching method
CN105205139B (en) * 2015-09-17 2019-06-14 罗旭斌 A kind of personalization document retrieval method
CN106055594A (en) * 2016-05-23 2016-10-26 成都陌云科技有限公司 Information providing method based on user interests
CN106126669A (en) * 2016-06-28 2016-11-16 北京邮电大学 User collaborative based on label filters content recommendation method and device
CN106126669B (en) * 2016-06-28 2019-07-16 北京邮电大学 User collaborative filtering content recommendation method and device based on label
CN110337682A (en) * 2016-07-15 2019-10-15 L·A·克里希纳斯瓦米 For supporting the educational data platform of the overall model of learner
CN106708920A (en) * 2016-10-09 2017-05-24 南京双运生物技术有限公司 Screening method for personalized scientific research literature
WO2019028990A1 (en) * 2017-08-09 2019-02-14 上海壹账通金融科技有限公司 Code element naming method, device, electronic equipment and medium
CN108268584A (en) * 2017-08-25 2018-07-10 广州市动景计算机科技有限公司 Message push method, device and server
CN107562919B (en) * 2017-09-13 2020-07-17 云南大学 Multi-index integrated software component retrieval method and system based on information retrieval
CN109978642A (en) * 2017-12-27 2019-07-05 中移(杭州)信息技术有限公司 A kind of information recommendation method, device and communication equipment
CN109408713A (en) * 2018-10-09 2019-03-01 哈尔滨工程大学 A kind of software requirement searching system based on field feedback
CN109408713B (en) * 2018-10-09 2020-12-04 哈尔滨工程大学 Software demand retrieval system based on user feedback information

Also Published As

Publication number Publication date
CN102819575B (en) 2015-06-17

Similar Documents

Publication Publication Date Title
CN102819575A (en) Personalized search method for Web service recommendation
Colace et al. A collaborative user-centered framework for recommending items in Online Social Networks
White et al. Predicting user interests from contextual information
US8200617B2 (en) Automatic mapping of a location identifier pattern of an object to a semantic type using object metadata
Zhong et al. Time-aware service recommendation for mashup creation in an evolving service ecosystem
US9922344B1 (en) Serving advertisements based on partial queries
Jain et al. Aggregating functionality, use history, and popularity of APIs to recommend mashup creation
CN104035972A (en) Knowledge recommending method and system based on micro blogs
CN105468649A (en) Method and apparatus for determining matching of to-be-displayed object
KR100954842B1 (en) Method and System of classifying web page using category tag information and Recording medium using by the same
Gao et al. SeCo-LDA: Mining service co-occurrence topics for composition recommendation
JP2018504686A (en) Method and apparatus for processing search data
CN102156747A (en) Method and device for forecasting collaborative filtering mark by introduction of social tag
Li et al. CoWS: An Internet-enriched and quality-aware Web services search engine
JP5048852B2 (en) Search device, search method, search program, and computer-readable recording medium storing the program
Kim et al. Automated discovery of small business domain knowledge using web crawling and data mining
JP6144799B2 (en) Method and system for providing search list and search word rank based on information database attached in search result
US11237693B1 (en) Provisioning serendipitous content recommendations in a targeted content zone
Tang et al. SCHOLAT: an innovative academic information service platform
Du et al. Scientific users' interest detection and collaborators recommendation
Anandhan et al. Expert Recommendation Through Tag Relationship In Community Question Answering
Hu et al. A personalised search approach for web service recommendation
Kanoulas et al. CLEF 2017 dynamic search evaluation lab overview
Li et al. Personalized microtopic recommendation with rich information
Al-Abdullatif et al. Using online hotel customer reviews to improve the booking process

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160310

Address after: 222000 Jinqiao Road 19, Lianyungang economic and Technological Development Zone, Jiangsu, Lianyungang

Patentee after: Ten Party health management (Jiangsu) Limited

Patentee after: JIANGSU HUAKANG INFORMATION TECHNOLOGY CO., LTD.

Address before: Qixia Xianlin Avenue District of Nanjing City, Jiangsu Province, Nanjing University No. 163 210093

Patentee before: Nanjing University