CN105404619A - Similarity based semantic Web service clustering labeling method - Google Patents

Similarity based semantic Web service clustering labeling method Download PDF

Info

Publication number
CN105404619A
CN105404619A CN201510568188.4A CN201510568188A CN105404619A CN 105404619 A CN105404619 A CN 105404619A CN 201510568188 A CN201510568188 A CN 201510568188A CN 105404619 A CN105404619 A CN 105404619A
Authority
CN
China
Prior art keywords
similarity
semantic web
web services
sim
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510568188.4A
Other languages
Chinese (zh)
Other versions
CN105404619B (en
Inventor
刘发贵
邓达成
彭晨漪
李平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201510568188.4A priority Critical patent/CN105404619B/en
Publication of CN105404619A publication Critical patent/CN105404619A/en
Application granted granted Critical
Publication of CN105404619B publication Critical patent/CN105404619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a similarity based semantic Web service clustering labeling method. The method is characterized by comprising two parts of realizing semantic Web service similarity calculation and realizing a semantic Web service clustering labeling algorithm. During the semantic Web service similarity calculation, in combination with results of input/output (I/O) parameter mixed similarity calculation and service description keyword similarity calculation, a calculation result of semantic Web service similarity is comprehensively obtained, and the difference and similarity between service functions are reflected; and I/O parameters can directly describe functions of corresponding service modules and serve as measurement standards for calculating the semantic Web service similarity from a functional perspective. According to the method, the accuracy of similarity calculation can be improved and the performance of a service discovery system is further improved.

Description

A kind of Semantic Web Services cluster mask method based on similarity
Technical field
The invention belongs to Semantic Web Services in intelligent semantic net and calculate field, be specifically related to the Semantic Web Services cluster mask method based on similarity.
Background technology
Along with the quick emergence of Internet of Things, can be undertaken manipulating by network and the equipment of exchanges data and resource type increasing.Cisco predicts, to the year two thousand twenty, the quantity of internet device will reach about 50,000,000,000.Along with the appearance of various Internet of Things entity device and application platform, Internet of Things starts the problem facing message exchange and collaborative work between isomerization entity.
Research is had to attempt service-oriented cross-platform thought to be incorporated in Internet of Things at present.Be described by the form of the function serviceization each entity in physical world, thus make the function of entity to come accessed with unified service interface and to call, and outwardly provide the function of himself further.So, just working in coordination with of the mutual of information and function is carried out by service interface between the entity of isomery.Further, can Relevant Service Discovery Technologies be passed through, find the service or the service chaining that meet user's request, and then find the entity of corresponding execution service function, the entity collaborate that final driving is heterogeneous, complete request.Thus, Relevant Service Discovery Technologies is that the collaborative work problem effectively solved between heterogeneous device entity provides solution.In addition, Semantic Web add the intelligence degree that can improve Relevant Service Discovery Technologies, clear and definite semanteme to be conducive to allowing between entity the Meaning of Information better understood each other.The Data Fusion of the introducing energy reinforce networked platforms of semantic technology in Internet of Things and resource query ability, meet application demand complicated and changeable.
But be described in the form of magnanimity entity function being carried out to serviceization, break through between resource while heterogeneous obstacle, by causing, the quantity of Semantic Web Services is various, and function type is lengthy and jumbled.Wherein the service of many identity functions will be there is.So, need the service library to depositing these services to carry out cluster, and the service mark center service to each class, thus improve service discovery efficiency.
Semantic Web Services cluster mask method based on similarity refers to the calculated value based on similarity between Semantic Web Services, category division is carried out to Semantic Web Services, thus play effect entity function sorted out and marks, finally reach the service of Internet of Things entity describes ever-increasing while, improve the efficiency of service discovery.
In service Similarity Measure, in existing achievement in research, in type API has Woogle and OWLS-MX.The Web service structure that Woogle supports is non-semantic, can not be applied directly in the calculating of Semantic Web Services similarity.By contrast, OWLS-MX comprises semantic reasoning module, supports the calculating of semantic similarity.But what OWLS-MX adopted is the method that coupling is filtered, and determines the similarity relation between two services with these five kinds of filtrators of EXACT, PLUGIN, SUBSUMES, SUBSUMED-BY and NEAREEST-NEIGHBOR.Thus, the similarity obtained is five fixing relation value, and the inadequate refinement of similarity numerical value is with accurate.Current existing DoM computing method, it be use service I/O body parameter calculation services between the method for similarity.Although the I/O body parameter of Semantic Web Services very directly can describe the function of corresponding with service module, when calculating similarity, only there is the I/O parameter of service, field and the feature of service function can not be described very clearly.
In service clustering algorithm, the people such as Christian use the mutation of AL (AverageLinkage) hierarchical clustering algorithm, the distance coming between compute classes central point and other classes for mean value with Mesophyticum.But the large class of minority that what the mode of hierarchical cluster probably caused final cluster to obtain is, easily causes service search efficiency on the low side.The people such as Wu propose the similarity calculating method of various dimensions, use the similarity between K-MEANS algorithm calculation services.K-MEANS algorithm need specify cluster number, and the selection of initial point is very large on the impact of result.The people such as Aliz propose the Semantic Web Services clustering method based on particle cluster algorithm.But this algorithm needs the number of times of specifying cluster number and algorithm iteration, and this algorithm is easily absorbed in the predicament of locally optimal solution.The people such as Pop propose the service clustering method based on ant group algorithm.But the speed of convergence of the method is slow, among the predicament being likely absorbed in local optimum.
Summary of the invention
The object of the invention is to the deficiency overcoming existing Semantic Web Services Similarity Measure technology and service cluster label technology, propose a kind of based on I/O mixing and the Semantic Web Services similarity calculating method of key word, and further provide a kind of method that Semantic Web Services cluster based on similarity marks.
Object of the present invention is achieved by the following technical programs:
Based on a Semantic Web Services cluster mask method for similarity, comprise following two steps:
1) Semantic Web Services Similarity Measure;
2) based on Similarity Measure result, algorithm is used to carry out cluster and center service mark to Semantic Web Services;
Above-mentioned steps 1) described computing method, comprise two parts: I/O hybrid parameter Similarity Measure, and key word Similarity Measure.
The calculating of described Semantic Web Services similarity, combine the result of I/O and the calculating of I/O parameter hybrid similarity and service describing key word Similarity Measure, comprehensively draw the result of calculation of Semantic Web Services similarity, make it more accurately to reflect the difference between service function and similarity degree;
Described I/O parameter very directly can describe the function of corresponding with service module.I/O parameter is usually used as the criterion from functional perspective computing semantic Web service similarity.
Above-mentioned steps 1) described computing method, computing formula is
Wherein, Sim (S1, S2) the similarity numerical value between Semantic Web Services S1 and S2 is represented, Sim_Func (S1, S2) the I/O hybrid parameter similarity between Semantic Web Services S1 and S2 is represented, Sim_Key (S1, S2) represents the key word similarity between Semantic Web Services S1 and S2.
Above-mentioned steps 1) described in I/O hybrid parameter similarity, the implication represented by it is as shown in Figure 1.Computing formula is:
Sim F u n c ( S 1 , S 2 ) = Sim I n p u t s ( S 1 , S 2 ) + Sim O u t p u t s ( S 1 , S 2 ) 2 .
Wherein, Sim inputs(S1, S2) is the input parameter similarity of Semantic Web Services S1 and S2, Sim outputs(S1, S2) is output parameter similarity.Wherein, Sim I n p u t s ( S 1 , S 2 ) = Sim I n ( S 1 , S 2 ) + Sim I n ( S 2 , S 1 ) 2 .
Sim in(S1, S2) represent the input parameter of Semantic Web Services S1 for Semantic Web Services S2 input parameter between similarity.Sim in(S2, S1) otherwise, represent Semantic Web Services S2 input body parameter for Semantic Web Services S1 input body parameter between similarity.Wherein, Sim I n ( S 1 , S 2 ) = Σ i = 1 | I n ( S 1 ) | max j = 1 | I n ( S 2 ) ∪ O u t ( S 2 ) | Sim c o n ( C 1 i , C 2 j ) | I n ( S 1 ) | ,
As shown in above formula, Sim con(C1 i, C2 j) represent the single input parameter C1 of Semantic Web Services S1 iwith the single input or output parameter C2 of Semantic Web Services S2 jbetween similarity. represent concept C1 imate with each input and output Ontological concept of Semantic Web Services S2, obtain the similarity numerical value of a pair maximum concept of matching degree.And the matching degree computing formula between concept is:
s i m ( X , Y ) = α | X ∩ Y | | X | + ( 1 - α ) | X ∩ Y | | Y | = α Σ d ∈ D min { μ X ( d ) , μ Y ( d ) } Σ d ∈ D μ X ( d ) + ( 1 - α ) Σ d ∈ D min { μ X ( d ) , μ Y ( d ) } Σ d ∈ D μ Y ( d )
The implication of sim (X, Y) is the similarity degree of Y for X.α is a parameter regulating weight, α ∈ [0,1].D={x} represents object space, and X, Y are the fuzzy sets in D, and sim:U × U → [0,1] is the fuzzy resembling relation on product space U × U.Apply the similarity that this formulae discovery can obtain between two concepts.
Above-mentioned steps 1) key word in described key word similarity, refer to the common factor of owls service description file text header content for descriptive semantics Web service and profile:textDescription Chinese version vocabulary.
Above-mentioned steps 1) described in the key word similarity of Semantic Web Services, computing formula is Sim K e y ( S 1 , S 2 ) = Sim w o r d ( S 1 , S 2 ) + Sim w o r d ( S 2 , S 1 ) 2 .
Wherein, Sim word(S1, S2) is for Semantic Web Services S1, the key word similarity between it and Semantic Web Services S2; Sim word(S1, S2) is for Semantic Web Services S2, the key word similarity between it and Semantic Web Services S1.Wherein, Sim w o r d ( S 1 , S 2 ) = Σ i = 1 | W o r d ( S 1 ) | T ( k 1 i ) × max j = 1 | W o r d ( S 2 ) | Sim G ( k 1 i , k 2 j ) .
T (k1 i) that represent is key word k1 itF-IDF weighted value.TF-IDF is a kind of method added up the significance level of words in one section of text and calculate.Wherein,
IDF irepresent the reverse document-frequency (inversedocumentfrequency, IDF) of key word i.IDF represents the significance level of a word in All Files.Here, owing to only considering the importance of key word in respective service description document, and the ubiquity of key word is not considered, so IDF i=1.
represent the TF word frequency (TermFrequency) of key word i in document d, the frequency namely occurred.In this article, document d refers to Semantic Web Services description document, and key word i refers to get to occur simultaneously by the word in textdescription and Semantic Web Services title and obtains.
Sim g(k1 i, k2 j) represent certain key word k1 of Semantic Web Services S1 iwith a key word k2 of Semantic Web Services S2 jbetween similarity. represent the key word k1 of calculation services S1 one by one iwith all key word k2 of service S2 ibetween similarity, get maximal value and return.Wherein, because key word is text, do not have the ontology information of direct correlation with it, we adopt NormalizedGoogleDsitance [52]calculate the similarity between key word, computing formula is Sim g(k1 i, k2 j)=1-NGD (k1 i, k2 j)
Wherein, NGD (k1 i, k2 j) represent NormalizedGoogleDistance between both keyword,
N G D ( k 1 i , k 2 j ) = max { log f ( k 1 i ) , log f ( k 2 j ) } - log f ( k 1 i , k 2 j ) log N - min { log f ( k 1 i ) , log k 2 j }
Step 2) described according to Similarity Measure result, propose based on AffinityPropagation (AP) cluster dimensioning algorithm to Semantic Web Services carry out cluster and center service mark.Algorithm steps is:
Step 2.1 regards a Semantic Web Services as a data point;
Similarity between Semantic Web Services is converted into the distance between data point by step 2.2;
The reference value parameter of step 2.3 initialization cluster dimensioning algorithm, ratio of damping;
Step 2.4, by transmitting responsibility and availability two class message between data point, determines which data point is central point;
Other data points are sorted out in the services set representated by centre data point by step 2.5.
The distance similarity between Semantic Web Services be converted between data point described in step 2.2, technical scheme is as follows: s (i, k) represents in the distance matrix of Semantic Web Services, the distance between service i and service k.The computing method of s (i, k) are: change similarity matrix Sim into distance matrix S.Suppose service n total to be clustered, then similarity matrix can be expressed as,
S i m = s i m ( 0 , 0 ) s i m ( 0 , 1 ) ... s i m ( 0 , n - 1 ) s i m ( 1 , 0 ) s i m ( 1 , 1 ) ... s i m ( 1 , n - 1 ) . . . . . . . . . s i m ( n - 1 , 0 ) s i m ( n - 1 , 1 ) ... s i m ( n - 1 , n - 1 )
Wherein sim (i, j) represents the similarity between Semantic Web Services i and Semantic Web Services j, by step 1) described technical method calculates.
By formula S=Sim-1, obtain the distance matrix S based on Semantic Web Services similarity.Wherein, element s (i, the j) ∈ [-1,0] in distance matrix S.
Above-mentioned s (i, k), as i=k, the diagonal entry of what s (k, k) represented is distance matrix S, value is in an initial condition called as reference value (preference).
The value of reference value described in step 2.3, in general has the value that two kinds desirable, and one is reference value=median (S), and reference value gets the intermediate value of distance value between each Semantic Web Services data point; Two is reference value=average (S), and reference value gets the mean value of distance value between each Semantic Web Services data point.Prove by experiment, as reference value=median (S), improve maximum to the efficiency of service discovery system, Clustering Effect is best.
Ratio of damping described in step 2.3, in order to suppress to occur that the situation that numerical value vibrates is established.Prove by experiment, as lam=0.5, numerical fluctuations is less, and this algorithm the convergence speed is moderate.
The two class message, Attraction Degree (responsibility) and the degree of membership (availability) that transmit between the data point formed in Semantic Web Services described in step 2.4.Attraction Degree responsibility (i, k) represents, to refer to from Semantic Web Services data point i, service data point k as an appropriateness for i cluster centre, is generally abbreviated as r (i, k).Degree of membership availability (i, k) represents, refers to from service data point k, and service data point i meeting selected element k, as the possible degree of cluster centre, is generally abbreviated as a (i, k).
The computation rule of r (i, k) is: r (i, k) ← s (i, k)-max k ' s.t:k ' ≠ k{ a (i, k ')+s (i, k ') }
Wherein, s (i, k) represents in the distance matrix of Semantic Web Services, the distance between service i and service k.Be 0 under avalability value original state between all services.
The computation rule of a (i, k) is: a ( i , k ) ← m i n { 0 , r ( k , k ) + Σ i ′ s . t . i ′ ∉ { i , k } max { 0 , r ( i ′ , k ) } } , i ≠ k Σ i ′ s . t . i ′ ∉ { i , k } max { 0 , r ( i ′ , k ) } , i = k
The value of r (i, k) and a (i, k) is larger, larger as the Attraction Degree degree of cluster centre to service point k, and service point i is under the jurisdiction of with service point k is the cluster at services set center degree of membership is also larger.Algorithm by continuous iteration upgrade each service point and other serve between Attraction Degree and belong to angle value, finally at a time stop iteration, and produce m high-quality cluster centre service point (exemplar), remaining service is referred in corresponding services set simultaneously.
In order to suppress the situation occurring that numerical value vibrates, need to add ratio of damping lam, that is:
r i+1(i,k)←lamr i(i,k)+(1-lam)r i+1(i,k),lam∈(0,1),
a i+1(i,k)←lama i(i,k)+(1-lam)a i+1(i,k),lam∈(0,1)
Wherein, r i+1(i, k) represents k that the i-th+1 time iteration the calculates attraction angle value for i, r i(i, k) represents k that i-th iteration the calculate attraction angle value for i.Lam is the ratio of damping described in step 2.3.Can find out thus, each renewal attracts angle value and ownership angle value, all needs to be multiplied by lam by the result of last iteration, adds that the result of current iteration is multiplied by 1-lam, reduce the vibration of numerical value, accelerate algorithm the convergence speed.
Which data point of determination described in step 2.4 is central point, and technical scheme is as follows:
Step 2.4.1 creates matrix E=R+A.R represents the N*N matrix of record r (i, k) value, and i ∈ [0, N), k ∈ [0, N).A is the N*N matrix of record a (i, k) value, and i ∈ [0, N), k ∈ [0, N).
Whether the value of step 2.4.2 one by one on judgment matrix E diagonal line is greater than 0.If be greater than 0, then using the central point of the Semantic Web Services corresponding to this diagonal entry as cluster.
Other data points are sorted out in the services set representated by centre data point described in step 2.5, refer to the Semantic Web Services as cluster centre point out selected by step 2.4.2, the Semantic Web Services for other non-cluster central points selects the class bunch of the maximum central point of the element value of associated in E matrix as oneself.
Compared with prior art, tool of the present invention has the following advantages and technique effect: the present invention is by similarity numerical value between the service that calculates, and carry out cluster based on similarity to Semantic Web Services, intimate service being gathered is a class.And the center service extracting each cluster services set in the process of cluster is to mark the function of this class.The present invention can improve the accuracy of Similarity Measure, and promotes the performance of service discovery system further.
Accompanying drawing explanation
Fig. 1 is the input parameter Similarity Measure schematic diagram of semantic service S1;
Fig. 2 is Semantic Web Services Similarity Measure process flow diagram;
Fig. 3 is Semantic Web Services cluster dimensioning algorithm process flow diagram.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with accompanying drawing, the present invention is further elaborated, but enforcement of the present invention and protection are not limited thereto.
Fig. 2 is Semantic Web Services Similarity Measure process flow diagram.The calculating of Semantic Web Services similarity comprises functional parameter coupling and keyword match two parts.Flow process described in Fig. 2 is:
Step 1, the service of input two similarities to be calculated, the similarity giving tacit consent to them is 0;
Step 2, judges whether that the I/O parameter of presence service is empty situation, if so, then returns acquiescence similarity, process ends, if not, then continues step 3;
Step 3, carries out the Input parameter of Semantic Web Services and the calculating of Output parameter similarity respectively, and comprehensive both calculated value, the calculating of functional parameter similarity between serving, draws functional parameter similarity numerical value.
The calculating of Semantic Web Services Input parameter similarity described in step 3, the formula adopted is Sim I n p u t s ( S 1 , S 2 ) = Sim I n ( S 1 , S 2 ) + Sim I n ( S 2 , S 1 ) 2 , Wherein Sim I n ( S 1 , S 2 ) = Σ i = 1 | I n ( S 1 ) | max j = 1 | I n ( S 2 ) ∪ O u t ( S 2 ) | Sim c o n ( C 1 i , C 2 j ) | I n ( S 1 ) | . Sim con(C1 i, C2 j) represent the single input parameter C1 of Semantic Web Services S1 iwith the single input or output parameter C2 of Semantic Web Services S2 jbetween similarity, concrete computing formula is s i m ( X , Y ) = α | X ∩ Y | | X | + ( 1 - α ) | X ∩ Y | | Y | = α Σ d ∈ D min { μ X ( d ) , μ Y ( d ) } Σ d ∈ D μ X ( d ) + ( 1 - α ) Σ d ∈ D min { μ X ( d ) , μ Y ( d ) } Σ d ∈ D μ Y ( d )
The implication of sim (X, Y) is the similarity degree of Y for X.α is a parameter regulating weight, α ∈ [0,1].D={x} represents object space, and X, Y are the fuzzy sets in D, and sim:U × U → [0,1] is the fuzzy resembling relation on product space U × U.Apply the similarity that this formulae discovery can obtain between two concepts.
Step 4, according to the title of the content under Semantic Web Services profile label and service, extracts the set of keywords of service.
The set of keywords of the service that extracts described in step 4, comprises four steps:
Step 4.1, by the content under Semantic Web Services description document profile label, carries out vocabulary segmentation, obtains the set of multiple text english vocabulary composition;
Step 4.2, carries out word segmentation processing by the title of service, obtains the set of multiple text English word composition;
Step 4.3, carries out intersection operation by the set that step 4.1 and step 4.2 obtain, obtains new lexical set;
Step 4.4, in the lexical set that calculation procedure 4.3 obtains, the TF-IDF weighted value of each vocabulary in corresponding Semantic Web Services description document.Computing formula is t (k1 i) that represent is key word k1 itF-IDF weighted value.Wherein, IDF irepresent the prevalence in corresponding Semantic Web Services description document of key word i, represent the frequency of occurrences of key word i in document d.Here, IDF i=1.
Step 5, according to the set of keywords of Semantic Web Services, application Google distance, calculates key word similarity numerical value.Be specially sim word(S1, S2) is for Semantic Web Services S1, the key word similarity between it and Semantic Web Services S2;
Described in step 5 Sim w o r d ( S 1 , S 2 ) = Σ i = 1 | W o r d ( S 1 ) | T ( k 1 i ) × max j = 1 | W o r d ( S 2 ) | Sim G ( k 1 i , k 2 j ) , T (k1 i) that represent is key word k1 itF-IDF weighted value, Sim g(k1 i, k2 j) represent certain key word k1 of Semantic Web Services S1 iwith a key word k2 of Semantic Web Services S2 jbetween similarity.
Sim described in step 5 g(k1 i, k2 j), employing NormalizedGoogleDsitance calculates the similarity between key word, and computing formula is Sim g(k1 i, k2 j)=1-NGD (k1 i, k2 j).Wherein, NGD (k1 i, k2 j) represent NormalizedGoogleDistance between both keyword.
Step 6, comprehensive function parameter similarity numerical value and service describing key word Similarity value, draw the similarity between Semantic Web Services.Be specially sim (S1, S2) represents the similarity numerical value between Semantic Web Services S1 and S2, and Sim_Func (S1, S2) represents language functional parameter similarity, and Sim_Key (S1, S2) represents key word similarity.
Based on the Semantic Web Services similarity calculated, cluster and center service extraction are carried out to Semantic Web Services.
Fig. 3 is Semantic Web Services cluster dimensioning algorithm process flow diagram.The flow process of algorithm is mainly divided into four-stage.First stage is the similarity of computing semantic Web service; Second stage is the parameter required for initialization clustering algorithm; Three phases is constantly calculated between services by iteration and mutually transmits the value of Attraction Degree (responsibility) and degree of membership (availability), selects the centre data point of cluster; Four-stage is according to the Attraction Degree between other data points and centre data point and degree of membership, and they are carried out category division.So, just complete the cluster to Semantic Web Services.Meanwhile, centre data point out selected by cluster process is mapped back Semantic Web Services, using these services as the center service of each services set, is used for marking the exemplary functions described by respective set of services.Concrete steps are as follows:
Step 1, uses above-mentioned Semantic Web Services similarity calculating method, the similarity in calculation services storehouse between each Semantic Web Services.
Step 2, according to the similarity structure similarity matrix Sim between Semantic Web Services.The row and column of this matrix is the Semantic Web Services by same sequence arrangement.
Step 3, uses the similarity matrix Sim of second step structure, and initialization is by by the distance matrix S of clustering processing, and S=Sim-1.Meanwhile, structure Attraction Degree matrix R and degree of membership matrix A, and initialization R=0, A=0.
Step 4, the value of initialization preference is the intermediate value of all elements in distance matrix, ratio of damping lam=0.5.Element in distance matrix is sorted by order from small to large, and obtains the element intermediate value after sorting, in order to initialization preference.
Step 5, using formula r (i, k) ← s (i, k)-max k ' s.t:k ' ≠ k{ a (i, k ')+s (i, k ') }, calculates the element value of Attraction Degree matrix R one by one, and adopts r i+1(i, k) ← lamr i(i, k)+(1-lam) r i+1(i, k) upgrades Attraction Degree matrix R.
R (i, k) described in step 5 refers to responsibility (i, k), represents from Semantic Web Services data point i, service data point k as an appropriateness for i cluster centre.A (i, k) refers to availability (i, k), represents from service data point k, and service data point i meeting selected element k is as the possible degree of cluster centre.A (i, k) value is in an initial condition 0.
S (i, k) described in step 5 represents in the distance matrix of Semantic Web Services, the distance between service i and service k.
Step 6, using formula a ( i , k ) ← m i n { 0 , r ( k , k ) + Σ i ′ s . t . i ′ ∉ { i , k } max { 0 , r ( i ′ , k ) } } , i ≠ k Σ i ′ s . t . i ′ ∉ { i , k } max { 0 , r ( i ′ , k ) } , i = k , Calculate the element value of degree of membership matrix A one by one, and adopt a i+1(i, k) ← lama i(i, k)+(1-lam) a i+1(i, k) upgrades degree of membership matrix A.
Step 7, judges whether iterations has exceeded defined 100 times.If so, then step 9 is performed; If not, then step 8 is performed.
Step 8, judges whether the value of Attraction Degree matrix and degree of membership matrix does not change more than 10 iteration all.If so, then the 9th step is performed; If not, then step 5 is performed.
Step 9, creates matrix E, and according to E=R+A, calculates the value of E.
Step 10, whether the value one by one on judgment matrix E diagonal line is greater than 0.If be greater than 0, then using the central point of the Semantic Web Services corresponding to this diagonal entry as cluster.
Step 11, according to E entry of a matrix element value, for the center service of the maximum central point of element value as oneself place services set is selected in other non-central some service.
Step 12, completes cluster, obtains the center service of one or more services set and each services set.

Claims (10)

1., based on a Semantic Web Services cluster mask method for similarity, what it is characterized in that comprising the realization of Semantic Web Services Similarity Measure and Semantic Web Services cluster dimensioning algorithm realizes two parts;
The calculating of described Semantic Web Services similarity, combine the result of I/O and the calculating of I/O parameter hybrid similarity and service describing key word Similarity Measure, comprehensively draw the result of calculation of Semantic Web Services similarity, the difference between reflection service function and similarity degree; Described I/O parameter directly can describe the function of corresponding with service module, and I/O parameter is as the criterion from functional perspective computing semantic Web service similarity.
2. the Semantic Web Services cluster mask method based on similarity according to claim 1, it is characterized in that, Semantic Web Services calculating formula of similarity is S i m ( S 1 , S 2 ) = Sim F u n c ( S 1 , S 2 ) + Sim K e y ( S 1 , S 2 ) 2 , Wherein, Sim (S1, S2) the similarity numerical value between Semantic Web Services S1 and S2 is represented, Sim_Func (S1, S2) the I/O parameter hybrid similarity between Semantic Web Services S1 and S2 is represented, Sim_Key (S1, S2) represents the service describing key word similarity between Semantic Web Services S1 and S2;
The computing formula of I/O parameter hybrid similarity is Sim F u n c ( S 1 , S 2 ) = Sim I n p u t s ( S 1 , S 2 ) + Sim O u t p u t s ( S 1 , S 2 ) 2 , Wherein, Sim inputs(S1, S2) is the input parameter similarity of Semantic Web Services S1 and S2, Sim outputs(S1, S2) is output parameter similarity.
3. the Semantic Web Services cluster mask method based on similarity according to claim 2, it is characterized in that, the circular of input parameter similarity is Sim I n p u t s ( S 1 , S 2 ) = Sim I n ( S 1 , S 2 ) + Sim I n ( S 2 , S 1 ) 2 , Sim in(S1, S2) represent the input parameter of Semantic Web Services S1 for Semantic Web Services S2 input parameter between similarity, Sim in(S2, S1) otherwise, represent Semantic Web Services S2 input body parameter for Semantic Web Services S1 input body parameter between similarity; Wherein, Sim I n ( S 1 , S 2 ) = Σ i = 1 | I n ( S 1 ) | max j = 1 | I n ( S 2 ) ∪ O u t ( S 2 ) | Sim c o n ( C 1 i , C 2 j ) | I n ( S 1 ) | , Sim con(C1 i, C2 j) represent the single input parameter C1 of Semantic Web Services S1 iwith the single input or output parameter C2 of Semantic Web Services S2 jbetween similarity;
Described input parameter similarity is identical with the computing method of output parameter similarity;
Described Sim con(C1 i, C2 j) computing method are specially:
s i m ( X , Y ) = α | X ∩ Y | | X | + ( 1 - α ) | X ∩ Y | | Y | = α Σ d ∈ D min { μ X ( d ) , μ Y ( d ) } Σ d ∈ D μ X ( d ) + ( 1 - α ) Σ d ∈ D min { μ X ( d ) , μ Y ( d ) } Σ d ∈ D μ Y ( d )
Sim (X, Y) implication is the similarity degree of Y for X, α is a parameter regulating weight, α ∈ [0,1], D={x} represents object space, and X, Y are the fuzzy sets in D, sim:U × U → [0,1] is the fuzzy resembling relation on product space U × U, applies the similarity that this formulae discovery can obtain between two concepts.
4. the Semantic Web Services cluster mask method based on similarity according to claim 1, it is characterized in that, in described service describing key word similarity, service describing key word refers to owls service description file title and this file and to get the bid the common factor of the content vocabulary signed under profile:textDescription;
The computing formula of described service describing key word similarity is wherein, Sim word(S1, S2) is for service S1, the key word similarity between it and service S2; Sim word(S1, S2) is for service S2, the key word similarity between it and service S1;
Key word similarity between described Semantic Web Services S1 and Semantic Web Services S2, computing formula is Sim w o r d ( S 1 , S 2 ) = Σ i = 1 | W o r d ( S 1 ) | T ( k 1 i ) × max j = 1 | W o r d ( S 2 ) | Sim G ( k 1 i , k 2 j ) , T (k1 i) that represent is key word k1 itF-IDF weighted value, Sim g(k1 i, k2 j) represent certain key word k1 of Semantic Web Services S1 iwith a key word k2 of Semantic Web Services S2 jbetween similarity;
Described TF-IDF weighted value, owing to only considering the importance of key word in respective service description document, and does not consider the ubiquity of key word, so parameter IDF=1 wherein.
5. the Semantic Web Services cluster mask method based on similarity according to claim 4, is characterized in that, Sim g(k1 i, k2 j) adopt NormalGoogleDistance to calculate, be specially, Sim g(k1 i, k2 j)=1-NGD (k1 i, k2 j), NGD (k1 i, k2 j) represent NormalizedGoogleDistance between both keyword.
6. the Semantic Web Services cluster mask method based on similarity according to any one of Claims 1 to 5, it is characterized in that, on the basis obtaining each Semantic Web Services Similarity value, cluster and center service mark are carried out to Semantic Web Services, the work of cluster and center service mark, specifically comprises:
Step 1 regards a Semantic Web Services as a data point;
Similarity between Semantic Web Services is converted into the distance between data point by step 2;
The reference value parameter of step 3 initialization cluster dimensioning algorithm, ratio of damping;
Step 4, by transmitting responsibility and availability two class message between data point, determines which data point is central point;
Other data points are sorted out in the services set representated by centre data point by step 5.
7. the Semantic Web Services cluster mask method based on similarity according to claim 6, it is characterized in that, in step 2, s (i, k) represents in the distance matrix of Semantic Web Services, the distance between service i and service k, s (i, k) computing method are: change similarity matrix Sim into distance matrix S, by formula S=Sim-1, obtain the distance matrix S based on Semantic Web Services similarity.
8. the Semantic Web Services cluster mask method based on similarity according to claim 7, it is characterized in that, the value of the reference value parameter (preference) in step 3 is also s (k, the value of k), and reference value=median (S); Described ratio of damping, in order to suppress to occur that the situation that numerical value vibrates is established, as lam=0.5, numerical fluctuations is less, and this algorithm the convergence speed is moderate.
9. the Semantic Web Services cluster mask method based on similarity according to claim 7, it is characterized in that, in step 4, Attraction Degree responsibility (i, k) represent, refer to from Semantic Web Services data point i, service data point k, as an appropriateness for i cluster centre, is abbreviated as r (i, k), degree of membership availability (i, k) represent, refer to from service data point k, service data point i meeting selected element k is as the possible degree of cluster centre, be abbreviated as a (i, k);
Described r (i, k), computation rule is: r (i, k) ← s (i, k)-max k ' s.t:k ' ≠ k{ a (i, k ')+s (i, k ') }, wherein, s (i, k) represents in the distance matrix of Semantic Web Services, the distance between service i and service k; Be 0 under avalability value original state between all services;
Described a (i, k) computation rule is: a ( i , k ) ← m i n { 0 , r ( k , k ) + Σ i ′ s . t . i ′ ∉ { i , k } max { 0 , r ( i ′ , k ) } } , i ≠ k Σ i ′ s . t . i ′ ∉ { i , k } max { 0 , r ( i ′ , k ) } , i = k ;
Determine that center service specifically comprises two steps:
Step 1 creates matrix E=R+A, and R represents the N*N matrix of record r (i, k) value, and i ∈ [0, N), k ∈ [0, N), A is the N*N matrix of record a (i, k) value, and i ∈ [0, N), k ∈ [0, N);
Whether the value of step 2 one by one on judgment matrix E diagonal line is greater than 0, if be greater than 0, then using the central point of the Semantic Web Services corresponding to this diagonal entry as cluster.
10. the Semantic Web Services cluster mask method based on similarity according to claim 9, it is characterized in that, other data points are sorted out in the services set representated by centre data point described in step 5, refer to according to the selected Semantic Web Services as cluster centre point out, the Semantic Web Services for other non-cluster central points selects the class bunch of the maximum central point of the element value of associated in E matrix as oneself.
CN201510568188.4A 2015-09-08 2015-09-08 A kind of Semantic Web Services cluster mask method based on similarity Active CN105404619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510568188.4A CN105404619B (en) 2015-09-08 2015-09-08 A kind of Semantic Web Services cluster mask method based on similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510568188.4A CN105404619B (en) 2015-09-08 2015-09-08 A kind of Semantic Web Services cluster mask method based on similarity

Publications (2)

Publication Number Publication Date
CN105404619A true CN105404619A (en) 2016-03-16
CN105404619B CN105404619B (en) 2018-09-14

Family

ID=55470113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510568188.4A Active CN105404619B (en) 2015-09-08 2015-09-08 A kind of Semantic Web Services cluster mask method based on similarity

Country Status (1)

Country Link
CN (1) CN105404619B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528633A (en) * 2016-10-11 2017-03-22 杭州电子科技大学 Method for improving social attention of video based on keyword recommendation
CN109255125A (en) * 2018-08-17 2019-01-22 浙江工业大学 A kind of Web service clustering method based on improvement DBSCAN algorithm
CN109359289A (en) * 2018-08-17 2019-02-19 浙江工业大学 A kind of Web service functional similarity measure based on ontology
CN110347401A (en) * 2019-06-18 2019-10-18 西安交通大学 A kind of API Framework service discovery method based on semantic similarity
CN111914859A (en) * 2019-05-07 2020-11-10 中移(苏州)软件技术有限公司 Service multiplexing method, computing device and computer readable storage medium
CN113409136A (en) * 2021-06-30 2021-09-17 中国工商银行股份有限公司 Method, device, computer system and storage medium for analyzing similarity of composite services
CN114091473A (en) * 2022-01-20 2022-02-25 北京建筑大学 Web service discovery method based on comprehensive semantics
CN114282548A (en) * 2022-01-04 2022-04-05 重庆邮电大学 Automatic semantic annotation system for data of Internet of things

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080049428A (en) * 2006-11-30 2008-06-04 주식회사 케이티프리텔 Method and apparatus for providing similarity searching services by semantic web
CN102622396A (en) * 2011-11-30 2012-08-01 浙江大学 Web service clustering method based on labels
CN103473695A (en) * 2013-09-01 2013-12-25 西安重装渭南光电科技有限公司 Semantic-net-based Web service discovery method and system
CN104598559A (en) * 2015-01-06 2015-05-06 浙江大学 Combined clustering method based on Web service tag data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080049428A (en) * 2006-11-30 2008-06-04 주식회사 케이티프리텔 Method and apparatus for providing similarity searching services by semantic web
CN102622396A (en) * 2011-11-30 2012-08-01 浙江大学 Web service clustering method based on labels
CN103473695A (en) * 2013-09-01 2013-12-25 西安重装渭南光电科技有限公司 Semantic-net-based Web service discovery method and system
CN104598559A (en) * 2015-01-06 2015-05-06 浙江大学 Combined clustering method based on Web service tag data

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528633B (en) * 2016-10-11 2019-07-02 杭州电子科技大学 A kind of video society attention rate improvement method recommended based on keyword
CN106528633A (en) * 2016-10-11 2017-03-22 杭州电子科技大学 Method for improving social attention of video based on keyword recommendation
CN109359289B (en) * 2018-08-17 2023-01-31 浙江工业大学 Web service function similarity measurement method based on ontology
CN109255125A (en) * 2018-08-17 2019-01-22 浙江工业大学 A kind of Web service clustering method based on improvement DBSCAN algorithm
CN109359289A (en) * 2018-08-17 2019-02-19 浙江工业大学 A kind of Web service functional similarity measure based on ontology
CN109255125B (en) * 2018-08-17 2023-07-14 浙江工业大学 Web service clustering method based on improved DBSCAN algorithm
CN111914859A (en) * 2019-05-07 2020-11-10 中移(苏州)软件技术有限公司 Service multiplexing method, computing device and computer readable storage medium
CN110347401B (en) * 2019-06-18 2021-03-16 西安交通大学 API Framework service discovery method based on semantic similarity
CN110347401A (en) * 2019-06-18 2019-10-18 西安交通大学 A kind of API Framework service discovery method based on semantic similarity
CN113409136A (en) * 2021-06-30 2021-09-17 中国工商银行股份有限公司 Method, device, computer system and storage medium for analyzing similarity of composite services
CN114282548A (en) * 2022-01-04 2022-04-05 重庆邮电大学 Automatic semantic annotation system for data of Internet of things
CN114091473A (en) * 2022-01-20 2022-02-25 北京建筑大学 Web service discovery method based on comprehensive semantics
CN114091473B (en) * 2022-01-20 2022-05-03 北京建筑大学 Web service discovery method based on comprehensive semantics

Also Published As

Publication number Publication date
CN105404619B (en) 2018-09-14

Similar Documents

Publication Publication Date Title
CN105404619A (en) Similarity based semantic Web service clustering labeling method
Song et al. Mining summaries for knowledge graph search
CN102207945B (en) Knowledge network-based text indexing system and method
CN102902821A (en) Methods for labeling and searching advanced semantics of imagse based on network hot topics and device
CN101271476B (en) Relevant feedback retrieval method based on clustering in network image search
US20080097994A1 (en) Method of extracting community and system for the same
CN103064917A (en) Specific-tendency high-influence user group discovering method orienting microblog
CN102202012A (en) Group dividing method and system of communication network
CN101901249A (en) Text-based query expansion and sort method in image retrieval
CN111737400A (en) Knowledge reasoning-based big data service tag expansion method and system
Yang et al. Identifying opinion leaders in social networks with topic limitation
Nguyen et al. Double-layered schema integration of heterogeneous XML sources
Rajdhev et al. Internet of things for health care
Mehrotra et al. Comparative analysis of K-Means with other clustering algorithms to improve search result
Kastrati et al. An improved concept vector space model for ontology based classification
Xu et al. Lightweight tag-aware personalized recommendation on the social web using ontological similarity
CN103095849A (en) A method and a system of spervised web service finding based on attribution forecast and error correction of quality of service (QoS)
Trappey et al. Patent landscape and key technology interaction roadmap using graph convolutional network–Case of mobile communication technologies beyond 5G
CN103955461A (en) Semantic matching method based on ontology set concept similarity
Li et al. Annotating semantic tags of locations in location-based social networks
Sun et al. A hybrid approach to news recommendation based on knowledge graph and long short-term user preferences
CN106446143A (en) Intelligent recommendation system and method based on graph structure matching
CN116450938A (en) Work order recommendation realization method and system based on map
CN106055702B (en) Internet-oriented data service unified description method
Gawinecki et al. WSColab: Structured Collaborative Tagging for Web Service Matchmaking.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant