CN105893515A - Information processing method and server - Google Patents

Information processing method and server Download PDF

Info

Publication number
CN105893515A
CN105893515A CN201610193015.3A CN201610193015A CN105893515A CN 105893515 A CN105893515 A CN 105893515A CN 201610193015 A CN201610193015 A CN 201610193015A CN 105893515 A CN105893515 A CN 105893515A
Authority
CN
China
Prior art keywords
coordinate points
distance
voice data
coordinate
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610193015.3A
Other languages
Chinese (zh)
Other versions
CN105893515B (en
Inventor
黄安埠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610193015.3A priority Critical patent/CN105893515B/en
Publication of CN105893515A publication Critical patent/CN105893515A/en
Application granted granted Critical
Publication of CN105893515B publication Critical patent/CN105893515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses an information processing method and a server. The method includes the steps of obtaining a plurality of audio data corresponding to a user, and recognizing a plurality of attribute parameters in the audio data; mapping the audio data into a preset multi-dimensional coordinate system according to the attribute parameters to obtain coordinate points corresponding to the audio data, wherein the dimension of the coordinate system is matched with the number of types of the attribute parameters; calculating local density parameters of the first audio data on the basis of the coordinate point of each audio data according to a preset algorithm, wherein the first audio data is any audio data in the audio data; determining whether the first audio data is noise data or not on the basis of the calculation result.

Description

A kind of information processing method and server
Technical field
The present invention relates to the information processing technology, be specifically related to a kind of information processing method and server.
Background technology
Along with the development of Internet technology, user can pass through the various affairs of internetwork operation, such as, listen music. User would generally according to subjectivity hobby song is carried out active operation, such as collect operation, down operation or Create song single operation etc..But these operations might not truly reflect the hobby of user, on the one hand it is Owing to As time goes on the hobby of user can may produce change, on the other hand the operation of user is probably Some maloperations.The data that above-mentioned this operation obtains are properly termed as abnormal data or noise data.
In the processing procedure of user's representation data or personalized recommendation data, need pending data are entered Row differentiates screening, cancelling noise data.In prior art, generally use policing rule based on artificial experience Noise data is judged.The usual logic of this mode is simple, is not only difficult to excavate the feature of deeper, The most also being difficult to be applicable to all groups, the determination rate of accuracy causing noise data is the highest, so that user Representation data or personalized recommendation data inaccurate, affects the experience of user.
Summary of the invention
For solving the technical problem of existing existence, the embodiment of the present invention provides a kind of information processing method and service Device, it is possible to increase the discriminating accuracy rate of noise data.
For reaching above-mentioned purpose, the technical scheme of the embodiment of the present invention is achieved in that
Embodiments providing a kind of information processing method, described method includes:
Obtain the multiple voice datas corresponding with user, identify the multiple property parameters in described voice data;
The plurality of voice data is mapped to according to the plurality of property parameters the multidimensional coordinate system pre-set In, it is thus achieved that the coordinate points that the plurality of voice data is corresponding;Wherein, the dimension of described coordinate system and described genus The number of types of property parameter matches;
Coordinate points based on each voice data calculates local density's ginseng of the first voice data according to preset algorithm Number;Described first voice data is the arbitrary voice data in the plurality of voice data;
Determine whether described first voice data is noise data based on result of calculation.
In such scheme, described coordinate points based on each voice data calculates the first audio frequency according to preset algorithm Local density's parameter of data, including:
Coordinate points based on each voice data calculates the first audio frequency according to local density factor (LOF) algorithm Local density's parameter of data.
In such scheme, described coordinate points based on each voice data calculates the first audio frequency according to LOF algorithm Local density's parameter of data, including:
Obtain k the coordinate that the Euclidean distance of first coordinate points a corresponding with described first voice data is nearest Point, generates the first set, and described first set is designated as Nk(a);
Calculate the reach distance between each coordinate points in described first coordinate points a and described first set; Described reach distance meets following formula:
Reachability_distance_k (a, b)=max{k_distance (b), d (a, b) };
Wherein, (a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Wherein, described second collection is combined into and institute State the set that k1 nearest coordinate points of the Euclidean distance of the second coordinate points b generates;When described first coordinate points When a and described second coordinate points b meet first condition, (a b) is equal to reachability_distance_k k_distance(b);When described first coordinate points a and described second coordinate points b are unsatisfactory for first condition, (a, b) equal to the Europe between described first coordinate points a and described second coordinate points b for reachability_distance_k Formula distance;
Calculate the first local density of described first coordinate points;Described local density meets following formula:
I r d ( a ) = | N k ( a ) | Σ b ∈ N k ( a ) r e a c h a b i l i t y _ d i s tan ce k ( a , b ) ;
Calculate the local density of each coordinate points, it is thus achieved that the average office of k coordinate points in described first set Portion's density and the ratio of described first local density;Described ratio meets following formula:
LOF k ( A ) = Σ b ∈ N k ( a ) I r d ( b ) I r d ( a ) | N k ( a ) | = Σ b ∈ N k ( a ) I r d ( b ) | N k ( a ) | / I r d ( a ) .
In such scheme, described meet first condition, including: described first coordinate points a belongs to described second The second set that coordinate points b is corresponding;
Described it is unsatisfactory for first condition, including: described first coordinate points a is not belonging to described second coordinate points b The second corresponding set.
In such scheme, described determine whether described first voice data is noise data based on result of calculation, Including:
When described ratio is more than predetermined threshold value, determine that described first voice data is noise data;Wherein, Described predetermined threshold value is more than or equal to 1.
The embodiment of the present invention additionally provides a kind of server, and described server includes: data capture unit, reflect Penetrate unit, computing unit and identifying unit;Wherein,
Described data capture unit, for obtaining the multiple voice datas corresponding with user, identifies described audio frequency Multiple property parameters in data;
Described map unit, the multiple voice datas being used for obtaining described data capture unit are according to described many Individual property parameters is mapped in the multidimensional coordinate system pre-set, it is thus achieved that the seat that the plurality of voice data is corresponding Punctuate;Wherein, the dimension of described coordinate system matches with the number of types of described property parameters;
Described computing unit, calculates the first sound for coordinate points based on each voice data according to preset algorithm Local density's parameter of frequency evidence;Described first voice data is the arbitrary audio frequency in the plurality of voice data Data;
Described identifying unit, determines described first audio frequency for the result of calculation obtained based on described computing unit Whether data are noise datas.
In such scheme, described computing unit, close according to local for coordinate points based on each voice data The degree factor (LOF) algorithm calculates local density's parameter of the first voice data.
In such scheme, described computing unit, for obtaining first seat corresponding with described first voice data K the coordinate points that the Euclidean distance of punctuate a is nearest, generates the first set, and described first set is designated as Nk(a);
Calculate the reach distance between each coordinate points in described first coordinate points a and described first set; Described reach distance meets following formula:
Reachability_distance_k (a, b)=max{k_distance (b), d (a, b) };
Wherein, (a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Wherein, described second collection is combined into and institute State the set that k1 nearest coordinate points of the Euclidean distance of the second coordinate points b generates;When described first coordinate points When a and described second coordinate points b meet first condition, (a b) is equal to reachability_distance_k k_distance(b);When described first coordinate points a and described second coordinate points b are unsatisfactory for first condition, (a, b) equal to the Europe between described first coordinate points a and described second coordinate points b for reachability_distance_k Formula distance;
Calculate the first local density of described first coordinate points;Described local density meets following formula:
I r d ( a ) = | N k ( a ) | Σ b ∈ N k ( a ) r e a c h a b i l i t y _ d i s tan ce k ( a , b ) ;
Calculate the local density of each coordinate points, it is thus achieved that the average office of k coordinate points in described first set Portion's density and the ratio of described first local density;Described ratio meets following formula:
LOF k ( A ) = Σ b ∈ N k ( a ) I r d ( b ) I r d ( a ) | N k ( a ) | = Σ b ∈ N k ( a ) I r d ( b ) | N k ( a ) | / I r d ( a ) .
In such scheme, described meet first condition, including: described first coordinate points a belongs to described second The second set that coordinate points b is corresponding;
Described it is unsatisfactory for first condition, including: described first coordinate points a is not belonging to described second coordinate points b The second corresponding set.
In such scheme, described identifying unit, for when described ratio is more than predetermined threshold value, determine described First voice data is noise data;Wherein, described predetermined threshold value is more than or equal to 1.
The information processing method of embodiment of the present invention offer and server, described method includes: obtain and user Corresponding multiple voice datas, identify the multiple property parameters in described voice data;By the plurality of audio frequency Data are mapped in the multidimensional coordinate system pre-set according to the plurality of property parameters, it is thus achieved that the plurality of sound Frequency is according to corresponding coordinate points;Wherein, the number of types phase of the dimension of described coordinate system and described property parameters Coupling;Coordinate points based on each voice data calculates the local density of the first voice data according to preset algorithm Parameter;Described first voice data is the arbitrary voice data in the plurality of voice data;Based on calculating knot Fruit determines whether described first voice data is noise data.So, the technical side of the embodiment of the present invention is used Case, it is not necessary to rely on artificial setting, only in accordance with voice data self attribute information (such as singer, language, Age, school etc.) discrete coordinate that voice data is mapped as in multidimensional coordinate system, coordinates computed point Based on result of calculation, local density's parameter, judges that described voice data, whether as noise data, substantially increases The discriminating accuracy rate of noise data, for determination or the personalized recommendation number of follow-up such as user representation data According to determination provide reliable Data Source.
Accompanying drawing explanation
Fig. 1 a to Fig. 1 c is the application scenarios schematic diagram of the information processing method of the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the information processing method of the embodiment of the present invention;
Fig. 3 is the mapping schematic diagram of the coordinate system in the embodiment of the present invention;
Fig. 4 is the schematic diagram of the local density determined based on LOF algorithm in the embodiment of the present invention;
Fig. 5 a is the effect schematic diagram before the information processing scheme using the embodiment of the present invention;
Fig. 5 b is the effect schematic diagram after the information processing scheme using the embodiment of the present invention;
Fig. 6 is the composition structural representation of the server of the embodiment of the present invention;
Fig. 7 is that the hardware of the server of the embodiment of the present invention constitutes schematic diagram.
Detailed description of the invention
Below in conjunction with the accompanying drawings and specific embodiment the present invention is further detailed explanation.
In the embodiment of the present invention, the information processing method of the embodiment of the present invention is used to carry out the voice data processed Can be presented by the form of song in end side, described song can be play by the broadcasting application output of terminal, Can also be play by webpage output;Described song is not limited in terminal the song of (the downloading) of storage, It can also be the online song play.In server side, the voice data carrying out processing is i.e. corresponding to end side The song play.
Along with the development that Network is personalized, it will usually operation based on user determines the hobby of this user, And the hobby for different user recommends different contents for user, i.e. determine user's representation data and pin Personalized recommendation to different user.Fig. 1 a to Fig. 1 c is answering of the information processing method of the embodiment of the present invention Use scene schematic diagram;As a example by song is played out by the broadcasting application of terminal, server can be according to user Playback of songs custom, determine the hobby of this user, the singer such as liked, the genre of songs liked, happiness The information such as joyous song language, the age of song liked;Further by the information pushing of above-mentioned screening to broadcasting Put and show, as shown in Fig. 1 a to Fig. 1 c on the page of application.
Under normal circumstances, server is that the data characterizing user operation according to terminal feedback determine this user's Hobby, described user operation such as down operation, collection operation, establishment song single operation, play operation etc.. The operation data being further based on obtaining use policing rule based on artificial experience to sentence noise data Fixed data determine mode, such as, when a song is the most clicked, and user is in the recent period the most again Listened the song of this song or this singer, then can be determined that this song was not the most the song that user is interested, Noise data should be judged as.
Above-mentioned policing rule based on artificial experience places one's entire reliance upon and is manually set, and rule is excessively simple, very It is easily caused the decision error of noise data.Such as, in terminal, correspondence has collected 300 songs, Qi Zhong great Part is Chinese songs, the English song of several head;And these several first English songs did not the most play;Can Can be to like listening English song before user, like now listening Chinese songs, simply the most not play English Literary composition song.In such a scenario, if these several first English songs are judged to that noise data is inaccurate.
Shortcoming based on above-mentioned policing rule, the information processing scheme of the embodiment of the present invention, in conjunction with data mining Technology, the attribute information (such as singer, language, age, school etc.) of foundation voice data is by audio frequency number According to the discrete coordinate being mapped as in multidimensional coordinate system.Owing to the song custom of listening of each user is concentrated in necessarily In the range of, then the isolated coordinate points during noise data shows as described multidimensional coordinate system.
Below the information processing method of the embodiment of the present invention is described in detail.
Embodiment one
Embodiments provide a kind of information processing method.Fig. 2 is the information processing of the embodiment of the present invention The schematic flow sheet of method;As in figure 2 it is shown, described information processing method includes:
Step 101: obtain the multiple voice datas corresponding with user, identify in described voice data is multiple Property parameters.
Step 102: the plurality of voice data is mapped to according to the plurality of property parameters and pre-sets In multidimensional coordinate system, it is thus achieved that the coordinate points that the plurality of voice data is corresponding;Wherein, the dimension of described coordinate system Spend the number of types with described property parameters to match.
Step 103: coordinate points based on each voice data calculates the first voice data according to preset algorithm Local density's parameter;Described first voice data is the arbitrary voice data in the plurality of voice data.
Step 104: determine whether described first voice data is noise data based on result of calculation.
In the present embodiment, described information processing method is applied in server or server cluster.Described service Device or server cluster can be to play server corresponding to application or server cluster, it is also possible to for webpage pair The server answered or server cluster.It is to be understood that when end side is by playing application plays song, Described information processing method is applied to the described server playing application correspondence or server cluster.Work as end side When playing song by webpage, described information processing method is applied to server corresponding to webpage or server set Group.
In step 101, multiple voice datas that described acquisition is corresponding with user, for: obtain and ID Multiple voice datas that (such as user name, IP address) is corresponding.Concrete, described ID can be User is the user name of input when completing registration or logging in, it is also possible to held when carrying out music by user The IP address of terminal.Further, server can record be associated with described ID based on Family operation various information, including voice data, such as play some audio frequency, download some audio frequency, Collect some audio frequency etc..Based on this, the described multiple voice datas corresponding with user, can be users The all voice datas operated, including following voice data at least one: user is by under down operation It is downloaded to voice data that the voice data of this locality, user play online by play operation, user by collection Voice data of operation collection etc..Wherein, the voice data that server obtains can be audio file data; It is to be understood that described audio file data can be by broadcasting instrument directly plays the data of output;Institute State audio file data and include the relevant information of audio frequency.It addition, the voice data that server obtains is all right It it is directly the relevant information of audio frequency.Wherein, mark (the such as name of the relevant information of described audio frequency such as audio frequency Claim), summary info etc.;Described summary info can include the information such as singer, language, age, area. Further, each information in the summary info in voice data described in server identification is as a generic Property parameter, such as, identify that the singer informations in voice data is as first kind property parameters;Identify audio frequency number Linguistic information according to is as Equations of The Second Kind property parameters, by that analogy.Certainly, the embodiment of the present invention does not limits Attribute information in the above-mentioned type.
In step 102, according to the type of property parameters, multidimensional coordinate system is set, such as, with attribute information bag Including: as a example by singer, language, age and area, the number of types of attribute information is four, accordingly, arranges Four-dimensional coordinate system, each coordinate axes in described four-dimensional coordinate system represents a generic attribute parameter.Fig. 3 is this The mapping schematic diagram of the coordinate system in bright embodiment;Coordinate system shown in Fig. 3 only carries out example, example with two dimension As x-axis represents that singer, y-axis represent the age;Each singer is corresponding with the numerical value in x-axis respectively, phase Answer, the different ages is corresponding with the numerical value in y-axis respectively.According to each voice data age with Singer is respectively by the different coordinate points in voice data respective coordinates system, as it is shown on figure 3, therefrom we can Finding out with preliminary, two coordinate points that arrow points to are the most isolated for other coordinate points.Certainly, When the number of types of property parameters is more than above-mentioned two class, with aforesaid way in like manner, set up multidimensional coordinate system, Each voice data is mapped in described multidimensional coordinate system, it is thus achieved that the coordinate points that each voice data is corresponding.
In step 103, described coordinate points based on each voice data calculates the first audio frequency according to preset algorithm Local density's parameter of data, including:
Coordinate points based on each voice data is according to local density factor (LOF, Local Outlier Factor) Algorithm calculates local density's parameter of the first voice data.
Concrete, in the present embodiment, the described local density's ginseng calculating the first voice data according to LOF algorithm Count and include following step:
Obtain k the coordinate that the Euclidean distance of first coordinate points corresponding with described first voice data is nearest Point, generates the first set, and described first set is designated as Nk(a);
Calculate the reach distance between each coordinate points in described first coordinate points a and described first set; Described reach distance meets formula (1):
Reachability_distance_k (a, and b)=max{k_distance (b), d (a, b) } (1)
Wherein, (a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Described 3rd coordinate points is in this signal The 3rd coordinate points k can be designated as;Wherein, described second collection is combined into and the Euclidean distance of described second coordinate points b The set that k1 nearest coordinate points generates;When described first coordinate points a and described second coordinate points b meet During first condition, (a, b) equal to k_distance (b) for reachability_distance_k;When described first coordinate points When a and described second coordinate points b are unsatisfactory for first condition, (a, b) equal to described for reachability_distance_k Euclidean distance between first coordinate points a and described second coordinate points b;
Calculate the first local density of described first coordinate points;Described local density meets formula (2):
I r d ( a ) = | N k ( a ) | Σ b ∈ N k ( a ) r e a c h a b i l i t y _ d i s tan ce k ( a , b ) - - - ( 2 )
Calculate the local density of each coordinate points, it is thus achieved that the average office of k coordinate points in described first set Portion's density and the ratio of described first local density;Described ratio meets formula (3):
LOF k ( A ) = Σ b ∈ N k ( a ) I r d ( b ) I r d ( a ) | N k ( a ) | = Σ b ∈ N k ( a ) I r d ( b ) | N k ( a ) | / I r d ( a ) - - - ( 3 )
Wherein, Ird (b) represents the local density of the second coordinate points b, can be designated as the second local density.
Concrete, for the coordinate points corresponding to all of voice data that is associated with user, can claim here For coordinate points set;First, determine all neighbour's nodes of each coordinate points in described coordinate points set, Here, as a example by neighbour's node of the first coordinate points a in described coordinate points set, however, it is determined that described coordinate Neighbour's node of the first coordinate points a in some set has k, then by described k neighbour's node (i.e. k coordinate Point) it is designated as the first set, described first set is designated as Nk(a).Wherein, described k neighbour's node represent with K the coordinate points that the Euclidean distance of described first coordinate points a is nearest.In the present embodiment, the meter of Euclidean distance Calculation mode can refer to, described in prior art, be not described in detail in the present embodiment.
Second, determine that the reach distance of two coordinate points meets shown in formula (1), here, two coordinate points The Euclidean distance of reach distance not necessarily two coordinate points.Concrete, when described first coordinate points a Belong to described second coordinate points b corresponding second set time, (a b) is equal to reachability_distance_k k_distance(b);When described first coordinate points a be not belonging to described second coordinate points b corresponding second set time, (a, b) equal to the Europe between described first coordinate points a and described second coordinate points b for reachability_distance_k Formula distance;Wherein, described second collection is combined into k1 the seat nearest with the Euclidean distance of described second coordinate points b The set that punctuate generates.It is to say, when described first coordinate points a belongs to the adjacent node of the second coordinate points b During set, the reach distance between described first coordinate points a and described second coordinate points b is equal to described second Euclidean distance between coordinate points b and its nearest adjacent node.When described first coordinate points a is not belonging to second During the adjacent node set of coordinate points b, between described first coordinate points a and described second coordinate points b up to Distance is equal to the Euclidean distance between described first coordinate points a and described second coordinate points b.So make follow-up Local density calculating stability more preferably.
3rd, the acquisition of local density is carried out based on above-mentioned two step.To calculate the of the first coordinate points a As a example by one local density Ird (a), described first local density Ird (a) meets shown in formula (2), its In, b represents the second coordinate points, and b ∈ NkA (), represents that described second coordinate points b belongs to the first set NkA (), the most described second coordinate points b is k the seat nearest with the Euclidean distance of described first coordinate points a One of them of punctuate.By formula (2) it can be seen that the first local density of described first coordinate points a is full Described first coordinate points a of foot first set N described with itkAll coordinate points in (a) average up to away from From inverse.The local density of each coordinate points is obtained based on aforesaid way.
Finally, the average local density of all coordinate points during comparison the most described first is gathered and described first coordinate The ratio of first local density of some a;Described ratio meets shown in formula (3), logical in the embodiment of the present invention Cross the ratio size obtained and determine whether the first voice data corresponding to described first coordinate points is noise data.
In step 104, described determine whether described first voice data is noise data based on result of calculation, Including: when described ratio is more than predetermined threshold value, determine that described first voice data is noise data;Wherein, Described predetermined threshold value is more than or equal to 1.
Concrete, when result of calculation (ratio i.e. obtained) is less than or equal to 1, show described first coordinate Point a is surrounded, during the most described first coordinate points a is gathered with described first by the coordinate points in described first set Coordinate points between relatively position tightr.When result of calculation (ratio i.e. obtained) is more than 1, show Described first coordinate points a is outside in described first set;Ratio closer to 1, described first coordinate points a with The relative position relative close between coordinate points in described first set;Ratio, further away from 1, shows described Relative position between first coordinate points a with the coordinate points in described first set is more become estranged, it may be determined that institute State the probability that the first coordinate points a is noise data the highest.Fig. 4 is to calculate based on LOF in the embodiment of the present invention The schematic diagram of the local density that method determines;As shown in Figure 4, the coordinate points irised out by annular is the ratio of acquisition Value is more than the coordinate points of 1 correspondence.Based on this, in the present embodiment, a predetermined threshold value can be configured based on demand, Described predetermined threshold value is more than or equal to 1;Described predetermined threshold value is the biggest, and the determination rate of accuracy of noise data is the highest. Described predetermined threshold value such as 3, then when the ratio obtained is more than 3, it may be determined that the coordinate points phase that ratio is corresponding Corresponding voice data is noise data.
Use the technical scheme of the embodiment of the present invention, it is not necessary to rely on artificial setting, only in accordance with voice data certainly Voice data is mapped as multidimensional coordinate by the attribute information (such as singer, language, age, school etc.) of body Discrete coordinate in system, local density's parameter of coordinates computed point, judge described audio frequency based on result of calculation Whether data are noise data, substantially increase the discriminating accuracy rate of noise data, for follow-up such as user The determination of representation data or the determination of personalized recommendation data provide reliable Data Source.
Embodiment two
The present embodiment combines concrete application scenarios and retouches the information processing method of the embodiment of the present invention in detail State.With client (application of the described client such as music class) collection (or download) that user is corresponding As a example by 300 songs, this 300 song is most likely due to the maloperation of user cause collection (or Download), or there may be user before for a long time, collect (or download) but in nearest a period of time not Play.
The first step, identifies the property parameters of described 300 songs, includes with the property parameters identified: song As a example by hands, language, age, school four generic attribute parameter, then correspondence establishment four-dimensional coordinate system, the described four-dimension Each coordinate axes corresponding generic attribute parameter respectively in coordinate system;Coordinate figure on coordinate axes is the most corresponding corresponding The value that property parameters is corresponding.Such as, each singer is corresponding with the numerical value on the first coordinate axes respectively, will The different ages is corresponding with the numerical value on the second coordinate axes etc. respectively, by that analogy.By described 300 first songs Song is respectively mapped to described four-dimensional coordinate fastens according to singer, language, age, school, it is thus achieved that described 300 The coordinate points that song correspondence is fastened at described four-dimensional coordinate, it is thus achieved that coordinate points set.Wherein, each is sat Punctuate can be represented by characteristic vector, and described characteristic vector comprises four characteristic vector values.
Second step, for described coordinate points set, it is thus achieved that the institute of each coordinate points in described coordinate points set There is neighbour's node, generate the first set.If neighbour's node of the first coordinate points a has k, the most described first Set is designated as Nk(a).Wherein, neighbour's node of described first coordinate points a represents and described first coordinate points The coordinate points that the Euclidean distance of a is nearest.
3rd step, calculates the reach distance of two coordinate points, and described reach distance can pass through above-mentioned formula (1) Calculate and obtain.(a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Described 3rd coordinate points is in this signal The 3rd coordinate points k can be designated as;Wherein, described second collection is combined into and the Euclidean distance of described second coordinate points b The set that k1 nearest coordinate points generates.When described first coordinate points a belongs to described second coordinate points b pair During the second set answered, (a, b) equal to k_distance (b) for reachability_distance_k;When described first sits Punctuate a be not belonging to described second coordinate points b corresponding second set time, reachability_distance_k (a, b) Equal to the Euclidean distance between described first coordinate points a and described second coordinate points b.It is to say, work as institute When stating the adjacent node set that the first coordinate points a belongs to the second coordinate points b, described first coordinate points a is with described Reach distance between second coordinate points b is equal between described second coordinate points b and its nearest adjacent node Euclidean distance.When described first coordinate points a is not belonging to the adjacent node set of the second coordinate points b, described Reach distance between one coordinate points a and described second coordinate points b is equal to described first coordinate points a with described Euclidean distance between second coordinate points b.So make the stability of calculating of follow-up local density more preferably.
4th step, calculates the local density of each coordinate points.Close to calculate the first local of the first coordinate points a As a example by degree, described first local density can pass through formula (2) and calculate acquisition.
5th step, it is thus achieved that the average local density of k coordinate points in described first set and described first game The ratio of portion's density;Described ratio can pass through formula (3) and obtain.Concrete, when result of calculation (i.e. obtains Ratio) less than or equal to 1 time, show described first coordinate points a by described first set in coordinate points bag Enclosing, between the most described first coordinate points a with the coordinate points in described first set, relative position is tightr.When When result of calculation (ratio i.e. obtained) is more than 1, show that described first coordinate points a is gathered described first Outside;The ratio phase closer to 1, between described first coordinate points a with the coordinate points in described first set To position relative close;Ratio is further away from 1, in showing that described first coordinate points a is gathered with described first Relative position between coordinate points is more become estranged, it may be determined that described first coordinate points a is possible of noise data Property is the highest.As shown in Figure 4, the coordinate points irised out by annular is the ratio coordinate more than 1 correspondence of acquisition Point.Based on this, in the present embodiment, can configure a predetermined threshold value based on demand, described predetermined threshold value is more than Equal to 1;Described predetermined threshold value is the biggest, and the determination rate of accuracy of noise data is the highest.Described predetermined threshold value such as 3, Then when the ratio obtained is more than 3, it may be determined that the corresponding voice data of coordinate points that ratio is corresponding is noise Data.
Use the technical scheme of the embodiment of the present invention, it is not necessary to rely on artificial setting, only in accordance with voice data certainly Voice data is mapped as multidimensional coordinate by the attribute information (such as singer, language, age, school etc.) of body Discrete coordinate in system, local density's parameter of coordinates computed point, judge described audio frequency based on result of calculation Whether data are noise data, substantially increase the discriminating accuracy rate of noise data, for follow-up such as user The determination of representation data or the determination of personalized recommendation data provide reliable Data Source.
By the description of above-mentioned information processing method, the technical scheme of the embodiment of the present invention can be applicable to such as end Scape:
Scene one, applies for some, in this scene as a example by music class is applied, uses music class user During application, it will usually (i.e. personalization pushes away the song that the hobby of foundation user recommends user may like for user Recommend).If using the identification method of noise data of the prior art, it is likely that can cause differentiating inaccurate, Then may result in what user when the song to user's recommendation is likely to did not liked.Fig. 5 a is for using the present invention real Execute the effect schematic diagram before the information processing scheme of example;When recommending the song that user does not likes, user It is likely to click on " changing a collection of " button characterizing handoff functionality, is this use to switch next or lower one page The song that family is recommended.Thus as shown in Figure 5 a, can use the processing scheme of prior art, what feedback obtained " changes A collection of rate " remain at a higher numerical value (more than 1.5%).And use the skill of the embodiment of the present invention Art scheme, after rejecting real noise data, server can determine that the real hobby of user, thus is user The song recommended is that the probability that user likes significantly improves.Fig. 5 b is the information processing using the embodiment of the present invention Effect schematic diagram after scheme;As shown in Figure 5 b, " the refuse receptacle rate " that characterize negative feedback index all occurs Downward trend (being on a declining curve as shown by arrows).
Scene two, for the determination of user's representation data, the user's portrait in applying with music class in this scene Data be defined as example, when user uses music class to apply, application individual show page, it will usually show Personalized preference, specifically can refer to shown in Fig. 1 a to Fig. 1 c, and singer, user that such as user likes like Genre of songs, song age of liking of user etc..If using the mirror of noise data of the prior art Other mode, it is likely that the hobby of information and the user shown can be caused to have bigger difference, be unfavorable for user's Experience.And use the processing scheme of prior art, it is possible to it is greatly promoted the discriminating accuracy of noise data, from And the information of displaying and the hobby of user can be made to distinguish less, promote the experience of user.
Embodiment three
The embodiment of the present invention additionally provides a kind of server.Fig. 6 is the composition of the server of the embodiment of the present invention Structural representation;As shown in Figure 6, described server includes: data capture unit 21, map unit 22, Computing unit 23 and identifying unit 24;Wherein,
Described data capture unit 21, for obtaining the multiple voice datas corresponding with user, identifies described sound Multiple property parameters in frequency evidence;
Described map unit 22, for multiple voice datas that described data capture unit 21 is obtained according to The plurality of property parameters is mapped in the multidimensional coordinate system pre-set, it is thus achieved that the plurality of voice data pair The coordinate points answered;Wherein, the dimension of described coordinate system matches with the number of types of described property parameters;
Described computing unit 23, calculates first for coordinate points based on each voice data according to preset algorithm Local density's parameter of voice data;Described first voice data is the arbitrary sound in the plurality of voice data Frequency evidence;
Described identifying unit 24, the result of calculation for obtaining based on described computing unit 23 determines described the Whether one voice data is noise data.
In the present embodiment, described server or server cluster can be to play server corresponding to application or clothes Business device cluster, it is also possible to the server corresponding for webpage or server cluster.It is to be understood that work as end side During by playing application plays song, described information processing method is applied to the described service playing application correspondence Device or server cluster.When end side plays song by webpage, described information processing method is applied to net The server of page correspondence or server cluster.
In the present embodiment, the property parameters of described voice data such as singer informations, linguistic information, age letter Breath, regional information etc..Described map unit 22 arranges multidimensional coordinate system according to the type of property parameters in advance, Each coordinate axes in described multidimensional coordinate system represents a generic attribute parameter.Coordinate system shown in Fig. 3 is only with two Dimension carries out example, such as x-axis and represents that singer, y-axis represent the age;By each singer respectively with in x-axis Numerical value corresponding, accordingly, the different ages is corresponding with the numerical value in y-axis respectively.According to each sound The age of frequency evidence and singer are respectively by the different coordinate points in voice data respective coordinates system, such as Fig. 3 institute Showing, what therefrom we can be preliminary finds out, two coordinate points that arrow points to are for other coordinate points The most isolated.Certainly, when the number of types of property parameters is more than above-mentioned two class, with aforesaid way in like manner, Set up multidimensional coordinate system, each voice data is mapped in described multidimensional coordinate system, it is thus achieved that each audio frequency number According to corresponding coordinate points.
In the present embodiment, described computing unit 23, specifically for coordinate points based on each voice data according to LOF algorithm calculates local density's parameter of the first voice data.
Concrete, described computing unit 23, for obtaining first coordinate corresponding with described first voice data Putting k the coordinate points that the Euclidean distance of a is nearest, generate the first set, described first set is designated as Nk(a);
Calculate the reach distance between each coordinate points in described first coordinate points a and described first set; Described reach distance meets formula (1):
Reachability_distance_k (a, and b)=max{k_distance (b), d (a, b) } (1)
Wherein, (a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Described 3rd coordinate points is in this signal The 3rd coordinate points k can be designated as;Wherein, described second collection is combined into and the Euclidean distance of described second coordinate points b The set that k1 nearest coordinate points generates;When described first coordinate points a and described second coordinate points b meet During first condition, (a, b) equal to k_distance (b) for reachability_distance_k;When described first coordinate points When a and described second coordinate points b are unsatisfactory for first condition, (a, b) equal to described for reachability_distance_k Euclidean distance between first coordinate points a and described second coordinate points b;
Calculate the first local density of described first coordinate points;Described local density meets formula (2):
I r d ( a ) = | N k ( a ) | Σ b ∈ N k ( a ) r e a c h a b i l i t y _ d i s tan ce k ( a , b ) - - - ( 2 )
Calculate the local density of each coordinate points, it is thus achieved that the average office of k coordinate points in described first set Portion's density and the ratio of described first local density;Described ratio meets formula (3):
LOF k ( A ) = Σ b ∈ N k ( a ) I r d ( b ) I r d ( a ) | N k ( a ) | = Σ b ∈ N k ( a ) I r d ( b ) | N k ( a ) | / I r d ( a ) - - - ( 3 )
Wherein, Ird (b) represents the local density of the second coordinate points b, can be designated as the second local density.
Concrete, for the coordinate points corresponding to all of voice data that is associated with user, can claim here For coordinate points set;First, each coordinate points during described computing unit 23 determines described coordinate points set All neighbour's nodes, here, as a example by neighbour's node of the first coordinate points a in described coordinate points set, If it is determined that neighbour's node of the first coordinate points a in described coordinate points set has k, then described k neighbour is tied Point (i.e. k coordinate points) is designated as the first set, and described first set is designated as Nk(a).Wherein, described k Adjacent node represents and k the coordinate points nearest with the Euclidean distance of described first coordinate points a.In the present embodiment, The calculation of Euclidean distance can refer to, described in prior art, be not described in detail in the present embodiment.
Second, described computing unit 23 determines that the reach distance of two coordinate points meets shown in formula (1), Here, the Euclidean distance of the reach distance of two coordinate points not necessarily two coordinate points.Concrete, when Described first coordinate points a belong to described second coordinate points b corresponding second set time, (a, b) equal to k_distance (b) for reachability_distance_k;When described first coordinate points a is not belonging to described Second coordinate points b corresponding second set time, reachability_distance_k (a, b) equal to described first sit Euclidean distance between punctuate a and described second coordinate points b;Wherein, described second collection is combined into and described the The set that nearest k1 the coordinate points of the Euclidean distance of two coordinate points b generates.It is to say, when described first When coordinate points a belongs to the adjacent node set of the second coordinate points b, described first coordinate points a is sat with described second Reach distance between punctuate b equal between described second coordinate points b and its nearest adjacent node European away from From.When described first coordinate points a is not belonging to the adjacent node set of the second coordinate points b, described first coordinate Reach distance between some a and described second coordinate points b is sat with described second equal to described first coordinate points a Euclidean distance between punctuate b.So make the stability of calculating of follow-up local density more preferably.
3rd, described computing unit 23 carries out the acquisition of local density based on above-mentioned two step.To calculate As a example by first local density Ird (a) of one coordinate points a, described first local density Ird (a) meets public affairs Shown in formula (2), wherein, b represents the second coordinate points, and b ∈ NkA (), represents described second coordinate Point b belongs to the first set NkA (), the most described second coordinate points b is European with described first coordinate points a One of them of k closest coordinate points.By formula (2) it can be seen that described first coordinate points a The first local density meet described first coordinate points a described with it first set NkAll seats in (a) The inverse of the average reach distance of punctuate.The local density of each coordinate points is obtained based on aforesaid way.
Finally, the average local density of all coordinate points in the most described first set of described computing unit 23 Ratio with the first local density of described first coordinate points a;Described ratio meets shown in formula (3), this In inventive embodiments, the ratio size by obtaining determines that the first voice data corresponding to described first coordinate points is No is noise data.
In the present embodiment, described identifying unit 24, for when described ratio is more than predetermined threshold value, determining institute Stating the first voice data is noise data;Wherein, described predetermined threshold value is more than or equal to 1.
Concrete, when result of calculation (ratio i.e. obtained) is less than or equal to 1, show described first coordinate Point a is surrounded, during the most described first coordinate points a is gathered with described first by the coordinate points in described first set Coordinate points between relatively position tightr.When result of calculation (ratio i.e. obtained) is more than 1, show Described first coordinate points a is outside in described first set;Ratio closer to 1, described first coordinate points a with The relative position relative close between coordinate points in described first set;Ratio, further away from 1, shows described Relative position between first coordinate points a with the coordinate points in described first set is more become estranged, it may be determined that institute State the probability that the first coordinate points a is noise data the highest.As shown in Figure 4, the coordinate points irised out by annular It is the ratio coordinate points more than 1 correspondence of acquisition.Based on this, in the present embodiment, can join based on demand Putting a predetermined threshold value, described predetermined threshold value is more than or equal to 1;Described predetermined threshold value is the biggest, sentencing of noise data Determine accuracy rate the highest.Described predetermined threshold value such as 3, then when the ratio obtained is more than 3, it may be determined that ratio The voice data that corresponding coordinate points is corresponding is noise data.
In the present embodiment, the data capture unit 21 in described server, map unit 22, computing unit 23 and identifying unit 24, the most all can by described server central processing unit (CPU, Central Processing Unit), digital signal processor (DSP, Digital Signal Processor) or Programmable gate array (FPGA, Field-Programmable Gate Array) realizes.
Fig. 7 is that the hardware of the server of the embodiment of the present invention constitutes schematic diagram;Server is as hardware entities One example is as it is shown in fig. 7, comprises processor 31, storage medium 32 and at least one PERCOM peripheral communication connect Mouth 33;Described processor 31, storage medium 32 and external communication interface 33 are all connected by bus 34.
It need to be noted that: above is referred to the description of server entry, it is similar for describing with said method, Describe with the beneficial effect of method, do not repeat.For the technology not disclosed in server example of the present invention Details, refer to the description of the inventive method embodiment.
In several embodiments provided herein, it should be understood that disclosed equipment and method, can To realize by another way.Apparatus embodiments described above is only schematically, such as, and institute Stating the division of unit, be only a kind of logic function and divide, actual can have other dividing mode when realizing, As: multiple unit or assembly can be in conjunction with, or it is desirably integrated into another system, or some features can be neglected Slightly, or do not perform.It addition, the coupling each other of shown or discussed each ingredient or directly coupling Close or communication connection can be the INDIRECT COUPLING by some interfaces, equipment or unit or communication connection, can Be electrical, machinery or other form.
The above-mentioned unit illustrated as separating component can be or may not be physically separate, as The parts that unit shows can be or may not be physical location, i.e. may be located at a place, it is possible to To be distributed on multiple NE;Part or all of unit therein can be selected according to the actual needs Realize the purpose of the present embodiment scheme.
It addition, each functional unit in various embodiments of the present invention can be fully integrated in a processing unit, Can also be that each unit is individually as a unit, it is also possible to two or more unit are integrated in one In individual unit;Above-mentioned integrated unit both can realize to use the form of hardware, it would however also be possible to employ hardware adds soft The form of part functional unit realizes.
One of ordinary skill in the art will appreciate that: all or part of step realizing said method embodiment can Completing with the hardware relevant by programmed instruction, aforesaid program can be stored in an embodied on computer readable and deposit In storage media, this program upon execution, performs to include the step of said method embodiment;And aforesaid storage Medium includes: movable storage device, read only memory (ROM, Read-Only Memory), deposit at random Access to memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.
Or, if the above-mentioned integrated unit of the present invention is using the form realization of software function module and as independent Production marketing or use time, it is also possible to be stored in a computer read/write memory medium.Based on so Understanding, the part that prior art is contributed by the technical scheme of the embodiment of the present invention the most in other words can Embodying with the form with software product, this computer software product is stored in a storage medium, bag Include some instructions with so that a computer equipment (can be personal computer, server or network Equipment etc.) perform all or part of of method described in each embodiment of the present invention.And aforesaid storage medium bag Include: movable storage device, ROM, RAM, magnetic disc or CD etc. are various can store program code Medium.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited to This, any those familiar with the art, in the technical scope that the invention discloses, can readily occur in Change or replacement, all should contain within protection scope of the present invention.Therefore, protection scope of the present invention should It is as the criterion with described scope of the claims.

Claims (10)

1. an information processing method, it is characterised in that described method includes:
Obtain the multiple voice datas corresponding with user, identify the multiple property parameters in described voice data;
The plurality of voice data is mapped to according to the plurality of property parameters the multidimensional coordinate system pre-set In, it is thus achieved that the coordinate points that the plurality of voice data is corresponding;Wherein, the dimension of described coordinate system and described genus The number of types of property parameter matches;
Coordinate points based on each voice data calculates local density's ginseng of the first voice data according to preset algorithm Number;Described first voice data is the arbitrary voice data in the plurality of voice data;
Determine whether described first voice data is noise data based on result of calculation.
Method the most according to claim 1, it is characterised in that described seat based on each voice data Punctuate calculates local density's parameter of the first voice data according to preset algorithm, including:
Coordinate points based on each voice data calculates the first audio frequency number according to local density factor LOF algorithm According to local density's parameter.
Method the most according to claim 2, it is characterised in that described seat based on each voice data Punctuate calculates local density's parameter of the first voice data according to LOF algorithm, including:
Obtain k the coordinate that the Euclidean distance of first coordinate points a corresponding with described first voice data is nearest Point, generates the first set, and described first set is designated as Nk(a);
Calculate the reach distance between each coordinate points in described first coordinate points a and described first set; Described reach distance meets following formula:
Reachability_distance_k (a, b)=max{k_distance (b), d (a, b) };
Wherein, (a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Wherein, described second collection is combined into and institute State the set that k1 nearest coordinate points of the Euclidean distance of the second coordinate points b generates;When described first coordinate points When a and described second coordinate points b meet first condition, (a b) is equal to reachability_distance_k k_distance(b);When described first coordinate points a and described second coordinate points b are unsatisfactory for first condition, (a, b) equal to the Europe between described first coordinate points a and described second coordinate points b for reachability_distance_k Formula distance;
Calculate the first local density of described first coordinate points;Described local density meets following formula:
I r d ( a ) = | N k ( a ) | Σ b ∈ N k ( a ) r e a c h a b i l i t y _ distance k ( a , b ) ;
Calculate the local density of each coordinate points, it is thus achieved that the average office of k coordinate points in described first set Portion's density and the ratio of described first local density;Described ratio meets following formula:
LOF k ( A ) = Σ b ∈ N k ( a ) I r d ( b ) I r d ( a ) | N k ( a ) | = Σ b ∈ N k ( a ) I r d ( b ) | N k ( a ) | / I r d ( a ) .
Method the most according to claim 3, it is characterised in that described meet first condition, including: Described first coordinate points a belongs to the second set that described second coordinate points b is corresponding;
Described it is unsatisfactory for first condition, including: described first coordinate points a is not belonging to described second coordinate points b The second corresponding set.
Method the most according to claim 3, it is characterised in that described determine based on result of calculation described Whether the first voice data is noise data, including:
When described ratio is more than predetermined threshold value, determine that described first voice data is noise data;Wherein, Described predetermined threshold value is more than or equal to 1.
6. a server, it is characterised in that described server includes: data capture unit, map unit, Computing unit and identifying unit;Wherein,
Described data capture unit, for obtaining the multiple voice datas corresponding with user, identifies described audio frequency Multiple property parameters in data;
Described map unit, the multiple voice datas being used for obtaining described data capture unit are according to described many Individual property parameters is mapped in the multidimensional coordinate system pre-set, it is thus achieved that the seat that the plurality of voice data is corresponding Punctuate;Wherein, the dimension of described coordinate system matches with the number of types of described property parameters;
Described computing unit, calculates the first sound for coordinate points based on each voice data according to preset algorithm Local density's parameter of frequency evidence;Described first voice data is the arbitrary audio frequency in the plurality of voice data Data;
Described identifying unit, determines described first audio frequency for the result of calculation obtained based on described computing unit Whether data are noise datas.
Server the most according to claim 6, it is characterised in that described computing unit, for based on The coordinate points of each voice data calculates the local of the first voice data according to local density factor LOF algorithm Density parameter.
Server the most according to claim 7, it is characterised in that described computing unit, is used for obtaining K the coordinate points that the Euclidean distance of first coordinate points a corresponding with described first voice data is nearest, generates First set, described first set is designated as Nk(a);
Calculate the reach distance between each coordinate points in described first coordinate points a and described first set; Described reach distance meets following formula:
Reachability_distance_k (a, b)=max{k_distance (b), d (a, b) };
Wherein, (a b) represents between the first coordinate points a and the second coordinate points b reachability_distance_k Reach distance;K_distance (b) represent described second coordinate points b and second set in described second coordinate points Euclidean distance between the 3rd coordinate points that the Euclidean distance of b is farthest;Wherein, described second collection is combined into and institute State the set that k1 nearest coordinate points of the Euclidean distance of the second coordinate points b generates;When described first coordinate points When a and described second coordinate points b meet first condition, (a b) is equal to reachability_distance_k k_distance(b);When described first coordinate points a and described second coordinate points b are unsatisfactory for first condition, (a, b) equal to the Europe between described first coordinate points a and described second coordinate points b for reachability_distance_k Formula distance;
Calculate the first local density of described first coordinate points;Described local density meets following formula:
I r d ( a ) = | N k ( a ) | Σ b ∈ N k ( a ) r e a c h a b i l i t y _ distance k ( a , b ) ;
Calculate the local density of each coordinate points, it is thus achieved that the average office of k coordinate points in described first set Portion's density and the ratio of described first local density;Described ratio meets following formula:
LOF k ( A ) = Σ b ∈ N k ( a ) I r d ( b ) I r d ( a ) | N k ( a ) | = Σ b ∈ N k ( a ) I r d ( b ) | N k ( a ) | / I r d ( a ) .
Server the most according to claim 8, it is characterised in that described meet first condition, including: Described first coordinate points a belongs to the second set that described second coordinate points b is corresponding;
Described it is unsatisfactory for first condition, including: described first coordinate points a is not belonging to described second coordinate points b The second corresponding set.
Server the most according to claim 8, it is characterised in that described identifying unit, for working as When described ratio is more than predetermined threshold value, determine that described first voice data is noise data;Wherein, described pre- If threshold value is more than or equal to 1.
CN201610193015.3A 2016-03-30 2016-03-30 Information processing method and server Active CN105893515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610193015.3A CN105893515B (en) 2016-03-30 2016-03-30 Information processing method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610193015.3A CN105893515B (en) 2016-03-30 2016-03-30 Information processing method and server

Publications (2)

Publication Number Publication Date
CN105893515A true CN105893515A (en) 2016-08-24
CN105893515B CN105893515B (en) 2021-02-05

Family

ID=57014416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610193015.3A Active CN105893515B (en) 2016-03-30 2016-03-30 Information processing method and server

Country Status (1)

Country Link
CN (1) CN105893515B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503181A (en) * 2016-10-25 2017-03-15 腾讯音乐娱乐(深圳)有限公司 A kind of audio data processing method and device
CN107301297A (en) * 2017-06-28 2017-10-27 正升环境科技股份有限公司 Noise abatement management method and device
CN113613145A (en) * 2021-08-09 2021-11-05 深圳分贝声学科技有限公司 Noise reduction processing method and related device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101552689A (en) * 2009-05-15 2009-10-07 中国科学技术大学 User interest drift detection method and system based on network structure
US20120099734A1 (en) * 2002-01-24 2012-04-26 Telediffusion De France Method for qualitative evaluation of a digital audio signal
CN104035972A (en) * 2014-05-21 2014-09-10 哈尔滨工业大学深圳研究生院 Knowledge recommending method and system based on micro blogs
CN104113698A (en) * 2014-08-06 2014-10-22 北京北纬通信科技股份有限公司 Blurred image processing method and system applied to image capturing device
CN104156579A (en) * 2014-07-31 2014-11-19 江南大学 Dynamic traffic abnormal data detection and recovery method
CN104504901A (en) * 2014-12-29 2015-04-08 浙江银江研究院有限公司 Multidimensional data based detecting method of traffic abnormal spots
CN104809594A (en) * 2015-05-13 2015-07-29 中国电力科学研究院 Distribution network data online cleaning method based on dynamic outlier detection
CN104932966A (en) * 2015-06-19 2015-09-23 广东欧珀移动通信有限公司 Method and device for detecting false downloading times of application software

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120099734A1 (en) * 2002-01-24 2012-04-26 Telediffusion De France Method for qualitative evaluation of a digital audio signal
CN101552689A (en) * 2009-05-15 2009-10-07 中国科学技术大学 User interest drift detection method and system based on network structure
CN104035972A (en) * 2014-05-21 2014-09-10 哈尔滨工业大学深圳研究生院 Knowledge recommending method and system based on micro blogs
CN104156579A (en) * 2014-07-31 2014-11-19 江南大学 Dynamic traffic abnormal data detection and recovery method
CN104113698A (en) * 2014-08-06 2014-10-22 北京北纬通信科技股份有限公司 Blurred image processing method and system applied to image capturing device
CN104504901A (en) * 2014-12-29 2015-04-08 浙江银江研究院有限公司 Multidimensional data based detecting method of traffic abnormal spots
CN104809594A (en) * 2015-05-13 2015-07-29 中国电力科学研究院 Distribution network data online cleaning method based on dynamic outlier detection
CN104932966A (en) * 2015-06-19 2015-09-23 广东欧珀移动通信有限公司 Method and device for detecting false downloading times of application software

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503181A (en) * 2016-10-25 2017-03-15 腾讯音乐娱乐(深圳)有限公司 A kind of audio data processing method and device
CN107301297A (en) * 2017-06-28 2017-10-27 正升环境科技股份有限公司 Noise abatement management method and device
CN113613145A (en) * 2021-08-09 2021-11-05 深圳分贝声学科技有限公司 Noise reduction processing method and related device
CN113613145B (en) * 2021-08-09 2023-03-03 深圳分贝声学科技有限公司 Noise reduction processing method and related device

Also Published As

Publication number Publication date
CN105893515B (en) 2021-02-05

Similar Documents

Publication Publication Date Title
CN107818781B (en) Intelligent interaction method, equipment and storage medium
CN107832286B (en) Intelligent interaction method, equipment and storage medium
JP6163607B2 (en) Method and apparatus for constructing event knowledge database
US9754306B2 (en) Recommendation engine with profile analysis
CN107609101A (en) Intelligent interactive method, equipment and storage medium
US7958119B2 (en) Method for content recommendation
US11210338B2 (en) Systems, methods and apparatus for generating music recommendations based on combining song and user influencers with channel rule characterizations
US9235853B2 (en) Method for recommending musical entities to a user
CN108009228A (en) A kind of method to set up of content tab, device and storage medium
US20120174006A1 (en) System, method, apparatus and computer program for generating and modeling a scene
CN106104512A (en) System and method for active obtaining social data
US20180246962A1 (en) Playlist list determining method and device, electronic apparatus, and storage medium
CN103390044B (en) Method and device for identifying linkage type POI (Point Of Interest) data
CN109271550A (en) A kind of music personalization classification recommended method based on deep learning
CN106951527B (en) Song recommendation method and device
CN109545185A (en) Interactive system evaluation method, evaluation system, server and computer-readable medium
KR20120101233A (en) Method for providing sentiment information and method and system for providing contents recommendation using sentiment information
CN105893515A (en) Information processing method and server
CN103761273A (en) Tree structure node attribute configuration method and system
KR20170107868A (en) Method and system to recommend music contents by database composed of user's context, recommended music and use pattern
CN111383619A (en) Sound effect generation method, device, equipment and computer readable storage medium
CN104615749A (en) Ring tone recommendation method and ring tone recommendation device
CN107077315A (en) For select will the voice used with user's communication period system and method
CN106599114A (en) Music recommendation method and system
CN108140034A (en) Content item is selected using lexical item of the topic model based on reception

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant