CN109885722B

CN109885722B - Music recommendation method and device based on natural language processing and computer equipment

Info

Publication number: CN109885722B
Application number: CN201910012372.9A
Authority: CN
Inventors: 吴壮伟
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-01-07
Filing date: 2019-01-07
Publication date: 2023-07-04
Anticipated expiration: 2039-01-07
Also published as: CN109885722A

Abstract

The invention discloses a music recommendation method and device based on natural language processing and computer equipment. According to the method, song vectors corresponding to all songs in a history song list are used as input of a song emotion classification model, so that emotion dimension values corresponding to all songs are obtained; taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking emotion dimension values corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain the song emotion prediction model; and obtaining a target song emotion dimension value corresponding to the current system time according to the current system time and the song emotion prediction model, and correspondingly obtaining a recommended song list according to the song type of the target song emotion dimension value. According to the method, text mining and mood mining of songs are achieved through natural language processing technology and emotion analysis, and songs similar to the current song mood can be screened out to form a recommendation list.

Description

Music recommendation method and device based on natural language processing and computer equipment

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a music recommendation method and apparatus based on natural language processing, and a computer device.

Background

At present, a music recommendation system mainly passes through a collaborative filtering algorithm of user behaviors and can meet the conditions of cold start and sparse matrix. And the user listens to the song, and can have a certain key classification to the music, but the part is not separated from natural language processing, and if songs based on the similar key cannot be provided, the user viscosity is greatly reduced.

Disclosure of Invention

The embodiment of the invention provides a music recommendation method, a device and computer equipment based on natural language processing, which aim to solve the problems that a music recommendation system in the prior art mainly passes through a collaborative filtering algorithm of user behaviors and can encounter cold start and sparse matrix.

In a first aspect, an embodiment of the present invention provides a music recommendation method based on natural language processing, including:

obtaining song vectors corresponding to lyrics of each song in a history song list;

inputting a song vector corresponding to the lyrics of each song in the history song list to a pre-trained song emotion classification model to obtain an emotion dimension value corresponding to each song in the history song list;

Taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking an emotion dimension value corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain a song emotion prediction model;

inputting the current system time into a song emotion prediction model, and obtaining a target song emotion dimension value corresponding to the current system time; and

and correspondingly acquiring a recommended song list according to the song type corresponding to the emotion value of the target song.

In a second aspect, an embodiment of the present invention provides a music recommendation device based on natural language processing, including:

a song vector obtaining unit, configured to obtain a song vector corresponding to lyrics of each song in the history song list;

the emotion dimension value acquisition unit is used for inputting a song vector corresponding to the lyrics of each song in the history song list to a pre-trained song emotion classification model to obtain an emotion dimension value corresponding to each song in the history song list;

The first model training unit is used for taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking an emotion dimension value corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain a song emotion prediction model;

the target emotion dimension acquisition unit is used for inputting the current system time into the song emotion prediction model and acquiring a target song emotion dimension value corresponding to the current system time; and

and the recommended song list acquisition unit is used for correspondingly acquiring a recommended song list according to the song type corresponding to the emotion value of the target song.

In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the music recommendation method based on natural language processing according to the first aspect when executing the computer program.

In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program when executed by a processor causes the processor to perform the music recommendation method based on natural language processing according to the first aspect.

The embodiment of the invention provides a music recommendation method and device based on natural language processing and computer equipment. The method comprises the steps of obtaining song vectors corresponding to lyrics of each song in a history song list; inputting a song vector corresponding to the lyrics of each song in the history song list to a pre-trained song emotion classification model to obtain an emotion dimension value corresponding to each song in the history song list; taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking an emotion dimension value corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain a song emotion prediction model; inputting the current system time into a song emotion prediction model, and obtaining a target song emotion dimension value corresponding to the current system time; and correspondingly acquiring a recommended song list according to the song type corresponding to the emotion value of the target song. According to the method, text mining and emotion benchmark mining of songs are achieved through natural language processing technology, emotion analysis and emotion clustering, and songs similar to the current song in basic tone can be screened out to be used as recommendation lists.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of an application scenario of a music recommendation method based on natural language processing according to an embodiment of the present invention;

fig. 2 is a flow chart of a music recommendation method based on natural language processing according to an embodiment of the present invention;

FIG. 3 is a schematic sub-flowchart of a music recommendation method based on natural language processing according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of a music recommendation method based on natural language processing according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of another sub-flowchart of a music recommendation method based on natural language processing according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of another sub-flowchart of a music recommendation method based on natural language processing according to an embodiment of the present invention;

FIG. 7 is a schematic block diagram of a music recommendation device based on natural language processing according to an embodiment of the present invention;

FIG. 8 is a schematic block diagram of a subunit of a music recommendation device based on natural language processing according to an embodiment of the present invention;

FIG. 9 is another schematic block diagram of a music recommendation device based on natural language processing according to an embodiment of the present invention;

FIG. 10 is a schematic block diagram of another subunit of a music recommendation device based on natural language processing according to an embodiment of the present invention;

FIG. 11 is a schematic block diagram of another subunit of a music recommendation device based on natural language processing according to an embodiment of the present invention;

fig. 12 is a schematic block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

Referring to fig. 1 and fig. 2, fig. 1 is a schematic view of an application scenario of a music recommendation method based on natural language processing according to an embodiment of the present invention, and fig. 2 is a flowchart of a music recommendation method based on natural language processing according to an embodiment of the present invention, where the music recommendation method based on natural language processing is applied to a server, and the method is executed by application software installed in the server.

As shown in fig. 2, the method includes steps S110 to S150.

S110, obtaining song vectors corresponding to lyrics of each song in the history song list.

In this embodiment, the technical solution is described in terms of a server, that is, in this application, the recommendation of the music songs is performed on the user by the server, and the processing procedure of the recommendation is completed in the server.

In the server, if a user needs to be subjected to song recommendation correspondingly, a history song list of the user needs to be acquired. In the historical song list of the user, each song is correspondingly stored with corresponding song ID, song name, song keyword information, song vector of the song, label combination of the song, label vector of the song and the like. By processing each song and storing the information, the data base of song recommendation can be used. When the recommended song list is obtained in the server, the recommended song list can be sent to the user side, and the user side caches the played songs on line according to the recommended song list.

In one embodiment, as shown in fig. 3, step S110 includes:

s111, word segmentation is carried out on the lyrics of each song in the history song list through a word segmentation model based on probability statistics, and word segmentation results corresponding to the lyrics of each song are obtained;

s112, extracting keyword information positioned before a preset first ranking value in word segmentation results corresponding to lyrics of each song through a word frequency-inverse text frequency index model to serve as a target keyword set corresponding to the lyrics of each song one by one;

S113, obtaining target word vectors corresponding to the keyword information in each target keyword set;

s114, obtaining song vectors corresponding to each target keyword set one by one according to each target word vector in each target keyword set and the weight corresponding to each target word vector.

In this embodiment, the word segmentation process of the lyrics of each song in the history song list through the statistical word segmentation model based on probability is as follows:

for example, let c=c1c2..cm, C be the chinese character string to be split, let w=w1w2..wn, W be the result of the split, wa, wb, … …, wk be all possible split schemes of C. Then, based on the probability statistical word segmentation model, the target word string W can be found, so that W satisfies the following conditions: p (w|c) =max (P (wa|c), P (wb|c)..p (wk|c)), and the word string W obtained by the word segmentation model is the word string with the maximum estimated probability. Namely:

for a substring S of a word to be segmented, all candidate words w1, w2, …, wi, … and wn are taken out according to the sequence from left to right; the probability value P (wi) of each candidate word is found in the dictionary, and all left neighbor words of each candidate word are recorded; calculating the cumulative probability of each candidate word, and simultaneously comparing to obtain the optimal left neighbor word of each candidate word; if the current word wn is the tail word of the character string S and the cumulative probability P (wn) is the maximum, the wn is the end word of the S; starting from wn, outputting the best left neighbor word of each word in turn from right to left, namely the word segmentation result of S.

After the word segmentation result corresponding to the lyrics of each song in the history song list is obtained, keyword information before a preset first ranking value in the word segmentation result is extracted to be used as a target keyword set through a word Frequency-inverse text Frequency index model (namely a TF-IDF model, TF-IDF is a shorthand of Term Frequency-Inverse DocumentFrequency). Extracting keyword information positioned before a preset ranking value in the word segmentation result through a TF-IDF model, wherein the keyword information comprises the following specific steps:

1) Calculating word frequency of each word segmentation i in word segmentation results and marking as TF _i ；

2) Calculating the inverse document frequency IDF of each word segmentation i in the word segmentation result _i ；

In calculating the inverse document frequency IDF of each word i _i When the method is used, a corpus (similar to a dictionary in the word segmentation process) is needed to simulate the use environment of the language;

inverse document frequency IDF _i [ total number of documents of corpus/(number of documents containing the word+1) ]]；

If a word is more common, the larger the denominator, the smaller the inverse document frequency, the closer to 0. The denominator is added 1 to avoid denominator 0 (i.e., all documents do not contain the word).

3) According to TF _i *IDF _i Calculating word frequency-inverse text frequency index TF-IDFi corresponding to each word i in the word segmentation result;

It is apparent that TF-IDF is proportional to the number of occurrences of a word in a document and inversely proportional to the number of occurrences of the word in the whole language. Therefore, the automatic extraction of the keywords is to calculate the TF-IDF value of each word segment of the document, and then arrange the words in descending order, and take the words ranked in the first N bits as a keyword list of the document.

4) And ordering word frequency-inverse text frequency indexes corresponding to each word in the word segmentation result in descending order, and taking the word with the rank before a preset first ranking value (for example, the preset first ranking value is 21) to form a target keyword set corresponding to the lyrics of each song one by one.

After the target keyword sets corresponding to the lyrics of each song one by one are obtained, the target word vector corresponding to each keyword in the target keyword sets can be correspondingly obtained. The word vector corresponding to the keyword information is obtained based on the pre-constructed vocabulary inquiry, the word vector obtaining process is called word2vec, and the word vector obtaining process is used for converting words in natural language into dense vectors which can be understood by a computer. For example, in a corpus (i.e., vocabulary), AA, BB, CC, DD (where AA, BB, CC, DD represents a Chinese word) each corresponds to a vector, only one of which has a value of 1 and the others have 0. The words are converted into discrete single symbols through One-Hot encoding, then are reduced in dimension through Word2Vec to be converted into continuous values with low dimension, namely dense vectors, and words with similar meaning are mapped to similar positions in a vector space.

Finally, according to the word frequency of each keyword in the word segmentation result, the weight corresponding to each target word vector can be obtained, and at the moment, according to each target word vector and the weight corresponding to each target word vector, the song vector corresponding to each target keyword set one by one is obtained. The specific calculation formula is as follows:

wherein vector_full refers to a song Vector corresponding to lyrics of each song one by one, word_embedding (Word _i ) For the target word vector i, weight _i Is the weight corresponding to the target word vector i. Through the process, the lyrics of each song can be converted into a multidimensional row vector or a multidimensional column vector, and the quantization conversion of the lyrics of each song is realized.

In one embodiment, as shown in fig. 4, step S110 further includes:

s101, constructing an initial deep neural network;

s102, taking a song vector corresponding to lyrics of each song in a training song list as input of the initial deep neural network, taking the marked emotion dimension value corresponding to each song as output of the initial deep neural network, and training the initial deep neural network to obtain a song emotion classification model.

In this embodiment, an initial deep neural network is constructed, that is, an initial multi-layer DNN fully-connected neural network (the full name of DNN is Deep Neural Networks, which indicates a deep neural network), and the network structure of the initial multi-layer DNN fully-connected neural network is a 3-layer structure (an input layer, a hidden layer and an output layer respectively) as follows:

First layer (input layer): 1 node, each node is 1 x 256 dimensionality;

second layer (hidden layer): 100 nodes, each node is 1 x 100 dimensionality, the dropout node proportion is 20%, and the activation function is relu;

third layer (hidden layer): 100 nodes, each node is 1 x 100 dimensionality, the dropout node proportion is 20%, and the activation function is relu;

fourth layer (hidden layer): 20 nodes, each node is 1 x 20 dimension, and the activation function is relu;

fifth layer (output layer): 4 nodes, each node being 1*1-dimensional, the activation function being sigmod.

When the marked emotion dimension value corresponding to each song is used as the output of the initial deep neural network, the emotion dimension is four types of vitality, anxiety, satisfaction and depression, and the emotion dimension value corresponding to the vitality is 2; the corresponding emotion dimension value is 1; the emotion dimension value corresponding to anxiety is-1; the depression corresponds to a mood dimension value of-2.

For example, the lyrics of song A1 correspond to a song vector A1, such as a1= [3421], where the second vector value 4 is the maximum value, indicating that the emotion dimension of song A1 is satisfactory and the emotion dimension value is 1; the lyrics of song A2 correspond to a song vector A2, e.g., a2= [2341], wherein the third vector value 4 is the maximum value, indicating that the emotional dimension of song A2 is anxiety and the emotional dimension value is-1; lyrics of song A3 correspond to song vector A3, e.g., a3= [1324], where the fourth vector value 4 is the maximum value, indicating that the emotional dimension of song A3 is depressed and its emotional dimension value is-2; the lyrics of song A4 correspond to a song vector A4, e.g., a4= [4321], where the first vector value 4 is the maximum value, indicating that the emotion dimension of song A4 is tibole, whose emotion dimension value is 2; … …; the lyrics of song AN correspond to a song vector AN, such as an= [4321], where the first vector value 4 is the maximum value, representing that the emotion dimension of song AN is tibole and the emotion dimension value is 2; and training the initial multi-layer DNN fully-connected neural network according to the input and the output to obtain the song emotion classification model. In the method, keyword extraction, emotion analysis and emotion clustering are carried out on lyric information of each song in a history song list through a natural language processing technology, so that text mining and emotion benchmark mining of the songs are achieved.

S120, inputting a song vector corresponding to the lyrics of each song in the history song list to a pre-trained song emotion classification model to obtain emotion dimension values corresponding to each song in the history song list.

In this embodiment, in order to predict the emotion dimension value of the subsequent song according to the song included in the history song list, the emotion dimension value corresponding to each song in the history song list is acquired first, and then a song emotion prediction model is constructed according to the playing time and emotion dimension of each song in the history song list, so as to predict the emotion dimension value of the next playing song according to the currently playing song.

S130, taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking an emotion dimension value corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain the song emotion prediction model.

In this embodiment, when the song emotion prediction model is trained according to the time sequence corresponding to the playing time of each song and the emotion dimension value corresponding to each song in the history song list, an LSTM Long Short Memory model (LSTM is called Long Short-Term Memory in full) is constructed.

The built initial LSTM long and short memory model comprises three layers (an input layer, an LSTM layer and an output layer respectively) as follows:

inputting time series data X;

entering LSTM 1 layer: the data of the current input node is 1 dimension, and the data of the current output node is 100 dimensions; the Dropout layer proportion is 20%;

entering LSTM 2 layer: the data of the current input node is 100 dimensions, and the data of the current output node is 100 dimensions; the Dropout layer proportion is 20%;

entering LSTM 3 layer: the data of the current input node is 100 dimensions, and the data of the current output node is 1 dimension;

output layer: outputting a predicted value of the next time; (remark: dropout layer refers to the number of randomly lost nodes).

Since the playing time of each song in the current song list (the playing time is understood to be the corresponding system time when the song is played) is known, and the emotion dimension value corresponding to each song is also obtained through the song emotion classification model, the playing time and emotion dimension value corresponding to each song can be used as historical data. Specifically, a time sequence corresponding to playing time of each song is taken as input, an emotion dimension value corresponding to each song is taken as output, and an initial LSTM long and short memory model is trained to obtain a song emotion prediction model which is used for predicting the emotion dimension value of the song played in the next time sequence.

S140, inputting the current system time into a song emotion prediction model, and obtaining a target song emotion dimension value corresponding to the current system time.

In this embodiment, when the song emotion prediction model is obtained according to the time sequence corresponding to the playing time of each song in the history song list and the emotion dimension value corresponding to each song in the history song list, the target song emotion dimension value corresponding to the current system time can be predicted, and recommendation of the song emotion dimension value according to the history song listening habit of the user is realized.

S150, a recommended song list is correspondingly obtained according to the song type corresponding to the emotion value of the target song.

In an embodiment, as shown in fig. 5, as a first embodiment of step S150, step S150 includes:

s1511, acquiring an initial recommended song list with the same type as the song type from a pre-constructed song library according to the song type corresponding to the emotion value of the target song;

s1512, obtaining song information corresponding to the current song being played; wherein the song information at least comprises a song label combination;

s1513, calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set;

S1514, obtaining songs positioned in front of a preset first ranking value in the first contact ratio value set to form a recommended song list.

In this embodiment, when the song type corresponding to the emotion dimension value of the target song is determined, that is, the emotion dimension value (i.e., -2, -1, 2) corresponding to the emotion dimension value of the target song is determined, for example, the song type corresponding to the emotion dimension value of-2 is a casualty type song and a silence type song, the song type corresponding to the emotion dimension value of-1 is a mind type song, the song type corresponding to the emotion dimension value of 1 is a light type song and a quiet type song, and the song type corresponding to the emotion dimension value of 2 is a sweet type song and an excitement type song.

After the song type corresponding to the emotion value of the target song is obtained, an initial recommended song list with the same type as the song type can be obtained from a pre-constructed song library. In order to further recommend songs according to the current song being played and the predicted song types, the coincidence value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song can be calculated to form a first coincidence value set. When calculating the coincidence value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song, the calculation formula is as follows:

Wherein Tag1_similarity _i Representing and calculating the coincidence value of the song label combination corresponding to song i in the initial recommended song list and the song label combination of the current song, and kw _i ∩kw ₁ Representing the number of label repetition in the song label combination corresponding to the song i in the initial recommended song list and the song label combination of the current song, and max { |kw _i |,|kw ₁ The number kw of the labels of the song label combination corresponding to the song i in the initial recommended song list is represented by the number kw _i And the number of tags kw of the song tag combination of the current song ₁ Is the maximum value of (a). After the first contact ratio value set is obtained, songs, which are located before a preset first ranking value, in the first contact ratio value set are obtained to form a recommended song list, and songs which are close to the current song in basic tone can be screened out to serve as a recommended list.

In an embodiment, as shown in fig. 6, as a second embodiment of step S150, step S150 includes:

s1521, acquiring an initial recommended song list with the same type as the song type from a pre-constructed song library according to the song type corresponding to the emotion value of the target song;

s1522, obtaining song information corresponding to the current song being played; wherein the song information at least comprises a song label combination;

S1523, calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set;

s1524, calculating the coincidence value of the song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song to form a second coincidence value set;

s1525, calculating to obtain a comprehensive similarity set corresponding to the initial recommended song list according to the comprehensive similarity set=a first preset weight and a second preset weight of the first coincidence value set;

s1526, songs, located before a preset first ranking value, in the integrated similarity set are obtained to form a recommended song list.

In the present embodiment, the process until the first set of coincidence values is acquired by calculation is the same as that in the first embodiment of step S150. When calculating the coincidence ratio value of the song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song, the calculation formula is as follows:

wherein Tag2_similarity _i Representing and calculating the coincidence degree value of the song label vector corresponding to the song i in the initial recommended song list and the song label vector of the current song,

And a cosine value representing an included angle between a song label vector corresponding to the song i in the initial recommended song list and the song label vector of the current song. Preset first weight+preset second weight=1, for example preset first weight=0.5 and preset second weight=0.5.

And comprehensively calculating the comprehensive similarity of each song in the initial recommended song list and the current song by comprehensively considering the coincidence value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song and the coincidence value of the song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song, so as to screen out the songs positioned in front of a preset first ranking value in the comprehensive similarity set to form a recommended song list.

According to the method, text mining and emotion benchmark mining of songs are achieved through natural language processing technology, emotion analysis and emotion clustering, and songs similar to the current song in basic tone can be screened out to be used as recommendation lists.

The embodiment of the invention also provides a music recommendation device based on natural language processing, which is used for executing any embodiment of the music recommendation method based on natural language processing. Specifically, referring to fig. 7, fig. 7 is a schematic block diagram of a music recommendation device based on natural language processing according to an embodiment of the present invention. The music recommendation device 100 based on natural language processing may be configured in a server.

As shown in fig. 7, the music recommendation device 100 based on natural language processing includes a song vector acquisition unit 110, an emotion dimension value acquisition unit 120, a first model training unit 130, a target emotion dimension acquisition unit 140, and a recommendation list acquisition unit 150.

The song vector obtaining unit 110 is configured to obtain a song vector corresponding to lyrics of each song in the history song list.

In one embodiment, as shown in fig. 8, the song vector acquisition unit 110 includes:

the word segmentation unit 111 is configured to segment the lyrics of each song in the history song list by using a word segmentation model based on probability statistics, so as to obtain a word segmentation result corresponding to the lyrics of each song;

a keyword extraction unit 112, configured to extract, through a word frequency-inverse text frequency index model, keyword information located before a preset first ranking value in a word segmentation result corresponding to lyrics of each song, as a target keyword set corresponding to the lyrics of each song one by one;

a target vector obtaining unit 113, configured to obtain a target word vector corresponding to each keyword information in each target keyword set;

the song vector calculating unit 114 is configured to obtain a song vector corresponding to each target keyword set one-to-one according to each target word vector in each target keyword set and the weight corresponding to each target word vector.

In this embodiment, words of lyrics of each song in the history song list are segmented by a statistical word segmentation model based on probability. For a substring S of a word to be segmented, all candidate words w1, w2, …, wi, … and wn are taken out according to the sequence from left to right; the probability value P (wi) of each candidate word is found in the dictionary, and all left neighbor words of each candidate word are recorded; calculating the cumulative probability of each candidate word, and simultaneously comparing to obtain the optimal left neighbor word of each candidate word; if the current word wn is the tail word of the character string S and the cumulative probability P (wn) is the maximum, the wn is the end word of the S; starting from wn, outputting the best left neighbor word of each word in turn from right to left, namely the word segmentation result of S.

After the word segmentation result corresponding to the lyrics of each song in the history song list is obtained, keyword information before a preset first ranking value in the word segmentation result is extracted to be used as a target keyword set through a word Frequency-inverse text Frequency index model (namely a TF-IDF model, TF-IDF is a shorthand of Term Frequency-Inverse Document Frequency). Extracting keyword information positioned before a preset ranking value from the word segmentation result through a TF-IDF model,

In one embodiment, as shown in fig. 9, the music recommendation device 100 based on natural language processing further includes:

an initial network construction unit 101 for constructing an initial deep neural network;

the second model training unit 102 is configured to use a song vector corresponding to lyrics of each song in the training song list as input of the initial deep neural network, use the labeled emotion dimension value corresponding to each song as output of the initial deep neural network, and train the initial deep neural network to obtain a song emotion classification model.

In this embodiment, an initial deep neural network is constructed, that is, an initial multi-layer DNN fully-connected neural network (the full name of DNN is Deep Neural Networks, which indicates a deep neural network) is constructed, and the network structure of the initial multi-layer DNN fully-connected neural network is a 3-layer structure (i.e., an input layer, a hidden layer, and an output layer, respectively). When the marked emotion dimension value corresponding to each song is used as the output of the initial deep neural network, the emotion dimension is four, namely, the emotion dimension value corresponding to the emotion is 2, wherein the emotion dimension value is vivid and erection, anxiety, happiness and depression; the corresponding emotion dimension value is 1; the emotion dimension value corresponding to anxiety is-1; the depression corresponds to a mood dimension value of-2.

And the emotion dimension value obtaining unit 120 is configured to input a song vector corresponding to lyrics of each song in the history song list to a pre-trained song emotion classification model, so as to obtain an emotion dimension value corresponding to each song in the history song list.

The first model training unit 130 is configured to train the song emotion prediction model to be trained by using a time sequence corresponding to the playing time of each song in the history song list as an input of the song emotion prediction model to be trained and using an emotion dimension value corresponding to each song in the history song list as an output of the song emotion prediction model to be trained, so as to obtain the song emotion prediction model.

The built initial LSTM long and short memory model comprises three layers (an input layer, an LSTM layer and an output layer respectively). Since the playing time of each song in the current song list (the playing time is understood to be the corresponding system time when the song is played) is known, and the emotion dimension value corresponding to each song is also obtained through the song emotion classification model, the playing time and emotion dimension value corresponding to each song can be used as historical data. Specifically, a time sequence corresponding to playing time of each song is taken as input, an emotion dimension value corresponding to each song is taken as output, and an initial LSTM long and short memory model is trained to obtain a song emotion prediction model which is used for predicting the emotion dimension value of the song played in the next time sequence.

And the target emotion dimension obtaining unit 140 is configured to input the current system time into the song emotion prediction model, and obtain a target song emotion dimension value corresponding to the current system time.

And the recommended song list obtaining unit 150 is configured to obtain a recommended song list according to the song type corresponding to the emotion value of the target song.

In an embodiment, as shown in fig. 10, as a first embodiment of the recommendation list obtaining unit 150, the recommendation list obtaining unit 150 includes:

a first initial list obtaining unit 1511, configured to obtain, according to a song type corresponding to the emotion value of the target song, an initial recommended song list that is the same type as the song type in a pre-constructed song library;

a first song information acquiring unit 1512, configured to acquire song information corresponding to a current song being played; wherein the song information at least comprises a song label combination;

a contact ratio value set obtaining unit 1513, configured to calculate a contact ratio value of a song tag combination corresponding to each song in the initial recommended song list and a song tag combination of the current song, so as to form a first contact ratio value set;

and the first song sorting and filtering unit 1514 is configured to obtain songs located before a preset first ranking value in the first contact value set, so as to form a recommended song list.

In an embodiment, as shown in fig. 11, as a second embodiment of the recommendation list obtaining unit 150, the recommendation list obtaining unit 150 includes:

a second initial list obtaining unit 1521, configured to obtain, according to a song type corresponding to the emotion value of the target song, an initial recommended song list having the same type as the song type in a pre-constructed song library;

a second song information acquisition unit 1522 configured to acquire song information corresponding to a current song being played; wherein the song information at least comprises a song label combination;

a first set obtaining unit 1523, configured to calculate a coincidence value of a song tag combination corresponding to each song in the initial recommended song list and a song tag combination of the current song, so as to form a first coincidence value set;

a second set obtaining unit 1524, configured to calculate a coincidence value of a song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song, so as to form a second coincidence value set;

A similarity set obtaining unit 1525, configured to calculate and obtain a comprehensive similarity set corresponding to the initial recommended song list according to a comprehensive similarity set=a first weight preset by the first overlap ratio value set+a second weight preset by the second overlap ratio value set;

the second song sorting filtering unit 1526 is configured to obtain songs in the integrated similarity set that are located before the preset first ranking value, so as to form a recommended song list.

The device realizes text mining and emotion benchmark mining of songs through natural language processing technology, emotion analysis and emotion clustering, and can screen out songs similar to the current song in tone as a recommendation list.

The music recommendation apparatus based on natural language processing described above may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 12.

Referring to fig. 12, fig. 12 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 500 is a server. The server may be an independent server or a server cluster formed by a plurality of servers.

With reference to FIG. 12, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.

The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032, when executed, may cause the processor 502 to perform a music recommendation method based on natural language processing.

The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.

The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform a natural language processing based music recommendation method.

The network interface 505 is used for network communication, such as providing for transmission of data information, etc. It will be appreciated by those skilled in the art that the structure shown in FIG. 12 is merely a block diagram of some of the structures associated with the present inventive arrangements and does not constitute a limitation of the computer device 500 to which the present inventive arrangements may be applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to perform the following functions: obtaining song vectors corresponding to lyrics of each song in a history song list; inputting a song vector corresponding to the lyrics of each song in the history song list to a pre-trained song emotion classification model to obtain an emotion dimension value corresponding to each song in the history song list; taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking an emotion dimension value corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain a song emotion prediction model; inputting the current system time into a song emotion prediction model, and obtaining a target song emotion dimension value corresponding to the current system time; and correspondingly acquiring a recommended song list according to the song type corresponding to the emotion value of the target song.

In one embodiment, the processor 502 performs the following operations when performing the step of obtaining a song vector corresponding to lyrics of each song in the history song list: word segmentation is carried out on the lyrics of each song in the history song list through a word segmentation model based on probability statistics, so that word segmentation results corresponding to the lyrics of each song are obtained; extracting keyword information positioned before a preset first ranking value in word segmentation results corresponding to lyrics of each song through a word frequency-inverse text frequency index model to serve as a target keyword set corresponding to the lyrics of each song one by one; obtaining target word vectors corresponding to the keyword information in each target keyword set; and obtaining song vectors corresponding to each target keyword set one by one according to each target word vector in each target keyword set and the weight corresponding to each target word vector.

In one embodiment, before performing the step of obtaining a song vector corresponding to lyrics of each song in the history song list, the processor 502 further performs the following operations: constructing an initial deep neural network; and taking a song vector corresponding to the lyrics of each song in the training song list as input of the initial deep neural network, taking the marked emotion dimension value corresponding to each song as output of the initial deep neural network, and training the initial deep neural network to obtain a song emotion classification model.

In one embodiment, the processor 502 performs the following operations when performing the step of obtaining the recommended song list according to the song type corresponding to the emotion value of the target song, where the recommended song list is obtained according to the song type: according to the song types corresponding to the emotion values of the target songs, an initial recommended song list with the same type as the song types is obtained from a pre-constructed song library; acquiring song information corresponding to a current song being played; wherein the song information at least comprises a song label combination; calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set; and obtaining songs positioned in front of a preset first ranking value in the first contact ratio value set to form a recommended song list.

In one embodiment, when executing the step of obtaining the recommended song list according to the song type corresponding to the emotion value of the target song, the processor 502 performs the following operations: according to the song types corresponding to the emotion values of the target songs, an initial recommended song list with the same type as the song types is obtained from a pre-constructed song library; acquiring song information corresponding to a current song being played; wherein the song information at least comprises a song label combination; calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set; calculating the coincidence value of the song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song to form a second coincidence value set; calculating and obtaining a comprehensive similarity set corresponding to the initial recommended song list according to the comprehensive similarity set = first coincidence value set; and obtaining songs positioned in front of a preset first ranking value in the integrated similarity set to form a recommended song list.

Those skilled in the art will appreciate that the embodiment of the computer device shown in fig. 12 is not limiting of the specific construction of the computer device, and in other embodiments, the computer device may include more or less components than those shown, or certain components may be combined, or a different arrangement of components. For example, in some embodiments, the computer device may include only a memory and a processor, and in such embodiments, the structure and function of the memory and the processor are consistent with the embodiment shown in fig. 12, and will not be described again.

It should be appreciated that in an embodiment of the invention, the processor 502 may be a central processing unit (Central Processing Unit, CPU), the processor 502 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer readable storage medium stores a computer program, wherein the computer program when executed by a processor performs the steps of: obtaining song vectors corresponding to lyrics of each song in a history song list; inputting a song vector corresponding to the lyrics of each song in the history song list to a pre-trained song emotion classification model to obtain an emotion dimension value corresponding to each song in the history song list; taking a time sequence corresponding to the playing time of each song in the history song list as input of a song emotion prediction model to be trained, taking an emotion dimension value corresponding to each song in the history song list as output of the song emotion prediction model to be trained, and training the song emotion prediction model to be trained to obtain a song emotion prediction model; inputting the current system time into a song emotion prediction model, and obtaining a target song emotion dimension value corresponding to the current system time; and correspondingly acquiring a recommended song list according to the song type corresponding to the emotion value of the target song.

In an embodiment, the obtaining a song vector corresponding to lyrics of each song in the historical song list includes: word segmentation is carried out on the lyrics of each song in the history song list through a word segmentation model based on probability statistics, so that word segmentation results corresponding to the lyrics of each song are obtained; extracting keyword information positioned before a preset first ranking value in word segmentation results corresponding to lyrics of each song through a word frequency-inverse text frequency index model to serve as a target keyword set corresponding to the lyrics of each song one by one; obtaining target word vectors corresponding to the keyword information in each target keyword set; and obtaining song vectors corresponding to each target keyword set one by one according to each target word vector in each target keyword set and the weight corresponding to each target word vector.

In an embodiment, before obtaining the song vector corresponding to the lyrics of each song in the history song list, the method further includes: constructing an initial deep neural network; and taking a song vector corresponding to the lyrics of each song in the training song list as input of the initial deep neural network, taking the marked emotion dimension value corresponding to each song as output of the initial deep neural network, and training the initial deep neural network to obtain a song emotion classification model.

In an embodiment, the obtaining the recommended song list according to the song type corresponding to the emotion value of the target song includes: according to the song types corresponding to the emotion values of the target songs, an initial recommended song list with the same type as the song types is obtained from a pre-constructed song library; acquiring song information corresponding to a current song being played; wherein the song information at least comprises a song label combination; calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set; and obtaining songs positioned in front of a preset first ranking value in the first contact ratio value set to form a recommended song list.

In an embodiment, the obtaining the recommended song list according to the song type corresponding to the emotion value of the target song includes: according to the song types corresponding to the emotion values of the target songs, an initial recommended song list with the same type as the song types is obtained from a pre-constructed song library; acquiring song information corresponding to a current song being played; wherein the song information at least comprises a song label combination; calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set; calculating the coincidence value of the song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song to form a second coincidence value set; calculating and obtaining a comprehensive similarity set corresponding to the initial recommended song list according to the comprehensive similarity set = first coincidence value set; and obtaining songs positioned in front of a preset first ranking value in the integrated similarity set to form a recommended song list.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and unit described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein. Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the units is merely a logical function division, there may be another division manner in actual implementation, or units having the same function may be integrated into one unit, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present invention.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units may be stored in a storage medium if implemented in the form of software functional units and sold or used as stand-alone products. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, an optical disk, or other various media capable of storing program codes.

While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A music recommendation method based on natural language processing, comprising:

according to the song types corresponding to the emotion values of the target songs, a recommended song list is correspondingly obtained;

the corresponding obtaining of the recommended song list according to the song type corresponding to the emotion value of the target song includes:

according to the song types corresponding to the emotion values of the target songs, an initial recommended song list with the same type as the song types is obtained from a pre-constructed song library;

acquiring song information corresponding to a current song being played; wherein the song information at least comprises a song label combination;

calculating the coincidence degree value of the song label combination corresponding to each song in the initial recommended song list and the song label combination of the current song to form a first coincidence degree value set;

calculating the coincidence value of the song label vector corresponding to each song in the initial recommended song list and the song label vector of the current song to form a second coincidence value set;

calculating and obtaining a comprehensive similarity set corresponding to the initial recommended song list according to the comprehensive similarity set = first coincidence value set;

And obtaining songs positioned in front of a preset first ranking value in the integrated similarity set to form a recommended song list.

2. The method for recommending music based on natural language processing according to claim 1, wherein the obtaining a song vector corresponding to lyrics of each song in the history song list comprises:

word segmentation is carried out on the lyrics of each song in the history song list through a word segmentation model based on probability statistics, so that word segmentation results corresponding to the lyrics of each song are obtained;

extracting keyword information positioned before a preset first ranking value in word segmentation results corresponding to lyrics of each song through a word frequency-inverse text frequency index model to serve as a target keyword set corresponding to the lyrics of each song one by one;

obtaining target word vectors corresponding to the keyword information in each target keyword set;

and obtaining song vectors corresponding to each target keyword set one by one according to each target word vector in each target keyword set and the weight corresponding to each target word vector.

3. The method for recommending music based on natural language processing according to claim 1, wherein before obtaining a song vector corresponding to lyrics of each song in the history song list, further comprising:

Constructing an initial deep neural network;

and taking a song vector corresponding to the lyrics of each song in the training song list as input of the initial deep neural network, taking the marked emotion dimension value corresponding to each song as output of the initial deep neural network, and training the initial deep neural network to obtain a song emotion classification model.

4. A music recommendation device based on natural language processing, comprising:

the recommended song list acquisition unit is used for correspondingly acquiring a recommended song list according to the song type corresponding to the emotion value of the target song;

the recommendation list obtaining unit includes:

the second initial list acquisition unit is used for acquiring an initial recommended song list with the same type as the song type from a pre-constructed song library according to the song type corresponding to the emotion value of the target song;

a second song information acquisition unit, configured to acquire song information corresponding to a current song being played; wherein the song information at least comprises a song label combination;

a first set obtaining unit, configured to calculate a coincidence value of a song label combination corresponding to each song in the initial recommended song list and a song label combination of the current song, so as to form a first coincidence value set;

a second set obtaining unit, configured to calculate a coincidence value of a song label vector corresponding to each song in the initial recommended song list and a song label vector of the current song, so as to form a second coincidence value set;

A similarity set obtaining unit, configured to calculate and obtain a comprehensive similarity set corresponding to the initial recommended song list according to a first weight preset by the comprehensive similarity set=a first overlap ratio value set+a second weight preset by the second overlap ratio value set;

and the second song sorting and screening unit is used for acquiring songs positioned before a preset first ranking value in the integrated similarity set so as to form a recommended song list.

5. The music recommendation device based on natural language processing according to claim 4, wherein the song vector acquisition unit comprises:

the word segmentation unit is used for segmenting the lyrics of each song in the history song list through a word segmentation model based on probability statistics to obtain word segmentation results corresponding to the lyrics of each song;

the keyword extraction unit is used for extracting keyword information positioned before a preset first ranking value in a word segmentation result corresponding to the lyrics of each song through a word frequency-inverse text frequency index model to serve as a target keyword set corresponding to the lyrics of each song one by one;

the target vector acquisition unit is used for acquiring target word vectors corresponding to the keyword information in each target keyword set;

And the song vector calculation unit is used for acquiring the song vectors corresponding to each target keyword set one by one according to each target word vector in each target keyword set and the weight corresponding to each target word vector.

6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a natural language processing based music recommendation method as claimed in any one of claims 1 to 3 when the computer program is executed.

7. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform the natural language processing based music recommendation method according to any one of claims 1 to 3.