CN113010728A - Song recommendation method, system, intelligent device and storage medium - Google Patents

Song recommendation method, system, intelligent device and storage medium Download PDF

Info

Publication number
CN113010728A
CN113010728A CN202110367621.3A CN202110367621A CN113010728A CN 113010728 A CN113010728 A CN 113010728A CN 202110367621 A CN202110367621 A CN 202110367621A CN 113010728 A CN113010728 A CN 113010728A
Authority
CN
China
Prior art keywords
song
current
classification
songs
spectrogram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110367621.3A
Other languages
Chinese (zh)
Inventor
叶建仲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinbaobei Network Technology Suzhou Co ltd
Original Assignee
Jinbaobei Network Technology Suzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinbaobei Network Technology Suzhou Co ltd filed Critical Jinbaobei Network Technology Suzhou Co ltd
Priority to CN202110367621.3A priority Critical patent/CN113010728A/en
Publication of CN113010728A publication Critical patent/CN113010728A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a song recommending method, a song recommending system, intelligent equipment and a storage medium, wherein the method comprises the following steps: finely classifying the target type songs; acquiring a plurality of song files under each subdivision category, and converting the song files into a training spectrogram; establishing a song classification model, and training the song classification model through the training spectrogram; acquiring a current song selected by a user under a target type, inputting a spectrogram corresponding to the current song into the trained song classification model, and acquiring a current fine classification corresponding to the current song; and recommending songs according to the current fine classification. According to the scheme, similar recommendation can be performed according to the characteristics of the songs without acquiring user behaviors, data and the like, the recommendation precision is higher, the application range is wider, and the requirements of different types of people can be met.

Description

Song recommendation method, system, intelligent device and storage medium
Technical Field
The invention relates to the technical field of children song recommendation, in particular to a song recommendation method, a song recommendation system, intelligent equipment and a storage medium.
Background
The existing song recommendation methods are many, and the mainstream music recommendation technology is to realize recommendation such as spotify and dried shrimp music by classifying and judging a large amount of user behavior data, social data, user portrait and other data by using a collaborative algorithm. However, the method is not suitable for many vertical or relatively small-sized users, such as children's songs, because the user data is little or no, and the recommendation cannot be made according to the user behavior, data and the like or the recommendation accuracy is low. Therefore, a method for recommending songs with higher accuracy without acquiring behaviors, data, and the like of the user is required.
Disclosure of Invention
The invention aims to provide a song recommendation method, a song recommendation system, intelligent equipment and a storage medium, the scheme can perform similar recommendation according to the characteristics of songs without acquiring user behaviors, data and the like, the recommendation precision is higher, the application range is wider, and the requirements of different types of people can be met.
The technical scheme provided by the invention is as follows:
the invention provides a song recommending method, which comprises the following steps:
finely classifying the target type songs;
acquiring a plurality of song files under each subdivision category, and converting the song files into a training spectrogram;
establishing a song classification model, and training the song classification model through the training spectrogram;
acquiring a current song selected by a user under a target type, inputting a spectrogram corresponding to the current song into the trained song classification model, and acquiring a current fine classification corresponding to the current song;
and recommending songs according to the current fine classification.
Specifically, a large type of songs can be subdivided, for example, children's songs can be classified into 21 categories, such as cradle songs, game songs, digital songs, question and answer songs, passwords, chain tunes, riddles songs, reverse songs, capitals songs, story categories, lyrics categories, kindergartens, antenatal music, sleeping songs, ear grinding, animation songs, bevaco songs, sanji songs, ancient poems, english children's songs, classical music, and the like. Of course, classification may be made according to other criteria.
When a plurality of song files under each fine category are obtained as a training set, more song files are selected for each fine category in order to improve the accuracy of the model. Taking the above-mentioned children's song as an example, in the present embodiment, 50 mp3 song files are selected for each fine category, and 1050 mp3 files are total.
The audio files contain a lot of information, and in order to extract features and remove noise, audio can be converted into an image form, an audio signal is converted into a frequency domain by using a Fourier transform method, and 1050 MP3 audio files are processed by the method, and each song is extracted and converted into a spectrogram. A spectrogram is a visual representation of the frequency spectrum of a sound over time, and the shade of color in the spectrogram represents the sound magnitude at that frequency.
In this embodiment, the song classification model is a tensflo convolutional neural network model, when performing classification, a spectrogram image is converted into a number matrix representing the color of each pixel, then data is processed by a convolutional layer, a pooling layer, a full connection layer and the like and then converted into a softmax classifier, the classifier is a vector consisting of a plurality of numbers (for example, 21 numbers corresponding to the songguang subclasses), the probability that the convolutional neural network model assigns each song subclass to the spectrogram is included, and finally the classification of the maximum probability position is selected as a final identification classification. In other embodiments, other neural network models may be used to classify songs.
The song classification model can be trained by performing fine classification on the target type songs, acquiring a plurality of song files under each fine classification, and converting the song files into the training spectrogram; when a user selects a song under the target type, the spectrogram corresponding to the current song is input into the trained song classification model, so that the current fine classification corresponding to the current song can be obtained, and the song recommendation can be carried out according to the current fine classification. Because the scheme does not need to acquire user behaviors, data and the like, similar recommendation can be performed according to the characteristics of the songs, so that the recommendation precision of the scheme is higher, the application range is wider, and the requirements of different types of people can be met.
Further, after the song classification model is trained through the training spectrogram, the method further comprises the following steps:
performing fine classification on all the songs in the stored target type through the trained song classification model;
establishing a song similarity model for each fine category, and extracting the feature vectors of all songs under each fine category through the song similarity model;
after obtaining the current fine classification corresponding to the current song, the method further comprises the following steps:
extracting a feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification:
and comparing the feature vector corresponding to the current song with the feature vectors of all the songs under the current fine classification to obtain the target song with the highest similarity and recommend the target song.
Specifically, in order to recommend a target song more accurately, when a song classification model is trained, all songs in a stored target type are classified finely, a song similarity model is established for each fine classification, and then the feature vectors of all songs in each fine classification are extracted through the song similarity model; after the fine classification of the current song is determined, the feature vector corresponding to the current song is extracted through the song similarity model corresponding to the current fine classification, the feature vector corresponding to the current song is compared with the feature vectors of all songs under the current fine classification, and the target song with the highest similarity can be obtained for recommendation, so that the recommendation accuracy is improved.
Further, comparing the feature vector corresponding to the current song with the feature vectors of all songs in the current fine category through a cosine similarity algorithm;
and recommending the song with the maximum cosine similarity as a target song.
Specifically, the feature vector corresponding to the current song may be compared with the feature vectors of all songs in the current fine category through a cosine similarity algorithm. The cosine similarity algorithm is to draw the vectors into a vector space according to coordinate values, obtain the included angle of the vectors and obtain a cosine value corresponding to the included angle, and the cosine value can be used for measuring the similarity of the two vectors; the smaller the angle, the closer the cosine value is to 1, and the more similar the two vectors are. Therefore, the cosine similarity can be used as a main index to compare the similarity of the songs, and the song with the maximum cosine similarity is recommended as a target song.
Further, the obtaining a plurality of song files under each fine category and converting the song files into a training spectrogram specifically includes:
extracting audio of a preset time period in the song file;
converting the audio of the preset time period into a complete spectrogram;
dividing the complete spectrogram into a preset number of training spectrograms in equal parts;
the inputting the spectrogram corresponding to the current song into the trained song classification model specifically includes:
extracting the audio of the current song in a preset time period;
converting the audio frequency of the current song in a preset time period into spectrogram with preset quantity;
and inputting the preset number of spectrogram corresponding to the current song into the trained song classification model.
Specifically, in order to ensure the uniformity of data, the audio of a preset time period in the song file can be extracted, the audio of the preset time period is converted into a complete spectrogram, and the complete spectrogram is divided into a preset number of training spectrograms in equal parts. For example, taking the above-mentioned classification of children songs as an example, about 2 minutes of audio may be extracted from each song, and then converted into a spectrogram, then the picture is divided into 256 × 256 pixels square pictures, each picture has about 5 seconds of audio, one song has about 24 spectrograms, after the division, total about 25200 pictures are processed, and a label is made for each picture, and the classification is labeled. By splitting each song into 24 spectrogram, the accuracy of model training can be improved, and accidental performance is avoided. Of course, in other embodiments, different splits of the spectrogram may be performed according to actual requirements.
Further, the extracting feature vectors of all songs under each fine category through the song similarity model specifically includes:
extracting the audio of all songs in each fine category in a preset time period;
converting the audio of the preset time period corresponding to all songs in each subdivision category into spectrogram with preset quantity;
extracting a preset number of feature vectors corresponding to the preset number of spectrogram of each song under each subdivision classification through the song similarity model;
calculating an average vector of each song under each fine category;
the extracting of the feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification specifically includes:
extracting a preset number of feature vectors corresponding to a preset number of spectrogram of the current song through the song similar model corresponding to the current subdivision class;
calculating an average vector corresponding to the current song;
comparing the feature vector corresponding to the current song with the feature vectors of all songs under the current fine classification specifically comprises:
and comparing the average vector corresponding to the current song with the average vectors of the songs under the current fine classification.
Specifically, when extracting the feature vectors, in order to improve the accuracy and avoid the contingency, 24 feature vectors corresponding to one song can be obtained according to the same method, then the average vector of each song is obtained, the average vector is used as an evaluation standard to evaluate the similarity between different songs, and the optimal recommendation can be achieved.
In addition, the present invention also provides a song recommendation system, comprising:
the first classification module is used for performing fine classification on the target type songs;
the acquisition module is connected with the first classification module and used for acquiring a plurality of song files under each fine classification and converting the song files into a training spectrogram;
the training module is connected with the acquisition module and used for establishing a song classification model and training the song classification model through the training spectrogram;
and the recommending module is connected with the training module and used for acquiring the current song under the target type selected by the user, inputting the spectrogram corresponding to the current song into the trained song classification model, acquiring the current fine classification corresponding to the current song and recommending the song according to the current fine classification.
The method comprises the steps that a target type song is classified in a subdivided mode through a first classification module, a plurality of song files under each subdivided mode are obtained through an obtaining module, the song files are converted into a training spectrogram, and a song classification model can be trained through a training module; when a user selects a song under the target type, the spectrogram corresponding to the current song can be input into the trained song classification model through the recommendation module, so that the current fine classification corresponding to the current song can be obtained, and the song recommendation can be carried out according to the current fine classification. Because the scheme does not need to acquire user behaviors, data and the like, similar recommendation can be performed according to the characteristics of the songs, so that the recommendation precision of the scheme is higher, the application range is wider, and the requirements of different types of people can be met.
Further, still include:
the second classification module is used for performing fine classification on all the stored songs under the target type through the trained song classification model;
the first extraction module is used for establishing a song similarity model for each fine category and extracting the feature vectors of all songs under each fine category through the song similarity model;
the second extraction module is used for extracting the feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification;
and the comparison module is used for comparing the feature vector corresponding to the current song with the feature vectors of all the songs under the current fine classification to obtain the target song with the highest similarity, and the recommendation module recommends the target song.
Specifically, in order to recommend a target song more accurately, when a song classification model is trained, a second classification module is used for performing fine classification on all songs in a stored target type, a song similarity model is established for each fine classification, and a first extraction module is used for extracting feature vectors of all songs in each fine classification; after the fine classification of the current song is determined, the feature vector corresponding to the current song can be extracted through the second extraction module, and then the feature vector corresponding to the current song is compared with the feature vectors of all songs under the current fine classification through the comparison module, so that the target song with the highest similarity can be obtained for recommendation, and the recommendation accuracy is improved.
Further, the comparison module calculates the feature vector corresponding to the current song and the feature vectors of all the songs in the current fine category through a cosine similarity algorithm, and takes the song with the largest cosine similarity as the target song.
Specifically, the feature vector corresponding to the current song may be compared with the feature vectors of all songs in the current fine category through a cosine similarity algorithm. The cosine similarity algorithm is to draw the vectors into a vector space according to coordinate values, obtain the included angle of the vectors and obtain a cosine value corresponding to the included angle, and the cosine value can be used for measuring the similarity of the two vectors; the smaller the angle, the closer the cosine value is to 1, and the more similar the two vectors are. Therefore, the cosine similarity can be used as a main index to compare the similarity of the songs, and the song with the maximum cosine similarity is recommended as a target song.
In addition, the present invention also provides an intelligent device, comprising:
the memory is used for storing the running program;
and the processor is used for executing the running program stored in the memory and realizing the operation executed by the song recommending method.
In addition, the present invention also provides a storage medium, where at least one instruction is stored, and the instruction is loaded and executed by a processor to implement the operations performed by the above-mentioned song recommendation method.
According to the song recommendation method, the song recommendation system, the intelligent device and the storage medium, provided by the invention, the song classification model can be trained by finely classifying the target type of songs, acquiring a plurality of song files under each fine classification, and converting the song files into the training spectrogram; when a user selects a song under the target type, the spectrogram corresponding to the current song is input into the trained song classification model, so that the current fine classification corresponding to the current song can be obtained, and the song recommendation can be carried out according to the current fine classification. Because the scheme does not need to acquire user behaviors, data and the like, similar recommendation can be performed according to the characteristics of the songs, so that the recommendation precision of the scheme is higher, the application range is wider, and the requirements of different types of people can be met.
Drawings
The foregoing features, technical features, advantages and embodiments of the present invention will be further explained in the following detailed description of the preferred embodiments, which is to be read in connection with the accompanying drawings.
FIG. 1 is a schematic overall flow diagram of an embodiment of the present invention;
FIG. 2 is a schematic flow diagram of another embodiment of the present invention;
FIG. 3 is a system architecture diagram of an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an intelligent device according to an embodiment of the present invention.
Reference numerals: 1-a first classification module; 2-an acquisition module; 3-a training module; 4-a recommendation module; 5-a second classification module; 6-a first extraction module; 7-a second extraction module; 8-a comparison module; 100-a memory; 200-a processor; 300-a communication interface; 400-a communication bus; 500-input/output interface.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
Example 1
One embodiment of the present invention, as shown in fig. 1, provides a song recommendation method, including the steps of:
and S1, finely classifying the target type songs.
Specifically, a large type of songs can be subdivided, for example, children's songs can be classified into 21 categories, such as cradle songs, game songs, digital songs, question and answer songs, passwords, chain tunes, riddles songs, reverse songs, capitals songs, story categories, lyrics categories, kindergartens, antenatal music, sleeping songs, ear grinding, animation songs, bevaco songs, sanji songs, ancient poems, english children's songs, classical music, and the like. Of course, classification may be made according to other criteria.
And S2, acquiring a plurality of song files under each fine classification, and converting the song files into a training spectrogram.
When a plurality of song files under each fine category are obtained as a training set, more song files are selected for each fine category in order to improve the accuracy of the model. Taking the above-mentioned children's song as an example, in the present embodiment, 50 mp3 song files are selected for each fine category, and 1050 mp3 files are total.
The audio files contain a lot of information, and in order to extract features and remove noise, audio can be converted into an image form, an audio signal is converted into a frequency domain by using a Fourier transform method, and 1050 MP3 audio files are processed by the method, and each song is extracted and converted into a spectrogram. A spectrogram is a visual representation of the frequency spectrum of a sound over time, and the shade of color in the spectrogram represents the sound magnitude at that frequency.
And S3, establishing a song classification model, and training the song classification model through a training spectrogram.
In this embodiment, the song classification model is a tensflo convolutional neural network model, when performing classification, a spectrogram image is converted into a number matrix representing the color of each pixel, then data is processed by a convolutional layer, a pooling layer, a full connection layer and the like and then converted into a softmax classifier, the classifier is a vector consisting of a plurality of numbers (for example, 21 numbers corresponding to the songguang subclasses), the probability that the convolutional neural network model assigns each song subclass to the spectrogram is included, and finally the classification of the maximum probability position is selected as a final identification classification. In other embodiments, other neural network models may be used to classify songs.
And S4, acquiring the current song under the target type selected by the user, inputting the spectrogram corresponding to the current song into the trained song classification model, and acquiring the current fine classification corresponding to the current song.
And S5, recommending songs according to the current fine classification.
The song classification model can be trained by performing fine classification on the target type songs, acquiring a plurality of song files under each fine classification, and converting the song files into the training spectrogram; when a user selects a song under the target type, the spectrogram corresponding to the current song is input into the trained song classification model, so that the current fine classification corresponding to the current song can be obtained, and the song recommendation can be carried out according to the current fine classification. Because the scheme does not need to acquire user behaviors, data and the like, similar recommendation can be performed according to the characteristics of the songs, so that the recommendation precision of the scheme is higher, the application range is wider, and the requirements of different types of people can be met.
Example 2
As shown in fig. 2, an embodiment of the present invention, on the basis of embodiment 1, after the song classification model is trained by training the spectrogram, further includes the steps of:
and S31, finely classifying all the songs under the stored target type through the trained song classification model.
And S32, establishing a song similarity model for each fine category, and extracting the feature vectors of all songs under each fine category through the song similarity model.
After obtaining the current fine classification corresponding to the current song, the method further comprises the following steps:
and S41, extracting the feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification.
And S42, comparing the feature vector corresponding to the current song with the feature vectors of all songs under the current fine classification to obtain the target song with the highest similarity and recommending the target song.
Specifically, in order to recommend a target song more accurately, when a song classification model is trained, all songs in a stored target type are classified finely, a song similarity model is established for each fine classification, and then the feature vectors of all songs in each fine classification are extracted through the song similarity model; after the fine classification of the current song is determined, the feature vector corresponding to the current song is extracted through the song similarity model corresponding to the current fine classification, the feature vector corresponding to the current song is compared with the feature vectors of all songs under the current fine classification, and the target song with the highest similarity can be obtained for recommendation, so that the recommendation accuracy is improved.
Preferably, comparing the feature vector corresponding to the current song with the feature vectors of all songs in the current fine classification through a cosine similarity algorithm; and recommending the song with the maximum cosine similarity as a target song.
Specifically, the feature vector corresponding to the current song may be compared with the feature vectors of all songs in the current fine category through a cosine similarity algorithm. The cosine similarity algorithm is to draw the vectors into a vector space according to coordinate values, obtain the included angle of the vectors and obtain a cosine value corresponding to the included angle, and the cosine value can be used for measuring the similarity of the two vectors; the smaller the angle, the closer the cosine value is to 1, and the more similar the two vectors are. Therefore, the cosine similarity can be used as a main index to compare the similarity of the songs, and the song with the maximum cosine similarity is recommended as a target song.
Example 3
An embodiment of the present invention, on the basis of embodiment 2, obtains a plurality of song files under each subdivision category, and converts the song files into a training spectrogram, specifically including:
extracting audio of a preset time period in the song file; converting the audio frequency of a preset time period into a complete spectrogram; and dividing the complete spectrogram into a preset number of training spectrograms in equal.
Inputting the spectrogram corresponding to the current song into the trained song classification model, which specifically comprises the following steps:
extracting the audio of the current song in a preset time period; converting the audio frequency of the current song in a preset time period into spectrogram with preset quantity; and inputting the preset number of spectrogram corresponding to the current song into the trained song classification model.
Specifically, in order to ensure the uniformity of data, the audio of a preset time period in the song file can be extracted, the audio of the preset time period is converted into a complete spectrogram, and the complete spectrogram is divided into a preset number of training spectrograms in equal parts. For example, taking the above-mentioned classification of children songs as an example, about 2 minutes of audio may be extracted from each song, and then converted into a spectrogram, then the picture is divided into 256 × 256 pixels square pictures, each picture has about 5 seconds of audio, one song has about 24 spectrograms, after the division, total about 25200 pictures are processed, and a label is made for each picture, and the classification is labeled. By splitting each song into 24 spectrogram, the accuracy of model training can be improved, and accidental performance is avoided. Of course, in other embodiments, different splits of the spectrogram may be performed according to actual requirements.
Preferably, extracting feature vectors of all songs under each fine category through a song similarity model specifically includes:
extracting the audio of all songs in each fine category in a preset time period; converting the audio of the preset time period corresponding to all songs in each subdivision category into spectrogram with preset quantity; extracting a preset number of feature vectors corresponding to the preset number of spectrogram of each song under each subdivision classification through a song similarity model; an average vector is calculated for each song under each fine category.
Extracting a feature vector corresponding to the current song through a song similarity model corresponding to the current fine classification, which specifically comprises the following steps:
extracting a preset number of feature vectors corresponding to the spectrogram of the preset number of the current songs through a song similarity model corresponding to the current fine classification; and calculating an average vector corresponding to the current song.
Comparing the feature vector corresponding to the current song with the feature vectors of all songs under the current fine classification, specifically comprising:
the average vector corresponding to the current song is compared to the average vectors for the individual songs under the current fine category.
Specifically, when extracting the feature vectors, in order to improve the accuracy and avoid the contingency, 24 feature vectors corresponding to one song can be obtained according to the same method, then the average vector of each song is obtained, the average vector is used as an evaluation standard to evaluate the similarity between different songs, and the optimal recommendation can be achieved.
Example 4
In an embodiment of the present invention, as shown in fig. 3, the present invention further provides a song recommendation system, which includes a first classification module 1, an acquisition module 2, a training module 3, and a recommendation module 4.
The first classification module 1 is used for performing fine classification on the target type songs.
Specifically, a large type of songs can be subdivided, for example, children's songs can be classified into 21 categories, such as cradle songs, game songs, digital songs, question and answer songs, passwords, chain tunes, riddles songs, reverse songs, capitals songs, story categories, lyrics categories, kindergartens, antenatal music, sleeping songs, ear grinding, animation songs, bevaco songs, sanji songs, ancient poems, english children's songs, classical music, and the like. Of course, classification may be made according to other criteria.
The obtaining module 2 is connected with the first classification module 1, and is used for obtaining a plurality of song files under each fine classification and converting the song files into a training spectrogram.
When a plurality of song files under each fine category are obtained as a training set, more song files are selected for each fine category in order to improve the accuracy of the model. Taking the above-mentioned children's song as an example, in the present embodiment, 50 mp3 song files are selected for each fine category, and 1050 mp3 files are total.
The audio files contain a lot of information, and in order to extract features and remove noise, audio can be converted into an image form, an audio signal is converted into a frequency domain by using a Fourier transform method, and 1050 MP3 audio files are processed by the method, and each song is extracted and converted into a spectrogram. A spectrogram is a visual representation of the frequency spectrum of a sound over time, and the shade of color in the spectrogram represents the sound magnitude at that frequency.
The training module 3 is connected with the acquisition module 2 and used for establishing a song classification model and training the song classification model through a training spectrogram.
In this embodiment, the song classification model is a tensflo convolutional neural network model, when performing classification, a spectrogram image is converted into a number matrix representing the color of each pixel, then data is processed by a convolutional layer, a pooling layer, a full connection layer and the like and then converted into a softmax classifier, the classifier is a vector consisting of a plurality of numbers (for example, 21 numbers corresponding to the songguang subclasses), the probability that the convolutional neural network model assigns each song subclass to the spectrogram is included, and finally the classification of the maximum probability position is selected as a final identification classification. In other embodiments, other neural network models may be used to classify songs.
And the recommendation module 4 is connected with the training module 3 and is used for acquiring the current song under the target type selected by the user, inputting the spectrogram corresponding to the current song into the trained song classification model, acquiring the current fine classification corresponding to the current song and recommending the song according to the current fine classification.
The method comprises the steps that a target type song is classified in a subdivided mode through a first classification module 1, a plurality of song files under each subdivided mode are obtained through an obtaining module 2, the song files are converted into a training spectrogram, and a song classification model can be trained through a training module 3; when a user selects a song under the target type, the spectrogram corresponding to the current song can be input into the trained song classification model through the recommendation module 4, so that the current fine classification corresponding to the current song can be obtained, and the song recommendation can be carried out according to the current fine classification. Because the scheme does not need to acquire user behaviors, data and the like, similar recommendation can be performed according to the characteristics of the songs, so that the recommendation precision of the scheme is higher, the application range is wider, and the requirements of different types of people can be met.
Example 5
In an embodiment of the present invention, as shown in fig. 3, on the basis of embodiment 4, the song recommending system further includes a second classification module 5, a first extraction module 6, a second extraction module 7 and a comparison module 8.
The second classification module 5 is used for performing fine classification on all the songs under the stored target type through the trained song classification model.
The first extraction module 6 is used for establishing a song similarity model for each fine category and extracting the feature vectors of all songs under each fine category through the song similarity model.
The second extraction module 7 is configured to extract a feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification.
The comparison module 8 is configured to compare the feature vector corresponding to the current song with the feature vectors of all songs in the current fine category, to obtain a target song with the highest similarity, and the recommendation module recommends the target song.
Specifically, in order to recommend a target song more accurately, when a song classification model is trained, a second classification module 5 is used for performing fine classification on all songs in a stored target type, a song similarity model is established for each fine classification, and a first extraction module 6 is used for extracting feature vectors of all songs in each fine classification; after the fine classification of the current song is determined, the feature vector corresponding to the current song can be extracted through the second extraction module 7, and then the feature vector corresponding to the current song is compared with the feature vectors of all songs under the current fine classification through the comparison module 8, so that the target song with the highest similarity can be obtained for recommendation, and the recommendation accuracy is improved.
Preferably, the comparing module 8 calculates the feature vector corresponding to the current song and the feature vectors of all songs in the current fine category through a cosine similarity algorithm, and takes the song with the largest cosine similarity as the target song.
Specifically, the feature vector corresponding to the current song may be compared with the feature vectors of all songs in the current fine category through a cosine similarity algorithm. The cosine similarity algorithm is to draw the vectors into a vector space according to coordinate values, obtain the included angle of the vectors and obtain a cosine value corresponding to the included angle, and the cosine value can be used for measuring the similarity of the two vectors; the smaller the angle, the closer the cosine value is to 1, and the more similar the two vectors are. Therefore, the cosine similarity can be used as a main index to compare the similarity of the songs, and the song with the maximum cosine similarity is recommended as a target song.
In addition, in order to ensure the uniformity of data, the audio frequency of a preset time period in the song file can be extracted, the audio frequency of the preset time period is converted into a complete spectrogram, and the complete spectrogram is divided into a preset number of training spectrograms in equal halves. For example, taking the above-mentioned classification of children songs as an example, about 2 minutes of audio may be extracted from each song, and then converted into a spectrogram, then the picture is divided into 256 × 256 pixels square pictures, each picture has about 5 seconds of audio, one song has about 24 spectrograms, after the division, total about 25200 pictures are processed, and a label is made for each picture, and the classification is labeled. By splitting each song into 24 spectrogram, the accuracy of model training can be improved, and accidental performance is avoided. Meanwhile, when extracting the feature vectors, in order to improve the accuracy and avoid the contingency, 24 feature vectors corresponding to one song can be obtained according to the same method, then the average vector of each song is obtained, the average vector is used as an evaluation standard to evaluate the similarity between different songs, and the optimal recommendation can be achieved.
Example 6
As shown in fig. 4, an embodiment of the present invention further provides an intelligent device, which includes a memory 100 and a processor 200, where the memory 100 is used to store an execution program, and the processor 200 is used to execute the execution program stored in the memory, so as to implement the operations performed by the song recommendation method according to any one of embodiments 1 to 3.
Specifically, the smart device may further include a communication interface 300, a communication bus 400 and an input/output interface 500, wherein the processor 200, the memory 100, the input/output interface 500 and the communication interface 300 complete communication with each other through the communication bus 400.
A communication bus 400 is a circuit that connects the elements described and enables transmission between these elements. For example, the processor 200 receives commands from other elements through the communication bus 400, decrypts the received commands, and performs calculations or data processing according to the decrypted commands. The memory 100 may include program modules such as a kernel (kernel), middleware (middleware), an Application Programming Interface (API), and applications. The program modules may be comprised of software, firmware or hardware, or at least two of the same. The input/output interface 500 forwards commands or data entered by a user via an input/output device (e.g., sensor, keyboard, touch screen). The communication interface 300 connects the electronic device with other network devices, user equipment, networks. For example, the communication interface 300 may be connected to a network by wire or wirelessly to connect to external other network devices or user devices. The wireless communication may include at least one of: wireless fidelity (WiFi), Bluetooth (BT), Near Field Communication (NFC), Global Positioning Satellite (GPS) and cellular communications, among others. The wired communication may include at least one of: universal Serial Bus (USB), high-definition multimedia interface (HDMI), asynchronous transfer standard interface (RS-232), and the like. The network may be a telecommunications network and a communications network. The communication network may be a computer network, the internet of things, a telephone network. The smart device may connect to the network through the communication interface 300 and the protocol by which the smart device communicates with other network devices may be supported by at least one of an application, an Application Programming Interface (API), middleware, a kernel, and a communication interface.
Example 7
An embodiment of the present invention further provides a storage medium, where at least one instruction is stored in the storage medium, and the instruction is loaded and executed by a processor to implement the operations performed by the song recommendation method according to any one of embodiments 1 to 3. For example, the computer readable storage medium may be a read-only memory (ROM), a random-access memory (RAM), a compact disc read-only memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like. They may be implemented in program code that is executable by a computing device such that it is executed by the computing device, or separately, or as individual integrated circuit modules, or as a plurality or steps of individual integrated circuit modules. Thus, the present invention is not limited to any specific combination of hardware and software.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A song recommendation method, comprising the steps of:
finely classifying the target type songs;
acquiring a plurality of song files under each subdivision category, and converting the song files into a training spectrogram;
establishing a song classification model, and training the song classification model through the training spectrogram;
acquiring a current song selected by a user under a target type, inputting a spectrogram corresponding to the current song into the trained song classification model, and acquiring a current fine classification corresponding to the current song;
and recommending songs according to the current fine classification.
2. The method of claim 1, wherein after the training of the song classification model by the training spectrogram, the method further comprises the steps of:
performing fine classification on all the songs in the stored target type through the trained song classification model;
establishing a song similarity model for each fine category, and extracting the feature vectors of all songs under each fine category through the song similarity model;
after obtaining the current fine classification corresponding to the current song, the method further comprises the following steps:
extracting a feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification:
and comparing the feature vector corresponding to the current song with the feature vectors of all the songs under the current fine classification to obtain the target song with the highest similarity and recommend the target song.
3. A song recommendation method according to claim 2, characterized by: comparing the feature vector corresponding to the current song with the feature vectors of all songs in the current fine classification through a cosine similarity algorithm;
and recommending the song with the maximum cosine similarity as a target song.
4. The method according to claim 2, wherein the obtaining a plurality of song files under each fine category and converting the song files into a training spectrogram specifically comprises:
extracting audio of a preset time period in the song file;
converting the audio of the preset time period into a complete spectrogram;
dividing the complete spectrogram into a preset number of training spectrograms in equal parts;
the inputting the spectrogram corresponding to the current song into the trained song classification model specifically includes:
extracting the audio of the current song in a preset time period;
converting the audio frequency of the current song in a preset time period into spectrogram with preset quantity;
and inputting the preset number of spectrogram corresponding to the current song into the trained song classification model.
5. The song recommendation method according to claim 4, wherein the extracting feature vectors of all songs under each fine category through the song similarity model specifically comprises:
extracting the audio of all songs in each fine category in a preset time period;
converting the audio of the preset time period corresponding to all songs in each subdivision category into spectrogram with preset quantity;
extracting a preset number of feature vectors corresponding to the preset number of spectrogram of each song under each subdivision classification through the song similarity model;
calculating an average vector of each song under each fine category;
the extracting of the feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification specifically includes:
extracting a preset number of feature vectors corresponding to a preset number of spectrogram of the current song through the song similar model corresponding to the current subdivision class;
calculating an average vector corresponding to the current song;
comparing the feature vector corresponding to the current song with the feature vectors of all songs under the current fine classification specifically comprises:
and comparing the average vector corresponding to the current song with the average vectors of the songs under the current fine classification.
6. A song recommendation system, comprising:
the first classification module is used for performing fine classification on the target type songs;
the acquisition module is connected with the first classification module and used for acquiring a plurality of song files under each fine classification and converting the song files into a training spectrogram;
the training module is connected with the acquisition module and used for establishing a song classification model and training the song classification model through the training spectrogram;
and the recommending module is connected with the training module and used for acquiring the current song under the target type selected by the user, inputting the spectrogram corresponding to the current song into the trained song classification model, acquiring the current fine classification corresponding to the current song and recommending the song according to the current fine classification.
7. The song recommendation system of claim 6, further comprising:
the second classification module is used for performing fine classification on all the stored songs under the target type through the trained song classification model;
the first extraction module is used for establishing a song similarity model for each fine category and extracting the feature vectors of all songs under each fine category through the song similarity model;
the second extraction module is used for extracting the feature vector corresponding to the current song through the song similarity model corresponding to the current fine classification;
and the comparison module is used for comparing the feature vector corresponding to the current song with the feature vectors of all the songs under the current fine classification to obtain the target song with the highest similarity, and the recommendation module recommends the target song.
8. The song recommendation system of claim 7, wherein: and the comparison module calculates the feature vector corresponding to the current song and the feature vectors of all the songs under the current fine classification through a cosine similarity algorithm, and takes the song with the maximum cosine similarity as a target song.
9. A smart device, comprising:
the memory is used for storing the running program;
a processor for executing the operating program stored in the memory to implement the operations performed by the song recommendation method according to any one of claims 1 to 5.
10. A storage medium, characterized by: the storage medium has stored therein at least one instruction that is loaded and executed by a processor to perform operations performed by the song recommendation method of any one of claims 1-5.
CN202110367621.3A 2021-04-06 2021-04-06 Song recommendation method, system, intelligent device and storage medium Pending CN113010728A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110367621.3A CN113010728A (en) 2021-04-06 2021-04-06 Song recommendation method, system, intelligent device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110367621.3A CN113010728A (en) 2021-04-06 2021-04-06 Song recommendation method, system, intelligent device and storage medium

Publications (1)

Publication Number Publication Date
CN113010728A true CN113010728A (en) 2021-06-22

Family

ID=76387879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110367621.3A Pending CN113010728A (en) 2021-04-06 2021-04-06 Song recommendation method, system, intelligent device and storage medium

Country Status (1)

Country Link
CN (1) CN113010728A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023011062A1 (en) * 2021-08-05 2023-02-09 腾讯科技(深圳)有限公司 Information pushing method and apparatus, device, storage medium, and computer program product

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080082022A (en) * 2006-12-27 2008-09-11 한국전자통신연구원 Likelihood measurement apparatus and method based on music characteristics and music recommendation system and method using its
CN108206027A (en) * 2016-12-20 2018-06-26 北京酷我科技有限公司 A kind of audio quality evaluation method and system
CN108804609A (en) * 2018-05-30 2018-11-13 平安科技(深圳)有限公司 Song recommendation method and device
CN109885722A (en) * 2019-01-07 2019-06-14 平安科技(深圳)有限公司 Music recommended method, device and computer equipment based on natural language processing
CN110019924A (en) * 2017-08-14 2019-07-16 中兴通讯股份有限公司 A kind of method, apparatus of song recommendations, computer equipment and storage medium
CN110674339A (en) * 2019-09-18 2020-01-10 北京工业大学 Chinese song emotion classification method based on multi-mode fusion
CN111046221A (en) * 2019-12-17 2020-04-21 腾讯科技(深圳)有限公司 Song recommendation method and device, terminal equipment and storage medium
CN111259189A (en) * 2018-11-30 2020-06-09 马上消费金融股份有限公司 Music classification method and device
CN111583890A (en) * 2019-02-15 2020-08-25 阿里巴巴集团控股有限公司 Audio classification method and device
CN112445933A (en) * 2020-12-07 2021-03-05 腾讯音乐娱乐科技(深圳)有限公司 Model training method, device, equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080082022A (en) * 2006-12-27 2008-09-11 한국전자통신연구원 Likelihood measurement apparatus and method based on music characteristics and music recommendation system and method using its
CN108206027A (en) * 2016-12-20 2018-06-26 北京酷我科技有限公司 A kind of audio quality evaluation method and system
CN110019924A (en) * 2017-08-14 2019-07-16 中兴通讯股份有限公司 A kind of method, apparatus of song recommendations, computer equipment and storage medium
CN108804609A (en) * 2018-05-30 2018-11-13 平安科技(深圳)有限公司 Song recommendation method and device
CN111259189A (en) * 2018-11-30 2020-06-09 马上消费金融股份有限公司 Music classification method and device
CN109885722A (en) * 2019-01-07 2019-06-14 平安科技(深圳)有限公司 Music recommended method, device and computer equipment based on natural language processing
CN111583890A (en) * 2019-02-15 2020-08-25 阿里巴巴集团控股有限公司 Audio classification method and device
CN110674339A (en) * 2019-09-18 2020-01-10 北京工业大学 Chinese song emotion classification method based on multi-mode fusion
CN111046221A (en) * 2019-12-17 2020-04-21 腾讯科技(深圳)有限公司 Song recommendation method and device, terminal equipment and storage medium
CN112445933A (en) * 2020-12-07 2021-03-05 腾讯音乐娱乐科技(深圳)有限公司 Model training method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023011062A1 (en) * 2021-08-05 2023-02-09 腾讯科技(深圳)有限公司 Information pushing method and apparatus, device, storage medium, and computer program product

Similar Documents

Publication Publication Date Title
CN109993150B (en) Method and device for identifying age
CN109919244B (en) Method and apparatus for generating a scene recognition model
CN110009059B (en) Method and apparatus for generating a model
CN109961032B (en) Method and apparatus for generating classification model
CN110046571B (en) Method and device for identifying age
CN112839223B (en) Image compression method, image compression device, storage medium and electronic equipment
CN111768231A (en) Product information recommendation method and device
CN112584062B (en) Background audio construction method and device
CN109815448B (en) Slide generation method and device
CN111897950A (en) Method and apparatus for generating information
CN111966909A (en) Video recommendation method and device, electronic equipment and computer-readable storage medium
CN112950640A (en) Video portrait segmentation method and device, electronic equipment and storage medium
CN109934142A (en) Method and apparatus for generating the feature vector of video
CN112149699A (en) Method and device for generating model and method and device for recognizing image
CN109919220B (en) Method and apparatus for generating feature vectors of video
CN113923378A (en) Video processing method, device, equipment and storage medium
CN114420135A (en) Attention mechanism-based voiceprint recognition method and device
CN112990176A (en) Writing quality evaluation method and device and electronic equipment
CN113010728A (en) Song recommendation method, system, intelligent device and storage medium
JP7504192B2 (en) Method and apparatus for searching images - Patents.com
CN110059739B (en) Image synthesis method, image synthesis device, electronic equipment and computer-readable storage medium
CN110765304A (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN114969427A (en) Singing list generation method and device, electronic equipment and storage medium
CN111898658B (en) Image classification method and device and electronic equipment
CN111460214B (en) Classification model training method, audio classification method, device, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination