CN111435369A

CN111435369A - Music recommendation method, device, terminal and storage medium

Info

Publication number: CN111435369A
Application number: CN201910034882.6A
Authority: CN
Inventors: 李岩; 陈波
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-01-14
Filing date: 2019-01-14
Publication date: 2020-07-21
Anticipated expiration: 2039-01-14
Also published as: CN111435369B

Abstract

The embodiment of the invention discloses a music recommendation method, a device, a terminal and a storage medium, wherein the method comprises the following steps: if the music recommendation event is detected, determining an application scene of the music recommendation event; searching to obtain an initial music set matched with the application scene, wherein the initial music set comprises a plurality of initial music and emotion values of the initial music; selecting music to be recommended from the initial music set according to the emotion value of each piece of initial music, and responding to the music recommendation event to output the music to be recommended; music recommendations can be made better.

Description

Music recommendation method, device, terminal and storage medium

Technical Field

The present invention relates to the multimedia technology field, and in particular, to a music recommendation method, apparatus, terminal, and storage medium.

Background

With the development of internet technology, mobile terminals can provide more and more diversified application functions for users; among these application functions, music recommendation becomes an important function in the mobile terminal. The mobile terminal can recommend music to the user in different application scenarios, for example: the mobile terminal can recommend music to the user in the process of playing the music by the user; and recommending corresponding music to the user when the user shoots the video so as to match the music of the video shot by the user, and the like. Therefore, how to better recommend music and improve the accuracy of music recommendation become research hotspots.

Disclosure of Invention

The embodiment of the invention provides a music recommendation method, a device, a terminal and a storage medium, which can better recommend music.

In one aspect, an embodiment of the present invention provides a music recommendation method, including:

if the music recommendation event is detected, determining an application scene of the music recommendation event;

searching to obtain an initial music set matched with the application scene, wherein the initial music set comprises a plurality of initial music and emotion values of the initial music;

selecting music to be recommended from the initial music set according to the emotion value of each piece of initial music, and responding to the music recommendation event to output the music to be recommended;

and the emotion value of each piece of initial music in the initial music set is determined after the comment information set of each piece of initial music is subjected to classification analysis through the trained emotion classification model.

On the other hand, an embodiment of the present invention provides a music recommendation apparatus, including:

the device comprises a determining unit, a judging unit and a judging unit, wherein the determining unit is used for determining an application scene of a music recommending event if the music recommending event is detected;

the searching unit is used for searching and obtaining an initial music set matched with the application scene, and the initial music set comprises a plurality of pieces of initial music and emotion values of the pieces of initial music;

the recommending unit is used for screening out music to be recommended from the initial music set according to the emotion value of each piece of initial music and responding to the music recommending event to output the music to be recommended;

In another aspect, an embodiment of the present invention provides an intelligent terminal, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions to perform the following steps:

In still another aspect, an embodiment of the present invention provides a computer storage medium storing computer program instructions, which when executed, are used to implement the music recommendation method described above.

After the music recommendation event is detected, the application scene of the music recommendation event and the initial music set matched with the application scene can be determined; and then, music to be recommended is screened out from the initial music set again according to the emotion value of each initial music so as to recommend the music. The emotion value of each piece of initial music is determined after the comment information sets of each piece of initial music are subjected to classification analysis, so that the emotion value of each piece of initial music determined based on the comment information sets can be more accurate, the accuracy of the piece of music to be recommended determined according to the emotion value can be improved to a certain extent, and the piece of music to be recommended is more fit with the application scene.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1a is a diagram of an application scenario of a music recommendation scheme according to an embodiment of the present invention;

FIG. 1b is a diagram of an application scenario of another music recommendation method according to an embodiment of the present invention;

FIG. 1c is a schematic diagram of a user interface provided by an embodiment of the present invention;

FIG. 1d is a schematic diagram of another user interface provided by embodiments of the present invention;

fig. 2 is a flowchart illustrating a music recommendation method according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a music recommendation method according to another embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an initial emotion classification model provided by an embodiment of the present invention;

FIG. 5a is a schematic diagram of an emoticon provided by an embodiment of the present invention;

FIG. 5b is a schematic diagram of sample data provided by an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a music recommendation apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an intelligent terminal according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a music recommendation scheme for recommending music to a user, wherein the music recommendation scheme can be applied to an intelligent terminal, and the intelligent terminal can comprise but is not limited to: smart phones, laptops, tablets, etc. The terminal can apply the music recommendation scheme to different service scenes according to different service requirements, such as a service scene of music push, a service scene of video dubbing music, and the like. The following explains the music recommendation scheme by taking the application of the music recommendation scheme in a service scene of video dubbing music as an example:

when a user wants to shoot a video, a client with a video shooting function in the terminal can be opened; after the terminal detects the operation of opening the client by the user, a user interface for video shooting can be provided for the user at the service interface of the client, and the user interface for video shooting can be "shooting a video motion" as shown in fig. 1 a. A user may perform video shooting by clicking a user interface for video shooting in a service interface, and after detecting a click operation of the user on the user interface, the terminal may start a video shooting function and output a prompt message on the user interface to prompt the user whether to perform video shooting, as shown in fig. 1 a. After detecting the confirmation operation (e.g. clicking operation) of the prompt message by the user, the terminal may enter a video shooting mode, as shown in fig. 1 b. In the video shooting mode, a user can shoot a section of video in real time through a shooting button (such as a circular button shown in fig. 1 b) in a terminal shooting interface, in which case the terminal can take the video shot by the user in real time as a target video; the user can also select and upload a section of video from the local gallery through an upload button in the terminal shooting interface, and in this case, the terminal can take the video selected by the user to be uploaded as the target video.

After the terminal acquires the target video, the video content of the target video can be analyzed to determine the shooting scene of the target video, and the matched initial music set is determined according to the shooting scene. For example, if the shooting scene of the target video is a wedding scene, each piece of initial music in the determined matching initial music set is related to a theme such as "wedding", "wedding". The initial music set may include a plurality of initial music and an emotion value of each initial music, the emotion value of the initial music is used to reflect an emotion type of the initial music, where the emotion type may include: a positive mood type and a negative mood type; wherein, the higher the emotion value of the original music, the more positive the emotion of the original music is. Because the initial music set may contain music of a negative emotion type, and the wedding scene should recommend music of a positive emotion type, after the initial music set is determined, music to be recommended of the positive emotion type can be screened out from the initial music set according to the emotion value of each initial music, and the music to be recommended is output in the user interface for the user to select. The terminal can determine the selected music according to the selection instruction of the user to the music to be recommended and add the selected music to the target video to realize video dubbing. For example, the music to be recommended includes: music 1, music 2 and music 3, the terminal can display the three pieces of music to be recommended on the user interface, as shown in fig. 1 c; if the user selects music 1, the terminal may add music 1 to the target video. Optionally, the terminal may also provide the user with a video mute button at the user interface, as shown in fig. 1 d. The user can remove the noisy noise in the target video by clicking the video silencing button, and then add music 1 into the target video, so that the video dubbing effect is improved.

Based on the above description, in one embodiment, the embodiment of the present invention provides a schematic flowchart of a music recommendation method in fig. 2. The method of the embodiment of the present invention may be implemented by the above-mentioned terminal, or by an application program running in the terminal, and so on. For convenience of description, the embodiment of the present invention takes the terminal as an example to execute the music recommendation method. The terminal may detect whether a music recommendation event exists, and if the music recommendation event is detected, may determine an application scenario of the music recommendation event in step S201; application scenarios for music recommendation events may include, but are not limited to: video soundtrack scenes, music push scenes, and the like. In one embodiment, if the terminal detects a video shooting event or a video uploading event, the terminal may trigger generation of a music recommendation event and detect the music recommendation event, in which case the application scene of the music recommendation event includes a video dubbing scene. In another embodiment, if the terminal detects a music database update event, it may trigger generation of a music recommendation event and detect the music recommendation event, in which case the application scenario of the music recommendation event includes a music push scenario.

After determining the application scenario of the music recommendation event, the terminal may search for an initial music set matching the application scenario in step S202. The initial music set comprises a plurality of initial music and the emotion value of each initial music, and the emotion value of each initial music in the initial music set is determined after the comment information set of each initial music is classified and analyzed through the trained emotion classification model. In one embodiment, the terminal may perform classification analysis on the comment information set of each piece of initial music through a trained emotion classification model in advance to determine an emotion value of each piece of initial music, and add the emotion value of each piece of initial music to the initial music set. In another embodiment, the terminal may also perform classification analysis on the comment information sets of the initial music through the trained emotion classification model in real time in the process of acquiring the initial music set, so as to determine the emotion value of each initial music.

In the specific implementation process of step S202, the terminal may search in a local music database to obtain an initial music set matching the application scenario. In one embodiment, if the application scene of the music recommendation event includes a video music scene, the terminal may first acquire a target video in the video music scene when searching for an initial music set matching the application scene; shooting scenes and/or video contents of the target videos are then identified, and matching initial music sets are determined according to the shooting scenes and/or the video contents. The terminal can analyze the video content of the target video by adopting a computer vision method, so as to identify the shooting scene and the video content of the target video.

In another embodiment, if the application scene of the music recommendation event includes a music pushing scene, the terminal may obtain newly added music in the music database when searching for an initial music set matching the application scene; and taking the acquired newly added music as initial music to construct an initial music set. For example, the music database includes 4 pieces of music, which are: music a, music b, music c, and music d, where music a and music b are newly added to the music database, then music a and music b may be used as initial music to construct an initial music set. In other embodiments, the terminal may send a music search request to the server, where the music search request includes a scene identifier of an application scene, so that the server performs a music search in a music database inside the server according to the scene identifier included in the music search request, and sends an initial music set obtained by the search to the terminal.

After the initial music set is obtained, in step S203, music to be recommended may be screened out from the initial music set according to the emotion value of each initial music, and the music to be recommended may be output in response to the music recommendation event. In an embodiment, the specific implementation manner of screening out the music to be recommended from the initial music set according to the emotion value of each initial music may be: determining a candidate music set from the initial music set according to the emotion value of each initial music, wherein the emotion value of each candidate music in the candidate music set is greater than a preset emotion threshold value, and the preset emotion threshold value can be set according to an experience value or an actual service requirement; and selecting a preset number of candidate music from the candidate music set as the music to be recommended, wherein the preset number can be set according to actual service requirements, for example, the preset number is set to 3. Optionally, when a preset number of candidate music is selected from the candidate music set as the music to be recommended, the preset number of candidate music can be arbitrarily selected from the candidate music set as the music to be recommended; and selecting a preset number of candidate music from the candidate music set as the music to be recommended in turn according to the sequence of the emotion values from high to low.

In another embodiment, the specific implementation of screening out the music to be recommended from the initial music set according to the emotion value of each piece of initial music may also be: determining a candidate music set from the initial music set according to the emotion value of each initial music and the application scene of the music recommendation event; and selecting a preset number of candidate music from the candidate music set as the music to be recommended. If the application scene of the music recommendation event is a positive emotion scene, the emotion value of each candidate music in the candidate music set is greater than a preset emotion threshold value; and if the application scene of the music recommendation event is a scene with negative emotion, the emotion value of each candidate music in the candidate music set is smaller than a preset emotion threshold value.

After the terminal screens out the music to be recommended, the terminal can respond to the music recommendation event and output the music to be recommended. Specifically, the terminal can directly and sequentially play the music to be recommended when responding to the music recommendation event; or when responding to the music recommendation event, displaying the music to be recommended on the user interface, and playing or matching the music selected by the user according to the selection instruction of the music to be recommended by the user; the embodiment of the invention does not limit the specific implementation mode of the terminal for outputting the music to be recommended.

In yet another embodiment, an exemplary flow chart of another music recommendation method is set forth in FIG. 3. The method of the embodiment of the present invention may be implemented by the above-mentioned terminal, or by an application program running in the terminal, and so on. For convenience of description, the embodiment of the present invention takes the terminal as an example to execute the music recommendation method. The terminal can perform model training in advance to obtain a trained emotion classification model, so that the comment information set of each piece of initial music can be classified and analyzed subsequently through the trained emotion classification model, and the emotion value of each piece of initial music can be determined. Based on this, the terminal may construct an initial emotion classification model in step S301, and obtain a sample set. Specifically, the terminal may construct an initial emotion classification model based on FastText, and the constructed initial emotion classification model may be as shown in fig. 4; FastText is a tool for performing word vector calculation and text information classification, and the FastText is adopted to construct the initial emotion classification model in the embodiment of the invention, so that the training time with more orders of magnitude can be reduced and the model training efficiency can be improved when the initial emotion classification model is trained subsequently.

The sample set can comprise a plurality of sample data, each sample data comprises text information and supervision information, wherein the sample data is positive sample data or negative sample data, the text information included in the positive sample data is first-type text information, and the text information included in the negative sample data is second-type text information; the so-called first type of text information includes: text information in the comment information including the positive emotional emoticons; the so-called second type of text information includes: text information in the comment information including the negative emotional emoticon; the supervisory information may include an emotion tag value of the textual information. When a terminal acquires a sample set, an original comment information set can be acquired from a social platform, wherein the original comment information set comprises a plurality of original comment information, and the original comment information comprises emoticons; adopting text information and corresponding supervision information in original comment information of the emoticons belonging to the positive emotion emoticon set to form positive sample data and adding the positive sample data to the sample set; and adopting text information and corresponding supervision information in the original comment information of which the emoticons belong to the negative emotion emoticon set to form negative sample data and adding the negative sample data to the sample set. Wherein both the positive emoticon set and the negative emoticon set are predefined; for example, as shown in FIG. 5 a: the embodiment of the invention defines three emoticons such as [ smile ], [ split tooth ] and [ smile ] as positive emotion emoticons to form a positive emotion emoticon set; three emoticons such as [ quick crying ], [ lacrimation ], and [ big crying ] are defined as negative emotion emoticons to form a negative emotion emoticon set.

In the embodiment of the invention, a great deal of comment information exists for music in network platforms such as various music platforms, social platforms for spreading music, instant messaging application platforms and the like, comment information of different music (a first column of a table shown in fig. 5 b) is collected, comment information (a second column of the table shown in fig. 5 b) with a specified expression symbol is found, text information of positive sample data and text information of negative sample data in a 3 rd column of fig. 5b are automatically determined by means of the expression symbol, supervision information (positive representation information or negative representation information) corresponding to the text information can be determined based on a corresponding set of expression symbols divided by fig. 5a, so that a sample set is obtained without requiring a annotating person to manually annotate the data type (positive sample data or negative sample data) of each sample data in the sample set, the method can improve and save the marking resources and improve the acquisition efficiency of the sample set.

After the sample set is obtained, the terminal may train the initial emotion classification model by using the sample set in S302 to obtain a trained emotion classification model; the trained emotion classification model is used for carrying out classification analysis on comment information of the music so as to determine the emotion value of the music. Specifically, step S302 may include the following steps S11-S12:

and s11, calculating the classification analysis result and the supervision information by adopting a training formula to obtain a training value.

The classification analysis result is obtained by performing classification analysis on the text information of the sample data in the sample set through the initial emotion classification model. When the initial emotion classification model performs classification analysis on the text information of the sample data in the sample set, the hidden vector of the text information of the sample data in the sample set can be classified and analyzed according to a preset classification function, a first matrix parameter and a second matrix parameter. Wherein the text information comprises at least one word; the hidden layer vector of the text information is calculated according to the word vector of each word in the text information. Wherein, the training formula can be shown in formula 1.1:

wherein log (f (BAx)_n) Is) represents the result of the classification analysis,<x_n,y_n>represents sample data, y_nDenotes supervisory information, x_nRepresenting text information; optionally, x_nCan represent the normalized text information; a represents a first matrix parameter, A can be a word fast look-up table, and the word fast look-up table comprises at least one corresponding relation between a word and a word vector; ax_nA hidden layer vector representing text information; b represents a second matrix parameter, and B can be understood as a parameter in a preset classification function; f () represents a preset classification function, and f () may be a linear function.

Correspondingly, in the implementation process of calculating the classification analysis result and the supervision information by adopting a training formula to obtain a training value, the hidden layer vector of the text information in each sample data can be determined firstly; classifying and analyzing the hidden layer vector of each text message through an initial emotion classification model to obtain a classification analysis result of each text message; and finally, calculating the product between the classification analysis result of each text message and the supervision information, and determining a training value according to the calculated product. The specific implementation of determining the hidden layer vector of the text information in each sample data may be: obtaining a word fast look-up table (namely a first matrix parameter); and searching word vectors corresponding to all words in the text information in a word fast look-up table, and summing or averaging the searched word vectors to obtain hidden layer vectors of the text information.

And s12, adjusting the model parameters of the initial emotion classification model according to the obtained training value so as to obtain the trained emotion classification model.

Wherein, adjusting the model parameters of the initial emotion classification model comprises: the first matrix parameters and/or the second matrix parameters are adjusted. In a specific implementation process, the first matrix parameter and/or the second matrix parameter may be adjusted according to a principle of reducing a training value, so as to obtain a trained emotion classification model. In other embodiments, if the training formula is shown as equation 1.2, the first matrix parameter and/or the second matrix parameter may be adjusted according to the principle of increasing the training value to obtain the trained emotion classification model.

It should be noted that, the first matrix parameter and the second matrix parameter are adjusted no matter according to the principle of reducing the training value or according to the principle of increasing the training value, so as to make the likelihood value of the trained emotion classification model obtained by adjusting the parameters as large as possible, and the likelihood value is used for reflecting the accuracy of the trained emotion classification model; the greater the likelihood value, the higher the accuracy of the trained emotion classification model.

After the trained emotion classification model is obtained, the emotion value of each piece of initial music can be determined through the trained emotion classification model. Specifically, the terminal may acquire a comment information set of the target music in S303, where the target music is any one of the initial music sets. In a specific implementation process, a social platform associated with the target music may be determined first, and the social platform including the shared information of the target music may be used as the social platform associated with the target music in the embodiment of the present invention. For example, if the target music was shared to WeChat friend circles, the WeChat friend circles include shared information for the target music, and thus the WeChat friend circles may be used as a social platform associated with the target music. And then, obtaining a comment information set of the target music from the social platform, wherein the comment information set of the target music comprises comment information included when the target music is shared to the social platform. Then, the terminal may invoke the trained emotion classification model to perform classification analysis on each piece of comment information in the comment information set of the target music in S304, so as to obtain an emotion value of the target music. Specifically, a trained emotion classification model can be invoked to perform classification analysis on each piece of comment information in a comment information set of the target music, so as to obtain an emotion prediction value of each piece of comment information; and determining the emotion value of the target music according to the emotion prediction value of each piece of comment information. If only one piece of comment information exists in the comment information set of the target music, the emotion prediction value of the comment information can be used as the emotion value of the target music; if the comment information set of the target music comprises a plurality of comment information, calculating the average value of emotion prediction values of the comment information; and taking the calculated average value as the emotion value of the target music.

After the emotion value of the target music is determined, the emotion value of the target music may be added to the initial music set in S305. Based on the method mentioned in the above steps S303-S305, the terminal may obtain the emotion value of each piece of initial music in the initial music set, and add the emotion value of each piece of initial music to the initial music set. The emotion value of each piece of original music obtained through the trained emotion classification model can be shown in table 1:

TABLE 1

Serial number	Emotional value	Music ID	Singer	Music title
					1	0.945427	935922	Sheet ××	May you be happy and prosperous
2	0.935407	153390	Wang ××	Good for new year
					3	0.885385	713733	Plum ××	Bai new year
4	0.854717	147817	Wang ××	Blessing
					5	0.070943	926887	Liu ××	Tired of
…	…	…	…	…
					N	0.209825	500947	Chen ××	Lacrimation

The terminal can also detect whether a music recommendation event exists, and if the music recommendation event is detected, the application scene of the music recommendation event can be determined in S306; application scenarios for music recommendation events may include, but are not limited to: video soundtrack scenes, music push scenes, and the like. After the music scenes are determined, an initial music set matching the application scenes may be searched for in S307. The initial music set comprises a plurality of initial music and the emotion value of each initial music, and the emotion value of each initial music in the initial music set is determined after the comment information set of each initial music is classified and analyzed through the trained emotion classification model. And in step S308, music to be recommended is screened out from the initial music set according to the emotion value of each piece of initial music, and the music to be recommended is output in response to the music recommendation event. It should be noted that, for specific implementation of steps S306 to S308, reference may be made to S201 to S203 in the above embodiment of the present invention, and details of the embodiment of the present invention are not described again.

Based on the description of the above method embodiment, in an embodiment, an embodiment of the present invention further provides a schematic structural diagram of a music recommendation apparatus as shown in fig. 6. As shown in fig. 6, the music recommendation apparatus in the embodiment of the present invention may include:

a determining unit 101, configured to determine an application scenario of a music recommendation event if the music recommendation event is detected;

a searching unit 102, configured to search for an initial music set that matches the application scene, where the initial music set includes multiple pieces of initial music and emotion values of the pieces of initial music;

a recommending unit 103, configured to screen music to be recommended from the initial music set according to the emotion value of each piece of initial music, and output the music to be recommended in response to the music recommending event;

In one embodiment, the music recommendation apparatus may further include a processing unit 104 for: constructing an initial emotion classification model and acquiring a sample set; training the initial emotion classification model by using the sample set to obtain a trained emotion classification model; the sample set comprises a plurality of sample data, each sample data comprises text information and supervision information, wherein the sample data is positive sample data or negative sample data, the text information in the positive sample data is first-class text information, and the text information in the negative sample data is second-class text information; the first type of text information includes: text information in the comment information including the positive emotional emoticons; the second type of text information includes: text information in the comment information including the negative emotional emoticon; the trained emotion classification model is used for carrying out classification analysis on comment information of the music so as to determine the emotion value of the music.

In another embodiment, when the initial emotion classification model is trained by using the sample set to obtain a trained emotion classification model, the processing unit 104 may be specifically configured to: calculating the classification analysis result and the supervision information by adopting a training formula to obtain a training value; the classification analysis result is obtained by performing classification analysis on the text information of the sample data in the sample set through the initial emotion classification model; and adjusting the model parameters of the initial emotion classification model according to the obtained training value so as to obtain the trained emotion classification model.

In another embodiment, the classifying and analyzing the text information of the sample data in the sample set by the initial emotion classification model includes: the initial classification model performs classification analysis on the hidden vector of the text information of the sample data in the sample set according to a preset classification function, a first matrix parameter and a second matrix parameter; adjusting model parameters of the initial emotion classification model includes: adjusting the first matrix parameters and/or the second matrix parameters.

In yet another embodiment, the textual information includes at least one word; the hidden layer vector of the text information is obtained by calculation according to the word vector of each word in the text information.

In yet another embodiment, the processing unit 104 is further operable to: obtaining a comment information set of target music, wherein the target music is any one of the initial music in the initial music set; calling the trained emotion classification model to perform classification analysis on each comment information in the comment information set of the target music to obtain an emotion value of the target music; adding an emotion value of the target music to the initial music set.

In another embodiment, when obtaining the comment information set of the target music, the processing unit 104 may be specifically configured to: determining a social platform associated with the target music; and obtaining a comment information set of the target music from the social platform, wherein the comment information set of the target music comprises comment information included when the target music is shared to the social platform.

In still another embodiment, the application scenario of the music recommendation event includes: a video dubbing scene; accordingly, when the initial music set matching the application scenario is obtained through searching, the searching unit 102 may be specifically configured to: acquiring a target video in the video music scene; and identifying shooting scenes and/or video contents of the target video, and determining a matched initial music set according to the shooting scenes and/or the video contents.

Fig. 7 is a schematic structural diagram of an intelligent terminal according to an embodiment of the present invention. The intelligent terminal in this embodiment shown in fig. 7 may include: one or more processors 201; one or more input devices 202, one or more output devices 203, and memory 204. The processor 201, the input device 202, the output device 203, and the memory 204 are connected by a bus 205. The memory 204 is used for storing a computer program comprising program instructions, and the processor 201 is used for executing the program instructions stored in the memory 204 to execute the music recommendation method described above.

In one embodiment, the processor 201 may be a Central Processing Unit (CPU), or other general-purpose processor, i.e., a microprocessor or any conventional processor. The memory 204 may include both read-only memory and random access memory and provides instructions and data to the processor 201. Therefore, the processor 201 and the memory 204 are not limited herein.

The embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer program instructions, and the processor 201 loads and executes one or more computer program instructions stored in the computer storage medium to implement the corresponding steps of the method in the corresponding embodiments; in a particular implementation, at least one computer program instruction in the computer storage medium is loaded by the processor 201 and performs the following steps:

In one embodiment, the at least one computer program instruction may also be loaded and executed by the processor 201 to: constructing an initial emotion classification model and acquiring a sample set; training the initial emotion classification model by using the sample set to obtain a trained emotion classification model; the sample set comprises a plurality of sample data, each sample data comprises text information and supervision information, wherein the sample data is positive sample data or negative sample data, the text information in the positive sample data is first-class text information, and the text information in the negative sample data is second-class text information; the first type of text information includes: text information in the comment information including the positive emotional emoticons; the second type of text information includes: text information in the comment information including the negative emotional emoticon; the trained emotion classification model is used for carrying out classification analysis on comment information of the music so as to determine the emotion value of the music.

In yet another embodiment, the at least one computer program instruction may be loaded and executed by processor 201 in training the initial emotion classification model using the sample set to obtain a trained emotion classification model: calculating the classification analysis result and the supervision information by adopting a training formula to obtain a training value; the classification analysis result is obtained by performing classification analysis on the text information of the sample data in the sample set through the initial emotion classification model; and adjusting the model parameters of the initial emotion classification model according to the obtained training value so as to obtain the trained emotion classification model.

In yet another embodiment, the at least one computer program instruction may also be loaded and executed by the processor 201 to: obtaining a comment information set of target music, wherein the target music is any one of the initial music in the initial music set; calling the trained emotion classification model to perform classification analysis on each comment information in the comment information set of the target music to obtain an emotion value of the target music; adding an emotion value of the target music to the initial music set.

In yet another embodiment, the at least one computer program instruction may be loaded and executed by the processor 201 in obtaining a set of comment information for a target music: determining a social platform associated with the target music; and obtaining a comment information set of the target music from the social platform, wherein the comment information set of the target music comprises comment information included when the target music is shared to the social platform.

In still another embodiment, the application scenario of the music recommendation event includes: a video dubbing scene; accordingly, when searching for an initial set of music that matches the application scenario, the at least one computer program instruction may be loaded and executed by processor 201: acquiring a target video in the video music scene; and identifying shooting scenes and/or video contents of the target video, and determining a matched initial music set according to the shooting scenes and/or the video contents.

It should be noted that, for the specific working process of the terminal and the unit described above, reference may be made to the relevant description in the foregoing embodiments, and details are not described here again.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

While the invention has been described with reference to a number of embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A music recommendation method, comprising:

2. The method of claim 1, wherein the method further comprises:

constructing an initial emotion classification model and acquiring a sample set;

training the initial emotion classification model by using the sample set to obtain a trained emotion classification model;

the sample set comprises a plurality of sample data, each sample data comprises text information and supervision information, wherein the sample data is positive sample data or negative sample data, the text information in the positive sample data is first-class text information, and the text information in the negative sample data is second-class text information;

the first type of text information includes: text information in the comment information including the positive emotional emoticons; the second type of text information includes: text information in the comment information including the negative emotional emoticon; the trained emotion classification model is used for carrying out classification analysis on comment information of the music so as to determine the emotion value of the music.

3. The method of claim 2, wherein said training said initial emotion classification model using said sample set to obtain a trained emotion classification model, comprises:

calculating the classification analysis result and the supervision information by adopting a training formula to obtain a training value; the classification analysis result is obtained by performing classification analysis on the text information of the sample data in the sample set through the initial emotion classification model;

and adjusting the model parameters of the initial emotion classification model according to the obtained training value so as to obtain the trained emotion classification model.

4. The method of claim 3, wherein the initial emotion classification model performs classification analysis on textual information of sample data in the sample set, comprising:

the initial classification model performs classification analysis on the hidden vector of the text information of the sample data in the sample set according to a preset classification function, a first matrix parameter and a second matrix parameter;

adjusting model parameters of the initial emotion classification model includes: adjusting the first matrix parameters and/or the second matrix parameters.

5. The method of claim 4, wherein the textual information includes at least one word; the hidden layer vector of the text information is obtained by calculation according to the word vector of each word in the text information.

6. The method of any one of claims 1-5, further comprising:

obtaining a comment information set of target music, wherein the target music is any one of the initial music in the initial music set;

calling the trained emotion classification model to perform classification analysis on each comment information in the comment information set of the target music to obtain an emotion value of the target music;

adding an emotion value of the target music to the initial music set.

7. The method of claim 6, wherein the obtaining of the comment information set of the target music comprises:

determining a social platform associated with the target music;

and obtaining a comment information set of the target music from the social platform, wherein the comment information set of the target music comprises comment information included when the target music is shared to the social platform.

8. The method of any of claims 1-5, wherein the application scenario of the music recommendation event comprises: a video dubbing scene; the searching obtains an initial music set matched with the application scene, and the method comprises the following steps:

acquiring a target video in the video music scene;

and identifying shooting scenes and/or video contents of the target video, and determining a matched initial music set according to the shooting scenes and/or the video contents.

9. A music recommendation device, comprising:

10. An intelligent terminal, comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, wherein the memory is configured to store a computer program, the computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the music recommendation method of any one of claims 1-8.

11. A computer storage medium storing computer program instructions adapted to be loaded by a processor and to perform a music recommendation method according to any one of claims 1-8.