CN106844528A

CN106844528A - The method and apparatus for obtaining multimedia file

Info

Publication number: CN106844528A
Application number: CN201611248099.2A
Authority: CN
Inventors: 张斯剑
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2016-12-29
Filing date: 2016-12-29
Publication date: 2017-06-13

Abstract

The invention discloses a kind of method and apparatus for obtaining multimedia file, belong to network communication technology field.Method includes：Receive and obtain request, described acquisition asks at least to carry attribute information and voice signal, and the attribute information is used for the attribute of identification of multimedia file；According to the attribute information, the multimedia file matched with the attribute information is selected to constitute the second multimedia file storehouse from the first multimedia file storehouse；According to the voice signal, the destination multimedia file matched with the voice signal is selected from the second multimedia file storehouse.Due to the multimedia file for only including being matched with the attribute information in the second multimedia file storehouse, namely the second multimedia file negligible amounts that includes of multimedia file storehouse, therefore, multimedia server selects the destination multimedia file matched with the voice signal from the second multimedia file storehouse, compare and save time, the efficiency for obtaining destination multimedia file can be improved.

Description

The method and apparatus for obtaining multimedia file

Technical field

The present invention relates to network communication technology field, more particularly to a kind of method and apparatus for obtaining multimedia file.

Background technology

At present, most of terminal all supports music software, and most of music software all has the function of listening song to know song； When user does not know title of the song, user can groan out the melody of the song for wanting search against terminal, and terminal is bent by listening song to know Function, the corresponding song of the melody is searched out from multimedia server.

When terminal searches for the corresponding song of the melody from multimedia server, the voice letter of terminal collection user input Number, send the voice signal to multimedia server；The voice signal that multimedia server receiving terminal sends, extracts the language The melody of message number, calculates the matching degree between the melody of each song in the melody and library of the voice signal, according to this Matching degree between the melody of voice signal and the melody of each song, selects matching degree highest song from library, to Terminal sends the song of the selection.

Realize it is of the invention during, inventor find prior art at least there is problems with：

Because the number of songs that library includes is very more, multimedia server calculates the melody of the voice signal and is somebody's turn to do Matching degree in library between the melody of each song than relatively time-consuming, so as to cause terminal obtain song efficiency it is low.

The content of the invention

In order to solve problem of the prior art, the invention provides a kind of method and apparatus for obtaining multimedia file.Skill Art scheme is as follows：

In a first aspect, the embodiment of the invention provides a kind of method for obtaining multimedia file, methods described includes：

Receive and obtain request, described acquisition asks at least to carry attribute information and voice signal, and the attribute information is used for The attribute of identification of multimedia file；

According to the attribute information, the multimedia text matched with the attribute information is selected from the first multimedia file storehouse Part constitutes the second multimedia file storehouse, and the first multimedia file storehouse is used to store all many in the multimedia server Media file；

According to the voice signal, the target matched with the voice signal is selected from the second multimedia file storehouse Multimedia file.

In a possible design, described acquisition asks also to carry keyword,

It is described according to the attribute information, many matchmakers matched with the attribute information are selected from the first multimedia file storehouse Body file constitutes the second multimedia file storehouse, including：

According to the attribute information and the keyword, select to believe with the attribute from the first multimedia file storehouse Breath matching, and constitute the second multimedia file storehouse with the multimedia file of the Keywords matching.

It is described according to the voice signal in a possible design, selected from the second multimedia file storehouse The destination multimedia file matched with the voice signal, including：

The reference melody of the voice signal is extracted, in calculating the reference melody and the second multimedia file storehouse Matching degree between the melody of each multimedia file；

According to the matching degree between the reference melody and the melody of each multimedia file, from the matchmaker more than second Matching degree is selected to meet the first pre-conditioned destination multimedia file in body file set.

In a possible design, methods described also includes：

If pre- in the absence of meeting first with the matching degree between the reference melody in the second multimedia file storehouse If the multimedia file of condition, preset number syllable is extracted from the reference melody；

Calculate the matching degree between extracted syllable and the melody of each multimedia file；

The matching degree between melody according to the syllable for being extracted and each multimedia file, from the matchmaker more than second Matching degree is selected to meet the second pre-conditioned destination multimedia file in body library.

In a possible design, methods described also includes：

If in the absence of the destination multimedia file matched with the voice signal in the second multimedia file storehouse, hair Send and unsuccessfully indicate.

Second aspect, the embodiment of the invention provides a kind of method for obtaining multimedia file, and methods described includes：

Attribute information and voice signal are obtained, the attribute information is used for the attribute of identification of multimedia file；

Sent to multimedia server and obtain request, described acquisition asks at least to carry the attribute information and the voice Signal；

Receive the destination multimedia file that the multimedia server sends.

In a possible design, described acquisition asks also to carry keyword, and described transmission to multimedia server is obtained Before taking request, methods described also includes：

The keyword is obtained, the keyword is added in the acquisition request.

In a possible design, the acquisition keyword, including：

If receiving the failure instruction that the multimedia server sends, the keyword is obtained.

The third aspect, the embodiment of the invention provides a kind of device for obtaining multimedia file, and described device includes：

First receiver module, request is obtained for receiving, and described acquisition asks at least to carry attribute information and voice signal, The attribute information is used for the attribute of identification of multimedia file；

First choice module, for according to the attribute information, being selected from the first multimedia file storehouse and the attribute The multimedia file of information matches constitutes the second multimedia file storehouse, and the first multimedia file storehouse is used to store many matchmakers All multimedia files in body server；

Second selecting module, for according to the voice signal, selected from the second multimedia file storehouse with it is described The destination multimedia file of voice signal matching；

In a possible design, described acquisition asks also to carry keyword,

The first choice module, is additionally operable to according to the attribute information and the keyword, from first multimedia Select to be matched with the attribute information in library, and the second multimedia text is constituted with the multimedia file of the Keywords matching Part storehouse.

In a possible design, second selecting module, including：

First extraction unit, the reference melody for extracting the voice signal；

First computing unit, for calculating each multimedia in the reference melody and the second multimedia file storehouse Matching degree between the melody of file；

First choice unit, for according to matching between the reference melody and the melody of each multimedia file Degree, selects matching degree to meet the first pre-conditioned destination multimedia file from second collection of multimedia documents.

In a possible design, second selecting module also includes：

Second extraction unit, if for not existing between the reference melody in the second multimedia file storehouse Matching degree meets the first pre-conditioned multimedia file, and preset number syllable is extracted from the reference melody；

Second computing unit, for matching between the extracted syllable of calculating and the melody of each multimedia file Degree；

Second select unit, for according to matching between the syllable for being extracted and the melody of each multimedia file Degree, selects matching degree to meet the second pre-conditioned destination multimedia file from the second multimedia file storehouse.

In a possible design, described device also includes：

First sending module, if for not existing what is matched with the voice signal in the second multimedia file storehouse Destination multimedia file, transmission is unsuccessfully indicated.

Fourth aspect, the embodiment of the invention provides a kind of device for obtaining multimedia file, and described device includes：

First acquisition module, for obtaining attribute information and voice signal, the attribute information is used for identification of multimedia text The attribute of part；

Second sending module, request is obtained for being sent to multimedia server, and described acquisition asks at least to carry described Attribute information and the voice signal；

Second receiver module, for receiving the destination multimedia file that the multimedia server sends.

In a possible design, described device also includes：

Second acquisition module, for obtaining the keyword；

Add module, for the keyword to be added in the acquisition request.

In a possible design, second acquisition module, if being additionally operable to receive the multimedia server The failure instruction of transmission, obtains the keyword.

In embodiments of the present invention, terminal carries attribute information and voice in being asked to the acquisition that multimedia server sends Signal, multimedia server selects many with what the attribute information was matched according to the attribute information from the first multimedia file storehouse Media file constitutes the second multimedia file storehouse, according to the voice signal, selection and the voice from the second multimedia file storehouse The destination multimedia file of Signal Matching.Due to the multimedia for only including being matched with the attribute information in the second multimedia file storehouse File, namely the multimedia file negligible amounts that the second multimedia file storehouse includes, therefore, multimedia server is from more than second The destination multimedia file that selection is matched with the voice signal in media file storehouse, compares and saves time, and can improve acquisition target The efficiency of multimedia file.

Brief description of the drawings

Fig. 1 is a kind of schematic diagram of implementation environment provided in an embodiment of the present invention；

Fig. 2 is a kind of method flow diagram for obtaining multimedia file provided in an embodiment of the present invention；

Fig. 3 is a kind of method flow diagram for obtaining multimedia file provided in an embodiment of the present invention；

Fig. 4 is a kind of method flow diagram for obtaining multimedia file provided in an embodiment of the present invention；

Fig. 5 is a kind of apparatus structure schematic diagram for obtaining multimedia file provided in an embodiment of the present invention；

Fig. 6 is a kind of apparatus structure schematic diagram for obtaining multimedia file provided in an embodiment of the present invention；

Fig. 7 be it is provided in an embodiment of the present invention it is a kind of obtain multimedia file apparatus structure schematic diagram (terminal it is general Structure)；

Fig. 8 is a kind of apparatus structure schematic diagram (multimedia service for obtaining multimedia file provided in an embodiment of the present invention The general structure of device).

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.

Fig. 1 is a kind of schematic diagram of implementation environment that the embodiment of the present disclosure is provided, and referring to Fig. 1, the implementation environment includes：Eventually End 101 and multimedia server 102, are connected between terminal 101 and multimedia server 102 by communication network.

Wherein, the application of the association of multimedia server 102 is run in terminal 101, can be logged in based on ID should With or directly log in the application, so as to be interacted with the multimedia server 102.The application can for voice applications or Various applications such as Video Applications, the ID can be user account, telephone number etc., and the embodiment of the present invention is not limited this It is fixed.

Terminal 101 can be mobile phone terminal, PAD (portable android device, panel computer) terminals or electricity Brain terminal etc..Multimedia server 102 can be a multimedia server, or be made up of some multimedia servers Multimedia server cluster, or a cloud computing multimedia server center, the embodiment of the present disclosure are not limited this；It is many Media server 102 can be video server or audio server.

A kind of method for obtaining multimedia file is the embodiment of the invention provides, the method is applied in multimedia server In, referring to Fig. 2, the method includes：

Step 201：Receive and obtain request, acquisition request at least carries attribute information and voice signal, the attribute information For the attribute of identification of multimedia file.

Step 202：According to the attribute information, many matchmakers that selection is matched with the attribute information from the first multimedia file storehouse Body file constitutes the second multimedia file storehouse, and the first multimedia file storehouse is used to store all multimedias in multimedia server File.

Step 203：According to the voice signal, the target that selection is matched with the voice signal from the second multimedia file storehouse Multimedia file.

In a possible design, acquisition request also carries keyword, according to the attribute information, from the first multimedia The multimedia file that selection is matched with the attribute information in library constitutes the second multimedia file storehouse, including：

According to the attribute information and the keyword, select to be matched with the attribute information from the first multimedia file storehouse, and Multimedia file with the Keywords matching constitutes the second multimedia file storehouse.

In a possible design, according to the voice signal, selection and voice letter from the second multimedia file storehouse Number matching destination multimedia file, including：

The reference melody of the voice signal is extracted, this is calculated with reference to each many matchmaker in melody and the second multimedia file storehouse Matching degree between the melody of body file；

According to this with reference to the matching degree between melody and the melody of each multimedia file, from the second collection of multimedia documents Middle selection matching degree meets the first pre-conditioned destination multimedia file.

In a possible design, the method also includes：

If pre-conditioned in the absence of meeting first with reference to the matching degree between melody with this in the second multimedia file storehouse Multimedia file, from this with reference to extracting preset number syllable in melody；

According to the matching degree between the syllable for being extracted and the melody of each multimedia file, from the second multimedia file storehouse Middle selection matching degree meets the second pre-conditioned destination multimedia file.

In a possible design, the method also includes：

If in the absence of the destination multimedia file matched with the voice signal in the second multimedia file storehouse, sending failure Indicate.

The embodiment of the invention provides it is a kind of obtain multimedia file method, the method application in the terminal, referring to figure 3, the method includes：

Step 301：Attribute information and voice signal are obtained, the attribute information is used for the attribute of identification of multimedia file.

Step 302：Sent to multimedia server and obtain request, acquisition request at least carries the attribute information and the language Message number.

Step 303：Receive the destination multimedia file that multimedia server sends.

In a possible design, acquisition request also carries keyword, is sent to multimedia server and obtains request Before, the method also includes：

Keyword is obtained, the keyword is added in acquisition request.

In a possible design, keyword is obtained, including：

If receiving the failure instruction of multimedia server transmission, keyword is obtained.

Certain application is installed, the application can be Video Applications or voice applications in terminal.The application has listens song to know song Function；When user does not know title of the song, user can groan out the melody of the song for wanting search against terminal, and defeated to terminal Enter the attribute information of the song；Terminal gathers the voice signal of the melody, is sent to multimedia server and obtains request, the acquisition Request at least carries the attribute information and the voice signal；Multimedia server is literary from the first multimedia according to the attribute information The multimedia file that selection is matched with the attribute information in part storehouse constitutes the second multimedia file storehouse, from the second multimedia file storehouse The middle destination multimedia file for selecting to be matched with the voice signal, the destination multimedia file is sent to terminal.Due to reducing The quantity of the multimedia file that the second multimedia file storehouse includes, therefore the effect for obtaining destination multimedia file can be improved Rate.

In order to further reduce the quantity of the multimedia file that the second multimedia file storehouse includes, user can also be to end End input keyword, terminal obtains the keyword of user input, the keyword is added in acquisition request, so that multimedia Server selects to be matched with the attribute information according to the attribute information and the keyword from the first multimedia file storehouse, and with The multimedia file of the Keywords matching constitutes the second multimedia file storehouse, due to further reducing the second multimedia file storehouse The quantity of the multimedia file for including, therefore can further improve the efficiency for obtaining destination multimedia file.

It should be noted that in embodiments of the present invention, it is also possible to sent to multimedia server first in terminal and obtained During request, acquisition request only carries the attribute information and the voice signal；Multimedia server according to the attribute information and The voice signal, when can not matching the destination multimedia file matched with the voice signal in the second multimedia file storehouse, I.e. multimedia server recognition failures when, terminal just obtains the keyword, by the keyword be added to the acquisition request in.So as to Multimedia server is matched with this according to the attribute information, the keyword and the voice signal in the second multimedia file storehouse The destination multimedia file of voice signal matching, the success rate and accuracy rate of destination multimedia file are obtained such that it is able to improve.

Referring to Fig. 4, the handling process of the method can include the steps：

Step 401：Terminal obtains attribute information and voice signal, is sent to multimedia server and obtains request, the acquisition Request at least carries the attribute information and the voice signal, and the attribute information is used for the attribute of identification of multimedia file.

The current interface of terminal includes listening song to know bent recognition button, when user's searching multimedia files, Yong Huke To click on the recognition button；When terminal detects the recognition button to be triggered, display identification interface, the identification interface includes Attribute information input frame, user can be input into the attribute information of multimedia file to be searched in the attribute information input frame, And groan out the melody of multimedia file to be searched or move near other equipment terminal, the other equipment against terminal The multimedia file is currently played.Terminal obtain user input attribute information, and gather user input or other The voice signal of device plays, sends to multimedia server and obtains request, and acquisition request carries the attribute information and the language Message number.

Attribute information includes sex and/or languages；For example, when multimedia file is song, attribute information can be should The sex of the singer of song and/or the languages of the song.The sex of singer can be male or women；The language of the song Planting can be Chinese or English etc..

Then the identification interface includes sex input frame and languages input frame；User can be input into sex input frame or Person selects sex, and/or, it is input into languages input frame or selection languages；Terminal obtains the property of user input or selection Not, and/or, the languages of user input or selection are obtained, by the sex and/or languages composition attribute information.

The accuracy of multimedia file is obtained to further improve, can also be input into including keyword in the identification interface Frame, user can be input into the keyword of multimedia file in the keyword input frame；For example, when multimedia file is song When, user can be input into the lyrics in the keyword input frame.Terminal obtains the keyword of user input, then add the keyword It is added in acquisition request, namely the keyword is also carried in acquisition request.

It should be noted that the embodiment of the present invention obtains the attribute information of user input to terminal and gathers voice signal Order is not specifically limited；Terminal can first obtain the attribute information of user input, then gather the voice signal of user input； The voice signal of user input can be first gathered, then obtain the attribute information of user input.Certainly, terminal can also be used obtaining While the attribute information of family input, the voice signal of user input is gathered.

Step 402：The acquisition request that multimedia server receiving terminal sends, according to the attribute information, from matchmaker more than first The multimedia file that selection is matched with the attribute information in body library constitutes the second multimedia file storehouse.

It is used to store all multimedia files in multimedia server in first multimedia file storehouse.Second multimedia text Part storehouse is used to be stored in the multimedia file of attribute information matching, and the second multimedia file storehouse includes matchmaker more than at least one Body file.If acquisition request only carries the attribute information and the voice signal, this step can be by following steps (1) extremely (3) realize, including：

(1)：Multimedia server receiving terminal send acquisition request, from the acquisition request in obtain the attribute information and The voice signal.

If also carrying the keyword in acquisition request, in this step, multimedia server can also be from the acquisition The keyword is obtained in request.

(2)：Each multimedia file of the multimedia server in the attribute information and the first multimedia file storehouse Attribute information, calculate between the attribute information of each multimedia file in the attribute information and the first multimedia file storehouse With degree.

(3)：Each multimedia file of the multimedia server in the attribute information and the first multimedia file storehouse Matching degree between attribute information, selects to meet the 3rd pre-conditioned multimedia file composition from the first multimedia file storehouse Second multimedia file storehouse.

3rd it is pre-conditioned can be maximum for matching degree, or matching degree is more than the first preset matching degree.Wherein, first is pre- If matching degree can as needed be configured and change, in embodiments of the present invention, the first preset matching degree is not made specifically Limit；For example, the first preset matching degree can be 80% or 90% etc..

It should be noted that the correspondence of attribute information and multimedia file storehouse can also be previously stored in multimedia server Relation, accordingly, this step could alternatively be：

The acquisition request that multimedia server receiving terminal sends, according to the attribute information, dependence information and multimedia The corresponding second multimedia file storehouse of the attribute information is obtained in the corresponding relation of library.

Further, if acquisition request not only carries the attribute information and the voice signal, the keyword is also carried, Then this step can be：

The acquisition request that multimedia server receiving terminal sends, according to the attribute information and the keyword, more than first Selection is matched with the attribute information in media file storehouse, and constitutes the second multimedia text with the multimedia file of the Keywords matching Part storehouse.

Multimedia server is selected and the attribute according to the attribute information and the keyword from the first multimedia file storehouse Information matches, and can be by following three kinds realizations with the second multimedia file storehouse of multimedia file composition of the Keywords matching Mode is realized.

Correspondence the first implementation, multimedia server according to the attribute information and the keyword, from the first multimedia Selection is matched with the attribute information in library, and constitutes the second multimedia file storehouse with the multimedia file of the Keywords matching The step of can be：

According to the attribute information, selection is matched multimedia server with the attribute information from the first multimedia file storehouse Multimedia file constitutes the 3rd multimedia file storehouse, according to each multimedia file in the keyword and the 3rd multimedia file storehouse Caption information, selection constitutes the second multimedia text with the multimedia file of the Keywords matching from the 3rd multimedia file storehouse Part storehouse.

Second implementation of correspondence, multimedia server according to the attribute information and the keyword, from the first multimedia Selection is matched with the attribute information in library, and constitutes the second multimedia file storehouse with the multimedia file of the Keywords matching The step of can be：

Multimedia server according to the caption information of each multimedia file in the keyword and the first multimedia file storehouse, Selection constitutes the 3rd multimedia file storehouse with the multimedia file of the Keywords matching from the first multimedia file storehouse, according to this Attribute information, the multimedia file that selection is matched with the attribute information from the 3rd multimedia file storehouse constitutes the second multimedia text Part.

Correspondence the third implementation, multimedia server according to the attribute information and the keyword, from the first multimedia Selection is matched with the attribute information in library, and constitutes the second multimedia file storehouse with the multimedia file of the Keywords matching The step of can be：

The attribute of each multimedia file of the multimedia server in the attribute information and the first multimedia file storehouse Information, calculates the similarity between the attribute information and the attribute information of each multimedia file, according to the keyword and each The caption information of multimedia file, calculates the similarity between the keyword and the caption information of each multimedia file；For Each multimedia file in first multimedia file storehouse, multimedia server according to the first default weight, the second default weight, Similarity between the attribute information and the attribute information of the multimedia file, and, the keyword and the multimedia file Similarity between caption information, calculates the matching degree between the attribute information and the keyword and the multimedia file.According to Above method get the attribute information and keyword respectively with the first multimedia file storehouse in each multimedia file With degree, according to the attribute information and keyword respectively with the first multimedia file storehouse in each multimedia file matching degree, Select matching degree to meet the 4th pre-conditioned multimedia file from the first multimedia file storehouse and constitute the second multimedia file Storehouse.

First default weight is the corresponding weight of similarity between attribute information；Second default weight is between keyword The corresponding weight of similarity.Then multimedia server is according to the first default weight, the second default weight, the attribute information and is somebody's turn to do Similarity between the attribute information of multimedia file, and, between the keyword and the caption information of the multimedia file Similarity, can be the step of calculate the matching degree between the attribute information and the keyword and the multimedia file：

Multimedia server calculates first and presets between weight and the attribute information and the attribute information of the multimedia file Similarity product, obtain the first numerical value, calculate the captions letter of the second default weight and the keyword and the multimedia file The product of the similarity between breath, obtains second value, using the first numerical value and second value and as the attribute information and should Matching degree between keyword and the multimedia file.

First default weight and the second default weight can be with equal, it is also possible to unequal；First default weight and second pre- If weight can as needed be configured and change, in embodiments of the present invention, to the first default weight, the second default power Weight is all not especially limited；For example, the first default weight is 0.5, the second default weight is 0.5；For another example, the first default weight is 0.4, the second default weight is 0.6.

4th it is pre-conditioned can be maximum for matching degree, or matching degree is more than the second preset matching degree.Wherein, second is pre- If matching degree can as needed be configured and change, in embodiments of the present invention, the second preset matching degree is not made specifically Limit；For example, the second preset matching degree can be 80% or 90% etc..

It should be noted that acquisition request carries the keyword, so as to further reduce the second multimedia file storehouse The quantity of the multimedia file for including, further increases the efficiency for obtaining destination multimedia file.

Step 403：Multimedia server according to the voice signal, with the voice believe from the second multimedia file storehouse by selection Number matching destination multimedia file.

This step can be realized by following steps (1) and (2), including：

(1)：Multimedia server extracts the reference melody of the voice signal, calculates this with reference to melody and the second multimedia text Matching degree between the melody of each multimedia file in part storehouse.

Multimedia server calculates this and refers to melody by the algorithm of the matching degree between existing any calculating melody And the matching degree between the melody of each multimedia file in the second multimedia file storehouse.For example, two melody are included Matching degree between syllable is used as the matching degree between two melody.Then this step can be：

Multimedia server extracts the syllable sequence that this includes with reference to melody, extracts the sound of the melody of each multimedia file Section sequence, calculates matching between the syllable sequence and the syllable sequence of the melody of each multimedia file that the reference sequences include Spend as this with reference to the matching degree between melody and the melody of each multimedia file.

(2)：Multimedia server according to this with reference to matching degree between melody and the melody of each multimedia file, from the Matching degree is selected to meet the first pre-conditioned destination multimedia file in two collection of multimedia documents.

First it is pre-conditioned can be maximum for matching degree, or matching degree is more than the 3rd preset matching degree.Wherein, the 3rd is pre- If matching degree can as needed be configured and change, in embodiments of the present invention, the 3rd preset matching degree is not made specifically Limit；For example, the 3rd preset matching degree can be 80% or 90% etc..

If it should be noted that met with reference to the matching degree between melody in the absence of with this in the second multimedia file storehouse First pre-conditioned multimedia file, the success rate of multimedia file, in this step, multimedia service are obtained to improve Device can suitably increase the fuzziness of matching algorithm, then detailed process can be realized by following steps (A) to (C), including：

(A)：If default in the absence of meeting first with reference to the matching degree between melody with this in the second multimedia file storehouse The multimedia file of condition, multimedia server is from this with reference to extraction preset number syllable in melody.

Preset number can as needed be configured and change, and in embodiments of the present invention, preset number not made have Body is limited；For example, preset number can be 10 or 15 etc.；Certainly, preset number can also include according to this with reference to melody Number of syllables is configured and changes；For example, the number of syllables that the preset number can be this to be included with reference to melody and default ratio Rounded in the product of value or under round.

Preset ratio can as needed be configured and change, and in embodiments of the present invention, default ratio not made have Body is limited；For example, default ratio can be 0.8 or 0.85 etc..For example, when this is 10 with reference to the number of syllables that melody includes, The default ratio is 0.8, then preset number can be 8, then multimedia server refers to from this and 8 syllables are extracted in melody.

(B)：Multimedia server calculates the matching degree between extracted syllable and the melody of each multimedia file.

Multimedia server obtains the syllable that the melody of each multimedia file includes, according to the syllable for being extracted and each The syllable that the melody of multimedia file includes, matching between the extracted syllable of calculating and the melody of each multimedia file Degree.

(C)：Multimedia server according to the matching degree between the syllable for being extracted and the melody of each multimedia file, from Matching degree is selected to meet the second pre-conditioned destination multimedia file in second multimedia file storehouse.

Second it is pre-conditioned can be maximum for matching degree, or matching degree is more than the 4th preset matching degree.Wherein, the 4th is pre- If matching degree can as needed be configured and change, in embodiments of the present invention, the 4th preset matching degree is not made specifically Limit；For example, the 4th preset matching degree can be 80% or 90% etc..

It should be noted that the destination multimedia file that multimedia server is obtained can be one, or multiple. Also, destination multimedia file can be video file, or audio file.

Step 404：Multimedia server sends the destination multimedia file to terminal.

Terminal to multimedia server send acquisition request in carried terminal terminal iidentification, multimedia server from this Obtain in asking and obtain the terminal iidentification, according to the terminal iidentification, the destination multimedia file is sent to terminal.

In a possible implementation, in order to reduce the network resource consumption of terminal, multimedia server can not The destination multimedia file is sent to terminal, the mark of the destination multimedia file is only sent to terminal, receiving terminal hair When the download request or playing request sent, just the destination multimedia file is sent to terminal.

Wherein, the mark of the terminal iidentification and the destination multimedia file can as needed be configured and change, In the embodiment of the present invention, the mark to the terminal iidentification and the destination multimedia file is not especially limited；For example, the terminal mark Knowing can be the phone number of terminal or log in the ID of the application.The mark of the destination multimedia file can be should Title or numbering of destination multimedia file etc..

If it should be noted that in the absence of the destination multimedia matched with the voice signal in the second multimedia file storehouse File, is sent to terminal and unsuccessfully indicates, and this is unsuccessfully indicated for indicating identification identification.Terminal receives what multimedia server sent Failure is indicated, and shows that this is unsuccessfully indicated.

Terminal is received after this unsuccessfully indicates, and terminal can also be sent to multimedia server and obtain request again, and this is obtained Take request and carry the attribute information, the voice signal and the keyword.

Step 405：Terminal receives the destination multimedia file that multimedia server sends.

Terminal receives the destination multimedia file that multimedia server sends, and stores the destination multimedia file, shows The mark of the destination multimedia file, user can click on the destination multimedia file and play the destination multimedia with triggering terminal File；When terminal detects the destination multimedia file and is triggered, the destination multimedia file that acquisition has been stored plays the mesh Mark multimedia file.

If it should be noted that multimedia server only sends the destination multimedia file to terminal in step 404 Mark, then this step can be：

Terminal receives the mark of the destination multimedia file that multimedia server sends, and shows the destination multimedia file Mark；The mark that user can click on the destination multimedia file plays the destination multimedia file with triggering terminal；Terminal When detecting the destination multimedia file and being triggered, playing request is sent to multimedia server, the playing request carries the mesh Mark the mark of multimedia file.

The playing request that multimedia server receiving terminal sends, according to the mark of the destination multimedia file, obtaining should Destination multimedia file, the destination multimedia file is sent to terminal；The target that terminal receives multimedia server transmission is more Media file, plays the destination multimedia file.

A kind of device for obtaining multimedia file is the embodiment of the invention provides, the device is applied in multimedia server In, the step of execution for performing above multimedia server；Referring to Fig. 5, the device includes：

First receiver module 501, request is obtained for receiving, and described acquisition asks at least to carry attribute information and voice letter Number, the attribute information is used for the attribute of identification of multimedia file；

First choice module 502, for according to the attribute information, being selected from the first multimedia file storehouse and the category Property information matches multimedia file constitute the second multimedia file storehouse, the first multimedia file storehouse be used for store described many All multimedia files in media server；

Second selecting module 503, for according to the voice signal, being selected and institute from the second multimedia file storehouse State the destination multimedia file of voice signal matching.

In a possible design, described acquisition asks also to carry keyword,

The first choice module 502, is additionally operable to according to the attribute information and the keyword, from the matchmaker more than first Select to be matched with the attribute information in body library, and the second multimedia is constituted with the multimedia file of the Keywords matching Library.

In a possible design, second selecting module 503, including：

First extraction unit, the reference melody for extracting the voice signal；

In a possible design, second selecting module 503 also includes：

In a possible design, described device also includes：

The embodiment of the invention provides it is a kind of obtain multimedia file device, the device application in the terminal, for holding The step of row above terminal is performed；Referring to Fig. 6, the device includes：

First acquisition module 601, for obtaining attribute information and voice signal, the attribute information is used for identification of multimedia The attribute information of file；

Second sending module 602, request is obtained for being sent to multimedia server, and described acquisition asks at least to carry institute State attribute information and the voice signal；

Second receiver module 603, for receiving the destination multimedia file that the multimedia server sends.

In a possible design, described device also includes：

Second acquisition module, for obtaining the keyword；

Add module, for the keyword to be added in the acquisition request.

It should be noted that：Above-described embodiment provide acquisition multimedia file device obtain multimedia file when, Only carried out with the division of above-mentioned each functional module for example, in practical application, as needed can distribute above-mentioned functions Completed by different functional module, will the internal structure of device be divided into different functional modules, it is described above to complete All or part of function.In addition, the device and acquisition multimedia file of the acquisition multimedia file that above-described embodiment is provided Embodiment of the method belongs to same design, and it implements process and refers to embodiment of the method, repeats no more here.

Fig. 7 is a kind of structural representation of terminal provided in an embodiment of the present invention.The terminal can be used for implementing above-mentioned reality Apply the function performed by the terminal in the method for the acquisition multimedia file shown by example.Specifically：

Terminal 700 can include RF (Radio Frequency, radio frequency) circuit 710, include one or more meters The memory 720 of calculation machine readable storage medium storing program for executing, input block 730, display unit 740, sensor 750, voicefrequency circuit 760, biography Defeated module 770, include the part such as or the processor 780 and power supply 790 of more than one processing core.This area Technical staff is appreciated that the restriction of the terminal structure shown in Fig. 7 not structure paired terminal, can include than illustrate it is more or Less part, or some parts are combined, or different part arrangements.Wherein：

RF circuits 710 can be used to receiving and sending messages or communication process in, the reception and transmission of signal, especially, by base station After downlink information is received, transfer to one or more than one processor 780 is processed；In addition, will be related to up data is activation to Base station.Generally, RF circuits 710 include but is not limited to antenna, at least one amplifier, tuner, one or more oscillators, use Family identity module (SIM) card, transceiver, coupler, LNA (Low Noise Amplifier, low-noise amplifier), duplex Device etc..Additionally, RF circuits 710 can also be communicated by radio communication with network and other-end.The radio communication can make With any communication standard or agreement, and including but not limited to GSM (Global System of Mobile communication, entirely Ball mobile communcations system), GPRS (General Packet Radio Service, general packet radio service), CDMA (Code Division Multiple Access, CDMA), WCDMA (Wideband Code Division Multiple Access, WCDMA), LTE (Long Term Evolution, Long Term Evolution), Email, SMS (Short Messaging Service, Short Message Service) etc..

Memory 720 can be used to store software program and module, the terminal institute as shown by above-mentioned exemplary embodiment Corresponding software program and module, processor 780 store the software program and module in memory 720 by operation, from And various function application and data processing are performed, such as realize the interaction based on video.Memory 720 can mainly include storage Program area and storage data field, wherein, the application program that storing program area can be needed for storage program area, at least one function (such as sound-playing function, image player function etc.) etc.；Storage data field can be stored and use what is created according to terminal 700 Data (such as voice data, phone directory etc.) etc..Additionally, memory 720 can include high-speed random access memory, can be with Including nonvolatile memory, for example, at least one disk memory, flush memory device or other volatile solid-states Part.Correspondingly, memory 720 can also include Memory Controller, to provide processor 780 and input block 730 pairs of storages The access of device 720.

Input block 730 can be used to receive the numeral or character information of input, and generation is set and function with user The relevant keyboard of control, mouse, action bars, optics or trace ball signal input.Specifically, input block 730 may include to touch Sensitive surfaces 731 and other input terminals 732.Touch sensitive surface 731, also referred to as touch display screen or Trackpad, can collect use Family thereon or neighbouring touch operation (such as user is using any suitable objects such as finger, stylus or annex in touch-sensitive table Operation on face 731 or near Touch sensitive surface 731), and corresponding linked set is driven according to formula set in advance.It is optional , Touch sensitive surface 731 may include two parts of touch detecting apparatus and touch controller.Wherein, touch detecting apparatus detection is used The touch orientation at family, and the signal that touch operation brings is detected, transmit a signal to touch controller；Touch controller is from touch Touch information is received in detection means, and is converted into contact coordinate, then give processor 780, and can receiving processor 780 The order sent simultaneously is performed.Furthermore, it is possible to using polytypes such as resistance-type, condenser type, infrared ray and surface acoustic waves Realize Touch sensitive surface 731.Except Touch sensitive surface 731, input block 730 can also include other input terminals 732.Specifically, Other input terminals 732 can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.), One or more in trace ball, mouse, action bars etc..

Display unit 740 can be used to showing by user input information or be supplied to the information and terminal 700 of user Various graphical user interface, these graphical user interface can be made up of figure, text, icon, video and its any combination. Display unit 740 may include display panel 741, optionally, can use LCD (Liquid Crystal Display, liquid crystal Show device), the form such as OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) configure display panel 741.Further, Touch sensitive surface 731 can cover display panel 741, when Touch sensitive surface 731 is detected thereon or neighbouring is touched After touching operation, processor 780 is sent to determine the type of touch event, with preprocessor 780 according to the type of touch event Corresponding visual output is provided on display panel 741.Although in the figure 7, Touch sensitive surface 731 and display panel 741 are conducts Two independent parts come realize input and input function, but in some embodiments it is possible to by Touch sensitive surface 731 with display Panel 741 is integrated and realization is input into and output function.

Terminal 700 may also include at least one sensor 750, such as optical sensor, motion sensor and other sensings Device.Specifically, optical sensor may include ambient light sensor and proximity transducer, wherein, ambient light sensor can be according to environment The light and shade of light adjusts the brightness of display panel 741, and proximity transducer can close display when terminal 700 is moved in one's ear Panel 741 and/or backlight.As one kind of motion sensor, in the detectable all directions of Gravity accelerometer (generally Three axles) acceleration size, size and the direction of gravity are can detect that when static, can be used for recognize mobile phone attitude application (ratio Such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap)；Extremely The other sensors such as gyroscope, barometer, hygrometer, thermometer, the infrared ray sensor that be can also configure in terminal 700, herein Repeat no more.

Voicefrequency circuit 760, loudspeaker 761, microphone 762 can provide the COBBAIF between user and terminal 700.Audio Electric signal after the voice data conversion that circuit 760 will can be received, is transferred to loudspeaker 761, and sound is converted to by loudspeaker 761 Sound signal output；On the other hand, the voice signal of collection is converted to electric signal by microphone 762, after being received by voicefrequency circuit 760 Voice data is converted to, then after voice data output processor 780 is processed, through RF circuits 710 being sent to such as another end End, or voice data is exported to memory 720 so as to further treatment.Voicefrequency circuit 760 is also possible that earphone jack, To provide the communication of peripheral hardware earphone and terminal 700.

Terminal 700 can help user to send and receive e-mail, browse webpage and access streaming video by transport module 770 Deng it has provided the user broadband internet wirelessly or non-wirelessly and has accessed.Although Fig. 7 shows transport module 770, can be Understand, it is simultaneously not belonging to must be configured into for terminal 700, can not change the essential scope of invention as needed completely It is interior and omit.

Processor 780 is the control centre of terminal 700, and each portion of whole mobile phone is linked using various interfaces and circuit Point, by running or performing software program and/or module of the storage in memory 720, and storage is called in memory 720 Interior data, perform the various functions and processing data of terminal 700, so as to carry out integral monitoring to mobile phone.Optionally, processor 780 may include one or more processing cores；Preferably, processor 780 can integrated application processor and modem processor, Wherein, application processor mainly processes operating system, user interface and application program etc., and modem processor mainly processes nothing Line communicates.It is understood that above-mentioned modem processor can not also be integrated into processor 780.

Terminal 700 also includes the power supply 790 (such as battery) powered to all parts, it is preferred that power supply can be by electricity Management system is logically contiguous with processor 780, so as to realize management charging, electric discharge and power consumption by power-supply management system The functions such as management.Power supply 790 can also include one or more direct current or AC power, recharging system, power supply event The random component such as barrier detection circuit, power supply changeover device or inverter, power supply status indicator.

Although not shown, terminal 700 can also will not be repeated here including camera, bluetooth module etc..Specifically in this reality In applying example, the display unit of terminal is touch-screen display, and terminal also includes memory, and one or more than one Program, one of them or more than one program storage is configured to by one or more than one treatment in memory Device performs said one or more than one program bag containing the instruction for implementing the performed operation of terminal in above-described embodiment.

Fig. 8 is a kind of structural representation of multimedia server provided in an embodiment of the present invention；The multimedia server 800 Can include one or more central processing units (central because of configuration or performance is different and the larger difference of producing ratio Processing units, CPU) 822 (for example, one or more processors) and memory 832, one or more The storage medium 830 (such as one or more mass memory units) of storage application program 842 or data 844.Wherein, deposit Reservoir 832 and storage medium 830 can be of short duration storage or persistently storage.The program stored in storage medium 830 can include One or more modules (diagram is not marked), each module can be included to the series of instructions behaviour in multimedia server Make.Further, central processing unit 822 could be arranged to be communicated with storage medium 830, be held in abnormal injected system 800 Series of instructions operation in row storage medium 830.

Abnormal injected system 800 can also include one or more power supplys 826, one or more wired or nothings Wired network interface 850, one or more input/output interfaces 858, one or more keyboards 856, and/or, one Or more than one operating system 841, such as Windows Server^TM, Mac OS X^TM, Unix^TM,Linux^TM, FreeBSD^TMDeng Deng.

The multimedia server 800 can be used for many in the method for perform the acquisition multimedia file that above-described embodiment is provided Step performed by media server.

One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, it is also possible to instruct the hardware of correlation to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all it is of the invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.

Claims

1. it is a kind of obtain multimedia file method, it is characterised in that methods described includes：

Receive and obtain request, described acquisition asks at least to carry attribute information and voice signal, and the attribute information is used to identify The attribute of multimedia file；

According to the attribute information, the multimedia file group matched with the attribute information is selected from the first multimedia file storehouse Into the second multimedia file storehouse, the first multimedia file storehouse is used to store all multimedias in the multimedia server File；

According to the voice signal, selected from the second multimedia file storehouse and many matchmakers of the target that the voice signal is matched Body file.

2. method according to claim 1, it is characterised in that the acquisition request also carries keyword, described according to institute Attribute information is stated, selects the multimedia file matched with the attribute information to constitute matchmaker more than second from the first multimedia file storehouse Body library, including：

According to the attribute information and the keyword, selected from the first multimedia file storehouse and the attribute information Match somebody with somebody, and the second multimedia file storehouse is constituted with the multimedia file of the Keywords matching.

3. method according to claim 1 and 2, it is characterised in that described according to the voice signal, more than described second The destination multimedia file matched with the voice signal is selected in media file storehouse, including：

Extract the reference melody of the voice signal, calculate each in the reference melody and the second multimedia file storehouse Matching degree between the melody of multimedia file；

According to the matching degree between the reference melody and the melody of each multimedia file, from second multimedia text Matching degree is selected to meet the first pre-conditioned destination multimedia file in part set.

4. method according to claim 3, it is characterised in that methods described also includes：

If meeting the first default bar in the absence of with the matching degree between the reference melody in the second multimedia file storehouse The multimedia file of part, extracts preset number syllable from the reference melody；

The matching degree between melody according to the syllable for being extracted and each multimedia file, from second multimedia text Matching degree is selected to meet the second pre-conditioned destination multimedia file in part storehouse.

5. according to any described methods of claim 1-4, it is characterised in that methods described also includes：

If in the absence of the destination multimedia file matched with the voice signal in the second multimedia file storehouse, sending and losing Lose instruction.

6. it is a kind of obtain multimedia file device, it is characterised in that described device includes：

First receiver module, request is obtained for receiving, and described acquisition asks at least to carry attribute information and voice signal, described Attribute information is used for the attribute of identification of multimedia file；

First choice module, for according to the attribute information, being selected from the first multimedia file storehouse and the attribute information The multimedia file of matching constitutes the second multimedia file storehouse, and the first multimedia file storehouse is used to store the multimedia clothes All multimedia files in business device；

Second selecting module, for according to the voice signal, being selected from the second multimedia file storehouse and the voice The destination multimedia file of Signal Matching.

7. device according to claim 6, it is characterised in that the acquisition request also carries keyword,

The first choice module, is additionally operable to according to the attribute information and the keyword, from first multimedia file Select to be matched with the attribute information in storehouse, and the second multimedia file is constituted with the multimedia file of the Keywords matching Storehouse.

8. the device according to claim 6 or 7, it is characterised in that second selecting module, including：

First extraction unit, the reference melody for extracting the voice signal；

First computing unit, for calculating each multimedia file in the reference melody and the second multimedia file storehouse Melody between matching degree；

First choice unit, for according to the matching degree between the reference melody and the melody of each multimedia file, Matching degree is selected to meet the first pre-conditioned destination multimedia file from second collection of multimedia documents.

9. device according to claim 8, it is characterised in that second selecting module, also includes：

Second extraction unit, if for not existing matching between the reference melody in the second multimedia file storehouse Degree meets the first pre-conditioned multimedia file, and preset number syllable is extracted from the reference melody；

Second computing unit, for calculating the matching degree between extracted syllable and the melody of each multimedia file；

Second select unit, for the matching degree between the melody according to the syllable for being extracted and each multimedia file, Matching degree is selected to meet the second pre-conditioned destination multimedia file from the second multimedia file storehouse.

10. according to any described devices of claim 6-9, it is characterised in that described device also includes：

First sending module, if for not existing the target matched with the voice signal in the second multimedia file storehouse Multimedia file, transmission is unsuccessfully indicated.

11. a kind of devices for obtaining multimedia file, it is characterised in that described device includes：

First acquisition module, for obtaining attribute information and voice signal, the attribute information is used for identification of multimedia file Attribute；

Second sending module, request is obtained for being sent to multimedia server, and described acquisition asks at least to carry the attribute Information and the voice signal；

12. devices according to claim 11, it is characterised in that described device also includes：

Second acquisition module, for obtaining the keyword；

Add module, for the keyword to be added in the acquisition request.

13. devices according to claim 12, it is characterised in that

Second acquisition module, if being additionally operable to receive the failure instruction that the multimedia server sends, obtains described Keyword.