WO2022089098A1 - 音高调节方法、装置及计算机存储介质 - Google Patents

音高调节方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2022089098A1
WO2022089098A1 PCT/CN2021/119571 CN2021119571W WO2022089098A1 WO 2022089098 A1 WO2022089098 A1 WO 2022089098A1 CN 2021119571 W CN2021119571 W CN 2021119571W WO 2022089098 A1 WO2022089098 A1 WO 2022089098A1
Authority
WO
WIPO (PCT)
Prior art keywords
pitch
melody
target
file
fundamental frequency
Prior art date
Application number
PCT/CN2021/119571
Other languages
English (en)
French (fr)
Inventor
周宇
林森
Original Assignee
腾讯音乐娱乐科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯音乐娱乐科技(深圳)有限公司 filed Critical 腾讯音乐娱乐科技(深圳)有限公司
Priority to US18/034,031 priority Critical patent/US20230395051A1/en
Publication of WO2022089098A1 publication Critical patent/WO2022089098A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/18Selecting circuits
    • G10H1/20Selecting circuits for transposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/366Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems with means for modifying or correcting the external signal, e.g. pitch correction, reverberation, changing a singer's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/02Means for controlling the tone frequencies, e.g. attack or decay; Means for producing special musical effects, e.g. vibratos or glissandos
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/005Musical accompaniment, i.e. complete instrumental rhythm synthesis added to a performed melody, e.g. as output by drum machines
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/066Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/325Musical pitch modification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/325Musical pitch modification
    • G10H2210/331Note pitch correction, i.e. modifying a note pitch or replacing it by the closest one in a given scale

Definitions

  • the embodiments of the present application relate to the field of data processing, and in particular, to a pitch adjustment method, device, and computer storage medium.
  • the current smart terminal music software can provide users with singing recording services, that is, the music software plays the accompaniment of the song, the user sings under the accompaniment, and the music software records the user's singing voice, and then mixes the user's singing voice with the accompaniment of the song
  • the final synthesized work includes the user's singing voice and the accompaniment of the song.
  • Embodiments of the present application provide a pitch adjustment method, device, and computer storage medium, which are used to automatically adjust the accompaniment of a target song, so that the user's singing voice matches the accompaniment in pitch.
  • a first aspect of the embodiments of the present application provides a pitch adjustment method, including:
  • the alternative melody files are used to identify the pitch value of the note in the melody of the target song, and the pitch values identified by each of the alternative melody files are different;
  • the target fundamental frequency point includes the fundamental frequency The fundamental frequency point corresponding in time to the note of the candidate melody file in the sequence;
  • the candidate melody file with the smallest sum is determined as the target melody file, and the pitch of the accompaniment file of the target song is adjusted according to the pitch value difference between the target melody file and the original melody file of the target song.
  • a second aspect of the embodiments of the present application provides a pitch adjustment device, including:
  • the first acquisition unit is used to acquire a plurality of candidate melody files, the candidate melody files are used to identify the pitch values of notes in the melody of the target song, and the pitch values identified by each of the candidate melody files are different ;
  • the second acquisition unit is used to acquire the fundamental frequency sequence of the singing voice of the user singing the target song
  • the conversion unit is used to convert the frequency value of the target fundamental frequency point of the fundamental frequency sequence into a pitch value according to a preset algorithm, and the target fundamental frequency point includes the frequency value of the fundamental frequency sequence and the candidate melody file.
  • the fundamental frequency point corresponding to the note in time;
  • the calculation unit is used to calculate the pitch value difference between each of the alternative melody files and the fundamental frequency sequence at each corresponding time point, and to count all the pitch values of each of the alternative melody files respectively. the sum of the high value differences;
  • the pitch adjustment unit is used to determine the candidate melody file with the smallest sum as the target melody file, and adjust the target song according to the pitch value difference between the target melody file and the original melody file of the target song the pitch of the accompaniment file.
  • a third aspect of the embodiments of the present application provides a pitch adjustment device, including:
  • processor memory, bus, input and output devices
  • the processor is connected to the memory and the input and output device;
  • the bus is respectively connected to the processor, the memory and the input and output devices;
  • the processor is used to obtain a plurality of candidate melody files, and the alternative melody files are used to identify the pitch values of the notes in the melody of the target song, and the pitch values identified by each of the alternative melody files are different, and obtain
  • the fundamental frequency sequence of the singing voice of the user singing the target song, and the frequency value of the target fundamental frequency point of the fundamental frequency sequence is converted into a pitch value according to a preset algorithm, and the target fundamental frequency point includes the fundamental frequency sequence
  • the target fundamental frequency point includes the fundamental frequency sequence
  • calculate the pitch value difference between each of the alternative melody file and the fundamental frequency sequence at each corresponding time point calculate the pitch value difference between each of the alternative melody file and the fundamental frequency sequence at each corresponding time point , and count the sum of all pitch value differences of each described candidate melody file respectively, determine the candidate melody file with the smallest sum as the target melody file, and according to the target melody file and the target song
  • the pitch value difference of the original melody file adjusts the pitch of the accompaniment file of the target song.
  • a fourth aspect of the embodiments of the present application provides a computer storage medium, where an instruction is stored in the computer storage medium, and when the instruction is executed on a computer, the computer executes the method of the foregoing first aspect.
  • the embodiments of the present application have the following advantages:
  • the fundamental frequency sequence of the user's singing is obtained, the pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point is calculated, and the The sum of all pitch value differences, determine the candidate melody file with the smallest sum as the target melody file, and adjust the pitch of the accompaniment file of the target song according to the pitch value difference between the target melody file and the original melody file of the target song , since the pitch identified by the target melody file has the highest matching degree with the pitch of the user's singing voice, the accompaniment after the pitch adjustment can match the pitch of the user's singing voice, and the resulting mixed works can obtain good listening feel.
  • 1 is a schematic flowchart of a pitch adjustment method in an embodiment of the application
  • FIG. 2 is another schematic flowchart of the pitch adjustment method in the embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a pitch adjustment device in an embodiment of the present application.
  • FIG. 4 is another structural schematic diagram of the pitch adjustment device in the embodiment of the application.
  • FIG. 5 is another schematic structural diagram of the pitch adjustment device in the embodiment of the present application.
  • Embodiments of the present application provide a pitch adjustment method, device, and computer storage medium, which are used to automatically adjust the accompaniment of a target song, so that the user's singing voice matches the accompaniment in pitch.
  • an embodiment of the pitch adjustment method in the embodiment of the present application includes:
  • the method of this embodiment may be applied to a pitch adjustment apparatus, and the apparatus may be a computer device capable of performing data processing tasks, such as a terminal and a server.
  • the generating device is a terminal, it can be a smart phone, a tablet computer, a laptop computer, a desktop computer, a self-service terminal, etc.; when it is a server, it can be an independent physical server, or a plurality of physical servers.
  • Server clusters or distributed systems can also be cloud servers that provide basic cloud computing services such as cloud databases, cloud computing, and big data and artificial intelligence platforms.
  • the pitch of the accompaniment of the target song is adjusted according to the pitch of the user's singing voice, so that the pitch of the accompaniment matches the pitch of the user's singing voice, and the mixed work of the user's singing voice and the accompaniment has a better sense of hearing.
  • a plurality of candidate melody files are used as references to determine the degree of adjustment of the pitch of the accompaniment. Therefore, when adjusting the pitch of the accompaniment, a plurality of candidate melody files are obtained, wherein each candidate melody file is used to identify the pitch value of the note in the melody of the target song, and each candidate melody file identifies The pitch values are different.
  • the pitch value of the melody of the target song identified by the candidate melody file may be a pitch value of 0-108 or 0-88.
  • the pitch value identified by candidate melody file 1 is 0, the pitch value identified by candidate melody file 2 is 1, and so on.
  • the pitch adjustment device obtains the audio data of the user's singing voice, and extracts the fundamental frequency of the singing voice to obtain a fundamental frequency sequence, which includes a plurality of fundamental frequency points.
  • the commonly used fundamental frequency extraction algorithms include autocorrelation algorithm, parallel processing method, cepstral method and simplified inverse filtering method.
  • the fundamental frequency of the singing voice is obtained, and the fundamental frequency sequence of the user's singing voice is obtained.
  • this embodiment uses multiple candidate melody files as a reference, and the multiple candidate melody files identify the pitch values of the melody, when comparing the candidate melody files with the fundamental frequency sequence of the user's singing, It is necessary to convert the frequency value of the target fundamental frequency point in the fundamental frequency sequence into a pitch value, and the target fundamental frequency point includes the fundamental frequency point in the fundamental frequency sequence corresponding to the notes of the candidate melody file in time, so that the The pitch value of the fundamental frequency point is compared with the pitch value identified by the candidate melody file, and the comparison result can be used as the basis for the pitch adjustment of the accompaniment.
  • the pitch value identified by the candidate melody file is also the pitch value of the note.
  • you can calculate The pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point, where the corresponding time point means that the fundamental frequency point of the fundamental frequency sequence falls within the position of a certain note in the candidate melody file The time range at which the fundamental frequency corresponds to the note in time. For example, the duration of a note is 1s. If a fundamental frequency point falls within the time range of the 1s note, the fundamental frequency point corresponds to the note in time, and the sound of the two can be calculated. High value difference.
  • the numerical value of the sum of the pitch value differences can reflect the size of the gap between the pitch value of the candidate melody file and the pitch value of the fundamental frequency sequence of the user's singing voice, that is, the larger the value of the sum, the greater the gap.
  • the pitch of the candidate melody file does not match the pitch of the user's singing voice; the smaller the value of the sum is, the smaller the gap is, and the higher the degree of fit between the pitch of the candidate melody file and the pitch of the user's singing voice, Then, the pitch of the accompaniment is adjusted according to the candidate melody file, and the accompaniment matching the pitch of the user's singing voice can be obtained.
  • the original melody file of the target song is used to identify the pitch values of the notes in the original melody of the target song
  • the original melody may be the singing melody of the original singer of the target song, because the original singer is generally a relatively professional singer Therefore, the pitch of the original melody will generally match the pitch of the accompaniment of the target song, and the pitch value identified by the original melody file will also match the pitch value of the accompaniment. Therefore, the pitch of the accompaniment file of the target song can be adjusted according to the pitch value difference between the target melody file and the original melody file.
  • the accompaniment obtained by adjusting the pitch according to the target melody file will also match the pitch of the user's singing voice, thereby making the adjustment
  • the accompaniment after the pitch and the user's singing voice have a good sense of hearing.
  • the pitch values of the notes identified by a certain candidate melody file are 24, 25, 29, 31, 34, and 27 respectively (in practical applications, the number of notes identified by the candidate melody file is determined according to the target song , here only a limited number of notes are exemplified), and the pitch values of the target fundamental frequency points corresponding to the above-mentioned notes in the fundamental frequency sequence of the target song are 24, 25, 28, 31, 34, and 27 respectively.
  • Calculate the pitch value difference between the corresponding target fundamental frequency point and the note as 0, 0, 1, 0, 0, 0 (the absolute value of the pitch value difference), and statistically obtain the pitch value difference
  • the sum is 1.
  • the sum of the pitch value differences of other candidate melody files can be calculated.
  • the pitch of the accompaniment file of the target song can be adjusted according to the pitch difference between the target melody file and the original melody file of the target song, so that The accompaniment after adjusting the pitch can match the pitch of the user's singing voice and enhance the sense of hearing.
  • the fundamental frequency sequence of the user's singing is obtained, the pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point is calculated, and all the data of each candidate melody file are counted separately.
  • the sum of the pitch value differences, the candidate melody file with the smallest sum is determined as the target melody file, and the pitch of the accompaniment file of the target song is adjusted according to the pitch value difference between the target melody file and the original melody file of the target song, Since the pitch identified by the target melody file has the highest matching degree with the pitch of the user's singing voice, the accompaniment after the pitch adjustment can match the pitch of the user's singing voice, and the resulting mixed work can have a good sense of hearing. .
  • FIG. 2 another embodiment of the pitch adjustment method in the embodiment of the present application includes:
  • the multiple candidate melody files may be any files used to identify the pitch values of the melody of the target song, as long as the pitch values identified by each candidate melody file are different.
  • the multiple candidate melody files can be obtained by transforming the original melody files of the target song.
  • the original melody file is used to identify the pitch value of the original melody of the target song, and the original melody may be the singing melody of the original singer of the target song. Since the melody is composed of musical notes, when the original melody file is transformed into a rising or falling key, the transformed value can be added to the pitch values of all the notes in the original melody file, thereby obtaining a transformed melody file. Therefore, the transformed melody file and the original melody file can be respectively used as candidate melody files, and both can be used as reference for adjusting the pitch of the accompaniment.
  • the transformation of the original melody file may be a rising key transformation or a falling key transformation
  • the transformation value may be a positive value or a negative value.
  • the transformation value is +1, it means that the pitch value of the original melody file is increased by 1 unit, which is a rising transformation; the transformation value is -2, which means that the pitch value of the original melody file is reduced by 2 units. for the downshifting transformation.
  • the transformation may be specifically performed based on the principle of twelve equal laws.
  • the twelve equal temperament is a method of music law, dividing a pure octave into twelve equal parts, each equal part is called a semitone, and it is the most important tuning method. Therefore, based on the twelve equal temperament, an octave where the original melody file is located can be equally divided to obtain twelve semitone intervals, wherein the original melody file corresponds to one semitone interval in the twelve semitone intervals; After that, according to the interval relationship between the chromatic interval corresponding to the original melody file and other chromatic intervals, the pitch values of all the notes in the original melody file are respectively added and transformed 11 times, thereby obtaining 11 transformed melody files.
  • the transformed melody file Since the transformation value is added according to chromatic intervals, the transformed melody file also corresponds to one chromatic interval in the twelve chromatic intervals, that is, each transformed melody file corresponds to the twelve chromatic intervals respectively. a semitone interval in .
  • the 11 transformed melody files together with the original melody file constitute 12 candidate melody files.
  • the pitch values of all the notes in the original melody file are respectively added and transformed 11 times, which are +1, +2, +3, ..., +9, +10, +11, then the original melody file's The pitch value is the smallest, and the pitch value of the melody file plus the transform value +11 is the largest.
  • the specific algorithm content of the preset algorithm is not limited, as long as it is an algorithm that can convert the frequency value of the fundamental frequency point into a pitch value.
  • the preset algorithm can be the following formula:
  • hz_value is the frequency value of the fundamental frequency point.
  • the frequency value of the fundamental frequency point can be converted into a pitch value through the above formula.
  • the target fundamental frequency point may include all fundamental frequency points in the fundamental frequency sequence, or may only include the target fundamental frequency point corresponding in time to the notes of the candidate melody file.
  • one way may be to traverse each fundamental frequency point of the fundamental frequency sequence, and convert the frequency value of each fundamental frequency point into a pitch value according to a preset algorithm, and then, Then, from all the fundamental frequency points of the fundamental frequency sequence, determine the target fundamental frequency point corresponding to the notes of the candidate melody file in time; another way can also be, first determine from all the fundamental frequency points of the fundamental frequency sequence.
  • the target fundamental frequency point corresponding to the note in the alternative melody file in time is obtained.
  • When converting the frequency value to the pitch value only the frequency value of the target fundamental frequency point is converted. Compared with the previous method, no conversion is required.
  • the frequency values of other fundamental frequency points greatly reduce the operation of calculating the pitch value and reduce the pressure of data processing.
  • the pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point when calculating the pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point, obtain the temporally related target fundamental frequency point in each candidate melody file.
  • the pitch value of the corresponding note that is, when a certain fundamental frequency point falls within the time value range of a certain note, the fundamental frequency point is the target fundamental frequency point corresponding to the note in time.
  • the pitch value difference between the corresponding target fundamental frequency point and the note in time is calculated, so as to obtain the pitch value difference between the candidate melody file and the fundamental frequency sequence at each corresponding time point.
  • the candidate melody file determines whether the note in the candidate melody file corresponds to the fundamental frequency point of the fundamental frequency sequence in time, and the specific method may be that the candidate melody file also identifies the start time and end time of the note in the melody of the target song.
  • the note corresponding to the target fundamental frequency point in time can be determined according to the start time and end time of the note, that is, the fundamental frequency point falls within the time period from the start time to the end time of a certain note, then the target fundamental frequency is determined. The point corresponds in time to that note. After the corresponding note is determined, the pitch value of the corresponding note is obtained.
  • the candidate melody file with the smallest sum of pitch value differences has the highest pitch matching degree with the user's singing voice. Therefore, the candidate melody file with the smallest sum of pitch value differences is determined as the reference for accompaniment pitch adjustment. .
  • the degree of pitch matching between the target melody file and the user's singing voice can be further determined, that is, the note with a pitch value difference of 0 in the target melody file is among all the notes.
  • the note with a pitch value difference of 0 in the target melody file accounts for 100% of all notes, it means that there is no difference in pitch between the entire target melody file and the user's singing voice.
  • the pitch value can well match the user's singing voice, which also shows that the user has a strong ability to grasp the pitch from another perspective.
  • the proportion of notes with a pitch value difference of 0 in the target melody file is extremely low among all notes, it means that there are many differences in pitch between the target melody file and the user's singing voice, and the matching degree between the two is not high. It may be because the user's ability to grasp the pitch is not strong, and they often go out of tune when singing, and cannot sing according to a certain pitch.
  • the preset threshold can be set arbitrarily, and can be obtained by summarizing experimental data. For example, it can be set to any value between 80% and 100%.
  • the proportion of notes with a pitch value difference of 0 in the target melody file among all notes is greater than the preset threshold, it indicates that the target melody file and the user's singing have a high degree of pitch matching.
  • the pitch value difference of the original melody file of the target song adjusts the pitch of the accompaniment file of the target song.
  • the operations performed in this step are similar to the operations performed in step 104 in the aforementioned embodiment shown in FIG. 1 .
  • the target melody file is one of the multiple candidate melody files obtained in step 201, if the multiple candidate melody files are obtained by transforming the original melody file of the target song, the The transformation relationship of the melody file directly determines the pitch value difference between the target melody file and the original melody file.
  • step 201 transforms the original melody file based on the twelve equal temperament to obtain 12 candidate melody files, and each candidate melody file corresponds to a chromatic interval, therefore, the difference between the target melody file and the original melody file is There is an interval relationship, that is, how many semitone intervals are different, and when it is specifically expressed in pitch, it is the pitch difference between the melody corresponding to the target melody file and the melody corresponding to the original melody file. Therefore, the pitch of the accompaniment file of the target song can be adjusted according to the interval relationship between the target melody file and the original melody file.
  • the proportion of notes with a pitch value difference of 0 in the target melody file is less than the preset threshold, it indicates that there are many differences in pitch between the target melody file and the user's singing voice, and the matching degree between the two is not high. At this time, it is considered that the user has a poor grasp of the pitch of the target song. Even if the pitch of the accompaniment file is adjusted according to the target melody file, the accompaniment cannot fit the user's singing well. Therefore, the pitch of the accompaniment file is not adjusted and the accompaniment is not changed. the pitch.
  • the pitch match between the target melody file and the user's singing voice can be further determined by judging whether the proportion of notes with a pitch value difference of 0 in all notes in the target melody file is greater than a preset threshold. degree, and improve the feasibility of the program.
  • an embodiment of the pitch adjustment device in the embodiment of the present application includes:
  • the first obtaining unit 301 is used to obtain a plurality of candidate melody files, the candidate melody files are used to identify the pitch values of notes in the melody of the target song, and the pitch values identified by each candidate melody file are different;
  • the second obtaining unit 302 is used to obtain the fundamental frequency sequence of the singing voice of the user singing the target song
  • the conversion unit 303 is used to convert the frequency value of the target fundamental frequency point of the fundamental frequency sequence into a pitch value according to a preset algorithm, and the target fundamental frequency point includes the time corresponding to the note of the candidate melody file in the fundamental frequency sequence. fundamental frequency point;
  • the calculation unit 304 is used to calculate the pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point, and count the difference of all the pitch value differences of each candidate melody file respectively. sum;
  • the pitch adjustment unit 305 is configured to determine the candidate melody file with the smallest sum as the target melody file, and adjust the pitch of the accompaniment file of the target song according to the pitch value difference between the target melody file and the original melody file of the target song.
  • the first obtaining unit 301 is specifically configured to obtain the original melody file of the target song, and add transformation values to the pitch values of all the notes in the original melody file to obtain the transformed melody file,
  • the original melody file and the transformed melody file are respectively used as alternative melody files.
  • the first obtaining unit 301 is specifically configured to equally divide the octave corresponding to the original melody file based on the twelve equal temperament to obtain twelve semitone intervals, and the original melody file corresponds to twelve one of the semitone intervals;
  • the pitch values of all the notes in the original melody file are respectively added and transformed 11 times to obtain 11 transformed melody files;
  • each transformed melody file corresponds to one chromatic interval in the twelve chromatic intervals respectively.
  • the pitch adjustment unit 305 is specifically configured to adjust the pitch of the accompaniment file of the target song according to the interval relationship between the target melody file and the original melody file high.
  • the pitch adjustment device further includes:
  • Judging unit 306 for judging whether the proportion of notes with a pitch value difference of 0 in all notes in the target melody file is greater than a preset threshold
  • the pitch adjustment unit 305 is specifically configured to, when the proportion of notes with a pitch value difference of 0 in all the notes in the target melody file is greater than a preset threshold, execute the pitch of the original melody file according to the target melody file and the target song. The steps of adjusting the pitch of the accompaniment file of the target song by the difference value; when the proportion of notes with a pitch value difference of 0 in the target melody file is not greater than the preset threshold, the pitch of the accompaniment file is not adjusted .
  • the conversion unit 303 is specifically configured to traverse each fundamental frequency point of the fundamental frequency sequence, convert the frequency value of each fundamental frequency point into a pitch value according to a preset algorithm, and convert the frequency value of each fundamental frequency point into a pitch value from the fundamental frequency point. Determine the target fundamental frequency point among all fundamental frequency points of the frequency sequence;
  • the calculation unit 304 is specifically used to obtain the pitch value of the note corresponding to the target fundamental frequency point in time in each candidate melody file, and calculate the pitch value difference between the corresponding target fundamental frequency point and the note in time. value.
  • the alternative melody file is also used to identify the start time and end time of the notes in the melody of the target song;
  • the calculation unit 304 is specifically configured to determine the note corresponding to the target fundamental frequency point in time according to the start time and the end time of the note in each candidate melody file, and obtain the sound of the note corresponding to the target fundamental frequency point in time. high value.
  • each unit in the pitch adjustment device the operations performed by each unit in the pitch adjustment device are similar to those described in the foregoing embodiments shown in FIG. 1 to FIG. 2 , and details are not repeated here.
  • the first obtaining unit 301 obtains the fundamental frequency sequence of the user's singing voice
  • the calculating unit 304 calculates the pitch value difference between each candidate melody file and the fundamental frequency sequence at each corresponding time point, and counts the differences respectively.
  • the sum of all pitch value differences of each candidate melody file, the pitch adjustment unit 305 determines the candidate melody file with the smallest sum as the target melody file, and according to the pitch of the target melody file and the original melody file of the target song The value difference adjusts the pitch of the accompaniment file of the target song. Since the pitch identified by the target melody file has the highest matching degree with the pitch of the user's singing voice, the accompaniment after the pitch adjustment can match the pitch of the user's singing voice. Fit, the resulting mixed works can get a good sense of hearing.
  • an embodiment of the pitch adjustment device in the embodiment of the present application includes:
  • the pitch adjustment device 400 may include one or more central processing units (CPUs) 401 and a memory 405, where one or more application programs or data are stored in the memory 405.
  • CPUs central processing units
  • memory 405 where one or more application programs or data are stored in the memory 405.
  • the memory 405 may be volatile storage or persistent storage.
  • the program stored in the memory 405 may include one or more modules, each module may include a series of instruction operations on the pitch adjustment apparatus.
  • the central processing unit 401 may be configured to communicate with the memory 405 to execute a series of instruction operations in the memory 405 on the pitch adjustment device 400 .
  • the pitch adjustment device 400 may also include one or more power supplies 402, one or more wired or wireless network interfaces 403, one or more input and output interfaces 404, and/or, one or more operating systems, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • one or more operating systems such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the central processing unit 401 can perform the operations performed by the pitch adjustment apparatus in the embodiments shown in FIG. 1 to FIG. 2 , and details are not repeated here.
  • an embodiment of the pitch adjustment device in the embodiment of the present application includes:
  • the terminal can be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point ofSales, a sales terminal), a vehicle-mounted computer, etc.
  • the terminal is a mobile phone as an example:
  • FIG. 5 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present application.
  • the mobile phone includes: a radio frequency (Radio Frequency, RF) circuit 510 , a memory 520 , an input unit 530 , a display unit 540 , a sensor 550 , an audio circuit 560 , a wireless fidelity (WiFi) module 570 , and a processor 580 , and power supply 590 and other components.
  • RF Radio Frequency
  • the RF circuit 510 can be used for receiving and sending signals during sending and receiving of information or during a call. In particular, after receiving the downlink information of the base station, it is processed by the processor 580; in addition, the designed uplink data is sent to the base station.
  • the RF circuit 510 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • LNA Low Noise Amplifier
  • RF circuitry 510 may also communicate with networks and other devices via wireless communications.
  • the above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (General Packet Radio Service, GPRS), Code Division Multiple Access (Code Division) Multiple Access, CDMA), Wideband Code Division Multiple Access (Wideband Code Division Multiple Access, WCDMA), Long Term Evolution (Long Term Evolution, LTE), email, Short Messaging Service (Short Messaging Service, SMS), etc.
  • GSM Global System of Mobile communication
  • General Packet Radio Service General Packet Radio Service
  • GPRS General Packet Radio Service
  • Code Division Multiple Access Code Division Multiple Access
  • CDMA Code Division Multiple Access
  • Wideband Code Division Multiple Access Wideband Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • SMS Short Messaging Service
  • the memory 520 can be used to store software programs and modules, and the processor 580 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 520.
  • the memory 520 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program (such as a sound playback function, an image playback function, etc.) required for at least one function, and the like; Data created by the use of the mobile phone (such as audio data, phone book, etc.), etc.
  • memory 520 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 530 may be used for receiving inputted numerical or character information, and generating key signal input related to user setting and function control of the mobile phone.
  • the input unit 530 may include a touch panel 531 and other input devices 532 .
  • the touch panel 531 also referred to as a touch screen, can collect the user's touch operations on or near it (such as the user's finger, stylus, etc., any suitable object or accessory on or near the touch panel 531). operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 531 may include two parts, a touch detection device and a touch controller.
  • the touch detection device detects the user's touch orientation, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it to the touch controller.
  • the touch panel 531 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 530 may further include other input devices 532 .
  • other input devices 532 may include, but are not limited to, one or more of physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 540 may be used to display information input by the user or information provided to the user and various menus of the mobile phone.
  • the display unit 540 may include a display panel 541, and optionally, the display panel 541 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an organic light-emitting diode (Organic Light-Emitting Diode, OLED), and the like.
  • the touch panel 531 may cover the display panel 541, and when the touch panel 531 detects a touch operation on or near it, it transmits it to the processor 580 to determine the type of the touch event, and then the processor 580 determines the type of the touch event according to the touch event. Type provides corresponding visual output on display panel 541 .
  • the touch panel 531 and the display panel 541 are used as two independent components to realize the input and input functions of the mobile phone, in some embodiments, the touch panel 531 and the display panel 541 can be integrated to form Realize the input and output functions of the mobile phone.
  • the cell phone may also include at least one sensor 550, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 541 according to the brightness of the ambient light, and the proximity sensor may turn off the display panel 541 and/or when the mobile phone is moved to the ear. or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes), and can detect the magnitude and direction of gravity when it is stationary. games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc. Repeat.
  • the audio circuit 560, the speaker 561, and the microphone 562 can provide an audio interface between the user and the mobile phone.
  • the audio circuit 560 can transmit the received audio data converted electrical signal to the speaker 561, and the speaker 561 converts it into a sound signal for output; on the other hand, the microphone 562 converts the collected sound signal into an electrical signal, which is converted by the audio circuit 560 After receiving, it is converted into audio data, and then the audio data is output to the processor 580 for processing, and then sent to, for example, another mobile phone through the RF circuit 510, or the audio data is output to the memory 520 for further processing.
  • WiFi is a short-distance wireless transmission technology.
  • the mobile phone can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 570, which provides users with wireless broadband Internet access.
  • FIG. 5 shows the WiFi module 570, it can be understood that it is not a necessary component of the mobile phone.
  • the processor 580 is the control center of the mobile phone, using various interfaces and lines to connect various parts of the entire mobile phone, by running or executing the software programs and/or modules stored in the memory 520, and calling the data stored in the memory 520.
  • the processor 580 may include one or more processing units; preferably, the processor 580 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, etc. , the modem processor mainly deals with wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 580 .
  • the mobile phone also includes a power supply 590 (such as a battery) for supplying power to various components.
  • a power supply 590 (such as a battery) for supplying power to various components.
  • the power supply can be logically connected to the processor 580 through a power management system, so as to manage charging, discharging, and power consumption management functions through the power management system.
  • the mobile phone may also include a camera, a Bluetooth module, and the like, which will not be repeated here.
  • the processor 580 included in the terminal may perform the functions in the foregoing embodiments shown in FIG. 1 to FIG. 2 , and details are not described herein again.
  • An embodiment of the present application further provides a computer storage medium, wherein an embodiment includes: an instruction is stored in the computer storage medium, and when the instruction is executed on a computer, causes the computer to execute the foregoing embodiments shown in FIG. 1 to FIG. 2 .
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

一种音高调节方法,用于自动对目标歌曲的伴奏进行调节,以使得用户歌声与伴奏在音高上相匹配。方法包括:获取用户歌声的基频序列(102),计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和(103),将总和最小的备选旋律文件确定为目标旋律文件,并根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高(104)。由于目标旋律文件所标识的音高与用户歌声的音高的匹配度最高,因此,经过音高调节之后的伴奏可以与用户歌声的音高相契合,形成的混音作品可以获得良好的听感。还提供了一种音高调节装置及计算机存储介质。

Description

音高调节方法、装置及计算机存储介质
本申请要求于2020年10月27日提交中国专利局、申请号为202011163021.7、发明名称为“音高调节方法、装置及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及数据处理领域,具体涉及一种音高调节方法、装置及计算机存储介质。
背景技术
目前的智能终端音乐软件可以为用户提供歌唱录制服务,即音乐软件播放歌曲的伴奏,用户在该伴奏下歌唱,并由音乐软件录制用户的歌声,再将用户的歌声与该歌曲的伴奏进行混音,最终合成作品,该作品即包含用户的歌声及歌曲的伴奏。
部分用户因自身发音条件的限制,无法唱出歌曲中的高音部分或者低音部分,因此,即使音乐软件给出当前伴奏的参考音高,由于受限于自身的发音条件,用户仍然无法很好地依据该参考音高进行歌唱。此时,用户可以手动调节伴奏的音调,使其契合自身的发音条件,即用户无法唱出高音,则手动下调伴奏的音调,对伴奏进行降调,使其变为低音。
但是,用户如果不对伴奏的音调进行手动调节,在合成作品的时候,用户的歌声与伴奏在音高上不一致,严重影响了作品的听感。若用户在每一次歌唱时均需要根据自身的发音条件来调节伴奏的音高,这也给用户使用音乐软件带来不便,影响用户体验。
发明内容
本申请实施例提供了一种音高调节方法、装置及计算机存储介质,用于自动对目标歌曲的伴奏进行调节,以使得用户歌声与伴奏在音高上相匹配。
本申请实施例第一方面提供了一种音高调节方法,包括:
获取多个备选旋律文件,所述备选旋律文件用于标识目标歌曲的旋律中音 符的音高值,每个所述备选旋律文件所标识的音高值不同;
获取用户歌唱所述目标歌曲的歌声的基频序列,并根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,所述目标基频点包括所述基频序列中与所述备选旋律文件的音符在时间上相对应的基频点;
分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,并分别统计每个所述备选旋律文件的所有音高值差值的总和;
将所述总和最小的备选旋律文件确定为目标旋律文件,并根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高。
本申请实施例第二方面提供了一种音高调节装置,包括:
第一获取单元,用于获取多个备选旋律文件,所述备选旋律文件用于标识目标歌曲的旋律中音符的音高值,每个所述备选旋律文件所标识的音高值不同;
第二获取单元,用于获取用户歌唱所述目标歌曲的歌声的基频序列;
转换单元,用于根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,所述目标基频点包括所述基频序列中与所述备选旋律文件的音符在时间上相对应的基频点;
计算单元,用于分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,并分别统计每个所述备选旋律文件的所有音高值差值的总和;
音高调节单元,用于将所述总和最小的备选旋律文件确定为目标旋律文件,并根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高。
本申请实施例第三方面提供了一种音高调节装置,包括:
处理器、存储器、总线、输入输出设备;
所述处理器与所述存储器、输入输出设备相连;
所述总线分别连接所述处理器、存储器以及输入输出设备;
所述处理器用于获取多个备选旋律文件,所述备选旋律文件用于标识目标歌曲的旋律中音符的音高值,每个所述备选旋律文件所标识的音高值不同,获取用户歌唱所述目标歌曲的歌声的基频序列,并根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,所述目标基频点包括所述基频序列中与所述备选旋律文件的音符在时间上相对应的基频点,分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,并分别统计每个所述备选旋律文件的所有音高值差值的总和,将所述总和最小的备选旋律文件确定为目标旋律文件,并根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高。
本申请实施例第四方面提供了一种计算机存储介质,计算机存储介质中存储有指令,该指令在计算机上执行时,使得计算机执行前述第一方面的方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请实施例中,获取用户歌声的基频序列,计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和,将总和最小的备选旋律文件确定为目标旋律文件,并根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高,由于目标旋律文件所标识的音高与用户歌声的音高的匹配度最高,因此,经过音高调节之后的伴奏可以与用户歌声的音高相契合,形成的混音作品可以获得良好的听感。
附图说明
图1为本申请实施例中音高调节方法一个流程示意图;
图2为本申请实施例中音高调节方法另一流程示意图;
图3为本申请实施例中音高调节装置一个结构示意图;
图4为本申请实施例中音高调节装置另一结构示意图;
图5为本申请实施例中音高调节装置另一结构示意图。
具体实施方式
本申请实施例提供了一种音高调节方法、装置及计算机存储介质,用于自动对目标歌曲的伴奏进行调节,以使得用户歌声与伴奏在音高上相匹配。
请参阅图1,本申请实施例中音高调节方法一个实施例包括:
101、获取多个备选旋律文件;
本实施例的方法可应用于音高调节装置,该装置可以是终端、服务器等能够执行数据处理任务的计算机设备。该生成装置为终端时,可以是智能手机、平板电脑、膝上型便携计算机、台式计算机、自助服务终端等设备;为服务器时,可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式***,还可以是提供云数据库、云计算以及大数据和人工智能平台等基础云计算服务的云服务器。
本实施例根据用户的歌声的音高来调节目标歌曲的伴奏的音高,从而使伴奏的音高契合用户歌声的音高,使用户歌声和伴奏的混音作品在听感上更佳。基于上述原理,在调节目标歌曲的伴奏的音高时,是将多个备选旋律文件作为参考以确定出伴奏的音高的调节程度。因此,在调节伴奏的音高时,获取多个备选旋律文件,其中每个备选旋律文件均是用于标识目标歌曲的旋律中音符的音高值,且每个备选旋律文件所标识的音高值不同。
对于108键钢琴来说,其音高值范围是0~108;对于88键钢琴来说,其音高值范围是0~88。因此,备选旋律文件所标识的目标歌曲的旋律的音高值,其值可以是0~108或者0~88中的一个音高值。例如,备选旋律文件1标识的音高值为0,备选旋律文件2标识的音高值为1,等等。
102、获取用户歌唱目标歌曲的歌声的基频序列,并根据预设算法将基频序列的目标基频点的频率值转换为音高值;
当用户歌唱目标歌曲时,采集用户的歌声,则音高调节装置获取用户的歌声的音频数据,并提取该歌声的基频,得到基频序列,基频序列包括了多个基频点。本实施例中,提取歌声的基频的方法可以有多种,例如,常用的基频提取算法有自相关算法、平行处理法、倒谱法和简化逆滤波法,则可以基于上述算法提取出歌声的基频,并得到用户歌声的基频序列。
由于本实施例将多个备选旋律文件作为参考,而该多个备选旋律文件标识的是旋律的音高值,因此,在将备选旋律文件与用户歌声的基频序列进行对比时,需要将基频序列中的目标基频点的频率值转换为音高值,该目标基频点包括基频序列中与备选旋律文件的音符在时间上相对应的基频点,从而可以将基 频点的音高值与备选旋律文件所标识的音高值进行对比,对比的结果可以作为伴奏的音高调节的依据。
103、分别计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和;
由于旋律是由音符组成,因此,备选旋律文件所标识的音高值也就是音符的音高值,在将基频序列的每一基频点的频率值转换为音高值之后,可以计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,其中相对应时间点是指基频序列的基频点落在备选旋律文件中某个音符的时间范围,此时该基频点与该音符在时间上相对应。例如,某个音符的时值为1s,若某个基频点落在该1s的音符的时间范围之内,则该基频点与该音符在时间上相对应,可以计算出两者的音高值差值。
在计算得到每一个相对应时间点的音高值差值之后,分别将每个备选旋律文件的所有音高值差值进行累加,统计得到每个备选旋律文件的音高值差值的总和。音高值差值的总和的数值大小,可以反映出备选旋律文件的音高值与用户歌声的基频序列的音高值的差距大小,即该总和的数值越大,表明该差距越大,备选旋律文件的音高越不契合于用户歌声的音高;该总和的数值越小,表明该差距越小,备选旋律文件的音高与用户歌声的音高的契合程度越高,则依据该备选旋律文件调节伴奏音高,可以得到与用户歌声的音高相匹配的伴奏。
104、将总和最小的备选旋律文件确定为目标旋律文件,并根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高;
根据上述的分析,备选旋律文件的音高值差值的总和越小,越有利于调节伴奏的音高。因此,在得到所有备选旋律文件的音高值差值的总和之后,将其中音高值差值的总和最小的备选旋律文件确定为目标旋律文件,该目标旋律文件可以作为调节伴奏音高的依据。
本实施例中,目标歌曲的原始旋律文件用于标识目标歌曲的原始旋律中音符的音高值,该原始旋律可以是目标歌曲的原唱者的歌声旋律,由于原唱者一般是相对较专业的歌手,因此,该原始旋律的音高一般也会契合目标歌曲的伴奏的音高,则原始旋律文件所标识的音高值也会匹配于伴奏的音高值。因此, 可以根据目标旋律文件与该原始旋律文件的音高值差值来调节目标歌曲的伴奏文件的音高。由于目标旋律文件所标识的音高值与用户歌声的基频序列的音高值相匹配,因此,根据目标旋律文件调节音高得到的伴奏也会与用户歌声的音高相匹配,从而使调节音高之后的伴奏与用户歌声形成的混音作品具有良好的听感。
例如,假设某个备选旋律文件所标识的音符的音高值分别为24、25、29、31、34、27(实际应用中备选旋律文件所标识的音符的个数根据目标歌曲而确定,这里仅示例性列举有限个数的音符),而目标歌曲的基频序列中分别与上述音符对应的目标基频点的音高值为24、25、28、31、34、27。分别计算得到相对应的目标基频点与音符之间的音高值差值为0、0、1、0、0、0(音高值差值取绝对值),统计得到音高值差值的总和为1。以此类推,可计算得到其他备选旋律文件的音高值差值的总和。
假设存在12个备选旋律文件,其音高值差值的总和分别为137、109、90、73、49、24、1、22、45、67、86、114,则确定音高值差值为1所对应的备选旋律文件为目标旋律文件。假设该目标旋律文件与目标歌曲的原始旋律文件在音高上相差两个半音音程,则可以根据目标旋律文件与目标歌曲的原始旋律文件的音高差距调节目标歌曲的伴奏文件的音高,使得调节音高之后的伴奏能够契合用户歌声的音高,提升听感。
本实施例中,获取用户歌声的基频序列,计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和,将总和最小的备选旋律文件确定为目标旋律文件,并根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高,由于目标旋律文件所标识的音高与用户歌声的音高的匹配度最高,因此,经过音高调节之后的伴奏可以与用户歌声的音高相契合,形成的混音作品可以获得良好的听感。
下面将在前述图1所示实施例的基础上,进一步详细地描述本申请实施例。请参阅图2,本申请实施例中音高调节方法另一实施例包括:
201、获取多个备选旋律文件;
本实施例中,该多个备选旋律文件可以是任意的用于标识目标歌曲的旋律 的音高值的文件,只要每个备选旋律文件所标识的音高值不同即可。
在一种优选的实施方式中,该多个备选旋律文件可以由目标歌曲的原始旋律文件变换得到。同样的,原始旋律文件用于标识目标歌曲的原始旋律的音高值,该原始旋律可以是目标歌曲的原唱者的歌声旋律。由于旋律由音符组成,因此,在对原始旋律文件进行升调或者降调的变换时,可以对该原始旋律文件的所有音符的音高值加上变换值,从而得到变换后的旋律文件。因此,该变换后的旋律文件以及该原始旋律文件可以分别作为备选旋律文件,均可以作为伴奏音高调节的参考依据。
可以理解的是,由于对原始旋律文件的变换可以是升调的变换或者降调的变换,因此,变换值可以是正值,也可以是负值。例如,变换值为+1,则表示将原始旋律文件的音高值提高1个单位,为升调的变换;变换值为-2,则表示将原始旋律文件的音高值降低2个单位,为降调的变换。
本实施例在对原始旋律文件进行变换时,具体可以基于十二平均律的原理进行变换。十二平均律是一种音乐定律方法,将一个纯八度平均分成十二等份,每等分称为半音,是最主要的调音法。因此,可以基于十二平均律,将原始旋律文件所处的一个八度音阶进行平均分割,得到十二个半音音程,其中原始旋律文件便对应了该十二个半音音程中的一个半音音程;之后,按照原始旋律文件对应的半音音程与其它半音音程之间的音程关系,分别对原始旋律文件的所有音符的音高值执行11次的加变换值,从而得到11个变换后的旋律文件。由于加变换值是根据半音音程来执行的,因此,变换得到的旋律文件也会对应该十二个半音音程中的一个半音音程,即每个变换后的旋律文件分别对应该十二个半音音程中的一个半音音程。该11个变换后的旋律文件与原始旋律文件一起,构成12个备选旋律文件。
例如,分别对原始旋律文件的所有音符的音高值执行11次的加变换值,分别是+1、+2、+3、……、+9、+10、+11,则原始旋律文件的音高值最小,而加变换值+11的旋律文件的音高值最大。
202、获取用户歌唱目标歌曲的歌声的基频序列,并根据预设算法将基频序列的目标基频点的频率值转换为音高值;
本实施例中,该预设算法的具体算法内容不作限定,只要是能够将基频点 的频率值转换为音高值的算法即可。例如,该预设算法可以是以下公式:
音高值=12*log2(hz_value/440.0)+69;
其中,hz_value为基频点的频率值。通过以上公式可以将基频点的频率值转换为音高值。
本实施例中,目标基频点可以包括基频序列中所有的基频点,也可以仅包括与备选旋律文件的音符在时间上相对应的目标基频点。在计算目标基频点的音高值时,一种方式可以是,遍历基频序列的每一个基频点,根据预设算法将每一个基频点的频率值转换为音高值,之后,再从基频序列的所有基频点中确定出与备选旋律文件的音符在时间上相对应的目标基频点;另一方式也可以是,首先从基频序列的所有基频点中确定出与备选旋律文件的音符在时间上相对应的目标基频点,在将频率值转换为音高值时只转换目标基频点的频率值,相比于前一种方式,可以无需转换其他基频点的频率值,大大减少了计算音高值的操作,降低数据处理的压力。
203、分别计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和;
本实施例中,在计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值时,获取每个备选旋律文件中与目标基频点在时间上相对应的音符的音高值,即某一基频点落在某个音符的时值范围内时,该基频点即为与该音符在时间上相对应的目标基频点。之后,计算时间上相对应的目标基频点与音符之间的音高值差值,从而得到备选旋律文件与基频序列在每一个相对应时间点上的音高值差值。
其中,确定备选旋律文件中的音符是否与基频序列的基频点在时间上相对应,其具体方式可以是,备选旋律文件还标识有目标歌曲的旋律中音符的开始时间及结束时间,则可以根据音符的开始时间和结束时间确定与目标基频点在时间上相对应的音符,即基频点落在某一音符的开始时间至结束时间的时间段内,则确定目标基频点与该音符在时间上相对应。在确定了相对应的音符之后,获取该相对应的音符的音高值。
在计算得到每个备选旋律文件的所有相对应时间点的音高值差值之后,分别将每个备选旋律文件的所有音高值差值进行累加,统计得到每个备选旋律文 件的音高值差值的总和。
204、将总和最小的备选旋律文件确定为目标旋律文件;
音高值差值的总和最小的备选旋律文件与用户歌声在音高上的匹配度最高,因此,将音高值差值的总和最小的备选旋律文件确定为伴奏音高调节的参考依据。
205、判断目标旋律文件中音高值差值为0的音符在所有音符中的占比是否大于预设阈值,若是,则执行步骤206;若否,则执行步骤207;
本实施例中,在确定出目标旋律文件之后,还可以进一步确定该目标旋律文件与用户歌声在音高上的匹配程度,即目标旋律文件中音高值差值为0的音符在所有音符中的占比越高,表明目标旋律文件与用户歌声在音高上的差异越小,则匹配程度越高。
例如,若目标旋律文件中音高值差值为0的音符在所有音符中的占比为100%,说明整个目标旋律文件与用户歌声在音高上完全无任何差异,目标旋律文件所标识的音高值可以很好地匹配用户歌声,从另一角度也说明该用户对音准的把握能力很强。反之,若目标旋律文件中音高值差值为0的音符在所有音符中的占比极低,说明目标旋律文件与用户歌声在音高上存在多处差异,两者的匹配度不高,可能是由于用户对音准的把握能力不强,在歌唱的时候经常跑调,无法依照一定的音高进行歌唱。
其中,该预设阈值可以任意设定,具体可以根据实验数据总结得到,例如可以设定为80%~100%之间的任意一个数值。
206、根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高;
当目标旋律文件中音高值差值为0的音符在所有音符中的占比大于预设阈值时,表明目标旋律文件与用户歌声在音高上的匹配度很高,则根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高。本步骤所执行的操作与前述图1所示实施例中的步骤104所执行的操作类似。
由于目标旋律文件是步骤201所获取到的多个备选旋律文件中的一个,若该多个备选旋律文件是由目标歌曲的原始旋律文件变换得到的,则可以直接根 据目标旋律文件与原始旋律文件的变换关系,直接确定出目标旋律文件与原始旋律文件的音高值差值。
具体方式是,由于步骤201基于十二平均律对原始旋律文件进行变换得到12个备选旋律文件,并且每个备选旋律文件均对应一个半音音程,因此,目标旋律文件与原始旋律文件之间具有音程关系,即相差多少个半音音程,具体表现在音高上时,则为目标旋律文件对应的旋律与原始旋律文件对应的旋律之间的音高差距。因此,可以根据目标旋律文件与原始旋律文件之间的音程关系来调节目标歌曲的伴奏文件的音高。
207、不调节伴奏文件的音高;
当目标旋律文件中音高值差值为0的音符在所有音符中的占比小于预设阈值时,表明目标旋律文件与用户歌声在音高上存在多处差异,两者的匹配度不高,此时认为用户对目标歌曲的音准把握较差,即使根据目标旋律文件来调节伴奏文件的音高也无法令伴奏很好地契合用户歌声,因此,不调节伴奏文件的音高,不改变伴奏的音高。
本实施例中,可以通过判断目标旋律文件中音高值差值为0的音符在所有音符中的占比是否大于预设阈值,来进一步确定该目标旋律文件与用户歌声在音高上的匹配程度,提高了方案的可实现性。
上面对本申请实施例中的音高调节方法进行了描述,下面对本申请实施例中的音高调节装置进行描述,请参阅图3,本申请实施例中音高调节装置一个实施例包括:
第一获取单元301,用于获取多个备选旋律文件,备选旋律文件用于标识目标歌曲的旋律中音符的音高值,每个备选旋律文件所标识的音高值不同;
第二获取单元302,用于获取用户歌唱目标歌曲的歌声的基频序列;
转换单元303,用于根据预设算法将基频序列的目标基频点的频率值转换为音高值,目标基频点包括基频序列中与备选旋律文件的音符在时间上相对应的基频点;
计算单元304,用于分别计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和;
音高调节单元305,用于将总和最小的备选旋律文件确定为目标旋律文件,并根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高。
本实施例一种优选的实施方式中,第一获取单元301具体用于获取目标歌曲的原始旋律文件,对原始旋律文件的所有音符的音高值加上变换值,得到变换后的旋律文件,分别将原始旋律文件以及变换后的旋律文件作为备选旋律文件。
本实施例一种优选的实施方式中,第一获取单元301具体用于基于十二平均律,将原始旋律文件对应的八度音阶平均分割,得到十二个半音音程,原始旋律文件对应十二个半音音程中的一个半音音程;
按照原始旋律文件对应的半音音程与其它半音音程之间的音程关系,分别对原始旋律文件的所有音符的音高值执行11次的加变换值,得到11个变换后的旋律文件;
其中,每个变换后的旋律文件分别对应十二个半音音程中的一个半音音程。
本实施例一种优选的实施方式中,当目标旋律文件不是原始旋律文件时,音高调节单元305具体用于根据目标旋律文件与原始旋律文件之间的音程关系调节目标歌曲的伴奏文件的音高。
本实施例一种优选的实施方式中,音高调节装置还包括:
判断单元306,用于判断目标旋律文件中音高值差值为0的音符在所有音符中的占比是否大于预设阈值;
音高调节单元305具体用于当目标旋律文件中音高值差值为0的音符在所有音符中的占比大于预设阈值时,执行根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高的步骤;当目标旋律文件中音高值差值为0的音符在所有音符中的占比不大于预设阈值时,不调节伴奏文件的音高。
本实施例一种优选的实施方式中,转换单元303具体用于遍历基频序列的每一个基频点,根据预设算法将每一个基频点的频率值转换为音高值,并从基频序列的所有基频点中确定目标基频点;
计算单元304具体用于获取每个备选旋律文件中与目标基频点在时间上相对应的音符的音高值,计算时间上相对应的目标基频点与音符之间的音高值差值。
本实施例一种优选的实施方式中,备选旋律文件还用于标识目标歌曲的旋律中音符的开始时间及结束时间;
计算单元304具体用于根据每个备选旋律文件中音符的开始时间及结束时间确定与目标基频点在时间上相对应的音符,获取与目标基频点在时间上相对应的音符的音高值。
本实施例中,音高调节装置中各单元所执行的操作与前述图1至图2所示实施例中描述的类似,此处不再赘述。
本实施例中,第一获取单元301获取用户歌声的基频序列,计算单元304计算每个备选旋律文件与基频序列在每一个相对应时间点上的音高值差值,并分别统计每个备选旋律文件的所有音高值差值的总和,音高调节单元305将总和最小的备选旋律文件确定为目标旋律文件,并根据目标旋律文件与目标歌曲的原始旋律文件的音高值差值调节目标歌曲的伴奏文件的音高,由于目标旋律文件所标识的音高与用户歌声的音高的匹配度最高,因此,经过音高调节之后的伴奏可以与用户歌声的音高相契合,形成的混音作品可以获得良好的听感。
下面对本申请实施例中的音高调节装置进行描述。当该音高调节装置为服务器时,其结构示意图如图4所示。请参阅图4,本申请实施例中音高调节装置一个实施例包括:
该音高调节装置400可以包括一个或一个以***处理器(central processing units,CPU)401和存储器405,该存储器405中存储有一个或一个以上的应用程序或数据。
其中,存储器405可以是易失性存储或持久存储。存储在存储器405的程序可以包括一个或一个以上模块,每个模块可以包括对音高调节装置中的一系列指令操作。更进一步地,中央处理器401可以设置为与存储器405通信,在音高调节装置400上执行存储器405中的一系列指令操作。
音高调节装置400还可以包括一个或一个以上电源402,一个或一个以上有线或无线网络接口403,一个或一个以上输入输出接口404,和/或,一个或 一个以上操作***,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等。
该中央处理器401可以执行前述图1至图2所示实施例中音高调节装置所执行的操作,具体此处不再赘述。
当该音高调节装置为终端时,其结构示意图如图5所示。请参阅图5,本申请实施例中音高调节装置一个实施例包括:
为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端可以为包括手机、平板电脑、PDA(Personal DigitalAssistant,个人数字助理)、POS(Point ofSales,销售终端)、车载电脑等任意终端设备,以终端为手机为例:
图5示出的是与本申请实施例提供的终端相关的手机的部分结构的框图。参考图5,手机包括:射频(Radio Frequency,RF)电路510、存储器520、输入单元530、显示单元540、传感器550、音频电路560、无线保真(wireless fidelity,WiFi)模块570、处理器580、以及电源590等部件。本领域技术人员可以理解,图5中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图5对手机的各个构成部件进行具体的介绍:
RF电路510可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器580处理;另外,将设计上行的数据发送给基站。通常,RF电路510包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(LowNoiseAmplifier,LNA)、双工器等。此外,RF电路510还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯***(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division MultipleAccess,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE)、电子邮件、短消息服务(Short Messaging Service,SMS)等。
存储器520可用于存储软件程序以及模块,处理器580通过运行存储在存 储器520的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器520可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器520可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元530可用于接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元530可包括触控面板531以及其他输入设备532。触控面板531,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板531上或在触控面板531附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板531可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器580,并能接收处理器580发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板531。除了触控面板531,输入单元530还可以包括其他输入设备532。具体地,其他输入设备532可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元540可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元540可包括显示面板541,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板541。进一步的,触控面板531可覆盖显示面板541,当触控面板531检测到在其上或附近的触摸操作后,传送给处理器580以确定触摸事件的类型,随后处理器580根据触摸事件的类型在显示面板541上提供相应的视觉输出。虽然在图5中,触控面板531与显示面板541是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板531与显示面板541集成而实现手机的输入和输出功能。
手机还可包括至少一种传感器550,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板541的亮度,接近传感器可在手机移动到耳边时,关闭显示面板541和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路560、扬声器561,传声器562可提供用户与手机之间的音频接口。音频电路560可将接收到的音频数据转换后的电信号,传输到扬声器561,由扬声器561转换为声音信号输出;另一方面,传声器562将收集的声音信号转换为电信号,由音频电路560接收后转换为音频数据,再将音频数据输出处理器580处理后,经RF电路510以发送给比如另一手机,或者将音频数据输出至存储器520以便进一步处理。
WiFi属于短距离无线传输技术,手机通过WiFi模块570可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图5示出了WiFi模块570,但是可以理解的是,其并不属于手机的必须构成。
处理器580是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器520内的软件程序和/或模块,以及调用存储在存储器520内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器580可包括一个或多个处理单元;优选的,处理器580可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作***、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器580中。
手机还包括给各个部件供电的电源590(比如电池),优选的,电源可以通过电源管理***与处理器580逻辑相连,从而通过电源管理***实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机还可以包括摄像头、蓝牙模块等,在此不再赘述。
在本申请实施例中,该终端所包括的处理器580可以执行前述图1至图2所示实施例中的功能,此处不再赘述。
本申请实施例还提供了一种计算机存储介质,其中一个实施例包括:该计算机存储介质中存储有指令,该指令在计算机上执行时,使得该计算机执行前述图1至图2所示实施例中音高调节装置所执行的操作。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器, 或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,read-only memory)、随机存取存储器(RAM,random access memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (11)

  1. 一种音高调节方法,其特征在于,包括:
    获取多个备选旋律文件,所述备选旋律文件用于标识目标歌曲的旋律中音符的音高值,每个所述备选旋律文件所标识的音高值不同;
    获取用户歌唱所述目标歌曲的歌声的基频序列,并根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,所述目标基频点包括所述基频序列中与所述备选旋律文件的音符在时间上相对应的基频点;
    分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,并分别统计每个所述备选旋律文件的所有音高值差值的总和;
    将所述总和最小的备选旋律文件确定为目标旋律文件,并根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高。
  2. 根据权利要求1所述的音高调节方法,其特征在于,所述获取多个备选旋律文件,包括:
    获取所述目标歌曲的所述原始旋律文件;
    对所述原始旋律文件的所有音符的音高值加上变换值,得到变换后的旋律文件;
    分别将所述原始旋律文件以及所述变换后的旋律文件作为所述备选旋律文件。
  3. 根据权利要求2所述的音高调节方法,其特征在于,所述对所述原始旋律文件的所有音符的音高值加上变换值,得到变换后的旋律文件,包括:
    基于十二平均律,将所述原始旋律文件对应的八度音阶平均分割,得到十二个半音音程,所述原始旋律文件对应所述十二个半音音程中的一个半音音程;
    按照所述原始旋律文件对应的半音音程与其它所述半音音程之间的音程关系,分别对所述原始旋律文件的所有音符的音高值执行11次的加变换值,得到11个所述变换后的旋律文件;
    其中,每个所述变换后的旋律文件分别对应所述十二个半音音程中的一个半音音程。
  4. 根据权利要求3所述的音高调节方法,其特征在于,当所述目标旋律文件不是所述原始旋律文件时,所述根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高,包括:
    根据所述目标旋律文件与所述原始旋律文件之间的音程关系调节所述目标歌曲的伴奏文件的音高。
  5. 根据权利要求1所述的音高调节方法,其特征在于,所述将所述总和最小的备选旋律文件确定为目标旋律文件之后,所述方法还包括:
    判断所述目标旋律文件中音高值差值为0的音符在所有音符中的占比是否大于预设阈值;
    若是,则执行所述根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高的步骤;
    若否,则不调节所述伴奏文件的音高。
  6. 根据权利要求1所述的音高调节方法,其特征在于,所述根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,包括:
    确定所述基频序列中与所述备选旋律文件的音符在时间上相对应的所述目标基频点;
    根据所述预设算法将所述目标基频点的频率值转换为音高值;
    所述分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,包括:
    获取每个所述备选旋律文件中与所述目标基频点在时间上相对应的音符 的音高值,计算时间上相对应的目标基频点与音符之间的音高值差值。
  7. 根据权利要求6所述的音高调节方法,其特征在于,所述备选旋律文件还用于标识所述目标歌曲的旋律中音符的开始时间及结束时间;
    所述获取每个所述备选旋律文件中与所述目标基频点在时间上相对应的音符的音高值,包括:
    根据每个所述备选旋律文件中音符的开始时间及结束时间确定与所述目标基频点在时间上相对应的音符;
    获取与所述目标基频点在时间上相对应的音符的音高值。
  8. 一种音高调节装置,其特征在于,包括:
    第一获取单元,用于获取多个备选旋律文件,所述备选旋律文件用于标识目标歌曲的旋律中音符的音高值,每个所述备选旋律文件所标识的音高值不同;
    第二获取单元,用于获取用户歌唱所述目标歌曲的歌声的基频序列;
    转换单元,用于根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,所述目标基频点包括所述基频序列中与所述备选旋律文件的音符在时间上相对应的基频点;
    计算单元,用于分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,并分别统计每个所述备选旋律文件的所有音高值差值的总和;
    音高调节单元,用于将所述总和最小的备选旋律文件确定为目标旋律文件,并根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高。
  9. 根据权利要求8所述的音高调节装置,其特征在于,所述音高调节装置还包括:
    判断单元,用于判断所述目标旋律文件中音高值差值为0的音符在所有音 符中的占比是否大于预设阈值;
    音高调节单元具体用于当所述目标旋律文件中音高值差值为0的音符在所有音符中的占比大于预设阈值时,执行所述根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高的步骤;当所述目标旋律文件中音高值差值为0的音符在所有音符中的占比不大于预设阈值时,不调节所述伴奏文件的音高。
  10. 一种音高调节装置,其特征在于,包括:
    处理器、存储器、总线、输入输出设备;
    所述处理器与所述存储器、输入输出设备相连;
    所述总线分别连接所述处理器、存储器以及输入输出设备;
    所述处理器用于获取多个备选旋律文件,所述备选旋律文件用于标识目标歌曲的旋律中音符的音高值,每个所述备选旋律文件所标识的音高值不同,获取用户歌唱所述目标歌曲的歌声的基频序列,并根据预设算法将所述基频序列的目标基频点的频率值转换为音高值,所述目标基频点包括所述基频序列中与所述备选旋律文件的音符在时间上相对应的基频点,分别计算每个所述备选旋律文件与所述基频序列在每一个相对应时间点上的音高值差值,并分别统计每个所述备选旋律文件的所有音高值差值的总和,将所述总和最小的备选旋律文件确定为目标旋律文件,并根据所述目标旋律文件与所述目标歌曲的原始旋律文件的音高值差值调节所述目标歌曲的伴奏文件的音高。
  11. 一种计算机存储介质,其特征在于,所述计算机存储介质中存储有指令,所述指令在计算机上执行时,使得所述计算机执行如权利要求1至7中任一项所述的方法。
PCT/CN2021/119571 2020-10-27 2021-09-22 音高调节方法、装置及计算机存储介质 WO2022089098A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/034,031 US20230395051A1 (en) 2020-10-27 2021-09-22 Pitch adjustment method and device, and computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011163021.7 2020-10-27
CN202011163021.7A CN112270913B (zh) 2020-10-27 2020-10-27 音高调节方法、装置及计算机存储介质

Publications (1)

Publication Number Publication Date
WO2022089098A1 true WO2022089098A1 (zh) 2022-05-05

Family

ID=74342914

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119571 WO2022089098A1 (zh) 2020-10-27 2021-09-22 音高调节方法、装置及计算机存储介质

Country Status (3)

Country Link
US (1) US20230395051A1 (zh)
CN (1) CN112270913B (zh)
WO (1) WO2022089098A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270913B (zh) * 2020-10-27 2022-11-18 腾讯音乐娱乐科技(深圳)有限公司 音高调节方法、装置及计算机存储介质
CN113192477A (zh) * 2021-04-28 2021-07-30 北京达佳互联信息技术有限公司 音频处理方法及装置
CN113178183B (zh) * 2021-04-30 2024-05-14 杭州网易云音乐科技有限公司 音效处理方法、装置、存储介质和计算设备
CN113314093B (zh) * 2021-06-01 2024-04-12 广州酷狗计算机科技有限公司 音频合成方法、装置、终端及存储介质
CN114566191A (zh) * 2022-02-25 2022-05-31 腾讯音乐娱乐科技(深圳)有限公司 录音的修音方法及相关装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015225332A (ja) * 2014-05-30 2015-12-14 カシオ計算機株式会社 楽音発生装置、電子楽器、楽音発生方法およびプログラム
CN108206026A (zh) * 2017-12-05 2018-06-26 北京小唱科技有限公司 确定音频内容音高偏差的方法及装置
CN109272975A (zh) * 2018-08-14 2019-01-25 无锡冰河计算机科技发展有限公司 演唱伴奏自动调整方法、装置及ktv点唱机
CN111785238A (zh) * 2020-06-24 2020-10-16 腾讯音乐娱乐科技(深圳)有限公司 音频校准方法、装置及存储介质
CN112270913A (zh) * 2020-10-27 2021-01-26 腾讯音乐娱乐科技(深圳)有限公司 音高调节方法、装置及计算机存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015225332A (ja) * 2014-05-30 2015-12-14 カシオ計算機株式会社 楽音発生装置、電子楽器、楽音発生方法およびプログラム
CN108206026A (zh) * 2017-12-05 2018-06-26 北京小唱科技有限公司 确定音频内容音高偏差的方法及装置
CN109272975A (zh) * 2018-08-14 2019-01-25 无锡冰河计算机科技发展有限公司 演唱伴奏自动调整方法、装置及ktv点唱机
CN111785238A (zh) * 2020-06-24 2020-10-16 腾讯音乐娱乐科技(深圳)有限公司 音频校准方法、装置及存储介质
CN112270913A (zh) * 2020-10-27 2021-01-26 腾讯音乐娱乐科技(深圳)有限公司 音高调节方法、装置及计算机存储介质

Also Published As

Publication number Publication date
US20230395051A1 (en) 2023-12-07
CN112270913A (zh) 2021-01-26
CN112270913B (zh) 2022-11-18

Similar Documents

Publication Publication Date Title
WO2022089098A1 (zh) 音高调节方法、装置及计算机存储介质
CN107705778B (zh) 音频处理方法、装置、存储介质以及终端
CN109166593B (zh) 音频数据处理方法、装置及存储介质
CN109256146B (zh) 音频检测方法、装置及存储介质
CN106782600B (zh) 音频文件的评分方法及装置
EP3493198A1 (en) Method and device for determining delay of audio
WO2018072543A1 (zh) 模型生成方法、语音合成方法及装置
WO2014008843A1 (zh) 一种声纹特征模型更新方法及终端
US20180342231A1 (en) Sound effect parameter adjustment method, mobile terminal and storage medium
WO2020103550A1 (zh) 音频信号的评分方法、装置、终端设备及计算机存储介质
WO2018223837A1 (zh) 音乐播放方法及相关产品
WO2017215660A1 (zh) 一种场景音效的控制方法、及电子设备
CN111883091A (zh) 音频降噪方法和音频降噪模型的训练方法
CN106847307B (zh) 信号检测方法及装置
CN108668024B (zh) 一种语音处理方法及终端
CN111785238B (zh) 音频校准方法、装置及存储介质
CN107229629B (zh) 音频识别方法及装置
CN109616135B (zh) 音频处理方法、装置及存储介质
CN109085985B (zh) 发声控制方法、装置、电子装置以及存储介质
CN110599989B (zh) 音频处理方法、装置及存储介质
WO2017215511A1 (zh) 一种场景音效的控制方法、及相关产品
WO2020228226A1 (zh) 一种纯音乐检测方法、装置及存储介质
CN110378677B (zh) 一种红包领取方法、装置、移动终端及存储介质
WO2016155527A1 (zh) 一种流媒体对齐方法,设备以及存储介质
CN112667844A (zh) 检索音频的方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21884828

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18034031

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.08.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21884828

Country of ref document: EP

Kind code of ref document: A1