CN110265051A - The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco - Google Patents

The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco Download PDF

Info

Publication number
CN110265051A
CN110265051A CN201910480919.8A CN201910480919A CN110265051A CN 110265051 A CN110265051 A CN 110265051A CN 201910480919 A CN201910480919 A CN 201910480919A CN 110265051 A CN110265051 A CN 110265051A
Authority
CN
China
Prior art keywords
audio
data
rhythm
pitch
sightsinging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910480919.8A
Other languages
Chinese (zh)
Inventor
徐民洪
吴清强
刘昆宏
李昌春
黄仙寿
周道成
林辉杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Xiaozhi Dashu Information Technology Co Ltd
Original Assignee
Fujian Xiaozhi Dashu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Xiaozhi Dashu Information Technology Co Ltd filed Critical Fujian Xiaozhi Dashu Information Technology Co Ltd
Priority to CN201910480919.8A priority Critical patent/CN110265051A/en
Publication of CN110265051A publication Critical patent/CN110265051A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The present invention relates to a kind of sightsinging audio intelligent scoring modeling methods that education is sung applied to root LeEco, step 1: the solfege data comprising expert analysis mode that system is collected in advance divide, data are divided by 2:1,2 parts therein are used as training data, and 1 part is test data;Step 2: audio data is denoised, and cuts out the blank segment of no audio, carries out the data prediction of voice enhancing;Step 3: after audio data is pre-processed, audio frequency characteristics is extracted using mel cepstrum coefficients method, extract pitch information;Step 4: high pitch information is extracted into frequency domain character using Short Time Fourier Transform, beat information wherein included is extracted, forms the feature based on rhythm.Step 5: based on characteristic informations such as pitch, rhythm, scoring modeling is carried out.The ability that the present invention helps user to promote oneself music sightsinging aspect.

Description

The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco
Technical field
The present invention relates to a kind of sightsinging audio intelligent scoring modeling methods that education is sung applied to root LeEco.
Background technique
This system realizes that user recording and audio file upload, into system background server, to solfege audio into Row intelligent scoring, and appraisal result is fed back into client.Intelligent scoring module application machine learning modeling, by comparison The difference of voice and standard pronunciation in acoustic frequency is judged respectively from two angles of rhythm and accuracy in pitch, is precisely commented to realize It surveys, and result is fed back into user, the ability for helping user to promote oneself music sightsinging aspect.
Summary of the invention
The object of the present invention is to provide a kind of sightsinging audio intelligent scoring modeling sides that education is sung applied to root LeEco Method, the ability for helping user to promote oneself music sightsinging aspect.
Above-mentioned purpose is realized by following technical scheme:
A kind of sightsinging audio intelligent scoring modeling method for singing education applied to root LeEco, the acquisition of data and pre- place Reason the following steps are included:
Step 1: the solfege data comprising expert analysis mode that system is collected in advance divide, and data are pressed 2:1 It divides, 2 parts therein are used as training data, and 1 part is test data, are modeled using training data;
Step 2: audio data is denoised, and cuts out the blank segment of no audio, and the data for carrying out voice enhancing are located in advance Reason;
Step 3: audio data is extracted into audio frequency characteristics using mel cepstrum coefficients method, extracts pitch information;
Step 4: extracting frequency domain character using Short Time Fourier Transform for audio data, extracts beat letter wherein included Breath forms the feature based on rhythm;
Step 5: standard audio is extracted into accuracy in pitch and rhythm characteristic according to step 2 to step 4;
Step 6: standard audio and solfege audio are used and are based on dynamic time adjustment algorithm, is fallen to based on Meier The accuracy in pitch feature that spectral coefficient method obtains, is compared;
Step 7: standard audio and solfege audio are used, algorithm is scaled based on linear Hash, to based on Fu in short-term In leaf method obtain rhythm characteristic be compared;
Step 8: using the matching vector of the pitch of acquisition and rhythm as training data, using training neural network, when When test data set error rate is less than 1%, verification process terminates;
Step 9: using the client end interface of wechat small routine, sightsinging audio when upload user individual practices, to these The audio of upload carries out step 2 to step 4 and step 6, and the processing of step 7 inputs trained neural network later Model exports corresponding rhythm by neural network, accuracy in pitch scores;By the corresponding rhythm of neural network output, the scoring knot of accuracy in pitch Fruit exports to the interface of wechat small routine, flashes in client;
Step 10: corresponding accuracy in pitch vector sum rhythm vector is returned into user client interface.
The utility model has the advantages that
1. scoring effect of the invention can achieve the level of profession scoring, the scoring mean value error with multidigit human expert It is smaller.
2. scoring operation efficiency of the invention is higher, multiple angle scoring process can be completed within 5 seconds, reach industry Application requirement.
3. anti-noise ability of the invention is stronger, also can preferably score in the case where there is certain ambient noise.
4. scoring process of the invention merges various features, score can be judged from multiple angles such as rhythm, accuracy in pitch.
Detailed description of the invention
Attached drawing 1 is training process schematic diagram of the invention.
Attached drawing 2 is scoring process schematic diagram of the invention.
Attached drawing 3 is the dimensional variation schematic diagram of the filter group of melscale of the invention.
Attached drawing 4 is of the invention by signal decomposition, and the convolution of two signals is converted into the addition schematic diagram of two signals.
Attached drawing 5 is the similitude schematic diagram between the present invention two time serieses of calculating.
Attached drawing 6 is cost matrix schematic diagram of the invention.
Attached drawing 7 is that audio of the invention does after Fourier transformation overfrequency to separate unlike signal schematic diagram.
Specific embodiment
A kind of sightsinging audio intelligent scoring modeling method for singing education applied to root LeEco, it is characterized in that: data Obtain and pretreatment the following steps are included:
Step 1: the solfege data comprising expert analysis mode that system is collected in advance divide, and data are pressed 2:1 It divides, 2 parts therein are used as training data, and 1 part is test data, are modeled using training data;
Step 2: audio data is denoised, and cuts out the blank segment of no audio, and the data for carrying out voice enhancing are located in advance Reason;
Step 3: audio data is extracted into audio frequency characteristics using mel cepstrum coefficients method, extracts pitch information;
Step 4: extracting frequency domain character using Short Time Fourier Transform for audio data, extracts beat letter wherein included Breath forms the feature based on rhythm;
Step 5: standard audio is extracted into accuracy in pitch and rhythm characteristic according to step 2 to step 4;
Step 6: standard audio and solfege audio are used and are based on dynamic time adjustment algorithm, is fallen to based on Meier The accuracy in pitch feature that spectral coefficient method obtains, is compared;
Step 7: standard audio and solfege audio are used, algorithm is scaled based on linear Hash, to based on Fu in short-term In leaf method obtain rhythm characteristic be compared;
Step 8: using the matching vector of the pitch of acquisition and rhythm as training data, using training neural network, when When test data set error rate is less than 1%, verification process terminates;
The neural network training process, which includes: (1), selects important parameter, including activation primitive according to data characteristic, The hidden layer quantity of neural network, each hidden neuron number of nodes, learning rate etc.;(2) feature and mark extracted training data Range data after quasi- audio frequency characteristics comparison is as two vectors, using the professional score data that expert gives as prediction target, Carry out the training of neural network.Target value is approached by neural network using back-propagation algorithm;Neural network after training iteration The target of output and the error of expert analysis mode are less than certain threshold value, when test data set error rate is less than 1%, verification process Terminate;If it exceeds 10,000 iteration cannot approach the error range of target, then (1) is returned to, readjusts setting for important parameter It is fixed;
Step 9: using the client end interface of wechat small routine, sightsinging audio when upload user individual practices, to these The audio of upload carries out step 2 to step 4 and step 6, and the processing of step 7 inputs trained neural network later Model exports corresponding rhythm by neural network, accuracy in pitch scores;By the corresponding rhythm of neural network output, the scoring knot of accuracy in pitch Fruit exports to the interface of wechat small routine, flashes in client;
Step 10: corresponding accuracy in pitch vector sum rhythm vector is returned into user client interface.
Further, it is based on step 6, the pitch in the main piano pitch compared in standard audio and sightsinging audio is high Low variation matching degree, the method for having used linear pitch calibration here, first carries out linear scale for the pitch of voice and piano, Ensure that its average energy value is identical, on this basis in comparing audio sequence change in pitch matching vector.
Further, it is based on step 7, the rhythm in the main piano rhythm compared in standard audio and sightsinging audio is fast Slow variation matching degree.The rhythm of voice is carried out linear scale by the method for having used the calibration of linear rhythm here, it is ensured that its with The tempo variation rate of piano is identical, on this basis in comparing audio sequence tempo variation matching vector.
Further, it is based on step 10, interface parsed, and label is right in the sightsinging corresponding music score of Chinese operas position of song The poor position of user's matching degree carries out marking red annotation.
Mel-frequency cepstrum coefficient is exactly the coefficient for forming mel-frequency cepstrum, mel-frequency cepstrum coefficient feature extraction packet Containing two committed steps: being transformed into mel-frequency, then carry out cepstral analysis.
Further, the mel-frequency be it is a kind of based on human ear to the judgement of the sense organ of equidistant change in pitch depending on Non-linear frequency scale;It is as follows with the relationship of the hertz of frequency:
So if it on melscale be uniform indexing, for the distance between hertz will be increasing, institute With dimensional variation such as Fig. 1 of the filter group of melscale,
In the high resolution of low frequency part, the auditory properties with human ear are consistent the filter group of melscale, this It is the physical significance place of melscale.
This step is meant that: being carried out Fourier transformation to time-domain signal first and is transformed into frequency domain, then recycles Meier The filter group of frequency scale corresponds to frequency-region signal and carries out cutting, the last corresponding numerical value of each frequency band.
Cepstral analysis does Fourier transformation to time-domain signal, then takes log, then carries out inversefouriertransform again.It can be with It is divided into cepstrum, real cepstrum and power cepstrum, ours is power cepstrum;
Cepstral analysis is available: by signal decomposition, the convolution of two signals is converted into the addition of two signals.
Cepstral analysis is available: by signal decomposition, the convolution of two signals is converted into the addition of two signals.It is exemplified below:
Assuming that frequency spectrum X (k) above, time-domain signal is that x (n) so meets:
X (k)=DFT (x (n))
Consider frequency domain X (k) being split as two-part product:
X (k)=H (k) E (k)
Assuming that the corresponding time-domain signal of two parts is h (n) and e (n) respectively, then meeting:
X (n)=h (n) * e (n)
We are to cannot be distinguished to open h (n) and e (n) at this time.
Log is taken to frequency domain both sides:
Log (X (k))=log (H (k))+log (E (k))
Then inversefouriertransform is carried out:
IDFT (log (X (k)))=IDFT (log (H (k)))+IDFT (log (E (k)))
Assuming that the time-domain signal obtained at this time is as follows:
X'(n)=h'(n)+e'(n)
Although obtaining time-domain signal x ' (n) at this time is cepstrum, different with original time-domain signal x (n), The convolution relation of time-domain signal can be converted to linearly add relationship.
The frequency-region signal of corresponding upper figure, can split into two-part product: the envelope of frequency spectrum and the details of frequency spectrum.Frequency spectrum Peak value be formant, it determines the envelope of signal frequency domain, be distinguish sound important information, so carry out cepstral analysis Purpose is exactly to obtain the envelope information of frequency spectrum.It is the low-frequency information of frequency spectrum that envelope part is corresponding, and detail section is corresponding is The high-frequency information of frequency spectrum.Cepstral analysis closes the convolution relation conversion of the corresponding time-domain signal of two parts to linearly add System, so only needing cepstrum can be obtained corresponding time-domain signal h ' (t) in envelope part by a low-pass filter.
Accuracy in pitch based on dynamic time adjustment algorithm compares, and is the side of the similarity between a kind of two time serieses of measurement Method is mainly used in field of speech recognition to identify that two sections of voices indicate whether the same word;
In time series, the length for needing to compare two sections of time serieses of similitude is possible and unequal, knows in voice The word speed that other field shows as different people is different.And the rate of articulation of the different phonemes in the same word is also different, such as Somebody can drag " A " this sound very long, or the very short of " i " hair.In addition, when different time sequence may only exist Between displacement on axis, that is, in the case where restoring displacement, two time serieses are consistent.In these complex cases, make The distance between two time serieses that can not be effectively asked with traditional Euclidean distance/similitude;
DTW is by extending time series and shortened, and to calculate the similitude between two time serieses, such as schemes 3。
Enabling two time serieses that calculate similarity is X and Y, and length is respectively | X | and | Y |,
The form in consolidation path be W=w1, w2 ..., wK, wherein Max (| X |, | Y |)≤K≤| X |+| Y |,
The form of wk is (i, j), and wherein what i was indicated is the i coordinate in X, and what j was indicated is the j coordinate in Y,
Consolidation path W must start from w1=(1,1), to wK=(| X |, | Y |) ending, to guarantee each seat in X and Y Mark all occurs in W,
In addition, the i and j of w (i, j) must be increased monotonically in W, will not be intersected with the dotted line guaranteed in Fig. 1, so-called list It adjusts plus refers to:
wk=(i, j), wk+1=(i', j')
i≤i'≤i+1,j≤j'≤j+1
Obtained consolidation path is apart from a shortest consolidation path:
D (i, j)=Dist (i, j)+min [D (i-1, j), D (i, j-1), D (i-1, j-1)]
The consolidation path distance finally acquired be D (| X |, | Y |).
It is solved using Dynamic Programming such as Fig. 4, is cost matrix (Cost Matrix) D, D (i, j) indicates that length is Consolidation path distance between two time serieses of i and j.
Audio is extracted based on the frequency domain character of Fourier transformation to do after Fourier transformation and can separate different letters with overfrequency Number, the core concept of Fourier is exactly that all waves can be indicated with multiple sine-wave superimposeds, and wave here includes from sound Sound to light all waves, so, the signal of several frequencies can be separated by doing Fourier's series to a collected sound.
Fourier transform is a kind of method for analyzing signal, it can analyze the ingredient of signal, it is also possible to the synthesis of these ingredients Signal.It indicate can by some function representation met certain condition at trigonometric function (sinusoidal and/or cosine function) or it Integral linear combination.Many waveforms can be used as ingredient of signal, such as sine wave, square wave, sawtooth wave etc., and Fourier becomes Use ingredient of the sine wave as signal instead.
Frequency domain can also encounter frequency domain, frequency domain in high-speed figure application with more especially in radio frequency and communication system Most important property is: it is not true, a Mathematics structural.Time domain is the domain of only objective reality, and frequency domain is One follows the mathematics scope of ad hoc rules.
Sine wave is unique existing waveform in frequency domain, this is most important rule in frequency domain, i.e. sine wave is to frequency domain Description because any waveform in time domain can all be synthesized with sine wave.This is a very important property of sine wave.So And it is not the monopolizing characteristic of sine wave, there are many more other waveforms also to have the property that.Use sine wave as frequency Functional form in domain has its special place.If using sine wave, some problems relevant to the electric effect of interconnection line It will become more clearly understood from and solve.If transforming to frequency domain and being described using sine wave, sometimes than only in the time domain can Quickly obtain answer.
And in practice, it initially sets up comprising resistance, the circuit of inductance and capacitor, and input random waveform.Ordinary circumstance Under, the waveform of a similar sine wave will be obtained.Moreover, can easily describe these waves with the combination of several sine waves Shape.
Rhythm based on linear Hash scaling is compared to be needed to hum rotation in view of user from rhythm direction two section audios of comparison The difference for restraining speed, can quickly calculate the linear range between two different length sequences using linear extendible algorithm, groan Singing in rhythm scoring using the main reason for linear extendible algorithm is since the humming rate of user and the performance of original song are fast Rate is inconsistent, by linear extendible, humming segment can be stretched or be compressed, and is to keep one with the rate of original song It causes, the fundamental frequency sequence extracted in humming segment is carried out different degrees of stretch with identical contraction-expansion factor by being critical that for this algorithm Then contracting calculates the rhythm comparison of corresponding original song.
The humming rate and inconsistent problem of original song rate can solve using linear extendible algorithm, but this algorithm The premise of reliability is that humming rate and original song rate are completely proportional, that is, is not in sometimes slow phenomenon fastly sometimes, If humming rate is variation, will be gone wrong using linear extendible algorithm.
Certainly, the above description is not a limitation of the present invention, and the present invention is also not limited to the example above, this technology neck The variations, modifications, additions or substitutions that the technical staff in domain is made within the essential scope of the present invention also should belong to of the invention Protection scope.

Claims (4)

  1. The modeling method 1. a kind of sightsinging audio intelligent for singing education applied to root LeEco scores, it is characterized in that: data obtain Take and pre-process the following steps are included:
    Step 1: the solfege data comprising expert analysis mode that system is collected in advance divide, and data are drawn by 2:1 Point, 2 parts therein are used as training data, and 1 part is test data, are modeled using training data;
    Step 2: audio data is denoised, and cuts out the blank segment of no audio, carries out the data prediction of voice enhancing;
    Step 3: audio data is extracted into audio frequency characteristics using mel cepstrum coefficients method, extracts pitch information;
    Step 4: audio data is extracted into frequency domain character using Short Time Fourier Transform, extracts beat information wherein included, shape At the feature based on rhythm;
    Step 5: standard audio is extracted into accuracy in pitch and rhythm characteristic according to step 2 to step 4;
    Step 6: standard audio and solfege audio are used and are based on dynamic time adjustment algorithm, to based on mel cepstrum system The accuracy in pitch feature that number method obtains, is compared;
    Step 7: standard audio and solfege audio are used, algorithm is scaled based on linear Hash, to based on Fourier in short-term The rhythm characteristic that method obtains is compared;
    Step 8: using the matching vector of the pitch of acquisition and rhythm as training data, using training neural network, when testing When data set error rate is less than 1%, verification process terminates;
    Step 9: using the client end interface of wechat small routine, sightsinging audio when upload user individual practices, to these uploads Audio carry out step 2 to step 4, and Step 6: step 7 processing, input trained neural network mould later Type exports corresponding rhythm by neural network, accuracy in pitch scores;By the corresponding rhythm of neural network output, the appraisal result of accuracy in pitch It exports to the interface of wechat small routine, flashes in client;
    Step 10: corresponding accuracy in pitch vector sum rhythm vector is returned into user client interface.
  2. 2. the sightsinging audio intelligent scoring modeling method according to claim 1 that education is sung applied to root LeEco, It is characterized in: the pitch height variation matching based on step 6, in the main piano pitch compared in standard audio and sightsinging audio Degree;Here the method for having used linear pitch calibration, first carries out linear scale for the pitch of voice and piano, it is ensured that its energy Mean value is identical, on this basis in comparing audio sequence change in pitch matching vector.
  3. 3. the sightsinging audio intelligent scoring modeling method according to claim 1 that education is sung applied to root LeEco, Be characterized in: based on step 7, the rhythm speed in the main piano rhythm compared in standard audio and sightsinging audio changes matching Degree, the method for having used linear rhythm calibration here, carries out linear scale for the rhythm of voice, it is ensured that the rhythm of itself and piano Change rate is identical, on this basis in comparing audio sequence tempo variation matching vector.
  4. 4. the sightsinging audio intelligent scoring modeling method according to claim 1 that education is sung applied to root LeEco, Be characterized in: based on step 10, interface is parsed, and label matches journey in the sightsinging corresponding music score of Chinese operas position of song, to user Poor position is spent to carry out marking red annotation.
CN201910480919.8A 2019-06-04 2019-06-04 The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco Pending CN110265051A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910480919.8A CN110265051A (en) 2019-06-04 2019-06-04 The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910480919.8A CN110265051A (en) 2019-06-04 2019-06-04 The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco

Publications (1)

Publication Number Publication Date
CN110265051A true CN110265051A (en) 2019-09-20

Family

ID=67916665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910480919.8A Pending CN110265051A (en) 2019-06-04 2019-06-04 The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco

Country Status (1)

Country Link
CN (1) CN110265051A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111508526A (en) * 2020-04-10 2020-08-07 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111653152A (en) * 2020-05-18 2020-09-11 河南财政金融学院 Using method of music education and exercise system
CN113657184A (en) * 2021-07-26 2021-11-16 广东科学技术职业学院 Evaluation method and device for piano playing fingering
CN113744721A (en) * 2021-09-07 2021-12-03 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio processing method, device and readable storage medium
CN114093386A (en) * 2021-11-10 2022-02-25 厦门大学 Education-oriented multi-dimensional singing evaluation method
CN115796653A (en) * 2022-11-16 2023-03-14 中南大学 Interview speech evaluation method and system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1737796A (en) * 2005-09-08 2006-02-22 上海交通大学 Across type rapid matching method for digital music rhythm
CN103514866A (en) * 2012-06-28 2014-01-15 曾平蔚 Method and device for instrumental performance grading
CN104143340A (en) * 2014-07-28 2014-11-12 腾讯科技(深圳)有限公司 Voice frequency evaluation method and device
CN106250400A (en) * 2016-07-19 2016-12-21 腾讯科技(深圳)有限公司 A kind of audio data processing method, device and system
CN106445964A (en) * 2015-08-11 2017-02-22 腾讯科技(深圳)有限公司 Audio information processing method and apparatus
CN107767847A (en) * 2017-09-29 2018-03-06 小叶子(北京)科技有限公司 A kind of intelligent piano performance assessment method and system
CN107967827A (en) * 2017-12-29 2018-04-27 重庆师范大学 A kind of music education exercise system and its method
CN109461431A (en) * 2018-12-24 2019-03-12 厦门大学 The sightsinging mistake music score of Chinese operas mask method of education is sung applied to root LeEco
CN109584904A (en) * 2018-12-24 2019-04-05 厦门大学 The sightsinging audio roll call for singing education applied to root LeEco identifies modeling method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1737796A (en) * 2005-09-08 2006-02-22 上海交通大学 Across type rapid matching method for digital music rhythm
CN103514866A (en) * 2012-06-28 2014-01-15 曾平蔚 Method and device for instrumental performance grading
CN104143340A (en) * 2014-07-28 2014-11-12 腾讯科技(深圳)有限公司 Voice frequency evaluation method and device
CN106445964A (en) * 2015-08-11 2017-02-22 腾讯科技(深圳)有限公司 Audio information processing method and apparatus
CN106250400A (en) * 2016-07-19 2016-12-21 腾讯科技(深圳)有限公司 A kind of audio data processing method, device and system
CN107767847A (en) * 2017-09-29 2018-03-06 小叶子(北京)科技有限公司 A kind of intelligent piano performance assessment method and system
CN107967827A (en) * 2017-12-29 2018-04-27 重庆师范大学 A kind of music education exercise system and its method
CN109461431A (en) * 2018-12-24 2019-03-12 厦门大学 The sightsinging mistake music score of Chinese operas mask method of education is sung applied to root LeEco
CN109584904A (en) * 2018-12-24 2019-04-05 厦门大学 The sightsinging audio roll call for singing education applied to root LeEco identifies modeling method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111508526A (en) * 2020-04-10 2020-08-07 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111508526B (en) * 2020-04-10 2022-07-01 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111653152A (en) * 2020-05-18 2020-09-11 河南财政金融学院 Using method of music education and exercise system
CN113657184A (en) * 2021-07-26 2021-11-16 广东科学技术职业学院 Evaluation method and device for piano playing fingering
CN113657184B (en) * 2021-07-26 2023-11-07 广东科学技术职业学院 Piano playing fingering evaluation method and device
CN113744721A (en) * 2021-09-07 2021-12-03 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio processing method, device and readable storage medium
CN113744721B (en) * 2021-09-07 2024-05-14 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio processing method, device and readable storage medium
CN114093386A (en) * 2021-11-10 2022-02-25 厦门大学 Education-oriented multi-dimensional singing evaluation method
CN115796653A (en) * 2022-11-16 2023-03-14 中南大学 Interview speech evaluation method and system

Similar Documents

Publication Publication Date Title
CN110265051A (en) The sightsinging audio intelligent scoring modeling method of education is sung applied to root LeEco
Dhingra et al. Isolated speech recognition using MFCC and DTW
Tiwari MFCC and its applications in speaker recognition
Patel et al. Speech recognition and verification using MFCC & VQ
JP2020524308A (en) Method, apparatus, computer device, program and storage medium for constructing voiceprint model
WO2017088364A1 (en) Speech recognition method and device for dynamically selecting speech model
Prasomphan Improvement of speech emotion recognition with neural network classifier by using speech spectrogram
Jancovic et al. Bird species recognition using unsupervised modeling of individual vocalization elements
CN103996155A (en) Intelligent interaction and psychological comfort robot service system
Li et al. Speech emotion recognition using 1d cnn with no attention
Mansour et al. Voice recognition using dynamic time warping and mel-frequency cepstral coefficients algorithms
CN102521281A (en) Humming computer music searching method based on longest matching subsequence algorithm
WO2020248388A1 (en) Method and device for training singing voice synthesis model, computer apparatus, and storage medium
Sefara The effects of normalisation methods on speech emotion recognition
CN102411932B (en) Methods for extracting and modeling Chinese speech emotion in combination with glottis excitation and sound channel modulation information
CN101178897A (en) Speaking man recognizing method using base frequency envelope to eliminate emotion voice
CN103456302B (en) A kind of emotional speaker recognition method based on the synthesis of emotion GMM Model Weight
CN107767881B (en) Method and device for acquiring satisfaction degree of voice information
CN112002348B (en) Method and system for recognizing speech anger emotion of patient
Tyagi et al. Automatic identification of bird calls using spectral ensemble average voice prints
Wang Speech recognition of oral English teaching based on deep belief network
CN109065073A (en) Speech-emotion recognition method based on depth S VM network model
Piotrowska et al. Machine learning-based analysis of English lateral allophones
CN109452932A (en) A kind of Constitution Identification method and apparatus based on sound
Chien et al. Evaluation of glottal inverse filtering algorithms using a physiologically based articulatory speech synthesizer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190920