WO2021190660A1 - Music chord recognition method and apparatus, and electronic device and storage medium - Google Patents

Music chord recognition method and apparatus, and electronic device and storage medium Download PDF

Info

Publication number
WO2021190660A1
WO2021190660A1 PCT/CN2021/084222 CN2021084222W WO2021190660A1 WO 2021190660 A1 WO2021190660 A1 WO 2021190660A1 CN 2021084222 W CN2021084222 W CN 2021084222W WO 2021190660 A1 WO2021190660 A1 WO 2021190660A1
Authority
WO
WIPO (PCT)
Prior art keywords
note
chord
music
feature
music data
Prior art date
Application number
PCT/CN2021/084222
Other languages
French (fr)
Chinese (zh)
Inventor
蒋慧军
徐伟
杨艾琳
姜凯英
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021190660A1 publication Critical patent/WO2021190660A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/18Selecting circuits
    • G10H1/183Channel-assigning means for polyphonic instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Definitions

  • This application relates to the field of artificial intelligence technology, in particular to a music chord recognition method and device, electronic equipment, and computer-readable storage media.
  • the current analysis of the chord performance of classical music is based on the analysis of the chord symbols, and the root note of the chord can be obtained. The nature of chords, but this analysis method cannot get information about the chord function of classical music.
  • One of the objectives of the embodiments of the present application is to provide a method and device for recognizing music chords, electronic equipment, and computer-readable storage media, so as to solve the technical problem that information on the chord function of classical music cannot be obtained in the prior art.
  • an embodiment of the present application provides a music chord recognition method, the method includes:
  • an embodiment of the present application provides a music chord recognition device, including:
  • the note information processing module is configured to sequentially extract the note information corresponding to each note contained in the music data for the music data of the music chord to be recognized, and construct the two-dimensional information of each note based on the note information corresponding to each note Matrix representation
  • a note feature extraction module configured to extract the note feature corresponding to each note according to the two-dimensional matrix representation of each note
  • the chord feature recognition module is configured to recognize the chord feature corresponding to each note from different chord function recognition dimensions based on the note feature corresponding to each note;
  • the recognition result acquisition module is configured to combine the chord characteristics of the same note in the different chord function recognition dimensions to obtain the chord combination characteristics corresponding to the respective notes, and to combine the chords corresponding to the respective notes.
  • the feature sequence formed by the features is used as the music chord recognition result corresponding to the music data.
  • an embodiment of the present application provides an electronic device, including a memory, storing computer-readable instructions; a processor, which reads the computer-readable instructions stored in the memory, and when the processor executes a computer program:
  • the embodiments of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium stores a computer program. Realized when executed by the processor:
  • the embodiment of the present application has the beneficial effect that: in the technical solution proposed in the embodiment of the present application, the method of artificial intelligence automatically extracts the different chord functions of each note from the music data of the music chord to be recognized. Identify the chord features in the dimensions, and then combine the chord features of the same note in different chord function recognition dimensions, so that the final musical chord recognition result contains the chord information of each note in the different chord function recognition dimensions. That is, the embodiment of the present application can identify the information on the chord function of music data based on the digital note, and solves the problem that the information on the chord function of classical music cannot be obtained in the prior art.
  • Fig. 1 is a flow chart showing a method for music chord recognition according to an exemplary embodiment
  • Fig. 2 is a schematic structural diagram of a music chord recognition model according to an exemplary embodiment
  • Fig. 3 is a flow chart showing a method for music chord recognition according to another exemplary embodiment
  • Fig. 4 is a block diagram of a music chord recognition device according to an exemplary embodiment
  • Fig. 5 is a schematic diagram showing the hardware structure of an electronic device according to an exemplary embodiment.
  • chord performance of classical music is based on the analysis of the chord symbols, and the chord properties such as the root note of the chord can be obtained.
  • this analysis method cannot obtain information about the chord function of classical music. Information.
  • this application proposes an artificial intelligence-based music chord recognition solution, and specifically proposes a music chord recognition method, device, electronic equipment, and computer-readable storage medium. Based on the music chord recognition solution proposed in this application, chord function information such as the chord mode, chord tonality, and chord transposition of each note can be recognized from the music data of the music chord to be recognized.
  • FIG. 1 is a flowchart of a method for recognizing music chords according to an exemplary embodiment.
  • the music chord recognition method at least includes steps S110 to S170, which are described in detail as follows:
  • Step S110 For the music data of the music chord to be identified, the note information corresponding to each note contained in the music data is sequentially extracted, and a two-dimensional matrix representation of each note is constructed based on the note information corresponding to each note.
  • chord involved in the embodiments of the present application is a musical theory concept, which can be understood as a musical melody with a certain interval relationship, and its definition will not be elaborated here.
  • the music data of the music chord to be recognized refers to digital music encoded in a digital music standard format.
  • This music data records the music melody through musical notes and digital control information.
  • the data format of this music data can be a musical instrument data interface (Musical Instrument Digital Interface (MIDI) format, among which MIDI is a widely used music standard format.
  • MIDI Musical Instrument Digital Interface
  • This music data format can be understood by computers. It is specifically composed of notes, control participation and other instructions. Almost all current music is based on MIDI format. Produced.
  • the note information corresponding to each note may include the note pitch and note duration of each note, or the required note information may be extracted based on the actual scene, which is not limited here.
  • a two-dimensional feature representation of each note can be constructed.
  • the note pitch can be used as the vertical element in the two-dimensional matrix
  • the note duration can be used as the horizontal in the two-dimensional matrix.
  • the two-dimensional matrix representation constructed by taking the pitch of the note as the vertical element in the two-dimensional matrix and the duration of the note as the horizontal element in the two-dimensional matrix
  • it can be understood as taking the pitch of the note as Longitudinal coordinates, and the coordinate system constructed with time as the horizontal coordinate
  • the two-dimensional matrix representation of each note can carry the note information corresponding to the note, and then based on the recognition processing of the two-dimensional matrix representation of each note, the music data is in the chord function.
  • the music data of the to-be-identified music chords involved in this embodiment may be a digital expression corresponding to any form of music such as classical music and pop music, and this embodiment does not limit the music type to which this music data belongs.
  • step S120 the note feature corresponding to each note is extracted according to the two-dimensional matrix representation of each note.
  • the characteristic information for the two-dimensional matrix representation such as extracting the chord mode of each note (usually expressed as Key) and the chord tonality (usually expressed as Key). Characteristic information of the chord function such as Quality) and chord inversion (usually expressed as Inversion), etc., from which the note characteristics corresponding to each note can be obtained.
  • a two-dimensional matrix representation sequence composed of the two-dimensional matrix representation of each note can be obtained, and then this two-dimensional matrix representation sequence can be input into the feature extraction model, and
  • the acquired feature extraction model represents the note feature sequence output by the two-dimensional matrix representation sequence, and the acquired note feature sequence contains the note feature corresponding to each note.
  • the feature extraction model may be a machine learning model, for example, LSTM (Long Short-Term Memory, long short-term memory network) model, Bi-LSTM (bidirectional long short-term memory network) model, etc., are not restricted here. But generally speaking, the Bi-LSTM model has a better feature information extraction effect than the LSTM model. Therefore, in actual application scenarios, the Bi-LSTM model can be selected to extract the note features corresponding to each note.
  • the two-dimensional matrix representation of each note can also be extracted based on some note feature extraction algorithms for the feature information of chord functions such as chord mode, chord tonality, and chord transposition, to obtain each note There are no restrictions on the corresponding note characteristics.
  • step S130 based on the note feature corresponding to each note, the chord feature corresponding to each note is identified from different chord function identification dimensions.
  • step S120 the two-dimensional matrix representation is further performed to extract the characteristic information of each note in the chord function, such as chord mode, chord tonality, and chord inversion, to obtain the note corresponding to each note.
  • chord function such as chord mode, chord tonality, and chord inversion
  • chord mode dimensions, chord key dimensions, and chord transposition dimensions in the above examples collectively act on the chord function representation of music data.
  • these chord functions The identification dimensions influence each other, so it is necessary to identify the chord characteristics corresponding to each note from different chord function identification dimensions.
  • chord characteristic corresponding to each note from other chord function identification dimensions, such as the chord dominant (usually expressed as Pre.deg) dimension and the chord degree (usually expressed as Sec. deg) dimensions, etc., which are not limited in this embodiment.
  • a note feature sequence composed of note features corresponding to each note may be obtained, and then the note feature sequence may be input into a plurality of preset chord function recognition models to obtain each chord function recognition model from Different chord functions recognize the note characteristics obtained by recognizing each note in the note feature sequence.
  • different chord function recognition models correspond to different chord function recognition dimensions. Therefore, in this embodiment, based on the note characteristics corresponding to each note, the process of identifying the chord characteristics corresponding to each note from different chord function recognition dimensions can be Multiple subtasks of extracting chord features are performed separately or at the same time, and there is no restriction on this here.
  • the multiple chord function recognition models preset in this embodiment are all machine learning models.
  • CRF Consumer Random Field (Conditional Random Field) model.
  • Different CRF models are used to restrict the recognition process of each note feature in the note feature sequence from different chord function recognition dimensions.
  • the note feature sequence input to the chord function recognition model may be The output signal of the aforementioned feature extraction model. That is to say, in these embodiments, the LSTM model combined with the CRF model is used to automatically extract the chord features of each note contained in the music data in different chord function identification dimensions, and the two models are in the chord feature.
  • the extraction process of is interdependent and constrained, which can make the obtained chord features more accurate.
  • the process of identifying the chord characteristics corresponding to each note from different chord function recognition dimensions based on the note characteristics corresponding to each note can also be implemented according to the chord feature recognition algorithm in different chord function recognition dimensions. The embodiment also does not limit this.
  • Step S140 Combine the chord features of the same note in different chord function recognition dimensions to obtain the chord combination feature corresponding to each note, and use the feature sequence formed by the chord combination feature corresponding to each note as the music data correspondence The result of music chord recognition.
  • step S130 the chord characteristics of each note in the music data of the music chord to be recognized are obtained in different chord function recognition dimensions, and the different chord function recognition dimensions influence each other, so as to facilitate the comparison Perform an accurate and comprehensive analysis of the music data.
  • the chord features of the same note in different chord function identification dimensions are combined to determine the chord function characteristics of each note based on the obtained chord combination features. Information is presented in all aspects.
  • combining the chord features of the same note in different chord function recognition dimensions may be that the chord features in different chord function recognition dimensions are spliced according to a specified splicing order to obtain the chord combination characteristics.
  • the process of combining chord features of the same note in different chord function recognition dimensions may be specifically related to the chord features of the same note in different chord function recognition dimensions and the corresponding notes.
  • the process of. Specifically, a unique note identifier can be assigned to each note contained in the music data, for example, the corresponding note identifier is assigned according to the sequence of each note in the note data, and then each note and the note can be identified in different chord functions.
  • the association relationship between the chord features in the dimension, and the chord features of each note in the different chord function identification dimensions are stored in the database based on the constructed association relationship.
  • the chord information of each note in the music data of the music chord to be recognized in different chord function recognition dimensions that is, this embodiment can be based on digital note recognition
  • the related feature information of the music data on the chord function solves the problem that the information on the chord function of classical music cannot be obtained in the prior art.
  • Fig. 2 is a schematic structural diagram of a music chord recognition model according to an exemplary embodiment.
  • the music chord recognition model is used to extract the chord characteristics of each note in different chord function recognition dimensions for each note contained in the music data of the music chord to be recognized, so as to automatically extract the characteristic information of the music data in the chord function.
  • the exemplary music chord recognition model is composed of a feature extraction model 10 and a plurality of chord function recognition models, where the plurality of chord function recognition models may include the chord function recognition models 21-23 shown in FIG. 2, Among them, the chord function recognition models 21-23 extract relevant feature information of music data from different chord function recognition dimensions.
  • the feature extraction model 10 includes a Bi-LSTM network and a fully connected network.
  • the Bi-LSTM network is used to extract the note characteristics of the input two-dimensional matrix representation sequence, where the two-dimensional matrix representation sequence is a sequence composed of the two-dimensional matrix representation of each note contained in the music data of the music chord to be recognized ,
  • the two-dimensional matrix representation of each note is constructed based on the note information corresponding to each note.
  • the Bi-LSTM network extracts the corresponding note characteristics for the input two-dimensional matrix representation sequence "s1, s2, s3...sn" containing the two-dimensional matrix representation corresponding to each note, and the obtained each
  • the note feature corresponding to the note is fully connected via the fully connected network to obtain the note feature sequence composed of the note feature corresponding to each note, and output the obtained note feature sequence to each chord function recognition model.
  • each chord function recognition model may be a machine learning model obtained by training based on the CRF model.
  • the chord function recognition model 21 is based on the note feature sequence output by the feature extraction model 10, and recognizes the note feature of each note contained in the note feature sequence from the chord mode dimension, and the output is composed of the chord feature of each note in the chord mode dimension.
  • the first chord characteristic sequence is based on the note feature sequence output by the feature extraction model 10, and recognizes the note feature of each note contained in the note feature sequence from the chord mode dimension, and the output is composed of the chord feature of each note in the chord mode dimension.
  • chord function recognition model 22 based on the note feature sequence output by the feature extraction model 10, recognizes the note feature of each note contained in the note feature sequence from the chord tonality dimension, and outputs the chord of each note in the chord tonality dimension.
  • the second chord feature sequence composed of features.
  • chord function recognition model 23 is based on the note feature sequence output by the feature extraction model 10, and recognizes the note feature of each note contained in the note feature sequence from the chord transposition dimension, and outputs the note feature of each note in the chord transposition dimension.
  • the third chord feature sequence composed of chord features.
  • first chord feature sequence, the second chord feature sequence, and the third chord feature sequence output by the various chord function models all contain the music data of the music chord to be recognized in the different chord function recognition.
  • Chord characteristics in dimensionality.
  • the chord characteristics of the same note in different chord function recognition dimensions identified by each chord function model can be combined to obtain the chord combination feature corresponding to each note, and the chord corresponding to each note
  • the feature sequence constituted by the combined features is used as the music chord recognition result corresponding to the music data. Based on the obtained chord recognition result, the music data can be analyzed accurately and comprehensively.
  • the training process shown in FIG. 3 may be used to train the music chord recognition model shown in FIG. 2.
  • training the music chord recognition model includes steps S210 to S230, which are described in detail as follows:
  • Step S210 Obtain a data set for training a feature extraction model and a plurality of chord function recognition models, and the data set contains a plurality of music data to be trained.
  • each piece of music data to be trained is usually a piece of digital music corresponding to a complete piece of music Represents, for example, it can be the music data corresponding to a complete song.
  • Step S220 Divide each piece of music data to be trained into a first piece of music data, a second piece of music data, and a third piece of music data, to form a training data set based on the first piece of music data corresponding to each piece of music data to be trained,
  • the second music data segment corresponding to each music data to be trained constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set.
  • the music data to be trained can be equally divided into different music data segments of equal duration according to the duration; or it can be divided into different music data segments of equal duration. Locate the key data position in the music data, and divide the music data segment based on the positioned key data position to obtain corresponding music data segments such as prelude, verse, chorus, etc.; or, the different music data segments obtained by the division can also be obtained.
  • the data information corresponding to the repeated melody that can be contained is not limited in this embodiment.
  • the division of different music data segments for each music data to be trained can be the same, so that the first music data segments contained in the training data set obtained in this embodiment have feature consistency.
  • the test data set and the verification data set are used to train the music chord recognition model in the same way, so that the music chord recognition model can have a better training effect.
  • step S230 the feature extraction model and multiple chord function recognition models are trained according to the training data set, the test data set and the verification data set.
  • the training data set can be used to train the feature extraction model and multiple chord function recognition models for the first round.
  • the test data can be used.
  • Set the feature extraction model and multiple chord function recognition models for the second round of training The purpose of the second round of training is to test the model effects of the feature extraction model and multiple chord function recognition models trained in the first round, and based on The test results further optimize the model effect of the feature extraction model and multiple chord function recognition models.
  • the validation data set is used to verify the model effect. If the model effect obtained by the verification is not good, the model effect of the feature extraction model and multiple chord function recognition models need to be further optimized.
  • this embodiment is based on the training data set, test data set, and verification data set for the music chord recognition model Training is equivalent to training the music chord recognition model multiple times based on the same training data. It can not only improve the training effect of the music chord recognition model, but also increase the amount of training data of the music chord recognition model to a greater extent to further improve The training effect of music chord recognition model.
  • the chord function recognition models 21-23 share the output signal of the feature extraction model 10, that is, the chord function recognition models 21-23 share the model parameters of the feature extraction model 10. Therefore, in some embodiments, the feature extraction model can be trained based on the training data set, the test data set, and the verification data set; after the trained feature extraction model is obtained, the training data set, the test data set, and the verification data set can be used. , And the output signal of the trained feature extraction model, train each chord function recognition model; then train the corresponding training loss value for each chord function recognition model, when the training loss value corresponding to each chord function recognition model When the sum is less than the loss threshold, the training for multiple chord function recognition models ends. This embodiment trains each chord function recognition model based on the trained feature extraction model, which can increase the training rate of each chord function recognition model.
  • the combination of the feature extraction model and any chord function recognition model can also be trained according to the training data set, test data set, and verification data set. In each training process, based on the training loss value Both the feature extraction model and the model parameters in the chord function recognition model will be updated accordingly. Therefore, the training method proposed in this embodiment will train the feature extraction model multiple times, so that the music chord recognition model has a better training effect. .
  • Fig. 4 is a block diagram of a music chord recognition device according to an exemplary embodiment. As shown in Fig. 9, the music chord recognition device includes:
  • the note information processing module 410 is configured to sequentially extract the note information corresponding to each note contained in the music data for the music data of the music chord to be recognized, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
  • the extraction module 420 is configured to extract the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
  • the chord feature recognition module 430 is configured to recognize each note corresponding to each note from different chord function recognition dimensions based on the note feature corresponding to each note.
  • the recognition result acquisition module 440 is configured to combine the recognized chord features of the same note in different chord function recognition dimensions to obtain the chord combination feature corresponding to each note, and the chord combination feature corresponding to each note
  • the constructed feature sequence is used as the result of music chord recognition corresponding to the music data.
  • the music chord recognition device shown in this embodiment can recognize information on the chord function of music data based on digital notes, and can solve the problem that the information on the chord function of classical music cannot be obtained in the prior art.
  • the data format of the music data is a musical instrument data interface format
  • the note information processing module 410 includes:
  • the note information acquisition unit is configured to sequentially extract the note pitch and note duration of each note contained in the music data, and use the note pitch and note duration as the note information corresponding to each note;
  • the two-dimensional matrix represents the building unit and is configured as Taking the pitch of the note as the vertical element in the two-dimensional matrix and the duration of the note as the horizontal element in the two-dimensional matrix, a two-dimensional matrix representation of each note is constructed.
  • the musical note feature extraction module 420 includes:
  • the first sequence obtaining unit is configured to obtain a two-dimensional matrix representation sequence composed of a two-dimensional matrix representation of each note; the first model processing unit is configured to input the two-dimensional matrix representation sequence into the feature extraction model to obtain feature extraction
  • the model expresses the note feature sequence output by the sequence for a two-dimensional matrix, and the note feature sequence contains the note feature corresponding to each note.
  • chord feature recognition module 430 includes:
  • the second sequence acquisition unit is configured to acquire a note feature sequence composed of note features corresponding to each note;
  • the second model processing unit is configured to input the note feature sequence into a plurality of preset chord function recognition models, respectively, to Obtain the chord features obtained by recognizing each note feature in the note feature sequence from different chord feature recognition dimensions from different chord feature recognition models.
  • the chord function identification dimension includes at least the chord mode dimension, the chord key dimension, and the chord transposition dimension.
  • the chord mode dimension, the chord key dimension, and the chord transposition dimension work together on the chords of the music data. Functional representation.
  • the device further includes:
  • the data set acquisition module is configured to acquire a data set for training the feature extraction model and multiple chord function recognition models.
  • the data set contains multiple music data to be trained;
  • the data set processing module is configured to collect each music data to be trained It is divided into a first music data segment, a second music data segment, and a third music data segment, and a training data set is formed based on the first music data segment corresponding to each music data to be trained, and based on the first music data corresponding to each music data to be trained
  • the second music data segment constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set;
  • the model training module is configured to extract the model based on the training data set, the test data set and the verification data set
  • multiple chord function recognition models are trained to extract the note feature corresponding to each note in the music data of the music chord to be recognized based on the trained feature extraction model, and based on the trained multiple chord function recognition models from different
  • the chord function recognition dimension recognizes the
  • the input signals of the multiple chord function recognition models are all output signals of the feature extraction model
  • the model training module includes:
  • the feature extraction model training unit is configured to train the feature extraction model according to the training data set, the test data set and the verification data set;
  • the chord function recognition model training unit is configured to obtain the trained feature extraction model and then according to the training data set , Test data set and verification data set, and the output signal of the trained feature extraction model, train multiple chord function recognition models;
  • training monitoring unit configured to train the corresponding training loss for each chord function recognition model When the sum of the training loss values corresponding to each chord function recognition model is less than the loss threshold, the training for multiple chord function recognition models ends.
  • the present application also provides an electronic device, which includes a processor and a memory, and computer-readable instructions are stored on the memory.
  • an electronic device which includes a processor and a memory, and computer-readable instructions are stored on the memory.
  • the computer-readable instructions are executed by the processor, the same The described music chord recognition method.
  • Fig. 5 is a schematic diagram showing the hardware structure of an electronic device according to an exemplary embodiment.
  • the electronic device is only an example adapted to this application, and cannot be considered as providing any restriction on the scope of use of this application.
  • the electronic device also cannot be interpreted as being dependent on or having one or more components in the exemplary electronic device shown in FIG. 5.
  • the electronic device includes: a power supply 510, an interface 530, at least one memory 550, and at least one central processing unit (CPU, Central Processing Units) 570.
  • the power supply 510 is used to provide working voltage for each hardware device on the electronic device.
  • the interface 530 includes at least one wired or wireless network interface 531, at least one serial-to-parallel conversion interface 533, at least one input/output interface 535, and at least one USB interface 537, etc., for communicating with external devices.
  • the memory 550 as a resource storage carrier, can be a read-only memory, a random access memory, a magnetic disk or an optical disc, etc.
  • the resources stored on it include the operating system 551, application programs 553 or data 555, etc.
  • the storage method can be short-term storage or permanent storage. .
  • the operating system 551 is used to manage and control the various hardware devices and application programs 553 on the electronic device to realize the calculation and processing of the massive data 555 by the central processing unit 570, which can be Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM Wait.
  • the application program 553 is a computer program that completes at least one specific task based on the operating system 551. It may include at least one module (not shown in FIG. 5), and each module may include a series of computer programs for electronic devices. Readable instructions.
  • the data 555 may be http protocol data stored in a disk or the like.
  • the central processing unit 570 may include one or more processors, and is configured to communicate with the memory 550 via a bus for computing and processing the massive data 555 in the memory 550.
  • the electronic device applicable to the present application will read a series of computer-readable instructions stored in the memory 550 through the central processing unit 570 to complete the music chord recognition method described in the foregoing embodiment.
  • present application can also be implemented through hardware circuits or hardware circuits in combination with software instructions. Therefore, implementation of the present application is not limited to any specific hardware circuits, software, and combinations of the two.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile, and a computer program is stored thereon. When executed by the processor, the music chord recognition method as described above is realized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Signal Processing (AREA)
  • Auxiliary Devices For Music (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

The present application relates to the technical field of artificial intelligence. Specifically provided is a music chord recognition method. The method comprises: for music data of a music chord to be recognized, successively extracting note information corresponding to each note included in the music data, and on the basis of the note information corresponding to each note, constructing a two-dimensional matrix representation of each note; according to the two-dimensional matrix representation of each note, extracting note features corresponding to each note; on the basis of the note features corresponding to each note, respectively recognizing, from different chord function recognition dimensions, chord features corresponding to each note; and combining the recognized chord features of the same note on the different chord function recognition dimensions, so as to obtain combined chord features corresponding to each note, and taking a feature sequence, which is formed from the combined chord features corresponding to each note, as a music chord recognition result corresponding to the music data. By means of the present application, information of music data on a chord function can be recognized on the basis of digital notes.

Description

音乐和弦识别方法及装置、电子设备、存储介质Music chord recognition method and device, electronic equipment, and storage medium
本申请要求于2020年11月25日在中华人民共和国国家知识产权局专利局提交的、申请号为202011351757.7、发明名称为“音乐和弦识别方法及装置、电子设备、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires a Chinese patent application filed at the Patent Office of the State Intellectual Property Office of the People’s Republic of China on November 25, 2020, the application number is 202011351757.7, and the invention title is "Music chord recognition method and device, electronic equipment, storage medium" Priority, the entire content of which is incorporated in this application by reference.
技术领域Technical field
本申请涉及人工智能技术领域,具体涉及一种音乐和弦识别方法及装置、电子设备、计算机可读存储介质。This application relates to the field of artificial intelligence technology, in particular to a music chord recognition method and device, electronic equipment, and computer-readable storage media.
背景技术Background technique
随着计算机信息技术的不断发展,计算机技术在艺术上的应用越来越广泛。例如,西方古典音乐作品数量庞大,通过建立一个古典音乐自动分析***,能够使得古典音乐的学习更加方便,且能够使得古典音乐的传播具有更大潜力。With the continuous development of computer information technology, the application of computer technology in art has become more and more extensive. For example, there are a large number of Western classical music works. By establishing an automatic analysis system for classical music, the learning of classical music can be made more convenient, and the dissemination of classical music can have greater potential.
发明人意识到,在对古典音乐进行音乐分析时,通常需要分析古典音乐的和弦性能,但是目前对于古典音乐的和弦性能的分析是基于对和弦符号的分析实现的,可以得到和弦的弦根音等和弦性质,但是这种分析方式无法得到关于古典音乐在和弦功能上的信息。The inventor realized that when performing music analysis of classical music, it is usually necessary to analyze the chord performance of classical music. However, the current analysis of the chord performance of classical music is based on the analysis of the chord symbols, and the root note of the chord can be obtained. The nature of chords, but this analysis method cannot get information about the chord function of classical music.
技术问题technical problem
本申请实施例的目的之一在于:提出一种音乐和弦识别方法及装置、电子设备、计算机可读存储介质,以解决现有技术中无法得到关于古典音乐在和弦功能上的信息的技术问题。One of the objectives of the embodiments of the present application is to provide a method and device for recognizing music chords, electronic equipment, and computer-readable storage media, so as to solve the technical problem that information on the chord function of classical music cannot be obtained in the prior art.
技术解决方案Technical solutions
第一方面,本申请实施例提供了一种音乐和弦识别方法,方法包括:In the first aspect, an embodiment of the present application provides a music chord recognition method, the method includes:
针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;For the music data of the music chord to be recognized, extract the note information corresponding to each note contained in the music data in turn, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;Extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;Based on the note features corresponding to the respective notes, respectively identifying the chord features corresponding to the respective notes from different chord function recognition dimensions;
将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。Combine the recognized chord features of the same note in the different chord function identification dimensions to obtain the chord combination feature corresponding to each note, and use the chord combination feature corresponding to each note as the feature sequence The music chord recognition result corresponding to the music data.
第二方面,本申请实施例提供了一种音乐和弦识别装置,包括:In the second aspect, an embodiment of the present application provides a music chord recognition device, including:
音符信息处理模块,配置为针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;The note information processing module is configured to sequentially extract the note information corresponding to each note contained in the music data for the music data of the music chord to be recognized, and construct the two-dimensional information of each note based on the note information corresponding to each note Matrix representation
音符特征提取模块,配置为根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;A note feature extraction module configured to extract the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
和弦特征识别模块,配置为基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;The chord feature recognition module is configured to recognize the chord feature corresponding to each note from different chord function recognition dimensions based on the note feature corresponding to each note;
识别结果获取模块,配置为将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。The recognition result acquisition module is configured to combine the chord characteristics of the same note in the different chord function recognition dimensions to obtain the chord combination characteristics corresponding to the respective notes, and to combine the chords corresponding to the respective notes The feature sequence formed by the features is used as the music chord recognition result corresponding to the music data.
第三方面,本申请实施例提供了一种电子设备,包括存储器,存储有计算机可读指令;处理器,读取存储器存储的计算机可读指令,所述处理器执行计算机程序时实现:In a third aspect, an embodiment of the present application provides an electronic device, including a memory, storing computer-readable instructions; a processor, which reads the computer-readable instructions stored in the memory, and when the processor executes a computer program:
针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;For the music data of the music chord to be recognized, extract the note information corresponding to each note contained in the music data in turn, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;Extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;Based on the note features corresponding to the respective notes, respectively identifying the chord features corresponding to the respective notes from different chord function recognition dimensions;
将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。Combine the recognized chord features of the same note in the different chord function identification dimensions to obtain the chord combination feature corresponding to each note, and use the chord combination feature corresponding to each note as the feature sequence The music chord recognition result corresponding to the music data.
第四方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质可以是非易失性,也可以是易失性,计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现:In the fourth aspect, the embodiments of the present application provide a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium stores a computer program. Realized when executed by the processor:
针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;For the music data of the music chord to be recognized, extract the note information corresponding to each note contained in the music data in turn, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;Extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;Based on the note features corresponding to the respective notes, respectively identifying the chord features corresponding to the respective notes from different chord function recognition dimensions;
将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。Combine the recognized chord features of the same note in the different chord function identification dimensions to obtain the chord combination feature corresponding to each note, and use the chord combination feature corresponding to each note as the feature sequence The music chord recognition result corresponding to the music data.
有益效果Beneficial effect
本申请实施例与现有技术相比存在的有益效果是:在本申请实施例提出的技术方案中,基于人工智能的方式自动从待识别音乐和弦的音乐数据中提取各个音符在不同的和弦功能识别维度上的和弦特征,然后将同一音符在不同的和弦功能识别维度上的和弦特征进行组合,使得最终所得到音乐和弦识别结果中含有各个音符在不同的和弦功能识别维度上的和弦信息,也即本申请的实施例能够基于数字音符识别音乐数据在和弦功能上的信息,解决了现有技术中无法得到关于古典音乐在和弦功能上的信息的问题。Compared with the prior art, the embodiment of the present application has the beneficial effect that: in the technical solution proposed in the embodiment of the present application, the method of artificial intelligence automatically extracts the different chord functions of each note from the music data of the music chord to be recognized. Identify the chord features in the dimensions, and then combine the chord features of the same note in different chord function recognition dimensions, so that the final musical chord recognition result contains the chord information of each note in the different chord function recognition dimensions. That is, the embodiment of the present application can identify the information on the chord function of music data based on the digital note, and solves the problem that the information on the chord function of classical music cannot be obtained in the prior art.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并于说明书一起用于解释本申请的原理。The drawings here are incorporated into the specification and constitute a part of the specification, show embodiments conforming to the application, and are used together with the specification to explain the principle of the application.
图1是根据一示例性实施例示出的一种音乐和弦识别方法的流程图;Fig. 1 is a flow chart showing a method for music chord recognition according to an exemplary embodiment;
图2是根据一示例性实施例示出的一种音乐和弦识别模型的结构示意图;Fig. 2 is a schematic structural diagram of a music chord recognition model according to an exemplary embodiment;
图3是根据另一示例性实施例示出的一种音乐和弦识别方法的流程图;Fig. 3 is a flow chart showing a method for music chord recognition according to another exemplary embodiment;
图4是根据一示例性实施例示出的一种音乐和弦识别装置的框图;Fig. 4 is a block diagram of a music chord recognition device according to an exemplary embodiment;
图5是根据一示例性实施例所示出的一种电子设备的硬件结构示意图。Fig. 5 is a schematic diagram showing the hardware structure of an electronic device according to an exemplary embodiment.
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述,这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。Through the above drawings, the specific embodiments of the present application have been shown, and there will be more detailed descriptions in the following. These drawings and text descriptions are not intended to limit the scope of the concept of the present application in any way, but by referring to specific embodiments. The concept of this application is explained to those skilled in the art.
本发明的实施方式Embodiments of the present invention
这里将详细地对示例性实施例执行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。Here, an exemplary embodiment will be described in detail, and examples thereof are shown in the accompanying drawings. When the following description refers to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present application. On the contrary, they are merely examples of devices and methods consistent with some aspects of the application as detailed in the appended claims.
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。The block diagrams shown in the drawings are merely functional entities, and do not necessarily correspond to physically independent entities. That is, these functional entities can be implemented in the form of software, or implemented in one or more hardware modules or integrated circuits, or implemented in different networks and/or processor devices and/or microcontroller devices. entity.
附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flowchart shown in the drawings is only an exemplary description, and does not necessarily include all contents and operations/steps, nor does it have to be performed in the described order. For example, some operations/steps can be decomposed, and some operations/steps can be combined or partially combined, so the actual execution order may be changed according to actual conditions.
如前所述的,目前对于古典音乐的和弦性能的分析是基于对和弦符号的分析实现的,可以得到和弦的弦根音等和弦性质,但是这种分析方式无法得到关于古典音乐在和弦功能上的信息。As mentioned earlier, the current analysis of the chord performance of classical music is based on the analysis of the chord symbols, and the chord properties such as the root note of the chord can be obtained. However, this analysis method cannot obtain information about the chord function of classical music. Information.
为解决此问题,本申请提出一种基于人工智能的音乐和弦识别方案,具体提出一种音乐和弦识别方法、装置、电子设备和计算机可读存储介质。基于本申请提出的音乐和弦识别方案,可以从待识别音乐和弦的音乐数据中识别得到各个音符在和弦调式、和弦调性、和弦转位等和弦功能信息。In order to solve this problem, this application proposes an artificial intelligence-based music chord recognition solution, and specifically proposes a music chord recognition method, device, electronic equipment, and computer-readable storage medium. Based on the music chord recognition solution proposed in this application, chord function information such as the chord mode, chord tonality, and chord transposition of each note can be recognized from the music data of the music chord to be recognized.
下面将以具体的实施例对本申请所提出的音乐和弦识别方法、装置、电子设备和计算机可读存储介质进行详细描述。In the following, specific embodiments will be used to describe in detail the music chord recognition method, device, electronic equipment, and computer-readable storage medium proposed in this application.
请参阅图1,图1是根据一示例性实施例示出的一种音乐和弦识别方法的流程图。该音乐和弦识别方法至少包括步骤S110至步骤S170,详细介绍如下:Please refer to FIG. 1. FIG. 1 is a flowchart of a method for recognizing music chords according to an exemplary embodiment. The music chord recognition method at least includes steps S110 to S170, which are described in detail as follows:
步骤S110,针对待识别音乐和弦的音乐数据,依次提取音乐数据中含有的各个音符对应的音符信息,并基于各个音符对应的音符信息构建各个音符的二维矩阵表示。Step S110: For the music data of the music chord to be identified, the note information corresponding to each note contained in the music data is sequentially extracted, and a two-dimensional matrix representation of each note is constructed based on the note information corresponding to each note.
首先需要说明的是,本申请实施例中涉及的和弦是一种乐理概念,可以理解为是具有一定音程关系的音乐旋律,本处不对其定义进行详细阐述。First, it should be noted that the chord involved in the embodiments of the present application is a musical theory concept, which can be understood as a musical melody with a certain interval relationship, and its definition will not be elaborated here.
待识别音乐和弦的音乐数据是指利用数字音乐标准格式编码得到的数字音乐,此音乐数据通过音符及数字控制信息来记录音乐旋律。例如,此音乐数据的数据格式可以是乐器数据接口(Musical Instrument Digital Interface,MIDI)格式,其中MIDI是广泛使用的一种音乐标准格式,这种音乐数据格式可以被计算机理解,具体由音符、控制参与等指令构成,几乎所有的现在音乐都是基于MIDI格式制作得到的。The music data of the music chord to be recognized refers to digital music encoded in a digital music standard format. This music data records the music melody through musical notes and digital control information. For example, the data format of this music data can be a musical instrument data interface (Musical Instrument Digital Interface (MIDI) format, among which MIDI is a widely used music standard format. This music data format can be understood by computers. It is specifically composed of notes, control participation and other instructions. Almost all current music is based on MIDI format. Produced.
由此,本实施例基于音乐数据中含有的音符以及控制参数等指令,能够针对待识别音乐和弦的音乐数据依次提取各个音符对应的音符信息。示例性的,各个音符对应的音符信息可以包括各个音符的音符音高和音符持续时长,或者可以基于实际的场景提取所需要的音符信息,本处不对此进行限制。Therefore, in this embodiment, based on the notes and control parameters contained in the music data, it is possible to sequentially extract the note information corresponding to each note for the music data of the music chord to be recognized. Exemplarily, the note information corresponding to each note may include the note pitch and note duration of each note, or the required note information may be extracted based on the actual scene, which is not limited here.
基于本实施例针对各个音符所提取的音符信息,则能够构建各个音符的二维特征表示。如前所述的,若针对各个音符提取得到各个音符的音符音高和音符持续时长,则可以将音符音高作为二维矩阵中的纵向元素,以及将音符持续时长作为二维矩阵中的横向元素,或者将音符音高作为二维矩阵中的横向元素,以及将音符持续时长作为二维矩阵中的纵向元素,以构建各个音符的二维矩阵表示。Based on the note information extracted for each note in this embodiment, a two-dimensional feature representation of each note can be constructed. As mentioned above, if the note pitch and note duration of each note are extracted for each note, the note pitch can be used as the vertical element in the two-dimensional matrix, and the note duration can be used as the horizontal in the two-dimensional matrix. The element, or the pitch of the note as the horizontal element in the two-dimensional matrix, and the duration of the note as the vertical element in the two-dimensional matrix, to construct a two-dimensional matrix representation of each note.
便于理解的,针对将音符音高作为二维矩阵中的纵向元素,以及将音符持续时长作为二维矩阵中的横向元素所构建得到的二维矩阵表示,其可以理解为是以音符音高为纵向坐标,且以时间为横向坐标构建的坐标系,因此各个音符的二维矩阵表示能够携带音符对应的音符信息,进而基于对各个音符的二维矩阵表示的识别处理得到音乐数据在和弦功能上的相关信息。It is easy to understand that for the two-dimensional matrix representation constructed by taking the pitch of the note as the vertical element in the two-dimensional matrix and the duration of the note as the horizontal element in the two-dimensional matrix, it can be understood as taking the pitch of the note as Longitudinal coordinates, and the coordinate system constructed with time as the horizontal coordinate, so the two-dimensional matrix representation of each note can carry the note information corresponding to the note, and then based on the recognition processing of the two-dimensional matrix representation of each note, the music data is in the chord function. Related information.
另外还需要说明的是,本实施例涉及的待识别音乐和弦的音乐数据可以是古典音乐、流行音乐等任意形式的音乐对应的数字化表达,本实施不对此音乐数据所属的音乐类型进行限制。In addition, it should be noted that the music data of the to-be-identified music chords involved in this embodiment may be a digital expression corresponding to any form of music such as classical music and pop music, and this embodiment does not limit the music type to which this music data belongs.
步骤S120,根据各个音符的二维矩阵表示提取各个音符对应的音符特征。In step S120, the note feature corresponding to each note is extracted according to the two-dimensional matrix representation of each note.
在本实施例中,为获得音乐数据在和弦功能上的相关信息,需进一步针对二维矩阵表示进行特征信息的提取,例如提取各个音符在弦调式(通常表示为Key)、和弦调性(通常表示为Quality)以及和弦转位(通常表示为Inversion)等和弦功能上的特征信息,由此得到各个音符对应的音符特征。In this embodiment, in order to obtain information about the chord function of the music data, it is necessary to further extract the characteristic information for the two-dimensional matrix representation, such as extracting the chord mode of each note (usually expressed as Key) and the chord tonality (usually expressed as Key). Characteristic information of the chord function such as Quality) and chord inversion (usually expressed as Inversion), etc., from which the note characteristics corresponding to each note can be obtained.
示例性的,为便捷地得到各个音符对应的音符特征,可以获取由各个音符的二维矩阵表示所构成的二维矩阵表示序列,然后将此二维矩阵表示序列输入至特征提取模型中,并获取特征提取模型针对二维矩阵表示序列输出的音符特征序列,所获取到的音符特征序列中即含有各个音符对应的音符特征。示例性的,特征提取模型可以是机器学习模型,例如可以采用LSTM(Long Short-Term Memory,长短期记忆网络)模型、Bi-LSTM(双向长短期记忆网络)模型等,本处不进行限制。但通常而言,Bi-LSTM模型相比于LSTM模型具有更好的特征信息提取效果,因此在实际的应用场景中,可以选取Bi-LSTM模型进行各个音符对应的音符特征的提取。Exemplarily, in order to conveniently obtain the note feature corresponding to each note, a two-dimensional matrix representation sequence composed of the two-dimensional matrix representation of each note can be obtained, and then this two-dimensional matrix representation sequence can be input into the feature extraction model, and The acquired feature extraction model represents the note feature sequence output by the two-dimensional matrix representation sequence, and the acquired note feature sequence contains the note feature corresponding to each note. Exemplarily, the feature extraction model may be a machine learning model, for example, LSTM (Long Short-Term Memory, long short-term memory network) model, Bi-LSTM (bidirectional long short-term memory network) model, etc., are not restricted here. But generally speaking, the Bi-LSTM model has a better feature information extraction effect than the LSTM model. Therefore, in actual application scenarios, the Bi-LSTM model can be selected to extract the note features corresponding to each note.
在其它的一些实施例中,也可以基于一些音符特征提取算法对各个音符的二维矩阵表示进行在弦调式、和弦调性以及和弦转位等和弦功能上的特征信息的提取,以得到各个音符对应的音符特征,本处也不对此进行限制。In some other embodiments, the two-dimensional matrix representation of each note can also be extracted based on some note feature extraction algorithms for the feature information of chord functions such as chord mode, chord tonality, and chord transposition, to obtain each note There are no restrictions on the corresponding note characteristics.
步骤S130,基于各个音符对应的音符特征,分别从不同的和弦功能识别维度识别各个音符对应的和弦特征。In step S130, based on the note feature corresponding to each note, the chord feature corresponding to each note is identified from different chord function identification dimensions.
如前所述的,本实施例通过在步骤S120中进一步针对二维矩阵表示进行各个音符在弦调式、和弦调性以及和弦转位等和弦功能上的特征信息的提取,得到各个音符对应的音符特征,由此则可以从弦调式维度、和弦调性维度以及和弦转位维度等不同的和弦功能识别维度识别各个音符对应的和弦特征。As mentioned above, in this embodiment, in step S120, the two-dimensional matrix representation is further performed to extract the characteristic information of each note in the chord function, such as chord mode, chord tonality, and chord inversion, to obtain the note corresponding to each note. Features, so that the chord features corresponding to each note can be identified from different chord function identification dimensions such as the chord mode dimension, the chord tone dimension, and the chord transposition dimension.
需要说明的是,上述所示例的弦调式维度、和弦调性维度以及和弦转位维度共同作用于音乐数据的和弦功能表示,在对音乐数据(例如古典音乐)进行分析的场景中,这些和弦功能识别维度是相互影响的,因此需要分别从不同的和弦功能识别维度识别各个音符对应的和弦特征。It should be noted that the chord mode dimensions, chord key dimensions, and chord transposition dimensions in the above examples collectively act on the chord function representation of music data. In the scene of analyzing music data (such as classical music), these chord functions The identification dimensions influence each other, so it is necessary to identify the chord characteristics corresponding to each note from different chord function identification dimensions.
当然,基于不同的音乐数据分析需求,也可以从其它的和弦功能识别维度识别各个音符对应的和弦特征,例如和弦主度(通常表示为Pre.deg)维度以及和弦次度(通常表示为Sec.deg)维度等,本实施例不对此进行限制。Of course, based on different music data analysis requirements, it is also possible to identify the chord characteristics corresponding to each note from other chord function identification dimensions, such as the chord dominant (usually expressed as Pre.deg) dimension and the chord degree (usually expressed as Sec. deg) dimensions, etc., which are not limited in this embodiment.
在一些实施例中,可以获取由各个音符对应的音符特征所构成的音符特征序列,然后将此音符特征序列分别输入至预置的多个和弦功能识别模型中,以获取各个和弦功能识别模型从不同的和弦功能识别维度对此音符特征序列中的各个音符进行识别处理所得到的音符特征。需要说明的是,不同的和弦功能识别模型对应于不同的和弦功能识别维度,因此本实施例基于各个音符对应的音符特征,分别从不同的和弦功能识别维度识别各个音符对应的和弦特征的过程可以作为提取和弦特征的多个子任务分别进行或者同时进行,本处也不对此进行限制。In some embodiments, a note feature sequence composed of note features corresponding to each note may be obtained, and then the note feature sequence may be input into a plurality of preset chord function recognition models to obtain each chord function recognition model from Different chord functions recognize the note characteristics obtained by recognizing each note in the note feature sequence. It should be noted that different chord function recognition models correspond to different chord function recognition dimensions. Therefore, in this embodiment, based on the note characteristics corresponding to each note, the process of identifying the chord characteristics corresponding to each note from different chord function recognition dimensions can be Multiple subtasks of extracting chord features are performed separately or at the same time, and there is no restriction on this here.
示例性的,本实施例所预置的多个和弦功能识别模型均为机器学习模型,例如可以采用CRF(Conditional Random Field,条件随机场)模型,不同的CRF模型用于从不同的和弦功能识别维度来约束模型自身对于音符特征序列中的各个音符特征进行的识别处理。Exemplarily, the multiple chord function recognition models preset in this embodiment are all machine learning models. For example, CRF (Conditional Random Field (Conditional Random Field) model. Different CRF models are used to restrict the recognition process of each note feature in the note feature sequence from different chord function recognition dimensions.
在一些实施例中,若根据音乐数据中的各个音符的二维矩阵表示提取各个音符对应的音符特征是基于前述的特征提取模型实现的,输入至和弦功能识别模型中的音符特征序列则可以是前述的特征提取模型的输出信号。也即是说,在这些实施例中,采用LSTM模型与CRF模型相结合的模型形式来自动提取音乐数据所含有的各个音符在不同的和弦功能识别维度上的和弦特征,二者模型在和弦特征的提取过程中相互依赖且约束,能够使得所得到的和弦特征更为准确。In some embodiments, if the note feature corresponding to each note is extracted according to the two-dimensional matrix representation of each note in the music data based on the aforementioned feature extraction model, the note feature sequence input to the chord function recognition model may be The output signal of the aforementioned feature extraction model. That is to say, in these embodiments, the LSTM model combined with the CRF model is used to automatically extract the chord features of each note contained in the music data in different chord function identification dimensions, and the two models are in the chord feature. The extraction process of is interdependent and constrained, which can make the obtained chord features more accurate.
此外,本实施例基于各个音符对应的音符特征,分别从不同的和弦功能识别维度识别各个音符对应的和弦特征的过程,也可以是根据不同和弦功能识别维度上的和弦特征识别算法实现的,本实施例也不对此进行限制。In addition, in this embodiment, the process of identifying the chord characteristics corresponding to each note from different chord function recognition dimensions based on the note characteristics corresponding to each note can also be implemented according to the chord feature recognition algorithm in different chord function recognition dimensions. The embodiment also does not limit this.
步骤S140,将识别得到的同一音符在不同的和弦功能识别维度上的和弦特征进行组合,得到各个音符对应的和弦组合特征,并将各个音符对应的和弦组合特征所构成的特征序列作为音乐数据对应的音乐和弦识别结果。Step S140: Combine the chord features of the same note in different chord function recognition dimensions to obtain the chord combination feature corresponding to each note, and use the feature sequence formed by the chord combination feature corresponding to each note as the music data correspondence The result of music chord recognition.
如前所述的,步骤S130中得到待识别音乐和弦的音乐数据中的各个音符在不同的和弦功能识别维度上的和弦特征,且不同的和弦功能识别维度之间是相互影响的,为便于对音乐数据进行准确且全面的分析,本实施例将识别得到的同一音符在不同的和弦功能识别维度上的和弦特征进行组合,以基于所得到的和弦组合特征来对各个音符在和弦功能上的特征信息进行全方面的表示。As mentioned above, in step S130, the chord characteristics of each note in the music data of the music chord to be recognized are obtained in different chord function recognition dimensions, and the different chord function recognition dimensions influence each other, so as to facilitate the comparison Perform an accurate and comprehensive analysis of the music data. In this embodiment, the chord features of the same note in different chord function identification dimensions are combined to determine the chord function characteristics of each note based on the obtained chord combination features. Information is presented in all aspects.
示例性的,对同一音符在不同的和弦功能识别维度上的和弦特征进行组合,可以是不同的和弦功能识别维度上的和弦特征按照指定的拼接顺序进行拼接,以得到和弦组合特征。Exemplarily, combining the chord features of the same note in different chord function recognition dimensions may be that the chord features in different chord function recognition dimensions are spliced according to a specified splicing order to obtain the chord combination characteristics.
或者在一些实施例中,对同一音符在不同的和弦功能识别维度上的和弦特征进行组合的过程,具体可以是针对同一音符在不同的和弦功能识别维度上的和弦特征与相应的音符进行关联存储的过程。具体来说,可以分别为音乐数据中含有各个音符分配唯一的音符标识,例如按照各个音符在音符数据中的排列顺序进行相应音符标识的分配,然后构建各个音符与该音符在不同的和弦功能识别维度上的和弦特征之间的关联关系,并基于所构建的关联关系将各个音符在不同的和弦功能识别维度上的和弦特征存储至数据库中。当需要进行音乐数据对应的音乐和弦识别结果的展示时,例如当用户需要查看音乐数据对应的音乐和弦识别结果时,即可根据各个音符所具有的关联关系从数据库中调取各个音符在不同的和弦功能识别维度上的和弦特征进行展示,因此基于本实施例的方法可以十分便于进行音乐和弦识别结果的展示。Or in some embodiments, the process of combining chord features of the same note in different chord function recognition dimensions may be specifically related to the chord features of the same note in different chord function recognition dimensions and the corresponding notes. the process of. Specifically, a unique note identifier can be assigned to each note contained in the music data, for example, the corresponding note identifier is assigned according to the sequence of each note in the note data, and then each note and the note can be identified in different chord functions. The association relationship between the chord features in the dimension, and the chord features of each note in the different chord function identification dimensions are stored in the database based on the constructed association relationship. When it is necessary to display the music chord recognition result corresponding to the music data, for example, when the user needs to view the music chord recognition result corresponding to the music data, he can retrieve each note from the database according to the association relationship of each note. The chord feature in the chord function recognition dimension is displayed, so the method based on this embodiment can be very convenient for displaying the result of music chord recognition.
由此,在本实施例所获得的音乐和弦识别结果中,含有待识别音乐和弦的音乐数据中的各个音符在不同的和弦功能识别维度上的和弦信息,也即本实施例能够基于数字音符识别音乐数据在和弦功能上的相关特征信息,解决了现有技术中无法得到关于古典音乐在和弦功能上的信息的问题。Therefore, in the music chord recognition result obtained in this embodiment, the chord information of each note in the music data of the music chord to be recognized in different chord function recognition dimensions, that is, this embodiment can be based on digital note recognition The related feature information of the music data on the chord function solves the problem that the information on the chord function of classical music cannot be obtained in the prior art.
图2是根据一示例性实施例示出的一种音乐和弦识别模型的结构示意图。该音乐和弦识别模型用于针对待识别音乐和弦的音乐数据中含有的各个音符,提取各个音符在不同的和弦功能识别维度上的和弦特征,以自动地提取音乐数据在和弦功能上的特征信息。Fig. 2 is a schematic structural diagram of a music chord recognition model according to an exemplary embodiment. The music chord recognition model is used to extract the chord characteristics of each note in different chord function recognition dimensions for each note contained in the music data of the music chord to be recognized, so as to automatically extract the characteristic information of the music data in the chord function.
如图2所示,该示例性的音乐和弦识别模型由特征提取模型10和多个和弦功能识别模型构成,其中多个和弦功能识别模型可以包括图2所示的和弦功能识别模型21-23,其中和弦功能识别模型21-23分别从不同的和弦功能识别维度提取音乐数据的相关特征信息。As shown in FIG. 2, the exemplary music chord recognition model is composed of a feature extraction model 10 and a plurality of chord function recognition models, where the plurality of chord function recognition models may include the chord function recognition models 21-23 shown in FIG. 2, Among them, the chord function recognition models 21-23 extract relevant feature information of music data from different chord function recognition dimensions.
示例性的,特征提取模型10包括Bi-LSTM网络和全连接网络。Bi-LSTM网络用于对输入其中的二维矩阵表示序列进行音符特征提取,其中,二维矩阵表示序列是由待识别音乐和弦的音乐数据中含有的各个音符的二维矩阵表示所构成的序列,各个音符的二维矩阵表示是基于各个音符对应的音符信息构建得到的。如图2所示,Bi-LSTM网络针对输入其中的二维矩阵表示序列“s1,s2,s3……sn”中含有各个音符对应的二维矩阵表示分别提取相应的音符特征,所得到的各个音符对应的音符特征经由全连接网络进行全连接处理,得到由各个音符对应的音符特征所构成的音符特征序列,并将所得到的音符特征序列输出至各个和弦功能识别模型中。其中,各个和弦功能识别模型可以是基于CRF模型进行训练所得到的机器学习模型。Exemplarily, the feature extraction model 10 includes a Bi-LSTM network and a fully connected network. The Bi-LSTM network is used to extract the note characteristics of the input two-dimensional matrix representation sequence, where the two-dimensional matrix representation sequence is a sequence composed of the two-dimensional matrix representation of each note contained in the music data of the music chord to be recognized , The two-dimensional matrix representation of each note is constructed based on the note information corresponding to each note. As shown in Figure 2, the Bi-LSTM network extracts the corresponding note characteristics for the input two-dimensional matrix representation sequence "s1, s2, s3...sn" containing the two-dimensional matrix representation corresponding to each note, and the obtained each The note feature corresponding to the note is fully connected via the fully connected network to obtain the note feature sequence composed of the note feature corresponding to each note, and output the obtained note feature sequence to each chord function recognition model. Among them, each chord function recognition model may be a machine learning model obtained by training based on the CRF model.
和弦功能识别模型21基于特征提取模型10输出的音符特征序列,从和弦调式维度上对音符特征序列中含有的各个音符的音符特征进行识别处理,输出由各个音符在和弦调式维度上的和弦特征构成的第一和弦特征序列。The chord function recognition model 21 is based on the note feature sequence output by the feature extraction model 10, and recognizes the note feature of each note contained in the note feature sequence from the chord mode dimension, and the output is composed of the chord feature of each note in the chord mode dimension. The first chord characteristic sequence.
和弦功能识别模型22基于特征提取模型10输出的音符特征序列,从和弦调性维度上对音符特征序列中含有的各个音符的音符特征进行识别处理,输出由各个音符在和弦调性维度上的和弦特征构成的第二和弦特征序列。The chord function recognition model 22, based on the note feature sequence output by the feature extraction model 10, recognizes the note feature of each note contained in the note feature sequence from the chord tonality dimension, and outputs the chord of each note in the chord tonality dimension. The second chord feature sequence composed of features.
和弦功能识别模型23则基于特征提取模型10输出的音符特征序列,从和弦转位维度上对音符特征序列中含有的各个音符的音符特征进行识别处理,输出由各个音符在和弦转位维度上的和弦特征构成的第三和弦特征序列。The chord function recognition model 23 is based on the note feature sequence output by the feature extraction model 10, and recognizes the note feature of each note contained in the note feature sequence from the chord transposition dimension, and outputs the note feature of each note in the chord transposition dimension. The third chord feature sequence composed of chord features.
可以看出,在各个和弦功能模型所输出的第一和弦特征序列、第二和弦特征序列和第三和弦特征序列中,均含有待识别音乐和弦的音乐数据中的各个音符在不同的和弦功能识别维度上的和弦特征。在一示例性的应用场景中,可以将各个和弦功能模型所识别得到的同一音符在不同和弦功能识别维度上的和弦特征进行组合,得到各个音符对应的和弦组合特征,并将各个音符对应的和弦组合特征所构成的特征序列作为音乐数据对应的音乐和弦识别结果,基于所得到的和弦识别结果,则能够便于对音乐数据进行准确且全面的分析。It can be seen that the first chord feature sequence, the second chord feature sequence, and the third chord feature sequence output by the various chord function models all contain the music data of the music chord to be recognized in the different chord function recognition. Chord characteristics in dimensionality. In an exemplary application scenario, the chord characteristics of the same note in different chord function recognition dimensions identified by each chord function model can be combined to obtain the chord combination feature corresponding to each note, and the chord corresponding to each note The feature sequence constituted by the combined features is used as the music chord recognition result corresponding to the music data. Based on the obtained chord recognition result, the music data can be analyzed accurately and comprehensively.
需要说明的是,图2所示的音乐和弦识别模型针对待识别音乐和弦的音乐数据中含有的各个音符,提取各个音符在不同的和弦功能识别维度上的和弦特征的详细过程请参见图1所示实施例的具体描述,本处不在此进行赘述。It should be noted that the music chord recognition model shown in Figure 2 is for each note contained in the music data of the music chord to be recognized, and the detailed process of extracting the chord features of each note in different chord function recognition dimensions is shown in Figure 1. The specific description of the illustrated embodiment will not be repeated here.
在一些示例性的实施例中,为保证图2所示的音乐和弦识别模型具有较好的识别效果,可以采用图3所示的训练过程对图2所示的音乐和弦识别模型进行训练。In some exemplary embodiments, in order to ensure that the music chord recognition model shown in FIG. 2 has a better recognition effect, the training process shown in FIG. 3 may be used to train the music chord recognition model shown in FIG. 2.
如图3所示,在一示例性的实施例中,对音乐和弦识别模型进行训练包括步骤S210至步骤S230,详细介绍如下:As shown in FIG. 3, in an exemplary embodiment, training the music chord recognition model includes steps S210 to S230, which are described in detail as follows:
步骤S210,获取用于训练特征提取模型以及多个和弦功能识别模型的数据集,该数据集中含有多个待训练的音乐数据。Step S210: Obtain a data set for training a feature extraction model and a plurality of chord function recognition models, and the data set contains a plurality of music data to be trained.
需要说明的是,为获取较好的音乐和弦识别模型训练效果,在用于训练特征提取模型以及多个和弦功能识别模型的数据集中,各个待训练的音乐数据通常是一段完整音乐对应的数字化音乐表示,例如就可以是一首完整的歌曲所对应的音乐数据。It should be noted that, in order to obtain better training effects of music chord recognition models, in the data set used to train the feature extraction model and multiple chord function recognition models, each piece of music data to be trained is usually a piece of digital music corresponding to a complete piece of music Represents, for example, it can be the music data corresponding to a complete song.
步骤S220,将各个待训练的音乐数据划分为第一音乐数据段、第二音乐数据段和第三音乐数据段,以基于各个待训练的音乐数据对应的第一音乐数据段构成训练数据集,基于各个待训练的音乐数据对应的第二音乐数据段构成测试数据集,以及各个待训练的音乐数据对应的第三音乐数据段构成验证数据集。Step S220: Divide each piece of music data to be trained into a first piece of music data, a second piece of music data, and a third piece of music data, to form a training data set based on the first piece of music data corresponding to each piece of music data to be trained, The second music data segment corresponding to each music data to be trained constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set.
其中,针对每一个待训练的音乐数据进行不同音乐数据段的划分方式可以包括多种,例如可以根据时长将待训练的音乐数据平均划分为时长相等的不同音乐数据段;或者可以在待训练的音乐数据中定位关键数据位置,基于所定位的关键数据位置进行音乐数据段的划分可以相应得到前奏、主歌、副歌等相应的音乐数据段;或者,所划分得到的不同音乐数据段中也可以含有的重复旋律对应的数据信息,本实施例不对此进行限制。There are multiple ways to divide different music data segments for each music data to be trained. For example, the music data to be trained can be equally divided into different music data segments of equal duration according to the duration; or it can be divided into different music data segments of equal duration. Locate the key data position in the music data, and divide the music data segment based on the positioned key data position to obtain corresponding music data segments such as prelude, verse, chorus, etc.; or, the different music data segments obtained by the division can also be obtained. The data information corresponding to the repeated melody that can be contained is not limited in this embodiment.
需要说明的是,针对各个待训练的音乐数据进行不同音乐数据段的划分方式可以相同,以使得本实施例所得到的训练数据集中含有的各个第一音乐数据段之间的具有特征一致性,以便于对音乐和弦识别模型进行有针对性的训练。此外,采用测试数据集和验证数据集对音乐和弦识别模型进行训练同理,以使得音乐和弦识别模型能够具有更好的训练效果。It should be noted that the division of different music data segments for each music data to be trained can be the same, so that the first music data segments contained in the training data set obtained in this embodiment have feature consistency. To facilitate targeted training of the music chord recognition model. In addition, the test data set and the verification data set are used to train the music chord recognition model in the same way, so that the music chord recognition model can have a better training effect.
步骤S230,根据训练数据集、测试数据集和验证数据集对特征提取模型以及多个和弦功能识别模型进行训练。In step S230, the feature extraction model and multiple chord function recognition models are trained according to the training data set, the test data set and the verification data set.
在本实施例中,可以先使用训练数据集对特征提取模型以及多个和弦功能识别模型进行首轮训练,在得到训练好的特征提取模型以及多个和弦功能识别模型进行训练后,使用测试数据集对特征提取模型以及多个和弦功能识别模型进行第二轮训练,第二轮训练的目的在于对第一轮训练到的特征提取模型以及多个和弦功能识别模型的模型效果进行测试,并基于测试结果进一步优化特征提取模型以及多个和弦功能识别模型的模型效果。在第二轮训练后,则使用验证数据集对模型效果进行验证,如果验证得到的模型效果不佳,则需进一步优化特征提取模型以及多个和弦功能识别模型的模型效果。In this embodiment, the training data set can be used to train the feature extraction model and multiple chord function recognition models for the first round. After the trained feature extraction model and multiple chord function recognition models are trained, the test data can be used. Set the feature extraction model and multiple chord function recognition models for the second round of training. The purpose of the second round of training is to test the model effects of the feature extraction model and multiple chord function recognition models trained in the first round, and based on The test results further optimize the model effect of the feature extraction model and multiple chord function recognition models. After the second round of training, the validation data set is used to verify the model effect. If the model effect obtained by the verification is not good, the model effect of the feature extraction model and multiple chord function recognition models need to be further optimized.
由于训练数据集、测试数据集和验证数据集是针对同一数据集进行音乐数据段的划分及汇总所得到的,因此本实施例基于训练数据集、测试数据集和验证数据集对音乐和弦识别模型进行训练,相当于是基于同一训练数据对音乐和弦识别模型进行多次训练,不仅能够提升音乐和弦识别模型的训练效果,还能够在较大程度上增加音乐和弦识别模型的训练数据量,以进一步提升音乐和弦识别模型的训练效果。Since the training data set, test data set, and verification data set are obtained by dividing and summarizing music data segments for the same data set, this embodiment is based on the training data set, test data set, and verification data set for the music chord recognition model Training is equivalent to training the music chord recognition model multiple times based on the same training data. It can not only improve the training effect of the music chord recognition model, but also increase the amount of training data of the music chord recognition model to a greater extent to further improve The training effect of music chord recognition model.
此外,在图2所示的音乐和弦识别模型中,和弦功能识别模型21-23共享特征提取模型10的输出信号,也即和弦功能识别模型21-23之间共享特征提取模型10的模型参数。因此在一些实施例中,可以根据训练数据集、测试数据集和验证数据集对特征提取模型进行训练;在得到训练好的特征提取模型后,再根据训练数据集、测试数据集和验证数据集,以及训练好的特征提取模型的输出信号,对各个和弦功能识别模型进行训练;然后分别针对各个和弦功能识别模型进行训练所对应的训练损失值,当各个和弦功能识别模型对应的训练损失值之和小于损失阈值时,结束针对多个和弦功能识别模型的训练。此实施例基于训练好的特征提取模型对各个和弦功能识别模型进行训练,可以提升各个和弦功能识别模型的训练速率。In addition, in the music chord recognition model shown in FIG. 2, the chord function recognition models 21-23 share the output signal of the feature extraction model 10, that is, the chord function recognition models 21-23 share the model parameters of the feature extraction model 10. Therefore, in some embodiments, the feature extraction model can be trained based on the training data set, the test data set, and the verification data set; after the trained feature extraction model is obtained, the training data set, the test data set, and the verification data set can be used. , And the output signal of the trained feature extraction model, train each chord function recognition model; then train the corresponding training loss value for each chord function recognition model, when the training loss value corresponding to each chord function recognition model When the sum is less than the loss threshold, the training for multiple chord function recognition models ends. This embodiment trains each chord function recognition model based on the trained feature extraction model, which can increase the training rate of each chord function recognition model.
在其它的一些实施例中,也可以根据训练数据集、测试数据集和验证数据集分别对特征提取模型和任意一个和弦功能识别模型的组合进行训练,在每一次训练过程中,基于训练损失值都将相应更新特征提取模型以及和弦功能识别模型中的模型参数,因此采用本实施例提出的训练方式将针对特征提取模型进行成倍次数的训练,使得音乐和弦识别模型具有而更佳的训练效果。In some other embodiments, the combination of the feature extraction model and any chord function recognition model can also be trained according to the training data set, test data set, and verification data set. In each training process, based on the training loss value Both the feature extraction model and the model parameters in the chord function recognition model will be updated accordingly. Therefore, the training method proposed in this embodiment will train the feature extraction model multiple times, so that the music chord recognition model has a better training effect. .
图4是根据一示例性实施例示出的一种音乐和弦识别装置的框图,如图9所示,该音乐和弦识别装置包括:Fig. 4 is a block diagram of a music chord recognition device according to an exemplary embodiment. As shown in Fig. 9, the music chord recognition device includes:
音符信息处理模块410,配置为针对待识别音乐和弦的音乐数据,依次提取音乐数据中含有的各个音符对应的音符信息,并基于各个音符对应的音符信息构建各个音符的二维矩阵表示;音符特征提取模块420,配置为根据各个音符的二维矩阵表示提取各个音符对应的音符特征;和弦特征识别模块430,配置为基于各个音符对应的音符特征,分别从不同的和弦功能识别维度识别各个音符对应的和弦特征;识别结果获取模块440,配置为将识别得到的同一音符在不同的和弦功能识别维度上的和弦特征进行组合,得到各个音符对应的和弦组合特征,并将各个音符对应的和弦组合特征所构成的特征序列作为音乐数据对应的音乐和弦识别结果。The note information processing module 410 is configured to sequentially extract the note information corresponding to each note contained in the music data for the music data of the music chord to be recognized, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note; The extraction module 420 is configured to extract the note feature corresponding to each note according to the two-dimensional matrix representation of each note; the chord feature recognition module 430 is configured to recognize each note corresponding to each note from different chord function recognition dimensions based on the note feature corresponding to each note The recognition result acquisition module 440 is configured to combine the recognized chord features of the same note in different chord function recognition dimensions to obtain the chord combination feature corresponding to each note, and the chord combination feature corresponding to each note The constructed feature sequence is used as the result of music chord recognition corresponding to the music data.
本实施例所示出的音乐和弦识别装置能够基于数字音符识别音乐数据在和弦功能上的信息,能够解决现有技术中无法得到关于古典音乐在和弦功能上的信息的问题。The music chord recognition device shown in this embodiment can recognize information on the chord function of music data based on digital notes, and can solve the problem that the information on the chord function of classical music cannot be obtained in the prior art.
在另一示例性的实施例中,音乐数据的数据格式为乐器数据接口格式,音符信息处理模块410包括:In another exemplary embodiment, the data format of the music data is a musical instrument data interface format, and the note information processing module 410 includes:
音符信息获取单元,配置为依次提取音乐数据中含有的各个音符的音符音高和音符持续时长,将音符音高和音符持续时长作为各个音符对应的音符信息;二维矩阵表示构建单元,配置为将音符音高作为二维矩阵中的纵向元素,以及将音符持续时长作为二维矩阵中的横向元素,构建各个音符的二维矩阵表示。The note information acquisition unit is configured to sequentially extract the note pitch and note duration of each note contained in the music data, and use the note pitch and note duration as the note information corresponding to each note; the two-dimensional matrix represents the building unit and is configured as Taking the pitch of the note as the vertical element in the two-dimensional matrix and the duration of the note as the horizontal element in the two-dimensional matrix, a two-dimensional matrix representation of each note is constructed.
在另一示例性的实施例中,音符特征提取模块420包括:In another exemplary embodiment, the musical note feature extraction module 420 includes:
第一序列获取单元,配置为获取由各个音符的二维矩阵表示所构成的二维矩阵表示序列;第一模型处理单元,配置为将二维矩阵表示序列输入至特征提取模型中,获取特征提取模型针对二维矩阵表示序列输出的音符特征序列,音符特征序列中含有各个音符对应的音符特征。The first sequence obtaining unit is configured to obtain a two-dimensional matrix representation sequence composed of a two-dimensional matrix representation of each note; the first model processing unit is configured to input the two-dimensional matrix representation sequence into the feature extraction model to obtain feature extraction The model expresses the note feature sequence output by the sequence for a two-dimensional matrix, and the note feature sequence contains the note feature corresponding to each note.
在另一示例性的实施例中,和弦特征识别模块430包括:In another exemplary embodiment, the chord feature recognition module 430 includes:
第二序列获取单元,配置为获取由各个音符对应的音符特征所构成的音符特征序列;第二模型处理单元,配置为将音符特征序列分别输入至预置的多个和弦功能识别模型中,以获取各个和弦功能识别模型从不同的和弦功能识别维度对音符特征序列中的各个音符特征进行识别处理所得到的和弦特征。The second sequence acquisition unit is configured to acquire a note feature sequence composed of note features corresponding to each note; the second model processing unit is configured to input the note feature sequence into a plurality of preset chord function recognition models, respectively, to Obtain the chord features obtained by recognizing each note feature in the note feature sequence from different chord feature recognition dimensions from different chord feature recognition models.
在另一示例性的实施例中,和弦功能识别维度至少包括和弦调式维度、和弦调性维度以及和弦转位维度,和弦调式维度、和弦调性维度以及和弦转位维度共同作用于音乐数据的和弦功能表示。In another exemplary embodiment, the chord function identification dimension includes at least the chord mode dimension, the chord key dimension, and the chord transposition dimension. The chord mode dimension, the chord key dimension, and the chord transposition dimension work together on the chords of the music data. Functional representation.
在另一示例性的实施例中,该装置还包括:In another exemplary embodiment, the device further includes:
数据集获取模块,配置为获取用于训练特征提取模型以及多个和弦功能识别模型的数据集,数据集中含有多个待训练的音乐数据;数据集处理模块,配置为将各个待训练的音乐数据划分为第一音乐数据段、第二音乐数据段和第三音乐数据段,并基于各个待训练的音乐数据对应的第一音乐数据段构成训练数据集,基于各个待训练的音乐数据对应的第二音乐数据段构成测试数据集,以及各个待训练的音乐数据对应的第三音乐数据段构成验证数据集;模型训练模块,配置为根据训练数据集、测试数据集和验证数据集对特征提取模型以及多个和弦功能识别模型进行训练,以基于训练好的特征提取模型提取待识别音乐和弦的音乐数据中的各个音符对应的音符特征,以及基于训练好的多个和弦功能识别模型分别从不同的和弦功能识别维度识别各个音符对应的和弦特征。The data set acquisition module is configured to acquire a data set for training the feature extraction model and multiple chord function recognition models. The data set contains multiple music data to be trained; the data set processing module is configured to collect each music data to be trained It is divided into a first music data segment, a second music data segment, and a third music data segment, and a training data set is formed based on the first music data segment corresponding to each music data to be trained, and based on the first music data corresponding to each music data to be trained The second music data segment constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set; the model training module is configured to extract the model based on the training data set, the test data set and the verification data set And multiple chord function recognition models are trained to extract the note feature corresponding to each note in the music data of the music chord to be recognized based on the trained feature extraction model, and based on the trained multiple chord function recognition models from different The chord function recognition dimension recognizes the chord characteristics corresponding to each note.
在另一示例性实施例中,多个和弦功能识别模型的输入信号均为特征提取模型的输出信号,模型训练模块包括:In another exemplary embodiment, the input signals of the multiple chord function recognition models are all output signals of the feature extraction model, and the model training module includes:
特征提取模型训练单元,配置为根据训练数据集、测试数据集和验证数据集对特征提取模型进行训练;和弦功能识别模型训练单元,配置为在得到训练好的特征提取模型后,根据训练数据集、测试数据集和验证数据集,以及训练好的特征提取模型的输出信号,对多个和弦功能识别模型进行训练;训练监控单元,配置为分别针对各个和弦功能识别模型进行训练所对应的训练损失值,当各个和弦功能识别模型对应的训练损失值之和小于损失阈值时,结束针对多个和弦功能识别模型的训练。The feature extraction model training unit is configured to train the feature extraction model according to the training data set, the test data set and the verification data set; the chord function recognition model training unit is configured to obtain the trained feature extraction model and then according to the training data set , Test data set and verification data set, and the output signal of the trained feature extraction model, train multiple chord function recognition models; training monitoring unit, configured to train the corresponding training loss for each chord function recognition model When the sum of the training loss values corresponding to each chord function recognition model is less than the loss threshold, the training for multiple chord function recognition models ends.
需要说明的是,上述实施例所提供的装置与上述实施例所提供的方法属于同一构思,其中各个模块执行操作的具体方式已经在方法实施例中进行了详细描述,此处不再赘述。It should be noted that the device provided in the foregoing embodiment and the method provided in the foregoing embodiment belong to the same concept, and the specific manners for performing operations of each module have been described in detail in the method embodiment, and will not be repeated here.
在一示例性的实施例中,本申请还提供一种电子设备,该设备包括处理器和存储器,该存储器上存储有计算机可读指令,该计算机可读指令被处理器执行时,实现如前所述的音乐和弦识别方法。In an exemplary embodiment, the present application also provides an electronic device, which includes a processor and a memory, and computer-readable instructions are stored on the memory. When the computer-readable instructions are executed by the processor, the same The described music chord recognition method.
图5是根据一示例性实施例所示出的一种电子设备的硬件结构示意图。Fig. 5 is a schematic diagram showing the hardware structure of an electronic device according to an exemplary embodiment.
需要说明的是,该电子设备只是一个适配于本申请的示例,不能认为是提供了对本申请的使用范围的任何限制。该电子设备也不能解释为需要依赖于或者必须具有图5中示出的示例性的电子设备中的一个或者多个组件。It should be noted that the electronic device is only an example adapted to this application, and cannot be considered as providing any restriction on the scope of use of this application. The electronic device also cannot be interpreted as being dependent on or having one or more components in the exemplary electronic device shown in FIG. 5.
该电子设备的硬件结构可因配置或者性能的不同而产生较大的差异,如图5所示,电子设备包括:电源510、接口530、至少一存储器550、以及至少一中央处理器(CPU ,Central Processing Units)570。The hardware structure of the electronic device may vary greatly due to differences in configuration or performance. As shown in FIG. 5, the electronic device includes: a power supply 510, an interface 530, at least one memory 550, and at least one central processing unit (CPU, Central Processing Units) 570.
其中,电源510用于为电子设备上的各硬件设备提供工作电压。Wherein, the power supply 510 is used to provide working voltage for each hardware device on the electronic device.
接口530包括至少一有线或无线网络接口531、至少一串并转换接口533、至少一输入输出接口535以及至少一USB接口537等,用于与外部设备通信。The interface 530 includes at least one wired or wireless network interface 531, at least one serial-to-parallel conversion interface 533, at least one input/output interface 535, and at least one USB interface 537, etc., for communicating with external devices.
存储器550作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源包括操作***551、应用程序553或者数据555等,存储方式可以是短暂存储或者永久存储。The memory 550, as a resource storage carrier, can be a read-only memory, a random access memory, a magnetic disk or an optical disc, etc. The resources stored on it include the operating system 551, application programs 553 or data 555, etc. The storage method can be short-term storage or permanent storage. .
其中,操作***551用于管理与控制电子设备上的各硬件设备以及应用程序553,以实现中央处理器570对海量数据555的计算与处理,其可以是Windows ServerTM、Mac OS XTM、UnixTM、LinuxTM等。应用程序553是基于操作***551之上完成至少一项特定工作的计算机程序,其可以包括至少一模块(图5中未示出),每个模块都可以分别包含有对电子设备的一系列计算机可读指令。数据555可以是存储于磁盘中的http协议数据等。Among them, the operating system 551 is used to manage and control the various hardware devices and application programs 553 on the electronic device to realize the calculation and processing of the massive data 555 by the central processing unit 570, which can be Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM Wait. The application program 553 is a computer program that completes at least one specific task based on the operating system 551. It may include at least one module (not shown in FIG. 5), and each module may include a series of computer programs for electronic devices. Readable instructions. The data 555 may be http protocol data stored in a disk or the like.
中央处理器570可以包括一个或多个以上的处理器,并设置为通过总线与存储器550通信,用于运算与处理存储器550中的海量数据555。The central processing unit 570 may include one or more processors, and is configured to communicate with the memory 550 via a bus for computing and processing the massive data 555 in the memory 550.
如上面所详细描述的,适用本申请的电子设备将通过中央处理器570读取存储器550中存储的一系列计算机可读指令的形式来完成前述实施例所述的音乐和弦识别方法。As described in detail above, the electronic device applicable to the present application will read a series of computer-readable instructions stored in the memory 550 through the central processing unit 570 to complete the music chord recognition method described in the foregoing embodiment.
此外,通过硬件电路或者硬件电路结合软件指令也能同样实现本申请,因此实现本申请并不限于任何特定硬件电路、软件以及两者的组合。In addition, the present application can also be implemented through hardware circuits or hardware circuits in combination with software instructions. Therefore, implementation of the present application is not limited to any specific hardware circuits, software, and combinations of the two.
在一示例性的实施例中,本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以是非易失性,也可以是易失性,其上存储有计算机程序,该计算机程序被处理器执行时,实现如前所述的音乐和弦识别方法。In an exemplary embodiment, the present application also provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile, and a computer program is stored thereon. When executed by the processor, the music chord recognition method as described above is realized.
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围执行各种修改和改变。本申请的范围仅由所附的权利要求来限制。It should be understood that the present application is not limited to the precise structure that has been described above and shown in the drawings, and various modifications and changes can be performed without departing from its scope. The scope of the application is only limited by the appended claims.

Claims (20)

  1. 一种音乐和弦识别方法,其中,包括: A music chord recognition method, which includes:
    针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;For the music data of the music chord to be recognized, extract the note information corresponding to each note contained in the music data in turn, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
    根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;Extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
    基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;Based on the note features corresponding to the respective notes, respectively identifying the chord features corresponding to the respective notes from different chord function recognition dimensions;
    将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。Combine the recognized chord features of the same note in the different chord function identification dimensions to obtain the chord combination feature corresponding to each note, and use the chord combination feature corresponding to each note as the feature sequence The music chord recognition result corresponding to the music data.
  2. 根据权利要求1所述的方法,其中,所述音乐数据的数据格式为乐器数据接口格式;依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示,包括: The method according to claim 1, wherein the data format of the music data is a musical instrument data interface format; the note information corresponding to each note contained in the music data is sequentially extracted and based on the note information corresponding to each note Constructing the two-dimensional matrix representation of each note includes:
    依次提取所述音乐数据中含有的各个音符的音符音高和音符持续时长,将所述音符音高和音符持续时长作为所述各个音符对应的音符信息;Sequentially extracting the note pitch and note duration of each note contained in the music data, and using the note pitch and note duration as the note information corresponding to each note;
    将所述音符音高作为二维矩阵中的纵向元素,以及将所述音符持续时长作为所述二维矩阵中的横向元素,构建所述各个音符的二维矩阵表示。The pitch of the note is used as a longitudinal element in a two-dimensional matrix, and the duration of the note is used as a horizontal element in the two-dimensional matrix to construct a two-dimensional matrix representation of each note.
  3. 根据权利要求1所述的方法,其中,根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征,包括: The method according to claim 1, wherein extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note comprises:
    获取由所述各个音符的二维矩阵表示所构成的二维矩阵表示序列;Acquiring a two-dimensional matrix representation sequence formed by the two-dimensional matrix representation of each note;
    将所述二维矩阵表示序列输入至特征提取模型中,获取所述特征提取模型针对所述二维矩阵表示序列输出的音符特征序列,所述音符特征序列中含有所述各个音符对应的音符特征。The two-dimensional matrix representation sequence is input into a feature extraction model, and the note feature sequence output by the feature extraction model for the two-dimensional matrix representation sequence is obtained, and the note feature sequence contains the note feature corresponding to each note .
  4. 根据权利要求1所述的方法,其中,基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征,包括: The method according to claim 1, wherein, based on the note characteristics corresponding to the respective notes, respectively identifying the chord characteristics corresponding to the respective notes from different chord function recognition dimensions, comprises:
    获取由所述各个音符对应的音符特征所构成的音符特征序列;Acquiring a note feature sequence formed by the note features corresponding to the respective notes;
    将所述音符特征序列分别输入至预置的多个和弦功能识别模型中,以获取各个和弦功能识别模型从不同的和弦功能识别维度对所述音符特征序列中的各个音符特征进行识别处理所得到的和弦特征。Input the note feature sequence into a plurality of preset chord function recognition models to obtain each chord function recognition model from different chord function recognition dimensions to recognize each note feature in the note feature sequence. Chord characteristics.
  5. 根据权利要求4所述的方法,其中,所述和弦功能识别维度至少包括和弦调式维度、和弦调性维度以及和弦转位维度,所述和弦调式维度、所述和弦调性维度以及所述和弦转位维度共同作用于所述音乐数据的和弦功能表示。 The method according to claim 4, wherein the chord function identification dimension includes at least a chord mode dimension, a chord key dimension, and a chord transposition dimension, the chord mode dimension, the chord key dimension, and the chord transposition The position dimension works together to represent the chord function of the music data.
  6. 根据权利要求1所述的方法,其中,所述方法还包括: The method according to claim 1, wherein the method further comprises:
    获取用于训练特征提取模型以及多个和弦功能识别模型的数据集,所述数据集中含有多个待训练的音乐数据;Acquiring a data set for training a feature extraction model and a plurality of chord function recognition models, the data set containing a plurality of music data to be trained;
    将各个待训练的音乐数据划分为第一音乐数据段、第二音乐数据段和第三音乐数据段,并基于所述各个待训练的音乐数据对应的第一音乐数据段构成训练数据集,基于所述各个待训练的音乐数据对应的第二音乐数据段构成测试数据集,以及所述各个待训练的音乐数据对应的第三音乐数据段构成验证数据集;Divide each music data to be trained into a first music data segment, a second music data segment, and a third music data segment, and form a training data set based on the first music data segment corresponding to each music data to be trained, based on The second music data segment corresponding to each music data to be trained constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set;
    根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型以及多个和弦功能识别模型进行训练,以基于训练好的特征提取模型提取所述待识别音乐和弦的音乐数据中的各个音符对应的音符特征,以及基于训练好的多个和弦功能识别模型分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征。The feature extraction model and multiple chord function recognition models are trained according to the training data set, the test data set, and the verification data set, so as to extract the chords of the music to be recognized based on the trained feature extraction model The note feature corresponding to each note in the music data, and the chord feature corresponding to each note is recognized from different chord function recognition dimensions based on a plurality of trained chord function recognition models.
  7. 根据权利要求5所述的方法,其中,所述多个和弦功能识别模型的输入信号均为所述特征提取模型的输出信号;根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型以及多个和弦功能识别模型进行训练,包括: The method according to claim 5, wherein the input signals of the multiple chord function recognition models are all output signals of the feature extraction model; according to the training data set, the test data set and the verification data The set trains the feature extraction model and multiple chord function recognition models, including:
    根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型进行训练;Training the feature extraction model according to the training data set, the test data set, and the verification data set;
    在得到训练好的特征提取模型后,根据所述训练数据集、所述测试数据集和所述验证数据集,以及所述训练好的特征提取模型的输出信号,对所述多个和弦功能识别模型进行训练;After the trained feature extraction model is obtained, the multiple chord functions are identified according to the training data set, the test data set, the verification data set, and the output signal of the trained feature extraction model Model training;
    分别针对各个和弦功能识别模型进行训练所对应的训练损失值,当所述各个和弦功能识别模型对应的训练损失值之和小于损失阈值时,结束针对所述多个和弦功能识别模型的训练。The training loss values corresponding to the training of each chord function recognition model are respectively performed, and when the sum of the training loss values corresponding to the various chord function recognition models is less than the loss threshold, the training for the multiple chord function recognition models is ended.
  8. 一种音乐和弦识别装置,其中,包括: A music chord recognition device, which includes:
    音符信息处理模块,配置为针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;The note information processing module is configured to sequentially extract the note information corresponding to each note contained in the music data for the music data of the music chord to be recognized, and construct the two-dimensional information of each note based on the note information corresponding to each note Matrix representation
    音符特征提取模块,配置为根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;A note feature extraction module configured to extract the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
    和弦特征识别模块,配置为基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;The chord feature recognition module is configured to recognize the chord feature corresponding to each note from different chord function recognition dimensions based on the note feature corresponding to each note;
    识别结果获取模块,配置为将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。The recognition result acquisition module is configured to combine the chord characteristics of the same note in the different chord function recognition dimensions to obtain the chord combination characteristics corresponding to the respective notes, and to combine the chords corresponding to the respective notes The feature sequence formed by the features is used as the music chord recognition result corresponding to the music data.
  9. 一种电子设备,其中,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现: An electronic device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program:
    针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;For the music data of the music chord to be recognized, extract the note information corresponding to each note contained in the music data in turn, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
    根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;Extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
    基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;Based on the note features corresponding to the respective notes, respectively identifying the chord features corresponding to the respective notes from different chord function recognition dimensions;
    将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。Combine the recognized chord features of the same note in the different chord function identification dimensions to obtain the chord combination feature corresponding to each note, and use the chord combination feature corresponding to each note as the feature sequence The music chord recognition result corresponding to the music data.
  10. 如权利要求9所述的电子设备,其中,所述音乐数据的数据格式为乐器数据接口格式,所述处理器执行所述计算机程序时还实现: 9. The electronic device according to claim 9, wherein the data format of the music data is a musical instrument data interface format, and the processor further implements when the computer program is executed:
    依次提取所述音乐数据中含有的各个音符的音符音高和音符持续时长,将所述音符音高和音符持续时长作为所述各个音符对应的音符信息;Sequentially extracting the note pitch and note duration of each note contained in the music data, and using the note pitch and note duration as the note information corresponding to each note;
    将所述音符音高作为二维矩阵中的纵向元素,以及将所述音符持续时长作为所述二维矩阵中的横向元素,构建所述各个音符的二维矩阵表示。The pitch of the note is used as a longitudinal element in a two-dimensional matrix, and the duration of the note is used as a horizontal element in the two-dimensional matrix to construct a two-dimensional matrix representation of each note.
  11. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机程序时还实现: 9. The electronic device of claim 9, wherein the processor further implements when the computer program is executed:
    获取由所述各个音符的二维矩阵表示所构成的二维矩阵表示序列;Acquiring a two-dimensional matrix representation sequence formed by the two-dimensional matrix representation of each note;
    将所述二维矩阵表示序列输入至特征提取模型中,获取所述特征提取模型针对所述二维矩阵表示序列输出的音符特征序列,所述音符特征序列中含有所述各个音符对应的音符特征。The two-dimensional matrix representation sequence is input into a feature extraction model, and the note feature sequence output by the feature extraction model for the two-dimensional matrix representation sequence is obtained, and the note feature sequence contains the note feature corresponding to each note .
  12. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机程序时还实现: 9. The electronic device of claim 9, wherein the processor further implements when the computer program is executed:
    获取由所述各个音符对应的音符特征所构成的音符特征序列;Acquiring a note feature sequence formed by the note features corresponding to the respective notes;
    将所述音符特征序列分别输入至预置的多个和弦功能识别模型中,以获取各个和弦功能识别模型从不同的和弦功能识别维度对所述音符特征序列中的各个音符特征进行识别处理所得到的和弦特征。Input the note feature sequence into a plurality of preset chord function recognition models to obtain each chord function recognition model from different chord function recognition dimensions to recognize each note feature in the note feature sequence. Chord characteristics.
  13. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机程序时还实现: 9. The electronic device of claim 9, wherein the processor further implements when the computer program is executed:
    获取用于训练特征提取模型以及多个和弦功能识别模型的数据集,所述数据集中含有多个待训练的音乐数据;Acquiring a data set for training a feature extraction model and a plurality of chord function recognition models, the data set containing a plurality of music data to be trained;
    将各个待训练的音乐数据划分为第一音乐数据段、第二音乐数据段和第三音乐数据段,并基于所述各个待训练的音乐数据对应的第一音乐数据段构成训练数据集,基于所述各个待训练的音乐数据对应的第二音乐数据段构成测试数据集,以及所述各个待训练的音乐数据对应的第三音乐数据段构成验证数据集;Each piece of music data to be trained is divided into a first piece of music data, a second piece of music data, and a third piece of music data, and a training data set is formed based on the first piece of music data corresponding to each piece of music data to be trained, based on The second music data segment corresponding to each music data to be trained constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set;
    根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型以及多个和弦功能识别模型进行训练,以基于训练好的特征提取模型提取所述待识别音乐和弦的音乐数据中的各个音符对应的音符特征,以及基于训练好的多个和弦功能识别模型分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征。The feature extraction model and multiple chord function recognition models are trained according to the training data set, the test data set, and the verification data set, so as to extract the chords of the music to be recognized based on the trained feature extraction model The note feature corresponding to each note in the music data, and the chord feature corresponding to each note is recognized from different chord function recognition dimensions based on a plurality of trained chord function recognition models.
  14. 如权利要求13所述的电子设备,其中,所述多个和弦功能识别模型的输入信号均为所述特征提取模型的输出信号;所述处理器执行所述计算机程序时还实现: The electronic device according to claim 13, wherein the input signals of the multiple chord function recognition models are all output signals of the feature extraction model; and the processor further implements when the computer program is executed:
    根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型进行训练;Training the feature extraction model according to the training data set, the test data set, and the verification data set;
    在得到训练好的特征提取模型后,根据所述训练数据集、所述测试数据集和所述验证数据集,以及所述训练好的特征提取模型的输出信号,对所述多个和弦功能识别模型进行训练;After the trained feature extraction model is obtained, the multiple chord functions are identified according to the training data set, the test data set, the verification data set, and the output signal of the trained feature extraction model Model training;
    分别针对各个和弦功能识别模型进行训练所对应的训练损失值,当所述各个和弦功能识别模型对应的训练损失值之和小于损失阈值时,结束针对所述多个和弦功能识别模型的训练。The training loss values corresponding to the training of each chord function recognition model are respectively performed, and when the sum of the training loss values corresponding to the various chord function recognition models is less than the loss threshold, the training for the multiple chord function recognition models is ended.
  15. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现:A computer-readable storage medium storing a computer program, wherein the computer program is executed by a processor to realize:
    针对待识别音乐和弦的音乐数据,依次提取所述音乐数据中含有的各个音符对应的音符信息,并基于所述各个音符对应的音符信息构建所述各个音符的二维矩阵表示;For the music data of the music chord to be recognized, extract the note information corresponding to each note contained in the music data in turn, and construct a two-dimensional matrix representation of each note based on the note information corresponding to each note;
    根据所述各个音符的二维矩阵表示提取所述各个音符对应的音符特征;Extracting the note feature corresponding to each note according to the two-dimensional matrix representation of each note;
    基于所述各个音符对应的音符特征,分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征;Based on the note features corresponding to the respective notes, respectively identifying the chord features corresponding to the respective notes from different chord function recognition dimensions;
    将识别得到的同一音符在所述不同的和弦功能识别维度上的和弦特征进行组合,得到所述各个音符对应的和弦组合特征,并将所述各个音符对应的和弦组合特征所构成的特征序列作为所述音乐数据对应的音乐和弦识别结果。Combine the recognized chord features of the same note in the different chord function identification dimensions to obtain the chord combination feature corresponding to each note, and use the chord combination feature corresponding to each note as the feature sequence The music chord recognition result corresponding to the music data.
  16. 如权利要求15所述的计算机可读存储介质,其中,所述音乐数据的数据格式为乐器数据接口格式,所述计算机程序被处理器执行时还实现: 15. The computer-readable storage medium according to claim 15, wherein the data format of the music data is a musical instrument data interface format, and the computer program when executed by the processor further implements:
    依次提取所述音乐数据中含有的各个音符的音符音高和音符持续时长,将所述音符音高和音符持续时长作为所述各个音符对应的音符信息;Sequentially extracting the note pitch and note duration of each note contained in the music data, and using the note pitch and note duration as the note information corresponding to each note;
    将所述音符音高作为二维矩阵中的纵向元素,以及将所述音符持续时长作为所述二维矩阵中的横向元素,构建所述各个音符的二维矩阵表示。The pitch of the note is used as a longitudinal element in a two-dimensional matrix, and the duration of the note is used as a horizontal element in the two-dimensional matrix to construct a two-dimensional matrix representation of each note.
  17. 如权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时还实现: 15. The computer-readable storage medium of claim 15, wherein the computer program, when executed by the processor, further implements:
    获取由所述各个音符的二维矩阵表示所构成的二维矩阵表示序列;Acquiring a two-dimensional matrix representation sequence formed by the two-dimensional matrix representation of each note;
    将所述二维矩阵表示序列输入至特征提取模型中,获取所述特征提取模型针对所述二维矩阵表示序列输出的音符特征序列,所述音符特征序列中含有所述各个音符对应的音符特征。The two-dimensional matrix representation sequence is input into a feature extraction model, and the note feature sequence output by the feature extraction model for the two-dimensional matrix representation sequence is obtained, and the note feature sequence contains the note feature corresponding to each note .
  18. 如权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时还实现: 15. The computer-readable storage medium of claim 15, wherein the computer program, when executed by the processor, further implements:
    获取由所述各个音符对应的音符特征所构成的音符特征序列;Acquiring a note feature sequence formed by the note features corresponding to the respective notes;
    将所述音符特征序列分别输入至预置的多个和弦功能识别模型中,以获取各个和弦功能识别模型从不同的和弦功能识别维度对所述音符特征序列中的各个音符特征进行识别处理所得到的和弦特征。Input the note feature sequence into a plurality of preset chord function recognition models to obtain each chord function recognition model from different chord function recognition dimensions to recognize each note feature in the note feature sequence. Chord characteristics.
  19. 如权利要求15所述的计算机可读存储介质,其中,所述计算机程序被处理器执行时还实现: 15. The computer-readable storage medium of claim 15, wherein the computer program, when executed by the processor, further implements:
    获取用于训练特征提取模型以及多个和弦功能识别模型的数据集,所述数据集中含有多个待训练的音乐数据;Acquiring a data set for training a feature extraction model and a plurality of chord function recognition models, the data set containing a plurality of music data to be trained;
    将各个待训练的音乐数据划分为第一音乐数据段、第二音乐数据段和第三音乐数据段,并基于所述各个待训练的音乐数据对应的第一音乐数据段构成训练数据集,基于所述各个待训练的音乐数据对应的第二音乐数据段构成测试数据集,以及所述各个待训练的音乐数据对应的第三音乐数据段构成验证数据集;Each piece of music data to be trained is divided into a first piece of music data, a second piece of music data, and a third piece of music data, and a training data set is formed based on the first piece of music data corresponding to each piece of music data to be trained, based on The second music data segment corresponding to each music data to be trained constitutes a test data set, and the third music data segment corresponding to each music data to be trained constitutes a verification data set;
    根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型以及多个和弦功能识别模型进行训练,以基于训练好的特征提取模型提取所述待识别音乐和弦的音乐数据中的各个音符对应的音符特征,以及基于训练好的多个和弦功能识别模型分别从不同的和弦功能识别维度识别所述各个音符对应的和弦特征。The feature extraction model and multiple chord function recognition models are trained according to the training data set, the test data set, and the verification data set, so as to extract the chords of the music to be recognized based on the trained feature extraction model The note feature corresponding to each note in the music data, and the chord feature corresponding to each note is recognized from different chord function recognition dimensions based on a plurality of trained chord function recognition models.
  20. 如权利要求19所述的计算机可读存储介质,其中,所述多个和弦功能识别模型的输入信号均为所述特征提取模型的输出信号,所述计算机程序被处理器执行时还实现: 19. The computer-readable storage medium of claim 19, wherein the input signals of the plurality of chord function recognition models are all output signals of the feature extraction model, and the computer program further implements when being executed by the processor:
    根据所述训练数据集、所述测试数据集和所述验证数据集对所述特征提取模型进行训练;Training the feature extraction model according to the training data set, the test data set, and the verification data set;
    在得到训练好的特征提取模型后,根据所述训练数据集、所述测试数据集和所述验证数据集,以及所述训练好的特征提取模型的输出信号,对所述多个和弦功能识别模型进行训练;After the trained feature extraction model is obtained, the multiple chord functions are identified according to the training data set, the test data set, the verification data set, and the output signal of the trained feature extraction model Model training;
    分别针对各个和弦功能识别模型进行训练所对应的训练损失值,当所述各个和弦功能识别模型对应的训练损失值之和小于损失阈值时,结束针对所述多个和弦功能识别模型的训练。The training loss values corresponding to the training of each chord function recognition model are respectively performed, and when the sum of the training loss values corresponding to the various chord function recognition models is less than the loss threshold, the training for the multiple chord function recognition models is ended.
PCT/CN2021/084222 2020-11-25 2021-03-31 Music chord recognition method and apparatus, and electronic device and storage medium WO2021190660A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011351757.7 2020-11-25
CN202011351757.7A CN112652281A (en) 2020-11-25 2020-11-25 Music chord identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2021190660A1 true WO2021190660A1 (en) 2021-09-30

Family

ID=75349481

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084222 WO2021190660A1 (en) 2020-11-25 2021-03-31 Music chord recognition method and apparatus, and electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN112652281A (en)
WO (1) WO2021190660A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113140202B (en) * 2021-04-25 2024-06-18 北京灵动音科技有限公司 Information processing method, information processing device, electronic equipment and storage medium
CN114758637A (en) * 2022-04-13 2022-07-15 天津大学 Method and device for classifying current popular music chords

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6057502A (en) * 1999-03-30 2000-05-02 Yamaha Corporation Apparatus and method for recognizing musical chords
JP2008209550A (en) * 2007-02-26 2008-09-11 National Institute Of Advanced Industrial & Technology Chord discrimination device, chord discrimination method, and program
CN102723079A (en) * 2012-06-07 2012-10-10 天津大学 Music and chord automatic identification method based on sparse representation
CN106847248A (en) * 2017-01-05 2017-06-13 天津大学 Chord recognition methods based on robustness scale contour feature and vector machine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6057502A (en) * 1999-03-30 2000-05-02 Yamaha Corporation Apparatus and method for recognizing musical chords
JP2008209550A (en) * 2007-02-26 2008-09-11 National Institute Of Advanced Industrial & Technology Chord discrimination device, chord discrimination method, and program
CN102723079A (en) * 2012-06-07 2012-10-10 天津大学 Music and chord automatic identification method based on sparse representation
CN106847248A (en) * 2017-01-05 2017-06-13 天津大学 Chord recognition methods based on robustness scale contour feature and vector machine

Also Published As

Publication number Publication date
CN112652281A (en) 2021-04-13

Similar Documents

Publication Publication Date Title
WO2022116420A1 (en) Speech event detection method and apparatus, electronic device, and computer storage medium
WO2019232928A1 (en) Musical model training method, music creation method, devices, terminal and storage medium
JP6785904B2 (en) Information push method and equipment
US11017774B2 (en) Cognitive audio classifier
US10108661B2 (en) Using synthetic events to identify complex relation lookups
CN110532369A (en) A kind of generation method of question and answer pair, device and server
JP2019502144A (en) Audio information processing method and device
WO2021190660A1 (en) Music chord recognition method and apparatus, and electronic device and storage medium
WO2021139257A1 (en) Method and apparatus for selecting annotated data, and computer device and storage medium
US11367424B2 (en) Method and apparatus for training adaptation quality evaluation model, and method and apparatus for evaluating adaptation quality
US11030526B1 (en) Hierarchical system and method for generating intercorrelated datasets
WO2019205383A1 (en) Electronic device, deep learning-based music performance style identification method, and storage medium
CN108711415B (en) Method, apparatus and storage medium for correcting time delay between accompaniment and dry sound
CN111444379B (en) Audio feature vector generation method and audio fragment representation model training method
CN109410972B (en) Method, device and storage medium for generating sound effect parameters
CN111325031A (en) Resume parsing method and device
CN113223485B (en) Training method of beat detection model, beat detection method and device
CN113140230B (en) Method, device, equipment and storage medium for determining note pitch value
CN112582073B (en) Medical information acquisition method, device, electronic equipment and medium
US20160124961A1 (en) Using Priority Scores for Iterative Precision Reduction in Structured Lookups for Questions
US20210330241A1 (en) A computer-implemented method, an apparatus and a computer program product for determining an updated set of words for use in an auditory verbal learning test
US20190213989A1 (en) Technologies for generating a musical fingerprint
CN113282509B (en) Tone recognition, live broadcast room classification method, device, computer equipment and medium
CN113282839B (en) Internet data push processing method and system
US11704585B2 (en) System and method to determine outcome probability of an event based on videos

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21776810

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21776810

Country of ref document: EP

Kind code of ref document: A1