JP7343320B2

JP7343320B2 - Information processing device, information processing method, and program

Info

Publication number: JP7343320B2
Application number: JP2019135534A
Authority: JP
Inventors: 雅人小池
Original assignee: Koei Tecmo Games Co Ltd
Current assignee: Koei Tecmo Games Co Ltd
Priority date: 2019-07-23
Filing date: 2019-07-23
Publication date: 2023-09-12
Anticipated expiration: 2039-07-23
Also published as: JP2021018387A

Description

本発明は、情報処理装置、情報処理方法、及びプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

従来、コンピュータゲーム等において、ＢＧＭ（Back Ground Music）や効果音を、繰り返して（ループ）再生することが行われている（例えば、特許文献１を参照）。ループの開始地点や終了地点は、例えば、職人が音の波形を見ながら、手作業で設定する。 BACKGROUND ART Conventionally, in computer games and the like, BGM (Background Music) and sound effects have been repeatedly played back (loop) (for example, see Patent Document 1). For example, a craftsman manually sets the start and end points of the loop while looking at the sound waveform.

特開２００５－１４８２１０号公報Japanese Patent Application Publication No. 2005-148210

しかしながら、従来技術では、例えば、ループした際の音が不自然に聞こえる場合があるという問題点がある。 However, the conventional technology has a problem in that, for example, the sound when looped may sound unnatural.

そこで、一側面では、より適切に繰り返し再生させることができる技術を提供することを目的とする。 Accordingly, one aspect of the present invention aims to provide a technique that enables more appropriate repeated reproduction.

一つの案では、情報処理装置は、音データにおける第１区間よりも前の区間から、前記第１区間の音の波形との類似度が最も高い第２区間を検索する検索部と、繰り返し再生の終了点を前記第１区間に基づいて決定し、繰り返し再生の開始点を、前記第２区間に基づいて決定する決定部と、を有する。 In one proposal, the information processing device includes a search unit that searches for a second section having the highest degree of similarity to the sound waveform of the first section from a section before the first section in the sound data, and a search section that searches for a second section that has the highest degree of similarity to the sound waveform of the first section; a determining unit that determines the end point of the playback based on the first section, and determines the start point of repeated playback based on the second section.

一側面によれば、より適切に繰り返し再生させることができる。 According to one aspect, it is possible to perform repeated reproduction more appropriately.

実施形態に係る情報処理装置のハードウェア構成例を示す図である。1 is a diagram illustrating an example of a hardware configuration of an information processing device according to an embodiment. 実施形態に係る情報処理装置の機能ブロック図である。FIG. 1 is a functional block diagram of an information processing device according to an embodiment. 実施形態に係る情報処理装置の処理の一例を示すフローチャートである。3 is a flowchart illustrating an example of processing of the information processing apparatus according to the embodiment. 実施形態に係る情報処理装置１０の処理の一例を説明する図である。FIG. 2 is a diagram illustrating an example of processing of the information processing device 10 according to the embodiment. 実施形態に係る減衰率の一例について説明する図である。It is a figure explaining an example of an attenuation rate concerning an embodiment. 実施形態に係る減衰率の一例について説明する図である。It is a figure explaining an example of an attenuation rate concerning an embodiment. 実施形態に係る減衰率の一例について説明する図である。It is a figure explaining an example of an attenuation rate concerning an embodiment. 実施形態に係る繰り返し再生用の音データの一例について説明する図である。FIG. 3 is a diagram illustrating an example of sound data for repeated reproduction according to the embodiment.

以下、図面に基づいて本発明の実施形態を説明する。 Embodiments of the present invention will be described below based on the drawings.

＜ハードウェア構成＞
図１は、実施形態に係る情報処理装置１０のハードウェア構成例を示す図である。図１に示す情報処理装置１０は、それぞれバスＢで相互に接続されているドライブ装置１００、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ１０４、インタフェース装置１０５、表示装置１０６、及び入力装置１０７等を有する。 <Hardware configuration>
FIG. 1 is a diagram showing an example of a hardware configuration of an information processing device 10 according to an embodiment. The information processing device 10 shown in FIG. 1 includes a drive device 100, an auxiliary storage device 102, a memory device 103, a CPU 104, an interface device 105, a display device 106, an input device 107, etc., which are interconnected via a bus B. .

情報処理装置１０での処理を実現するゲームプログラムは、記録媒体１０１によって提供される。ゲームプログラムを記録した記録媒体１０１がドライブ装置１００にセットされると、ゲームプログラムが記録媒体１０１からドライブ装置１００を介して補助記憶装置１０２にインストールされる。但し、ゲームプログラムのインストールは必ずしも記録媒体１０１より行う必要はなく、ネットワークを介して他のコンピュータよりダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたゲームプログラムを格納すると共に、必要なファイルやデータ等を格納する。 A game program that implements processing by the information processing device 10 is provided by the recording medium 101. When the recording medium 101 on which the game program is recorded is set in the drive device 100, the game program is installed from the recording medium 101 into the auxiliary storage device 102 via the drive device 100. However, the game program does not necessarily need to be installed from the recording medium 101, and may be downloaded from another computer via a network. The auxiliary storage device 102 stores installed game programs as well as necessary files, data, and the like.

メモリ装置１０３は、例えば、ＤＲＡＭ（Dynamic Random Access Memory）、またはＳＲＡＭ（Static Random Access Memory）等のメモリであり、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムを読み出して格納する。ＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムに従って情報処理装置１０に係る機能を実現する。インタフェース装置１０５は、ネットワークに接続するためのインタフェースとして用いられる。表示装置１０６はプログラムによるＧＵＩ（Graphical User Interface）等を表示する。入力装置１０７は、コントローラ等、キーボード及びマウス等、またはタッチパネル及びボタン等で構成され、様々な操作指示を入力させるために用いられる。 The memory device 103 is, for example, a memory such as DRAM (Dynamic Random Access Memory) or SRAM (Static Random Access Memory), and reads the program from the auxiliary storage device 102 and stores it when a program startup instruction is received. . The CPU 104 implements functions related to the information processing device 10 according to programs stored in the memory device 103. The interface device 105 is used as an interface for connecting to a network. The display device 106 displays a GUI (Graphical User Interface) or the like based on a program. The input device 107 is configured with a controller, etc., a keyboard, a mouse, etc., or a touch panel, buttons, etc., and is used to input various operation instructions.

なお、記録媒体１０１の一例としては、ＣＤ－ＲＯＭ、ＤＶＤディスク、ブルーレイディスク、又はＵＳＢメモリ等の可搬型の記録媒体が挙げられる。また、補助記憶装置１０２の一例としては、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、又はフラッシュメモリ等が挙げられる。記録媒体１０１及び補助記憶装置１０２のいずれについても、コンピュータ読み取り可能な記録媒体に相当する。 Note that an example of the recording medium 101 is a portable recording medium such as a CD-ROM, a DVD disc, a Blu-ray disc, or a USB memory. Furthermore, examples of the auxiliary storage device 102 include an HDD (Hard Disk Drive), an SSD (Solid State Drive), a flash memory, and the like. Both the recording medium 101 and the auxiliary storage device 102 correspond to computer-readable recording media.

＜機能構成＞
次に、図２を参照し、情報処理装置１０の機能構成について説明する。図２は、実施形態に係る情報処理装置１０の機能ブロック図である。 <Functional configuration>
Next, with reference to FIG. 2, the functional configuration of the information processing device 10 will be described. FIG. 2 is a functional block diagram of the information processing device 10 according to the embodiment.

情報処理装置１０は、記憶部１１を有する。記憶部１１は、例えば、補助記憶装置１０２等を用いて実現される。記憶部１１は、録音された第１音データ等を記憶する。 The information processing device 10 has a storage unit 11. The storage unit 11 is realized using, for example, the auxiliary storage device 102 or the like. The storage unit 11 stores recorded first sound data and the like.

また、情報処理装置１０は、取得部１２、受付部１３、検索部１４、決定部１５、生成部１６、及び再生部１７を有する。これら各部は、情報処理装置１０にインストールされた１以上のプログラムが、情報処理装置１０のＣＰＵ１０４に実行させる処理により実現される。 The information processing device 10 also includes an acquisition unit 12 , a reception unit 13 , a search unit 14 , a determination unit 15 , a generation unit 16 , and a playback unit 17 . Each of these units is realized by one or more programs installed in the information processing device 10 causing the CPU 104 of the information processing device 10 to execute the processing.

取得部１２は、記憶部１１に記憶されている第１音データ等を取得する。受付部１３は、ユーザから各種の操作等による入力を受け付ける。検索部１４は、取得部１２により取得された第１音データにおいて、ユーザ等により指定された第１区間よりも前の区間から、当該第１区間の音の波形との類似度が最も高い第２区間を検索する。 The acquisition unit 12 acquires the first sound data and the like stored in the storage unit 11. The reception unit 13 receives input from the user through various operations. In the first sound data acquired by the acquisition unit 12, the search unit 14 selects a section from the section before the first section specified by the user, etc. that has the highest degree of similarity to the sound waveform of the first section. Search for 2 sections.

決定部１５は、繰り返し再生の終了点（折り返し点）を当該第１区間に基づいて決定し、繰り返し再生の開始点を、当該第２区間に基づいて決定する。 The determining unit 15 determines the end point (turning point) of repeated playback based on the first section, and determines the start point of repeat playback based on the second section.

生成部１６は、取得部１２により取得された第１音データの一部を繰り返して再生させる場合に、繰り返しの際の違和感を低減させる第２音データを、取得部１２により取得された第１音データに基づいて生成する。 When repeatedly reproducing a part of the first sound data acquired by the acquisition unit 12, the generation unit 16 generates second sound data that reduces discomfort during repetition from the first sound data acquired by the acquisition unit 12. Generate based on sound data.

再生部１７は、取得部１２により取得された第１音データを、最初のイントロ部分から決定部１５により決定された繰り返し再生の終了点まで再生した後、生成部１６により生成された第２音データを再生し、決定部１５により決定された繰り返し再生の開始点から取得部１２により取得された第１音データを再生する。そして、再生部１７は、第１音データを開始点から終了点まで再生した後、第２音データを再生する処理を繰り返す。また、再生部１７は、再生させた音をスピーカに出力させる。 The reproduction unit 17 reproduces the first sound data acquired by the acquisition unit 12 from the first intro part to the end point of repeated reproduction determined by the determination unit 15, and then reproduces the second sound data generated by the generation unit 16. The data is reproduced, and the first sound data acquired by the acquisition unit 12 is reproduced from the starting point of repeated reproduction determined by the determination unit 15. After reproducing the first sound data from the start point to the end point, the reproducing unit 17 repeats the process of reproducing the second sound data. Furthermore, the reproduction unit 17 causes the speaker to output the reproduced sound.

＜処理＞
次に、図３から図６を参照して、情報処理装置１０の処理について説明する。図３は、実施形態に係る情報処理装置１０の処理の一例を示すフローチャートである。図４は、実施形態に係る情報処理装置１０の処理の一例について説明する図である。図５Ａから図５Ｃは、実施形態に係る減衰率の一例について説明する図である。図６は、実施形態に係る繰り返し再生用の音データの一例について説明する図である。 <Processing>
Next, the processing of the information processing device 10 will be described with reference to FIGS. 3 to 6. FIG. 3 is a flowchart illustrating an example of processing of the information processing device 10 according to the embodiment. FIG. 4 is a diagram illustrating an example of processing of the information processing device 10 according to the embodiment. FIGS. 5A to 5C are diagrams illustrating an example of the attenuation rate according to the embodiment. FIG. 6 is a diagram illustrating an example of sound data for repeated reproduction according to the embodiment.

ステップＳ１において、取得部１２は、一部を繰り返し再生するための第１音データを取得する。図４の例では、第１音データ４０１はステレオであり、左（Ｌ）チャンネルの波形４０１Ａと、右（Ｒ）チャンネルの波形４０１Ｂの例が示されている。第１音データ４０１には、ゲーム等でＢＧＭであって、楽器でイントロ部分が演奏されて録音された区間４０２と、ゲーム等において繰り返し再生される部分が演奏されて録音された区間４０３と、当該繰り返し再生される部分の少なくとも一部が再度演奏されて録音された区間４０４とが含まれている。 In step S1, the acquisition unit 12 acquires first sound data for repeatedly reproducing a portion. In the example of FIG. 4, the first sound data 401 is stereo, and examples of a left (L) channel waveform 401A and a right (R) channel waveform 401B are shown. The first sound data 401 includes a section 402 in which an intro part is played and recorded as BGM in a game or the like, and a section 403 in which a part that is repeatedly played in a game or the like is played and recorded. A section 404 in which at least a part of the repeatedly reproduced portion is played and recorded again is included.

続いて、受付部１３は、取得した第１音データにおいて、繰り返し再生される部分が再度演奏された部分の少なくとも一部の区間である第１区間を指定する操作をユーザから受け付ける（ステップＳ２）。図４の例では、開始時点４２１と終了時点４２２との間の第１区間４０５がユーザにより指定されている。 Subsequently, the receiving unit 13 receives from the user an operation for specifying a first section in the acquired first sound data, in which the repeatedly played section is at least a part of the section played again (step S2). . In the example of FIG. 4, a first section 405 between a start time 421 and an end time 422 is specified by the user.

続いて、検索部１４は、取得部１２により取得された第１音データにおいて、指定された第１区間に類似する第２区間を検索する（ステップＳ３）。ここで、検索部１４は、第１区間よりも前の区間から、第１区間の音の波形との類似度が最も高い第２区間を検索する。 Subsequently, the search unit 14 searches for a second section similar to the specified first section in the first sound data acquired by the acquisition section 12 (step S3). Here, the search unit 14 searches for a second section having the highest degree of similarity to the sound waveform of the first section from the sections before the first section.

検索部１４は、ユーザにより指定された検索範囲から、第１区間に類似する第２区間を検索してもよい。この場合、図４の例では、受付部１３は、開始時点４１２と終了時点４１３との間の検索範囲４１１を指定する操作を受け付ける。そして、検索部１４は、指定された検索範囲４１１の開始時点４１２から第１区間と同じ時間長の区間である検索区間と、第１区間との類似度を算出する。 The search unit 14 may search for a second section similar to the first section from the search range specified by the user. In this case, in the example of FIG. 4, the accepting unit 13 accepts an operation that specifies the search range 411 between the start time 412 and the end time 413. Then, the search unit 14 calculates the degree of similarity between the first section and a search section that has the same time length as the first section from the start time 412 of the specified search range 411.

そして、検索部１４は、検索区間の開始時点の次の時点を新たな開始時点とし、当該新たな開始時点から第１区間と同じ時間長の検索区間と、第１区間との類似度を算出する処理を、開始時点を時間の経過方向に１時点（１標本点、１サンプリング点）ずつ、ずらしながら繰り返す。そして、検索部１４は、当該新たな開始時点から第１区間と同じ時間長の区間が、指定された検索範囲外となった場合、類似度を算出する処理を終了する。そして、検索部１４は、第１区間４０５の音の波形との類似度が最も高い第２区間４０６を決定する。 Then, the search unit 14 sets the next point after the start point of the search section as a new start point, and calculates the similarity between the first section and a search section having the same time length as the first section from the new start point. This process is repeated while shifting the start point by one point (one sampling point, one sampling point) in the direction of time. Then, when the section having the same time length as the first section from the new start point falls outside the specified search range, the search unit 14 ends the process of calculating the similarity. Then, the search unit 14 determines the second section 406 that has the highest degree of similarity to the sound waveform of the first section 405.

または、検索部１４は、ユーザに検索範囲を指定させずに、第１区間に類似する区間を検索してもよい。この場合、検索部１４は、例えば、第１区間よりも所定時間前の時点を開始時点とし、当該開始時点から第１区間と同じ時間長の検索区間と、第１区間との類似度を算出してもよい。 Alternatively, the search unit 14 may search for a section similar to the first section without having the user specify a search range. In this case, the search unit 14 sets, for example, a predetermined time before the first section as the starting point, and calculates the similarity between the first section and a search section that has the same time length as the first section from the starting point. You may.

そして、検索部１４は、検索区間の開始時点より一つ前の時点を新たな開始時点とし、当該新たな開始時点から第１区間と同じ時間長の区間と、第１区間との類似度を算出する処理を、開始時点を時間の経過方向と逆の方向に１時点（１標本点、１サンプリング点）ずつ、ずらしながら繰り返す。そして、検索部１４は、今回の検索区間と第１区間との類似度と、前回の検索区間と第１区間との類似度との差が閾値以上の場合は、検索区間の開始時点がイントロ部分に入ったと考えられるため、検索処理を終了してもよい。そして、検索部１４は、第１区間４０５の音の波形との類似度が最も高い第２区間４０６を決定する。 Then, the search unit 14 sets a point one time before the start point of the search section as a new start point, and calculates the similarity between the first section and an section having the same time length as the first section from the new start point. The calculation process is repeated while shifting the starting point one point at a time (one sampling point, one sampling point) in the opposite direction to the direction of time. Then, if the difference between the similarity between the current search section and the first section and the similarity between the previous search section and the first section is greater than or equal to the threshold, the search section 14 determines that the start point of the search section is the introductory section. Since it is considered that the search has entered the section, the search process may be terminated. Then, the search unit 14 determines the second section 406 that has the highest degree of similarity to the sound waveform of the first section 405.

検索部１４は、例えば、検索区間と第１区間との類似度を、検索区間の波形と、第１区間の波形との差分の合計値に基づいて決定してもよい。この場合、検索部１４は、例えば、検索区間の開始時点から終了時点までの各時点の波形の振幅と、第１区間の開始時点から終了時点までの各時点の波形の振幅との差分の合計値が大きい程、類似度の値が低く、当該合計値が小さい程、類似度の値が高いとしてもよい。検索部１４は、第１音データが複数のチャンネルのデータを有する場合、チャンネル毎に波形の差分の合計値を算出し、算出した合計値の平均値等に基づいて決定してもよい。検索部１４は、例えば、当該平均値が大きい程、類似度の値が低く、当該平均値が小さい程、類似度の値が高いとしてもよい。 For example, the search unit 14 may determine the degree of similarity between the search section and the first section based on the total value of the differences between the waveform of the search section and the waveform of the first section. In this case, the search unit 14 calculates, for example, the sum of the differences between the amplitude of the waveform at each point in time from the start point to the end point of the search section and the amplitude of the waveform at each point in time from the start point to the end point in the first section. The larger the value, the lower the similarity value, and the smaller the total value, the higher the similarity value. When the first sound data includes data of a plurality of channels, the search unit 14 may calculate the total value of waveform differences for each channel, and determine based on the average value of the calculated total values. For example, the search unit 14 may determine that the larger the average value, the lower the similarity value, and the smaller the average value, the higher the similarity value.

また、検索部１４は、検索区間の開始時点から終了時点までの各時点のうち、ランダムに抽出された各時点での差分に基づいて類似度を算出してもよい。これにより、より高速に算出できるとともに、当該各時点のうち等間隔で抽出された各時点での差分に基づいて類似度を算出する場合と比較して、より高精度に類似度を推定できる。 Furthermore, the search unit 14 may calculate the degree of similarity based on the difference at each point randomly extracted from among the points from the start to the end of the search section. Thereby, it is possible to calculate the degree of similarity at a higher speed, and to estimate the degree of similarity with higher accuracy than when the degree of similarity is calculated based on the differences at each time point extracted at equal intervals among the respective time points.

また、検索部１４は、検索区間と第１区間との類似度を、検索区間の波形と、第１区間の波形との相関係数に基づいて決定してもよい。この場合、検索部１４は、例えば、検索区間の開始時点から終了時点までの振幅の波形と、第１区間の開始時点から終了時点までの振幅の波形との相関係数を、類似度の値としてもよい。 Further, the search unit 14 may determine the degree of similarity between the search section and the first section based on the correlation coefficient between the waveform of the search section and the waveform of the first section. In this case, the search unit 14 calculates, for example, the correlation coefficient between the amplitude waveform from the start time to the end time of the search section and the amplitude waveform from the start time to the end time of the first section, as the similarity value. You can also use it as

また、検索部１４は、検索区間と第１区間との類似度を、各周波数の位相と振幅に基づいて決定してもよい。この場合、検索部１４は、例えば、検索区間の波形と第１区間の波形とをそれぞれ高速フーリエ変換し、各周波数の位相と振幅をそれぞれ算出してもよい。そして、検索部１４は、検索区間の波形の各周波数と各位相に対する振幅の差分の合計値に基づいて、類似度の値を算出してもよい。 Furthermore, the search unit 14 may determine the degree of similarity between the search section and the first section based on the phase and amplitude of each frequency. In this case, the search unit 14 may perform fast Fourier transform on the waveform of the search section and the waveform of the first section, respectively, and calculate the phase and amplitude of each frequency, for example. Then, the search unit 14 may calculate the similarity value based on the total value of amplitude differences for each frequency and each phase of the waveform in the search section.

また、検索部１４は、上述した各手法を組み合わせて類似度を決定してもよい。この場合、検索部１４は、例えば、上述した各手法で算出された類似度の各値を正規化し、正規化した各値の平均値に基づいて類似度を決定してもよい。 Furthermore, the search unit 14 may determine the degree of similarity by combining the above-mentioned methods. In this case, the search unit 14 may, for example, normalize each similarity value calculated by each of the methods described above, and determine the similarity based on the average value of each normalized value.

検索部１４は、ゲーム機等において、所定数（例えば、２８）のサンプリング点の音データのブロックを、ゲーム機等のＡＰＩ（Application Programming Interface）を用いて順次再生させる仕様の場合、各検索区間の開始時点を、ステップＳ１で取得した第１音データの開始時点から、当該所定数の整数倍としてもよい。これにより、当該仕様を満たし、第１区間との類似度が高い区間を検索することができる。 When a game machine or the like is designed to sequentially reproduce blocks of sound data at a predetermined number (for example, 28) of sampling points using an API (Application Programming Interface) of the game machine, the search unit 14 is configured to search for each search section. The starting point may be an integral multiple of the predetermined number from the starting point of the first sound data acquired in step S1. Thereby, it is possible to search for a section that satisfies the specifications and has a high degree of similarity to the first section.

続いて、決定部１５は、繰り返し再生の終了点、及び繰り返し再生の開始点を決定する（ステップＳ４）。ここで、決定部１５は、例えば、繰り返し再生の終了点を第１区間に基づいて決定してもよい。 Subsequently, the determining unit 15 determines the end point and start point of repeated reproduction (step S4). Here, the determining unit 15 may, for example, determine the end point of repeated reproduction based on the first section.

決定部１５は、例えば、第１区間の終了時点を、繰り返し再生の終了点としてもよい。 For example, the determining unit 15 may set the end point of the first section as the end point of the repeated playback.

また、決定部１５は、例えば、検索部１４により詮索された第２区間の終了時点を、繰り返し再生の開始点としてもよい。 Further, the determining unit 15 may set, for example, the end point of the second section searched by the searching unit 14 as the starting point of repeated playback.

続いて、生成部１６は、ゲーム等において繰り返し再生される際に出力させる第２音データを生成する（ステップＳ５）。なお、生成部１６は、例えば、第１区間の開始時点を、第１区間のうち、振幅の絶対値が急激に増加する時点よりも所定時間（例えば、０．１秒）前の時点（アタックする位置）に変更してもよい。これにより、後述する合成（クロスフェード）の処理により、繰り返し再生する際の音の変化の違和感をより低減できる。 Subsequently, the generation unit 16 generates second sound data to be output when repeatedly reproduced in a game or the like (step S5). Note that, for example, the generation unit 16 sets the start point of the first section to a point (attack position). As a result, through the synthesis (cross-fade) processing described later, it is possible to further reduce the sense of discomfort caused by changes in sound during repeated playback.

生成部１６は、図４の左チャンネル用の第１音データの第１区間４０５の波形４０５１Ａと第２区間４０６の波形４０６１Ａとに基づいて、図６の波形４０５２Ａに示すような左チャンネル用の第２音データを生成する。 The generation unit 16 generates a waveform for the left channel as shown in a waveform 4052A in FIG. Generate second sound data.

また、生成部１６は、図４の右チャンネル用の第１音データの第１区間４０５の波形４０５１Ｂと第２区間４０６の波形４０６１Ｂとに基づいて、図６の波形４０５２Ｂに示すような右チャンネル用の第２音データを生成する。 Furthermore, the generation unit 16 generates a right channel signal as shown in a waveform 4052B in FIG. 2nd sound data is generated.

以下では、左チャンネル用の波形４０５２Ａを生成する例について説明する。なお、右チャンネル用の波形４０５２Ｂも、同様に生成されてもよい。 An example of generating the waveform 4052A for the left channel will be described below. Note that the waveform 4052B for the right channel may also be generated in the same way.

生成部１６は、例えば、第１区間４０５の波形４０５１Ａの減衰率を徐々に大きくした波形と、第２区間４０６の波形４０５２Ａの振幅の減衰率を徐々に小さくした波形とを加算した波形の音を生成してもよい。 For example, the generation unit 16 generates a sound with a waveform obtained by adding a waveform in which the attenuation rate of the waveform 4051A in the first section 405 is gradually increased and a waveform in which the attenuation rate of the amplitude of the waveform 4052A in the second section 406 is gradually decreased. may be generated.

この場合、生成部１６は、第１音データの第１区間４０５の波形と第２区間４０６の波形との類似度に基づいて、第１区間４０５の波形の振幅を徐々に小さくする第１減衰曲線と、第２区間４０６の波形の振幅を徐々に大きくする第２減衰曲線とを決定してもよい。生成部１６は、例えば、第１区間４０５の波形と第２区間４０６の波形との類似度が大きい程、第１減衰曲線の直線的に小さくする割合を大きくし、当該類似度が小さい程、第１減衰曲線の正弦曲線的に小さくする割合を大きくしてもよい。 In this case, the generation unit 16 performs first attenuation that gradually reduces the amplitude of the waveform of the first section 405 based on the similarity between the waveform of the first section 405 and the waveform of the second section 406 of the first sound data. A curve and a second attenuation curve that gradually increases the amplitude of the waveform in the second section 406 may be determined. For example, the generation unit 16 increases the linear reduction rate of the first attenuation curve as the degree of similarity between the waveform of the first section 405 and the waveform of the second section 406 increases; The rate at which the first attenuation curve is reduced sinusoidally may be increased.

この場合、生成部１６は、当該類似度が第１閾値未満の場合、図５Ａに示すように、時間長ｔ_１の間に１から０まで減少する直線５０１の値を波形４０５１Ａの振幅に乗算した第１波形と、時間長ｔ_１の間に０から１まで増加する直線５０２の値を波形４０６１Ａの振幅に乗算した第２波形とを加算することにより、波形４０５２Ａを生成してもよい。なお、時間長ｔ_１は、第１区間の開始時点４２１から終了時点４２２までの時間長である。 In this case, if the similarity is less than the first threshold, the generation unit 16 multiplies the amplitude of the waveform 4051A by the value of the straight line 501 that decreases from 1 to 0 during the time length _t1 , as shown in FIG. 5A. The waveform 4052A may be generated by adding the first waveform obtained by multiplying the amplitude of the waveform 4061A by the value of the straight line 502 that increases from 0 to 1 during the time length _t1 . Note that the time length _t1 is the time length from the start time 421 to the end time 422 of the first section.

この場合、波形４０５１Ａ、波形４０６１Ａ、及び生成される波形４０５２Ａの時間ｔにおける振幅をそれぞれｆ_１（ｔ）、ｆ_２（ｔ）、ｆ（ｔ）とすると、ｆ（ｔ）は以下の式（１）で表すことができる。 In this case, if the amplitudes at time t of the waveform 4051A, the waveform 4061A, and the generated waveform 4052A are f ₁ (t), f ₂ (t), and f(t), respectively, then f(t) is calculated by the following formula ( 1).

また、生成部１６は、当該類似度が第２閾値より大きい場合、図５Ｂに示すように、時間長ｔ_１の間に１から０まで減少する正弦曲線５１１の値を波形４０５１Ａの振幅に乗算した第３波形と、時間長ｔ_１の間に０から１まで増加する正弦曲線５１２の値を波形４０６１Ａの振幅に乗算した第４波形とを加算することにより、波形４０５２Ａを生成してもよい。この場合、ｆ（ｔ）は以下の式（２）で表すことができる。 Furthermore, when the similarity is greater than the second threshold, the generation unit 16 multiplies the amplitude of the waveform 4051A by the value of the sine curve 511 that decreases from 1 to 0 during the time length _t1 , as shown in FIG. 5B. The waveform 4052A may be generated by adding the third waveform obtained by adding the third waveform obtained by multiplying the amplitude of the waveform 4061A by the value of the sinusoidal curve 512 that increases from 0 to 1 during the time length _t1 . . In this case, f(t) can be expressed by the following equation (2).

また、生成部１６は、当該類似度が第１閾値以上で第２閾値以下の場合、図５Ｃに示すように、図５Ａの直線５０１の値と図５Ｂの正弦曲線５１１の値とを所定の比率で加算した値の曲線５２１の値を波形４０５１Ａの振幅に乗算した第５波形と、図５Ａの直線５０２の値と図５Ｂの正弦曲線５１２の値とを当該所定の比率で加算した値の曲線５２２の値を波形４０６１Ａの振幅に乗算した第６波形とを加算することにより、波形４０５２Ａを生成してもよい。この場合、当該所定の比率をα対１－αとすると、ｆ（ｔ）は以下の式（２）で表すことができる。なお、αは、０から１の範囲の値であり、当該類似度が小さい程大きくなる値でもよい。なお、図５Ｃの例では、αは０．５である。 Further, when the similarity is greater than or equal to the first threshold and less than or equal to the second threshold, the generation unit 16 converts the value of the straight line 501 in FIG. 5A and the value of the sine curve 511 in FIG. 5B into a predetermined value, as shown in FIG. 5C. A fifth waveform obtained by multiplying the amplitude of the waveform 4051A by the value of the curve 521 of the values added at a ratio, and a value obtained by adding the value of the straight line 502 in FIG. 5A and the value of the sine curve 512 of FIG. 5B at the predetermined ratio. The waveform 4052A may be generated by adding a sixth waveform obtained by multiplying the amplitude of the waveform 4061A by the value of the curve 522. In this case, if the predetermined ratio is α to 1−α, f(t) can be expressed by the following equation (2). Note that α is a value in the range of 0 to 1, and may be a value that increases as the degree of similarity decreases. Note that in the example of FIG. 5C, α is 0.5.

音のエネルギーは振幅の２乗に比例する。そのため、例えば、ゲームで再生される風の音のように、第１区間４０５の波形と第２区間４０６の波形が似ていない音データの場合、図５Ａのような互いにクロスする直線的な係数曲線で第１区間４０５の波形と第２区間４０６の波形を合成（クロスフェード）すると、区間の中心部であるｔ_１／２付近で、音量が小さ過ぎるように知覚されてしまう。一方、第１区間４０５の波形と第２区間４０６の波形が似ていない音データの場合、図５Ｂのような正弦曲線的な係数曲線で第１区間４０５の波形と第２区間４０６の波形を合成すると、音のエネルギーが増加しない。そのため、合成した第２音データの音量は、第１区間４０５及び第２区間４０６の音量と同様とすることができる。これは、第１区間４０５の波形と第２区間４０６の波形との各係数が、２乗して加算すると１となる値であるためである。 The energy of sound is proportional to the square of its amplitude. Therefore, for example, in the case of sound data such as the sound of wind played in a game, where the waveform of the first section 405 and the waveform of the second section 406 are not similar, linear coefficient curves that cross each other as shown in FIG. 5A are used. If the waveform of the first section 405 and the waveform of the second section 406 are combined (cross-fade), the volume will be perceived as too low near t ₁ /2, which is the center of the section. On the other hand, in the case of sound data in which the waveform of the first section 405 and the waveform of the second section 406 are not similar, the waveform of the first section 405 and the waveform of the second section 406 can be expressed using a sinusoidal coefficient curve as shown in FIG. 5B. When synthesized, the energy of the sound does not increase. Therefore, the volume of the synthesized second sound data can be the same as the volume of the first section 405 and the second section 406. This is because each coefficient of the waveform of the first section 405 and the waveform of the second section 406 has a value that becomes 1 when squared and added.

また、第１区間４０５の波形と第２区間４０６の波形が似ている音データの場合、図５Ｂのような正弦曲線的な係数曲線で第１区間４０５の波形と第２区間４０６の波形を合成すると、音量が大きすぎるように知覚されてしまう。一方、第１区間４０５の波形と第２区間４０６の波形が似ている音データの場合に、図５Ａのような互いにクロスする直線的な係数曲線で第１区間４０５の波形と第２区間４０６の波形を合成すると、音のエネルギーが増加しない。そのため、合成した第２音データの音量は、第１区間４０５及び第２区間４０６の音量と同様とすることができる。 In addition, in the case of sound data in which the waveform of the first section 405 and the waveform of the second section 406 are similar, the waveform of the first section 405 and the waveform of the second section 406 can be expressed using a sinusoidal coefficient curve as shown in FIG. 5B. When combined, the volume will be perceived as too loud. On the other hand, in the case of sound data in which the waveform of the first section 405 and the waveform of the second section 406 are similar, the waveform of the first section 405 and the waveform of the second section 406 are expressed by linear coefficient curves that cross each other as shown in FIG. 5A. When the waveforms of are synthesized, the sound energy does not increase. Therefore, the volume of the synthesized second sound data can be the same as the volume of the first section 405 and the second section 406.

繰り返し再生するＢＧＭ用の第１音データの場合、第１区間４０５及び第２区間４０６は一定程度似ている場合が多いため、生成部１６が、上述した式（３）により合成することで、より違和感が少ない折り返し時点の音を再生させることができる。 In the case of the first sound data for BGM that is repeatedly played, the first section 405 and the second section 406 are often similar to a certain extent, so the generation unit 16 synthesizes them using the above equation (3), so that It is possible to reproduce the sound at the point of return, which feels less strange.

続いて、生成部１６は、第１音データの第１区間以降を第２音データに置換し、決定部１５により決定された繰り返し再生の終了点、及び繰り返し再生の開始点の情報をメタデータとして付加した第３音データを、記憶部１１に記録する（ステップＳ６）。ここで、生成部１６は、図６に示すような第３音データを記録する。図６の例では、第３音データは、第１音データの区間４０２、及び区間４０３の音データと、生成した第２音データとが連結された音データである。また、繰り返し再生の終了点として時点４２２、繰り返し再生の開始点として第２区間４０６の終了時点４２３がメタデータとして記録される。 Next, the generation unit 16 replaces the first section and subsequent sections of the first sound data with second sound data, and converts the information of the end point of repeated playback and the start point of repeated playback determined by the determination unit 15 into metadata. The added third sound data is recorded in the storage unit 11 (step S6). Here, the generation unit 16 records third sound data as shown in FIG. In the example of FIG. 6, the third sound data is sound data in which the sound data of sections 402 and 403 of the first sound data are concatenated with the generated second sound data. Furthermore, a time point 422 is recorded as the end point of repeated playback, and a time point 423 of the end of the second section 406 is recorded as the start point of repeat playback as metadata.

続いて、再生部１７は、ゲームにおける状況等に応じて、記憶部１１に記録されている第３音データに基づいて、所定のＢＧＭ等を繰り返し再生する（ステップＳ７）。ここで、再生部１７は、記憶部１１に記録されている第３音データの先頭から、繰り返し再生の終了点である時点４２２まで、区間４０２、区間４０３、及び区間４０５の音を再生する。なお、区間４０５の音は、第２音データの音である。そして、繰り返し再生の開始点である時点４２３から終了点である時点４２２までの音を再生する処理を、ゲームにおける状況等に基づいて繰り返す。これにより、繰り返し再生する場合の繰り返しの終了点や開始点付近でのユーザへの違和感を低減させることができる。 Subsequently, the playback section 17 repeatedly plays back predetermined BGM, etc., based on the third sound data recorded in the storage section 11, depending on the situation in the game, etc. (step S7). Here, the reproducing unit 17 reproduces the sounds of the sections 402, 403, and 405 from the beginning of the third sound data recorded in the storage section 11 to time point 422, which is the end point of the repeated playback. Note that the sound in section 405 is the sound of the second sound data. Then, the process of reproducing the sound from time 423, which is the start point of repeated reproduction, to time 422, which is the end point, is repeated based on the situation in the game, etc. As a result, it is possible to reduce the discomfort felt by the user near the end point and start point of repetition when repeatedly playing back.

＜変形例＞
情報処理装置１０の各機能部は、例えば１以上のコンピュータにより構成されるクラウドコンピューティングにより実現されていてもよい。また、第３音データと、再生部１７の機能を実現するプログラムを記録媒体に記録し、ゲーム装置等において、再生部１７の処理を実行させてもよい。 <Modified example>
Each functional unit of the information processing device 10 may be realized by cloud computing configured by, for example, one or more computers. Alternatively, the third sound data and a program that implements the functions of the playback section 17 may be recorded on a recording medium, and the processing of the playback section 17 may be executed in a game device or the like.

また、オンラインゲーム等を提供するサーバ装置に再生部１７の処理を実行させ、ユーザのスマートフォン、タブレット、及びパーソナルコンピュータ等の情報処理端末に、所定のＢＧＭ等が繰り返し再生される音をスピーカから出力させるようにしてもよい。 In addition, the server device that provides online games, etc. executes the processing of the playback unit 17, and the sound of predetermined BGM etc. being repeatedly played is output from the speaker to the user's information processing terminal such as a smartphone, tablet, or personal computer. You may also do so.

以上、本発明の実施例について詳述したが、本発明は斯かる特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 Although the embodiments of the present invention have been described in detail above, the present invention is not limited to these specific embodiments, and various modifications can be made within the scope of the gist of the present invention as described in the claims. - Can be changed.

１０情報処理装置
１１記憶部
１２取得部
１３受付部
１４検索部
１５決定部
１６生成部
１７再生部 10 Information processing device 11 Storage unit 12 Acquisition unit 13 Reception unit 14 Search unit 15 Determination unit 16 Generation unit 17 Reproduction unit

Claims

音データにおける第１区間よりも前の区間から、前記第１区間の音の波形との類似度が最も高い第２区間を検索する検索部と、
繰り返し再生の終了点を前記第１区間に基づいて決定し、繰り返し再生の開始点を、前記第２区間に基づいて決定する決定部と、
を有する情報処理装置。 a search unit that searches for a second section having the highest degree of similarity to the sound waveform of the first section from a section before the first section in the sound data;
a determining unit that determines an end point of repeated playback based on the first section, and determines a start point of repeated playback based on the second section;
An information processing device having:

前記検索部は、
前記第１区間の音の波形との差分の合計値に基づいて前記類似度を決定する、
請求項１に記載の情報処理装置。 The search section includes:
determining the degree of similarity based on the total value of differences from the sound waveform of the first section;
The information processing device according to claim 1.

前記検索部は、
前記第１区間の音の波形との相関係数に基づいて前記類似度を決定する、
請求項１または２に記載の情報処理装置。 The search section includes:
determining the degree of similarity based on a correlation coefficient with the sound waveform of the first section;
The information processing device according to claim 1 or 2.

前記第１区間の波形の振幅の減衰率を徐々に大きくした波形と、前記第２区間の波形の振幅の減衰率を徐々に小さくした波形とを加算した波形の音を生成する生成部を有する、
請求項１から３のいずれか一項に記載の情報処理装置。 A generation unit that generates a sound of a waveform obtained by adding a waveform in which the amplitude attenuation rate of the waveform in the first section is gradually increased and a waveform in which the amplitude attenuation rate of the waveform in the second section is gradually decreased. have,
The information processing device according to any one of claims 1 to 3.

前記生成部は、
前記第１区間の音の波形と、前記第２区間の音の波形との前記類似度に基づいて、前記第１区間の波形の振幅を徐々に小さくする第１減衰曲線と、前記第２区間の波形の振幅を徐々に大きくする第２減衰曲線とを決定する、
請求項４に記載の情報処理装置。 The generation unit is
a first attenuation curve that gradually reduces the amplitude of the waveform of the first section based on the degree of similarity between the sound waveform of the first section and the sound waveform of the second section; determining a second attenuation curve that gradually increases the amplitude of the waveform;
The information processing device according to claim 4.

前記生成部は、
前記類似度が大きい程、前記第１減衰曲線に含まれる、直線成分を小さくし、
前記類似度が小さい程、前記第１減衰曲線に含まれる、正弦曲線成分を小さくする、
請求項５に記載の情報処理装置。 The generation unit is
The greater the similarity, the smaller the linear component included in the first attenuation curve,
The smaller the similarity, the smaller the sinusoidal component included in the first attenuation curve.
The information processing device according to claim 5.

情報処理装置が、
音データにおける第１区間よりも前の区間から、前記第１区間の音の波形との類似度が最も高い第２区間を検索する処理と、
繰り返し再生の終了点を前記第１区間に基づいて決定し、繰り返し再生の開始点を、前記第２区間に基づいて決定する処理と、を実行する情報処理方法。 The information processing device
A process of searching for a second section having the highest degree of similarity with the sound waveform of the first section from a section before the first section in the sound data;
An information processing method for determining an end point of repeated playback based on the first section, and determining a start point of repeat playback based on the second section.

コンピュータに、
音データにおける第１区間よりも前の区間から、前記第１区間の音の波形との類似度が最も高い第２区間を検索する処理と、
繰り返し再生の終了点を前記第１区間に基づいて決定し、繰り返し再生の開始点を、前記第２区間に基づいて決定する処理と、を実行させるプログラム。 to the computer,
A process of searching for a second section having the highest degree of similarity with the sound waveform of the first section from a section before the first section in the sound data;
A program that executes a process of determining an end point of repeated playback based on the first section, and determining a start point of repeat playback based on the second section.