RU2716556C1

RU2716556C1 - Method of receiving speech signals

Info

Publication number: RU2716556C1
Application number: RU2018145167A
Authority: RU
Inventors: Артем Олегович Янгуразов; Дмитрий Иосифович Дубровских; Игорь Михайлович Еремеев
Original assignee: Общество с ограниченной ответственностью "ПРОМОБОТ"
Priority date: 2018-12-19
Filing date: 2018-12-19
Publication date: 2020-03-12
Also published as: WO2020130872A1

Abstract

FIELD: acoustics.

SUBSTANCE: invention relates to processing and conversion of acoustic signals into electrical signals in a robot. Technical result of the proposed technical solution is achieved by calibrating the robot working microphones in two steps, at the first step, transmitting a pulsed audio signal which is translated by the robot dynamics speakers and received by the working microphones, determining time delay of passage of pulse audio signal, transmitting obtained values of time delays to memory of control device, at second stage of calibration sound signal is transmitted successively in different frequency bands, translating it with robot speakers, determining the sound level value at each frequency for each working microphone, calculating the audio signal attenuation coefficient for each operating microphone, transmitting obtained values of attenuation coefficients to memory of control device, after which audio signals of robot are delayed in buffer, value of external speech signal of each working microphone is corrected by subtraction of time delay value.

EFFECT: technical result is reduced noise and reduced level of audio signal from robot loudspeakers.

1 cl, 3 dwg

Description

Изобретение относится к области обработки звука, а именно к обработке и преобразованию акустических сигналов в электрические в роботе. Позволяет выделять источники звука и определять направление их расположения, а также позволяет минимизировать помехи от внешних динамиков.The invention relates to the field of sound processing, namely, to processing and converting acoustic signals into electrical signals in a robot. Allows you to select sound sources and determine the direction of their location, and also allows you to minimize interference from external speakers.

Известно изобретение по патенту США US 2003139851 «Акустическое устройство и акустическая система робота» G10L 21/02, 2003. Изобретение представляет собой роботизированный слуховой аппарат и систему, которые сделаны для достижения активного восприятия при сборе звука от внешнего источника звука без влияния, получаемого от шумов, создаваемых внутри робота, таких как излучаемые из элементов движения робота. Устройство и система предназначены для робота, имеющего источник генерации шума в своей внутренней части, и включают в себя: звукоизолирующую оболочку, которой покрыта часть робота; внешние микрофоны, расположенные вне оболочки для сбора внешнего звука в первую очередь; внутренний микрофон, расположенный внутри оболочки для первичного сбора шумов от источника генерации шума во внутренней части робота; секцию обработки, реагирующую на сигналы от внешнего и внутреннего микрофонов для отделения звуковых сигналов принимаемых внешними микрофонами от сигналов шумов от источника генерации внутреннего шума и затем выдача левого и правого звукового сигнала; секцию выделения направленной информации, реагирующую на левый и правый звуковые сигналы от обрабатывающей секции для определения направления, из которого испускается внешний звук. Блок обработки выполнен с возможностью обнаружения всплесков из-за источника генерации шума из сигнала внутреннего микрофона для удаления участков сигнала из звуковых сигналов для полос, содержащих всплески. Недостатком является невозможность минимизировать помехи от внешних динамиков.The invention is known according to US patent US 2003139851 "Acoustic device and acoustic system of the robot" G10L 21/02, 2003. The invention is a robotic hearing aid and system that are made to achieve active perception when collecting sound from an external sound source without the influence obtained from noise generated inside the robot, such as those emitted from robot movement elements. The device and system are intended for a robot having a source of noise generation in its internal part, and include: a soundproofing shell, which covers a part of the robot; external microphones located outside the shell to collect external sound in the first place; an internal microphone located inside the shell for the primary collection of noise from a noise source in the inside of the robot; a processing section responsive to signals from external and internal microphones for separating audio signals received by external microphones from noise signals from an internal noise generation source and then issuing a left and right audio signal; a directional information extraction section responsive to left and right sound signals from the processing section to determine a direction from which external sound is emitted. The processing unit is configured to detect bursts due to the source of noise generation from the internal microphone signal to remove signal portions from sound signals for bands containing bursts. The disadvantage is the inability to minimize interference from external speakers.

Наиболее близким аналогом заявляемого изобретения является изобретение по патенту Китая CN 105825862 «Система эхоподавления в диалоге человек-машина» G10L 21/02, 2016. Система эхоподавления в диалоге человек-робот, содержащая: основной модуль управления, модуль эхоподавления, модуль деления голосового напряжения, микрофон и динамик, второй модуль деления голосового напряжения, сконфигурированный для установки отношения деления напряжения потенциометра и эхо-сигнала. Интенсивность передается модулю эхоподавления; динамик используется для воспроизведения тестового звука, испускаемого роботом; микрофон используется для сбора звукового сигнала; модуль эхоподавления подключается к модулю разделения голосового давления для устранения эхо-сигнала; основной модуль управления используется для регулировки коэффициента делителя напряжения потенциометра для управления процессом устранения эха. Недостатком является сложность снижения помех, так как перед каждым диалогом необходимо настраивать/регулировать коэффициент делителя напряжения потенциометра в ручном режиме.The closest analogue of the claimed invention is the invention according to Chinese patent CN 105825862 "Echo cancellation system in a human-machine dialogue" G10L 21/02, 2016. An echo cancellation system in a human-robot dialogue, comprising: a main control module, an echo cancellation module, a voice voltage division module, a microphone and a speaker, a second module for dividing the voice voltage, configured to set the ratio of the division of the voltage of the potentiometer and the echo signal. The intensity is transmitted to the echo cancellation module; the speaker is used to reproduce the test sound emitted by the robot; a microphone is used to collect an audio signal; the echo cancellation module is connected to the voice pressure separation module to eliminate the echo signal; The main control module is used to adjust the voltage divider factor of the potentiometer to control the echo cancellation process. The disadvantage is the difficulty in reducing interference, since before each dialogue it is necessary to adjust / adjust the voltage divider coefficient of the potentiometer in manual mode.

Техническим результатом заявляемого изобретения является снижение помех и уменьшение уровня звукового сигнала от громкоговорителей робота.The technical result of the claimed invention is to reduce interference and decrease the level of the sound signal from the robot speakers.

Технический результат достигается за счет того, что в способе приема речевых сигналов, включающем прием микрофонами робота внешних речевых сигналов и звуковых сигналов робота, корректировку величин внешних речевых сигналов с учетом величин звуковых сигналов робота, передачу полученных значений речевых сигналов на управляющее устройство, согласно изобретению, внешние речевые сигналы принимают рабочими микрофонами робота, а звуковые сигналы робота принимают калибровочными микрофонами, проводят калибровку рабочих микрофонов робота в два этапа, на первом этапе подают импульсный звуковой сигнал, который транслируют динамиками робота и принимают рабочими микрофонами, определяют временную задержку прохождения импульсного звукового сигнала от каждого динамика робота до каждого рабочего микрофона, передают полученные значения временных задержек в память управляющего устройства, на втором этапе калибровки подают звуковой сигнал последовательно в разных диапазонах частот, транслируют его динамиками робота, определяют величину уровня звука на каждой частоте для каждого рабочего микрофона, вычисляют коэффициент затухания звукового сигнала для каждого рабочего микрофона как отношение принятого уровня сигнала каждого рабочего микрофона к уровню сигнала калибровочного микрофона, передают полученные значения коэффициентов затухания в память управляющего устройства, далее в рабочем режиме при одновременном приеме звуковых сигналов робота и приеме внешних речевых сигналов, звуковые сигналы робота задерживают в буфере, величину внешнего речевого сигнала каждого рабочего микрофона корректируют путем вычитания значения временной задержки, определенной на первом этапе калибровки, с учетом коэффициента затухания, определенного на втором этапе калибровки, полученные значения речевых сигналов передают на управляющее устройство».The technical result is achieved due to the fact that in the method of receiving speech signals, including receiving the microphone of the robot external speech signals and sound signals of the robot, adjusting the values of external speech signals taking into account the values of the sound signals of the robot, transmitting the obtained values of the speech signals to the control device according to the invention, external speech signals are received by the working microphones of the robot, and sound signals of the robot are received by calibration microphones, calibration of the working microphones of the robot is carried out in two stages, at the first stage a pulse sound signal is transmitted, which is transmitted by the robot speakers and received by working microphones, the time delay of the pulse sound signal from each robot speaker to each working microphone is determined, the obtained values of the time delays are transmitted to the memory of the control device, at the second calibration stage give a sound signal sequentially in different frequency ranges, broadcast it by the speakers of the robot, determine the value of the sound level at each frequency for each about the working microphone, calculate the attenuation coefficient of the sound signal for each working microphone as the ratio of the received signal level of each working microphone to the signal level of the calibration microphone, transfer the obtained attenuation coefficients to the memory of the control device, then in the operating mode while receiving sound signals from the robot and receiving external speech signals, sound signals of the robot are delayed in the buffer, the value of the external speech signal of each working microphone is adjusted by ychitaniya time delay values determined in the first stage of calibration, based on attenuation coefficient determined in the second step of calibration, the obtained values of speech signals are transmitted to the control device. "

Технический результат обеспечивается за счет того, что работа микрофонного массива совмещена с алгоритмом подавления местного эффекта - АЕС. Сигнал от динамиков робота, принимаемый двумя калибровочными микрофонами, расположенными близко от динамиков робота, задерживается в буфере и вычитается из входного сигнала каждого микрофонного массива в соответствии с временной задержкой, измеренной на первом этапе калибровки и с учетом коэффициента затухания, определенного на втором этапе калибровки. Калибровка происходит единожды при включении питания. Вырабатываются сигналы, которые транслируются через динамики робота. Эти сигналы позволяют определить временные задержки с помощью импульсного сигнала и частотные коэффициенты затухания для правильной работы алгоритмов. После калибровки массив работает автоматически. Встроенный алгоритм АЕС позволяет минимизировать помехи от внешних динамиков робота.The technical result is provided due to the fact that the work of the microphone array is combined with the local effect suppression algorithm - AEC. The signal from the robot speakers, received by two calibration microphones located close to the robot speakers, is delayed in the buffer and subtracted from the input signal of each microphone array in accordance with the time delay measured at the first calibration stage and taking into account the attenuation coefficient determined at the second calibration stage. Calibration occurs once at power up. Signals are generated that are transmitted through the dynamics of the robot. These signals allow you to determine the time delay using a pulse signal and the frequency attenuation coefficients for the correct operation of the algorithms. After calibration, the array works automatically. The built-in AEC algorithm minimizes interference from the external speakers of the robot.

На фигуре 1 изображена структурная схема комплекса микрофонного массива.The figure 1 shows a structural diagram of a complex microphone array.

На фигуре 2 представлен вид спереди печатной платы MCU.2 is a front view of an MCU circuit board.

На фигуре 3 представлен вид спереди печатной платы рабочего микрофона и калибровочного микрофона АЕС.The figure 3 presents a front view of the printed circuit board of the working microphone and the calibration microphone AES.

Комплекс микрофонного массива состоит из платы MCU 1, восьми плат рабочих микрофонов 2, двух плат калибровочных микрофонов АЕС 2.1, встроенного программного обеспечения. Плата калибровочного микрофона 2.1 отличается от платы рабочего микрофона 2 заниженным коэффициентом усиления. Плата MCU 1 включает в себя микроконтроллер 3. В микроконтроллере используются десять каналов 4 для оцифровки звука, стэк USB 5 для передачи данных в персональный компьютер PC 6 или другое управляющее устройство. Питание платы осуществляется от USB шины. Плата рабочего микрофона 2 содержит рабочий микрофон 7, который является аналоговым и выполнен по Mems технологии, дифференциальный микрофонный усилитель 8 и буферный выходной усилитель 9. Плата калибровочного микрофона 2.1 содержит калибровочный микрофон 11. Для трансляции звукового сигнала используют динамики робота 10. Уровень выходного сигнала при номинальном звуковом давлении ~500 мВ. Питание микрофонов 7 и 11 (3.3 вольта) осуществляется от платы MCU 1. Встроенное программное обеспечение обеспечивает поддержку USB стэка 5, обработку оцифрованных аудиоданных и реализацию алгоритма подавления местного эффекта АЕС. Платы микрофонов расположены на роботе в соответствии с заданной в HARK конфигурацией. Точность расположения микрофонов - 5 мм. Со стороны персонального компьютера PC 6, устройство определяется, как стандартный акустический массив. Дополнительные драйверы для работы устройств не нужны.The microphone array complex consists of the MCU 1, eight working microphone 2 boards, two AEC 2.1 calibration microphone boards, and firmware. The board for the calibration microphone 2.1 differs from the board for the working microphone 2 in a lower gain. The MCU 1 includes a microcontroller 3. The microcontroller uses ten channels 4 for digitizing sound, a USB 5 stack for transferring data to a PC 6 or other control device. The board is powered by a USB bus. The working microphone board 2 contains a working microphone 7, which is analog and made according to Mems technology, a differential microphone amplifier 8 and a buffer output amplifier 9. The calibration microphone board 2.1 contains a calibration microphone 11. For broadcasting an audio signal, robot speakers 10 are used. The output signal level at nominal sound pressure ~ 500 mV. Microphones 7 and 11 (3.3 volts) are powered by the MCU 1. Integrated software provides support for USB stack 5, processing of digitized audio data and the implementation of an algorithm to suppress the local AEC effect. The microphone boards are located on the robot in accordance with the configuration specified in the HARK. The microphone positioning accuracy is 5 mm. From the PC 6 side, the device is defined as a standard acoustic array. Additional drivers for device operation are not needed.

В способе приема речевых сигналов в качестве микроконтроллера 3 используют микроконтроллер STM32H7. В качестве каналов для оцифровки звука используются 10 каналов ADC 16 бит.In the method for receiving speech signals, the microcontroller 3 uses the STM32H7 microcontroller. As channels for digitizing sound, 10 channels of 16 bit ADC are used.

Способ приема речевых сигналов осуществляют следующим образом. Сначала, с помощью платы MCU 1 вырабатывают кратковременный импульсный сигнал. Этот сигнал усиливают и транслируют через динамики робота 10. Рабочие микрофоны 7 принимают этот сигнал с разной временной задержкой, определяемой конкретным положением каждого рабочего микрофона 7 относительно каждого динамика 10. Данные временных задержек записывают в память MCU 1, которая является внутренней памятью чипа stm32H7, и потом используют совместно с буфером задержек, являющимся частью внутренней памяти чипа, для вычитания сигнала от динамика 10 из всех принятых звуковых сигналов. Вторым этапом производят калибровку уровня звука в четырех частотных диапазонах: 250 Гц, 500 Гц, 1000 Гц, 2000 Гц. С помощью платы MCU 1 вырабатывают последовательно набор четырех частот, через динамики робота 10 транслируют их, затем каждым рабочим микрофоном 7 измеряют уровень звука на каждой частоте и с помощью платы MCU 1 сохраняют значения в памяти управляющего устройства. При втором включении этого же калибровочного сигнала измеренные уровни звука используют для проверки работы компенсатора местного эффекта и вычисляют коэффициенты затухания сигнала для каждого рабочего микрофона 7, как отношение принятого уровня сигнала каждого рабочего микрофона к уровню сигнала калибровочного микрофона. После калибровки система работает автоматически, принимая акустические сигналы с восьми рабочих микрофонов 7 и пересылая данные в персональный компьютер PC 6 через USB порт. После того как определены временные задержки и частотные коэффициенты затухания система готова к работе. Звуковой сигнал от динамиков робота 10, принимают двумя калибровочными микрофонами 11, расположенными близко от динамиков робота 10, задерживают в буфере и вычитают из входного сигнала каждого рабочего микрофона 7 в соответствии с временной задержкой, измеренной на первом этапе калибровке и с учетом коэффициента затухания, определенного на втором этапе калибровки. Основная задача способа приема речевых сигналов - уменьшить уровень звукового сигнала от динамиков робота 10. В способе приема речевых сигналов человек может задавать вопросы роботу, прерывая его речь. Это приводит к комфортному общению человек-робот.The method of receiving speech signals is as follows. First, a short pulse signal is generated using the MCU 1. This signal is amplified and transmitted through the speakers of the robot 10. Working microphones 7 receive this signal with a different time delay determined by the specific position of each working microphone 7 relative to each speaker 10. The time delay data is recorded in the memory of MCU 1, which is the internal memory of the stm32H7 chip, and then they are used together with the delay buffer, which is part of the internal memory of the chip, to subtract the signal from the speaker 10 from all the received audio signals. The second stage calibrates the sound level in four frequency ranges: 250 Hz, 500 Hz, 1000 Hz, 2000 Hz. Using the MCU board 1, a set of four frequencies is generated sequentially, they are transmitted through the speakers of the robot 10, then each working microphone 7 measures the sound level at each frequency and uses the MCU 1 to store the values in the memory of the control device. When you turn on the same calibration signal for the second time, the measured sound levels are used to check the operation of the local effect compensator and the signal attenuation coefficients for each working microphone 7 are calculated as the ratio of the received signal level of each working microphone to the signal level of the calibration microphone. After calibration, the system operates automatically, receiving acoustic signals from eight working microphones 7 and sending data to a PC 6 via a USB port. After the time delays and frequency attenuation coefficients are determined, the system is ready for operation. The sound signal from the speakers of the robot 10 is received by two calibration microphones 11 located close to the speakers of the robot 10, is delayed in the buffer and subtracted from the input signal of each working microphone 7 in accordance with the time delay measured at the first calibration stage and taking into account the attenuation coefficient determined in the second stage of calibration. The main objective of the method of receiving speech signals is to reduce the level of the sound signal from the speakers of the robot 10. In the method of receiving speech signals, a person can ask questions to the robot, interrupting his speech. This leads to comfortable human-robot communication.

Таким образом, предлагаемый способ приема речевых сигналов позволяет снизить помехи и уменьшить уровень звукового сигнала от внешних динамиков робота.Thus, the proposed method for receiving speech signals can reduce interference and reduce the level of the sound signal from the external speakers of the robot.

Claims

Способ приема речевых сигналов, включающий прием микрофонами робота внешних речевых сигналов и звуковых сигналов робота, корректировку величин внешних речевых сигналов с учетом величин звуковых сигналов робота, передачу полученных значений речевых сигналов на управляющее устройство, отличающийся тем, что внешние речевые сигналы принимают рабочими микрофонами робота, а звуковые сигналы робота принимают калибровочными микрофонами, проводят калибровку рабочих микрофонов робота в два этапа, на первом этапе подают импульсный звуковой сигнал, который транслируют динамиками робота и принимают рабочими микрофонами, определяют временную задержку прохождения импульсного звукового сигнала от каждого динамика робота до каждого рабочего микрофона, передают полученные значения временных задержек в память управляющего устройства, на втором этапе калибровки подают звуковой сигнал последовательно в разных диапазонах частот, транслируют его динамиками робота, определяют величину уровня звука на каждой частоте для каждого рабочего микрофона, вычисляют коэффициент затухания звукового сигнала для каждого рабочего микрофона как отношение принятого уровня сигнала каждого рабочего микрофона к уровню сигнала калибровочного микрофона, передают полученные значения коэффициентов затухания в память управляющего устройства, далее в рабочем режиме при одновременном приеме звуковых сигналов робота и приеме внешних речевых сигналов, звуковые сигналы робота задерживают в буфере, величину внешнего речевого сигнала каждого рабочего микрофона корректируют путем вычитания значения временной задержки, определенной на первом этапе калибровки, с учетом коэффициента затухания, определенного на втором этапе калибровки, полученные значения речевых сигналов передают на управляющее устройство.The method of receiving speech signals, including receiving the robot microphones external speech signals and sound signals of the robot, adjusting the values of external speech signals taking into account the values of the sound signals of the robot, transmitting the received values of the speech signals to the control device, characterized in that the external speech signals are received by the working microphones of the robot, and the robot’s sound signals are received by calibration microphones, the working microphones of the robot are calibrated in two stages, and the pulse sound the signal transmitted by the robot speakers and received by working microphones, the time delay of the pulse sound signal from each robot speaker to each working microphone is determined, the obtained values of the time delays are transmitted to the memory of the control device, at the second calibration stage, the sound signal is supplied sequentially in different frequency ranges, broadcast it by the dynamics of the robot, determine the value of the sound level at each frequency for each working microphone, calculate the attenuation coefficient the sound signal for each working microphone as the ratio of the received signal level of each working microphone to the signal level of the calibration microphone, transmit the obtained attenuation coefficients to the memory of the control device, then in the operating mode while receiving sound signals from the robot and receiving external speech signals, the sound signals of the robot are delayed in the buffer, the magnitude of the external speech signal of each working microphone is adjusted by subtracting the time delay value determined by the first calibration stage, taking into account the attenuation coefficient determined at the second calibration stage, the obtained values of the speech signals are transmitted to the control device.