CN109346100A - A kind of network transfer method of Digital Media interactive instructional system - Google Patents

A kind of network transfer method of Digital Media interactive instructional system Download PDF

Info

Publication number
CN109346100A
CN109346100A CN201811254608.1A CN201811254608A CN109346100A CN 109346100 A CN109346100 A CN 109346100A CN 201811254608 A CN201811254608 A CN 201811254608A CN 109346100 A CN109346100 A CN 109346100A
Authority
CN
China
Prior art keywords
video
voice signal
instructional
digital media
mixing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811254608.1A
Other languages
Chinese (zh)
Inventor
宋玉玺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yantai Olympic Digital Technology Co Ltd
Original Assignee
Yantai Olympic Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yantai Olympic Digital Technology Co Ltd filed Critical Yantai Olympic Digital Technology Co Ltd
Priority to CN201811254608.1A priority Critical patent/CN109346100A/en
Publication of CN109346100A publication Critical patent/CN109346100A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a kind of network transfer methods of Digital Media interactive instructional system.It specifically includes that A. acquires multicenter voice signal by microphone array, analyzes direction and its variation of voice signal;B. using Beamforming Method and mixing removing method based on message structure, the front-end processing of speech recognition is carried out;C. the parallel quick processing feature for utilizing programmable unit array FPGA carries out image enhancement processing to instructional video, and realizes the scaling of video image;D. using the layered architecture of multistage mixing, the media units in instructional video are transmitted, by the operation of overlapping transmission process and mixed process, achievees the purpose that reduce accumulation time delay, completes the transformation task of video in real time.This method has interactivity, flexibility and expandability, and can select suitable network communication configuration according to current teaching scale, reduces propagation delay time, stably accomplishes the transformation task of video teaching system.

Description

A kind of network transfer method of Digital Media interactive instructional system
Technical field
The present invention relates to a kind of network transfer methods of Digital Media interactive video teaching, belong to communication, artificial intelligence And computer field.
Background technique
Digital Media interactive instructional system mainly by way of network, completes the communication and interaction of teacher and student. System passes through the video at compression teachers ' teaching scene, audio stream combination synchronous browsing courseware order, forms teaching resource stream.It is existing The image enhancement and scaling processing technology deposited will cause the unmatched problem of video resolution of system input and output;Audio The processing pressure that identification, storage and the transmission problem of signal increase system influences the real-time of signal transmission;Current use compared with The time delay that more network transmission modes generates, tends not to the requirement for meeting system real time.
The present invention using Wave beam forming and mixing eliminate speech recognition front-ends processing method, programmable unit array it is fast Fast processing feature and the layered architecture of network communication solve the above problems.
Summary of the invention
To solve the above problems, the purpose of the present invention is to provide one kind itself to have flexibility and expandability, and The good network transfer method of real-time.
The present invention solves the problems, such as technical solution used by it, comprising the following steps:
A. multicenter voice signal is acquired by microphone array, analyzes direction and its variation of voice signal;
B. using Beamforming Method and mixing removing method based on message structure, at the front end for carrying out speech recognition Reason;
C. the parallel quick processing feature for utilizing programmable unit array FPGA, carries out at image enhancement instructional video Reason, and realize the scaling of video image;
D. using the layered architecture of multistage mixing, the media units in instructional video is transmitted, overlapping transmission mistake is passed through The operation of journey and mixed process achievees the purpose that reduce accumulation time delay, completes the transformation task of video in real time.
The beneficial effects of the present invention are:
In data volume in the big and network transmission task of various structures, the present invention can flexible expansion, strong real-time it is complete Task is handled at media information, and suitable layered structure can be selected according to current teaching scale, has accumulation time delay small, The high beneficial effect of flexibility.
Detailed description of the invention
A kind of overall flow figure of the network transfer method of Digital Media interactive instructional system of Fig. 1.
Fig. 2 microphone array topology controlment.
Fig. 3 bilinear interpolation model.
Fig. 4 calculates best layered structure flow chart.
Specific embodiment
Referring to figs. 1 to Fig. 4, method of the present invention the following steps are included:
A. microphone array topology controlment is constructed, multicenter voice signal, analytic language are acquired by microphone array The direction of sound signal and its variation;
(1) using a series of microphone array being made of acoustic sensors, horizontal azimuth and Vertical Square parallactic angle are received The voice signal of sending analyzes far and near distance and the sound source side of received signal according to the time delay relationship between voice signal To;
(2) interference for inhibiting remote signaling using method for echo cancellation monitors audio RX path according to echo eliminator On from the voice signal distally transmitted, calculate echo estimated value, subtract echo estimated value during sending voice signal, Wave is gone to handle so as to complete in voice signal;
B. using Beamforming Method and mixing removing method based on message structure, at the front end for carrying out speech recognition Reason;
(1) Beamforming Method based on message structure is utilized, the information for merging multiple channels resists non-targeted direction Disturbing factor enhances the sound of target direction;
1. sound-field model is divided into far field and near field in microphone array according to sound source at a distance from microphone, wherein closely The sound wave of field is spherical wave, and the sound wave in far field is plane wave;
2., using the Beamforming Method based on message structure, constructing sound wave beam in the model of far field:
Wherein, c indicates the spread speed of sound in the medium, and v indicates frequency, and α indicates the direction of propagation of sound, and λ is frequency The corresponding wavelength of rate v, the relationship between spread speed, frequency and wavelength are c=λ v;
3. according to the time delay D elay between signaln, n expression element number of array, the voice that calculating each array of microphone receives Signal:
Signal (time)=ejv·time[exp(-jv·wave·Delay1)exp(-jv·wave ·Delay2) ...exp(-jv·wave·Delayn)]T
Wherein, T indicates the transposition of array;
(2) using the mixing removing method based on prediction error weighting, single channel and multichannel in microphone array are carried out Mixing eliminate;
1. including phonetic element Q and reverberation component B in mixing voice signal S, blending constituent B passes through previous in voice signal The voice signal method of weighting at all possibility time points calculates:
B (time, signal)=W (1, signal) S (time-1, signal)+...+W (m, signal) S (time-m, signal)
Wherein, W indicates weight coefficient, is acquired by the calculating of WPE weighting algorithm;
2. current mixing voice signal is represented by S (time, signal), after treatment, blending constituent is eliminated in estimation Voice signal afterwards are as follows:
Q (time, signal)=S (time, signal)-B (time.signal)
C. the parallel quick processing feature for utilizing programmable unit array FPGA, carries out at image enhancement instructional video Reason, and realize the scaling of video image;
(1) enhancing processing is carried out to the gray level and tone of video image using FPGA processor;
1. in the airspace enhancement method of image, using the variation of histogram show each gray level in image occur it is general Rate: histogram calculation is carried out according to the corresponding histogram of image, obtains the number of pixels of corresponding grey scale grade;
2. enhancing the changes in contrast of image by histogram equalization method: calculating the probability density of video image first Function:
Wherein, f (l, c) indicates the pixel of the video image of input, and L, C indicate that the row and column of image, G indicate image most High-gray level grade, if f (l, c)=grayg, then F (f (l, c)-grayg)=1, otherwise F (f (l, c)-grayg)=0;
3. to probability density function P (grayg) integrate:
Then the gray-scale value of image may be expressed as: after histogram equalization
(2) HIS Color space model is utilized, on the basis of keeping tone constant, image is enhanced;
1. being translated or being stretched to image by the linear transformation that tri- color components of R, G, B are carried out with same scale It converts, the calculation method of tone H in the color space HIS are as follows:
2. in order to guarantee the invariance of picture tone, when carrying out stretching conversion to image, tri- color components of R, G, B Transformation coefficient should keep equal, respectively indicate three-component value range: [m with m, nR,nR],[mG,nG],[mB,nB], then it regards When any point D in frequency image is converted, the criterion that should refer to are as follows:
The maximum value of S expression color component;
(3) bilinear interpolation method is used, calculating is weighted and averaged to the gray value of 2*2 neighborhood territory pixel in image, The gray value for obtaining image interpolation point, realizes the scale transformation of video image;
1. establishing bilinear interpolation method model:
2. according to the direction x two o'clock Q1,Q2Calculate B point gray value, the direction y two o'clock Q3,Q4What is calculated goes out A point gray value, finally The gray value E of interpolation point C, interpolation formula are acquired by linear interpolation are as follows:
E=(1-x) (1-y) E1+x(1-y)E2+(1-x)yE3+xyE4
Wherein, E1, E2, E3, E4Respectively indicate Q1, Q2, Q3, Q44 points of gray value, x, y indicate original image horizontal and Increment variation in vertical direction;
D. using the layered architecture of multistage mixing, the media units in instructional video is transmitted, overlapping transmission mistake is passed through The operation of journey and mixed process achievees the purpose that reduce accumulation time delay, completes the transformation task of video in real time.
(1) the audio & video information in teaching process is obtained, it is carried out using the layered architecture of multistage mixing It receives and mixes;
1. the system just starts to mix it when receiving the information of two video images or audio stream, mixed Other media informations are continued to during closing, more operations are completed in less accumulation time delay, realize the reality of system Shi Xing;
2. when the participation membership size of system is R, wherein some layered structure str1Accumulation time delay be
The accumulation time delay of best layered system is AccTime (R);
3. mixer is MixTime, media information transmission time between node the time required to being mixed to media units For TransTime, the respective items between media units can be compared and operation;
(2) construct layered architecture algorithm flow as shown in figure 4, using the structure real-time and flexibility, can Efficiently accomplish the network transmission task of system.
In conclusion just realizing a kind of network transfer method of Digital Media interactive instructional system.It is big in data volume And in the network transmission task of various structures, the present invention can flexible expansion, complete to strong real-time media information processing appoint Business, and suitable layered structure can be selected according to current teaching scale, have accumulation time delay small, the high beneficial effect of flexibility Fruit.

Claims (4)

1. a kind of network transfer method of Digital Media interactive instructional system, it is characterised in that: the method includes following steps It is rapid:
A. multicenter voice signal is acquired by microphone array, analyzes direction and its variation of voice signal;
B. using Beamforming Method and mixing removing method based on message structure, the front-end processing of speech recognition is carried out;
C. the parallel quick processing feature for utilizing programmable unit array FPGA carries out image enhancement processing to instructional video, and Realize the scaling of video image;
D. using the layered architecture of multistage mixing, the media units in instructional video are transmitted, by overlapping transmission process with The operation of mixed process achievees the purpose that reduce accumulation time delay, completes the transformation task of video in real time.
2. a kind of network transfer method of Digital Media interactive instructional system according to claim 1, it is characterised in that: The step B includes the front end processing method of speech recognition, constructs voice signal wave beam
And mixing elimination is carried out to mixing voice signal, obtain quiet phonetic element.
3. a kind of network transfer method of Digital Media interactive instructional system according to claim 1, it is characterised in that: The step C includes the image processing method to instructional video: image enhancement and video scaling.
4. a kind of network transfer method of Digital Media interactive instructional system according to claim 1, it is characterised in that: The step D includes the layered approach of the best layered architecture of network communication, and accumulation time delay may be expressed as:
CN201811254608.1A 2018-10-25 2018-10-25 A kind of network transfer method of Digital Media interactive instructional system Pending CN109346100A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811254608.1A CN109346100A (en) 2018-10-25 2018-10-25 A kind of network transfer method of Digital Media interactive instructional system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811254608.1A CN109346100A (en) 2018-10-25 2018-10-25 A kind of network transfer method of Digital Media interactive instructional system

Publications (1)

Publication Number Publication Date
CN109346100A true CN109346100A (en) 2019-02-15

Family

ID=65312195

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811254608.1A Pending CN109346100A (en) 2018-10-25 2018-10-25 A kind of network transfer method of Digital Media interactive instructional system

Country Status (1)

Country Link
CN (1) CN109346100A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111489753A (en) * 2020-06-24 2020-08-04 深圳市友杰智新科技有限公司 Anti-noise sound source positioning method and device and computer equipment
CN113626204A (en) * 2021-08-27 2021-11-09 北京拙成科技发展有限公司 Interactive all-media digital information processing method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102164328A (en) * 2010-12-29 2011-08-24 中国科学院声学研究所 Audio input system used in home environment based on microphone array
CN102324237A (en) * 2011-05-30 2012-01-18 深圳市华新微声学技术有限公司 Microphone array voice wave beam formation method, speech signal processing device and system
CN105244036A (en) * 2014-06-27 2016-01-13 中兴通讯股份有限公司 Microphone speech enhancement method and microphone speech enhancement device
CN108242042A (en) * 2016-12-23 2018-07-03 南京理工大学 Objective self-adapting plateau equalization implementation method based on FPGA
US20180225522A1 (en) * 2015-06-15 2018-08-09 Davantis Technologies Sl Ir or thermal image enhancement method based on background information for video analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102164328A (en) * 2010-12-29 2011-08-24 中国科学院声学研究所 Audio input system used in home environment based on microphone array
CN102324237A (en) * 2011-05-30 2012-01-18 深圳市华新微声学技术有限公司 Microphone array voice wave beam formation method, speech signal processing device and system
CN105244036A (en) * 2014-06-27 2016-01-13 中兴通讯股份有限公司 Microphone speech enhancement method and microphone speech enhancement device
US20180225522A1 (en) * 2015-06-15 2018-08-09 Davantis Technologies Sl Ir or thermal image enhancement method based on background information for video analysis
CN108242042A (en) * 2016-12-23 2018-07-03 南京理工大学 Objective self-adapting plateau equalization implementation method based on FPGA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何炎祥等: "多媒体会议***分层通信结构及其算法研究", 《计算机学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111489753A (en) * 2020-06-24 2020-08-04 深圳市友杰智新科技有限公司 Anti-noise sound source positioning method and device and computer equipment
CN113626204A (en) * 2021-08-27 2021-11-09 北京拙成科技发展有限公司 Interactive all-media digital information processing method and system

Similar Documents

Publication Publication Date Title
CN110363716B (en) High-quality reconstruction method for generating confrontation network composite degraded image based on conditions
GB2597162A (en) Audiovisual source separation and localization using generative adversarial networks
CN109525859A (en) Model training, image transmission, image processing method and relevant apparatus equipment
CN102317976A (en) The level of video conference is stared estimation
CN109346100A (en) A kind of network transfer method of Digital Media interactive instructional system
CN113365156B (en) Panoramic video multicast stream view angle prediction method based on limited view field feedback
CN110020715B (en) Neural network identification method and device using mixed coding of fluctuation and pulse signals
US7286184B2 (en) Information signal processing device, information signal processing method, image signal processing device and image display device using it, coefficient type data creating device used therein and creating method, coefficient data creating device and creating method, and information providing medium
EP3958168A1 (en) Method and device for identifying video
CN108683874B (en) Method for focusing attention of video conference and storage device
CN112085717B (en) Video prediction method and system for laparoscopic surgery
JP2022067858A (en) Learned model and data processor
CN114339302B (en) Method, device, equipment and computer storage medium for guiding broadcast
JP2000115716A (en) Device and method for converting video signal, image display device using it and television receiver
CN104205807A (en) Image processing device and method, and program
US8515096B2 (en) Incorporating prior knowledge into independent component analysis
KR101979584B1 (en) Method and Apparatus for Deinterlacing
EP1353320A1 (en) Information signal processing device, information signal processing method, image signal processing device, image display comprising the same, and information providing medium
CN109714603A (en) The method and device of multichannel audio-video frequency live streaming
US7483530B2 (en) Apparatus for separating blind source signals having systolic array structure
CN108461089A (en) Video synthesis system based on stream media technology
CN114639166A (en) Examination room abnormal behavior recognition method based on motion recognition
CN105491447A (en) Video technique based on streaming media technology
JPH0983961A (en) Learning method for class prediction coefficient, signal converter using classification adaptive processing and method therefor
CN104320663A (en) Video compression method and device and video transmission system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190215

WD01 Invention patent application deemed withdrawn after publication