WO2003045066A2 - Producing acoustic signals from a video data flow - Google Patents

Producing acoustic signals from a video data flow Download PDF

Info

Publication number
WO2003045066A2
WO2003045066A2 PCT/EP2002/012295 EP0212295W WO03045066A2 WO 2003045066 A2 WO2003045066 A2 WO 2003045066A2 EP 0212295 W EP0212295 W EP 0212295W WO 03045066 A2 WO03045066 A2 WO 03045066A2
Authority
WO
WIPO (PCT)
Prior art keywords
vector field
motion vector
vfd
chs
img
Prior art date
Application number
PCT/EP2002/012295
Other languages
German (de)
French (fr)
Other versions
WO2003045066A3 (en
Inventor
Markus Simon
Original Assignee
Siemens Aktiengesellschaft
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Aktiengesellschaft filed Critical Siemens Aktiengesellschaft
Priority to JP2003546577A priority Critical patent/JP2005510907A/en
Priority to EP02790332A priority patent/EP1446956A2/en
Publication of WO2003045066A2 publication Critical patent/WO2003045066A2/en
Publication of WO2003045066A3 publication Critical patent/WO2003045066A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2368Multiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/41407Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4341Demultiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • H04N2007/145Handheld terminals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • H04N5/145Movement estimation

Definitions

  • the invention relates to the generation of signals - in particular acoustic signals - from a video data stream which contains a chronological sequence of individual images.
  • the conversion of movement information into acoustic signals is e.g. known in the form of an acoustic motion detector. For example, this takes a picture using a video camera and triggers a signal, such as an acoustic alarm as soon as the picture changes.
  • a movement elder delivers a constant signal;
  • the acoustic signal can be changed by dividing the examined image area into fields, each of which is assigned a different signal. In cases where e.g. if an object moving relative to a background is to be monitored, motion detectors of this type naturally fail.
  • the generation of sounds depending on movements is also used in the fields of the performing arts.
  • the body movements performed by a person are interpreted and used to control sound generation.
  • the person performing can shape the music rhythm and theme through their movements.
  • the interaction between movements and sounds is a creative process that encourages the person to create more and more sound sequences through new movements.
  • the generation of a sound is usually based on the fact that the person responds to sensors during their movements, for example in the manner of a motion detector or a light barrier, the signals from the sensors are processed by a data processing system, and sounds associated with the sensors are generated in this way.
  • these sound generations are based on quite complex installations that receive their visual input via several recording facilities and are also installed in a fixed position during operation.
  • Such a sound generation should be able to be integrated, in particular, as a function of a cell phone with a built-in camera.
  • This object is achieved according to the invention by a method of the type mentioned at the outset with the following steps: a) determining a motion vector field from the image data of a single image with the aid of the image data of previous and / or subsequent individual images, b) deriving at least one characteristic variable from the motion vector field, and c ) Generating an acoustic signal depending on the characteristic variable (s).
  • a device with a control device with a means for determining a motion vector field from the image data of a single image with the aid of the image data of preceding and / or subsequent individual images is suitable for the method according to the invention, the control device additionally being used to derive at least one characteristic variable from the motion vector field and to generate it an acoustic signal is set up as a function of the at least one characteristic variable (s).
  • the solution according to the invention provides a generation of sounds or acoustic signals, which is based not only on the presence of movements, but also on the size and / or direction of the recorded movements. There- a differentiated design of the generated sound can be achieved. In the case of monitoring, this enables, for example, differentiated monitoring of an image area via acoustic signals, which among other things enables the differentiation of different movement sequences.
  • the solution according to the invention is implemented on the part of a telecommunication terminal, in particular a mobile telephone. Since functions for image processing are often already provided in mobile telephones, implementation of the invention is possible there in a particularly inexpensive and compact manner.
  • the video data stream can advantageously be generated by a camera device of the terminal.
  • the acoustic signals can be output via a
  • Listening device of the terminal or via a telecommunications connection existing with the terminal Listening device of the terminal or via a telecommunications connection existing with the terminal.
  • step a) it is advantageous, in particular in the case of a cell phone, if the motion vector field is determined in step a) by means of an MPEG encoder method known per se.
  • An easy-to-implement evaluation of the motion vector field in step b) is that, e.g. using statistical methods, a distribution is derived and statistical parameters are determined for this distribution, from which the at least one characteristic variable is determined.
  • FIG. 1 a front view of a mobile telephone according to the exemplary embodiment
  • Fig. 2 is a rear view of the mobile phone of Fig. 1;
  • Fig. 3 is a block diagram of the mobile phone of Fig. 1;
  • Figure 4 is a block diagram of an MPEG encoder.
  • FIG. 6 shows a movement histogram derived from the vector field of FIG. 5.
  • FIG. 1 and 2 show a mobile phone MOG which, according to the invention, converts movement recorded from the surroundings into acoustic signals (“acoustic kaleidoscope *).
  • the features of the MOG telephone which are visible on the housing, are a microphone MIC, a loudspeaker LSP and an input field EIN (e.g. a keyboard) for entering operator commands and telephone numbers, as well as an output DIS in the form of a screen, e.g. an LCD display on which a video image can be displayed.
  • the video image img comes in particular from a camera module CAM located on the back (FIG. 2), which is used to record images from the environment and feed them to the image data processing according to the invention.
  • FIG. 3 shows the components of the MOG telephone.
  • an antenna ANN in addition to the input / output elements LSP, MIC, EIN and the display DIS, an antenna ANN and a transmitting / receiving device SEE for performing the telecommunication functions;
  • a processor PRC is used as a control device for interpreting the user the input ON input commands and corresponding control of the device SEE provided.
  • the processor PRC is also set up according to the invention for processing the image data img of the camera module in order to generate sounds snd from them, as described below, with the aid of video coding and motion field analysis.
  • the function for generating sounds snd from the motion information of images img essentially comprises the following processing steps: a) encoding of the image information img by an encoder module ENC, for example by means of an MPEG algorithm, for determining an associated motion vector field vfd, b) Analysis of the vector field vfd for predetermined characteristic quantities chs, for example the prevailing motion vector, in an analysis module AAN of the processor system PRC and c) sound generation based on the quantities chs, for example generation of a sound as a function of the orientation and the amount of the prevailing motion vector, by a synthesis module SYN, which in the exemplary embodiment shown here also in the processor system PRC is realized.
  • Step a) the encoding of the image information takes place, for example, using the known MPEG encoding.
  • This is a standardized procedure for the compression and transmission of digitized image sequences and often in cell phones with cameras, e.g. already implemented for the purpose of video telephony.
  • Essential components in the method are the motion estimation BS of successive images (see below), the motion compensation BK and the transformation of the motion-compensated image into the frequency space by means of a discrete cosine transformation
  • DCT digital Signal Processing for Multimedia Systems
  • iDCT denotes the transformation inverse to DCT
  • DG is the digitization of the image stream received as input.
  • the input to motion estimation BS is the image imp preceding this at time t n - ⁇ .
  • the image imp is obtained from the DCT-transformed and motion-compensated signal of the previous image by means of an inverse DCT (iDCT), or temporarily buffered by means of an image memory IS for the duration of an image change.
  • iDCT inverse DCT
  • the image imp serves as a reference image and is subdivided into a number of blocks bbl, for example as shown in FIG. 5 into 36 pixel blocks of 16x16 pixels each. For each of these pixel blocks bbl, the best possible match is sought in the image to be evaluated in accordance with the MSE method ( ⁇ Mean Squared Error ') in a local neighborhood. In this way, information about the displacement vector is obtained for each block bbl.
  • FIG. 5 shows the image img of FIG. 1 as an example, in which the determined motion vectors v are additionally entered for each block bbl.
  • the picture shows an automobile driving against a background; the camera was pivoted with the vehicle during the recorded image sequence.
  • the motion vectors determined on the vehicle are almost zero, while the vectors in the vicinity have motion.
  • the motion vector field vfd resulting from this processing is the input for the next processing level.
  • the motion estimation described here which is part of the known MPEG method, provides a simple yet effective motion analysis.
  • Other methods of motion analysis can also be used within the scope of the invention, which provide a motion vector field and which generally use one or more images which are preceding or following the image to be examined.
  • the motion vector field vfd becomes one or more characteristic quantities chs, here e.g. the dominant movement orientation. This takes place in the AAN analysis module.
  • the orientation of all vectors from the motion field is entered in a histogram, cf. Fig. 6.
  • the maximum hmx within the distribution which is represented by the histogram his, gives the main direction of motion within the Picture.
  • an amount of the speed movement is determined, which is calculated by simple averaging from all vectors belonging to this main movement direction with a predeterminable tolerance (e.g. two classes adjacent to the histogram class of the maximum).
  • the result of this processing is a vector, the orientation and amount of which describe the main movement in the image. In the example considered here, this vector is the (two-dimensional) characteristic quantity chs from which sounds are derived in the subsequent processing stage.
  • the evaluation can of course also take place in a different way.
  • Variables that can be used as the basis for the evaluation are, in particular, statistical characteristic data of a distribution such as the most common value (maximum), secondary maxima, associated variances, weights of higher orders, etc.
  • step c) the sound is then generated in the synthesis module SYN on the basis of the characteristic size (n) chs.
  • a sound snd is generated and output via the loudspeaker LSP.
  • the sound sequence which is in the form of an acoustic signal in a known electrical representation, can be transmitted to another subscriber via a telecommunication connection of the mobile phone MOG.
  • the amount of movement speed controls the volume during sound generation, while different types of sounds are generated depending on the movement orientation.
  • each orientation class represented in the histogram his is assigned a pre-stored sound which differs from the other sounds by its pitch and / or sound characteristics (overtone spectrum).
  • the sounds can - but do not have to - be arranged according to their pitch.
  • the type of sound generation can be varied.
  • several maxima hmx, bmx of the distribution his could lead to an overlay of sounds.
  • the histogram could be fed directly to the synthesis device SYN, which uses this as the overtone spectrum of a fundamental tone; the pitch of the fundamental can remain the same (e.g. an A, 110 Hz) or can be determined using the procedure described above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

According to the invention, acoustic signals (snd) are produced from a video data flow containing a temporal sequence of individual images (img). A displacement vector field (vfd) is determined from the image data of the individual images (img), for example by means of an MPEG algorithm; at least one characteristic variable (chs) is then derived from the displacement vector field (vfd), e.g. a dominant displacement vector, in an analysis module (AAN); and an acoustic signal (snd) is produced on the basis of said variables (chs) by means of a synthesis module (SYN).

Description

Beschreibungdescription
Verfahren und Vorrichtung zur Erzeugung akustischer Signale aus einem VideodatenstromMethod and device for generating acoustic signals from a video data stream
Die Erfindung betrifft die Erzeugung von Signalen - insbesondere akustischen Signalen - aus einem Videodatenstrom, welcher eine zeitliche Abfolge von Einzelbildern enthält.The invention relates to the generation of signals - in particular acoustic signals - from a video data stream which contains a chronological sequence of individual images.
Die Umsetzung von Bewegungsinformation in akustische Signale ist z.B. in Form eines akustischen Bewegungsmelders bekannt. Dieser nimmt beispielsweise mittels einer Videokamera ein Bild auf und löst ein Signal, wie z.B. einen akustischen Alarm aus, sobald sich das Bild verändert. Ein solcher Be- wegungs elder liefert jedoch ein gleichbleibendes Signal; bestenfalls durch Aufteilung des untersuchten Bildbereichs in Felder, denen jeweils ein anderes Signal zugeordnet ist, kann eine Änderung des akustischen Signals realisiert werden. In Fällen, wo z.B. ein relativ zu einem Hintergrund bewegtes Objekt überwacht werden soll, versagen Bewegungsmelder dieser Art naturgemäß .The conversion of movement information into acoustic signals is e.g. known in the form of an acoustic motion detector. For example, this takes a picture using a video camera and triggers a signal, such as an acoustic alarm as soon as the picture changes. However, such a movement elder delivers a constant signal; At best, the acoustic signal can be changed by dividing the examined image area into fields, each of which is assigned a different signal. In cases where e.g. if an object moving relative to a background is to be monitored, motion detectors of this type naturally fail.
Die Erzeugung von Klängen in Abhängigkeit von Bewegungen wird auch in Bereichen der darstellenden Kunst verwendet. Bei- spielsweise werden die von einer Person ausgeführten Körperbewegungen interpretiert und zur Steuerung einer Klanggene- rierung verwendet. Auf diese Weise kann die ausübende Person Musikrhythmus und -thema durch ihre Bewegungen gestalten. Die Interaktion zwischen Bewegungen und Klängen ist ein kreativer Prozess, durch den die Person animiert wird, immer weitere Klangfolgen durch neue Bewegungen zu erzeugen. Die Erzeugung eines Klanges beruht hierbei in der Regel darauf, dass die Person bei ihren Bewegungen Sensoren, z.B. nach Art eines Bewegungsmelders oder einer Lichtschranke, anspricht, die Signale der Sensoren über eine Datenverarbeitungsanlage verarbeitet und auf diese Weise den Sensoren zugeordnete Klänge erzeugt werden. Diese Klangerzeugungen beruhen jedoch auf recht aufwendigen Installationen, die ihren visuellen Input über mehrere Aufnahmeeinrichtungen entgegennehmen und zudem während des Betriebs ortsfest installiert sind.The generation of sounds depending on movements is also used in the fields of the performing arts. For example, the body movements performed by a person are interpreted and used to control sound generation. In this way, the person performing can shape the music rhythm and theme through their movements. The interaction between movements and sounds is a creative process that encourages the person to create more and more sound sequences through new movements. The generation of a sound is usually based on the fact that the person responds to sensors during their movements, for example in the manner of a motion detector or a light barrier, the signals from the sensors are processed by a data processing system, and sounds associated with the sensors are generated in this way. However, these sound generations are based on quite complex installations that receive their visual input via several recording facilities and are also installed in a fixed position during operation.
Es ist Aufgabe der Erfindung, einen Weg aufzuzeigen, wie über eine Kamera aufgenommene Bewegungsinformation in Klangmuster umgesetzt werden können, wobei die erzeugten Klänge von der Art der Bewegung, insbesondere deren Richtung, veränderlich . sein soll. Eine solche Klangerzeugung soll insbesondere als Funktion eines Mobiltelefons mit eingebauter Kamera integrierbar sein.It is the object of the invention to show a way in which movement information recorded via a camera can be converted into sound patterns, the sounds produced changing depending on the type of movement, in particular its direction. should be. Such a sound generation should be able to be integrated, in particular, as a function of a cell phone with a built-in camera.
Diese Aufgabe wird erfindungsgemäß durch ein Verfahren der eingangs genannten Art mit folgenden Schritten gelöst: a) Ermitteln eines Bewegungsvektorfeldes aus den Bilddaten eines Einzelbildes unter Zuhilfenahme der Bilddaten vorangehender und/oder nachfolgender Einzelbilder, b) Ableiten zumindest einer charakteristischen Größe aus dem Bewegungsvektorfeld, und c) Erzeugen eines akustischen Signals in Abhängigkeit von der/den charakteristischen Größe (n) .This object is achieved according to the invention by a method of the type mentioned at the outset with the following steps: a) determining a motion vector field from the image data of a single image with the aid of the image data of previous and / or subsequent individual images, b) deriving at least one characteristic variable from the motion vector field, and c ) Generating an acoustic signal depending on the characteristic variable (s).
Für das erfindungsgemäße Verfahren eignet eine Vorrichtung mit einer Steuereinrichtung mit einem Mittel zum Er-mitteln eines Bewegungsvektorfeldes aus den Bilddaten eines Einzelbildes unter Zuhilfenahme der Bilddaten vorangehender und/oder nachfolgender Einzelbilder, wobei die Steuereinrichtung zusätzlich zum Ableiten zumindest einer charakteristischen Größe aus dem Bewegungsvektorfeld und Erzeugen eines akustischen Signals in Ab-hängigkeit von der/den zumindest einen charakteristischen Größe (n) eingerichtet ist.A device with a control device with a means for determining a motion vector field from the image data of a single image with the aid of the image data of preceding and / or subsequent individual images is suitable for the method according to the invention, the control device additionally being used to derive at least one characteristic variable from the motion vector field and to generate it an acoustic signal is set up as a function of the at least one characteristic variable (s).
Die erfindungsgemäße Lösung stellt eine Erzeugung von Klängen bzw. akustischer Signale zur Verfügung, die nicht nur auf das Vorhandensein von Bewegungen, sondern auch auf die Größe und/oder Richtung der aufgenommenen Bewegungen beruht. Da- durch kann eine differenzierte Gestaltung des erzeugten Klanges erwirkt werden. Im Fall einer Überwachung ermöglicht dies z.B. eine differenzierte Überwachung eines Bildbereichs über akustische Signale, die unter Anderem die Unterscheidung verschiedener Bewegungsabläufe ermöglicht.The solution according to the invention provides a generation of sounds or acoustic signals, which is based not only on the presence of movements, but also on the size and / or direction of the recorded movements. There- a differentiated design of the generated sound can be achieved. In the case of monitoring, this enables, for example, differentiated monitoring of an image area via acoustic signals, which among other things enables the differentiation of different movement sequences.
In einer bevorzugten Ausführungsform ist die erfindungsgemäße Lösung seitens eines Telekommunikationsendgeräts, insbesondere eines Mobiltelefons, realisiert. Da in Mobiltele- fönen häufig bereits Funktionen zur Bildverarbeitung vorgesehen sind, ist dort die Implementierung der Erfindung besonders günstig und auf kompakte Weise möglich. In diesem Fall kann günstigerweise der Videodatenstrom von einer Kameraeinrichtung des Endgeräts erzeugt werden. Des Weiteren kann die Ausgabe der akustischen Signale über eineIn a preferred embodiment, the solution according to the invention is implemented on the part of a telecommunication terminal, in particular a mobile telephone. Since functions for image processing are often already provided in mobile telephones, implementation of the invention is possible there in a particularly inexpensive and compact manner. In this case, the video data stream can advantageously be generated by a camera device of the terminal. Furthermore, the acoustic signals can be output via a
Höreinrichtung des Endgeräts oder auch über eine mit dem Endgerät bestehende Telekommunikationsver-bindung erfolgen.Listening device of the terminal or via a telecommunications connection existing with the terminal.
Darüber hinaus ist es vorteilhaft, insbesondere im Falle eines Mobiltelefons, wenn in Schritt a) das Bewegungsvektorfeld mittels eines an sich bekannten MPEG-Enkoder-Verfahrens ermittelt wird.In addition, it is advantageous, in particular in the case of a cell phone, if the motion vector field is determined in step a) by means of an MPEG encoder method known per se.
Eine einfach zu realisierende Auswertung des Bewegungs- Vektorfelds in Schritt b) besteht darin, dass daraus, z.B. unter Verwendung statistischer Methoden, eine Verteilung abgeleitet und für diese Verteilung statistische Kenngrößen ermittelt werden, aus denen die zumindest eine charakteristische Größe bestimmt wird.An easy-to-implement evaluation of the motion vector field in step b) is that, e.g. using statistical methods, a distribution is derived and statistical parameters are determined for this distribution, from which the at least one characteristic variable is determined.
Die Erfindung samt weiterer Vorzüge wird im Folgenden anhand eines nicht einschränkenden Ausführungsbeispiels näher erläutert, das in den beigefügten Zeichnungen dargestellt ist. Die Zeichnungen zeigen in schematischer Form: Fig. 1 eine Vorderansicht eines Mobiltelefons nach dem Ausführungsbeispiel; Fig. 2 eine Rückansicht des Mobiltelefons der Fig. 1;The invention and further advantages are explained in more detail below with reference to a non-restrictive exemplary embodiment, which is shown in the accompanying drawings. The drawings show in schematic form: FIG. 1 a front view of a mobile telephone according to the exemplary embodiment; Fig. 2 is a rear view of the mobile phone of Fig. 1;
Fig. 3 ein Blockdiagramm des Mobiltelefons der Fig. 1;Fig. 3 is a block diagram of the mobile phone of Fig. 1;
Fig. 4 ein Blockdiagramm eines MPEG-Enkoders;Figure 4 is a block diagram of an MPEG encoder.
Fig. 5 ein Beispiel eines aus einer Bewegungsschätzung gewonnenen Vektorfelds; und5 shows an example of a vector field obtained from a motion estimation; and
Fig. 6 ein aus dem Vektorfeld der Fig. 5 abgeleitetes Bewegungs-Histogramm.FIG. 6 shows a movement histogram derived from the vector field of FIG. 5.
Es sei vorausgeschickt, dass das im Folgenden beschriebene Ausführungsbeispiel lediglich als Beispiel dienen soll, und die oben dargestellte Erfindung nicht als darauf eingeschränkt zu verstehen ist.It should be noted that the exemplary embodiment described below is only intended to serve as an example, and the invention presented above is not to be understood as being limited thereto.
Fig. 1 und 2 zeigen ein Mobiltelefon MOG, das nach der Erfindung aus der Umgebung aufgenommene Bewegung in akustische Signale umsetzt („akustisches Kaleidoskop* ) . Das Telefon MOG weist als am Gehäuse sichtbare Merkmale nach bekannter Art ein Mikrophon MIC, einen Lautsprecher LSP und ein Eingabefeld EIN (z.B. eine Tastatur) zur Eingabe von Bedienerbefehlen und Rufnummer auf, sowie darüber hinaus eine Ausgabe DIS in Form eines Bildschirms, wie z.B. eines LCD- Displays, auf dem ein Videobild angezeigt werden kann. Das Videobild img stammt insbesondere von einem auf der Rückseite (Fig. 2) befindlichen Kameramodul CAM, das der Aufzeichnung von Bildern aus der Umgebung dient und diese der erfindungsgemäßen Bilddatenverarbeitung zuführt. Ebenfalls auf der Rückseite des Mobiltelefons MOG befindet sich nach bekannter Art ein Fach CAC für Akku und SIM-Karte.1 and 2 show a mobile phone MOG which, according to the invention, converts movement recorded from the surroundings into acoustic signals (“acoustic kaleidoscope *). The features of the MOG telephone, which are visible on the housing, are a microphone MIC, a loudspeaker LSP and an input field EIN (e.g. a keyboard) for entering operator commands and telephone numbers, as well as an output DIS in the form of a screen, e.g. an LCD display on which a video image can be displayed. The video image img comes in particular from a camera module CAM located on the back (FIG. 2), which is used to record images from the environment and feed them to the image data processing according to the invention. There is also a CAC compartment for the battery and SIM card, as is known, on the back of the MOG mobile phone.
Das Blockdiagramm der Fig. 3 zeigt die Komponenten de Telefons MOG. Nach bekannter Art sind neben der Ein/Ausgabeelemente LSP,MIC, EIN und dem Display DIS eine Antenne ANN und eine Sende/Empfangseinrichtung SEE zum Ausführen der Telekommunikationsfunktionen; außerdem ist ein Prozessor PRC als Steuereinrichtung zur Interpretation der vom Benutzer über die Eingabe EIN eingegebenen Befehle und entsprechenden Steuerung der Einrichtung SEE vorgesehen.3 shows the components of the MOG telephone. In a known manner, in addition to the input / output elements LSP, MIC, EIN and the display DIS, an antenna ANN and a transmitting / receiving device SEE for performing the telecommunication functions; In addition, a processor PRC is used as a control device for interpreting the user the input ON input commands and corresponding control of the device SEE provided.
Der Prozessor PRC ist des Weiteren nach der Erfindung zur Verarbeitung der Bilddaten img des Kameramoduls eingerichtet, um aus diesen - wie im Folgenden beschrieben - unter Zuhilfenahme von Videokodierung und Bewegungsfeldanalyse Klänge snd zu generieren. Die Funktion zur Generierung von Klängen snd aus der Bewegungsinformation von Bildern img umfasst im we- sentlichen folgende Verarbeitungsschritte: a) Enkodierung der Bildinformation img durch ein Enkoder- Modul ENC, beispielsweise mittels eines MPEG-Algorithmus, zur Bestimmung eines zugehörenden Bewegungsvektorfeldes vfd, b) Analyse des Vektorfeldes vfd auf vorbestimmte charakteristische Größen chs, z.B. den vorherrschenden Bewegungsvektor, in einem Analysemodul AAN des Prozessorsystems PRC und c) Klangerzeugung aufgrund der Größen chs, beispielsweise Erzeugung eines Klanges als Funktion der Orientierung und des Betrags des vorherrschenden Bewegungsvektors, durch ein Synthesemodul SYN, das in dem hier gezeigten Ausführungsbeispiel ebenfalls in dem Prozessorsystem PRC realisiert ist.The processor PRC is also set up according to the invention for processing the image data img of the camera module in order to generate sounds snd from them, as described below, with the aid of video coding and motion field analysis. The function for generating sounds snd from the motion information of images img essentially comprises the following processing steps: a) encoding of the image information img by an encoder module ENC, for example by means of an MPEG algorithm, for determining an associated motion vector field vfd, b) Analysis of the vector field vfd for predetermined characteristic quantities chs, for example the prevailing motion vector, in an analysis module AAN of the processor system PRC and c) sound generation based on the quantities chs, for example generation of a sound as a function of the orientation and the amount of the prevailing motion vector, by a synthesis module SYN, which in the exemplary embodiment shown here also in the processor system PRC is realized.
Schritt a) , die Enkodierung der Bildinformation erfolgt beispielsweise mithilfe der bekannten MPEG-Enkodierung. Diese ist ein standardisiertes Verfahren zur Komprimierung und Übertragung von digitalisierten Bildfolgen und vielfach in Mobiltelefonen mit Kameras, z.B. zum Zwecke der Videotele- fonie, bereits implementiert. Wesentliche Komponenten in dem Verfahren sind die Bewegungsschätzung BS aufeinanderfolgender Bilder (sh. unten), die Bewegungskompensation BK und die Transformation des bewegungsko pensierten Bildes in den Frequenzraum mittels einer Diskreten Cosinus TransformationStep a), the encoding of the image information takes place, for example, using the known MPEG encoding. This is a standardized procedure for the compression and transmission of digitized image sequences and often in cell phones with cameras, e.g. already implemented for the purpose of video telephony. Essential components in the method are the motion estimation BS of successive images (see below), the motion compensation BK and the transformation of the motion-compensated image into the frequency space by means of a discrete cosine transformation
(DCT) im Verbund mit einer Datenreduktion DK. Das Verfahrensprinzip ist der Fig. 4 entnehmbar; für eine genauere Darstel- lung sei auf Digital Signal Processing for Multimedia Systems" von Keshab K. Parhi und Takao Nishitani (Hrsg.), Marcel Dekker, Inc., New York, Seiten 31-37 verwiesen. In Fig. 4 bezeichnet iDCT die zur DCT inverse Transformation und DG die Digitialisierung des als Input entgegengenommenen Bilderstroms i s .(DCT) in conjunction with a data reduction DK. The principle of the method can be seen in FIG. 4; for a more precise representation For more information, see Digital Signal Processing for Multimedia Systems "by Keshab K. Parhi and Takao Nishitani (ed.), Marcel Dekker, Inc., New York, pages 31-37. In FIG. 4, iDCT denotes the transformation inverse to DCT and DG is the digitization of the image stream received as input.
Für die Ableitung akustischer Signale nach der Erfindung wird freilich weniger ein gemäß der MPEG-Verfahren komprimiertes Bild, sondern das Ergebnis der Bewegungsschätzung BS benötigt, die Teil des MPEG-Verfahrens ist. Der Input zur Bewegungsschätzung BS sind neben dem zu bewertenden Bild imn zum Zeitpunkt tn das diesem vorausgehende Bild imp zum Zeitpunkt tn-ι. Das Bild imp wird aus dem DCT-transfomierten und bewegungskompensierten Signal des vorhergehenden Bildes mittels einer inversen DCT (iDCT) gewonnen, bzw. mittels eines Bildspeichers IS für die Zeitdauer eines Bildwechsels zwischengepuffert .To derive acoustic signals according to the invention, less an image compressed according to the MPEG method is required, but the result of the motion estimation BS, which is part of the MPEG method. In addition to the image to be evaluated imn at time t n, the input to motion estimation BS is the image imp preceding this at time t n -ι. The image imp is obtained from the DCT-transformed and motion-compensated signal of the previous image by means of an inverse DCT (iDCT), or temporarily buffered by means of an image memory IS for the duration of an image change.
Das Bild imp dient als Referenzbild und wird in eine Anzahl von Blöcken bbl unterteilt, z.B. wie in Fig. 5 gezeigt in 36 Pixelblöcke zu je 16x16 Pixel. Für jeden dieser Pixelblöcke bbl wird in dem zu bewertenden Bild imn nach dem MSE Verfahren ( ΛMean Squared Error' ) in einer lokalen Nachbarschaft ein best-möglicher Match gesucht. Auf diese Weise erhält man für jeden Block bbl die Information über den Verschiebungsvektor. Fig. 5 zeigt als Beispiel das Bild img der Fig. 1, in dem zusätzlich die ermittelten Bewegungsvektoren v für jeden Block bbl eingetragen sind. Das Bild zeigt ein Automobil, das vor einem Hintergrund fährt; die Kamera wurde bei der aufgezeichneten Bildfolge mit dem Fahrzeug mitgeschwenkt. Aus diesem Grund haben die ermittelten Bewegungsvektoren auf dem Fahrzeug nahezu den Betrag Null, während die Vektoren in der Umgebung eine Bewegung aufweisen. Das aus dieser Verarbeitung resultierende Bewegungsvektorfeld vfd ist der Input für die nächste Verarbeitungsstufe. Es sei angemerkt, dass die hier beschriebene Bewegungsschätzung, die Teil des bekannten MPEG-Verfahrens ist, eine einfache und dennoch effektive Bewegungsanalyse liefert. Im Rahmen der Erfindung können auch andere Verfahren der Bewe- gungsanalyse verwendet werden, die ein Bewegungs-vektorfeld liefern und die hierfür im Allgemeinen ein oder mehrere dem zu untersuchenden Bild zeitlich vorangehende oder nachfolgende Bilder heranziehen.The image imp serves as a reference image and is subdivided into a number of blocks bbl, for example as shown in FIG. 5 into 36 pixel blocks of 16x16 pixels each. For each of these pixel blocks bbl, the best possible match is sought in the image to be evaluated in accordance with the MSE method ( Λ Mean Squared Error ') in a local neighborhood. In this way, information about the displacement vector is obtained for each block bbl. FIG. 5 shows the image img of FIG. 1 as an example, in which the determined motion vectors v are additionally entered for each block bbl. The picture shows an automobile driving against a background; the camera was pivoted with the vehicle during the recorded image sequence. For this reason, the motion vectors determined on the vehicle are almost zero, while the vectors in the vicinity have motion. The motion vector field vfd resulting from this processing is the input for the next processing level. It should be noted that the motion estimation described here, which is part of the known MPEG method, provides a simple yet effective motion analysis. Other methods of motion analysis can also be used within the scope of the invention, which provide a motion vector field and which generally use one or more images which are preceding or following the image to be examined.
In dem nächsten Schritt b) wird aus dem Bewegungsvektorfeld vfd eine oder mehrere charakteristische Größen chs, hier z.B. die dominierende Bewegungsorientierung, bestimmt. Dies erfolgt in dem Analysemodul AAN. In dem hier betrachteten Beispiel wird die Ori Ientierung aller Vektoren aus dem Bewegungs- feld in ein Histogramm his eingetragen, vgl. Fig. 6. (DerIn the next step b) the motion vector field vfd becomes one or more characteristic quantities chs, here e.g. the dominant movement orientation. This takes place in the AAN analysis module. In the example considered here, the orientation of all vectors from the motion field is entered in a histogram, cf. Fig. 6. (The
Übersichtlichkeit halber erfolgt in Fig. 6 die Aufteilung auf 16 Richtungsklassen; natürlich kann die Zahl der Klassen der Grundmenge deutlich höher sein und ist nur durch die Zahl der Blöcke bbl und die Auflösung der Bewegungsschätzung einge- schränkt.) Das Maximum hmx innerhalb der Verteilung, die durch das Histogramm his repräsentiert wird, gibt die Hauptbewegungsrichtung innerhalb des Bildes an. Zu dieser Richtung wird ein Betrag der Geschwindigkeitsbewegung bestimmt, der durch eine einfache Mittelwertsbildung aus allen Vektoren errechnet wird, die mit einer vorgebbaren Toleranz (z.B. jeweils zwei zu der Histogrammklasse des Maximums benachbarte Klassen) zu dieser Hauptbewegungsrichtung gehören. Ergebnis dieser Verarbeitung ist ein Vektor, durch dessen Orientierung und Betrag die Hauptbewegung im Bild beschrieben wird. Dieser Vektor ist in dem hier betrachteten Beispiel die (zweidi- mensionale) charakteristische Größe chs, aus der in der nachfolgenden Verarbeitungsstufe Klänge abgeleitet werden.For the sake of clarity, the division into 16 direction classes takes place in FIG. 6; Of course, the number of classes of the basic set can be significantly higher and is only limited by the number of blocks bbl and the resolution of the motion estimation.) The maximum hmx within the distribution, which is represented by the histogram his, gives the main direction of motion within the Picture. For this direction, an amount of the speed movement is determined, which is calculated by simple averaging from all vectors belonging to this main movement direction with a predeterminable tolerance (e.g. two classes adjacent to the histogram class of the maximum). The result of this processing is a vector, the orientation and amount of which describe the main movement in the image. In the example considered here, this vector is the (two-dimensional) characteristic quantity chs from which sounds are derived in the subsequent processing stage.
In anderen Ausführungsformen der Erfindung kann die Auswer- tung natürlich auch in anderer Art erfolgen. Beispielsweise könnte ein Histogramm ausgewertet werden, welches Orientierung und Betrag der Vektoren berücksichtigt (= Häufigkeit einer zweidimensionalen Grundmenge) . Größen, die als Grundlage der Auswertung verwendet werden können, sind insbesondere statistische Kenndaten einer Verteilung wie häufigster Wert (Maximum) , Nebenmaxima, zugehörende Varianzen, Gewichte höherer Ordnungen usf.In other embodiments of the invention, the evaluation can of course also take place in a different way. For example, a histogram could be evaluated, which takes into account the orientation and amount of the vectors (= frequency a two-dimensional basic set). Variables that can be used as the basis for the evaluation are, in particular, statistical characteristic data of a distribution such as the most common value (maximum), secondary maxima, associated variances, weights of higher orders, etc.
In Schritt c) erfolgt sodann im Synthesemodul SYN die Klangerzeugung aufgrund der charakteristischen Größe (n) chs. Als Funktion der Orientierung und des Betrags des zuvor besti m- ten Hauptbewegungsvektors wird ein Klang snd erzeugt und über den Lautsprecher LSP ausgegeben. Alternativ kann die Klangsequenz, die ja als akustisches Signal in an sich bekannter elektrischer Darstellung vorliegt, über eine Telekommunikationsverbindung des Mobiltelefon MOG zu einem anderen Teilnehmer übertragen werden.In step c), the sound is then generated in the synthesis module SYN on the basis of the characteristic size (n) chs. As a function of the orientation and the amount of the previously determined main motion vector, a sound snd is generated and output via the loudspeaker LSP. Alternatively, the sound sequence, which is in the form of an acoustic signal in a known electrical representation, can be transmitted to another subscriber via a telecommunication connection of the mobile phone MOG.
Beispielsweise steuert bei der Klangerzeugung der Betrag der Bewegungsgeschwindigkeit die Lautstärke, während in Abhängigkeit der Bewegungsorientierung verschiedenartige Klänge er- zeugt werden. Hierbei ist z.B. jeder in dem Histogramm his repräsentierten Orientierungsklasse ein vorgespeicherter Klang zugeordnet, der sich von den übrigen Klängen durch seine Tonhöhe und/oder Klangcharakteristik (Oberton-Spektrum) unterscheidet. Die Klänge können hierbei - müssen jedoch nicht - nach ihrer Tonhöhe angeordnet sein.For example, the amount of movement speed controls the volume during sound generation, while different types of sounds are generated depending on the movement orientation. Here, e.g. each orientation class represented in the histogram his is assigned a pre-stored sound which differs from the other sounds by its pitch and / or sound characteristics (overtone spectrum). The sounds can - but do not have to - be arranged according to their pitch.
Selbstverständlich kann die Art der Klangerzeugung variiert werden. So könnte in einer Variante der Erfindung mehrere Maxima hmx,bmx der Verteilung his zu einer Überlagerung von Klängen führen. In einer anderen Variante könnte das Histogramm his unmittelbar der Syntheseeinrichtung SYN zugeleitet werden, das dieses als Obertonspektrum eines Grundtons verwendet; die Tonhöhe des Grundtons kann gleich bleiben (z.B. ein A, 110 Hz) oder nach dem oben beschriebenen Verfahren bestimmt werden. Of course, the type of sound generation can be varied. Thus, in a variant of the invention, several maxima hmx, bmx of the distribution his could lead to an overlay of sounds. In another variant, the histogram could be fed directly to the synthesis device SYN, which uses this as the overtone spectrum of a fundamental tone; the pitch of the fundamental can remain the same (e.g. an A, 110 Hz) or can be determined using the procedure described above.

Claims

Patentansprüche claims
1. Verfahren zur Erzeugung von Signalen aus einem Videodatenstrom, welcher eine zeitliche Abfolge von Einzelbildern (img) enthält, gekennzeichnet durch folgende Schritte: a) Ermitteln eines Bewegungsvektorfeldes (vfd) aus den1. A method for generating signals from a video data stream which contains a chronological sequence of individual images (img), characterized by the following steps: a) determining a motion vector field (vfd) from the
Bilddaten eines Einzelbildes (img) unter Zuhilfenahme der Bilddaten vorangehender und/oder nachfolgender Einzelbilder, b) Ableiten zumindest einer charakteristischen Größe (chs) aus dem Bewegungsvektorfeld (vfd) , und c) Erzeugen eines akustischen Signals (snd) in Abhängigkeit von der/den charakteristischen Größe (n) (chs).Image data of a single image (img) with the aid of the image data of previous and / or subsequent individual images, b) deriving at least one characteristic variable (chs) from the motion vector field (vfd), and c) generating an acoustic signal (snd) as a function of the / the characteristic size (n) (chs).
2. Verfahren nach Anspruch 1, dadurch gekennzeichnet, dass es seitens eines Telekommunikationsendgeräts (MOG) , insbesondere eines Mobiltelefons, ausgeführt wird.2. The method according to claim 1, characterized in that it is carried out by a telecommunications terminal (MOG), in particular a mobile phone.
3. Verfahren nach Anspruch 2, dadurch gekennzeichnet, dass der Videodatenstrom von einer Kameraeinrichtung (CAM) des Endgeräts (MOG) erzeugt wird.3. The method according to claim 2, characterized in that the video data stream is generated by a camera device (CAM) of the terminal (MOG).
4. Verfahren nach Anspruch 2 oder 3, dadurch gekennzeichnet, dass die Ausgabe der akustischen Signale über eine Höreinrichtung (LSP) des Endgeräts (MOG) erfolgt.4. The method according to claim 2 or 3, characterized in that the output of the acoustic signals via a hearing device (LSP) of the terminal (MOG).
5. Verfahren nach Anspruch 2 oder 3, dadurch gekennzeichnet, dass die Ausgabe der akustischen Signale über eine mit dem Endgerät (MOG) bestehende Telekommunikationsverbindung erfolgt.5. The method according to claim 2 or 3, characterized in that the output of the acoustic signals via an existing with the terminal (MOG) telecommunications connection.
6. Verfahren nach einem der Ansprüche 1 bis 5, dadurch gekennzeichnet , dass in Schritt a) das Bewegungsvektorfeld mittels eines an sich bekannten MPEG- Enkoder-Verfahrens ermittelt wird.6. The method according to any one of claims 1 to 5, characterized in that in step a) Motion vector field is determined by means of an MPEG encoder method known per se.
7. Verfahren nach einem der Ansprüche 1 bis 6, dadurch gekennzeichnet, dass in Schritt b) aus den Bewegungsvektorfeld (vfd) eine Verteilung (his) abgeleitet und für diese Verteilung statistische Kenngrößen ermittelt werden, aus denen die zumindest eine charakteristische Größe (chs) bestimmt wird.7. The method according to any one of claims 1 to 6, characterized in that in step b) a distribution (his) is derived from the motion vector field (vfd) and statistical parameters are determined for this distribution, from which the at least one characteristic variable (chs) is determined.
8. Vorrichtung zur Erzeugung von Signalen aus einem Videodatenstrom, welcher eine zeitliche Abfolge von8. Device for generating signals from a video data stream, which has a time sequence of
Einzelbildern (img) enthält, ge ennzeichnet durch eine Steuereinrichtung mit einem Mittel (ENC) zum Ermitteln eines Bewegungsvektorfeldes (vfd) aus den Bilddaten eines Einzelbildes (img) unter Zuhilfenahme der Bilddaten vorangehender und/oder nachfolgender Einzelbilder, wobei die Steuereinrichtung zusätzlich zum Ableiten zumindest einer charakteristischen Größe (chs) aus dem Bewegungsvektorfeld (vfd) und Erzeugen eines akustischen Signals (snd) in Abhängigkeit von der/den zumindest einen charakteristischen Größe (n) (chs) eingerichtet ist.Contains individual images (img), characterized by a control device with a means (ENC) for determining a motion vector field (vfd) from the image data of a single image (img) with the aid of the image data of previous and / or subsequent individual images, the control device additionally at least for deriving a characteristic variable (chs) from the motion vector field (vfd) and generating an acoustic signal (snd) as a function of the at least one characteristic variable (s) (chs).
9. Vorrichtung nach Anspruch 8, dadurch gekennzeichnet, dass sie in einem Telekommunikationsendgerät (MOG) , insbesondere einem Mobiltelefon, vorgesehen ist.9. The device according to claim 8, characterized in that it is provided in a telecommunications terminal (MOG), in particular a mobile phone.
10. Vorrichtung nach Anspruch 9, gekennzeichnet durch eine Kameraeinrichtung (CAM) zum Erzeugen eines Videodatenstroms (img), der dem Mittel (ENC) zum Ermitteln eines Bewegungsvektorfeldes (vfd) zuführbar ist. 10. The device according to claim 9, characterized by a camera device (CAM) for generating a video data stream (img) which can be fed to the means (ENC) for determining a motion vector field (vfd).
11. Vorrichtung nach einem der Ansprüche 8 bis 10, dadurch gekennzeichnet, dass das Mittel (ENC) zum Ermitteln eines Bewegungsvektorfeldes (vfd) ein MPEG-Enkoder ist. 11. Device according to one of claims 8 to 10, characterized in that the means (ENC) for determining a motion vector field (vfd) is an MPEG encoder.
PCT/EP2002/012295 2001-11-22 2002-11-04 Producing acoustic signals from a video data flow WO2003045066A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2003546577A JP2005510907A (en) 2001-11-22 2002-11-04 Method and apparatus for forming an audio signal from a video data stream
EP02790332A EP1446956A2 (en) 2001-11-22 2002-11-04 Method and device for producing acoustic signals from a video data flow

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP01127842.1 2001-11-22
EP01127842 2001-11-22

Publications (2)

Publication Number Publication Date
WO2003045066A2 true WO2003045066A2 (en) 2003-05-30
WO2003045066A3 WO2003045066A3 (en) 2003-11-27

Family

ID=8179319

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2002/012295 WO2003045066A2 (en) 2001-11-22 2002-11-04 Producing acoustic signals from a video data flow

Country Status (3)

Country Link
EP (1) EP1446956A2 (en)
JP (1) JP2005510907A (en)
WO (1) WO2003045066A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1507389A1 (en) * 2003-08-13 2005-02-16 Sony Ericsson Mobile Communications AB Mobile phone with means for switching off the alarm remotely
DE102022105681A1 (en) 2022-03-10 2023-09-14 Ebm-Papst Mulfingen Gmbh & Co. Kg Method for determining vibration of a ventilation system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0294960A2 (en) * 1987-06-09 1988-12-14 Sony Corporation Motion vector processing in television images
GB2350512A (en) * 1999-05-24 2000-11-29 Motorola Ltd Video encoder
WO2001006791A1 (en) * 1999-04-22 2001-01-25 Activesky Inc. Wireless video surveillance system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0294960A2 (en) * 1987-06-09 1988-12-14 Sony Corporation Motion vector processing in television images
WO2001006791A1 (en) * 1999-04-22 2001-01-25 Activesky Inc. Wireless video surveillance system
GB2350512A (en) * 1999-05-24 2000-11-29 Motorola Ltd Video encoder

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
ATSUSHI K ET AL: "MOVING OBJECT DETECTION METHOD USING H.263 VIDEO CODED DATA FOR REMOTE SURVEILLANCE SYSTEMS" PROCEEDINGS OF THE SPIE, SPIE, BELLINGHAM, VA, US, Bd. 3641, Januar 1999 (1999-01), Seiten 247-258, XP000991237 *
BALDAUF K: "AURAL AIR" FLORIDA STATE UNIVERSITY BULLETIN, RESEARCH IN REVIEW, FLORIDA STATE UNIVERSITY, TALLAHASSEE, FL, US, Bd. 8, Nr. 2/3, 1997, Seite 6,2,1 XP001091400 ISSN: 0885-2073 *
BALDAUF, K.J.: "AirMusic Background Reports" FLORIDA STATE UNIVERSITY, [Online] 14. Juli 1997 (1997-07-14), Seiten 1-2, XP000926339 Florida, US Gefunden im Internet: <URL:http://websrv.cs.fsu.edu/~baldauf/pro jects/airmusic/airreports.html> [gefunden am 2002-05-16] *
BALDAUF, K.J.: "AirMusic: Interpreting Movement with Sound (Project Defense Summer 1997)" FLORIDA STATE UNIVERSITY, [Online] Juli 1997 (1997-07), XP000926348 Florida, US Gefunden im Internet: <URL:http://websrv.cs.fsu.edu/~baldauf/pro jects/airmusic/AirMusi2.html> [gefunden am 2002-05-16] *
COLLANDER P ET AL: "Mobile multimedia communication" ELECTRONIC MANUFACTURING TECHNOLOGY SYMPOSIUM, 1995, PROCEEDINGS OF 1995 JAPAN INTERNATIONAL, 18TH IEEE/CPMT INTERNATIONAL OMIYA, JAPAN 4-6 DEC. 1995, NEW YORK, NY, USA,IEEE, US, 4. Dezember 1995 (1995-12-04), Seiten 20-22, XP010195637 ISBN: 0-7803-3622-4 *
FELS S ET AL: "Iamascope: a graphical musical instrument - new ways to play" COMPUTERS AND GRAPHICS, PERGAMON PRESS LTD. OXFORD, GB, Bd. 23, Nr. 2, April 1999 (1999-04), Seiten 277-286, XP004165791 ISSN: 0097-8493 *
VLACHOS T: "Simple method for estimation of global motion parameters using sparse translational motion vector fields" ELECTRONICS LETTERS, IEE STEVENAGE, GB, Bd. 34, Nr. 1, 8. Januar 1998 (1998-01-08), Seiten 60-62, XP006009128 ISSN: 0013-5194 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1507389A1 (en) * 2003-08-13 2005-02-16 Sony Ericsson Mobile Communications AB Mobile phone with means for switching off the alarm remotely
DE102022105681A1 (en) 2022-03-10 2023-09-14 Ebm-Papst Mulfingen Gmbh & Co. Kg Method for determining vibration of a ventilation system

Also Published As

Publication number Publication date
JP2005510907A (en) 2005-04-21
WO2003045066A3 (en) 2003-11-27
EP1446956A2 (en) 2004-08-18

Similar Documents

Publication Publication Date Title
DE19642558B4 (en) Device for electronic program guide
DE69523503T2 (en) Audiovisual communication method and device with integrated, perception-dependent speech and video coding
DE69630121T2 (en) IMAGE COMPRESSION SYSTEM
DE69326751T2 (en) MOTION IMAGE ENCODER
DE10212915A1 (en) Electroendoscope system with electroendoscopes with different numbers of pixels
EP0645037A1 (en) Process for detecting changes in moving images
WO2006056531A1 (en) Transcoding method and device
DE19618984B4 (en) Method for motion evaluation in image data and apparatus for carrying out this method
DE102008001076A1 (en) Method, device and computer program for reducing the resolution of an input image
DE102016121755A1 (en) Method for determining a composite image of a surrounding area of a motor vehicle with adaptation of brightness and / or color, camera system and power vehicle
DE69321011T2 (en) Method and device for noise measurement
EP0525900B1 (en) Filter circuit for preprocessing a video signal
DE102023134534A1 (en) ISSUE METHOD AND ELECTRONIC DEVICE
WO2001008409A1 (en) Mobile videophone
EP0897247A2 (en) Method for computing motion vectors
DE69801165T2 (en) SIGNAL PROCESSING
DE102007010662A1 (en) Method for gesture-based real time control of virtual body model in video communication environment, involves recording video sequence of person in end device
EP1489842B1 (en) Motion vector based method and device for the interpolation of picture elements
EP1446956A2 (en) Method and device for producing acoustic signals from a video data flow
DE69911964T2 (en) PERFORMANCE MEASUREMENT OF TELECOMMUNICATION SYSTEMS
EP2536127A2 (en) Method for image-based automatic exposure adjustment
DE102022121955A1 (en) AUDIO PROCESSING METHOD, AUDIO PROCESSING DEVICE, ELECTRONIC AUDIO PROCESSING DEVICE AND STORAGE MEDIA
DE19749655B4 (en) Method and device for coding a motion vector
EP0363677A2 (en) Circuit for the estimation of movement in a detected picture
EP1848148A1 (en) Method for creating user profiles and method for providing information on objects

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2002790332

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2003546577

Country of ref document: JP

WWP Wipo information: published in national office

Ref document number: 2002790332

Country of ref document: EP