WO2006022071A1 - 映像表示装置及び映像表示方法 - Google Patents
映像表示装置及び映像表示方法 Download PDFInfo
- Publication number
- WO2006022071A1 WO2006022071A1 PCT/JP2005/011423 JP2005011423W WO2006022071A1 WO 2006022071 A1 WO2006022071 A1 WO 2006022071A1 JP 2005011423 W JP2005011423 W JP 2005011423W WO 2006022071 A1 WO2006022071 A1 WO 2006022071A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speaker
- display
- video
- information
- subtitle
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4348—Demultiplexing of additional data and video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/84—Generation or processing of descriptive data, e.g. content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/44—Receiver circuitry for the reception of television signals according to analogue transmission standards
- H04N5/60—Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals
Definitions
- the present invention relates to a video display device and a video display method, and more particularly to a video display device and a video display method for displaying subtitles.
- Patent Document 1 discloses such a technique as shown in FIG.
- Patent Document 2 as shown in FIG. 2, a balloon frame corresponding to the person displayed in the image is displayed, and data obtained by converting speech into characters is associated with the speaker of the speech.
- a technique for displaying subtitles (character data) in a balloon frame This makes it easy to identify speakers who are difficult with subtitles alone, and makes it easy to understand the program contents even in silence.
- Patent Document 1 Japanese Patent Laid-Open No. 2004-056286
- Patent Document 2 Japanese Unexamined Patent Application Publication No. 2004-080069
- An object of the present invention is to provide a video display device and a video display method that allow a viewer to easily grasp the contents of a program even in a silent state.
- the video display device of the present invention creates a subtitle display image that associates subtitles with speaker information that allows the viewer to recognize the subtitle speaker, and synthesizes the generated subtitle display image and video.
- a display processing unit and a display unit that displays an image synthesized by the display processing unit.
- FIG. 1 is a diagram showing an image display method disclosed in Patent Document 1.
- FIG. 2 is a diagram showing an image display method disclosed in Patent Document 2.
- FIG. 3 is a block diagram showing a configuration of a broadcasting system according to Embodiment 1 of the present invention.
- FIG. 4 is a conceptual diagram showing the processing of the caption processing unit shown in FIG.
- FIG. 6 is a conceptual diagram showing the processing of the speaker information extraction unit shown in FIG.
- FIG. 7 is a conceptual diagram showing the state of processing of the display processing unit shown in FIG.
- FIG. 8 is a flowchart showing the processing procedure of the caption processing unit shown in FIG.
- FIG. 13 is a block diagram showing a configuration of the second embodiment of the present invention.
- FIG. 14 is a block diagram showing the configuration of the third embodiment of the present invention.
- FIG. 15 is a block diagram showing the configuration of the third embodiment of the present invention.
- FIG. 16 is a block diagram showing a configuration of a fourth embodiment of the present invention.
- FIG. 17 is a block diagram showing the configuration of the fourth embodiment of the present invention.
- FIG. 18 is a block diagram showing the configuration of the fourth embodiment of the present invention.
- FIG. 19 is a block diagram showing a configuration of a fifth embodiment of the present invention.
- FIG. 20 is a block diagram showing a configuration of a sixth embodiment of the present invention.
- FIG. 21 is a block diagram showing the configuration of the seventh embodiment of the present invention.
- FIG. 22 is a block diagram showing the configuration of the seventh embodiment of the present invention.
- FIG. 23 is a block diagram showing a configuration of the eighth embodiment of the present invention.
- FIG. 24 is a block diagram showing the configuration of the ninth embodiment of the present invention.
- FIG.25B Diagram showing subtitle display order in descending order
- FIG. 3 shows the configuration of the broadcasting system according to Embodiment 1 of the present invention.
- the input device 101 is a camera, a microphone, a keyboard, or the like, through which caption information, video / audio content, and data content are input.
- the video encoding unit 102 encodes video information in the audio / video content using a compression method such as Mpeg2, Mpeg4, or H.264, and outputs the encoded video information to the multiplexing processing unit 104. To do.
- a compression method such as Mpeg2, Mpeg4, or H.264
- the audio encoding unit 103 encodes audio information in the video / audio content using a compression method such as AAC, and outputs the encoded audio information to the multiplexing processing unit 104.
- the multiplexing processing unit 104 includes video information output from the video encoding unit 102, audio information output from the audio encoding unit 103, other program information, program identification information, text information, Broadcast contents such as image information (hereinafter referred to as “other broadcast contents”) are multiplexed, and the multiplexed signal is output to the transmission path code unit 105.
- Transmission path code unit 105 performs transmission processing such as encoding and modulation on the signal output from multiplexing processing unit 104, and transmits a broadcast wave from antenna 106.
- the tuner unit 112 extracts the frequency signal of the channel specified by the user from the broadcast wave received via the antenna 111, and performs code demodulation processing of the extracted frequency signal.
- the demodulated signal is output to the demultiplexing unit 113.
- Demultiplexing section 113 separates the signal output from tuner section 112 into subtitle information, video information, and other broadcast contents, and outputs the separated subtitle information to subtitle processing section 115 to display the video information as video.
- the data is output to the processing unit 117, and other broadcast contents are output to the speaker information extraction unit 118.
- the caption information includes the speaker identification information, which is information such as an ID for identifying the speaker, and information on the caption itself, and the other broadcast content includes a speaker who recognizes the speaker of the caption. Information and speaker identification information.
- the timer 114 measures the current time, and notifies the caption processing unit 115 and the display processing unit 120 of the measured current time.
- the caption processing unit 115 stores the caption information output from the demultiplexing unit 113 in the caption history storage unit 116 for each speaker based on the speaker identification information. At this time, the display time is also stored in the caption history storage unit 116 with the current time notified from the timer 114 as the display time of the caption information. Also, the area for displaying the caption for each speaker is set as the speaker frame, the position for displaying the speaker frame (hereinafter simply referred to as “display position”) is determined, and the determined display position is also determined. Stored in the subtitle history storage unit 116.
- FIG. 4 conceptually shows the processing of the caption processing unit 115. In this embodiment, as shown in FIG. 5, three speaker frames are prepared, and the display positions are “1”, “2”, and “3” in order from the top, and the determination of the display position is vacant. The lowest number among the display positions is determined.
- the caption history storage unit 116 manages the speaker identification information, the display time, the display position, and the caption display information that is a set of captions in a table as shown in FIG.
- the video processing unit 117 decodes the video stream output from the demultiplexing unit 113 and encoded with H.264 or the like, and outputs the decoded signal to the display processing unit 120.
- the speaker information extraction unit 118 extracts the data power speaker identification information and the speaker information output from the multiplexing / separation unit 113, and sets the extracted speaker identification information and the speaker information as a pair. And stored in the speaker information storage unit 119.
- Figure 6 schematically shows the processing performed by the speaker information extraction unit 118. As shown in FIG. 6, the speaker information storage unit 119 manages the speaker identification information and the speaker information as a set in a table.
- the display processing unit 120 divides one image area into a subtitle display area for displaying subtitles and a video display area for displaying video, and the video output from the video processing unit 117 is used as the video display area.
- the subtitle display information stored in the subtitle history storage unit 116 and the speaker information stored in the speaker information storage unit 119 are arranged in the subtitle display area, and these display images are synthesized.
- the speaker frames are sorted based on the time notified from the timer 114 and the display time stored in the caption history storage unit 116. Since the display processing unit 120 separates the subtitle display and the video display on the same screen, the video display and the subtitle display are not overlapped, and the video or subtitle display can be prevented from being invisible. .
- the synthesized image is output to the display unit 121.
- Fig. 7 conceptually shows how the display processing unit 120 processes.
- the display processing unit 120 dynamically distributes the speaker frame according to the presence or absence of the speaker, and arranges the speaker frame when there is a speaker's speech. If there is no speaker's speech, no speaker frame is placed, so that if the screen ratio of the video and the screen ratio of the video display device are different, the surplus area can be used effectively as a subtitle display area. .
- the display unit 121 displays the composite image output from the display processing unit 120.
- step (hereinafter abbreviated as “ST”) 131 subtitles that have been displayed for more than a specified time (for example, 5 seconds) are deleted from the subtitle display information of subtitle history storage section 116. Move to ST132.
- Figure 9 shows how subtitles are deleted.
- ST132 subtitle display information that has passed a specified time after displaying the subtitle, or subtitle display information that has been used for two or more frames and only the subtitle is deleted, is moved to ST133.
- Figure 10 shows the screen display information deleted.
- the deletion designation time of the caption should be equal to the deletion designation time of the caption display information.
- the specified deletion time is the time to display the subtitle of the first speaker after displaying the subtitle of the second speaker different from the first speaker while displaying the subtitle of the first speaker. .
- ST133 it is determined whether or not the new subtitle information has been acquired from the demultiplexing unit 113. If it is determined that new subtitle information has been acquired (YES), the process proceeds to ST134, where new subtitle information is acquired. If it is determined that the information has not been acquired (NO), the process returns to ST131, and the processes of ST131 to ST133 are repeated until it is determined that new caption information has been acquired.
- ST134 it is determined whether or not there is power in the speaker frame. Specifically, since there is an upper limit on the number of speaker frames that can be displayed depending on the specifications of the screen size, whether or not the number of information speaker frames stored in the caption history storage unit 116 is the upper limit. Is determined. For example, in the case of the upper limit power, if the number of speakers is 3 or less, it is determined that the number is not the upper limit, and if the number of speakers is 4, the upper limit is determined. That is, if it is not the upper limit, it is determined that there is an empty speaker frame (YES), and the process proceeds to ST136. If it is the upper limit, it is determined that there is no empty speaker frame (NO), and the process proceeds to ST135.
- ST136 whether the same speaker identification information as the speaker identification information included in the new caption information acquired from the demultiplexing unit 113 exists in the caption history storage unit 116, that is, the stored card Determine whether or not. If it is determined that it exists (YES), the process proceeds to ST138, and if it is determined that it does not exist (NO), the process proceeds to ST137.
- new caption display information is recorded in a free area of caption history storage section 116 based on the new caption information.
- the subtitle display information including the speaker identification information included in the newly acquired subtitle information and the same speaker identification information stored in the subtitle history storage unit 116 has the power (stored). Whether or not) is determined.
- ST131 subtitles after the specified time have also been deleted in ST131
- ST132 when the subtitle display information within the specified time is not deleted from the subtitle display, only the subtitle display information is deleted. Therefore, it is determined whether or not only the caption is deleted. If it is determined that subtitles are present (YES), the process proceeds to ST140, and if it is determined that no subtitles are present (NO), the process proceeds to ST139.
- new caption information is stored in the caption display information (not including the caption) including the same speaker identification information stored in caption caption storage section 116.
- ST140 it is determined whether or not the display position next to the display position corresponding to the same speaker identification information stored in subtitle history storage section 116 is empty. For example, if the display position of the lowest speaker identification information stored in the caption history storage unit 116 is the second from the top, the next display position, that is, the top 3 It is determined whether or not the display position of the eye is empty. If it is determined that it is empty (YES), the process proceeds to ST 142, and if it is determined that it is not empty (NO), the process proceeds to ST 141. Also, if the same speaker identification information is the lowest display position and there is no next display position, it is determined that it is not empty and (NO).
- ST142 among the same speaker identification information stored in subtitle history storage section 116 in ST141, it is determined that the next display position of the lowest display position is not empty. Create a space at the display position of. Specifically, the same speaker identification information If the display position of the lowest information is the second from the top, and if the next display position, that is, the third display position from the top is not empty, then the top three The display position of the caption display information for the eyes is shifted up to the fourth position, and the fourth and subsequent positions from the top are also shifted up.
- the caption processing unit 115 dynamically stores the caption information for each speaker in the caption history storage unit 116, and deletes the caption information stored in the caption history storage unit 116 in order of age.
- one image area is divided into a caption display area and a video display area, and information indicating a speaker is associated with captions indicating the content of the speaker.
- a plurality of icons, speaker frames, fonts, character colors, character sizes, etc. may be prepared, and the display designation information may be used so that any one of these may be used.
- the speaker information extraction unit 118 can combine the other broadcast content capability speaker identification information with the display designation information.
- the subtitle display image is created according to the extracted display designation information.
- the broadcast content does not contain multiple speaker information, the default information prepared in advance is used.
- the display processing unit 120 displays a speaker frame of a plurality of speakers, and when a speaker's speech continues, a speaker frame of another speaker is displayed. May be deleted, and the speaker frame of a speaker who makes a continuous speech in the deleted area may be expanded. If the speaker frame is expanded to the maximum and the speech continues, scroll the subtitles. Thereby, a long subtitle can be displayed.
- the other broadcast contents include time control mode (TMD) and display start time (STM) t and time information, and when this time information is used, explain.
- TMD time control mode
- STM display start time
- the speaker information extraction processing unit 151 extracts time information from the data output from the demultiplexing unit 113.
- the extracted time information is output to the caption processing unit 115 and the display processing unit 120.
- the video display device 150 can omit the timer for measuring the current time, and the device scale can be reduced.
- storage device 161 is a DVD (Digital Versatile Disc), an SD card, a hard disk, or the like, and video and audio content and data content are stored. ing.
- the video display device 160 can simultaneously display video, speaker information, and subtitles using the video / audio content and data content stored in the storage device 161.
- the video display device 165 receives the broadcast wave, records the received signal demodulated by the tuner unit 112, records it in the recording processing unit 166, and stores it in the storage device 161. You may have the reception video recording function to memorize. In this case, the received broadcast wave may be demodulated and displayed in real time, or stored in the storage device 161 and displayed.
- FIG. 16 shows the configuration of a broadcasting system according to Embodiment 4 of the present invention.
- the communication unit 171 transmits / receives video / audio content and data content from the servo 180 via a communication network such as the Internet network.
- the communication method of the communication unit 171 may be wired or wireless regardless of the type, such as a network adapter, a wireless local area network (LAN), Bluetooth, or infrared communication.
- the server 180 inputs speaker information using a camera, a keyboard, or the like as the input device 181, stores the speaker information in the speaker information storage unit 182, and stores the speaker information in the video display device 170 via the communication unit 183. Send.
- the video display device 170 can acquire the video / audio content and the caption information from the broadcast wave, and can acquire the speaker information from the communication network. This makes it possible to When the speaker information of a program is acquired in advance via a network or data broadcasting, and the video of that program is played, the viewer can use the acquired speaker information to Can be easily grasped.
- the communication unit 171 may acquire the video / audio content, the caption information, and the speaker information from the communication network.
- subtitle information and speaker information may be acquired from the communication network, and video / audio content may be acquired from the broadcast wave.
- Video, speaker information, and subtitles can be displayed at the same time even in analog broadcasting that is not included.
- FIG. 19 shows the configuration of a broadcasting system according to Embodiment 5 of the present invention.
- the authentication processing unit 192 acquires the authentication information input by the user from the input device 191, and sends an inquiry about the acquired authentication information via the communication unit 171. To 200.
- the speaker information distribution apparatus 200 receives an inquiry for authentication from the video display apparatus 190 via the communication unit 201, and the authentication processing unit 202 collates the authentication information. A plurality of kinds of speaker information stored in the speaker information storage unit 203 are transmitted. Note that the speaker information stored in the speaker information storage unit 203 is input by the input device 204.
- the storage device 193 in the video display device 190 is an SD card or the like having a secure area, and a plurality of types of speaker information acquired from the speaker information distribution device 200 and program identification information using the speaker information. (Program name, broadcast station name, channel, start time, end time, other ID, etc.) and authentication information input from the input device 191 are stored, and only the authentication processing unit 192 can access.
- the authentication processing unit 192 accesses the storage device 193, reads the information stored in the storage device 193, and stores the speaker information. Write to part 119.
- the authentication processing unit 192 deletes the information written in the speaker information storage unit 119.
- the authentication process Information here, multiple types of speaker information
- only a video display device that has been successfully authenticated can obtain a plurality of types of speaker information, and can perform rich subtitle display.
- FIG. 20 shows the configuration of a broadcast system according to Embodiment 6 of the present invention.
- the video display device 210 includes a first communication unit 171 connected to the speaker information distribution device 200 via a communication network such as the Internet network, a non-contact IC such as a watermelon (registered trademark), a wireless A second communication unit 211 that performs communication using a tag, infrared rays, or the like is provided.
- a communication network such as the Internet network
- a non-contact IC such as a watermelon (registered trademark)
- a wireless A second communication unit 211 that performs communication using a tag, infrared rays, or the like is provided.
- the key distribution device 220 When receiving the key acquisition request from the video display device 210 via the communication unit 221, the key distribution device 220 receives the key (or the key and the address of the speaker information distribution device) managed by the key distribution management unit 222. The video information is distributed to the video display device 210, and the information is notified to the speaker information distribution device 200. Note that authentication information (key, ID) managed by the key distribution management unit 222 is input by the input device 223.
- Speaker information distribution apparatus 200 receives a key distribution notification from key distribution apparatus 220 and adds the information to the authentication information managed by authentication processing unit 202. In addition, upon receiving an authentication inquiry using a key from the video display device 210, the authentication processing unit 202 performs authentication, and a plurality of information stored in the speaker information storage unit 203 is only stored for the video display device that has been successfully authenticated. Send the type of speaker information.
- the video display device that has acquired the key from the key distribution device 220 can obtain rich speaker information and perform subtitle display using a plurality of types of speaker information. For this reason, providing a key service to users who have purchased multiple speaker information, or distributing keys at the store where the program was purchased when purchasing goods related to the program, provides a service It can be done.
- only the video display device that has acquired the key can obtain a plurality of types of speaker information, and a rich subtitle display using a plurality of types of speaker information. It can be performed.
- audio processing unit 231 decodes the audio stream output from multiplexing / demultiplexing unit 113, and audio analysis is performed on the decoded audio stream. Output to part 232
- the sound analysis unit 232 analyzes the sound stream output from the sound processing unit 231 and outputs analysis results such as volume and pitch to the display processing unit 233. Also, by analyzing the characteristics of the voice, information expressing emotions such as emotions, gender information, and information indicating age (for example, baby, child, adult, elderly person, etc.) are generated and output to the display processing unit 233. .
- the display processing unit 233 creates a caption display image using the audio analysis result output from the audio analysis unit 232.
- the volume is associated with the character size
- the pitch is associated with the character color.
- information representing emotions is associated with fonts, and gender is associated with highlight colors.
- the decoration corresponding to each content of the voice analysis result is not limited to this.
- the seventh embodiment by visually displaying the result of analyzing the voice of the speaker in the caption display image, information other than characters can be displayed in caption, and the program is displayed. The contents can be grasped more easily.
- the video display device 235 has a video analysis unit 236, the video analysis unit 236 analyzes the video stream output from the video processing unit 117, and the display processing unit 233 The decoration corresponding to the analysis result such as the size may be performed.
- the display processing unit 233 may perform decoration corresponding to scenes such as morning, noon, night, sea, mountain, and soccer.
- audio processing section 231 decodes the audio stream output from multiplexing / demultiplexing section 113, and the decoded audio stream is sent to the speaker. Output to analysis unit 241.
- the speaker analysis unit 241 detects a speaker from video and audio, and extracts an image of the speaker.
- the extracted image is enlarged / reduced to a specified size to obtain speaker information.
- the speaker information is stored in the subtitle history storage unit 116 together with the subtitle information processed by the subtitle processing unit 115. It is assumed that the technology for detecting a speaker's video and audio power uses existing technology, for example, the technology described in Patent Document 1.
- the voice recognition unit 251 converts the voice stream output from the voice processing unit 231 into voice information, thereby converting it into character information. Generate caption information.
- the generated caption information is stored in the caption history storage 1
- caption display can be performed even when the speaker information and the caption information are not included in the broadcast wave.
- the subtitle display order may be displayed in order (ascending order) from the top of the subtitle display area, or in order (descending order) from the bottom of the subtitle display area as shown in Fig. 25B. You can do it.
- the size of the speaker frame is changed step by step to bright colors, plain colors, the subtitle text color is gradually reduced, and the font size is further increased. May be reduced step by step, or the display order may be numbered. Thereby, the user can recognize the display order of the subtitles without reading the subtitles. You can let the user set the display order (descending or ascending order) of the captions!
- a subtitle display image in which subtitles and speaker information for allowing a viewer to recognize a subtitle speaker are associated is created, and the created subtitle display image and video are synthesized.
- the subtitle and the speaker in the video can be associated with each other, so that even if the speaker is displayed on the video, the viewer can recognize the program contents in a silent state. It can be easily grasped.
- a screen comprising speaker information acquisition means for acquiring speaker information, and speaker information storage means for storing the acquired speaker information.
- An image display device for displaying the acquired speaker information.
- a third aspect of the present invention is the video display apparatus according to the above aspect, wherein the speaker information acquisition means acquires speaker information in advance before starting reception of a program.
- a fourth aspect of the present invention is the video display device according to the above aspect, wherein the speaker information acquisition means acquires speaker information together with reception of a program.
- the display processing unit is different from the subtitle display area in which the display processing unit displays one subtitle display area and the subtitle display area.
- the video display device is divided into a video display area to be displayed, video including a speaker is arranged in the video display area, and subtitles and speaker information corresponding to the speaker are arranged in the subtitle display area.
- the display processing unit dynamically sets a speaker frame, which is a region for displaying a caption for each speaker, according to the presence or absence of the speaker.
- Distribute It is a video display device.
- the display processing means displays a subtitle of a second speaker different from the first speaker while displaying the subtitle of the first speaker. From the video display device, the subtitles of the first speaker are deleted.
- the display processing means is displayed when an upper limit of the number of speaker frames can be displayed and when a new speaker appears.
- This is a video display device in which a speaker frame is deleted and a speaker frame corresponding to a new speaker is arranged.
- the display processing means is a plurality of types of speaker information that allows a viewer to recognize the same speaker acquired by the speaker information acquisition means.
- the video display device creates a caption display image based on display designation information indicating which speaker information is used.
- rich subtitle display can be performed by designating speaker information from among a plurality of types that allows the viewer to recognize the same speaker.
- the display processing means displays a speaker frame of a plurality of speakers and a speaker's speech continues
- This is a video display device that deletes the speaker frame of the speaker and extends the speaker frame of any of the speakers to the deleted area.
- a speaker frame of a speaker having a continuous speech is replaced with another speaker frame being displayed. It is possible to display a long subtitle by deleting and expanding to the deleted area.
- An eleventh aspect of the present invention includes, in the above aspect, an analysis unit that analyzes video or audio, and the display processing unit is a video that decorates subtitles based on an analysis result of the analysis unit. It is a display device.
- the display processing means associates the display order of the caption display information with the decoration of the caption, and installs the caption according to the display order of the caption display information. It is a video display device to decorate.
- a thirteenth aspect of the present invention is a broadcast wave transmission device that transmits video, subtitles, and speaker information that allows a viewer to recognize a speaker of the subtitles as a broadcast wave, and a transmission from the broadcast wave transmission device.
- a video display device having display means for displaying the image synthesized by the display processing means.
- the subtitle and the speaker in the video can be associated with each other, so that even if the speaker is displayed on the video, the viewer can recognize the program content in a silent state. It can be easily grasped.
- a fourteenth aspect of the present invention provides a recording device that records speaker information that allows a viewer to recognize a speaker of video, captions, and captions, and subtitles included in the information recorded in the recording device.
- a video having: a subtitle display image associated with speaker information; a display processing unit that combines the generated subtitle display image with the video; and a display unit that displays the image combined by the display processing unit. And a display device.
- the subtitles and the speakers in the video can be associated with each other. Therefore, even if the speakers are displayed on the video, the viewer can recognize them, and the program contents can be displayed in a silent state. easily I can grasp it.
- an authentication processing means for performing an authentication process and a video display device authenticated by the authentication processing means allow a plurality of different speakers to recognize the same speaker.
- An authentication system comprising: a display processing unit configured to combine an image and a video; and a display unit configured to display an image combined by the display processing unit.
- the video display device authenticated by the authentication device acquires a plurality of different speaker information that allows the viewer to recognize the same speaker, the authenticated video display device Rich subtitle display can be performed.
- a subtitle display image in which subtitles and speaker information for allowing a viewer to recognize a subtitle speaker are associated is created, and the created subtitle display image and video are synthesized.
- a video display method comprising: a display processing step; and a display step for displaying an image synthesized in the display processing step.
- the subtitles and the speakers in the video can be associated with each other, so that even if the speakers are displayed on the video, the viewer can recognize them, and the program contents can be displayed in a silent state. It can be easily grasped.
- the video display device and video display method according to the present invention make it easy for the viewer to grasp the program contents even in a silent state! It can be applied to a mobile phone having a small effect and a small screen size.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Television Systems (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Controls And Circuits For Display Device (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004245734 | 2004-08-25 | ||
JP2004-245734 | 2004-08-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006022071A1 true WO2006022071A1 (ja) | 2006-03-02 |
Family
ID=35967292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/011423 WO2006022071A1 (ja) | 2004-08-25 | 2005-06-22 | 映像表示装置及び映像表示方法 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2006022071A1 (ja) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007142569A (ja) * | 2005-11-15 | 2007-06-07 | Toshiba Corp | デジタル放送用携帯字幕仕様変換装置とその変換方法 |
JP2007142568A (ja) * | 2005-11-15 | 2007-06-07 | Toshiba Corp | デジタル放送用携帯字幕仕様変換装置とその変換方法 |
EP2180693A1 (en) * | 2008-10-22 | 2010-04-28 | Samsung Electronics Co., Ltd. | Display apparatus and control method thereof |
JP2010251841A (ja) * | 2009-04-10 | 2010-11-04 | Nikon Corp | 画像抽出プログラムおよび画像抽出装置 |
JP2016082355A (ja) * | 2014-10-15 | 2016-05-16 | 富士通株式会社 | 入力情報支援装置、入力情報支援方法および入力情報支援プログラム |
WO2019230225A1 (ja) * | 2018-05-29 | 2019-12-05 | ソニー株式会社 | 画像処理装置、画像処理方法、プログラム |
JP2020010224A (ja) * | 2018-07-10 | 2020-01-16 | ヤマハ株式会社 | 端末装置、情報提供システム、端末装置の動作方法および情報提供方法 |
US20240022682A1 (en) * | 2022-07-13 | 2024-01-18 | Sony Interactive Entertainment LLC | Systems and methods for communicating audio data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002056006A (ja) * | 2000-08-10 | 2002-02-20 | Nippon Hoso Kyokai <Nhk> | 映像・音声検索装置 |
JP2002232798A (ja) * | 2001-01-30 | 2002-08-16 | Toshiba Corp | 放送受信装置とその制御方法 |
JP2002232802A (ja) * | 2001-01-31 | 2002-08-16 | Mitsubishi Electric Corp | 映像表示装置 |
JP2002341890A (ja) * | 2001-05-17 | 2002-11-29 | Matsushita Electric Ind Co Ltd | 音声認識文字表示方法およびその装置 |
JP2003224842A (ja) * | 2002-01-31 | 2003-08-08 | Matsushita Electric Ind Co Ltd | コンテンツ配信方法 |
-
2005
- 2005-06-22 WO PCT/JP2005/011423 patent/WO2006022071A1/ja active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002056006A (ja) * | 2000-08-10 | 2002-02-20 | Nippon Hoso Kyokai <Nhk> | 映像・音声検索装置 |
JP2002232798A (ja) * | 2001-01-30 | 2002-08-16 | Toshiba Corp | 放送受信装置とその制御方法 |
JP2002232802A (ja) * | 2001-01-31 | 2002-08-16 | Mitsubishi Electric Corp | 映像表示装置 |
JP2002341890A (ja) * | 2001-05-17 | 2002-11-29 | Matsushita Electric Ind Co Ltd | 音声認識文字表示方法およびその装置 |
JP2003224842A (ja) * | 2002-01-31 | 2003-08-08 | Matsushita Electric Ind Co Ltd | コンテンツ配信方法 |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007142568A (ja) * | 2005-11-15 | 2007-06-07 | Toshiba Corp | デジタル放送用携帯字幕仕様変換装置とその変換方法 |
JP2007142569A (ja) * | 2005-11-15 | 2007-06-07 | Toshiba Corp | デジタル放送用携帯字幕仕様変換装置とその変換方法 |
EP2180693A1 (en) * | 2008-10-22 | 2010-04-28 | Samsung Electronics Co., Ltd. | Display apparatus and control method thereof |
JP2010251841A (ja) * | 2009-04-10 | 2010-11-04 | Nikon Corp | 画像抽出プログラムおよび画像抽出装置 |
JP2016082355A (ja) * | 2014-10-15 | 2016-05-16 | 富士通株式会社 | 入力情報支援装置、入力情報支援方法および入力情報支援プログラム |
US11450352B2 (en) | 2018-05-29 | 2022-09-20 | Sony Corporation | Image processing apparatus and image processing method |
WO2019230225A1 (ja) * | 2018-05-29 | 2019-12-05 | ソニー株式会社 | 画像処理装置、画像処理方法、プログラム |
EP3787285A4 (en) * | 2018-05-29 | 2021-03-03 | Sony Corporation | IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND PROGRAM |
JPWO2019230225A1 (ja) * | 2018-05-29 | 2021-07-15 | ソニーグループ株式会社 | 画像処理装置、画像処理方法、プログラム |
JP7272356B2 (ja) | 2018-05-29 | 2023-05-12 | ソニーグループ株式会社 | 画像処理装置、画像処理方法、プログラム |
JP2020010224A (ja) * | 2018-07-10 | 2020-01-16 | ヤマハ株式会社 | 端末装置、情報提供システム、端末装置の動作方法および情報提供方法 |
JP7087745B2 (ja) | 2018-07-10 | 2022-06-21 | ヤマハ株式会社 | 端末装置、情報提供システム、端末装置の動作方法および情報提供方法 |
US20240022682A1 (en) * | 2022-07-13 | 2024-01-18 | Sony Interactive Entertainment LLC | Systems and methods for communicating audio data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2006022071A1 (ja) | 映像表示装置及び映像表示方法 | |
CN108847214B (zh) | 语音处理方法、客户端、装置、终端、服务器和存储介质 | |
JP2003333445A (ja) | 字幕抽出装置 | |
JP2003333445A5 (ja) | 字幕抽出装置及びシステム | |
JP5994968B2 (ja) | コンテンツ利用装置、制御方法、プログラム、及び記録媒体 | |
JPH09506755A (ja) | 予め格納された画像を有するワイヤレス・ページャとそれと共に用いる方法およびシステム | |
EP1465423A1 (en) | Videophone device and data transmitting/receiving method applied thereto | |
KR20130005406A (ko) | 휴대용 단말기에서 메시지를 전송하기 위한 장치 및 방법 | |
US20090096782A1 (en) | Message service method supporting three-dimensional image on mobile phone, and mobile phone therefor | |
JP2002268963A (ja) | ブルートゥース機能を用いた無線データ送受信制御方法と無線データ送受信システム、及びそれに使用されるサーバーと端末機 | |
JP2014006669A (ja) | 推奨コンテンツ通知システム、その制御方法および制御プログラム、ならびに記録媒体 | |
JP2016005268A (ja) | 情報伝送システム、情報伝送方法、及びプログラム | |
JP2002288213A (ja) | データ転送装置、データ送受信装置、データ交換システム、データ転送方法、データ転送プログラム、データ送受信プログラム | |
JP2008113331A (ja) | 電話システム、電話機、サーバ装置およびプログラム | |
US7120583B2 (en) | Information presentation system, information presentation apparatus, control method thereof and computer readable memory | |
JP6706591B2 (ja) | 放送受信機、通知方法、プログラム、及び記憶媒体 | |
JP2005124169A (ja) | 吹き出し字幕付き映像コンテンツ作成装置、送信装置、再生装置、提供システムならびにそれらで用いられるデータ構造および記録媒体 | |
KR101618777B1 (ko) | 파일 업로드 후 텍스트를 추출하여 영상 또는 음성간 동기화시키는 서버 및 그 방법 | |
US20080108327A1 (en) | Method and communication device for transmitting message | |
US20070110397A1 (en) | Playback apparatus and bookmark system | |
CN106657255A (zh) | 文件共享的方法、装置和终端设备 | |
JP2005064592A (ja) | 携帯通信端末 | |
KR101597248B1 (ko) | VoIP 기반 음성 통화 시 음성 인식을 이용한 광고 제공 시스템 및 방법 | |
JP2004253923A (ja) | 情報受信装置 | |
JP2005159743A (ja) | 映像表示装置、映像表示プログラム、情報配信装置および情報通信システム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |