WO2010146869A1 - Système de support d'édition, procédé de support d'édition et programme de support d'édition - Google Patents

Système de support d'édition, procédé de support d'édition et programme de support d'édition Download PDF

Info

Publication number
WO2010146869A1
WO2010146869A1 PCT/JP2010/004060 JP2010004060W WO2010146869A1 WO 2010146869 A1 WO2010146869 A1 WO 2010146869A1 JP 2010004060 W JP2010004060 W JP 2010004060W WO 2010146869 A1 WO2010146869 A1 WO 2010146869A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
text data
divided data
text
recognition result
Prior art date
Application number
PCT/JP2010/004060
Other languages
English (en)
Japanese (ja)
Inventor
三木清一
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2011519574A priority Critical patent/JP5533865B2/ja
Publication of WO2010146869A1 publication Critical patent/WO2010146869A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition

Definitions

  • the present invention relates to an editing support system, an editing support method, and an editing support program.
  • Patent Document 1 Japanese Patent Laid-Open No. 2006-119534 describes a mouse subtitle editing apparatus that is operated by a person responsible for generated subtitles and specifies a portion to be edited with respect to a voice recognition result by a voice recognition apparatus, and a mouse A system is described that includes a keyboard subtitle editing apparatus that is operated by an operator who performs an operation of inputting a correct character string corresponding to sound with a keyboard with respect to subtitles passed from the subtitle editing apparatus.
  • the operator of the keyboard subtitle editing apparatus can be a person with a relatively low skill level and low responsibility and can save labor costs.
  • Patent Document 1 it is necessary for the person in charge of operating the mouse caption editing device to perform an operation for specifying a portion to be edited with respect to all of the speech recognition results, and thus it is impossible to perform quick processing. There's a problem.
  • the person in charge identifies the same part, and the operator of the keyboard subtitle editing apparatus performs a work of inputting a character string, which is checked by a plurality of people, resulting in poor efficiency.
  • An object of the present invention is to provide an editing support system and an editing support method that solve the above-described problem that a partial editing operation of a speech recognition result cannot be performed quickly.
  • Voice data storage means for storing voice data in association with time information
  • Speech recognition result storage means for storing text data of a speech recognition result of the speech data in a predetermined format in association with time information in units of words
  • First display processing means for displaying the text data in a predetermined display area and displaying a cursor for selecting the text data in the display area
  • Instruction accepting means for accepting an arbitrary selection range of the text data displayed by the first display processing means by the cursor and accepting an instruction for generating divided data
  • a divided data generating unit that extracts the text data included in the selection range received by the instruction receiving unit while maintaining the predetermined format from the voice recognition result storage unit, and generates divided data
  • a speech recognition result editing support system including
  • the text data is read from the voice recognition result storage means for storing the text data of the voice recognition result of the voice data in a predetermined format in association with time information in units of words, and the text data is displayed in a predetermined display area.
  • a speech recognition result editing support method is provided.
  • Computer Audio data storage means for storing audio data in association with time information
  • Voice recognition result storage means for storing text data of a voice recognition result of the voice data in a predetermined format in association with time information in units of words
  • First display processing means for displaying the text data in a predetermined display area and displaying a cursor for selecting the text data in the display area
  • Instruction accepting means for accepting an arbitrary selection range of the text data displayed by the first display processing means by the cursor and accepting an instruction for generating divided data
  • a divided data generation unit configured to extract the text data included in the selection range received by the instruction receiving unit while maintaining the predetermined format from the voice recognition result storage unit, and generate divided data
  • a speech recognition result editing support program is provided.
  • FIG. 1 it is a figure which shows an example of the screen displayed on a display by the display process part of an edit management apparatus.
  • FIG. 1 it is a figure which shows an example of the screen displayed on a display by the display process part of an edit management apparatus.
  • FIG. 1 it is a figure which shows an example of the management table of the display process part of the edit management apparatus in embodiment of this invention.
  • FIG. 1 It is a figure which shows an example of the management table of the display process part of the edit processing apparatus in embodiment of this invention.
  • it is a figure which shows an example of the screen displayed on a display by the display process part of an edit processing apparatus.
  • it is a figure which shows an example of the screen displayed on a display by the display process part of an edit processing apparatus.
  • it is a figure which shows an example of the screen displayed on a display by the display process part of an edit processing apparatus.
  • It is a figure which shows an example of the structure of the edited data in embodiment of this invention.
  • FIG. 1 it is a figure which shows an example of the screen displayed on a display by the display process part of an edit management apparatus. In embodiment of this invention, it is a figure which shows an example of the screen displayed on a display by the display process part of an edit management apparatus. In embodiment of this invention, it is a figure which shows an example of the screen displayed on a display by the display process part of an edit management apparatus. In embodiment of this invention, it is a figure which shows an example of the screen displayed on a display by the display process part of an edit management apparatus. It is a figure which shows the other example of a structure of the text data of the speech recognition result memorize
  • FIG. 1 is a block diagram schematically showing the configuration of the editing support system in the present embodiment.
  • the editing support system 300 includes an editing management device 100 and one or more editing processing devices 200.
  • an example is shown in which the editing support system 300 includes two editing processing devices 200 (an editing processing device 200 (A) and an editing processing device 200 (B)).
  • the editing management apparatus 100 stores the text data of the speech recognition result in a predetermined format, and displays the text data in a predetermined display area so that it can be edited.
  • the edit management apparatus 100 extracts the text data corresponding to the range while maintaining the original format, and generates divided data.
  • the divided data can be a part of the original text data.
  • the editing management apparatus 100 can extract the corresponding voice data together with the text data and include the voice data in the divided data.
  • the edit management device 100 generates divided data including text data and audio data. In this way, the edit management apparatus 100 can generate a plurality of divided data.
  • Each divided data is edited by each editing processing device 200.
  • the divided data edited by the editing processing device 200 is integrated by the editing management device 100.
  • FIG. 2 is a block diagram showing the configuration of the edit management apparatus 100 in the present embodiment.
  • the edit management apparatus 100 includes a voice acquisition unit 102, a voice recognition unit 104, a display processing unit 110 (first display processing unit), an instruction reception unit 112 (instruction reception unit), a voice reproduction unit 114 (voice reproduction unit), and a division.
  • a data generation unit 116 (divided data generation unit), an editing processing unit 118 (editing processing unit), a data integration unit 120 (data integration unit), an access control unit 122, and a storage unit 130 are included.
  • the storage unit 130 includes a voice data storage unit 132 (voice data storage unit), a voice recognition result storage unit 134 (voice recognition result storage unit), a divided data storage unit 136, an edited data storage unit 138, and an integrated data storage unit 140. including.
  • the voice acquisition unit 102 acquires the voice data of a speaker input from a voice input unit (not shown) such as a microphone.
  • the voice acquisition unit 102 acquires voice data in association with time information.
  • the audio data storage unit 132 stores the audio data acquired by the audio acquisition unit 102 in association with the time information.
  • the voice recognition unit 104 recognizes the voice data acquired by the voice acquisition unit 102 and converts the voice recognition result into text data.
  • the voice recognition result storage unit 134 stores the text data of the voice recognition result processed by the voice recognition unit 104 in a predetermined format in association with time information in units of words.
  • the speech recognition result storage unit 134 grasps text data of the speech recognition result for each sentence (sentence) and each word (word), and associates time information with each sentence and each word. Memorize in the format.
  • the time information may include both the start time and the end time, or may include only the start time.
  • the display processing unit 110 displays the text data of the speech recognition result in a predetermined area so as to be editable, and displays a cursor (caret) for selecting the text data in the display area.
  • the function of the display processing unit 110 can be realized by a text editor.
  • the display processing unit 110 can display text data in association with relative position information with respect to the cursor at least in units of words.
  • the instruction accepting unit 112 accepts an arbitrary selection range of the text data displayed by the display processing unit 110 with a cursor and accepts an instruction to generate divided data.
  • the audio reproduction unit 114 reads audio data from the audio data storage unit 132 and reproduces the audio.
  • the sound reproduction unit 114 outputs sound data corresponding to the time.
  • the voice reproduction unit 114 can reproduce the voice data at the corresponding time based on the time information associated with the word selected by the cursor in the text data displayed by the display processing unit 110.
  • the audio output device can be, for example, a speaker.
  • the divided data generation unit 116 extracts the text data included in the selection range received by the instruction reception unit 112 from the speech recognition result storage unit 134 while maintaining a predetermined format.
  • “maintaining the format” is grasped for each sentence (sentence) and each word (word), and time information is associated with each sentence and each word. It can be in the form.
  • the divided data generation unit 116 extracts the audio data corresponding to the text data included in the selection range from the audio data storage unit 132 in a state associated with the time information.
  • the divided data generation unit 116 generates divided data including the extracted text data and audio data.
  • the divided data generation unit 116 stores the generated divided data in a predetermined folder in the divided data storage unit 136.
  • a predetermined folder that is prepared in advance for each apparatus that is supposed to perform editing processing on the divided data can be prepared.
  • a folder corresponding to the edit processing device 200 (A), the edit processing device 200 (B), etc. shown in FIG. 1 can be prepared.
  • the divided data generation unit 116 can save the divided data in the folder prepared in this way.
  • the editing processing unit 118 is also used for editing the text data of the speech recognition result in the edit management apparatus 100, and can have the same configuration as that included in the edit processing apparatus 200. The function of the editing processing unit 118 will be described later with reference to the editing processing device 200.
  • the edited data storage unit 138 stores edited divided data (hereinafter referred to as edited data).
  • the data integration unit 120 arranges and integrates text data of a plurality of divided data in order of time based on time information.
  • the data integration unit 120 stores the integrated data in the integrated data storage unit 140.
  • the edited data storage unit 138 is prepared separately from the divided data storage unit 136 is shown.
  • the edited data storage unit 138 is not prepared, A configuration may be adopted in which the divided data before editing stored in the divided data storage unit 136 is overwritten with the edited divided data.
  • the integrated data storage unit 140 is prepared separately from the speech recognition result storage unit 134 is shown, but in another example, the integrated data storage unit 140 is not prepared.
  • the text data of the speech recognition result before editing stored in the speech recognition result storage unit 134 may be overwritten with the edited integrated data.
  • the access control unit 122 controls access from an external device such as the editing processing device 200.
  • the divided data generated by the divided data generation unit 116 is stored in a predetermined folder of the divided data storage unit 136 of the editing management apparatus 100.
  • a user who edits each piece of divided data in the editing processing apparatus 200 accesses the edit management apparatus 100 and acquires divided data.
  • the access control unit 122 controls access from such other terminals.
  • FIG. 3 is a flowchart showing a procedure for generating divided data in the edit management apparatus 100 according to the present embodiment.
  • the display processing unit 110 displays the text data of the speech recognition result stored in the speech recognition result storage unit 134 on the display (step S102).
  • FIG. 4 is a diagram illustrating an example of the configuration of text data of a speech recognition result stored in the speech recognition result storage unit 134 according to the present embodiment.
  • the speech recognition result storage unit 134 includes a sentence number field, a word number field, a speaker field, a start time field, an end time field, a speech recognition result field, and a character number field.
  • text data of the speech recognition result is stored in units of words.
  • words included in the sentences identified by “s11” and “s12” are displayed.
  • Each word is also attached with identification information for identifying the word in each sentence. That is, for example, based on the identification information of “s11” and “w1”, the word “Last year” is identified.
  • This word is a statement by the speaker “2”, the start time is “13:44:09”, and the end time is “13:44:10”.
  • the number of characters is three.
  • 5 to 8 are diagrams showing a text editor screen 400 displayed on the display by the display processing unit 110.
  • the screen 400 displays a text display area 402, a time display area 404, a time change button 406, an audio playback button 408, a speed change button 410, and the like.
  • text display area 402 text data of a speech recognition result and a cursor 420 are displayed.
  • the display processing unit 110 displays the text data stored in the speech recognition result storage unit 134 in the text display area 402 with a line feed every 25 characters.
  • the display processing unit 110 includes a management table for grasping the position of each word included in the text data displayed on the screen 400.
  • FIG. 9 is a diagram illustrating a management table of the display processing unit 110.
  • the management table of the display processing unit 110 holds, for each row, identification information of character strings (text), sentences (sentences), and words (words) included in the row.
  • the management table holds information indicating a start position (start) and a character length (len) for each sentence and each word.
  • the character string in the second line of the text display area 402 of the screen 400 shown in FIG. 5 will be described as an example.
  • “Speaker 2 received a report from the A Review Committee last year” is displayed.
  • “L2” in FIG. 9 is associated with display information relating to the character string displayed in this line.
  • label identification information is used as character string (text) information.
  • “I11” is entered.
  • “Received a report from the A Review Committee last year” means “Last year,” “A Review Committee,” “From”, “Report”, “O”, “Receive”, Corresponds to “I did.” Therefore, “s11_w1”, “s11_w2”, “s11_w3”, “s11_w4”, “s11_w5”, “s11_w6”, and “s11_w7” indicating the identification information of each word are entered as character string (text) information. .
  • the start position and the character length in the sentence and the word are described.
  • the display processing unit 110 can grasp the position (line, character position) of each word displayed in the text display area 402.
  • the display processing unit 110 also grasps the position (line, character position) of the cursor 420.
  • the display processing unit 110 can grasp which word of which sentence is pointed based on the position of the cursor 420.
  • the user can specify an arbitrary selection range of the text data displayed in the text display area 402 by moving the cursor 420 using an operation unit (not shown) such as a mouse.
  • the display processing unit 110 refers to the management table based on the cursor position information and grasps words included in the selection range.
  • the instruction receiving unit 112 acquires information on words included in the selection range from the display processing unit 110.
  • the instruction receiving unit 112 receives the instruction.
  • the instruction receiving unit 112 receives the instruction and notifies the audio reproducing unit 114 of the instruction.
  • the audio reproduction unit 114 performs reproduction, stop, fast forward, rewind, and the like of audio data based on a user instruction.
  • the instruction receiving unit 112 receives the instruction and notifies the sound reproducing unit 114.
  • the audio playback unit 114 changes the playback speed of the audio data based on a user instruction.
  • the time display area 404 the time corresponding to the audio data is displayed.
  • the time displayed in the time display area 404 can be changed.
  • the cursor 420 and the time displayed in the time display area 404 can be linked to each other, and the cursor 420 can be displayed at a position corresponding to the word corresponding to the time displayed in the time display area 404. .
  • the divided data generating unit 116 when the instruction receiving unit 112 receives a range selection and divided data generation instruction from the user (YES in step S104), the divided data generating unit 116 generates divided data.
  • the instruction receiving unit 112 receives a range selection and divided data generation instruction from the user (YES in step S104)
  • the divided data generating unit 116 generates divided data.
  • the text data in the selection range 422 in the middle is selected by being inverted (FIG. 6).
  • a box 430 is displayed (FIG. 7).
  • various work items such as a divided data generation button 432 are displayed.
  • a save screen 440 is displayed (FIG. 8).
  • the save screen 440 displays fields for inputting a plurality of predetermined folders and file names, a save button 442, a cancel button 444, and the like.
  • a save button 442 When the user selects any folder, inputs a file name, and presses the save button 442, the selection of the range and the generation of divided data in step S104 shown in FIG. 3 are performed.
  • the file name can be automatically assigned.
  • the user can also create a new folder.
  • the divided data generation unit 116 determines words included in the selected range (step S106). Further, the divided data generation unit 116 determines a start time and an end time based on the determined word (step S108). Next, the divided data generation unit 116 extracts text data corresponding to the selected range from the speech recognition result storage unit 134 (step S110). Thereafter, the divided data generation unit 116 extracts audio data at the corresponding time based on the start time and the end time (step S112). The divided data generation unit 116 generates divided data including text data and audio data of the selected portion (step S114) and stores it in a predetermined folder (step S116).
  • FIG. 10 is a diagram illustrating an example of the text data of the divided data stored in the divided data storage unit 136.
  • the text data of the divided data is generated in the same format as the text data of the speech recognition result stored in the speech recognition result storage unit 134. That is, the text data of the divided data includes a sentence number field, a word number field, a speaker field, a start time field, an end time field, a voice recognition result field, and a character number field.
  • FIG. 11 is a diagram showing a configuration of the editing processing apparatus 200 in the present embodiment.
  • the editing processing apparatus 200 includes a display processing unit 210 (second display processing unit), an instruction receiving unit 212, an audio reproduction unit 214, an editing processing unit 218 (editing processing unit), a data acquisition / transmission unit 220, and a storage unit 230.
  • Storage unit 230 includes divided data storage unit 236 and edited data storage unit 238.
  • the data acquisition / transmission unit 220 accesses the divided data storage unit 136 and the edited data storage unit 138 of the storage unit 130 of the editing management apparatus 100 to acquire the divided data and save the edited data.
  • the divided data storage unit 236 stores the divided data acquired from the divided data storage unit 136 by the data acquisition / transmission unit 220.
  • the divided data acquired by the data acquisition / transmission unit 220 has the same configuration as that shown in FIG.
  • the display processing unit 210, the instruction receiving unit 212, and the sound reproducing unit 214 are configured to have the same functions as the display processing unit 110, the instruction receiving unit 112, and the sound reproducing unit 114 of the editing management apparatus 100, respectively. it can.
  • the display processing unit 210 displays the text data included in the divided data so as to be editable in a predetermined area, and displays a cursor (caret) for selecting the text data in the display area.
  • the function of the display processing unit 210 can be realized by a text editor similar to the display processing unit 110.
  • FIG. 12 is a diagram showing a text editor screen 500 displayed on the display by the display processing unit 210.
  • a text display area 502 On the screen 500, a text display area 502, a time display area 404, a time change button 406, an audio playback button 408, a speed change button 410, and the like are displayed.
  • text display area 502 text data of divided data and a cursor 520 are displayed.
  • the time display area 404, the time change button 406, the sound reproduction button 408, and the speed change button 410 have the same functions as described with reference to FIGS. The description is omitted here.
  • FIG. 13 is a diagram showing a management table of the display processing unit 210 in the state shown in FIG.
  • the display processing unit 210 holds, for each line, identification information of character strings (text), sentences (sentences), and words (words) included in the line. In addition, information indicating a start position (start) and a character length (len) is held for each sentence and each word.
  • the character string in the third line of the text display area 502 of the screen 500 shown in FIG. 12 will be described.
  • “and the school chief of C city and the cotton swab of the municipality school board of B prefecture” are displayed.
  • “L3” in FIG. 13 is associated with display information related to the character string displayed in this line.
  • the instruction receiving unit 212 receives an arbitrary selection range of the text data displayed by the display processing unit 210 with the cursor, and receives an edit to the text data displayed on the display processing unit 210.
  • the audio reproducing unit 214 reads out audio data included in the divided data from the divided data storage unit 236, and reproduces the audio.
  • the sound reproduction unit 214 outputs sound data corresponding to the time.
  • the user of the editing processing apparatus 200 reproduces the corresponding voice data while viewing the text data displayed by the display processing unit 210, and determines whether or not the voice recognition result is correct. If there is an error in the speech recognition result, the corresponding part is corrected and edited.
  • the editing processing unit 218 rewrites a corresponding word in the text data of the divided data.
  • the portion corresponding to the word in the text data of the divided data is rewritten to a null character string.
  • the character string is inserted into a corresponding position in the text data of the divided data.
  • FIG. 17 is a diagram showing a management table of the display processing unit 210 in the state shown in FIG.
  • the display information of the third line (L3) is the same as that shown in FIG. 13, but is displayed after the fourth line by changing the “cotton swab” of the third line to “member”.
  • the word has been changed.
  • a box 530 is displayed.
  • a save button 532 is displayed.
  • the edited data is saved in the edited data storage unit 238 as edited data.
  • the file name can be automatically assigned or can be input by the user.
  • FIG. 18 is a diagram illustrating an example of text data of edited data stored in the edited data storage unit 238.
  • the edited data is generated in the same format as the text data of the divided data. That is, the text data of the edited data includes a sentence number field, a word number field, a speaker field, a start time field, an end time field, a voice recognition result field, and a character number field.
  • the data acquisition / transmission unit 220 stores the edited data in the edited data storage unit 138 of the editing management apparatus 100 according to a user instruction.
  • the editing management apparatus 100 can be configured to have a function of registering a connected character for a predetermined character string included in text data.
  • the connecting character can be a common character string that should be included in a plurality of divided data. By registering such connecting characters, the divided data can be integrated using the connecting characters as keys, and integrated data can be generated easily and accurately.
  • a procedure for registering characters connected to text data displayed in the text display area 402 of the screen 400 will be described.
  • a box 430 is displayed.
  • This procedure is the same as described with reference to FIG.
  • a connection character registration button 434 is further displayed.
  • this character string is registered as a connection character.
  • the display processing unit 110 can display a highlight character so that it can be highlighted by surrounding a connecting character with a frame 424.
  • the connecting characters By registering the connecting characters before the user of the editing management apparatus 100 performs the process of generating the divided data, the user views the screen 400 and selects the range of the divided data using the connecting characters as a boundary. be able to.
  • the connecting character can be included in a plurality of divided data in common.
  • FIG. 21 shows an example of this.
  • “Last Year” is registered as a connecting character.
  • the integrated data can be generated using the connecting character “Last Year” as a key.
  • the editing management apparatus 100 may have a function of giving an index for marking an arbitrary reproduction start position at a predetermined position of text data. By giving an index to a predetermined position of the displayed text data, the user can reproduce from the position.
  • a procedure for assigning an index to the text data displayed in the text display area 402 of the screen 400 will be described.
  • a box 430 is displayed.
  • an index assignment button 436 is further displayed in the box 430.
  • an index is assigned to this position.
  • a desired range of the speech recognition result can be selected with a simple operation, and the original format of the text data included in the range can be maintained. Can be extracted. Thereby, the partial edit operation
  • multiple pieces of divided data are prepared so that multiple workers can edit each of them, improving the work efficiency when correcting the speech recognition results by multiple workers. Can be improved.
  • each component of the edit management device 100 shown in FIG. 2 and the edit processing device 200 shown in FIG. 11 is not a hardware unit configuration but a functional unit block.
  • Each component of the edit management device 100 and the edit processing device 200 includes a CPU, a memory, a program that realizes the components shown in the figure loaded in the memory, a storage unit such as a hard disk that stores the program, and a network. It is realized by an arbitrary combination of hardware and software, centering on the connection interface. It will be understood by those skilled in the art that there are various modifications to the implementation method and apparatus.
  • the voice data acquired by the voice acquisition unit 102 described with reference to FIG. 2 and the text data of the voice recognition result processed by the voice recognition unit 104 can be included in one file. That is, the text data of the speech recognition result shown in FIG. 4 can be associated with the speech data and configured as one file. Further, the voice data storage unit 132 and the voice recognition result storage unit 134 illustrated in FIG. 2 are functionally separated and may not be physically separated clearly.
  • the edit management device 100 and the edit processing device 200 are each configured by a device 10 such as a personal computer.
  • FIG. 24 is a block diagram illustrating a hardware configuration of the apparatus 10 that constitutes the edit management apparatus 100 and the edit processing apparatus 200.
  • the apparatus 10 includes a CPU 12, a memory 14, an HDD (hard disk) 16, a communication IF (interface) 18, a display 30, an operation unit 32, an audio output device 34, and a bus 40 for connecting them.
  • the editing processing apparatus 200 accesses the editing management apparatus 100 and acquires the divided data. However, when the editing management apparatus 100 generates the divided data, the editing processing apparatus 200 divides the editing processing apparatus 200 as appropriate. Data can be distributed to request editing.
  • the configuration in which the divided data includes the audio data corresponding to the text data is shown.
  • the data amount of the divided data acquired by each editing processing apparatus 200 can be reduced.
  • the voice data included in the divided data may correspond to the entire text data of the voice recognition result. Even in this case, the user of the editing processing apparatus 200 can reproduce the corresponding portion of the audio data based on the time information.
  • the divided data may be configured not to include audio data. In this case, the user of the editing processing apparatus 200 can access the audio data storage unit 132 of the editing management apparatus 100 and reproduce the corresponding portion of audio data based on the time information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Document Processing Apparatus (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

L'invention porte sur un appareil de gestion d'édition comprenant : une unité de mémorisation de résultats de reconnaissance vocale mémorisant des données de texte dans un format prédéfini tout en associant chaque mot de celles-ci à des informations de temps, les données de texte étant obtenues en tant que résultat d'une reconnaissance vocale exécutée sur des données vocales, une unité de traitement d'affichage affichant les données de texte dans une région d'affichage préfinie et affichant simultanément un curseur utilisé pour sélectionner des données de texte dans la région d'affichage, une unité d'acceptation d'instructions acceptant, au moyen du curseur, une sélection quelconque à partir des données de texte affichées par l'unité de traitement d'affichage et acceptant simultanément une instruction de génération de données divisées, et une unité de génération de données divisées générant les données divisées par extraction des données de texte contenues dans la sélection acceptée par l'unité d'acceptation d'instructions, à partir de l'unité de mémorisation des résultats de reconnaissance vocale, le format préfini étant conservé.
PCT/JP2010/004060 2009-06-18 2010-06-17 Système de support d'édition, procédé de support d'édition et programme de support d'édition WO2010146869A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2011519574A JP5533865B2 (ja) 2009-06-18 2010-06-17 編集支援システム、編集支援方法および編集支援プログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2009145529 2009-06-18
JP2009-145529 2009-06-18

Publications (1)

Publication Number Publication Date
WO2010146869A1 true WO2010146869A1 (fr) 2010-12-23

Family

ID=43356199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/004060 WO2010146869A1 (fr) 2009-06-18 2010-06-17 Système de support d'édition, procédé de support d'édition et programme de support d'édition

Country Status (2)

Country Link
JP (1) JP5533865B2 (fr)
WO (1) WO2010146869A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017026821A (ja) * 2015-07-22 2017-02-02 ブラザー工業株式会社 テキスト対応付け編集装置、テキスト対応付け編集方法、及びプログラム
JP2017026822A (ja) * 2015-07-22 2017-02-02 ブラザー工業株式会社 テキスト対応付け編集装置、テキスト対応付け編集方法、及びプログラム
JP2018059989A (ja) * 2016-10-03 2018-04-12 株式会社アドバンスト・メディア 情報処理システム、端末装置、サーバ、情報処理方法及びプログラム
JP2019197210A (ja) * 2018-05-08 2019-11-14 日本放送協会 音声認識誤り修正支援装置およびそのプログラム
JP2021128744A (ja) * 2020-09-16 2021-09-02 株式会社時空テクノロジーズ 情報処理装置、情報処理システム、および、プログラム

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001272990A (ja) * 2000-03-28 2001-10-05 Fuji Xerox Co Ltd 対話記録編集装置
JP2003131694A (ja) * 2001-08-04 2003-05-09 Koninkl Philips Electronics Nv 認識の信頼性に適合される再生速度により、音声認識されたテキストの校正を支援する方法
JP2004333737A (ja) * 2003-05-06 2004-11-25 Nec Corp メディア検索装置およびメディア検索プログラム
JP2007133033A (ja) * 2005-11-08 2007-05-31 Nec Corp 音声テキスト化システム、音声テキスト化方法および音声テキスト化用プログラム
JP2009098490A (ja) * 2007-10-18 2009-05-07 Kddi Corp 音声認識結果編集装置、音声認識装置およびコンピュータプログラム

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3668892B2 (ja) * 2002-08-21 2005-07-06 株式会社大和速記情報センター デジタル速記システム
JP2008009693A (ja) * 2006-06-29 2008-01-17 Advanced Media Inc 聞き起こしシステム、そのサーバ及びサーバ用プログラム
JP2009009410A (ja) * 2007-06-28 2009-01-15 Hiroshi Ueno 文章編集支援システムおよびプログラム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001272990A (ja) * 2000-03-28 2001-10-05 Fuji Xerox Co Ltd 対話記録編集装置
JP2003131694A (ja) * 2001-08-04 2003-05-09 Koninkl Philips Electronics Nv 認識の信頼性に適合される再生速度により、音声認識されたテキストの校正を支援する方法
JP2004333737A (ja) * 2003-05-06 2004-11-25 Nec Corp メディア検索装置およびメディア検索プログラム
JP2007133033A (ja) * 2005-11-08 2007-05-31 Nec Corp 音声テキスト化システム、音声テキスト化方法および音声テキスト化用プログラム
JP2009098490A (ja) * 2007-10-18 2009-05-07 Kddi Corp 音声認識結果編集装置、音声認識装置およびコンピュータプログラム

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017026821A (ja) * 2015-07-22 2017-02-02 ブラザー工業株式会社 テキスト対応付け編集装置、テキスト対応付け編集方法、及びプログラム
JP2017026822A (ja) * 2015-07-22 2017-02-02 ブラザー工業株式会社 テキスト対応付け編集装置、テキスト対応付け編集方法、及びプログラム
JP2018059989A (ja) * 2016-10-03 2018-04-12 株式会社アドバンスト・メディア 情報処理システム、端末装置、サーバ、情報処理方法及びプログラム
JP2019197210A (ja) * 2018-05-08 2019-11-14 日本放送協会 音声認識誤り修正支援装置およびそのプログラム
JP2021128744A (ja) * 2020-09-16 2021-09-02 株式会社時空テクノロジーズ 情報処理装置、情報処理システム、および、プログラム
JP7048113B2 (ja) 2020-09-16 2022-04-05 株式会社時空テクノロジーズ 情報処理装置、情報処理システム、および、プログラム

Also Published As

Publication number Publication date
JP5533865B2 (ja) 2014-06-25
JPWO2010146869A1 (ja) 2012-11-29

Similar Documents

Publication Publication Date Title
US9870796B2 (en) Editing video using a corresponding synchronized written transcript by selection from a text viewer
JP4347223B2 (ja) マルチメディア文書における多モード特性に注釈を付けるためのシステムおよび方法
CN106716466B (zh) 会议信息储存装置以及方法
US6915258B2 (en) Method and apparatus for displaying and manipulating account information using the human voice
WO2016119370A1 (fr) Procédé et dispositif de mise en œuvre d'un enregistrement sonore, et terminal mobile
JP5533865B2 (ja) 編集支援システム、編集支援方法および編集支援プログラム
JP2010140506A (ja) 文書に注釈を付ける装置
WO2005040966A2 (fr) Etiquetage vocal, annotation vocale et reconnaissance de la parole pour dispositifs portables avec post-traitement optionnel
JP2005341015A (ja) 議事録作成支援機能を有するテレビ会議システム
JP2010060850A (ja) 議事録作成支援装置、議事録作成支援方法、議事録作成支援用プログラム及び議事録作成支援システム
CN106126157A (zh) 基于医院信息***的语音输入方法及装置
JP2021067830A (ja) 議事録作成システム
JP2010238050A (ja) 閲覧システム、方法、およびプログラム
KR102036721B1 (ko) 녹음 음성에 대한 빠른 검색을 지원하는 단말 장치 및 그 동작 방법
EP1079313A2 (fr) Système de traitement audio
CN110335583B (zh) 一种带隔断标识的复合文件生成及解析方法
JP7180747B2 (ja) 編集支援プログラム、編集支援方法、及び編集支援装置
JP2008216965A (ja) 音楽と一体化した形式でメールを画面に表示する方法
JP4260641B2 (ja) 検索結果処理装置、検索結果処理プログラム、検索結果処理プログラム記録媒体及び検索結果処理システム
JPH08153104A (ja) ハイパーメディアシステムおよびハイパーメディア文書作成・編集方法
JP2011150169A (ja) 音声認識装置
JP4452122B2 (ja) メタデータ生成装置及びメタデータ生成プログラム
JP6628157B2 (ja) 翻訳装置、その制御方法およびプログラム
JPH08152897A (ja) 音声編集処理装置
JP6650636B1 (ja) 翻訳装置、その制御方法およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10789248

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2011519574

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10789248

Country of ref document: EP

Kind code of ref document: A1