US20170270949A1 - Summary generating device, summary generating method, and computer program product - Google Patents
Summary generating device, summary generating method, and computer program product Download PDFInfo
- Publication number
- US20170270949A1 US20170270949A1 US15/416,973 US201715416973A US2017270949A1 US 20170270949 A1 US20170270949 A1 US 20170270949A1 US 201715416973 A US201715416973 A US 201715416973A US 2017270949 A1 US2017270949 A1 US 2017270949A1
- Authority
- US
- United States
- Prior art keywords
- information
- script
- featural
- structuring
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004590 computer program Methods 0.000 title claims description 9
- 238000000034 method Methods 0.000 title claims description 8
- 239000000284 extract Substances 0.000 claims abstract description 8
- 238000010586 diagram Methods 0.000 description 22
- 238000006243 chemical reaction Methods 0.000 description 14
- 230000014509 gene expression Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 6
- 230000000877 morphologic effect Effects 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 239000000470 constituent Substances 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
- G06F16/345—Summarisation for human users
-
- G06F17/211—
-
- G06F17/2755—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/268—Morphological analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G10L15/265—
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/221—Announcement of recognition results
Definitions
- Embodiments described herein relate generally to a summary generating device, a summary generating method, and a computer program product.
- the advance preparation for the purpose of creating the minutes is a cumbersome task. More particularly, in the conventional technology, since a minutes blueprint that is prepared in advance is used, the advance preparation for the purpose of creating the minutes becomes a cumbersome task.
- FIG. 1 is a diagram illustrating a system configuration of a summary generating system according to an embodiment
- FIG. 2 is a block diagram illustrating a hardware configuration of a summary generating device according to the embodiment
- FIG. 5 is a diagram illustrating a result of performing morphological analysis according to the embodiment.
- FIG. 6 is a diagram illustrating command expressions according to the embodiment.
- FIG. 7 is a diagram illustrating property information/meaning class of the parts of speech according to the embodiment.
- FIG. 8 is a diagram illustrating system-originating information according to the embodiment.
- FIG. 9 is a diagram illustrating a result of performing segmentalization according to the embodiment.
- FIG. 11 is a diagram for explaining about candidates of the display format according to the embodiment.
- FIG. 13 is a flowchart for explaining a sequence of operations performed by a segment candidate generating unit according to the embodiment
- FIG. 14 is a flowchart for explaining a sequence of operations performed by a structuring estimating unit according to the embodiment.
- FIG. 15 is a flowchart for explaining a sequence of operations performed by a display format converting unit according to the embodiment.
- FIG. 1 is a diagram illustrating an exemplary system configuration of a summary generating system 1 according to an embodiment.
- the summary generating system 1 includes a summary generating device 100 and a terminal device 200 .
- communication can be performed with each device in a wireless manner or a wired manner.
- the summary generating system 1 can include a plurality of terminal devices 200 .
- the summary generating system 1 creates, from voice data of a meeting, a summary document that is made visible as a format-structured text.
- the terminal device 200 obtains voice data of a meeting and sends that voice data to the summary generating device 100 via a network.
- the voice data is obtained from a microphone that is connected to the terminal device 200 .
- a single microphone or a plurality of microphones can be used. Since there are times when a meeting is conducted across different locations, there may be a case in which the summary generating system 1 includes a plurality of terminal devices 200 .
- the terminal device 200 is an information device such as a personal computer (PC) or a tablet terminal.
- the summary generating device 100 obtains voice data from the terminal device 200 , detects an explicit summary request of a speaker or an expression for a structuring request included in a speech, and estimates appropriate display units (segments). Then, in response to a termination instruction from a speaker, the summary generating device 100 rearranges the segments depending on the contents thereof, converts them into various display formats, and outputs them.
- the summary generating device 100 is an information processing device such as a server device.
- FIG. 2 is a block diagram illustrating an exemplary hardware configuration of the summary generating device 100 according to the embodiment.
- the summary generating device 100 includes a central processing unit (CPU) 12 , a read only memory (ROM) 13 , a random access memory (RAM) 14 , and a communicating unit 15 .
- CPU central processing unit
- ROM read only memory
- RAM random access memory
- communicating unit 15 a communicating unit 15 .
- these constituent elements are connected to each other by a bus 11 .
- FIG. 3 is a block diagram illustrating an exemplary functional configuration of the summary generating device 100 according to the embodiment.
- the summary generating device 100 includes a voice recognizing unit 110 , a featural script extracting unit 120 , a segment candidate generating unit 130 , a structuring estimating unit 140 , an instructing unit 160 , and a display format converting unit 170 . Some or all of these constituent elements can be implemented using software (computer programs) or using hardware circuitry. Meanwhile the summary generating device 100 stores a structure estimation model 150 in the ROM 13 .
- the voice recognizing unit 110 performs a voice recognition operation with respect to voice data. More particularly, the voice recognizing unit 110 receives input of voice data that is sent from the terminal device 200 . Then, the voice recognizing unit 110 performs a voice recognition operation, and generates text information containing character data of the utterances and information about the timings of utterances.
- FIG. 4 is a diagram illustrating an example of the result of performing voice recognition according to the embodiment. As illustrated in FIG. 4 , the text information that represents the result of voice recognition contains character data of the utterances and information about the timings of utterances.
- the voice recognizing unit 110 identifies utterance sections and silent sections as the audio features of the voice data, and detects the duration of those sections. Meanwhile, the voice recognizing unit 110 may not be included in the summary generating device 100 , and the configuration can be such that the featural script extracting unit 120 installed at latter stage performs operations based on the result of performing the voice recognition operation/an audio feature extraction operation.
- the featural script extracting unit 120 extracts featural script information included in the text information. More particularly, the featural script extracting unit 120 performs morphological analysis with respect to the text information generated by the voice recognizing unit 110 .
- FIG. 5 is a diagram illustrating an example of the result of performing morphological analysis according to the embodiment. As illustrated in FIG. 5 , as a result of performing morphological analysis, the text is divided into the smallest linguistic units each carrying a meaning. Then, the featural script extracting unit 120 identifies part-of-speech information included in the text information, and performs meaning class analysis. In the case of nouns, more detailed information (for example, information such as personal names or dates) based on the corresponding property information is extracted as featural script information.
- the featural script information is information related to parts of speech that are divisions in accordance with, for example, a morphological analysis.
- the featural script extracting unit 120 performs logical element determination with respect to the text information. For example, in the logical element determination, it is determined whether or not ordered bullet point expressions or command expressions for structuring are included in the text information. If logical elements are included in the text information, then the featural script extracting unit 120 assigns their metadata to the logical elements.
- FIG. 6 is a diagram illustrating exemplary command expressions according to the embodiment.
- the command expressions include “let us summarize (conclusion)”, “to-do list (todo)”, “that's enough (terminate)”, and “we will end (terminate)”.
- FIG. 7 is a diagram illustrating an example of the property information/the meaning class of the parts of speech according to the embodiment. As illustrated in FIG. 7 , for the term “10-th day of the month”, “date” represents the property information/the meaning class of a part of speech. Similarly, for the term “Nishiguchi”, “person” represents the property information/the meaning class of a part of speech.
- a segment label is the name expressing the role of a segment (a display unit), and represents metadata that is assigned depending on whether or not the following is included: the meaning class/the property information of a part of speech extracted at an earlier stage, or the text of an utterance not having the meaning class/the property information, or a command (instruction) for structuring.
- a command for structuring represents an instruction to start structuring, and examples thereof include “start of bullet points”, “table begins here”, or “tabular format begins here”.
- the featural script extracting unit 120 assigns utterance sections and silent sections, which are detected by the voice recognizing unit 110 , as surrounding information.
- the featural script extracting unit 120 obtains the following system-originating information if available: detection of a speaker ID based on the login user of a microphone or the connected terminal device 200 ; meeting information such as the meeting title referable to in tandem with the usage timing of the meeting room and the scheduler, the time of the meeting, the participants, and the meeting room; and detailed meeting information such as information on the individual speakers who input voice during the meeting.
- FIG. 8 is a diagram illustrating an example of system-originating information according to the example. As illustrated in FIG. 8 , in the system-originating information, “A” represents the “speaker ID” and “10/23/2015” represents the “meeting date”.
- the segment candidate generating unit 130 generates variation in the candidates of smallest constitutional units for structuring.
- the candidates for smallest constitutional units for structuring include, in descending order of granularity, character strings partitioned by units such as speakers, paragraphs, phrases, sequences of the same character type such as Kanji or Katakana, meaning classes, words, and parts of speech.
- the segment candidate generating unit 130 reads the text information generated by the voice recognizing unit 110 and reads the featural script information extracted by the featural script extracting unit 120 . Then, the segment candidate generating unit 130 detects the segment label present in each set of featural script information. For example, in the segment label detection; a start instruction, a termination instruction, or a label providing a clue of structuring is detected.
- the segment candidate generating unit 130 performs grouping of the sets of featural script information that have been read and stored before. For example, in the grouping, repetition of regular appearances of similar elements is detected or the appearance patterns of featural script information having different types are detected, and the units of such repetitions are grouped together.
- similar elements point to regular appearance of repetition of the elements (three elements) such as date, location, and arbitrary text.
- the segment candidate generating unit 130 performs ordering of the sets of featural script information that have been grouped before.
- ordering include the following methods: a method in which the ordering of the types of featural script is defined in advance and then the ordering is defined in a fixed manner; a method in which, in a specific example of the extracted featural script information, the ordering is performed based on the character length (average character length) included in each featural script; and a method in which the ordering is performed based on the inclusion number of a particular element (meaning class).
- FIG. 9 is a diagram illustrating an example of the result of performing segmentalization according to the embodiment. As illustrated in FIG. 9 , segmentalization results in the generation of text information that is assigned with the command expressions included in the text at each timing, the property information/the meaning class of the parts of speech, and the system-originating information.
- the structuring estimating unit 140 estimates structure information based on the segment information. More specifically, the structuring estimating unit 140 reads the segment information generated by the segment candidate generating unit 130 . Then, the structuring estimating unit 140 reads a structure estimation model from the structure estimation model 150 .
- the structure estimation model is obtained by learning, as learning data, the exemplary formats suitable for display and the results edited/decided in the past. Based on such a structure estimation model; the structuring estimating unit 140 assigns combinations and patterns of appearances of the featural script information, and presents suitable structuring candidates in an ordered manner. In the initial presentation of the structure information, the structuring result having the highest likelihood is presented from among the ordered segment patterns.
- the structuring estimating unit 140 receives a decision instruction from a user.
- the instruction from the user is received via the instructing unit 160 .
- a structuring result with the decided presentation candidates is presented.
- the next structuring result is presented.
- the presentation can be done not only by changing the combination of segments but also by changing the variation by tracking back the manner of retrieval of the segments. Meanwhile, the presentation of the structuring result either can be output from the terminal device 200 or can be output from the summary generating device 100 .
- FIG. 10 is a diagram for explaining an example of extracting candidates of structuring estimation according to the embodiment.
- the structuring result includes information ranging from information of a comprehensive structure level to information of a local structure level.
- the comprehensive structure level and the local structure level represent the tier of a document, such as chapters, sections, paragraphs, itemization, and titles and main text, and may be called as a “logical structure” or “logical element” of a document.
- the comprehensive structure level may represent the highest or broadest structure element/elements while the local structure level may represent the lowest or narrowest structure element/elements.
- the comprehensive structure level may be the document itself or the chapters, and the local structure element may be the paragraphs.
- the display format converting unit 170 converts the decided structuring result into a display format for user viewing. More particularly, the display format converting unit 170 reads the structuring result decided by the structuring estimating unit 140 . Then, the display format converting unit 170 reads a display format conversion model. In the display format conversion model, definition patterns regarding the display format to be used for presentation are written corresponding to the structuring results; and the cascading style sheets (CSS) or the XSL transformations (XSLT) can be used for writing the definition patterns.
- CSS cascading style sheets
- XSLT XSL transformations
- the display format converting unit 170 presents the initial conversion result according to the structuring result and the display format conversion model. In response to that presentation, if a decision instruction is received from the user via the instructing unit 160 , then the display format converting unit 170 outputs the conversion result as a summary document. On the other hand, if a decision instruction from the user cannot be obtained (i.e., if a request for presentation of the next candidate is received), then the conversion result having the next highest likelihood is presented. Meanwhile, the presentation of the conversion result either can be output from the terminal device 200 or can be output from the summary generating device 100 .
- FIG. 11 is a diagram for explaining an example of candidates of the display format according to the embodiment. As illustrated in FIG. 11 , in the case of presenting the information of any one of the structures from among the structuring results, the display is done in the order of date, or the display is done on the basis of personal names, or only the items and the personal names are displayed.
- FIG. 12 is a flowchart for explaining an exemplary sequence of operations performed by the featural script extracting unit 120 according to the embodiment.
- the featural script extracting unit 120 obtains the text information generated by the voice recognizing unit 110 (Step S 101 ). Then, the featural script extracting unit 120 performs morphological analysis with respect to the obtained text information (Step S 102 ). Subsequently, the featural script extracting unit 120 identifies part-of-speech information included in the text information and performs meaning class analysis (Step S 103 ). Then, the featural script extracting unit 120 performs logical element determination with respect to the text information (Step S 104 ).
- the featural script extracting unit 120 performs segment label determination with respect to the text information (Step S 105 ). Then, the featural script extracting unit 120 assigns utterance sections and silent sections, which are detected by the voice recognizing unit 110 , as surrounding information (Step S 106 ). Subsequently, the featural script extracting unit 120 detects, as system-originating information, a speaker ID based on the login user of a microphone or the terminal device 200 (Step S 107 ). Then, the featural script extracting unit 120 detects detailed meeting information managed by an external device (Step S 108 ).
- FIG. 13 is a flowchart for explaining an exemplary sequence of operations performed by the segment candidate generating unit 130 according to the embodiment.
- the segment candidate generating unit 130 reads the text information generated by the voice recognizing unit 110 and reads the featural script information extracted by the featural script extracting unit 120 (Step S 201 ). Then, the segment candidate generating unit 130 reads the segment label included in each set of featural script information (Step S 202 ). Subsequently, the segment candidate generating unit 130 performs grouping of sets of featural script information that have been read and stored before (Step S 203 ). Then, the segment candidate generating unit 130 performs ordering of the sets of featural script information that have been grouped till that point of time (Step S 204 ).
- the segment candidate generating unit 130 determines whether or not a termination instruction regarding structuring is included in the segment label (Step S 205 ). If a termination instruction regarding structuring is included in the segment label (Yes at Step S 205 ), then the segment candidate generating unit 130 performs ordering of the sets of featural script information that have been grouped (Step S 206 ). However, if a termination instruction regarding structuring is not included in the segment label (No at Step S 205 ), then the system control returns to Step S 201 .
- FIG. 14 is a flowchart for explaining an exemplary sequence of operations performed by the structuring estimating unit 140 according to the embodiment.
- the structuring estimating unit 140 reads the ordered sets of segment information generated by the segment candidate generating unit 130 (Step S 301 ). Then, the structuring estimating unit 140 reads a structure estimation model from the structure estimation model 150 (Step S 302 ). Subsequently, based on the segment information and the structuring estimation model, the structuring estimating unit 140 assigns combinations and patterns of appearances of the featural script information, and presents the initial structure information (a candidate for structuring) (Step S 303 ).
- Step S 304 When a decision instruction is received from the user in response to the presentation of structure information (Yes at Step S 304 ), the structuring estimating unit 140 assigns the presented candidate as the decided structure information (Step S 305 ). However, if a decision instruction cannot be received from the user in response to the presentation of structure information (i.e., if a request for presentation of the next candidate is received) (No at Step S 304 ), then the structuring estimating unit 140 presents the candidate of the structure information having the next highest score (Step S 306 ). After a candidate is presented, the system control returns to Step S 304 and a decision instruction from the user is awaited.
- FIG. 15 is a flowchart for explaining an exemplary sequence of operations performed by the display format converting unit 170 according to the embodiment.
- the display format converting unit 170 reads the structuring result decided by the structuring estimating unit 140 (Step S 401 ). Then, the display format converting unit 170 reads the display format conversion model (Step S 402 ). Subsequently, according to the structuring result and the display format conversion model, the display format converting unit 170 presents the initial conversion result (Step S 403 ).
- Step S 404 when a decision instruction from the user is received in response to the presentation of the conversion result (Yes at Step S 404 ), the display format converting unit 170 outputs the conversion result as a summary document (Step S 405 ). However, if a decision instruction from the user cannot be received in response to the presentation of the conversion result (i.e., if a request for presentation of the next candidate is received) (No at Step S 404 ), then the display format converting unit 170 presents the candidate having the next highest score of the conversion result (Step S 406 ). After the candidate is presented, the system control returns to Step S 404 and a decision instruction from the user is awaited.
- segments are estimated based on an explicit instruction by the speaker or based on an expression for a structuring request. Then, the segments are rearranged depending on the contents thereof and are presented upon being converted into various display formats. As a result, it becomes possible to cut down on the time and efforts required for advance preparation.
- the summary generating device 100 can be implemented using, for example, a general-purpose computer device serving as the basic hardware.
- the computer programs that are executed contain modules for the constituent elements described above.
- the computer programs can be provided by recording as installable files or executable files in a computer-readable recording medium such as a compact disk read only memory (CD-ROM), a compact disk recordable (CD-R), or a digital versatile disk (DVD); or can be provided by storing in advance in a ROM.
- a computer-readable recording medium such as a compact disk read only memory (CD-ROM), a compact disk recordable (CD-R), or a digital versatile disk (DVD)
- CD-ROM compact disk read only memory
- CD-R compact disk recordable
- DVD digital versatile disk
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- User Interface Of Digital Computer (AREA)
- Machine Translation (AREA)
Abstract
Description
- This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2016-054331, filed on Mar. 17, 2016; the entire contents of which are incorporated herein by reference.
- Embodiments described herein relate generally to a summary generating device, a summary generating method, and a computer program product.
- Conventionally, with the improvement in the accuracy in the voice recognition technology, a system has been proposed in which the voice recognition technology is used for documenting the remarks made during a meeting. In such a situation, a technology is available that supports the creation of the minutes of a meeting, which typically requires time and efforts when created manually. For example, a technology is available that provides a minutes blueprint and performs analysis before creating the minutes according to the minutes blueprint.
- However, in the conventional technology, the advance preparation for the purpose of creating the minutes is a cumbersome task. More particularly, in the conventional technology, since a minutes blueprint that is prepared in advance is used, the advance preparation for the purpose of creating the minutes becomes a cumbersome task.
-
FIG. 1 is a diagram illustrating a system configuration of a summary generating system according to an embodiment; -
FIG. 2 is a block diagram illustrating a hardware configuration of a summary generating device according to the embodiment; -
FIG. 3 is a block diagram illustrating a functional configuration of the summary generating device according to the embodiment; -
FIG. 4 is a diagram illustrating a result of performing voice recognition according to the embodiment; -
FIG. 5 is a diagram illustrating a result of performing morphological analysis according to the embodiment; -
FIG. 6 is a diagram illustrating command expressions according to the embodiment; -
FIG. 7 is a diagram illustrating property information/meaning class of the parts of speech according to the embodiment; -
FIG. 8 is a diagram illustrating system-originating information according to the embodiment; -
FIG. 9 is a diagram illustrating a result of performing segmentalization according to the embodiment; -
FIG. 10 is a diagram for explaining the extraction of candidates of structuring estimation according to the embodiment; -
FIG. 11 is a diagram for explaining about candidates of the display format according to the embodiment; -
FIG. 12 is a flowchart for explaining a sequence of operations performed by a featural script extracting unit according to the embodiment; -
FIG. 13 is a flowchart for explaining a sequence of operations performed by a segment candidate generating unit according to the embodiment; -
FIG. 14 is a flowchart for explaining a sequence of operations performed by a structuring estimating unit according to the embodiment; and -
FIG. 15 is a flowchart for explaining a sequence of operations performed by a display format converting unit according to the embodiment. - According to one embodiment, a summary generating device includes a featural script extracting unit, a segment candidate generating unit, and a structuring estimating unit. The featural script extracting unit extracts featural script information of the words included in text information. Based on the extracted feature script information, the segment candidate generating unit generates candidates of segments that represent the constitutional units for the display purpose. Based on the generated candidates of segments and based on an estimation model for structuring, the structuring estimating unit estimates structure information containing information organized from a comprehensive structure level to a local structure level.
- Embodiment
-
FIG. 1 is a diagram illustrating an exemplary system configuration of asummary generating system 1 according to an embodiment. As illustrated inFIG. 1 , thesummary generating system 1 includes asummary generating device 100 and aterminal device 200. In thesummary generating system 1, communication can be performed with each device in a wireless manner or a wired manner. Meanwhile, thesummary generating system 1 can include a plurality ofterminal devices 200. Thesummary generating system 1 creates, from voice data of a meeting, a summary document that is made visible as a format-structured text. - In the configuration described above, the
terminal device 200 obtains voice data of a meeting and sends that voice data to the summary generatingdevice 100 via a network. The voice data is obtained from a microphone that is connected to theterminal device 200. In a meeting, either a single microphone or a plurality of microphones can be used. Since there are times when a meeting is conducted across different locations, there may be a case in which thesummary generating system 1 includes a plurality ofterminal devices 200. Herein, theterminal device 200 is an information device such as a personal computer (PC) or a tablet terminal. - The
summary generating device 100 obtains voice data from theterminal device 200, detects an explicit summary request of a speaker or an expression for a structuring request included in a speech, and estimates appropriate display units (segments). Then, in response to a termination instruction from a speaker, thesummary generating device 100 rearranges the segments depending on the contents thereof, converts them into various display formats, and outputs them. Herein, thesummary generating device 100 is an information processing device such as a server device. -
FIG. 2 is a block diagram illustrating an exemplary hardware configuration of thesummary generating device 100 according to the embodiment. As illustrated inFIG. 2 , thesummary generating device 100 includes a central processing unit (CPU) 12, a read only memory (ROM) 13, a random access memory (RAM) 14, and a communicatingunit 15. Moreover, these constituent elements are connected to each other by a bus 11. - The
CPU 12 controls the operations of the entiresummary generating device 100. TheCPU 12 uses theRAM 14 as the work area and executes computer programs stored in theROM 13 so as to control the operations of the entiresummary generating device 100. TheRAM 14 is used to temporarily store the information related to various operations, and is used as the work area during the execution of the computer programs stored in theROM 13. Herein, theROM 13 is used to store computer programs for implementing the operations of thesummary generating device 100. The communicatingunit 15 communicates with external devices such as theterminal device 200 via a network in a wireless manner or a wired manner. Meanwhile, the hardware configuration illustrated inFIG. 2 is only exemplary, and it is also possible to have a display unit for outputting the processing result and an operating unit for inputting a variety of information. -
FIG. 3 is a block diagram illustrating an exemplary functional configuration of thesummary generating device 100 according to the embodiment. As illustrated inFIG. 3 , thesummary generating device 100 includes avoice recognizing unit 110, a featuralscript extracting unit 120, a segmentcandidate generating unit 130, a structuring estimatingunit 140, an instructingunit 160, and a displayformat converting unit 170. Some or all of these constituent elements can be implemented using software (computer programs) or using hardware circuitry. Meanwhile the summary generatingdevice 100 stores astructure estimation model 150 in theROM 13. - The
voice recognizing unit 110 performs a voice recognition operation with respect to voice data. More particularly, thevoice recognizing unit 110 receives input of voice data that is sent from theterminal device 200. Then, thevoice recognizing unit 110 performs a voice recognition operation, and generates text information containing character data of the utterances and information about the timings of utterances.FIG. 4 is a diagram illustrating an example of the result of performing voice recognition according to the embodiment. As illustrated inFIG. 4 , the text information that represents the result of voice recognition contains character data of the utterances and information about the timings of utterances. - Moreover, the
voice recognizing unit 110 identifies utterance sections and silent sections as the audio features of the voice data, and detects the duration of those sections. Meanwhile, thevoice recognizing unit 110 may not be included in thesummary generating device 100, and the configuration can be such that the featuralscript extracting unit 120 installed at latter stage performs operations based on the result of performing the voice recognition operation/an audio feature extraction operation. - The featural
script extracting unit 120 extracts featural script information included in the text information. More particularly, the featuralscript extracting unit 120 performs morphological analysis with respect to the text information generated by thevoice recognizing unit 110.FIG. 5 is a diagram illustrating an example of the result of performing morphological analysis according to the embodiment. As illustrated inFIG. 5 , as a result of performing morphological analysis, the text is divided into the smallest linguistic units each carrying a meaning. Then, the featuralscript extracting unit 120 identifies part-of-speech information included in the text information, and performs meaning class analysis. In the case of nouns, more detailed information (for example, information such as personal names or dates) based on the corresponding property information is extracted as featural script information. For example, during meaning class analysis, the following is extracted as featural script information: presence or absence of expressions of quantity and expressions of units, personal names and organization names, event names, and keywords such as technical terms. In other words, the featural script information, according to some embodiments, is information related to parts of speech that are divisions in accordance with, for example, a morphological analysis. Then, the featuralscript extracting unit 120 performs logical element determination with respect to the text information. For example, in the logical element determination, it is determined whether or not ordered bullet point expressions or command expressions for structuring are included in the text information. If logical elements are included in the text information, then the featuralscript extracting unit 120 assigns their metadata to the logical elements. -
FIG. 6 is a diagram illustrating exemplary command expressions according to the embodiment. As illustrated inFIG. 6 , the command expressions include “let us summarize (conclusion)”, “to-do list (todo)”, “that's enough (terminate)”, and “we will end (terminate)”.FIG. 7 is a diagram illustrating an example of the property information/the meaning class of the parts of speech according to the embodiment. As illustrated inFIG. 7 , for the term “10-th day of the month”, “date” represents the property information/the meaning class of a part of speech. Similarly, for the term “Nishiguchi”, “person” represents the property information/the meaning class of a part of speech. - Subsequently, the featural
script extracting unit 120 performs segment label determination with respect to the text information. A segment label is the name expressing the role of a segment (a display unit), and represents metadata that is assigned depending on whether or not the following is included: the meaning class/the property information of a part of speech extracted at an earlier stage, or the text of an utterance not having the meaning class/the property information, or a command (instruction) for structuring. For example, a command for structuring represents an instruction to start structuring, and examples thereof include “start of bullet points”, “table begins here”, or “tabular format begins here”. Moreover, the featuralscript extracting unit 120 assigns utterance sections and silent sections, which are detected by thevoice recognizing unit 110, as surrounding information. - Meanwhile, as the featural script information, information originating from the
summary generating system 1 can also be used. For example, as the featural script information, the featuralscript extracting unit 120 obtains the following system-originating information if available: detection of a speaker ID based on the login user of a microphone or the connectedterminal device 200; meeting information such as the meeting title referable to in tandem with the usage timing of the meeting room and the scheduler, the time of the meeting, the participants, and the meeting room; and detailed meeting information such as information on the individual speakers who input voice during the meeting.FIG. 8 is a diagram illustrating an example of system-originating information according to the example. As illustrated inFIG. 8 , in the system-originating information, “A” represents the “speaker ID” and “10/23/2015” represents the “meeting date”. - The segment
candidate generating unit 130 generates variation in the candidates of smallest constitutional units for structuring. Examples of the candidates for smallest constitutional units for structuring include, in descending order of granularity, character strings partitioned by units such as speakers, paragraphs, phrases, sequences of the same character type such as Kanji or Katakana, meaning classes, words, and parts of speech. More particularly, the segmentcandidate generating unit 130 reads the text information generated by thevoice recognizing unit 110 and reads the featural script information extracted by the featuralscript extracting unit 120. Then, the segmentcandidate generating unit 130 detects the segment label present in each set of featural script information. For example, in the segment label detection; a start instruction, a termination instruction, or a label providing a clue of structuring is detected. - Then, the segment
candidate generating unit 130 performs grouping of the sets of featural script information that have been read and stored before. For example, in the grouping, repetition of regular appearances of similar elements is detected or the appearance patterns of featural script information having different types are detected, and the units of such repetitions are grouped together. As an example, similar elements point to regular appearance of repetition of the elements (three elements) such as date, location, and arbitrary text. - Meanwhile, if a termination instruction regarding structuring is included in a segment label, then the segment
candidate generating unit 130 performs ordering of the sets of featural script information that have been grouped before. Examples of ordering include the following methods: a method in which the ordering of the types of featural script is defined in advance and then the ordering is defined in a fixed manner; a method in which, in a specific example of the extracted featural script information, the ordering is performed based on the character length (average character length) included in each featural script; and a method in which the ordering is performed based on the inclusion number of a particular element (meaning class). -
FIG. 9 is a diagram illustrating an example of the result of performing segmentalization according to the embodiment. As illustrated inFIG. 9 , segmentalization results in the generation of text information that is assigned with the command expressions included in the text at each timing, the property information/the meaning class of the parts of speech, and the system-originating information. - The
structuring estimating unit 140 estimates structure information based on the segment information. More specifically, thestructuring estimating unit 140 reads the segment information generated by the segmentcandidate generating unit 130. Then, thestructuring estimating unit 140 reads a structure estimation model from thestructure estimation model 150. Herein, the structure estimation model is obtained by learning, as learning data, the exemplary formats suitable for display and the results edited/decided in the past. Based on such a structure estimation model; thestructuring estimating unit 140 assigns combinations and patterns of appearances of the featural script information, and presents suitable structuring candidates in an ordered manner. In the initial presentation of the structure information, the structuring result having the highest likelihood is presented from among the ordered segment patterns. - Then, the
structuring estimating unit 140 receives a decision instruction from a user. Herein, the instruction from the user is received via theinstructing unit 160. For example, if the user has no issue with the current presentation candidates, a structuring result with the decided presentation candidates is presented. On the other hand, if a decision instruction from the user cannot be obtained (i.e., if a request for presentation of the next candidate is received), then the next structuring result is presented. In the case of presenting the next structuring result, the presentation can be done not only by changing the combination of segments but also by changing the variation by tracking back the manner of retrieval of the segments. Meanwhile, the presentation of the structuring result either can be output from theterminal device 200 or can be output from thesummary generating device 100. -
FIG. 10 is a diagram for explaining an example of extracting candidates of structuring estimation according to the embodiment. As illustrated inFIG. 10 , based on the segment information, the structuring result includes information ranging from information of a comprehensive structure level to information of a local structure level. For example, the comprehensive structure level and the local structure level according to the embodiments represent the tier of a document, such as chapters, sections, paragraphs, itemization, and titles and main text, and may be called as a “logical structure” or “logical element” of a document. In a hierarchy of structural elements, the comprehensive structure level may represent the highest or broadest structure element/elements while the local structure level may represent the lowest or narrowest structure element/elements. For example, if a document is divided into chapters, section, and paragraphs, the comprehensive structure level may be the document itself or the chapters, and the local structure element may be the paragraphs. - The display
format converting unit 170 converts the decided structuring result into a display format for user viewing. More particularly, the displayformat converting unit 170 reads the structuring result decided by thestructuring estimating unit 140. Then, the displayformat converting unit 170 reads a display format conversion model. In the display format conversion model, definition patterns regarding the display format to be used for presentation are written corresponding to the structuring results; and the cascading style sheets (CSS) or the XSL transformations (XSLT) can be used for writing the definition patterns. - Subsequently, the display
format converting unit 170 presents the initial conversion result according to the structuring result and the display format conversion model. In response to that presentation, if a decision instruction is received from the user via theinstructing unit 160, then the displayformat converting unit 170 outputs the conversion result as a summary document. On the other hand, if a decision instruction from the user cannot be obtained (i.e., if a request for presentation of the next candidate is received), then the conversion result having the next highest likelihood is presented. Meanwhile, the presentation of the conversion result either can be output from theterminal device 200 or can be output from thesummary generating device 100. -
FIG. 11 is a diagram for explaining an example of candidates of the display format according to the embodiment. As illustrated inFIG. 11 , in the case of presenting the information of any one of the structures from among the structuring results, the display is done in the order of date, or the display is done on the basis of personal names, or only the items and the personal names are displayed. -
FIG. 12 is a flowchart for explaining an exemplary sequence of operations performed by the featuralscript extracting unit 120 according to the embodiment. As illustrated inFIG. 12 , the featuralscript extracting unit 120 obtains the text information generated by the voice recognizing unit 110 (Step S101). Then, the featuralscript extracting unit 120 performs morphological analysis with respect to the obtained text information (Step S102). Subsequently, the featuralscript extracting unit 120 identifies part-of-speech information included in the text information and performs meaning class analysis (Step S103). Then, the featuralscript extracting unit 120 performs logical element determination with respect to the text information (Step S104). - Subsequently, the featural
script extracting unit 120 performs segment label determination with respect to the text information (Step S105). Then, the featuralscript extracting unit 120 assigns utterance sections and silent sections, which are detected by thevoice recognizing unit 110, as surrounding information (Step S106). Subsequently, the featuralscript extracting unit 120 detects, as system-originating information, a speaker ID based on the login user of a microphone or the terminal device 200 (Step S107). Then, the featuralscript extracting unit 120 detects detailed meeting information managed by an external device (Step S108). -
FIG. 13 is a flowchart for explaining an exemplary sequence of operations performed by the segmentcandidate generating unit 130 according to the embodiment. As illustrated inFIG. 13 , the segmentcandidate generating unit 130 reads the text information generated by thevoice recognizing unit 110 and reads the featural script information extracted by the featural script extracting unit 120 (Step S201). Then, the segmentcandidate generating unit 130 reads the segment label included in each set of featural script information (Step S202). Subsequently, the segmentcandidate generating unit 130 performs grouping of sets of featural script information that have been read and stored before (Step S203). Then, the segmentcandidate generating unit 130 performs ordering of the sets of featural script information that have been grouped till that point of time (Step S204). - Subsequently, the segment
candidate generating unit 130 determines whether or not a termination instruction regarding structuring is included in the segment label (Step S205). If a termination instruction regarding structuring is included in the segment label (Yes at Step S205), then the segmentcandidate generating unit 130 performs ordering of the sets of featural script information that have been grouped (Step S206). However, if a termination instruction regarding structuring is not included in the segment label (No at Step S205), then the system control returns to Step S201. -
FIG. 14 is a flowchart for explaining an exemplary sequence of operations performed by thestructuring estimating unit 140 according to the embodiment. As illustrated inFIG. 14 , thestructuring estimating unit 140 reads the ordered sets of segment information generated by the segment candidate generating unit 130 (Step S301). Then, thestructuring estimating unit 140 reads a structure estimation model from the structure estimation model 150 (Step S302). Subsequently, based on the segment information and the structuring estimation model, thestructuring estimating unit 140 assigns combinations and patterns of appearances of the featural script information, and presents the initial structure information (a candidate for structuring) (Step S303). - When a decision instruction is received from the user in response to the presentation of structure information (Yes at Step S304), the
structuring estimating unit 140 assigns the presented candidate as the decided structure information (Step S305). However, if a decision instruction cannot be received from the user in response to the presentation of structure information (i.e., if a request for presentation of the next candidate is received) (No at Step S304), then thestructuring estimating unit 140 presents the candidate of the structure information having the next highest score (Step S306). After a candidate is presented, the system control returns to Step S304 and a decision instruction from the user is awaited. -
FIG. 15 is a flowchart for explaining an exemplary sequence of operations performed by the displayformat converting unit 170 according to the embodiment. As illustrated inFIG. 15 , the displayformat converting unit 170 reads the structuring result decided by the structuring estimating unit 140 (Step S401). Then, the displayformat converting unit 170 reads the display format conversion model (Step S402). Subsequently, according to the structuring result and the display format conversion model, the displayformat converting unit 170 presents the initial conversion result (Step S403). - Subsequently, when a decision instruction from the user is received in response to the presentation of the conversion result (Yes at Step S404), the display
format converting unit 170 outputs the conversion result as a summary document (Step S405). However, if a decision instruction from the user cannot be received in response to the presentation of the conversion result (i.e., if a request for presentation of the next candidate is received) (No at Step S404), then the displayformat converting unit 170 presents the candidate having the next highest score of the conversion result (Step S406). After the candidate is presented, the system control returns to Step S404 and a decision instruction from the user is awaited. - According to the embodiment, from the result of voice recognition performed with respect to voice data, segments are estimated based on an explicit instruction by the speaker or based on an expression for a structuring request. Then, the segments are rearranged depending on the contents thereof and are presented upon being converted into various display formats. As a result, it becomes possible to cut down on the time and efforts required for advance preparation.
- Meanwhile, the
summary generating device 100 according to the embodiment can be implemented using, for example, a general-purpose computer device serving as the basic hardware. The computer programs that are executed contain modules for the constituent elements described above. The computer programs can be provided by recording as installable files or executable files in a computer-readable recording medium such as a compact disk read only memory (CD-ROM), a compact disk recordable (CD-R), or a digital versatile disk (DVD); or can be provided by storing in advance in a ROM. - While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Claims (8)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016054331A JP2017167433A (en) | 2016-03-17 | 2016-03-17 | Summary generation device, summary generation method, and summary generation program |
JP2016-054331 | 2016-03-17 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20170270949A1 true US20170270949A1 (en) | 2017-09-21 |
US10540987B2 US10540987B2 (en) | 2020-01-21 |
Family
ID=59855825
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/416,973 Active US10540987B2 (en) | 2016-03-17 | 2017-01-26 | Summary generating device, summary generating method, and computer program product |
Country Status (2)
Country | Link |
---|---|
US (1) | US10540987B2 (en) |
JP (1) | JP2017167433A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11176943B2 (en) | 2017-09-21 | 2021-11-16 | Kabushiki Kaisha Toshiba | Voice recognition device, voice recognition method, and computer program product |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111051074B (en) | 2017-08-31 | 2022-08-16 | 富士胶片株式会社 | Lithographic printing plate precursor and method for producing lithographic printing plate |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5778397A (en) * | 1995-06-28 | 1998-07-07 | Xerox Corporation | Automatic method of generating feature probabilities for automatic extracting summarization |
US20050125216A1 (en) * | 2003-12-05 | 2005-06-09 | Chitrapura Krishna P. | Extracting and grouping opinions from text documents |
US20060167930A1 (en) * | 2004-10-08 | 2006-07-27 | George Witwer | Self-organized concept search and data storage method |
US7272558B1 (en) * | 2006-12-01 | 2007-09-18 | Coveo Solutions Inc. | Speech recognition training method for audio and video file indexing on a search engine |
US20100070276A1 (en) * | 2008-09-16 | 2010-03-18 | Nice Systems Ltd. | Method and apparatus for interaction or discourse analytics |
US20110301945A1 (en) * | 2010-06-04 | 2011-12-08 | International Business Machines Corporation | Speech signal processing system, speech signal processing method and speech signal processing program product for outputting speech feature |
US20130086458A1 (en) * | 2011-09-30 | 2013-04-04 | Sony Corporation | Information processing apparatus, information processing method, and computer readable medium |
US20140016867A1 (en) * | 2012-07-12 | 2014-01-16 | Spritz Technology Llc | Serial text display for optimal recognition apparatus and method |
US20140325407A1 (en) * | 2013-04-25 | 2014-10-30 | Microsoft Corporation | Collection, tracking and presentation of reading content |
US20140350930A1 (en) * | 2011-01-10 | 2014-11-27 | Nuance Communications, Inc. | Real Time Generation of Audio Content Summaries |
US9020808B2 (en) * | 2013-02-11 | 2015-04-28 | Appsense Limited | Document summarization using noun and sentence ranking |
US20150363954A1 (en) * | 2014-06-17 | 2015-12-17 | Spritz Technology, Inc. | Optimized serial text display for chinese and related languages |
US9483532B1 (en) * | 2010-01-29 | 2016-11-01 | Guangsheng Zhang | Text processing system and methods for automated topic discovery, content tagging, categorization, and search |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08194492A (en) | 1995-01-18 | 1996-07-30 | Fujitsu Ltd | Supporting device for preparation of proceeding |
WO2003038662A1 (en) * | 2001-10-31 | 2003-05-08 | University Of Medicine & Dentistry Of New Jersey | Conversion of text data into a hypertext markup language |
US7085771B2 (en) * | 2002-05-17 | 2006-08-01 | Verity, Inc | System and method for automatically discovering a hierarchy of concepts from a corpus of documents |
US20050108630A1 (en) * | 2003-11-19 | 2005-05-19 | Wasson Mark D. | Extraction of facts from text |
JP4649405B2 (en) * | 2004-06-07 | 2011-03-09 | 株式会社日立メディコ | Structured document creation method and apparatus |
US7584103B2 (en) * | 2004-08-20 | 2009-09-01 | Multimodal Technologies, Inc. | Automated extraction of semantic content and generation of a structured document from speech |
JP4011573B2 (en) | 2004-09-10 | 2007-11-21 | 日本電信電話株式会社 | Conference structure grasp support method, apparatus, program, and recording medium storing the program |
JP4521417B2 (en) * | 2007-03-16 | 2010-08-11 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Computer program, method and apparatus for editing a structured electronic document represented by a tree structure in which an object to be processed by a computer constitutes each node |
JP2011501847A (en) * | 2007-10-17 | 2011-01-13 | アイティーアイ・スコットランド・リミテッド | Computer-implemented method |
US8473467B2 (en) * | 2009-01-02 | 2013-06-25 | Apple Inc. | Content profiling to dynamically configure content processing |
WO2011099086A1 (en) | 2010-02-15 | 2011-08-18 | 株式会社 東芝 | Conference support device |
JP5677901B2 (en) * | 2011-06-29 | 2015-02-25 | みずほ情報総研株式会社 | Minutes creation system and minutes creation method |
JP5774459B2 (en) | 2011-12-08 | 2015-09-09 | 株式会社野村総合研究所 | Discourse summary template creation system and discourse summary template creation program |
US9367607B2 (en) * | 2012-12-31 | 2016-06-14 | Facebook, Inc. | Natural-language rendering of structured search queries |
JP6179226B2 (en) | 2013-07-05 | 2017-08-16 | 株式会社リコー | Minutes generating device, minutes generating method, minutes generating program and communication conference system |
KR20150081981A (en) | 2014-01-07 | 2015-07-15 | 삼성전자주식회사 | Apparatus and Method for structuring contents of meeting |
JP6402447B2 (en) * | 2014-01-21 | 2018-10-10 | 株式会社リコー | Support device, support method, and program |
US10176369B2 (en) * | 2016-11-23 | 2019-01-08 | Xerox Corporation | Method and apparatus for generating a summary document |
-
2016
- 2016-03-17 JP JP2016054331A patent/JP2017167433A/en active Pending
-
2017
- 2017-01-26 US US15/416,973 patent/US10540987B2/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5778397A (en) * | 1995-06-28 | 1998-07-07 | Xerox Corporation | Automatic method of generating feature probabilities for automatic extracting summarization |
US20050125216A1 (en) * | 2003-12-05 | 2005-06-09 | Chitrapura Krishna P. | Extracting and grouping opinions from text documents |
US20060167930A1 (en) * | 2004-10-08 | 2006-07-27 | George Witwer | Self-organized concept search and data storage method |
US7272558B1 (en) * | 2006-12-01 | 2007-09-18 | Coveo Solutions Inc. | Speech recognition training method for audio and video file indexing on a search engine |
US20100070276A1 (en) * | 2008-09-16 | 2010-03-18 | Nice Systems Ltd. | Method and apparatus for interaction or discourse analytics |
US9483532B1 (en) * | 2010-01-29 | 2016-11-01 | Guangsheng Zhang | Text processing system and methods for automated topic discovery, content tagging, categorization, and search |
US20110301945A1 (en) * | 2010-06-04 | 2011-12-08 | International Business Machines Corporation | Speech signal processing system, speech signal processing method and speech signal processing program product for outputting speech feature |
US20140350930A1 (en) * | 2011-01-10 | 2014-11-27 | Nuance Communications, Inc. | Real Time Generation of Audio Content Summaries |
US20130086458A1 (en) * | 2011-09-30 | 2013-04-04 | Sony Corporation | Information processing apparatus, information processing method, and computer readable medium |
US20140016867A1 (en) * | 2012-07-12 | 2014-01-16 | Spritz Technology Llc | Serial text display for optimal recognition apparatus and method |
US9020808B2 (en) * | 2013-02-11 | 2015-04-28 | Appsense Limited | Document summarization using noun and sentence ranking |
US20140325407A1 (en) * | 2013-04-25 | 2014-10-30 | Microsoft Corporation | Collection, tracking and presentation of reading content |
US20150363954A1 (en) * | 2014-06-17 | 2015-12-17 | Spritz Technology, Inc. | Optimized serial text display for chinese and related languages |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11176943B2 (en) | 2017-09-21 | 2021-11-16 | Kabushiki Kaisha Toshiba | Voice recognition device, voice recognition method, and computer program product |
Also Published As
Publication number | Publication date |
---|---|
US10540987B2 (en) | 2020-01-21 |
JP2017167433A (en) | 2017-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11037553B2 (en) | Learning-type interactive device | |
CN108536654B (en) | Method and device for displaying identification text | |
US9514741B2 (en) | Data shredding for speech recognition acoustic model training under data retention restrictions | |
US8170866B2 (en) | System and method for increasing accuracy of searches based on communication network | |
US20180130496A1 (en) | Method and system for auto-generation of sketch notes-based visual summary of multimedia content | |
US20170169822A1 (en) | Dialog text summarization device and method | |
CN107430851B (en) | Speech presentation device and speech presentation method | |
CN111615696A (en) | Interactive representation of content for relevance detection and review | |
JP5639546B2 (en) | Information processing apparatus and information processing method | |
US20200137224A1 (en) | Comprehensive log derivation using a cognitive system | |
WO2016190126A1 (en) | Information processing device, information processing method, and program | |
WO2004072780A2 (en) | Method for automatic and semi-automatic classification and clustering of non-deterministic texts | |
CN110750996B (en) | Method and device for generating multimedia information and readable storage medium | |
US11392791B2 (en) | Generating training data for natural language processing | |
US20200151220A1 (en) | Interactive representation of content for relevance detection and review | |
CN112382295A (en) | Voice recognition method, device, equipment and readable storage medium | |
WO2023129255A1 (en) | Intelligent character correction and search in documents | |
US20220121712A1 (en) | Interactive representation of content for relevance detection and review | |
US10540987B2 (en) | Summary generating device, summary generating method, and computer program product | |
CN111222837A (en) | Intelligent interviewing method, system, equipment and computer storage medium | |
JP2008204274A (en) | Conversation analysis apparatus and conversation analysis program | |
US20190088258A1 (en) | Voice recognition device, voice recognition method, and computer program product | |
KR102464156B1 (en) | Call center service providing apparatus, method, and program for matching a user and an agent vasded on the user`s status and the agent`s status | |
US20230069113A1 (en) | Text Summarization Method and Text Summarization System | |
KR20230015489A (en) | Apparatus for managing minutes and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: KABUSHIKI KAISHA TOSHIBA, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FUME, KOSEI;ASHIKAWA, TAIRA;ASHIKAWA, MASAYUKI;AND OTHERS;SIGNING DATES FROM 20170223 TO 20170224;REEL/FRAME:041504/0745 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |