CN110460798B - Video interview service processing method, device, terminal and storage medium - Google Patents

Video interview service processing method, device, terminal and storage medium Download PDF

Info

Publication number
CN110460798B
CN110460798B CN201910563766.3A CN201910563766A CN110460798B CN 110460798 B CN110460798 B CN 110460798B CN 201910563766 A CN201910563766 A CN 201910563766A CN 110460798 B CN110460798 B CN 110460798B
Authority
CN
China
Prior art keywords
information
user
voice information
preset
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910563766.3A
Other languages
Chinese (zh)
Other versions
CN110460798A (en
Inventor
张奕
赵芝松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910563766.3A priority Critical patent/CN110460798B/en
Publication of CN110460798A publication Critical patent/CN110460798A/en
Application granted granted Critical
Publication of CN110460798B publication Critical patent/CN110460798B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • G06F16/637Administration of user profiles, e.g. generation, initialization, adaptation or distribution
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A video interview service processing method, comprising: acquiring user information participating in interviews; reading preset voice information in a voice database according to the user information; acquiring voice information output by a current user, and matching the voice information output by the user with the preset voice information to obtain user information of the current output voice information; converting voice information output by a user into text information; detecting whether preset keywords exist in the character information or not; if the preset keywords exist in the character information, the preset keywords are obtained, corresponding dialogs are matched in a dialogs service database according to the preset keywords, and recommended dialogs are output. The invention also provides a video interview service processing device, a terminal and a computer readable storage medium. The invention enables the interviewing parties to communicate in a text form; and the problems of no topic and speaking, embarrassment and the like during interview caused by insufficient experience of the auditors or poor presence can be avoided.

Description

Video interview service processing method, device, terminal and storage medium
Technical Field
The invention relates to the technical field of data analysis, in particular to a video interview service processing method, a video interview service processing device, a terminal and a computer readable storage medium.
Background
With the progress of technology, the way of remote communication between people has been developed from letter, telegraph, voice telephone to video telephone. Because the video telephone has the characteristics of good real-time performance and strong interactivity, the video telephone is rapidly favored and pursued by a large number of users, more and more industries begin to utilize the video telephone to carry out related work, for example, an interviewer and a interviewee can communicate in a video telephone mode, and therefore the condition limitation of regions and the like to influence the communication process of the two parties is avoided.
The existing video call generally only needs to transmit video data and audio data at the same time, namely, both parties of the call only carry out video communication and audio communication, but cannot convert audio information into character information to carry out character communication, so that both parties of the call can normally communicate only by carefully listening to the conversation content of the other party; in the video telephone process, the communication and interaction between the interviewer and the interviewee are mainly carried out in a mode that the interviewer actively guides the topic and the interviewee replies, and when the actual interviewer experiences are insufficient or the presence state is poor, the problems that the interviewer cannot speak the topic, the atmosphere is embarrassed and the like often occur easily, and the experience is poor.
Disclosure of Invention
In view of the above, there is a need to provide a video interview service processing method, a video interview service processing device, a terminal and a computer readable storage medium, so that interview parties can communicate in a text form; and the problems of no topic and speaking, embarrassment and the like during interview caused by insufficient experience of the auditors or poor presence can be avoided.
A first aspect of an embodiment of the present invention provides a video interview service processing method, where the video interview service processing method includes:
before a video interview starts, user information participating in the interview is acquired, wherein the user information comprises auditing personnel information and audited personnel information;
reading preset voice information in a voice database according to the user information;
in the video interview process, acquiring voice information output by a current user, and matching the voice information output by the user with the preset voice information to obtain user information of the current output voice information;
converting voice information output by a user into character information;
detecting whether preset keywords exist in the character information or not;
and if the detection result is that the preset keywords exist in the character information, acquiring the preset keywords, matching corresponding dialogues in a dialogues service database according to the preset keywords, and outputting recommended dialogues.
Further, in the video interview service processing method provided in the embodiment of the present invention, before reading preset voice information in a voice database according to the user information, the method further includes:
detecting whether preset voice information corresponding to the user information is stored in the voice database;
if the detection result is that the preset voice information corresponding to the user information is not stored in the voice database, acquiring the voice information of the user to obtain the preset voice information corresponding to the user;
and storing the preset voice information corresponding to the user into the voice database.
Further, in the method for processing a video interview service provided by the embodiment of the present invention, the matching of the voice information output by the user with the preset voice information to obtain the user information of the current output voice information includes:
acquiring the voice information output by the user and tone information in the preset voice information;
and matching the tone information in the voice information output by the user with the tone information in the preset voice information to obtain the user information of the current output voice information.
Further, in the above video interview service processing method provided by the embodiment of the present invention, before converting the voice information output by the user into text information, the method further includes:
acquiring the definition of voice information output by a user;
judging whether the definition of the voice information output by the user meets a preset definition threshold value or not;
if the judgment result is that the definition of the voice information output by the user does not meet the preset definition threshold, confirming the reason causing that the definition of the voice information output by the user does not meet the preset definition threshold;
matching the solutions in the solution library according to the reasons, and outputting the solutions.
Further, in the above video interview service processing method provided by the embodiment of the present invention, the matching of corresponding dialogues in the dialogues service database according to the preset keywords and the outputting of the recommended dialogues includes:
extracting the text information converted from the voice information output by the auditors;
acquiring preset keywords in the text information;
determining a conversation category according to the preset keywords, and acquiring a conversation list corresponding to the current conversation category;
acquiring the position of the current ongoing call in the call list;
and acquiring a next dialect corresponding to the current ongoing dialect, and outputting the next dialect to the auditor as a recommended dialect.
Further, in the above video interview service processing method provided by the embodiment of the present invention, after the outputting of the recommended speech, the method further includes:
detecting whether the auditors adopt the recommended dialect within a preset time interval;
and if the detection result is that the auditor does not adopt the recommended dialect, canceling the currently recommended dialect and executing the next round of dialect recommendation.
Further, in the above video interview service processing method provided in the embodiment of the present invention, the method further includes:
detecting whether preset sensitive words exist in the output voice information or character information;
and if the detection result is that the preset sensitive words exist in the output voice information or the output text information, determining the user information of the output sensitive words, and outputting an alarm prompt to the user.
A second aspect of the embodiments of the present invention further provides a video interview service processing apparatus, where the video interview service processing apparatus includes:
the user information acquisition module is used for acquiring user information participating in interviews before the video interviews start, wherein the user information comprises auditor information and audited personnel information;
the voice information reading module is used for reading preset voice information in a voice database according to the user information;
the voice matching module is used for acquiring voice information output by a current user in the video interview process and matching the voice information output by the user with the preset voice information to obtain the user information of the current output voice information;
the character conversion module is used for converting the voice information output by the user into character information;
the keyword detection module is used for detecting whether preset keywords exist in the character information or not;
and the word operation recommending module is used for acquiring the preset keyword when the detection result shows that the preset keyword exists in the word information, matching the corresponding word operation in the word operation service database according to the preset keyword and outputting the recommended word operation.
A third aspect of the embodiments of the present invention further provides a terminal, where the terminal includes a processor, and the processor is configured to implement any one of the above video interview service processing methods when executing a computer program stored in a memory.
The fourth aspect of the embodiments of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the video interview service processing method described above.
The embodiment of the invention provides a video interview service processing method, a video interview service processing device, a terminal and a computer readable storage medium, wherein before video interview begins, user information participating in interview is acquired, and the user information comprises audit staff information and audited staff information; reading preset voice information in a voice database according to the user information; in the video interview process, acquiring voice information output by a current user, and matching the voice information output by the user with the preset voice information to obtain user information of the current output voice information; converting voice information output by a user into character information; detecting whether preset keywords exist in the text information or not; and if the detection result is that the preset keywords exist in the character information, acquiring the preset keywords, matching corresponding dialogues in a dialogues service database according to the preset keywords, and outputting recommended dialogues. By utilizing the embodiment of the invention, the interview parties can also communicate in a text form through the voice-to-text service; and the speech recommendation service is used for performing speech recommendation in the video interview process, so that the problems of no topic and speechability, embarrassment and the like caused by insufficient experience of auditors or poor presence state during interview are avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a video interview service processing method according to a first embodiment of the invention.
Fig. 2 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Fig. 3 is an exemplary functional block diagram of the terminal shown in fig. 2.
Description of the main elements
Terminal device 1
Memory device 10
Display screen 20
Processor with a memory having a plurality of memory cells 30
Video interview service processing device 100
User information acquisition module 101
Voice information reading module 102
Voice matching module 103
Character conversion module 104
Keyword detection module 105
Dialect recommendation module 106
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention, and the described embodiments are merely a subset of the embodiments of the present invention, rather than a complete embodiment. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Fig. 1 is a flowchart of a video interview service processing method according to a first embodiment of the present invention, where the video interview service processing method can be applied to a terminal 1, and the terminal 1 can be an intelligent device such as a smart phone, a notebook computer, a desktop/tablet computer, a smart watch, and a Personal Digital Assistant (PDA). As shown in fig. 1, the video interview service processing method can include the following steps:
s101: before a video interview is started, user information participating in the interview is obtained, wherein the user information comprises audit staff information and audited staff information.
In the embodiment, before the video interview is started, user information participating in the interview is acquired, wherein the user information comprises auditor information and audited person information. The auditors and audited persons may be in a one-to-one relationship (i.e., one auditor corresponds to one audited person), a many-to-one relationship (i.e., a plurality of auditors correspond to one audited person), or a many-to-many relationship (a plurality of auditors correspond to a plurality of audited persons). The user information may be information such as a name and a reserved mobile phone number of the user.
S102: and reading preset voice information in a voice database according to the user information.
In this embodiment, a voice database is provided, and the voice database is used for storing the preset voice information of the auditor and the information corresponding to the auditor, and the preset voice information of the audited person and the information corresponding to the audited person. The preset voice information may be preset by the end user, for example, the preset voice information includes: "you are good, happy to see you! Before reading the preset voice information in the voice database according to the user information, the method further comprises: detecting whether preset voice information corresponding to the user information is stored in the voice database; if the detection result indicates that the preset voice information corresponding to the user information is not stored in the voice database, acquiring the voice information of the user to obtain the preset voice information corresponding to the user; and storing the preset voice information corresponding to the user into the voice database. And reading preset voice information in a voice database according to the user information, and temporarily storing the read preset voice information into the same newly-built folder.
S103: in the video interview process, voice information output by a current user is collected, and the voice information output by the user is matched with the preset voice information to obtain user information of the current output voice information.
In the embodiment, in the video interview process, the voice information output by the current user is collected, and the voice information output by the user is matched with the preset voice information to obtain the user information of the current output voice information. The matching of the voice information output by the user and the preset voice information to obtain the user information of the current output voice information comprises the following steps: acquiring the voice information output by the user and tone information in the preset voice information; and matching the tone information in the voice information output by the user with the tone information in the preset voice information to obtain the user information of the current output voice information. Different users have different tone information, so that the identity information of the current user can be confirmed according to the tone information in the voice information output by the user. Because the read preset voice information is temporarily stored in the same new folder, when the tone color information in the preset voice information is obtained, only the preset voice information in the new folder needs to be operated, and the matching rate is improved.
S104: and converting the voice information output by the user into character information.
In the embodiment, the voice information output by the user is converted into the text information, so that the text information and the voice information can be synchronously displayed on the display interface. The text information corresponds to user information, and when the current user speaks in voice, the text information output by the current user is synchronously displayed on the display interface.
When a user outputs voice information, the voice information output by the user may not be recognized for various reasons, and thus the voice information output by the user may not be accurately converted into text information. Thus, before the converting the voice information output by the user into the text information, the method further comprises: acquiring the definition of voice information output by a user; judging whether the definition of the voice information output by the user meets a preset definition threshold value or not; if the judgment result is that the definition of the voice information output by the user does not meet the preset definition threshold, confirming the reason causing that the definition of the voice information output by the user does not meet the preset definition threshold; matching the solutions in the solution library according to the reasons, and outputting the solutions. Wherein the reason why the confirmation causes the definition of the voice information output by the user not to meet the preset definition threshold comprises one or more of the following reasons: acquiring a volume value of voice information output by a user; and judging whether the volume value of the voice information output by the user is lower than a first preset threshold value or not. Or acquiring a distance value of the face of the user from the terminal; judging whether the distance value from the face of the user to the terminal is larger than a second preset threshold value or not; or acquiring the speed of the user for outputting the voice information; and judging whether the speed of the voice information output by the user is greater than a third preset threshold value. The first/second/third preset threshold values are all values set by the end user according to experience. The embodiment also provides a scheme library, and the scheme library is used for storing the solution which causes that the definition of the voice information output by the user cannot meet the preset definition threshold. The matching of the solution in the solution library according to the reason and the outputting of the solution comprise: and if the volume value of the voice information output by the user is lower than the first preset threshold value, prompting the user to increase the volume. Or if the distance value of the face of the user from the terminal is larger than the second preset threshold value, prompting the user to approach the terminal. Or if the speed of outputting the voice information by the user is judged to be greater than the third preset threshold, prompting the user to reduce the speed of outputting the voice information.
After the converting the voice information output by the user into the text information, the method further comprises: acquiring preset interview information of the video interview; and storing the preset interview information and the text information. The preset interview information and the text information are stored and recorded, so that the interview information and the text information can be conveniently checked later. The preset interview information comprises: the information of the auditor information and audited personnel information in the video interview, the subject of the video interview, the time of the video interview, and the IP address and port number corresponding to the service adopted by the video interview. The embodiment of the invention provides a video interview service control page, which comprises preset numbers (for example, the number of voice to text services is 2, the number of tactical recommendation services is 1, and the preset numbers are set according to actual needs) of voice to text services and IP addresses and port numbers of the tactical recommendation services. In the video interview service control page, the voice-to-text service or the tactical recommendation service can be logically edited and deleted; it is also possible to set whether a certain service is a used service. For example, the number of the voice-to-text services is 2, and the voice-to-text services are respectively a first voice-to-text service and a second voice-to-text service. For the first voice to text service and the second voice to text service, there are an IP address and a port number corresponding to the services. And if the first voice to text service is set as the used service, accessing the IP address and the port number of the first voice to text service in the video interview process. The service address is set on the video interview service control page, so that the service address is more convenient to set.
S105: and detecting whether preset keywords exist in the text information, and if the detection result indicates that the preset keywords exist in the text information, executing step S106.
In this embodiment, whether a preset keyword exists in the text information is detected, where the preset keyword is a preset keyword matched with a speech technology, and a corresponding speech technology can be matched in a speech technology service database according to the preset keyword.
S106: and acquiring the preset keywords, matching corresponding dialogs in a dialogs service database according to the preset keywords, and outputting recommended dialogs.
The conversational recommendation service is used for carrying out conversational recommendation on the user according to the voice information output by the user in the conversation process of the user. For example, the auditor is recommended topics that need interview, etc. The embodiment of the invention provides a speech technology service database, wherein different types of speech technologies are stored in the speech technology service database. The presentation form of the dialects can be a presentation form of a list of the dialects, and all the possible used dialects in one category are presented in the list.
In this embodiment, the preset keyword is obtained, and a recommended speech is output according to the matching of the preset keyword with the corresponding speech in the speech service database. The matching of the corresponding dialect in the dialect service database according to the preset keyword comprises: extracting the text information converted from the voice information output by the auditor; acquiring preset keywords in the text information; determining a conversation category according to the preset keywords, and acquiring a conversation list corresponding to the current conversation category; acquiring the position of the currently-ongoing call in the call list; and acquiring a next dialect corresponding to the currently ongoing dialect, and outputting the next dialect to the auditor as a recommended dialect. For the currently recommended dialect, the highlighting may be performed according to a preset highlighting manner, where the preset highlighting manner may be to bold or highlight the currently recommended dialect. It is understood that in the actual video interview process, the auditor does not necessarily interview according to the corresponding dialect sequence in the dialect list, and sometimes skips some dialects and sometimes adds some new dialects. Aiming at the situation of skipping some dialects, the method can also prompt the dialects which are not adopted by the current auditors, and avoids the situation that the video interview content is not comprehensive enough due to the fact that the auditors are not professional enough. Aiming at the situation of adding some new dialects, the new dialects can be recorded, and after the video interview of the auditor is finished, a prompt box is popped up to prompt the auditor whether the current new dialects need to be added into the dialects service database.
After the outputting the recommended dialog, the method further comprises: detecting whether the auditors adopt the recommended dialect within a preset time interval; and if the detection result is that the auditor does not adopt the recommended dialect, canceling the currently recommended dialect and executing the next round of dialect recommendation. The preset time interval is preset by the end user, for example, the preset time interval is 5 seconds.
The method further comprises the following steps: detecting whether preset sensitive words exist in the output voice information or the output text information or not; and if the detection result is that the preset sensitive words exist in the output voice information or the output text information, determining the user information of the output sensitive words, and outputting an alarm prompt to the user. The predetermined sensitive words include non-plain language and other restricted words. And an alarm prompt is output to the user to prompt the user to pay attention to the civilized terms, so that the interview order can be ensured. For the auditors, the auditors which often output sensitive words can be labeled, and relevant personnel are arranged by a company to take corresponding measures for the labeled auditors, such as educational training and the like.
The embodiment of the invention provides a video interview service processing method, which comprises the steps of obtaining user information participating in interview before the video interview is started, wherein the user information comprises auditor information and audited personnel information; reading preset voice information in a voice database according to the user information; in the video interview process, acquiring voice information output by a current user, and matching the voice information output by the user with the preset voice information to obtain user information of the current output voice information; converting voice information output by a user into text information; detecting whether preset keywords exist in the text information or not; and if the detection result is that the preset keywords exist in the character information, acquiring the preset keywords, matching corresponding dialogues in a dialogues service database according to the preset keywords, and outputting recommended dialogues. By utilizing the embodiment of the invention, the interviewing parties can also communicate in a text form through the voice-to-text service; and the speech recommendation service is used for performing speech recommendation in the video interview process, so that the problems of no topic and talkability, embarrassment in atmosphere and the like caused by insufficient experience of auditors or poor presence state during interview are avoided.
The above is a detailed description of the method provided by the embodiments of the present invention. The order of execution of the blocks in the flowcharts shown may be changed, and some blocks may be omitted, according to various needs. The following describes the terminal 1 provided in the embodiment of the present invention.
The embodiment of the present invention further provides a terminal 1, which includes a memory 10, a processor 30, and a computer program stored on the memory 10 and executable on the processor 30, where the processor 30 implements the steps of the video interview service processing method described in any of the above embodiments when executing the program.
Fig. 2 is a schematic structural diagram of the terminal 1 according to an embodiment of the present invention, and as shown in fig. 2, the terminal 1 includes a memory 10, and a video interview service processing device 100 is stored in the memory 10. The terminal 1 may be a terminal 1 with an application display function, such as a mobile phone, a tablet computer, and a personal digital assistant. The video interview service processing device 100 can acquire user information participating in interviews before the video interviews start, wherein the user information comprises auditor information and audited personnel information; reading preset voice information in a voice database according to the user information; in the video interview process, acquiring voice information output by a current user, and matching the voice information output by the user with the preset voice information to obtain user information of the current output voice information; converting voice information output by a user into text information; detecting whether preset keywords exist in the text information or not; and if the detection result is that the preset keywords exist in the character information, acquiring the preset keywords, matching corresponding dialogues in a dialogues service database according to the preset keywords, and outputting recommended dialogues. By utilizing the embodiment of the invention, the interviewing parties can also communicate in a text form through the voice-to-text service; and the speech recommendation service is used for performing speech recommendation in the video interview process, so that the problems of no topic and talkability, embarrassment in atmosphere and the like caused by insufficient experience of auditors or poor presence state during interview are avoided.
In this embodiment, the terminal 1 may further include a display 20 and a processor 30. The memory 10 and the display screen 20 can be electrically connected with the processor 30 respectively.
The memory 10 may be of different types of memory devices for storing various types of data. For example, the memory or internal memory of the terminal 1 may be used, and the memory Card may be a memory Card that is externally connected to the terminal 1, such as a flash memory, an SM Card (Smart Media Card), an SD Card (Secure Digital Card), and the like. In addition, the memory 10 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device. The memory 10 is used for storing various types of data, such as various types of application programs (Applications) installed in the terminal 1, data set and acquired by applying the video interview service processing method, and the like.
A display 20 is mounted to the terminal 1 for displaying information.
The processor 30 is used for executing the video interview service processing method and various types of software installed in the terminal 1, such as an operating system, application display software and the like. The processor 30 includes, but is not limited to, a Central Processing Unit (CPU), a Micro Controller Unit (MCU), and other devices for interpreting a computer and Processing data in computer software.
The video interview service processing apparatus 100 can include one or more modules stored in the memory 10 of the terminal 1 and configured to be executed by one or more processors (in this embodiment, one processor 30) to implement embodiments of the invention. For example, referring to fig. 3, the video interview service processing device 100 can include a user information obtaining module 101, a voice information reading module 102, a voice matching module 103, a text conversion module 104, a keyword detection module 105, and a conversation recommendation module 106. The modules referred to in the embodiments of the present invention may be program segments that perform a specific function, and are more suitable than programs for describing the execution process of software in a processor.
It is understood that, corresponding to the embodiments of the video interview service processing method described above, the terminal 1 may include some or all of the functional modules shown in fig. 3, and the functions of the modules will be described in detail below. It should be noted that the same noun and its specific explanation in the above embodiments of the video interview service processing method can also be applied to the following functional introduction of each module. For brevity and to avoid repetition, further description is omitted.
The user information obtaining module 101 may be configured to obtain user information participating in a video interview before the video interview starts, where the user information includes auditor information and audited people information.
The voice information reading module 102 may be configured to read preset voice information in a voice database according to the user information.
The voice matching module 103 may be configured to collect voice information output by a current user during a video interview process, and match the voice information output by the user with the preset voice information to obtain user information of the current output voice information.
The text conversion module 104 may be used to convert the voice information output by the user into text information.
The keyword detection module 105 may be configured to detect whether a preset keyword exists in the text message.
The word operation recommending module 106 may be configured to, when the detection result indicates that a preset keyword exists in the text information, obtain the preset keyword, match a corresponding word operation in a word operation service database according to the preset keyword, and output a recommended word operation.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the video interview service processing method in any one of the above embodiments.
The video interview service processing means 100/terminal 1/computer device integrated module/unit, if implemented as a software functional unit and sold or used as a stand-alone product, can be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium and used by a processor to implement the steps of the above method embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, U.S. disk, removable hard disk, magnetic diskette, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signal, telecommunications signal, and software distribution medium, etc.
The Processor 30 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 30 is the control center of the video interview service processing device 100/terminal 1, with various interfaces and lines connecting the various parts of the entire video interview service processing device 100/terminal 1.
The memory 10 is used for storing the computer programs and/or modules, and the processor 30 implements various functions of the video interview service processing device 100/terminal 1 by running or executing the computer programs and/or modules stored in the memory and calling data stored in the memory 10. The memory 10 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
In the several embodiments provided in the present invention, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described system implementation is merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be implemented in practice.
It will be evident to those skilled in the art that the embodiments of the present invention are not limited to the details of the foregoing illustrative embodiments, and that the embodiments of the present invention are capable of being embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the embodiments being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. A plurality of units, modules or devices recited in the claims may also be implemented by one and the same unit, module or device by software or hardware.
Although the embodiments of the present invention have been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the embodiments of the present invention.

Claims (9)

1. A video interview service processing method is characterized by comprising the following steps:
before a video interview starts, user information participating in the interview is acquired, wherein the user information comprises auditing personnel information and audited personnel information;
reading preset voice information in a voice database according to the user information;
in the video interview process, acquiring voice information output by a current user, and matching the voice information output by the user with the preset voice information to obtain user information of the current output voice information;
converting voice information output by a user into character information;
detecting whether preset keywords exist in the text information or not;
if the detection result is that preset keywords exist in the text information, extracting the text information converted from the voice information output by the auditor;
acquiring preset keywords in the text information;
determining a conversation category according to the preset keywords, and acquiring a conversation list corresponding to the current conversation category;
acquiring the position of the currently-ongoing call in the call list;
and acquiring a next dialect corresponding to the currently ongoing dialect, and outputting the next dialect to the auditor as a recommended dialect.
2. The video interview service processing method of claim 1 wherein prior to reading the preset voice information in the voice database based on said user information, said method further comprises:
detecting whether preset voice information corresponding to the user information is stored in the voice database;
if the detection result indicates that the preset voice information corresponding to the user information is not stored in the voice database, acquiring the voice information of the user to obtain the preset voice information corresponding to the user;
and storing the preset voice information corresponding to the user into the voice database.
3. The video interview service processing method of claim 1, wherein the matching of the voice information output by the user with the preset voice information to obtain the user information of the currently output voice information comprises:
acquiring the voice information output by the user and tone information in the preset voice information;
and matching the tone information in the voice information output by the user with the tone information in the preset voice information to obtain the user information of the current output voice information.
4. The video interview service processing method of claim 1 wherein prior to said converting the user output voice information to text information, said method further comprises:
acquiring the definition of voice information output by a user;
judging whether the definition of the voice information output by the user meets a preset definition threshold value or not;
if the judgment result is that the definition of the voice information output by the user does not meet the preset definition threshold, confirming the reason causing that the definition of the voice information output by the user does not meet the preset definition threshold;
and matching the solutions in the solution library according to the reasons, and outputting the solutions.
5. The video interview service processing method of claim 1 wherein after said outputting a recommended utterance, said method further comprises:
detecting whether the auditor adopts the recommended dialect within a preset time interval;
and if the detection result is that the auditor does not adopt the recommended dialect, canceling the currently recommended dialect and executing the next round of dialect recommendation.
6. The video interview service processing method of claim 1 wherein said method further comprises:
detecting whether preset sensitive words exist in the output voice information or character information;
and if the detection result is that the preset sensitive words exist in the output voice information or the output text information, determining the user information of the output sensitive words, and outputting an alarm prompt to the user.
7. A video interview service processing apparatus, said video interview service processing apparatus comprising:
the user information acquisition module is used for acquiring user information participating in interviews before the video interviews start, wherein the user information comprises auditor information and audited personnel information;
the voice information reading module is used for reading preset voice information in a voice database according to the user information;
the voice matching module is used for acquiring voice information output by a current user in the video interview process and matching the voice information output by the user with the preset voice information to obtain the user information of the current output voice information;
the character conversion module is used for converting the voice information output by the user into character information;
the keyword detection module is used for detecting whether preset keywords exist in the character information or not;
the speech recommendation module is used for extracting the text information converted from the voice information output by the auditor when the detection result shows that the preset keywords exist in the text information;
acquiring preset keywords in the text information;
determining a conversation category according to the preset keywords, and acquiring a conversation list corresponding to the current conversation category;
acquiring the position of the current ongoing call in the call list;
and acquiring a next dialect corresponding to the current ongoing dialect, and outputting the next dialect to the auditor as a recommended dialect.
8. A terminal characterized in that it comprises a processor for implementing a video interview service processing method according to any one of claims 1 to 6 when executing a computer program stored in a memory.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a video interview service processing method according to any one of claims 1 to 6.
CN201910563766.3A 2019-06-26 2019-06-26 Video interview service processing method, device, terminal and storage medium Active CN110460798B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910563766.3A CN110460798B (en) 2019-06-26 2019-06-26 Video interview service processing method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910563766.3A CN110460798B (en) 2019-06-26 2019-06-26 Video interview service processing method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN110460798A CN110460798A (en) 2019-11-15
CN110460798B true CN110460798B (en) 2022-10-11

Family

ID=68481159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910563766.3A Active CN110460798B (en) 2019-06-26 2019-06-26 Video interview service processing method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN110460798B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361429A (en) * 2023-01-19 2023-06-30 北京伽睿智能科技集团有限公司 Business exception employee management method, system, equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112084318B (en) * 2020-09-25 2024-02-20 支付宝(杭州)信息技术有限公司 Dialogue assistance method, system and device
CN112182197A (en) * 2020-11-09 2021-01-05 北京明略软件***有限公司 Method, device and equipment for recommending dialect and computer readable medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215654A (en) * 2018-10-22 2019-01-15 北京智合大方科技有限公司 The mobile terminal intelligent customer service auxiliary system of Real-time speech recognition and natural language processing

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105162977A (en) * 2015-08-26 2015-12-16 上海银天下科技有限公司 Excuse recommendation method and device
CN108062316A (en) * 2016-11-08 2018-05-22 百度在线网络技术(北京)有限公司 A kind of method and apparatus for aiding in customer service
CN207149252U (en) * 2017-08-01 2018-03-27 安徽听见科技有限公司 Speech processing system
CN109033257A (en) * 2018-07-06 2018-12-18 中国平安人寿保险股份有限公司 Talk about art recommended method, device, computer equipment and storage medium
CN109166572A (en) * 2018-09-11 2019-01-08 深圳市沃特沃德股份有限公司 The method and reading machine people that robot is read
CN109885679A (en) * 2019-01-11 2019-06-14 平安科技(深圳)有限公司 Obtain method, apparatus, computer equipment and the storage medium of preferred words art
CN109902146A (en) * 2019-01-23 2019-06-18 深圳壹账通智能科技有限公司 Credit information acquisition methods, device, terminal and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215654A (en) * 2018-10-22 2019-01-15 北京智合大方科技有限公司 The mobile terminal intelligent customer service auxiliary system of Real-time speech recognition and natural language processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361429A (en) * 2023-01-19 2023-06-30 北京伽睿智能科技集团有限公司 Business exception employee management method, system, equipment and storage medium
CN116361429B (en) * 2023-01-19 2024-02-02 北京伽睿智能科技集团有限公司 Business exception employee management method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN110460798A (en) 2019-11-15

Similar Documents

Publication Publication Date Title
US10678501B2 (en) Context based identification of non-relevant verbal communications
US11935540B2 (en) Switching between speech recognition systems
US10971153B2 (en) Transcription generation from multiple speech recognition systems
US10388272B1 (en) Training speech recognition systems using word sequences
EP3254453B1 (en) Conference segmentation based on conversational dynamics
US10334384B2 (en) Scheduling playback of audio in a virtual acoustic space
US20200127865A1 (en) Post-conference playback system having higher perceived quality than originally heard in the conference
US10057707B2 (en) Optimized virtual scene layout for spatial meeting playback
US10516782B2 (en) Conference searching and playback of search results
US8880403B2 (en) Methods and systems for obtaining language models for transcribing communications
Przybocki et al. NIST speaker recognition evaluations utilizing the Mixer corpora—2004, 2005, 2006
CA3060748A1 (en) Automated transcript generation from multi-channel audio
CN110460798B (en) Video interview service processing method, device, terminal and storage medium
US20180191912A1 (en) Selective conference digest
US20040064322A1 (en) Automatic consolidation of voice enabled multi-user meeting minutes
US20110004473A1 (en) Apparatus and method for enhanced speech recognition
US20180190266A1 (en) Conference word cloud
US20130253932A1 (en) Conversation supporting device, conversation supporting method and conversation supporting program
CN114514577A (en) Method and system for generating and transmitting a text recording of a verbal communication
CN115831125A (en) Speech recognition method, device, equipment, storage medium and product
CN113037610B (en) Voice data processing method and device, computer equipment and storage medium
Perepelytsia et al. Acoustic compression in Zoom audio does not compromise voice recognition performance
Roshan et al. Capturing important information from an audio conversation
CN112784038A (en) Information identification method, system, computing device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant