CN111128147A - System and method for terminal equipment to automatically access AI multi-turn conversation capability - Google Patents

System and method for terminal equipment to automatically access AI multi-turn conversation capability Download PDF

Info

Publication number
CN111128147A
CN111128147A CN201911129150.1A CN201911129150A CN111128147A CN 111128147 A CN111128147 A CN 111128147A CN 201911129150 A CN201911129150 A CN 201911129150A CN 111128147 A CN111128147 A CN 111128147A
Authority
CN
China
Prior art keywords
user
speaking
client
intention
ivn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911129150.1A
Other languages
Chinese (zh)
Inventor
李旭滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN201911129150.1A priority Critical patent/CN111128147A/en
Publication of CN111128147A publication Critical patent/CN111128147A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides a system for automatically accessing an AI multi-turn conversation capability by a terminal device, which comprises the following steps: the system comprises a visual process configuration module, an equipment end and an IVN cloud server; the visual process configuration module is used for acquiring a client's speech process and sending the client's speech process to the IVN cloud server; the device side is used for acquiring the speaking intention of a user and transmitting the speaking intention of the user to the IVN cloud server in a voice stream mode; the IVN cloud server is used for receiving the speaking process of the client and the speaking intention of the user, and controlling the terminal equipment to automatically access the AI multi-turn conversation capability system according to the speaking process of the client and the speaking intention of the user so as to realize multi-turn conversation between the user and the equipment terminal. By adopting the scheme disclosed by the invention, the code is not required to be developed by professional personnel at regular time, the implementation is simple, and the time and the labor are saved.

Description

System and method for terminal equipment to automatically access AI multi-turn conversation capability
Technical Field
The invention relates to the technical field of internet, in particular to a system and a method for automatically accessing AI multi-turn conversation capability by terminal equipment.
Background
Artificial Intelligence (Artificial Intelligence), abbreviated in english as AI. The artificial intelligence is a branch of computer science, and attempts to understand the essence of intelligence and produces a new intelligent machine which can react in a manner similar to human intelligence, and the research in the field includes robots, speech recognition, image recognition, natural language processing, expert systems, etc., IVN (interactive Voice navigation), and intelligent Voice navigation systems.
At present, terminal hardware on the market is various, and terminal hardware needs professionals to have professional language and semantic related knowledge to realize AI multi-turn conversations, regularly develops a large number of codes, and has complex logic, time consumption and labor consumption.
Disclosure of Invention
The invention provides a system for automatically accessing AI multi-turn conversation capability by terminal equipment, which is used for carrying out multi-turn conversation by acquiring the conversation process of a client and the speaking intention of the user, does not need professional to develop codes regularly, and is simple to realize, time-saving and labor-saving.
The invention provides a system for automatically accessing AI multi-turn conversation capability by terminal equipment, which comprises the following steps: the system comprises a visual process configuration module, an equipment end and an IVN cloud server;
the visual process configuration module is used for acquiring a client's speech process and sending the client's speech process to the IVN cloud server;
the device side is used for acquiring the speaking intention of a user and transmitting the speaking intention of the user to the IVN cloud server in a voice stream mode;
the IVN cloud server is used for receiving the speaking process of the client and the speaking intention of the user, and controlling the terminal equipment to automatically access the AI multi-turn conversation capability system according to the speaking process of the client and the speaking intention of the user so as to realize multi-turn conversation between the user and the equipment terminal.
Preferably, the visualization process configuration module includes:
the IVN project visualization sub-module is used for configuring the speaking and operation process of the client through a speaking and operation process editing interface and configuring and modifying the speaking and operation text, audio and intention of each speaking and operation node in the speaking and operation process of the client; the IVN project visualization submodule is also used for providing a management background of customer operation.
Preferably, the IVN cloud server includes:
the voice interaction interface service module is used for receiving the speaking intention of the user transmitted by the equipment terminal; the voice interaction interface service is further used for sending a response audio generated by the IVN cloud server to the speaking intention of the user to the equipment end;
and the data interface service module is used for receiving the speaking process of the client sent by the visualization process configuration module.
Preferably, the IVN cloud server includes:
the voice recognition service module is used for carrying out voice recognition on the speaking intention of the user and carrying out intelligent voice sentence break on the speaking intention of the user;
the semantic understanding service module is used for understanding the meaning of the speaking intention of the user;
the voice synthesis service is used for carrying out audio synthesis on the response of the IVN cloud server to the speaking intention of the user; the speech synthesis service is also used for dynamic parametric audio synthesis.
Preferably, the IVN cloud server includes:
the data center service module is used for providing big data service, analyzing direct interaction logs of the user and the equipment and screening out the intention of the user;
and the business service module is used for uniformly skipping when the user is abnormal.
Preferably, the voice interaction interface service module, the data interface service module, the voice recognition service module, the semantic understanding service module, the voice synthesis module, the data center service module and the business service module, which are included in the IVN cloud server, can perform horizontal cluster expansion and longitudinal expansion.
The system for automatically accessing the AI multi-turn conversation capability by the terminal equipment provided by the embodiment has the following beneficial effects: the method and the system have the advantages that multiple rounds of conversations are carried out by obtaining the speaking process of the client and the speaking intention of the user, and the method and the system do not need to develop codes regularly by professionals, so that the method and the system are easy to implement, and time and labor are saved.
The invention also provides a method for automatically accessing the AI multi-turn conversation capability by the terminal equipment, which is characterized by comprising the following steps:
the method comprises the steps of obtaining a client's speech process, and sending the client's speech process to an IVN cloud server;
acquiring the speaking intention of a user, and transmitting the speaking intention of the user to an IVN cloud server in a voice stream mode;
and receiving the dialect process of the client and the speaking intention of the user, and controlling the terminal equipment to automatically access the system with AI multi-turn conversation capability according to the dialect process of the client and the speaking intention of the user so as to realize multi-turn conversation between the user and the equipment terminal.
Preferably, in the obtaining of the client's speech and operation process, the client's speech and operation process is configured through a speech and operation process editing interface, and the speech and operation text, audio and intention of each speech and operation node in the client's speech and operation process are configured and modified.
Preferably, the system for controlling the terminal device to automatically access the AI multi-turn conversation capability according to the client's conversation process and the user's speaking intention so as to realize multi-turn conversations between the user and the device side includes:
performing voice recognition on the speaking intention of the user;
carrying out intelligent voice sentence break on the speaking intention of the user;
understanding the meaning in the user's speaking intent.
Preferably, the system for controlling the terminal device to automatically access the AI multi-turn conversation capability according to the client's conversation process and the user's speaking intention so as to realize multi-turn conversations between the user and the device side further includes:
generating a response to the speaking intention of the user according to the meaning in the speaking intention of the user and the speaking process of the client;
and carrying out audio synthesis on the response, and sending the synthesized audio to the equipment terminal.
The method for automatically accessing the AI multi-turn conversation capability by the terminal equipment provided by the embodiment has the following beneficial effects: the method and the system have the advantages that multiple rounds of conversations are carried out by obtaining the speaking process of the client and the speaking intention of the user, and the method and the system do not need to develop codes regularly by professionals, so that the method and the system are easy to implement, and time and labor are saved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a block diagram of a system for automatically accessing an AI multi-turn dialogue capability by a terminal device according to an embodiment of the present invention;
fig. 2 is a block diagram illustrating an example of a system for automatically accessing an AI multi-turn session capability by a terminal device according to an embodiment of the present invention;
fig. 3 is a flowchart of a method for automatically accessing an AI multi-turn session capability by a terminal device according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 1 is a system for a terminal device to automatically access an AI multi-turn session capability according to an embodiment of the present invention, as shown in fig. 1, the system includes: the system comprises a visual process configuration module 11, an equipment end 12 and an IVN cloud server 13;
the visualization process configuration module 11 is configured to obtain a customer's speech process, and send the customer's speech process to the IVN cloud server 13;
the device end 12 is configured to obtain a speaking intention of a user, and transmit the speaking intention of the user to the IVN cloud server 13 in a voice stream manner;
the IVN cloud server 13 is configured to receive the client's speech process and the user's speech intention, and control the terminal device to automatically access the AI multi-turn conversation capability system according to the client's speech process and the user's speech intention, so as to implement multi-turn conversation between the user and the device 12.
The working principle of the embodiment is as follows: the visualization process configuration module 11 acquires a client's speech process and sends the client's speech process to the IVN cloud server 13; the device end 12 acquires the speaking intention of a user, namely a terminal, the user refers to a user using the device end, and transmits the speaking intention of the user to the IVN cloud server 13 in a voice stream mode; the IVN cloud server 13 receives the client's speech process sent by the visual process configuration module 11 and the user's speech intention sent by the device 12, and then the IVN cloud server 13 realizes multiple rounds of conversations between the user and the device 12 according to the client's speech process and the user's speech intention, and the IVN cloud server 13 can also realize control over the business process according to the client's speech process and the user's speech intention.
The beneficial effect of this embodiment lies in: the method and the system have the advantages that multiple rounds of conversations are carried out by obtaining the speaking process of the client and the speaking intention of the user, and the method and the system do not need to develop codes regularly by professionals, so that the method and the system are easy to implement, and time and labor are saved.
In one embodiment, the visualization process configuration module 11 includes:
the IVN project visualization sub-module 111 is used for configuring the speaking and operation process of the client through a speaking and operation process editing interface, and configuring and modifying the speaking and operation text, audio and intention of each speaking and operation node in the speaking and operation process of the client; the IVN project visualization submodule is also used for providing a management background of customer operation.
The working principle of the embodiment is as follows: and configuring the voice process of the client through the voice process editing interface.
The beneficial effect of this embodiment lies in: the method has the advantages that the voice art text, the audio and the intention of each voice art node in the voice art process of the client are configured and modified through the IVN project visualization module, the process development efficiency is greatly improved, the visual feeling of the client business process is improved, and the client can conveniently inquire the conversation log between the user and the equipment terminal in real time through the management platform for client operation.
In one embodiment, the IVN cloud server 13 includes:
the voice interaction interface service module 131 is configured to receive the speaking intention of the user transmitted by the device side; the voice interaction interface service is further used for sending a response audio generated by the IVN cloud server to the speaking intention of the user to the equipment end;
a data interface service module 132, configured to receive the verbal process of the client sent by the visualization process configuration module.
It should be noted that the IVN cloud server provides a client unified input interface, that is, a data interface service module.
The working principle of the embodiment is as follows: receiving the speaking intention of the user, sending the response audio of the speaking intention of the user, and receiving the speaking process of the client.
The beneficial effect of this embodiment lies in: the convenience of customer access is guaranteed.
In one embodiment, the IVN cloud server 13 includes:
the voice recognition service module 133 is configured to perform voice recognition on the speaking intention of the user and perform intelligent voice sentence break on the speaking intention of the user;
a semantic understanding service module 134 for understanding a meaning in the user's speaking intent;
a speech synthesis service 135 for audio synthesizing responses generated by the IVN cloud server to the user's speaking intent; the speech synthesis service is also used for dynamic parametric audio synthesis.
It should be noted that the response generated by the IVN cloud server to the speaking intention of the user may also be text-synthesized.
The working principle of the embodiment is as follows: and carrying out voice recognition, intelligent sentence breaking and semantic understanding on the speaking intention of the user, and carrying out voice synthesis on the response generated by the speaking intention of the user.
The beneficial effect of this embodiment lies in: the user can obtain corresponding response according to the speaking intention.
In one embodiment, the IVN cloud server 13 includes:
the data center service module 136 is used for providing big data service, analyzing direct interaction logs of the user and the equipment, and screening out the intention of the user;
and the business service module 137 is used for unified skip when the user is abnormal.
The working principle of the embodiment is as follows: and analyzing the direct interaction log of the user and the equipment, and screening out the intention of the user.
The beneficial effect of this embodiment lies in: by screening the intentions of the users, the clients can be helped to accurately locate the target groups.
In an embodiment, the voice interaction interface service module, the data interface service module, the voice recognition service module, the semantic understanding service module, the voice synthesis module, the data center service module, and the business service module included in the IVN cloud server may all perform horizontal cluster expansion and vertical expansion.
The working principle of the example is as follows: high available load is realized through a multi-instance deployment mode, and horizontal cluster expansion is achieved.
The beneficial effect of this embodiment lies in: the expandability of customer requirements can be ensured by carrying out horizontal cluster expansion and longitudinal expansion on each module, such as expansion face recognition service and speech processing service in an integrated module.
In one embodiment, a system for a terminal device to automatically access AI multi-turn conversation capability includes: the system comprises a visual process configuration module, an equipment end and an IVN cloud server;
the visualization process configuration module comprises an IVN project visualization submodule;
the IVN cloud server comprises a voice interaction interface service module, a data interface service module, a voice recognition service module, a semantic understanding service module, a voice synthesis module, a data center service module and a business service module.
For example, as shown in fig. 2, the device side is a terminal device, which may be an intelligent robot, an intelligent speaker, etc., the terminal device transmits the speaking intention of the user to the IVN cloud server by means of voice stream, namely the IVN service mechanism, the IVN cloud server sends response audio or text generated by the speaking intention of the user to the terminal equipment, the visual process configuration module, namely the visual process configuration system, sends the speech process of the customer to the IVN cloud server, and the voice interaction interface service module, the data interface service module, the voice recognition service module, the semantic understanding service module, the voice synthesis module, the data center service module and the business service module respectively correspond to the voice interaction interface service, the data interface service, the voice recognition service, the semantic understanding service, the voice synthesis, the data center service and the business service in the IVN service architecture.
Fig. 3 is a method for a terminal device to automatically access AI multi-turn dialogue capability according to an embodiment of the present invention, as shown in fig. 3, the method may be implemented as S31-S33:
in step S31, acquiring a client 'S speech process, and sending the client' S speech process to the IVN cloud server;
in step S32, acquiring the speaking intention of the user, and transmitting the speaking intention of the user to the IVN cloud server in a voice stream manner;
in step S33, the user 'S speech process and the user' S speech intention are received, and the terminal device is controlled to automatically access the AI multi-turn dialogue capability system according to the user 'S speech process and the user' S speech intention, so as to implement multi-turn dialogue between the user and the device.
The working principle of the embodiment is as follows: the method comprises the steps of obtaining a client's speech process, and sending the client's speech process to an IVN cloud server; acquiring the speaking intention of a user, wherein the user refers to a user using an equipment end, and sending the speaking intention of the user to an IVN cloud server; and receiving the dialect process of the client and the speaking intention of the user, further realizing multi-turn conversation between the user and the equipment end according to the dialect process of the client and the speaking intention of the user, and realizing control over the business process according to the dialect process of the client and the speaking intention of the user.
The beneficial effect of this embodiment lies in: the method and the system have the advantages that multiple rounds of conversations are carried out by obtaining the speaking process of the client and the speaking intention of the user, and the method and the system do not need to develop codes regularly by professionals, so that the method and the system are easy to implement, and time and labor are saved.
In one embodiment, in step S31, the customer 'S speech process is configured through the speech process editing interface, and the speech text, audio and intention of each speech node in the customer' S speech process are configured and modified.
The working principle of the embodiment is as follows: the client can configure the dialog flow through the dialog flow editing interface, and can also configure and modify the dialog text, audio and intention of each dialog node in the dialog flow.
The beneficial effect of this embodiment lies in: the visual experience of the customer business process is improved.
In one embodiment, the step S33 includes:
performing voice recognition on the speaking intention of the user;
carrying out intelligent voice sentence break on the speaking intention of the user;
understanding the meaning in the user's speaking intent.
The working principle of the embodiment is as follows: and carrying out voice recognition, intelligent sentence breaking and semantic understanding on the speaking intention of the user, and carrying out voice synthesis on the response generated by the speaking intention of the user.
The beneficial effect of this embodiment lies in: the user can obtain corresponding response according to the speaking intention.
In an embodiment, the step S33 further includes:
generating a response to the speaking intention of the user according to the meaning in the speaking intention of the user and the speaking process of the client;
and carrying out audio synthesis on the response, and sending the synthesized audio to the equipment terminal.
The working principle of the embodiment is as follows: and responding to the speaking intention of the user according to the meaning in the speaking intention of the user, namely the speaking process of the client, synthesizing audio and sending the audio to the equipment side.
The beneficial effect of this embodiment lies in: the speaking intention of the user can be satisfied.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A system for terminal equipment to automatically access AI multi-turn conversation capability is characterized by comprising: the system comprises a visual process configuration module, an equipment end and an IVN cloud server;
the visual process configuration module is used for acquiring a client's speech process and sending the client's speech process to the IVN cloud server;
the device side is used for acquiring the speaking intention of a user and transmitting the speaking intention of the user to the IVN cloud server in a voice stream mode;
the IVN cloud server is used for receiving the speaking process of the client and the speaking intention of the user, and controlling the terminal equipment to automatically access the AI multi-turn conversation capability system according to the speaking process of the client and the speaking intention of the user so as to realize multi-turn conversation between the user and the equipment terminal.
2. The system of claim 1, wherein the visualization process configuration module comprises:
the IVN project visualization sub-module is used for configuring the speaking and operation process of the client through a speaking and operation process editing interface and configuring and modifying the speaking and operation text, audio and intention of each speaking and operation node in the speaking and operation process of the client; the IVN project visualization submodule is also used for providing a management background of customer operation.
3. The system of claim 1, wherein the IVN cloud server comprises:
the voice interaction interface service module is used for receiving the speaking intention of the user transmitted by the equipment terminal; the voice interaction interface service is further used for sending a response audio generated by the IVN cloud server to the speaking intention of the user to the equipment end;
and the data interface service module is used for receiving the speaking process of the client sent by the visualization process configuration module.
4. The system of claim 1, wherein the IVN cloud server comprises:
the voice recognition service module is used for carrying out voice recognition on the speaking intention of the user and carrying out intelligent voice sentence break on the speaking intention of the user;
the semantic understanding service module is used for understanding the meaning of the speaking intention of the user;
the voice synthesis service is used for carrying out audio synthesis on the response of the IVN cloud server to the speaking intention of the user; the speech synthesis service is also used for dynamic parametric audio synthesis.
5. The system of claim 1, wherein the IVN cloud server comprises:
the data center service module is used for providing big data service, analyzing direct interaction logs of the user and the equipment and screening out the intention of the user;
and the business service module is used for uniformly skipping when the user is abnormal.
6. The system of any one of claims 3 to 5, wherein the IVN cloud server comprises a voice interaction interface service module, a data interface service module, a voice recognition service module, a semantic understanding service module, a voice synthesis module, a data center service module and a business service module, which are all capable of performing horizontal cluster expansion and vertical expansion.
7. A method for automatically accessing AI multi-turn conversation capability by a terminal device is characterized by comprising the following steps:
the method comprises the steps of obtaining a client's speech process, and sending the client's speech process to an IVN cloud server;
acquiring the speaking intention of a user, and transmitting the speaking intention of the user to an IVN cloud server in a voice stream mode;
and receiving the dialect process of the client and the speaking intention of the user, and controlling the terminal equipment to automatically access the system with AI multi-turn conversation capability according to the dialect process of the client and the speaking intention of the user so as to realize multi-turn conversation between the user and the equipment terminal.
8. The method of claim 6, wherein in the obtaining the client's verbal process, the client's verbal process is configured through a verbal process editing interface, and the verbal text, audio and intent of each verbal node in the client's verbal process are configured and modified.
9. The method of claim 6, wherein the controlling the terminal device to automatically access the system for multi-turn AI dialog capability according to the client's dialect flow and the user's speaking intention to realize multi-turn dialog between the user and the device comprises:
performing voice recognition on the speaking intention of the user;
carrying out intelligent voice sentence break on the speaking intention of the user;
understanding the meaning in the user's speaking intent.
10. The method of claim 9, wherein the controlling the terminal device to automatically access the system for multi-turn AI dialog capability according to the client's dialect flow and the user's speaking intention to realize multi-turn dialog between the user and the device, further comprises:
generating a response to the speaking intention of the user according to the meaning in the speaking intention of the user and the speaking process of the client;
and carrying out audio synthesis on the response, and sending the synthesized audio to the equipment terminal.
CN201911129150.1A 2019-11-18 2019-11-18 System and method for terminal equipment to automatically access AI multi-turn conversation capability Pending CN111128147A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911129150.1A CN111128147A (en) 2019-11-18 2019-11-18 System and method for terminal equipment to automatically access AI multi-turn conversation capability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911129150.1A CN111128147A (en) 2019-11-18 2019-11-18 System and method for terminal equipment to automatically access AI multi-turn conversation capability

Publications (1)

Publication Number Publication Date
CN111128147A true CN111128147A (en) 2020-05-08

Family

ID=70495969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911129150.1A Pending CN111128147A (en) 2019-11-18 2019-11-18 System and method for terminal equipment to automatically access AI multi-turn conversation capability

Country Status (1)

Country Link
CN (1) CN111128147A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256360A (en) * 2020-09-09 2021-01-22 青岛大学 Intelligent service assistant system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109346076A (en) * 2018-10-25 2019-02-15 三星电子(中国)研发中心 Interactive voice, method of speech processing, device and system
CN109979456A (en) * 2019-04-22 2019-07-05 济南磨刀石信息科技有限公司 A kind of intelligent robot customer service system and its dialogue method based on two dimensional code popularization
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more
CN110442701A (en) * 2019-08-15 2019-11-12 苏州思必驰信息科技有限公司 Voice dialogue processing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109346076A (en) * 2018-10-25 2019-02-15 三星电子(中国)研发中心 Interactive voice, method of speech processing, device and system
CN109979456A (en) * 2019-04-22 2019-07-05 济南磨刀石信息科技有限公司 A kind of intelligent robot customer service system and its dialogue method based on two dimensional code popularization
CN110209791A (en) * 2019-06-12 2019-09-06 百融云创科技股份有限公司 It is a kind of to take turns dialogue intelligent speech interactive system and device more
CN110442701A (en) * 2019-08-15 2019-11-12 苏州思必驰信息科技有限公司 Voice dialogue processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256360A (en) * 2020-09-09 2021-01-22 青岛大学 Intelligent service assistant system
CN112256360B (en) * 2020-09-09 2024-03-29 青岛大学 Intelligent service assistant system

Similar Documents

Publication Publication Date Title
JP7274043B2 (en) Voice conversation processing method and device
JP7112919B2 (en) Smart device task processing method and device
US20170046124A1 (en) Responding to Human Spoken Audio Based on User Input
CN110767220A (en) Interaction method, device, equipment and storage medium of intelligent voice assistant
US20190371291A1 (en) Method and apparatus for processing speech splicing and synthesis, computer device and readable medium
CN108028044A (en) The speech recognition system of delay is reduced using multiple identifiers
US20200076866A1 (en) Systems, devices, and methods for streaming haptic effects
CN111145745B (en) Conversation process customizing method and device
US20200210829A1 (en) Network off-line model processing method, artificial intelligence processing device and related products
US20220366913A1 (en) Systems and method for third party natural language understanding service integration
Pérez-Soler et al. Towards Conversational Syntax for Domain-Specific Languages using Chatbots.
CN105206273B (en) Voice transfer control method and system
CN111128147A (en) System and method for terminal equipment to automatically access AI multi-turn conversation capability
KR20210065629A (en) Chatbot integration agent platform system and service method thereof
CN107273398B (en) Human interface system and method for operating the same
WO2023206327A1 (en) Custom display post processing in speech recognition
CN111552630A (en) Skill debugging method and device and storage medium
CN110782888A (en) Voice tone control system for changing perceptual-cognitive state
CN111754974B (en) Information processing method, device, equipment and computer storage medium
CN110728977A (en) Voice conversation method and system based on artificial intelligence
CN105118507B (en) Voice activated control and its control method
US10032456B2 (en) Automated audio data selector
US20140067398A1 (en) Method, system and processor-readable media for automatically vocalizing user pre-selected sporting event scores
CN113157241A (en) Interaction equipment, interaction device and interaction system
CN111739510A (en) Information processing method, information processing apparatus, vehicle, and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200508