CN113113005B

CN113113005B - Voice data processing method, device, computer equipment and storage medium

Info

Publication number: CN113113005B
Application number: CN202110295407.1A
Authority: CN
Inventors: 张磊嘉
Original assignee: Volkswagen Mobvoi Beijing Information Technology Co Ltd
Current assignee: Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2024-06-18
Anticipated expiration: 2041-03-19
Also published as: CN113113005A

Abstract

The application relates to a voice data processing method, a voice data processing device, computer equipment and a storage medium. The method comprises the following steps: receiving a current user instruction corresponding to a current user, wherein the current user instruction comprises a current area position and a target area position, acquiring a current voice service instance corresponding to the current area position, wherein the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit, acquiring a target voice service hardware unit corresponding to the target area position, switching the current voice service hardware unit controlled by the current voice service instance into the target voice service hardware unit to obtain a target voice service instance, receiving target user voice print characteristics corresponding to the target user where the target area position is located, binding the target user voice print characteristics with the target voice service instance, and providing voice service for the target user through the target voice service instance. By adopting the method, the accuracy of the voice service object can be improved.

Description

Voice data processing method, device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and apparatus for processing voice data, a computer device, and a storage medium.

Background

At present, intelligent voice services in vehicles are all in a single-task working mode, can only serve a single person at the same time, and customization for user services only depends on login accounts. While hardware relies on the most basic speech hardware architecture (e.g., a display, a set of single-channel or multi-channel mic). However, if a plurality of persons are said to be on one vehicle at the same time, a single device in the vehicle cannot perform voice service on the plurality of persons at the same time, and cannot precisely press the characteristics (such as gender, age) of different persons, thereby performing corresponding voice service.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a method, apparatus, computer device, and storage medium capable of processing voice data, in which a plurality of voice service instances exist, and in which region hardware and voice prints of specified users are individually bound, so that voice services can be provided to a plurality of users without interference.

A method of processing speech data, the method comprising:

receiving a current user instruction corresponding to a current user, wherein the current user instruction comprises a current area position and a target area position;

Acquiring a current voice service instance corresponding to a current area position, wherein the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit;

Acquiring a target voice service hardware unit corresponding to the target area position;

Switching a current voice service hardware unit controlled by a current voice service instance into a target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises a current voice service software unit and a target voice service hardware unit;

And receiving target user voiceprint characteristics corresponding to the target user where the target area is located, binding the target user voiceprint characteristics with a target voice service instance, and providing voice service for the target user through the target voice service instance.

In one embodiment, before receiving a current user instruction corresponding to a current user, the method includes: when the current vehicle machine corresponding to the current vehicle is detected to be started, a default voice service instance is obtained, the default voice service instance comprises a default voice service software unit and a corresponding default voice service hardware unit, the voice print characteristics of the super user corresponding to the super user are received, the voice print characteristics of the super user are bound with the default voice service instance, voice service is carried out for the super user through the default voice service instance, and the super user is the user with the highest authority in the current vehicle.

In one embodiment, receiving a current user instruction corresponding to a current user includes: receiving a first super user instruction corresponding to the super user, wherein the first super user instruction comprises a current area position of the current user, acquiring a current voice service hardware unit corresponding to the current area position, establishing an association relation between a default voice service software unit controlled by a default voice service example and the current voice service hardware unit according to the first super user instruction, obtaining a current voice service example, receiving current user voiceprint features corresponding to the current user, binding the current user voiceprint features with the current voice service example, providing voice service for the current user through the current voice service example, and acquiring the current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service example.

In one embodiment, receiving a current user instruction corresponding to a current user includes: receiving a second super user instruction corresponding to the super user, wherein the second super user instruction comprises a current area position of the current user, acquiring a current voice service hardware unit corresponding to the current area position, adding a new voice service software unit according to the second super user instruction, establishing an association relation between the new voice service software unit and the current voice service hardware unit to obtain a current voice service instance, receiving the voice print feature of the current user corresponding to the current user, binding the voice print feature of the current user with the current voice service instance, providing voice service for the current user through the current voice service instance, and acquiring the current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

In one embodiment, the voice data processing method further includes: the method comprises the steps that a target voice service hardware unit controlled by a target voice service instance receives a current user sharing instruction corresponding to a target user, the current user sharing instruction comprises a shared user area position where a shared user is located and current user sharing content, a corresponding first voice service instance is obtained according to the shared user area position, the first voice service instance comprises a first voice service software unit and a corresponding first voice service hardware unit, the current user sharing content is copied to the first voice service instance, and the current user sharing content is displayed for the shared user through the first voice service hardware unit controlled by the first voice service instance.

In one embodiment, the voice data processing method further includes: the target voice service hardware unit controlled by the target voice service instance receives a current user statement corresponding to a target user, performs voice recognition on the current user statement to obtain a current user field corresponding to the current user statement, determines a target feedback statement corresponding to the current user statement according to the current user field, and responds to the target feedback statement to the target user through the target voice service hardware unit.

In one embodiment, the voice data processing method further includes: receiving a plurality of user input sentences, carrying out voiceprint recognition on each user input sentence to obtain user voiceprint characteristics corresponding to each user input sentence, determining a user voice service instance corresponding to each user input sentence according to each user voiceprint characteristic, and responding to the corresponding user input sentence through the user voice service instance.

A speech data processing apparatus, the apparatus comprising:

the user instruction receiving module is used for receiving a current user instruction corresponding to a current user, wherein the current user instruction comprises a current area position and a target area position;

the current voice service instance acquisition module is used for acquiring a current voice service instance corresponding to the current area position, and the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit;

the voice service hardware unit acquisition module is used for acquiring a target voice service hardware unit corresponding to the target area position;

the target voice service instance generation module is used for switching the current voice service hardware unit controlled by the current voice service instance into a target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises a current voice service software unit and a target voice service hardware unit;

The target voice service instance processing module is used for receiving target user voiceprint features corresponding to the target user where the target area is located, binding the target user voiceprint features with the target voice service instance, and providing voice service for the target user through the target voice service instance.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

According to the voice data processing method, the voice data processing device, the computer equipment and the storage medium, the target voice service hardware unit of the region position of the target user is rebinding by changing the current voice service hardware unit controlled by the current voice service instance of the current user, and the target voice service hardware unit is bound with the voiceprint of the target user, so that the changed current voice service instance can independently provide voice service for the target user without being influenced by the current user. Therefore, by binding the region hardware and the voiceprint of the designated user individually through a plurality of voice service instances, voice services can be provided for a plurality of users without interference. The voice service object can be accurately identified by changing the area hardware controlled by the current voice service instance, so that the voice service object can be switched, and the identification accuracy of the voice service object can be improved.

Drawings

FIG. 1 is a diagram of an application environment for a voice data processing method in one embodiment;

FIG. 2 is a flow chart of a method of processing voice data according to one embodiment;

FIG. 3 is a flow chart of a method of processing voice data according to one embodiment;

FIG. 4 is a flow chart illustrating the steps of receiving a current user command in one embodiment;

FIG. 5 is a flow chart illustrating the steps of receiving a current user command in one embodiment;

FIG. 6 is a flow chart of a method of processing voice data according to one embodiment;

FIG. 7 is a flow chart of a method of processing voice data according to one embodiment;

FIG. 8 is a flow chart of a method of processing voice data in one embodiment;

FIG. 9 is a block diagram of a voice data processing apparatus in one embodiment;

fig. 10 is an internal structural view of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The voice data processing method provided by the application can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.

Specifically, the terminal 102 receives a current user instruction corresponding to a current user, where the current user instruction includes a current area location and a target area location, and sends the current user instruction to the server 104 through network communication. The server 104 obtains a current voice service instance corresponding to the current area position, the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit, a target voice service hardware unit corresponding to the target area position is obtained, the current voice service hardware unit controlled by the current voice service instance is switched to a target voice service hardware unit, a target voice service instance is obtained, the target voice service instance comprises the current voice service software unit and the target voice service hardware unit, target user voiceprint features corresponding to a target user where the target area position is located are received, the target user voiceprint features are bound with the target voice service instance, and voice service is provided for the target user through the target voice service instance.

In one embodiment, as shown in fig. 2, a voice data processing method is provided, and the method is applied to the terminal in fig. 1 for illustration, and includes the following steps:

step 202, receiving a current user instruction corresponding to a current user, where the current user instruction includes a current area position and a target area position.

The terminal can be a vehicle-mounted terminal where the current vehicle is located, the current vehicle comprises a front row driver seat, a front row co-driver seat, a rear row left position and a rear row right position, each area position of the current vehicle is provided with a corresponding voice service hardware unit, and the voice service hardware unit comprises a microphone, a loudspeaker and a display screen at the area position.

The current user is a user currently speaking in the current vehicle, the current user can perform voice interaction with a corresponding voice service hardware unit, the current voice service hardware unit corresponding to the current area position where the current user is located receives a current user instruction, and the current user instruction comprises the current area position and the target area position. The current area position refers to a seat where a current user is located in a current vehicle, and the target area position refers to an area position specified in the current vehicle.

The voice interaction between the current user and the current voice service hardware unit corresponding to the current region position can be realized by binding the voice print characteristics of the current user with the current voice service hardware unit in advance, so that the current voice service hardware unit can only recognize the voice of the current user.

Step 204, a current voice service instance corresponding to the current area position is obtained, where the current voice service instance includes a current voice service software unit and a corresponding current voice service hardware unit.

The voice service instance is composed of a voice service hardware unit and a voice service software unit, wherein the voice service hardware unit is hardware equipment of voice service, the voice service software unit is software equipment of voice service, and the voice service software unit is used for processing audio data acquired by the voice service hardware unit and providing a software entity of voice service.

The corresponding voice service instance exists in each area position of the current vehicle, so that the corresponding current voice service instance can be obtained according to the current area position, and the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit.

Step 206, obtaining the target voice service hardware unit corresponding to the target area position.

The target area position is the area position specified by the current user, and the target voice service hardware unit corresponding to the area position specified by the current user is obtained.

And step 208, switching the current voice service hardware unit controlled by the current voice service instance into a target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises a current voice service software unit and a target voice service hardware unit.

And switching the area served by the voice service instance, and providing voice service for the new service area by the voice service instance. Specifically, the voice service hardware unit controlled by the voice service instance is changed, and the user of the new service area is rebindd, so that the voice service is provided for the user. Specifically, the current voice service hardware unit controlled by the current voice service instance is changed into the target voice service hardware unit, that is, the current voice service software unit and the target voice service hardware unit are bound to obtain the target voice service instance. And carrying out voice service for the target user on the area position where the target voice service hardware unit is located through the target voice service instance. Wherein the target user is a user in the location of the area where the target voice service hardware unit is located.

Step 210, receiving a target user voiceprint feature corresponding to the target user where the target area is located, binding the target user voiceprint feature with a target voice service instance, and providing voice service for the target user through the target voice service instance.

After the target voice service instance is obtained, the target voice service instance and the target user need to be bound, and after the binding, the target voice service instance can serve the target user alone. Specifically, target user voiceprint features corresponding to a target user where a target area is located are collected, specifically, target user audio information of the target user is collected through a microphone in a target voice service hardware unit where the target area is located, voiceprint recognition is performed on the target user audio information through a current voice service software unit associated with the target voice service hardware unit to obtain target user voiceprint features, and the target user voiceprint features are bound with a target voice service instance, namely, the target voice service instance only recognizes target users matched with the target user voiceprint features. Finally, voice service can be provided for the target user through the target voice service instance, specifically, user sentences sent by the target user are collected through a target voice service hardware unit controlled by the target voice service instance, voice recognition is carried out on the user sentences through a current voice software unit controlled by the target voice service instance, corresponding user feedback sentences are determined, and response is carried out through the target voice service hardware unit.

In the voice data processing method, the target voice service hardware unit of the region position of the target user is rebinding by changing the current voice service hardware unit controlled by the current voice service instance of the current user and binding with the voiceprint of the target user, so that the changed current voice service instance can independently provide voice service for the target user without being influenced by the current user. Therefore, by binding the region hardware and the voiceprint of the designated user individually through a plurality of voice service instances, voice services can be provided for a plurality of users without interference. The voice service object can be accurately identified by changing the area hardware controlled by the current voice service instance, so that the voice service object can be switched, and the identification accuracy of the voice service object can be improved.

In one embodiment, as shown in fig. 3, before receiving the current user instruction corresponding to the current user, the method includes:

step 302, when the current vehicle machine corresponding to the current vehicle is detected to be started, a default voice service instance is obtained, wherein the default voice service instance comprises a default voice service software unit and a corresponding default voice service hardware unit.

Step 304, receiving voice print characteristics of the super user corresponding to the super user, binding the voice print characteristics of the super user with a default voice service instance, and performing voice service for the super user through the default voice service instance, wherein the super user is the user with the highest authority in the current vehicle.

The hardware equipment of each vehicle comprises a vehicle machine, the vehicle machine is a short name of a vehicle-mounted information entertainment product arranged in an automobile, and the vehicle machine can realize information communication between people and the automobile and between the automobile and the outside (automobile and automobile) functionally. Firstly, detecting a current vehicle state corresponding to a current vehicle, and when the current vehicle state is started, indicating that the whole vehicle of the current vehicle is started, at this time, acquiring a default voice service instance, wherein the default voice service instance can be downloaded to a vehicle-mounted terminal from a server in advance, and further the default voice service instance can be acquired locally, or alternatively, the default voice service instance can be downloaded from the server in real time. The default voice service instance comprises a default voice service software unit and a corresponding default voice service hardware unit.

The super user is the user with the highest authority in the current vehicle, the user with the highest authority in the current vehicle is usually the user who drives the vehicle, namely, the driver is the super user of the current vehicle, the super user audio information corresponding to the super user is received, and the voice print recognition is carried out on the super user audio information through the default voice service software unit, so that the voice print characteristics of the super user are obtained. The voice print feature of the super-user is bound with a default voice service instance such that the default voice service instance alone and only provides voice services for the super-user. That is, the default speech service instance only recognizes the driver's voice.

And if the current vehicle has a plurality of users, determining that the user corresponding to the user voice closest to the default voice service hardware unit is a super user. That is, the default voice service software unit in the default voice service instance may determine the user's voice closest to the default voice service hardware unit according to the decibels of the plurality of user's voices.

In one embodiment, as shown in fig. 4, receiving a current user instruction corresponding to a current user includes:

step 402, receiving a first super user instruction corresponding to a super user, where the first super user instruction includes a current area location where a current user is located.

Step 404, obtaining the current voice service hardware unit corresponding to the current region position.

And step 406, establishing an association relation between the default voice service software unit controlled by the default voice service instance and the current voice service hardware unit according to the first super user instruction, and obtaining the current voice service instance.

The first super user instruction is an instruction sent by the super user and is used for binding the relation between the current voice service instance and the current user, so that the current voice service instance only recognizes the voice of the current user. Specifically, a default voice service hardware unit controlled by a default voice service instance collects a first super user instruction corresponding to a super user, where the first super user instruction includes a current area position where a current user is located.

Further, the current voice service hardware unit where the current area position is located is obtained, namely the voice service hardware device where the current area position is located is obtained, and the default voice service software unit controlled by the default voice service instance is bound with the current voice service hardware unit where the current area position is located according to the first super user instruction, so that the current voice service instance is obtained.

Step 408, receiving the voice print feature of the current user corresponding to the current user, binding the voice print feature of the current user with the current voice service instance, and providing voice service for the current user through the current voice service instance.

Step 410, the current user instruction corresponding to the current user is collected through the current voice service hardware unit controlled by the current voice service instance.

The voice service hardware unit receives the current user audio information corresponding to the current user in the current voice service instance, and voice recognition is performed on the current user audio information through the current voice service software unit to obtain the voice characteristics of the current user. Further, the voiceprint feature of the current user can be bound with the current voice service instance, and after the voiceprint feature is bound, the current voice service instance can only provide voice service for the current user and is not influenced by other users. Therefore, the current user instruction corresponding to the current user can be collected through the current voice service hardware unit controlled by the current voice service instance.

In one embodiment, as shown in fig. 5, receiving a current user instruction corresponding to a current user includes:

step 502, receiving a second super user instruction corresponding to the super user, where the second super user instruction includes a current area position where the current user is located.

Step 504, obtaining the current voice service hardware unit corresponding to the current region position.

And step 506, adding a new voice service software unit according to the second super user instruction, and establishing an association relation between the new voice service software unit and the current voice service hardware unit to obtain the current voice service instance.

The second super user instruction is an instruction sent by the super user and is used for binding the relation between the current voice service instance and the current user, so that the current voice service instance only recognizes the voice of the current user. Specifically, a default voice service hardware unit controlled by a default voice service instance collects a second super user instruction corresponding to the super user, where the second super user instruction includes a current area position where a current user is located.

Further, the current voice service hardware unit where the current area position is located is obtained, namely the voice service hardware device where the current area position is located is obtained, a new voice service software unit is newly added according to the second super user instruction, and the new voice service software unit is bound with the current voice service hardware unit to obtain the current voice service instance.

Step 508, receiving the voice print feature of the current user corresponding to the current user, binding the voice print feature of the current user with the current voice service instance, and providing voice service for the current user through the current voice service instance.

Step 510, collecting the current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

In one embodiment, as shown in fig. 6, the voice data processing method further includes:

in step 602, a target voice service hardware unit controlled by a target voice service instance receives a current user sharing instruction corresponding to a target user, where the current user sharing instruction includes a shared user area location where a shared user is located and a current user sharing content.

Step 604, obtaining a corresponding first voice service instance according to the shared user area location, where the first voice service instance includes a first voice service software unit and a corresponding first voice service hardware unit.

In step 606, the current user sharing content is copied to the first voice service instance, and the current user sharing content is displayed for the shared user through the first voice service hardware unit controlled by the first voice service instance.

After the voiceprint features of the target user are bound with the target voice service instance, the target voice service instance only provides voice service for the target user. Specifically, a current user sharing instruction corresponding to a target user can be received through a target voice service hardware unit controlled by a target voice service instance, wherein the current user sharing instruction comprises a shared user area position where a shared user is located and current user sharing content. That is, the target user may share the content to other users, and the current user shares the content to be shared by the target user.

Further, a first voice service instance where the shared user area is located is obtained, where the first voice service instance includes a first voice service software unit and an associated first voice service hardware unit. Through the voice request of the target user, the target voice service instance interacts with the first voice service instance, specifically, the shared content of the current user can be directly copied into the first voice service instance, so that the first voice service hardware unit controlled by the first voice service instance can display the shared content of the current user. Sharing of voice service instances in different regional locations is achieved. For example, the rear left-side general user B is browsing a commodity, wants to share to the rear right-side general user C, requests to share to the right rear by voice, and then the commodity appears on the display of the rear right-side general user B.

In one embodiment, as shown in fig. 7, the voice data processing method further includes:

step 702, receiving, by a target voice service hardware unit controlled by the target voice service instance, a current user statement corresponding to the target user.

Step 704, performing voice recognition on the current user sentence to obtain the current user field corresponding to the current user sentence.

Step 706, determining a target feedback statement corresponding to the current user statement according to the current user field.

Step 708, responding to the target user with a target feedback statement through the target voice service hardware unit.

The target voice service instance only provides voice service for the target user, and the target voice service hardware unit controlled by the target voice service instance receives the current user statement corresponding to the target user. The current user statement sent by the target user can be acquired through a microphone in the target voice service hardware unit, or the current user statement input by the target user can be acquired through a screen in the target voice service hardware unit.

Further, the current user sentence is subjected to voice recognition, and the current user field is obtained. The current user domain refers to a knowledge domain where the current user statement is located. For example, the current user statement is: "how the road conditions of Beijing two rings are", and performing voice recognition on the Beijing two rings to obtain the corresponding current user fields as follows: and (5) navigation.

Secondly, after the current user field corresponding to the current user statement is obtained, a target feedback statement corresponding to the current user statement can be determined according to the current user field, the relation between feedback statements corresponding to the user fields can be established in advance, and the target feedback statement corresponding to the current user statement can be determined according to the relation. For example, the current user domain is: navigation, wherein the target feedback statement is as follows: "whether or not it is necessary to provide you with the best route navigation. Finally, the target user can be responded to the target feedback statement through the target voice service hardware unit.

In one embodiment, as shown in fig. 8, the voice data processing method further includes:

step 802, a plurality of user input sentences are received.

Step 804, voiceprint recognition is performed on each user input sentence, so as to obtain user voiceprint features corresponding to each user input sentence.

Step 806, determining a user voice service instance corresponding to each user input statement according to each user voiceprint feature.

Step 808, responding to the corresponding user input sentence through the user voice service instance.

If a plurality of users of the current vehicle send out voices at the same time, a plurality of user input sentences are received. Voiceprint recognition is needed to be carried out on each user input sentence, and user voiceprint characteristics corresponding to each user input sentence are obtained. Because each voice service instance is bound with the corresponding user voiceprint feature, the matched user voice service instance can be determined according to the user voiceprint feature corresponding to each user input sentence, and finally the corresponding user input sentence is responded through the user voice service instance.

In a specific application scenario, after the vehicle machine of the current vehicle is started, the vehicle machine is in an initial state, a single voice service instance is operated in the vehicle machine, a code super, the whole vehicle is in a single voice service unit state, a unique voice service process is operated in the vehicle machine, the voice service instance authority is a super attendant, the voice service hardware unit group (comprising the minimum voice service hardware units of all areas of the whole vehicle at this time) is controlled to perform voice receiving on the whole vehicle, and simultaneously, unified voice service is performed on all users, and meanwhile, only a single user can be served, (wherein the front row, the rear row, the left row, the right set of microphones mic, the loudspeaker speaker and the screen) are arranged in the area, and when the user initiates a voice request, the system judges the user direction, and the latest minimum voice service hardware unit is called to interact with the user and expand the service.

Further, in the single voice service instance state, the main driving user a voice requests to start the split mode: the voice wake-up method comprises the steps that a customer requests a voice service instance super to carry out a splitting service, the voice service instance requests a user A to carry out voiceprint binding, the user A becomes a super user A and has dominance authority on the car machine voice service instance super, after the voiceprint binding is completed, namely the voice service instance super binds the voice of the user A, only the voice of the A is identified, and the voice is not woken up by other people and is provided with service. At this time, the super user A is asked by the voice service instance super to designate the work area location (e.g., front row, rear left row, rear right row) for the new voice service instance to be generated, voice service instance normal-1.

Assuming that the super user a assigns an area to the rear left, the vehicle system will add a new speech service instance-speech service instance normal-1, which will control the smallest speech service hardware unit of the assigned area (rear left), and mainly provide speech services to the general user B in that area (rear left). The ordinary user B in the area is required to carry out voiceprint binding by a voice service instance normal-1, and voice service is carried out only for the ordinary user B. After the process is completed, the voice service instance super controls the minimum voice service hardware unit group formed by the front row and the rear row of the right minimum voice service hardware units to perform voice interaction with the super user A and provide services, and the voice service instance normal-1 controls the minimum voice service hardware unit of the designated area (rear row left) to perform voice interaction with the common user B and provide services. Because multiple speech instances exist and the regional hardware is bound separately, and the user voiceprint is specified, two and users can be served independently and simultaneously without interference.

For example: the speech service instance normal-1 is originally carrying out speech service for the normal user B on the left side of the back row, at this time, the speech enhancement of the normal user B on the left side of the back row requires the speech service instance normal-1 to provide service for the normal user C on the right side of the back row, at this time, the speech service instance normal-1 starts to switch the controlled speech service hardware units, the control right of the minimum speech service hardware unit originally (on the left side of the back row) is also given to the speech service instance super, and the control of the minimum speech service hardware unit (on the left side of the back row) is started, and the user voiceprint binding request is started for the normal user C on the right side of the back row and the service is started to be provided.

The scenario may be: the requirement of the rear left user mother wants to select an cartoon for the son positioned on the right side of the rear row to see, and the operation is that: the rear left common user B mom and the voice service instance normal-1 attendant at the position initiate voice to search for an cartoon, after the cartoon is selected, the voice enables the voice service instance normal-1 attendant to play for the child common user C with the rear right region, the left child common user C is inquired whether to start playing, the user answers to confirm, and at the moment, the voice service instance normal-1 carries out voiceprint binding on the common user C and starts serving the common user C.

The voice service instances can share the voice service content currently in progress, specifically, the voice service instance located in different area positions shares the voice service content currently in progress, for example, the voice service instance normal-1 is serving for the rear left common user B, the voice service content is task 1, meanwhile, the voice service instance normal-2 is serving for the rear right common user C, the common user B wants to synchronize the task 1 sharing performed by the voice service instance to the common user C, the common user B interacts with the voice service instance normal-2 through the voice service instance normal-1 after the voice request, the task 1 is copied to the voice service instance normal-2, the voice service instance normal-2 has a task 1 copy identical to the task 1, and at this time, the common user C can start to operate the voice service content as the task 1 copy.

The scenario may be that the rear-row left-side general user B is browsing a commodity, wants to share to the rear-row right-side general user C, initiates a request to share to the right rear side through voice, and then the commodity appears on the display of the rear-row right-side general user B.

In a specific embodiment, a voice data processing method is provided, which specifically includes the following steps:

1. When the current vehicle machine corresponding to the current vehicle is detected to be started, a default voice service instance is obtained, wherein the default voice service instance comprises a default voice service software unit and a corresponding default voice service hardware unit.

2. And receiving voice print characteristics of the super user corresponding to the super user, binding the voice print characteristics of the super user with a default voice service instance, and performing voice service for the super user through the default voice service instance, wherein the super user is the user with the highest authority in the current vehicle.

3. And receiving a current user instruction corresponding to the current user, wherein the current user instruction comprises a current area position and a target area position.

3-1-1, Receiving a first super user instruction corresponding to the super user, wherein the first super user instruction comprises the current area position of the current user.

3-1-2, Obtaining a current voice service hardware unit corresponding to the current region position.

3-1-3, Establishing an association relation between a default voice service software unit controlled by the default voice service instance and a current voice service hardware unit according to a first super user instruction, and obtaining the current voice service instance.

3-1-4, Receiving the voice print characteristics of the current user corresponding to the current user, binding the voice print characteristics of the current user with the current voice service instance, and providing voice service for the current user through the current voice service instance.

3-1-5, Collecting a current user instruction corresponding to a current user through a current voice service hardware unit controlled by a current voice service instance.

3-2-1, Receiving a second super user instruction corresponding to the super user, wherein the second super user instruction comprises the current area position of the current user.

3-2-2, Obtaining the current voice service hardware unit corresponding to the current region position.

3-2-3, Adding a new voice service software unit according to the second super user instruction, and establishing an association relation between the new voice service software unit and the current voice service hardware unit to obtain the current voice service instance.

3-2-4, Receiving the voice print characteristics of the current user corresponding to the current user, binding the voice print characteristics of the current user with the current voice service instance, and providing voice service for the current user through the current voice service instance.

3-2-5, Collecting a current user instruction corresponding to the current user through a current voice service hardware unit controlled by the current voice service instance.

4. And acquiring a current voice service instance corresponding to the current region position, wherein the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit.

5. And acquiring a target voice service hardware unit corresponding to the target area position.

6. And switching the current voice service hardware unit controlled by the current voice service instance into a target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises a current voice service software unit and a target voice service hardware unit.

7. And receiving target user voiceprint characteristics corresponding to the target user where the target area is located, binding the target user voiceprint characteristics with a target voice service instance, and providing voice service for the target user through the target voice service instance.

8. And receiving a current user sharing instruction corresponding to the target user through a target voice service hardware unit controlled by the target voice service instance, wherein the current user sharing instruction comprises the shared user area position where the shared user is and the shared content of the current user.

9. And acquiring a corresponding first voice service instance according to the shared user area position, wherein the first voice service instance comprises a first voice service software unit and a corresponding first voice service hardware unit.

10. Copying the shared content of the current user to a first voice service instance, and displaying the shared content of the current user for the shared user through a first voice service hardware unit controlled by the first voice service instance.

11. And receiving a current user statement corresponding to the target user through a target voice service hardware unit controlled by the target voice service instance.

12. And carrying out voice recognition on the current user statement to obtain the current user field corresponding to the current user statement.

13. And determining a target feedback statement corresponding to the current user statement according to the current user field.

14. And responding the target feedback statement to the target user through the target voice service hardware unit.

15. A plurality of user input sentences is received.

16. And carrying out voiceprint recognition on each user input sentence to obtain user voiceprint characteristics corresponding to each user input sentence.

17. And determining the user voice service instance corresponding to each user input statement according to each user voiceprint feature.

18. Responding to the corresponding user input statement through the user voice service instance.

It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, and the order of execution of the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with at least a part of the sub-steps or stages of other steps or other steps.

In one embodiment, as shown in fig. 9, there is provided a voice data processing apparatus 900 comprising: a user instruction receiving module 902, a current voice service instance obtaining module 904, a voice service hardware unit obtaining module 906, a target voice service instance generating module 908, and a target voice service instance processing module 910, wherein:

the user instruction receiving module 902 is configured to receive a current user instruction corresponding to a current user, where the current user instruction includes a current area location and a target area location.

The current voice service instance obtaining module 904 is configured to obtain a current voice service instance corresponding to a current area location, where the current voice service instance includes a current voice service software unit and a corresponding current voice service hardware unit.

The voice service hardware unit obtaining module 906 is configured to obtain a target voice service hardware unit corresponding to the target area position.

The target voice service instance generating module 908 is configured to switch the current voice service hardware unit controlled by the current voice service instance to a target voice service hardware unit, so as to obtain a target voice service instance, where the target voice service instance includes the current voice service software unit and the target voice service hardware unit.

The target voice service instance processing module 910 is configured to receive a target user voiceprint feature corresponding to a target user where the target area is located, bind the target user voiceprint feature with a target voice service instance, and provide a voice service for the target user through the target voice service instance.

In one embodiment, when detecting that the current vehicle corresponding to the current vehicle is started, the voice data processing device 900 obtains a default voice service instance, where the default voice service instance includes a default voice service software unit and a corresponding default voice service hardware unit, receives a voice print feature of a super user corresponding to the super user, binds the voice print feature of the super user with the default voice service instance, performs voice service for the super user through the default voice service instance, and the super user is a user with the highest authority in the current vehicle.

In one embodiment, the voice data processing apparatus 900 receives a first super user instruction corresponding to a super user, where the first super user instruction includes a current area location where a current user is located, obtains a current voice service hardware unit corresponding to the current area location, establishes an association between a default voice service software unit controlled by a default voice service instance and the current voice service hardware unit according to the first super user instruction, obtains a current voice service instance, receives a voice print feature of the current user corresponding to the current user, binds the voice print feature of the current user with the current voice service instance, provides voice service for the current user through the current voice service instance, and acquires a current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

In one embodiment, the user instruction receiving module 902 receives a second super user instruction corresponding to the super user, where the second super user instruction includes a current area location where the current user is located, obtains a current voice service hardware unit corresponding to the current area location, adds a new voice service software unit according to the second super user instruction, establishes an association between the new voice service software unit and the current voice service hardware unit to obtain a current voice service instance, receives a voice print feature of the current user corresponding to the current user, binds the voice print feature of the current user with the current voice service instance, provides voice service for the current user through the current voice service instance, and acquires the current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

In one embodiment, the voice data processing apparatus 900 receives a current user sharing instruction corresponding to a target user through a target voice service hardware unit controlled by a target voice service instance, where the current user sharing instruction includes a shared user area location where a shared user is located and a current user sharing content, obtains a corresponding first voice service instance according to the shared user area location, and copies the current user sharing content to the first voice service instance through the first voice service hardware unit controlled by the first voice service instance.

In one embodiment, the voice data processing apparatus 900 receives a current user sentence corresponding to a target user through a target voice service hardware unit controlled by a target voice service instance, performs voice recognition on the current user sentence to obtain a current user field corresponding to the current user sentence, determines a target feedback sentence corresponding to the current user sentence according to the current user field, and responds to the target feedback sentence to the target user through the target voice service hardware unit.

In one embodiment, the voice data processing apparatus 900 receives a plurality of user input sentences, performs voiceprint recognition on each user input sentence, obtains user voiceprint features corresponding to each user input sentence, determines a user voice service instance corresponding to each user input sentence according to each user voiceprint feature, and responds to the corresponding user input sentence through the user voice service instance. For specific limitations of the voice data processing apparatus, reference may be made to the above limitation of the voice data processing method, and no further description is given here. The respective modules in the above-described voice data processing apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and an internal structure diagram thereof may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of speech data processing. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in FIG. 10 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program: receiving a current user instruction corresponding to a current user, wherein the current user instruction comprises a current area position and a target area position, acquiring a current voice service instance corresponding to the current area position, wherein the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit, acquiring a target voice service hardware unit corresponding to the target area position, switching the current voice service hardware unit controlled by the current voice service instance into the target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises the current voice service software unit and the target voice service hardware unit, receiving target user voice print characteristics corresponding to a target user in which the target area position is located, binding the target user voice print characteristics with the target voice service instance, and providing voice service for the target user through the target voice service instance.

In one embodiment, before receiving the current user instruction corresponding to the current user, the method includes: when the current vehicle machine corresponding to the current vehicle is detected to be started, a default voice service instance is obtained, the default voice service instance comprises a default voice service software unit and a corresponding default voice service hardware unit, the voice print characteristics of the super user corresponding to the super user are received, the voice print characteristics of the super user are bound with the default voice service instance, voice service is carried out for the super user through the default voice service instance, and the super user is the user with the highest authority in the current vehicle.

In one embodiment, the processor when executing the computer program further performs the steps of: receiving a first super user instruction corresponding to the super user, wherein the first super user instruction comprises a current area position of the current user, acquiring a current voice service hardware unit corresponding to the current area position, establishing an association relation between a default voice service software unit controlled by a default voice service example and the current voice service hardware unit according to the first super user instruction, obtaining a current voice service example, receiving current user voiceprint features corresponding to the current user, binding the current user voiceprint features with the current voice service example, providing voice service for the current user through the current voice service example, and acquiring the current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service example.

In one embodiment, the processor when executing the computer program further performs the steps of: receiving a second super user instruction corresponding to the super user, wherein the second super user instruction comprises a current area position of the current user, acquiring a current voice service hardware unit corresponding to the current area position, adding a new voice service software unit according to the second super user instruction, establishing an association relation between the new voice service software unit and the current voice service hardware unit to obtain a current voice service instance, receiving the voice print feature of the current user corresponding to the current user, binding the voice print feature of the current user with the current voice service instance, providing voice service for the current user through the current voice service instance, and acquiring the current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

In one embodiment, the processor when executing the computer program further performs the steps of: the method comprises the steps that a target voice service hardware unit controlled by a target voice service instance receives a current user sharing instruction corresponding to a target user, the current user sharing instruction comprises a shared user area position where a shared user is located and current user sharing content, a corresponding first voice service instance is obtained according to the shared user area position, the first voice service instance comprises a first voice service software unit and a corresponding first voice service hardware unit, the current user sharing content is copied to the first voice service instance, and the current user sharing content is displayed for the shared user through the first voice service hardware unit controlled by the first voice service instance.

In one embodiment, the processor when executing the computer program further performs the steps of: the target voice service hardware unit controlled by the target voice service instance receives a current user statement corresponding to a target user, performs voice recognition on the current user statement to obtain a current user field corresponding to the current user statement, determines a target feedback statement corresponding to the current user statement according to the current user field, and responds to the target feedback statement to the target user through the target voice service hardware unit.

In one embodiment, the processor when executing the computer program further performs the steps of: receiving a plurality of user input sentences, carrying out voiceprint recognition on each user input sentence to obtain user voiceprint characteristics corresponding to each user input sentence, determining a user voice service instance corresponding to each user input sentence according to each user voiceprint characteristic, and responding to the corresponding user input sentence through the user voice service instance.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: receiving a current user instruction corresponding to a current user, wherein the current user instruction comprises a current area position and a target area position, acquiring a current voice service instance corresponding to the current area position, wherein the current voice service instance comprises a current voice service software unit and a corresponding current voice service hardware unit, acquiring a target voice service hardware unit corresponding to the target area position, switching the current voice service hardware unit controlled by the current voice service instance into the target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises the current voice service software unit and the target voice service hardware unit, receiving target user voice print characteristics corresponding to a target user in which the target area position is located, binding the target user voice print characteristics with the target voice service instance, and providing voice service for the target user through the target voice service instance.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A method of speech data processing, the method comprising:

Switching the current voice service hardware unit controlled by the current voice service instance into the target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises the current voice service software unit and the target voice service hardware unit;

Receiving target user voiceprint features corresponding to a target user where the target area is located, binding the target user voiceprint features with the target voice service instance, and providing voice service for the target user through the target voice service instance;

The receiving the current user instruction corresponding to the current user comprises the following steps:

Receiving a first super user instruction corresponding to a super user, wherein the first super user instruction comprises the current area position of a current user; the super user is the user with the highest authority in the current vehicle; the voice print characteristics of the corresponding super user of the super user are bound with a default voice service instance in advance, and the default voice service instance comprises a default voice service software unit;

acquiring a current voice service hardware unit corresponding to the current region position;

Establishing an association relation between a default voice service software unit controlled by the default voice service instance and the current voice service hardware unit according to the first super user instruction to obtain a current voice service instance;

Receiving current user voiceprint features corresponding to the current user, binding the current user voiceprint features with the current voice service instance, and providing voice service for the current user through the current voice service instance;

and collecting a current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

2. The method according to claim 1, wherein before receiving the current user instruction corresponding to the current user, the method comprises:

When the current vehicle machine corresponding to the current vehicle is detected to be started, acquiring a default voice service instance, wherein the default voice service instance further comprises a default voice service hardware unit corresponding to the default voice service software unit;

And receiving voice print characteristics of the super user corresponding to the super user, binding the voice print characteristics of the super user with the default voice service instance, and performing voice service for the super user through the default voice service instance.

3. The method according to claim 2, wherein receiving the current user instruction corresponding to the current user includes:

receiving a second super user instruction corresponding to the super user, wherein the second super user instruction comprises the current area position of the current user;

adding a new voice service software unit according to the second super user instruction, and establishing an association relation between the new voice service software unit and the current voice service hardware unit to obtain a current voice service instance;

Receiving current user voiceprint features corresponding to the current user, binding the current user voiceprint features with the current voice service instance, and providing voice service for the current user through the current voice service instance; and collecting a current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

4. The method according to claim 1, wherein the method further comprises:

Receiving a current user sharing instruction corresponding to the target user through a target voice service hardware unit controlled by the target voice service instance, wherein the current user sharing instruction comprises a shared user area position where a shared user is and current user sharing content;

Acquiring a corresponding first voice service instance according to the shared user area position, wherein the first voice service instance comprises a first voice service software unit and a corresponding first voice service hardware unit;

copying the shared content of the current user to the first voice service instance, and displaying the shared content of the current user for the shared user through a first voice service hardware unit controlled by the first voice service instance.

5. The method according to claim 1, wherein the method further comprises:

Receiving a current user statement corresponding to the target user through a target voice service hardware unit controlled by the target voice service instance;

performing voice recognition on the current user statement to obtain a current user field corresponding to the current user statement;

determining a target feedback statement corresponding to the current user statement according to the current user field;

And responding the target feedback statement to the target user through the target voice service hardware unit.

6. The method according to claim 1, wherein the method further comprises:

receiving a plurality of user input sentences;

voiceprint recognition is carried out on each user input sentence, and user voiceprint characteristics corresponding to each user input sentence are obtained;

Determining user voice service examples corresponding to the user input sentences according to the voice print characteristics of the users;

Responding to the corresponding user input statement through the user voice service instance.

7. A voice data processing apparatus, the apparatus comprising:

The target voice service instance generation module is used for switching the current voice service hardware unit controlled by the current voice service instance into the target voice service hardware unit to obtain a target voice service instance, wherein the target voice service instance comprises the current voice service software unit and the target voice service hardware unit;

The target voice service instance processing module is used for receiving target user voiceprint features corresponding to a target user where the target area is located, binding the target user voiceprint features with the target voice service instance, and providing voice service for the target user through the target voice service instance;

The voice data processing device is also used for receiving a first super user instruction corresponding to the super user, wherein the first super user instruction comprises the current area position of the current user; the super user is the user with the highest authority in the current vehicle; the voice print characteristics of the corresponding super user of the super user are bound with a default voice service instance in advance, and the default voice service instance comprises a default voice service software unit; acquiring a current voice service hardware unit corresponding to the current region position; establishing an association relation between a default voice service software unit controlled by the default voice service instance and the current voice service hardware unit according to the first super user instruction to obtain a current voice service instance; receiving current user voiceprint features corresponding to the current user, binding the current user voiceprint features with the current voice service instance, and providing voice service for the current user through the current voice service instance; and collecting a current user instruction corresponding to the current user through the current voice service hardware unit controlled by the current voice service instance.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when the computer program is executed by the processor.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.