CN112102821A

CN112102821A - Data processing method, device, system and medium applied to electronic equipment

Info

Publication number: CN112102821A
Application number: CN201910530553.0A
Authority: CN
Inventors: 仇璐; 陈宇; 耿岭; 白二伟; 刘鲁鹏; 刘颖; 元海明; 占凯; 郑勇超
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2019-06-18
Filing date: 2019-06-18
Publication date: 2020-12-18
Anticipated expiration: 2039-06-18
Also published as: CN112102821B; WO2020253265A1

Abstract

The present disclosure provides a data processing method applied to an electronic device, including: acquiring a plurality of first historical voice data; determining at least one first target voice data in the plurality of first historical voice data, wherein a first score of each first target voice data of the at least one first target voice data is greater than or equal to a preset threshold value; acquiring a current threshold condition, wherein the current threshold condition is used for a condition that whether the electronic equipment responds to current voice data for operation; and adjusting the current threshold condition based on the amount of the at least one first target speech data. The present disclosure also provides a data processing apparatus applied to an electronic device, a data processing system, and a computer-readable storage medium.

Description

Data processing method, device, system and medium applied to electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a data processing method applied to an electronic device, a data processing apparatus applied to an electronic device, a data processing system, and a computer-readable storage medium.

Background

With the development of computer technology, electronic devices tend to be intelligent, and various intelligent devices are widely applied to many fields, such as smart homes, intelligent vehicles and the like. Voice is the most commonly used interactive mode for human beings, so the technology of waking up a smart device by voice becomes a research hotspot. When receiving the wake-up voice, the electronic device in the related art generally needs to determine whether the wake-up voice satisfies a wake-up condition, and the electronic device can respond to the wake-up voice to perform a related operation when the wake-up voice satisfies the wake-up condition.

In implementing the disclosed concept, the inventors found that there is at least the following problem in the related art, in which the wake-up condition of the electronic device for determining whether to respond to the wake-up voice is generally fixed, resulting in poor wake-up effect and large false wake-up rate.

Disclosure of Invention

In view of the above, the present disclosure provides an optimized data processing method and apparatus, system, and medium applied to an electronic device.

One aspect of the present disclosure provides a data processing method applied to an electronic device, including: the method comprises the steps of obtaining a plurality of first historical voice data, determining at least one first target voice data in the plurality of first historical voice data, obtaining a current threshold condition, wherein a first score of each first target voice data of the at least one first target voice data is larger than or equal to a preset threshold, the current threshold condition is used for a condition that whether the electronic equipment operates in response to the current voice data, and the current threshold condition is adjusted based on the number of the at least one first target voice data.

According to an embodiment of the present disclosure, the method further includes: acquiring a plurality of second historical voice data, and determining the number of at least one second target voice data in the plurality of second historical voice data, wherein the second score of each second target voice data of the at least one second target voice data is greater than or equal to the preset threshold.

According to the embodiment of the present disclosure, the number of the at least one first target voice data is a first number, and the number of the at least one second target voice data is a second number. The adjusting the current threshold condition based on the amount of the at least one first target speech data comprises: comparing the first number with the second number to obtain a comparison result, and adjusting the current threshold condition according to the comparison result, wherein the current threshold condition comprises a first threshold, and the adjusting the current threshold condition comprises increasing the first threshold or decreasing the first threshold.

According to an embodiment of the present disclosure, the method further includes: the method comprises the steps of obtaining the current voice data and a plurality of third history voice data, wherein the number of the third history voice data is a third number, processing the current voice data according to the third history voice data to obtain a third score of the current voice data, and determining that the electronic equipment operates in response to the current voice data in response to the third score of the current voice data meeting the current threshold condition.

According to an embodiment of the present disclosure, the method further includes: adjusting the third quantity according to the comparison result. Wherein said adjusting said third number according to said comparison comprises at least one of: the third number is increased in response to the comparison result characterizing that the first number is greater than the second number, and the third number is decreased in response to the comparison result characterizing that the first number is less than the second number.

According to an embodiment of the present disclosure, the first score includes a score obtained by processing the at least one first target voice data according to at least one of the plurality of first historical voice data.

Another aspect of the present disclosure provides a data processing apparatus applied to an electronic device, including: the device comprises a first obtaining module, a first determining module, a second obtaining module and a first adjusting module. The first obtaining module obtains a plurality of first historical voice data, the first determining module determines at least one first target voice data in the plurality of first historical voice data, wherein a first score of each first target voice data of the at least one first target voice data is greater than or equal to a preset threshold, the second obtaining module obtains a current threshold condition, the current threshold condition is used for a condition that the electronic equipment operates in response to the current voice data, and the first adjusting module adjusts the current threshold condition based on the number of the at least one first target voice data.

According to the embodiment of the present disclosure, the above apparatus further includes: the device comprises a third acquisition module and a second determination module. The third obtaining module obtains a plurality of second historical voice data, and the second determining module determines the number of at least one second target voice data in the plurality of second historical voice data, wherein the second score of each second target voice data of the at least one second target voice data is greater than or equal to the preset threshold.

According to an embodiment of the present disclosure, the number of the at least one first target voice data is a first number. The number of the at least one second target voice data is a second number. The adjusting the current threshold condition based on the amount of the at least one first target speech data comprises: comparing the first number with the second number to obtain a comparison result, and adjusting the current threshold condition according to the comparison result, wherein the current threshold condition comprises a first threshold, and the adjusting the current threshold condition comprises increasing the first threshold or decreasing the first threshold.

According to the embodiment of the present disclosure, the apparatus further includes: the device comprises a fourth acquisition module, a processing module and a third determination module. The fourth acquiring module acquires the current voice data and a plurality of third history voice data, wherein the number of the third history voice data is a third number, the processing module processes the current voice data according to the third history voice data to obtain a third score of the current voice data, and the third determining module determines that the electronic equipment operates in response to the current voice data in response to the third score of the current voice data meeting the current threshold condition.

According to the embodiment of the present disclosure, the apparatus further includes: and the second adjusting module adjusts the third quantity according to the comparison result. Wherein said adjusting said third number according to said comparison comprises at least one of: the third number is increased in response to the comparison result characterizing that the first number is greater than the second number, and the third number is decreased in response to the comparison result characterizing that the first number is less than the second number.

Another aspect of the present disclosure provides a data processing system comprising: one or more processors. The storage device is for storing one or more programs, which when executed by the one or more processors, cause the one or more processors to perform the method as described above.

Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.

Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.

According to the embodiment of the disclosure, the problems of poor wake-up effect and large false wake-up rate caused by the fact that the wake-up condition of the electronic device for determining whether to respond to the wake-up voice is usually fixed and unchanged in the related art can be at least partially solved, and therefore, the technical effects of adjusting the wake-up condition in real time to improve the wake-up effect and reduce the false wake-up rate can be achieved.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:

fig. 1 schematically shows a system architecture of a data processing method and a data processing apparatus applied to an electronic device according to an embodiment of the present disclosure;

2A-2B schematically illustrate application scenarios of a data processing method applied to an electronic device according to an embodiment of the present disclosure;

fig. 3 schematically shows a flow chart of a data processing method applied to an electronic device according to a first embodiment of the present disclosure;

fig. 4 schematically shows a flow chart of a data processing method applied to an electronic device according to a second embodiment of the present disclosure;

fig. 5 schematically shows a flow chart of a data processing method applied to an electronic device according to a third embodiment of the present disclosure;

fig. 6 schematically shows a flow chart of a data processing method applied to an electronic device according to a fourth embodiment of the present disclosure;

fig. 7 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a first embodiment of the present disclosure;

fig. 8 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a second embodiment of the present disclosure;

fig. 9 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a third embodiment of the present disclosure;

fig. 10 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a fourth embodiment of the present disclosure; and

FIG. 11 schematically shows a block diagram of a computer system suitable for data processing according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

The embodiment of the present disclosure provides a data processing method applied to an electronic device, including: the method comprises the steps of obtaining a plurality of first historical voice data, determining at least one first target voice data in the plurality of first historical voice data, obtaining a current threshold condition, wherein a first score of each first target voice data of the at least one first target voice data is larger than or equal to a preset threshold, the current threshold condition is used for a condition whether the electronic equipment operates in response to the current voice data, and adjusting the current threshold condition based on the number of the at least one first target voice data.

Fig. 1 schematically shows a system architecture of a data processing method and a data processing apparatus applied to an electronic device according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1, the system architecture 100 according to this embodiment may include

terminal devices

101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that the data processing method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may be generally disposed in the server 105. The data processing method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster different from the server 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105. Accordingly, the data processing apparatus provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105.

For example, the plurality of first historical voice data and the current threshold condition acquired by the embodiment of the present disclosure may be stored in the

terminal devices

101, 102, 103, and the server 105 may determine at least one first target voice data of the plurality of first historical voice data and adjust the current threshold condition based on the number of the at least one first target voice data by the

terminal devices

101, 102, 103 transmitting the plurality of first historical voice data and the current threshold condition to the server 105. Alternatively, the

terminal devices

101, 102, 103 may also directly obtain the plurality of first historical voice data and the current threshold condition, determine at least one first target voice data in the plurality of first historical voice data, and adjust the current threshold condition based on the number of the at least one first target voice data. In addition, the acquired plurality of first historical speech data and the current threshold condition may also be directly stored in the server 105, at least one first target speech data of the plurality of first historical speech data is directly determined by the server 105, and the current threshold condition is adjusted based on the number of the at least one first target speech data.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Fig. 2A-2B schematically illustrate application scenarios of the data processing method applied to an electronic device according to an embodiment of the present disclosure.

As shown in fig. 2A, the application scenario 200 includes, for example, a smart device 210 and a user 220.

The smart device 210 may be, for example, a smart speaker, a smart phone, or the like. The user 220 may wake up the smart device 210 through voice.

For example, when receiving voice data, the smart device 210 generally needs to determine whether the voice data meets a wake-up condition, and if so, the smart device 210 performs a relevant operation in response to the voice data. If not, the smart device 210 does not respond to the voice data.

When the number of times that the user 220 wakes up the smart device 210 within a period of time is large, which indicates that the user 220 interacts with the smart device 210 frequently, the wake-up condition may be dynamically lowered, so that the user 220 can wake up the smart device 210 more easily. For example, if the current wake-up condition is that the similarity between the voice data and the wake-up word is 80%, then the wake-up condition may be reduced to that the similarity between the voice data and the wake-up word is 70%.

As shown in fig. 2B, the application scenario 200 includes, for example, a smart device 210.

When the smart device 210 does not receive the voice data satisfying the wake-up condition for a long time, it indicates that the number of interactions between the user and the smart device 210 is small. In this case, the possibility that the voice data received by the smart device 210 is noise is high, and at this time, the wake-up condition can be dynamically improved, so as to prevent the noise from waking up the smart device 210 and causing false wake-up. For example, if the current wake-up condition is that the similarity between the voice data and the wake-up word is 80%, then the wake-up condition may be increased to that the similarity between the voice data and the wake-up word is 90%.

The embodiment of the present disclosure may dynamically adjust the wake-up condition to improve the wake-up rate of the user waking up the smart device 210 and reduce the false wake-up rate.

Fig. 3 schematically shows a flowchart of a data processing method applied to an electronic device according to a first embodiment of the present disclosure.

As shown in fig. 3, the method includes operations S310 to S340.

In operation S310, a plurality of first history voice data is acquired.

According to the embodiment of the disclosure, the electronic device may be a smart device, for example, a smart speaker, a smart phone, or the like. The electronic device can collect voice data in real time and determine whether to wake up based on the collected voice data. The plurality of first historical voice data include, for example, voice data collected by the electronic device before the current time, and the plurality of first historical voice data include, for example, voice data of successfully waking up the electronic device and voice data of unsuccessfully waking up the electronic device. After receiving the voice data, the electronic device may determine whether the received voice data can wake up the electronic device through a corresponding voice model, where the voice model may be a neural network model.

In operation S320, at least one first target voice data among the plurality of first historical voice data is determined, wherein a first score of each first target voice data of the at least one first target voice data is greater than or equal to a preset threshold.

Wherein the first score includes at least one of the plurality of first historical speech data (e.g., n for at least one of the plurality of first historical speech data)₁A plurality of historical speech data) processing the at least one first target speech data. The first score may be used, for example, as a basis for the electronic device to respond to at least one first target voice.

For example, a first history voice data a is taken as an example. The first historical speech data a is input into a speech model for classification, and the output result of the speech model is, for example, a score of the first historical speech data a belonging to the first category and a score of the first historical speech data a belonging to the second category. The first category is, for example, a wake word category, and the second category is, for example, not a wake word category. When the score of the first historical voice data A belonging to the first category is higher than the score belonging to the second category, the first historical voice data A is preliminarily judged to belong to the first category, namely the first historical voice data A belongs to the category of the awakening words.

To calculate the first score of the first historical speech data a, n before the first historical speech data a may be acquired₁A history voice data (the n₁The plurality of historical speech data may be a portion of the plurality of first historical speech data). And n is₁Respectively inputting the historical voice data into the voice model to obtain n₁N of the first category of the historical speech data₁A score, and a score of the first historical speech data A belonging to the first category and n₁And carrying out weighted average calculation on the scores to obtain a first score of the first historical voice data A. Similarly to the calculation of the first score of the first historical speech data a, the first score of each of the plurality of first historical speech data may be calculated. In addition, the embodiment of the disclosure can properly reduce the interference of noise on the first historical voice data through weighted average calculation, and ensure that the first score more accurately reflects the similarity between the first historical voice data and the awakening word.

According to an embodiment of the present disclosure, the preset threshold may be a specific value, for example. After the first score of each first historical voice data is obtained through calculation, the first historical voice data with the first score being larger than or equal to a preset threshold value is used as first target voice data.

In operation S330, a current threshold condition is acquired, where the current threshold condition is used as a condition for whether the electronic device operates in response to current voice data. For example, current voice data collected in real-time can wake up the device when the current voice data meets a current threshold condition.

In operation S340, a current threshold condition is adjusted based on the amount of the at least one first target voice data.

Wherein the amount of the at least one first target voice data can represent how frequently the user wakes up the electronic device. For example, a larger amount of the at least one first target voice data indicates that the user wakes up the electronic device more frequently, and a smaller amount indicates that the user interacts with the electronic device less frequently. Therefore, the present disclosure may adjust the current threshold condition according to the amount of the at least one first target voice data, that is, the current threshold condition may be decreased when the amount of the at least one first target voice data is larger, so that the user may wake up the electronic device more easily. When the number is smaller, the current threshold condition can be improved, so that when the user does not perform voice interaction with the electronic equipment for a long time, unnecessary false awakening caused by noise is avoided.

Fig. 4 schematically shows a flowchart of a data processing method applied to an electronic device according to a second embodiment of the present disclosure.

As shown in fig. 4, the method includes operations S310 to S340 and S410 to S420. Operations S310 to S340 are the same as or similar to the operations described above with reference to fig. 3, and are not described again here.

In operation S410, a plurality of second history voice data is acquired.

According to the embodiment of the present disclosure, the plurality of first historical voice data may be, for example, voice data collected within a preset time period. For example, the current time is 10:00, the plurality of first historical voice data may be voice data collected from 9:00 to 10: 00.

Similarly, the plurality of second historical voice data may also be voice data collected within a preset time period, for example. For example, the current time is 10:00, and the plurality of second historical voice data may be voice data collected from 8:00 to 9: 00. The collection time of the plurality of second historical voice data is, for example, before the collection time of the plurality of first historical voice data.

In operation S420, a number of at least one second target voice data of the plurality of second history voice data is determined, and a second score of each second target voice data of the at least one second target voice data is greater than or equal to a preset threshold.

The manner how the first score of each first history voice data is obtained has been described above according to the embodiment of the present disclosure. Based on the same or similar manner, a second score of each second historical voice data may be acquired, and the second historical voice data of which the second score is greater than or equal to a preset threshold may be taken as at least one second target voice data.

In the embodiment of the present disclosure, the number of the at least one first target voice data is a first number, and the number of the at least one second target voice data is a second number.

When the first number is larger than the second number, the number of the first target voice data with the first score larger than or equal to the preset threshold value in a past period (for example, 9: 00-10: 00) is increased compared with the previous period, and the number of times that the user wakes up the electronic device is increased. When the first number is smaller than the second number, the frequency of voice interaction between the user and the electronic equipment is reduced.

According to an embodiment of the present disclosure, adjusting the current threshold condition based on the amount of the at least one first target voice data includes: and comparing the first quantity with the second quantity to obtain a comparison result, and adjusting the current threshold condition according to the comparison result. Wherein the current threshold condition comprises a first threshold and adjusting the current threshold condition comprises increasing the first threshold or decreasing the first threshold.

For example, the current threshold condition may include a first threshold v that indicates that the user wakes up the electronic device more frequently when the first number is greater than the second number, where the first threshold v may be decreased so that the user wakes up the electronic device more easily. When the first number is smaller than the second number, the frequency of voice interaction between the user and the electronic device is reduced, and at the moment, the first threshold v can be increased, so that when the user does not perform voice interaction with the electronic device for a long time, unnecessary mistaken awakening caused by noise is avoided.

For example, the value of the first threshold v may be adjusted within a preset range. The preset range may be [ M, M ], where M represents a minimum value that the first threshold value v may assume, and M represents a maximum value that the first threshold value v may assume. Therefore, the embodiment of the present disclosure may adjust the value of the first threshold v within the preset range [ M, M ] according to the comparison result between the first number and the second number.

For example, the initial value of the first threshold v may be M, and the disclosed embodiments may adjust the first threshold v by increasing a variation rate r (0 < r < 1). Specifically, when the first number is greater than the second number, the first threshold v is decreased by v ═ v × r. When the first number is smaller than the second number, the first threshold v is increased by v ═ v/r. When the first number is equal to the second number, the first threshold v may not be adjusted.

The preset threshold referred to above may be the minimum value m here.

According to the embodiment of the disclosure, the frequency of the language interaction between the user and the electronic device is determined according to the comparison result of the first quantity and the second quantity, and the current threshold condition (the first threshold) is dynamically adjusted according to the frequency, so that the electronic device can have a stable and good awakening rate and a low false awakening rate when the interaction frequency between the user and the electronic device is high or low. Therefore, the technical scheme of the embodiment of the disclosure can improve the awakening effect and avoid mistaken awakening as much as possible.

Fig. 5 schematically shows a flowchart of a data processing method applied to an electronic device according to a third embodiment of the present disclosure.

As shown in FIG. 5, the method includes operations S310 to S340, S410 to S420, and S510 to S530. Operations S310 to S340 are the same as or similar to the operations described above with reference to fig. 3, and operations S410 to S420 are the same as or similar to the operations described above with reference to fig. 4, and are not repeated herein.

According to the embodiment of the disclosure, in the actual use process, the electronic device determines whether to respond to the current voice data for operation by acquiring the current voice data in real time and judging whether the third score corresponding to the current voice data meets the current threshold condition. For example, the third score may be obtained by inputting current speech data into the speech model whenThe pre-speech data may be MFCC (Mel Frequency Cepstra) obtained after processing the initial speech_lCoeffients, mel frequency cepstral coefficients).

In operation S510, current voice data and a plurality of third history voice data are acquired, wherein the number of the plurality of third history voice data is a third number.

In operation S520, the current voice data is processed according to the third history voice data to obtain a third score of the current voice data.

For example, the current time is 11:00, the current voice data is the voice data acquired at the current time, and the score of the current voice data belonging to the first category can be obtained preliminarily by inputting the current voice data into the voice model. To calculate the third score of the current speech data, n may be obtained before the current time₂A third history voice data (n)₂A third history voice data is a data before 11: 00), when the third number is n₂. N is to be₂Respectively inputting the third history voice data into the voice model to obtain n₂N of the third history voice data belonging to the first category₂A score and a score n for the current speech data belonging to the first category₂The scores are weighted and averaged to obtain a third score (where the third score is calculated in the same or similar manner as the first score or the second score).

In operation S530, in response to the third score of the current voice data satisfying the current threshold condition, it is determined that the electronic device operates in response to the current voice data.

For example, when the third score is greater than the first threshold, the electronic device may perform a correlation operation in response to current voice data.

According to the embodiment of the present disclosure, since the third score of the current voice data is based on n₂N of the third history voice data belonging to the first category₂The scores are calculated as a weighted average, and the third number n is therefore₂Based on the accuracy of the third score, embodiments of the present disclosure may dynamically adjust the third number n₂To increase the accuracy of the third score to increase the effectiveness of the wake-up.

Referring to fig. 6 below, the embodiment of fig. 6 is for dynamically adjusting the third number n₂The process of (a) is described.

Fig. 6 schematically shows a flowchart of a data processing method applied to an electronic device according to a fourth embodiment of the present disclosure.

As shown in FIG. 6, the method includes operations S310 to S340, S410 to S420, S510 to S530, and S610. Operations S310 to S340 are the same as or similar to the operations described above with reference to fig. 3, operations S410 to S420 are the same as or similar to the operations described above with reference to fig. 4, and operations S510 to S530 are the same as or similar to the operations described above with reference to fig. 4, and are not repeated herein.

In operation S610, the third number is adjusted according to the comparison result.

The comparison result includes, for example, that the first number is greater than the second number, or that the first number is less than the second number.

Wherein the third number n is adjusted according to the comparison result₂Including at least one of:

(1) the third number is incremented in response to the comparison characterizing that the first number is greater than the second number.

(2) The third quantity is decreased in response to the comparison indicating that the first quantity is less than the second quantity.

For example, when the first number is greater than the second number, the current voice data is easier to wake up the electronic device due to the decreased first threshold v, and at this time, the third number n may be increased₂The accuracy of the third score calculated by the weighted average is improved, so that the interference of noise on the current voice data is reduced, and the possibility of mistaken awakening is reduced. When the first number is smaller than the second number, the condition for waking up the electronic device by the current voice data is improved (at this time, the possibility of false wake-up is reduced) due to the increase of the first threshold v, and at this time, the third number n may be reduced₂To appropriately reduce the amount of calculation in calculating the third component value via the weighted average. In addition, the third number may not be adjusted when the first number is equal to the second numberQuantity n₂。

For example, a third number n may be given₂Value range of [ a, b ]]. Therefore, the value range [ a, b ] can be obtained according to the comparison result of the first number and the second number]Internally adjusting the third number n₂The value of (a).

For example, the third number n is adjusted by increasing a variation step s₂. I.e. when the first number is greater than the second number, by n₂＝n₂+ s to increase the third number n₂. When the first number is smaller than the second number, pass n₂＝n₂S to reduce the third number n₂. Wherein by adjusting the third number n₂When the voice data is subsequently acquired, the adjusted third number n may be used₂A third score of the subsequently acquired speech data is calculated.

The embodiment of the disclosure can keep a stable and good wake-up rate and a low false wake-up rate in a changing environment by dynamically adjusting the current threshold condition and the third number. For example, when the interaction frequency of the user and the electronic device is high or low, the electronic device can have a stable and good wake-up rate and low false wake-up. Therefore, the technical scheme of the embodiment of the disclosure can improve the awakening effect and avoid mistaken awakening as much as possible.

Fig. 7 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a first embodiment of the present disclosure.

As shown in fig. 7, the data processing apparatus 700 applied to the electronic device includes a first obtaining module 710, a first determining module 720, a second obtaining module 730, and a first adjusting module 740.

The first obtaining module 710 may be configured to obtain a plurality of first historical speech data. According to the embodiment of the present disclosure, the first obtaining module 710 may, for example, perform the operation S310 described above with reference to fig. 3, which is not described herein again.

The first determining module 720 may be configured to determine at least one first target voice data in the plurality of first historical voice data, wherein a first score of each first target voice data of the at least one first target voice data is greater than or equal to a preset threshold.

According to an embodiment of the present disclosure, the first score includes a score obtained by processing at least one first target voice data based on at least one of the plurality of first historical voice data.

According to an embodiment of the present disclosure, the first determining module 720 may perform, for example, operation S320 described above with reference to fig. 3, which is not described herein again.

The second obtaining module 730 may be configured to obtain a current threshold condition, where the current threshold condition is used for a condition that the electronic device operates in response to current voice data. According to the embodiment of the present disclosure, the second obtaining module 730 may, for example, perform the operation S330 described above with reference to fig. 3, which is not described herein again.

The first adjusting module 740 may be configured to adjust the current threshold condition based on the amount of the at least one first target voice data. According to the embodiment of the present disclosure, the first adjusting module 740 may perform, for example, the operation S340 described above with reference to fig. 3, which is not described herein again.

Fig. 8 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a second embodiment of the present disclosure.

As shown in fig. 8, the data processing apparatus 800 applied to the electronic device includes a first obtaining module 710, a first determining module 720, a second obtaining module 730, a first adjusting module 740, a third obtaining module 810, and a second determining module 820. The first obtaining module 710, the first determining module 720, the second obtaining module 730, and the first adjusting module 740 are the same as or similar to the modules described above with reference to fig. 7, and are not repeated herein.

The third obtaining module 810 may be configured to obtain a plurality of second historical speech data. According to an embodiment of the present disclosure, the third obtaining module 810 may, for example, perform operation S410 described above with reference to fig. 4, which is not described herein again.

The second determining module 820 may be configured to determine the number of at least one second target voice data in the plurality of second historical voice data, wherein the second score of each second target voice data of the at least one second target voice data is greater than or equal to a preset threshold. According to an embodiment of the present disclosure, the second determining module 820 may perform, for example, operation S420 described above with reference to fig. 4, which is not described herein again.

According to the embodiment of the present disclosure, the number of the at least one first target voice data is a first number, and the number of the at least one second target voice data is a second number. Adjusting a current threshold condition based on the amount of the at least one first target speech data, including: comparing the first quantity with the second quantity to obtain a comparison result, and adjusting a current threshold condition according to the comparison result, wherein the current threshold condition comprises a first threshold value, and the adjusting the current threshold condition comprises increasing or decreasing the first threshold value.

Fig. 9 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a third embodiment of the present disclosure.

As shown in fig. 9, the data processing apparatus 900 applied to the electronic device includes a first obtaining module 710, a first determining module 720, a second obtaining module 730, a first adjusting module 740, a third obtaining module 810, a second determining module 820, a fourth obtaining module 910, a processing module 920, and a third determining module 930. The first obtaining module 710, the first determining module 720, the second obtaining module 730, and the first adjusting module 740 are the same as or similar to the modules described above with reference to fig. 7, and the third obtaining module 810 and the second determining module 820 are the same as or similar to the modules described above with reference to fig. 8, and are not repeated herein.

The fourth obtaining module 910 may be configured to obtain the current voice data and a plurality of third history voice data, where the number of the plurality of third history voice data is a third number. According to the embodiment of the present disclosure, the fourth obtaining module 910 may perform, for example, operation S510 described above with reference to fig. 5, which is not described herein again.

The processing module 920 may be configured to process the current speech data according to the third history speech data to obtain a third score of the current speech data. According to the embodiment of the present disclosure, the processing module 920 may perform, for example, the operation S520 described above with reference to fig. 5, which is not described herein again.

The third determining module 930 may be configured to determine that the electronic device operates in response to the current voice data in response to the third score of the current voice data satisfying the current threshold condition. According to an embodiment of the present disclosure, the third determining module 930 may perform, for example, operation S530 described above with reference to fig. 5, which is not described herein again.

Fig. 10 schematically shows a block diagram of a data processing apparatus applied to an electronic device according to a fourth embodiment of the present disclosure.

As shown in fig. 10, the data processing apparatus 1000 applied to the electronic device includes a first obtaining module 710, a first determining module 720, a second obtaining module 730, a first adjusting module 740, a third obtaining module 810, a second determining module 820, a fourth obtaining module 910, a processing module 920, a third determining module 930, and a second adjusting module 1010. The first obtaining module 710, the first determining module 720, the second obtaining module 730, and the first adjusting module 740 are the same as or similar to the modules described above with reference to fig. 7, the third obtaining module 810 and the second determining module 820 are the same as or similar to the modules described above with reference to fig. 8, and the fourth obtaining module 910, the processing module 920, and the third determining module 930 are the same as or similar to the modules described above with reference to fig. 9, and are not repeated herein.

The second adjusting module 1010 may be configured to adjust the third amount according to the comparison result.

According to an embodiment of the disclosure, the third number is adjusted according to the comparison result, including at least one of: the third number is increased in response to the comparison indicating that the first number is greater than the second number, and the third number is decreased in response to the comparison indicating that the first number is less than the second number.

According to the embodiment of the present disclosure, the second adjusting module 1010 may perform the operation S610 described above with reference to fig. 6, for example, and is not described herein again.

Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.

For example, any plurality of the first obtaining module 710, the first determining module 720, the second obtaining module 730, the first adjusting module 740, the third obtaining module 810, the second determining module 820, the fourth obtaining module 910, the processing module 920, the third determining module 930, and the second adjusting module 1010 may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first obtaining module 710, the first determining module 720, the second obtaining module 730, the first adjusting module 740, the third obtaining module 810, the second determining module 820, the fourth obtaining module 910, the processing module 920, the third determining module 930, and the second adjusting module 1010 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or in a suitable combination of any of them. Alternatively, at least one of the first obtaining module 710, the first determining module 720, the second obtaining module 730, the first adjusting module 740, the third obtaining module 810, the second determining module 820, the fourth obtaining module 910, the processing module 920, the third determining module 930 and the second adjusting module 1010 may be implemented at least in part as a computer program module, which when executed, may perform a corresponding function.

FIG. 11 schematically shows a block diagram of a computer system suitable for data processing according to an embodiment of the present disclosure. The computer system illustrated in FIG. 11 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.

As shown in fig. 11, a computer system 1100 according to an embodiment of the present disclosure includes a processor 1101, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. The processor 1101 may comprise, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1101 may also include on-board memory for caching purposes. The processor 1101 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to the embodiments of the present disclosure.

In the RAM 1103, various programs and data necessary for the operation of the system 1100 are stored. The processor 1101, the ROM 1102, and the RAM 1103 are connected to each other by a bus 1104. The processor 1101 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1102 and/or the RAM 1103. It is noted that the programs may also be stored in one or more memories other than the ROM 1102 and RAM 1103. The processor 1101 may also perform various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in the one or more memories.

System 1100 may also include an input/output (I/O) interface 1105, which input/output (I/O) interface 1105 is also connected to bus 1104, according to an embodiment of the present disclosure. The system 1100 may also include one or more of the following components connected to the I/O interface 1105: an input portion 1106 including a keyboard, mouse, and the like; an output portion 1107 including a signal output unit such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 1108 including a hard disk and the like; and a communication section 1109 including a network interface card such as a LAN card, a modem, or the like. The communication section 1109 performs communication processing via a network such as the internet. A driver 1110 is also connected to the I/O interface 1105 as necessary. A removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1110 as necessary, so that a computer program read out therefrom is mounted into the storage section 1108 as necessary.

According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 1109 and/or installed from the removable medium 1111. The computer program, when executed by the processor 1101, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a computer-non-volatile computer-readable storage medium, which may include, for example and without limitation: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1102 and/or the RAM 1103 and/or one or more memories other than the ROM 1102 and the RAM 1103 described above.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. A data processing method applied to an electronic device comprises the following steps:

acquiring a plurality of first historical voice data;

determining at least one first target voice data in the plurality of first historical voice data, wherein a first score of each first target voice data of the at least one first target voice data is greater than or equal to a preset threshold value;

acquiring a current threshold condition, wherein the current threshold condition is used for a condition that whether the electronic equipment responds to current voice data for operation; and

adjusting the current threshold condition based on the amount of the at least one first target speech data.

2. The method of claim 1, further comprising:

acquiring a plurality of second historical voice data;

determining the number of at least one second target voice data in the plurality of second historical voice data, wherein the second score of each second target voice data of the at least one second target voice data is greater than or equal to the preset threshold value.

3. The method of claim 2, wherein the amount of the at least one first target speech data is a first amount; the number of the at least one second target voice data is a second number; the adjusting the current threshold condition based on the amount of the at least one first target speech data comprises:

comparing the first quantity with the second quantity to obtain a comparison result; and

adjusting the current threshold condition according to the comparison result;

wherein the current threshold condition comprises a first threshold value, and the adjusting the current threshold condition comprises increasing the first threshold value or decreasing the first threshold value.

4. The method of claim 3, further comprising:

acquiring the current voice data and a plurality of third history voice data, wherein the number of the third history voice data is a third number;

processing the current voice data according to the third history voice data to obtain a third score of the current voice data; and

and in response to the third score of the current voice data meeting the current threshold condition, determining that the electronic equipment operates in response to the current voice data.

5. The method of claim 4, further comprising: adjusting the third quantity according to the comparison result; wherein said adjusting said third number according to said comparison comprises at least one of:

increasing the third number in response to the comparison indicating that the first number is greater than the second number; and

decreasing the third number in response to the comparison indicating that the first number is less than the second number.

6. The method of claim 1, wherein the first score comprises a score resulting from processing the at least one first target speech data based on at least one of the plurality of first historical speech data.

7. A data processing apparatus applied to an electronic device, comprising:

the first acquisition module is used for acquiring a plurality of first historical voice data;

the first determination module is used for determining at least one first target voice data in the plurality of first historical voice data, wherein the first score of each first target voice data of the at least one first target voice data is greater than or equal to a preset threshold value;

the second acquisition module is used for acquiring a current threshold condition, wherein the current threshold condition is used for a condition that the electronic equipment responds to current voice data for operation; and

a first adjustment module that adjusts the current threshold condition based on the amount of the at least one first target voice data.

8. The apparatus of claim 7, further comprising:

the third acquisition module is used for acquiring a plurality of second historical voice data;

a second determining module, configured to determine a number of at least one second target voice data in the plurality of second historical voice data, where a second score of each second target voice data of the at least one second target voice data is greater than or equal to the preset threshold.

9. The apparatus of claim 8, wherein the amount of the at least one first target speech data is a first amount; the number of the at least one second target voice data is a second number; the adjusting the current threshold condition based on the amount of the at least one first target speech data comprises:

adjusting the current threshold condition according to the comparison result;

10. The apparatus of claim 9, further comprising:

the fourth acquisition module is used for acquiring the current voice data and a plurality of third history voice data, wherein the number of the third history voice data is a third number;

the processing module is used for processing the current voice data according to the third history voice data to obtain a third score of the current voice data; and

and the third determining module is used for determining that the electronic equipment responds to the current voice data to operate in response to the third score of the current voice data meeting the current threshold condition.

11. The apparatus of claim 10, further comprising: a second adjusting module for adjusting the third quantity according to the comparison result; wherein said adjusting said third number according to said comparison comprises at least one of:

12. The apparatus according to claim 7, wherein the first score comprises a score resulting from processing the at least one first target speech data based on at least one of the plurality of first historical speech data.

13. A data processing system comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-6.

14. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 6.