CN118212935A

CN118212935A - Information processing method and device and electronic equipment

Info

Publication number: CN118212935A
Application number: CN202211615299.2A
Authority: CN
Inventors: 程林; 方迟
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-12-15
Filing date: 2022-12-15
Publication date: 2024-06-18

Abstract

The embodiment of the invention discloses an information processing method, an information processing device and electronic equipment. An information processing method, comprising: determining sound characteristics of the target audio information; the target audio information is the audio information of a target user; determining interaction information corresponding to the target user according to the sound characteristics; and executing the interaction operation corresponding to the interaction information. The information processing method, the information processing device and the electronic equipment can determine corresponding interaction information based on the audio information of the user, so that corresponding interaction operation is triggered, and interaction experience of the user is improved.

Description

Information processing method and device and electronic equipment

Technical Field

The disclosure relates to the technical field of internet, and in particular relates to an information processing method, an information processing device and electronic equipment.

Background

With the development of computer technology, XR (Extended Reality) technology is gradually applied to various interaction scenarios, and the XR technology is defined as all real and virtual environments generated by computer graphics and wearable devices, and is applied to the interaction scenarios, so that the interaction experience of users can be improved.

In an interactive scenario based on XR technology, a user may input instructions through speech, and the wearable device may also recognize the speech instructions input by the user.

Disclosure of Invention

This disclosure is provided in part to introduce concepts in a simplified form that are further described below in the detailed description. This disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

The embodiment of the disclosure provides an information processing method, an information processing device and electronic equipment, which can determine corresponding interaction information based on audio information of a user, so as to trigger corresponding interaction operation and promote interaction experience of the user.

In a first aspect, an embodiment of the present disclosure provides an information processing method, including: determining sound characteristics of the target audio information; the target audio information is the audio information of a target user; determining interaction information corresponding to the target user according to the sound characteristics; and executing the interaction operation corresponding to the interaction information.

In a second aspect, an embodiment of the present disclosure provides an information processing apparatus including: a determining unit configured to determine a sound characteristic of the target audio information; the target audio information is the audio information of a target user; the determining unit is further used for determining interaction information corresponding to the target user according to the sound characteristics; and the interaction unit is used for executing the interaction operation corresponding to the interaction information.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the information processing method as described in the first aspect.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium, on which a computer program is stored, which when executed by a processor implements the information processing method according to the first aspect.

According to the information processing method, the information processing device and the electronic equipment, through determining the sound characteristics of the audio information of the target user, the interactive information of the target user is determined based on the sound characteristics, and therefore interactive operation corresponding to the interactive information is executed. Therefore, the technical scheme can utilize the audio information of the user to determine the corresponding interaction information and trigger the corresponding interaction operation, and when the method is applied to various interaction scenes, the interaction mode based on the dimension of the sound characteristic is increased, the self-adaption of the interaction is completed, and the interaction experience of the user is improved.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a flow chart of one embodiment of an information processing method according to the present disclosure;

FIG. 2 is a schematic diagram of the structure of one embodiment of an information processing apparatus according to the present disclosure;

FIG. 3 is an exemplary system architecture to which the information processing method of one embodiment of the present disclosure may be applied;

fig. 4 is a schematic diagram of a basic structure of an electronic device provided according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

The technical scheme provided by the embodiment of the disclosure can be applied to various interactive scenes based on XR technology; for example: virtual shooting, virtual live broadcasting and various interactions in a virtual scene.

In these interactive scenarios, the user may enter voice instructions and the interactive application or interactive device 5 (e.g., XR headset) may also recognize voice instructions entered by the user.

In the related art, the interactive applications or the interactive devices can recognize voice instructions input by users and give corresponding feedback based on recognition results. But cannot trigger the corresponding interaction means based on the user's voice information. That is, in the related art, the environment interaction based on the sound dimension is not supported, resulting in poor interaction experience for the user.

0 Based thereon, embodiments of the present disclosure provide an information processing scheme in which an information processing party

In this case, by determining the sound characteristics of the audio information of the target user, the interactive information of the target user is determined based on the sound characteristics, thereby performing the interactive operation corresponding to the interactive information. Therefore, the technical scheme can utilize the audio information of the user to determine the corresponding interaction information and trigger the phase

When the interactive operation is applied to various interactive scenes (such as XR equipment interactive scenes), 5 the interactive mode based on the dimension of sound characteristics is added, the self-adaption of the interaction is completed, and the interaction of users is improved

Mutual experience.

Referring to fig. 1, a flow of one embodiment of an information processing method according to the present disclosure is shown. The information processing method can be applied to a terminal device. The information processing method as shown in fig. 1 includes the steps of:

step 0, step 101, determining the sound characteristics of the target audio information. Wherein the target audio information is

And (5) marking audio information of the user.

In some embodiments, the target user is a user in an interaction scenario, which may be a real interaction scenario or a virtual interaction scenario. Correspondingly, the target user may or may not correspond to the avatar.

5 In different application scenarios, the target user may have different definitions or understandings.

In some embodiments, the information processing method further comprises: and acquiring target audio information.

In some embodiments, the terminal device includes an audio acquisition module, including but not limited to a microphone. Thus, the target audio information can be acquired by the audio acquisition module.

In some embodiments, step 101 comprises: preprocessing target audio information; sound features are extracted based on the preprocessed target audio information.

In such embodiments, the manner of pretreatment includes, but is not limited to: denoising processing, filtering processing, and the like. The preprocessed target audio information does not include audio information of other users than the target user, audio information of environmental noise, and the like.

In some embodiments, implementation of extracting sound features based on target audio information may be referred to techniques well-established in the art, such as: and extracting sound features based on feature extraction algorithms such as frequency cepstrum coefficients, linear prediction coefficients, line spectrum frequencies and the like. And, in combination with different sound features, a corresponding sound feature extraction algorithm may also be employed.

In the disclosed embodiments, sound features include, but are not limited to: sound frequency, tone, timbre, etc.

Step 102, determining interaction information corresponding to the target user according to the sound characteristics.

In different application scenarios, the interaction information may have different definitions or understandings. For example, the interaction information may be a current interaction action, an interaction type, etc. of the target user; or may be identity information of the target user.

In different application scenarios, the determination modes of the interaction information are correspondingly different.

As a first alternative embodiment, step 102 includes: comparing the sound characteristics with a plurality of preset reference sound characteristics to determine target preset reference sound characteristics corresponding to the sound characteristics; the plurality of preset reference sound features are respectively corresponding to preset interaction information; and determining preset interaction information corresponding to the target preset reference sound characteristics as interaction information corresponding to the target user.

In this embodiment, a plurality of reference sound features are preset, and the plurality of reference sound features correspond to preset interaction information.

And comparing the sound characteristic with a preset reference sound characteristic after determining the sound characteristic, and determining a target preset sound reference characteristic matched with the sound characteristic, so that preset interaction information corresponding to the target preset sound reference characteristic is determined as interaction information corresponding to a target user.

The preset reference sound feature may be a specific reference sound feature value or a range of reference sound feature values. Taking the sound frequency as an example, the preset reference sound characteristic may be a sound frequency value or a sound frequency value range. Correspondingly, one sound frequency value corresponds to one interaction information; or a range of sound frequency values corresponds to an interactive message.

And when comparing the sound characteristic with a plurality of preset reference sound characteristics, if the preset reference sound characteristic is a specific value, the target preset reference sound characteristic value is the same as the sound characteristic value. If the preset reference sound feature is a sound feature value range, the sound feature value is within the target preset reference sound feature value range. Taking the sound frequency as an example, if the preset sound frequency is a value, the target preset sound frequency is the same as the current sound frequency. If the preset sound frequency is the sound frequency range, the current sound frequency is in the target sound frequency range.

In this embodiment, by presetting the interaction information corresponding to different sound features, the rapid determination of the interaction information corresponding to the current sound feature can be realized.

As a second alternative embodiment, step 102 includes: determining identity information of a target user according to the sound characteristics; and determining interaction information corresponding to the target user according to the identity information of the target user.

In some embodiments, the identity information of the target user may be: adults, children, men, women, etc.

In some embodiments, the interaction information corresponding to the target user may be identity information of the target user, or may be determined based on the identity information of the target user. For example: the identity information of the target user is assumed to include: adult and male, the final interaction information is: male adults.

In some embodiments, the sound features corresponding to the different identity information may be preset, then the current sound features and the preset sound features corresponding to the different identity information are compared, and if the target sound features conforming to the current sound features are found, the identity information corresponding to the sound features is determined as the identity information of the target user.

In other embodiments, the identity information of the target user may also be determined by an identification model of the acoustic features. The recognition model can determine corresponding identity information based on sound features through pre-training. The recognition model may be a neural network model, a random forest model, etc., and is not limited herein.

In this embodiment, different interaction modes can be triggered in consideration of different user identities, so that the identity information of the target user is determined first, and then the interaction information corresponding to the target user is determined, so that interaction is triggered based on the user identities.

As a third alternative embodiment, step 102 includes: acquiring control information; the control information is the control information of a controller operated by a target user; and determining interaction information corresponding to the target user according to the sound characteristics and the control information.

In some embodiments, the target user operated controller comprises: gestures, handles, eye movement tracking, head-mounted 6DoF tracking, etc. Correspondingly, the control information of the controllers can be directly obtained from the controllers.

Thus, based on the sound characteristics and the control information, interaction information corresponding to the target user may be determined together, i.e. the controller and the sound characteristics may trigger the corresponding interaction or feedback in combination.

In some embodiments, determining interaction information corresponding to the target user according to the sound characteristics and the control information includes: comparing the sound characteristics and the control information with a plurality of preset information groups to determine target preset information groups corresponding to the sound characteristics and the control information; the plurality of preset information groups respectively comprise preset sound characteristics and preset control information, and the plurality of preset information groups respectively correspond to preset interaction information; and determining the preset interaction information corresponding to the target preset information group as the interaction information corresponding to the target user.

In this embodiment, a plurality of preset information groups are configured, each of which includes preset sound characteristics and preset control information; and each preset information group corresponds to preset interaction information.

And comparing the sound characteristics and the control information with a plurality of preset information groups to determine a target preset information group. It can be understood that the sound characteristics in the target preset information group are consistent with the sound characteristics of the target user, and the control information in the target preset information group is consistent with the control information of the target user.

In some embodiments, there may be different implementations in different application scenarios, whether the sound features are consistent, and whether the control information is consistent. Wherein, whether the sound characteristics are consistent or not can refer to the description in the previous embodiment.

For control information, including but not limited to: control parameters, control modes, control frequencies, etc. In the comparison of whether or not the control information is identical, if the control information is a specific value, reference may be made to the comparison embodiment of the sound characteristics; if the control information is not a value but a control method, it is necessary to ensure that the control information is identical or has a correlation or the like, so that the control information can be regarded as identical.

In such an embodiment, the combined triggering interaction of the sound and the controller may be achieved by determining the interaction information in combination with the sound characteristics and the control information.

As a fourth alternative embodiment, the information processing method further includes: determining semantic features corresponding to the target audio information; correspondingly, step 102 includes: and determining interaction information corresponding to the target user according to the sound characteristics and the semantic characteristics.

In some embodiments, determining the semantic features corresponding to the target audio information may be implemented by a semantic feature extraction algorithm. The extraction algorithm of the semantic features can refer to the mature technology in the field.

In some embodiments, the semantic features may be: keyword features corresponding to the audio information, text features corresponding to the audio information, and the like.

As an optional implementation manner, determining the interaction information corresponding to the target user according to the sound feature and the semantic feature includes: comparing the sound features and the semantic features with a plurality of preset feature groups to determine target preset feature groups corresponding to the sound features and the semantic features; the plurality of preset feature groups respectively comprise preset sound features and preset semantic features, and the plurality of preset feature groups respectively correspond to preset interaction information; and determining the preset interaction information corresponding to the target preset feature group as the interaction information corresponding to the target user.

In this embodiment, a plurality of preset feature groups are configured, each preset feature group including 5 preset sound features and preset semantic features; and each preset feature group corresponds to preset interaction

Information.

And comparing the sound features and the semantic features with a plurality of preset feature groups to determine a target preset feature group. It will be appreciated that the sound features in the target preset feature set are in combination with the sound of the target user

The sound features are consistent, and the semantic features in the target preset feature group are 0 to the semantic features of the target user.

In some embodiments, there may be different implementations in different application scenarios, whether the sound features are consistent, and whether the semantic features are consistent. Wherein, whether the sound characteristics are consistent or not can refer to the description in the previous embodiment.

For the semantic features, when comparing whether the semantic features are consistent, the semantic 5 features are completely the same; or the similarity between the semantic features is larger than the preset similarity; or association of

The degree is larger than the preset association degree and the like, and the semantic features are consistent.

In such an embodiment, the combined triggering of interactions of sound and semantics may be achieved by combining sound features with semantic features to determine interaction information.

And step 103, executing the interaction operation corresponding to the interaction information.

0 After determining the interaction information, in step 103, a specific interaction scenario may be combined

And performing interactive operation corresponding to the interactive information.

In the foregoing embodiments, the interaction information includes, but is not limited to: the interaction mode of the user; user identity, etc. The interaction mode of the user can represent interaction actions, interaction types and the like of the user in the current interaction scene.

5 Thus, as an alternative embodiment, the interaction information comprises: the target user mimics the target role.

In such an embodiment, the target user may simulate the sound of the target character in the interactive scene, so that the target character simulated by the target user may be determined through step 102.

In some embodiments, the target character may be a character in an animal, a television show, or a movie, etc.

Step 103 then comprises: and displaying the virtual image corresponding to the target role.

For example, assuming that the target character is one of the animated characters in the animation, in step 103, an avatar corresponding to the animated character is displayed.

The implementation mode is suitable for the situation that when the interaction scene where the target user is located is an XR interaction scene, the target user needs to display himself through the virtual image or interact with other users through the virtual image.

In other embodiments, the information processing method further includes: and displaying the identification information corresponding to the target role based on the display position of the virtual image.

The identification information corresponding to the target role may be: and the significances corresponding to the target roles. For example: the target character is a cat, and the corresponding identification information can be a cat claw, a cat ear and the like.

The identification information corresponding to the target role may also be: text information, expression information and the like corresponding to the target user. For example: the target user currently sends an expression, and the expression is displayed at the corresponding position of the target role; the target user currently speaks, and is displayed at the corresponding position of the target character in the form of text.

In some embodiments, when the interactive operation corresponding to the interactive information is executed, the avatar corresponding to the target character may not be displayed, but only the identification information corresponding to the target character may be displayed based on the display position of the target character in the current display interface. Such as: in the live interaction scene, the live image is the real-time image of the target user, and the identification information can be displayed at the corresponding display position in the real-time image of the target user. For example: cat paws are displayed at hand positions; the cat ear is shown in the head position, etc.

In such an embodiment, different interactions may be triggered by the interaction information, thereby enabling triggering of scene interactions based on sound.

In other embodiments, where the interaction information includes the user identity of the target user, step 103 includes: and executing interaction operation matched with the user identity.

In such an embodiment, different user identities correspond to different interactions. For example: the interaction between children and adults is different, and the interaction between men and women is also different.

Thus, based on the user identity of the target user, customized interactions that match the user identity may be performed.

In some embodiments, the interactions corresponding to different user identities have the same interaction part, and also have different interaction parts, which are customized interactions. For example: the expression that children and adults can send is different, and children's expression that can send is the lovely expression of comparison, and the expression that the adult can send is more, and the scope of involving is also wider.

For another example: the special figures that can be added by men and women are different, men can only add the special figures of men, and women can only add the special figures of women.

Thus, for the interactive device or the interactive application, the interactive operations executable by different user identities can be configured in advance, and after the user identities are identified, the interactive operations executable by the target user are limited.

In such an embodiment, customized interactions matching the user identity are achieved through correspondence between the user identity and the interaction operation.

As can be seen from the description of the foregoing embodiments, the information processing scheme provided by the embodiments of the present disclosure determines, by determining the sound characteristics of the audio information of the target user, the interactive information of the target user based on the sound characteristics, thereby performing the interactive operation corresponding to the interactive information. Therefore, the technical scheme can utilize the audio information of the user to determine the corresponding interaction information and trigger the corresponding interaction operation, and when the method is applied to various interaction scenes (such as XR equipment interaction scenes), the interaction mode based on the dimension of the sound characteristic is increased, the self-adaption of the interaction is completed, and the interaction experience of the user is improved.

With further reference to fig. 2, as an implementation of the method shown in the foregoing drawings, the present disclosure provides an embodiment of an information processing apparatus, which corresponds to the information processing method embodiment shown in fig. 1, and which is particularly applicable to various electronic devices.

As shown in fig. 2, the information processing apparatus of the present embodiment includes: a determining unit 201 for determining a sound characteristic of the target audio information; the target audio information is the audio information of a target user; the determining unit 201 is further configured to determine interaction information corresponding to the target user according to the sound feature; and the interaction unit 202 is configured to execute an interaction operation corresponding to the interaction information.

In some embodiments, the determining unit 201 is further configured to: comparing the sound characteristics with a plurality of preset reference sound characteristics, and determining target preset reference sound characteristics corresponding to the sound characteristics; the preset reference sound features are respectively corresponding to preset interaction information; and determining preset interaction information corresponding to the target preset reference sound characteristics as interaction information corresponding to the target user.

In some embodiments, the determining unit 201 is further configured to: determining the identity information of the target user according to the sound characteristics; and determining interaction information corresponding to the target user according to the identity information of the target user.

In some embodiments, the determining unit 201 is further configured to: acquiring control information; the control information is the control information of the controller operated by the target user; and determining interaction information corresponding to the target user according to the sound characteristics and the control information.

In some embodiments, the determining unit 201 is further configured to: comparing the sound characteristics and the control information with a plurality of preset information groups to determine target preset information groups corresponding to the sound characteristics and the control information; the preset information sets respectively comprise preset sound characteristics and preset control information, and the preset information sets respectively correspond to preset interaction information; and determining the preset interaction information corresponding to the target preset information group as the interaction information corresponding to the target user.

In some embodiments, the determining unit 201 is further configured to: determining semantic features corresponding to the target audio information; and determining interaction information corresponding to the target user according to the sound characteristics and the semantic characteristics.

In some embodiments, the determining unit 201 is further configured to: comparing the sound features and the semantic features with a plurality of preset feature groups to determine target preset feature groups corresponding to the sound features and the semantic features; the preset feature groups respectively comprise preset sound features and preset semantic features, and the preset feature groups respectively correspond to preset interaction information; and determining the preset interaction information corresponding to the target preset feature group as the interaction information corresponding to the target user.

In some embodiments, the target user emulates a target persona; the interaction unit 202 is further configured to: and displaying the virtual image corresponding to the target role.

In some embodiments, the interaction unit 202 is further configured to: and displaying the identification information corresponding to the target role based on the display position of the virtual image.

In some embodiments, the interaction information includes: the user identity of the target user; the interaction unit 202 is further configured to: and executing interaction operation matched with the user identity.

Referring to fig. 3, fig. 3 illustrates an exemplary system architecture in which an information processing method of an embodiment of the present disclosure may be applied.

As shown in fig. 3, the system architecture may include terminal devices 301, 302, 303, a network 304, and a server 305. The network 304 may be the medium used to provide communication links between the terminal devices 301, 302, 303 and the server 305. The network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The terminal devices 301, 302, 303 may interact with the server 305 through the network 304 to receive or send messages or the like. Various client applications, such as a web browser application, a search class application, a news information class application, may be installed on the terminal devices 301, 302, 303. The client application in the terminal device 301, 302, 303 may receive the user's instructions and according to

The user's instructions fulfill the corresponding functions, for example adding corresponding 5 information to the information according to the user's instructions.

The terminal devices 301, 302, 303 may be hardware or software. When the terminal devices 301, 302, 303 are hardware, they may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, electronic book readers, and the like,

MP3 players (Moving Picture Experts Group Audio Layer III, motion Picture 0 expert compression Standard Audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, motion Picture expert compression Standard Audio layer 4) players, laptop and desktop computers, and so forth. When the terminal devices 301, 302, 303 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single 5-piece software or software module. The present invention is not particularly limited herein.

The server 305 may be a server providing various services, such as the receiving terminal device 301,

302. 303, And acquiring display information corresponding to the information acquisition request in various modes according to the information acquisition request. And the relevant data showing the information is sent to the terminal devices 301, 302, 303.

0 To be noted is that the information processing method provided in the embodiment of the present disclosure may be implemented by a terminal

Device execution, accordingly, information processing means may be provided at the terminal devices 301, 302, 303

Is a kind of medium. In addition, the information processing method provided by the embodiment of the present disclosure may also be used by the server 305

The execution, accordingly, the information processing apparatus may be provided in the server 305.

It should be understood that the number of terminal devices, networks and servers in fig. 3 is merely illustrative 5. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to fig. 4, a schematic diagram of a configuration of an electronic device (e.g., a terminal device or server in fig. 3) suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 4, the electronic device may include a processing means (e.g., a central processor, a graphics processor, etc.) 401, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403. In the RAM403, various programs and data necessary for the operation of the electronic device 400 are also stored. The processing device 401, the ROM402, and the RAM403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

In general, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, magnetic tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 shows an electronic device having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communications device 409, or from storage 408, or from ROM 402. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 401.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: determining sound characteristics of the target audio information; the target audio information is the audio information of a target user; determining interaction information corresponding to the target user according to the sound characteristics; and executing the interaction operation corresponding to the interaction information.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit is not limited to the unit itself in some cases, and the determination unit 201 may also be described as "a unit that determines a sound characteristic of the target audio information", for example.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. An information processing method, characterized by comprising:

Determining sound characteristics of the target audio information; the target audio information is the audio information of a target user;

determining interaction information corresponding to the target user according to the sound characteristics;

and executing the interaction operation corresponding to the interaction information.

2. The information processing method according to claim 1, wherein the determining the interaction information corresponding to the target user according to the sound feature includes:

Comparing the sound characteristics with a plurality of preset reference sound characteristics, and determining target preset reference sound characteristics corresponding to the sound characteristics; the preset reference sound features are respectively corresponding to preset interaction information;

and determining preset interaction information corresponding to the target preset reference sound characteristics as interaction information corresponding to the target user.

3. The information processing method according to claim 1, wherein the determining the interaction information corresponding to the target user according to the sound feature includes:

determining the identity information of the target user according to the sound characteristics;

And determining interaction information corresponding to the target user according to the identity information of the target user.

4. The information processing method according to claim 1, wherein the determining the interaction information corresponding to the target user according to the sound feature includes:

Acquiring control information; the control information is the control information of the controller operated by the target user;

And determining interaction information corresponding to the target user according to the sound characteristics and the control information.

5. The information processing method according to claim 4, wherein the determining the interaction information corresponding to the target user based on the sound feature and the control information includes:

Comparing the sound characteristics and the control information with a plurality of preset information groups to determine target preset information groups corresponding to the sound characteristics and the control information; the preset information sets respectively comprise preset sound characteristics and preset control information, and the preset information sets respectively correspond to preset interaction information;

and determining the preset interaction information corresponding to the target preset information group as the interaction information corresponding to the target user.

6. The information processing method according to claim 1, characterized in that the information processing method further comprises:

Determining semantic features corresponding to the target audio information;

The determining the interaction information corresponding to the target user according to the sound characteristics comprises the following steps:

and determining interaction information corresponding to the target user according to the sound characteristics and the semantic characteristics.

7. The information processing method according to claim 6, wherein the determining the interaction information corresponding to the target user according to the sound feature and the semantic feature includes:

Comparing the sound features and the semantic features with a plurality of preset feature groups to determine target preset feature groups corresponding to the sound features and the semantic features; the preset feature groups respectively comprise preset sound features and preset semantic features, and the preset feature groups respectively correspond to preset interaction information;

and determining the preset interaction information corresponding to the target preset feature group as the interaction information corresponding to the target user.

8. The information processing method according to claim 1, wherein the interaction information includes: the target role imitated by the target user; the executing the interactive operation corresponding to the interactive information includes:

and displaying the virtual image corresponding to the target role.

9. The information processing method according to claim 8, characterized in that the information processing method further comprises:

and displaying the identification information corresponding to the target role based on the display position of the virtual image.

10. The information processing method according to claim 1, wherein the interaction information includes: the user identity of the target user; the executing the interactive operation corresponding to the interactive information includes:

and executing interaction operation matched with the user identity.

11. An information processing apparatus, characterized by comprising:

a determining unit configured to determine a sound characteristic of the target audio information; the target audio information is the audio information of a target user;

The determining unit is further used for determining interaction information corresponding to the target user according to the sound characteristics;

And the interaction unit is used for executing the interaction operation corresponding to the interaction information.

12. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-10.

13. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1-10.

14. A computer program product comprising computer instructions which, when executed by a processor, implement the method of any of claims 1-10.