CN113516984A - Sign language interaction method, system, equipment and storage medium - Google Patents

Sign language interaction method, system, equipment and storage medium Download PDF

Info

Publication number
CN113516984A
CN113516984A CN202110462102.5A CN202110462102A CN113516984A CN 113516984 A CN113516984 A CN 113516984A CN 202110462102 A CN202110462102 A CN 202110462102A CN 113516984 A CN113516984 A CN 113516984A
Authority
CN
China
Prior art keywords
information
gesture
semantic
voice
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110462102.5A
Other languages
Chinese (zh)
Inventor
尹崇亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202110462102.5A priority Critical patent/CN113516984A/en
Publication of CN113516984A publication Critical patent/CN113516984A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • G10L15/25Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a sign language interaction method, a sign language interaction system, sign language interaction equipment and a storage medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: establishing a gesture semantic database; acquiring user gesture information, and intercepting the user gesture information in a segmented manner to obtain a plurality of gestures; matching each gesture with gesture data in a gesture semantic database respectively, and outputting semantic information corresponding to each gesture in sequence; semantic information corresponding to each gesture is integrated to generate target gesture semantic information; acquiring and recognizing user voice information, and generating voice semantic recognition information; matching the voice semantic recognition information with semantic data in a gesture semantic database to obtain corresponding target gesture information; and generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information to complete interaction. The invention identifies each gesture independently, ensures the accuracy of identification and improves the interaction effect.

Description

Sign language interaction method, system, equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a sign language interaction method, a sign language interaction system, sign language interaction equipment and a storage medium.
Background
Sign language is a language used for communication between the deaf-mutes, and is a certain meaning or word formed by simulating images or syllables according to the change of gestures by using gesture proportional actions, however, due to the complexity of the sign language, most of normal people can master and use the sign language to communicate with the deaf-mutes, so that the communication between the deaf-mutes and the normal people is hindered. The traditional solution is to make the deaf-mute communicate with the normal person through characters, but the traditional solution has certain limitations, and the character communication is limited by the conditions of reading ability, writing ability of paper and pens and the like.
Along with the development of artificial intelligence technology, more and more technologies are applied to the field of civilian life, wherein the language-handicapped people are very urgent to realize communication and communication with normal social groups through the artificial intelligence technology, however, sign language interaction systems in the prior art have some problems, and deaf-dumb people cannot accurately recognize the semantics of each gesture due to too many gestures when the gestures are drawn, so that the accuracy is not enough, and a good interaction effect cannot be achieved.
Disclosure of Invention
In order to overcome the above problems or at least partially solve the above problems, embodiments of the present invention provide a sign language interaction method, system, device, and storage medium, which perform individual recognition on each gesture, ensure recognition accuracy, and improve interaction effect.
The embodiment of the invention is realized by the following steps:
in a first aspect, an embodiment of the present invention provides a sign language interaction method, including the following steps:
acquiring and establishing a gesture semantic database according to the gesture semantic data;
acquiring user gesture information, and intercepting the user gesture information in a segmented manner to obtain a plurality of gestures;
matching each gesture with gesture data in a gesture semantic database respectively, and outputting semantic information corresponding to each gesture in sequence;
semantic information corresponding to each gesture is integrated to generate target gesture semantic information;
acquiring and recognizing user voice information, and generating voice semantic recognition information;
matching the voice semantic recognition information with semantic data in a gesture semantic database to obtain corresponding target gesture information;
and generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information to complete interaction.
In order to ensure that the subsequent sign language interaction can be completed quickly and effectively, firstly, a large amount of existing gesture semantic data are obtained so as to establish a complete gesture semantic database and provide data support. After the gesture semantic database is established, when sign language interaction is needed, user gesture information scribed by a user is obtained through a camera, and since gestures in the user gesture information generally obtained are continuous, in order to ensure that each gesture is accurately identified, the user gesture information is intercepted in a segmented manner to obtain each individual gesture, and then each individual gesture is matched with gesture data in the gesture semantic database to find out corresponding gesture semantics in a matched manner; after the semantic information corresponding to each gesture is obtained, all the semantic information is integrated to form complete and coherent semantic information, namely the target gesture semantic information is obtained, and the target gesture information is converted into corresponding voice information to be sent to a normal person user. Meanwhile, voice information of the user is acquired and recognized, voice semantic recognition information is generated, the voice semantic recognition information is matched with semantics in a gesture semantic database, matching is carried out by taking a single word or a word as a unit, gesture information corresponding to corresponding semantics is obtained through matching, then the obtained corresponding gesture information is integrated to form complete gesture information, and the gesture information is generated to the corresponding deaf-mute user in a video mode.
The method respectively identifies the gestures and the voices of the deaf-mute user and the normal user, and matches the individual gestures in the gesture information and the characters or words in the voice information one by one, so that the identification accuracy is ensured, the condition of missing identification is avoided, the interaction accuracy is further improved, and the interaction effect is improved.
Based on the first aspect, in some embodiments of the present invention, the method for performing semantic integration on semantic information corresponding to each gesture to generate semantic information of a target gesture includes the following steps:
semantic integration is carried out on the semantic information corresponding to each gesture according to the output sequence to obtain initial gesture semantic information;
and extracting and performing redundancy elimination processing on the initial gesture semantic information according to the associated words in the initial gesture semantic information to generate target gesture semantic information.
Based on the first aspect, in some embodiments of the present invention, the method for generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information includes the following steps:
generating voice playing information according to the target gesture semantic information, and playing the voice playing information to a corresponding user;
and generating sign language information by adopting a preset three-dimensional model according to the target gesture information, and displaying the sign language information to a corresponding user.
Based on the first aspect, in some embodiments of the present invention, the method for acquiring and recognizing the user voice information and generating the voice semantic recognition information includes the following steps:
acquiring user voice information;
extracting and searching and acquiring a corresponding voice database according to language information in the voice information of the user;
and matching the voice content in the voice information of the user with the semantics in the corresponding voice database to generate voice semantic recognition information.
In a second aspect, an embodiment of the present invention provides a sign language interaction system, including a data establishing module, a gesture intercepting module, a gesture matching module, a semantic integration module, a voice recognition module, a voice matching module, and an interaction module, where:
the data establishing module is used for acquiring and establishing a gesture semantic database according to the gesture semantic data;
the gesture intercepting module is used for acquiring gesture information of a user and intercepting the gesture information of the user in a segmented manner so as to intercept a plurality of gestures;
the gesture matching module is used for matching each gesture with gesture data in the gesture semantic database respectively and outputting semantic information corresponding to each gesture in sequence;
the semantic integration module is used for performing semantic integration on semantic information corresponding to each gesture to generate target gesture semantic information;
the voice recognition module is used for acquiring and recognizing the voice information of the user and generating voice semantic recognition information;
the voice matching module is used for matching the voice semantic recognition information with semantic data in the gesture semantic database to obtain corresponding target gesture information;
and the interaction module is used for generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information so as to complete interaction.
In order to ensure that the subsequent sign language interaction can be completed quickly and effectively, firstly, the existing large amount of gesture semantic data is obtained through a data establishing module so as to establish a complete gesture semantic database and provide data support. After the gesture semantic database is established, when sign language interaction is needed, user gesture information scribed by a user is obtained through a camera, since gestures in the user gesture information generally obtained are continuous, in order to ensure that each gesture is accurately identified, the user gesture information is segmented and intercepted through a gesture intercepting module to obtain each individual gesture, then each individual gesture is matched with gesture data in a gesture semantic database through a gesture matching module, and the corresponding gesture semantics are matched and found; after the semantic information corresponding to each gesture is obtained, the semantic integration module integrates all the semantic information to form complete and coherent semantic information, namely the target gesture semantic information is obtained, and the target gesture information is converted into corresponding voice information through the interaction module and is sent to a normal person user. Meanwhile, voice information of the user is acquired and recognized through the voice recognition module to generate voice semantic recognition information, the voice semantic recognition information is matched with semantics in a gesture semantic database through the voice matching module, matching is carried out by taking a single character or word as a unit, gesture information corresponding to corresponding semantics is obtained through matching, then the obtained corresponding gesture information is integrated to form complete gesture information, and the gesture information is generated to the corresponding deaf-mute user through the interaction module in a video mode.
The system respectively identifies the gestures and the voices of the deaf-mute user and the normal user, and carries out one-to-one matching on each independent gesture in the gesture information and the characters or words in the voice information, so that the identification accuracy is ensured, the condition of missing identification is avoided, the interaction accuracy is further improved, and the interaction effect is improved.
Based on the second aspect, in some embodiments of the present invention, the semantic integration module includes an initial integration sub-module and a redundancy elimination sub-module, wherein:
the initial integration submodule is used for performing semantic integration on semantic information corresponding to each gesture according to the output sequence to obtain initial gesture semantic information;
and the redundancy removing submodule is used for extracting and carrying out redundancy removing processing on the initial gesture semantic information according to the associated words in the initial gesture semantic information to generate target gesture semantic information.
Based on the second aspect, in some embodiments of the present invention, the interaction module includes a voice playing sub-module and a gesture displaying sub-module, wherein:
the voice playing submodule is used for generating voice playing information according to the target gesture semantic information and playing the voice playing information to a corresponding user;
and the gesture display module is used for generating sign language information by adopting a preset three-dimensional model according to the target gesture information and displaying the sign language information to a corresponding user.
Based on the second aspect, in some embodiments of the present invention, the voice recognition module includes a voice obtaining sub-module, a language extraction sub-module, and a semantic matching sub-module, where:
the voice acquisition submodule is used for acquiring voice information of a user;
the language extraction submodule is used for extracting and searching and acquiring a corresponding voice database according to language information in the voice information of the user;
and the semantic matching submodule is used for matching the voice content in the voice information of the user with the semantic in the corresponding voice database so as to generate voice semantic identification information.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory for storing one or more programs; a processor. The program or programs, when executed by a processor, implement the method of any of the first aspects as described above.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method according to any one of the first aspect described above.
The embodiment of the invention at least has the following advantages or beneficial effects:
the embodiment of the invention provides a sign language interaction method, a system, equipment and a storage medium, and establishes a complete gesture semantic database. Intercepting the gesture information of the user in a segmented manner to obtain each independent gesture, matching each independent gesture with gesture data in a gesture semantic database, and matching and searching corresponding gesture semantics; matching the voice semantic recognition information with semantics in a gesture semantic database, matching by taking a single character or word as a unit to obtain gesture information corresponding to the corresponding semantics, and generating the gesture information to the corresponding deaf-mute user in a video mode. The method and the device respectively identify the gestures and the voices of the deaf-mute user and the normal person user, and match each individual gesture in the gesture information with the characters or words in the voice information one by one, so that the identification accuracy is ensured, the condition of missing identification is avoided, the interaction accuracy is further improved, and the interaction effect is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flowchart of a sign language interaction method according to an embodiment of the present invention;
FIG. 2 is a schematic block diagram of a sign language interaction system according to an embodiment of the present invention;
fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Icon: 100. a data establishing module; 200. a gesture intercepting module; 300. a gesture matching module; 400. a semantic integration module; 410. an initial integration submodule; 420. a redundancy elimination submodule; 500. a voice recognition module; 510. a voice acquisition submodule; 520. a language extraction submodule; 530. a semantic matching submodule; 600. a voice matching module; 700. an interaction module; 710. a voice playing submodule; 720. a gesture display submodule; 101. a memory; 102. a processor; 103. a communication interface.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Examples
As shown in fig. 1, in a first aspect, an embodiment of the present invention provides a sign language interaction method, including the following steps:
s1, acquiring and establishing a gesture semantic database according to the gesture semantic data;
in some embodiments of the present invention, in order to ensure that the subsequent sign language interaction can be completed quickly and effectively, a large amount of existing gesture semantic data is obtained first, so as to establish a complete gesture semantic database and provide data support. The gesture semantic database contains gesture, text semantics and associated information of the two.
S2, acquiring user gesture information, and intercepting the user gesture information in a segmented manner to obtain a plurality of gestures;
in some embodiments of the present invention, after the gesture semantic database is established, when a sign language interaction is required, gesture information of a user is obtained by a camera, and since gestures in the gesture information of the user are generally obtained continuously, in order to ensure accurate recognition of each gesture, the gesture information of the user is intercepted in segments to obtain each individual gesture, so that individual matching is performed subsequently, and accuracy of matching is improved.
S3, matching each gesture with gesture data in a gesture semantic database respectively, and sequentially outputting semantic information corresponding to each gesture;
in some embodiments of the present invention, each individual gesture is matched with gesture data in the gesture semantic database, the corresponding gesture semantics are found by matching, and the corresponding gesture semantics are sequentially output according to the input sequence.
S4, performing semantic integration on semantic information corresponding to each gesture to generate target gesture semantic information;
in some embodiments of the invention, after obtaining the semantic information corresponding to each gesture, integrating all the semantic information to form complete and coherent semantic information, namely obtaining the semantic information of the target gesture; the target gesture semantic information comprises semantic content, language and other information.
S5, acquiring and recognizing the voice information of the user, and generating voice semantic recognition information;
in some embodiments of the present invention, in order to ensure the timeliness of the interaction, after a normal user sends out voice information, the voice information of the user is timely acquired, and is recognized, so that the voice information is converted into text semantic information. The user voice information comprises voice content and language information.
S6, matching the voice semantic recognition information with semantic data in a gesture semantic database to obtain corresponding target gesture information;
in some embodiments of the present invention, the speech semantic recognition information is matched with semantics in a gesture semantic database, and a single word or word is used as a unit for matching, so as to obtain gesture information corresponding to the corresponding semantics.
And S7, generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information, and completing the interaction.
In some embodiments of the present invention, the target gesture information is converted into corresponding voice information and sent to a normal person user; and integrating the obtained corresponding gesture information to form complete gesture information, and generating the gesture information to the corresponding deaf-mute user in a video mode. The method respectively identifies the gestures and the voices of the deaf-mute user and the normal user, and matches the individual gestures in the gesture information and the characters or words in the voice information one by one, so that the identification accuracy is ensured, the condition of missing identification is avoided, the interaction accuracy is further improved, and the interaction effect is improved.
Based on the first aspect, in some embodiments of the present invention, the method for performing semantic integration on semantic information corresponding to each gesture to generate semantic information of a target gesture includes the following steps:
semantic integration is carried out on the semantic information corresponding to each gesture according to the output sequence to obtain initial gesture semantic information;
and extracting and performing redundancy elimination processing on the initial gesture semantic information according to the associated words in the initial gesture semantic information to generate target gesture semantic information.
After semantic matching is performed on each gesture, the obtained information is incomplete, so semantic information corresponding to each gesture needs to be semantically integrated according to an output sequence to form a coherent sentence, the obtained information may include some unnecessary redundant information or interference information, a target word can be screened according to a relevant word in the sentence, and redundant data is removed to obtain accurate target gesture semantic information.
Based on the first aspect, in some embodiments of the present invention, the method for generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information includes the following steps:
generating voice playing information according to the target gesture semantic information, and playing the voice playing information to a corresponding user;
and generating sign language information by adopting a preset three-dimensional model according to the target gesture information, and displaying the sign language information to a corresponding user.
In order to ensure the accuracy of interactive information transmission and adapt to different crowds, corresponding voice playing information is generated and played to normal users in a voice mode, and sign language information is displayed to corresponding deaf-mutes through three-dimensional model simulation gestures.
Based on the first aspect, in some embodiments of the present invention, the method for acquiring and recognizing the user voice information and generating the voice semantic recognition information includes the following steps:
acquiring user voice information;
extracting and searching and acquiring a corresponding voice database according to language information in the voice information of the user;
and matching the voice content in the voice information of the user with the semantics in the corresponding voice database to generate voice semantic recognition information.
In order to meet the semantic conversion requirements of different languages and improve the data matching efficiency, corresponding voice semantic information is searched in a corresponding voice database according to the difference of the languages, and then corresponding voice semantic identification information is generated.
As shown in fig. 2, in a second aspect, an embodiment of the present invention provides a sign language interaction system, which includes a data establishing module 100, a gesture intercepting module 200, a gesture matching module 300, a semantic integration module 400, a speech recognition module 500, a speech matching module 600, and an interaction module 700, where:
the data establishing module 100 is used for acquiring and establishing a gesture semantic database according to the gesture semantic data;
the gesture intercepting module 200 is configured to acquire user gesture information and intercept the user gesture information in segments to intercept a plurality of gestures;
the gesture matching module 300 is configured to match each gesture with gesture data in a gesture semantic database, and sequentially output semantic information corresponding to each gesture;
the semantic integration module 400 is used for performing semantic integration on the semantic information corresponding to each gesture to generate target gesture semantic information;
the voice recognition module 500 is used for acquiring and recognizing the voice information of the user and generating voice semantic recognition information;
the voice matching module 600 is configured to match the voice semantic recognition information with semantic data in the gesture semantic database to obtain corresponding target gesture information;
and the interaction module 700 is configured to generate and send corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information, so as to complete interaction.
In order to ensure that the subsequent sign language interaction can be completed quickly and effectively, firstly, a large amount of existing gesture semantic data is acquired through the data establishing module 100, so that a complete gesture semantic database is established to provide data support. After the gesture semantic database is established, when sign language interaction is needed, user gesture information scribed by a user is obtained through a camera, since gestures in the user gesture information generally obtained are continuous, in order to ensure accurate recognition of each gesture, the gesture information of the user is segmented and intercepted through a gesture intercepting module 200 to obtain each individual gesture, then each individual gesture is matched with gesture data in a gesture semantic database through a gesture matching module 300, and corresponding gesture semantics are matched and found; after obtaining the semantic information corresponding to each gesture, the semantic integration module 400 integrates all the semantic information to form complete and coherent semantic information, i.e. obtaining the semantic information of the target gesture, and the interaction module 700 converts the semantic information of the target gesture into corresponding voice information and sends the voice information to the normal user. Meanwhile, voice information of the user is acquired and recognized through the voice recognition module 500 to generate voice semantic recognition information, the voice semantic recognition information is matched with semantics in a gesture semantic database through the voice matching module 600, single characters or words are matched as units, gesture information corresponding to corresponding semantics is obtained through matching, then the obtained corresponding gesture information is integrated to form complete gesture information, and the gesture information is generated to the corresponding deaf-mute user through the interaction module 700 in a video mode.
The system respectively identifies the gestures and the voices of the deaf-mute user and the normal user, and carries out one-to-one matching on each independent gesture in the gesture information and the characters or words in the voice information, so that the identification accuracy is ensured, the condition of missing identification is avoided, the interaction accuracy is further improved, and the interaction effect is improved.
As shown in fig. 2, according to the second aspect, in some embodiments of the present invention, the semantic integration module 400 includes an initial integration sub-module 410 and a redundancy elimination sub-module 420, wherein:
the initial integration sub-module 410 is configured to perform semantic integration on semantic information corresponding to each gesture according to the output sequence to obtain initial gesture semantic information;
and the redundancy removing submodule 420 is configured to extract and perform redundancy removing processing on the initial gesture semantic information according to the associated word in the initial gesture semantic information, so as to generate target gesture semantic information.
After semantic matching is performed on each gesture, the obtained information is incomplete, so semantic information corresponding to each gesture needs to be semantically integrated by the initial integration submodule 410 according to the output sequence to form a coherent sentence, at this time, the obtained information may contain some unnecessary redundant information or interference information, a target word may be screened according to a relevant word in the sentence by the redundancy removal submodule 420, and redundant data is removed to obtain accurate target gesture semantic information.
As shown in fig. 2, according to the second aspect, in some embodiments of the present invention, the interaction module 700 includes a voice playing sub-module 710 and a gesture showing sub-module 720, wherein:
the voice playing sub-module 710 is configured to generate voice playing information according to the target gesture semantic information, and play the voice playing information to a corresponding user;
and the gesture display module is used for generating sign language information by adopting a preset three-dimensional model according to the target gesture information and displaying the sign language information to a corresponding user.
In order to ensure the accuracy of interactive information transmission and adapt to different crowds, the voice playing submodule 710 generates corresponding voice playing information to play the voice playing information to a normal user, and the gesture display module simulates a gesture through a three-dimensional model and displays sign language information to a corresponding deaf-mute.
As shown in fig. 2, according to the second aspect, in some embodiments of the present invention, the voice recognition module 500 includes a voice obtaining sub-module 510, a language extracting sub-module 520, and a semantic matching sub-module 530, wherein:
a voice obtaining sub-module 510, configured to obtain user voice information;
the language extraction submodule 520 is used for extracting and searching and acquiring a corresponding voice database according to language information in the voice information of the user;
and the semantic matching submodule 530 is used for matching the voice content in the voice information of the user with the semantic in the corresponding voice database to generate voice semantic identification information.
In order to meet the semantic conversion requirements of different languages and improve the data matching efficiency, corresponding voice semantic information is searched in a corresponding voice database according to the difference of the languages, and then corresponding voice semantic identification information is generated.
As shown in fig. 3, in a third aspect, an embodiment of the present application provides an electronic device, which includes a memory 101 for storing one or more programs; a processor 102. The one or more programs, when executed by the processor 102, implement the method of any of the first aspects as described above.
Also included is a communication interface 103, and the memory 101, processor 102 and communication interface 103 are electrically connected to each other, directly or indirectly, to enable transfer or interaction of data. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 101 may be used to store software programs and modules, and the processor 102 executes the software programs and modules stored in the memory 101 to thereby execute various functional applications and data processing. The communication interface 103 may be used for communicating signaling or data with other node devices.
The Memory 101 may be, but is not limited to, a Random Access Memory 101 (RAM), a Read Only Memory 101 (ROM), a Programmable Read Only Memory 101 (PROM), an Erasable Read Only Memory 101 (EPROM), an electrically Erasable Read Only Memory 101 (EEPROM), and the like.
The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor 102, including a Central Processing Unit (CPU) 102, a Network Processor 102 (NP), and the like; but may also be a Digital Signal processor 102 (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware components.
In the embodiments provided in the present application, it should be understood that the disclosed method and system and method can be implemented in other ways. The method and system embodiments described above are merely illustrative, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by the processor 102, implements the method according to any one of the first aspect described above. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory 101 (ROM), a Random Access Memory 101 (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (10)

1. A sign language interaction method is characterized by comprising the following steps:
acquiring and establishing a gesture semantic database according to the gesture semantic data;
acquiring user gesture information, and intercepting the user gesture information in a segmented manner to obtain a plurality of gestures;
matching each gesture with gesture data in a gesture semantic database respectively, and outputting semantic information corresponding to each gesture in sequence;
semantic information corresponding to each gesture is integrated to generate target gesture semantic information;
acquiring and recognizing user voice information, and generating voice semantic recognition information;
matching the voice semantic recognition information with semantic data in a gesture semantic database to obtain corresponding target gesture information;
and generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information to complete interaction.
2. The sign language interaction method according to claim 1, wherein the method for performing semantic integration on semantic information corresponding to each gesture to generate semantic information of the target gesture comprises the following steps:
semantic integration is carried out on the semantic information corresponding to each gesture according to the output sequence to obtain initial gesture semantic information;
and extracting and performing redundancy elimination processing on the initial gesture semantic information according to the associated words in the initial gesture semantic information to generate target gesture semantic information.
3. The sign language interaction method of claim 1, wherein the method for generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information comprises the following steps:
generating voice playing information according to the target gesture semantic information, and playing the voice playing information to a corresponding user;
and generating sign language information by adopting a preset three-dimensional model according to the target gesture information, and displaying the sign language information to a corresponding user.
4. A sign language interaction method according to claim 1, wherein the method for acquiring and recognizing the user voice information and generating the voice semantic recognition information comprises the following steps:
acquiring user voice information;
extracting and searching and acquiring a corresponding voice database according to language information in the voice information of the user;
and matching the voice content in the voice information of the user with the semantics in the corresponding voice database to generate voice semantic recognition information.
5. The sign language interaction system is characterized by comprising a data establishing module, a gesture intercepting module, a gesture matching module, a semantic integration module, a voice recognition module, a voice matching module and an interaction module, wherein:
the data establishing module is used for acquiring and establishing a gesture semantic database according to the gesture semantic data;
the gesture intercepting module is used for acquiring gesture information of a user and intercepting the gesture information of the user in a segmented manner so as to intercept a plurality of gestures;
the gesture matching module is used for matching each gesture with gesture data in the gesture semantic database respectively and outputting semantic information corresponding to each gesture in sequence;
the semantic integration module is used for performing semantic integration on semantic information corresponding to each gesture to generate target gesture semantic information;
the voice recognition module is used for acquiring and recognizing the voice information of the user and generating voice semantic recognition information;
the voice matching module is used for matching the voice semantic recognition information with semantic data in the gesture semantic database to obtain corresponding target gesture information;
and the interaction module is used for generating and sending corresponding interaction information to a corresponding user according to the target gesture semantic information and the target gesture information so as to complete interaction.
6. The sign language interactive system of claim 5, wherein the semantic integration module comprises an initial integration sub-module and a redundancy elimination sub-module, wherein:
the initial integration submodule is used for performing semantic integration on semantic information corresponding to each gesture according to the output sequence to obtain initial gesture semantic information;
and the redundancy removing submodule is used for extracting and carrying out redundancy removing processing on the initial gesture semantic information according to the associated words in the initial gesture semantic information to generate target gesture semantic information.
7. The sign language interaction system of claim 5, wherein the interaction module comprises a voice playing sub-module and a gesture displaying sub-module, wherein:
the voice playing submodule is used for generating voice playing information according to the target gesture semantic information and playing the voice playing information to a corresponding user;
and the gesture display module is used for generating sign language information by adopting a preset three-dimensional model according to the target gesture information and displaying the sign language information to a corresponding user.
8. The sign language interactive system of claim 5, wherein the voice recognition module comprises a voice acquisition sub-module, a language extraction sub-module and a semantic matching sub-module, wherein:
the voice acquisition submodule is used for acquiring voice information of a user;
the language extraction submodule is used for extracting and searching and acquiring a corresponding voice database according to language information in the voice information of the user;
and the semantic matching submodule is used for matching the voice content in the voice information of the user with the semantic in the corresponding voice database so as to generate voice semantic identification information.
9. An electronic device, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of any of claims 1-4.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.
CN202110462102.5A 2021-04-27 2021-04-27 Sign language interaction method, system, equipment and storage medium Withdrawn CN113516984A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110462102.5A CN113516984A (en) 2021-04-27 2021-04-27 Sign language interaction method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110462102.5A CN113516984A (en) 2021-04-27 2021-04-27 Sign language interaction method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113516984A true CN113516984A (en) 2021-10-19

Family

ID=78063761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110462102.5A Withdrawn CN113516984A (en) 2021-04-27 2021-04-27 Sign language interaction method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113516984A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898468A (en) * 2022-05-26 2022-08-12 平安普惠企业管理有限公司 Sign language translation method and device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898468A (en) * 2022-05-26 2022-08-12 平安普惠企业管理有限公司 Sign language translation method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US20220198516A1 (en) Data recommendation method and apparatus, computer device, and storage medium
CN106776544B (en) Character relation recognition method and device and word segmentation method
CN107291783B (en) Semantic matching method and intelligent equipment
CN111625635A (en) Question-answer processing method, language model training method, device, equipment and storage medium
CN110263248B (en) Information pushing method, device, storage medium and server
CN110442710B (en) Short text semantic understanding and accurate matching method and device based on knowledge graph
CN109918676B (en) Method and device for detecting intention regular expression and terminal equipment
US8868609B2 (en) Tagging method and apparatus based on structured data set
CN111538816B (en) Question-answering method, device, electronic equipment and medium based on AI identification
CN111783471B (en) Semantic recognition method, device, equipment and storage medium for natural language
CN111339250A (en) Mining method of new category label, electronic equipment and computer readable medium
CN109033282A (en) A kind of Web page text extracting method and device based on extraction template
CN110991149A (en) Multi-mode entity linking method and entity linking system
CN114387061A (en) Product pushing method and device, electronic equipment and readable storage medium
CN108304387B (en) Method, device, server group and storage medium for recognizing noise words in text
CN113255331B (en) Text error correction method, device and storage medium
CN113516984A (en) Sign language interaction method, system, equipment and storage medium
CN117932022A (en) Intelligent question-answering method and device, electronic equipment and storage medium
CN113254814A (en) Network course video labeling method and device, electronic equipment and medium
CN110309355A (en) Generation method, device, equipment and the storage medium of content tab
JP6942759B2 (en) Information processing equipment, programs and information processing methods
CN110008314B (en) Intention analysis method and device
CN116701636A (en) Data classification method, device, equipment and storage medium
CN116306506A (en) Intelligent mail template method based on content identification
CN112241463A (en) Search method based on fusion of text semantics and picture information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20211019