CN112995572A - Remote conference system and physical display method in remote conference - Google Patents
Remote conference system and physical display method in remote conference Download PDFInfo
- Publication number
- CN112995572A CN112995572A CN202110438782.7A CN202110438782A CN112995572A CN 112995572 A CN112995572 A CN 112995572A CN 202110438782 A CN202110438782 A CN 202110438782A CN 112995572 A CN112995572 A CN 112995572A
- Authority
- CN
- China
- Prior art keywords
- image
- real object
- host
- end equipment
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/111—Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
- H04N13/117—Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/275—Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
- H04N13/279—Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention relates to a remote conference system and a real object display method in a remote conference, which comprises a near-end device and a far-end device, wherein the near-end device and the far-end device both comprise a host and an external device; the peripheral equipment comprises a camera module, a microphone and an adjusting device; the camera module is connected with the adjusting device and used for acquiring a 3D virtual image of a real object and a user gesture, and the host machine recognizes the user gesture and controls the 3D virtual image through a forming instruction; the host machine also controls the adjusting device by recognizing voice and forming an instruction; the 3D image acquisition device comprises a 3D scanner or a 3D depth perception camera; the whole conference or teaching activities can be smoothly carried out conveniently, and the communication efficiency is improved; the operation of other inconvenient peripheral equipment such as a keyboard or a mouse is not needed, so that the continuous suspension of teaching explanation or meeting programs is avoided, and the development of the remote meeting or remote teaching activities is more smooth and efficient.
Description
Technical Field
The invention relates to the technical field of teleconferencing, in particular to a teleconferencing system and a real object display method in a teleconference.
Background
When a teleconference is carried out, communication is generally carried out in a video mode, when a real object needs to be displayed to be a three-dimensional graph, a display person often needs to adjust the camera and rotate and turn the real object, the real object can be conveniently observed by users of other remote-end equipment, and the teleconference is very inconvenient.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a remote conference system and a method for displaying a real object in a remote conference, aiming at the above-mentioned defects in the prior art.
The technical scheme adopted by the invention for solving the technical problems is as follows:
constructing a remote conference system, which comprises a near-end device and a far-end device remotely connected with the near-end device, wherein the number of the far-end devices is one or more, and the near-end device and the far-end device respectively comprise a host and a peripheral connected with the host;
the peripheral comprises a display module, a camera module, a pickup module, a microphone, a loudspeaker and an adjusting device; wherein:
the camera module is arranged on the adjusting device and comprises a 3D image acquisition device and a depth lens, and the 3D image acquisition device is used for acquiring a 3D virtual image of a real object;
the host acquires user gestures through the depth lens, recognizes the user gestures and controls the 3D virtual image through a command corresponding to the gestures; the host computer also obtains voice information through the microphone, recognizes voice and controls the adjusting device through a command corresponding to the voice;
the 3D image acquisition device comprises a 3D scanner or a 3D depth perception camera.
Preferably, the host computer is still used for long-range acoustic image to adjust, and is concrete, near-end equipment the host computer is to far-end equipment peripheral hardware output audio and video, far-end equipment the peripheral hardware passes through display module shows the image, through the broadcast sound of speaker, the rethread pickup module is right the sound of speaker broadcast is gathered, acquires sound data, through camera module is right the video that display module broadcast is gathered, acquires video data to with video data and sound data transmission to near-end equipment the host computer, staff are according to video data and sound data to far-end equipment the display module with the speaker debugging.
Preferably, the host is further configured to record a conference, specifically, store the voice acquired by the microphone as an audio file, convert the audio file into a text file through voice recognition, recognize the speaker of each voice segment through voiceprint recognition, and mark corresponding speaker information on the text converted from each voice segment.
A real object display method in a teleconference is based on the teleconference system and comprises the following steps:
the method comprises the following steps: when the real object is to be displayed, a user of the near-end equipment sends an instruction through voice, the host of the near-end equipment acquires the voice through the microphone, the voice of the user is recognized, the adjusting device is controlled through the instruction corresponding to the voice to adjust the position of the camera module, the real object image information is acquired through the camera module and is transmitted to the far-end equipment in real time;
step two: when a 3D virtual image of a three-dimensional real object is to be displayed, acquiring picture data of the real object through a 3D scanner or a 3D depth perception camera; when the position or the angle of the camera module needs to be adjusted, a user of the near-end equipment sends an instruction through voice, the host of the near-end equipment acquires the voice through a microphone, the voice of the user is recognized, the adjusting device is controlled to adjust the position of the camera module through the instruction corresponding to the voice, the camera module is matched with the direction and the angle of an artificially overturned real object at the same time, graphic information of each angle of the three-dimensional real object is acquired, the graphic information of the three-dimensional real object is edited, and a virtual real object three-dimensional image is formed;
step three: when the 3D virtual image of the real object is not clear, executing a first step;
step four: when a 3D virtual image of a real object is to be controlled, the host machine acquires a user gesture through the depth lens or the 3D depth perception camera, recognizes the user gesture and controls the 3D virtual image through a command corresponding to the gesture;
step five: and when the plane image of the plane object is to be displayed, the host of the near-end equipment acquires the plane display image of the plane object through the camera module.
Preferably, when the 3D image acquisition device employs a 3D depth perception camera;
if the displayed real object is a three-dimensional object, the host acquires a 3D virtual image of the real object through a 3D depth perception camera, and acquires a gesture of a user through the depth lens and sends a corresponding instruction to control the state of the 3D virtual image;
if the actual real object needs to be displayed remotely, the angle of the real object is turned over by hand, the host computer sends out a control to the adjusting device by acquiring a voice command so as to adjust the position of the 3D depth perception camera, and the detailed structure information of the real object is displayed by matching with the manual turning of the user on the real object;
when the display module displays the 3D virtual image and the real image of the real object simultaneously in a split screen mode, the host has the function of controlling the real object display through voice and controlling the 3D virtual image through gestures.
Preferably, when the object to be displayed is a planar paper form document, the host acquires an image of the paper form document through the camera module;
the host computer scans fields of the image, calls out corresponding prefabricated template files according to the first row and/or the first column fields of the scanned form, correspondingly fills the scanned fields into the template files to form files, and sends the form files to all remote devices;
when the near-end equipment needs to display the table, the host of the near-end equipment opens the table file, the host of the far-end equipment synchronously opens the table file, the host of the near-end equipment acquires a control instruction of a user on the table and sends the control instruction to the host of the far-end equipment, and the far-end equipment and the near-end equipment update the table display state in real time according to the instruction.
Preferably, when the table image is scanned, the maximum bounding box is found first, a plurality of bounding boxes in the maximum bounding box are positioned, the rows and columns of the table are determined, and fields of the rows and columns of the table are scanned; calling out a prefabricated template, matching the image form with the prefabricated template according to the scanned form row and column information and the first row and/or first column character information, determining the form type, and newly building a corresponding template form file according to the determined form type; and filling the identified fields of the rest rows and columns into corresponding areas of the template format file, storing the file, and simultaneously respectively sending the file to all remote equipment.
Preferably, when the form image is scanned and the image needs to be corrected, specifically, the longest straight line in the image is found first, then it is determined whether the longest straight line is close to the horizontal line or the vertical line, then the included angle between the longest straight line and the horizontal line/vertical line is found, the inclination angle of the image is determined, and thus the angle of the image is corrected in a rotating manner.
Preferably, when the host acquires a plurality of image files, scanning header text information outside the maximum bounding box, if it is detected that the header text information is consistent and the templates are the same, performing field scanning on the areas to be identified of the image files with the same template, and sequentially filling the scanned fields into files with the same template format according to the receiving sequence and storing the files.
Preferably, the target remote device obtains a table display instruction and a table editing instruction, sends the table display instruction and the table editing instruction to the other remote devices through the host of the near-end device, the remote device updates the table according to the table display instruction and the table editing instruction, and after confirming that the other remote devices are all updated, the target remote device updates the table according to the table display instruction and the table editing instruction.
The invention has the beneficial effects that: through obtaining speech information, discernment pronunciation and formation instruction control adjusting device adjusts camera module's position through adjusting device, cooperates the manual rotation object of user to obtain the image of different angles in kind to transmit the image to far-end equipment in real time, be convenient for explain.
The 3D virtual image of the real object can be obtained through the cooperation of the adjusting device and the 3D depth perception camera, the 3D virtual image is transmitted to the far-end equipment, after the 3D virtual image is obtained, the 3D depth perception camera can also recognize the gesture of the user and control the 3D virtual image through a corresponding instruction by obtaining the gesture of the user, the change of the 3D virtual image is synchronously transmitted to the far-end equipment, the interference of the displayed finger on the real object image can be reduced, and the user of the far-end equipment can clearly observe the real object carefully.
3D degree of depth perception camera itself has the effect that intelligence was caught the real object or is explained personnel, cooperates the position and the angle of voice control adjusting device adjustment 3D degree of depth perception camera simultaneously, can develop meeting or teaching activities more efficiently.
In a remote conference or remote teaching environment, the 3D virtual image is controlled through gestures or the position and the angle of the camera device are controlled through voice, so that the whole conference or teaching activity can be smoothly carried out, and the communication efficiency is improved; by adopting the remote conference system and the method, the operation of other inconvenient peripheral equipment such as a keyboard or a mouse is not needed, the continuous suspension of teaching explanation or conference program is avoided, and the remote conference or remote teaching activity is more smoothly and efficiently developed by adopting the system and the method.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the present invention will be further described with reference to the accompanying drawings and embodiments, wherein the drawings in the following description are only part of the embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts:
FIG. 1 is a schematic diagram of a teleconferencing system in accordance with a preferred embodiment of the present invention;
fig. 2 is a flowchart of a method for displaying a real object in a teleconference according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the following will clearly and completely describe the technical solutions in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without inventive step, are within the scope of the present invention.
As shown in fig. 1, the teleconference system in the preferred embodiment of the present invention includes one or more near-end devices and a far-end device remotely connected to the near-end device, where the near-end device and the far-end device each include a host 1 and a peripheral 2 connected to the host 1;
the peripheral 2 comprises a display module 21, a camera module 22, a sound pickup module 23, a microphone 24, a loudspeaker 25 and an adjusting device 26; wherein:
the camera module 22 is arranged on the adjusting device 26, the camera module 22 comprises a 3D image acquiring device 221 and a depth lens 222, and the 3D image acquiring device 221 is used for acquiring a 3D virtual image of a real object;
the host 2 acquires the user gesture through the depth lens 222, recognizes the user gesture, and controls the 3D virtual image through an instruction corresponding to the gesture; the host machine 2 also acquires voice information through the microphone 24, recognizes the voice and controls the adjusting device 26 through a command corresponding to the voice;
the 3D image acquisition device 221 includes a 3D scanner or a 3D depth perception camera.
The 3D virtual image of the real object can be obtained through the cooperation of the adjusting device and the 3D depth perception camera, the 3D virtual image is transmitted to the far-end equipment, after the 3D virtual image is obtained, the 3D depth perception camera can recognize the gesture of the user and control the 3D virtual image through a corresponding instruction by obtaining the gesture of the user, the change of the 3D virtual image is synchronously transmitted to the far-end equipment, the interference of fingers of a presenter when the real object is observed can be reduced, and the real object can be observed more clearly and carefully by a user of the far-end equipment.
3D degree of depth perception camera itself has the effect that intelligence was caught the real object or is explained personnel, cooperates the position and the angle of voice control adjusting device adjustment 3D degree of depth perception camera simultaneously, can develop meeting or teaching activities more efficiently.
In a remote conference or remote teaching environment, the 3D virtual image is controlled through gestures or the position and the angle of the camera device are controlled through voice, so that the whole conference or teaching activity can be smoothly carried out, and the communication efficiency is improved; by adopting the remote conference system and the method, the operation of other inconvenient peripheral equipment such as a keyboard or a mouse is not needed, the continuous suspension of teaching explanation or conference program is avoided, and the remote conference or remote teaching activity is more smoothly and efficiently developed by adopting the system and the method.
As shown in fig. 1, the host 1 is further configured to perform remote audio-video adjustment, specifically, the host 1 of the near-end device outputs audio and video to the peripheral 2 of the far-end device, the peripheral 1 of the far-end device displays images through the display module 21, collects sounds played by the speaker 25 through the sound pickup module 23 to obtain sound data, collects videos played by the display module 21 through the camera module 22 to obtain video data, and sends the video data and the sound data to the host 1 of the near-end device, and a worker debugs the display module 21 and the speaker 25 of the far-end device according to the video data and the sound data;
the host computer can also carry out automatic debugging, specifically, the video data and the sound data transmitted to the host computer 1 are analyzed, the image information such as definition and noise point of the video data are analyzed, the sound information such as decibel value and definition of the sound data are analyzed, and the image information and the sound information are adjusted to be within a preset range; the sound pickup module 23 comprises a plurality of sound pickups which are uniformly distributed on the conference table; camera module 22 includes a plurality of cameras, and a plurality of cameras can collect the sound image information of each position in meeting room to display module 21 from different angles, behind debugging display module 21 and sound source output module 25, can include that all meeting room personnel can both hear clearly see clearly.
As shown in fig. 2, the host 1 is further configured to record a conference, specifically, store the voice obtained by the microphone 24 as an audio file, convert the audio file into a text file through voice recognition, recognize the speaker of each voice through voiceprint recognition, and label corresponding speaker information on the text converted from each voice.
As shown in fig. 1, the method for displaying a real object in a teleconference according to the preferred embodiment of the present invention includes the following steps based on the previous embodiment:
the method comprises the following steps: when the real object is to be displayed, a user of the near-end equipment sends an instruction through voice, the host 1 of the near-end equipment acquires the voice through the microphone 24, recognizes the voice of the user, controls the adjusting device 26 to adjust the position of the camera module 22 through the instruction corresponding to the voice, acquires real object image information through the camera module 22, and transmits the real object image information to the far-end equipment in real time;
step two: when a 3D virtual image of a three-dimensional real object is to be displayed, acquiring picture data of the real object through a 3D scanner or a 3D depth perception camera; when the position or the angle of the camera module 22 needs to be adjusted, a user of the near-end device sends an instruction through voice, the host 1 of the near-end device obtains the voice through the microphone 24, recognizes the voice of the user and controls the adjusting device 26 to adjust the position of the camera module 22 through the instruction corresponding to the voice, the camera module 22 is matched with the direction and the angle of an artificially overturned real object at the same time to obtain graphic information of each angle of the three-dimensional real object, and the graphic information of the three-dimensional real object is edited to form a three-dimensional image of a virtual real object;
step three: when the 3D virtual image of the real object is not clear, executing a first step;
step four: when a 3D virtual image of a real object is to be controlled, the host 1 acquires a user gesture through the depth lens 222 or the 3D depth perception camera, recognizes the user gesture, and controls the 3D virtual image through an instruction corresponding to the gesture;
step five: when the planar image of the planar real object is to be displayed, the host 1 of the near-end device acquires the planar display image of the planar real object through the camera module 22.
3D degree of depth perception camera itself has the effect that intelligence was caught the real object or is explained personnel, cooperates the position and the angle of voice control adjusting device adjustment 3D degree of depth perception camera simultaneously, can develop meeting or teaching activities more efficiently.
In a remote conference or remote teaching environment, the 3D virtual image is controlled through gestures or the position and the angle of the camera device are controlled through voice, so that the whole conference or teaching activity can be smoothly carried out, and the communication efficiency is improved; by adopting the remote conference system and the method, the operation of other inconvenient peripheral equipment such as a keyboard or a mouse is not needed, the continuous suspension of teaching explanation or conference program is avoided, and the remote conference or remote teaching activity is more smoothly and efficiently developed by adopting the system and the method.
As shown in fig. 2, when the 3D image acquisition device 221 employs a 3D depth perception camera;
if the displayed real object is a three-dimensional object, the host 1 acquires a 3D virtual image of the real object through a 3D depth perception camera, and acquires a state of the 3D virtual image controlled by a command sent by a gesture of a user through the depth lens 222;
if the actual real object needs to be displayed remotely, the angle of the real object is turned over by hand, the host 1 sends out a control adjusting device 26 by acquiring a voice command, so that the position of the 3D depth perception camera is adjusted, and the detailed structure information of the real object is displayed in a manner of matching with the manual turning of the real object by a user;
when the display module 21 displays the 3D virtual image and the real image of the real object simultaneously in a split screen manner, the host has the function of controlling the real object display by voice and the 3D virtual image by gesture at the same time.
As shown in fig. 2, when the object to be displayed is a planar paper form document, the host 1 obtains an image of the paper form document through the camera module 22;
the host 1 carries out field scanning on the image, calls out a corresponding prefabricated template file according to the first row and/or the first column field of a scanned form, correspondingly fills the scanned field into the template file to form a form file, and sends the form file to all remote devices;
when the near-end equipment needs to display the table, the host 1 of the near-end equipment opens the table file, the host 1 of the far-end equipment synchronously opens the table file, the host 1 of the near-end equipment acquires a control instruction of the user on the table and sends the control instruction to the host 1 of the far-end equipment, and the far-end equipment and the near-end equipment update the table display state in real time according to the instruction.
The information extraction can be rapidly carried out on the paper form, and the information extraction accuracy is high; by generating the instruction aiming at the table document to be displayed and sending the document and the display state thereof to each remote device, all users can synchronously display the table document to be displayed without transmitting videos, so that a large amount of memory and network resources are not required to be consumed, the requirement on the performance of the remote device is low, and the cost can be reduced.
As shown in fig. 2, when scanning the table image, first find the largest bounding box, locate a plurality of bounding boxes within the largest bounding box, determine the rows and columns of the table, and scan the fields of the rows and columns of the table; calling out a prefabricated template, matching the image form with the prefabricated template according to the scanned form row and column information and the first row and/or first column character information, determining the form type, and newly building a corresponding template form file according to the determined form type; and filling the identified fields of the rest rows and columns into corresponding areas of the template format file, storing the file, and simultaneously respectively sending the file to all remote equipment.
As shown in fig. 2, when the form image is scanned and the image needs to be corrected, specifically, the longest straight line in the image is found first, then it is determined whether the longest straight line is close to the horizontal line or the vertical line, then an included angle between the longest straight line and the horizontal line/vertical line is found, the inclination angle of the image is determined, and thus the angle of the image is corrected by rotation; by correcting the image, the resource utilization rate of the image acquisition equipment is improved, the data volume of image data scanned and processed by the image is reduced, and the response rate of the system is improved;
after the image is corrected, noise reduction processing is required, and the specific processing steps are as follows: transferring the image to an image with an HSV color gamut, and removing pixel points falling in a red interval; and determining a binarization threshold value at the pixel position according to the pixel value distribution of the neighborhood blocks of the pixels of the image, and carrying out binarization of the adaptive threshold value on the image to reduce the interference of noise.
And after the field in the table is recognized, dictionary optimization is needed, specifically, the field recognized by the OCR is matched with the field in the dictionary library by establishing the form of the dictionary library, if the matching score is larger than a preset threshold value, the field in the dictionary library is replaced by the field recognized by the OCR so as to optimize and update the field in the dictionary library, meanwhile, the manually confirmed correct field is supplemented into the dictionary library, and the matching score is equal to the total number of the words recognized by the OCR divided by the total number of the matched words in the current dictionary library.
The method comprises the following steps that part of forms have more contents, one paper file cannot be completely accommodated, so that the situation that a plurality of paper forms are unified forms possibly exists, when the same remote equipment transmits a plurality of image files, title character information outside a maximum enclosing frame is scanned, if the fact that the title character information is consistent and templates are the same is detected, fields of to-be-identified areas of the image files with the same template are scanned, and the scanned fields are sequentially filled into files with the same template format according to a receiving sequence and stored; a plurality of paper files may be merged into one document file.
The near-end equipment acquires the table display instruction and the table editing instruction, the table display instruction and the table editing instruction are sent to other far-end equipment through the host 1 of the near-end equipment, the far-end equipment updates the table according to the table display instruction and the table editing instruction, and after the fact that the other far-end equipment is updated is confirmed, the near-end equipment updates the table according to the table display instruction and the table editing instruction.
The presentation instructions include one or more of: page turning, page zooming and cursor displaying, wherein the editing instruction comprises one or more items of the following items: a formula editing instruction, a table editing instruction, and a chart insertion instruction.
When a conference is recorded, the display state and the update of the form are recorded on a screen, the recorded state and the update of the form are saved as a video file, real-time is bound, voice acquired by the microphone 24 when the form is displayed is saved as a second audio file, the real-time is also bound, and the video file is associated with the second audio file through the real-time; and the second audio file is converted into a text file through voice recognition, is converted into a subtitle file through real-time, and is associated with the video file, and the two modes can be simultaneously implemented or can be implemented singly.
The first audio file is also bound with real-time, and when the first audio file is converted into a text file, the real-time is not displayed, but the associated video file and the second audio file are inserted in corresponding time periods.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.
Claims (8)
1. A teleconferencing system, characterized in that, includes near-end equipment, and far-end equipment connected with the near-end equipment remotely, the said far-end equipment is specifically one or more, the said near-end equipment and the said far-end equipment each include host computer (1), and peripheral hardware (2) connected with the said host computer (1);
the peripheral (2) comprises a display module (21), a camera module (22), a pickup module (23), a microphone (24), a loudspeaker (25) and an adjusting device (26); wherein:
the camera module (22) is arranged on the adjusting device (26), the camera module (22) comprises a 3D image acquisition device (221) and a depth lens (222), and the 3D image acquisition device (221) is used for acquiring a 3D virtual image of a real object;
the host (2) acquires user gestures through the depth lens (222), recognizes the user gestures and controls the 3D virtual image through commands corresponding to the gestures; the host (2) also acquires voice information through the microphone (24), recognizes voice and controls the adjusting device (26) through a command corresponding to the voice;
the 3D image acquisition device (221) comprises a 3D scanner or a 3D depth perception camera;
host computer (1) still is used for long-range acoustic image to adjust, specific, near-end equipment host computer (1) is to far-end equipment peripheral hardware (2) output audio and video, far-end equipment peripheral hardware (1) pass through display module (21) show image, through the broadcast sound of speaker (25), rethread pickup module (23) are right the sound of speaker (25) broadcast is gathered, acquires sound data, through camera module (22) are right the video of display module (21) broadcast is gathered, acquires video data to with video data and sound data send to near-end equipment host computer (1), the staff is according to video data and sound data to far-end equipment display module (21) with speaker (25) debug.
2. A method for displaying a real object in a teleconference, based on the teleconference system of claim 1, characterized by comprising the steps of:
the method comprises the following steps: when the real object is to be displayed, a user of the near-end equipment sends an instruction through voice, the host (1) of the near-end equipment acquires the voice through the microphone (24), recognizes the voice of the user and controls the adjusting device (26) to adjust the position of the camera module (22) through the instruction corresponding to the voice, and the real object image information is acquired through the camera module (22) and transmitted to the far-end equipment in real time;
step two: when a 3D virtual image of a three-dimensional real object is to be displayed, acquiring picture data of the real object through a 3D scanner or a 3D depth perception camera; when the position or the angle of the camera module (22) needs to be adjusted, a user of a near-end device sends an instruction through voice, the host (1) of the near-end device obtains the voice through a microphone (24), the voice of the user is recognized, the adjusting device (26) is controlled through the instruction corresponding to the voice to adjust the position of the camera module (22), the camera module (22) is matched with the direction and the angle of a man-made overturned real object at the same time, graphic information of each angle of the three-dimensional real object is obtained, and the graphic information of the three-dimensional real object is edited to form a three-dimensional image of the virtual real object;
step three: when the 3D virtual image of the real object is not clear, executing a first step;
step four: when a 3D virtual image of a real object is to be controlled, the host (1) acquires a user gesture through the depth lens (222) or the 3D depth perception camera, recognizes the user gesture and controls the 3D virtual image through a command corresponding to the gesture;
step five: when the plane image of the plane object is to be displayed, the host (1) of the near-end equipment acquires the plane display image of the plane object through the camera module (22).
3. The method for displaying the real object in the teleconference according to claim 2, wherein when the 3D image obtaining apparatus (221) employs a 3D depth-aware camera;
if the displayed real object is a three-dimensional object, the host (1) acquires a 3D virtual image of the real object through a 3D depth perception camera, acquires a gesture of a user through the depth lens (222), and sends a corresponding instruction to control the state of the 3D virtual image;
if the actual real object needs to be displayed remotely, the angle of the real object is turned over by hands, the host (1) sends out a control to the adjusting device (26) through acquiring a voice command so as to adjust the position of the 3D depth perception camera and display detailed structure information of the real object in a manner of matching with the manual turning of the real object by a user;
when the display module (21) displays the 3D virtual image and the real image of the real object simultaneously in a split screen mode, the host has the function of simultaneously controlling the real object display through voice and controlling the 3D virtual image through gestures.
4. The method for displaying the real object in the teleconference according to claim 2, wherein when the real object to be displayed is a planar paper form document, the host (1) acquires an image of the paper form document through the camera module (22);
the host (1) scans fields of the image, calls out corresponding prefabricated template files according to the first row and/or the first column fields of the scanned form, correspondingly fills the scanned fields into the template files to form files, and sends the form files to all remote devices;
when the near-end equipment needs to display the table, the host (1) of the near-end equipment opens the table file, the host (1) of the far-end equipment synchronously opens the table file, the host (1) of the near-end equipment acquires a control instruction of a user on the table and sends the control instruction to the host (1) of the far-end equipment, and the far-end equipment and the near-end equipment update the table display state in real time according to the instruction.
5. The method as claimed in claim 4, wherein when the table image is scanned, the largest bounding box is first found, the plurality of bounding boxes in the largest bounding box are located, the rows and columns of the table are determined, and the fields of the rows and columns of the table are scanned; calling out a prefabricated template, matching the image form with the prefabricated template according to the scanned form row and column information and the first row and/or first column character information, determining the form type, and newly building a corresponding template form file according to the determined form type; and filling the identified fields of the rest rows and columns into corresponding areas of the template format file, storing the file, and simultaneously respectively sending the file to all remote equipment.
6. The method as claimed in claim 5, wherein when the image of the form is scanned and the image needs to be corrected, the method specifically comprises finding the longest straight line in the image, determining whether the longest straight line is close to the horizontal line or the vertical line, finding the included angle between the longest straight line and the horizontal line/vertical line, and determining the inclination angle of the image, thereby performing rotation correction on the angle of the image.
7. The method for displaying the real object in the teleconference according to claim 5, wherein when the host (1) acquires the plurality of image files, the information of the title words outside the maximum bounding box is scanned, and if it is detected that the information of the title words is consistent and the templates are the same, the field scanning is performed on the areas to be identified of the image files with the same templates, and the scanned fields are sequentially filled in the files with the same template format according to the receiving sequence and are stored.
8. The method according to claim 4, wherein the near-end device obtains a table display instruction and a table editing instruction, and sends the table display instruction and the table editing instruction to the other far-end devices through the host (1) of the near-end device, the far-end device updates the table according to the table display instruction and the table editing instruction, and after confirming that the other far-end devices are all updated, the near-end device updates the table according to the table display instruction and the table editing instruction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110438782.7A CN112995572A (en) | 2021-04-23 | 2021-04-23 | Remote conference system and physical display method in remote conference |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110438782.7A CN112995572A (en) | 2021-04-23 | 2021-04-23 | Remote conference system and physical display method in remote conference |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112995572A true CN112995572A (en) | 2021-06-18 |
Family
ID=76339974
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110438782.7A Pending CN112995572A (en) | 2021-04-23 | 2021-04-23 | Remote conference system and physical display method in remote conference |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112995572A (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103654967A (en) * | 2013-12-06 | 2014-03-26 | 傅松青 | Speech-controlled auxiliary imaging device for minimally invasive operations |
CN104135619A (en) * | 2014-08-12 | 2014-11-05 | 广东欧珀移动通信有限公司 | Method and device of controlling camera |
CN104918000A (en) * | 2015-06-30 | 2015-09-16 | 国家电网公司 | Video conference remote control device |
CN107609045A (en) * | 2017-08-17 | 2018-01-19 | 深圳壹秘科技有限公司 | A kind of minutes generating means and its method |
CN108986826A (en) * | 2018-08-14 | 2018-12-11 | 中国平安人寿保险股份有限公司 | Automatically generate method, electronic device and the readable storage medium storing program for executing of minutes |
CN109344831A (en) * | 2018-08-22 | 2019-02-15 | 中国平安人寿保险股份有限公司 | A kind of tables of data recognition methods, device and terminal device |
CN109413364A (en) * | 2018-12-04 | 2019-03-01 | 湖北安心智能科技有限公司 | A kind of interactive remote meeting system and method |
CN110335612A (en) * | 2019-07-11 | 2019-10-15 | 招商局金融科技有限公司 | Minutes generation method, device and storage medium based on speech recognition |
CN211930771U (en) * | 2020-04-03 | 2020-11-13 | 杭州优甲科技有限公司 | Novel intelligent network live broadcast machine and live broadcast system |
CN112037791A (en) * | 2020-08-12 | 2020-12-04 | 广东电力信息科技有限公司 | Conference summary transcription method, apparatus and storage medium |
CN112148922A (en) * | 2019-06-28 | 2020-12-29 | 鸿富锦精密工业(武汉)有限公司 | Conference recording method, conference recording device, data processing device and readable storage medium |
CN112363563A (en) * | 2020-10-11 | 2021-02-12 | 王小龙 | External image sound input device mounted on computer display |
-
2021
- 2021-04-23 CN CN202110438782.7A patent/CN112995572A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103654967A (en) * | 2013-12-06 | 2014-03-26 | 傅松青 | Speech-controlled auxiliary imaging device for minimally invasive operations |
CN104135619A (en) * | 2014-08-12 | 2014-11-05 | 广东欧珀移动通信有限公司 | Method and device of controlling camera |
CN104918000A (en) * | 2015-06-30 | 2015-09-16 | 国家电网公司 | Video conference remote control device |
CN107609045A (en) * | 2017-08-17 | 2018-01-19 | 深圳壹秘科技有限公司 | A kind of minutes generating means and its method |
CN108986826A (en) * | 2018-08-14 | 2018-12-11 | 中国平安人寿保险股份有限公司 | Automatically generate method, electronic device and the readable storage medium storing program for executing of minutes |
CN109344831A (en) * | 2018-08-22 | 2019-02-15 | 中国平安人寿保险股份有限公司 | A kind of tables of data recognition methods, device and terminal device |
CN109413364A (en) * | 2018-12-04 | 2019-03-01 | 湖北安心智能科技有限公司 | A kind of interactive remote meeting system and method |
CN112148922A (en) * | 2019-06-28 | 2020-12-29 | 鸿富锦精密工业(武汉)有限公司 | Conference recording method, conference recording device, data processing device and readable storage medium |
CN110335612A (en) * | 2019-07-11 | 2019-10-15 | 招商局金融科技有限公司 | Minutes generation method, device and storage medium based on speech recognition |
CN211930771U (en) * | 2020-04-03 | 2020-11-13 | 杭州优甲科技有限公司 | Novel intelligent network live broadcast machine and live broadcast system |
CN112037791A (en) * | 2020-08-12 | 2020-12-04 | 广东电力信息科技有限公司 | Conference summary transcription method, apparatus and storage medium |
CN112363563A (en) * | 2020-10-11 | 2021-02-12 | 王小龙 | External image sound input device mounted on computer display |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4770178B2 (en) | Camera control apparatus, camera system, electronic conference system, and camera control method | |
WO2017215295A1 (en) | Camera parameter adjusting method, robotic camera, and system | |
WO2021143315A1 (en) | Scene interaction method and apparatus, electronic device, and computer storage medium | |
US20080180519A1 (en) | Presentation control system | |
US20090257730A1 (en) | Video server, video client device and video processing method thereof | |
US9076345B2 (en) | Apparatus and method for tutoring in convergence space of real and virtual environment | |
US20130300934A1 (en) | Display apparatus, server, and controlling method thereof | |
CN111010610A (en) | Video screenshot method and electronic equipment | |
CN113014857A (en) | Control method and device for video conference display, electronic equipment and storage medium | |
CN110928509B (en) | Display control method, display control device, storage medium, and communication terminal | |
KR102424150B1 (en) | An automatic video production system | |
CN114531564A (en) | Processing method and electronic equipment | |
CN113301367B (en) | Audio and video processing method, device, system and storage medium | |
KR20160082291A (en) | Image processing apparatus and image processing method thereof | |
US7986336B2 (en) | Image capture apparatus with indicator | |
CN103959805A (en) | Method and device for displaying image | |
CN112995572A (en) | Remote conference system and physical display method in remote conference | |
US11729489B2 (en) | Video chat with plural users using same camera | |
CN112887653B (en) | Information processing method and information processing device | |
WO2021226821A1 (en) | Systems and methods for detection and display of whiteboard text and/or an active speaker | |
CN115118913A (en) | Projection video conference system and projection video method | |
KR20170031941A (en) | Display device and luminance control method thereof | |
US11805231B2 (en) | Target tracking method applied to video transmission | |
JPH07319886A (en) | Retrieving device for drawing image interlinked with dynamic image and retrieving method for drawing image interlinked with dynamic image | |
KR102507873B1 (en) | Method for generating customized video based on objects and service server using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210618 |