WO2023160356A1 - 一种增强虚拟现实***用户体验的方法及*** - Google Patents

一种增强虚拟现实***用户体验的方法及*** Download PDF

Info

Publication number
WO2023160356A1
WO2023160356A1 PCT/CN2023/074391 CN2023074391W WO2023160356A1 WO 2023160356 A1 WO2023160356 A1 WO 2023160356A1 CN 2023074391 W CN2023074391 W CN 2023074391W WO 2023160356 A1 WO2023160356 A1 WO 2023160356A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
posture
virtual reality
human body
user
Prior art date
Application number
PCT/CN2023/074391
Other languages
English (en)
French (fr)
Inventor
董博雅
付思超
张帆
Original Assignee
凝动医疗技术服务(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 凝动医疗技术服务(上海)有限公司 filed Critical 凝动医疗技术服务(上海)有限公司
Publication of WO2023160356A1 publication Critical patent/WO2023160356A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Definitions

  • the present invention relates to the field of virtual reality (Virtual Reality, VR) technology, in particular to a method and system for enhancing user experience of a virtual reality system.
  • VR Virtual Reality
  • VR technology is an emerging comprehensive information technology. It uses modern high-tech with computer technology as the core to generate a realistic virtual environment with a specific range of visual, auditory, and tactile integration. Users use various realities and controls, etc.
  • the interface device interacts with and influences the objects in the virtual environment in its own way, so as to produce the feeling and experience equivalent to the real environment in person.
  • VR technology integrates digital image processing, multimedia technology, computer graphics, sensor technology and other aspects of information technology. It forms a three-dimensional digital model through computer graphics, and provides users with an immersive feeling in an interactive three-dimensional environment generated on the computer.
  • the human body pose can be estimated based on the image to impose constraints through depth information to improve accuracy (V.Marton, A.Lorincz, "Multi -Person Absolute 3D Human Pose Estimation with Weak Depth Supervision", ArXiv, 2020), and the estimation of human pose through 3D point cloud data proposed by Zhou et al. (Y.Zhou, H.Dong and A.El Saddik,”Learning to Estimate 3D Human Pose from Point Cloud”, IEEE Sensors Journal 2020, 20(15):12334-12342).
  • the Chinese invention patent application with the publication number CN108734194A discloses a human joint point recognition method based on a single depth map for virtual reality.
  • the gesture public dataset NYU is trained offline through a convolutional network to obtain a robust, A model with high accuracy and fast recognition speed; then the depth image is captured in real time by the depth camera, and after a series of preprocessing of the image, it is sent to the human bone recognition module and the gesture recognition module respectively, and the 3D information of the recognized joint points is returned, and then mapped 3D human body model.
  • the gesture public dataset NYU trained in this publication has good accuracy only in gesture recognition. It is still difficult for the existing technology to allow users to see their hands, feet and body in the VR scene with a high degree of realism in a convenient way.
  • What is needed in this field is a method and system for enhancing the user experience of a virtual reality system, which can enhance the sense of reality and interaction of the VR experience in a convenient manner, and allow users to see their hands, feet and body in the VR scene.
  • the purpose of the present invention is to provide a method and system for enhancing the user experience of a virtual reality system, which can monitor the user's whole body in a non-wearable manner.
  • the gestures and movements are captured, and the corresponding virtual 3D body is displayed in the VR scene to follow the user's gestures and movements in real time, enhancing the sense of reality and interaction of the VR experience.
  • a method for enhancing user experience of a virtual reality system includes the following steps: collecting the relevant data of the user's whole-body posture through an external sensor; generating the relevant data of the whole-body posture through a pre-trained deep neural network to generate three-dimensional posture parameters of the human body; using the three-dimensional posture parameters of the human body to drive Parameterize the human body model, and generate the user's virtual 3D body in the VR scene, and follow the user's posture to move in real time.
  • the posture generation deep neural network is pre-trained through the following steps: collecting posture data of the human body in various motion states through wearable and/or patch sensors, and constructing the three-dimensional posture parameters of the human body As an input feature, and simultaneously use the external sensor to collect the relevant data of the whole body posture as an output feature, divide the input feature and the corresponding output feature into a training set and a verification set; according to the input feature and the output feature Data dimension, build the attitude generation deep neural network model based on convolutional neural network and/or recurrent neural network and/or codec neural network and/or self-attention mechanism; Adopt the method of deep learning field, utilize described training set to The pose generation deep neural network model is trained, and the hyperparameters are tuned using the validation set.
  • the method in the field of deep learning includes a backpropagation algorithm.
  • the external sensor includes at least one of the following: a camera, a depth camera, a three-dimensional laser radar, and a three-dimensional millimeter-wave radar.
  • the whole-body posture-related data includes at least one of the following: user video image data, user video image data with depth information, lidar point cloud data, millimeter-wave radar point cloud data;
  • the feature of the posture generation deep neural network includes extracting the three-dimensional posture parameters of the human body corresponding to the posture of the user from each frame of whole body posture related data;
  • the three-dimensional posture parameters of the human body include the following nodes: buttocks, spine, neck, head, left shoulder, left arm, left forearm, left hand, right shoulder, right arm, right forearm, right hand, upper left Leg, left leg, left foot, left toe heel, right upper leg, right leg, right foot, right toe heel, each node represents its relative rotation angle with a set of three-dimensional floating point data.
  • the parametric human body model needs to form a tree structure according to the above nodes, and each node expresses its positional relationship by the relative rotation angle relative to its parent node.
  • the user's virtual three-dimensional body is a three-dimensional obj model, which contains at least vertex and triangular surface information, and each said vertex is set according to the distance relationship of one or more said nodes nearest to it. There are weight values corresponding to these nodes, so that running When moving, it affects the movement of the vertex, and then affects the movement of the entire virtual three-dimensional body.
  • a system for enhancing user experience of a virtual reality system including a VR display module, an external sensor module, an attitude parameter generation module, and a VR scene generation module;
  • the VR display module is used to receive the The VR scene generated by the VR scene generation module is displayed to the user;
  • the external sensor module is used to collect the user's whole body posture related data and sent to the posture parameter generation module;
  • the posture parameter generation module is used to receive the The whole-body posture-related data sent by the external sensor module is processed by using a pre-trained posture to generate a deep neural network to generate human body three-dimensional posture parameters, and sent to the VR scene generation module;
  • the VR scene generation module is used to generate the VR scene, and is used to receive the human body three-dimensional posture parameters sent by the posture parameter generation module, and use the human body three-dimensional posture parameters to drive a parameterized human body model, in the VR scene
  • the user's virtual three-dimensional body is added to the VR scene, and the VR scene is sent to the VR display module.
  • the VR display module includes at least one of the following: VR helmet, VR glasses, VR goggles;
  • the external sensor module includes at least one of the following: a camera, a depth camera, a three-dimensional laser radar, and a three-dimensional millimeter-wave radar.
  • the method and system for enhancing the user experience of the virtual reality system provided by the present invention can enhance the sense of reality and interaction of the VR experience in a convenient manner, so that the user can see the virtual body and hands and feet consistent with the real posture and action in the VR scene, and experience It is more realistic and facilitates communication between different users through body language.
  • More embodiments of the present invention can also achieve other beneficial technical effects that are not listed one by one. These other beneficial technical effects may be partially described below, and for those skilled in the art after reading the present invention Expected and understandable.
  • FIG. 1 is a schematic flow diagram of a method for enhancing user experience of a virtual reality system according to an embodiment of the present invention
  • Fig. 2 is a schematic structural diagram of a system for enhancing user experience of a virtual reality system according to an embodiment of the present invention.
  • Figure 1 shows a method for enhancing user experience of a virtual reality system according to the present invention, including the following steps: S1, collecting user's whole-body posture-related data through external sensors; S2, generating a deep neural network through a pre-trained posture Process the data related to the whole body posture to generate three-dimensional posture parameters of the human body; S3, use the three-dimensional posture parameters of the human body to drive a parametric human body model, and generate the user's virtual three-dimensional body in the VR scene, and follow the user's posture to move in real time.
  • External sensors include but are not limited to the following: camera, depth camera, 3D lidar, and 3D millimeter wave radar.
  • the whole-body pose-related data includes at least one of the following: user video image data, user video image data with depth information, lidar point cloud data, and millimeter-wave radar point cloud data.
  • a camera is used as an external sensor to collect video image data of the user, and the physical feature points of the human body are extracted from the video image data by means of deep learning.
  • the mediapipe deep learning model produced by the company extracts the coordinate positions of the nodes contained in the three-dimensional posture parameters of the human body from the human body motion video images, and then calculates the required relative rotation angles of the nodes through geometric relationship calculations.
  • the depth camera is used as the external sensor, which can also complete user video image data collection similar to the camera, and can further combine the depth collected by the depth camera (that is, the distance ) information to obtain more accurate coordinate positions of the nodes.
  • three-dimensional laser radar and three-dimensional millimeter-wave radar are used as the external sensors to collect the user's three-dimensional point cloud data, and then through deep learning The method extracts the shape feature points of the human body from the three-dimensional point cloud data.
  • the external sensor in the embodiment of the present invention may also be a combination of multiple devices of the same or different types.
  • a combination of a camera and a three-dimensional millimeter-wave radar is used to collect the user's video image data and three-dimensional point cloud data at the same time, and then the video image data is analyzed by the distance information contained in the three-dimensional point cloud data. Extract the body feature points of the human body and constrain them to obtain more accurate 3D pose parameter estimation of the human body, or combine video image data and 3D point cloud data as input features, and use the trained poses under the input features to generate depth
  • the neural network outputs the three-dimensional posture parameters of the human body after processing.
  • the posture generation deep neural network in the embodiments of the present invention can be trained through the following steps: S101, collect posture data of the human body in various motion states through wearable and/or patch sensors, and construct the The three-dimensional posture parameters of the human body are used as input features, and at the same time, the external sensor is used to collect the relevant data of the whole body posture as output features, and the input features and corresponding output features are divided into a training set and a verification set; S102, according to the The data dimensions of the input feature and the output feature are constructed based on a convolutional neural network (Convolutional-Neural-Networks, CNN) and/or a recurrent neural network (Recurrent-Neural-Network, RNN) and/or a codec neural network ( Encoder-Decoder) and/or self-attention mechanism (Self-Attention) pose to generate a deep neural network model; S103, using relevant methods in the field of deep learning such as back propagation (Back-Propagation), using the training set to The posture generation deep
  • the feature of the pose generation deep neural network includes extracting the three-dimensional pose parameters of the human body corresponding to the user pose from each frame of whole body pose related data.
  • the three-dimensional posture parameters of the human body include the following nodes: buttocks, spine, neck, head, left shoulder, left arm, left forearm, left hand, right shoulder, right arm, right forearm, right hand, upper left Leg, left leg, left foot, left toe heel, right upper leg, right leg, right foot, right toe heel, each node represents its relative rotation angle with a set of three-dimensional floating point data.
  • the parametric human body model needs to form a tree structure according to the above nodes, and each node expresses its positional relationship by the relative rotation angle relative to its parent node.
  • the user's virtual three-dimensional body is a three-dimensional obj model, which contains at least vertex and triangular surface information, and each said vertex is set according to the distance relationship of one or more said nodes nearest to it. There are weight values corresponding to these nodes, so that running When moving, it affects the movement of the vertex, and then affects the movement of the entire virtual three-dimensional body.
  • Fig. 2 has provided the composition diagram of the system of a kind of enhanced virtual reality system user experience according to the present invention, comprises VR display module 1, external sensor module 2, posture parameter generation module 3, VR scene generation module 4;
  • the VR The display module 1 is used to receive the VR scene generated by the VR scene generation module 4 and display it to the user;
  • the external sensor module 2 is used to collect the user's whole body posture related data and send it to the posture parameter generation module 3;
  • the attitude parameter generating module 3 is used to receive the whole-body attitude-related data sent by the external sensor module 2, and utilize a pre-trained attitude to generate a deep neural network to process the whole-body attitude-related data to generate a three-dimensional human body attitude parameters, and sent to the VR scene generation module 4;
  • the VR scene generation module 4 is used to generate the VR scene, and is used to receive the human body three-dimensional posture parameters sent by the posture parameter generation module 4, using the
  • the parameterized human body model is driven by the three-dimensional posture parameters of the human body, the user's virtual three-
  • the VR display module 1 includes at least one of the following: a VR helmet, VR glasses, and a VR goggle.
  • the external sensor module 2 includes at least one of the following: a camera, a depth camera, a three-dimensional laser radar, and a three-dimensional millimeter-wave radar.
  • each component module of a system for enhancing the user experience of a virtual reality system does not have to be a physically independent device, and multiple modules can also be integrated into the same device.
  • the VR display module and the VR scene generation module are integrated into the VR helmet device, or the VR display module, the VR scene generation module and the posture parameter generation module can be integrated into the VR helmet device , or, the external sensor module and the attitude parameter generation module can be integrated into one attitude acquisition device, and other combinations.
  • the method and system for enhancing the user experience of the virtual reality system have the following advantages: it can accurately track the movement posture of the human body, and superimpose and display it in the VR scene, so that the user can see the virtual body consistent with the real posture and action in the VR scene and hands and feet, the experience is more real; due to the use of external sensors, there is no need for cumbersome procedures such as wearing, taking off, calibrating and charging, making the system more convenient and easy to use; due to the tracking and display of the user's posture, it is convenient for the user to use gestures and other limbs
  • the interactive control of the VR scene is carried out through language, and it is convenient for different users to communicate through body language. stream communication.
  • the method and system for enhancing user experience of a virtual reality system proposed in the present invention are also applicable to mixed reality (Mixed Reality, MR) and augmented reality (Augmented Reality, AR) systems based on the principles of virtual reality systems.
  • Mixed reality Mated Reality, MR
  • augmented reality Augmented Reality, AR
  • the various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components in the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as an apparatus or an apparatus program (for example, a computer program and a computer program product) for performing a part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals.
  • Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.
  • the system of the present invention for enhancing user experience of a virtual reality system conventionally includes a processor and a computer program product in the form of memory or a computer readable medium.
  • the memory may be electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk or ROM.
  • the memory has storage space for program code for carrying out any of the method steps in the methods described above.
  • the storage space for program codes may include respective program codes for respectively implementing various steps in the above methods.
  • These program codes can be read from or written into one or more computer program products.
  • These computer program products comprise program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such computer program products are typically portable or fixed storage units.
  • the storage unit may be similarly arranged as a storage segment, a storage space, or the like.
  • the program code can eg be compressed in a suitable form.
  • the memory unit comprises computer-readable codes for performing the steps of the method according to the invention, i.e. codes that can be read by, for example, a processor, which codes, when executed, cause the interactive control means of the virtual reality system to perform the above-described steps in the method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本发明公开了一种增强虚拟现实***用户体验的方法及***。具体而言,增强虚拟现实***用户体验的方法包括以下步骤:通过外置传感器采集用户的全身姿态相关数据;通过预先训练好的姿态生成深度神经网络对全身姿态相关数据进行处理,生成人体三维姿态参数;利用所述三维姿态参数驱动参数化人体模型,并在虚拟现实场景中生成用户的虚拟三维身体,跟随用户的姿态实时运动。还公开了一种增强虚拟现实***用户体验的***,该***包括虚拟现实显示模块、外置传感器模块、姿态参数生成模块、虚拟现实场景生成模块。

Description

一种增强虚拟现实***用户体验的方法及***
本申请要求于2022年02月25日递交的第202210177953.X号中国专利申请的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。
技术领域
本发明涉及虚拟现实(Virtual Reality,VR)技术领域,具体涉及一种增强虚拟现实***用户体验的方法及***。
背景技术
虚拟现实(VR)技术是一门新兴的综合信息技术,使用以计算机技术为核心的现代高科技生成逼真的视觉、听觉、触觉一体化的特定范围的虚拟环境,用户借助各种现实及控制等接口设备以自己的方式与虚拟环境中的对象进行交互作用、互相影响,从而产生与亲临等同真实环境的感受和体验。VR技术融合了数字图像处理、多媒体技术、计算机图形学、传感器技术等多方面的信息技术,它通过计算机图形学构成三维数字模型,在计算机上生成可交互的三维环境中为用户提供沉浸感觉。
目前,用户在使用VR***时,通常是通过头戴显示器体验虚拟的三维场景,但在该虚拟场景中,难以看到虚拟场景中随自己姿态运动的虚拟手脚和身体,缺乏真实感,部分VR***通过可穿戴传感器实现手部和/或腿部位置姿态的跟踪及VR显示,但可穿戴传感器只能跟踪及显示身体的局部区域,穿戴少量传感器难以对全身姿态进行准确跟踪及VR显示,穿戴大量传感器又会带来高成本问题,同时多个传感器的穿戴、脱卸、校准和充电等操作也带来了使用的不便。
随着深度学习技术近年来的快速发展,基于非穿戴式设备采集的数据利用深度神经网络估计人体姿态成为了可能,例如Denis等人提出可通过单幅图像估计人体姿态(T.Denis,R.Chris and L.Agapito,"Lifting from the Deep:Convolutional 3D Pose Estimation from a Single Image",IEEE Conference on  Computer Vision and Pattern Recognition(CVPR),2017:5689-5698),Marton等人提出可在基于图像估计出人体姿态的基础上通过深度信息施加约束来提高精度(V.Marton,A.Lorincz,"Multi-Person Absolute 3D Human Pose Estimation with Weak Depth Supervision",ArXiv,2020),以及Zhou等人提出的通过三维点云数据估计人体姿态等(Y.Zhou,H.Dong and A.El Saddik,"Learning to Estimate 3D Human Pose from Point Cloud",IEEE Sensors Journal 2020,20(15):12334-12342)。
本领域已尝试基于单深度图来识别人体关节点。例如,公开号为CN108734194A的中国发明专利申请公开了一种面向虚拟现实的基于单深度图的人体关节点识别方法,首先通过卷积网络离线训练手势公开数据集NYU,得到一个鲁棒性好、准确率高以及识别速度快的模型;然后通过深度摄像头实时捕获深度图像,在对图像一系列预处理之后分别传入人体骨骼识别模块和手势识别模块,返回识别后的关节点三维信息,进而映射三维人体模型。然而,该公开文献中训练的手势公开数据集NYU仅在手势识别方面具有较好的准确度。现有技术仍然难以通过便捷的方式让用户以高度的真实感在VR场景中看到自己的手脚以及身体。
本领域所需的是一种增强虚拟现实***用户体验的方法及***,能够通过便捷的方式增强VR体验的真实感和交互感,让用户在VR场景中看到自己的手脚以及身体。
需要说明的是,本发明的说明书的此背景技术部分中所包括的信息,包括本文中所引用的任何参考文献及其任何描述或讨论,仅出于技术参考的目的而被包括在内,并且不被认为是将限制本发明范围的主题。
发明内容
为了解决上述问题和其他相关问题,根据本发明的一个或多个实施例,本发明的目的旨在提供一种增强虚拟现实***用户体验的方法及***,能够通过非穿戴的方式对用户全身的姿态动作进行捕捉,并将对应的虚拟三维身体显示到VR场景中跟随用户姿态动作实时运动,增强VR体验的真实感和交互感。
根据本发明的一个方面,提供了一种增强虚拟现实***用户体验的方法, 该方法包括以下步骤:通过外置传感器采集用户的全身姿态相关数据;通过预先训练好的姿态生成深度神经网络对所述全身姿态相关数据进行处理,生成人体三维姿态参数;利用人体三维姿态参数驱动参数化人体模型,并在VR场景中生成用户的虚拟三维身体,跟随用户姿态实时运动。
根据本发明的优选实施例,所述姿态生成深度神经网络通过以下步骤预先训练:通过穿戴式和/或贴片式传感器采集人体在各种运动状态下的姿态数据,构建所述人体三维姿态参数作为输入特征,并同时使用所述外置传感器采集所述全身姿态相关数据作为输出特征,将输入特征和对应的输出特征划分为训练集和验证集;根据所述输入特征和所述输出特征的数据维度,构建基于卷积神经网络和/或循环神经网络和/或编解码神经网络和/或自注意力机制的姿态生成深度神经网络模型;采用深度学习领域的方法,利用所述训练集对所述姿态生成深度神经网络模型进行训练,并利用所述验证集对超参数进行调优。
根据本发明的优选实施例,所述深度学习领域的方法包括反向传播算法。
根据本发明的优选实施例,所述外置传感器包括下列中的至少一个:摄像头、深度相机、三维激光雷达、三维毫米波雷达。
根据本发明的优选实施例,所述全身姿态相关数据包括下列中的至少一个:用户的视频影像数据、带深度信息的用户视频影像数据、激光雷达点云数据、毫米波雷达点云数据;
根据本发明的优选实施例,所述姿态生成深度神经网络的特征包括从每一帧全身姿态相关数据中提取与用户姿态相对应的人体三维姿态参数;
根据本发明的优选实施例,所述人体三维姿态参数包括如下节点:臀部、脊椎、颈部、头部、左肩、左臂、左前臂、左手、右肩、右臂、右前臂、右手、左上腿、左腿、左脚、左脚趾跟、右上腿、右腿、右脚、右脚趾跟,每个节点以一组三维浮点数据表示其相对旋转角。
根据本发明的优选实施例,所述参数化人体模型需根据上述节点构成树状结构,每一节点以其相对于父节点的所述相对旋转角来表达其位置关系。
根据本发明的优选实施例,所述用户的虚拟三维身体为一个三维obj模型,至少包含顶点和三角面信息,每一个所述顶点根据与其最近的一个或多个所述节点的距离关系,设置有对应这些节点的权重值,从而在所述节点运 动时牵动所述顶点运动,进而牵动整个虚拟三维身体运动。
根据本发明的另一个方面,提供了一种增强虚拟现实***用户体验的***,包括VR显示模块、外置传感器模块、姿态参数生成模块、VR场景生成模块;所述VR显示模块用于接收所述VR场景生成模块生成的VR场景并向用户显示;所述外置传感器模块用于采集用户的全身姿态相关数据并发送至所述姿态参数生成模块;所述姿态参数生成模块用于接收所述外置传感器模块发送的所述全身姿态相关数据,并利用预先训练好的姿态生成深度神经网络对所述全身姿态相关数据进行处理,生成人体三维姿态参数,并发送至所述VR场景生成模块;所述VR场景生成模块用于生成所述VR场景,并用于接收所述姿态参数生成模块发送的所述人体三维姿态参数,利用所述人体三维姿态参数驱动参数化人体模型,在所述VR场景中加入用户的虚拟三维身体,并将所述VR场景发送至所述VR显示模块。
根据本发明的优选实施例,所述VR显示模块包括下列中的至少一个:VR头盔、VR眼镜、VR眼罩;
根据本发明的优选实施例,所述外置传感器模块包括下列中的至少一个:摄像头、深度相机、三维激光雷达、三维毫米波雷达。
本发明提供的增强虚拟现实***用户体验的方法及***能够通过便捷的方式增强VR体验的真实感和交互感,使用户能够在VR场景中看到与真实姿态动作一致的虚拟身体和手脚,体验更加真实,并且便于不同用户之间通过肢体语言进行交流沟通。
本发明的更多的实施例还能够实现未一一列出的其他有益技术效果,这些其他有益技术效果在下文中可能有部分描述,并且对于本领域的技术人员而言在阅读了本发明后是可以预期和理解的。
附图说明
通过参考下文的描述连同附图,本发明的实施例的上述特征和优点及其他特征和优点以及实现它们的方式将更显而易见,并且可以更好地理解本发明的实施例,在附图中:
图1为根据本发明的实施例的增强虚拟现实***用户体验的方法的流程示意图;
图2为根据本发明的实施例的增强虚拟现实***用户体验的***的构成示意图。
具体实施方式
在以下结合附图对具体实施方式的描述中,将阐述本发明的一个或多个实施例的细节。从这些描述、附图以及权利要求中,可以清楚本发明的其他特征、目的和优点。
可以理解,本文中所使用的词组和用语是出于描述的目的,而不应当被认为是限制性的。本文中的“包括”、“包含”或“具有”及其变型的使用,旨在开放式地包括其后列出的项及其等同项以及附加的项。
下面将结合附图参考本发明的若干实施例对本发明进行更详细的描述。
图1给出了根据本发明的一种增强虚拟现实***用户体验的方法,包括以下步骤:S1,通过外置传感器采集用户的全身姿态相关数据;S2,通过预先训练好的姿态生成深度神经网络对所述全身姿态相关数据进行处理,生成人体三维姿态参数;S3,利用人体三维姿态参数驱动参数化人体模型,并在VR场景中生成用户的虚拟三维身体,跟随用户姿态实时运动。
外置传感器包括但不限于下列项:摄像头、深度相机、三维激光雷达、三维毫米波雷达。相应地,全身姿态相关数据包括下列中的至少一个:用户的视频影像数据、带深度信息的用户视频影像数据、激光雷达点云数据、毫米波雷达点云数据。
具体地,在本发明的一个实施例中,采用摄像头作为外置传感器采集用户的视频影像数据,通过深度学习的方法从所述视频影像数据中提取出人体的形体特征点,例如可采用美国Google公司出品的mediapipe深度学习模型从人体运动视频影像中提取出所述人体三维姿态参数包含的所述节点的坐标位置,再通过几何关系计算得到所需的所述各节点的所述相对旋转角。
更具体地,在本发明的第二个实施例中,采用深度相机作为所述外置传感器,也可完成与摄像头相似的用户视频影像数据采集,并可进一步结合深度相机采集的深度(即距离)信息得到更精确的所述节点的坐标位置。
更具体地,在本发明的第三个实施例中,采用三维激光雷达和三维毫米波雷达作为所述外置传感器,采集用户的三维点云数据,再通过深度学习的 方法从所述三维点云数据中提取出人体的形体特征点。
更具体地,本发明的实施例中所述外置传感器还可以是多个相同或是不同种类的设备结合。例如,在本发明的第四个实施例中,采用摄像头和三维毫米波雷达结合,同时采集用户的视频影像数据和三维点云数据,然后通过三维点云数据所包含的距离信息对视频影像数据中提取出人体的形体特征点加以约束,得到更精确的所述人体三维姿态参数估计,或是将视频影像数据和三维点云数据合并作为输入特征,利用该输入特征下训练好的姿态生成深度神经网络处理后输出所述人体三维姿态参数。
具体地,本发明的实施例中的所述姿态生成深度神经网络可以通过如下步骤进行训练:S101,通过穿戴式和/或贴片式传感器采集人体在各种运动状态下的姿态数据,构建所述人体三维姿态参数,作为输入特征,并同时使用所述外置传感器采集所述全身姿态相关数据,作为输出特征,将输入特征和对应的输出特征划分为训练集和验证集;S102,根据所述输入特征和所述输出特征的数据维度,构建基于卷积神经网络(Convolutional-Neural-Networks,CNN)和/或循环神经网络(Recurrent-Neural-Network,RNN)和/或编解码神经网络(Encoder-Decoder)和/或自注意力机制(Self-Attention)的姿态生成深度神经网络模型;S103,采用反向传播(Back-Propagation)等深度学习领域的相关方法,利用所述训练集对所述姿态生成深度神经网络模型进行训练,并利用所述验证集对超参数(Hyper-Parameter)进行调优。
根据本发明的优选实施例,所述姿态生成深度神经网络的特征包括从每一帧全身姿态相关数据中提取与用户姿态相对应的人体三维姿态参数。
根据本发明的优选实施例,所述人体三维姿态参数包括如下节点:臀部、脊椎、颈部、头部、左肩、左臂、左前臂、左手、右肩、右臂、右前臂、右手、左上腿、左腿、左脚、左脚趾跟、右上腿、右腿、右脚、右脚趾跟,每个节点以一组三维浮点数据表示其相对旋转角。
根据本发明的优选实施例,所述参数化人体模型需根据上述节点构成树状结构,每一节点以其相对于父节点的所述相对旋转角来表达其位置关系。
根据本发明的优选实施例,所述用户的虚拟三维身体为一个三维obj模型,至少包含顶点和三角面信息,每一个所述顶点根据与其最近的一个或多个所述节点的距离关系,设置有对应这些节点的权重值,从而在所述节点运 动时牵动所述顶点运动,进而牵动整个虚拟三维身体运动。
图2给出了根据本发明的一种增强虚拟现实***用户体验的***的组成示意图,包括VR显示模块1、外置传感器模块2、姿态参数生成模块3、VR场景生成模块4;所述VR显示模块1用于接收所述VR场景生成模块4生成的VR场景并向用户显示;所述外置传感器模块2用于采集用户的全身姿态相关数据并发送至所述姿态参数生成模块3;所述姿态参数生成模块3用于接收所述外置传感器模块2发送的所述全身姿态相关数据,并利用预先训练好的姿态生成深度神经网络对所述全身姿态相关数据进行处理,生成人体三维姿态参数,并发送至所述VR场景生成模块4;所述VR场景生成模块4用于生成所述VR场景,并用于接收所述姿态参数生成模块4发送的所述人体三维姿态参数,利用所述人体三维姿态参数驱动参数化人体模型,在所述VR场景中加入用户的虚拟三维身体,并将所述VR场景发送至所述VR显示模块1。
根据本发明的优选实施例,所述VR显示模块1包括下列中的至少一个:VR头盔、VR眼镜、VR眼罩。
根据本发明的优选实施例,所述外置传感器模块2包括下列中的至少一个:摄像头、深度相机、三维激光雷达、三维毫米波雷达。
进一步地,本发明提供的一种增强虚拟现实***用户体验的***的各个组成模块并不必须是物理上的独立设备,也可以是多个模块共同集成于同一个设备,例如在实施例中可将所述VR显示模块和所述VR场景生成模块集成到VR头盔设备中,或者,可将所述VR显示模块、所述VR场景生成模块和所述姿态参数生成模块一起集成到VR头盔设备中,或者,可将所述外置传感器模块和所述姿态参数生成模块集成到一个姿态采集设备中,等多种组合方式。
本发明提供的增强虚拟现实***用户体验的方法及***具有如下优点:能够精确跟踪人体运动姿态,并在VR场景中叠加显示,使用户能够在VR场景中看到与真实姿态动作一致的虚拟身体和手脚,体验更加真实;由于采用了外置传感器,无需繁琐的穿戴、脱卸、校准及充电等流程,使得***更加方便易用;由于对用户姿态进行了跟踪和显示,便于用户采用手势等肢体语言进行VR场景的交互控制,并且便于不同用户之间通过肢体语言进行交 流沟通。
需要说明的是,本发明中提出的增强虚拟现实***用户体验的方法及***对于基于虚拟现实***原理的混合现实(Mixed Reality,MR)及增强现实(Augmented Reality,AR)***也同样适用。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
本发明的增强虚拟现实***用户体验的***传统上包括处理器和以存储器形式的计算机程序产品或者计算机可读介质。存储器可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器具有用于执行上述方法中的任何方法步骤的程序代码的存储空间。例如,用于程序代码的存储空间可以包括分别用于实现上面的方法中的各种步骤的各个程序代码。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为便携式或者固定存储单元。该存储单元可以类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括用于执行根据本发明的方法步骤的计算机可读代码,即可以由例如处理器读取的代码,这些代码被运行时,导致该虚拟现实***的交互控制装置执行上面所描述的方法中的各个步骤。
应当理解,所图示和描述的实施例在应用中不限于在上述描述中阐明或在附图中图示的配置的细节。所图示的示例可以是其他的实施例,并且能够以各种方式来实施或执行。各示例通过对所公开的实施例进行解释而非限制的方式来提供。实际上,对本领域技术人员显而易见的是,在不背离本发明公开的范围或实质的情况下,可以对本发明的各实施例作出各种修改和变型。 例如,作为一个实施例的一部分而图示或描述的特征,可以与另一实施例起使用,以产生进一步的实施例。因此,本发明公开涵盖属于所附权利要求及其等同要素范围内的这样的修改和变型。

Claims (18)

  1. 一种增强虚拟现实***用户体验的方法,包括以下步骤:
    通过外置传感器采集用户的全身姿态相关数据;
    通过预先训练好的姿态生成深度神经网络对所述全身姿态相关数据进行处理,生成人体三维姿态参数;
    利用所述人体三维姿态参数驱动参数化人体模型,并在虚拟现实场景中生成所述用户的虚拟三维身体,跟随所述用户的姿态实时运动。
  2. 根据权利要求1所述的方法,其特征在于,所述姿态生成深度神经网络通过以下步骤预先训练:
    通过穿戴式和/或贴片式传感器采集人体在各种运动状态下的姿态数据,构建所述人体三维姿态参数作为输入特征,并同时使用所述外置传感器采集所述全身姿态相关数据作为输出特征,将输入特征和对应的输出特征划分为训练集和验证集;
    根据所述输入特征和所述输出特征的数据维度,构建基于卷积神经网络和/或循环神经网络和/或编解码神经网络和/或自注意力机制的姿态生成深度神经网络模型;
    采用深度学习领域的方法,利用所述训练集对所述姿态生成深度神经网络模型进行训练,并利用所述验证集对超参数进行调优。
  3. 根据权利要求2所述的方法,其特征在于,所述深度学习领域的方法包括反向传播算法。
  4. 根据权利要求1所述的方法,其特征在于,所述外置传感器包括下列中的至少一个:摄像头、深度相机、三维激光雷达、三维毫米波雷达。
  5. 根据权利要求1所述的方法,其特征在于,所述全身姿态相关数据包括下列中的至少一个:用户的视频影像数据、带深度信息的用户视频影像数据、激光雷达点云数据、毫米波雷达点云数据。
  6. 根据权利要求1所述的方法,其特征在于,所述姿态生成深度神经网络的特征包括从每一帧全身姿态相关数据中提取与用户姿态相对应的人体三维姿态参数。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,所述人体三维姿态参数包括如下节点:臀部、脊椎、颈部、头部、左肩、左臂、左前臂、左手、右肩、右臂、右前臂、右手、左上腿、左腿、左脚、左脚趾跟、右上腿、右腿、右脚、右脚趾跟,每个节点以一组三维浮点数据表示其相对旋转角。
  8. 根据权利要求7所述的方法,其特征在于,所述参数化人体模型根据所有的节点构成树状结构,每个节点以其相对于父节点的所述相对旋转角来表达其位置关系。
  9. 根据权利要求7所述的方法,其特征在于,所述用户的虚拟三维身体为三维obj模型,并且至少包含顶点和三角面信息,每个所述顶点根据与其最近的一个或多个所述节点的距离关系设置有对应于所述节点的权重值,从而在所述节点运动时牵动所述顶点运动,进而牵动整个虚拟三维身体运动。
  10. 一种增强虚拟现实***用户体验的***,包括虚拟现实显示模块、外置传感器模块、姿态参数生成模块、虚拟现实场景生成模块,
    其中,所述虚拟现实显示模块用于接收所述虚拟现实场景生成模块生成的虚拟现实场景并向用户显示;所述外置传感器模块用于采集所述用户的全身姿态相关数据并发送至所述姿态参数生成模块;所述姿态参数生成模块用于接收所述外置传感器模块发送的所述全身姿态相关数据,并利用预先训练好的姿态生成深度神经网络对所述全身姿态相关数据进行处理,生成人体三维姿态参数,并发送至所述虚拟现实场景生成模块;所述虚拟现实场景生成模块用于生成所述虚拟现实场景,并用于接收所述姿态参数生成模块发送的所述人体三维姿态参数,利用所述人体三维姿态参数驱动参数化人体模型,在所述虚拟现实场景中加入所述用户的虚拟三维身体,并将所述虚拟现实场 景发送至所述虚拟现实显示模块。
  11. 根据权利要求10所述的***,其特征在于,所述姿态生成深度神经网络通过以下步骤预先训练:
    通过穿戴式和/或贴片式传感器采集人体在各种运动状态下的姿态数据,构建所述人体三维姿态参数作为输入特征,并同时使用所述外置传感器采集所述全身姿态相关数据作为输出特征,将输入特征和对应的输出特征划分为训练集和验证集;
    根据所述输入特征和所述输出特征的数据维度,构建基于卷积神经网络和/或循环神经网络和/或编解码神经网络和/或自注意力机制的姿态生成深度神经网络模型;
    采用深度学习领域的方法,利用所述训练集对所述姿态生成深度神经网络模型进行训练,并利用所述验证集对超参数进行调优。
  12. 根据权利要求11所述的***,其特征在于,所述深度学习领域的方法包括反向传播算法。
  13. 根据权利要求10所述的***,其特征在于,所述虚拟现实显示模块包括下列中的至少一个:虚拟现实头盔、虚拟现实眼镜、虚拟现实眼罩。
  14. 根据权利要求10所述的***,其特征在于,所述外置传感器模块包括下列中的至少一个:摄像头、深度相机、三维激光雷达、三维毫米波雷达。
  15. 根据权利要求10所述的***,其特征在于,所述姿态生成深度神经网络的特征包括从每一帧全身姿态相关数据中提取与用户姿态相对应的人体三维姿态参数。
  16. 根据权利要求10-15中任一项所述的***,其特征在于,所述人体三维姿态参数包括如下节点:臀部、脊椎、颈部、头部、左肩、左臂、左前臂、左手、右肩、右臂、右前臂、右手、左上腿、左腿、左脚、左脚趾跟、 右上腿、右腿、右脚、右脚趾跟,每个节点以一组三维浮点数据表示其相对旋转角。
  17. 根据权利要求16所述的***,其特征在于,所述参数化人体模型根据所有的节点构成树状结构,每个节点以其相对于父节点的所述相对旋转角来表达其位置关系。
  18. 根据权利要求16所述的***,其特征在于,所述用户的虚拟三维身体为三维obj模型,并且至少包含顶点和三角面信息,每个所述顶点根据与其最近的一个或多个所述节点的距离关系设置有对应于所述节点的权重值,从而在所述节点运动时牵动所述顶点运动,进而牵动整个虚拟三维身体运动。
PCT/CN2023/074391 2022-02-25 2023-02-03 一种增强虚拟现实***用户体验的方法及*** WO2023160356A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210177953.XA CN116700471A (zh) 2022-02-25 2022-02-25 一种增强虚拟现实***用户体验的方法及***
CN202210177953.X 2022-02-25

Publications (1)

Publication Number Publication Date
WO2023160356A1 true WO2023160356A1 (zh) 2023-08-31

Family

ID=87764777

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074391 WO2023160356A1 (zh) 2022-02-25 2023-02-03 一种增强虚拟现实***用户体验的方法及***

Country Status (2)

Country Link
CN (1) CN116700471A (zh)
WO (1) WO2023160356A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115400A (zh) * 2023-09-15 2023-11-24 深圳市红箭头科技有限公司 实时显示全身人体动作的方法、装置、计算机设备及存储介质
CN117742499A (zh) * 2023-12-29 2024-03-22 武汉科技大学 一种基于飞轮的vr场景体感模拟***

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108196679A (zh) * 2018-01-23 2018-06-22 河北中科恒运软件科技股份有限公司 基于视频流的手势捕捉和纹理融合方法及***
CN108369478A (zh) * 2015-12-29 2018-08-03 微软技术许可有限责任公司 用于交互反馈的手部跟踪
CN108983979A (zh) * 2018-07-25 2018-12-11 北京因时机器人科技有限公司 一种手势跟踪识别方法、装置和智能设备
CN112102682A (zh) * 2020-11-09 2020-12-18 中电科芜湖钻石飞机制造有限公司南京研发中心 基于5g通信的飞行器驾驶培训***和方法
CN113421328A (zh) * 2021-05-27 2021-09-21 中国人民解放军军事科学院国防科技创新研究院 一种三维人体虚拟化重建方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108369478A (zh) * 2015-12-29 2018-08-03 微软技术许可有限责任公司 用于交互反馈的手部跟踪
CN108196679A (zh) * 2018-01-23 2018-06-22 河北中科恒运软件科技股份有限公司 基于视频流的手势捕捉和纹理融合方法及***
CN108983979A (zh) * 2018-07-25 2018-12-11 北京因时机器人科技有限公司 一种手势跟踪识别方法、装置和智能设备
CN112102682A (zh) * 2020-11-09 2020-12-18 中电科芜湖钻石飞机制造有限公司南京研发中心 基于5g通信的飞行器驾驶培训***和方法
CN113421328A (zh) * 2021-05-27 2021-09-21 中国人民解放军军事科学院国防科技创新研究院 一种三维人体虚拟化重建方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115400A (zh) * 2023-09-15 2023-11-24 深圳市红箭头科技有限公司 实时显示全身人体动作的方法、装置、计算机设备及存储介质
CN117742499A (zh) * 2023-12-29 2024-03-22 武汉科技大学 一种基于飞轮的vr场景体感模拟***
CN117742499B (zh) * 2023-12-29 2024-05-31 武汉科技大学 一种基于飞轮的vr场景体感模拟***

Also Published As

Publication number Publication date
CN116700471A (zh) 2023-09-05

Similar Documents

Publication Publication Date Title
CN111126272B (zh) 姿态获取方法、关键点坐标定位模型的训练方法和装置
CN111460875B (zh) 图像处理方法及装置、图像设备及存储介质
WO2023160356A1 (zh) 一种增强虚拟现实***用户体验的方法及***
CN111402290B (zh) 一种基于骨骼关键点的动作还原方法以及装置
Shingade et al. Animation of 3d human model using markerless motion capture applied to sports
CN112602090A (zh) 用于插值不同输入的方法和***
EP4307233A1 (en) Data processing method and apparatus, and electronic device and computer-readable storage medium
CN106484115A (zh) 用于增强和虚拟现实的***和方法
CN104508709A (zh) 使用人体对对象进行动画化
CN110348370B (zh) 一种人体动作识别的增强现实***及方法
Huang et al. A review of 3D human body pose estimation and mesh recovery
Fu et al. Capture of 3D human motion pose in virtual reality based on video recognition
Qianwen Application of motion capture technology based on wearable motion sensor devices in dance body motion recognition
Jiang et al. Rgbd-based real-time 3d human pose estimation for fitness assessment
Hori et al. Silhouette-Based 3D Human Pose Estimation Using a Single Wrist-Mounted 360° Camera
Cai et al. A method for 3D human pose estimation and similarity calculation in Tai Chi videos
Dong Design of smart operating table based on HCI and virtual visual tracking technology
Niu et al. From Method to Application: A Review of Deep 3D Human Motion Capture
Li et al. Computer-aided teaching software of three-dimensional model of sports movement based on kinect depth data
Ma et al. Value evaluation of human motion simulation based on speech recognition control
Zhao et al. Implementation of Computer Aided Dance Teaching Integrating Human Model Reconstruction Technology
Zhou et al. Tracking of Deformable Human Avatars through Fusion of Low-Dimensional 2D and 3D Kinematic Models
Qu et al. Gaussian process latent variable models for inverse kinematics
Dong Three-Dimensional Animation Capture Driver Technology for Digital Media.
Tkach Real-Time Generative Hand Modeling and Tracking

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23758981

Country of ref document: EP

Kind code of ref document: A1