CN115810099A - Image fusion equipment for virtual immersion type depression treatment system - Google Patents

Image fusion equipment for virtual immersion type depression treatment system Download PDF

Info

Publication number
CN115810099A
CN115810099A CN202310054924.9A CN202310054924A CN115810099A CN 115810099 A CN115810099 A CN 115810099A CN 202310054924 A CN202310054924 A CN 202310054924A CN 115810099 A CN115810099 A CN 115810099A
Authority
CN
China
Prior art keywords
image
fused
fusion
physiological data
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310054924.9A
Other languages
Chinese (zh)
Other versions
CN115810099B (en
Inventor
严龙生
林友辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Yi'an Intelligent Technology Co ltd
Original Assignee
Xiamen Yi'an Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Yi'an Intelligent Technology Co ltd filed Critical Xiamen Yi'an Intelligent Technology Co ltd
Priority to CN202310054924.9A priority Critical patent/CN115810099B/en
Publication of CN115810099A publication Critical patent/CN115810099A/en
Application granted granted Critical
Publication of CN115810099B publication Critical patent/CN115810099B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention provides an image fusion device for a virtual immersive depression treatment system, wherein the virtual immersive depression treatment system comprises a camera, a VR device, a physiological data sensor and a server, and the device comprises: the system comprises a collecting unit, a selecting unit, a fusing unit and a playing unit which are mutually matched, so that a key image sequence is extracted after video images of a user are collected by a camera and sent to a server, physiological data of the user collected by a physiological data sensor are sent to the server, and then a scene video is selected from a scene database; the server uses a multi-parameter confrontation generation neural network model to fuse each frame image in the key frame image sequence into the scene video based on the physiological data to generate a fused scene video image, and then the fused scene video image is sent to the VR device to be played. The image of the patient is fused with the frame image in the virtual scene, so that the immersion of the patient is enhanced.

Description

Image fusion equipment for virtual immersion type depression treatment system
Technical Field
The invention relates to the technical field of artificial intelligence and medical data processing, in particular to image fusion equipment for a virtual immersion type depression treatment system.
Background
The chinese patent invention 2017103003414 discloses a virtual scene system for adjuvant therapy of depressive disorder based on VR technology, which includes a virtual scene interaction module, a data acquisition module and a data analysis module that are connected to each other, where the virtual scene interaction module includes VR devices, the virtual scene interaction module is connected to the data acquisition module through VR devices, and the data analysis module includes a database; the virtual scene interaction module is used for simulating a story narration manipulation of psychological persuasion by a doctor, providing various virtual scenes with rich story scenes by means of VR equipment, realizing interaction behaviors, creating an immersive experience environment for a user, serving as a mood and physiological data trigger source in a scene building stage, and playing a role in relieving depression mood in a use stage after the scene is optimal; the data acquisition module is used for acquiring feedback data of a user in the virtual scene interaction module and sending the feedback data to the data analysis module; and the data analysis module is used for converting the feedback data in the database into the adjustment information of various numerical values required in the virtual scene interaction module by combining a statistical theory. The virtual scene interaction module also comprises an environment module, a role module and an plot module; the environment module, the role module and the plot module construct a plurality of virtual scenes and form a virtual scene story system through a time axis; the virtual scene is initially created by combining experience basis of depression and emotion relief and professional guidance suggestions of a psychologist; and the virtual scene is used at the later stage, the environment, the character and the plot module need to be adjusted according to the feedback data of the data analysis module, and the virtual scene with the optimal value is obtained.
However, the virtual scene adopted by the system generally adopts cartoon characters to replace patients, so that the patients have poor immersion and poor treatment effect.
In the prior art, image fusion is generally performed in a way of resisting generation of a neural network by adopting wavelet transformation, but in a virtual environment of depression, how to perform image fusion based on physiological parameters of a patient is a technical challenge, namely how to enable people in a fused video to perfectly match the physical condition of the patient and improve the fusion speed.
Disclosure of Invention
The present invention proposes the following technical solutions to address one or more technical defects in the prior art.
An image fusion device for a virtual immersive depression treatment system, the virtual immersive depression treatment system comprising a camera, a VR device, a physiological data sensor, and a server, the device comprising:
the acquisition unit is used for enabling the camera to acquire video images of users and send the video images to the server, and the server extracts a key image sequence from the video images after receiving the video images and stores the key image sequence in a first cache queue;
the selection unit is used for sending the physiological data of the user, which are acquired by the physiological data sensor, to the server, and the server selects a scene video from a scene database based on the physiological data;
the fusion unit is used for enabling the server to fuse each frame image in the key frame image sequence in the first cache queue into the scene video based on the physiological data by using a multi-parameter confrontation generation neural network model to generate a fusion scene video image, wherein the multi-parameter confrontation generation neural network model is obtained by adopting optimized loss function training;
and the playing unit is used for sending the fusion scene video image to the VR equipment and playing the fusion scene video image to the user in a display device of the VR equipment.
Further, the physiological data at least comprises body temperature, brain waves, blood pressure, heart rate, electrocardio and myoelectricity data.
Further, the operation of extracting the key image sequence from the video image is as follows: and inputting each frame of the video image into a first convolutional neural network for processing to obtain a key frame image sequence.
Further, the operation of the fusion unit is: acquiring physiological data Pi corresponding to a key frame image Mi in a key frame image sequence from a server, wherein Pi = (Ti, NEi, BPi, HRi, HEi and MEi), determining a corresponding image frame Ni to be fused in the scene video based on the key frame image Mi, determining fusion coordinates of the key frame image Mi in the image frame Ni to be fused, inputting the key frame image Mi, the image frame Ni to be fused, the physiological data Pi and the fusion coordinates into a multi-parameter antagonistic generation neural network model to generate a fusion image frame, and combining the generated multiple fusion image frames into a fusion scene video image after all key frame images in the key frame image sequence are processed, wherein n is more than or equal to 0, n is the total frame number in the key frame image sequence, and Ti, NEi, BPi, HRi, HEi and Mei respectively represent corresponding body temperature, brain wave, blood pressure, electrocardio, heart rate and myoelectricity data.
Further, the operation of determining the corresponding image frame Ni to be fused in the scene video based on the key frame image Mi is as follows: and judging the similarity between the posture of the virtual character in each frame of image in the scene video and the posture of the user in the key frame image Mi, and taking the frame object with the maximum similarity as the image frame Ni to be fused.
Further, the countermeasure generation neural network model includes a fusion image generator G and a fusion image discriminator D, the fusion image generator G and the fusion image discriminator D are alternately trained, each training sample in the sample set is input into the fusion image generator G to generate a fusion image, and the fusion image discriminator D is used for discriminating the difference between the generated fusion image and the real image.
Further, each training sample Sj in the sample set comprises a user image Uj, user physiological data Pj, a background image Bj to be fused, a fusion coordinate Cj and a real image RMj, wherein Pj = (Tj, NEj, BPj, HRj, HEj and MEj), wherein m is larger than or equal to j and larger than or equal to 0, m is the number of the training samples, and Tj, NEj, BPj, HRj, HEj and MEj respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardio and myo data in the user physiological data Pj in the training sample Sj.
Further, the optimized loss function includes a loss function LossG of the fused image generator G and a loss function LossD of the fused image discriminator D, wherein:
Figure SMS_1
;
Figure SMS_2
Figure SMS_3
;
Figure SMS_4
;
wherein ,
Figure SMS_5
is expressed as input
Figure SMS_6
A fused image generated by the time-fused image generator G;
Figure SMS_7
representing pairs of discriminators D using fused images
Figure SMS_8
In the above-described manner, the result of the recognition,
Figure SMS_9
to represent
Figure SMS_10
And the difference between the values of the first and second coefficients RMj,
Figure SMS_11
and representing the recognition result of the fusion image discriminator D on the RMj, and obtaining the well-trained confrontation network after a plurality of times of iterative training.
The invention also provides an image fusion method for the virtual immersive depression treatment system, wherein the virtual immersive depression treatment system comprises a camera, a VR device, a physiological data sensor and a server, and the method comprises the following steps:
the method comprises the steps of collecting a video image of a user by a camera and sending the video image to a server, extracting a key image sequence from the video image after the server receives the video image, and storing the key image sequence in a first cache queue;
a selecting step, namely sending the physiological data of the user, which is acquired by the physiological data sensor, to the server, and selecting a scene video from a scene database by the server based on the physiological data;
a fusion step, wherein the server fuses each frame image in the key frame image sequence in the first cache queue into the scene video to generate a fusion scene video image based on the physiological data by using a multi-parameter confrontation generation neural network model, wherein the multi-parameter confrontation generation neural network model is obtained by adopting optimized loss function training;
and a playing step, namely sending the fusion scene video image to the VR equipment, and playing the fusion scene video image to the user in a display device of the VR equipment.
Still further, the physiological data includes at least body temperature, brain waves, blood pressure, heart rate, electrocardiogram, and electromyogram data.
Further, the operation of extracting the key image sequence from the video image is as follows: and inputting each frame of the video image into a first convolutional neural network for processing to obtain a key frame image sequence.
Further, the fusing step operates by: acquiring physiological data Pi corresponding to a key frame image Mi in a key frame image sequence from a server, wherein Pi = (Ti, NEi, BPi, HRi, HEi and MEi), determining a corresponding image frame Ni to be fused in the scene video based on the key frame image Mi, determining fusion coordinates of the key frame image Mi in the image frame Ni to be fused, inputting the key frame image Mi, the image frame Ni to be fused, the physiological data Pi and the fusion coordinates into a multi-parameter confrontation generation neural network model to generate a fusion image frame, and combining a plurality of generated fusion image frames into a fusion scene video image after all key frame images in the key frame image sequence are processed, wherein n i is not less than 0, n is the total frame number in the key frame image sequence, and Ti, NEi, BPi, HRi, HEi and Mei respectively represent corresponding body temperature, brain wave, blood pressure, electrocardio, myoelectricity and myoelectricity data.
Further, the operation of determining the corresponding image frame Ni to be fused in the scene video based on the key frame image Mi is as follows: and judging the similarity between the posture of the virtual character in each frame of image in the scene video and the posture of the user in the key frame image Mi, and taking the frame object with the maximum similarity as the image frame Ni to be fused.
Further, the multi-parameter countermeasure generation neural network model includes a fused image generator G and a fused image discriminator D, alternately training the fused image generator G and the fused image discriminator D, inputting each training sample in the sample set into the fused image generator G to generate a fused image, and the fused image discriminator D for discriminating a difference between the generated fused image and a real image.
Furthermore, each training sample Sj in the sample set comprises a user image Uj, user physiological data Pj, a background image Bj to be fused, a fusion coordinate Cj and a real image RMj, wherein Pj = (Tj, NEj, BPj, HRj, HEj, MEj), wherein m is not less than j and not less than 0, m is the number of the training samples, and Tj, NEj, BPj, HRj, HEj and MEj respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardio and electromyographic data in the user physiological data Pj in the training sample Sj.
Further, the optimized loss function includes a loss function LossG of the fused image generator G and a loss function LossD of the fused image discriminator D, wherein:
Figure SMS_12
Figure SMS_13
;
Figure SMS_14
;
Figure SMS_15
;
wherein ,
Figure SMS_16
is expressed as input
Figure SMS_17
A fused image generated by the time-fused image generator G;
Figure SMS_18
representing pairs of discriminators D using fused images
Figure SMS_19
As a result of the recognition of (1),
Figure SMS_20
to represent
Figure SMS_21
And the difference between the values of the first and second coefficients RMj,
Figure SMS_22
and representing the recognition result of the fusion image discriminator D on the RMj, and obtaining a well-trained confrontation network after a plurality of times of iterative training.
The invention also proposes an electronic device comprising a processor and a memory, said memory being connected to the processor, said memory having stored therein program code, the method of any of the above mentioned being implemented when said processor executes the program code in said memory.
The present invention also proposes a computer-readable storage medium having stored thereon computer program code which, when executed by a computer, performs the method of any of the above.
The invention has the technical effects that: the invention relates to an image fusion method, an electronic device and a storage medium for a virtual immersion type depression treatment system, wherein the virtual immersion type depression treatment system comprises a camera, a VR device, a physiological data sensor and a server, and the method comprises the following steps: the method comprises the following steps that S101, a camera collects video images of a user and sends the video images to a server, the server extracts a key image sequence from the video images after receiving the video images, and the key image sequence is stored in a first cache queue; a selecting step S102, in which the physiological data of the user collected by the physiological data sensor is sent to the server, and the server selects a scene video from a scene database based on the physiological data; a fusion step S103, fusing each frame image in the key frame image sequence in the first cache queue into the scene video by the server based on the physiological data by using a multi-parameter confrontation generation neural network model to generate a fusion scene video image, wherein the multi-parameter confrontation generation neural network model is obtained by adopting optimized loss function training; and a playing step S104, sending the fusion scene video image to the VR equipment, and playing the fusion scene video image to the user in a display device of the VR equipment. The method comprises the steps of collecting video images of a user (namely a depression patient), extracting a key image sequence from the video images, storing the key image sequence in a first cache queue, sending physiological data of the user collected by a physiological data sensor to a server, selecting a scene video from a scene database based on the physiological data, adopting a multi-parameter confrontation generation neural network model to fuse each frame image in the key frame image sequence in the first cache queue into the scene video based on the physiological data so as to generate a fused scene video image, training the multi-parameter confrontation generation neural network model by adopting an optimized loss function, sending the fused scene video image to VR equipment, and playing the fused scene video image to the user in a display device of the VR equipment. Because the method adopts the image of the patient to extract the key frame sequence and then fuses with the frame image in the virtual scene, the immersion of the patient is more comfortable, and the technical problem that the cartoon character is not true in the scene video in the background technology is solved; in the invention, the similarity between the posture of a virtual character in each frame of image in the scene video and the posture of a user in the key frame image Mi is judged, a frame object with the maximum similarity is used as an image frame Ni to be fused, namely the frame object is used for perfectly matching the action of the user with the action of a virtual human to replace the virtual human in the video, then the fusion coordinate of the key frame image Mi is determined in the image frame Ni to be fused, the fusion coordinate is determined, so that the calculation amount of an anti-generation network model is reduced in the fusion process, the fusion speed is improved, then the key frame image Mi, the image frame Ni to be fused, physiological data Pi and the fusion coordinate are input into an anti-neural network model to generate a fusion image frame, and after all key frame images in the key frame image sequence are processed, a plurality of fusion image frames are generated to be combined into a fusion scene video image. Because the fusion coordinate is determined, the calculation amount of the confrontation generation network model is reduced in the fusion process, the fusion speed is improved, and the generated fusion scene video image is more matched with the action of the user, so that the patient obtains more vivid immersion feeling, and the depression recovery is facilitated; the antagonism generation neural network model of the invention adopts a plurality of parameters to generate images, and a part of parameters are physiological data of a patient, so that the images of the user in the fused video are more in line with the condition of the patient.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
Fig. 1 is a flow diagram of an image fusion method for a virtual immersive depression treatment system according to an embodiment of the present invention.
Fig. 2 is a block diagram of an image fusion device for a virtual immersive depression treatment system according to an embodiment of the present invention.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 shows an image fusion method of the present invention for a virtual immersive depression treatment system including a camera, a VR device, a physiological data sensor, and a server, the method including:
the method comprises the following steps that S101, a camera collects video images of a user and sends the video images to a server, the server extracts a key image sequence from the video images after receiving the video images, and the key image sequence is stored in a first cache queue;
a selecting step S102, in which the physiological data of the user collected by the physiological data sensor is sent to the server, and the server selects a scene video from a scene database based on the physiological data;
a fusion step S103, fusing each frame image in the key frame image sequence in the first cache queue into the scene video by the server based on the physiological data by using a multi-parameter confrontation generation neural network model to generate a fusion scene video image, wherein the multi-parameter confrontation generation neural network model is obtained by adopting optimized loss function training;
and a playing step S104, sending the fusion scene video image to the VR equipment, and playing the fusion scene video image to the user in a display device of the VR equipment.
The method comprises the steps of collecting video images of a user (namely a depression patient), extracting a key image sequence from the video images, storing the key image sequence in a first cache queue, sending physiological data of the user collected by a physiological data sensor to a server, selecting a scene video from a scene database based on the physiological data, fusing each frame image in the key frame image sequence in the first cache queue to the scene video based on the physiological data by adopting a multi-parameter confrontation generation neural network model to generate a fused scene video image, training the multi-parameter confrontation generation neural network model by adopting an optimized loss function, sending the fused scene video image to VR equipment, and playing the fused scene video image to the user in a display device of the VR equipment. The method has the advantages that the key frame sequence is extracted from the image of the patient, and then the image is fused with the frame image in the virtual scene, so that the immersion of the patient is more comfortable, the technical problem that cartoon characters are not true in the scene video in the background technology is solved.
In a further embodiment, the physiological data comprises at least body temperature, brain waves, blood pressure, heart rate, electro-cardio and myoelectric data, which are acquired by the physiological data sensors, which may be arranged at different locations on the patient (user) to acquire these physiological data.
In a further embodiment, the operation of extracting the key image sequence from the video image is: and inputting each frame of the video image into a first convolutional neural network for processing to obtain a key frame image sequence. The neural network technology provided by the user key frame is mature, and the key frame can be extracted by constructing a convolutional neural network model and training by adopting a corresponding training sample set.
In a further embodiment, the fusing step operates to: acquiring physiological data Pi corresponding to a key frame image Mi in a key frame image sequence from a server, wherein Pi = (Ti, NEi, BPi, HRi, HEi and MEi), determining a corresponding image frame Ni to be fused in the scene video based on the key frame image Mi, determining fusion coordinates of the key frame image Mi in the image frame Ni to be fused, inputting the key frame image Mi, the image frame Ni to be fused, the physiological data Pi and the fusion coordinates into an anti-neural network model to generate a fusion image frame, and combining a plurality of generated fusion image frames into a fusion scene video image after all key frame images in the key frame image sequence are processed, wherein n i is more than or equal to 0, n is the total frame number in the key frame image sequence, and Ti, NEi, BPi, HRi, HEi and Mei respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardio and myoelectric data.
In a further embodiment, the operation of determining the corresponding image frame to be fused Ni in the scene video based on the key frame image Mi is: and judging the similarity between the posture of the virtual character in each frame of image in the scene video and the posture of the user in the key frame image Mi, and taking the frame object with the maximum similarity as the image frame Ni to be fused.
In the invention, the similarity between the posture of a virtual character in each frame of image in the scene video and the posture of a user in the key frame image Mi is judged, a frame object with the maximum similarity is used as an image frame Ni to be fused, namely, the image frame object is used for perfectly matching the action of the user with the action of the virtual human to replace the virtual human in the video, then the fusion coordinates of the key frame image Mi are determined in the image frame Ni to be fused, because the fusion coordinates are determined, the calculation amount of an antagonistic generation network model is reduced in the fusion process, the fusion speed is improved, then the key frame image Mi, the image frame Ni to be fused, physiological data Pi and the fusion coordinates are input into an antagonistic neural network model to generate a fusion image frame, and after all key frame images in the key frame image sequence are processed, a plurality of fusion image frames are generated to be combined into a fusion scene video image. The fusion coordinates are determined, so that the calculation amount of the anti-generation network model is reduced in the fusion process, the fusion speed is improved, and the generated fusion scene video images are more matched with the actions of the user, so that the patient obtains more vivid immersion feeling, and the depression recovery is facilitated, which is another important invention point of the invention.
In a further embodiment, the multi-parameter countermeasure generation neural network model includes a fused image generator G and a fused image discriminator D, alternately training the fused image generator G and the fused image discriminator D, inputting each training sample of the sample set into the fused image generator G to generate a fused image, and the fused image discriminator D for discriminating a difference between the generated fused image and the real image.
In a further embodiment, each training sample Sj in the sample set includes a user image Uj, user physiological data Pj, a background image Bj to be fused, a fusion coordinate Cj and a real image RMj, wherein Pj = (Tj, NEj, BPj, HRj, HEj, MEj), wherein m ≧ j ≧ 0, m is the number of training samples, and Tj, NEj, BPj, HRj, HEj, MEj respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardiogram and electromyogram data in the user physiological data Pj in the training sample Sj.
In a further embodiment, the optimized loss function comprises a loss function LossG of the fused image generator G and a loss function LossD of the fused image discriminator D, wherein:
Figure SMS_23
;
Figure SMS_24
Figure SMS_25
Figure SMS_26
;
wherein ,
Figure SMS_27
is expressed as input
Figure SMS_28
A fused image generated by the time-fused image generator G;
Figure SMS_29
representing pairs using fused image discriminators D
Figure SMS_30
As a result of the recognition of (1),
Figure SMS_31
to represent
Figure SMS_32
And the difference between the values of the first and second coefficients RMj,
Figure SMS_33
and representing the recognition result of the fusion image discriminator D on the RMj, and obtaining the well-trained confrontation network after a plurality of times of iterative training.
The antagonism generation neural network model of the invention adopts a plurality of parameters to generate images, and a part of parameters are physiological data of a patient, so that the images of the user in the fused video are more in line with the condition of the patient, the invention improves the loss function of the antagonism generation neural network model, in the loss function of the generator, the average value and the maximum value of all physiological parameters are improved based on the comparison of the average value and the maximum value, in the loss function of the discriminator, the loss function is improved based on the comparison of the minimum value and the average value of all physiological parameters, through the improvement, the loss function reflects the characteristic of a plurality of parameters of the antagonism generation neural network model, so that the training speed of the neural network is faster, and the generated fusion image is more approximate to the real condition of the patient, therefore, the improved loss function of the invention is another important invention point.
Fig. 2 shows an image fusion device of the present invention for a virtual immersive depression treatment system including a camera, a VR device, a physiological data sensor, and a server, the device including:
the acquisition unit 201 is used for acquiring a video image of a user by the camera and sending the video image to the server, extracting a key image sequence from the video image after the server receives the video image, and storing the key image sequence in a first cache queue;
the selecting unit 202 is used for sending the physiological data of the user, which is acquired by the physiological data sensor, to the server, and the server selects a scene video from a scene database based on the physiological data;
a fusion unit 203, the server fusing each frame image in the key frame image sequence in the first buffer queue to the scene video based on the physiological data by using a multi-parameter confrontation generation neural network model to generate a fusion scene video image, wherein the multi-parameter confrontation generation neural network model is obtained by adopting optimized loss function training;
and the playing unit 204 is used for sending the fusion scene video image to the VR equipment and playing the fusion scene video image to the user in a display device of the VR equipment.
The method comprises the steps of collecting video images of a user (namely a depression patient), extracting a key image sequence from the video images, storing the key image sequence in a first cache queue, sending physiological data of the user collected by a physiological data sensor to a server, selecting a scene video from a scene database based on the physiological data, fusing each frame image in the key frame image sequence in the first cache queue to the scene video based on the physiological data by adopting a multi-parameter confrontation generation neural network model to generate a fused scene video image, training the multi-parameter confrontation generation neural network model by adopting an optimized loss function, sending the fused scene video image to VR equipment, and playing the fused scene video image to the user in a display device of the VR equipment. The method has the advantages that the key frame sequence is extracted from the image of the patient, and then the image is fused with the frame image in the virtual scene, so that the immersion of the patient is more comfortable, the technical problem that cartoon characters are not true in the scene video in the background technology is solved.
In a further embodiment, the physiological data comprises at least body temperature, brain waves, blood pressure, heart rate, electro-cardio and myoelectric data, which are acquired by the physiological data sensors, which may be arranged at different locations on the patient (user) to acquire the physiological data.
In a further embodiment, the operation of extracting the key image sequence from the video image is: and inputting each frame of the video image into a first convolutional neural network for processing to obtain a key frame image sequence. The neural network technology provided by the user key frame is mature, and the key frame can be extracted after a convolutional neural network model is constructed and trained by adopting a corresponding training sample set.
In a further embodiment, the operation of the fusion unit is: acquiring physiological data Pi corresponding to a key frame image Mi in a key frame image sequence from a server, wherein Pi = (Ti, NEi, BPi, HRi, HEi and MEi), determining a corresponding image frame Ni to be fused in the scene video based on the key frame image Mi, determining fusion coordinates of the key frame image Mi in the image frame Ni to be fused, inputting the key frame image Mi, the image frame Ni to be fused, the physiological data Pi and the fusion coordinates into an anti-neural network model to generate a fusion image frame, and combining a plurality of generated fusion image frames into a fusion scene video image after all key frame images in the key frame image sequence are processed, wherein n i is not less than 0, n is the total frame number in the key frame image sequence, and Ti, NEi, BPi, HRi, HEi and Mei respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardio and myo data.
In a further embodiment, the operation of determining the corresponding image frame to be fused Ni in the scene video based on the key frame image Mi is: and judging the similarity between the posture of the virtual character in each frame of image in the scene video and the posture of the user in the key frame image Mi, and taking the frame object with the maximum similarity as the image frame Ni to be fused.
In the invention, the similarity between the posture of a virtual character in each frame of image in the scene video and the posture of a user in the key frame image Mi is judged, a frame object with the maximum similarity is used as an image frame Ni to be fused, namely the frame object is used for perfectly matching the action of the user with the action of a virtual human to replace the virtual human in the video, then the fusion coordinate of the key frame image Mi is determined in the image frame Ni to be fused, the calculation amount of an antagonistic generation network model is reduced in the fusion process due to the determination of the fusion coordinate, the fusion speed is improved, then the key frame image Mi, the image frame Ni to be fused, physiological data Pi and the fusion coordinate are input into an antagonistic neural network model to generate a fusion image frame, and after all key frame images in the key frame image sequence are processed, a plurality of fusion image frames are generated to be combined into a fusion scene video image. The fusion coordinates are determined, so that the calculation amount of the anti-generation network model is reduced in the fusion process, the fusion speed is improved, and the generated fusion scene video images are more matched with the actions of the user, so that the patient obtains more vivid immersion feeling, and the depression recovery is facilitated, which is another important invention point of the invention.
In a further embodiment, the multi-parameter countermeasure generation neural network model includes a fused image generator G and a fused image discriminator D, alternately training the fused image generator G and the fused image discriminator D, inputting each training sample of the sample set into the fused image generator G to generate a fused image, and the fused image discriminator D for discriminating a difference between the generated fused image and the real image.
In a further embodiment, each training sample Sj in the sample set includes a user image Uj, user physiological data Pj, a background image Bj to be fused, a fusion coordinate Cj and a real image RMj, wherein Pj = (Tj, NEj, BPj, HRj, HEj, MEj), wherein m ≧ j ≧ 0, m is the number of training samples, and Tj, NEj, BPj, HRj, HEj, MEj respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardiogram and electromyogram data in the user physiological data Pj in the training sample Sj.
In a further embodiment, the optimized loss function comprises a loss function LossG of the fused image generator G and a loss function LossD of the fused image discriminator D, wherein:
Figure SMS_34
;
Figure SMS_35
;
Figure SMS_36
;
Figure SMS_37
;
wherein ,
Figure SMS_38
is expressed as input
Figure SMS_39
A fused image generated by the time-fused image generator G;
Figure SMS_40
indicating the recognition result by the fused image discriminator D pair,
Figure SMS_41
to represent
Figure SMS_42
And the difference between the values of the first and second coefficients RMj,
Figure SMS_43
and representing the recognition result of the fusion image discriminator D on the RMj, and obtaining the well-trained confrontation network after a plurality of times of iterative training.
The method comprises the steps of generating an image by adopting a plurality of parameters, wherein a part of parameters are physiological data of a patient, so that the image of the user in a fused video is more accordant with the condition of the patient, improving a loss function of the anti-generation neural network model, improving a traditional generator loss function based on the comparison of the mean value and the maximum value of each physiological parameter in the loss function of a generator, improving a traditional discriminator loss function based on the comparison of the minimum value and the mean value of each physiological parameter in the loss function of a discriminator, and reflecting the characteristic of a plurality of parameters of the anti-generation neural network model by the loss function through the improvement, so that the training speed of the neural network is higher, and the generated fused image is more approximate to the real condition of the patient.
Fig. 3 shows an electronic device of the invention comprising a processor and a memory coupled to the processor, the memory having stored therein program code for performing any of the above-mentioned methods when the processor executes the program code in the memory. The electronic device may be a variety of computers, handheld devices, etc., a distributed computer, etc.
For convenience of description, the above devices are described as being divided into various units for separate description. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application or portions thereof contributing to the prior art may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the apparatuses according to the embodiments or some parts of the embodiments of the present application.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention any modifications and equivalents.

Claims (8)

1. An image fusion device for a virtual immersive depression treatment system, the virtual immersive depression treatment system including a camera, a VR device, a physiological data sensor, and a server, the device comprising:
the acquisition unit is used for enabling the camera to acquire video images of users and send the video images to the server, and the server extracts a key image sequence from the video images after receiving the video images and stores the key image sequence in a first cache queue;
the selection unit is used for sending the physiological data of the user, which are acquired by the physiological data sensor, to the server, and the server selects a scene video from a scene database based on the physiological data;
a fusion unit, configured to enable the server to fuse each frame image in the key frame image sequence in the first cache queue into the scene video based on the physiological data using a multi-parameter confrontation generation neural network model to generate a fused scene video image, where the multi-parameter confrontation generation neural network model is obtained by using optimized loss function training;
and the playing unit is used for sending the fusion scene video image to the VR equipment and playing the fusion scene video image to the user in a display device of the VR equipment.
2. The apparatus of claim 1, wherein the physiological data includes at least body temperature, brain waves, blood pressure, heart rate, electrocardiogram and electromyogram data.
3. The device according to claim 2, characterized in that the operation of extracting a key image sequence from said video images is: and inputting each frame of the video image into a first convolutional neural network for processing to obtain a key frame image sequence.
4. The apparatus of claim 3, wherein the operation of the fusion unit is to: acquiring physiological data Pi corresponding to a key frame image Mi in a key frame image sequence from a server, wherein Pi = (Ti, NEi, BPi, HRi, HEi and MEi), determining a corresponding image frame Ni to be fused in the scene video based on the key frame image Mi, determining fusion coordinates of the key frame image Mi in the image frame Ni to be fused, inputting the key frame image Mi, the image frame Ni to be fused, the physiological data Pi and the fusion coordinates into a multi-parameter antagonistic generation neural network model to generate a fusion image frame, and combining the generated multiple fusion image frames into a fusion scene video image after all key frame images in the key frame image sequence are processed, wherein n is more than or equal to 0, n is the total frame number in the key frame image sequence, and Ti, NEi, BPi, HRi, HEi and Mei respectively represent corresponding body temperature, brain wave, blood pressure, electrocardio, heart rate and myoelectricity data.
5. The device according to claim 4, wherein the operation of determining the corresponding image frame to be fused Ni in the scene video based on the key frame image Mi is: and judging the similarity between the posture of the virtual character in each frame of image in the scene video and the posture of the user in the key frame image Mi, and taking the frame object with the maximum similarity as the image frame Ni to be fused.
6. The apparatus of claim 5, wherein the multi-parameter countermeasure generation neural network model comprises a fused image generator G and a fused image discriminator D, wherein the fused image generator G and the fused image discriminator D are trained alternately, wherein each training sample in the sample set is input to the fused image generator G to generate a fused image, and wherein the fused image discriminator D is configured to discriminate a difference between the generated fused image and a real image.
7. The apparatus according to claim 6, wherein each training sample Sj in the sample set comprises a user image Uj, user physiological data Pj, a background image Bj to be fused, a fusion coordinate Cj and a real image RMj, wherein Pj = (Tj, NEj, BPj, HRj, HEj, MEj), wherein m ≧ j ≧ 0, m is the number of training samples, and Tj, NEj, BPj, HRj, HEj, MEj respectively represent corresponding body temperature, brain wave, blood pressure, heart rate, electrocardiogram and electromyogram data in the user physiological data Pj in the training sample Sj.
8. The apparatus of claim 7, wherein the optimized loss function comprises a loss function LossG of the fused image generator G and a loss function LossD of the fused image discriminator D, wherein:
Figure QLYQS_1
;
Figure QLYQS_2
;
Figure QLYQS_3
;
Figure QLYQS_4
;
wherein ,
Figure QLYQS_5
is expressed as input
Figure QLYQS_6
A fused image generated by the time-fused image generator G;
Figure QLYQS_7
representing pairs using fused image discriminators D
Figure QLYQS_8
As a result of the recognition of (1),
Figure QLYQS_9
represent
Figure QLYQS_10
And the difference between the values of the first and second coefficients RMj,
Figure QLYQS_11
and representing the recognition result of the fusion image discriminator D on the RMj, and obtaining the well-trained confrontation network after a plurality of times of iterative training.
CN202310054924.9A 2023-02-03 2023-02-03 Image fusion device for virtual immersion type depression treatment system Active CN115810099B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310054924.9A CN115810099B (en) 2023-02-03 2023-02-03 Image fusion device for virtual immersion type depression treatment system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310054924.9A CN115810099B (en) 2023-02-03 2023-02-03 Image fusion device for virtual immersion type depression treatment system

Publications (2)

Publication Number Publication Date
CN115810099A true CN115810099A (en) 2023-03-17
CN115810099B CN115810099B (en) 2023-05-16

Family

ID=85487809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310054924.9A Active CN115810099B (en) 2023-02-03 2023-02-03 Image fusion device for virtual immersion type depression treatment system

Country Status (1)

Country Link
CN (1) CN115810099B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106303289A (en) * 2015-06-05 2017-01-04 福建凯米网络科技有限公司 A kind of real object and virtual scene are merged the method for display, Apparatus and system
CN107463780A (en) * 2017-08-04 2017-12-12 南京乐朋电子科技有限公司 A kind of virtual self-closing disease treatment system of 3D and treatment method
CN111027425A (en) * 2019-11-28 2020-04-17 深圳市木愚科技有限公司 Intelligent expression synthesis feedback interaction system and method
US20210243383A1 (en) * 2019-03-06 2021-08-05 Tencent Technology (Shenzhen) Company Limited Video synthesis method, model training method, device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106303289A (en) * 2015-06-05 2017-01-04 福建凯米网络科技有限公司 A kind of real object and virtual scene are merged the method for display, Apparatus and system
CN107463780A (en) * 2017-08-04 2017-12-12 南京乐朋电子科技有限公司 A kind of virtual self-closing disease treatment system of 3D and treatment method
US20210243383A1 (en) * 2019-03-06 2021-08-05 Tencent Technology (Shenzhen) Company Limited Video synthesis method, model training method, device, and storage medium
CN111027425A (en) * 2019-11-28 2020-04-17 深圳市木愚科技有限公司 Intelligent expression synthesis feedback interaction system and method

Also Published As

Publication number Publication date
CN115810099B (en) 2023-05-16

Similar Documents

Publication Publication Date Title
Altaheri et al. Deep learning techniques for classification of electroencephalogram (EEG) motor imagery (MI) signals: A review
CN108446020B (en) Motor imagery idea control method fusing visual effect and deep learning and application
WO2021043118A1 (en) Motor imagery electroencephalogram signal processing method, device, and storage medium
CN107492099B (en) Medical image analysis method, medical image analysis system, and storage medium
Piana et al. Adaptive body gesture representation for automatic emotion recognition
Zhang Automated biometrics: Technologies and systems
CN107485844A (en) A kind of limb rehabilitation training method, system and embedded device
CN110298286B (en) Virtual reality rehabilitation training method and system based on surface myoelectricity and depth image
CN111785366B (en) Patient treatment scheme determination method and device and computer equipment
CN111881838A (en) Dyskinesia assessment video analysis method and equipment with privacy protection function
CN114998983A (en) Limb rehabilitation method based on augmented reality technology and posture recognition technology
CN117438048B (en) Method and system for assessing psychological disorder of psychiatric patient
CN111389008A (en) Face generation method of virtual character, automatic face pinching method and device
CN112101424A (en) Generation method, identification device and equipment of retinopathy identification model
CN113703574A (en) VR medical learning method and system based on 5G
CN115227234A (en) Cardiopulmonary resuscitation pressing action evaluation method and system based on camera
CN115101191A (en) Parkinson disease diagnosis system
CN113593671B (en) Automatic adjustment method and device of virtual rehabilitation game based on Leap Motion gesture recognition
CN113749656B (en) Emotion recognition method and device based on multidimensional physiological signals
CN115311737A (en) Method for recognizing hand motion of non-aware stroke patient based on deep learning
CN111312363B (en) Double-hand coordination enhancement system based on virtual reality
CN116525061B (en) Training monitoring method and system based on remote human body posture assessment
CN112215962A (en) Virtual reality emotional stimulation system and creating method thereof
CN115810099B (en) Image fusion device for virtual immersion type depression treatment system
CN111991808A (en) Face model generation method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant