CN102375541B - User movement is converted into the response of multiple object - Google Patents
User movement is converted into the response of multiple object Download PDFInfo
- Publication number
- CN102375541B CN102375541B CN201110280552.9A CN201110280552A CN102375541B CN 102375541 B CN102375541 B CN 102375541B CN 201110280552 A CN201110280552 A CN 201110280552A CN 102375541 B CN102375541 B CN 102375541B
- Authority
- CN
- China
- Prior art keywords
- user
- response
- onscreen
- movement data
- application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000004044 response Effects 0.000 title claims abstract description 103
- 230000000875 corresponding Effects 0.000 claims abstract description 20
- 210000002356 Skeleton Anatomy 0.000 claims description 54
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 230000003321 amplification Effects 0.000 claims description 3
- 230000002596 correlated Effects 0.000 claims description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 3
- 238000005728 strengthening Methods 0.000 claims 1
- 230000036544 posture Effects 0.000 description 79
- 238000000034 method Methods 0.000 description 23
- 238000005516 engineering process Methods 0.000 description 18
- 230000000007 visual effect Effects 0.000 description 15
- 230000001133 acceleration Effects 0.000 description 11
- 230000001131 transforming Effects 0.000 description 11
- 238000004891 communication Methods 0.000 description 10
- 241000251730 Chondrichthyes Species 0.000 description 9
- 210000004247 Hand Anatomy 0.000 description 8
- 210000001624 Hip Anatomy 0.000 description 7
- 230000002093 peripheral Effects 0.000 description 7
- 230000000712 assembly Effects 0.000 description 6
- 230000003068 static Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 230000003287 optical Effects 0.000 description 5
- 210000000988 Bone and Bones Anatomy 0.000 description 4
- 210000003414 Extremities Anatomy 0.000 description 4
- 210000001503 Joints Anatomy 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000007689 inspection Methods 0.000 description 3
- 210000001699 lower leg Anatomy 0.000 description 3
- 238000005096 rolling process Methods 0.000 description 3
- 210000000245 Forearm Anatomy 0.000 description 2
- 210000003141 Lower Extremity Anatomy 0.000 description 2
- 210000000746 body regions Anatomy 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000002452 interceptive Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000000051 modifying Effects 0.000 description 2
- 210000000689 upper leg Anatomy 0.000 description 2
- 241000251468 Actinopterygii Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 241000406668 Loxodonta cyclotis Species 0.000 description 1
- 210000000707 Wrist Anatomy 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000001143 conditioned Effects 0.000 description 1
- 235000009508 confectionery Nutrition 0.000 description 1
- 230000001808 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001419 dependent Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000002708 enhancing Effects 0.000 description 1
- 230000001815 facial Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000000059 patterning Methods 0.000 description 1
- 230000000149 penetrating Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000001360 synchronised Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Abstract
The present invention relates to be converted into user movement the method and system of multiple object response.Provide a kind of user based on the application performed on calculating equipment mutual, user movement is converted into the system of multiple objects response of onscreen object.The user movement data from one or more users are received at the equipment of seizure.Described user movement data are corresponding to mutual with the user that the onscreen object that presents in application is carried out.This onscreen object is corresponding to the object in addition to representing on the screen of the user shown by the equipment of calculating.User movement data are automatically converted into multiple objects response of onscreen object.Multiple objects of onscreen object are responded and is simultaneously displayed to user.
Description
Technical field
The present invention relates to be converted into user movement the method and system of multiple object response.
Background technology
In the past, such as computer game and multimedia application etc. calculate application and use controller, remote controller, keyboard, mouse
Etc. allowing user direct game role.Recently, computer game and multimedia application have started to use photographing unit and software appearance
Gesture identification provides man-machine interface (" HCI ").Use HCI, the user of user's posture form be detected alternately, explain and make for
Control game role.
Summary of the invention
Disclose a kind of technology, it by allow user be traditionally on noninteractive and static various screens right
As interacting, strengthen mutual with the user that application is carried out.Such as, user movement can be used to change, control or move to remove
Object beyond representing on the screen of user.Catch the user movement and want and applying the user view interacted relevant, and
This user movement is converted into multiple objects response of onscreen object.Onscreen object multiple objects response for apply into
Consumer's Experience that is that the mutual user of row provides enhancing and that improve, without change that application carries out by user typical or its
The most mutual result of his mode.
In one embodiment, it is provided that a kind of based on mutual with the user that the application performed on the computing device is carried out
The method that user movement is converted into multiple objects response of onscreen object.Receive from one or more from the equipment of seizure
The user movement data of user.These user movement data are handed over corresponding to the user carried out with the onscreen object that presents in application
Mutually.This onscreen object is corresponding to the object in addition to representing on the screen of the user shown by the equipment of calculating.By user movement
Data are automatically converted into multiple objects response of onscreen object.Multiple objects of onscreen object are responded and is simultaneously displayed to use
Family.
This general introduction is provided so as to introduce in simplified form will be described in detail below in some concepts of further describing.
Present invention is not intended as identifying key feature or the essential feature of theme required for protection, is intended to be used to help
Determine the scope of theme required for protection.Additionally, theme required for protection is not limited to solve any part in the disclosure
In each realization of any or all shortcoming of mentioning.
Accompanying drawing explanation
Fig. 1 shows that user is playing an embodiment of the tracking system of game.
Fig. 2 shows the embodiment catching equipment of the part that can be used as tracking system.
Fig. 3 shows and can be used to pursuit movement and carry out the reality calculating equipment of more new opplication based on the motion followed the tracks of
Execute example.
Fig. 4 shows and can be used to pursuit movement and carry out the reality calculating equipment of more new opplication based on the motion followed the tracks of
Execute example.
Fig. 5 depicts the flow chart of an embodiment of the process for performing each operation in disclosure technology.
Fig. 6 depicts the stream of an embodiment of the process for catching user movement data according to disclosure technology
Cheng Tu.
Fig. 7 shows skeleton pattern or the example of mapping of the human object that expression scanned.
Fig. 8 provides the further detail below of the exemplary embodiment of the gesture recognition engine shown in Fig. 2.
Fig. 9 depict according to disclosure technology for determining whether user movement data mate the process of a posture
The flow chart of one embodiment.
Figure 10-14 shows depicting and the user that carried out of application that performs on calculating equipment according to disclosure technology
Mutual various user interface screen.
Detailed description of the invention
Disclosing a kind of technology, the user carried out by this technology and the application performed on the computing device is obtained alternately
Strengthen, enable a user to and be not that the onscreen object of expression interacts on the screen of user.Seizure equipment is caught
Catch the user movement data relating to applying the user view interacted.Calculating equipment is connected to described seizure equipment, and from
Described seizure equipment receives the information relevant with user movement data, to control the various aspects of application.Application can include example
Such as the video game application performed on said computing device or film applications.One or more users are via in calculating equipment
User interface interacts with the onscreen object described by this application.In one embodiment, onscreen object corresponds to
The most unconventional object for appreciation game element, such as at non-interactive type or static experience (the static film the most such as presented by application
Sequence, series of stories, film sequence or animation) period display lived object or abiotic object.
In one embodiment, user movement data are converted into multiple objects response of onscreen object by calculating equipment.
User movement data can receive at one or more users mutual with application.Multiple objects response of onscreen object can be wrapped
Include the motion response of onscreen object, acoustic frequency response or eye response.Can be via the user interface in calculating equipment by right
As response is simultaneously displayed to user.
Fig. 1 shows the target recognition of each operation for performing disclosure technology, analyzes and one of tracking system 10
Embodiment (the most commonly referred to follows the tracks of system).Target recognition, analyze and tracking system 10 can be used to identify, analyze and/or
Follow the tracks of the human object of such as user 18 grade.As it is shown in figure 1, tracking system 10 can include calculating equipment 12.Calculating equipment 12 can
To be computer, games system or control station etc..According to an embodiment, calculating equipment 12 can include nextport hardware component NextPort and/
Or component software, the application of application, non-gaming application etc so that calculating equipment 12 can be used to perform such as to play.One
In individual embodiment, calculating equipment 12 can include performing in processor readable storage device storage, retouch at this for performing
The processor of the instruction of the process stated, such as standardization device, application specific processor, microprocessor etc..
As it is shown in figure 1, tracking system 10 may also include seizure equipment 20.Seizure equipment 20 can be such as photographing unit, should
Photographing unit can be used for visually monitoring one or more users of such as user 18 grade, so that catch, analyze also
Follow the tracks of the posture made by the one or more user, in order to control each side of the application performed on calculating equipment 12
Face.
According to an embodiment, tracking system 10 is connectable to provide game or application to the user of such as user 18 grade
The audio-visual equipment 16 of vision and/or audio frequency, such as television set, monitor, HDTV (HDTV) etc..Such as, equipment is calculated
12 can include the audio frequency adapter such as the video adapters such as such as graphics card and/or such as sound card, and these adapters can provide and swim
The audio visual signal that play application, non-gaming application etc. are associated.Audio-visual equipment 16 can receive audio visual signal from calculating equipment 12, then
The game being associated with audio visual signal or application vision and/or audio frequency can be exported to user 18.According to an embodiment, audiovisual sets
Standby 16 can be via such as, and S-vision cable, coaxial cable, HDMI cable, DVI cable, VGA cable etc. are connected to calculating equipment
12。
This target recognition, analyze and tracking system 10 can be used to identify, analyze and/or follow the tracks of of such as user 18 etc.
Or multiple human object.Such as, seizure equipment 20 can be used to follow the tracks of user 18 so that can the movement of user 18 be construed to can
The application performed by calculating equipment 12 for impact or the control of operating system.
Fig. 2 shows an embodiment of seizure equipment 20 and calculating equipment 12, they can target recognition, analysis and
Tracking system 10 uses, in order to identify the mankind in capture region or nonhuman target, and unique terrestrial reference in three dimensions
Know them and follow the tracks of them.According to an embodiment, seizure equipment 20 can be configured to via any suitable technology, including example
Having the video of depth information as flight time, structured light, stereo-picture etc. catch, this depth information includes including
The depth image of depth value.According to an embodiment, the depth information calculated can be organized as " Z layer " and maybe can hang down by seizure equipment 20
Straight in the layer of the Z axis extended along its sight line from depth camera.
As in figure 2 it is shown, capture device 20 can include image camera component 32.According to an embodiment, image camera group
Part 32 can be the depth camera of the depth image that can catch scene.Depth image can include the two dimension (2-of caught scene
D) pixel region, wherein each pixel in 2-D pixel region can represent depth value, the most such as in terms of centimetre, millimeter etc.
The distance to image distance camera in the scene caught.
As in figure 2 it is shown, image camera component 32 can include the IR light group that can be used to catch the depth image of capture region
Part 34, three-dimensional (3-D) camera 36 and RGB camera 38.Such as, in ToF analysis, catch the IR optical assembly of equipment 20
Then 34 can use sensor by infrared light emission to capture region, by such as 3-D camera 36 and/or RGB camera
The light of 38 backscatter,surfaces detecting the one or more targets from capture region and object.In certain embodiments,
Pulsed infrared light can be used such that it is able to the time difference measured between outgoing light pulse and corresponding incident light pulse general
It is for determining the physical distance of the target from seizure equipment 20 to capture region or the ad-hoc location on object.Additionally, can
It is compared to determine phase shift by the phase place of outgoing light wave and the phase place of incident light wave.Then can use this phase in-migration determine from
Capture device to the physical distance of ad-hoc location on target or object.
According to another embodiment, ToF analysis can be used, by via including such as shutter light pulse imaging
Various technology analyze reflection light beam Strength Changes in time to be determined indirectly from seizure equipment 20 to target or object
On the physical distance of ad-hoc location.
In another example, seizure equipment 20 can use structured light to catch depth information.In this analysis, patterning
Light (that is, being shown as the light of the such as known pattern such as lattice or candy strip) can be projected via such as IR optical assembly 34
In capture region.During the surface of one or more targets in striking capture region or object, as response, pattern can
Deformation.This deformation of pattern can be caught by such as 3-D camera 36 and/or RGB camera 38, then can be analyzed to determine from
Capture device to the physical distance of ad-hoc location on target or object.
According to an embodiment, seizure equipment 20 can include can observing capture region from different angles two or more
The camera being physically isolated, can be resolved to generate the visual stereo data of depth information to obtain more.It is used as it
He creates depth image by the depth image sensor of type.
Seizure equipment 20 may also include microphone 40.Microphone 40 can include can receiving sound and converting thereof into changing of the signal of telecommunication
Can device or sensor.According to an embodiment, microphone 40 can be used to reduce in target recognition, analyzes and catching in tracking system 10
Catch the feedback between equipment 20 and calculating equipment 12.It addition, microphone 40 can be used to receive also can customer-furnished audio signal,
Apply with such as game application, the non-gaming application etc. controlling to be performed by calculating equipment 12.
In one embodiment, seizure equipment 20 can also include can carrying out exercisable logical with image camera component 32
The processor 42 of letter.Processor 42 can include the standard processor of executable instruction, application specific processor, microprocessor etc., these
Instruction can include the instruction for storing profile, for receive depth image instruction, for determine suitable target whether by
The instruction that is included in depth image, for suitable Target Transformation is become the skeleton representation of this target or the instruction of model or
Any other suitably instructs.
Seizure equipment 20 may also include memory assembly 44, and memory assembly 44 can store the finger that can be performed by processor 42
The image make, caught by 3-D camera or RGB camera or the frame of image, user profiles or any other suitable information, figure
As etc..According to an example, memory assembly 44 can include random access memory (RAM), read only memory (ROM), height
Speed caching, flash memory, hard disk or any other suitably store assembly.As in figure 2 it is shown, memory assembly 44 can be to catch with image
Catch assembly 32 and single assembly that processor 42 communicates.In another embodiment, memory assembly 44 can be integrated into
In processor 42 and/or image capture assemblies 32.In one embodiment, shown in Fig. 2 catch equipment 20 assembly 32,
34, some or all in 36,38,40,42 and 44 are accommodated in single shell.
Seizure equipment 20 can communicate with calculating equipment 12 via communication link 46.Communication link 46 can be bag
Include the wired connection of such as USB connection, live wire connection, Ethernet cable connection etc. and/or the most wireless 802.11b,
The wireless connections such as 802.11g, 802.11a or 802.11n connection.Calculating equipment 12 can provide clock to seizure equipment 20, should
Clock may be used to determine when catch such as scene via communication link 46.
Seizure equipment 20 can provide by such as 3-D camera 36 and/or RGB phase to calculating equipment 12 via communication link 46
The depth information of machine 38 seizure and image, including the skeleton pattern that can be generated by seizure equipment 20.Then calculating equipment 12 can make
Such as create vision screen with this skeleton pattern, depth information and the image caught, and control at such as game or word
The application of reason program etc.
Calculating equipment 12 includes gesture library 192, structured data 194 and gesture recognition engine 190.Gesture library 192 can include
The set of posture filter, each posture filter includes what description or definition can be made by skeleton pattern (when user moves)
Motion or the information of posture.Structured data 192 includes about can the structural information of object to be tracked.For example, it is possible to storage people
The skeleton pattern of class is to help understand the movement of user and identify body part.The structure about non-inanimate object can also be stored
Information is to help identify these objects and help understanding mobile.
Gesture recognition engine 190 can be by caught by camera 36,38 and equipment 20, skeleton pattern and associated there
The data of form such as movement compare with the posture filter in gesture library 192, to identify (as represented) by skeleton pattern
When user has made one or more posture.Calculating equipment 12 can use gesture library 192 to explain the movement of skeleton pattern also
Move based on this and control application.More information about gesture recognition engine 190 sees the U.S. submitted on April 13rd, 2009
Patent application 12/422,661 " Gesture Recognition System Architecture (gesture recognition system frame
Structure) ", this application is quoted by entirety and is herein incorporated.See what on February 23rd, 2009 submitted to about the more information identifying posture
U.S. Patent application 12/391,150 " Standard Gestures (standard gestures) ";And U.S. that on May 29th, 2009 submits to
State's patent application 12/474,655 " Gesture Tool (gesture tool) ", the two application is all quoted by entirety and is incorporated into
This.The U.S. Patent application 12/641,788 can submitted to for 18th at Decembers in 2009 about the more information of motion detect and track
" Motion Detection Using Depth Images (using the motion detection of depth image) ", and United States Patent (USP) Shen
Please 12/475,308 " Device for Identifying and Tracking Multiple Humans over Time (use
In the equipment identifying and following the tracks of multiple mankind in time) " in find, the two application is quoted by entirety and is herein incorporated.
In one embodiment, calculating equipment 12 can include applying 202.Application 202 may be included in holds on calculating equipment 12
The video game application of row, film applications, shopping application, browse application or other application.Application 202 allows and by this application
The user that the 202 one or more onscreen object presented are carried out is mutual.Described onscreen object may correspond to by application 202
The most unconventional object for appreciation game element presented, such as non-interactive type or static experience (the most static film sequence,
Series of stories, film sequence or animation) period display lived object or abiotic object.Application 202 can include
One or more scenes.Scene in application 202 can include such as developing the background information of plot between game play session.
Such as, the scene in application 202 can include the static film sequence presented by application 202, series of stories, film sequence or move
Drawing, they depict one or more lived or abiotic onscreen object.In one example, onscreen object can
Corresponding to the object in addition to role representation on the screen of user.In one embodiment, user can set via the calculating of user
User interface in standby 12 to interact with application 102.
Application 202 can include transformation model 196, display module 198 and application controls logic 200.Transformation model 196, aobvious
Show that module 198 and application controls logic 200 can realize as software module, to perform the one or more of disclosure technology
Operation.Application controls logic 200 can include the preprogrammed logic relevant with the execution of application 202 and/or the set of rule.One
In individual embodiment, application controls logic 200 rule specified can be based on define alternately with the users that carried out of application 202 can
For controlling the mode of the onscreen object presented by program 202.In one example, application controls logic 200 specify
Rule can define the type of the action of execution on onscreen object alternately based on applying 202 users carried out.As incited somebody to action
Discussed in detail below, user may be made that a range of motion, in order to the one or more screens described in application
On curtain, object interacts, and the most different user movements can cause the corresponding object of similar or different onscreen object
Action.
In one embodiment, gesture recognition engine 190 can provide the skeleton pattern of human object to application controls logic 200
Type and the information of the movement about human object.In one embodiment, the information about the movement of human object can include example
The position of each body part as being associated with human object, direction, acceleration and curvature.Application controls logic 200 can profit
The type of the action to perform on display onscreen object in application 202 is defined by this information.An embodiment
In, application controls logic 200 rule specified can be implemented as the one group of posture identified by gesture recognition engine 190 and screen
On curtain, a group objects response of object carries out the data structure being correlated with.
In another embodiment, application controls logic 200 can also define on the screen that wherein can describe to apply 202
One or more environmental context in object.Therefore, the type of the action that will perform on onscreen object can be based on
The information relevant with user movement data (movement of human object) and wherein description have the environmental context of onscreen object.
Such as, if depicting the lived onscreen object such as such as shark in the application, then environmental context can correspond to
One context.In this context, the movement from left to right of the hip of user such as can cause so that the fin of shark comes
The action that return is dynamic, but in the second context depicting the abiotic objects such as such as truck, same action but may be used
Cause the action performing to turn round on truck.
Transformation model 196 can include based on rule defined in application controls logic 200 in specific screens on object
Perform the executable instruction of specific action.In one embodiment, transformation model 196 can be by turning user movement data automatically
The multiple objects response changing onscreen object into perform specific action on onscreen object.In one embodiment, conversion
Model 196 can be come by the respective response accessing onscreen object from the data structure defined by programed logic 200
Perform this conversion.In one embodiment, the respective response of onscreen object can include the motion response of this onscreen object, regard
Feel and respond or acoustic frequency response.Then transformation model 196 may have access to realize the code of this object response.In one embodiment, may be used
By the skeleton pattern of user being represented the object model being mapped to onscreen object represents that realizing this object responds.Such as,
Transformation model 196 can be mapped to based on screen by the movement of the body part of acquisition in representing from the skeleton pattern of human object
User movement data are converted into by the some portion of corresponding movement in the onscreen object that on curtain, the object model of object represents
The corresponding sports response of onscreen object.Similarly, conversion module 196 can represent by utilizing the object model of onscreen object
User movement data are converted into corresponding acoustic frequency response or the eye response of this onscreen object.
In one embodiment, transformation model 196 is performing concrete mobile being turned accordingly with execution onscreen object
When changing motion, the available metadata relevant with the movement of human object, the speed of such as body part, the acceleration of body part
The distance that degree or body part are advanced.Such as, the movement of the more speed of the concrete body part of human object can cause by with
Family exercise data is converted into the motion response of onscreen object, and wherein in response to the movement of more speed, onscreen object moves
Obtain faster.Similarly, transformation model 196 can utilize the metadata relevant with the movement of human object (such as body part
Speed, the acceleration of body part or body part are advanced and to be obtained distance etc.) trigger acoustic frequency response or the vision of onscreen object
Response.
Display module 198 controls what to display to the user that, and can include for receiving based at transformation model 196
Information carry out the executable instruction of multiple objects response of object on display screen in the user interface calculating equipment 12.
Fig. 3 shows the example of the calculating equipment 100 of the calculating equipment 12 that can be used to realize Fig. 1-2.The calculating equipment of Fig. 3
100 can be the multimedia consoles 100 such as such as game console.As it is shown on figure 3, multimedia console 100 has centre
Managing unit (CPU) 200 and be easy to the Memory Controller 202 of processor access all kinds memorizer, all kinds store
Device includes flash read only memory (ROM) 204, random access memory (RAM) 206, hard disk drive 208 and portable
Media drive 106.In one implementation, CPU 200 includes 1 grade of cache 210 and 2 grades of caches 212, is used for facing
Time storage data, and therefore reduce the quantity to the memory access cycle that hard disk drive 208 is carried out, thus improve process
Speed and handling capacity.
CPU 200, Memory Controller 202 and various memory devices are via one or more bus (not shown)
It is interconnected.The details of the bus used in this implementation is to understanding that theme described herein is not especially relevant.So
And, it should be appreciated that such bus can include serial and concurrent bus, memory bus, peripheral bus, the various bus of use
One or more in the processor of any one of architecture or local bus.As example, such architecture
Can include that industry standard architecture (ISA) bus, MCA (MCA) bus, enhancement mode ISA (EISA) are total
Peripheral parts interconnected (PCI) bus of line, VESA's (VESA) local bus and also referred to as mezzanine bus.
In one implementation, CPU 200, Memory Controller 202, ROM 204 and RAM 206 are integrated into public
In module 214.In this implementation, ROM 204 is configured to pci bus and ROM bus (being both shown without) connects
Flash ROM to Memory Controller 202.RAM 206 is configured to multiple Double Data Rate synchronous dynamic ram (DDR
SDRAM) module, they are stored by controller 202 and are controlled independently by separate bus (not shown).Hard drive
Device 208 and portable media drive 106 are shown to pass through pci bus and AT Attachment (ATA) bus 216 is connected to
Memory Controller 202.But, in other realize, it is also possible to alternatively apply different types of dedicated data bus structures.
Graphics Processing Unit 220 and video encoder 222 constitute for carrying out at high speed and high-resolution (such as, height
Definition) the video processing pipeline of graphics process.Data pass through digital video bus (not shown) from Graphics Processing Unit
220 are transferred to video encoder 222.Audio treatment unit 224 and audio coder-decoder (encoder/decoder) 226 are constituted
Corresponding audio processing pipeline, for carrying out multi-channel audio process to various digital audio formats.Pass through communication link
(not shown) transmits voice data between audio treatment unit 224 and audio coder-decoder 226.Video and Audio Processing stream
Waterline exports data to A/V (audio/video) port 228, in order to be transferred to television set or other display.In shown reality
In Xian, video and audio processing components 220-228 are arranged in module 214.
Fig. 3 shows the module 214 including USB host controller 230 and network interface 232.USB host controller 230 is shown
For being communicated with CPU 200 and Memory Controller 202 by bus (such as, pci bus), and as peripheral controllers
The main frame of 104 (1)-104 (4).Network interface 232 provides the access to network (such as the Internet, home network etc.), and can
Being to include the various wired or nothings such as Ethernet card, modem, wireless access card, bluetooth module, cable modem
Any one in line interface assembly.
In realizing depicted in figure 3, control station 102 includes for supporting four controllers 104 (1)-104 (4)
Sub-component 240 supported by controller.Controller support sub-component 240 include supporting with such as, such as, media and game console it
Any hardware and software component needed for the wired and radio operation of the external control devices of class.Front panel I/O subassembly 242
Hold power knob 112, ejector button 114, and any LED (light emitting diode) or be exposed on the outer surface of control station 102
Multiple functions such as other display lamps.Subassembly 240 and 242 is led to module 214 by one or more cable assemblies 244
Letter.In other realize, control station 102 can include other controller subassembly.Shown realization also show is joined
It is set to transmission and reception can be for delivery to the optics I/O interface 235 of the signal of module 214.
MU 140 (1) and 140 (2) is illustrated as being connected respectively to MU port " A " 130 (1) and " B " 130 (2).Additional
MU (such as, MU 140 (3)-140 (6)) is illustrated as may be connected to controller 104 (1) and 104 (3), i.e. each controller two
Individual MU.Controller 104 (2) and 104 (4) can also be configured to receive MU (not shown).Each MU 140 provides additional
Memorizer, can store game, game parameter and other data in the above.In some implementations, other data can include
Digital game component, executable game application, any in the instruction set and media file of extension, game application
Kind.When being inserted in control station 102 or controller, MU 140 can be stored by controller 202 and access.System power supply mould
Block 250 is to the assembly power supply of games system 100.Fan 252 can cool down the circuit in control station 102.
Application program 260 including machine instruction is stored on hard disk drive 208.When control station 102 is switched on electricity
During source, the various piece of application 260 is loaded in RAM 206, and/or cache 210 and 212 with on CPU 200
Performing, wherein application 260 is such example.Various application can be stored on hard disk drive 208 at CPU
Perform on 200.
By simply game and media system 100 are connected to monitor 150 (Fig. 1), television set, video projector or
Other display equipment, this system 100 can serve as autonomous system and operates.Under this stand-alone mode, game and media system
100 allow one or more players game play or appreciate Digital Media, such as viewing film or appreciation music.But, along with width
Integrated being possibly realized by network interface 232 of band connection, game and media system 100 are also used as bigger network
The participant of game community operates.
Fig. 4 is exemplified with the universal computing device of the operation for realizing disclosure technology.With reference to Fig. 4, it is used for realizing these public affairs
Open the universal computing device that the example system of technology includes presenting with the form of computer 310.The assembly of computer 310 is permissible
It include but not limited to, processing unit 320, system storage 330, and the various system components of system storage will be included
It is coupled to the system bus 321 of processing unit 320.If system bus 321 can be any one in the bus structures of dry type,
Including the memory bus of any one used in various bus architecture or Memory Controller, peripheral bus, Yi Jiju
Portion's bus.Unrestricted as example, such architecture includes industry standard architecture (ISA) bus, microchannel body
Architecture (MCA) bus, enhancement mode ISA (EISA) bus, VESA (VESA) local bus, and also referred to as
Peripheral parts interconnected (PCI) bus of mezzanine bus.
Computer 310 generally includes various computer-readable medium.Computer-readable medium can be can be by computer 310
Any usable medium accessed, and comprise volatibility and non-volatile media, removable and irremovable medium.As example
And unrestricted, computer-readable medium can include computer-readable storage medium and communication media.Computer-readable storage medium include with
For storing any method or the technology of the information such as such as computer-readable instruction, data structure, program module or other data
The volatibility realized and medium non-volatile, removable and irremovable.Computer-readable storage medium include, but not limited to RAM,
ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disc (DVD) or other optical disc memory apparatus, magnetic
Tape drum, tape, disk storage equipment or other magnetic storage apparatus, or can be used for storing information needed and can be by computer
310 any other media accessed.Communication media generally comes with modulated message signal such as such as carrier wave or other transmission mechanisms
Embody computer-readable instruction, data structure, program module or other data, and include random information transmission medium.Term is "
Modulated data signal " refer to the signal that one or more feature is set in the way of encoding information in the signal or changes.
Unrestricted as example, communication media includes such as cable network or the wire medium of directly line connection etc, and as acoustics,
The wireless medium of RF, infrared and other wireless mediums etc.Above-mentioned middle any combination also should be included in computer-readable medium
Within the scope of.
System storage 330 includes volatibility and/or the computer-readable storage medium of nonvolatile memory form, as read-only
Memorizer (ROM) 331 and random access memory (RAM) 332.Basic input/output 333 (BIOS) includes as started
Time help to transmit between element in computer 310 the basic routine of information, it is commonly stored in ROM 331.RAM 332
Generally comprise data and/or program module that processing unit 320 can immediately access and/or be currently in operation.As example
And unrestricted, Fig. 4 shows operating system 334, application program 335, other program modules 336, and routine data 337.
Computer 310 can also include other removable/nonremovable, volatile/nonvolatile computer storage media.
Being only used as example, Fig. 4 shows the hard disk drive 340 being written and read irremovable, non-volatile magnetic media, to moving
The disc driver 351 that dynamic, non-volatile magnetic disk 352 is written and read, and can to such as CD ROM or other optical medium etc.
The CD drive 355 that mobile, anonvolatile optical disk 356 is written and read.Can use in Illustrative Operating Environment other
Removable/nonremovable, volatile/nonvolatile computer storage media includes but not limited to, cartridge, flash card, numeral
Versatile disc, digital video tape, solid-state RAM, solid-state ROM etc..Hard disk drive 341 generally by such as interface 340 can not
Mobile memory interface is connected to system bus 321, and disc driver 351 and CD drive 355 are generally by such as connecing
The removable memory interface of mouth 350 is connected to system bus 321.
As discussed above and the driver that figure 4 illustrates and the computer-readable storage medium that is associated thereof are computer
310 provide the storage to computer-readable instruction, data structure, program module and other data.Such as, in Fig. 4, hard disk
Driver 341 is illustrated as storing operating system 344, application program 345, other program module 346 and routine data 347.Note,
These assemblies can be identical with operating system 334, application program 335, other program modules 336 and routine data 337, it is also possible to
Different from them.It is given at this operating system 344, application program 345, other program modules 346 and routine data 347
Different numberings, with explanation, at least they are different copies.User can be (logical by such as keyboard 362 and pointing device 361
Be commonly referred to as mouse, tracking ball or touch pads) etc input equipment to computer 20 input order and information.Other inputs set
Standby (not shown) can include microphone, stick, game paddle, satellite dish, scanner etc..These and other are defeated
Enter equipment and be generally connected to processing unit 320 by the user's input interface 360 coupleding to system bus but it also may by the most also
Other interfaces such as row port, game port or USB (universal serial bus) (USB) and bus structures are attached.Monitor 391 or
Other kinds of display device is connected to system bus 321 also by the interface of such as video interface 390.In addition to monitor 891,
Computer can also include other peripheral output devices of such as speaker 397 and printer 396 etc, and they can be by defeated
Go out peripheral interface 390 to connect.
Computer 310 can use the logic of the one or more remote computers to such as remote computer 380 etc even
It is connected in networked environment operation.Remote computer 380 can be personal computer, server, router, network PC, equity
Equipment or other common network node, and generally include above with reference to the many described by computer 310 or whole element,
Although Fig. 4 merely illustrates memory devices 381.Logic described in Fig. 4 connects and includes LAN (LAN) 371 and wide area
Net (WAN) 373, but it is also possible to include other networks.Such networked environment office, enterprise-wide. computer networks,
Intranet and the Internet are common.
When using in LAN networked environment, computer 310 is connected to LAN by network interface or adapter 370
371.When using in WAN networked environment, computer 310 generally includes modem 372 or for by such as because of spy
The WAN 373 such as net set up other means of communication.Modem 372 can be internal or external, and it can be via user
Input interface 360 or other suitable mechanism are connected to system bus 321.In networked environment, relative to computer 310 institute
The program module or its part that describe can be stored in remote memory storage device.Unrestricted as example, Fig. 4 illustrates
Reside in the remote application 385 on memory devices 381.It is exemplary for being appreciated that shown network connects, and
Other means setting up communication link between the computers can be used.
The hardware device in Fig. 1-4 can be used to realize a system, and this system allows user and except on the screen of user
Object beyond expression interacts.Fig. 5 depicts an enforcement of the process of each operation for performing disclosure technology
The flow chart of example.In one embodiment, the step in Fig. 5 can be by the transformation model calculated in equipment 12 in system 10
196, the software module in display module 198 and/or application controls logic 200 performs.In step 500, calculating is first received
The identity of the user on equipment 12.In one example, step 500 can use facial recognition by from the vision figure received
The face of the user of picture carries out relevant, to determine the identity of user to reference visual pattern.In another example, user identity is determined
Can include at user, receive the input identifying their identity.Such as, user profiles can be stored by computer equipment 12, and
User can make one's options to be designated corresponding to this user profiles themselves on screen.It is being successfully made ID
After, receive user's selection to such as applying 202 application such as grade in step 501.In step 501, can be by the place of user
User interface in reason equipment 12 points out user to select to apply 202.As discussed in the above, application 202 can include example
As calculated video game application, the film applications etc. performed on equipment 12.In step 502, application 202 is displayed to the user that
Scene.
In step 506, make and determine whether to detect the inspection of user movement.If it is determined that detect user's fortune
Dynamic, receive user movement data the most in step 512.In one embodiment, the processor 42 in seizure equipment 20 can be examined
Survey and receive the information relevant with the motion of user.Discuss in figure 6 and detected by seizure equipment 20 and catch user movement
The process of data.Without user movement data being detected, make the most in step 508 and determine whether user wishes and application
The inspection that the additional scene of 202 interacts.If it is determined that user wishes to interact with one or more additional scenes, then exist
Step 504 obtains the next scene of application 202, and displays to the user that this scene in step 502.If it is determined that user is not
Wish to interact with any additional scene of application 202, then apply 202 to exit in step 510.
After capturing user movement in step 512, make in step 514 and determine whether user movement data mate one
The inspection of posture.User may be made that a range of motion, in order to the one or more onscreen object described in application
Interact.In one embodiment, step 514 can include specifying user movement data with by gesture recognition engine 190
One or more predetermined gestures compare.Determining whether user movement data mate the process of a posture can be by posture
Identify that engine 190 performs, and can discuss in the figure 7.If user movement Data Matching one posture, then in step
In 518, these user movement data are automatically converted into multiple objects response of onscreen object.In one embodiment, conversion is used
The step 518 of family exercise data can perform as follows.In step 519, utilize obtain in the step 514 to be matched posture
Access as described in Figure 2 carry out relevant data structure for posture and object being responded.In step 520, from data structure
Place accesses the respective response of onscreen object, and it includes accessing the pointer pointing to the code realizing this response.Such as, right on screen
The respective response of elephant can include the motion response of this onscreen object, eye response or acoustic frequency response.In step 521, access
Realize the code of object response.In one embodiment, can be mapped to by the skeleton pattern of user is represented as described in Figure 2
The object model of onscreen object represents to realize object response.
If user movement data do not mate posture, the most in step 516 via the user interface calculating equipment 12 come to
User provides the feedback relevant with being used by the user to the different types of motion that interacts with onscreen object.Such as, may be used
Providing a user with the guide with text, this guide and user can be used to dissimilar with what various onscreen object interacted
Motion relevant.If detecting user movement data in step 506, being subsequently based on the most in step 512 and providing a user with
Feedback catch user movement data.
In step 522, via the user interface in the calculating equipment 12 of user, multiple objects of onscreen object are rung
User should be shown to.
Fig. 6-9 there is provided the flow chart in greater detail of each step in Fig. 5.Fig. 6 depicts for catching use
The flow chart of one embodiment (step 512 in Fig. 5) of the process of family exercise data.In step 526, catch the place of equipment 20
Reason device 42 receives visual pattern and depth image at image capture assemblies 32.In other examples, only receive in step 526
Depth image.Depth image and visual pattern can by any sensor in image capture assemblies 32 or as known in the art its
The sensor that he is suitable for catches.In one embodiment, depth image and visual pattern are separately captured.Real at some
In Xian, depth image and visual pattern are captured simultaneously, and in other realize, they are sequentially or at different times
It is captured.In other embodiments, depth image is captured together with visual pattern, or is combined into visual pattern
One image file so that each pixel has R value, G-value, B value and Z value (expression distance).
In step 528, determine the depth information corresponding to visual pattern and depth image.Can analyze and receive in step 526
The visual pattern arrived and depth image, to determine the depth value of the one or more targets in image.Seizure equipment 20 can catch
Or observe the capture region that can include one or more target.
In step 530, seizure equipment determines whether depth image includes human object.In one embodiment, can be to deeply
Each target in degree image carries out film color filling and is compared to determine whether this depth image includes by itself and a pattern
Human object.In one embodiment, it may be determined that the edge of each target being captured in scene of depth image.Depth image
Can be included being captured the two-dimensional pixel region of scene.Each pixel in 2D pixel region can represent the most such as can be from camera
Measure the length arrived or distance even depth value.Can be by being associated with the adjacent or neighbouring pixel of such as depth image
Various depth values are compared to determine edge.If the various depth values just compared are more than predetermined sides tolerance, then these pictures
Element one edge of definable.The depth information calculated including depth image can be organized into " Z layer " and maybe can hang down by seizure equipment
Straight in each layer of the Z axis extending to observer along its sight line from camera.Can based on determined by edge, the possibility to Z layer
Z value carry out film color filling.Such as, can by with determined by the pixel that is associated of edge and determined by intramarginal region
In pixel interrelated, to define the target in capture region or object.
In step 532, catch device scan human object to find one or more body parts.People's classification can be scanned
Mark the tolerance of such as length, the width etc. that provide the one or more body parts with user to be associated so that can be based on this
A little tolerance generate the accurate model of this user.In one example, isolate human object, and create bit mask and scan one
Or multiple body part.Can such as by human object is carried out film color fill create bit mask so that this human object with
Other targets or object in capture region element are separated.
In step 534, it is grown up next life classification target model based on the scanning performed in step 532.This bit mask can be analyzed
Find one or more body part, to generate the models such as the skeleton pattern of such as human object, mesh human model.Example
As, the metric determined by the bit mask scanned can be used to the one or more joints defining in skeleton pattern.Bit mask
Can include that human object is along X, Y and the value of Z axis.These one or more joints can be used for defining the body part that may correspond to the mankind
One or more skeleton.
According to an embodiment, in order to determine the position of the cervical region of human object, shoulder etc., can be by such as scanned
The width of bit mask of position compare with the threshold value of the representative width being associated with such as cervical region, shoulder etc..Replace one
Change in embodiment, it is possible to use determine with the distance of previous position that is that scan and that be associated with the body part in bit mask
The position of cervical region, shoulder etc..
In one embodiment, in order to determine the position of shoulder, can be by the width of the bit mask of shoulder position and threshold value
Shoulder value compares.For example, it is possible to by the distance between two most external Y values at the X value of the bit mask of shoulder position
And the threshold shoulder value of the typical range between such as mankind's shoulder compares.Thus, according to an example embodiment, this threshold value
Shoulder value can be the representative width or width range being associated with the shoulder in the body model of the mankind.
In one embodiment, some body part of such as lower limb, foot etc. can be based on the position of such as other body parts
Calculate.Such as, as set forth above, it is possible to the such as information such as position, pixel that scanning is associated with human object, to determine people's classification
The position of each body part of target.Based on these positions, can be human object's follow-up body of calculating such as lower limb, foot etc. subsequently
Body region.
According to an embodiment, after the value determining such as certain body part, a data structure can be created, these data
Structure can include the metric of such as length, the width etc. of the body part that the scanning of the bit mask with human object is associated.
In one embodiment, this data structure can include the scanning result that multiple depth images are averaging gained.Such as, seizure sets
For catching the capture region in each frame, each frame includes depth image.The depth map of each frame can be analyzed as described above
Picture, to determine whether to include human object.If the depth image of frame includes human object, then can scan and be associated with this frame
The bit mask of the human object of depth image finds one or more body part.Then can be to determined by each frame
The value of body part is averaging, so that this data structure body part that can include being associated with the scanning of each frame is all
Average degree value such as length, width etc..According to an embodiment, the metric of body part determined by adjustable, as put
Greatly, reduce so that the metric in data structure corresponds more closely to the typical model of human body.In step 534,
The metric determined by the bit mask scanned can be used to the one or more joints defining in skeleton pattern.
In step 536, use skeleton to map and follow the tracks of the model created in step 534.Such as, can be user in the visual field
Adjust and update the skeleton pattern of user 18 when physical space before inherent camera moves.Information from the equipment of seizure can be used
In adjusting model so that skeleton pattern represents user exactly.In one example, this is by the one of this skeleton pattern
Or multiple stress aspect applies one or more power, to be adjusted to correspond more closely to human object and thing by this skeleton pattern
The attitude of the attitude in reason space realizes.
In step 538, motion is caught, to generate motion capture files based on skeleton mapping.In one embodiment,
The step 538 catching motion can include calculating the position of one or more body parts by described scanning mark, direction, acceleration
Degree and curvature.The position of body part is calculated, to create this body part three-dimensional in the visual field of camera in X, Y, Z-space
Positional representation.The direction that body part moves is calculated according to this position.Directivity moves can have appointing in X, Y and Z-direction
Component in what one or a combination thereof.Determine the curvature of this body part movement in X, Y, Z-space, such as to represent health
Position nonlinear moving in capture region.Speed, acceleration and curvature estimation are not dependent on direction.It is right to be appreciated that
The use that X, Y, Z Cartesian map is merely possible to what example provided.In other embodiments, different coordinates can be used to reflect
The system of penetrating calculates mobile, speed and acceleration.When check body part naturally rotate around joint mobile time, such as sphere
Coordinate mapping is probably useful.
After once analyzing all body parts in a scan, it is possible to generate in step 538 for target update
Motion capture files.In one example, motion capture files is to come in real time based on the information being associated with the model followed the tracks of
Generate.Such as, in one embodiment, motion capture files can include the vector containing X, Y and Z value, and these vectors are at model
For defining joint and the skeleton of this model when each time point is tracked.As it has been described above, tracked model can be based on respectively
The user movement of individual time point is adjusted, and can generate and store the motion capture files of the model of motion.With mesh
Mark not, analyze and follow the tracks of the user of system interaction and carry out nature and move period, this motion capture files can catch is followed the tracks of
Model.Such as, motion capture files can be generated so that this motion capture files can naturally catch user with target recognition,
Analyze and follow the tracks of any movement or motion carried out during system interaction.This motion capture files can include with such as user not
With the frame that the snapshot of the motion of time point is corresponding.After capturing followed the tracks of model, can be the one of motion capture files
Presenting the information being associated with model in frame, this information is included in a particular point in time and is applied to any movement or the tune of this model
Whole.Information in this frame can include the such as vector containing X, Y and Z value and a timestamp, the mould that these vector definition are followed the tracks of
The joint of type and skeleton, this timestamp for example, may indicate that user performs the movement of the attitude corresponding to the model followed the tracks of
Time point.
In one embodiment, step 526-538 is performed by calculating equipment 12.Although additionally, step 526-538 quilt
It is depicted as being performed by seizure equipment 20, but each step in these steps all can be by such as computing environment 12 etc
Other assemblies perform.Such as, seizure equipment 20 can provide vision and/or depth image to calculating equipment 12, calculates equipment 12
Then will determine depth information, detection human object, scanning target, generate also trace model and the fortune of seizure human object
Dynamic.
Fig. 7 show that can generate in the step 534 of Fig. 6, represent the skeleton pattern of human object scanned or reflect
Penetrate the example of 540.According to an embodiment, skeleton pattern 540 can include that human object can be expressed as threedimensional model
Or multiple data structure.Each body part is characterized by defining joint and the mathematical vector of skeleton of skeleton pattern 540.
Skeleton pattern 540 includes joint n1-n18.It is fixed that each in the n1-n18 of joint can make between these joints
One or more body potential energies of justice move relative to other body parts one or more.Represent that the model of human object can
Including multiple rigidity and/or flexible body position, these body parts can be by one or more structures of such as " skeleton " etc.
Part defines, and joint n1-n18 is positioned at the intersection of adjacent skeleton.Joint n1-n18 can make and skeleton and joint n1-
Each body part that n18 is associated can independently of one another or be movable relative to each other.Such as, between n7 and n11 of joint
The skeleton of definition is corresponding to forearm, and this forearm can be independent of the bone corresponding to shank such as defined between n15 and n17 of joint
Bone and move.It is appreciated that some skeleton may correspond to the anatomy skeleton in human object, and/or some skeleton is the mankind
Target may not have the anatomy skeleton of correspondence.
Skeleton and joint can collectively form skeleton pattern, and they can be the constitution element of this model.Axial rolling angle can
For defining limb relative to his father's limb and/or the spin orientation of trunk.Such as, if skeleton pattern is just illustrating the axial rotation of arm
Turn, then roll joint and can be used to indicate the direction (such as, palm is upwards) of the wrist indication being associated.By checking that limb is relative to it
Father's limb and/or the orientation of trunk, it may be determined that axial rolling angle.Such as, if checking shank, then can check shank relative to
The thigh being associated and the orientation of hip are to determine axial rolling angle.
Fig. 8 provides the further detail below of the exemplary embodiment of gesture recognition engine 190 as shown in Figure 2.As shown
Going out, gesture recognition engine 190 may include determining whether at least one filter 450 of one or more posture.Filter 450 includes
The parameter of the metadata 454 of definition posture 452 (hereinafter referred to as " posture ") and this posture.Filter can include recognizable posture
Or otherwise process the code of the degree of depth, RGB or skeleton data and the data being associated.Such as, including a hands from health
The throwing crossing preaxial motion behind can be implemented as including that a hands of expression user crosses health behind from health
The posture 452 of the information of the movement in front, this moves and will be caught by depth camera.Then can be this posture 452 setup parameter
454.In the case of posture 452 is to throw, parameter 454 can be that the threshold velocity that must reach of this hands, this hands must be advanced
Distance (or absolute, or as the overall size relative to user) and by identifying that grading made by engine
Posture occur confidence level.These parameters 454 of posture 452 can be between each application, at each context of single application
Between or one application a context in change over time.Pose parameter can include threshold angle (such as hip
Portion-thigh angle, forearm-biceps angle etc.), the motion periodicity, threshold period, the threshold position that occur or do not occur (open
Begin, terminate), moving direction, speed, acceleration, the coordinate etc. of movement.
Filter can include recognizable posture or the code otherwise processing the degree of depth, RGB or skeleton data and be correlated with
The data of connection.Filter can be modular or interchangeable.In one embodiment, filter have multiple input and
Multiple outputs, each in these inputs has a type, and each in these outputs has a type.In this situation
In, the first filter can replace with second filter with this first filter with equal number and the input of type and output
Change, and without changing other aspects of gesture recognition engine architecture.For example, it may be possible to have the first filter to be driven,
This first filter using skeleton data as input, and export confidence level occurent with the posture that this filter is associated and
Steering angle.In the case of this first driving filter is replaced in hope with the second driving filter, (this is possibly due to second
Drive filter more efficient and need less process resource), can be by replacing the first filtration with the second filter simply
Device carrys out do so, if the second filter have same input and output one of skeleton data type input and
Confidence categories and two outputs of angular type.
Filter need not have parameter.Such as, " user's height " filter of the height returning user may not allow
Any parameter that can be conditioned." user's height " filter replaced can have customized parameter, as determined the height of user
Time whether consider the footwear of user, hair style, headwear and figure.
The input of filter can include the joint data of such as joint position about user, such as the bone intersected at joint
Angle that bone is formed, the content such as the rate of change of one side from the rgb color data of capture region and user.From
The output of filter can include the most just making the confidence level of given posture, making the speed of posture movements and make posture
The contents such as the time of motion.
Gesture recognition engine 190 can have the base recognition engine 456 providing function to posture filter 450.At one
In embodiment, the function that base recognition engine 456 realizes includes following the tracks of the posture identified and the input in time of other inputs
(input-over-time) achieve, (wherein modeling system is assumed have unknown parameter in hidden Markov model realization
(current state encapsulates any past state information determined needed for state in future to Markov process, is therefore not necessarily this mesh
And safeguard the process of other past state information any), and hiding parameter determines from observable data) and solve
Other functions needed for the particular instance of gesture recognition.
Filter 450 loads on base recognition engine 456 and realizes, and available engine 456 is supplied to own
The service of filter 450.In one embodiment, base recognition engine 456 processes received data, in order to whether determine it
Meet the requirement of any filter 450.The service that such as input being carried out parsing etc. due to these is provided is to be identified by basis
Engine 456 disposably provides rather than is provided by each filter 450, the most this service only need within a period of time by
Processing once rather than is processed once by each filter 450 during this time period, thereby reduces needed for determining posture
Process.
Application program can use by the filter 450 identifying that engine 190 provides, or this application program can provide its own
Filter 450, this filter is inserted in base recognition engine 456.In one embodiment, all filters 450 have
Enable the common interface of this insertion characteristic.Additionally, all filters 450 may utilize parameter 454, therefore can use as described below
Single gesture tool diagnoses and regulates whole filter system.These parameters 454 can be application or application by gesture tool
Context regulates.
There is the various output can being associated with posture.In one example, posture can be related to the most occur
Baseline " yes/no ".In another example, it is also possible to have confidence level, it is corresponding to the tracked mobile correspondence of user
Probability in posture.This can be scope be the lineal scale of the floating number (including end points) between 0 and 1.Receiving this appearance
In the case of the application of gesture information can not accept the false input of conduct certainly, it can only use has the high confidence water such as at least 0.95
The posture that flat those are identified.Even certainly must identify the situation of each example of posture as cost with vacation in application
Under, it may use that at least have the posture of much lower confidence level, such as those postures of simply greater than .2.Posture can have
The output of the time between two nearest steps, and in the case of only have registered the first step, this can be set as retention, such as-1
(because the time between any two steps is just necessary for).Posture also can have about at the highest thigh that the most further period reaches
The output at angle.
The space body that it can must be occurred by posture or one part wherein is as parameter.Include that health moves in posture
In the case of, this space body generally can be expressed relative to health.Such as, the rugby for dexterous user is thrown
Throw posture to be only not less than right shoulder 410a and know in the space body of the same side of 422 with throwing arm 402a-410a
Not.All borders of possible unnecessary definition space body, as this throwing gesture, wherein kept not from the border that health is outside
It is defined, and this space body ad infinitum stretches out, or extend to the edge of the capture region being just monitored.
It addition, posture may be stacked on over each other.That is, user once can express more than one posture.Such as, not exist
Do not allow any input in addition to throwing when making throwing gesture, the most do not require that user protects in addition to the component of this posture
Hold motionless (such as, standing still) when making the throwing gesture only relating to an arm.When posture stacks, user can be simultaneously
Make jump posture and throwing gesture, and the two posture all will be recognized by the gesture engine.
Fig. 9 depicts and according to disclosure technology for performing the determination user movement data of the step 514 in Fig. 5 is
The flow chart of one embodiment of the process of no coupling one posture.Fig. 9 describes a kind of rule-based method, and the method is used for
One or more posture filter is applied whether to mate one to the exercise data determining user by gesture recognition engine 190 concrete
Posture.Although being appreciated that and describe the detection to single posture in this concrete example, but the process of Fig. 9 can be by repeatedly
Perform the multiple postures concentrated with detection in activity posture.Described process can for multiple activity postures in parallel or sequentially
Ground performs.
In step 602, gesture recognition engine accesses the skeleton tracking data of objectives to start whether to determine this target
Make selected posture.As described in Figure 6, skeleton tracking data can be accessed at motion capture files.In step 604, posture
Identifying that engine filters skeleton tracking data for one or more predetermined body parts, this is one or more the most true
Fixed body part is relevant with the posture selected by mark in selected posture filter.Step 604 can include only accessing with
The data that selected posture is relevant, or access whole skeleton tracking data of target and ignore or abandon and selected posture not phase
The information closed.Such as, hand gesture filter can indicate that the hands of only human object is relevant with selected posture, so that and its
The relevant data of his body part can be left in the basket.This technology can be determined in advance as the appearance to being selected by process being limited to
Gesture is that significant information is to improve the performance of gesture recognition engine.
In step 606, gesture recognition engine is that predetermined moving axially filters skeleton tracking data.Such as, selected
The filter of posture may specify that the movement of a certain subset only along axle is relevant.
In step 608, gesture recognition engine accesses the regular j specified in posture filter.In the process of Fig. 9 first
In secondary iteration, j is equal to 1.Posture can include multiple parameters that needs are satisfied in order to posture is identified.Every in these parameters
Individual parameter can be specified in single rule, although can include multiple component in single rule.Rule may specify the body of target
Threshold distance, position, direction, curvature, speed and/or acceleration that body region must is fulfilled for make posture be satisfied and
Other parameters.Rule can be applicable to a body part or multiple body part.And, rule may specify such as position etc
Single parameter, or multiple parameters of such as position, direction, distance, curvature, speed and acceleration etc.
In step 610, gesture recognition engine is by the appointment in step 604 and the 606 skeleton tracking data filtered with rule
Parameter compares, to determine whether this rule is satisfied.Such as, gesture recognition engine can determine that the original position whether position of hands
In the threshold distance of original position parameter.Rule can further specify that and engine determines that hands moves the most in the direction indicated
Move, move threshold distance from original position in the direction indicated;Along specifying axle to move in threshold value curvature, with command speed or
Exceed command speed to move;Meet or exceed appointment acceleration.If engine determines that skeleton tracking information is unsatisfactory for filter rule
The parameter specified in then, engine returns failure or the unsatisfied response of posture filter the most in step 612.Set in calculating
The application 202 performed on standby 12 returns this response.
In step 614, it is necessary that gesture recognition engine determines whether posture filter specifies for posture to be done
The ancillary rules being satisfied.If filter includes ancillary rules, then j be incremented by 1 and process return to step 608, in step
608 access next rule.Without ancillary rules, then returning posture filter in step 618 gesture recognition engine is expired
The instruction of foot.
The step 612 of Fig. 9 and 618 aligns analyzed posture and returns simply by/failed response.In other examples
In, Fig. 9 does not return simply by/failed response, but will return the confidence level that posture filter is satisfied.For
Each rule in filter, determines that the movement of target meets or is unsatisfactory for specifying the amount of parameter.Gathering based on this tittle, knows
Other engine returns target and really performs the confidence level of posture.
Figure 10-14 shows the various users that user that the application describing have with perform on the computing device is carried out is mutual
Interface screen.Figure 10-11 shows that the example user carried out via the user interface in calculating equipment 12 and application is mutual, with
And the result mutual with the user that application is carried out.Figure 10 show user 18 just via the user interface in calculating equipment 12 with should
Interact with 202.Figure 11 shows and applies the mutual result of the user as shown in Figure 10 carried out.In showing shown in Figure 11
In example diagram, user 18 to describe with in the scene 50 of application 202 by performing the exemplary motion such as such as hip motion
Shark object 52 interact.Showing the result that this user is mutual, in this result, hip motion 54 has been converted into shark
Multiple motion responses 56,57 of the fin of fish.As further shown, via user interface, multiple objects are responded 56,57
It is simultaneously displayed to user 18.
In another embodiment, more than one user is carried out with the application 202 performed in calculating equipment 12 simultaneously
Alternately.Therefore, first user exercise data can be received at first user, and can be from right with on the screen of description application 202
The second user movement data are received as at the second user of interacting.First user exercise data can be converted into the first motion
Response, and the second user movement data can be converted into the second motion response.Can come via the user interface calculated in equipment 12
First and second responses of onscreen object are simultaneously displayed to user.In one embodiment, when the second user movement data
When differing with first user exercise data, the second object response can differ with the first object response.As replacement, when
When two user's exercise datas are identical with first user exercise data, the second object response can be the amplification of first user response.
The response amplified can such as based on the user movement sensed speed or acceleration determine, wherein in response to more speed
Mobile onscreen object is faster removable.
Figure 12 is first user 18 and the exemplary diagram of the second user 19, and this first user 18 and the second user 19 are
Interact with the onscreen object 52 described in the application 202 performed on calculating equipment 12.In this exemplary diagram, the
One user 18 performs hip motion and interacts with the shark object 52 described in the scene 50 of application 202.Second user
19 perform hand exercise 53 interacts with identical shark object 52.Show to enter with shark object 52 the most simultaneously
The result that two capable users are mutual, wherein hip motion 54 has been converted into multiple motions of the fin causing shark object 52
56, first motion response of 57, and hand exercise 53 be converted into the health causing shark object 52 motion 59 second
Motion response.
Figure 13 shows that user 18 is just carried out via another scene of the user interface in calculating equipment 12 with application 202
Alternately.In the exemplary diagram shown in Figure 13, user 18 and the such as bulb described in the concrete scene 60 of application 202
The abiotic objects such as object 62 interact.The result that Figure 14 shows with light bulb object 62 is carried out user is mutual, its
The motion 66 of clapping hands of middle user has been converted into the eye response of light bulb object 62.As shown, as by reference marker 64
Bulb 62 is opened by this indicated eye response.
Although the language special by architectural feature and/or method action describes this theme, it will be appreciated that appended power
Profit theme defined in claim is not necessarily limited to above-mentioned specific features or action.More precisely, above-mentioned specific features and action
It is as realizing disclosed in the exemplary forms of claim.The scope of the present invention is defined by the claims appended hereto.
Claims (15)
1. one kind based on mutual with the user that carried out of application that performs on calculating equipment, and user movement is converted into multiple object
The method of response, including:
The user movement data of one or more users are received at sensor;
Determine whether described user movement data mate one or more predefined posture;
When the described one or more predefined posture of user movement Data Matching, described user movement data are changed automatically
Becoming multiple objects response of onscreen object, described onscreen object is corresponding to except being shown by the calculating equipment performing described application
User screen on represent beyond object;And
Show the plurality of object response of described onscreen object based on described conversion simultaneously, and do not change user to described
The most mutual result that application is carried out,
Wherein said onscreen object is noninteractive.
2. the method for claim 1, it is characterised in that:
At least one in the response of the plurality of object includes the motion response of described onscreen object.
3. the method for claim 1, it is characterised in that:
At least one in the response of the plurality of object is the acoustic frequency response of described onscreen object.
4. the method for claim 1, it is characterised in that:
At least one in the response of the plurality of object is the eye response of described onscreen object.
5. the method for claim 1, it is characterised in that described user movement data are converted into the response of multiple object also
Including:
Based on a determination that whether described user movement data mate the one or more predefined posture, access by posture with
Object response carries out the data structure being correlated with;
Access the respective response of described onscreen object;And
Realizing the response of described object, described realization includes representing the skeleton pattern of described user and is mapped to described onscreen object
Object model represent.
6. the method for claim 1, it is characterised in that receive described user movement data at one or more users
Farther include:
At first user, receive first user exercise data, and at the second user, receive the second user movement data.
7. method as claimed in claim 6, it is characterised in that also include:
Described first user exercise data is converted into the first object response of described onscreen object;
Described second user movement data are converted into the second object response of described onscreen object;
Described first object response and the described second object response of described onscreen object are simultaneously displayed to described first and use
Family and described second user.
8. method as claimed in claim 7, it is characterised in that:
When described second user movement data differ with described first user exercise data, described second object response and institute
State the first object response to differ.
9. method as claimed in claim 7, it is characterised in that:
When described second user movement data are identical with described first user exercise data, described second object response is described
The amplification of first user response.
10. the method for claim 1, it is characterised in that:
Receive described user movement data to include at first user, receiving first user exercise data and at the second user
Receive the second user movement data;
Described user movement data are converted into the response of multiple object and include being converted into described first user exercise data described
Described second user movement data are converted into the second motion of described onscreen object by the first motion response of onscreen object
Response;
Described user movement data are converted into the response of multiple object include representing the skeleton pattern of described user movement data
The object model being mapped to described onscreen object represents;
Show that multiple objects response of described onscreen object includes showing the fortune of described onscreen object based on described conversion
At least one in dynamic response, acoustic frequency response or eye response.
11. the method for claim 1, it is characterised in that:
One or more described onscreen object include the abiotic object presented in the application;And wherein:
At least one in the response of the plurality of object includes the motion of described abiotic object.
12. 1 kinds are used for strengthening the method mutual with the user of application, including:
Receiving the first user exercise data of first user at sensor, described first user exercise data corresponds to and is counting
The first user that the onscreen object presented in the application performed on calculation equipment is carried out is mutual;
At described sensor, receive the second user movement data of the second user, described second user movement data corresponding to
The second user that the described onscreen object presented in the application performed on the computing device is carried out is mutual;
Described first user exercise data is converted into the first motion response of described onscreen object;
Described second user movement data are converted into the second motion response of described onscreen object;And
Described first motion response and described second motion response are simultaneously displayed to described first user and described second user,
And do not change described first user and the most mutual result that described application is carried out by described second user,
Wherein said onscreen object is noninteractive.
13. methods as claimed in claim 12, it is characterised in that:
When described second user movement data differ with described first user exercise data, described second motion response and institute
State the first motion response to differ.
14. methods as claimed in claim 12, it is characterised in that:
When described second user movement data are identical with described first user exercise data, described second motion response is described
The amplification of the first motion response.
15. 1 kinds of devices triggering object response based on the user with application alternately, including:
Catch the sensor of user movement data;And
Being connected to the calculating equipment of described sensor, described user movement data are converted into the motion of abiotic object and ring by it
Should, at least one in acoustic frequency response or eye response, described calculating equipment shows described abiotic based on described conversion
At least one in the described motion response of object, described acoustic frequency response or described eye response, and do not change user to described
The most mutual result that application is carried out,
Wherein said abiotic object is noninteractive.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/859,995 | 2010-08-20 | ||
US12/859,995 US9075434B2 (en) | 2010-08-20 | 2010-08-20 | Translating user motion into multiple object responses |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102375541A CN102375541A (en) | 2012-03-14 |
CN102375541B true CN102375541B (en) | 2016-12-14 |
Family
ID=
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1394325A (en) * | 2000-09-01 | 2003-01-29 | 美国索尼电脑娱乐公司 | User input device and method for interaction with graphic images |
CN101484933A (en) * | 2006-05-04 | 2009-07-15 | 索尼计算机娱乐美国公司 | Methods and apparatus for applying gearing effects to input based on one or more of visual, acoustic, inertial, and mixed data |
CN101730874A (en) * | 2006-06-28 | 2010-06-09 | 诺基亚公司 | Touchless gesture based input |
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1394325A (en) * | 2000-09-01 | 2003-01-29 | 美国索尼电脑娱乐公司 | User input device and method for interaction with graphic images |
CN101484933A (en) * | 2006-05-04 | 2009-07-15 | 索尼计算机娱乐美国公司 | Methods and apparatus for applying gearing effects to input based on one or more of visual, acoustic, inertial, and mixed data |
CN101730874A (en) * | 2006-06-28 | 2010-06-09 | 诺基亚公司 | Touchless gesture based input |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102331840B (en) | User selection and navigation based on looped motions | |
CN102184020B (en) | Gestures and gesture modifiers for manipulating a user-interface | |
US9075434B2 (en) | Translating user motion into multiple object responses | |
CN102448561B (en) | Gesture coach | |
CN102222431B (en) | Computer implemented method for performing sign language translation | |
CN102576466B (en) | For the system and method for trace model | |
US9245177B2 (en) | Limiting avatar gesture display | |
CN102448562B (en) | Systems and methods for tracking a model | |
CN102414641B (en) | Altering view perspective within display environment | |
CN102129293B (en) | Tracking groups of users in motion capture system | |
CN102411783B (en) | Move from motion tracking user in Video chat is applied | |
TWI442311B (en) | Using a three-dimensional environment model in gameplay | |
US8803889B2 (en) | Systems and methods for applying animations or motions to a character | |
CN102413886B (en) | Show body position | |
CN105073210B (en) | Extracted using the user's body angle of depth image, curvature and average terminal position | |
US20120053015A1 (en) | Coordinated Motion and Audio Experience Using Looped Motions | |
CN102262438A (en) | Gestures and gesture recognition for manipulating a user-interface | |
US20110109617A1 (en) | Visualizing Depth | |
CN102207771A (en) | Intention deduction of users participating in motion capture system | |
CN103608844A (en) | Fully automatic dynamic articulated model calibration | |
CN102356373A (en) | Virtual object manipulation | |
CN107077730A (en) | Limb finder based on outline is determined | |
CN102591456B (en) | To the detection of health and stage property | |
CN102375541B (en) | User movement is converted into the response of multiple object | |
CN105164617B (en) | The self-discovery of autonomous NUI equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20150728 Address after: Washington State Applicant after: Micro soft technique license Co., Ltd Address before: Washington State Applicant before: Microsoft Corp. |
|
GR01 | Patent grant |