Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture-AI-php.cn

Home

Technology peripherals

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

王林

Sep 18, 2023 pm 07:37 PM

ar/vrMicrosoft patentbody posture capture

(Nweon September 18, 2023) In order to accurately represent the real-world posture of a human user, relatively detailed information about the position and orientation of the user's body parts is usually required, but this information is not always available. For example, when using a headset to provide a virtual reality experience, the system may only be able to obtain spatial information related to the user's head and hands. However, in most cases this is not sufficient to accurately reproduce the real pose of a human user

So in the patent application called "Pose prediction for articulated object", Microsoft proposed a technology to predict the posture of articulated objects. In particular, the machine learning model receives the spatial information of n different joints of the articulated object, where n joints are smaller than all the joints of the articulated object.

In the case of a human user, the n joints may include the human user's head joint and/or one or two wrist joints, which are associated with spatial information detailing the parameters of the user's head and/or hands

The machine learning model has been trained to receive input spatial information for n m joints of an articulated object, where m is greater than or equal to 1. For example, during initial training, a machine learning model receives input data corresponding to nearly all joints of an articulated object. The n m joints may include each joint of the articulated object.

In other examples, there may be n m joints where there are less than all joints of an articulated object. During the training process, the data input to the machine learning model may be gradually hidden. You can use predefined values to replace the corresponding input data of a specific node in m nodes, or simply omit

In other words, a machine learning model is trained to accurately predict the pose of an articulated object based on progressively less information about the position/orientation of the various movable parts of the articulated object.

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

Using this approach, machine learning models are able to accurately predict the pose of articulated objects at runtime and require only sparse input data. Microsoft notes that this technology can accurately reproduce the real-world pose of articulated objects for human users without requiring a large amount of information about the orientation of each joint.

In other words, inventions can provide technical advantages that improve human-computer interaction by more accurately reproducing the real-world gestures of human users. These technical benefits include improving the immersion of virtual reality experiences and improving the accuracy of gesture recognition systems

In addition, the described technology can reduce the consumption of computing resources while accurately reproducing the real posture of human users by reducing the amount of data that must be collected as input to the posture prediction process.

Example method 200 shows Figure 2 for predicting the pose of an articulated object

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

At point 202, receive the spatial information of n joints, which are used for articulated objects. The system receives the spatial information of n joints of the articulated object, which contains fewer joints than all the joints of the articulated object. Representing the spatial information of a joint as the position and orientation of six degrees of freedom connecting body parts, this can be used to infer the state of the joint

As an example, the n joints may include head joints of the human body, and the spatial information of the head joints may describe the parameters of the human head in detail. In addition, the n joints may include one or more wrist joints of the human body, and the spatial information of the one or more wrist joints may describe in detail the parameters of one or more hands of the human body.

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

Figure 3 shows human users. The human user has a head 300 and two hands 302A and 302B. The computing system may receive spatial information for one or more joints of a human user, which may include head and/or wrist joints.

The spatial information of the n joints of the articulated object can be derived from the positioning data output by one or more sensors. Sensors may be integrated into one or more devices held or worn by corresponding body parts of a human user.

For example, sensors may include one or more inertial measurement units integrated into a head-mounted display device and/or a handheld controller. As another example, a sensor may include one or more cameras.

Figure 3 schematically illustrates different types of sensors where the output from the sensors may include or be used to derive spatial information. Specifically, a human user wears a head mounted display device 304 on his or her head 300 .

Additionally, the human user holds position sensors 306A and 306B, which may be configured to detect and report motion of the user's hands to the headset 304 and/or another computing system configured to receive spatial information.

In Figure 2, we are back to the 204 situation. We pass the spatial information of n joints to the previously trained machine learning model. This model receives spatial information of n m joints as input, where the value of m is greater than or equal to 1. In other words, compared to the previous training model, this machine learning model receives less joint space information

In 206, a pose prediction of the joint object is received as output from the machine learning model, the prediction is based on at least the spatial information of the n joints and does not contain the spatial information of their joints. In other words, even if the spatial information of m joints is not provided, the machine learning model can predict the complete posture of the joint object.

Schematic 4 shows an example machine learning model 400 to illustrate this process

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

In Figure 4, the machine learning model receives spatial information 402, corresponding to three different joints J1, J2, and J3. The spatial information of the joint may take the form of any suitable computer data that specifies or can be used to derive the position and/or orientation of the body part connected to the joint.

For example, the spatial information may directly specify the position and orientation of a body part, and/or the spatial information may specify one or more rotations of a joint relative to one or more rotation axes. In Figure 4, joints J1, J2, J3 correspond to a human user's head joint 404A and two wrist joints 404B/404C, as shown by the shaded circles superimposed on the user's body.

In this example, the n joints include three joints, corresponding to the head and wrist joints of the human body. Based on the input spatial information 402, the machine learning model outputs a predicted pose 406 of the articulated object.

In addition, the machine learning model can output predicted spatial information corresponding to the joints represented by the virtual hinge. Human users can be represented by avatars with cartoonish or non-human proportions. For example, the predicted spatial information may correspond to joints represented by SMPL.

In other words, the joints of the virtual representation of the articulated representation do not have to have a 1:1 correspondence with the joints of the articulated object. Therefore, the spatial information output predicted by the machine learning model may be for joints that do not directly correspond to the n m joints of the articulated object. For example, a virtual representation may have fewer spinal joints than an articulated object.

Machine learning models can be trained in any suitable way. In one embodiment, the machine learning model may have been previously trained using training input data with ground truth labels for articulated objects.

In other words, the training spatial information of the joints of the articulated object can be provided to the machine learning model and marked as the ground truth label specifying the actual pose of the articulated object corresponding to the spatial information.

As mentioned above, a machine learning model can be trained to receive spatial information of n m joints as input. This involves, in the first training iteration, providing the machine learning model with training input data for all n m joints. In a subsequent series of training iterations, the training input data of m joints can be gradually masked.

For example, in the second training iteration, the first joint among the m joints can be masked, where the spatial information of the joint in the training data set is replaced with a predefined value representing the masked joint, or simply omitted.

As an example. In the third training iteration, the second of the m joints can be masked, and so on, until all m joints are masked, and only the spatial information of n joints is provided to the machine learning model.

This process is illustrated in Figures 5a-5d. Specifically, in Figure 5A, machine learning model 400 is provided with a training input data set. In this embodiment, the training input data includes spatial information corresponding to a plurality of different postures of the articulated object, including the first posture 502A and the second posture 502B.

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

In Figure 5A, we provide the spatial information of n m joints for the articulated object of the machine learning model. In this simplified representation of the human body, each circle representing a joint is represented by a white fill pattern. However, in Figure 5B we have shielded 504A as shown with a black fill pattern representing the circle

of connector 504A

In other words, Figure 5A represents the initial iteration of the training process, in which the spatial information of all n m joints is provided to the machine learning model. Figure 5B shows the second iteration of the training process, in which the first joint 504A

of the m joints is masked.

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

In Figure 5C, the second joint 504B among the m joints represented by the hinge is blocked. Similarly, in Figure 5D, the third joint among the m joints is occluded. Multiple training iterations can be continued until the spatial information of each of the m joints is masked, and only the spatial information of n joints is provided to the machine learning model.

In the above scenario, we describe the situation where the articulated object is the whole body of the human body. However, articulated objects can also take other forms

Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture

As shown in Figure 7, the articulated object is the human hand, not the entire human body. Specifically, Figure 7 shows an example machine learning model 700.

The machine learning model 700 receives spatial information for joints J1, J2, and J3, which correspond to the three joints 704A-C of an articulated object, in this example taking the form of a human hand 706.

In this case, specifically, the n joints include one or more finger joints of the human hand. The spatial information of one or more finger joints details the parameters of one or more fingers or finger segments of the human hand. For example, spatial information may specify the position/orientation of the fingers of the hand, and/or the rotation applied to the joints of the hand

Any suitable method may be used to collect joint space information, such as via position sensor 708. For example, a position sensor could take the form of a camera configured to image the hand. As another example, a position sensor may include an appropriate radio frequency antenna configured to expose the hand surface to an electromagnetic field and evaluate the effect of movement and proximity of conductive human skin on the electromagnetic field impedance at the antenna

According to the input spatial information 702, the machine learning model will output a set of predicted spatial information 710. Spatial information 710 may be used to construct the predicted pose of the articulated object. As mentioned earlier, this spatial information can represent the position and orientation of body parts of an articulated object

Related Patents: Microsoft Patent | Pose prediction for articulated object

Microsoft originally submitted a patent application called "Pose prediction for articulated object" in June 2022, and the application was recently published by the US Patent and Trademark Office

The above is the detailed content of Microsoft proposes patented technology for predicting the posture of articulated objects for AR/VR body posture capture. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:搜狐. If there is any infringement, please contact admin@php.cn delete

How to Build Your Personal AI Assistant with Huggingface SmolLMApr 18, 2025 am 11:52 AM

Harness the Power of On-Device AI: Building a Personal Chatbot CLI In the recent past, the concept of a personal AI assistant seemed like science fiction. Imagine Alex, a tech enthusiast, dreaming of a smart, local AI companion—one that doesn't rely

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford UniversityApr 18, 2025 am 11:49 AM

Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and techno

The 2025 WNBA Draft Class Enters A League Growing And Fighting Online HarassmentApr 18, 2025 am 11:44 AM

"We want to ensure that the WNBA remains a space where everyone, players, fans and corporate partners, feel safe, valued and empowered," Engelbert stated, addressing what has become one of women's sports' most damaging challenges. The anno

Comprehensive Guide to Python Built-in Data Structures - Analytics VidhyaApr 18, 2025 am 11:43 AM

Introduction Python excels as a programming language, particularly in data science and generative AI. Efficient data manipulation (storage, management, and access) is crucial when dealing with large datasets. We've previously covered numbers and st

First Impressions From OpenAI's New Models Compared To AlternativesApr 18, 2025 am 11:41 AM

Before diving in, an important caveat: AI performance is non-deterministic and highly use-case specific. In simpler terms, Your Mileage May Vary. Don't take this (or any other) article as the final word—instead, test these models on your own scenario

AI Portfolio | How to Build a Portfolio for an AI Career?Apr 18, 2025 am 11:40 AM

Building a Standout AI/ML Portfolio: A Guide for Beginners and Professionals Creating a compelling portfolio is crucial for securing roles in artificial intelligence (AI) and machine learning (ML). This guide provides advice for building a portfolio

What Agentic AI Could Mean For Security OperationsApr 18, 2025 am 11:36 AM

The result? Burnout, inefficiency, and a widening gap between detection and action. None of this should come as a shock to anyone who works in cybersecurity. The promise of agentic AI has emerged as a potential turning point, though. This new class

Google Versus OpenAI: The AI Fight For StudentsApr 18, 2025 am 11:31 AM

Immediate Impact versus Long-Term Partnership? Two weeks ago OpenAI stepped forward with a powerful short-term offer, granting U.S. and Canadian college students free access to ChatGPT Plus through the end of May 2025. This tool includes GPT‑4o, an a

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Will R.E.P.O. Have Crossplay?

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Notepad++7.3.1

Easy-to-use and free code editor

Hot Topics

Where is the login entrance for gmail email?

7555

CakePHP Tutorial

1383

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers