TWI793579B - Method and system for simultaneously tracking 6 dof poses of movable object and movable camera - Google Patents

Method and system for simultaneously tracking 6 dof poses of movable object and movable camera Download PDF

Info

Publication number
TWI793579B
TWI793579B TW110114401A TW110114401A TWI793579B TW I793579 B TWI793579 B TW I793579B TW 110114401 A TW110114401 A TW 110114401A TW 110114401 A TW110114401 A TW 110114401A TW I793579 B TWI793579 B TW I793579B
Authority
TW
Taiwan
Prior art keywords
movable
camera
freedom
orientations
movable object
Prior art date
Application number
TW110114401A
Other languages
Chinese (zh)
Other versions
TW202203644A (en
Inventor
汪德美
謝中揚
Original Assignee
財團法人工業技術研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 財團法人工業技術研究院 filed Critical 財團法人工業技術研究院
Priority to CN202110554564.XA priority Critical patent/CN113920189A/en
Priority to US17/369,669 priority patent/US11506901B2/en
Publication of TW202203644A publication Critical patent/TW202203644A/en
Application granted granted Critical
Publication of TWI793579B publication Critical patent/TWI793579B/en

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

A method and a system for simultaneously tracking several 6 DoF poses of a movable object and a movable camera are provided. The method includes the following steps: A series of images are captured by a movable camera. Several environmental feature points are extracted from the images. The environmental feature points are matched to calculate several camera matrixes of the movable camera, and then the 6 DoF poses of the movable camera are calculated by the camera matrixes. At the same time, several feature points of the movable object are inferred from the images captured by the movable camera. The coordinates of the feature points of the movable object are corrected through the camera matrixes corresponding to the images, as well as the predefined geometric and temporal constraints. Then, the 6 DoF poses of the movable object are calculated.

Description

同時追蹤可移動物體與可移動相機的六自由度方位之方法與系統Method and system for simultaneously tracking the six-degree-of-freedom orientation of a movable object and a movable camera

本揭露是有關於一種同時追蹤可移動物體與可移動相機的六自由度方位之方法與系統。The present disclosure relates to a method and system for simultaneously tracking a six-degree-of-freedom orientation of a movable object and a movable camera.

在現有的追蹤技術中,例如同時定位與地圖構建技術(Simultaneous Localization And Mapping, SLAM)可以追蹤可移動相機的六自由度方位,但卻無法同時追蹤可移動物體。原因是可移動相機需要用穩定的環境特徵點才能進行定位,而可移動物體的特徵點不穩定,通常會被丟棄,無法用於追蹤。Among existing tracking technologies, such as Simultaneous Localization And Mapping (SLAM), which can track the six-degree-of-freedom orientation of a movable camera, it cannot simultaneously track movable objects. The reason is that the movable camera needs to use stable environmental feature points for localization, while the feature points of the movable object are unstable and are usually discarded and cannot be used for tracking.

另一方面,用於追蹤可移動物體的技術都會忽略環境特徵點以避免干擾,因此這些技術都無法追蹤可移動相機。On the other hand, techniques for tracking movable objects ignore environmental landmarks to avoid interference, so none of these techniques can track movable cameras.

大多數神經網路所學習的特徵都是用來區分物體的類型,而不是計算物體的六自由度方位。某些用於辨識姿態或手勢的神經網路只能夠輸出骨骼關節在影像平面的2D坐標(x, y),即使靠深度感測技術估算關節與相機之間的距離,也不是空間中真正的3D座標,更無法計算空間中的六自由度方位。The features learned by most neural networks are used to distinguish the type of object, rather than to calculate the six degrees of freedom orientation of the object. Some neural networks used to recognize poses or gestures can only output the 2D coordinates (x, y) of bone joints on the image plane. Even if the distance between the joints and the camera is estimated by depth sensing technology, it is not the real distance in space. 3D coordinates, and it is impossible to calculate the six-degree-of-freedom orientation in space.

在運動捕捉系統中,則是使用多個固定相機追蹤關節位置,一般會在關節上貼標記以減少誤差,沒有追蹤可移動相機的六自由度方位。In the motion capture system, multiple fixed cameras are used to track the joint position, and the joints are generally marked to reduce errors, and the six-degree-of-freedom orientation of the movable camera is not tracked.

因此,就目前已知的技術而言,尚未有任何技術能夠做到同時追蹤可移動物體與可移動相機。Therefore, as far as currently known technologies are concerned, there is no technology capable of simultaneously tracking a movable object and a movable camera.

隨著混合實境(mixed reality, MR)的快速發展,促使研究人員開發能夠同時追蹤可移動相機和可移動物體之六個自由度方位的技術。在混合實境的應用中,由於安裝在MR眼鏡上的相機會隨頭部移動,因此需要知道相機的六自由度方位才能知道使用者的位置和方向。與使用者互動的物體也會移動,因此還需要知道該物體的六自由度方位才能在適當的位置和方向顯示虛擬內容。戴著MR眼鏡的使用者可能在室內或室外自由走動,很難在環境中放置標記。而且為了有較好的使用體驗,除了物體本身的特徵外,也不會在物體上貼特殊的標記。The rapid development of mixed reality (MR) has prompted researchers to develop technologies that can simultaneously track the six-degree-of-freedom orientation of a movable camera and a movable object. In mixed reality applications, since the camera mounted on the MR glasses moves with the head, it is necessary to know the six-degree-of-freedom orientation of the camera to know the user's position and direction. Objects that the user interacts with also move, so the 6DOF orientation of the object also needs to be known in order to display virtual content in the proper position and orientation. Users wearing MR glasses may move freely indoors or outdoors, making it difficult to place markers in the environment. And in order to have a better user experience, in addition to the characteristics of the object itself, no special marks will be attached to the object.

雖然這些情況提高追蹤六自由度方位的難度,我們仍開發出能夠同時追蹤可移動物體與可移動相機的技術,以解決上述這些問題並滿足更多的應用。Although these situations increase the difficulty of tracking the 6DOF orientation, we have developed a technology that can simultaneously track a movable object and a movable camera to solve the above-mentioned problems and meet more applications.

本揭露所提出之技術例如可以應用於:當使用者戴著MR眼鏡時,可以在手持裝置,例如:手機的真實螢幕旁顯示一個或多個虛擬螢幕,根據手機和MR眼鏡上的相機的六自由度方位設定虛擬螢幕的預設位置、方向和大小。並且,透過六自由度方位的追蹤,可以自動控制虛擬螢幕旋轉和移動,使其與觀看方向一致。本揭露技術可以為使用者提供以下好處:(1)將小的實體螢幕擴展到大的虛擬螢幕;(2)將單個實體螢幕增加到多個虛擬螢幕,以同時查看更多應用程式;(3)虛擬螢幕的內容不會被他人窺探。The technology proposed in this disclosure can be applied, for example: when a user wears MR glasses, one or more virtual screens can be displayed next to the real screen of a handheld device, such as a mobile phone, according to the six-dimensional information of the camera on the mobile phone and the MR glasses DOF Orientation sets the default position, orientation and size of the virtual screen. Moreover, through six-degree-of-freedom orientation tracking, the rotation and movement of the virtual screen can be automatically controlled to make it consistent with the viewing direction. The disclosed technology can provide users with the following benefits: (1) expand a small physical screen to a large virtual screen; (2) increase a single physical screen to multiple virtual screens to view more applications at the same time; (3) ) The contents of the virtual screen will not be snooped by others.

根據本揭露之一實施例,提出一種同時追蹤可移動物體與可移動相機的六自由度方位(6 DoF poses)之方法,包括以下步驟:以可移動相機擷取一連串的影像,從這些影像中提取數個環境特徵點,匹配這些環境特徵點計算可移動相機之數個相機矩陣,再由這些相機矩陣計算可移動相機的六自由度方位;並同時從可移動相機擷取的這些影像中推算可移動物體的數個特徵點,使用這些影像各自對應的相機矩陣,以及預先定義的幾何限制和時間限制,修正可移動物體的這些特徵點的座標,再以這些修正後的特徵點座標及其對應的相機矩陣,計算可移動物體的六自由度方位。According to an embodiment of the present disclosure, a method for simultaneously tracking a movable object and six degrees of freedom (6 DoF poses) of a movable camera is proposed, comprising the following steps: capturing a series of images with the movable camera, from the images Extract several environmental feature points, match these environmental feature points to calculate several camera matrices of the movable camera, and then calculate the six-degree-of-freedom orientation of the movable camera from these camera matrices; and calculate from these images captured by the movable camera at the same time Several feature points of the movable object, using the corresponding camera matrices of these images, and the predefined geometric constraints and time constraints, correct the coordinates of these feature points of the movable object, and then use these corrected feature point coordinates and their The corresponding camera matrix calculates the six-degree-of-freedom orientation of the movable object.

根據本揭露之另一實施例,提出一種同時追蹤可移動物體與可移動相機的六自由度方位之系統,包括可移動相機、可移動相機六自由度方位計算單元及可移動物體六自由度方位計算單元。可移動相機用以擷取一連串的影像。可移動相機六自由度方位計算單元用以從這些影像中提取數個環境特徵點,匹配這些環境特徵點計算可移動相機之數個相機矩陣,再由這些相機矩陣計算可移動相機的六自由度方位。可移動物體六自由度方位計算單元,用以從可移動相機擷取的這些影像中推算可移動物體的數個特徵點,透過這些影像各自對應的相機矩陣,以及預先定義的幾何限制、和時間限制,修正可移動物體的這些特徵點的座標,再以這些修正後的特徵點座標及其對應的這些相機矩陣,計算可移動物體的六自由度方位。According to another embodiment of the present disclosure, a system for simultaneously tracking the six-degree-of-freedom orientation of a movable object and a movable camera is proposed, including a movable camera, a six-degree-of-freedom orientation calculation unit for the movable camera, and a six-degree-of-freedom orientation of the movable object computing unit. The movable camera is used to capture a series of images. The six-degree-of-freedom orientation calculation unit of the movable camera is used to extract several environmental feature points from these images, match these environmental feature points to calculate several camera matrices of the movable camera, and then calculate the six-degree-of-freedom of the movable camera from these camera matrices position. The six-degree-of-freedom orientation calculation unit of the movable object is used to calculate several feature points of the movable object from the images captured by the movable camera, through the respective camera matrices corresponding to these images, as well as the predefined geometric constraints and time Limiting and correcting the coordinates of these feature points of the movable object, and then calculating the six-degree-of-freedom orientation of the movable object with the corrected coordinates of the feature points and the corresponding camera matrices.

為了對本揭露之上述及其他方面有更佳的瞭解,下文特舉實施例,並配合所附圖式詳細說明如下:In order to have a better understanding of the above and other aspects of the present disclosure, the following specific embodiments are described in detail in conjunction with the attached drawings as follows:

請參照第1A、1B圖,其繪示本揭露同時追蹤可移動物體與可移動相機之技術與習知技術相比在應用上的說明。本揭露所提出之技術例如可以應用於:如第1A圖所示,當使用者戴著MR眼鏡G1(MR眼鏡G1上配置可移動相機110)時,可以在手持裝置,例如:手機P1(即可移動物體900)的真實螢幕旁顯示一個或多個虛擬螢幕,根據手機P1和MR眼鏡G1上的可移動相機110的六自由度方位設定虛擬螢幕D2、D3的預設位置、方向和大小。可移動相機110的「可移動」係指相對於三維空間之一靜止物而言。並且,透過六自由度方位的追蹤,可以自動控制虛擬螢幕D2、D3旋轉和移動,使其與觀看方向一致(如第1B圖所示),使用者也可以根據自己的喜好調整這些虛擬螢幕D2、D3的位置和角度。習知技術所顯示的虛擬螢幕會跟著MR眼鏡G1移動,不會跟著物體的六自由度方位移動。本揭露技術可以為使用者提供以下好處:(1)將小的實體螢幕D1擴展到大的虛擬螢幕D2;(2)將單個實體螢幕D1增加到多個虛擬螢幕D2、D3,以同時查看更多應用程式;(3)虛擬螢幕D2、D3的內容不會被他人窺探。上述技術也可以應用於平板電腦或筆記型電腦,在其實體螢幕旁設置虛擬螢幕。可移動物體900除了實體螢幕以外,還可以是其他能定義特徵的物體,例如:汽車、自行車、行人等。可移動相機110不侷限是MR眼鏡G1上的相機,也可以是自主移動機器人和車輛上的相機。Please refer to Figures 1A and 1B, which illustrate the application of the technology of simultaneously tracking a movable object and a movable camera of the present disclosure compared with the conventional technology. The technology proposed in this disclosure can be applied, for example: as shown in Figure 1A, when the user wears MR glasses G1 (the movable camera 110 is configured on the MR glasses G1), the handheld device, such as: mobile phone P1 (i.e. One or more virtual screens are displayed next to the real screen of the movable object 900), and the preset positions, directions and sizes of the virtual screens D2 and D3 are set according to the six-degree-of-freedom orientation of the mobile camera 110 on the mobile phone P1 and the MR glasses G1. "Moveable" of the movable camera 110 refers to a stationary object in three-dimensional space. Moreover, through six-degree-of-freedom orientation tracking, the rotation and movement of virtual screens D2 and D3 can be automatically controlled to make them consistent with the viewing direction (as shown in Figure 1B), and users can also adjust these virtual screens D2 according to their own preferences , The position and angle of D3. The virtual screen displayed by the conventional technology will move with the MR glasses G1, and will not move with the six degrees of freedom of the object. The disclosed technology can provide users with the following benefits: (1) extend a small physical screen D1 to a large virtual screen D2; (2) extend a single physical screen D1 to multiple virtual screens D2, D3 to simultaneously view more Multiple applications; (3) The contents of the virtual screens D2 and D3 will not be peeped by others. The technology described above can also be applied to a tablet or notebook computer, where a virtual screen is placed next to the physical screen. In addition to the physical screen, the movable object 900 can also be other objects capable of defining features, such as cars, bicycles, pedestrians, and so on. The movable camera 110 is not limited to the camera on the MR glasses G1, and may also be a camera on an autonomous mobile robot or vehicle.

請參照第2A圖,其繪示根據一實施例之同時追蹤可移動物體900(標示於第1A圖)與可移動相機110的六自由度方位之系統100與方法。可移動物體900例如是第1A圖之手機P1;可移動相機110例如是第1A圖之MR眼鏡G1上的相機。同時追蹤可移動物體900與可移動相機110的六自由度方位之系統100包括可移動相機110、可移動相機六自由度方位計算單元120及可移動物體六自由度方位計算單元130。可移動相機110用以擷取一連串影像IM。可移動相機110可以設置於頭戴式立體顯示器、行動裝置、電腦或機器人上。可移動相機六自由度方位計算單元120及/或可移動物體六自由度方位計算單元130例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。Please refer to FIG. 2A , which illustrates a system 100 and method for simultaneously tracking a six-degree-of-freedom orientation of a movable object 900 (marked in FIG. 1A ) and a movable camera 110 according to an embodiment. The movable object 900 is, for example, the mobile phone P1 in FIG. 1A ; the movable camera 110 is, for example, the camera on the MR glasses G1 in FIG. 1A . The system 100 for simultaneously tracking the 6DOF orientation of the movable object 900 and the movable camera 110 includes a movable camera 110 , a 6DOF orientation calculation unit 120 for the movable camera, and a 6DOF orientation calculation unit 130 for the movable object. The movable camera 110 is used to capture a series of images IM. The movable camera 110 can be installed on a head-mounted display, a mobile device, a computer or a robot. The movable camera 6DOF orientation calculation unit 120 and/or the movable object 6DOF orientation calculation unit 130 are, for example, circuits, chips, circuit boards, program codes, or storage devices for storing program codes.

可移動相機六自由度方位計算單元120包括環境特徵擷取單元121、相機矩陣計算單元122及相機方位計算單元123,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。環境特徵擷取單元121用以從這些影像IM中提取數個環境特徵點EF。相機矩陣計算單元122係匹配這些環境特徵點EF計算可移動相機110之數個相機矩陣CM。相機方位計算單元123再由相機矩陣CM計算可移動相機110的六自由度方位CD。The movable camera six-degree-of-freedom orientation calculation unit 120 includes an environmental feature extraction unit 121, a camera matrix calculation unit 122, and a camera orientation calculation unit 123, and its implementation is, for example, a circuit, a chip, a circuit board, a program code, or a stored program code. storage device. The environmental feature extraction unit 121 is used to extract several environmental feature points EF from the images IM. The camera matrix calculation unit 122 matches these environmental feature points EF to calculate several camera matrices CM of the movable camera 110 . The camera orientation calculation unit 123 then calculates the six-degree-of-freedom orientation CD of the movable camera 110 from the camera matrix CM.

可移動物體六自由度方位計算單元130包括物體特徵座標推算單元131、物體特徵座標修正單元132及物體方位計算單元133,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。物體特徵座標推算單元131用以從可移動相機110擷取的這些影像IM中推算可移動物體900的數個特徵點OF,這些特徵點OF為預先定義,與可移動相機110擷取的這些影像IM做比對,以推算這些特徵點OF的座標。其中,可移動物體900為剛性物體。The six-degree-of-freedom orientation calculation unit 130 of the movable object includes an object characteristic coordinate calculation unit 131, an object characteristic coordinate correction unit 132, and an object orientation calculation unit 133, and its implementation is, for example, a circuit, a chip, a circuit board, a program code, or a stored program code storage device. The object feature coordinate calculation unit 131 is used to calculate several feature points OF of the movable object 900 from the images IM captured by the movable camera 110. IM compares to calculate the coordinates of these feature points OF. Wherein, the movable object 900 is a rigid object.

請參照第2B圖所繪示之另一實施例,同時追蹤可移動物體900與可移動相機110的六自由度方位之方法包含訓練階段(training stage)ST1和追蹤階段(tracking stage)ST2。其中,物體特徵座標推算單元131使用神經網路推論模型MD,從可移動相機110擷取的這些影像IM中推算可移動物體900特徵點OF的座標,神經網路推論模型MD為預先訓練,訓練資料由手動或自動標記獲得,在訓練過程中加入幾何限制GC和時間限制TC。Please refer to another embodiment shown in FIG. 2B , the method for simultaneously tracking the six degrees of freedom of the movable object 900 and the movable camera 110 includes a training stage ST1 and a tracking stage ST2 . Wherein, the object feature coordinate calculation unit 131 uses the neural network inference model MD to calculate the coordinates of the feature points OF of the movable object 900 from the images IM captured by the movable camera 110. The neural network inference model MD is pre-trained, training Data are obtained by manual or automatic labeling, and geometrically limited GC and temporally limited TC are added during training.

物體特徵座標修正單元132使用這些影像IM各自對應的相機矩陣CM,以及預先定義的幾何限制GC和時間限制TC,修正可移動物體900的這些特徵點OF的座標。其中,物體特徵座標修正單元132使用這些相機矩陣CM,將這些特徵點OF的二維座標投影至對應的三維座標,依據幾何限制GC,刪除三維座標偏差大於預定值的特徵點OF,或以相鄰特徵點OF的座標依據幾何限制GC補充未被偵測到的特徵點OF的座標。並且,物體特徵座標修正單元132更依據時間限制TC,比對這些特徵點OF於多張連續影像IM中的座標變化,再以這些連續影像IM中對應的這些特徵點OF的座標修正座標變化大於預定值的特徵點OF的座標,得到修正後之這些特徵點OF’的座標。The object feature coordinate correcting unit 132 corrects the coordinates of these feature points OF of the movable object 900 by using the corresponding camera matrix CM of the images IM and the predefined geometric constraints GC and time constraints TC. Among them, the object feature coordinate correction unit 132 uses these camera matrices CM to project the two-dimensional coordinates of these feature points OF to the corresponding three-dimensional coordinates, and delete the feature points OF whose three-dimensional coordinate deviation is greater than a predetermined value according to the geometric limit GC, or use a corresponding The coordinates of adjacent feature points OF supplement the coordinates of undetected feature points OF according to the geometric constraints GC. Moreover, the object feature coordinate correction unit 132 compares the coordinate changes of these feature points OF in multiple continuous images IM according to the time limit TC, and then uses the coordinates of these feature points OF in these continuous images IM to correct the coordinate changes greater than The coordinates of the feature points OF with predetermined values are obtained to obtain the coordinates of these feature points OF' after correction.

請參照第3A圖,其示例說明可移動相機擷取的一連串影像中,環境特徵點、可移動物體特徵點各自的對應關係。對於非平面物體來說,則可以透過幾個選定的特徵點OF的質心來定義方向和位置。請參照第3B圖,其示例說明物體在空間的位置與方向。特徵點OF擬合出最佳平面PL,最佳平面PL之中心點C可以代表物體在三維空間中的位置(x, y, x),並且用最佳平面PL之法向量N可以表示物體的方向。Please refer to FIG. 3A , which illustrates the corresponding relationship between environmental feature points and movable object feature points in a series of images captured by the movable camera. For non-planar objects, the orientation and position can be defined through the centroids of several selected feature points OF. Please refer to FIG. 3B, which illustrates the position and orientation of objects in space. The feature point OF fits the best plane PL, the center point C of the best plane PL can represent the position (x, y, x) of the object in three-dimensional space, and the normal vector N of the best plane PL can represent the object’s position direction.

幾何限制GC定義於三維空間中,對於剛性物體,特徵點OF之間的距離應該是固定的。經過相機矩陣投影至二維影像平面後,所有特徵點OF的位置須限制在合理的範圍內。Geometric constraints GC are defined in three-dimensional space, for rigid objects, the distance between feature points OF should be fixed. After the camera matrix is projected onto the two-dimensional image plane, the positions of all feature points OF must be limited within a reasonable range.

請參照第4A~4B圖,其示例說明修正特徵點OF的座標。相機矩陣CM不僅可用於計算可移動相機110和可移動物體900的六自由度方位,還可套用三維的幾何限制GC,修正特徵點OF*投影到二維影像平面的座標(如第4A圖所示)或添加缺少的特徵點OF**座標(如第4B圖所示)。Please refer to FIGS. 4A-4B , which illustrate the coordinates of the corrected feature points OF. The camera matrix CM can not only be used to calculate the six-degree-of-freedom orientation of the movable camera 110 and the movable object 900, but also apply the three-dimensional geometric constraint GC to correct the coordinates of the feature points OF* projected onto the two-dimensional image plane (as shown in Figure 4A shown) or add missing feature point OF** coordinates (as shown in Figure 4B).

物體方位計算單元133再以修正後之這些特徵點OF’的座標及其對應的這些相機矩陣CM,計算可移動物體900的六自由度方位OD。對於平面之可移動物體,使用這些特徵點OF計算最佳擬合平面。可移動物體900的六自由度方位OD由平面之中心點及法向量定義。對於非平面之可移動物體,可移動物體900的六自由度方位OD由這些特徵點OF'之三維座標的質心定義。The object orientation calculation unit 133 calculates the six-degree-of-freedom orientation OD of the movable object 900 by using the corrected coordinates of these feature points OF' and the corresponding camera matrices CM. For planar movable objects, use these feature points OF to calculate the best fitting plane. The six degrees of freedom orientation OD of the movable object 900 is defined by the center point and the normal vector of the plane. For non-planar movable objects, the six-degree-of-freedom orientation OD of the movable object 900 is defined by the centroids of the three-dimensional coordinates of these feature points OF'.

如第2B圖所示,同時追蹤可移動物體900與可移動相機110的六自由度方位之系統100的訓練階段(training stage)ST1包括訓練資料生成單元140及神經網路訓練單元150,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。As shown in FIG. 2B, the training stage (training stage) ST1 of the system 100 that simultaneously tracks the six degrees of freedom of the movable object 900 and the movable camera 110 includes a training data generation unit 140 and a neural network training unit 150, which implement The means are, for example, circuits, chips, circuit boards, program codes, or storage devices for storing program codes.

神經網路訓練單元150用以訓練神經網路推論模型MD。神經網路推論模型MD用於推算可移動物體900之特徵點OF的位置和序列。在訓練資料生成單元140中,訓練資料可以是手動標記特徵點之位置和序列的影像、或者是自動擴充已標記的影像。請參照第5A~5D圖,其繪示以手機為例之各種訓練資料。在這些圖式中,特徵點OF由實體螢幕D4的四個內角定義。實體螢幕D4擺放成縱向方向時,順時針方向從左上角到左下角依序指定為四個特徵點OF的順序。如第5A圖所示,四個特徵點OF依序具有座標

Figure 02_image001
、座標
Figure 02_image003
、座標
Figure 02_image005
、座標
Figure 02_image007
。即使將實體螢幕D4屏幕旋轉到橫向,特徵點OF的順序也保持不變(如第5B圖所示)。在某些情況中,並不是所有的特徵點OF都能被拍到。因此,訓練資料需要包含一些類似第5C圖或第5D圖這種缺漏一些特徵點OF的影像。如第5A圖與第5D圖所示,特徵點標記的動作可以分辨出手機的正面(即螢幕)與背面,而僅在正面進行標記。為了獲得較高的精準度,在標記特徵點OF時放大每張影像,直到清楚地看到每個像素。由於手動標記的動作非常耗時,因此需要自動擴充才能將訓練資料擴展到百萬張數等級。對手動標記的影像進行自動擴充的方法包含:按比例縮放與旋轉、以透視投影法進行映射、轉換到不同的顏色、調整其亮度和對比度、添加移動模糊和雜訊、加上其他物體遮蓋某些特徵點(如第5C圖與第5D圖所示)、變更螢幕顯示的內容、或者替換背景等等。再將這些手動標記之特徵點OF的位置按照轉換關係重新計算在自動擴充的影像中的位置。The neural network training unit 150 is used for training the neural network inference model MD. The neural network inference model MD is used to infer the position and sequence of the feature points OF of the movable object 900 . In the training data generation unit 140 , the training data can be manually marked images with positions and sequences of feature points, or automatically expanded marked images. Please refer to Figures 5A-5D, which illustrate various training materials taking mobile phones as examples. In these figures, the feature points OF are defined by the four inner corners of the physical screen D4. When the physical screen D4 is placed in a portrait orientation, the clockwise direction from the upper left corner to the lower left corner is designated as the order of the four feature points OF. As shown in Figure 5A, the four feature points OF have coordinates in sequence
Figure 02_image001
,coordinate
Figure 02_image003
,coordinate
Figure 02_image005
,coordinate
Figure 02_image007
. Even if the real screen D4 screen is rotated to landscape orientation, the order of feature points OF remains unchanged (as shown in Fig. 5B). In some cases, not all feature points OF can be photographed. Therefore, the training data needs to include some images like those in FIG. 5C or FIG. 5D that lack some feature points OF. As shown in Figure 5A and Figure 5D, the action of feature point marking can distinguish the front (ie screen) and back of the mobile phone, and only mark the front. For higher accuracy, zoom in on each image while marking feature points OF until each pixel is clearly visible. Since manually labeling actions is time-consuming, auto-expansion is required to expand the training data to millions of images. Methods for auto-augmenting manually labeled images include: scaling and rotation, mapping with perspective projection, converting to different colors, adjusting their brightness and contrast, adding motion blur and noise, adding other objects to obscure certain Some feature points (as shown in FIG. 5C and FIG. 5D ), change the content displayed on the screen, or replace the background and so on. Then recalculate the position of these manually marked feature points OF in the automatically expanded image according to the conversion relationship.

請參照第6圖,其示例說明神經網路在訓練階段的主要結構包含特徵抽取和特徵點座標預測。其中特徵抽取器ET可以使用如ResNet這種深度殘差網路或其他有類似功能的網路。所抽取的特徵向量FV傳送至特徵點座標預測層FL中,推算特徵點OF的座標(例如目前影像之特徵點OF的座標以

Figure 02_image009
表示、前一張影像之特徵點OF的座標以
Figure 02_image011
表示)。除了特徵點預測層之外,本實施例還加上幾何限制層GCL和時間限制層TCL以減少錯誤的預測。在訓練階段,每一層會根據損失函數計算出預測值與真值的損失值LV,然後將這些損失值及其各自的權重進行累加以獲得總損失值OLV。Please refer to Figure 6, which illustrates that the main structure of the neural network in the training phase includes feature extraction and feature point coordinate prediction. Among them, the feature extractor ET can use a deep residual network such as ResNet or other networks with similar functions. The extracted feature vector FV is sent to the feature point coordinate prediction layer FL, and the coordinates of the feature point OF are calculated (for example, the coordinates of the feature point OF in the current image are given by
Figure 02_image009
Indicates that the coordinates of the feature point OF in the previous image are given by
Figure 02_image011
express). In addition to the feature point prediction layer, this embodiment also adds a geometric constrained layer GCL and a temporal constrained layer TCL to reduce erroneous predictions. In the training phase, each layer will calculate the loss value LV of the predicted value and the true value according to the loss function, and then accumulate these loss values and their respective weights to obtain the total loss value OLV.

請參照第7圖,其示例說明在相鄰的兩張影像之間,特徵點位移的計算方式。在目前影像中特徵點OF的座標為

Figure 02_image009
,同一特徵點OF在前一張影像中的座標為
Figure 02_image011
,其間的位移定為
Figure 02_image013
。Please refer to Figure 7, which illustrates how to calculate the displacement of feature points between two adjacent images. The coordinates of the feature point OF in the current image are
Figure 02_image009
, the coordinates of the same feature point OF in the previous image are
Figure 02_image011
, the displacement between them is defined as
Figure 02_image013
.

不合理的位移以懲罰值

Figure 02_image015
進行限制。懲罰值
Figure 02_image015
例如是按照下式(1)進行計算。
Figure 02_image017
………………………………………………..(1)
 
Unreasonable displacement with penalty value
Figure 02_image015
Limit. penalty value
Figure 02_image015
For example, calculation is performed according to the following formula (1).
Figure 02_image017
………………………………………………..(1)

其中m為所有訓練資料針對每個特徵點OF所計算出的位移平均值,s是位移標準差,d是同一特徵點OF在前一影像與目前影像之間的位移。當d≤m時,位移屬於可接受範圍內,沒有懲罰值(即

Figure 02_image019
)。請參照第8圖,其示例說明時間限制TC、懲罰值
Figure 02_image015
的計算及判定方法。圓的中心代表在前一張影像中特徵點OF的座標
Figure 02_image021
,圓的面積代表在目前影像中特徵點OF可接受的位移。如果在目前影像中,特徵點OF的預測座標
Figure 02_image023
在圓內(位移d'≤m),則懲罰值
Figure 02_image015
為零。如果在目前影像中特徵點OF的預測座標
Figure 02_image025
在圓外(位移d">m),則懲罰值
Figure 02_image015
Figure 02_image027
。位移超出圓的半徑(即m)越多,在訓練過程中將會得到較大的懲罰值
Figure 02_image015
和較大的損失值,以此限制特徵點OF的座標在合理範圍內。Among them, m is the average displacement calculated for each feature point OF in all training data, s is the displacement standard deviation, and d is the displacement of the same feature point OF between the previous image and the current image. When d≤m, the displacement is within the acceptable range and there is no penalty value (ie
Figure 02_image019
). Please refer to Figure 8 for an example of time limit TC, penalty value
Figure 02_image015
calculation and judgment methods. The center of the circle represents the coordinates of the feature point OF in the previous image
Figure 02_image021
, the area of the circle represents the acceptable displacement of the feature point OF in the current image. If in the current image, the predicted coordinates of the feature point OF
Figure 02_image023
In the circle (displacement d'≤m), then the penalty value
Figure 02_image015
to zero. If the predicted coordinates of the feature point OF in the current image
Figure 02_image025
Outside the circle (displacement d">m), the penalty value
Figure 02_image015
for
Figure 02_image027
. The more the displacement exceeds the radius (ie m) of the circle, the larger the penalty value will be obtained during the training process
Figure 02_image015
And a larger loss value, so as to limit the coordinates of the feature point OF within a reasonable range.

請參照第9圖,其示例說明缺少時間限制TC而產生不正確位移的情況。第9圖之左側圖示為前一影像,右側圖示為目前影像。在前一影像中,辨識出具有座標

Figure 02_image003
之特徵點OF。但在目前影像中,從反光成像中辨識出具有座標
Figure 02_image029
之特徵點OF,座標
Figure 02_image029
與座標
Figure 02_image003
之間的位移大於時間限制TC所設定的範圍,故可以判定座標
Figure 02_image029
不正確。Please refer to Fig. 9 for an example of incorrect displacement due to lack of time limit TC. The image on the left of Figure 9 is the previous image, and the image on the right is the current image. In the previous image, it was identified that has coordinates
Figure 02_image003
The feature point OF. However, in the current image, it is identified from the reflective imaging that there are coordinates
Figure 02_image029
The feature point OF, coordinates
Figure 02_image029
with coordinates
Figure 02_image003
The displacement between is greater than the range set by the time limit TC, so the coordinates can be determined
Figure 02_image029
Incorrect.

如第2B圖所示,在追蹤階段ST2,可移動相機110擷取一連串的影像IM。從這些影像中提取數個環境特徵點EF,然後將其用於計算可移動相機110的相應的相機矩陣CM和六自由度方位CD。同時,可移動物體900的特徵點OF的座標也被神經網路推論模型MD推算出來,並由相機矩陣CM轉換、修正,以獲得可移動物體900的六自由度方位OD。As shown in FIG. 2B, in the tracking stage ST2, the movable camera 110 captures a series of images IM. Several environmental feature points EF are extracted from these images, which are then used to calculate the corresponding camera matrix CM and six degrees of freedom orientation CD of the movable camera 110 . At the same time, the coordinates of the feature points OF of the movable object 900 are also calculated by the neural network inference model MD, and converted and corrected by the camera matrix CM to obtain the six-degree-of-freedom orientation OD of the movable object 900 .

請參照第10圖,其繪示加入增量學習階段(incremental learning stage)ST3之同時追蹤可移動物體900(標示於第1A圖)與可移動相機110的六自由度方位之系統200與方法,包含:自動擴增單元260及權重調整單元270,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。Please refer to FIG. 10 , which illustrates a system 200 and method for simultaneously tracking a six-degree-of-freedom orientation of a movable object 900 (marked in FIG. 1A ) and a movable camera 110 while adding an incremental learning stage (incremental learning stage) ST3, It includes: an automatic amplification unit 260 and a weight adjustment unit 270 , the implementation of which is, for example, a circuit, a chip, a circuit board, a program code, or a storage device for storing the program code.

在第10圖之實施例中,神經網路推論模型MD在訓練階段,其訓練資料由手動標記和自動擴充組成;而在增量學習階段,其訓練資料由自動標記和自動擴充組成。In the embodiment shown in FIG. 10 , in the training phase of the neural network inference model MD, its training data is composed of manual labeling and automatic expansion; while in the incremental learning phase, its training data is composed of automatic labeling and automatic expansion.

在追蹤可移動物體900的同時,神經網路推論模型MD在背景執行增量學習。增量學習的訓練資料包括:可移動相機110擷取的影像IM及自動擴增單元260根據影像IM自動擴增的影像IM’。自動擴增單元260並以對應影像IM及IM’的修正後之特徵點OF的座標取代手動標記,做為特徵點座標真值。權重調整單元270調整神經網路推論模型MD中的權重,以更新為神經網路推論模型MD’,藉此適應使用情境以精準追蹤可移動物體900的六自由度方位OD。While tracking the movable object 900, the neural network inference model MD performs incremental learning in the background. The training data for incremental learning includes: the image IM captured by the movable camera 110 and the image IM' automatically expanded by the automatic expansion unit 260 according to the image IM. The automatic augmentation unit 260 replaces the manual marking with the corrected feature point OF coordinates corresponding to the images IM and IM' as the true value of the feature point coordinates. The weight adjustment unit 270 adjusts the weights in the neural network inference model MD to be updated to the neural network inference model MD', so as to adapt to the usage situation and accurately track the six degrees of freedom orientation OD of the movable object 900.

此外,請參照第11圖,其繪示應用於MR眼鏡之同時追蹤可移動物體900與可移動相機110的六自由度方位之系統300與方法,包括:方位修正單元310、方位穩定單元320、視軸計算單元330、螢幕方位計算單元340及立體影像產生單元350,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。方位修正單元310包括交叉比對單元311及修正單元312,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。立體影像產生單元350包括影像產生單元351及成像單元352,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。In addition, please refer to FIG. 11 , which shows a system 300 and method for simultaneously tracking the six-degree-of-freedom orientation of a movable object 900 and a movable camera 110 applied to MR glasses, including: an orientation correction unit 310, an orientation stabilization unit 320, The visual axis calculation unit 330 , the screen orientation calculation unit 340 and the stereoscopic image generation unit 350 are implemented, for example, by circuits, chips, circuit boards, program codes, or storage devices for storing program codes. The orientation correction unit 310 includes a cross-comparison unit 311 and a correction unit 312 , and its implementation is, for example, a circuit, a chip, a circuit board, a program code, or a storage device for storing the program code. The stereoscopic image generation unit 350 includes an image generation unit 351 and an imaging unit 352 , the implementation of which is, for example, a circuit, a chip, a circuit board, a program code, or a storage device for storing the program code.

隨著可移動相機110和可移動物體900的移動,需要對它們的六自由度方位CD、OD進行交叉比對和修正(如第8圖所示)。方位修正單元310之交叉比對單元311用以交叉比對可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD。修正單元312用以修正可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD。As the movable camera 110 and the movable object 900 move, cross-comparison and correction of their six-degree-of-freedom orientations CD and OD need to be performed (as shown in FIG. 8 ). The cross comparison unit 311 of the orientation correction unit 310 is used to cross compare the six degrees of freedom orientation OD of the movable object 900 with the six degrees of freedom orientation CD of the movable camera 110 . The correction unit 312 is used for correcting the six-degree-of-freedom orientation OD of the movable object 900 and the six-degree-of-freedom orientation CD of the movable camera 110 .

為減少因頭部無意識的輕微晃動,而重新計算可移動相機及可移動物體的六自由度方位,造成虛擬螢幕D2(繪示於第1A圖)跟著晃動產生暈眩。方位穩定單元320用以判斷當可移動物體900之六自由度方位OD或可移動相機110之六自由度方位CD的變動小於預設值時,不改變可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD。In order to reduce the slight unconscious shaking of the head, the six-degree-of-freedom orientation of the movable camera and the movable object is recalculated, causing the virtual screen D2 (shown in FIG. 1A ) to shake and cause dizziness. The orientation stabilizing unit 320 is used to determine that when the change of the six degrees of freedom orientation OD of the movable object 900 or the six degrees of freedom orientation CD of the movable camera 110 is less than a preset value, the six degrees of freedom orientation OD and the Six degrees of freedom orientation CD of the movable camera 110 .

視軸計算單元330用以根據可移動相機110之六自由度方位CD計算使用者之雙眼的視軸。The viewing axis calculation unit 330 is used for calculating the viewing axes of the user's eyes according to the six-degree-of-freedom orientation CD of the movable camera 110 .

螢幕方位計算單元340用以根據可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD計算虛擬螢幕D2之六自由度方位DD,讓虛擬螢幕D2隨著可移動物體900一起移動(如第1B圖所示),或是隨著可移動相機110之六自由度方位改變虛擬螢幕D2呈顯的視角。The screen orientation calculation unit 340 is used to calculate the six-degree-of-freedom orientation DD of the virtual screen D2 according to the six-degree-of-freedom orientation OD of the movable object 900 and the six-degree-of-freedom orientation CD of the movable camera 110, so that the virtual screen D2 follows the movable object 900 Move together (as shown in FIG. 1B ), or change the viewing angle displayed on the virtual screen D2 along with the six-degree-of-freedom orientation of the movable camera 110 .

立體影像產生單元350之影像產生單元351用以根據虛擬螢幕D2之六自由度方位DD及立體顯示器(例如是第1A圖之MR眼鏡G1)的光學參數產生虛擬螢幕D2之左眼影像及右眼影像。立體影像產生單元350之成像單元352用以顯示虛擬螢幕D2的立體影像於立體顯示器(例如是第1A圖之MR眼鏡G1)。The image generation unit 351 of the stereoscopic image generation unit 350 is used to generate the left-eye image and the right-eye image of the virtual screen D2 according to the six-degree-of-freedom orientation DD of the virtual screen D2 and the optical parameters of the stereoscopic display (such as the MR glasses G1 in FIG. 1A ). image. The imaging unit 352 of the stereoscopic image generating unit 350 is used to display the stereoscopic image of the virtual screen D2 on a stereoscopic display (for example, the MR glasses G1 in FIG. 1A ).

其中,立體影像產生單元350之成像單元352可以根據使用者設定,將虛擬螢幕D2顯示於可移動物體900周圍之特定位置。Wherein, the imaging unit 352 of the stereoscopic image generating unit 350 can display the virtual screen D2 at a specific position around the movable object 900 according to user settings.

綜上所述,雖然本揭露已以實施例揭露如上,然其並非用以限定本揭露。本揭露所屬技術領域中具有通常知識者,在不脫離本揭露之精神和範圍內,當可作各種之更動與潤飾。因此,本揭露之保護範圍當視後附之申請專利範圍所界定者為準。To sum up, although the present disclosure has been disclosed above with embodiments, it is not intended to limit the present disclosure. Those with ordinary knowledge in the technical field to which this disclosure belongs may make various changes and modifications without departing from the spirit and scope of this disclosure. Therefore, the scope of protection of this disclosure should be defined by the scope of the appended patent application.

100, 200, 300:同時追蹤可移動物體與可移動相機的六自由度方位之系統 110:可移動相機 120:可移動相機六自由度方位計算單元 121:環境特徵擷取單元 122:相機矩陣計算單元 123:相機方位計算單元 130:可移動物體六自由度方位計算單元 131:物體特徵座標推算單元 132:物體特徵座標修正單元 133:物體方位計算單元 140:訓練資料生成單元 150:神經網路訓練單元 260:自動擴增單元 270:權重調整單元 310:方位修正單元 311:交叉比對單元 312:修正單元 320:方位穩定單元 330:視軸計算單元 340:螢幕方位計算單元 350:立體影像產生單元 351:影像產生單元 352:成像單元 900:可移動物體 CD:可移動相機之六自由度方位 CM:相機矩陣 d, d’, d” :位移 D1, D4:實體螢幕 D2, D3:虛擬螢幕 DD:虛擬螢幕之六自由度方位 EF:環境特徵點 ET:特徵抽取器 FL:特徵點座標預測層 FV:特徵向量 G1:MR眼鏡 GC:幾何限制 GCL:幾何限制層 IM, IM’:影像 LV:損失值 MD:神經網路推論模型 m:位移平均值 OD:可移動物體的六自由度方位 OF, OF’, OF*, OF**:特徵點 OLV:總損失值 P1:手機 s:位移標準差 ST1:訓練階段 ST2:追蹤階段 ST3:增量學習階段 TC:時間限制 TCL:時間限制層

Figure 02_image001
,
Figure 02_image003
,
Figure 02_image005
,
Figure 02_image031
,
Figure 02_image029
,
Figure 02_image021
,
Figure 02_image033
,
Figure 02_image025
,
Figure 02_image009
,
Figure 02_image011
,:座標
Figure 02_image013
:位移
Figure 02_image015
:懲罰值 PL:最佳平面 C:中心點 N:法向量100, 200, 300: Simultaneously track the system of the six-degree-of-freedom orientation of the movable object and the movable camera 110: The movable camera 120: The six-degree-of-freedom orientation calculation unit of the movable camera 121: The environmental feature extraction unit 122: The camera matrix calculation Unit 123: Camera orientation calculation unit 130: Six degrees of freedom orientation calculation unit for movable objects 131: Object feature coordinate calculation unit 132: Object feature coordinate correction unit 133: Object orientation calculation unit 140: Training data generation unit 150: Neural network training Unit 260: automatic amplification unit 270: weight adjustment unit 310: orientation correction unit 311: cross comparison unit 312: correction unit 320: orientation stabilization unit 330: visual axis calculation unit 340: screen orientation calculation unit 350: stereoscopic image generation unit 351: image generation unit 352: imaging unit 900: movable object CD: six degrees of freedom orientation of movable camera CM: camera matrix d, d', d”: displacement D1, D4: physical screen D2, D3: virtual screen DD : six degrees of freedom orientation of virtual screen EF: environmental feature point ET: feature extractor FL: feature point coordinate prediction layer FV: feature vector G1: MR glasses GC: geometric constraint GCL: geometric constraint layer IM, IM': image LV: Loss value MD: neural network inference model m: average displacement OD: six degrees of freedom orientation OF, OF', OF*, OF**: feature point OLV: total loss value P1: mobile phone s: displacement standard Difference ST1: Training stage ST2: Tracking stage ST3: Incremental learning stage TC: Time-limited TCL: Time-limited layer
Figure 02_image001
,
Figure 02_image003
,
Figure 02_image005
,
Figure 02_image031
,
Figure 02_image029
,
Figure 02_image021
,
Figure 02_image033
,
Figure 02_image025
,
Figure 02_image009
,
Figure 02_image011
,:coordinate
Figure 02_image013
: displacement
Figure 02_image015
: penalty value PL: best plane C: center point N: normal vector

第1A、1B圖繪示本揭露同時追蹤可移動物體與可移動相機之技術與習知技術相比在應用上的說明。 第2A圖繪示根據一實施例之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。 第2B圖繪示加入訓練階段之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。 第3A圖繪示可移動相機擷取的一連串影像中,環境特徵點、可移動物體特徵點各自的對應關係。 第3B圖示例說明物體在空間的位置與方向。 第4A~4B圖繪示修補可移動物體的特徵點。 第5A~5D圖繪示以手機為例之特徵點定義及各種訓練資料。 第6圖繪示神經網路在訓練階段的結構。 第7圖繪示在相鄰的兩張影像之間,特徵點位移的計算方式。 第8圖繪示時間限制的計算及判定方法。 第9圖繪示缺少時間限制而產生不正確位移的情況。 第10圖繪示加入增量學習之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。 第11圖繪示應用於MR眼鏡之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。Figures 1A and 1B illustrate the application of the technology of simultaneously tracking a movable object and a movable camera of the present disclosure compared with the conventional technology. FIG. 2A illustrates a system and method for simultaneously tracking the 6DOF orientation of a movable object and a movable camera according to an embodiment. FIG. 2B illustrates a system and method for simultaneously tracking the 6DOF orientation of a movable object and a movable camera, adding a training phase. FIG. 3A shows the corresponding relationship between environmental feature points and movable object feature points in a series of images captured by the movable camera. Figure 3B illustrates the position and orientation of objects in space. 4A-4B illustrate patching feature points of a movable object. Figures 5A to 5D show the definition of feature points and various training materials taking mobile phones as an example. Figure 6 shows the structure of the neural network during the training phase. Fig. 7 shows the calculation method of feature point displacement between two adjacent images. Figure 8 shows the calculation and determination method of the time limit. Fig. 9 shows the situation where the lack of time constraints produces incorrect displacements. FIG. 10 illustrates a system and method for simultaneously tracking a movable object and a movable camera's six-degree-of-freedom orientation with incremental learning. FIG. 11 shows a system and method for simultaneously tracking a movable object and a movable camera's six-degree-of-freedom orientation applied to MR glasses.

100:同時追蹤可移動物體與可移動相機的六自由度方位之系統100: A six-degree-of-freedom system that simultaneously tracks a movable object and a movable camera

110:可移動相機110: Movable camera

120:可移動相機六自由度方位計算單元120:Moveable camera six degrees of freedom orientation calculation unit

121:環境特徵擷取單元121: Environmental feature extraction unit

122:相機矩陣計算單元122: Camera matrix calculation unit

123:相機方位計算單元123: Camera orientation calculation unit

130:可移動物體六自由度方位計算單元130: Six degrees of freedom orientation calculation unit for movable objects

131:物體特徵座標推算單元131: Object feature coordinate calculation unit

132:物體特徵座標修正單元132: Object feature coordinate correction unit

133:物體方位計算單元133: Object orientation calculation unit

CD:可移動相機之六自由度方位CD: six degrees of freedom orientation of movable camera

CM:相機矩陣CM: camera matrix

EF:環境特徵點EF: environmental feature point

GC:幾何限制GC: Geometry Constraints

IM:影像IM: Image

MD:神經網路推論模型MD: Neural Network Inference Models

OD:可移動物體的六自由度方位OD: six degrees of freedom orientation of a movable object

OF,OF’:可移動物體的特徵點OF, OF’: feature points of movable objects

ST1:訓練階段ST1: training stage

ST2:追蹤階段ST2: Tracking stage

TC:時間限制TC: time limit

Claims (20)

一種同時追蹤可移動物體與可移動相機的複數個六自由度方位(6 DoF poses)之方法,包括: 以該可移動相機擷取複數張影像,從該些影像中提取複數個環境特徵點,匹配該些環境特徵點計算該可移動相機之複數個相機矩陣,再由該些相機矩陣計算該可移動相機的該些六自由度方位;以及 從該可移動相機擷取的該些影像中推算該可移動物體的複數個特徵點,透過該些影像對應的該些相機矩陣,以及預先定義的幾何限制和時間限制,修正該可移動物體的該些特徵點的複數個座標,再以修正後之該些特徵點的該些座標及其對應的該些相機矩陣,計算該可移動物體的該些六自由度方位。A method for simultaneously tracking multiple six-degree-of-freedom (6 DoF poses) of a movable object and a movable camera, including: Capture a plurality of images with the movable camera, extract a plurality of environmental feature points from the images, match the environmental feature points to calculate the plurality of camera matrices of the movable camera, and then calculate the movable the 6DOF orientations of the camera; and Estimate a plurality of feature points of the movable object from the images captured by the movable camera, and correct the movable object’s feature points through the camera matrices corresponding to the images, as well as the predefined geometric constraints and time constraints. The plurality of coordinates of the feature points, and the corrected coordinates of the feature points and the corresponding camera matrices are used to calculate the six-degree-of-freedom orientations of the movable object. 如請求項1所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中從該可移動相機擷取的該些影像中推算該可移動物體的該些特徵點為預先定義,與該可移動相機擷取的該些影像做比對推算該些特徵點的該些座標。The method for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in claim 1, wherein the feature points of the movable object are estimated from the images captured by the movable camera as It is pre-defined and compared with the images captured by the movable camera to calculate the coordinates of the feature points. 如請求項1所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中從該可移動相機擷取的該些影像中推算該可移動物體的該些特徵點,並由神經網路推論模型推算該些特徵點的該些座標,該神經網路推論模型為預先訓練,訓練資料由手動標記和自動擴充組成,在訓練過程中加入該幾何限制和該時間限制。The method for simultaneously tracking the six degrees of freedom orientations of the movable object and the movable camera as described in claim 1, wherein the feature points of the movable object are estimated from the images captured by the movable camera, The coordinates of the feature points are calculated by the neural network inference model, the neural network inference model is pre-trained, the training data is composed of manual marking and automatic expansion, and the geometric limit and the time limit are added during the training process. 如請求項3所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中在追蹤該可移動物體時,該神經網路推論模型在背景執行增量學習,該增量學習的訓練資料包括:該可移動相機擷取的該些影像及由該些影像自動擴增的影像,並以對應該些影像的修正後之該些特徵點的座標取代手動標記,調整該神經網路推論模型中的權重,更新該神經網路推論模型以精準推算該可移動物體特徵點的座標。The method for simultaneously tracking the six degrees of freedom orientations of a movable object and a movable camera as described in claim 3, wherein when tracking the movable object, the neural network inference model performs incremental learning in the background, and the incremental The training data for quantitative learning includes: the images captured by the movable camera and the images automatically amplified from the images, and the corrected coordinates of the feature points corresponding to the images are used to replace the manual markers and adjust the The weights in the neural network inference model are updated to accurately calculate the coordinates of the feature points of the movable object. 如請求項1所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,更包括: 交叉比對該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位,以修正該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位; 當該可移動物體之該些六自由度方位或該可移動相機之該些六自由度方位的變動小於一預設值時,不改變該可移動物體之該些六自由度方位及該可移動相機之該些六自由度方位; 根據該可移動相機之該些六自由度方位計算使用者之雙眼的視軸; 根據該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位計算虛擬螢幕之些六自由度方位;以及 根據該虛擬螢幕之該些六自由度方位及立體顯示器的光學參數產生該虛擬螢幕之左眼影像及右眼影像,以顯示該虛擬螢幕的立體影像於該立體顯示器。The method for simultaneously tracking the six degrees of freedom orientations of a movable object and a movable camera as described in claim 1 further includes: cross-comparing the 6DOF orientations of the movable object with the 6DOF orientations of the movable camera to correct the 6DOF orientations of the movable object and the 6DOF orientations of the movable camera degree azimuth; When the changes of the six degrees of freedom orientations of the movable object or the six degrees of freedom orientations of the movable camera are less than a preset value, the six degrees of freedom orientations of the movable object and the movable The six degrees of freedom orientation of the camera; calculating the viewing axes of the user's eyes according to the six degrees of freedom orientations of the movable camera; calculating six degrees of freedom orientations of the virtual screen based on the six degrees of freedom orientations of the movable object and the six degrees of freedom orientations of the movable camera; and Generate the left-eye image and right-eye image of the virtual screen according to the six-degree-of-freedom orientations of the virtual screen and the optical parameters of the stereoscopic display, so as to display the stereoscopic images of the virtual screen on the stereoscopic display. 如請求項5所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中該虛擬螢幕由該使用者設定顯示於該可移動物體周圍之一特定位置,該虛擬螢幕隨著該可移動物體一起移動。The method for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in claim 5, wherein the virtual screen is set by the user to be displayed at a specific position around the movable object, and the virtual screen is set to be displayed at a specific position around the movable object. Move with the movable object. 如請求項1所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中修正該可移動物體的該些特徵點的座標之步驟包括: 使用該些相機矩陣,將該些特徵點的二維座標投影至對應的三維座標; 依據該幾何限制,刪除該三維座標偏差大於一預定值的該些特徵點,或以相鄰特徵點的座標依據該幾何限制補充未被偵測到的特徵點的座標;以及 依據該時間限制,比對該些特徵點於連續之該些影像中的座標變化,再以連續之該些影像中對應的該些特徵點的座標修正該座標變化大於一設定值的該些特徵點的座標。The method for simultaneously tracking the six degrees of freedom orientations of the movable object and the movable camera as described in Claim 1, wherein the step of correcting the coordinates of the feature points of the movable object includes: Using the camera matrices, project the two-dimensional coordinates of these feature points to corresponding three-dimensional coordinates; According to the geometric restriction, delete the feature points whose three-dimensional coordinate deviation is greater than a predetermined value, or use the coordinates of adjacent feature points to supplement the coordinates of undetected feature points according to the geometric restriction; and According to the time limit, compare the coordinate changes of these feature points in the continuous images, and then use the coordinates of the corresponding feature points in the continuous images to correct the features whose coordinate changes are greater than a set value The coordinates of the point. 如請求項1所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中在計算該可移動物體的該些六自由度方位之步驟中, 對於平面之該可移動物體,使用該些特徵點計算擬合平面,該可移動物體的該些六自由度方位由該平面之中心點及法向量定義; 對於非平面之該可移動物體,該可移動物體的該些六自由度方位由該些特徵點之三維座標的質心定義。The method for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in Claim 1, wherein in the step of calculating the six-degree-of-freedom orientations of the movable object, For the movable object on the plane, the fitting plane is calculated using the feature points, and the six-degree-of-freedom orientations of the movable object are defined by the center point and normal vector of the plane; For the non-planar movable object, the six degrees of freedom orientations of the movable object are defined by the centroids of the three-dimensional coordinates of the feature points. 如請求項1所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之方法,其中該可移動物體為剛性物體,該可移動相機設置於頭戴式立體顯示器、行動裝置、電腦或機器人上。The method for simultaneously tracking the six-degree-of-freedom orientations of a movable object and a movable camera as described in Claim 1, wherein the movable object is a rigid object, and the movable camera is set on a head-mounted stereoscopic display, a mobile device, computer or robot. 一種同時追蹤可移動物體與可移動相機的複數個六自由度方位之系統,包括: 該可移動相機,用以擷取複數張影像; 可移動相機六自由度方位計算單元,用以從該些影像中提取複數個環境特徵點,匹配該些環境特徵點計算該可移動相機之複數個相機矩陣,再由該些相機矩陣計算該可移動相機的該些六自由度方位;以及 可移動物體六自由度方位計算單元,用以從該可移動相機擷取的該些影像中推算該可移動物體的複數個特徵點,透過該些影像對應的該些相機矩陣,以及預先定義的幾何限制、和時間限制,修正該可移動物體的該些特徵點的座標,再以修正後之該些特徵點的座標及其對應的該些相機矩陣,計算該可移動物體的該些六自由度方位。A system for simultaneously tracking multiple six-degree-of-freedom orientations of a movable object and a movable camera, including: The movable camera is used to capture a plurality of images; The six-degree-of-freedom orientation calculation unit of the movable camera is used to extract a plurality of environmental feature points from the images, match the environmental feature points to calculate a plurality of camera matrices of the movable camera, and then calculate the movable camera matrix from the camera matrices. the six degrees of freedom orientations of the mobile camera; and The six-degree-of-freedom orientation calculation unit of the movable object is used to calculate a plurality of feature points of the movable object from the images captured by the movable camera, through the camera matrices corresponding to the images, and the predefined Geometric constraints, and time constraints, correct the coordinates of the feature points of the movable object, and then calculate the six freedoms of the movable object with the corrected coordinates of the feature points and the corresponding camera matrices degree bearing. 如請求項10所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該可移動相機六自由度方位計算單元包括: 環境特徵擷取單元,用以從該些影像中提取該些環境特徵點; 相機矩陣計算單元,係匹配該些環境特徵點計算該可移動相機之該些相機矩陣;以及 相機方位計算單元,用該些相機矩陣計算該可移動相機的該些六自由度方位。The system for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in claim 10, wherein the six-degree-of-freedom orientation calculation unit of the movable camera includes: an environmental feature extraction unit, configured to extract the environmental feature points from the images; The camera matrix calculation unit is used to match the environmental feature points to calculate the camera matrices of the movable camera; and The camera orientation calculation unit calculates the six-DOF orientations of the movable camera by using the camera matrices. 如請求項10所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該可移動物體六自由度方位計算單元包括: 物體特徵推算單元,用以從該可移動相機擷取的該些影像中推算該可移動物體的該些特徵點; 物體特徵座標修正單元,用以透過該些張影像對應的該些相機矩陣,以及預先定義的該幾何限制和該時間限制,修正該可移動物體的該些特徵點的座標;以及 物體方位計算單元,係以修正後之該些特徵點的座標及其對應的該些相機矩陣,計算該可移動物體的該些六自由度方位。The system for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in claim 10, wherein the six-degree-of-freedom orientation calculation unit of the movable object includes: an object feature estimation unit, configured to estimate the feature points of the movable object from the images captured by the movable camera; An object feature coordinate correction unit, used to correct the coordinates of the feature points of the movable object through the camera matrices corresponding to the images, and the predefined geometric constraints and the time constraints; and The object orientation calculation unit calculates the six-degree-of-freedom orientations of the movable object by using the corrected coordinates of the feature points and the corresponding camera matrices. 如請求項12所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該物體特徵推算單元從該可移動相機擷取的該些影像中推算該可移動物體的該些特徵點為預先定義,與該可移動相機擷取的該些影像做比對推算該些特徵點的座標。The system for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in claim 12, wherein the object feature estimation unit calculates the position of the movable object from the images captured by the movable camera The feature points are predefined, and the coordinates of the feature points are calculated by comparing with the images captured by the movable camera. 如請求項12所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該物體特徵推算單元從該可移動相機擷取的該些影像中推算該可移動物體的該些特徵點,並由神經網路推論模型推算該些特徵點的座標,該神經網路推論模型為預先訓練,訓練資料由手動標記和自動擴充組成,在訓練過程中加入該幾何限制和該時間限制。The system for simultaneously tracking the six-degree-of-freedom orientations of the movable object and the movable camera as described in claim 12, wherein the object feature estimation unit calculates the position of the movable object from the images captured by the movable camera These feature points, and the coordinates of these feature points are calculated by the neural network inference model. The neural network inference model is pre-trained, and the training data is composed of manual labeling and automatic expansion. The geometric restrictions and the time limit. 如請求項14所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中在追蹤該可移動物體時,該神經網路推論模型在背景執行增量學習,該增量學習的訓練資料包括:該可移動相機擷取的該些影像及由該些影像自動擴增的影像,並以對應該些影像的修正後之該些特徵點的座標取代手動標記,調整該神經網路推論模型中的權重,更新該神經網路推論模型以精準推算該可移動物體特徵點的座標。The system for simultaneously tracking the six degrees of freedom orientations of a movable object and a movable camera as described in claim 14, wherein when tracking the movable object, the neural network inference model performs incremental learning in the background, and the incremental The training data for quantitative learning includes: the images captured by the movable camera and the images automatically amplified from the images, and the corrected coordinates of the feature points corresponding to the images are used to replace the manual markers and adjust the The weights in the neural network inference model are updated to accurately calculate the coordinates of the feature points of the movable object. 如請求項10所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,更包括: 方位修正單元,用以交叉比對該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位,以修正該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位; 方位穩定單元,當該可移動物體之該些六自由度方位或該可移動相機之該些六自由度方位的變動小於一預設值時,不改變該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位; 視軸計算單元,用以根據該可移動相機之該些六自由度方位計算使用者之雙眼的視軸; 螢幕方位計算單元,用以根據該可移動物體之該些六自由度方位與該可移動相機之該些六自由度方位計算虛擬螢幕之複數個六自由度方位;以及 立體影像產生單元,用以根據該虛擬螢幕之該些六自由度方位及立體顯示器的光學參數產生該虛擬螢幕之左眼影像及右眼影像,以顯示該虛擬螢幕的立體影像於該立體顯示器。The system for simultaneously tracking the six degrees of freedom orientations of a movable object and a movable camera as described in Claim 10 further includes: The orientation correction unit is used for cross-comparing the six degrees of freedom orientations of the movable object with the six degrees of freedom orientations of the movable camera, so as to correct the six degrees of freedom orientations of the movable object and the movable The six degrees of freedom orientation of the camera; The orientation stabilizing unit does not change the six degrees of freedom orientations of the movable object when the change of the six degrees of freedom orientations of the movable object or the six degrees of freedom orientations of the movable camera is less than a preset value the six degrees of freedom orientations of the movable camera; a viewing axis calculation unit, configured to calculate the viewing axes of the user's eyes according to the six-degree-of-freedom orientations of the movable camera; a screen orientation calculation unit, configured to calculate multiple six-degree-of-freedom orientations of the virtual screen according to the six-degree-of-freedom orientations of the movable object and the six-degree-of-freedom orientations of the movable camera; and The stereoscopic image generating unit is used to generate left-eye images and right-eye images of the virtual screen according to the six degrees of freedom orientations of the virtual screen and the optical parameters of the stereoscopic display, so as to display the stereoscopic images of the virtual screen on the stereoscopic display. 如請求項16所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該虛擬螢幕由該使用者設定顯示於該可移動物體周圍之一特定位置,該虛擬螢幕隨著該可移動物體一起移動。The system for simultaneously tracking the six degrees of freedom orientations of a movable object and a movable camera as described in claim 16, wherein the virtual screen is set by the user to be displayed at a specific position around the movable object, and the virtual screen is Move with the movable object. 如請求項12所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該物體特徵座標修正單元 使用該些相機矩陣,將該些特徵點的二維座標投影至對應的三維座標;並 依據該幾何限制,刪除該三維座標偏差大於一預定值的該些特徵點,或以相鄰特徵點的座標依據該幾何限制補充未被偵測到的特徵點的座標;以及 依據該時間限制,比對該些特徵點於連續之該些影像中的座標變化,再以連續之該些影像中對應的該些特徵點的座標修正該座標變化大於一設定值的該些特徵點的座標。The system for simultaneously tracking the six degrees of freedom orientations of a movable object and a movable camera as described in claim 12, wherein the object feature coordinate correction unit Using the camera matrices, projecting the two-dimensional coordinates of the feature points to corresponding three-dimensional coordinates; and According to the geometric restriction, delete the feature points whose three-dimensional coordinate deviation is greater than a predetermined value, or use the coordinates of adjacent feature points to supplement the coordinates of undetected feature points according to the geometric restriction; and According to the time limit, compare the coordinate changes of these feature points in the continuous images, and then use the coordinates of the corresponding feature points in the continuous images to correct the features whose coordinate changes are greater than a set value The coordinates of the point. 如請求項12所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該物體方位計算單元 對於平面之該可移動物體,使用該些特徵點計算擬合平面,該可移動物體的該些六自由度方位由該平面之中心點及法向量定義; 對於非平面之該可移動物體,該可移動物體的該些六自由度方位由該些特徵點之三維座標的質心定義。The system for simultaneously tracking the six-degree-of-freedom orientations of a movable object and a movable camera as described in Claim 12, wherein the object orientation calculation unit For the movable object on the plane, the fitting plane is calculated using the feature points, and the six-degree-of-freedom orientations of the movable object are defined by the center point and normal vector of the plane; For the non-planar movable object, the six degrees of freedom orientations of the movable object are defined by the centroids of the three-dimensional coordinates of the feature points. 如請求項10所述之同時追蹤可移動物體與可移動相機的該些六自由度方位之系統,其中該可移動物體為剛性物體,該可移動相機設置於頭戴式立體顯示器、行動裝置、電腦或機器人上。The system for simultaneously tracking the six-degree-of-freedom orientations of a movable object and a movable camera as described in claim 10, wherein the movable object is a rigid object, and the movable camera is set on a head-mounted stereoscopic display, a mobile device, computer or robot.
TW110114401A 2020-07-08 2021-04-21 Method and system for simultaneously tracking 6 dof poses of movable object and movable camera TWI793579B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110554564.XA CN113920189A (en) 2020-07-08 2021-05-20 Method and system for simultaneously tracking six-degree-of-freedom directions of movable object and movable camera
US17/369,669 US11506901B2 (en) 2020-07-08 2021-07-07 Method and system for simultaneously tracking 6 DoF poses of movable object and movable camera

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063049161P 2020-07-08 2020-07-08
US63/049,161 2020-07-08

Publications (2)

Publication Number Publication Date
TW202203644A TW202203644A (en) 2022-01-16
TWI793579B true TWI793579B (en) 2023-02-21

Family

ID=80787701

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110114401A TWI793579B (en) 2020-07-08 2021-04-21 Method and system for simultaneously tracking 6 dof poses of movable object and movable camera

Country Status (1)

Country Link
TW (1) TWI793579B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI826189B (en) * 2022-12-16 2023-12-11 仁寶電腦工業股份有限公司 Controller tracking system and method with six degrees of freedom

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201915943A (en) * 2017-09-29 2019-04-16 香港商阿里巴巴集團服務有限公司 Method, apparatus and system for automatically labeling target object within image
CN111311632A (en) * 2018-12-11 2020-06-19 深圳市优必选科技有限公司 Object pose tracking method, device and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201915943A (en) * 2017-09-29 2019-04-16 香港商阿里巴巴集團服務有限公司 Method, apparatus and system for automatically labeling target object within image
CN111311632A (en) * 2018-12-11 2020-06-19 深圳市优必选科技有限公司 Object pose tracking method, device and equipment

Also Published As

Publication number Publication date
TW202203644A (en) 2022-01-16

Similar Documents

Publication Publication Date Title
CN109146965B (en) Information processing apparatus, computer readable medium, and head-mounted display apparatus
CN110047104B (en) Object detection and tracking method, head-mounted display device, and storage medium
US20210209788A1 (en) Method and apparatus for generating data for estimating three-dimensional (3d) pose of object included in input image, and prediction model for estimating 3d pose of object
JP4512584B2 (en) Panorama video providing method and apparatus with improved image matching speed and blending method
JP2019536170A (en) Virtually extended visual simultaneous localization and mapping system and method
JP6456347B2 (en) INSITU generation of plane-specific feature targets
CN105678809A (en) Handheld automatic follow shot device and target tracking method thereof
JP2016522485A (en) Hidden reality effect and intermediary reality effect from reconstruction
JP2015521419A (en) A system for mixing or synthesizing computer generated 3D objects and video feeds from film cameras in real time
CN102714695A (en) Image processing device, image processing method and program
CN108227920B (en) Motion closed space tracking method and system
CN114095662B (en) Shooting guide method and electronic equipment
US10838515B1 (en) Tracking using controller cameras
CN103914855B (en) The localization method and device of a kind of moving target
CN112541973B (en) Virtual-real superposition method and system
Jia et al. Constrained 3D rotation smoothing via global manifold regression for video stabilization
US11506901B2 (en) Method and system for simultaneously tracking 6 DoF poses of movable object and movable camera
TWI793579B (en) Method and system for simultaneously tracking 6 dof poses of movable object and movable camera
CN111680671A (en) Automatic generation method of camera shooting scheme based on optical flow
TWI736083B (en) Method and system for motion prediction
CN111193918B (en) Image processing system and image processing method
US20190340773A1 (en) Method and apparatus for a synchronous motion of a human body model
TWI680005B (en) Movement tracking method and movement tracking system
CN114092668A (en) Virtual-real fusion method, device, equipment and storage medium
CN114913245A (en) Multi-calibration-block multi-camera calibration method and system based on undirected weighted graph