TWI648556B

TWI648556B - Slam and gesture recognition method

Info

Publication number: TWI648556B
Application number: TW107107503A
Authority: TW
Inventors: 楊世豪; 蔡耀崇; 張庭基
Original assignee: 仁寶電腦工業股份有限公司
Priority date: 2018-03-06
Filing date: 2018-03-06
Publication date: 2019-01-21
Also published as: TW201939105A

Abstract

本案係關於一種同步定位與地圖建構及手勢辨識方法，包括步驟：提供頭戴式電子裝置；以影像擷取模組進行影像擷取，以取得擷取影像；影像處理單元對擷取影像進行前處理動作，產生顯示影像；複製顯示影像並分別傳送至動態影像過濾單元及手勢影像擷取單元；動態影像過濾單元過濾顯示影像，以除去顯示影像中之動態影像，進而取得空間影像；空間定位演算法單元根據空間影像進行同步定位與地圖建構；手勢影像擷取單元分析顯示影像，並擷取顯示影像中之手勢影像；以及細部手勢辨識演算法單元根據手勢影像進行細部手勢辨識。藉此可減少產品體積、重量與製造成本。The present invention relates to a method for synchronous positioning and map construction and gesture recognition, comprising the steps of: providing a head-mounted electronic device; capturing an image by using an image capturing module to obtain a captured image; and the image processing unit before capturing the image Processing the action to generate a display image; copying the display image and transmitting it to the motion image filtering unit and the gesture image capturing unit respectively; the motion image filtering unit filters the displayed image to remove the dynamic image in the displayed image, thereby obtaining a spatial image; spatial positioning calculation The method unit performs synchronous positioning and map construction according to the spatial image; the gesture image capturing unit analyzes and displays the image, and captures the gesture image in the displayed image; and the detailed gesture recognition algorithm unit performs detailed gesture recognition according to the gesture image. This reduces product size, weight and manufacturing costs.

Description

同步定位與地圖建構及手勢辨識方法Synchronous positioning and map construction and gesture recognition method

本案係關於一種影像之空間定位與手勢辨識方法，尤指一種應用於擴增實境之同步定位與地圖建構及手勢辨識方法。This case is about a method for spatial positioning and gesture recognition of images, especially a method for synchronous positioning, map construction and gesture recognition applied in augmented reality.

目前在擴增實境(Augmented Reality, AR)的應用上，為了能與虛擬物件互動，人機介面的研究日益蓬勃。由於穿戴穿戴式的擴增實境裝置或其他頭戴式電子裝置時，多半無法採用傳統的滑鼠控制或是觸控控制等方式，故多種型態的人機介面方式持續地被提出。At present, in the application of Augmented Reality (AR), in order to interact with virtual objects, the research on human-machine interface is growing vigorously. Since wearable augmented reality devices or other head-mounted electronic devices are often unable to adopt conventional mouse control or touch control, various types of human-machine interface methods are continuously proposed.

舉例而言，擴增實境可透過特製的控制器來與虛擬物件進行互動，其反應速度靈敏、控制方式多元，但仍需因應不同的使用環境採用多種額外的控制器，且需學習並記憶每一種控制器的操作方式，無法達到直覺使用的效果。因此，無須使用額外控制器的手勢控制方式遂發展成一主要的人機介面方式。For example, augmented reality can interact with virtual objects through a special controller. The response speed is sensitive and the control method is diversified. However, multiple additional controllers need to be used according to different usage environments, and need to learn and memorize. The way each controller operates can't achieve the effect of intuitive use. Therefore, it is not necessary to use the gesture control method of the additional controller to develop into a main human-machine interface mode.

然而，現行對於手勢控制方式的處理，仍具有許多缺點。請參閱第1圖及第2圖，其中第1圖係顯示習知頭戴式電子裝置之架構方塊圖，以及第2圖係顯示習知頭戴式電子裝置擷取目標手勢及目標場景並進行演算之示意圖。由第1圖及第2圖可得知，習知頭戴式電子裝置1主要包括控制單元10、第一影像擷取模組11、手勢影像處理器12、第二影像擷取模組13、場景影像處理器14、手勢辨識演算法單元15以及空間定位演算法單元16。其中，第一影像擷取模組11及第二影像擷取模組13在硬體上及功能上為二組彼此獨立的影像擷取模組，第一影像擷取模組11係對目標手勢進行影像擷取，並經由手勢影像處理器12處理後，將擷取到的目標手勢影像傳送至手勢辨識演算法單元15進行演算，以得到對應的手勢；第二影像擷取模組13係對目標場景進行影像擷取，並經由場景影像處理器14處理後，將擷取到的目標場景影像傳送至空間定位演算法單元16進行演算，以完成空間定位。然而，此架構必須配置二組獨立的影像擷取模組，且當欲配合其他功能時，可能需要因應需求再增添額外的影像擷取模組，使得體積及重量相對地提升，不僅製造的成本較高，也使得使用者的配戴體驗不佳。However, the current handling of gesture control methods still has a number of disadvantages. Please refer to FIG. 1 and FIG. 2 , wherein FIG. 1 is a block diagram showing the architecture of a conventional head-mounted electronic device, and FIG. 2 is a diagram showing a conventional head-mounted electronic device capturing target gestures and target scenes. Schematic diagram of the calculation. As can be seen from the first and second figures, the conventional head-mounted electronic device 1 mainly includes a control unit 10, a first image capturing module 11, a gesture image processor 12, and a second image capturing module 13. The scene image processor 14, the gesture recognition algorithm unit 15, and the spatial positioning algorithm unit 16. The first image capturing module 11 and the second image capturing module 13 are two independent image capturing modules on the hardware and the function, and the first image capturing module 11 is configured to target the target gesture. After the image capture is performed and processed by the gesture image processor 12, the captured target gesture image is transmitted to the gesture recognition algorithm unit 15 for calculation to obtain a corresponding gesture; the second image capture module 13 is paired The target scene is image captured, and processed by the scene image processor 14, the captured target scene image is transmitted to the spatial positioning algorithm unit 16 for calculation to complete spatial positioning. However, this architecture must be configured with two separate sets of image capture modules. When you want to work with other functions, you may need to add additional image capture modules to meet the needs, so that the volume and weight are relatively increased, not only the cost of manufacturing. Higher, also makes the user's wearing experience poor.

故此，如何發展一種採用單一影像擷取模組搭配適度的過濾，讓擷取影像可以同步使用於空間定位、地圖建構與手勢辨識之同步定位與地圖建構及手勢辨識方法，實為目前尚待解決的問題。Therefore, how to develop a single image capture module with moderate filtering, so that captured images can be used simultaneously for spatial positioning, map construction and gesture recognition synchronous positioning and map construction and gesture recognition methods, which is still to be solved The problem.

本案之主要目的為提供一種同步定位與地圖建構及手勢辨識方法，俾解決並改善前述先前技術之問題與缺點。The main purpose of the present invention is to provide a method for synchronous positioning, map construction and gesture recognition, which solves and improves the problems and disadvantages of the prior art described above.

本案之另一目的為提供一種同步定位與地圖建構及手勢辨識方法，採用單一的影像擷取模組擷取同時包括目標場景及目標手勢之一擷取影像，並透過動態影像過濾單元及手勢影像擷取單元等影像前處理單元針對各個演算法作前處理產生顯示影像，最後再由空間定位演算法單元以空間定位演算法進行同步定位與地圖建構，並由細部手勢辨識演算法單元以手勢辨識演算法進行細部手勢辨識，不僅可有效地減少產品體積與重量，同時增進使用者體驗並達成產品微小化的目的，更可進一步降低製造成本。Another object of the present invention is to provide a method for synchronous positioning, map construction and gesture recognition, which uses a single image capturing module to capture one of the target scene and the target gesture, and captures the image through the motion image filtering unit and the gesture image. The image pre-processing unit such as the capture unit performs pre-processing on each algorithm to generate a display image, and finally the spatial positioning algorithm unit performs spatial positioning algorithm for synchronous positioning and map construction, and the detailed gesture recognition algorithm unit uses gesture recognition. The algorithm performs detailed gesture recognition, which not only can effectively reduce the volume and weight of the product, but also enhance the user experience and achieve the purpose of miniaturization of products, and further reduce the manufacturing cost.

本案之另一目的為提供一種同步定位與地圖建構及手勢辨識方法，透過單一影像擷取模組擷取同時包括目標場景及目標手勢的全彩影像，其影像內容完整且可提供各類軟硬體進行影像處理及後續應用，具有良好的擴充性。Another object of the present invention is to provide a method for synchronous positioning, map construction and gesture recognition, which utilizes a single image capture module to capture a full-color image including both a target scene and a target gesture, and the image content is complete and can provide various types of soft and hard images. The image processing and subsequent application have good scalability.

為達上述目的，本案之一較佳實施態樣為提供一種同步定位與地圖建構及手勢辨識方法，包括步驟：(a)提供一頭戴式電子裝置；(b)以該頭戴式電子裝置之一影像擷取模組進行影像擷取，以取得一擷取影像；(c)以該頭戴式電子裝置之一影像處理單元對該擷取影像進行一前處理動作，產生一顯示影像；(d)複製該顯示影像並分別傳送至該頭戴式電子裝置之一動態影像過濾單元及一手勢影像擷取單元；(e)該動態影像過濾單元過濾該顯示影像，以除去該顯示影像中之動態影像，進而取得一空間影像；(f)該頭戴式電子裝置之一空間定位演算法單元根據該空間影像進行同步定位與地圖建構；(g)該手勢影像擷取單元分析該顯示影像，並擷取該顯示影像中之一手勢影像；以及(h)該頭戴式電子裝置之一細部手勢辨識演算法單元根據該手勢影像進行細部手勢辨識。In order to achieve the above object, a preferred embodiment of the present invention provides a method for synchronous positioning and map construction and gesture recognition, comprising the steps of: (a) providing a head mounted electronic device; and (b) using the head mounted electronic device An image capturing module performs image capturing to obtain a captured image; and (c) performing a pre-processing operation on the captured image by an image processing unit of the head mounted electronic device to generate a display image; (d) copying the display image and transmitting it to one of the dynamic image filtering unit of the head mounted electronic device and a gesture image capturing unit; (e) the dynamic image filtering unit filters the display image to remove the displayed image a moving image to obtain a spatial image; (f) a spatial positioning algorithm unit of the head mounted electronic device performs synchronous positioning and map construction according to the spatial image; (g) the gesture image capturing unit analyzes the displayed image And capturing a gesture image of the display image; and (h) the detail gesture recognition algorithm unit of the head mounted electronic device performs detailed gesture recognition according to the gesture image.

體現本案特徵與優點的一些典型實施例將在後段的說明中詳細敘述。應理解的是本案能夠在不同的態樣上具有各種的變化，其皆不脫離本案的範圍，且其中的說明及圖示在本質上係當作說明之用，而非架構於限制本案。Some exemplary embodiments embodying the features and advantages of the present invention are described in detail in the following description. It is to be understood that the present invention is capable of various modifications in various aspects, and is not to be construed as a limitation.

請參閱第3圖、第4圖及第5圖，其中第3圖係顯示本案較佳實施例之同步定位與地圖建構及手勢辨識方法之流程圖，第4圖係顯示適用本案同步定位與地圖建構及手勢辨識方法之頭戴式電子裝置之架構方塊圖，以及第5圖係顯示以第4圖所示之頭戴式電子裝置擷取擷取影像並進行演算之示意圖。如第3圖、第4圖及第5圖所示，適用於本案同步定位與地圖建構及手勢辨識方法之頭戴式電子裝置係至少包括中央處理單元20、影像擷取模組21、影像處理單元22、動態影像過濾單元23、空間定位演算法單元24、手勢影像擷取單元25以及細部手勢辨識演算法單元26。其中，中央處理單元20係架構於控制頭戴式電子裝置2之運作。影像擷取模組21係用以擷取擷取影像。影像處理單元22係與中央處理單元20相連接。動態影像過濾單元23係與中央處理單元20相連接，且空間定位演算法單元24係與該動態影像過濾單元23相連接。手勢影像擷取單元25係與中央處理單元20相連接，且細部手勢辨識演算法單元26係與手勢影像擷取單元25相連接。Please refer to FIG. 3, FIG. 4 and FIG. 5 , wherein FIG. 3 is a flow chart showing the synchronous positioning, map construction and gesture recognition method of the preferred embodiment of the present invention, and FIG. 4 is a view showing the synchronous positioning and map applicable to the present case. The block diagram of the head-mounted electronic device of the construction and gesture recognition method, and the fifth figure show the schematic diagram of capturing the image and performing calculation by the head-mounted electronic device shown in FIG. As shown in FIG. 3, FIG. 4 and FIG. 5, the head mounted electronic device suitable for the synchronous positioning, map construction and gesture recognition method of the present invention includes at least a central processing unit 20, an image capturing module 21, and image processing. The unit 22, the motion image filtering unit 23, the spatial positioning algorithm unit 24, the gesture image capturing unit 25, and the detailed gesture recognition algorithm unit 26. The central processing unit 20 is configured to control the operation of the head mounted electronic device 2. The image capturing module 21 is configured to capture captured images. The image processing unit 22 is connected to the central processing unit 20. The motion picture filtering unit 23 is connected to the central processing unit 20, and the spatial positioning algorithm unit 24 is connected to the motion picture filtering unit 23. The gesture image capturing unit 25 is connected to the central processing unit 20, and the detailed gesture recognition algorithm unit 26 is connected to the gesture image capturing unit 25.

本案較佳實施例之同步定位與地圖建構及手勢辨識方法係包括如下步驟。首先，如步驟S100所示，提供頭戴式電子裝置2。其次，如步驟S200所示，以頭戴式電子裝置2之影像擷取模組21進行影像擷取，以取得一擷取影像。接著，如步驟S300所示，以頭戴式電子裝置2之影像處理單元22對擷取影像進行前處理動作，以產生一顯示影像，該前處理動作包括但不限於鏡頭校正（Lens Shading Correction)、自動獲得（AutoGain）以及自動曝光（AutoExposure）等。然後，如步驟S400所示，複製經過前處理後的顯示影像並分別傳送至頭戴式電子裝置2之動態影像過濾單元23及手勢影像擷取單元25，亦即動態影像過濾單元23及手勢影像擷取單元25於此步驟會接收到相同的經過前處理動作的顯示影像。在步驟S400完成後，步驟S500與步驟S700係可同時執行於步驟S400之後，亦可存在時間差地分別執行，步驟S600係執行於步驟S500之後，且步驟S800係執行於步驟S700之後。在步驟S500中，動態影像過濾單元23過濾顯示影像，以除去顯示影像中之動態影像，進而取得空間影像，換言之，為了進行準確的空間定位，應就顯示影像排除其中的動態影像例如移動中的手或其他移動中的物件。在步驟S600中，頭戴式電子裝置2之空間定位演算法單元24根據空間影像進行同步定位與地圖建構。在步驟S700中，手勢影像擷取單元25分析顯示影像，並擷取顯示影像中之手勢影像。在步驟S800中，頭戴式電子裝置2之細部手勢辨識演算法單元26根據手勢影像進行細部手勢辨識。The method for synchronous positioning, map construction and gesture recognition in the preferred embodiment of the present invention includes the following steps. First, as shown in step S100, the head mounted electronic device 2 is provided. Next, as shown in step S200, the image capturing module 21 of the head mounted electronic device 2 performs image capturing to obtain a captured image. Then, as shown in step S300, the image processing unit 22 of the head mounted electronic device 2 performs a pre-processing operation on the captured image to generate a display image, which includes, but is not limited to, lens correction (Lens Shading Correction). , AutoGain and Auto Exposure. Then, as shown in step S400, the pre-processed display image is copied and transmitted to the motion image filtering unit 23 and the gesture image capturing unit 25 of the head mounted electronic device 2, that is, the motion image filtering unit 23 and the gesture image. The capture unit 25 receives the same pre-processed display image at this step. After the step S400 is completed, the step S500 and the step S700 may be performed simultaneously after the step S400, or may be performed separately with the time difference, the step S600 is performed after the step S500, and the step S800 is performed after the step S700. In step S500, the motion image filtering unit 23 filters the display image to remove the motion image in the displayed image, thereby acquiring the spatial image. In other words, in order to perform accurate spatial positioning, the moving image should be excluded from the displayed image, for example, in motion. Hand or other moving objects. In step S600, the spatial positioning algorithm unit 24 of the head mounted electronic device 2 performs synchronous positioning and map construction based on the spatial image. In step S700, the gesture image capturing unit 25 analyzes the displayed image and captures the gesture image in the displayed image. In step S800, the detailed gesture recognition algorithm unit 26 of the head mounted electronic device 2 performs detailed gesture recognition based on the gesture image.

根據本案之構想，中央處理單元20可為中央處理器，且於一些實施例中，上述之步驟S400較佳係以中央處理單元20實現，但不以此為限。影像擷取模組21較佳為包括至少二個鏡頭及二個感光元件之單一影像擷取模組，由於空間計算若以三角函數為之，至少需具備二個已知相對位置關係的鏡頭與感光元件，同時考量使產品微小化並增進使用者體驗的取向上，本案中的影像擷取模組21最佳為採用二個鏡頭搭配二個感光元件，但不以此為限。影像處理單元22係可為但不限於一影像訊號處理器（Image Signal Processor, ISP）。此外，動態影像過濾單元23、手勢影像擷取單元25、空間定位演算法單元24及細部手勢辨識演算法單元26可皆為軟體單元，且透過中央處理單元20控制以實現執行，然亦不以此為限。According to the concept of the present invention, the central processing unit 20 can be a central processing unit. In some embodiments, the step S400 is preferably implemented by the central processing unit 20, but is not limited thereto. The image capturing module 21 preferably includes a single image capturing module including at least two lenses and two photosensitive elements. Since the space calculation is performed by a trigonometric function, at least two lenses with known relative positional relationships are required. The image capturing module 21 in the present invention preferably uses two lenses to match two photosensitive elements, but is not limited thereto. The image processing unit 22 can be, but not limited to, an Image Signal Processor (ISP). In addition, the motion image filtering unit 23, the gesture image capturing unit 25, the spatial positioning algorithm unit 24, and the detailed gesture recognition algorithm unit 26 may all be software units, and are controlled by the central processing unit 20 to implement, but not This is limited.

進一步地，本案之頭戴式電子裝置2可進一步包括記憶體單元27及微投影顯示單元28。其中，記憶體單元27係與中央處理單元20相連接，微投影顯示單元28係與中央處理單元20相連接。其中，至少前述之一手勢影像，例如於某一特定瞬時之手勢影像，係儲存於記憶體單元27，以用於與前一個或後一個手勢影像比對或其他相類似之用途，但不以此為限。頭戴式電子裝置2在執行本案之同步定位與地圖建構及手勢辨識方法後，係相應地顯示在微投影顯示單元28上。換言之，使用者最終係於微投影顯示單元28上觀看到互動的結果。Further, the head mounted electronic device 2 of the present invention may further include a memory unit 27 and a micro projection display unit 28. The memory unit 27 is connected to the central processing unit 20, and the micro-projection display unit 28 is connected to the central processing unit 20. Wherein at least one of the aforementioned gesture images, such as a gesture image at a particular moment, is stored in the memory unit 27 for comparison with the previous or next gesture image or other similar uses, but not This is limited. The head-mounted electronic device 2 is displayed on the micro-projection display unit 28 after performing the synchronous positioning, map construction and gesture recognition methods of the present invention. In other words, the user ultimately views the result of the interaction on the micro-projection display unit 28.

在動態影像過濾方面，請參閱第6圖並配合第3圖及第4圖，其中第6圖係顯示第3圖所示之步驟S500之細部流程圖。如第3圖、第4圖及第6圖所示，本案同步定位與地圖建構及手勢辨識方法之步驟S500係包括細部流程步驟如下。如步驟S520所示，動態影像過濾單元23將顯示影像轉換為第一灰階化影像。接著，如步驟S540所示，動態影像過濾單元23根據第一灰階化影像計算一深度資訊。然後，如步驟S560所示，動態影像過濾單元23透過加速穩健特徵（Speeded Up Robust Features, SURF）取得複數個特徵點，並將複數個特徵點製作成三維點雲（Point Cloud）。再來，如步驟S580所示，根據每一幀（Frame）對應之三維點雲過濾並移除移動物體產生的點雲資料，進而取得空間影像。In terms of motion picture filtering, please refer to FIG. 6 and cooperate with FIG. 3 and FIG. 4, wherein FIG. 6 shows a detailed flowchart of step S500 shown in FIG. As shown in FIG. 3, FIG. 4 and FIG. 6, the step S500 of the simultaneous positioning and map construction and gesture recognition method of the present case includes the detailed process steps as follows. As shown in step S520, the motion image filtering unit 23 converts the display image into the first grayscale image. Next, as shown in step S540, the motion picture filtering unit 23 calculates a depth information according to the first grayscale image. Then, as shown in step S560, the motion picture filtering unit 23 obtains a plurality of feature points through the Speeded Up Robust Features (SURF), and creates a plurality of feature points into a three-dimensional point cloud. Then, as shown in step S580, the point cloud data generated by the moving object is filtered according to the three-dimensional point cloud corresponding to each frame, and the spatial image is obtained.

經由動態影像過濾單元23完成前處理後，在空間定位演算法單元24所執行之步驟係說明如後。請參閱第7圖並配合第3圖及第4圖，其中第7圖係顯示第3圖所示之步驟S600之細部流程圖。如第3圖、第4圖及第7圖所示，本案同步定位與地圖建構及手勢辨識方法之步驟S600係包括細部流程步驟如下。如步驟S620所示，空間定位演算法單元24係以空間定位演算法追蹤複數個特徵點的主方向，添加特徵點向量以與前一個關鍵幀匹配計算影像擷取模組之相機（即鏡頭或感光元件）之初始位置，並在當前幀以及局部地圖間找尋匹配點，以最佳化當前幀的位置並決定次一個關鍵幀。接著，如步驟S640所示，判斷當前幀是否為新的關鍵幀。當步驟S640之判斷結果為是，於步驟S640之後係依序執行步驟S660及步驟S680。在步驟S660中，係將當前幀加入局部地圖並更新，建立新的關鍵幀與其他關鍵幀的連接關係，以及更新新的關鍵幀與其他關鍵幀的特徵點匹配關係，以生成三維地圖。另在步驟S680中，係取得空間資訊，並對該空間資訊最佳化，以完成同步定位與地圖建構。After the pre-processing is completed by the motion picture filtering unit 23, the steps performed by the spatial positioning algorithm unit 24 are as follows. Please refer to FIG. 7 and cooperate with FIG. 3 and FIG. 4, wherein FIG. 7 shows a detailed flow chart of step S600 shown in FIG. As shown in FIG. 3, FIG. 4 and FIG. 7, the step S600 of the synchronous positioning and map construction and gesture recognition method of the present case includes the detailed process steps as follows. As shown in step S620, the spatial positioning algorithm unit 24 tracks the main directions of the plurality of feature points by using the spatial positioning algorithm, and adds the feature point vector to match the previous key frame to calculate the camera of the image capturing module (ie, the lens or The initial position of the photosensitive element) and find a matching point between the current frame and the partial map to optimize the position of the current frame and determine the next key frame. Next, as shown in step S640, it is determined whether the current frame is a new key frame. When the determination result in step S640 is YES, step S660 and step S680 are sequentially executed after step S640. In step S660, the current frame is added to the local map and updated, the connection relationship between the new key frame and other key frames is established, and the feature point matching relationship between the new key frame and other key frames is updated to generate a three-dimensional map. In addition, in step S680, spatial information is obtained, and the spatial information is optimized to complete synchronization positioning and map construction.

在手勢分析方面，請參閱第8圖並配合第3圖及第4圖，其中第8圖係顯示第3圖所示之步驟S700之細部流程圖。如第3圖、第4圖及第8圖所示，本案同步定位與地圖建構及手勢辨識方法之步驟S700係包括細部流程步驟如下。如步驟S725所示，手勢影像擷取單元25將顯示影像轉換為第二灰階化影像。接著，如步驟S750所示，透過數學型態學（Mathematical Morphology，亦有稱電腦型態學或人體型態學）以及特徵圖像（Feature Patterns）解析出第二灰階化影像中的手部影像。由於人類的手具有固定的型態，故特定的手部動作屬於有限的組合，可內建於資料庫中予以分析比對。然後，如步驟S775所示，對手部影像進行特徵點分析，以擷取手部影像為手勢影像，並將手勢影像製作成手勢點雲。For the gesture analysis, please refer to FIG. 8 and cooperate with FIG. 3 and FIG. 4, wherein FIG. 8 shows a detailed flowchart of step S700 shown in FIG. As shown in FIG. 3, FIG. 4 and FIG. 8, the step S700 of the synchronous positioning and map construction and gesture recognition method of the present case includes the detailed process steps as follows. As shown in step S725, the gesture image capturing unit 25 converts the display image into a second grayscale image. Then, as shown in step S750, the hand in the second grayscale image is analyzed through Mathematical Morphology (also referred to as computer morphology or human body morphology) and feature patterns (Feature Patterns). image. Since human hands have a fixed pattern, specific hand movements are a limited combination that can be built into the database for analysis and comparison. Then, as shown in step S775, the feature image is analyzed by the hand image to capture the hand image as a gesture image, and the gesture image is made into a gesture point cloud.

最後，在細部手勢辨識方面，請參閱第9圖並配合第3圖及第4圖，其中第9圖係顯示第3圖所示之步驟S800之細部流程圖如第3圖、第4圖及第9圖所示，本案同步定位與地圖建構及手勢辨識方法之步驟S800係包括細部流程步驟如下。如步驟S825所示，分析並追蹤前段段末所述之手勢點雲，並根據追蹤結果識別手指、手腕及手掌，以取得手部資訊。接著，如步驟S850所示，根據連續的手部資訊建構立體動作模型，亦即在連續的時間上持續取得手部資訊並根據該等手部資訊的變化建構立體動作模型及其動作。最後，如步驟S875所示，根據立體動作模型進行細部手勢辨識。Finally, in terms of detailed gesture recognition, please refer to FIG. 9 and cooperate with FIG. 3 and FIG. 4, wherein FIG. 9 shows a detailed flowchart of step S800 shown in FIG. 3, as shown in FIG. 3 and FIG. As shown in Fig. 9, the step S800 of the synchronous positioning and map construction and gesture recognition method of the present case includes the detailed process steps as follows. As shown in step S825, the gesture point cloud described in the previous paragraph is analyzed and tracked, and the finger, the wrist and the palm are identified according to the tracking result to obtain the hand information. Next, as shown in step S850, the stereoscopic motion model is constructed based on the continuous hand information, that is, the hand information is continuously acquired for a continuous period of time, and the stereoscopic motion model and its motion are constructed based on the changes of the hand information. Finally, as shown in step S875, detailed gesture recognition is performed according to the stereoscopic motion model.

其中，本案同步定位與地圖建構及手勢辨識方法為提供精確的細部手部辨識，在上述之手部資訊中係可取得包括對每一手指的每一指節之分析資訊，故在判斷手勢上除判斷手掌及手指的閉合、滑動及甩動外，更可判斷不同手指的指節彎曲或伸直狀態，俾以排列組合而能實現多種多樣的手勢操作動作，深富多元彈性。Among them, the synchronous positioning and map construction and gesture recognition method of the present case provide accurate detailed hand recognition, and in the above hand information, the analysis information including each knuckle of each finger can be obtained, so the judgment gesture is In addition to judging the closing, sliding and swaying of the palm and fingers, it is also possible to judge the bending or straightening state of the knuckles of different fingers, and to realize a variety of gesture operation actions by arrangement and combination, and to be rich in multiple elasticity.

另一方面，就未來擴充性而言，由於習知頭戴式電子裝置採用分別擷取目標手勢及目標場景以進行空間定位及手勢辨識之方法，其擷取到的皆為部分影像，甚至可能直接擷取灰階或黑白影像，由於其來源影像非如本案為包含所有資訊的全彩影像，於擴充性上自然不及於本案，且本案更能進一步將包含所有資訊的全彩影像作為來源影像提供給多種多樣的應用程式，於未來擴充性上具有無限可能。On the other hand, in terms of future scalability, since the conventional head-mounted electronic device adopts a method of separately capturing a target gesture and a target scene for spatial positioning and gesture recognition, all of the captured images are partial images, and may even be Directly capture grayscale or black-and-white images. Since the source image is not a full-color image containing all the information, the expansion is naturally not as good as the case, and the case can further use the full-color image containing all the information as the source image. Available in a wide variety of applications, there is unlimited possibilities for future scalability.

綜上所述，本案提供一種同步定位與地圖建構及手勢辨識方法，採用單一的影像擷取模組擷取同時包括目標場景及目標手勢之一擷取影像，並透過動態影像過濾單元及手勢影像擷取單元等影像前處理單元針對各個演算法作前處理並產生顯示影像，最後再由空間定位演算法單元以空間定位演算法進行空間定位，並由細部手勢辨識演算法單元以手勢辨識演算法進行細部手勢辨識，不僅可有效地減少產品體積與重量，同時增進使用者體驗並達成產品微小化的目的，更可進一步降低製造成本。同時，透過上述單一影像擷取模組擷取同時包括目標場景及目標手勢的全彩影像，其影像內容完整且可提供各類軟硬體進行影像處理及後續應用，具有良好的擴充性。In summary, the present invention provides a method for synchronous positioning, map construction and gesture recognition, which uses a single image capture module to capture one of the target scene and the target gesture, and captures the image through the motion image filtering unit and the gesture image. The image pre-processing unit such as the capture unit performs pre-processing on each algorithm and generates a display image. Finally, the spatial positioning algorithm unit performs spatial localization by the spatial positioning algorithm, and the detailed gesture recognition algorithm unit uses the gesture recognition algorithm. The detailed gesture recognition not only can effectively reduce the volume and weight of the product, but also enhance the user experience and achieve the purpose of miniaturization of products, and further reduce the manufacturing cost. At the same time, through the single image capturing module, the full color image including the target scene and the target gesture is captured, and the image content is complete and can provide various software and hardware for image processing and subsequent application, and has good expandability.

縱使本發明已由上述之實施例詳細敘述而可由熟悉本技藝之人士任施匠思而為諸般修飾，然皆不脫如附申請專利範圍所欲保護者。The present invention has been described in detail by the above-described embodiments, and may be modified by those skilled in the art, without departing from the scope of the appended claims.

1‧‧‧習知頭戴式電子裝置1‧‧‧Knowledge head-mounted electronic device

10‧‧‧控制單元 10‧‧‧Control unit

11‧‧‧第一影像擷取模組 11‧‧‧First image capture module

12‧‧‧手勢影像處理器 12‧‧‧ gesture image processor

13‧‧‧第二影像擷取模組 13‧‧‧Second image capture module

14‧‧‧場景影像處理器 14‧‧‧ Scene Image Processor

15‧‧‧手勢辨識演算法集成單元 15‧‧‧ gesture recognition algorithm integration unit

16‧‧‧空間定位演算法集成單元 16‧‧‧ Spatial Positioning Algorithm Integration Unit

2‧‧‧頭戴式電子裝置 2‧‧‧ head mounted electronic devices

20‧‧‧中央處理單元 20‧‧‧Central Processing Unit

21‧‧‧影像擷取模組 21‧‧‧Image capture module

22‧‧‧影像處理單元 22‧‧‧Image Processing Unit

23‧‧‧動態影像過濾單元 23‧‧‧Dynamic image filtering unit

24‧‧‧空間定位演算法單元 24‧‧‧ Spatial Positioning Algorithm Unit

25‧‧‧手勢影像擷取單元 25‧‧‧ gesture image capture unit

26‧‧‧細部手勢辨識演算法單元 26‧‧‧Detailed Gesture Recognition Algorithm Unit

27‧‧‧記憶體單元 27‧‧‧ memory unit

28‧‧‧微投影顯示單元 28‧‧‧Micro-projection display unit

第1圖係顯示習知頭戴式電子裝置之架構方塊圖。第2圖係顯示習知頭戴式電子裝置擷取目標手勢及目標場景並進行演算之示意圖。第3圖係顯示本案較佳實施例之同步定位與地圖建構及手勢辨識方法之流程圖。第4圖係顯示適用本案同步定位與地圖建構及手勢辨識方法之頭戴式電子裝置之架構方塊圖。第5圖係顯示以第4圖所示之頭戴式電子裝置擷取擷取影像並進行演算之示意圖。第6圖係顯示第3圖所示之步驟S500之細部流程圖。第7圖係顯示第3圖所示之步驟S600之細部流程圖。第8圖係顯示第3圖所示之步驟S700之細部流程圖。第9圖係顯示第3圖所示之步驟S800之細部流程圖。Figure 1 is a block diagram showing the architecture of a conventional head mounted electronic device. FIG. 2 is a schematic diagram showing a conventional head-mounted electronic device capturing target gestures and target scenes and performing calculations. Figure 3 is a flow chart showing the method of synchronous positioning, map construction and gesture recognition in the preferred embodiment of the present invention. Figure 4 is a block diagram showing the architecture of a head-mounted electronic device that is suitable for the simultaneous positioning, map construction and gesture recognition methods of the present invention. Fig. 5 is a schematic diagram showing the capture of an image and calculation by the head mounted electronic device shown in Fig. 4. Fig. 6 is a flow chart showing the details of step S500 shown in Fig. 3. Fig. 7 is a flow chart showing the details of step S600 shown in Fig. 3. Fig. 8 is a detailed flow chart showing the step S700 shown in Fig. 3. Fig. 9 is a flow chart showing the details of step S800 shown in Fig. 3.

Claims

一種同步定位與地圖建構及手勢辨識方法，包括步驟： (a)提供一頭戴式電子裝置； (b)以該頭戴式電子裝置之一影像擷取模組進行影像擷取，以取得一擷取影像； (c)以該頭戴式電子裝置之一影像處理單元對該擷取影像進行一前處理動作，產生一顯示影像； (d)複製該顯示影像並分別傳送至該頭戴式電子裝置之一動態影像過濾單元及一手勢影像擷取單元； (e)該動態影像過濾單元過濾該顯示影像，以除去該顯示影像中之動態影像，進而取得一空間影像； (f)該頭戴式電子裝置之一空間定位演算法單元根據該空間影像進行同步定位與地圖建構； (g)該手勢影像擷取單元分析該顯示影像，並擷取該顯示影像中之一手勢影像；以及 (h)該頭戴式電子裝置之一細部手勢辨識演算法單元根據該手勢影像進行細部手勢辨識。A method for synchronous positioning and map construction and gesture recognition comprises the steps of: (a) providing a head mounted electronic device; (b) performing image capturing by using an image capturing module of the head mounted electronic device to obtain a (c) performing a pre-processing operation on the captured image by an image processing unit of the head mounted electronic device to generate a display image; (d) copying the display image and transmitting the display image to the headset a dynamic image filtering unit and a gesture image capturing unit of the electronic device; (e) the dynamic image filtering unit filters the display image to remove the dynamic image in the display image, thereby obtaining a spatial image; (f) the head The spatial positioning algorithm unit of the wearable electronic device performs synchronous positioning and map construction according to the spatial image; (g) the gesture image capturing unit analyzes the display image and captures one of the gesture images in the display image; h) A detailed gesture recognition algorithm unit of the head mounted electronic device performs detailed gesture recognition according to the gesture image.

如申請專利範圍第1項所述之同步定位與地圖建構及手勢控制方法，其中該步驟(e)包括步驟： (e1)該動態影像過濾單元將該顯示影像轉換為一第一灰階化影像； (e2)根據該第一灰階化影像計算一深度資訊； (e3)透過加速穩健特徵取得複數個特徵點，並將該複數個特徵點製作成一三維點雲；以及 (e4)根據每一幀對應之該三維點雲過濾並移除移動物體產生之一點雲資料，進而取得該空間影像。The method for synchronous positioning and map construction and gesture control according to claim 1, wherein the step (e) includes the following steps: (e1) the dynamic image filtering unit converts the display image into a first grayscale image. (e2) calculating a depth information according to the first grayscale image; (e3) obtaining a plurality of feature points by accelerating the robust feature, and forming the plurality of feature points into a three-dimensional point cloud; and (e4) according to each The three-dimensional point cloud corresponding to the frame filters and removes the moving object to generate a point cloud data, thereby acquiring the spatial image.

如申請專利範圍第2項所述之同步定位與地圖建構及手勢控制方法，其中該步驟(f)包括步驟： (f1)追蹤該複數個特徵點的主方向，添加特徵點向量以與前一個關鍵幀匹配計算該影像擷取模組之一相機之初始位置，並在當前幀以及一局部地圖間找尋匹配點，以最佳化當前幀的位置並決定次一個關鍵幀； (f2)判斷當前幀是否為新的關鍵幀； (f3)將當前幀加入該局部地圖並更新，建立該新的關鍵幀與其他關鍵幀的連接關係，以及更新該新的關鍵幀與其他關鍵幀的特徵點匹配關係，以生成一三維地圖；以及 (f4)取得一空間資訊，並對該空間資訊最佳化，以完成同步定位與地圖建構；其中，當步驟(f2)之判斷結果為是，於該步驟(f2)之後係依序執行該步驟(f3)及該步驟(f4)。The method for synchronizing positioning and map construction and gesture control according to item 2 of the patent application scope, wherein the step (f) comprises the steps of: (f1) tracking the main direction of the plurality of feature points, adding the feature point vector to the previous one Key frame matching calculates an initial position of the camera of one of the image capturing modules, and finds a matching point between the current frame and a partial map to optimize the position of the current frame and determine the next key frame; (f2) determining the current Whether the frame is a new key frame; (f3) adding the current frame to the local map and updating, establishing a connection relationship between the new key frame and other key frames, and updating the new key frame to match the feature points of other key frames Relationship to generate a three-dimensional map; and (f4) obtaining a spatial information and optimizing the spatial information to complete synchronization positioning and map construction; wherein, when the determination result of step (f2) is yes, at this step (f2) This step (f3) and the step (f4) are sequentially performed.

如申請專利範圍第1項所述之同步定位與地圖建構及手勢控制方法，其中該步驟(g)包括步驟： (g1)該手勢影像擷取單元將該顯示影像轉換為一第二灰階化影像； (g2)透過數學型態學以及特徵圖像解析出該第二灰階化影像中的一手部影像；以及 (g3)對該手部影像進行特徵點分析，以擷取該手部影像為該手勢影像，並將該手勢影像製作成一手勢點雲。The method for synchronizing positioning and map construction and gesture control according to claim 1 , wherein the step (g) comprises the steps of: (g1) the gesture image capturing unit converts the display image into a second grayscale (g2) parsing a hand image in the second grayscale image through mathematical morphology and feature image; and (g3) performing feature point analysis on the hand image to capture the hand image For the gesture image, and make the gesture image into a gesture point cloud.

如申請專利範圍第4項所述之同步定位與地圖建構及手勢控制方法，其中該步驟(h)包括步驟： (h1)分析並追蹤該手勢點雲，並根據追蹤結果識別手指、手腕及手掌，以取得一手部資訊； (h2)根據連續的該手部資訊建構一立體動作模型；以及 (h3)根據該立體動作模型進行細部手勢辨識。For example, the method for synchronous positioning and map construction and gesture control described in claim 4, wherein the step (h) comprises the steps of: (h1) analyzing and tracking the point cloud of the gesture, and identifying the finger, the wrist and the palm according to the tracking result. (a) constructing a stereoscopic motion model based on the continuous hand information; and (h3) performing detailed gesture recognition based on the stereoscopic motion model.

如申請專利範圍第5項所述之同步定位與地圖建構及手勢控制方法，其中該手部資訊包括對每一手指的每一指節之分析資訊。For example, the method for synchronous positioning and map construction and gesture control described in claim 5, wherein the hand information includes analysis information for each knuckle of each finger.

如申請專利範圍第1項所述之同步定位與地圖建構及手勢控制方法，其中該頭戴式電子裝置包括：一中央處理單元，架構於控制該頭戴式電子裝置之運作；該影像擷取模組，用以擷取該擷取影像；該影像處理單元，與該中央處理單元相連接，用以對該擷取影像進行該前處理動作，以產生該顯示影像；該動態影像過濾單元，與該中央處理單元相連接；該空間定位演算法單元，與該動態影像過濾單元相連接；該手勢影像擷取單元，與該中央處理單元相連接；以及該細部手勢辨識演算法單元，與該手勢影像擷取單元相連接。The method for synchronizing positioning and map construction and gesture control according to claim 1, wherein the head mounted electronic device comprises: a central processing unit configured to control operation of the head mounted electronic device; a module for capturing the captured image; the image processing unit is coupled to the central processing unit for performing the pre-processing operation on the captured image to generate the display image; the dynamic image filtering unit, Connected to the central processing unit; the spatial positioning algorithm unit is coupled to the dynamic image filtering unit; the gesture image capturing unit is coupled to the central processing unit; and the detailed gesture recognition algorithm unit, and the The gesture image capturing unit is connected.

如申請專利範圍第7項所述之同步定位與地圖建構及手勢控制方法，其中該動態影像過濾單元、該手勢影像擷取單元、該空間定位演算法單元及該細部手勢辨識演算法單元皆為軟體單元，且皆係以該中央處理單元實現執行。The method for synchronous positioning and map construction and gesture control according to claim 7 , wherein the motion image filtering unit, the gesture image capturing unit, the spatial positioning algorithm unit, and the detailed gesture recognition algorithm unit are The software unit is implemented by the central processing unit.

如申請專利範圍第7項所述之同步定位與地圖建構及手勢控制方法，其中該步驟(d)係以該中央處理單元實現。The method for synchronizing positioning and map construction and gesture control according to claim 7 of the patent application scope, wherein the step (d) is implemented by the central processing unit.

如申請專利範圍第7項所述之同步定位與地圖建構及手勢控制方法，其中該頭戴式電子裝置更包括一記憶體單元，且該記憶體單元係與該中央處理單元相連接，其中至少該手勢影像係儲存於該記憶體單元。The method for synchronous positioning and map construction and gesture control according to claim 7 , wherein the head mounted electronic device further comprises a memory unit, and the memory unit is connected to the central processing unit, wherein at least The gesture image is stored in the memory unit.