JP2023552538A

JP2023552538A - Image processing methods and devices, electronic devices, storage media, and computer programs

Info

Publication number: JP2023552538A
Application number: JP2023533782A
Authority: JP
Inventors: 朋輝李; 静涛徐; 学峰范; 家華崔; 柳清張; 亮亮仲; 国洪李; 菲高
Original assignee: バイドゥオンラインネットワークテクノロジー（ペキン）カンパニーリミテッド
Priority date: 2021-09-09
Filing date: 2022-08-10
Publication date: 2023-12-18
Also published as: CN113793255A; WO2023035841A1

Abstract

本開示の実施例によれば、画像処理のための方法、装置、機器、記憶媒体及びプログラム製品を提供する。画像処理のための方法は、二次元画像に対する、少なくとも二次元画像の深度情報を含む入力情報を取得することと、二次元画像及び入力情報を利用して、二次元画像の各画素に対応する三次元点群を取得することと、目標二次元画素に対応する三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する三次元点群における隣接点集合に基づいて、二次元画像に対する三次元画像を生成することと、を含む。このように、二次元写真により三次元モデルの構築を実現することができ、良好な普遍性及び汎用性を有し、これによりユーザの没入型及び対話型体験を大幅に向上させることができる。According to embodiments of the present disclosure, methods, apparatus, apparatus, storage media, and program products for image processing are provided. A method for image processing includes acquiring input information for a two-dimensional image, including at least depth information of the two-dimensional image, and processing corresponding to each pixel of the two-dimensional image using the two-dimensional image and the input information. Based on obtaining a 3D point cloud and a point in the 3D point cloud corresponding to the target 2D pixel, and a set of neighboring points in the corresponding 3D point cloud of a set of pixels adjacent to the target 2D pixel. and generating a three-dimensional image for the two-dimensional image. In this way, the construction of a three-dimensional model can be realized by two-dimensional photographs, which has good universality and versatility, which can greatly improve the immersive and interactive experience of users.

Description

（関連出願のクロス援用）
本願は、出願番号が２０２１１１０５６６７１.６であり、名称が「画像処理ための方法、装置、機器、記憶媒体及びプログラム製品」であり、出願日が２０２１年９月９日である中国発明特許出願の優先権を主張し、この援用により該出願全体を本明細書に組み込む。 (Cross-incorporation of related applications)
This application is a Chinese invention patent application whose application number is 202111056671.6 and whose name is "Method, apparatus, equipment, storage medium and program product for image processing" and whose filing date is September 9, 2021. Priority is claimed and the entire application is incorporated herein by this reference.

本開示の実施例は主にコンピュータの分野に関し、より具体的には、画像処理方法及び装置、機器、記憶媒体ならびにプログラム製品に関する。 TECHNICAL FIELD Embodiments of the present disclosure relate primarily to the field of computers, and more particularly to image processing methods and apparatus, equipment, storage media, and program products.

画像の二次元表示は、現在最も主要な画像表示形式である。二次元画像とは、一般に平面画像である。二次元画像は左、右、上、下の四つの方向のみがあり、前後が存在しない。したがって、二次元画像は面積のみがあり、体積がない。一般的に、二次元画像は、ＲＧＢ画像又はグレースケール画像であってもよい。ユーザがより良好な没入式又は対話型体験を必要とする場合、二次元画像を三次元画像に変換する方法が必要である。 Two-dimensional image display is currently the most common image display format. A two-dimensional image is generally a planar image. A two-dimensional image has only four directions: left, right, top, and bottom, and there is no front or back. Therefore, a two-dimensional image has only area and no volume. Generally, a two-dimensional image may be an RGB image or a grayscale image. When users require a better immersive or interactive experience, a method of converting two-dimensional images to three-dimensional images is needed.

本開示の実施例によれば、画像処理の解決手段を提供する。 According to embodiments of the present disclosure, an image processing solution is provided.

本開示の第一態様において、画像処理方法を提供し、二次元画像に対する、少なくとも二次元画像の深度情報を含む入力情報を取得することと、二次元画像及び入力情報を用いて、二次元画像の各画素に対応する三次元点群を取得することと、目標二次元画素に対応する三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する三次元点群における隣接点集合に基づいて、二次元画像に対する三次元画像を生成することと、を含む。 In a first aspect of the present disclosure, an image processing method is provided, which includes acquiring input information including at least depth information of the two-dimensional image, and processing the two-dimensional image using the two-dimensional image and the input information. a point in the three-dimensional point cloud corresponding to the target two-dimensional pixel, and a corresponding three-dimensional point cloud of a set of pixels adjacent to the target two-dimensional pixel; and generating a three-dimensional image for the two-dimensional image based on a set of adjacent points in the image.

本開示の第二態様において、ビデオ処理方法を提供し、本開示の第一態様の方法に基づいて、ビデオストリーム中の各フレームの二次元画像に対して対応する三次元画像を生成することと、生成された三次元画像を利用して、三次元ビデオストリームを生成することと、を含む。 In a second aspect of the disclosure, there is provided a video processing method, comprising: generating a corresponding three-dimensional image for a two-dimensional image of each frame in a video stream based on the method of the first aspect of the disclosure; , and generating a three-dimensional video stream using the generated three-dimensional image.

本開示の第三態様において、画像処理の装置を提供し、二次元画像に対する、少なくとも二次元画像の深度情報を含む入力情報を取得するように構成される入力情報取得モジュールと、二次元画像及び入力情報を用いて、二次元画像の各画素に対応する三次元点群を取得するように構成される三次元点群取得モジュールと、目標二次元画素に対応する三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する三次元点群における隣接点集合に基づいて、二次元画像に対する三次元画像を生成するように構成される三次元画像生成モジュールと、を含む。 In a third aspect of the present disclosure, an image processing apparatus is provided, and an input information acquisition module configured to acquire input information including at least depth information of the two-dimensional image with respect to the two-dimensional image; a three-dimensional point cloud acquisition module configured to use input information to acquire a three-dimensional point cloud corresponding to each pixel of a two-dimensional image; a point in the three-dimensional point cloud corresponding to a target two-dimensional pixel; and a three-dimensional image generation module configured to generate a three-dimensional image for the two-dimensional image based on a set of adjacent points in a corresponding three-dimensional point group of a set of pixels adjacent to the target two-dimensional pixel. include.

本開示の第四態様において、ビデオ処理装置を提供し、本開示の第二態様の方法に基づいて、ビデオストリーム中の各フレームの二次元画像に対して対応する三次元画像を生成するように構成される第２の三次元画像生成モジュールと、生成された三次元画像を利用して、三次元ビデオストリームを生成するように構成される三次元ビデオストリーム生成モジュールと、を含む。 In a fourth aspect of the disclosure, there is provided a video processing apparatus, adapted to generate a corresponding three-dimensional image for a two-dimensional image of each frame in a video stream based on the method of the second aspect of the disclosure. and a 3D video stream generation module configured to generate a 3D video stream using the generated 3D image.

本開示の第五態様において、電子機器を提供し、一つ又は複数のプロセッサと、一つ又は複数のプログラムを記憶する記憶装置と、を含み、一つ又は複数のプログラムが一つ又は複数のプロセッサにより実行されることにより、一つ又は複数のプロセッサが本開示の第一態様又は第二態様に係る方法を実現する。 In a fifth aspect of the present disclosure, there is provided an electronic device including one or more processors and a storage device storing one or more programs, the one or more programs being one or more. When executed by the processor, one or more processors implement the method according to the first aspect or the second aspect of the present disclosure.

本開示の第六態様において、コンピュータ可読な記憶媒体を提供し、コンピュータプログラムが記憶され、プログラムがプロセッサにより実行される時に本開示の第一態様又は第二態様に係る方法を実現する。 In a sixth aspect of the present disclosure, a computer readable storage medium is provided, in which a computer program is stored, and when the program is executed by a processor, it implements a method according to the first or second aspect of the present disclosure.

本開示の第七態様において、コンピュータプログラム製品を提供し、プロセッサにより実行される時、本開示の第一態様又は第二態様に係る方法を実行するコンピュータプログラムを含む。 In a seventh aspect of the disclosure, a computer program product is provided, comprising a computer program product which, when executed by a processor, performs a method according to the first or second aspect of the disclosure.

理解すべきことは、発明の内容部分に記述された内容は本開示の実施例のキー又は重要な特徴を限定するものではなく、本開示の範囲を限定するものではない。本開示の他の特徴は、以下の説明により容易に理解されるであろう。 It is to be understood that what is described in the Summary section does not limit the key or important features of the embodiments of the present disclosure and does not limit the scope of the present disclosure. Other features of the disclosure will be readily understood from the following description.

図面を参照しながら以下の詳細な説明を参照し、本開示の各実施例の上記及び他の特徴、利点及び方面はより明らかになる。図面において、同一又は類似の図面は同一又は類似の要素を示す。 These and other features, advantages and aspects of embodiments of the present disclosure will become more apparent with reference to the following detailed description with reference to the drawings. In the drawings, identical or similar drawings indicate identical or similar elements.

図１は、本開示の複数の実施例がその中に実現できる例示環境の概略図を示す。FIG. 1 depicts a schematic diagram of an example environment in which embodiments of the present disclosure may be implemented. 図２は、本開示の実施例に係る三次元画像を生成する過程のフローチャートを示す。FIG. 2 shows a flowchart of a process of generating a three-dimensional image according to an embodiment of the present disclosure. 図３は、本開示のいくつかの実施例に係る三次元画像を生成する過程において点群に基づいて三次元画像を生成する過程の概略図を示す。FIG. 3 shows a schematic diagram of a process of generating a three-dimensional image based on a point cloud in a process of generating a three-dimensional image according to some embodiments of the present disclosure. 図４は、本開示のいくつかの実施例に係る三次元ビデオストリームを生成する過程のフローチャートを示す。FIG. 4 shows a flowchart of a process for generating a three-dimensional video stream according to some embodiments of the present disclosure. 図５は、本開示のいくつかの実施例に係る三次元画像を生成する装置の概略図を示す。FIG. 5 shows a schematic diagram of an apparatus for generating three-dimensional images according to some embodiments of the present disclosure. 図６は、本開示の実施例に係る三次元ビデオストリームを生成する装置の概略ブロック図を示す。FIG. 6 shows a schematic block diagram of an apparatus for generating a three-dimensional video stream according to an embodiment of the present disclosure. 図７は、本開示を実施可能な複数の実施例の計算機器のブロック図を示す。FIG. 7 depicts a block diagram of several example computing devices capable of implementing the present disclosure.

以下、本発明の実施例について、図面を参照してより詳細に説明する。図面において本開示のいくつかの実施例を示すが、理解すべきこととして、本開示は様々な形式で実現することができ、かつ説明される実施例に限定されるべきではなく、逆にこれらの実施例を提供することは本開示をより徹底的かつ完全に理解するためである。理解すべきこととして、本開示の図面及び実施例は例示的な作用のみに用いられ、本開示の保護範囲を限定するものではない。 Hereinafter, embodiments of the present invention will be described in more detail with reference to the drawings. Although several embodiments of the disclosure are shown in the drawings, it should be understood that the disclosure may be implemented in various forms and is not limited to the embodiments described, on the contrary. In order that the present disclosure may be more thoroughly and completely understood, it is preferable to provide examples of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are used for illustrative purposes only and are not intended to limit the protection scope of the present disclosure.

本開示の実施例の説明において、用語「含む」及びその類似用語は、開放的に含み、即ち「含むがこれらに限定されない」と理解すべきである。用語「基づく」は、「少なくとも部分的に基づいて」であると理解すべきである。用語「一つの実施例」又は「該実施例」は、「少なくとも一つの実施例」と理解すべきである。用語「第一」、「第二」などは異なる又は同じ対象を指すことができる。以下にさらに他の明確な及び暗黙的な定義を含む可能性がある。 In describing embodiments of the present disclosure, the term "comprising" and its analogous terms should be understood to include inclusively, ie, "including, but not limited to." The term "based on" is to be understood as "based at least in part." The terms "an embodiment" or "the embodiment" are to be understood as "at least one embodiment." The terms "first", "second", etc. can refer to different or the same object. The following may contain further explicit and implicit definitions.

また、本明細書において、理解すべきこととして、用語「３Ｄ」は、「三次元」に相当することができ、用語「２Ｄ」は「二次元」に相当することができ、かつ「三次元画像」は「三次元画像モデル」に相当することができる。 Also, as used herein, it should be understood that the term "3D" can correspond to "three dimensions," and the term "2D" can correspond to "two dimensions," and "three dimensions" can correspond to "three dimensions." "image" can correspond to "three-dimensional image model".

前述したように、二次元画像を三次元画像に変換する方法を必要とし、より良好な没入型又は対話型体験を満たす。従来、主に二種類の解決案がある。一態様において、三次元ソフトウェアに基づいて設計することができ、ユーザは例えばＢｌｅｎｄｅｒなどの三次元モデルを直接利用して二次元画像を三次元画像に生成することができる。しかしながら、このような方式は既に存在する三次元モデルに依存する必要があり、かつその応用シーンは既知の三次元情報のシーンにしか存在していないので、この解決案の応用範囲は限られている。 As mentioned above, there is a need for a method to convert two-dimensional images into three-dimensional images to satisfy a better immersive or interactive experience. Traditionally, there are mainly two types of solutions. In one embodiment, the design can be based on three-dimensional software, and the user can directly utilize a three-dimensional model such as Blender to generate a two-dimensional image into a three-dimensional image. However, since such a method needs to rely on an already existing 3D model, and its application scene only exists in the scene of known 3D information, the scope of application of this solution is limited. There is.

別の従来の解決案において、一般的に、三角測量原理のハードウェア走査方式に基づいて二次元画像を三次元画像に変換する。しかしながら、このような解決手段は、対応するハードウェアに依存する（構造光、ｔｏｆ等に基づく）必要がある。特定のサイズの測定すべき物体に対して、複数回走査する必要があり、三次元モデル再構成を実現することができる。また、特定の走査ハードウェアに対して、再構成すべき物体のサイズが大きいほど、走査回数が多いほど、対応する計算量も大きくなる；また、該解決案はハードウェアに基づいて実現されるため、三次元モデル再構成された物体に対して現場走査を行う必要があるため、該技術案は使用上に大きな限界性を有する。 In another conventional solution, a two-dimensional image is generally converted into a three-dimensional image based on a hardware scanning method of triangulation principle. However, such solutions have to depend on the corresponding hardware (based on structured light, tof, etc.). An object to be measured of a specific size needs to be scanned multiple times, and three-dimensional model reconstruction can be realized. Also, for a given scanning hardware, the larger the size of the object to be reconstructed and the greater the number of scans, the larger the corresponding computational effort; Therefore, it is necessary to perform on-site scanning of the object whose three-dimensional model has been reconstructed, so this technical proposal has a large limitation in use.

上記問題と他の潜在的な問題を少なくとも部分的に解決するために、本明細書は二次元画像に基づいて三次元画像を生成する解決手段を提供する。該解決手段において、二次元写真を利用して三次元モデルの構築を実現することができ、良好な普遍性及び汎用性を有し、追加のハードウェアに依存せず、現場の走査を必要としない。かつ、三次元点群における点と二次元画像の画素との一対一の対応関係を利用して、三次元画像の色情報及びテクスチャ情報を三次元画像モデルに描画することができる。このようにして、該解決手段を利用して生成された三次元画像の色情報及びテクスチャ情報が失われず、それにより高品質の３Ｄモデルを生成し、さらにユーザの没入型及び対話型体験を大幅に向上させる。 To at least partially solve the above problems and other potential problems, the present specification provides a solution for generating three-dimensional images based on two-dimensional images. The solution can realize the construction of a 3D model using 2D photographs, has good universality and versatility, does not rely on additional hardware, and does not require on-site scanning. do not. Moreover, the color information and texture information of the three-dimensional image can be drawn on the three-dimensional image model by using the one-to-one correspondence between the points in the three-dimensional point group and the pixels of the two-dimensional image. In this way, the color information and texture information of the 3D images generated using the solution are not lost, thereby generating high-quality 3D models and further improving the immersive and interactive experience of the user. to improve.

以下、本発明の実施例について、図面を参照して具体的に説明する。 Embodiments of the present invention will be specifically described below with reference to the drawings.

図１は、本開示の複数の実施例が実現可能な例示環境１００の概略図を示す。該例示環境１００において、二次元画像１１０を計算機器１２０に入力することができ、計算機器１２０に対応計算（例えば以下に説明する深度計算及び画像修復など）を行い、三次元画像１３０を生成する。 FIG. 1 depicts a schematic diagram of an example environment 100 in which embodiments of the present disclosure may be implemented. In the example environment 100, a two-dimensional image 110 may be input to a computing device 120, which performs corresponding calculations (such as depth calculations and image inpainting as described below) to generate a three-dimensional image 130. .

いくつかの実施例において、二次元画像１１０は、前述のような平面画像であってもよい。説明すべきものとして、該二次元画像１１０は前景と背景を有する画像であってもよく、前景を有しないか、又は明らかな前景を有しない画像であってもよい。 In some embodiments, two-dimensional image 110 may be a planar image as described above. It should be noted that the two-dimensional image 110 may be an image with a foreground and a background, or may be an image without a foreground or without an obvious foreground.

説明すべきことは、二次元画像における前景と背景は、画像技術分野の一般的な概念である。一般的に、画像の前景は視聴者に最も近い視覚平面であり、画像における背景は視聴者から遠い合成における平面である。例えば、人物の一枚の二次元画像において、人物は一般的に画像の前景であり、画像における他の部分は常に背景と呼ばれる。いくつかの画像に対して、例えば青空白雲画像であれば、それは前景又は明らかな前景がなく、これも二次元画像の一種である。 It should be explained that foreground and background in two-dimensional images are common concepts in the field of imaging technology. Generally, the foreground of an image is the visual plane closest to the viewer, and the background in the image is the plane in the composition that is farthest from the viewer. For example, in a two-dimensional image of a person, the person is generally in the foreground of the image, and the other parts of the image are always referred to as the background. For some images, such as a blue sky cloud image, it has no foreground or obvious foreground, which is also a type of two-dimensional image.

本開示の異なる実施例において、異なる種類の二次元画像１１０に対して計算機器１２０により異なる処理を行うことができる。このことは、後に詳述する。 In different embodiments of the present disclosure, different types of two-dimensional images 110 may be processed differently by computing device 120. This will be explained in detail later.

相応的には、三次元画像１３０（「三次元画像モデル」とも呼ばれる）は一般的に高さ、幅及び奥行きを有する立体画像を指す。本開示の実施例において、三次元画像１３０は二次元画像１１０に基づいて取得されたユーザの没入型及び対話型体験をより向上させることができる画像であってもよい。 Correspondingly, three-dimensional image 130 (also referred to as a "three-dimensional image model") generally refers to a three-dimensional image having a height, width, and depth. In embodiments of the present disclosure, the three-dimensional image 130 may be an image obtained based on the two-dimensional image 110 that can further enhance the immersive and interactive experience of the user.

いくつかの実施例において、計算機器１２０は、デスクトップコンピュータ、タブレットコンピュータ、パーソナルデジタルアシスタント（ＰＤＡ）、サーバ、ホストなど、又は有線データ通信又は無線データ通信を行うことができる任意の他のプロセッサイネーブルデバイスを含むことができ、本開示はこれを限定しない。 In some examples, computing device 120 is a desktop computer, tablet computer, personal digital assistant (PDA), server, host, etc., or any other processor-enabled device capable of wired or wireless data communications. This disclosure is not limited thereto.

本開示の実施例が提供する二次元画像に対する三次元画像を生成する解決手段をより明確に理解するために、図２を参照して本開示の実施例をさらに説明する。図２は、本開示の実施例に係る三次元画像を生成する過程２００のフローチャートを示す。なお、過程２００は、図１の計算機器１２０によって実現されてもよい。説明を容易にするために、図１を参照して過程２００を説明する。 In order to more clearly understand the solution for generating three-dimensional images versus two-dimensional images provided by the embodiments of the present disclosure, the embodiments of the present disclosure will be further described with reference to FIG. 2. FIG. 2 shows a flowchart of a process 200 for generating three-dimensional images according to an embodiment of the present disclosure. Note that process 200 may be implemented by computing device 120 of FIG. For ease of explanation, process 200 will be described with reference to FIG.

ブロック２１０において、二次元画像１１０に対する入力情報を取得し、入力情報は少なくとも二次元画像１１０の深度情報を含む。深度情報は二次元画像１１０の深度画像を含むことができ、深度画像は二次元画像１１０を深度モデルに入力することにより取得することができる。理解すべきこととして、深度画像の解像度が二次元画像１１０の解像度と一致している。 At block 210, input information for the two-dimensional image 110 is obtained, the input information including at least depth information of the two-dimensional image 110. The depth information can include a depth image of the two-dimensional image 110, and the depth image can be obtained by inputting the two-dimensional image 110 into a depth model. It should be understood that the resolution of the depth image matches the resolution of the two-dimensional image 110.

いくつかの実施例において、深度モデルは、計算機器１２０に配置されてもよく、計算機器１２０と異なる他の計算機器に配置されてもよく、必要な深度情報をブロック２２０の入力として取得できればよく、本開示はこれを制限しない。 In some embodiments, the depth model may be located on computing device 120 or may be located on another computing device different from computing device 120, as long as the necessary depth information can be obtained as an input to block 220. , this disclosure does not limit this.

いくつかの実施例において、入力情報はさらに二次元画像１１０の前景マスク、背景マスク（前景マスク及び背景マスクを「前景背景マスク」と総称することができる）及び修正された画像情報などを含むことができる。 In some embodiments, the input information may further include a foreground mask, a background mask (the foreground mask and the background mask may be collectively referred to as "foreground and background masks"), modified image information, etc. of the two-dimensional image 110. I can do it.

このような実施例において、二次元画像１１０は一般的に明らかな前景と背景を有する。分割モデルにより前景マスク及び背景マスクを取得し、かつ画像修正（ｉｎｐａｉｎｔｉｎｇ）モデルにより修正された画像（すなわち修正画像情報）を取得することができる。このように、入力情報は、深度画像、修復された画像、前景マスク及び背景マスクを含むことができる。 In such embodiments, the two-dimensional image 110 typically has a distinct foreground and background. A foreground mask and a background mask can be obtained by the segmentation model, and a modified image (ie, modified image information) can be obtained by the image inpainting model. Thus, the input information may include a depth image, an inpainted image, a foreground mask, and a background mask.

説明すべきこととして、分割モデル及び画像修復モデルは、計算機器１２０に配置されてもよく、計算機器１２０と異なる他の計算機器に配置されてもよく、必要な深度情報をブロック２２０の入力として取得できればよく、本開示はこれを制限しない。 It should be noted that the segmentation model and the image inpainting model may be located on the computing device 120 or on another computing device different from the computing device 120, with the necessary depth information as input to block 220. It is sufficient if it can be obtained, and the present disclosure does not limit this.

ブロック２２０において、二次元画像１１０と入力情報を利用して、二次元画像１１０の各画素に対応する三次元点群を取得する。分かるように、ブロック２２０において、二次元画像１１０は入力情報と見なされてもよい。 At block 220, a three-dimensional point cloud corresponding to each pixel of the two-dimensional image 110 is obtained using the two-dimensional image 110 and input information. As can be seen, at block 220, the two-dimensional image 110 may be considered input information.

いくつかの実施例において、二次元画像１１０と入力情報に対して正規化処理を行うことにより、三次元点群を取得することができる。正規化は、計算を簡略化する方式であり、数を（０、１）の間の小数に変更し、又は次元を有する表現式を、変換により、無次元の表現式に変換し、純粋な量になる。このように、データ処理を容易にし、計算量を減少させることができる。 In some embodiments, a three-dimensional point cloud can be obtained by performing a normalization process on the two-dimensional image 110 and input information. Normalization is a method of simplifying calculations, changing a number to a decimal between (0, 1), or converting an expression with dimensions into a dimensionless expression, and converting it into a pure expression. It becomes quantity. In this way, data processing can be facilitated and the amount of calculations can be reduced.

いくつかの実施例において、二次元画像１１０の画素が位置する平面座標系を三次元座標系に変換し、かつ二次元画像１１０の画素に基づいて、三次元座標系における三次元点群を生成することができる。平面座標系は、画素座標系又は画像座標系のうちの少なくとも一種を含むことができ、かつ三次元座標系はカメラ座標系又はワールド座標系のうちの少なくとも一種を含む。このようにして、二次元画像１１０から三次元画像への遷移を迅速に実現することができる。 In some embodiments, a planar coordinate system in which pixels of the two-dimensional image 110 are located is converted to a three-dimensional coordinate system, and a three-dimensional point group in the three-dimensional coordinate system is generated based on the pixels of the two-dimensional image 110. can do. The planar coordinate system may include at least one of a pixel coordinate system or an image coordinate system, and the three-dimensional coordinate system may include at least one of a camera coordinate system or a world coordinate system. In this way, the transition from the two-dimensional image 110 to the three-dimensional image can be quickly realized.

以上の前景又は明らかな前景がない二次元画像１１０の実施例において、直接に深度画像に基づいて三次元変換（例えば数学的射影変換）を行って正規化処理後の三次元点群を生成することができる。説明すべきものとして、上記三次元変換の方式は例示的なものだけであり、二次元から三次元への変換を実現することができる任意の方式はいずれも可能であり、本開示はこれを制限しない。 In the above embodiment of the two-dimensional image 110 with no foreground or obvious foreground, a three-dimensional transformation (e.g., mathematical projective transformation) is directly performed based on the depth image to generate a three-dimensional point cloud after normalization processing. be able to. It should be noted that the above three-dimensional transformation scheme is only exemplary, and any scheme that can achieve two-dimensional to three-dimensional transformation is possible, and this disclosure does not limit this. do not.

前景と背景を有する二次元画像１１０における実施例において、前記のように、入力情報はさらに修正された画像、前景マスク及び背景マスクを含むことができる。以下では、平面座標系を画素座標系とし、三次元座標系をカメラ座標系とすることを例とし、以下の式（１）－（６）の数学的射影変換の方式により二次元画像１１０の各画素に対応する三次元点群を取得する過程を詳細に説明する。 In the embodiment of a two-dimensional image 110 having a foreground and a background, the input information may further include a modified image, a foreground mask, and a background mask, as described above. In the following, it is assumed that the plane coordinate system is a pixel coordinate system and the three-dimensional coordinate system is a camera coordinate system, and the two-dimensional image 110 is The process of acquiring a three-dimensional point group corresponding to each pixel will be explained in detail.

ここで、Ｚ_ｃは深度画像における画素の高さ情報であり、ｕおよびｖはそれぞれ二次元画素座標系における座標位置であり、ｕ_０、ｖ_０は、画像中心であり、ｆは、仮想カメラの焦点距離であり、単位がｍｍであり、ｄ_ｘ、ｄ_ｙは、画素サイズである。 Here, Z _c is the height information of the pixel in the depth image, u and v are the coordinate positions in the two-dimensional pixel coordinate system, u ₀ and v ₀ are the image center, and f is the virtual camera is the focal length of , the unit is mm, and d _x and _dy are pixel sizes.

該実施例において、式（１）及び式（２）の数学的変換により、式（３）及び式（４）に示されるｘ軸及びｙ軸での正規化焦点距離ｆ_ｘおよびｆ_ｙを取得することができる。正規化焦点距離ｆ_ｘ及びｆ_ｙを取得した後に、小孔結像原理に基づいて、式（５）及び式（６）により小孔結像に基づくカメラ座標系における画素のＸ軸及びＹ軸の座標Ｘ_ｃ及びＹ_ｃを取得することができる。 In this example, the normalized focal lengths f _x and f _y in the x and y axes shown in equations (3) and (4) are obtained by mathematical transformation of equations (1) and (2). can do. After obtaining the _normalized focal lengths _f The coordinates X _c and Y _c of can be obtained.

なお、式（１）及び式（２）の数学的変換方式は例示的なものだけであり、当業者は、実際の需要に応じて調整することができる。かつ座標Ｘ_ｃ及びＹ_ｃの決定方式も模式的であり、当業者は実際の必要に応じて式（３）及び式（４）を調整して他の方式で正規化焦点距離を決定することができ、本開示はこれを制限しない。 It should be noted that the mathematical conversion methods of equations (1) and (2) are only exemplary, and those skilled in the art can adjust them according to actual needs. In addition, the method for determining the coordinates X _c and Y _c is also schematic, and those skilled in the art can adjust equations (3) and (4) according to actual needs to determine the normalized focal length using other methods. can be used, and this disclosure does not limit this.

さらに説明する必要があることとして、上記式（５）及び式（６）は、ｕ_０、ｖ_０を画像中心としてＸ軸及びＹ軸の座標Ｘ_ｃ及びＹ_ｃを取得することである。当業者は、他の任意の画素点の位置を用いて関連画素の座標を計算することができ、本開示はこれを制限しない。 What needs further explanation is that the above equations (5) and (6) obtain the coordinates X _c and Y _c of the X and Y axes with u ₀ and v ₀ as the center of the image. Those skilled in the art can use any other pixel point location to calculate the coordinates of the relevant pixel, and this disclosure does not limit this.

さらに、仮想カメラの水平及び垂直画角をφｈ及びφｖとすることができれば、以下を取得することができる： Furthermore, if the horizontal and vertical angles of view of the virtual camera can be set to φh and φv, the following can be obtained:

式（７）及び式（８）を結合して、画像画素座標系を小孔撮像に基づくカメラ座標系に変換することができ、即ち特定のレンダリング視野角を設定すれば、二次元画素座標系における画素から、三次元カメラ座標系における点群を生成することができる。深度マップ範囲が０～２^１６－１であるため、普遍性を保証するために、深度マップ範囲を０～１にマッピングする。このようにすれば、上記簡便なステップにより二次元画像１１０に基づいて三次元画像を取得することができる。ハードウェア走査に基づいて取得された点群の方式に比べて、ハードウェアに不可避的に偏差が存在するため、点群に紛失（ＮＡＮ値）が存在しやすく、深度学習に基づいて取得された点群に点群欠落問題が存在せず、したがってより高品質の三次元画像モデルを表示することができる。 By combining equations (7) and (8), the image pixel coordinate system can be transformed into the camera coordinate system based on stoma imaging, i.e., by setting a certain rendering viewing angle, the two-dimensional pixel coordinate system A point cloud in a three-dimensional camera coordinate system can be generated from pixels in . Since the depth map range is 0 to 2 ¹⁶ -1, we map the depth map range to 0 to 1 to ensure universality. In this way, a three-dimensional image can be obtained based on the two-dimensional image 110 through the above-mentioned simple steps. Compared to the method of point clouds acquired based on hardware scanning, due to the unavoidable deviation in the hardware, there are more likely to be missing points (NAN values) in the point cloud, and the method of point clouds acquired based on deep learning There is no point cloud missing problem in the point cloud, and therefore a higher quality three-dimensional image model can be displayed.

説明すべきこととして、上記二次元画像１１０を三次元画像に変換する方式は例示的なものだけであり、当業者は、任意の適切な方式で上記変換を実現することができ、又は上記言及されたパラメータを対応的に調整して上記変換を実現することができ、本開示はこれを制限しない。 It should be noted that the manner of converting the two-dimensional image 110 into a three-dimensional image is exemplary only, and those skilled in the art can realize the conversion in any suitable manner, or use the methods mentioned above. The converted parameters can be correspondingly adjusted to realize the above transformation, and this disclosure does not limit this.

ブロック２３０において、目標二次元画素に対応する三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する三次元点群における隣接点集合に基づいて、二次元画像１１０に対する三次元画像を生成する。 At block 230, a two-dimensional image 110 is generated based on the point in the three-dimensional point cloud corresponding to the target two-dimensional pixel and the set of neighboring points in the corresponding three-dimensional point cloud of a set of pixels adjacent to the target two-dimensional pixel. Generate a three-dimensional image for.

該実施例において、目標二次元画素は、二次元画像１１０における任意の画素であってもよく、例えば図３に示すような画素Ｘであってもよい。一組の画素は、目標二次元画素に隣接する画素の集合であり、例えば図３に示すような画素Ａ－Ｈの集合であってもよい。二次元画素と三次元点群中の点は一対一に対応する関係があるため、理解されるように、目標二次元画素と隣接する一組の画素に対応する三次元点群中の点は空間上に一定の位置関係を有する。目標二次元画素に対応する三次元点群中の点及び隣接する一組の画素に対応する三次元点群中の隣接点集合に基づいて、二次元画像１１０に対する三次元画像を生成する。 In this embodiment, the target two-dimensional pixel may be any pixel in the two-dimensional image 110, for example, pixel X as shown in FIG. A set of pixels is a set of pixels adjacent to the target two-dimensional pixel, and may be a set of pixels AH as shown in FIG. 3, for example. Since there is a one-to-one correspondence between two-dimensional pixels and points in the three-dimensional point cloud, it is understood that the points in the three-dimensional point cloud that correspond to a set of pixels adjacent to the target two-dimensional pixel are It has a certain positional relationship in space. A three-dimensional image for the two-dimensional image 110 is generated based on a point in the three-dimensional point group corresponding to the target two-dimensional pixel and a set of adjacent points in the three-dimensional point group corresponding to a set of adjacent pixels.

以下、図３を参照しながら、ブロック２３０がさらに実現する例示的なステップを説明する。図３は、本開示のいくつかの実施例に係る点群に基づいて三次元画像を生成する過程３００の概略図を示す。図３において、上記のように、目標二次元画素はＸであってもよく、目標二次元画素に隣接する一組の画素は画素Ａ－Ｈの集合であってもよい。それに対応して、隣接点集合は、該一組の画素Ａ－Ｈが三次元点群で対応する点の集合であってもよい。 Exemplary steps further implemented by block 230 will now be described with reference to FIG. FIG. 3 shows a schematic diagram of a process 300 of generating a three-dimensional image based on a point cloud according to some embodiments of the present disclosure. In FIG. 3, as described above, the target two-dimensional pixel may be X, and the set of pixels adjacent to the target two-dimensional pixel may be a set of pixels AH. Correspondingly, the adjacent point set may be a set of points to which the set of pixels AH correspond in a three-dimensional point group.

いくつかの実施例において、目標二次元画素Ｘに対応する三次元点群中の点、及び隣接点集合における少なくとも二つの点に基づいて、前記目標画素と前記一組の画素に対する平面３０１を取得し、かつ取得された平面グリッド３０１に基づいて、前記二次元画像１１０に対する前記三次元画像を生成することができる。 In some embodiments, a plane 301 for the target pixel and the set of pixels is obtained based on a point in the three-dimensional point cloud corresponding to the target two-dimensional pixel X and at least two points in the adjacent point set. Then, based on the acquired plane grid 301, the three-dimensional image for the two-dimensional image 110 can be generated.

該実施例において、平面グリッド３０１を生成する過程は、コード化及びシート化過程を含むことができる。すなわち、画素と点群との対応関係に基づいて、三次元座標系における点群を符号化してシート化し、それにより二次元画像に対する三次元画像モデルを生成することができる。このように、平面グリッド３０１の方式で三次元画像モデリングを実現することができ、即ち一枚の二次元画像で三次元モデリングを完了することができ、追加ハードウェアに依存する必要がなく、現場走査を行う必要がなく、高い実用価値を備える。 In this embodiment, the process of generating the planar grid 301 may include encoding and sheeting processes. That is, based on the correspondence between pixels and point groups, the point group in the three-dimensional coordinate system is encoded into a sheet, and thereby a three-dimensional image model for the two-dimensional image can be generated. In this way, three-dimensional image modeling can be realized using the planar grid 301 method, that is, three-dimensional modeling can be completed with one two-dimensional image, there is no need to rely on additional hardware, and it can be done on-site. It does not require scanning and has high practical value.

該実施例において、図３を参照し、一般に平面を決定するために、三つの点のみを必要とし、したがって目標二次元画素Ｘに対応する三次元点群中の点、及び一組の画素に対応する三次元点群のうちの任意の二つの点に基づいてシート化（すなわち平面化）を実現することができる。この場合、シートは、三角シートであってもよい。それに対応して、平面グリッド３０１は少なくとも一つの三角グリッドを含むことができる。 In this embodiment, with reference to FIG. 3, in order to determine a plane, generally only three points are required, thus a point in the three-dimensional point cloud corresponding to the target two-dimensional pixel X, and a set of pixels. Sheeting (ie, planarization) can be realized based on any two points of the corresponding three-dimensional point group. In this case, the sheet may be a triangular sheet. Correspondingly, planar grid 301 may include at least one triangular grid.

一つの実施例において、より具体的には、図３を参照すると、三角シートを実現する過程において、選択された隣接点集合のうちの二つの点に対応する二次元画像１１０中の画素は隣接する。例えば、目標二次元画素Ｘを決定した後に、隣接する一組の画素における画素Ａと画素Ｂに対応する３Ｄ点群内の点を選択し、かつ画素と点群との対応関係に基づいて、画素Ａ、画素Ｘ及び画素Ｂを３Ｄ点群でシート化し、三角シート３１０を取得する。 In one embodiment, more specifically, referring to FIG. 3, in the process of realizing the triangular sheet, pixels in the two-dimensional image 110 corresponding to two points of the selected set of adjacent points are do. For example, after determining a target two-dimensional pixel Pixel A, pixel X, and pixel B are formed into a sheet using a 3D point group, and a triangular sheet 310 is obtained.

同様に、画素Ｘ、画素Ｂ及び画素Ｃを三角シート化して三角シート３２０を得て、画素Ｘ、画素Ｅ及び画素Ｃを三角シート化して三角シート３３０を得て、画素Ｘ、画素Ｅ及び画素Ｈを三角シート化して三角シート３４０を得て、画素Ｘ、画素Ｇ及び画素Ｈを三角シート化して三角シート３５０を得て、このように類推して、完全な８つの三角シートで構成された平面グリッド３０１を得る。さらに、上記方法を二次元画像１１０の各画素に拡大して、三次元画像モデルを得ることができる。このように、無損失の三次元画像を得ることができ、それによりユーザの対話体験及び没入体験を大幅に向上させる。 Similarly, pixel X, pixel B, and pixel C are made into a triangular sheet to obtain a triangular sheet 320, and pixel X, pixel E, and pixel C are made into a triangular sheet to obtain a triangular sheet 330, and pixel H is made into a triangular sheet to obtain a triangular sheet 340, and pixel X, pixel G, and pixel H are made into a triangular sheet to obtain a triangular sheet 350. By analogy, it is made up of eight complete triangular sheets. A plane grid 301 is obtained. Furthermore, the above method can be extended to each pixel of the two-dimensional image 110 to obtain a three-dimensional image model. In this way, a lossless three-dimensional image can be obtained, thereby significantly improving the user's interaction and immersive experience.

説明すべきこととして、上記三角シート化の過程は、例示的なものだけであり、当業者はさらに他の任意の適切な方式で三次元画像モデリングを実現し、本開示はこれを制限しない。例えば、精度要求が相対的に低い場合に、さらに三角シートを採用する必要がなく、上記モデリングを行うことができる。すなわち、一組の画素を選択する時に、隣接する画素を取る必要がなく、このように形成された平面グリッド３０１は相対的に少ないシートで構成することができる。このように、部分精度を犠牲にするが、計算量を大幅に低減し、いくつかの低精度要求の三次元画像モデルに適用することができる。 It should be noted that the above triangular sheeting process is only an example, and those skilled in the art can further realize three-dimensional image modeling in any other suitable manner, and the present disclosure does not limit the same. For example, when accuracy requirements are relatively low, the above modeling can be performed without the need to further employ a triangular sheet. That is, when selecting a set of pixels, it is not necessary to select adjacent pixels, and the planar grid 301 formed in this way can be constructed with a relatively small number of sheets. In this way, although partial accuracy is sacrificed, the amount of calculation is significantly reduced, and it can be applied to some three-dimensional image models requiring low accuracy.

いくつかの実施例において、図３を参照し、二次元画像１１０は一般的にテクスチャ情報又は色情報のうちの少なくとも一種の情報を含む。このような実施例において、二次元画像１１０の画素と三次元点群中点との対応関係に基づいて、テクスチャ情報又は前記色情報のうちの少なくとも一種の情報を平面グリッド３０１に描画し、かつ描画された平面グリッド３０１を利用して、二次元画像１１０に対する三次元画像を表示することができる。テクスチャ情報は画像における同質現象を反映する視覚的特徴であり、テクスチャ情報は、物体表面の緩やかな変化又は周期的変化を有する表面構造組織の配列属性を体現する。階調、色などの画像特徴と異なり、テクスチャは、画素及びその周囲空間近傍の階調分布により表現される。色情報は画像の階調、色などの画像特徴に対応する。 In some embodiments, referring to FIG. 3, two-dimensional image 110 generally includes at least one of texture information or color information. In such an embodiment, at least one type of texture information or the color information is drawn on the plane grid 301 based on the correspondence between the pixels of the two-dimensional image 110 and the midpoints of the three-dimensional point group, and Using the drawn planar grid 301, a three-dimensional image relative to the two-dimensional image 110 can be displayed. Texture information is a visual feature that reflects a homogeneous phenomenon in an image, and texture information embodies the arrangement attributes of a surface structure organization with gradual or periodic changes on an object surface. Unlike image features such as gradation and color, texture is expressed by the gradation distribution in the vicinity of a pixel and its surrounding space. Color information corresponds to image characteristics such as image gradation and color.

該実施例において、二次元画像１１０に基づく深度画像自体と二次元画像１１０とは厳密な一対一の対応関係が存在し、すなわち三次元点群とテクスチャ情報及び色情報との間に一対一の対応関係が存在することを意味する。したがって、三次元画像モデルを取得してテクスチャ情報又は色情報のうちの少なくとも一種の情報をレンダリングすることができ、それにより完全な三次元画像モデルを取得する。実際に、該ステップは、テクスチャ情報及び色情報のうちの少なくとも一種の情報を平面グリッド３０１の対応するシートに貼り付ける過程、即ち三次元画像モデルに対して着色及びレンダリングを行う過程であると理解することができる。 In this embodiment, there is a strict one-to-one correspondence between the depth image itself based on the two-dimensional image 110 and the two-dimensional image 110, that is, there is a one-to-one correspondence between the three-dimensional point cloud, texture information, and color information. It means that a correspondence exists. Accordingly, a three-dimensional image model can be obtained and at least one of texture information or color information can be rendered, thereby obtaining a complete three-dimensional image model. In fact, this step is understood to be a process of pasting at least one type of texture information and color information onto a corresponding sheet of the planar grid 301, that is, a process of coloring and rendering the three-dimensional image model. can do.

ハードウェア走査に基づいて点群を取得する解決手段において、ハードウェアは不可避的に偏差が存在するため、点群とテクスチャ情報及び色情報は、一対多の対応関係が存在し、テクスチャ情報と点群情報は完全に一対一に対応する可能性がなく、最終的にテクスチャ品質の低下を引き起こす。該実施例において、このような空間の強い相関関係により、テクスチャ情報及び色情報が失われることがなく、それにより高品質の、二次元画像１１０の立体化情報を完全に体現できる三次元画像モデルを得る。 In the solution method that acquires point clouds based on hardware scanning, there is an unavoidable deviation in the hardware, so there is a one-to-many correspondence relationship between the point cloud, texture information, and color information, and the texture information and point cloud The information may not have a perfect one-to-one correspondence, ultimately causing a decrease in texture quality. In this embodiment, due to such strong spatial correlation, texture information and color information are not lost, thereby creating a three-dimensional image model that can completely embody the three-dimensional information of the two-dimensional image 110 with high quality. get.

図４は、本開示のいくつかの実施例に係る三次元ビデオストリームを生成する過程４００のフローチャートを示す。過程４００は、図１に示す計算機器１２０において実現されてもよく、他の任意の適切な計算機器で実現されてもよい。説明を容易にするために、図１を参照して過程４００を説明する。 FIG. 4 shows a flowchart of a process 400 of generating a three-dimensional video stream according to some embodiments of the present disclosure. Process 400 may be implemented on computing device 120 shown in FIG. 1, or any other suitable computing device. For ease of explanation, process 400 will be described with reference to FIG.

ブロック４１０では、ビデオストリーム中の各フレームの二次元画像１１０に対して対応する三次元画像を生成する。いくつかの実施例において、二次元画像１１０を三次元画像に生成する過程は、図２に示す方法に基づいて実現するか又は他の任意の適切な方法に基づいて実現することができる。理解すべきこととして、ビデオストリームの一部のフレームに三次元画像を生成する必要がある場合、一部の特定のフレームを選択して三次元画像モデリングを行うこともでき、本開示はこれを制限しない。 At block 410, a corresponding three-dimensional image is generated for the two-dimensional image 110 of each frame in the video stream. In some embodiments, the process of generating the two-dimensional image 110 into a three-dimensional image may be implemented based on the method shown in FIG. 2 or based on any other suitable method. It should be understood that if there is a need to generate three-dimensional images for some frames of a video stream, some specific frames can also be selected for three-dimensional image modeling, and this disclosure No restrictions.

ブロック４２０において、生成された三次元画像を利用して、三次元ビデオストリームを生成する。このように、三次元画像を生成した上で三次元ビデオストリームを得て、さらにユーザの没入体験及びインタラクティブ体験を向上させることができる。 At block 420, the generated three-dimensional images are utilized to generate a three-dimensional video stream. In this way, a three-dimensional image can be generated and a three-dimensional video stream can be obtained, further improving the user's immersive and interactive experience.

図５は、本開示のいくつかの実施例に係る三次元画像を生成する装置５００の模式図を示す。装置５００は、入力情報取得モジュール５１０と、三次元点群取得モジュール５２０と、三次元画像生成モジュール５３０とを備える。 FIG. 5 shows a schematic diagram of an apparatus 500 for generating three-dimensional images according to some embodiments of the present disclosure. The apparatus 500 includes an input information acquisition module 510, a three-dimensional point cloud acquisition module 520, and a three-dimensional image generation module 530.

入力情報取得モジュール５１０は、二次元画像１１０に対する入力情報を取得するように構成され、前記入力情報は少なくとも二次元画像１１０の深度情報を含む。前記のように、入力情報はさらに二次元画像の前景マスク、二次元画像の背景マスク、及び二次元画像の修正された画像情報の少なくとも一つを含むことができる。 The input information acquisition module 510 is configured to acquire input information for the two-dimensional image 110, and the input information includes at least depth information of the two-dimensional image 110. As mentioned above, the input information can further include at least one of a foreground mask of the two-dimensional image, a background mask of the two-dimensional image, and modified image information of the two-dimensional image.

三次元点群取得モジュール５２０は、二次元画像１１０及び入力情報を利用して、二次元画像１１０の各画素に対応する三次元点群を取得するように構成される。 The three-dimensional point cloud acquisition module 520 is configured to acquire a three-dimensional point cloud corresponding to each pixel of the two-dimensional image 110 using the two-dimensional image 110 and input information.

いくつかの実施例において、三次元点群取得モジュール５２０は、正規化処理モジュールを含み、正規化処理モジュールは、二次元画像と入力情報に対して正規化処理を行うことにより、三次元点群を得るように構成される。 In some embodiments, the 3D point cloud acquisition module 520 includes a normalization processing module, and the normalization processing module performs normalization processing on the 2D image and input information to obtain the 3D point cloud. is configured to obtain.

いくつかの実施例において、三次元点群取得モジュール５２０は、さらに、三次元座標系変換モジュール及び三次元点群生成モジュールを含み、三次元座標系変換モジュールは、二次元画像１１０の画素が位置する平面座標系を三次元座標系に変換するように配置され、かつ三次元点群生成モジュールは、二次元画像１１０の画素に基づいて、三次元座標系における三次元点群を生成するように構成される。 In some embodiments, the 3D point cloud acquisition module 520 further includes a 3D coordinate system transformation module and a 3D point cloud generation module, where the 3D coordinate system transformation module The three-dimensional point cloud generation module is arranged to convert a plane coordinate system into a three-dimensional coordinate system, and the three-dimensional point cloud generation module is configured to generate a three-dimensional point cloud in the three-dimensional coordinate system based on the pixels of the two-dimensional image 110. configured.

いくつかの実施例において、平面座標系は画素座標系又は画像座標系のうちの少なくとも一種を含むことができ、かつ三次元座標系はカメラ座標系又はワールド座標系のうちの少なくとも一種を含むことができる。 In some embodiments, the planar coordinate system can include at least one of a pixel coordinate system or an image coordinate system, and the three-dimensional coordinate system can include at least one of a camera coordinate system or a world coordinate system. I can do it.

三次元画像生成モジュール５３０は、目標二次元画素に対応する三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する三次元点群における隣接点集合に基づいて、二次元画像１１０に対する三次元画像を生成するように構成される。 The three-dimensional image generation module 530 generates, based on the point in the three-dimensional point group corresponding to the target two-dimensional pixel and a set of adjacent points in the corresponding three-dimensional point group of a set of pixels adjacent to the target two-dimensional pixel, It is configured to generate a three-dimensional image relative to the two-dimensional image 110.

いくつかの実施例において、三次元画像生成モジュール５３０はさらに平面グリッド取得モジュール及び三次元画像生成サブモジュールを含み、平面グリッド取得モジュールは、目標二次元画素に対応する三次元点群中の点、及び隣接点集合における少なくとも二つの点に基づいて、目標画素と一組の画素に対する平面グリッドを取得するように構成され、かつ三次元画像生成サブモジュールは、取得された平面グリッドに基づいて、二次元画像１１０に対する三次元画像を生成するように構成される。 In some embodiments, the three-dimensional image generation module 530 further includes a planar grid acquisition module and a three-dimensional image generation sub-module, where the planar grid acquisition module acquires a point in the three-dimensional point cloud corresponding to the target two-dimensional pixel; and the three-dimensional image generation sub-module is configured to obtain a planar grid for the target pixel and the set of pixels based on the at least two points in the adjacent point set, and the three-dimensional image generation sub-module is configured to obtain a planar grid for the target pixel and the set of pixels based on the obtained planar grid. It is configured to generate a three-dimensional image for the dimensional image 110.

いくつかの実施例において、隣接点集合における少なくとも二つの点に対応する画素は、隣接することができる。 In some embodiments, pixels corresponding to at least two points in the neighbor point set can be adjacent.

いくつかの実施例において、三次元画像生成モジュール５３０はさらに平面グリッド描画モジュール及び三次元画像表示モジュールを含み、平面グリッド描画モジュールは、二次元画像１１０の画素と三次元点群における点の対応関係に基づいて、テクスチャ情報又は色情報のうちの少なくとも一種の情報を平面グリッドに描画し、かつ三次元画像表示モジュールは描画された平面グリッドを利用して、二次元画像１１０に対する三次元画像を表示するように構成される。 In some embodiments, the three-dimensional image generation module 530 further includes a planar grid drawing module and a three-dimensional image display module, and the planar grid drawing module calculates the correspondence between pixels of the two-dimensional image 110 and points in the three-dimensional point cloud. Based on the above, at least one type of texture information or color information is drawn on a plane grid, and the three-dimensional image display module displays a three-dimensional image for the two-dimensional image 110 using the drawn plane grid. configured to do so.

図６は、本開示の実施例に係る三次元ビデオストリームを生成する装置６００の概略ブロック図である。装置６００は、第２の三次元画像生成モジュール６１０と、三次元ビデオストリーム生成モジュール６２０とを備える。装置６００は、図１に示される計算機器１２０において実現されてもよく、他の任意の適切な機器において実現されてもよい。説明を容易にするために、図１を参照して過程６００を説明する。 FIG. 6 is a schematic block diagram of an apparatus 600 for generating a three-dimensional video stream according to an embodiment of the present disclosure. The device 600 includes a second three-dimensional image generation module 610 and a three-dimensional video stream generation module 620. Apparatus 600 may be implemented in computing device 120 shown in FIG. 1, or in any other suitable device. For ease of explanation, process 600 will be described with reference to FIG.

第２の三次元画像生成モジュール６１０は、ビデオストリーム内の各フレームの二次元画像１１０に対して、対応する三次元画像を生成するように構成される。三次元画像を生成するステップは、前述したような装置５００によって実現されてもよい。 The second three-dimensional image generation module 610 is configured to generate a corresponding three-dimensional image for the two-dimensional image 110 of each frame in the video stream. The step of generating a three-dimensional image may be implemented by apparatus 500 as described above.

三次元ビデオストリーム生成モジュール６２０は、生成された三次元画像を利用して、三次元ビデオストリームを生成するように構成される。このように、三次元画像を生成した上で三次元ビデオストリームを得て、さらにユーザの没入体験及びインタラクティブ体験を向上させることができる。 The 3D video stream generation module 620 is configured to generate a 3D video stream using the generated 3D images. In this way, a three-dimensional image can be generated and a three-dimensional video stream can be obtained, further improving the user's immersive and interactive experience.

図７は、本開示の複数の実施例を実施可能な計算機器７００のブロック図を示している。機器７００は、図１の計算機器１２０を実現するために用いることができる。図に示すように、機器７００は、中央処理ユニット（ＣＰＵ）７０１を含み、それはリードオンリーメモリ（ＲＯＭ）７０２に記憶されたコンピュータプログラム命令又は記憶ユニット７０８からランダムアクセスメモリ（ＲＡＭ）７０３にロードされたコンピュータプログラム命令に基づいて、様々な適切な動作及び処理を実行することができる。ＲＡＭ７０３には、さらに機器７００の操作に必要な様々なプログラム及びデータを記憶することができる。ＣＰＵ７０１、ＲＯＭ７０２、およびＲＡＭ７０３は、バス７０４により相互に接続されている。バス７０４には、さらに、入出力（Ｉ／Ｏ）インタフェース７０５も接続されている。 FIG. 7 depicts a block diagram of a computing device 700 capable of implementing embodiments of the present disclosure. Device 700 can be used to implement computing device 120 of FIG. As shown, the device 700 includes a central processing unit (CPU) 701 that executes computer program instructions stored in a read-only memory (ROM) 702 or loaded into a random access memory (RAM) 703 from a storage unit 708. Various suitable operations and processes may be performed based on computer program instructions. The RAM 703 can further store various programs and data necessary for operating the device 700. CPU 701 , ROM 702 , and RAM 703 are interconnected by bus 704 . An input/output (I/O) interface 705 is also connected to the bus 704 .

機器７００における複数の部品は、Ｉ／Ｏインタフェース７０５に接続され、例えばキーボード、マウス等の入力ユニット７０６と、例えば様々なタイプのディスプレイ、スピーカ等の出力ユニット７０７と、例えば磁気ディスク、光ディスク等の記憶ユニット７０８と、例えばネットワークカード、モデム、無線通信トランシーバ等の通信ユニット７０９と、を含む。通信ユニット７０９は、機器７００がインターネット等のコンピュータネットワーク／各種の電気通信網を介して他の装置と情報／データをやり取りすることを可能にする。 A plurality of components in the device 700 are connected to an I/O interface 705, and include an input unit 706 such as a keyboard and a mouse, an output unit 707 such as various types of displays and speakers, and an output unit 707 such as a magnetic disk, an optical disk, etc. It includes a storage unit 708 and a communication unit 709, such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network/various telecommunications networks, such as the Internet.

処理ユニット７０１は、例えば過程２００及び過程４００のうちの一つ又は複数という上記各方法及び処理を実行する。例えば、いくつかの実施例において、過程２００及び過程４００のうちの一つ又は複数はコンピュータソフトウェアプログラムとして実現され、それは、例えば記憶ユニット７０８という機械可読媒体に一時的に含まれる。いくつかの実施例において、コンピュータプログラムの一部又は全部はＲＯＭ７０２及び／又は通信ユニット７０９を介して機器７００にロード及び／又はインストールされる。コンピュータプログラムがＲＡＭ７０３にロードされかつＣＰＵ７０１により実行される場合、上記過程２００及び過程４００のうちの一つ又は複数のステップを実行することができる。代替的に、他の実施例において、ＣＰＵ７０１は他の任意の適切な方式（例えば、ファームウェアを介して）により過程２００及び過程４００のうちの一つ又は複数を実行するように構成されてもよい。 The processing unit 701 performs the methods and processes described above, for example one or more of process 200 and process 400. For example, in some embodiments, one or more of process 200 and process 400 are implemented as a computer software program, which is temporarily contained in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program is loaded and/or installed on device 700 via ROM 702 and/or communication unit 709. When a computer program is loaded into RAM 703 and executed by CPU 701, one or more steps of process 200 and process 400 described above may be performed. Alternatively, in other embodiments, CPU 701 may be configured to perform one or more of steps 200 and 400 in any other suitable manner (e.g., via firmware). .

本明細書で説明した機能は、少なくとも一部が一つまたは複数のハードウェアロジックによって実行されてもよい。例えば、非限定的に、使用可能な例示的なタイプのハードウェアロジック部品は、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、専用集積回路（ＡＳＩＣ）、専用標準製品（ＡＳＳＰ）、チップ上システムのシステム（ＳＯＣ）、ロードプログラマブルロジックデバイス（ＣＰＬＤ）等を含む。 The functionality described herein may be performed at least in part by one or more hardware logic. For example, and without limitation, exemplary types of hardware logic components that may be used include field programmable gate arrays (FPGAs), special purpose integrated circuits (ASICs), special purpose standard products (ASSPs), systems on chips (SOCs), ), load programmable logic devices (CPLDs), etc.

本開示の方法を実施するためのプログラムコードは、一つ又は複数のプログラム言語の任意の組み合わせで作成することができる。これらのプログラムコードは、汎用コンピュータ、専用コンピュータ又は他のプログラム可能なデータ処理装置のプロセッサ又はコントローラに提供することができ、それによりプログラムコードはプロセッサ又はコントローラにより実行される時にフローチャート及び／又はブロック図に規定された機能／操作が実施される。プログラムコードは機器に完全に実行され、部分的に機器で実行されてもよく、独立したソフトウェアパッケージ部分として機器で実行されかつ遠隔機器で部分的に実行されるか又は完全に遠隔機器又はサーバで実行される。 Program code for implementing the methods of this disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing device such that, when executed by the processor or controller, the program codes may be implemented in flowchart and/or block diagram formats. The functions/operations specified in the above shall be carried out. The program code may be executed entirely on the device, partially executed on the device, executed on the device as part of a separate software package and partially executed on a remote device, or completely executed on a remote device or server. executed.

本開示のコンテキストにおいて、機械可読媒体は有形の媒体であってもよく、それは命令実行システム、装置又は装置の使用又は命令実行システム、装置又は装置と組み合わせて使用されるプログラムを含むか又は記憶することができる。機械可読媒体は機械可読信号媒体又は機械可読記憶媒体であってもよい。機械可読媒体は電子の、磁気的、光学的、電磁的、赤外線の、又は半導体システム、装置又は装置、又は上記内容の任意の適切な組み合わせを含むがそれらに限定されない。機械可読記憶媒体のより具体的な例は一つ以上の線に基づく電気的接続、携帯式コンピュータディスク、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、読み出し専用メモリ（ＲＯＭ）、消去可能なプログラマブルリードオンリーメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、光ファイバ、便利式コンパクトフラッシュ（登録商標）メモリ（ＣＤ－ＲＯＭ）、光記憶装置、磁気記憶装置、又は上記コンテンツの任意の適切な組み合わせを含む。 In the context of this disclosure, a machine-readable medium may be a tangible medium that contains or stores an instruction execution system, apparatus or use of or a program for use in conjunction with an instruction execution system, apparatus or apparatus. be able to. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the above. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), and erasable programmable read-only memory. (EPROM or flash memory), fiber optics, compact flash memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the above content.

また、特定の順序で各操作を描画したが、これは以下のように理解すべきである：このような操作は示された特定の順序又は順序で実行されることが要求され、又は全ての図示の操作が実行されるように要求されて所望の結果を取得することが要求される。一定の環境で、マルチタスク及び並列処理が有利である可能性がある。同様に、上記においていくつかの具体的な実現詳細を含むが、これらは本開示の範囲を限定するものと解釈されるべきではない。個別の実施例のコンテキストに記述されたいくつかの特徴はさらに組み合わせて単一の実現に実現されてもよい。逆に、単一で実現されるコンテキストに記述された様々な特徴は単独で又は任意の適切なサブセットの方式で複数の実現に実現されてもよい。 Also, although we have drawn each operation in a particular order, this should be understood as: such operations are required to be performed in the particular order or order shown, or that all The illustrated operations are required to be performed to obtain the desired results. In certain circumstances, multitasking and parallel processing may be advantageous. Similarly, although some specific implementation details are included above, these should not be construed as limiting the scope of the disclosure. Certain features that are described in the context of separate embodiments may also be implemented in combination in a single implementation. Conversely, various features described in the context of a single implementation may be implemented in multiple implementations singly or in any suitable subset manner.

構造的特徴及び／又は方法の論理動作に特化した言語を用いて本主題を説明したが、添付の特許請求の範囲に限定された主題は必ずしも上記特定の特徴又は動作に限定されるものではないと理解すべきである。逆に、上記特定の特徴及び動作は特許請求の範囲の例示的な形態を実現するだけである。 Although the present subject matter has been described in language specific to structural features and/or methodological operations, the subject matter limited in the appended claims is not necessarily limited to such specific features or operations. It should be understood that there is no such thing. On the contrary, the specific features and acts described above merely implement example forms of the claims.

（関連出願のクロス援用）
本願は、出願番号が２０２１１１０５６６７１.６であり、名称が「画像処理ための方
法、装置、機器、記憶媒体及びプログラム製品」であり、出願日が２０２１年９月９日である中国発明特許出願の優先権を主張し、この援用により該出願全体を本明細書に組み込む。(Cross-incorporation of related applications)
This application is a Chinese invention patent application whose application number is 202111056671.6 and whose name is "Method, apparatus, equipment, storage medium and program product for image processing" and whose filing date is September 9, 2021. Priority is claimed and the entire application is incorporated herein by this reference.

本開示の実施例は主にコンピュータの分野に関し、より具体的には、画像処理方法及び装置、電子機器、記憶媒体ならびにコンピュータプログラムに関する。Embodiments of the present disclosure primarily relate to the field of computers, and more particularly to image processing methods and apparatuses, electronic devices, storage media, and computer programs .

本開示の第七態様において、コンピュータプログラムを提供し、プロセッサにより実行される時、本開示の第一態様又は第二態様に係る方法を実行する。 In a seventh aspect of the disclosure, a computer program is provided, which when executed by a processor performs a method according to the first or second aspect of the disclosure .

ここで、Ｚ_ｃは深度画像における画素の高さ情報であり、ｕおよびｖはそれぞれ二次元画素座標系における座標位置であり、ｕ_０、ｖ_０は、画像中心であり、ｆは、仮想カメラの焦点距離であり、単位がｍｍであり、ｄ_ｘ、ｄ_ｙは、画素サイズである。Here, Z _c is the height information of the pixel in the depth image, u and v are the coordinate positions in the two-dimensional pixel coordinate system, u ₀ and v ₀ are the image center, and f is the virtual camera is the focal length of , the unit is mm, and d _x and _dy are pixel sizes.

該実施例において、式（１）及び式（２）の数学的変換により、式（３）及び式（４）に示されるｘ軸及びｙ軸での正規化焦点距離ｆ_ｘおよびｆ_ｙを取得することができる。正規化焦点距離ｆ_ｘ及びｆ_ｙを取得した後に、小孔結像原理に基づいて、式（５）及び式（６）により小孔結像に基づくカメラ座標系における画素のＸ軸及びＹ軸の座標Ｘ_ｃ及びＹ_ｃを取得することができる。In this example, the normalized focal lengths f _x and f _y in the x and y axes shown in equations (3) and (4) are obtained by mathematical transformation of equations (1) and (2). can do. After obtaining the _normalized focal lengths _f The coordinates X _c and Y _c of can be obtained.

なお、式（１）及び式（２）の数学的変換方式は例示的なものだけであり、当業者は、実際の需要に応じて調整することができる。かつ座標Ｘ_ｃ及びＹ_ｃの決定方式も模式的であり、当業者は実際の必要に応じて式（３）及び式（４）を調整して他の方式で正規化焦点距離を決定することができ、本開示はこれを制限しない。It should be noted that the mathematical conversion methods of equations (1) and (2) are only exemplary, and those skilled in the art can adjust them according to actual needs. In addition, the method for determining the coordinates X _c and Y _c is also schematic, and those skilled in the art can adjust equations (3) and (4) according to actual needs to determine the normalized focal length using other methods. can be used, and this disclosure does not limit this.

さらに説明する必要があることとして、上記式（５）及び式（６）は、ｕ_０、ｖ_０を画像中心としてＸ軸及びＹ軸の座標Ｘ_ｃ及びＹ_ｃを取得することである。当業者は、他の任意の画素点の位置を用いて関連画素の座標を計算することができ、本開示はこれを制限しない。What needs further explanation is that the above equations (5) and (6) obtain the coordinates X _c and Y _c of the X and Y axes with u ₀ and v ₀ as the center of the image. Those skilled in the art can use any other pixel point location to calculate the coordinates of the relevant pixel, and this disclosure does not limit this.

式（７）及び式（８）を結合して、画像画素座標系を小孔撮像に基づくカメラ座標系に変換することができ、即ち特定のレンダリング視野角を設定すれば、二次元画素座標系における画素から、三次元カメラ座標系における点群を生成することができる。深度マップ範囲が０～２^１６－１であるため、普遍性を保証するために、深度マップ範囲を０～１にマッピングする。このようにすれば、上記簡便なステップにより二次元画像１１０に基づいて三次元画像を取得することができる。ハードウェア走査に基づいて取得された点群の方式に比べて、ハードウェアに不可避的に偏差が存在するため、点群に紛失（ＮＡＮ値）が存在しやすく、深度学習に基づいて取得された点群に点群欠落問題が存在せず、したがってより高品質の三次元画像モデルを表示することができる。By combining equations (7) and (8), the image pixel coordinate system can be transformed into the camera coordinate system based on stoma imaging, i.e., by setting a certain rendering viewing angle, the two-dimensional pixel coordinate system A point cloud in a three-dimensional camera coordinate system can be generated from pixels in . Since the depth map range is 0 to 2 ¹⁶ -1, we map the depth map range to 0 to 1 to ensure universality. In this way, a three-dimensional image can be obtained based on the two-dimensional image 110 through the above-mentioned simple steps. Compared to the method of point clouds acquired based on hardware scanning, due to the unavoidable deviation in the hardware, there are more likely to be missing points (NAN values) in the point cloud, and the method of point clouds acquired based on deep learning There is no point cloud missing problem in the point cloud, and therefore a higher quality three-dimensional image model can be displayed.

図７は、本開示の複数の実施例を実施可能な計算機器７００のブロック図を示している。計算機器７００は、図１の計算機器１２０を実現するために用いることができる。図に示すように、計算機器７００は、中央処理ユニット（ＣＰＵ）７０１を含み、それはリードオンリーメモリ（ＲＯＭ）７０２に記憶されたコンピュータプログラム命令又は記憶ユニット７０８からランダムアクセスメモリ（ＲＡＭ）７０３にロードされたコンピュータプログラム命令に基づいて、様々な適切な動作及び処理を実行することができる。ＲＡＭ７０３には、さらに計算機器７００の操作に必要な様々なプログラム及びデータを記憶することができる。ＣＰＵ７０１、ＲＯＭ７０２、およびＲＡＭ７０３は、バス７０４により相互に接続されている。バス７０４には、さらに、入出力（Ｉ／Ｏ）インタフェース７０５も接続されている。FIG. 7 depicts a block diagram of a computing device 700 capable of implementing embodiments of the present disclosure. Computing device 700 can be used to implement computing device 120 of FIG. As shown, computing device 700 includes a central processing unit (CPU) 701 that stores computer program instructions in read-only memory (ROM) 702 or loads them into random access memory (RAM) 703 from storage unit 708. Various suitable operations and processes may be performed based on the computer program instructions provided. The RAM 703 can further store various programs and data necessary for operating the computing device 700. CPU 701 , ROM 702 , and RAM 703 are interconnected by bus 704 . An input/output (I/O) interface 705 is also connected to the bus 704 .

計算機器７００における複数の部品は、Ｉ／Ｏインタフェース７０５に接続され、例えばキーボード、マウス等の入力ユニット７０６と、例えば様々なタイプのディスプレイ、スピーカ等の出力ユニット７０７と、例えば磁気ディスク、光ディスク等の記憶ユニット７０８と、例えばネットワークカード、モデム、無線通信トランシーバ等の通信ユニット７０９と、を含む。通信ユニット７０９は、計算機器７００がインターネット等のコンピュータネットワーク／各種の電気通信網を介して他の装置と情報／データをやり取りすることを可能にする。A plurality of components in the computing device 700 are connected to an I/O interface 705, including an input unit 706 such as a keyboard and a mouse, an output unit 707 such as various types of displays and speakers, and an output unit 707 such as a magnetic disk, an optical disk, etc. a storage unit 708, and a communication unit 709, such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the computing device 700 to exchange information/data with other devices via a computer network/various telecommunications networks, such as the Internet.

処理ユニット７０１は、例えば過程２００及び過程４００のうちの一つ又は複数という上記各方法及び処理を実行する。例えば、いくつかの実施例において、過程２００及び過程４００のうちの一つ又は複数はコンピュータソフトウェアプログラムとして実現され、それは、例えば記憶ユニット７０８という機械可読媒体に一時的に含まれる。いくつかの実施例において、コンピュータプログラムの一部又は全部はＲＯＭ７０２及び／又は通信ユニット７０９を介して計算機器７００にロード及び／又はインストールされる。コンピュータプログラムがＲＡＭ７０３にロードされかつＣＰＵ７０１により実行される場合、上記過程２００及び過程４００のうちの一つ又は複数のステップを実行することができる。代替的に、他の実施例において、ＣＰＵ７０１は他の任意の適切な方式（例えば、ファームウェアを介して）により過程２００及び過程４００のうちの一つ又は複数を実行するように構成されてもよい。The processing unit 701 performs the methods and processes described above, for example one or more of process 200 and process 400. For example, in some embodiments, one or more of process 200 and process 400 are implemented as a computer software program, which is temporarily contained in a machine-readable medium, such as storage unit 708. In some embodiments, some or all of the computer program is loaded and/or installed on computing device 700 via ROM 702 and/or communication unit 709. When a computer program is loaded into RAM 703 and executed by CPU 701, one or more steps of process 200 and process 400 described above may be performed. Alternatively, in other embodiments, CPU 701 may be configured to perform one or more of steps 200 and 400 in any other suitable manner (e.g., via firmware). .

Claims

二次元画像に対する、少なくとも前記二次元画像の深度情報を含む入力情報を取得することと、
前記二次元画像及び前記入力情報を用いて、前記二次元画像の各画素に対応する三次元点群を取得することと、
目標二次元画素に対応する前記三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する前記三次元点群における隣接点集合に基づいて、前記二次元画像に対する三次元画像を生成することと、を含む
画像処理方法。 Obtaining input information for a two-dimensional image that includes at least depth information of the two-dimensional image;
using the two-dimensional image and the input information to obtain a three-dimensional point group corresponding to each pixel of the two-dimensional image;
Based on points in the three-dimensional point group corresponding to the target two-dimensional pixel and a set of adjacent points in the three-dimensional point group corresponding to a set of pixels adjacent to the target two-dimensional pixel, three-dimensional An image processing method comprising: generating an original image;

前記二次元画像に対する三次元画像を生成することは、
前記目標二次元画素に対応する前記三次元点群中の点、及び前記隣接点集合における少なくとも二つの点に基づいて、前記目標画素と前記一組の画素に対する平面グリッドを取得することと、
取得された平面グリッドに基づいて、前記二次元画像に対する前記三次元画像を生成することと、を含む
請求項１に記載の画像処理方法。 Generating a three-dimensional image for the two-dimensional image includes:
Obtaining a planar grid for the target pixel and the set of pixels based on a point in the three-dimensional point group corresponding to the target two-dimensional pixel and at least two points in the adjacent point set;
The image processing method according to claim 1, further comprising: generating the three-dimensional image for the two-dimensional image based on the obtained planar grid.

前記隣接点集合における前記少なくとも二つの点に対応する画素が隣接する
請求項２に記載の画像処理方法。 The image processing method according to claim 2, wherein pixels corresponding to the at least two points in the adjacent point set are adjacent to each other.

前記入力情報は、
前記二次元画像の前景マスクと、
前記二次元画像の背景マスクと、
前記二次元画像の修正された画像情報と、の少なくとも一つをさらに含む
請求項１－３のいずれか一項に記載の画像処理方法。 The input information is
a foreground mask of the two-dimensional image;
a background mask of the two-dimensional image;
The image processing method according to any one of claims 1 to 3, further comprising at least one of: corrected image information of the two-dimensional image.

前記二次元画像の各画素に対応する三次元点群を取得することは、
前記二次元画像及び前記入力情報に対して正規化処理を行うことにより、前記三次元点群を取得することを含む
請求項１－３のいずれか一項に記載の画像処理方法。 Obtaining a three-dimensional point group corresponding to each pixel of the two-dimensional image includes:
The image processing method according to any one of claims 1 to 3, comprising obtaining the three-dimensional point group by performing normalization processing on the two-dimensional image and the input information.

前記二次元画像の各画素に対応する三次元点群を取得することは、
前記二次元画像の画素が位置する平面座標系を三次元座標系に変換することと、
前記二次元画像の画素に基づいて、前記三次元座標系における前記三次元点群を生成することと、を含む
請求項１－３のいずれか一項に記載の画像処理方法。 Obtaining a three-dimensional point group corresponding to each pixel of the two-dimensional image includes:
converting a plane coordinate system in which pixels of the two-dimensional image are located into a three-dimensional coordinate system;
The image processing method according to any one of claims 1 to 3, comprising: generating the three-dimensional point group in the three-dimensional coordinate system based on pixels of the two-dimensional image.

前記平面座標系は、画素座標系又は画像座標系のうちの少なくとも一種を含み、かつ前記三次元座標系はカメラ座標系又はワールド座標系のうちの少なくとも一種を含む
請求項６に記載の画像処理方法。 The image processing according to claim 6, wherein the plane coordinate system includes at least one of a pixel coordinate system and an image coordinate system, and the three-dimensional coordinate system includes at least one of a camera coordinate system and a world coordinate system. Method.

前記平面グリッドは三角グリッドを含む
請求項２又は３に記載の画像処理方法。 The image processing method according to claim 2 or 3, wherein the plane grid includes a triangular grid.

前記二次元画像は、テクスチャ情報又は色情報のうちの少なくとも一つの情報を含み、前記二次元画像に対する三次元画像を生成することは、
前記二次元画像の画素と前記三次元点群における点の対応関係に基づいて、前記テクスチャ情報又は前記色情報のうちの少なくとも一種の情報を前記平面グリッドに描画することと、
描画された前記平面グリッドを利用して、前記二次元画像に対する三次元画像を表示することと、をさらに含む
請求項１－３又は７のいずれか一項に記載の画像処理方法。 The two-dimensional image includes at least one of texture information and color information, and generating a three-dimensional image for the two-dimensional image includes:
drawing at least one type of information among the texture information or the color information on the plane grid based on the correspondence between pixels of the two-dimensional image and points in the three-dimensional point group;
8. The image processing method according to claim 1, further comprising: displaying a three-dimensional image for the two-dimensional image using the drawn planar grid.

請求項１－９のいずれか一項に記載の画像処理方法に基づいて、ビデオストリーム中の各フレームの二次元画像に対して対応する三次元画像を生成することと、
生成された三次元画像を利用して、三次元ビデオストリームを生成することと、を含む
ビデオ処理方法。 Generating a three-dimensional image corresponding to a two-dimensional image of each frame in a video stream based on the image processing method according to any one of claims 1 to 9;
A video processing method comprising: generating a three-dimensional video stream using the generated three-dimensional image.

二次元画像に対する、少なくとも前記二次元画像の深度情報を含む入力情報を取得するように構成される入力情報取得モジュールと、
前記二次元画像及び前記入力情報を用いて、前記二次元画像の各画素に対応する三次元点群を取得するように構成される三次元点群取得モジュールと、
目標二次元画素に対応する前記三次元点群中の点、及び目標二次元画素に隣接する一組の画素の対応する前記三次元点群における隣接点集合に基づいて、前記二次元画像に対する三次元画像を生成するように構成される三次元画像生成モジュールと、を含む
画像処理装置。 an input information acquisition module configured to acquire input information for a two-dimensional image, including at least depth information of the two-dimensional image;
a three-dimensional point cloud acquisition module configured to use the two-dimensional image and the input information to acquire a three-dimensional point group corresponding to each pixel of the two-dimensional image;
Based on points in the three-dimensional point group corresponding to the target two-dimensional pixel and a set of adjacent points in the three-dimensional point group corresponding to a set of pixels adjacent to the target two-dimensional pixel, three-dimensional An image processing device comprising: a three-dimensional image generation module configured to generate an original image.

前記三次元画像生成モジュールは、
前記目標二次元画素に対応する前記三次元点群中の点、及び前記隣接点集合における少なくとも二つの点に基づいて、前記目標画素と前記一組の画素に対する平面グリッドを取得するように構成される平面グリッド取得モジュールと、
取得された平面グリッドに基づいて、前記二次元画像に対する前記三次元画像を生成するように構成される三次元画像生成サブモジュールと、を含む
請求項１１に記載の画像処理装置。 The three-dimensional image generation module includes:
The method is configured to obtain a planar grid for the target pixel and the set of pixels based on a point in the three-dimensional point group corresponding to the target two-dimensional pixel and at least two points in the adjacent point set. a plane grid acquisition module;
The image processing device according to claim 11 , further comprising: a three-dimensional image generation sub-module configured to generate the three-dimensional image for the two-dimensional image based on the obtained planar grid.

前記隣接点集合における前記少なくとも二つの点に対応する画素が隣接する
請求項１２に記載の画像処理装置。 The image processing device according to claim 12, wherein pixels corresponding to the at least two points in the adjacent point set are adjacent to each other.

前記入力情報は、
前記二次元画像の前景マスクと、
前記二次元画像の背景マスクと、
前記二次元画像の修正された画像情報と、の少なくとも一つをさらに含む
請求項１１－１３のいずれか一項に記載の画像処理装置。 The input information is
a foreground mask of the two-dimensional image;
a background mask of the two-dimensional image;
The image processing device according to any one of claims 11 to 13, further comprising at least one of: corrected image information of the two-dimensional image.

前記三次元点群取得モジュールは、
前記二次元画像及び前記入力情報に対して正規化処理を行うことにより、前記三次元点群を取得するように構成される正規化処理モジュールを含む
請求項１１－１３のいずれか一項に記載の画像処理装置。 The three-dimensional point cloud acquisition module includes:
14. A normalization processing module configured to obtain the three-dimensional point group by performing normalization processing on the two-dimensional image and the input information, according to any one of claims 11 to 13. image processing device.

前記三次元点群取得モジュールは、
前記二次元画像の画素が位置する平面座標系を三次元座標系に変換するように構成される三次元座標系変換モジュールと、
前記二次元画像の画素に基づいて、前記三次元座標系における前記三次元点群を生成するように構成される三次元点群生成モジュールと、を含む
請求項１１－１３のいずれか一項に記載の画像処理装置。 The three-dimensional point cloud acquisition module includes:
a three-dimensional coordinate system conversion module configured to convert a plane coordinate system in which pixels of the two-dimensional image are located into a three-dimensional coordinate system;
a three-dimensional point cloud generation module configured to generate the three-dimensional point group in the three-dimensional coordinate system based on pixels of the two-dimensional image. The image processing device described.

前記平面座標系は画素座標系又は画像座標系のうちの少なくとも一種を含み、かつ前記三次元座標系はカメラ座標系又はワールド座標系のうちの少なくとも一種を含む
請求項１６に記載の画像処理装置。 The image processing device according to claim 16, wherein the plane coordinate system includes at least one of a pixel coordinate system and an image coordinate system, and the three-dimensional coordinate system includes at least one of a camera coordinate system and a world coordinate system. .

前記平面グリッドは三角グリッドを含む
請求項１２又は１３に記載の画像処理装置。 The image processing device according to claim 12 or 13, wherein the plane grid includes a triangular grid.

前記二次元画像はテクスチャ情報又は色情報のうちの少なくとも一つの情報を含み、前記三次元画像生成モジュールは、
前記二次元画像の画素と前記三次元点群における点の対応関係に基づいて、前記テクスチャ情報又は前記色情報のうちの少なくとも一種の情報を前記平面グリッドに描画するように構成される平面グリッド描画モジュールと、
描画された前記平面グリッドを利用して、前記二次元画像に対する三次元画像を表示するように構成される三次元画像表示モジュールと、をさらに含む
請求項１１－１３又は１７のいずれか一項に記載の装置。 The two-dimensional image includes at least one of texture information and color information, and the three-dimensional image generation module includes:
Planar grid drawing configured to draw at least one type of information among the texture information or the color information on the planar grid based on the correspondence between pixels of the two-dimensional image and points in the three-dimensional point group. module and
According to any one of claims 11 to 13 or 17, further comprising: a three-dimensional image display module configured to display a three-dimensional image for the two-dimensional image using the drawn planar grid. The device described.

請求項１－１０のいずれか一項に記載の画像処理方法に基づいて、ビデオストリーム中の各フレームの二次元画像に対して対応する三次元画像を生成するように構成される第２の三次元画像生成モジュールと、
生成された三次元画像を利用して、三次元ビデオストリームを生成するように構成される三次元ビデオストリーム生成モジュールと、を含む
ビデオ処理装置。 a second tertiary image configured to generate a corresponding three-dimensional image for a two-dimensional image of each frame in the video stream based on an image processing method according to any one of claims 1-10; An original image generation module,
a 3D video stream generation module configured to generate a 3D video stream using the generated 3D image.

一つ又は複数のプロセッサと、
一つ又は複数のプログラムを記憶する記憶装置と、を含み、
前記一つ又は複数のプログラムが前記一つ又は複数のプロセッサにより実行されることにより、前記一つ又は複数のプロセッサが請求項１－１０のいずれか一項に記載の画像処理方法を実現する
電子機器。 one or more processors;
a storage device that stores one or more programs;
By executing the one or more programs by the one or more processors, the one or more processors realize the image processing method according to any one of claims 1 to 10. device.

コンピュータプログラムが記憶され、前記プログラムがプロセッサにより実行される時に請求項１－１０のいずれか一項に記載の画像処理方法を実現する
コンピュータ可読な記憶媒体。 A computer-readable storage medium in which a computer program is stored and which, when executed by a processor, implements an image processing method according to any one of claims 1-10.

プロセッサにより実行される時、請求項１－１０のいずれか一項に記載の画像処理方法を実行するコンピュータプログラムを含む
コンピュータプログラム製品。 A computer program product comprising a computer program which, when executed by a processor, performs an image processing method according to any one of claims 1-10.