TW202232953A

TW202232953A - Model-based prediction for geometry point cloud compression

Info

Publication number: TW202232953A
Application number: TW110149211A
Authority: TW
Inventors: 德奧維拉葛特汎; 阿達許克里許納瑞瑪蘇布雷蒙尼安; 巴帕迪亞雷; 龍范文; 馬塔卡茲維克茲
Original assignee: 美商高通公司
Priority date: 2021-01-04
Filing date: 2021-12-28
Publication date: 2022-08-16
Also published as: WO2022147008A1

Abstract

Techniques are disclosed for coding point cloud data using a scene model. An example device for coding point cloud data includes a memory configured to store the point cloud data and one or more processors implemented in circuitry and communicatively coupled to the memory. The one or more processors are configured to determine or obtain a scene model corresponding with a first frame of the point cloud data, wherein the scene model represents objects within a scene, the objects corresponding with at least a portion of the first frame of the point cloud data. The one or more processors are also configured to code a current frame of the point cloud data based on the scene model.

Description

用於幾何點雲壓縮的基於模型的預測Model-Based Prediction for Geometric Point Cloud Compression

本申請主張享有於2021年1月4日提交並且標題為“MODEL-BASED PREDICTION FOR GEOMETRY POINT CLOUD COMPRESSION”的美國暫時申請號63/133,622的權益，其全部內容透過引用的方式合併入本文。This application claims the benefit of US Provisional Application No. 63/133,622, filed January 4, 2021, and entitled "MODEL-BASED PREDICTION FOR GEOMETRY POINT CLOUD COMPRESSION," the entire contents of which are incorporated herein by reference.

本公開內容涉及點雲編碼和解碼。The present disclosure relates to point cloud encoding and decoding.

點雲是3維空間中的點的集合。這些點可以對應於3維空間內的物體上的點。因此，可以使用點雲來表示三維空間的實體內容。點雲可能在各種情況下有用。例如，可以在自動駕駛汽車的上下文中使用點雲，以用於表示道路上物體的位置。在另一示例中，可以在表示為了在擴增實境（AR）或混合實境（MR）應用中定位虛擬物體的環境的實體內容的上下文中使用點雲。點雲壓縮是對點雲進行編碼和解碼的過程。對點雲進行編碼可以減少點雲的儲存和傳輸所需要的資料量。A point cloud is a collection of points in 3-dimensional space. These points may correspond to points on objects in 3-dimensional space. Therefore, point clouds can be used to represent solid content in three-dimensional space. Point clouds can be useful in a variety of situations. For example, point clouds can be used in the context of self-driving cars to represent the location of objects on the road. In another example, a point cloud may be used in the context of physical content representing an environment for locating virtual objects in augmented reality (AR) or mixed reality (MR) applications. Point cloud compression is the process of encoding and decoding point clouds. Encoding point clouds can reduce the amount of data required for point cloud storage and transmission.

一般而言，本公開內容描述了用於對輸入點雲進行建模的技術。本公開內容的技術可以用於對點雲幀集合中的當前幀或後續幀的預測。In general, this disclosure describes techniques for modeling input point clouds. The techniques of this disclosure may be used for prediction of a current frame or a subsequent frame in a set of point cloud frames.

使用幾何點雲壓縮（G-PCC），可以在使用或不使用感測器模型的情況下對點雲進行譯碼，以提高譯碼效率。然而，可以在不使用與場景有關的資訊（例如，物體的位置）的情況下執行所述壓縮。透過獲得或以其他方式決定場景模型，並且使用場景模型對點雲資料進行譯碼，可以增加額外的譯碼效率。Using Geometric Point Cloud Compression (G-PCC), point clouds can be decoded with or without a sensor model to improve decoding efficiency. However, the compression may be performed without using scene-related information (eg, the position of objects). Additional decoding efficiency may be added by obtaining or otherwise determining a scene model and using the scene model to decode the point cloud data.

在一個示例中，本公開內容描述了一種對點雲資料進行譯碼的方法，所述方法包括：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體對應於點雲資料的第一幀的至少一部分；以及，基於所述場景模型，對所述點雲資料的當前幀進行譯碼。In one example, the present disclosure describes a method of decoding point cloud data, the method comprising: determining or obtaining a scene model corresponding to a first frame of point cloud data, wherein the scene model represents an object within a scene, the object corresponding to at least a portion of a first frame of point cloud data; and, based on the scene model, decoding a current frame of the point cloud data.

在一個示例中，本公開內容描述了一種用於對點雲資料進行譯碼的設備，所述設備包括：記憶體，被配置為儲存點雲資料；以及在電路中實現並通信地耦接到記憶體的一個或多個處理器，所述一個或多個處理器被配置為：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體對應於點雲資料的第一幀的至少一部分；以及，基於場景模型，來對所述點雲資料的當前幀進行譯碼。In one example, the present disclosure describes an apparatus for decoding point cloud data, the apparatus comprising: a memory configured to store point cloud data; and implemented in circuitry and communicatively coupled to one or more processors of the memory, the one or more processors configured to: determine or obtain a scene model corresponding to the first frame of the point cloud data, wherein the scene model represents objects within the scene , the object corresponds to at least a portion of a first frame of point cloud data; and, based on a scene model, decoding the current frame of the point cloud data.

在一個示例中，本公開內容描述了一種其上儲存有指令的非暫時性計算機可讀儲存媒體，所述指令當被執行時使一個或多個處理器用於：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景中的物體，所述物體對應於點雲資料的第一幀的至少一部分；以及，基於所述場景模型，來對所述點雲資料的當前幀進行譯碼。In one example, the present disclosure describes a non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to: determine or obtain information related to point cloud data. a scene model corresponding to the first frame, wherein the scene model represents an object in the scene, and the object corresponds to at least a part of the first frame of the point cloud data; The current frame of cloud data is decoded.

在一個示例中，本公開內容描述了一種用於對點雲資料進行譯碼的設備，所述設備包括：用於決定或獲得與點雲資料的第一幀相對應的場景模型的單元，其中，所述場景模型表示場景內的物體，所述物體對應於點雲資料的第一幀的至少一部分；以及，用於基於場景模型來對所述點雲資料的當前幀進行譯碼的單元。In one example, the present disclosure describes an apparatus for decoding point cloud material, the apparatus comprising: means for determining or obtaining a scene model corresponding to a first frame of the point cloud material, wherein , the scene model represents an object in the scene, the object corresponds to at least a part of the first frame of the point cloud data; and a unit for decoding the current frame of the point cloud data based on the scene model.

在一個示例中，本公開內容描述了一種對點雲資料進行譯碼的方法，所述方法包括：決定感測器模型，所述感測器模型包括被配置為獲得點雲資料的一個或多個感測器的至少一個內在參數或外在參數，以及，基於感測器模型，來對點雲資料進行譯碼。In one example, the present disclosure describes a method of decoding point cloud data, the method comprising: determining a sensor model, the sensor model comprising one or more devices configured to obtain point cloud data At least one intrinsic parameter or extrinsic parameter of each sensor, and, based on the sensor model, decode the point cloud data.

在另一示例中，本公開內容描述了一種用於對點雲資料進行譯碼的設備，所述設備包括被配置為儲存點雲資料的記憶體以及在電路中實現並且通信地耦接到記憶體的一個或多個處理器，所述一個或多個處理器被配置為執行本公開內容的任何技術。In another example, the present disclosure describes an apparatus for decoding point cloud data, the apparatus including a memory configured to store point cloud data and implemented in circuitry and communicatively coupled to the memory one or more processors of the body configured to perform any of the techniques of this disclosure.

在另一示例中，本公開內容描述了一種用於對點雲資料進行譯碼的設備，所述設備包括用於執行本公開內容的任何技術的一個或多個單元。In another example, this disclosure describes an apparatus for decoding point cloud material, the apparatus including one or more means for performing any of the techniques of this disclosure.

在又一示例中，本公開內容描述了一種儲存指令的非暫時性計算機可讀儲存媒體，所述指令當被執行時使一個或多個處理器執行本公開內容的任何技術。In yet another example, this disclosure describes a non-transitory computer-readable storage medium storing instructions that, when executed, cause one or more processors to perform any of the techniques of this disclosure.

在圖式和以下描述中闡述了一個或多個示例的細節。根據說明書、圖式和申請專利範圍，其它特徵、目的和優點將是清楚的。The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects and advantages will be apparent from the description, drawings and claims.

點雲編碼或解碼，例如幾何點雲壓縮（G-PCC），可以利用基於八叉樹或基於預測的幾何編碼技術（如下所述），可選地結合關於感測器的先前知識。例如，所述先前知識可以包括LIDAR感測器內的多個雷射的角度資料和位置偏移，這可以實現LIDAR獲得的點雲的顯著譯碼效率增益。然而，點雲編碼器或解碼器可能沒有關於與點雲相對應的三維（3D）場景的可用資訊。在某些情況下，場景可以被理解為提供用於對點雲進行譯碼的幾何上下文（例如，上下文資訊）。就這點而言，本公開內容提出利用（3D）場景模型來提高譯碼效率。根據本公開內容的技術，可以獲得（例如，從外部設備接收）或決定場景模型，並且G-PCC譯碼器可以單獨或與感測器模型一起使用所述場景模型，以提高對點雲位置和/或點雲屬性進行譯碼的效率。點雲可以被定義為具有位置

, n=1,…,N的點的集合，其中N是點雲中的點的數量，並且可選屬性

, n = 1,…,N，其中D是用於每個點的屬性的數量。然而，譯碼效率提高取決於所獲得或推導出的場景模型是否是由點雲形成的場景的準確表示。就這點而言，應所述認識到，可以獲得或推導出場景模型，以用於對多個幀（例如，兩個、三個、...、十個）或甚至一個（單個）幀的點雲進行譯碼。場景模型可以是真實世界場景的數位表示。例如，場景模型可以是基於網格的（包括具有連接資訊的頂點），或者場景內的表面和物體的其他表示，比如表示點雲的限定區域內的一群點的平面。在一些示例中，實際場景模型（例如，城市模型）可以在外部（例如，從外部伺服器）提供給編碼器和/或解碼器，或者可以由編碼器用信號發送給解碼器作為用於點雲幀的序列並且用於對點雲幀進行譯碼的邊資訊。在一些示例中，可以由編碼器使用當前幀來決定場景模型，並且所述場景模型可以用信號進行發送並且用作針對當前幀的預測器（predictor）（例如，使用幀內預測）。在一些示例中，來自先前幀的用信號發送的場景模型可以用作針對當前幀的預測器（例如，使用幀間預測）。在一些示例中，場景模型可以從先前的重構幀估計並且用於當前幀的預測（例如，使用幀間預測）。在一些情況下，先前的場景模型可以用於對當前幀的場景模型進行譯碼，其中，場景模型殘差可以由編碼器向解碼器用信號進行發送並且用於預測當前幀。本公開內容的技術可以減少發送所需要的頻寬和儲存經編碼的點雲所需要的記憶體。 Point cloud encoding or decoding, such as Geometric Point Cloud Compression (G-PCC), can utilize octree-based or prediction-based geometry encoding techniques (described below), optionally incorporating prior knowledge about the sensor. For example, the prior knowledge may include angular profiles and positional offsets of multiple lasers within the LIDAR sensor, which may enable significant decoding efficiency gains for LIDAR-obtained point clouds. However, the point cloud encoder or decoder may not have available information about the three-dimensional (3D) scene corresponding to the point cloud. In some cases, a scene can be understood as providing geometric context (eg, contextual information) for decoding the point cloud. In this regard, the present disclosure proposes to utilize (3D) scene models to improve coding efficiency. According to the techniques of this disclosure, a scene model may be obtained (eg, received from an external device) or determined, and may be used by a G-PCC decoder alone or in conjunction with a sensor model to improve knowledge of point cloud locations and/or the efficiency of decoding point cloud attributes. A point cloud can be defined as having locations

, a collection of points with n=1,…,N, where N is the number of points in the point cloud, and optional attributes

, n = 1,…,N, where D is the number of attributes used for each point. However, the improved coding efficiency depends on whether the obtained or derived scene model is an accurate representation of the scene formed by the point cloud. In this regard, it should be recognized that a scene model may be obtained or derived for use in analyzing multiple frames (eg, two, three, . . . , ten) or even one (single) frame point cloud for decoding. A scene model may be a digital representation of a real-world scene. For example, the scene model may be mesh-based (including vertices with connected information), or other representations of surfaces and objects within the scene, such as planes representing a group of points within a defined area of a point cloud. In some examples, the actual scene model (eg, a city model) may be provided externally (eg, from an external server) to the encoder and/or decoder, or may be signaled by the encoder to the decoder as a source for the point cloud A sequence of frames and side information used to decode point cloud frames. In some examples, the current frame may be used by the encoder to decide a scene model, and the scene model may be signaled and used as a predictor for the current frame (eg, using intra prediction). In some examples, a signaled scene model from a previous frame may be used as a predictor for the current frame (eg, using inter prediction). In some examples, a scene model may be estimated from a previous reconstructed frame and used for prediction of the current frame (eg, using inter prediction). In some cases, a previous scene model may be used to code the scene model for the current frame, where the scene model residual may be signaled by the encoder to the decoder and used to predict the current frame. The techniques of this disclosure may reduce the bandwidth required for transmission and the memory required to store the encoded point cloud.

圖1是示出可以執行本公開內容的技術的示例性編碼和解碼系統100的方塊圖。概括而言，本公開內容的技術涉及對點雲資料進行譯碼（編碼和/或解碼），即，支援點雲壓縮。通常，點雲資料包括用於處理點雲的任何資料。譯碼可以有效地壓縮和/或解壓縮點雲資料。1 is a block diagram illustrating an exemplary encoding and decoding system 100 in which the techniques of this disclosure may be implemented. In general, the techniques of this disclosure relate to decoding (encoding and/or decoding) point cloud data, ie, supporting point cloud compression. In general, point cloud profiles include any profile used to process point clouds. Decoding can efficiently compress and/or decompress point cloud data.

如圖1中所示，系統100包括來源設備102和目的地設備116。來源設備102提供要被目的地設備116解碼的經編碼的點雲資料。具體地，在圖1的示例中，來源設備102經由計算機可讀媒體110來將點雲資料提供給目的地設備116。來源設備102和目的地設備116可以包括各種各樣的設備中的任何一種，包括台式計算機、筆記本計算機（例如，膝上型計算機）、平板計算機、機上盒、諸如智慧型電話之類的電話手機、電視機、相機、顯示設備、數位媒體播放器、視頻遊戲控制台、視頻串流設備、陸地或海上車輛、太空船、飛機、機器人、LIDAR（光探測和測距）設備、衛星等。在一些情況下，來源設備102和目的地設備116可以被配備用於無線通信。As shown in FIG. 1 , system 100 includes source device 102 and destination device 116 . Source device 102 provides encoded point cloud data to be decoded by destination device 116 . Specifically, in the example of FIG. 1 , source device 102 provides point cloud material to destination device 116 via computer-readable medium 110 . Source device 102 and destination device 116 may include any of a wide variety of devices, including desktop computers, notebook computers (eg, laptop computers), tablet computers, set-top boxes, telephones such as smartphones Mobile phones, televisions, cameras, display devices, digital media players, video game consoles, video streaming devices, land or sea vehicles, spacecraft, aircraft, robots, LIDAR (Light Detection and Ranging) devices, satellites, etc. In some cases, source device 102 and destination device 116 may be equipped for wireless communication.

在圖1的示例中，來源設備102包括資料源104、記憶體106、G-PCC編碼器200以及輸出介面108。目的地設備116包括輸入介面122、G-PCC解碼器300、記憶體120以及資料消費者118。根據本公開內容，來源設備102的G-PCC編碼器200和目的地設備116的G-PCC解碼器300可以被配置為應用本公開內容的與對輸入點雲進行建模有關的技術。因此，來源設備102表示編碼設備的示例，而目的地設備116表示解碼設備的示例。在其它示例中，來源設備102和目的地設備116可以包括其它組件或佈置。例如，來源設備102可以從內部或外部來源接收資料（例如，點雲資料）。同樣，目的地設備116可以與外部資料消費者對接，而不是在同一設備中包括資料消費者。In the example of FIG. 1 , source device 102 includes data source 104 , memory 106 , G-PCC encoder 200 , and output interface 108 . Destination device 116 includes input interface 122 , G-PCC decoder 300 , memory 120 , and data consumer 118 . In accordance with the present disclosure, the G-PCC encoder 200 of the source device 102 and the G-PCC decoder 300 of the destination device 116 may be configured to apply the techniques of the present disclosure related to modeling input point clouds. Thus, source device 102 represents an example of an encoding device, while destination device 116 represents an example of a decoding device. In other examples, source device 102 and destination device 116 may include other components or arrangements. For example, source device 102 may receive data (eg, point cloud data) from internal or external sources. Likewise, the destination device 116 may interface with an external data consumer, rather than including the data consumer in the same device.

如圖1所示的系統100僅是一個示例。通常，其它數位編碼和/或解碼設備可以執行本公開內容的與對輸入點雲進行建模有關的技術。來源設備102和目的地設備116僅是這樣的設備的示例，其中，來源設備102產生經譯碼的資料以用於傳輸給目的地設備116。本公開內容將“譯碼”設備指代為執行對資料的譯碼（例如，編碼和/或解碼）的設備。因此，G-PCC編碼器200和G-PCC解碼器300分別表示譯碼設備（具體地，編碼器和解碼器）的示例。在一些示例中，來源設備102和目的地設備116可以以基本上對稱的方式進行操作，使得來源設備102和目的地設備116中的每一者都包括編碼和解碼組件。因此，系統100可以支援在來源設備102和目的地設備116之間的單向或雙向傳輸，例如，以用於串流、回放、廣播、電話、導航和其他應用。The system 100 shown in FIG. 1 is only one example. In general, other digital encoding and/or decoding devices may perform the techniques of this disclosure related to modeling the input point cloud. Source device 102 and destination device 116 are merely examples of devices in which source device 102 generates decoded data for transmission to destination device 116 . This disclosure refers to a "decoding" device as a device that performs the decoding (eg, encoding and/or decoding) of material. Thus, the G-PCC encoder 200 and the G-PCC decoder 300 respectively represent examples of decoding devices (specifically, an encoder and a decoder). In some examples, source device 102 and destination device 116 may operate in a substantially symmetrical manner, such that source device 102 and destination device 116 each include encoding and decoding components. Thus, system 100 can support one-way or two-way transmission between source device 102 and destination device 116, eg, for streaming, playback, broadcast, telephony, navigation, and other applications.

通常，資料源104表示資料（例如，原始的、未編碼的點雲資料）的來源，並且可以向G-PCC編碼器200提供資料的連續系列“幀”，G-PCC編碼器200將針對幀的資料進行編碼。來源設備102的資料源104可以包括點雲獲得設備，諸如各種相機或感測器中的任一種，例如，3D掃描儀或LIDAR設備、一個或多個攝像機、包含先前獲得的資料的存檔、和/或從資料內容提供者接收資料的資料饋送介面。替代地或額外地，點雲資料可以是從掃描儀、相機、感測器或其他資料透過計算機產生的。例如，資料源104可以產生基於計算機圖形的資料作為來源資料，或者產生即時資料、被存檔的資料和計算機產生資料的組合。在每種情況下，G-PCC編碼器200可以對被獲得的、預獲得的或計算機產生的資料進行編碼。G-PCC編碼器200可以將幀從所接收的順序（有時被稱為“顯示順序”）重新排列為用於譯碼的譯碼順序。G-PCC編碼器200可以產生包括經編碼的資料的一個或多個位元串流。然後，來源設備102可以經由輸出介面108將經編碼的資料輸出到計算機可讀媒體110上，以便由例如目的地設備116的輸入介面122接收和/或取回。Typically, material source 104 represents a source of material (eg, raw, unencoded point cloud material), and may provide a continuous series of "frames" of material to G-PCC encoder 200, which G-PCC encoder 200 will target for frames data are encoded. The material source 104 of the source device 102 may include a point cloud acquisition device, such as any of a variety of cameras or sensors, eg, a 3D scanner or LIDAR device, one or more cameras, an archive containing previously acquired material, and / or data feed interface that receives data from data content providers. Alternatively or additionally, point cloud data may be computer generated from scanners, cameras, sensors or other data. For example, material sources 104 may generate computer graphics-based material as source material, or a combination of live material, archived material, and computer-generated material. In each case, the G-PCC encoder 200 may encode acquired, pre-acquired or computer-generated material. The G-PCC encoder 200 may rearrange the frames from the received order (sometimes referred to as "display order") to the decoding order for decoding. G-PCC encoder 200 may generate one or more bitstreams comprising encoded data. The source device 102 may then output the encoded data onto the computer-readable medium 110 via the output interface 108 for receipt and/or retrieval by, for example, the input interface 122 of the destination device 116 .

來源設備102的記憶體106和目的地設備116的記憶體120可以表示通用記憶體。在一些示例中，記憶體106和記憶體120可以儲存原始資料，例如，來自資料源104的原始資料以及來自G-PCC解碼器300的原始的經解碼的資料。額外地或替代地，記憶體106和記憶體120可以儲存可由例如G-PCC編碼器200和G-PCC解碼器300分別執行的軟體指令。儘管在所述示例中記憶體106和記憶體120被示為與G-PCC編碼器200和G-PCC解碼器300分開，但是應當理解的是，G-PCC編碼器200和G-PCC解碼器300還可以包括用於在功能上類似或等效目的的內部記憶體。此外，記憶體106和記憶體120可以儲存例如從G-PCC編碼器200輸出並且輸入到G-PCC解碼器300的經編碼的資料。在一些示例中，記憶體106和記憶體120的部分可以被分配為一個或多個緩衝器，例如，以儲存原始的經解碼和/或經編碼的資料。例如，記憶體106和記憶體120可以儲存表示點雲的資料。Memory 106 of source device 102 and memory 120 of destination device 116 may represent general purpose memory. In some examples, memory 106 and memory 120 may store raw data, eg, raw data from data source 104 and raw decoded data from G-PCC decoder 300 . Additionally or alternatively, memory 106 and memory 120 may store software instructions executable by, for example, G-PCC encoder 200 and G-PCC decoder 300, respectively. Although memory 106 and memory 120 are shown as separate from G-PCC encoder 200 and G-PCC decoder 300 in the example, it should be understood that G-PCC encoder 200 and G-PCC decoder 300 may also include internal memory for a functionally similar or equivalent purpose. In addition, memory 106 and memory 120 may store encoded data output from G-PCC encoder 200 and input to G-PCC decoder 300, for example. In some examples, portions of memory 106 and memory 120 may be allocated as one or more buffers, eg, to store raw decoded and/or encoded data. For example, memory 106 and memory 120 may store data representing point clouds.

計算機可讀媒體110可以表示能夠將編碼資料從來源設備102輸送到目的地設備116的任何類型的媒體或設備。在一個示例中，計算機可讀媒體110表示通信媒體，其使得來源設備102能夠例如經由射頻網路或基於計算機的網路，來即時地向目的地設備116直接發送編碼資料。輸出介面108可以根據諸如無線通信協定之類的通信標準來對包括編碼資料的傳輸信號進行調變，並且輸入介面122可以根據諸如無線通信協定之類的通信標準來對所接收的傳輸信號進行解調。通信媒體可以包括任何無線或有線通信媒體，例如，射頻（RF）頻譜或一條或多條實體傳輸線。通信媒體可以形成諸如以下各項的基於封包的網路的一部分：區域網路、廣域網路、或諸如網際網路之類的全球網路。通信媒體可以包括路由器、交換機、基地台、或對於促進從來源設備102到目的地設備116的通信而言可以有用的任何其它設備。Computer-readable medium 110 may represent any type of media or device capable of transporting encoded material from source device 102 to destination device 116 . In one example, computer-readable medium 110 represents a communication medium that enables source device 102 to transmit encoded material directly to destination device 116 in real time, such as via a radio frequency network or a computer-based network. The output interface 108 may modulate the transmission signal including the encoded data according to a communication standard, such as a wireless communication protocol, and the input interface 122 may decode the received transmission signal according to a communication standard, such as a wireless communication protocol. tune. Communication media may include any wireless or wired communication media, eg, the radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet-based network such as a local area network, a wide area network, or a global network such as the Internet. Communication media may include routers, switches, base stations, or any other device that may be useful for facilitating communication from source device 102 to destination device 116 .

在一些示例中，來源設備102可以將經編碼的資料從輸出介面108輸出到儲存設備112。類似地，目的地設備116可以經由輸入介面122從儲存設備112存取經編碼的資料。儲存設備112可以包括各種分布式或本地存取的資料儲存媒體中的任何一種，諸如硬碟驅動器、藍光光碟、DVD、CD-ROM、快閃記憶體、揮發性或非揮發性記憶體、或用於儲存編碼資料的任何其它適當的數位儲存媒體。In some examples, source device 102 may output the encoded data from output interface 108 to storage device 112 . Similarly, destination device 116 may access encoded data from storage device 112 via input interface 122 . Storage device 112 may include any of a variety of distributed or locally-accessed data storage media, such as hard drives, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or Any other suitable digital storage medium for storing encoded data.

在一些示例中，來源設備102可以將編碼資料輸出到文件伺服器114或者可以儲存由來源設備102產生的編碼資料的另一中間儲存設備。目的地設備116可以經由串流式傳輸或下載來從文件伺服器114存取被儲存的視頻資料。文件伺服器114可以是能夠儲存編碼資料並且將所述編碼資料發送給目的地設備116的任何類型的伺服器設備。文件伺服器114可以表示網頁伺服器（例如，用於網站）、文件傳輸協定（FTP）伺服器、內容遞送網路設備、或網路額外儲存（NAS）設備。目的地設備116可以透過任何標準資料連接（包括網際網路連接）來從文件伺服器114存取編碼資料。這可以包括適於存取被儲存在文件伺服器114上的編碼資料的無線信道（例如，Wi-Fi連接）、有線連接（例如，數位用戶線（DSL）、電纜資料機等）、或這兩者的組合。文件伺服器114和輸入介面122可以被配置為根據以下各項來操作：流傳輸協定、下載傳輸協定、或其組合。In some examples, source device 102 may output the encoded data to file server 114 or another intermediate storage device that may store encoded data generated by source device 102 . The destination device 116 may access the stored video data from the file server 114 via streaming or downloading. File server 114 may be any type of server device capable of storing encoded data and sending the encoded data to destination device 116 . File server 114 may represent a web server (eg, for a website), a file transfer protocol (FTP) server, a content delivery network device, or a network attached storage (NAS) device. Destination device 116 may access encoded data from file server 114 over any standard data connection, including an Internet connection. This may include a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, digital subscriber line (DSL), cable modem, etc.) suitable for accessing encoded data stored on the file server 114, or the like a combination of the two. File server 114 and input interface 122 may be configured to operate according to a streaming protocol, a download protocol, or a combination thereof.

輸出介面108和輸入介面122可以表示無線發射機/接收機、資料機、有線聯網組件（例如，乙太網卡）、根據各種IEEE 802.11標準中的任何一種標準進行操作的無線通信組件、或其它實體組件。在其中輸出介面108和輸入介面122包括無線組件的示例中，輸出介面108和輸入介面122可以被配置為根據蜂巢式通信標準（諸如4G、4G-LTE（長期演進）、改進的LTE、5G等）來傳輸資料（比如，編碼資料）。在其中輸出介面108包括無線發射機的一些示例中，輸出介面108和輸入介面122可以被配置為根據其它無線標準（諸如IEEE 802.11規範、IEEE 802.15規範（例如，ZigBee™）、Bluetooth™標準等）來傳輸資料（比如，編碼資料）。在一些示例中，來源設備102和/或目的地設備116可以包括相應的單晶片系統（SoC）設備。例如，來源設備102可以包括用於執行被賦予G-PCC編碼器200和/或輸出介面108的功能的SoC設備，並且目的地設備116可以包括用於執行被賦予G-PCC解碼器300和/或輸入介面122的功能的SoC設備。Output interface 108 and input interface 122 may represent wireless transmitters/receivers, data machines, wired networking components (eg, Ethernet cards), wireless communication components operating in accordance with any of the various IEEE 802.11 standards, or other entities components. In examples in which output interface 108 and input interface 122 include wireless components, output interface 108 and input interface 122 may be configured according to cellular communication standards such as 4G, 4G-LTE (Long Term Evolution), LTE-Advanced, 5G, etc. ) to transmit data (eg, encoded data). In some examples in which output interface 108 includes a wireless transmitter, output interface 108 and input interface 122 may be configured in accordance with other wireless standards (such as the IEEE 802.11 specification, the IEEE 802.15 specification (eg, ZigBee™), the Bluetooth™ standard, etc.) to transmit data (eg, encoded data). In some examples, source device 102 and/or destination device 116 may include respective system-on-a-chip (SoC) devices. For example, source device 102 may include an SoC device for performing the functions assigned to G-PCC encoder 200 and/or output interface 108, and destination device 116 may include an SoC device for executing functions assigned to G-PCC encoder 200 and/or output interface 108. Or input interface 122 functions of the SoC device.

本公開內容的技術可以應用於編碼和解碼，以支援各種應用中的任一種，例如，在自主車輛之間的通信，在掃描儀、相機、感測器和諸如本地或遠程伺服器的處理設備之間的通信，地理映射，或其他應用。The techniques of this disclosure may be applied to encoding and decoding to support any of a variety of applications, such as communications between autonomous vehicles, in scanners, cameras, sensors, and processing devices such as local or remote servers communication between, geographic mapping, or other applications.

目的地設備116的輸入介面122從計算機可讀媒體110（例如，通信媒體、儲存設備112、文件伺服器114等）接收經編碼的位元串流。經編碼的位元串流可以包括由G-PCC編碼器200定義的諸如以下語法元素之類的信令資訊（其也被G-PCC解碼器300使用）：所述語法元素具有描述譯碼單元（例如，切片、圖片、圖片組、序列等）的特性和/或處理的值。資料消費者118使用經解碼的資料。例如，資料消費者118可以使用解碼資料來決定實體物體的位置。在一些示例中，資料消費者118可以包括用於基於點雲來呈現圖像的顯示器。Input interface 122 of destination device 116 receives the encoded bitstream from computer-readable medium 110 (eg, communication medium, storage device 112, file server 114, etc.). The encoded bitstream may include signaling information such as syntax elements defined by G-PCC encoder 200 (which are also used by G-PCC decoder 300 ) that have description coding units (eg, slice, picture, group of pictures, sequence, etc.) characteristics and/or processed values. Data consumers 118 use the decoded data. For example, the data consumer 118 may use the decoded data to determine the location of the physical object. In some examples, the profile consumer 118 may include a display for presenting images based on the point cloud.

G-PCC編碼器200和G-PCC解碼器300各自可以被實現為各種適當的編碼器和/或解碼器電路中的任何一種，比如，一個或多個微處理器、數位信號處理器（DSP）、特殊應用積體電路（ASIC）、現場可程式化閘陣列（FPGA）、離散邏輯、軟體、硬體、韌體、或其任何組合。當所述技術部分地用軟體實現時，設備可以將用於軟體的指令儲存在適當的非暫時性計算機可讀媒體中，並且使用一個或多個處理器，用硬體來執行指令以執行本公開內容的技術。G-PCC編碼器200和G-PCC解碼器300中的每一者可以被包括在一個或多個編碼器或解碼器中，編碼器或解碼器中的任一者可以被集成為相應設備中的組合編碼器/解碼器（CODEC）的一部分。包括G-PCC編碼器200和/或G-PCC解碼器300的設備可以包括一個或多個積體電路、微處理器和/或其他類型的設備。G-PCC encoder 200 and G-PCC decoder 300 may each be implemented as any of a variety of suitable encoder and/or decoder circuits, such as one or more microprocessors, digital signal processors (DSPs) ), application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, software, hardware, firmware, or any combination thereof. When the techniques are implemented partly in software, the apparatus may store instructions for the software in a suitable non-transitory computer-readable medium and execute the instructions in hardware, using one or more processors, to perform the present invention. Techniques for Disclosing Content. Each of the G-PCC encoder 200 and the G-PCC decoder 300 may be included in one or more encoders or decoders, either of which may be integrated into a corresponding device part of the Combined Encoder/Decoder (CODEC). Devices including G-PCC encoder 200 and/or G-PCC decoder 300 may include one or more integrated circuits, microprocessors, and/or other types of devices.

G-PCC編碼器200和G-PCC解碼器300可以根據諸如視頻點雲壓縮（V-PCC）標準或幾何點雲壓縮（G-PCC）標準之類的譯碼標準進行操作。本公開內容可以通常涉及對圖片的譯碼（例如，編碼和解碼）以包括對資料進行編碼或解碼的過程。經編碼的位元串流通常包括用於表示譯碼決策（例如，譯碼模式）的語法元素的一系列值。The G-PCC encoder 200 and the G-PCC decoder 300 may operate according to a coding standard such as the Video Point Cloud Compression (V-PCC) standard or the Geometric Point Cloud Compression (G-PCC) standard. The present disclosure may generally relate to the coding (eg, encoding and decoding) of pictures to include the process of encoding or decoding material. The encoded bitstream typically includes a series of values for syntax elements representing coding decisions (eg, coding modes).

本公開內容通常可以涉及“以信號發送”某些資訊（比如，語法元素）。術語“以信號發送”通常可以指代對用於語法元素的值和/或用於對編碼資料進行解碼的其它資料的傳送。也就是說，G-PCC編碼器200可以在位元串流中以信號發送用於語法元素的值。通常，以信號發送指代在位元串流中產生值。如上所述，來源設備102可以基本上即時地或不是即時地（比如，可能在將語法元素儲存到儲存設備112以供目的地設備116稍後取回時發生）將位元串流傳輸到目的地設備116。The present disclosure may generally involve "signaling" certain information (eg, syntax elements). The term "signaling" may generally refer to the transmission of values for syntax elements and/or other materials for decoding encoded material. That is, G-PCC encoder 200 may signal values for syntax elements in the bitstream. Generally, signaling refers to producing a value in a bitstream. As discussed above, source device 102 may stream the bitstream to the destination substantially instantaneously or not (eg, as may occur when syntax elements are stored to storage device 112 for later retrieval by destination device 116) ground equipment 116.

ISO/IEC MPEG（JTC 1/SC 29/WG 11）和最近的ISO/IEC MPEG 3DG（JTC 1/SC29/WG 7）正在研究對具有明顯超過當前方法的壓縮能力的點雲譯碼技術的標準化的潛在需求，並將以創建所述標準為目標。MPEG正在與叫做3D圖形組（3DG）合作努力開展這項探索活動，以評估其專家在所述領域提出的壓縮技術設計。ISO/IEC MPEG (JTC 1/SC 29/WG 11) and more recently ISO/IEC MPEG 3DG (JTC 1/SC29/WG 7) are working on standardizing point cloud decoding techniques with compression capabilities significantly exceeding current methods potential needs and will aim to create said standards. MPEG is working on this exploratory effort in partnership with what is called the 3D Graphics Group (3DG) to evaluate the design of compression techniques proposed by its experts in the field.

點雲壓縮活動被分為兩種不同的方法。第一種方法是“視頻點雲壓縮”（V-PCC），它對3D物體進行分割，並將這些分割片段投影到多個2D平面（在2D幀中表示為“補丁”），然後由諸如高效視頻譯碼（HEVC）（ITU-T H.265）編解碼器之類的傳統2D視頻編解碼器進一步譯碼。第二種方法是“基於幾何的點雲壓縮”（G-PCC），它直接壓縮3D幾何（例如，在3D空間中的點集合的位置）以及相關聯的屬性值（對於與3D幾何相關聯的每個點）。G-PCC處理類別1（靜態的點雲）和類別3（動態獲得的點雲）中的點雲的壓縮。G-PCC 標準的最新草案可在ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Teleconference, Oct. 2020中獲得，並且編解碼器的描述可在G-PCC Codec Description, ISO/IEC JTC 1/SC29/WG 7 MDS19620, Teleconference, Oct. 2020（以下簡稱“G-PCC編解碼器說明”）中獲得。The point cloud compression activity is divided into two different approaches. The first method is "Video Point Cloud Compression" (V-PCC), which segments 3D objects and projects these segmented segments onto multiple 2D planes (represented in 2D frames as "patches"), which are then processed by data such as Conventional 2D video codecs such as the High Efficiency Video Coding (HEVC) (ITU-T H.265) codec further code. The second method is "Geometry-Based Point Cloud Compression" (G-PCC), which directly compresses 3D geometry (e.g., the location of a collection of points in 3D space) and associated attribute values (for those associated with 3D geometry each point). G-PCC handles the compression of point clouds in category 1 (static point clouds) and category 3 (dynamically obtained point clouds). The latest draft of the G-PCC standard is available in ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Teleconference, Oct. 2020, and the codec descriptions are available Obtained in G-PCC Codec Description, ISO/IEC JTC 1/SC29/WG 7 MDS19620, Teleconference, Oct. 2020 (hereinafter referred to as "G-PCC Codec Description").

點雲包含3D空間中的點集合並且可以具有與所述點相關聯的屬性。屬性可以是色彩資訊，例如R、G、B或Y、Cb、Cr，或反射率資訊，或其他屬性。點雲可以由各種相機或感測器（例如LIDAR感測器和3D掃描儀）獲得，並且也可以是計算機產生的。點雲資料用於各種應用，包括但不限於建築（建模）、圖形（用於可視化和動畫的3D模型）和汽車行業（用於有助於導航的LIDAR感測器）。A point cloud contains a collection of points in 3D space and can have attributes associated with the points. Attributes can be color information, such as R, G, B or Y, Cb, Cr, or reflectance information, or other attributes. Point clouds can be obtained by various cameras or sensors, such as LIDAR sensors and 3D scanners, and can also be computer-generated. Point cloud data are used in a variety of applications including, but not limited to, architecture (modeling), graphics (3D models for visualization and animation), and the automotive industry (LIDAR sensors to aid in navigation).

點雲佔據的3D空間可以被虛擬邊界框包圍。所述點在邊界框中的位置可以用一定精度來表示；因此，可以基於精度來量化一個或多個點的位置。在最小級別，邊界框被分割成體素（voxels），所述體素是由單元立方體表示的最小空間單元。邊界框中的體素可以與零個、一個或多個點相關聯。邊界框可以被分割成多個立方體/長方體區域，所述多個立方體/長方體區域可以被稱為瓦片。每個瓦片可以被譯碼為一個或多個切片。將邊界框劃分為切片和瓦片可以是基於每個分區中的點的數量，或基於其他考慮（例如，特定區域可以被譯碼為瓦片）。可以使用類似於視頻編解碼器中的分割決策來進一步劃分切片區域。The 3D space occupied by the point cloud can be surrounded by a virtual bounding box. The location of the point within the bounding box can be represented with a certain precision; thus, the location of one or more points can be quantified based on the precision. At the smallest level, bounding boxes are segmented into voxels, which are the smallest spatial units represented by unit cubes. A voxel in a bounding box can be associated with zero, one or more points. The bounding box may be divided into multiple cube/cuboid regions, which may be referred to as tiles. Each tile can be coded into one or more slices. The division of the bounding box into slices and tiles may be based on the number of points in each partition, or based on other considerations (eg, certain regions may be coded as tiles). Slice regions can be further divided using segmentation decisions similar to those in video codecs.

圖2提供了G-PCC編碼器200的概述。圖3提供了G-PCC解碼器300的概述。所示模組是邏輯的，並且不一定一一對應於G-PCC編解碼器的參考實現中的實現碼，例如由ISO/IEC MPEG（JTC 1/SC 29/WG 11）研究的TMC13測試模型軟體。FIG. 2 provides an overview of the G-PCC encoder 200 . FIG. 3 provides an overview of G-PCC decoder 300 . The modules shown are logical and do not necessarily correspond one-to-one to implementation codes in reference implementations of the G-PCC codec, such as the TMC13 test model studied by ISO/IEC MPEG (JTC 1/SC 29/WG 11) software.

在G-PCC編碼器200和G-PCC解碼器300兩者中，首先對點雲位置進行譯碼，並且對點雲屬性的譯碼取決於經譯碼的幾何形狀。點雲的幾何形狀僅包含點位置。在一些示例中，G-PCC編碼器200和G-PCC解碼器300可以使用預測幾何譯碼。例如，G-PCC編碼器200可以包括預測幾何分析單元211並且G-PCC解碼器300可以包括用於執行預測幾何譯碼的預測幾何合成單元307。稍後在本公開內容中關於圖5更詳細地討論預測幾何編碼。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以從諸如伺服器之類的外部設備獲得場景模型230。在一些示例中，G-PCC編碼器或G-PCC解碼器可以決定場景模型230或場景模型330。在其中G-PCC編碼器或G-PCC解碼器可以決定場景模型230或場景模型330的情況下，場景模型可以被稱為經估計的場景模型或經決定的場景模型。在一些示例中，當對點雲位置和/或屬性進行編碼時，G-PCC編碼器200可以使用場景模型230和/或可選地感測器模型234。在一些示例中，當對點雲位置和/或屬性進行解碼時，G-PCC解碼器300可以使用場景模型330，和/或可選地，感測器模型334。在一些示例中，場景模型230與場景模型330相同。在一些示例中，感測器模型234與感測器模型334相同。場景模型230和/或可選地感測器模型234可以儲存在G-PCC編碼器200的記憶體240中。類似地，場景模型330和/或可選地感測器模型334可以儲存在G-PCC解碼器300的記憶體340中。In both the G-PCC encoder 200 and the G-PCC decoder 300, the point cloud positions are coded first, and the coding of the point cloud properties depends on the coded geometry. The geometry of the point cloud contains only point locations. In some examples, G-PCC encoder 200 and G-PCC decoder 300 may use predictive geometric coding. For example, G-PCC encoder 200 may include predictive geometry analysis unit 211 and G-PCC decoder 300 may include predictive geometry synthesis unit 307 for performing predictive geometry coding. Predictive geometry coding is discussed in more detail with respect to FIG. 5 later in this disclosure. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may obtain scene model 230 from an external device such as a server. In some examples, the G-PCC encoder or G-PCC decoder may decide on scene model 230 or scene model 330 . In the case where the G-PCC encoder or the G-PCC decoder may decide the scene model 230 or the scene model 330, the scene model may be referred to as an estimated scene model or a decided scene model. In some examples, G-PCC encoder 200 may use scene model 230 and/or optionally sensor model 234 when encoding point cloud locations and/or attributes. In some examples, G-PCC decoder 300 may use scene model 330, and/or optionally, sensor model 334, when decoding point cloud locations and/or attributes. In some examples, scene model 230 is the same as scene model 330 . In some examples, sensor model 234 is the same as sensor model 334 . The scene model 230 and/or optionally the sensor model 234 may be stored in the memory 240 of the G-PCC encoder 200 . Similarly, scene model 330 and/or optionally sensor model 334 may be stored in memory 340 of G-PCC decoder 300 .

在圖2中，表面近似分析單元212和RAHT單元218是通常用於類別1資料的選項。LoD產生單元220和提升單元222是通常用於類別3資料的選項。在圖3中，表面近似合成單元310和RAHT單元314是通常用於類別1資料的選項。LoD產生單元316和反向提升單元318是通常用於類別3資料的選項。所有其他模組可以在類別1和類別3之間是通用的。In Figure 2, surface approximation analysis unit 212 and RAHT unit 218 are options typically used for Category 1 data. LoD generation unit 220 and promotion unit 222 are options typically used for Category 3 data. In FIG. 3, surface approximation synthesis unit 310 and RAHT unit 314 are options typically used for Category 1 material. LoD generation unit 316 and reverse boost unit 318 are options typically used for Category 3 data. All other mods can be common between Category 1 and Category 3.

對於八叉樹譯碼，針對類別3資料，經壓縮的幾何形狀通常表示為從根一直向下到單個體素的樹葉級的八叉樹。針對類別1資料，經壓縮的幾何形狀通常由經修剪的八叉樹（例如，從根向下到大於體素的區塊的樹葉級的八叉樹）加上近似於經修剪的八叉樹的每個樹葉內的表面的模型來表示。透過這種方式，類別1資料和類別3資料都共用八叉樹譯碼機制，而類別1資料可以另外用表面模型來近似每個樹葉內的體素。使用的表面模型是每一區塊包含1-10個三角形的三角剖分，從而形成三角形湯（triangle soup）。因此，類別1幾何編解碼器被稱為Trisoup幾何編解碼器，而類別3幾何編解碼器被稱為八叉樹幾何編解碼器。For octree coding, for class 3 data, the compressed geometry is typically represented as a leaf-level octree from the root all the way down to a single voxel. For category 1 data, the compressed geometry is typically composed of a pruned octree (eg, a leaf-level octree from the root down to blocks larger than voxels) plus an approximation of the pruned octree A model of the surface inside each leaf is represented. In this way, both class 1 data and class 3 data share the octree decoding mechanism, while class 1 data can additionally use a surface model to approximate the voxels within each leaf. The surface model used was a triangulation containing 1-10 triangles per block, resulting in a triangle soup. Therefore, class 1 geometry codecs are called Trisoup geometry codecs, and class 3 geometry codecs are called octree geometry codecs.

圖4是示出根據本公開內容的技術的用於幾何譯碼的示例性八叉樹分割的概念圖。在圖4中所示的例子中，八叉樹400可以被分割成一系列節點。例如，每個節點可以是立方體節點。在八叉樹的每個節點處，針對所述節點的子節點中的一個或多個子節點，其可以包括多達8個節點，當G-PCC解碼器300未推斷出佔用時，G-PCC編碼器200可以將點雲的點對節點的佔用用信號發送給G-PCC解碼器300。指定了多個鄰域，包括（a）與當前八叉樹節點共用面的節點，（b）與當前八叉樹節點共用面、邊或頂點的節點，等等。在每個鄰域內，節點的佔用和/或其子節點可以用於預測當前節點或其子節點的佔用。對於八叉樹的某些節點中稀疏分佈的點，編解碼器還支援其中對點的3D位置進行直接編碼的直接譯碼模式。可以用信號發送標誌以指示用信號發送直接模式。使用直接模式，點雲中的點的位置可以被直接譯碼，而無需任何壓縮。在最低級別，也可以對與八叉樹節點/葉節點相關聯的點的數量進行譯碼。4 is a conceptual diagram illustrating an example octree partitioning for geometric coding in accordance with the techniques of this disclosure. In the example shown in FIG. 4, octree 400 may be partitioned into a series of nodes. For example, each node can be a cube node. At each node of the octree, for one or more of the node's child nodes, which may include up to 8 nodes, when the G-PCC decoder 300 does not infer occupancy, the G-PCC The encoder 200 may signal the point-to-node occupancy of the point cloud to the G-PCC decoder 300 . Multiple neighborhoods are specified, including (a) nodes that share faces with the current octree node, (b) nodes that share faces, edges, or vertices with the current octree node, and so on. Within each neighborhood, the occupancy of the node and/or its children can be used to predict the occupancy of the current node or its children. For sparsely distributed points in some nodes of the octree, the codec also supports a direct decoding mode in which the 3D position of the point is directly encoded. A flag can be signaled to indicate that direct mode is signaled. Using direct mode, the positions of points in the point cloud can be decoded directly without any compression. At the lowest level, the number of points associated with an octree node/leaf node can also be decoded.

一旦幾何被編碼，與幾何點相對應的屬性被譯碼。當存在與一個經重構/解碼的幾何點相對應的多個屬性點時，可以推導出表示重構點的屬性值。Once the geometry is encoded, the attributes corresponding to the geometry points are decoded. When there are multiple attribute points corresponding to one reconstructed/decoded geometry point, attribute values representing the reconstructed points can be derived.

G-PCC中存在三種屬性譯碼方法：區域自適應層次轉換（RAHT）譯碼、基於插值的層次最近鄰域預測（預測轉換）和具有更新/提升步驟的基於插值的層次最近鄰域預測（提升轉換）。RAHT和提升轉換通常用於類別1資料，而預測轉換通常用於類別3資料。但是，任何方法可以用於任何資料，並且就像G-PCC中的幾何編解碼器，可以在位元串流中指定用於對點雲進行譯碼的屬性譯碼方法。There are three attribute decoding methods in G-PCC: Region Adaptive Hierarchical Transform (RAHT) decoding, interpolation-based hierarchical nearest-neighbor prediction (predictive transform), and interpolation-based hierarchical nearest-neighbor prediction with update/lift steps ( boost conversion). RAHT and lift transformations are typically used for Category 1 data, while Predictive transformations are often used for Category 3 data. However, any method can be used for any data, and like the geometry codec in G-PCC, the attribute decoding method used to decode the point cloud can be specified in the bitstream.

對屬性的譯碼可以在細節層次（LoD）中進行，其中，就每個級別的細節而言，可以獲得點雲屬性的更精細的表示。可以基於與相鄰節點的距離度量或基於採樣距離，來指定每個級別的細節。The decoding of attributes can be done in a level of detail (LoD), where, for each level of detail, a finer representation of the point cloud attributes can be obtained. Each level of detail can be specified based on a distance metric to neighboring nodes or based on sampling distance.

在G-PCC編碼器200處，對作為針對屬性的譯碼方法的輸出而獲得的殘差進行量化。可以透過從基於當前點的鄰域中的點並且基於先前編碼的點的屬性值推導出的預測中減去屬性值，來獲得所述殘差。可以使用上下文自適應算術譯碼對經量化的殘差進行譯碼。At the G-PCC encoder 200, the residual obtained as an output of the attribute-oriented coding method is quantized. The residuals may be obtained by subtracting attribute values from predictions derived based on points in the neighborhood of the current point and based on attribute values of previously encoded points. The quantized residual may be coded using context adaptive arithmetic coding.

在圖2的示例中，G-PCC編碼器200可以包括座標轉換單元202、色彩轉換單元204、體素化（voxelization）單元206、屬性傳送單元208、八叉樹分析單元210、表面近似分析單元212、算術編碼單元214、幾何重構單元216、RAHT單元218、LoD產生單元220、提升單元222、係數量化單元224和算術編碼單元226。In the example of FIG. 2, the G-PCC encoder 200 may include a coordinate conversion unit 202, a color conversion unit 204, a voxelization unit 206, an attribute transfer unit 208, an octree analysis unit 210, a surface approximation analysis unit 212 , arithmetic coding unit 214 , geometric reconstruction unit 216 , RAHT unit 218 , LoD generation unit 220 , lifting unit 222 , coefficient quantization unit 224 , and arithmetic coding unit 226 .

如圖2的示例所示，G-PCC編碼器200可以接收位置集合和屬性集合。這些位置可以包括點雲中的點的座標。屬性可以包括關於點雲中的點的資訊，例如，與點雲中的點相關聯的色彩。As shown in the example of FIG. 2, the G-PCC encoder 200 may receive a set of locations and a set of attributes. These locations may include coordinates of points in the point cloud. Attributes may include information about points in the point cloud, eg, colors associated with points in the point cloud.

座標轉換單元202可以對點的座標應用轉換，以將座標從初始域轉換到轉換域。本公開內容可以將經轉換的座標稱為轉換座標。色彩轉換單元204可以應用轉換，以便將屬性的色彩資訊轉換到不同的域。例如，色彩轉換單元204可以將色彩資訊從RGB色彩空間轉換到YCbCr色彩空間。The coordinate transformation unit 202 may apply transformations to the coordinates of the points to transform the coordinates from the original domain to the transformed domain. This disclosure may refer to the transformed coordinates as transformed coordinates. Color conversion unit 204 may apply conversions to convert attribute color information to different domains. For example, the color conversion unit 204 may convert the color information from the RGB color space to the YCbCr color space.

此外，在圖2的示例中，體素化單元206可以對轉換座標進行體素化。對轉換座標的體素化可以包括量化和去除點雲的一些點。換言之，點雲的多個點可以被包含在單個“體素”中，此後在某些方面中可以將單個“體素”視為一個點。此外，八叉樹分析單元210可以基於經體素化的轉換座標來產生八叉樹。此外，在圖2的示例中，表面近似分析單元212可以對點進行分析，以潛在地決定點的集合的表面表示。算術編碼單元214可以對表示八叉樹和/或由表面近似分析單元212所決定的表面的資訊的語法元素進行熵編碼。G-PCC編碼器200可以在幾何位元串流中輸出這些語法元素。Furthermore, in the example of FIG. 2, the voxelization unit 206 may voxelize the transformed coordinates. Voxelization of the transformed coordinates may include quantizing and removing some points of the point cloud. In other words, multiple points of a point cloud may be contained in a single "voxel," which may thereafter be considered a point in certain aspects. Furthermore, octree analysis unit 210 may generate an octree based on the voxelized transformed coordinates. Furthermore, in the example of FIG. 2, the surface approximation analysis unit 212 may analyze the points to potentially determine the surface representation of the set of points. Arithmetic encoding unit 214 may entropy encode syntax elements representing information about the octree and/or the surface determined by surface approximation analysis unit 212 . G-PCC encoder 200 may output these syntax elements in the geometry bitstream.

幾何重構單元216可以基於八叉樹、指示由表面近似分析單元212所決定的表面的資料、和/或其他資訊來重構點雲中的點的轉換座標。由於體素化和表面近似，由幾何重構單元216所重構的轉換座標的數量可能不同於點雲的原始點的數量。本公開內容可以將所形成的點稱為重構點。屬性轉移單元208可以將點雲的原始點的屬性轉移到點雲的重構點。The geometric reconstruction unit 216 may reconstruct the transformed coordinates of the points in the point cloud based on the octree, data indicative of the surface determined by the surface approximation analysis unit 212, and/or other information. Due to voxelization and surface approximation, the number of transformed coordinates reconstructed by the geometric reconstruction unit 216 may differ from the number of original points of the point cloud. The present disclosure may refer to the formed point as a reconstruction point. The attribute transfer unit 208 may transfer the attributes of the original points of the point cloud to the reconstructed points of the point cloud.

此外，RAHT單元218可以將RAHT譯碼應用於重構點的屬性。在一些示例中，在RAHT下，2x2x2點位置的區塊的屬性被沿一個方向獲取和轉換，以獲得四個低（L）頻率節點和四個高（H）頻率節點。隨後，將四個低頻節點（L）沿第二方向轉換，以獲得兩個低（LL）頻率節點和兩個高（LH）頻率節點。兩個低頻率節點（LL）沿第三方向轉換，得到一個低（LLL）頻率節點和一個高（LLH）頻率節點。低頻率節點LLL對應於DC係數，而高頻率節點H、LH和LLH對應於AC係數。每個方向上的轉換可以是具有兩個係數權重的1D轉換。低頻率係數可以作為用於下一個更高級別的RAHT轉換的2x2x2區塊的係數，並且AC係數在沒有變化的情況下被編碼；這種轉換繼續，直到頂部根節點為止。用於編碼的樹遍歷是從上到下，用於計算要用於係數的權重；轉換順序是從下到上。然後，可以對係數進行量化和譯碼。Additionally, RAHT unit 218 may apply RAHT coding to properties of the reconstructed points. In some examples, under RAHT, the properties of a block of 2x2x2 point locations are taken and transformed in one direction to obtain four low (L) frequency nodes and four high (H) frequency nodes. Subsequently, the four low frequency nodes (L) are transformed in the second direction to obtain two low (LL) frequency nodes and two high (LH) frequency nodes. Two low frequency nodes (LL) are converted in the third direction, resulting in a low (LLL) frequency node and a high (LLH) frequency node. The low frequency node LLL corresponds to the DC coefficient, while the high frequency nodes H, LH and LLH correspond to the AC coefficient. The transformation in each direction can be a 1D transformation with two coefficient weights. The low frequency coefficients can be used as coefficients for the 2x2x2 block of the next higher level RAHT transformation, and the AC coefficients are encoded without change; this transformation continues until the top root node. The tree traversal used for encoding is top-to-bottom, which is used to calculate the weights to be used for the coefficients; the transformation order is bottom-to-top. The coefficients can then be quantized and decoded.

替代地或額外地，LoD產生單元220和提升單元222可以分別將LoD處理和提升應用於重構點的屬性。LoD產生用於將屬性拆分為不同的細化級別。每個細化級別提供對點雲的屬性的細化。第一細化級別提供了粗略的近似值，並且包含很少的點；隨後的細化級別通常包含更多點，等等。細化級別可以使用基於距離的度量來構建，或者也可以使用一個或多個其他分類標準（例如，按照特定順序進行子採樣（subsample））。因此，所有重構點都可以被包括在細化級別中。每個級別的細節是透過將所有點的聯集達到特定細化級別來產生的：例如，LoD1是基於細化級別RL1獲得的，LoD2是基於RL1和RL2獲得的，...... LoDN是透過RL1和 RL2，…… RLN的聯集獲得的。在某些情況下，LoD產生之後可以是預測方案（例如，預測轉換），其中，與LoD中的每個點相關聯的屬性是根據先前點的加權平均值進行預測的，並且殘差被量化和熵編碼。提升方案建立在預測轉換機制之上，其中，使用更新操作符來更新係數，並執行係數的自適應量化。Alternatively or additionally, LoD generation unit 220 and promotion unit 222 may apply LoD processing and promotion, respectively, to the properties of the reconstructed points. LoD generation is used to split attributes into different levels of refinement. Each refinement level provides refinement of the properties of the point cloud. The first refinement level provides a rough approximation and contains few points; subsequent refinement levels usually contain more points, and so on. Levels of refinement can be constructed using distance-based metrics, or one or more other classification criteria (eg, subsampling in a specific order). Therefore, all reconstruction points can be included in the refinement level. Each level of detail is produced by unioning all points up to a particular level of refinement: e.g. LoD1 is obtained based on refinement level RL1, LoD2 is obtained based on RL1 and RL2, ... LoDN It is obtained through the union of RL1 and RL2, ... RLN. In some cases, LoD generation can be followed by a prediction scheme (eg, prediction transformation), where the attributes associated with each point in the LoD are predicted from a weighted average of previous points, and the residuals are quantized and entropy encoding. The boosting scheme builds on a predictive transformation mechanism, where the coefficients are updated using an update operator, and adaptive quantization of the coefficients is performed.

RAHT單元218和提升單元222可以基於屬性，來產生係數。係數量化單元224可以量化由RAHT單元218或提升單元222所產生的係數。算術編碼單元226可以將算術譯碼應用於表示量化係數的語法元素。G-PCC編碼器200可以在屬性位元串流中輸出這些語法元素。RAHT unit 218 and boost unit 222 may generate coefficients based on attributes. Coefficient quantization unit 224 may quantize the coefficients produced by RAHT unit 218 or lifting unit 222 . Arithmetic encoding unit 226 may apply arithmetic coding to syntax elements representing quantized coefficients. G-PCC encoder 200 may output these syntax elements in the attribute bitstream.

在圖3的示例中，G-PCC解碼器300可以包括幾何算術解碼單元302、屬性算術解碼單元304、八叉樹合成單元306、逆量化單元308、表面近似合成單元310、幾何重構單元312、RAHT單元314、LoD產生單元316、逆提升單元318、逆轉換座標單元320和逆轉換色彩單元322。In the example of FIG. 3 , the G-PCC decoder 300 may include a geometric arithmetic decoding unit 302 , an attribute arithmetic decoding unit 304 , an octree synthesis unit 306 , an inverse quantization unit 308 , a surface approximation synthesis unit 310 , and a geometric reconstruction unit 312 , RAHT unit 314 , LoD generation unit 316 , inverse lift unit 318 , inverse coordinate unit 320 and inverse color unit 322 .

G-PCC解碼器300可以獲得幾何位元串流和屬性位元串流。解碼器300的幾何算術解碼單元302可以將算術解碼（例如，上下文自適應二進制算術編碼（CABAC）或其他類型的算術解碼）應用於幾何位元串流中的語法元素。類似地，屬性算術解碼單元304可以將算術解碼應用於屬性位元串流中的語法元素。The G-PCC decoder 300 can obtain geometry bitstreams and attribute bitstreams. Geometric arithmetic decoding unit 302 of decoder 300 may apply arithmetic decoding (eg, context adaptive binary arithmetic coding (CABAC) or other types of arithmetic decoding) to syntax elements in the geometry bitstream. Similarly, attribute arithmetic decoding unit 304 may apply arithmetic decoding to syntax elements in the attribute bitstream.

八叉樹合成單元306可以基於從幾何位元串流解析的語法元素來合成八叉樹。從八叉樹的根節點開始，在位元串流中用信號發送每個八叉樹級別的八個子節點中的每個子節點的佔用。當信令指示在特定八叉樹級別的子節點被佔用時，用信號發送所述子節點的子節點的佔用。在後續八叉樹級別之前，用信號發送每個八叉樹級別的節點的信令。在八叉樹的最後一個級別處，每個節點對應體素位置；當葉節點被佔用時，可以指定一個或多個點佔用體素位置。在某些情況下，由於量化，八叉樹的某些分支可以比最終級別更早終止。在這種情況下，葉節點被認為是沒有子節點的佔用節點。在其中幾何位元串流中使用表面近似的情況下，表面近似合成單元310可以基於從幾何位元串流中解析的句法元素並且基於八叉樹來決定表面模型。Octree synthesis unit 306 may synthesize octrees based on syntax elements parsed from the geometry bitstream. Starting at the root node of the octree, the occupancy of each of the eight child nodes of each octree level is signaled in the bitstream. When signaling indicates that a child node at a particular octree level is occupied, the occupancy of the child node of that child node is signaled. Signaling of the nodes of each octree level is signaled before subsequent octree levels. At the last level of the octree, each node corresponds to a voxel position; when a leaf node is occupied, one or more points can be specified to occupy the voxel position. In some cases, some branches of the octree can terminate earlier than the final level due to quantization. In this case, leaf nodes are considered occupied nodes with no children. In cases where a surface approximation is used in the geometry bitstream, the surface approximation synthesis unit 310 may decide the surface model based on syntax elements parsed from the geometry bitstream and based on an octree.

此外，幾何重構單元312可以執行重構，以決定點雲中的點的座標。對於八叉樹的葉節點處的每個位置，幾何重構單元312可以透過使用八叉樹中的葉節點的二進制表示來重構節點位置。在每個相應的葉節點處，用信號發送相應的葉節點處的點的數量；這表示在相同體素位置處的重複點的數量。當使用幾何量化時，點位置被縮放以用於決定經重構的點位置值。Additionally, the geometric reconstruction unit 312 may perform reconstruction to determine the coordinates of the points in the point cloud. For each position at a leaf node of the octree, geometric reconstruction unit 312 may reconstruct the node position by using the binary representation of the leaf node in the octree. At each corresponding leaf node, the number of points at the corresponding leaf node is signaled; this represents the number of duplicate points at the same voxel location. When geometric quantization is used, the point positions are scaled for use in determining reconstructed point position values.

逆轉換座標單元320可以對重構座標應用逆轉換以將點雲中的點的重構座標（例如，位置）從轉換域轉換回初始域。點雲中的點的位置可以在浮點域中，但G-PCC編解碼器中的點位置在整數域中被譯碼。逆轉換可以用於將位置轉換回初始域。Inverse transform coordinates unit 320 may apply an inverse transform to the reconstructed coordinates to transform the reconstructed coordinates (eg, positions) of points in the point cloud from the transformed domain back to the original domain. The position of the point in the point cloud can be in the floating point domain, but the position of the point in the G-PCC codec is decoded in the integer domain. The inverse transformation can be used to transform the position back to the original domain.

另外，在圖3的示例中，逆量化單元308可以逆量化屬性值。屬性值可以基於從屬性位元串流獲得的語法元素（例如，包括由屬性算術解碼單元304解碼的語法元素）。Additionally, in the example of FIG. 3, the inverse quantization unit 308 may inverse quantize the attribute values. The attribute value may be based on syntax elements obtained from the attribute bitstream (eg, including syntax elements decoded by attribute arithmetic decoding unit 304).

依據屬性值如何被編碼，RAHT單元314可以執行RAHT譯碼以基於逆量化的屬性值來決定點雲的點的色彩值。RAHT解碼是從樹的頂部到底部進行的。在每一級別處，從逆量化過程中推導出的低頻率係數和高頻率係數，以用於推導出成分值。在葉節點處，推導出的值對應於係數的屬性值。用於點的權重推導過程類似於在G-PCC編碼器200處使用的過程。可選地，LoD產生單元316和逆提升單元318可以使用基於細節水平的技術來決定用於點雲的點的色彩值。LoD產生單元316對每個LoD進行解碼，給出點的屬性的逐漸更精細的表示。利用預測轉換，LoD產生單元316從在先前LoD中或先前在相同LoD中重構的點的加權和推導出點的預測值。LoD產生單元316可以將預測值添加到（在逆量化之後獲得的）殘差，以獲得屬性的重構值。當使用提升方案時，LoD產生單元316還可以包括更新運算符，以更新用於推導出屬性值的係數。在這種情況下，LoD產生單元316還可以應用逆自適應量化。Depending on how the attribute values are encoded, RAHT unit 314 may perform RAHT coding to determine color values for points of the point cloud based on the inverse quantized attribute values. RAHT decoding is done from the top to the bottom of the tree. At each level, low and high frequency coefficients derived from the inverse quantization process are used to derive component values. At the leaf nodes, the derived values correspond to the attribute values of the coefficients. The weight derivation process for the points is similar to that used at the G-PCC encoder 200 . Alternatively, LoD generation unit 316 and inverse lift unit 318 may use a level of detail based technique to decide color values for the points of the point cloud. LoD generation unit 316 decodes each LoD, giving progressively finer representations of the properties of the point. Using a predictive transformation, LoD generation unit 316 derives a predicted value for a point from a weighted sum of points reconstructed in the previous LoD or previously in the same LoD. LoD generation unit 316 may add the predicted value to the residual (obtained after inverse quantization) to obtain the reconstructed value of the attribute. When using a boosting scheme, the LoD generation unit 316 may also include an update operator to update the coefficients used to derive attribute values. In this case, the LoD generation unit 316 may also apply inverse adaptive quantization.

此外，在圖3的示例中，逆轉換色彩單元322可以對色彩值應用逆色彩轉換。所述逆色彩轉換可以是由G-PCC編碼器200的色彩轉換單元204所應用的色彩轉換的逆。例如，色彩轉換單元204可以將色彩資訊從RGB色彩空間轉換到YCbCr色彩空間。相應地，逆轉換色彩單元322可以將色彩資訊從YCbCr色彩空間轉換到RGB色彩空間。Furthermore, in the example of FIG. 3, inverse color conversion unit 322 may apply inverse color conversion to color values. The inverse color conversion may be the inverse of the color conversion applied by the color conversion unit 204 of the G-PCC encoder 200 . For example, the color conversion unit 204 may convert the color information from the RGB color space to the YCbCr color space. Accordingly, the inverse color conversion unit 322 can convert the color information from the YCbCr color space to the RGB color space.

示出了圖2和圖3的各個單元以幫助理解由編碼器200和解碼器300執行的操作。這些單元可以被實現為固定功能電路、可程式化電路、或其組合。固定功能電路指代提供特定功能並且關於可以執行的操作而預先設置的電路。可程式化電路指代可以被程式化以執行各種任務並且以可以執行的操作來提供彈性功能的電路。例如，可程式化電路可以執行軟體或韌體，軟體或韌體使得可程式化電路以軟體或韌體的指令所定義的方式進行操作。固定功能電路可以執行軟體指令（例如，以接收參數或輸出參數），但是固定功能電路執行的操作類型通常是不可變的。在一些示例中，這些單元中的一個或多個單元可以是不同的電路方塊（固定功能或可程式化），並且在一些示例中，一個或多個單元可以是積體電路。The various elements of FIGS. 2 and 3 are shown to aid in understanding the operations performed by encoder 200 and decoder 300 . These units may be implemented as fixed function circuits, programmable circuits, or a combination thereof. Fixed function circuits refer to circuits that provide specific functions and are preset with respect to the operations that can be performed. Programmable circuits refer to circuits that can be programmed to perform various tasks and that can perform operations to provide elastic functionality. For example, the programmable circuit may execute software or firmware that causes the programmable circuit to operate in a manner defined by the instructions of the software or firmware. Fixed-function circuits can execute software instructions (eg, to receive parameters or output parameters), but the types of operations performed by fixed-function circuits are generally immutable. In some examples, one or more of these units may be different circuit blocks (fixed function or programmable), and in some examples, one or more of the units may be an integrated circuit.

圖5是示出預測樹的示例的概念圖。引入預測幾何譯碼作為八叉樹幾何譯碼的替代，其中，節點佈置在樹結構中（其定義預測結構），並且使用各種預測策略來預測樹中的每個節點針對其預測器的座標。圖5示出了預測樹的示例，即，其中，箭頭指向預測方向的有向圖。節點500是根頂點，沒有預測器。節點502和504有兩子。節點506有三子。節點508、510、512、514和516是葉節點，並且這些節點無子。其餘節點各有一子。每個節點只有一個父節點。FIG. 5 is a conceptual diagram illustrating an example of a prediction tree. Predictive geometry coding was introduced as an alternative to octree geometry coding, where nodes are arranged in a tree structure (which defines the prediction structure), and various prediction strategies are used to predict the coordinates of each node in the tree for its predictor. Figure 5 shows an example of a prediction tree, ie a directed graph in which the arrows point in the prediction direction. Node 500 is the root vertex and has no predictor. Nodes 502 and 504 have two children. Node 506 has three children. Nodes 508, 510, 512, 514, and 516 are leaf nodes, and these nodes have no children. The remaining nodes each have a child. Each node has only one parent node.

基於每個節點的父節點（p0）、祖父節點（p1）和曾祖父節點（p2），為所述每個節點指定四種預測策略：1）無預測/零預測（0）；2）增量預測（p0）；3）線性預測（2*p0 - p1）；4）平行四邊形預測（2*p0 + p1 – p2）。Four prediction strategies are assigned to each node based on its parent (p0), grandfather (p1), and great-grandfather (p2): 1) no prediction/zero prediction (0); 2) incremental Prediction(p0); 3) Linear Prediction(2*p0 - p1); 4) Parallelogram Prediction(2*p0 + p1 - p2).

G-PCC編碼器200可以採用任何演算法來產生預測樹；可以基於應用/用例來決定所使用的演算法，並且可以使用多種策略。G-PCC編解碼器描述中描述了示例策略。The G-PCC encoder 200 can employ any algorithm to generate the prediction tree; the algorithm used can be determined based on the application/use case, and various strategies can be used. Example strategies are described in the G-PCC codec description.

對於每個節點，G-PCC編碼器200可以透過深度優先的方式從根節點（例如，節點500）開始對位元串流中的殘差座標值進行編碼。預測幾何譯碼可以有利於類別3（例如，LIDAR獲得的）點雲資料，例如，針對低時延應用。例如，G-PCC編碼器200或G-PCC解碼器300可以使用可以被填充有一個或多個候選者的預測器候選者列表。G-PCC編碼器200或G-PCC解碼器300可以從預測器候選者列表中選擇一個候選者以用於預測幾何譯碼。For each node, the G-PCC encoder 200 may encode the residual coordinate values in the bitstream starting from the root node (eg, node 500 ) in a depth-first manner. Predictive geometry coding may be beneficial for category 3 (eg, LIDAR-obtained) point cloud data, eg, for low-latency applications. For example, G-PCC encoder 200 or G-PCC decoder 300 may use a predictor candidate list that may be populated with one or more candidates. G-PCC encoder 200 or G-PCC decoder 300 may select a candidate from the predictor candidate list for predictive geometric coding.

現在描述用於預測幾何譯碼的角度模式。角度模式可用於預測幾何譯碼，其中，感測器（例如，LIDAR感測器）的特性可以用於對預測樹進行更有效地譯碼。位置的座標被轉換為（r,φ,i）（半徑、方位角和雷射器索引），並且在所述域中執行預測（在r,φ,i域中對殘差進行譯碼）。由於四捨五入的誤差，r,φ,i 中的譯碼不是無損的，因此可以對與笛卡兒座標相對應的第二殘差集合進行譯碼。用於預測幾何譯碼的角度模式的編碼和解碼策略的描述通常是如下從G-PCC編解碼器描述中再現的。Angle modes for predictive geometric coding are now described. The angle mode can be used for predictive geometric coding, where the characteristics of a sensor (eg, a LIDAR sensor) can be used to more efficiently code the predictive tree. The coordinates of the positions are converted to (r,φ,i) (radius, azimuth and laser index) and prediction is performed in the domain (residuals are decoded in the r,φ,i domain). The decoding in r,φ,i is not lossless due to rounding errors, so the second set of residuals corresponding to Cartesian coordinates can be decoded. The description of the encoding and decoding strategies for angular modes of predictive geometric coding is generally reproduced from the G-PCC codec description as follows.

圖6A和圖6B是示出旋轉LIDAR獲得模型的示例的概念圖。如圖6A和圖6B所示的獲得模型涉及使用旋轉LIDAR模型所獲得的點雲。在圖6A和圖6B的例子中，LIDAR發射器/接收器600有根據方位角φ 602圍繞Z軸的旋轉的N個雷射器（例如，N = 16、32、64）。每個雷射器可以有不同的仰角

和高度

。例如，不同的雷射器可以佈置在LIDAR發射器/接收器600中的不同高度處。假設雷射器i命中具有根據圖6A中描述的座標系定義的笛卡兒整數座標（x,y,z）的點M。 6A and 6B are conceptual diagrams illustrating an example in which a model is obtained by rotating a LIDAR. The obtained model as shown in Figures 6A and 6B involves point clouds obtained using a rotated LIDAR model. In the example of Figures 6A and 6B, the LIDAR transmitter/receiver 600 has N lasers (eg, N=16, 32, 64) rotated about the Z-axis according to the azimuth angle φ 602 . Each laser can have a different elevation angle

and height

. For example, different lasers may be arranged at different heights in the LIDAR transmitter/receiver 600 . Suppose laser i hits point M with Cartesian integer coordinates (x, y, z) defined according to the coordinate system depicted in Figure 6A.

所述技術使用三個參數（ r, ϕ ,i）來表示M的位置，所述三個參數被計算如下：

, The technique uses three parameters ( r, ϕ , i ) to represent the position of M, which are calculated as follows:

,

更精確地說，所述技術使用（ r,φ,i）的經量化版本，表示為（ r ,φ ,i），其中，三個整數 r ,φ和 i被計算如下：

其中（

分別是控制 φ 和 r 的精度的量化參數。

是如下函數：如果t是正的則返回1，否則返回（-1）。

是t的絕對值。 More precisely, the technique uses a quantized version of (r,φ, i ), denoted as ( r ,φ ,i ), where the three integers r ,φ, and i are computed as follows:

in(

are quantization parameters that control the precision of φ and r, respectively.

is a function that returns 1 if t is positive, otherwise (-1).

is the absolute value of t.

為避免由於使用浮點運算而造成的重構不匹配，

和

的值被預先計算和量化如下：

其中（

和

是分別控制

和

的精度的量化參數。得到經重構的笛卡兒座標如下：

, 其中，

和

是

and

的近似值。計算可以使用定點表示、查找表和線性插值。 To avoid reconstruction mismatches due to the use of floating point arithmetic,

and

The value of is precomputed and quantized as follows:

in(

and

are controlled separately

and

The precision of the quantization parameter. The reconstructed Cartesian coordinates are obtained as follows:

, in,

and

Yes

and

approximate value. Calculations can use fixed-point representation, look-up tables, and linear interpolation.

注意，由於各種原因（可以包括量化、近似、LIDAR 獲得模型不精確和/或LIDAR獲得模型參數不精確），

可以與（x,y,z）不同。 Note that for various reasons (which can include quantization, approximation, inaccurate LIDAR derived model and/or inaccurate LIDAR derived model parameters),

Can be different from (x,y,z).

重構殘差

可以被定義如下：

在所述技術中，G-PCC編碼器200可以執行以下操作： 1）對LIDAR獲得模型參數

和

以及量化參數

進行編碼； 2）將ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Teleconference, Oct. 2020中描述的幾何預測方案應用於表示

。在一些示例中，可以引入利用LIDAR的特性的新預測器。例如，LIDAR掃描儀圍繞z軸的旋轉速度通常是恆定的。因此，G-PCC編碼器200可以如下預測當前的

：

其中： i.

是G-PCC編碼器200可以從中選擇的潛在速度集合。索引 k可以在位元串流中顯式地用信號發送，或者可以基於由G-PCC編碼器200和G-PCC解碼器300兩者應用的決定性策略（例如，由G-PCC解碼器300）從上下文推斷出來，以及 ii.

是跳過點的數量，其可以在位元串流中顯式地用信號發送或者可以基於由G-PCC編碼器200和G-PCC解碼器300兩者應用的決定性策略（例如，由G-PCC解碼器300）從上下文中推斷出來。

後面也被稱為“phi乘法器”。注意，

目前僅針對增量預測器使用；以及 3）針對每個節點，對重構殘差

進行編碼。 Reconstructed residuals

can be defined as follows:

In the described technique, the G-PCC encoder 200 may perform the following operations: 1) Obtain model parameters for LIDAR

and

and quantization parameters

2) Apply the geometry prediction scheme described in ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Teleconference, Oct. 2020 to the representation

. In some examples, new predictors that exploit the properties of LIDAR may be introduced. For example, the rotational speed of a LIDAR scanner around the z-axis is usually constant. Therefore, the G-PCC encoder 200 can predict the current

:

of which: i.

is the set of potential speeds from which the G-PCC encoder 200 can choose. The index k may be explicitly signaled in the bitstream, or may be based on a deterministic policy applied by both G-PCC encoder 200 and G-PCC decoder 300 (eg, by G-PCC decoder 300 ) inferred from the context, and ii.

is the number of skip points, which may be explicitly signaled in the bitstream or may be based on a deterministic policy applied by both G-PCC encoder 200 and G-PCC decoder 300 (eg, by G-PCC PCC decoder 300) infers from the context.

Also called "phi multiplier" hereafter. Notice,

Currently only used for incremental predictors; and 3) for each node, on the reconstruction residual

to encode.

G-PCC解碼器300可以執行以下操作： 1）解碼模型參數

和

以及量化參數

和

； 2）根據ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Teleconference, Oct. 2020中描述的幾何預測方案，對與節點相關聯的

參數進行解碼； 3）如上所述，計算重構座標

； 4）對殘差

進行解碼。如在下一節中所討論的，可以透過對重構殘差（

）進行量化來支援有損壓縮；以及 5）計算初始座標

如下：

The G-PCC decoder 300 may perform the following operations: 1) Decode model parameters

and

and quantization parameters

and

; 2) According to the geometry prediction scheme described in ISO/IEC FDIS 23090-9 Geometry-based Point Cloud Compression, ISO/IEC JTC 1/SC29/WG 7 MDS19617, Teleconference, Oct. 2020,

Decode the parameters; 3) As described above, calculate the reconstructed coordinates

; 4) For the residual

to decode. As discussed in the next section, the reconstruction residuals (

) to quantize to support lossy compression; and 5) to compute initial coordinates

as follows:

可以透過對重構殘差（

）應用量化或透過丟棄點來實現有損壓縮。如下計算經量化的重構殘差：

其中，

、

是分別控制

、

、和

的精度的量化參數。 The reconstruction residuals (

) apply quantization or achieve lossy compression by dropping points. The quantized reconstruction residuals are computed as follows:

in,

,

are controlled separately

,

,and

The precision of the quantization parameter.

網格量化（Trellis quantization）可以用於進一步改善RD（率失真）性能結果。量化參數可以在序列/幀/切片/區塊級別改變，以實現區域自適應品質和用於速率控制目的。Trellis quantization can be used to further improve RD (Rate Distortion) performance results. The quantization parameters can be changed at the sequence/frame/slice/block level for region adaptive quality and for rate control purposes.

G-PCC使用基於八叉樹或基於預測的幾何譯碼技術，可選地，結合關於感測器的先前知識（例如，感測器模型），其可以被稱為用於幾何譯碼的角度模式。所述先前知識（例如，感測器模型）可以包括LIDAR感測器內的多個雷射的角度資料和位置偏移，這可能實現針對LIDAR獲得的點雲的顯著譯碼效率增益。然而，G-PCC編碼器或解碼器可能沒有關於與點雲相對應的3D場景的可用資訊。在一些示例中，（3D）場景模型可以被理解為提供用於對點雲進行譯碼的幾何上下文（例如，上下文資訊）。就這點而言，建議利用（3D）場景模型來提高譯碼效率。根據本公開內容的技術，如果獲得或推導出場景模型（例如，場景模型230或場景模型330），則所述場景模型資訊單獨或與感測器模型（例如，感測器模型234或感測器模型334）一起可以用於提高對點雲和點雲屬性進行譯碼的效率。點雲可以被定義為具有位置

,

的點的集合，其中，N是點雲中的點的數量，並且可選屬性

,

，其中，D是針對每個點的屬性的數量。然而，譯碼效率提高取決於所獲得或推導的場景模型是否是由點雲形成的場景的準確表示。就這點而言，應當認識到，可以獲得（例如，從外部設備接收）或推導出場景模型，以用於對多個幀（例如，2、3、...、10個）或甚至一個（單個）幀的點雲進行譯碼。場景模型可以是真實世界場景的數位表示。例如，場景模型可以是基於網格的（包括具有連接資訊的頂點），或場景內的表面和物體的其他表示，例如，表示點雲的定義區域內的一群點的平面。本公開內容的技術可以減少發送所需要的頻寬和儲存經編碼的點雲所需要的記憶體。 G-PCC uses octree-based or prediction-based geometric coding techniques, optionally combined with prior knowledge about the sensor (eg, sensor model), which may be referred to as the angle for geometric coding model. The prior knowledge (eg, sensor model) may include angular data and positional offsets for multiple lasers within the LIDAR sensor, which may enable significant coding efficiency gains for LIDAR-obtained point clouds. However, the G-PCC encoder or decoder may not have available information about the 3D scene corresponding to the point cloud. In some examples, a (3D) scene model may be understood to provide geometric context (eg, contextual information) for decoding the point cloud. In this regard, it is proposed to utilize (3D) scene models to improve decoding efficiency. According to the techniques of this disclosure, if a scene model (eg, scene model 230 or scene model 330 ) is obtained or derived, the scene model information alone or in combination with a sensor model (eg, sensor model 234 or sensor 334) together can be used to improve the efficiency of decoding point clouds and point cloud attributes. A point cloud can be defined as having locations

,

A collection of points, where N is the number of points in the point cloud, and optional attributes

,

, where D is the number of attributes for each point. However, the improved coding efficiency depends on whether the obtained or derived scene model is an accurate representation of the scene formed by the point cloud. In this regard, it should be appreciated that a scene model may be obtained (eg, received from an external device) or derived for use with multiple frames (eg, 2, 3, . . . , 10) or even one The point cloud of the (single) frame is decoded. A scene model may be a digital representation of a real-world scene. For example, the scene model may be mesh-based (including vertices with connection information), or other representations of surfaces and objects within the scene, eg, a plane representing a group of points within a defined area of a point cloud. The techniques of this disclosure may reduce the bandwidth required for transmission and the memory required to store the encoded point cloud.

可以單獨應用或以任意組合應用本文件中公開的一種或多種技術。本公開內容的技術可以適用於對點雲資料的編碼和/或解碼。One or more of the techniques disclosed in this document may be applied alone or in any combination. The techniques of this disclosure may be applicable to encoding and/or decoding of point cloud material.

現在討論決定感測器模型（例如，感測器模型234或感測器模型334），所述感測器模型包括用於獲得點雲資料的一個或多個感測器的內在參數和/或外在參數。被建模的感測器可以是飛行時間（time of flight，ToF）感測器，例如，能夠測量場景中的點的位置的LIDAR或任何感測器。在LIDAR的情況下，內在感測器參數的示例可以包括：感測器中的雷射的數量、感測器頭部內的雷射相對於原點的位置、相對於基準的雷射的角度或雷射的角度差、每個雷射的視野、感測器的每一度或每一圈的樣本數量、或每一雷射的採樣率，等等。外部感測器參數的示例可以包括：相對於基準，場景內的感測器的位置和方向。Determining a sensor model (eg, sensor model 234 or sensor model 334 ) that includes intrinsic parameters and/or intrinsic parameters of one or more sensors used to obtain point cloud data is now discussed. extrinsic parameters. The sensor modeled may be a time of flight (ToF) sensor, eg a LIDAR or any sensor capable of measuring the position of a point in the scene. In the case of LIDAR, examples of intrinsic sensor parameters may include: the number of lasers in the sensor, the position of the lasers in the sensor head relative to the origin, the angle of the lasers relative to the fiducial Or the angle difference of the lasers, the field of view of each laser, the number of samples per degree or revolution of the sensor, or the sampling rate of each laser, etc. Examples of external sensor parameters may include the position and orientation of the sensor within the scene relative to the reference.

現在討論決定或獲得與點雲相對應的場景模型（例如，場景模型230或場景模型330）。在本公開內容的一個示例中，G-PCC編碼器200或G-PCC解碼器300可以決定或獲得與點雲資料的點雲相對應的場景模型230或場景模型330，並且基於場景模型來對點雲資料進行譯碼。可以在點雲的譯碼過程期間，預先決定或產生或估計場景模型230或場景模型330。例如，G-PCC編碼器200或G-PCC解碼器300可以從外部設備獲得場景模型230或場景模型330。例如，G-PCC編碼器200或G-PCC解碼器300可以產生或估計場景模型230或場景模型330。例如，場景模型可以表示道路/地面和/或周圍物體，例如，車輛、行人、路標、交通信號燈、植被、建築物等。Determining or obtaining a scene model (eg, scene model 230 or scene model 330 ) corresponding to the point cloud is now discussed. In one example of the present disclosure, the G-PCC encoder 200 or the G-PCC decoder 300 may decide or obtain the scene model 230 or the scene model 330 corresponding to the point cloud of the point cloud material, and based on the scene model Decoding the point cloud data. The scene model 230 or the scene model 330 may be predetermined or generated or estimated during the decoding process of the point cloud. For example, the G-PCC encoder 200 or the G-PCC decoder 300 may obtain the scene model 230 or the scene model 330 from an external device. For example, G-PCC encoder 200 or G-PCC decoder 300 may generate or estimate scene model 230 or scene model 330 . For example, the scene model may represent the road/ground and/or surrounding objects such as vehicles, pedestrians, road signs, traffic lights, vegetation, buildings, etc.

在一些情況下，可以用信號發送僅在當前幀和實際場景模型（例如，所獲得的場景模型）與估計的場景模型之間的差異。例如，針對幀N，G-PCC編碼器200可以用信號發送所獲得的場景模型230與估計的場景模型230之間的差異。例如，所述差異可以是在獲得的場景模型230和估計的場景模型中一個或多個點的位置座標之間的差異。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以使用已經解碼的資訊（比如，先前重構幀，例如幀（N-1）、幀（N-2）等等）來決定估計的場景模型。G-PCC解碼器300可以解析用信號發送的差異以決定所述差異。例如，G-PCC解碼器300可以在對點雲資料進行解碼時，使用所述差異來更新場景模型330或其他方式。如本文使用的，解析是用於決定在位元串流中用信號發送的值的過程。In some cases, only the difference between the current frame and the actual scene model (eg, the obtained scene model) and the estimated scene model may be signaled. For example, for frame N, the G-PCC encoder 200 may signal the difference between the obtained scene model 230 and the estimated scene model 230 . For example, the difference may be the difference between the position coordinates of one or more points in the obtained scene model 230 and the estimated scene model. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may use already decoded information (eg, previously reconstructed frames, eg, frame(N-1), frame(N-2), etc.) to Determines the estimated scene model. The G-PCC decoder 300 may parse the signaled differences to determine the differences. For example, the G-PCC decoder 300 may use the differences to update the scene model 330 or otherwise when decoding the point cloud material. As used herein, parsing is the process of determining a value to be signaled in a bitstream.

在一些示例中，G-PCC編碼器200可以將場景模型230用信號發送給G-PCC解碼器300以用於幀內幀（或者通常是隨機存取幀），並且針對非幀內（非I）幀（例如，運動預測幀）或切片（例如，運動預測切片），G-PCC編碼器200可以向G-PCC解碼器300用信號發送場景模型230與當前幀之間的差異。例如，G-PCC編碼器200或G-PCC解碼器300可以決定點雲資料的幀是幀內幀，並且基於所述幀是幀內幀，用信號發送或解析場景模型230或場景模型330，並且使用場景模型作為針對點雲資料的當前幀的預測器。例如，G-PCC編碼器200可以透過決定可以透過編碼成本分析來使用幀內預測對幀進行最佳編碼，從而決定所述幀是幀內幀。G-PCC解碼器300可以透過解碼由G-PCC編碼器200向G-PCC解碼器300發送的指示所述幀是幀內幀的語法資訊來決定所述幀是否是幀內幀。G-PCC編碼器200可以編碼和發送場景模型230，並且G-PCC解碼器可以解碼場景模型230並將場景模型230作為場景模型330儲存在記憶體中。In some examples, the G-PCC encoder 200 may signal the scene model 230 to the G-PCC decoder 300 for intra frames (or generally random access frames), and for non-intra (non-I ) frame (eg, a motion prediction frame) or slice (eg, a motion prediction slice), the G-PCC encoder 200 may signal the difference between the scene model 230 and the current frame to the G-PCC decoder 300 . For example, the G-PCC encoder 200 or the G-PCC decoder 300 may decide that a frame of the point cloud material is an intra frame, and based on the frame being an intra frame, signal or parse the scene model 230 or the scene model 330, And use the scene model as a predictor for the current frame of the point cloud material. For example, G-PCC encoder 200 may determine that the frame is an intra frame by determining that the frame can be best encoded using intra prediction through encoding cost analysis. G-PCC decoder 300 may determine whether the frame is an intra frame by decoding syntax information sent by G-PCC encoder 200 to G-PCC decoder 300 indicating that the frame is an intra frame. The G-PCC encoder 200 may encode and transmit the scene model 230 , and the G-PCC decoder may decode the scene model 230 and store the scene model 230 in memory as the scene model 330 .

例如，G-PCC編碼器200或G-PCC解碼器300可以決定點雲資料的當前幀不是幀內幀。基於所述幀不是幀內幀（例如，是幀間幀），G-PCC編碼器200或G-PCC解碼器300可以決定獲得的場景模型和決定的場景模型之間的差異。所述差異可以包括獲得的場景模型和決定的場景模型的位置點之間的差異。在一些示例中，對點雲資料進行譯碼是進一步基於獲得的場景模型與決定的場景模型的位置點之間的差異。在一些示例中，G-PCC解碼器300可以基於所述差異來更新場景模型300。例如，G-PCC編碼器200可以透過對獲得的場景模型和決定的場景模型進行比較，來決定獲得的場景模型和決定的場景模型之間的差異。在一些示例中，獲得的場景模型和決定的場景模型之間的比較包括：關於自由移動體在3D空間中具有的六自由度的比較。G-PCC編碼器200可以向G-PCC解碼器300用信號發送所述差異。G-PCC解碼器300可以透過解析位元串流中的差異，來決定獲得的場景模型和決定的場景模型之間的差異。G-PCC解碼器300可以使用所述差異來解碼當前幀，例如，透過加上或減去來自場景模型330的差異並且使用更新的場景模型330作為針對當前幀的預測器。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於先前幀來分別決定場景模型230或330。For example, the G-PCC encoder 200 or the G-PCC decoder 300 may decide that the current frame of the point cloud material is not an intra frame. Based on the fact that the frame is not an intra frame (eg, is an inter frame), the G-PCC encoder 200 or the G-PCC decoder 300 may decide the difference between the obtained scene model and the decided scene model. The differences may include differences between the obtained scene model and the determined location points of the scene model. In some examples, decoding the point cloud data is further based on differences between the obtained scene model and the determined location points of the scene model. In some examples, G-PCC decoder 300 may update scene model 300 based on the difference. For example, the G-PCC encoder 200 may determine the difference between the obtained scene model and the decided scene model by comparing the obtained scene model with the decided scene model. In some examples, the comparison between the obtained scene model and the determined scene model includes a comparison regarding the six degrees of freedom a freely moving body has in 3D space. G-PCC encoder 200 may signal the difference to G-PCC decoder 300 . The G-PCC decoder 300 can determine the difference between the obtained scene model and the determined scene model by parsing the differences in the bitstream. G-PCC decoder 300 may use the difference to decode the current frame, eg, by adding or subtracting the difference from scene model 330 and using the updated scene model 330 as a predictor for the current frame. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may decide scene model 230 or 330, respectively, based on previous frames.

在一些示例中，可以存在與點雲相關聯的一個或多個場景模型。例如，場景模型230和場景模型330可以包括多個場景模型。在一些示例中，場景模型230或場景模型330可以表示整個點雲或表示點雲的具體區域。例如，對於汽車用例，點雲可以表示道路/地面和周圍物體，比如，車輛、行人、路標、交通燈、植被、建築物等。在一些示例中，場景模型，比如場景模型230或場景模型330可以限於表示道路/地面區域或場景中的其他固定物體。在一些示例中，場景模型230或場景模型330可以表示城市或城市街區。在一些示例中，G-PCC編碼器200可以將點雲幀分割成多個切片，其中，一個或多個切片可以對應於道路/地面區域並且剩餘切片可以表示點雲幀的剩餘場景。例如，G-PCC編碼器200或G-PCC解碼器300可以基於直方圖閾值（T1，T2）對道路點進行分類。參見，例如，於2020年12月29日提交的美國臨時專利申請63/131,637，其全部內容透過引用的方式合併入本文。例如，直方圖可以包括點雲資料的聚集高度（z值）。G-PCC編碼器200可以使用直方圖來計算閾值T1和T2。例如，如果T1 ≤ z ≤ T2，則一個點屬一條道路。在一些示例中，隨後，諸如場景模型230或場景模型330之類的場景模型可以僅應用於與道路/地面區域相關聯的切片。例如，當對與道路/地面區域相關聯的切片進行譯碼時，G-PCC編碼器200或G-PCC解碼器300可以僅使用場景模型230或場景模型330。G-PCC編碼器200可以向G-PCC解碼器300用信號發送切片級標誌以指示場景模型230或場景模型330是否可以被應用於特定切片。例如，切片級標誌可以指示場景模型230或場景模型330是用於對特定切片進行譯碼，還是不用於對特定切片進行譯碼。額外的場景模型可以表示建築物、路標等。In some examples, there may be one or more scene models associated with the point cloud. For example, scene model 230 and scene model 330 may include multiple scene models. In some examples, scene model 230 or scene model 330 may represent an entire point cloud or a specific region of a point cloud. For example, for the automotive use case, the point cloud can represent the road/ground and surrounding objects such as vehicles, pedestrians, road signs, traffic lights, vegetation, buildings, etc. In some examples, scene models, such as scene model 230 or scene model 330, may be limited to representing road/ground areas or other fixed objects in the scene. In some examples, scene model 230 or scene model 330 may represent a city or city block. In some examples, the G-PCC encoder 200 may segment the point cloud frame into multiple slices, where one or more slices may correspond to road/ground areas and the remaining slices may represent the remaining scene of the point cloud frame. For example, G-PCC encoder 200 or G-PCC decoder 300 may classify road points based on histogram thresholds (T1, T2). See, eg, US Provisional Patent Application 63/131,637, filed December 29, 2020, the entire contents of which are incorporated herein by reference. For example, the histogram may include the clustered heights (z-values) of the point cloud data. The G-PCC encoder 200 may use the histogram to calculate the thresholds T1 and T2. For example, a point belongs to a road if T1 ≤ z ≤ T2. In some examples, a scene model such as scene model 230 or scene model 330 may then be applied only to the slices associated with the road/ground area. For example, G-PCC encoder 200 or G-PCC decoder 300 may only use scene model 230 or scene model 330 when coding slices associated with road/ground areas. G-PCC encoder 200 may signal slice level flags to G-PCC decoder 300 to indicate whether scene model 230 or scene model 330 can be applied to a particular slice. For example, a slice-level flag may indicate whether scene model 230 or scene model 330 is used to code a particular slice or not to code a particular slice. Additional scene models can represent buildings, road signs, etc.

在本公開內容的一個示例中，場景模型，例如場景模型230或場景模型330，可以表示點雲的近似。在一些示例中，場景模型230或場景模型330可以將點雲區域劃分為單獨分段（例如，被單獨建模的分段）。在一些示例中，分段模型可以是平面。在一些示例中，分段模型可以是更高階表面近似，例如，多元多項式模型。In one example of the present disclosure, a scene model, such as scene model 230 or scene model 330, may represent an approximation of a point cloud. In some examples, scene model 230 or scene model 330 may divide the point cloud region into separate segments (eg, separately modeled segments). In some examples, the segmented model may be a plane. In some examples, the piecewise model may be a higher order surface approximation, eg, a multivariate polynomial model.

在一些示例中，可以基於在G-PCC編碼器200和G-PCC解碼器兩者處的點雲幀以相同方式推導出場景模型230或場景模型330，以避免解碼漂移。換言之，場景模型230和場景模型330可以是相同的。在一些示例中，只有G-PCC編碼器200可以推導或決定場景模型230，並且在位元串流中對場景模型230的表示進行編碼，G-PCC解碼器300可以對場景模型230的表示進行解碼並作為場景模型330儲存到記憶體340中。例如，從所述位元串流中，G-PCC解碼器300可以將場景模型230重構為場景模型330。在一些示例中，場景模型230或場景模型330的參數可以表示與分段模型相對應的平面參數，或者它們可以表示更高階表面近似的參數。In some examples, scene model 230 or scene model 330 may be derived in the same manner based on point cloud frames at both G-PCC encoder 200 and G-PCC decoder to avoid decoding drift. In other words, scene model 230 and scene model 330 may be the same. In some examples, only the G-PCC encoder 200 may derive or determine the scene model 230 and encode the representation of the scene model 230 in the bitstream, and the G-PCC decoder 300 may Decoded and stored in memory 340 as scene model 330 . For example, from the bitstream, G-PCC decoder 300 may reconstruct scene model 230 into scene model 330 . In some examples, the parameters of scene model 230 or scene model 330 may represent planar parameters corresponding to segmented models, or they may represent parameters of higher order surface approximations.

在本公開內容的另一示例中，可以基於兩個或更多個點雲幀來決定場景模型230或場景模型330。可以基於屬兩個或更多個幀的點，對場景模型參數估計進行優化。當使用兩個或更多個幀來決定場景模型230或場景模型330時，可以對屬不同幀的點執行登記，從而使這些幀一起描述場景模型。例如，G-PCC編碼器200或G-PCC解碼器300可以決定針對點雲資料的多個幀的場景模型，決定屬於多個點雲幀中的兩個點雲幀的點的登記，並且決定兩個點雲幀之間的經登記的點的位移。例如，G-PCC編碼器200或G-PCC解碼器300可以決定屬於點雲資料的多個幀中的兩個幀的對應點。G-PCC編碼器200或G-PCC解碼器300可以決定兩個幀之間的對應點的位移。G-PCC編碼器200或G-PCC解碼器300可以基於場景模型，例如透過基於位移來補償兩個幀之間的運動，對點雲資料的當前幀進行譯碼。In another example of the present disclosure, the scene model 230 or the scene model 330 may be decided based on two or more point cloud frames. The scene model parameter estimates can be optimized based on points that belong to two or more frames. When two or more frames are used to decide the scene model 230 or the scene model 330, registration may be performed on points belonging to different frames so that the frames together describe the scene model. For example, the G-PCC encoder 200 or the G-PCC decoder 300 may decide a scene model for multiple frames of point cloud data, decide the registration of points belonging to two of the plurality of point cloud frames, and decide The displacement of the registered point between two point cloud frames. For example, the G-PCC encoder 200 or the G-PCC decoder 300 may decide the corresponding point of two of the frames belonging to the point cloud material. The G-PCC encoder 200 or the G-PCC decoder 300 can decide the displacement of the corresponding point between two frames. The G-PCC encoder 200 or the G-PCC decoder 300 may decode the current frame of point cloud data based on the scene model, eg, by compensating for motion between two frames based on displacement.

在這種情況下，G-PCC編碼器200或G-PCC解碼器300可以在對點雲資料進行譯碼時基於位移來補償運動。例如，點雲幀序列中的相鄰幀的角原點可能是連接到車輛的LIDAR系統的位置。因此，所述原點隨著車輛進行移動並且因此角原點從一幀到另一幀的位移可以得到補償。在一些示例中，可以從外部單元（例如，車輛的全球定位衛星（GPS）參數）估計或獲得位移的資訊。In this case, G-PCC encoder 200 or G-PCC decoder 300 may compensate for motion based on displacement when decoding the point cloud data. For example, the corner origin of adjacent frames in a sequence of point cloud frames might be the location of the LIDAR system attached to the vehicle. Thus, the origin moves with the vehicle and thus the displacement of the corner origin from one frame to another can be compensated for. In some examples, the displacement information may be estimated or obtained from an external unit (eg, global positioning satellite (GPS) parameters of the vehicle).

現在討論利用場景模型230或場景模型330來對點雲幾何和/或屬性進行譯碼。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以使用場景模型230或場景模型330作為對點雲位置（例如，位置的差異或增量）進行譯碼的參考，例如，所述位置差異或增量可以在笛卡兒座標或球座標，或方位角、半徑、雷射器ID系統等等中給出。在一些示例中，場景模型230或場景模型330可以用於對點雲幀集合中的當前幀進行譯碼，和/或者場景模型可以用於對幀集合中的後續幀進行譯碼。在一些示例中，對於預測幾何譯碼，可以將基於場景模型的一個或多個候選者添加到預測器候選者列表。在一些示例中，為了預測基於轉換的屬性譯碼，可以將基於場景模型的一個或多個候選者添加到預測器候選者列表中。預測器候選者列表可以用於從候選者列表中選擇預測器，G-PCC編碼器200或G-PCC解碼器300可以使用所述預測器來預測當前點雲幀或切片。The use of scene model 230 or scene model 330 to decode point cloud geometry and/or attributes is now discussed. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may use scene model 230 or scene model 330 as a reference to code point cloud positions (eg, differences or deltas of positions), eg, The position difference or increment may be given in Cartesian or spherical coordinates, or in azimuth, radius, laser ID system, and the like. In some examples, scene model 230 or scene model 330 may be used to code a current frame in the set of point cloud frames, and/or the scene model may be used to code subsequent frames in the set of frames. In some examples, for predictive geometry coding, one or more candidates based on the scene model may be added to the predictor candidate list. In some examples, to predict transformation-based attribute coding, one or more candidates based on the scene model may be added to the predictor candidate list. The predictor candidate list may be used to select a predictor from the candidate list, which may be used by the G-PCC encoder 200 or the G-PCC decoder 300 to predict the current point cloud frame or slice.

現在討論G-PCC編碼器200或G-PCC解碼器300利用場景模型（例如，場景模型230或場景模型330）連同感測器模型（例如，感測器模型234或感測器模型334）來對點雲幾何和/或屬性進行譯碼。在一些示例中，使用感測器模型234或感測器模型334結合場景模型230或場景模型330，可以提供對點雲中的點的位置的估計。例如，G-PCC編碼器200或G-PCC解碼器300可以基於感測器模型234或感測器模型334以及場景模型230或場景模型330來決定點雲中的位置的估計。在這樣的示例中，G-PCC編碼器200或G-PCC解碼器300可以使用點雲中的點的位置的估計作為預測器，並且基於預測器來計算位置殘差。在一個示例中，在LIDAR感測器模型的情況下，可以採用內在感測器參數和外在感測器參數來計算雷射與場景模型230或場景模型330的交點，所述交點可以決定點位置。這些點位置可以用作對點雲進行譯碼的預測器。預測器可以用於計算例如在笛卡兒座標、球座標中、或在方位角、半徑、雷射器ID系統等等中的位置殘差。例如，G-PCC編碼器200或G-PCC解碼器300可以基於內在感測器參數和外在感測器參數來決定或計算雷射與場景模型230或場景模型330的第一交點。G-PCC編碼器200或G-PCC解碼器300可以在對點雲資料進行譯碼時使用交點作為預測器，並基於預測器來計算位置殘差。It is now discussed that G-PCC encoder 200 or G-PCC decoder 300 utilizes a scene model (eg, scene model 230 or scene model 330 ) in conjunction with a sensor model (eg, sensor model 234 or sensor model 334 ) to Decode point cloud geometry and/or attributes. In some examples, the use of sensor model 234 or sensor model 334 in conjunction with scene model 230 or scene model 330 may provide an estimate of the location of points in the point cloud. For example, G-PCC encoder 200 or G-PCC decoder 300 may decide an estimate of the location in the point cloud based on sensor model 234 or sensor model 334 and scene model 230 or scene model 330 . In such an example, G-PCC encoder 200 or G-PCC decoder 300 may use estimates of the positions of points in the point cloud as predictors, and calculate position residuals based on the predictors. In one example, in the case of a LIDAR sensor model, intrinsic and extrinsic sensor parameters may be employed to calculate the intersection of the laser with scene model 230 or scene model 330, which may determine the point Location. These point locations can be used as predictors for decoding the point cloud. The predictor can be used to calculate the positional residuals, for example, in Cartesian coordinates, spherical coordinates, or in azimuth, radius, laser ID systems, and the like. For example, the G-PCC encoder 200 or the G-PCC decoder 300 may determine or calculate the first intersection point of the laser with the scene model 230 or the scene model 330 based on intrinsic sensor parameters and extrinsic sensor parameters. The G-PCC encoder 200 or the G-PCC decoder 300 may use the intersection as a predictor when decoding the point cloud data, and calculate the position residual based on the predictor.

在一些示例中，點雲可以是點雲幀集合中的當前幀。在一些示例中，點雲可以是按譯碼順序的點雲幀集合中的當前幀。在一個示例中，為了對當前幀進行譯碼，基於運動資訊（例如，車輛的運動，其可以從GPS資料估計或獲得），相對於先前幀的場景模型230或場景模型330，對感測器進行重新定位。基於感測器的新位置並且使用感測器模型234或感測器模型334，可以計算雷射與場景模型230或場景模型330的交點，以便估計與當前幀中的點雲相對應的點雲。例如，G-PCC編碼器200或G-PCC解碼器300可以從GPS資料獲得運動資訊，並且基於運動資訊，針對當前幀，相對於場景模型230或場景模型330對感測器進行重新定位。In some examples, the point cloud may be the current frame in the set of point cloud frames. In some examples, the point cloud may be the current frame in a set of point cloud frames in coding order. In one example, to decode the current frame, based on motion information (eg, the motion of the vehicle, which may be estimated or obtained from GPS data), the sensor is adjusted relative to the scene model 230 or scene model 330 of the previous frame to reposition. Based on the new position of the sensor and using sensor model 234 or sensor model 334, the intersection of the laser with scene model 230 or scene model 330 may be calculated in order to estimate the point cloud corresponding to the point cloud in the current frame . For example, G-PCC encoder 200 or G-PCC decoder 300 may obtain motion information from GPS data, and based on the motion information, reposition the sensor relative to scene model 230 or scene model 330 for the current frame.

對於作為來自新位置處的感測器的雷射與感測器模型234或感測器模型334的交點而獲得的第一雷射點，G-PCC編碼器200可以用信號發送標誌以向G-PCC解碼器300指示：所述點是否在後續幀中用作預測器。For the first laser point obtained as the intersection of the laser from the sensor at the new location with the sensor model 234 or the sensor model 334, the G-PCC encoder 200 may signal a flag to the G - The PCC decoder 300 indicates whether the point is used as a predictor in subsequent frames.

現在討論對具有平面的LIDAR點雲（例如，汽車用例）進行場景建模的G-PCC編碼器200或G-PCC解碼器300。例如，G-PCC編碼器200或G-PCC解碼器300可以基於直方圖閾值（T1，T2）對道路點進行分類。例如，直方圖可以包括點雲資料的聚集高度（z值）。G-PCC編碼器200可以使用直方圖來計算閾值T1和T2。例如，如果T1 ≤ z ≤ T2，則一個點屬一條道路。G-PCC編碼器200或G-PCC解碼器300可以對道路區域進行分割，並且估計針對每個分段的單獨平面參數。例如，可以透過方位角範圍和雷射器索引範圍來決定分段。G-PCC編碼器200或G-PCC解碼器300可以使用LIDAR參數（雷射角度、垂直偏移）來計算雷射圓圈（例如，由正在旋轉的雷射形成的圓圈）的理論位置。G-PCC編碼器200或G-PCC解碼器300可以決定或計算雷射射線與分割平面的第一交點。為了對後續點雲幀的預測，G-PCC編碼器200或G-PCC解碼器300可以相對於道路模型來重新定位LIDAR感測器，並決定或計算雷射射線與分割平面的第二交點。A G-PCC encoder 200 or G-PCC decoder 300 for scene modeling with a planar LIDAR point cloud (eg, automotive use case) is now discussed. For example, G-PCC encoder 200 or G-PCC decoder 300 may classify road points based on histogram thresholds (T1, T2). For example, the histogram may include the clustered heights (z-values) of the point cloud data. The G-PCC encoder 200 may use the histogram to calculate the thresholds T1 and T2. For example, a point belongs to a road if T1 ≤ z ≤ T2. The G-PCC encoder 200 or the G-PCC decoder 300 may segment the road area and estimate individual plane parameters for each segment. For example, segmentation can be determined by azimuth range and laser index range. The G-PCC encoder 200 or the G-PCC decoder 300 may use the LIDAR parameters (laser angle, vertical offset) to calculate the theoretical position of the laser circle (eg, the circle formed by the rotating laser). The G-PCC encoder 200 or the G-PCC decoder 300 may determine or calculate the first intersection point of the laser ray with the dividing plane. For prediction of subsequent point cloud frames, the G-PCC encoder 200 or G-PCC decoder 300 may reposition the LIDAR sensor relative to the road model and determine or calculate the second intersection of the laser ray with the segmentation plane.

圖7是示出根據本公開內容的場景模型譯碼技術的示例的流程圖。G-PCC編碼器200或G-PCC解碼器300可以決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體對應於點雲資料的第一幀的至少一部分（700）。例如，G-PCC編碼器200可以針對將要對其點雲資料進行編碼的場景，產生或獲得場景模型230。在一些示例中，G-PCC編碼器200可以透過從記憶體240讀取場景模型230，或者透過從外部設備接收場景模型230，來獲得場景模型230。在一些示例中，場景模型230是預先決定的。在一些示例中，G-PCC編碼器200可以基於先前幀，來決定場景模型230。所決定的場景模型也可以被稱為估計的場景模型。在一些示例中，G-PCC解碼器300可以針對將要對其點雲資料進行解碼的場景，產生或獲得場景模型330。在一些示例中，G-PCC解碼器300可以透過從記憶體340讀取場景模型330或者透過從諸如G-PCC編碼器200之類的外部設備接收場景模型330，來獲得場景模型330。在一些示例中，G-PCC解碼器300可以基於先前幀來決定場景模型330。G-PCC編碼器200或G-PCC解碼器300可以基於場景模型，對點雲資料的當前幀進行譯碼（702）。例如，G-PCC編碼器200可以基於場景模型230，對點雲資料的當前幀進行編碼。例如，G-PCC解碼器300可以基於場景模型330，對點雲資料的當前幀進行解碼。7 is a flowchart illustrating an example of a scene model coding technique in accordance with the present disclosure. The G-PCC encoder 200 or the G-PCC decoder 300 may determine or obtain a scene model corresponding to the first frame of the point cloud data, wherein the scene model represents objects within the scene, the objects corresponding to the point cloud At least a portion of the first frame of material (700). For example, the G-PCC encoder 200 may generate or obtain a scene model 230 for the scene whose point cloud material is to be encoded. In some examples, G-PCC encoder 200 may obtain scene model 230 by reading scene model 230 from memory 240, or by receiving scene model 230 from an external device. In some examples, the scene model 230 is predetermined. In some examples, G-PCC encoder 200 may decide scene model 230 based on previous frames. The determined scene model may also be referred to as an estimated scene model. In some examples, the G-PCC decoder 300 may generate or obtain a scene model 330 for the scene whose point cloud material is to be decoded. In some examples, G-PCC decoder 300 may obtain scene model 330 by reading scene model 330 from memory 340 or by receiving scene model 330 from an external device such as G-PCC encoder 200 . In some examples, G-PCC decoder 300 may decide scene model 330 based on previous frames. The G-PCC encoder 200 or the G-PCC decoder 300 may decode the current frame of the point cloud material based on the scene model (702). For example, the G-PCC encoder 200 may encode the current frame of the point cloud material based on the scene model 230 . For example, the G-PCC decoder 300 may decode the current frame of the point cloud material based on the scene model 330 .

在一些示例中，場景模型（例如，場景模型230或場景模型330）包括真實世界場景的數位表示。在一些示例中，場景模型表示道路、地面、車輛、行人、道路標誌、交通信號燈、植被或建築物中的至少一項。在一些示例中，場景模型表示點雲資料的當前幀的近似。In some examples, a scene model (eg, scene model 230 or scene model 330 ) includes a digital representation of a real-world scene. In some examples, the scene model represents at least one of roads, ground, vehicles, pedestrians, road signs, traffic lights, vegetation, or buildings. In some examples, the scene model represents an approximation of the current frame of the point cloud material.

在一些示例中，場景模型包括多個單獨分段。在一些示例中，多個單獨分段包括多個平面或多個更高階表面近似。In some examples, the scene model includes multiple individual segments. In some examples, the multiple individual segments include multiple planes or multiple higher order surface approximations.

在一些示例中，第一幀是當前幀，並且G-PCC編碼器200或G-PCC解碼器300可以決定點雲資料的當前幀是幀內幀，並且基於點雲資料的當前幀是幀內幀，用信號發送或解析場景模型230或場景模型330；以及，使用場景模型作為針對點雲資料的當前幀的預測器。In some examples, the first frame is the current frame, and the G-PCC encoder 200 or the G-PCC decoder 300 may decide that the current frame of the point cloud material is an intra frame, and based on the point cloud material, the current frame is an intra frame frame, signaling or parsing the scene model 230 or scene model 330; and, using the scene model as a predictor for the current frame of the point cloud material.

在一些示例中，譯碼包括編碼，並且決定或獲得場景模型包括：獲得第一場景模型和決定第二場景模型。在這樣的示例中，G-PCC編碼器200可以決定點雲資料的當前幀不是幀內幀。G-PCC編碼器200可以基於點雲資料的當前幀不是幀內幀，來決定第一場景模型和第二場景模型之間的差異。G-PCC編碼器200可以使用第二場景模型作為針對點雲資料的當前幀的預測器，並且用信號發送所述差異。In some examples, decoding includes encoding, and determining or obtaining a scene model includes obtaining a first scene model and determining a second scene model. In such an example, the G-PCC encoder 200 may decide that the current frame of the point cloud material is not an intra frame. The G-PCC encoder 200 may determine the difference between the first scene model and the second scene model based on the fact that the current frame of the point cloud data is not an intra frame. The G-PCC encoder 200 may use the second scene model as a predictor for the current frame of the point cloud material and signal the difference.

在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以用信號發送或解析（分別）用於指示場景模型是否被用於對點雲資料的當前幀的多個切片的特定切片進行譯碼的切片級標誌。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以決定場景模型，包括決定針對點雲資料的多個幀的場景模型。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以決定屬於點雲資料的多個幀中的兩個幀的對應點。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以決定兩個幀之間的對應點的位移。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於場景模型來對點雲資料的當前幀進行譯碼，包括基於位移來補償兩個幀之間的運動。In some examples, G-PCC encoder 200 or G-PCC decoder 300 may signal or parse (respectively) a particular slice for indicating whether a scene model is used for multiple slices of the current frame of point cloud material Slice-level flag for decoding. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may determine a scene model, including determining a scene model for multiple frames of point cloud material. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may decide corresponding points for two of the frames belonging to the point cloud material. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may decide the displacement of the corresponding point between two frames. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may code the current frame of the point cloud material based on the scene model, including compensating for motion between two frames based on displacement.

在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於場景模型對點雲資料的當前幀進行譯碼，包括使用場景模型作為對點雲位置進行譯碼的參考。In some examples, the G-PCC encoder 200 or the G-PCC decoder 300 may decode the current frame of the point cloud material based on the scene model, including using the scene model as a reference for decoding the point cloud positions.

在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以使用預測幾何譯碼或基於轉換的屬性譯碼來譯碼。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於場景模型（例如，場景模型230或場景模型330），將一個或多個候選者添加到預測器候選者列表並從候選者列表中選擇一個候選者。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以對點雲資料的當前幀進行譯碼，包括基於所選擇的候選者來對當前幀進行譯碼。In some examples, G-PCC encoder 200 or G-PCC decoder 300 may code using predictive geometry coding or transform-based attribute coding. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may add one or more candidates to the predictor candidate list based on a scene model (eg, scene model 230 or scene model 330 ) Select a candidate from the candidate list. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may code the current frame of the point cloud material, including coding the current frame based on the selected candidates.

在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於感測器模型（例如，感測器模型234或感測器模型334）和場景模型（例如，場景模型230或場景模型330）來決定點雲資料的當前幀中的點的位置的估計。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於場景模型對點雲資料的當前幀進行譯碼，包括使用點雲資料的當前幀中的點的位置的估計作為預測器；以及，基於預測器來計算位置殘差。在一些示例中，感測器模型表示LIDAR（光偵測和測距）感測器。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以決定對點的位置的估計，包括基於感測器模型的內在感測器參數和外在感測器參數來決定感測器模型的雷射與場景模型的第一交點，以及，使用點雲中的點的位置的估計作為預測器，包括使用第一交點作為預測器。In some examples, G-PCC encoder 200 or G-PCC decoder 300 may be based on a sensor model (eg, sensor model 234 or sensor model 334 ) and a scene model (eg, scene model 230 or scene model 330) to determine an estimate of the position of the point in the current frame of point cloud data. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may decode the current frame of point cloud material based on the scene model, including using estimates of the positions of points in the current frame of point cloud material as predictions a predictor; and a position residual is calculated based on the predictor. In some examples, the sensor model represents a LIDAR (Light Detection and Ranging) sensor. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may determine an estimate of the position of the point, including determining sensing based on intrinsic and extrinsic sensor parameters of the sensor model the first intersection of the laser of the predictor model with the scene model, and using the estimate of the position of the point in the point cloud as the predictor, including using the first intersection as the predictor.

在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以從全球定位系統資料中獲得運動資訊。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以補償點雲資料的兩幀之間的運動，包括基於運動資訊，相對於場景模型來重新定位感測器模型的感測器，包括基於運動資訊，相對於場景模型來重新定位感測器模型的感測器。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於與重新定位相關聯的感測器的新位置，並且基於感測器模型，決定雷射與場景模型的第二交點。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以基於雷射與場景模型的第二交點，預測與點雲資料的兩幀的後續幀相對應的點雲。In some examples, G-PCC encoder 200 or G-PCC decoder 300 may obtain motion information from global positioning system data. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may compensate for motion between two frames of point cloud data, including repositioning sensing of sensor models relative to a scene model based on motion information The device includes a sensor that repositions the sensor model relative to the scene model based on motion information. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may determine the second intersection point of the laser with the scene model based on the new position of the sensor associated with the repositioning, and based on the sensor model . In some examples, the G-PCC encoder 200 or the G-PCC decoder 300 may predict a point cloud corresponding to a frame subsequent to two frames of the point cloud profile based on the second intersection of the laser and the scene model.

在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以在位元串流中（分別）發送或接收場景模型。在一些示例中，G-PCC編碼器200或G-PCC解碼器300可以避免在位元串流中（分別）發送或接收場景模型。In some examples, G-PCC encoder 200 or G-PCC decoder 300 may send or receive scene models (respectively) in a bitstream. In some examples, G-PCC encoder 200 or G-PCC decoder 300 may avoid sending or receiving scene models (respectively) in the bitstream.

圖8是示出根據本公開內容的場景模型技術的示例的流程圖。G-PCC編碼器200或G-PCC解碼器300可以決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體對應於點雲資料的第一幀的至少一部分（800）。例如，G-PCC編碼器200可以針對將要對其點雲資料進行編碼的場景，產生或獲得場景模型230。在一些示例中，G-PCC編碼器200可以透過從記憶體240讀取場景模型230或者透過從外部設備接收場景模型230，來獲得場景模型230。在一些示例中，場景模型230是被預先決定的。在一些示例中，G-PCC編碼器200可以決定場景模型230。例如，G-PCC編碼器200可以基於先前幀來決定場景模型230。在一些示例中，G-PCC解碼器300可以針對將要對其點雲資料進行解碼的場景，產生或獲得場景模型330。在一些示例中，G-PCC解碼器300可以透過從記憶體340讀取場景模型330或者透過從外部設備接收場景模型330，來獲得場景模型330。在一些示例中，G-PCC解碼器300可以從G-PCC編碼器200接收場景模型330。在一些示例中，G-PCC解碼器300可以決定場景模型330。例如，G-PCC解碼器300可以基於先前幀，來決定場景模型330。8 is a flowchart illustrating an example of a scene modeling technique in accordance with the present disclosure. The G-PCC encoder 200 or the G-PCC decoder 300 may determine or obtain a scene model corresponding to the first frame of the point cloud data, wherein the scene model represents objects within the scene, the objects corresponding to the point cloud At least a portion (800) of a first frame of material. For example, the G-PCC encoder 200 may generate or obtain a scene model 230 for the scene whose point cloud material is to be encoded. In some examples, G-PCC encoder 200 may obtain scene model 230 by reading scene model 230 from memory 240 or by receiving scene model 230 from an external device. In some examples, the scene model 230 is predetermined. In some examples, G-PCC encoder 200 may determine scene model 230 . For example, G-PCC encoder 200 may decide scene model 230 based on previous frames. In some examples, the G-PCC decoder 300 may generate or obtain a scene model 330 for the scene whose point cloud material is to be decoded. In some examples, G-PCC decoder 300 may obtain scene model 330 by reading scene model 330 from memory 340 or by receiving scene model 330 from an external device. In some examples, G-PCC decoder 300 may receive scene model 330 from G-PCC encoder 200 . In some examples, G-PCC decoder 300 may decide on scene model 330 . For example, G-PCC decoder 300 may decide scene model 330 based on previous frames.

G-PCC編碼器200或G-PCC解碼器300可以決定點雲的幀是否是幀內幀（802）。例如，G-PCC編碼器200可以決定點雲資料的幀應當或不應當作為幀內幀被譯碼。G-PCC編碼器200可以對用於指示所述幀是否是幀內幀的語法元素進行譯碼，並且可以在位元串流中將所述語法元素用信號發送給G-PCC解碼器300。G-PCC解碼器300可以從位元串流中解析語法元素，以決定所述幀是否是幀內幀。G-PCC encoder 200 or G-PCC decoder 300 may decide whether a frame of the point cloud is an intra frame (802). For example, G-PCC encoder 200 may decide that frames of point cloud material should or should not be decoded as intra-frames. G-PCC encoder 200 may decode a syntax element indicating whether the frame is an intra frame and may signal the syntax element to G-PCC decoder 300 in the bitstream. G-PCC decoder 300 may parse syntax elements from the bitstream to determine whether the frame is an intra frame.

如果所述幀是幀內幀（來自框802的“是”路徑），則基於所述幀是幀內幀，G-PCC編碼器200可以用信號發送，或者G-PCC解碼器300可以解析場景模型230或場景型號330（804）。G-PCC編碼器200或G-PCC解碼器300可以使用場景模型作為針對點雲資料的當前幀的預測器（806）。例如，G-PCC。例如，G-PCC編碼器200可以基於場景模型230，來對點雲資料的當前幀進行編碼。例如，G-PCC解碼器300可以基於場景模型330來對點雲資料的當前幀進行解碼。例如，第一幀是當前幀。If the frame is an intra frame ("Yes" path from block 802), then based on the fact that the frame is an intra frame, the G-PCC encoder 200 may signal, or the G-PCC decoder 300 may parse the scene Model 230 or Scene Model 330 (804). G-PCC encoder 200 or G-PCC decoder 300 may use the scene model as a predictor for the current frame of point cloud material (806). For example, G-PCC. For example, the G-PCC encoder 200 may encode the current frame of the point cloud material based on the scene model 230 . For example, the G-PCC decoder 300 may decode the current frame of the point cloud material based on the scene model 330 . For example, the first frame is the current frame.

如果所述幀不是幀內幀（例如，所述幀是幀間幀）（來自框802的“否”路徑），G-PCC編碼器200或G-PCC解碼器300可以決定第一場景模型和第二場景模型之間的差異（812）。例如，G-PCC編碼器200可以決定第一場景模型（其可以是所獲得的場景模型）和第二場景模型（可以是所決定的場景模型）之間的點發生移動，並且所述移動可以是點的位置座標之間的差異。在一些示例中，第一幀是第二場景模型的前一幀。G-PCC編碼器200或G-PCC解碼器300可以使用第二場景模型，作為針對點雲資料的當前幀的預測器（813）。在示例中，其中G-PCC解碼器300使用第二場景模型作為針對點雲資料的當前幀的預測器，G-PCC編碼器200可以用信號發送所述差異（814）。例如，G-PCC編碼器200可以用信號發送用於指示差異的語法元素，並且G-PCC解碼器300可以解析語法元素以決定所述差異。G-PCC解碼器300可以使用所述差異來將場景模型330更新到第二場景模型，並且使用第二場景模型作為針對點雲資料的當前幀的預測器。If the frame is not an intra frame (eg, the frame is an inter frame) ("NO" path from block 802), G-PCC encoder 200 or G-PCC decoder 300 may determine the first scene model and Differences between second scene models (812). For example, the G-PCC encoder 200 may decide that the point between the first scene model (which may be the obtained scene model) and the second scene model (which may be the decided scene model) moves, and the movement may is the difference between the position coordinates of the points. In some examples, the first frame is a previous frame of the second scene model. The G-PCC encoder 200 or the G-PCC decoder 300 may use the second scene model as a predictor for the current frame of the point cloud material (813). In an example where G-PCC decoder 300 uses the second scene model as a predictor for the current frame of point cloud material, G-PCC encoder 200 may signal the difference (814). For example, G-PCC encoder 200 may signal a syntax element indicating the difference, and G-PCC decoder 300 may parse the syntax element to determine the difference. The G-PCC decoder 300 can use the difference to update the scene model 330 to a second scene model and use the second scene model as a predictor for the current frame of the point cloud material.

圖9是示出可以針對本公開內容的一種或多種技術使用的示例性測距系統900的概念圖。在圖9的例子中，測距系統900包括發光器902和感測器904。發光器902可以發射光906。在一些示例中，發光器902可以發射光906，作為一個或多個雷射束。光906可以是一種或多種波長，例如，紅外線波長或可見光波長。在其他示例中，光906不是相干的雷射。當光906遇到諸如物體908之類的物體時，光906產生返回光910。返回光910可以包括反向散射和/或反射光。返回光910可以穿過透鏡911，透鏡911引導返回光910以便在感測器904上產生物體908的圖像912。感測器904基於圖像912來產生信號914。圖像912可以包括點集合（例如，如由圖8的圖像912中的點來表示）。9 is a conceptual diagram illustrating an example ranging system 900 that may be used with one or more techniques of the present disclosure. In the example of FIG. 9 , ranging system 900 includes an emitter 902 and a sensor 904 . The light emitter 902 can emit light 906 . In some examples, light emitter 902 may emit light 906 as one or more laser beams. Light 906 may be of one or more wavelengths, eg, infrared wavelengths or visible light wavelengths. In other examples, light 906 is not a coherent laser. When light 906 encounters an object, such as object 908 , light 906 produces return light 910 . Return light 910 may include backscattered and/or reflected light. Return light 910 may pass through lens 911 , which directs return light 910 to produce an image 912 of object 908 on sensor 904 . Sensor 904 generates signal 914 based on image 912 . Image 912 may include a set of points (eg, as represented by the points in image 912 of FIG. 8 ).

在一些示例中，發光器902和感測器904可以安裝在旋轉結構上，使得發光器902和感測器904獲得環境的360度視圖。在其他示例中，測距系統900可以包括一個或多個光學組件（例如，鏡子、準直器、衍射光柵等），使得發光器902和感測器904能夠偵測具體範圍內（例如，高達360度）的物體的範圍。儘管圖9的例子僅示出了單個發光器902和感測器904，但測距系統900可以包括多組發光器和感測器。In some examples, light 902 and sensor 904 may be mounted on a rotating structure such that light 902 and sensor 904 obtain a 360-degree view of the environment. In other examples, ranging system 900 may include one or more optical components (eg, mirrors, collimators, diffraction gratings, etc.) that enable emitter 902 and sensor 904 to detect within a specific range (eg, up to 360 degrees) of the object's range. Although the example of FIG. 9 shows only a single emitter 902 and sensor 904, ranging system 900 may include multiple sets of emitters and sensors.

在一些示例中，發光器902產生結構化光圖案。在這樣的示例中，測距系統900可以包括多個感測器904，在所述多個感測器904上形成結構化光圖案的相應圖像。測距系統900可以使用結構化光圖案的圖像之間的差異來決定到物體908的距離，結構化光圖案從所述物體908反向散射。當物體908相對靠近感測器904（例如，0.2米到2米）時，基於結構光的測距系統可以具有高水平的準確度（例如，亞毫米範圍內的準確度）。所述高水平的準確性可以有利於面部識別應用，例如，解鎖行動設備（例如，行動電話、平板電腦等）和用於安全應用。In some examples, light emitter 902 produces a structured light pattern. In such an example, ranging system 900 may include a plurality of sensors 904 on which respective images of the structured light pattern are formed. The ranging system 900 can use the difference between the images of the structured light pattern from which the structured light pattern is backscattered to determine the distance to the object 908 . Structured light based ranging systems can have a high level of accuracy (eg, in the sub-millimeter range) when the object 908 is relatively close to the sensor 904 (eg, 0.2 meters to 2 meters). The high level of accuracy can be beneficial for facial recognition applications, eg, unlocking mobile devices (eg, mobile phones, tablets, etc.) and for security applications.

在一些示例中，測距系統900是基於ToF的系統。在其中測距系統900是基於ToF系統的一些示例中，發光器902產生光脈衝。換言之，發光器902可以調變發射光906的幅度。在這樣的示例中，感測器904從由發光器902產生的光906的脈衝中偵測出返回光910。然後，測距系統900可以基於光906發射的時間與被偵測到的時間之間的延遲以及光在空氣中的已知速度，來決定到物體908的距離，光906從物體908反向散射。在一些示例中，並非（或作為補充）對發射光906的幅度進行調變，發光器902可以對發射光906的相位進行調變。在這樣的示例中，感測器904可以偵測從物體908返回光910的相位，並且使用光速並基於發光器902產生處於特定相位的光906的時間與感測器904偵測出處於特定相位的返回光910的時間之間的時間差，來決定到物體908上的點的距離。In some examples, ranging system 900 is a ToF based system. In some examples in which ranging system 900 is a ToF-based system, light emitter 902 generates light pulses. In other words, the light emitter 902 can modulate the amplitude of the emitted light 906 . In such an example, sensor 904 detects return light 910 from pulses of light 906 produced by light emitter 902 . The ranging system 900 can then determine the distance to the object 908 from which the light 906 is backscattered based on the delay between the time the light 906 is emitted and the time it is detected and the known speed of the light in air . In some examples, instead of (or in addition to) modulating the amplitude of the emitted light 906 , the light emitter 902 may modulate the phase of the emitted light 906 . In such an example, the sensor 904 may detect the phase of the light 910 returning from the object 908 and use the speed of light and based on the time at which the light emitter 902 produces the light 906 at a particular phase and the sensor 904 detects that it is at a particular phase The time difference between the times of the returned light 910 to determine the distance to the point on the object 908.

在其他示例中，可以在不使用發光器902的情況下產生點雲。例如，在一些示例中，測距系統900的感測器904可以包括兩個或更多個光學相機。在這樣的示例中，測距系統900可以使用光學相機來獲得環境的立體圖像，包括物體908。測距系統900可以包括點雲產生器916，所述點雲產生器916可以計算立體圖像中的位置之間的差異。然後，測距系統900可以使用所述差異來決定到立體圖像中所示位置的距離。根據這些距離，點雲產生器916可以產生點雲。In other examples, the point cloud may be generated without the use of the illuminator 902 . For example, in some examples, sensor 904 of ranging system 900 may include two or more optical cameras. In such an example, ranging system 900 may use an optical camera to obtain a stereoscopic image of the environment, including object 908 . The ranging system 900 can include a point cloud generator 916 that can calculate differences between positions in a stereoscopic image. The ranging system 900 can then use the difference to determine the distance to the location shown in the stereoscopic image. From these distances, the point cloud generator 916 may generate a point cloud.

感測器904還可以偵測物體908的其他屬性，例如，色彩和反射率資訊。在圖9的例子中，點雲產生器916可以基於由感測器904產生的信號914來產生點雲。測距系統900和/或點雲產生器916可以形成資料源104（圖1）的一部分。因此，可以根據本公開內容的任何技術對由測距系統900產生的點雲進行編碼和/或解碼。Sensor 904 can also detect other properties of object 908, such as color and reflectivity information. In the example of FIG. 9 , point cloud generator 916 may generate a point cloud based on signal 914 generated by sensor 904 . The ranging system 900 and/or the point cloud generator 916 may form part of the data source 104 (FIG. 1). Accordingly, the point cloud generated by ranging system 900 may be encoded and/or decoded according to any technique of the present disclosure.

圖10是示出其中可以使用本公開內容的一種或多種技術的示例性基於車輛的場景的概念圖。在圖10的例子中，車輛1000包括測距系統1002。測距系統1002可以參照圖10討論的方式來實現。雖然在圖10的例子中沒有示出，車輛1000還可以包括資料源（例如，資料源104（圖1））、以及G-PCC編碼器（例如，G-PCC編碼器200（圖1））。在圖10的例子中，測距系統1002發射雷射束1004，所述雷射束1004反射行人1006或道路中的其他物體。車輛1000的資料源可以基於測距系統1002產生的信號，來產生點雲。車輛1000的G-PCC編碼器可以對點雲進行編碼，以產生位元串流1008，例如幾何位元串流（圖2）和屬性位元串流（圖2）。位元串流1008可以包括與由G-PCC編碼器獲得的未編碼點雲相比少得多的位元。在一些示例中，車輛1000的G-PCC編碼器可以使用如上所述的一個或多個實際場景模型、估計的場景模型和/或感測器模型來對位元串流1008進行編碼。10 is a conceptual diagram illustrating an exemplary vehicle-based scenario in which one or more techniques of the present disclosure may be employed. In the example of FIG. 10 , vehicle 1000 includes ranging system 1002 . Ranging system 1002 may be implemented in the manner discussed with reference to FIG. 10 . Although not shown in the example of FIG. 10 , vehicle 1000 may also include a data source (eg, data source 104 ( FIG. 1 )), and a G-PCC encoder (eg, G-PCC encoder 200 ( FIG. 1 )) . In the example of Figure 10, ranging system 1002 emits a laser beam 1004 that reflects pedestrians 1006 or other objects in the road. The data source of the vehicle 1000 may generate a point cloud based on the signals generated by the ranging system 1002 . The G-PCC encoder of vehicle 1000 may encode the point cloud to generate bitstreams 1008, such as geometry bitstreams (FIG. 2) and attribute bitstreams (FIG. 2). The bitstream 1008 may include far fewer bits than the unencoded point cloud obtained by the G-PCC encoder. In some examples, the G-PCC encoder of vehicle 1000 may encode bitstream 1008 using one or more actual scene models, estimated scene models, and/or sensor models as described above.

車輛1000的輸出介面（例如，輸出介面108（圖1）可以將位元串流1008發送給一個或多個其他設備。位元串流1008可以包括與由G-PCC編碼器獲得的未編碼點雲相比少得多的位元。因此，車輛1000也許能夠與未編碼的點雲資料相比更快地將位元串流1008發送給其他設備。此外，位元串流1008可能需要更少的資料儲存容量。An output interface of vehicle 1000 (eg, output interface 108 (FIG. 1)) may send bitstream 1008 to one or more other devices. Bitstream 1008 may include data related to unencoded points obtained by the G-PCC encoder Much fewer bits than the cloud. Therefore, the vehicle 1000 may be able to send the bitstream 1008 to other devices faster than the unencoded point cloud data. Furthermore, the bitstream 1008 may require less data storage capacity.

在圖10的示例中，車輛1000可以將位元串流1008發送給另一車輛1010。車輛1010可以包括G-PCC解碼器，例如，G-PCC解碼器300（圖1）。車輛1010的G-PCC解碼器可以對位元串流1008進行解碼，以便重構點雲。在一些示例中，車輛1010的G-PCC解碼器在對點雲進行解碼時，可以使用如上所述的一個或多個實際場景模型、估計的場景模型和/或感測器模型。車輛1010可以將重構的點雲用於各種目的。例如，車輛1010可以基於重構的點雲來決定行人1006在車輛1000前面的道路上，並且因此例如甚至在車輛1010的駕駛員意識到行人1006在道路上之前就開始減速。因此，在一些示例中，車輛1010可以基於重構的點雲來執行自主導航操作。In the example of FIG. 10 , a vehicle 1000 may send a bitstream 1008 to another vehicle 1010 . Vehicle 1010 may include a G-PCC decoder, eg, G-PCC decoder 300 (FIG. 1). The G-PCC decoder of the vehicle 1010 may decode the bitstream 1008 in order to reconstruct the point cloud. In some examples, the G-PCC decoder of vehicle 1010 may use one or more actual scene models, estimated scene models, and/or sensor models as described above when decoding the point cloud. Vehicle 1010 may use the reconstructed point cloud for various purposes. For example, vehicle 1010 may decide, based on the reconstructed point cloud, that pedestrian 1006 is on the road in front of vehicle 1000, and thus begin to slow down, eg, even before the driver of vehicle 1010 realizes pedestrian 1006 is on the road. Thus, in some examples, the vehicle 1010 may perform autonomous navigation operations based on the reconstructed point cloud.

額外地或替代地，車輛1000可以將位元串流1008發送給伺服器系統1012。伺服器系統1012可以將位元串流1008用於各種目的。例如，伺服器系統1012可以儲存位元串流1008用於點雲的後續重構。在所述示例中，伺服器系統1012可以使用點雲連同其他資料（例如，由車輛1000產生的車輛遙感資料）來訓練自動駕駛系統。在其他示例中，伺服器系統1012可以儲存位元串流1008，以用於後續重構以便法醫事故調查。Additionally or alternatively, the vehicle 1000 may send the bitstream 1008 to the server system 1012 . The server system 1012 can use the bitstream 1008 for various purposes. For example, server system 1012 may store bitstream 1008 for subsequent reconstruction of the point cloud. In the example, the server system 1012 may use the point cloud along with other data (eg, vehicle telemetry data generated by the vehicle 1000 ) to train the autonomous driving system. In other examples, the server system 1012 may store the bitstream 1008 for subsequent reconstruction for forensic incident investigation.

圖11是示出其中可以使用本公開內容的一種或多種技術的示例性擴增實境系統的概念圖。延展實境（XR）是用於涵蓋包括擴增實境（AR）、混合實境（MR）和虛擬實境（VR）的一系列技術的術語。在圖11的例子中，用戶1100位於第一位置1102處。用戶1100佩戴XR頭戴式耳機1104。作為XR頭戴式耳機1104的替代，用戶1100可以使用行動設備（例如，行動電話、平板電腦等）。XR頭戴式耳機1104包括深度偵測感測器（例如測距系統），所述深度偵測感測器偵測位於位置1102處的物體1106上的點的位置。XR頭戴式耳機1104的資料源可以使用由深度偵測感測器所產生的信號來產生位於位置1102處的物體1106的點雲表示。XR頭戴式耳機1104可以包括G-PCC編碼器（例如，圖1的G-PCC編碼器200），所述G-PCC編碼器被配置為對點雲進行編碼以產生位元串流1108。在一些示例中，如上所述，當對點雲進行編碼時，XR頭戴式耳機1104的G-PCC編碼器可以使用實際場景模型、估計的場景模型和/或感測器模型。11 is a conceptual diagram illustrating an exemplary augmented reality system in which one or more techniques of the present disclosure may be used. Extended Reality (XR) is a term used to cover a range of technologies including Augmented Reality (AR), Mixed Reality (MR) and Virtual Reality (VR). In the example of FIG. 11 , user 1100 is located at first location 1102 . User 1100 wears XR headset 1104 . As an alternative to the XR headset 1104, the user 1100 may use a mobile device (eg, mobile phone, tablet, etc.). XR headset 1104 includes a depth detection sensor (eg, a ranging system) that detects the location of a point on object 1106 at location 1102 . The data source for the XR headset 1104 may use the signals generated by the depth detection sensor to generate a point cloud representation of the object 1106 at the location 1102 . The XR headset 1104 may include a G-PCC encoder (eg, the G-PCC encoder 200 of FIG. 1 ) configured to encode the point cloud to generate the bitstream 1108 . In some examples, the G-PCC encoder of the XR headset 1104 may use the actual scene model, the estimated scene model, and/or the sensor model when encoding the point cloud, as described above.

XR頭戴式耳機1104可以將位元串流1108（例如，經由諸如網際網路之類的網路）發送給位於第二位置1114處的用戶1112所佩戴的XR頭戴式耳機1110。XR頭戴式耳機1110可以對位元串流1108進行解碼，以重構點雲。在一些示例中，如上所述，當對點雲進行解碼時，XR頭戴式耳機1110的G-PCC解碼器可以使用實際場景模型、估計的場景模型和/或感測器模型。The XR headset 1104 may send the bitstream 1108 (eg, via a network such as the Internet) to the XR headset 1110 worn by the user 1112 at the second location 1114 . The XR headset 1110 can decode the bitstream 1108 to reconstruct the point cloud. In some examples, the G-PCC decoder of the XR headset 1110 may use the actual scene model, the estimated scene model, and/or the sensor model when decoding the point cloud, as described above.

XR頭戴式耳機1110可以使用點雲來產生表示位於位置1102處的物體1106的XR可視化（例如，AR、MR、VR可視化）。因此，在一些示例中，例如當XR頭戴式耳機1110產生VR可視化時，用戶1112可以具有位置1102的3D沉浸式體驗。在一些示例中，XR頭戴式耳機1110可以基於重構的點雲來決定虛擬物體的位置。例如，XR頭戴式耳機1110可以基於重構的點雲來決定環境（例如，位置1102）包括平坦表面，並且隨後決定虛擬物體（例如，卡通人物）將要被定位在平坦表面上。XR頭戴式耳機1110可以產生其中虛擬物體處於決定位置的XR可視化。例如，XR耳機1110可以顯示卡通人物坐在平坦表面上。The XR headset 1110 can use the point cloud to generate an XR visualization (eg, AR, MR, VR visualization) representing the object 1106 located at the location 1102 . Thus, in some examples, user 1112 may have a 3D immersive experience of location 1102, such as when XR headset 1110 produces a VR visualization. In some examples, the XR headset 1110 may determine the position of the virtual object based on the reconstructed point cloud. For example, the XR headset 1110 may decide, based on the reconstructed point cloud, that the environment (eg, location 1102 ) includes a flat surface, and then decide that a virtual object (eg, a cartoon character) is to be positioned on the flat surface. The XR headset 1110 can produce an XR visualization with virtual objects in a determined position. For example, the XR headset 1110 may show a cartoon character sitting on a flat surface.

圖12是示出其中可以使用本公開內容的一種或多種技術的示例性行動設備系統的概念圖。在圖12的例子中，諸如行動電話或平板電腦之類的行動設備1200包括諸如LIDAR系統之類的測距系統，所述測距系統偵測行動設備1200的環境中的物體1202上的點的位置。設備1200可以使用由深度偵測感測器所產生的信號來產生物體1202的點雲表示。行動設備1200可以包括G-PCC編碼器（例如，圖1的G-PCC編碼器200），所述G-PCC編碼器被配置為對點雲進行編碼，以產生位元串流1204。在一些示例中，如上所述，當對點雲進行編碼時，行動設備1200的G-PCC編碼器可以使用實際場景模型、估計的場景模型和/或感測器模型。12 is a conceptual diagram illustrating an example mobile device system in which one or more techniques of the present disclosure may be employed. In the example of FIG. 12 , a mobile device 1200 , such as a mobile phone or tablet computer, includes a ranging system, such as a LIDAR system, that detects points on objects 1202 in the environment of the mobile device 1200 . Location. Device 1200 may generate a point cloud representation of object 1202 using signals generated by depth detection sensors. Mobile device 1200 may include a G-PCC encoder (eg, G-PCC encoder 200 of FIG. 1 ) configured to encode a point cloud to generate bitstream 1204 . In some examples, the G-PCC encoder of the mobile device 1200 may use the actual scene model, the estimated scene model, and/or the sensor model when encoding the point cloud, as described above.

在圖12的示例中，行動設備1200可以將位元串流發送給遠程設備1206，例如，伺服器系統或其他行動設備。遠程設備1206可以對位元串流1204進行解碼，以重構點雲。在一些示例中，如上所述，當對點雲進行解碼時，遠程設備1206的G-PCC解碼器可以使用實際場景模型、估計的場景模型和/或感測器模型。In the example of FIG. 12, the mobile device 1200 may send the bitstream to a remote device 1206, eg, a server system or other mobile device. The remote device 1206 can decode the bitstream 1204 to reconstruct the point cloud. In some examples, the G-PCC decoder of the remote device 1206 may use the actual scene model, the estimated scene model, and/or the sensor model when decoding the point cloud, as described above.

遠程設備1206可以將點雲用於各種目的。例如，遠程設備1206可以使用點雲來產生行動設備1200的環境的地圖。例如，遠程設備1206可以基於重構的點雲來產生建築物內部的地圖。在另一示例中，遠程設備1206可以基於點雲來產生影像（例如，計算機圖形）。例如，遠程設備1206可以使用點雲的點作為多邊形的頂點，並且使用點的色彩屬性作為用於對多邊形進行著色的基礎。在一些示例中，遠程設備1206可以將重構的點雲用於面部識別或其他安全應用。The remote device 1206 can use the point cloud for various purposes. For example, the remote device 1206 may use the point cloud to generate a map of the environment of the mobile device 1200. For example, the remote device 1206 may generate a map of the interior of a building based on the reconstructed point cloud. In another example, the remote device 1206 may generate imagery (eg, computer graphics) based on the point cloud. For example, the remote device 1206 may use the points of the point cloud as the vertices of the polygon, and use the color properties of the points as the basis for shading the polygon. In some examples, the remote device 1206 may use the reconstructed point cloud for facial recognition or other security applications.

本公開內容包含以下非限制性條款。This disclosure contains the following non-limiting terms.

條款1A. 一種對點雲資料進行譯碼方法，所述方法包括：決定感測器模型，所述感測器模型包括被配置為獲得點雲資料的一個或多個感測器的至少一個內在或外在參數；以及，基於感測器模型，對點雲資料進行譯碼。Clause 1A. A method of decoding point cloud data, the method comprising: determining a sensor model comprising at least one intrinsic of one or more sensors configured to obtain point cloud data or extrinsic parameters; and, based on the sensor model, decoding the point cloud data.

條款2A. 根據條款1A所述的方法，其中，所述一個或多個感測器還被配置為感測場景中的點的位置。Clause 2A. The method of Clause 1A, wherein the one or more sensors are further configured to sense locations of points in the scene.

條款3A. 根據條款1A或條款2A所述的方法，其中所述一個或多個感測器包括一個或多個LIDAR（光偵測和測距）感測器。Clause 3A. The method of Clause 1A or Clause 2A, wherein the one or more sensors comprise one or more LIDAR (Light Detection and Ranging) sensors.

條款4A. 根據條款1A-3A的任何組合所述的方法，其中，所述感測器模型包括以下各項中的至少一項：感測器中的雷射的數量、感測器中的雷射相對於原點的位置、感測器中的雷射的角度、感測器中的雷射相對於參考的角度差、感測器的每個雷射的視野、感測器的每一度的樣本數量、感測器的每一圈的樣本數量、或感測器的每個雷射的採樣率。Clause 4A. The method of any combination of clauses 1A-3A, wherein the sensor model includes at least one of: a number of lasers in a sensor, a laser in a sensor The position of the laser relative to the origin, the angle of the laser in the sensor, the angle difference of the laser in the sensor relative to the reference, the field of view of each laser of the sensor, the angle of each degree of the sensor The number of samples, the number of samples per revolution of the sensor, or the sampling rate per laser of the sensor.

條款5A. 根據條款1A-3A的任何組合所述的方法，其中，所述感測器模型包括以下各項中的至少一項：場景內的感測器相對於參考的位置、或場景內的感測器相對於參考的方向。Clause 5A. The method of any combination of clauses 1A-3A, wherein the sensor model comprises at least one of: a position of a sensor within a scene relative to a reference, or a position within a scene The orientation of the sensor relative to the reference.

條款6A 一種對點雲資料進行譯碼方法，所述方法包括：決定與點雲資料的點雲相對應的場景模型；以及，基於場景模型，來對點雲資料進行譯碼。Clause 6A A method of decoding point cloud data, the method comprising: determining a scene model corresponding to a point cloud of the point cloud data; and decoding the point cloud data based on the scene model.

條款7A. 根據條款6A所述的方法，其中，決定場景模型包括：從記憶體讀取預先決定的場景模型。Clause 7A. The method of Clause 6A, wherein determining the scene model comprises: reading the predetermined scene model from memory.

條款8A. 根據條款6A所述的方法，其中，決定所述場景模型包括：產生或估計所述場景模型。Clause 8A. The method of Clause 6A, wherein determining the scene model comprises generating or estimating the scene model.

條款9A. 根據條款6A-8A中任一項所述的方法，還包括：決定場景模型和估計的場景模型之間的差異；以及用信號發送或解析所述差異。Clause 9A. The method of any of Clauses 6A-8A, further comprising: determining a difference between the scene model and the estimated scene model; and signaling or parsing the difference.

條款10A. 根據條款6A-9A中任一項所述的方法，還包括：決定一幀是否是幀內幀；以及，基於所述幀是幀內幀，用信號發送或解析所述場景模型。Clause 10A. The method of any of clauses 6A-9A, further comprising: determining whether a frame is an intra frame; and signaling or parsing the scene model based on the frame being an intra frame.

條款11A. 根據條款10A所述的方法，其中，所述幀是第一幀，所述方法還包括：決定第二幀是否是幀內幀；以及，基於所述第二幀不是幀內幀，決定用於所述第二幀的場景模型與用於所述第二幀的估計場景模型之間的差異；以及，用信號發送或解析所述差異。Clause 11A. The method of Clause 10A, wherein the frame is a first frame, the method further comprising: determining whether a second frame is an intra frame; and, based on the second frame not being an intra frame, determining a difference between the scene model for the second frame and the estimated scene model for the second frame; and signaling or parsing the difference.

條款12A. 根據條款6A-11A中任一項所述的方法，其中，所述場景模型是多個場景模型中的一個場景模型。Clause 12A. The method of any of clauses 6A-11A, wherein the scene model is a scene model of a plurality of scene models.

條款13A. 根據條款6A-12A中任一項所述的方法，其中，所述場景模型表示整個點雲。Clause 13A. The method of any of Clauses 6A-12A, wherein the scene model represents an entire point cloud.

條款14A. 根據條款6A-12A中任一項所述的方法，其中，所述場景模型表示點雲的區域。Clause 14A. The method of any of Clauses 6A-12A, wherein the scene model represents a region of a point cloud.

條款15A. 根據條款14A所述的方法，其中，所述場景模型表示道路、地面、汽車、人、路標、植被或建築物中的至少一項。Clause 15A. The method of Clause 14A, wherein the scene model represents at least one of a road, ground, cars, people, road signs, vegetation, or buildings.

條款16A. 根據條款6A-15A中任一項所述的方法，還包括：對多個切片中的點雲幀進行分割，其中，所述多個切片中的一個或多個切片對應於道路區域；以及，應用針對與道路區域相對應的多個切片中的一個或多個切片的而應用的場景模型。Clause 16A. The method of any of clauses 6A-15A, further comprising segmenting the point cloud frame in a plurality of slices, wherein one or more slices of the plurality of slices correspond to a road area and, applying the applied scene model for one or more of the plurality of slices corresponding to the road area.

條款17A. 根據條款16A所述的方法，還包括：用信號發送或解析用於指示所述場景模型是否被應用於所述多個切片中的一個切片的切片級標誌。Clause 17A. The method of Clause 16A, further comprising signaling or parsing a slice-level flag indicating whether the scene model is applied to a slice of the plurality of slices.

條款18A. 根據條款6A-17A中任一項所述的方法，其中，所述場景模型表示點雲的近似。Clause 18A. The method of any of Clauses 6A-17A, wherein the scene model represents an approximation of a point cloud.

條款19A. 根據條款6A-18A中任一項所述的方法，其中，所述場景模型包括單獨建模的多個分段。Clause 19A. The method of any of Clauses 6A-18A, wherein the scene model includes a plurality of segments modeled separately.

條款20A. 根據條款19A所述的方法，其中，所述分段包括平面。Clause 20A. The method of Clause 19A, wherein the segment comprises a plane.

條款21A. 根據條款19A所述的方法，其中，所述分段包括更高階表面近似。Clause 21A. The method of Clause 19A, wherein the segmentation comprises a higher order surface approximation.

條款22A. 根據條款21A所述的方法，其中，所述更高階表面近似包括多元多項式模型。Clause 22A. The method of Clause 21A, wherein the higher order surface approximation comprises a multivariate polynomial model.

條款23A. 根據條款6A-22A中任一項所述的方法，其中，所述方法是由G-PCC編碼器和G-PCC解碼器兩者執行的。Clause 23A. The method of any of Clauses 6A-22A, wherein the method is performed by both a G-PCC encoder and a G-PCC decoder.

條款24A. 根據條款6A-23A中任一項所述的方法，其中，所述方法是由G-PCC編碼器執行的，並且譯碼包括編碼，所述方法還包括：在位元串流中對場景模型的表示進行編碼。Clause 24A. The method of any of clauses 6A-23A, wherein the method is performed by a G-PCC encoder and decoding comprises encoding, the method further comprising: in a bitstream Encodes a representation of the scene model.

條款25A. 根據條款6A-24A中任一項所述的方法，其中，所述方法是由G-PCC解碼器執行的，並且譯碼包括解碼，並且其中，所述決定所述場景模型包括：對位元串流中的場景模型的表示進行解析。Clause 25A. The method of any of clauses 6A-24A, wherein the method is performed by a G-PCC decoder and decoding comprises decoding, and wherein the determining the scene model comprises: Parse the representation of the scene model in the bitstream.

條款26A. 根據條款6A-25A中任一項所述的方法，其中，所述場景模型是基於多個點雲幀而被決定的。Clause 26A. The method of any of Clauses 6A-25A, wherein the scene model is determined based on a plurality of point cloud frames.

條款27A. 根據條款26A所述的方法，還包括：決定屬於所述多個點雲幀的不同點雲幀的點的登記。Clause 27A. The method of Clause 26A, further comprising determining registration of points belonging to different point cloud frames of the plurality of point cloud frames.

條款28A. 根據條款27A所述的方法，還包括：決定所述多個點雲幀中的兩個幀之間的點的位移。Clause 28A. The method of Clause 27A, further comprising determining a displacement of a point between two frames of the plurality of point cloud frames.

條款29A. 根據條款6A-28A中任一項所述的方法，其中，基於場景模型來對點雲資料進行譯碼包括：使用場景模型作為對點雲位置進行譯碼的參考。Clause 29A. The method of any of clauses 6A-28A, wherein decoding the point cloud data based on the scene model comprises using the scene model as a reference for decoding the point cloud locations.

條款30A. 根據條款29A所述的方法，其中，所述參考包括位置座標的差異。Clause 30A. The method of Clause 29A, wherein the reference comprises a difference in position coordinates.

條款31A. 根據條款30A所述的方法，其中，所述位置座標包括笛卡兒座標、球座標、方位角、半徑或雷射器ID系統中的一項或多項。Clause 31A. The method of Clause 30A, wherein the location coordinates comprise one or more of Cartesian coordinates, spherical coordinates, azimuth, radius, or a laser ID system.

條款32A. 根據條款6A-31A中任一項所述的方法，其中，基於場景模型來對點雲資料進行譯碼包括以下各項中的至少一項：對點雲幀集合中的當前幀進行譯碼；或者，對點雲幀集合中的後續幀進行譯碼。Clause 32A. The method of any of Clauses 6A-31A, wherein decoding the point cloud data based on the scene model comprises at least one of: performing a decoding; or, decoding subsequent frames in the set of point cloud frames.

條款33A. 根據條款6A-32A中任一項所述的方法，其中，譯碼包括預測幾何譯碼，所述方法還包括：基於場景模型，將一個或多個候選者添加到預測器候選者列表。Clause 33A. The method of any of Clauses 6A-32A, wherein the coding comprises predictive geometry coding, the method further comprising: adding the one or more candidates to the predictor candidates based on the scene model list.

條款34A. 根據條款6A-33A中任一項所述的方法，其中，譯碼包括基於轉換的屬性譯碼，所述方法還包括：基於場景模型，將一個或多個候選者添加到預測器候選者列表。Clause 34A. The method of any of clauses 6A-33A, wherein the decoding comprises transformation-based attribute decoding, the method further comprising: adding one or more candidates to a predictor based on a scene model Candidate list.

條款35A. 根據條款1A和條款5A的組合所述的方法，還包括：基於感測器模型和場景模型，決定點雲中的點的位置的估計值。Clause 35A. The method of the combination of Clause 1A and Clause 5A, further comprising: determining an estimate of the position of the point in the point cloud based on the sensor model and the scene model.

條款36A. 根據條款35A所述的方法，其中，決定點的位置的估計值包括：基於內在感測器參數和外在感測器參數，來計算雷射與場景模型的交點。Clause 36A. The method of Clause 35A, wherein determining the estimate of the location of the point comprises calculating an intersection of the laser and the scene model based on intrinsic and extrinsic sensor parameters.

條款37A. 根據條款36A所述的方法，還包括：使用交點作為對點雲進行譯碼的預測器。Clause 37A. The method of Clause 36A, further comprising using the intersection as a predictor for decoding the point cloud.

條款38A. 根據條款37A所述的方法，還包括：基於所述預測器，來計算位置殘差。Clause 38A. The method of Clause 37A, further comprising calculating a position residual based on the predictor.

條款39A. 根據條款38A所述的方法，其中，所述位置殘差包括雷射器ID系統的笛卡兒座標、球座標、方位角、半徑中的至少一項。Clause 39A. The method of Clause 38A, wherein the position residual comprises at least one of Cartesian coordinates, spherical coordinates, azimuth, and radius of the laser ID system.

條款40A. 根據條款35A-39A中任一項所述的方法，還包括：基於運動參數，針對後續幀，相對於場景模型，對感測器進行重新定位。Clause 40A. The method of any of Clauses 35A-39A, further comprising: repositioning the sensor relative to the scene model for subsequent frames based on the motion parameter.

條款41A. 根據條款40A所述的方法，其中，所述運動參數是從全球定位系統資料估計或獲得的。Clause 41A. The method of Clause 40A, wherein the motion parameter is estimated or obtained from global positioning system data.

條款42A. 根據條款40A或41A所述的方法，還包括：基於與重新定位相關聯的感測器的新位置，並且基於感測器模型，來決定雷射與場景模型的交點；以及，基於雷射與場景模型的交點，預測與後續幀中的點雲相對應的點雲。Clause 42A. The method of clause 40A or 41A, further comprising: based on the new position of the sensor associated with the repositioning, and based on the sensor model, determining the intersection of the laser and the scene model; and, based on The intersection of the laser and the scene model, predicting the point cloud corresponding to the point cloud in subsequent frames.

條款43A. 根據條款40A-42A中任一項所述的方法，還包括：用信號發送或解析用於指示點是否被用作後續幀中的預測器的標誌。Clause 43A. The method of any of clauses 40A-42A, further comprising signaling or parsing a flag indicating whether a point is used as a predictor in a subsequent frame.

條款44A. 根據條款1A-43A中任一項所述的方法，還包括：產生點雲。Clause 44A. The method of any of Clauses 1A-43A, further comprising: generating a point cloud.

條款45A. 一種用於處理點雲的設備，所述設備包括用於執行條款A1-44A中任一項所述的方法的一個或多個構件。Clause 45A. An apparatus for processing a point cloud, the apparatus comprising one or more means for performing the method of any of Clauses A1-44A.

條款46A. 根據條款45A所述的設備，其中，所述一個或多個構件包括在電路中實現的一個或多個處理器。Clause 46A. The apparatus of clause 45A, wherein the one or more components comprise one or more processors implemented in a circuit.

條款47A. 根據條款45A或46A中任一項所述的設備，還包括：用於儲存表示點雲的資料的記憶體。Clause 47A. The apparatus of any of clauses 45A or 46A, further comprising: memory for storing data representing the point cloud.

條款48A. 根據條款45A-47A中任一項所述的設備，其中，所述設備包括解碼器。Clause 48A. The apparatus of any of clauses 45A-47A, wherein the apparatus comprises a decoder.

條款49A. 根據條款45A-48A中任一項所述的設備，其中，所述設備包括編碼器。Clause 49A. The apparatus of any of clauses 45A-48A, wherein the apparatus comprises an encoder.

條款50A. 根據條款45A-49A中任一項所述的設備，還包括：用於產生點雲的設備。Clause 50A. The apparatus of any of Clauses 45A-49A, further comprising: apparatus for generating a point cloud.

條款51A. 根據條款45A-50A中任一項所述的設備，還包括用於基於點雲來呈現影像的顯示器。Clause 51A. The apparatus of any of Clauses 45A-50A, further comprising a display for rendering imagery based on the point cloud.

條款52A. 一種其上儲存有指令的計算機可讀儲存媒體，所述指令在被執行時使一個或多個處理器執行條款1A-44A中任一項所述的方法。Clause 52A. A computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to perform the method of any of clauses 1A-44A.

條款1B. 一種對點雲資料進行譯碼的方法，所述方法包括：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及，基於所述場景模型，對所述點雲資料的當前幀進行譯碼。 Clause 1B. A method of decoding point cloud data, the method comprising: determining or obtaining a scene model corresponding to a first frame of point cloud data, wherein the scene model represents an object within a scene, the object corresponding to at least a portion of the first frame of the point cloud data; and, based on the scene model, decoding the current frame of the point cloud data.

條款2B. 根據條款1B所述的方法，其中，所述場景模型包括真實世界場景的數位表示。Clause 2B. The method of Clause 1B, wherein the scene model comprises a digital representation of a real-world scene.

條款3B. 根據條款1B或條款2B的方法，其中，所述場景模型表示道路、地面、車輛、行人、路標、交通信號燈、植被或建築物中的至少一項。Clause 3B. The method of Clause IB or Clause 2B, wherein the scene model represents at least one of a road, ground, vehicles, pedestrians, road signs, traffic lights, vegetation, or buildings.

條款4B. 根據條款1B-3B中任一項所述的方法，其中，所述場景模型表示點雲資料的當前幀的近似。Clause 4B. The method of any of Clauses 1B-3B, wherein the scene model represents an approximation of a current frame of point cloud material.

條款5B. 根據條款1B-4B中任一項所述的方法，其中，所述場景模型包括多個單獨分段。Clause 5B. The method of any of Clauses 1B-4B, wherein the scene model includes a plurality of individual segments.

條款6B. 根據條款5B所述的方法，其中，所述多個單獨分段包括多個平面或多個更高階表面近似。Clause 6B. The method of Clause 5B, wherein the plurality of individual segments comprise a plurality of planes or a plurality of higher order surface approximations.

條款7B. 根據條款1B-6B中任一項所述的方法，其中，所述第一幀是所述當前幀，所述方法還包括：決定所述點雲資料的所述當前幀是幀內幀；基於所述點雲資料的所述當前幀是所述幀內幀，用信號發送或解析所述場景模型；以及，使用所述場景模型作為用於所述點雲資料的所述當前幀的預測器。Clause 7B. The method of any of Clauses 1B-6B, wherein the first frame is the current frame, the method further comprising: determining that the current frame of the point cloud material is an intra frame frame; signaling or parsing the scene model based on the current frame of the point cloud material being the intra frame; and using the scene model as the current frame for the point cloud material predictor.

條款8B. 根據條款1B-6B中任一項所述的方法，其中，譯碼包括編碼，並且決定或獲得所述場景模型包括：獲得第一場景模型和決定第二場景模型，所述方法還包括：決定所述點雲資料的所述當前幀不是幀內幀；基於所述點雲資料的所述當前幀不是所述幀內幀，來決定所述第一場景模型與所述第二場景模型之間的差異；使用所述第二場景模型作為用於所述點雲資料的所述當前幀的預測器；以及，用信號發送所述差異。Clause 8B. The method of any of Clauses 1B-6B, wherein decoding comprises encoding, and determining or obtaining the scene model comprises obtaining a first scene model and determining a second scene model, the method further comprising: determining that the current frame of the point cloud data is not an intra frame; determining the first scene model and the second scene based on the current frame of the point cloud data not being the intra frame difference between models; using the second scene model as a predictor for the current frame of the point cloud material; and signaling the difference.

條款9B. 根據條款1B-8B中任一項所述的方法，還包括：用信號發送或解析用於指示所述場景模型是否用於對所述點雲資料的所述當前幀的多個切片中的特定切片進行所述譯碼的切片級標誌。Clause 9B. The method of any of Clauses 1B-8B, further comprising: signaling or parsing a plurality of slices indicating whether the scene model is used for the current frame of the point cloud material A slice-level flag for the coding of a particular slice in .

條款10B. 根據條款1B-9B中任一項所述的方法，其中，決定所述場景模型包括：決定針對所述點雲資料的多個幀的所述場景模型，並且其中所述方法還包括：決定屬於所述點雲資料的所述多個幀中的兩幀的對應點；以及，決定所述兩幀之間的所述對應點的位移，其中，基於所述場景模型對所述點雲資料的所述當前幀進行譯碼包括：基於所述位移來補償所述兩幀之間的運動。Clause 10B. The method of any of Clauses 1B-9B, wherein determining the scene model comprises determining the scene model for a plurality of frames of the point cloud material, and wherein the method further comprises : determine a corresponding point of two of the plurality of frames belonging to the point cloud data; and, determine a displacement of the corresponding point between the two frames, wherein the point is determined based on the scene model Decoding the current frame of cloud data includes compensating for motion between the two frames based on the displacement.

條款11B. 根據條款1B-10B中任一項所述的方法，其中，基於所述場景模型對所述點雲資料的所述當前幀進行譯碼包括：使用所述場景模型作為對點雲位置進行譯碼的參考。Clause 11B. The method of any of clauses 1B-10B, wherein decoding the current frame of the point cloud material based on the scene model comprises using the scene model as a Reference for decoding.

條款12B. 根據條款1B-11B中任一項所述的方法，其中，所述譯碼包括預測幾何譯碼或基於轉換的屬性譯碼，所述方法還包括：基於所述場景模型，將一個或多個候選者添加到預測器候選者列表中；以及，從所述預測器候選者列表中選擇一個候選者，其中，對所述點雲資料的所述當前幀進行譯碼包括：基於所選擇的候選者對所述當前幀進行譯碼。Clause 12B. The method of any of Clauses 1B-11B, wherein the decoding comprises predictive geometry decoding or transformation-based attribute decoding, the method further comprising: based on the scene model, converting a adding one or more candidates to a predictor candidate list; and selecting a candidate from the predictor candidate list, wherein decoding the current frame of the point cloud material comprises: based on the The selected candidate codes the current frame.

條款13B. 根據條款1B-12B中任一項所述的方法，還包括：基於感測器模型和所述場景模型，來決定所述點雲資料的所述當前幀中的點位置的估計，其中，基於所述場景模型對所述點雲資料的所述當前幀進行譯碼包括：使用所述點雲資料的所述當前幀中的所述點位置的所述估計作為預測器；以及，基於所述預測器，來計算位置殘差。Clause 13B. The method of any of clauses 1B-12B, further comprising: determining an estimate of a point position in the current frame of the point cloud material based on a sensor model and the scene model, wherein decoding the current frame of the point cloud data based on the scene model comprises: using the estimate of the point position in the current frame of the point cloud data as a predictor; and, Based on the predictor, a position residual is calculated.

條款14B. 根據條款13B所述的方法，其中，所述感測器模型表示LIDAR（光偵測和測距）感測器，並且其中，所述決定所述點位置的所述估計包括：基於所述感測器模型的內在感測器參數和外在感測器參數，來決定所述感測器模型的雷射與所述場景模型的第一交點，其中，使用所述點雲中的所述點的所述位置的所述估計作為所述預測器包括：使用所述第一交點作為所述預測器。Clause 14B. The method of Clause 13B, wherein the sensor model represents a LIDAR (Light Detection and Ranging) sensor, and wherein the determining the estimate of the point location comprises: based on Intrinsic sensor parameters and extrinsic sensor parameters of the sensor model to determine the first intersection of the sensor model's laser and the scene model, wherein using the The estimate of the position of the point as the predictor includes using the first intersection as the predictor.

條款15B. 根據條款14B所述的方法，還包括：從全球定位系統資料中獲得運動資訊；補償所述點雲資料的兩幀之間的運動包括：基於所述運動資訊，相對於所述場景模型來重新定位所述感測器模型的感測器；基於與所述重新定位相關聯的所述感測器的新位置並基於所述感測器模型，來決定雷射與所述場景模型的第二交點；以及，基於所述雷射與所述場景模型的所述第二交點，來預測與所述點雲資料的所述兩幀的後續幀相對應的點雲。Clause 15B. The method of Clause 14B, further comprising: obtaining motion information from global positioning system data; compensating for motion between two frames of the point cloud data comprising: based on the motion information, relative to the scene model to relocate the sensors of the sensor model; based on the new location of the sensor associated with the relocation and based on the sensor model, determine the laser and the scene model and, based on the second intersection of the laser and the scene model, predicting a point cloud corresponding to subsequent frames of the two frames of the point cloud data.

條款16B. 根據條款1B-15B中任一項所述的方法，其中，所述方法還包括：在位元串流中發送或接收所述場景模型。Clause 16B. The method of any of Clauses 1B-15B, wherein the method further comprises sending or receiving the scene model in a bitstream.

條款17B. 根據條款1B-15B中任一項所述的方法，其中，所述方法還包括：避免在位元串流中發送或接收所述場景模型。Clause 17B. The method of any of Clauses 1B-15B, wherein the method further comprises avoiding sending or receiving the scene model in a bitstream.

條款18B. 一種用於對點雲資料進行譯碼的設備，所述設備包括：記憶體，其被配置為儲存所述點雲資料；以及，在電路中實現並通信地耦接到所述記憶體的一個或多個處理器，所述一個或多個處理器被配置為：決定或獲得與所述點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及，基於所述場景模型，來對所述點雲資料的當前幀進行譯碼。Clause 18B. An apparatus for decoding point cloud data, the apparatus comprising: a memory configured to store the point cloud data; and implemented in circuitry and communicatively coupled to the memory one or more processors of the volume, the one or more processors configured to: determine or obtain a scene model corresponding to the first frame of the point cloud material, wherein the scene model represents a scene within the scene an object corresponding to at least a portion of the first frame of the point cloud data; and decoding the current frame of the point cloud data based on the scene model.

條款19B. 根據條款18B所述的設備，其中，所述場景模型包括真實世界場景的數位表示。Clause 19B. The apparatus of Clause 18B, wherein the scene model comprises a digital representation of a real-world scene.

條款20B. 根據條款18B或條款19B所述的設備，其中，所述場景模型表示道路、地面、車輛、行人、路標、交通信號燈、植被或建築物中的至少一項。Clause 20B. The apparatus of Clause 18B or Clause 19B, wherein the scene model represents at least one of a road, ground, vehicles, pedestrians, road signs, traffic lights, vegetation, or buildings.

條款21B. 根據條款18B-20B中任一項所述的設備，其中，所述場景模型表示所述點雲資料的所述當前幀的近似。Clause 21B. The apparatus of any of clauses 18B-20B, wherein the scene model represents an approximation of the current frame of the point cloud material.

條款22B. 根據條款18B-21B中任一項所述的設備，其中，所述場景模型包括多個單獨分段。Clause 22B. The apparatus of any of Clauses 18B-21B, wherein the scene model includes a plurality of individual segments.

條款23B. 根據條款22B所述的設備，其中，所述多個單獨分段包括多個平面或多個更高階表面近似。Clause 23B. The apparatus of Clause 22B, wherein the plurality of individual segments comprise a plurality of planes or a plurality of higher order surface approximations.

條款24B. 根據條款18B-23B中任一項所述的設備，其中，所述第一幀是所述當前幀，並且其中，所述一個或多個處理器還被配置為：決定所述點雲資料的所述當前幀是幀內幀；基於所述點雲資料的所述當前幀是所述幀內幀，用信號發送或解析所述場景模型；以及，使用所述場景模型作為用於所述點雲資料的所述當前幀的預測器。Clause 24B. The apparatus of any of clauses 18B-23B, wherein the first frame is the current frame, and wherein the one or more processors are further configured to: determine the point the current frame of cloud data is an intra frame; signaling or parsing the scene model based on the current frame of the point cloud data being the intra frame; a predictor for the current frame of the point cloud profile.

條款25B. 根據條款18B-23B中任一項所述的設備，其中，譯碼包括編碼，並且作為決定或獲得所述場景模型的一部分，所述一個或多個處理器被配置為：獲得第一場景模型和決定第二場景模型，其中，所述一個或多個處理器還被配置為：決定所述點雲資料的所述當前幀不是幀內幀；基於所述點雲資料的所述當前幀不是所述幀內幀，來決定所述第一場景模型與所述第二場景模型之間的差異；使用所述第二場景模型作為用於所述點雲資料的所述當前幀的預測器；以及，用信號發送所述差異。Clause 25B. The apparatus of any of clauses 18B-23B, wherein decoding comprises encoding, and as part of determining or obtaining the scene model, the one or more processors are configured to: obtain the first a scene model and determining a second scene model, wherein the one or more processors are further configured to: determine that the current frame of the point cloud data is not an intra frame; The current frame is not the intra frame to determine the difference between the first scene model and the second scene model; using the second scene model as the current frame for the point cloud data a predictor; and signaling the difference.

條款26B. 根據條款18B-25B中任一項所述的設備，其中，所述一個或多個處理器還被配置為：用信號發送或解析用於指示所述場景模型是否用於對所述點雲資料的所述當前幀的多個切片中的特定切片進行所述譯碼的切片級標誌。Clause 26B. The apparatus of any one of clauses 18B-25B, wherein the one or more processors are further configured to: signal or parse an indication of whether the scene model is used for the A slice-level flag for the coding is performed on a particular slice of the plurality of slices of the current frame of point cloud data.

條款27B. 根據條款18B-26B中任一項所述的設備，作為決定所述場景模型的一部分，所述一個或多個處理器還被配置為：決定針對所述點雲資料的多個幀的所述場景模型，並且其中所述一個或多個處理器還被配置為：決定屬於所述點雲資料的所述多個幀中的兩幀的對應點；以及，決定所述兩幀之間的所述對應點的位移，其中，作為基於所述場景模型對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為：基於所述位移來補償所述兩幀之間的運動。Clause 27B. The apparatus of any of clauses 18B-26B, as part of determining the scene model, the one or more processors are further configured to: determine a plurality of frames for the point cloud material the scene model, and wherein the one or more processors are further configured to: determine the corresponding points of two of the plurality of frames belonging to the point cloud data; and, determine which of the two frames displacement of the corresponding point between, wherein, as part of decoding the current frame of the point cloud data based on the scene model, the one or more processors are configured to: based on the displacement to compensate for motion between the two frames.

條款28B. 根據條款18B-27B中任一項所述的設備，其中，作為基於所述場景模型對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為使用所述場景模型作為對點雲位置進行譯碼的參考。Clause 28B. The apparatus of any of clauses 18B-27B, wherein, as part of decoding the current frame of the point cloud material based on the scene model, the one or more processors is configured to use the scene model as a reference for decoding point cloud positions.

條款29B. 根據條款18B-28B中任一項所述的設備，其中，譯碼包括預測幾何譯碼或基於轉換的屬性譯碼，並且其中，所述一個或多個處理器還被配置為：基於所述場景模型，將一個或多個候選者添加到預測器候選者列表中；以及，從所述預測器候選者列表中選擇一個候選者，其中，作為對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為基於所選擇的候選者，對所述當前幀進行譯碼。Clause 29B. The apparatus of any of clauses 18B-28B, wherein the decoding comprises predictive geometry decoding or transform-based attribute decoding, and wherein the one or more processors are further configured to: adding one or more candidates to a predictor candidate list based on the scene model; and selecting a candidate from the predictor candidate list, wherein as the reference to the point cloud material The one or more processors are configured to decode the current frame based on the selected candidates as part of decoding the current frame.

條款30B. 根據條款18B-29B中任一項所述的設備，其中，所述一個或多個處理器還被配置為：基於感測器模型和所述場景模型，來決定所述點雲資料的所述當前幀中的點位置的估計，其中，作為基於所述場景模型對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為：使用所述點雲資料的所述當前幀中的所述點位置的所述估計作為預測器；以及，基於所述預測器，來計算位置殘差。Clause 30B. The apparatus of any of clauses 18B-29B, wherein the one or more processors are further configured to: determine the point cloud data based on a sensor model and the scene model , wherein, as part of decoding the current frame of the point cloud material based on the scene model, the one or more processors are configured to: use the estimate of the point position in the current frame of the point cloud material acts as a predictor; and, based on the predictor, a position residual is calculated.

條款31B. 根據條款30B所述的設備，其中，所述感測器模型表示LIDAR（光偵測和測距）感測器，並且其中，作為決定所述點位置的所述估計的一部分，所述一個或多個處理器還被配置為：基於所述感測器模型的內在感測器參數或外在感測器參數中的至少一個，來決定所述感測器模型的雷射與所述場景模型的第一交點，其中，作為使用所述點雲中的所述點位置的所述估計作為所述預測器的一部分，所述一個或多個處理器還被配置為使用所述第一交點作為所述預測器。Clause 31B. The apparatus of clause 30B, wherein the sensor model represents a LIDAR (Light Detection and Ranging) sensor, and wherein, as part of the estimation of determining the point location, the The one or more processors are further configured to: based on at least one of intrinsic sensor parameters or extrinsic sensor parameters of the sensor model, determine whether the laser of the sensor model is related to the sensor model. a first intersection of the scene model, wherein, as part of using the estimate of the point position in the point cloud as part of the predictor, the one or more processors are further configured to use the first An intersection serves as the predictor.

條款32B. 根據條款31B所述的設備，其中，所述一個或多個處理器還被配置為：從全球定位系統資料中獲得運動資訊；補償所述點雲資料的兩幀之間的運動包括：基於所述運動資訊，相對於所述場景模型來重新定位所述感測器模型的感測器；基於與所述重新定位相關聯的所述感測器的新位置，並且基於所述感測器模型，來決定雷射與所述場景模型的第二交點；以及，基於所述雷射與所述場景模型的所述第二交點，來預測與所述點雲資料的所述兩幀的後續幀相對應的點雲。Clause 32B. The apparatus of Clause 31B, wherein the one or more processors are further configured to: obtain motion information from global positioning system data; compensating for motion between frames of the point cloud data comprising: : reposition the sensors of the sensor model relative to the scene model based on the motion information; based on the new positions of the sensors associated with the repositioning, and based on the sensed a detector model to determine a second intersection of the laser and the scene model; and, based on the second intersection of the laser and the scene model, to predict the two frames with the point cloud data The corresponding point cloud for subsequent frames of .

條款33B. 根據條款18B-32B中任一項所述的設備，其中，所述設備包括車輛、機器人或智慧型手機。Clause 33B. The apparatus of any of Clauses 18B-32B, wherein the apparatus comprises a vehicle, a robot, or a smartphone.

條款34B. 根據條款18B-33B中任一項所述的設備，其中，所述一個或多個處理器還被配置為：在位元串流中發送或接收所述場景模型。Clause 34B. The apparatus of any of Clauses 18B-33B, wherein the one or more processors are further configured to send or receive the scene model in a bitstream.

條款35B. 根據條款18B-33B中任一項所述的設備，其中，所述一個或多個處理器還被配置為：避免在位元串流中發送或接收所述場景模型。Clause 35B. The apparatus of any of Clauses 18B-33B, wherein the one or more processors are further configured to avoid sending or receiving the scene model in a bitstream.

條款36B. 一種在其上儲存有指令的非暫時性計算機可讀儲存媒體，所述指令當被執行時使一個或多個處理器用於：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及，基於所述場景模型，來對所述點雲資料的當前幀進行譯碼。Clause 36B. A non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to: determine or obtain a data corresponding to a first frame of point cloud data a scene model, wherein the scene model represents an object within a scene, the object corresponding to at least a portion of the first frame of the point cloud profile; and, based on the scene model, for the point cloud The current frame of data is decoded.

條款37B. 一種用於對點雲資料進行譯碼的設備，所述設備包括：用於決定或獲得與所述點雲資料的第一幀相對應的場景模型的構件，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及，用於基於所述場景模型，來對所述點雲資料的當前幀進行譯碼的構件。Clause 37B. An apparatus for decoding point cloud data, the apparatus comprising: means for determining or obtaining a scene model corresponding to a first frame of the point cloud data, wherein the scene model representing an object within a scene, the object corresponding to at least a portion of the first frame of the point cloud data; and for decoding a current frame of the point cloud data based on the scene model component.

本公開內容的各個方面中的示例可以單獨使用或以任何組合使用。The examples in various aspects of the present disclosure may be used alone or in any combination.

要認識到的是，根據示例，本文描述的任何技術的某些動作或事件可以以不同的順序執行，可以被添加、合併或完全省略（例如，並非所有描述的動作或事件是對於實施所述技術都是必要的）。此外，在某些示例中，動作或事件可以例如透過多線程處理、中斷處理或多個處理器並行地而不是順序地執行。It is recognized that, based on examples, certain acts or events of any of the techniques described herein may be performed in a different order, may be added, combined, or omitted entirely (eg, not all described acts or events are specific to implementing the described technology is necessary). Furthermore, in some examples, actions or events may be performed in parallel rather than sequentially, eg, through multi-threading, interrupt processing, or multiple processors.

在一個或多個示例中，所描述的功能可以用硬體、軟體、韌體或其任何組合來實現。如果用軟體來實現，則所述功能可以作為一個或多個指令或碼儲存在計算機可讀媒體上或者透過其進行傳輸並且由基於硬體的處理單元執行。計算機可讀媒體可以包括計算機可讀儲存媒體，其對應於諸如資料儲存媒體之類的有形媒體或者通信媒體，所述通信媒體包括例如根據通信協定來促進計算機程式從一個地方傳送到另一個地方的任何媒體。以這種方式，計算機可讀媒體通常可以對應於（1）非暫時性的有形計算機可讀儲存媒體、或者（2）諸如信號或載波之類的通信媒體。資料儲存媒體可以是可以由一個或多個計算機或者一個或多個處理器存取以取得用於實現在本公開內容中描述的技術的指令、碼和/或資料結構的任何可用的媒體。計算機程式產品可以包括計算機可讀媒體。In one or more examples, the functions described can be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to tangible media, such as data storage media, or communication media, including media that facilitate transfer of computer programs from one place to another, for example, in accordance with communication protocols. any media. In this manner, a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium such as a signal or carrier wave. Data storage media can be any available media that can be accessed by one or more computers or one or more processors for instructions, code and/or data structures for implementing the techniques described in this disclosure. The computer program product may include a computer-readable medium.

舉例來說而非進行限制，這樣的計算機可讀儲存媒體可以包括RAM、ROM、EEPROM、CD-ROM或其它光碟儲存、磁盤儲存或其它磁儲存設備、快閃記憶體、或者能夠用於以指令或資料結構形式儲存期望的程式碼以及能夠由計算機存取的任何其它媒體。此外，任何連接被適當地稱為計算機可讀媒體。例如，如果使用同軸電纜、光纖光纜、雙絞線、數位用戶線（DSL）或者無線技術（例如，紅外線、無線電和微波）從網站、伺服器或其它遠程來源傳輸指令，則同軸電纜、光纖光纜、雙絞線、DSL或者無線技術（例如，紅外線、無線電和微波）被包括在媒體的定義中。然而，應當理解的是，計算機可讀儲存媒體和資料儲存媒體不包括連接、載波、信號或其它暫時性媒體，而是替代地針對非暫時性的有形儲存媒體。如本文所使用的，磁碟和光碟包括壓縮光碟（CD）、雷射光碟、光碟、數位多功能光碟（DVD）、軟碟和藍光光碟，其中，磁碟通常磁性地複製資料，而光碟利用雷射來光學地複製資料。上述各項的組合也應當被包括在計算機可讀媒體的範圍之內。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or can be used to store instructions or data structure to store the desired code and any other medium that can be accessed by the computer. Also, any connection is properly termed a computer-readable medium. For example, if coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies (eg, infrared, radio, and microwave) are used to transmit instructions from a website, server, or other remote source, coaxial cable, fiber optic cable , twisted pair, DSL, or wireless technologies (eg, infrared, radio, and microwave) are included in the definition of media. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory, tangible storage media. As used herein, magnetic and optical discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), floppy discs, and Blu-ray discs, where magnetic discs typically reproduce material magnetically, while optical discs utilize Lasers to optically reproduce material. Combinations of the above should also be included within the scope of computer-readable media.

指令可以由一個或多個處理器來執行，諸如一個或多個數位信號處理器（DSP）、通用微處理器、特殊應用積體電路（ASIC）、現場可程式化閘陣列（FPGA）、或其它等效的整合或離散邏輯電路。因此，如本文所使用的術語“處理器”和“處理電路”可以指代前述結構中的任何一者或者適於實現本文描述的技術的任何其它結構。另外，在一些方面中，本文描述的功能可以在被配置用於編碼和解碼的專用硬體和/或軟體模組內提供，或者被併入經組合的編解碼器中。此外，所述技術可以完全在一個或多個電路或邏輯元件中實現。Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or Other equivalent integrated or discrete logic circuits. Accordingly, the terms "processor" and "processing circuit" as used herein may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Furthermore, the techniques may be fully implemented in one or more circuits or logic elements.

本公開內容的技術可以在多種多樣的設備或裝置中實現，包括無線手機、積體電路（IC）或一組IC（例如，晶片組）。在本公開內容中描述了各種組件、模組或單元以強調被配置為執行所公開的技術的設備的功能性方面，但是不一定需要透過不同的硬體單元來實現。確切而言，如上所述，各種單元可以被組合在編解碼器硬體單元中，或者由可互操作的硬體單元的集合（包括如上所述的一個或多個處理器）結合適當的軟體和/或韌體來提供。The techniques of this disclosure may be implemented in a wide variety of devices or apparatus, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require implementation by different hardware units. Rather, the various units may be combined in codec hardware units, as described above, or by a collection of interoperable hardware units (including one or more processors as described above) combined with appropriate software and/or firmware.

已經描述了各個示例。這些和其它示例在所附的申請專利範圍的範圍內。Various examples have been described. These and other examples are within the scope of the appended claims.

100:編碼和解碼系統 102:來源設備 104:資料源 106:記憶體 108:輸出介面 110:計算機可讀媒體 112:儲存設備 114:文件伺服器 116:目的地設備 118:資料消費者 120:記憶體 122:輸入介面 200:G-PCC編碼器 202:座標轉換單元 204:色彩轉換單元 206:體素化（voxelization）單元 208:屬性傳送單元 210:八叉樹分析單元 211:預測幾何分析單元 212:表面近似分析單元 214:算術編碼單元 216:幾何重構單元 218:RAHT單元 220:LoD產生單元 222:提升單元 224:係數量化單元 226:算術編碼單元 230:場景模型 234:感測器模型 240:記憶體 300:G-PCC解碼器 302:幾何算術解碼單元 304:屬性算術解碼單元 306:八叉樹合成單元 307:預測幾何合成單元 308:逆量化單元 310:表面近似合成單元 312:幾何重構單元 314:RAHT單元 316:LoD產生單元 318:逆提升單元 320:逆轉換座標單元 322:逆轉換色彩單元 330:場景模型 334:感測器模型 340:記憶體 400:八叉樹 500:節點 502:節點 504:節點 506:節點 508:節點 510:節點 512:節點 514:節點 516:節點 602:方位角 600:LIDAR發射器/接收器 700:步驟 702:步驟 800:步驟 802:步驟 804:步驟 806:步驟 812:步驟 813:步驟 814:步驟 900:測距系統 902:發光器 904:感測器 906:光 908:物體 910:返回光 911:透鏡 912:圖像 914:信號 916:點雲產生器 1000:車輛 1002:測距系統 1004:雷射束 1006:行人 1008:位元串流 1010:車輛 1100:用戶 1102:第一位置 1104:XR頭戴式耳機 1106:物體 1108:位元串流 1110:XR頭戴式耳機 1112:用戶 1114:第二位置 1200:行動設備 1202:物體 1204:位元串流 1206:遠程設備 100: Encoding and Decoding Systems 102: Source Device 104: Sources 106: Memory 108: Output interface 110: Computer-readable media 112: Storage Devices 114: file server 116: destination device 118: Data Consumers 120: memory 122: Input interface 200: G-PCC encoder 202: Coordinate conversion unit 204: Color conversion unit 206: voxelization unit 208: attribute transfer unit 210: Octree Analysis Unit 211: Predictive Geometric Analysis Unit 212: Surface Approximation Analysis Unit 214: Arithmetic coding unit 216: Geometric reconstruction unit 218: RAHT unit 220: LoD generation unit 222: Lifting Unit 224: Coefficient quantization unit 226: Arithmetic coding unit 230: Scene Model 234: Sensor Model 240: memory 300: G-PCC Decoder 302: Geometric Arithmetic Decoding Unit 304: attribute arithmetic decoding unit 306: Octree Synthesis Unit 307: Predictive Geometric Synthesis Unit 308: Inverse Quantization Unit 310: Surface Approximation Synthesis Unit 312: Geometric reconstruction unit 314: RAHT unit 316: LoD generation unit 318: Inverse Lifting Unit 320: Inverse conversion coordinate unit 322: Inverse conversion color unit 330: Scene Model 334: Sensor Model 340: Memory 400: Octree 500: Node 502: Node 504: Node 506: Node 508: Node 510: Node 512: Node 514: Node 516: Node 602: Azimuth 600: LIDAR Transmitter/Receiver 700: Steps 702: Steps 800: Steps 802: Steps 804: Steps 806: Steps 812: Steps 813: Steps 814: Steps 900: Ranging System 902: Illuminator 904: Sensor 906: Light 908: Object 910: Return Light 911: Lens 912: Image 914: Signal 916: Point Cloud Generator 1000: Vehicle 1002: Ranging System 1004: Laser Beam 1006: Pedestrian 1008: bitstream 1010: Vehicles 1100: User 1102: First position 1104: XR Headphones 1106: Object 1108: bit stream 1110: XR Headphones 1112: User 1114: Second position 1200: Mobile Devices 1202: Objects 1204: bit stream 1206: Remote Device

圖1是示出可以執行本公開內容的技術的示例性編碼和解碼系統的示圖。1 is a diagram illustrating an example encoding and decoding system that may perform the techniques of this disclosure.

圖2是示出示例幾何點雲壓縮（G-PCC）編碼器的方塊圖。2 is a block diagram illustrating an example Geometric Point Cloud Compression (G-PCC) encoder.

圖3是示出示例性G-PCC解碼器的方塊圖。3 is a block diagram illustrating an exemplary G-PCC decoder.

圖4是示出根據本公開內容的技術的用於幾何譯碼的示例性八叉樹分割（octree split）的概念圖。4 is a conceptual diagram illustrating an example octree split for geometric coding in accordance with the techniques of this disclosure.

圖5是用於預測幾何譯碼的預測樹的概念圖。5 is a conceptual diagram of a prediction tree for predictive geometry coding.

圖6A和圖6B是示出旋轉LIDAR獲得模型的示例的概念圖。6A and 6B are conceptual diagrams illustrating an example in which a model is obtained by rotating a LIDAR.

圖7是示出本公開內容的示例性場景模型譯碼技術的流程圖。7 is a flow diagram illustrating an exemplary scene model decoding technique of the present disclosure.

圖8是示出本公開內容的示例性場景模型譯碼技術的流程圖。8 is a flowchart illustrating an exemplary scene model decoding technique of the present disclosure.

圖9是示出可以針對本公開內容的一種或多種技術使用的示例性測距系統的概念圖。9 is a conceptual diagram illustrating an example ranging system that may be used with one or more techniques of the present disclosure.

圖10是示出可以使用本公開內容的一種或多種技術的示例性基於車輛的場景的概念圖。10 is a conceptual diagram illustrating an exemplary vehicle-based scenario in which one or more techniques of the present disclosure may be employed.

圖11是示出其中可以使用本公開內容的一種或多種技術的示例性擴增實境系統的概念圖。11 is a conceptual diagram illustrating an exemplary augmented reality system in which one or more techniques of the present disclosure may be used.

圖12是示出其中可以使用本公開內容的一種或多種技術的示例性行動設備系統的概念圖。12 is a conceptual diagram illustrating an example mobile device system in which one or more techniques of the present disclosure may be employed.

200:G-PCC編碼器 200: G-PCC encoder

202:座標轉換單元 202: Coordinate conversion unit

204:色彩轉換單元 204: Color conversion unit

206:體素化(voxelization)單元 206: voxelization unit

208:屬性傳送單元 208: attribute transfer unit

210:八叉樹分析單元 210: Octree Analysis Unit

211:預測幾何分析單元 211: Predictive Geometric Analysis Unit

212:表面近似分析單元 212: Surface Approximation Analysis Unit

214:算術編碼單元 214: Arithmetic coding unit

216:幾何重構單元 216: Geometric reconstruction unit

218:RAHT單元 218: RAHT unit

220:LoD產生單元 220: LoD generation unit

222:提升單元 222: Lifting Unit

224:係數量化單元 224: Coefficient quantization unit

226:算術編碼單元 226: Arithmetic coding unit

230:場景模型 230: Scene Model

234:感測器模型 234: Sensor Model

240:記憶體 240: memory

Claims

一種對點雲資料進行譯碼的方法，所述方法包括：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及基於所述場景模型，對所述點雲資料的當前幀進行譯碼。 A method for decoding point cloud data, the method comprising: determining or obtaining a scene model corresponding to a first frame of point cloud data, wherein the scene model represents an object within a scene, the object corresponding to at least a portion of the first frame of the point cloud data; as well as Decoding the current frame of the point cloud data based on the scene model.

根據請求項1所述的方法，其中，所述場景模型包括真實世界場景的數位表示。The method of claim 1, wherein the scene model comprises a digital representation of a real-world scene.

根據請求項1所述的方法，其中，所述場景模型表示道路、地面、車輛、行人、路標、交通信號燈、植被或建築物中的至少一項。The method of claim 1, wherein the scene model represents at least one of roads, ground, vehicles, pedestrians, road signs, traffic lights, vegetation, or buildings.

根據請求項1所述的方法，其中，所述場景模型表示所述點雲資料的近似。The method of claim 1, wherein the scene model represents an approximation of the point cloud data.

根據請求項1所述的方法，其中，所述場景模型包括多個單獨分段。The method of claim 1, wherein the scene model includes a plurality of individual segments.

根據請求項5所述的方法，其中，所述多個單獨分段包括多個平面或多個更高階表面近似。The method of claim 5, wherein the plurality of individual segments comprise a plurality of planes or a plurality of higher order surface approximations.

根據請求項1所述的方法，其中，所述第一幀是所述當前幀，所述方法還包括：決定所述點雲資料的所述當前幀是幀內幀；基於所述點雲資料的所述當前幀是所述幀內幀，用信號發送或解析所述場景模型；以及使用所述場景模型，作為針對所述點雲資料的所述當前幀的預測子。 The method according to claim 1, wherein the first frame is the current frame, and the method further comprises: determining that the current frame of the point cloud data is an intra frame; Signaling or parsing the scene model based on the current frame of the point cloud data being the intra frame; and Using the scene model as a predictor for the current frame of the point cloud material.

根據請求項1所述的方法，其中，譯碼包括編碼，並且決定或獲得場景模型包括：獲得第一場景模型和決定第二場景模型，所述方法還包括：決定所述點雲資料的所述當前幀不是幀內幀；基於所述點雲資料的所述當前幀不是所述幀內幀，來決定所述第一場景模型與所述第二場景模型之間的差異；使用所述第二場景模型，作為針對所述點雲資料的所述當前幀的預測子；以及用信號發送所述差異。 The method according to claim 1, wherein decoding includes encoding, and determining or obtaining a scene model includes: obtaining a first scene model and determining a second scene model, and the method further includes: determining that the current frame of the point cloud data is not an intra frame; determining the difference between the first scene model and the second scene model based on the fact that the current frame of the point cloud data is not the intra-frame; using the second scene model as a predictor for the current frame of the point cloud profile; and The difference is signaled.

根據請求項1所述的方法，還包括：用信號發送或解析用於指示所述場景模型是否用於對所述點雲資料的所述當前幀的多個切片中的特定切片進行所述譯碼的切片級標誌。 The method according to claim 1, further comprising: A slice-level flag is signaled or parsed to indicate whether the scene model is used for the coding of a particular slice of the plurality of slices of the current frame of the point cloud material.

根據請求項1所述的方法，其中，決定所述場景模型包括：決定針對所述點雲資料的多個幀的所述場景模型，並且其中，所述方法還包括：決定屬所述點雲資料的所述多個幀中的兩幀的對應點；以及決定所述兩幀之間的所述對應點的位移，其中，基於所述場景模型對所述點雲資料的所述當前幀進行譯碼包括：基於所述位移來補償所述兩幀之間的運動。 The method of claim 1, wherein determining the scene model comprises: determining the scene model for multiple frames of the point cloud data, and wherein the method further comprises: determining corresponding points for two of the plurality of frames belonging to the point cloud data; and determine the displacement of the corresponding point between the two frames, Wherein, decoding the current frame of the point cloud data based on the scene model includes: compensating for motion between the two frames based on the displacement.

根據請求項1所述的方法，其中，基於所述場景模型對所述點雲資料的所述當前幀進行譯碼包括：使用所述場景模型作為對點雲位置進行譯碼的參考。 The method according to claim 1, wherein decoding the current frame of the point cloud data based on the scene model comprises: The scene model is used as a reference for decoding point cloud positions.

根據請求項1所述的方法，其中，所述譯碼包括預測幾何譯碼或基於變換的屬性譯碼，所述方法還包括：基於所述場景模型，將一個或多個候選者添加到預測子候選者列表中；以及從所述預測子候選者列表中選擇候選者，其中，對所述點雲資料的所述當前幀進行譯碼包括：基於所選擇的候選者，對所述當前幀進行譯碼。 The method of claim 1, wherein the decoding comprises predictive geometric decoding or transform-based attribute decoding, the method further comprising: adding one or more candidates to a list of predictor sub-candidates based on the scene model; and selecting a candidate from the list of predictor candidates, Wherein, decoding the current frame of the point cloud data includes: decoding the current frame based on the selected candidate.

根據請求項1所述的方法，還包括：基於感測器模型和所述場景模型，來決定所述點雲資料的所述當前幀中的點的位置的估計，其中，基於所述場景模型對所述點雲資料的所述當前幀進行譯碼包括：使用所述點雲資料的所述當前幀中的點的所述位置的所述估計作為預測子；以及基於該預測子來計算位置殘差。 The method according to claim 1, further comprising: An estimate of the position of the point in the current frame of the point cloud data is determined based on the sensor model and the scene model, wherein the current frame of the point cloud data is based on the scene model Decoding includes: using the estimate of the position of the point in the current frame of the point cloud profile as a predictor; and A position residual is calculated based on the predictor.

根據請求項13所述的方法，其中，所述感測器模型表示LIDAR（光偵測和測距）感測器，並且其中，所述決定所述點的所述位置的所述估計包括：基於所述感測器模型的內在感測器參數或外在感測器參數中的至少一個，來決定所述感測器模型的雷射與所述場景模型的第一交點，其中，使用所述點雲中的所述點的所述位置的所述估計作為所述預測子包括：使用所述第一交點作為所述預測子。 The method of claim 13, wherein the sensor model represents a LIDAR (Light Detection and Ranging) sensor, and wherein the determining the estimate of the location of the point comprises: determining a first intersection point between the laser of the sensor model and the scene model based on at least one of an intrinsic sensor parameter or an extrinsic sensor parameter of the sensor model, Wherein, using the estimate of the position of the point in the point cloud as the predictor includes using the first intersection point as the predictor.

根據請求項14所述的方法，還包括：從全球定位系統資料中獲得運動資訊；補償所述點雲資料的兩幀之間的運動包括：基於所述運動資訊，相對於所述場景模型來重新定位所述感測器模型的感測器；基於與所述重新定位相關聯的所述感測器的新位置並且基於所述感測器模型，來決定雷射與所述場景模型的第二交點；以及基於所述雷射與所述場景模型的所述第二交點，來預測與所述點雲資料的所述兩幀的後續幀相對應的點雲。 The method according to claim 14, further comprising: Obtain movement information from GPS data; Compensating for motion between the two frames of the point cloud data includes: repositioning sensors of the sensor model relative to the scene model based on the motion information; determining a second intersection point of the laser with the scene model based on the new position of the sensor associated with the repositioning and based on the sensor model; and A point cloud corresponding to subsequent frames of the two frames of the point cloud data is predicted based on the second intersection of the laser and the scene model.

根據請求項1所述的方法，其中，所述方法還包括：在位元串流中發送或接收所述場景模型。 The method according to claim 1, wherein the method further comprises: The scene model is sent or received in a bitstream.

根據請求項1所述的方法，其中，所述方法還包括：避免在位元串流中發送或接收所述場景模型。 The method according to claim 1, wherein the method further comprises: Avoid sending or receiving the scene model in the bitstream.

一種用於對點雲資料進行譯碼的設備，所述設備包括：記憶體，其被配置為儲存所述點雲資料；以及在電路中實現並通信地耦接到所述記憶體的一個或多個處理器，所述一個或多個處理器被配置為：決定或獲得與所述點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及基於所述場景模型，來對所述點雲資料的當前幀進行譯碼。 A device for decoding point cloud data, the device includes: a memory configured to store the point cloud data; and One or more processors implemented in circuitry and communicatively coupled to the memory, the one or more processors configured to: determining or obtaining a scene model corresponding to a first frame of the point cloud material, wherein the scene model represents an object within a scene that corresponds to at least a portion of the first frame of the point cloud material correspond; and Decoding the current frame of the point cloud data based on the scene model.

根據請求項18所述的設備，其中，所述場景模型包括真實世界場景的數位表示。The apparatus of claim 18, wherein the scene model comprises a digital representation of a real world scene.

根據請求項18所述的設備，其中，所述場景模型表示道路、地面、車輛、行人、路標、交通信號燈、植被或建築物中的至少一項。The apparatus of claim 18, wherein the scene model represents at least one of roads, ground, vehicles, pedestrians, road signs, traffic lights, vegetation, or buildings.

根據請求項18所述的設備，其中，所述場景模型表示所述點雲資料的所述當前幀的近似。The apparatus of claim 18, wherein the scene model represents an approximation of the current frame of the point cloud material.

根據請求項18所述的設備，其中，所述場景模型包括多個單獨分段。The apparatus of claim 18, wherein the scene model includes a plurality of individual segments.

根據請求項22所述的設備，其中，所述多個單獨分段包括多個平面或多個更高階表面近似。The apparatus of claim 22, wherein the plurality of individual segments comprise a plurality of planes or a plurality of higher order surface approximations.

根據請求項18所述的設備，其中，所述第一幀是所述當前幀，並且其中，所述一個或多個處理器還被配置為：決定所述點雲資料的所述當前幀是幀內幀；基於所述點雲資料的所述當前幀是所述幀內幀，用信號發送或解析所述場景模型；以及使用所述場景模型作為用於所述點雲資料的所述當前幀的預測子。 The apparatus of claim 18, wherein the first frame is the current frame, and wherein the one or more processors are further configured to: determining that the current frame of the point cloud data is an intra frame; Signaling or parsing the scene model based on the current frame of the point cloud data being the intra frame; and Using the scene model as a predictor for the current frame of the point cloud profile.

根據請求項18所述的設備，其中，譯碼包括編碼，並且作為決定或獲得所述場景模型的一部分，所述一個或多個處理器被配置為：獲得第一場景模型和決定第二場景模型，其中，所述一個或多個處理器還被配置為：決定所述點雲資料的所述當前幀不是幀內幀；基於所述點雲資料的所述當前幀不是所述幀內幀，來決定所述第一場景模型與所述第二場景模型之間的差異；使用所述第二場景模型作為針對所述點雲資料的所述當前幀的預測子；以及用信號發送所述差異。 The apparatus of claim 18, wherein decoding comprises encoding, and as part of determining or obtaining the scene model, the one or more processors are configured to: obtain a first scene model and decide a second scene model, wherein the one or more processors are further configured to: determining that the current frame of the point cloud data is not an intra frame; determining the difference between the first scene model and the second scene model based on the fact that the current frame of the point cloud data is not the intra-frame; using the second scene model as a predictor for the current frame of the point cloud profile; and The difference is signaled.

根據請求項18所述的設備，其中，所述一個或多個處理器還被配置為：用信號發送或解析用於指示所述場景模型是否用於對所述點雲資料的所述當前幀的多個切片中的特定切片進行所述譯碼的切片級標誌。 The apparatus of claim 18, wherein the one or more processors are further configured to: A slice-level flag is signaled or parsed to indicate whether the scene model is used for the coding of a particular slice of the plurality of slices of the current frame of the point cloud material.

根據請求項18所述的設備，其中，作為決定所述場景模型的一部分，所述一個或多個處理器還被配置為：決定針對所述點雲資料的多個幀的所述場景模型，並且其中，所述一個或多個處理器還被配置為：決定屬於所述點雲資料的所述多個幀中的兩幀的對應點；以及決定所述兩幀之間的所述對應點的位移，其中，作為基於所述場景模型對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為：基於所述位移來補償所述兩幀之間的運動。 The apparatus of claim 18, wherein, as part of determining the scene model, the one or more processors are further configured to: determine the scene model for a plurality of frames of the point cloud material, and wherein the one or more processors are further configured to: determining corresponding points for two of the plurality of frames belonging to the point cloud data; and determining the displacement of the corresponding point between the two frames, wherein the one or more processors are configured as part of decoding the current frame of the point cloud data based on the scene model To: compensate for motion between the two frames based on the displacement.

根據請求項18所述的設備，其中，作為基於所述場景模型對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為使用所述場景模型作為對點雲位置進行譯碼的參考。The apparatus of claim 18, wherein the one or more processors are configured to use the scene model as part of decoding the current frame of the point cloud material based on the scene model As a reference for decoding point cloud positions.

根據請求項18所述的設備，其中，譯碼包括預測幾何譯碼或基於變換的屬性譯碼，並且其中，所述一個或多個處理器還被配置為：基於所述場景模型，將一個或多個候選者添加到預測子候選者列表中；以及從所述預測子候選者列表中選擇候選者，其中，作為對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為基於所選擇的候選者，對所述當前幀進行譯碼。 The apparatus of claim 18, wherein the decoding comprises predictive geometry decoding or transform-based attribute decoding, and wherein the one or more processors are further configured to: adding one or more candidates to a list of predictor sub-candidates based on the scene model; and selecting a candidate from the list of predictor candidates, Wherein, as part of decoding the current frame of the point cloud material, the one or more processors are configured to decode the current frame based on the selected candidates.

根據請求項18所述的設備，其中，所述一個或多個處理器還被配置為：基於感測器模型和所述場景模型，來決定所述點雲資料的所述當前幀中的點的位置的估計，其中，作為基於所述場景模型對所述點雲資料的所述當前幀進行譯碼的一部分，所述一個或多個處理器被配置為：使用所述點雲資料的所述當前幀中的點的所述位置的所述估計作為預測子；以及基於所述預測子，來計算位置殘差。 The apparatus of claim 18, wherein the one or more processors are further configured to: An estimate of the position of the point in the current frame of the point cloud data is determined based on the sensor model and the scene model, wherein as the current frame of the point cloud data based on the scene model As part of the decoding, the one or more processors are configured to: using the estimate of the position of the point in the current frame of the point cloud profile as a predictor; and Based on the predictor, a position residual is calculated.

根據請求項30所述的設備，其中，所述感測器模型表示LIDAR（光偵測和測距）感測器，並且其中，作為決定所述點的所述位置的所述估計的一部分，所述一個或多個處理器還被配置為：基於所述感測器模型的內在感測器參數和外在感測器參數，來決定所述感測器模型的雷射與所述場景模型的第一交點，其中，作為使用所述點雲中的所述點的所述位置的所述估計作為所述預測子的一部分，所述一個或多個處理器還被配置為使用所述第一交點作為所述預測子。 The apparatus of claim 30, wherein the sensor model represents a LIDAR (Light Detection and Ranging) sensor, and wherein, as part of the estimation of determining the position of the point, The one or more processors are also configured to: determining a first intersection point between the laser of the sensor model and the scene model based on intrinsic sensor parameters and extrinsic sensor parameters of the sensor model, wherein, as part of using the estimate of the position of the point in the point cloud as part of the predictor, the one or more processors are further configured to use the first intersection as the predictor.

根據請求項31所述的設備，其中，所述一個或多個處理器還被配置為：從全球定位系統資料中獲得運動資訊；補償所述點雲資料的兩幀之間的運動包括：基於所述運動資訊，相對於所述場景模型來重新定位所述感測器模型的感測器；基於與所述重新定位相關聯的所述感測器的新位置，並且基於所述感測器模型，來決定雷射與所述場景模型的第二交點；以及基於所述雷射與所述場景模型的所述第二交點，來預測與所述點雲資料的所述兩幀的後續幀相對應的點雲。 The apparatus of claim 31, wherein the one or more processors are further configured to: Obtain movement information from GPS data; Compensating for motion between the two frames of the point cloud data includes: repositioning sensors of the sensor model relative to the scene model based on the motion information; determining a second intersection point of the laser with the scene model based on the new position of the sensor associated with the repositioning and based on the sensor model; and A point cloud corresponding to subsequent frames of the two frames of the point cloud data is predicted based on the second intersection of the laser and the scene model.

根據請求項18所述的設備，其中，所述設備包括車輛、機器人或智慧型手機。The device of claim 18, wherein the device comprises a vehicle, a robot or a smartphone.

根據請求項18所述的設備，其中，所述一個或多個處理器還被配置為：在位元串流中發送或接收所述場景模型。 The apparatus of claim 18, wherein the one or more processors are further configured to: The scene model is sent or received in a bitstream.

根據請求項18所述的設備，其中，所述一個或多個處理器還被配置為：避免在位元串流中發送或接收所述場景模型。 The apparatus of claim 18, wherein the one or more processors are further configured to: Avoid sending or receiving the scene model in the bitstream.

一種在其上儲存有指令的非暫時性計算機可讀儲存媒體，所述指令當被執行時使一個或多個處理器用於：決定或獲得與點雲資料的第一幀相對應的場景模型，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及基於所述場景模型，來對所述點雲資料的當前幀進行譯碼。 A non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to: determining or obtaining a scene model corresponding to a first frame of point cloud data, wherein the scene model represents an object within a scene, the object corresponding to at least a portion of the first frame of the point cloud data; as well as Decoding the current frame of the point cloud data based on the scene model.

一種用於對點雲資料進行譯碼的設備，所述設備包括：用於決定或獲得與所述點雲資料的第一幀相對應的場景模型的構件，其中，所述場景模型表示場景內的物體，所述物體與所述點雲資料的所述第一幀的至少一部分相對應；以及用於基於所述場景模型，來對所述點雲資料的當前幀進行譯碼的構件。 A device for decoding point cloud data, the device includes: means for determining or obtaining a scene model corresponding to the first frame of the point cloud data, wherein the scene model represents an object within the scene that is related to the first frame of the point cloud data corresponds to at least a portion of; and means for decoding the current frame of the point cloud data based on the scene model.