WO2023029858A1 - 点云媒体文件的封装与解封装方法、装置及存储介质 - Google Patents

点云媒体文件的封装与解封装方法、装置及存储介质 Download PDF

Info

Publication number
WO2023029858A1
WO2023029858A1 PCT/CN2022/109620 CN2022109620W WO2023029858A1 WO 2023029858 A1 WO2023029858 A1 WO 2023029858A1 CN 2022109620 W CN2022109620 W CN 2022109620W WO 2023029858 A1 WO2023029858 A1 WO 2023029858A1
Authority
WO
WIPO (PCT)
Prior art keywords
attribute
instance
information
point cloud
attribute instance
Prior art date
Application number
PCT/CN2022/109620
Other languages
English (en)
French (fr)
Inventor
胡颖
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023029858A1 publication Critical patent/WO2023029858A1/zh
Priority to US18/463,765 priority Critical patent/US20230421810A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the embodiment of the present application relates to the technical field of video processing, and in particular to a method, device and storage medium for encapsulating and decapsulating point cloud media files.
  • a point cloud is a set of discrete point sets randomly distributed in space that express the spatial structure and surface properties of a three-dimensional object or scene.
  • Point cloud media can be divided into 3 degrees of freedom (DoF for short) media, 3DoF+ media and 6DoF media according to the degree of freedom of users when consuming media content.
  • DoF degrees of freedom
  • Each point in the point cloud includes geometric information and attribute information.
  • the attribute information includes different types of attribute information such as color attributes and reflectance.
  • the same type of attribute information can also include different attribute instances.
  • the color attribute of a point includes Different color types, different color types are referred to as different attribute instances of the color attribute.
  • coding technology such as Geometry-based Point Cloud Compression (GPCC for short), it supports multiple attribute instances of the same attribute type in one code stream.
  • the present application provides a point cloud media file encapsulation and decapsulation method, device and storage medium, which can selectively consume attribute instances according to the first characteristic information of at least one attribute instance among the M attribute instances added in the media file , thereby saving decoding resources and improving decoding efficiency.
  • An embodiment of the present application provides a method for encapsulating point cloud media files, which is applied to a file encapsulation device, and the method includes:
  • the target point cloud includes instance data corresponding to M attribute instances of at least one type of attribute information
  • the first feature information of at least one attribute instance among the M attribute instances is used as the instance corresponding to the at least one attribute instance
  • the metadata of the data is encapsulated in the media file of the target point cloud, and the first characteristic information is used to identify the difference between the at least one attribute instance and other attribute instances in the M attribute instances, and the M is A positive integer greater than 1.
  • the embodiment of the present application also provides a method for decapsulating a point cloud media file, which is applied to a file decapsulating device, and the method includes:
  • the first information is used to indicate the first feature information of at least one attribute instance in the M attribute instances, and the M attribute instances are N included in the target point cloud.
  • M attribute instances included in at least one category of attribute information in the category attribute information the N is a positive integer, and the M is a positive integer greater than 1;
  • the embodiment of the present application also provides a point cloud media file packaging device, which is applied to a file packaging device, and the device includes:
  • An acquisition unit configured to acquire a target point cloud, and encode the target point cloud to obtain a code stream of the target point cloud
  • An encapsulation unit configured to encapsulate the code stream to obtain a media file of the target point cloud; when the target point cloud includes instance data corresponding to M attribute instances of at least one type of attribute information, the M
  • the first characteristic information of at least one attribute instance in the attribute instance is used as the metadata of the instance data corresponding to the at least one attribute instance, and is encapsulated in the media file of the target point cloud, and the first characteristic information is used to identify the The difference between the at least one attribute instance and other attribute instances in the M attribute instances, where M is a positive integer greater than 1.
  • the embodiment of the present application also provides a device for decapsulating a point cloud media file, which is applied to a file decapsulating device, and the device includes:
  • the transceiver unit is configured to receive the first information sent by the file encapsulation device; wherein the first information is used to indicate the first feature information of at least one attribute instance in the M attribute instances, and the M attribute instances are target points M attribute instances included in at least one type of attribute information in the N types of attribute information included in the cloud, the N is a positive integer, and the M is a positive integer greater than 1;
  • the embodiment of the present application also provides a file packaging device, including: a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the method of the first aspect.
  • the embodiment of the present application also provides a file decapsulation device, including: a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the method of the second aspect .
  • the embodiment of the present application also provides a computer-readable storage medium, which is used for storing a computer program, and the computer program causes a computer to execute the method in any one of the above-mentioned first aspect to the second aspect or each implementation manner thereof.
  • An embodiment of the present application also provides an electronic device, including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to perform the first aspect and /or the method described in any one of the second aspect.
  • the file encapsulation device in the embodiment of the present application converts the first feature of at least one attribute instance among the M attribute instances
  • the information is encapsulated in the media file of the target point cloud as metadata of the instance data corresponding to the at least one attribute instance. That is, in the embodiment of the present application, the first characteristic information of the attribute instance is added to the media file as metadata, so that the file decapsulation device can determine the specific decoding target attribute instance according to the first characteristic information in the metadata, thereby saving bandwidth. and decoding resources to improve decoding efficiency.
  • Figure 1 shows a schematic diagram of three degrees of freedom
  • Figure 2 shows a schematic diagram of the three degrees of freedom +
  • Figure 3 shows a schematic diagram of six degrees of freedom
  • FIG. 4A is an architectural diagram of an immersive media system provided by an embodiment of the present application.
  • FIG. 4B is a schematic diagram of the content flow of the V3C media provided by the embodiment of the present application.
  • Fig. 5 is a flow chart of a method for encapsulating a point cloud media file provided by an embodiment of the present application
  • Fig. 6 is an interactive flowchart of a method for encapsulating and decapsulating a point cloud media file provided in an embodiment of the present application
  • FIG. 7 is an interactive flow chart of a point cloud media file encapsulation and decapsulation method provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a point cloud media file packaging device provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a device for decapsulating a point cloud media file provided in an embodiment of the present application.
  • Fig. 10 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the embodiment of the present application relates to the data processing technology of point cloud media.
  • Point cloud is a set of discrete point sets randomly distributed in space that express the spatial structure and surface properties of a three-dimensional object or scene. Each point in the point cloud has at least three-dimensional position information, and may also have color, material or other information depending on the application scenario. Typically, each point in a point cloud has the same number of additional attributes.
  • V3C volumetric media visual volumetric video-based coding media, refers to immersive media that captures visual content in three-dimensional space and provides 3DoF+ and 6DoF viewing experience, is encoded with traditional video, and contains volumetric video-type tracks in the file package, including multiple Perspective video, video encoded point cloud, etc.
  • PCC Point Cloud Compression, point cloud compression.
  • G-PCC Geometry-based Point Cloud Compression, point cloud compression based on geometric models.
  • V-PCC Video-based Point Cloud Compression, point cloud compression based on traditional video coding.
  • Atlas Indicates the area information on the 2D plane frame, the area information on the 3D presentation space, and the mapping relationship between them and the necessary parameter information required for mapping.
  • Track a collection of media data during the media file encapsulation process.
  • a media file can consist of multiple tracks.
  • a media file can include a video track, an audio track, and a subtitle track.
  • Sample the encapsulation unit in the media file encapsulation process
  • a media track consists of many samples.
  • a sample of a video track is usually a video frame.
  • DoF Degree of Freedom
  • degree of freedom In a mechanical system, it refers to the number of independent coordinates.
  • degrees of freedom of rotation and vibration In addition to the degree of freedom of translation, there are also degrees of freedom of rotation and vibration. In this embodiment of the application, it refers to the degree of freedom in which the user supports movement and interacts with content when watching immersive media.
  • 3DoF Three degrees of freedom, referring to the three degrees of freedom in which the user's head rotates around the XYZ axis.
  • Figure 1 schematically shows a schematic diagram of three degrees of freedom. As shown in Figure 1, it means that at a certain place and a certain point, all three axes can be rotated, the head can be turned, the head can be bowed up and down, and the head can also be swung.
  • the head can be turned, the head can be bowed up and down, and the head can also be swung.
  • the panoramic picture is static, it can be understood as a panoramic picture.
  • the panoramic picture is dynamic, it is a panoramic video, that is, a VR video.
  • VR videos have certain limitations. Users cannot move, and cannot choose any place to watch.
  • 3DoF+ On the basis of three degrees of freedom, users also have limited freedom of movement along the XYZ axis, which can also be called restricted six degrees of freedom, and the corresponding media stream can be called restricted six degrees of freedom media stream.
  • Figure 2 schematically shows a schematic diagram of the three degrees of freedom+.
  • 6DoF In addition to the three degrees of freedom, the user also has the freedom to move freely along the XYZ axis, and the corresponding media stream can be called the six degrees of freedom media stream.
  • Fig. 3 schematically shows a schematic diagram of six degrees of freedom.
  • 6DoF media refers to 6-degree-of-freedom video, which means that the video can provide users with a high-degree-of-freedom viewing experience of freely moving the viewpoint in the XYZ axis direction of the three-dimensional space and freely rotating the viewpoint around the XYX axis.
  • 6DoF media is a video combination of different perspectives in space collected by a camera array.
  • 6DoF media data is expressed as a combination of the following information: texture maps collected by multi-cameras, depth maps corresponding to multi-camera texture maps, and corresponding 6DoF media content description metadata
  • the metadata includes the parameters of the multi-camera, as well as description information such as the splicing layout and edge protection of the 6DoF media.
  • the multi-camera texture map information and the corresponding depth map information are spliced, and the description data of the splicing method is written into the metadata according to the defined syntax and semantics.
  • the spliced multi-camera depth map and texture map information are encoded by planar video compression, and transmitted to the terminal for decoding, then the 6DoF virtual viewpoint requested by the user is synthesized, thereby providing the user with a viewing experience of 6DoF media.
  • AVS Audio Video Coding Standard, audio and video coding standard.
  • ISOBMFF ISO Based Media File Format, a media file format based on the ISO (International Standard Organization, International Organization for Standardization) standard.
  • ISOBMFF International Standard Organization, International Organization for Standardization
  • MP4 Moving Picture Experts Group 4, Moving Picture Experts Group 4
  • DASH dynamic adaptive streaming over HTTP
  • dynamic adaptive streaming based on HTTP is an adaptive bit rate streaming technology, which enables high-quality streaming media to be delivered over the Internet through traditional HTTP web servers.
  • MPD media presentation description, media presentation description signaling in DASH, used to describe media segment information.
  • HEVC High Efficiency Video Coding, the international video coding standard HEVC/H.265.
  • VVC versatile video coding
  • the international video coding standard VVC/H.266 the international video coding standard VVC/H.266.
  • Intra(picture)Prediction Intra-frame prediction.
  • Inter(picture)Prediction Inter-frame prediction.
  • SCC screen content coding, screen content coding.
  • QP Quantization Parameter, quantization parameter.
  • Immersive media refers to media content that can bring consumers an immersive experience.
  • Immersive media can be divided into 3DoF media, 3DoF+ media and 6DoF media according to the degree of freedom of users when consuming media content.
  • Common 6DoF media include point cloud media.
  • a point cloud is a set of discrete point sets randomly distributed in space that express the spatial structure and surface properties of a three-dimensional object or scene.
  • Each point in the point cloud has at least three-dimensional position information, and may also have color, material or other information depending on the application scenario.
  • each point in a point cloud has the same number of additional attributes.
  • Point cloud can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes, so it is widely used, including virtual reality (Virtual Reality, VR) games, computer aided design (Computer Aided Design, CAD), geographic information system (Geography Information System, GIS), automatic navigation system (Autonomous Navigation System, ANS), digital cultural heritage, free viewpoint broadcasting, 3D immersive telepresence, 3D reconstruction of biological tissues and organs, etc.
  • VR Virtual Reality
  • CAD Computer Aided Design
  • GIS Geographic Information System
  • Automatic navigation system Autonomous Navigation System
  • digital cultural heritage free viewpoint broadcasting
  • 3D immersive telepresence 3D reconstruction of biological tissues and organs, etc.
  • Point cloud acquisition mainly has the following methods: computer generation, 3D laser scanning, 3D photogrammetry, etc.
  • Computers can generate point clouds of virtual 3D objects and scenes.
  • 3D scanning can obtain point clouds of static real-world three-dimensional objects or scenes, and can obtain millions of point clouds per second.
  • 3D cameras can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and tens of millions of point clouds can be obtained per second.
  • point clouds of biological tissues and organs can be obtained from MRI, CT, and electromagnetic positioning information.
  • the encoded data stream After encoding the point cloud media, the encoded data stream needs to be encapsulated and transmitted to the user. Correspondingly, on the point cloud media player side, it is necessary to decapsulate the point cloud file first, then decode it, and finally present the decoded data stream.
  • FIG. 4A is a structural diagram of an immersive media system provided by an embodiment of the present application.
  • the immersive media system includes an encoding device and a decoding device.
  • the encoding device may refer to a computer device used by a provider of the immersive media, and the computer device may be a terminal (such as a PC (Personal Computer, personal computer), a smart mobile devices (such as smartphones, etc.) or servers.
  • the decoding device may refer to a computer device used by a user of immersive media, and the computer device may be a terminal (such as a PC (Personal Computer, personal computer), a smart mobile device (such as a smart phone), a VR device (such as a VR helmet, a VR glasses, etc.)).
  • the data processing process of immersive media includes the data processing process on the encoding device side and the data processing process on the decoding device side.
  • the data processing process at the encoding device side mainly includes:
  • the data processing process at the decoding device side mainly includes:
  • the transmission process involving immersive media between the encoding device and the decoding device can be carried out based on various transmission protocols.
  • the transmission protocols here include but are not limited to: DASH (Dynamic Adaptive Streaming over HTTP, dynamic adaptive streaming Streaming media transmission) protocol, HLS (HTTP Live Streaming, dynamic code rate adaptive transmission) protocol, SMTP (Smart Media Transport Protocol, intelligent media transmission protocol), TCP (Transmission Control Protocol, transmission control protocol), etc.
  • the media content of immersive media is obtained by capturing real-world audio-visual scenes through capture devices.
  • the capture device may be a hardware component provided in the encoding device, for example, the capture device may include a microphone, a camera, a sensor, and the like of the terminal.
  • the capture device may also include a hardware device connected to the encoding device, for example, a camera connected to the server.
  • the capture device may include but not limited to: audio device, camera device and sensor device.
  • the audio device may include an audio sensor, a microphone, and the like.
  • the camera device may include a common camera, a stereo camera, a light field camera, and the like.
  • Sensing devices may include laser devices, radar devices, and the like.
  • the number of capture devices can be multiple. These capture devices are deployed at some specific locations in the real space to simultaneously capture audio content and video content from different angles in the space, and the captured audio content and video content are synchronized in time and space.
  • the media content captured by the capture device is called the raw data of immersive media.
  • the captured audio content is itself suitable for performing audio encoding for immersive media.
  • the captured video content can only be suitable as the content of video encoding for immersive media after undergoing a series of production processes.
  • the production process may include the following steps.
  • splicing refers to splicing the video content shot at these various angles into a complete video that can reflect the 360-degree visual panorama of the real space, that is, the spliced video is a panoramic video (or spherical video) represented in three-dimensional space.
  • Projection refers to the process of mapping a three-dimensional video formed by splicing onto a two-dimensional (3-Dimension, 2D) image.
  • the 2D image formed by projection is called projected image.
  • Projection methods may include, but are not limited to: latitude and longitude diagram projection, regular hexahedron projection.
  • the projected image can be encoded directly, or the projected image can be encoded after region encapsulation.
  • region encapsulation technology is widely used in the process of video processing of immersive media.
  • area encapsulation refers to the process of converting the projected image by area, and the area encapsulation process converts the projected image into an encapsulated image.
  • the process of area encapsulation specifically includes: dividing the projected image into multiple mapped areas, and then converting the multiple mapped areas to obtain multiple encapsulated areas, and mapping the multiple encapsulated areas into a 2D image to obtain an encapsulated image.
  • the mapping area refers to the area obtained by dividing in the projected image before performing area encapsulation;
  • the encapsulating area refers to the area located in the encapsulating image after performing area encapsulation.
  • Conversion processing may include, but is not limited to: mirroring, rotation, rearrangement, up-sampling, down-sampling, changing the resolution of an area, and moving.
  • the capture device can only capture panoramic video, after such video is processed by the encoding device and transmitted to the decoding device for corresponding data processing, the user on the decoding device side can only perform some specific actions (such as head body rotation) to watch 360-degree video information, but performing non-specific actions (such as moving the head) cannot obtain corresponding video changes, and the VR experience is not good, so it is necessary to provide additional depth information that matches the panoramic video to make the Users get better immersion and better VR experience, which involves 6DoF (Six Degrees of Freedom, six degrees of freedom) production technology. When the user can move more freely in the simulated scene, it is called 6DoF.
  • the capture device When using 6DoF production technology to produce immersive media video content, the capture device generally uses light field cameras, laser equipment, radar equipment, etc. to capture point cloud data or light field data in space, and execute the above production process 1-
  • the process of 3 also needs to carry out some specific processing, such as the process of cutting and mapping point cloud data, the calculation process of depth information, etc.
  • the captured audio content can be directly encoded to form an audio stream of immersive media.
  • video encoding is performed on the projected image or packaged image to obtain the video stream of the immersive media, for example, encoding the packaged image (D) into an encoded image (Ei) or encoded video bit Stream (Ev).
  • the captured audio (Ba) is encoded into an audio bitstream (Ea).
  • the encoded images, video and/or audio are assembled into a media file (F) for file playback or an initialization segment and a sequence of media segments (Fs) for streaming.
  • the encoding device side also includes metadata, such as projection and region information, into the file or fragment, which helps render the decoded packed picture.
  • a specific coding method (such as point cloud coding) needs to be used for coding in the video coding process.
  • the media file resource can be Media files or media fragments form immersive media media files; and media presentation description (Media presentation description, MPD) is used to record the metadata of the immersive media media file resources according to the file format requirements of immersive media, where the metadata is A general term for information related to the presentation of immersive media.
  • the metadata may include description information of media content, description information of a window, signaling information related to presentation of media content, and the like.
  • the encoding device stores media presentation description information and media file resources formed after the data processing process.
  • the Immersive Media System supports Box.
  • a data box refers to a data block or object including metadata, that is, a data box includes metadata of corresponding media content.
  • Immersive media can include multiple data boxes, such as Sphere Region Zooming Box, which contains metadata used to describe the zoom information of spherical regions; 2D Region Zooming Data Box (2DRegionZoomingBox), which contains Metadata of 2D region scaling information; Region Wise PackingBox (Region Wise PackingBox), which contains metadata for describing corresponding information in the region packing process, and so on.
  • the decoding device can dynamically obtain immersive media media file resources and corresponding media presentation description information from the encoding device through the recommendation of the encoding device or adaptively according to the needs of the user on the decoding device.
  • the tracking information determines the orientation and position of the user, and then dynamically requests the encoding device to obtain corresponding media file resources based on the determined orientation and position.
  • Media file resources and media presentation description information are transmitted from the encoding device to the decoding device through a transmission mechanism (such as DASH, SMT).
  • the process of decapsulating the file on the decoding device is opposite to the process of encapsulating the file on the encoding device.
  • the decoding device decapsulates the media file resource according to the file format requirements of immersive media to obtain audio and video streams.
  • the decoding process at the decoding device end is opposite to the encoding process at the encoding device end.
  • the decoding device performs audio decoding on the audio code stream to restore the audio content.
  • the decoding process of the video code stream by the decoding device includes the following:
  • the planar image refers to the encapsulated image; if the metadata indicates that the immersive media If the area encapsulation process has not been performed, the planar image refers to the projected image;
  • the decoding device decapsulates the area of the encapsulated image to obtain the projected image.
  • region decapsulation is opposite to region encapsulation, and region decapsulation refers to the process of inversely transforming the encapsulated image according to region.
  • Region unpacking causes the encapsulated image to be converted to a projected image.
  • the process of region decapsulation specifically includes: performing inverse conversion processing on the multiple encapsulated regions in the encapsulated image according to the instruction of the metadata to obtain multiple mapped regions, and mapping the multiple mapped regions to a 2D image to obtain a projection image.
  • the inverse conversion process refers to the inverse process of the conversion process. For example, if the conversion process refers to a 90-degree counterclockwise rotation, then the inverse conversion process refers to a 90-degree clockwise rotation.
  • the projected image is reconstructed according to the media presentation description information to convert it into a 3D image.
  • the reconstruction process here refers to the process of reprojecting the two-dimensional projected image into a 3D space.
  • the decoding device renders the audio content obtained by audio decoding and the 3D image obtained by video decoding according to the metadata related to rendering and window in the media presentation description information. After the rendering is completed, the playback and output of the 3D image is realized.
  • the decoding device mainly renders the 3D image based on the current viewpoint, disparity, depth information, etc.; if the production technology of 6DoF is used, the decoding device mainly renders the 3D image in the window based on the current viewpoint. render.
  • the viewpoint refers to the viewing position of the user
  • the parallax refers to the visual difference caused by the user's binoculars or due to movement
  • the window refers to the viewing area.
  • the Immersive Media System supports Box.
  • a data box refers to a data block or object including metadata, that is, a data box includes metadata of corresponding media content.
  • Immersive media can include multiple data boxes, for example, Sphere Region Zooming Box, which contains metadata for describing spherical region zoom information; 2D Region Zooming Data Box (2DRegionZoomingBox), which contains data for describing Metadata of 2D region scaling information; Region Wise PackingBox (Region Wise PackingBox), which contains metadata used to describe the corresponding information in the region packing process, etc.
  • Fig. 4B is a schematic diagram of the content flow of GPCC point cloud media provided by an embodiment of the present application.
  • the immersive media system includes a file encapsulator and a file decapsulator.
  • the file encapsulator can be understood as the above encoding device
  • the file decapsulator can be understood as the above decoding device.
  • a real-world visual scene (A) is captured by an array of cameras or a camera device with multiple lenses and sensors.
  • the acquisition result is the source point cloud data (B).
  • One or more point cloud frames are encoded into a G-PCC bitstream, including the encoded geometry bitstream and attribute bitstream (E).
  • E encoded geometry bitstream and attribute bitstream
  • one or more encoded bitstreams are combined into a media file (F) for file playback or a sequence of initialization segments and media segments for streaming (Fs).
  • the media container file format is the ISO base media file format specified in ISO/IEC 14496-12.
  • File wrappers also include metadata into files or fragments. Fragment Fs is delivered to the player using a delivery mechanism.
  • the file (F) output by the file packer is the same as the file (F') input by the file depacker.
  • a file decapsulator processes files (F') or received segments (F's), extracts the encoded bitstream (E') and parses metadata.
  • the G-PCC bitstream is then decoded into a decoded signal (D'), and point cloud data is generated from the decoded signal (D'). Render and display point cloud data on the screen of a head-mounted display or any other display device, where applicable, based on the current viewing position, viewing direction, or viewport determined by various types of sensors (e.g. head) on, and tracked, where tracking can use a positional tracking sensor or an eye tracking sensor.
  • the current viewing position or viewing direction can also be used for decoding optimization.
  • the current viewing position and viewing direction are also passed to a strategy module (not shown) for deciding which tracks to receive or decode.
  • E/E' coded G-PCC bit stream
  • F/F' is a media file that includes a track format specification, which may contain constraints on the elementary streams contained in the track samples.
  • Each point in the point cloud includes geometric information and attribute information.
  • the attribute information includes different types of attribute information such as color attribute and reflectance.
  • the attribute information of the same type may also include different attribute instances.
  • An attribute instance is a specific instance of an attribute, in which the value of the attribute is specified.
  • the color attribute of a point includes different color types, and the different color types are called different attribute instances of the color attribute.
  • coding technology such as Geometry-based Point Cloud Compression (GPCC for short), it supports multiple attribute instances of the same attribute type in one code stream, and multiple attribute instances of the same attribute type can be Distinguished by attribute instance id.
  • the current point cloud media encapsulation technology such as GPCC encoding technology, supports multiple attribute instances of the same attribute type to exist in the code stream at the same time, but there is no corresponding information indication, making it impossible for the file decapsulation device to determine which one to consume attribute instance.
  • the file encapsulation device in the embodiment of the present application uses at least one attribute instance among the M attribute instances of the same type of attribute information of the target point cloud during the encapsulation process of the media file
  • the first characteristic information of is added in the media file.
  • the file decapsulating device can determine a specific decoding target attribute instance according to the first characteristic information of the attribute information, thereby saving bandwidth and decoding resources, and improving decoding efficiency.
  • FIG. 5 is a flow chart of a point cloud media file encapsulation method provided by an embodiment of the present application. As shown in Figure 5, the method includes the following steps:
  • the file encapsulation device acquires a target point cloud, encodes the target point cloud, and obtains a code stream of the target point cloud.
  • the file encapsulation device is also called a point cloud encapsulation device, or a point cloud encoding device.
  • the above-mentioned target point cloud is an overall point cloud.
  • the above-mentioned target point cloud is a part of the overall point cloud, such as a subset of the overall point cloud.
  • the target point cloud is also referred to as target point cloud data or target point cloud media content or target point cloud content.
  • the ways for the file encapsulation device to obtain the target point cloud include but are not limited to the following.
  • the file encapsulation device obtains the target point cloud from the point cloud collection device, for example, the file encapsulation device obtains the point cloud collected by the point cloud collection device from the point cloud collection device as the target point cloud.
  • the file encapsulation device obtains the target point cloud from the storage device. For example, after the point cloud acquisition device collects the point cloud data, it stores the point cloud data in the storage device, and the file encapsulation device obtains the target point cloud from the storage device.
  • Method 3 if the above-mentioned target point cloud is a local point cloud, after the file encapsulation device obtains the overall point cloud according to the above-mentioned method 1 or method 2, it divides the whole point cloud into blocks, and uses one of the blocks as the target point cloud.
  • the target point cloud in the embodiment of the present application includes N types of attribute information, at least one type of attribute information in the N types of attribute information includes M attribute instances, where N is a positive integer, and M is a positive integer greater than 1.
  • the target point cloud includes the instance data corresponding to the M attribute instances, for example, the instance data of the attribute instance whose attribute value of attribute type A is A1.
  • the target point cloud includes N types of attribute information such as color attribute, reflectance attribute, and transparency attribute.
  • the color attribute includes M different attribute instances, for example, the color attribute includes a blue attribute instance, a red attribute instance, and the like.
  • the encoding of the target point cloud includes encoding the geometric information and the attribute information of the point cloud respectively to obtain the geometric code stream and the attribute code stream of the point cloud.
  • the geometric information and the attribute information of the target point cloud are encoded simultaneously, and the obtained point cloud code stream includes the geometric information and the attribute information.
  • the embodiment of the present application mainly involves the encoding of the attribute information of the target point cloud.
  • the file encapsulation device encapsulates the code stream of the target point cloud according to the first feature information of at least one attribute instance among the M attribute instances, to obtain a media file of the target point cloud.
  • the media file of the target point cloud includes the first feature information of the at least one attribute instance.
  • the file encapsulation device may use the first feature information of at least one attribute instance among the M attribute instances as the at least one attribute instance
  • the metadata of the corresponding instance data is encapsulated in the media file of the target point cloud.
  • the first characteristic information is used to identify the difference between the at least one attribute instance and other attribute instances in the above M attribute instances except the at least one attribute instance.
  • attribute instance the "instance data corresponding to the attribute instance” will be referred to as "attribute instance” for short.
  • the first feature information of the attribute instance can be understood as information used to identify that the attribute instance is different from other attribute instances in the M attribute instances. For example, priority, identity, etc. of attribute instances.
  • the embodiment of the present application does not limit the specific content of the first characteristic information of the attribute instance.
  • the first feature information of the attribute instance includes: at least one of an identifier of the attribute instance, a priority of the attribute instance, and a type of the attribute instance.
  • the identifier of the attribute instance is represented by a field attr_instance_id, and different values of this field represent the identifier value of the attribute instance.
  • the priority of the attribute instance is indicated by the field attr_instance_priority. In some embodiments, the smaller the value of this field, the higher the priority of the attribute instance.
  • Attr_instance_id can be reused to indicate the priority of the attribute instance, for example, the smaller the value of attr_instance_id, the higher the priority of the attribute instance.
  • the type of the attribute instance also referred to as the selection policy of the attribute instance, is represented by the field attr_instance_type, and different values of this field represent different types of the attribute instance.
  • the type of the attribute instance can be understood as a strategy for instructing the file decapsulation device to select a target attribute instance from M attribute instances of the same type.
  • it can be understood as a consumption scenario used to indicate different attribute instances.
  • the consumption scenario of the attribute instance is that the attribute instance is associated with scenario 1, so that the file decapsulation device in scenario 1 can request the attribute instance associated with scenario 1 to obtain the instance data of the attribute instance associated with scenario 1.
  • the type of attribute instance includes at least one of an attribute instance associated with a recommendation view and an attribute instance associated with user feedback.
  • the file decapsulation device can determine the attribute instance associated with the user feedback information according to the user feedback information, and then determine the attribute information as the target to be decoded attribute instance.
  • the file decapsulation device may determine the attribute instance associated with the recommended window according to the relevant information of the recommended window, and then determine the attribute instance as the target to be decoded attribute instance.
  • the value of the field attr_instance_type is the first value, it indicates that the type of the attribute instance is an attribute instance associated with the recommended window.
  • the value of the field attr_instance_type is the second value, it indicates that the type of the attribute instance is an attribute instance associated with user feedback.
  • Attr_instance_type value describe first value The instance associated with the viewport second value Instances associated with user feedback other reserve
  • the above-mentioned first value is 0.
  • the above-mentioned second value is 1.
  • first value and the second value are only an illustration of the first value and the second value, and the values of the first value and the second value include but are not limited to the above-mentioned 0 and 1, which are specifically determined according to actual conditions.
  • the first characteristic information of at least one attribute instance among the M attribute instances belonging to the same type of attribute information is added to the media file of the target point cloud.
  • the embodiment of the present application does not limit the specific adding position of the first characteristic information of the at least one attribute instance in the media file, for example, it may be added in the header sample of the track corresponding to the at least one attribute instance.
  • the code stream of the target point cloud is encapsulated to obtain the media file of the target point cloud (that is, at least one of the M attribute instances).
  • the implementation process of adding the first feature information of an attribute instance to the media file of the target point cloud includes the following situations.
  • the point cloud code stream is packaged according to the point cloud frame as the package unit, and one frame of point cloud can understand the point cloud scanned by the point cloud acquisition device during one scan.
  • a frame of point cloud is a point cloud with a preset size.
  • the first feature information of at least one attribute instance may be added to the sub-sample data box.
  • each attribute information in the N types of attribute information of the target point cloud is encapsulated in a sub-sample
  • the M attribute instances are attribute instances of the a-th type attribute information
  • the M The first characteristic information of at least one attribute instance among the attribute instances is added to the sub-sample data box of the a-th type attribute information.
  • the data structure of the sub-sample data box corresponding to the above-mentioned case 1 is as follows:
  • the codec_specific_parameters field in the subsample data box SubsampleInformationBox is defined as follows:
  • payloadType is used to indicate the tlv_type data type of the G-PCC unit in the subsample.
  • AttrIdx is used to indicate the ash_attr_sps_attr_idx of the G-PCC unit containing attribute data in the subsample.
  • multi_attr_instance_flag is 1, indicating that there are multiple attribute instances of the attribute of the current type; the value of 0 indicates that there is only one attribute instance of the attribute of the current type.
  • Attr_instance_id indicates an identifier of an attribute instance.
  • Attr_instance_priority indicates the priority of the attribute instance, the smaller the value of this field, the higher the priority of the attribute instance.
  • the client can discard the attribute instance with low priority.
  • Attr_instance_type indicates the type of attribute instance. This field is used to indicate the consumption scenarios of different instances. The meaning of the field value is as follows:
  • the file decapsulation device after the file decapsulation device obtains the media file, it can obtain the first characteristic information of at least one attribute instance among the M attribute instances from the above-mentioned sub-sample data box, and then determine the to-be-decoded The target attribute instance of , thereby avoiding decoding all attribute instances, thus improving the decoding efficiency.
  • the geometric information and attribute information of a frame of point cloud are packaged separately, for example, the geometric information is packaged in the geometric track, and each type of attribute information in the N types of attribute information
  • Each property instance of is encapsulated in a track or project.
  • the first characteristic information of at least one attribute instance can be added to the components corresponding to the M attribute instances in the data box.
  • the data structure of the component data box corresponding to the above-mentioned case 2 is as follows:
  • gpcc_type is used to indicate the type of GPCC component, and its value meaning is shown in Table 2.
  • gpcc_type value describe 1 reserve 2 geometry data 3 reserve 4 attribute data 5..31 reserve
  • Attr_index is used to indicate the sequence number of the attribute indicated in SPS (Sequence Parameter Set).
  • Attr_type_present_flag is 1, indicating that the attribute type information is indicated in the GPCCComponentInfoBox data box; the value of attr_type_present_flag is 0, indicating that the attribute type information is not indicated in the GPCCComponentInfoBox data box.
  • Attr_type indicates the type of the attribute component, and its value is shown in Table 3.
  • Attr_name is used to indicate human-readable attribute component type information.
  • multi_attr_instance_flag is 1, indicating that there are multiple attribute instances of the attribute of the current type; the value of 0 indicates that there is only one attribute instance of the attribute of the current type.
  • Attr_instance_id indicates an identifier of an attribute instance.
  • Attr_instance_priority indicates the priority of the attribute instance, the smaller the value of this field, the higher the priority of the attribute instance.
  • the client can discard the attribute instance with low priority.
  • Attr_instance_id can be reused to indicate the priority of the attribute instance, and the smaller the value of attr_instance_id, the higher the priority of the attribute instance.
  • Attr_instance_type indicates the type of attribute instance. This field is used to indicate the consumption scenarios of different instances. The meaning of the field value is as follows:
  • Attr_instance_type value describe 0 The instance associated with the viewport 1 Instances associated with user feedback other reserve
  • the file decapsulating device after the file decapsulating device obtains the media file, it can obtain the first characteristic information of at least one attribute instance among the M attribute instances from the above-mentioned component data box, and then determine the to-be-decoded The target attribute instance, thereby avoiding decoding all attribute instances, improves the decoding efficiency.
  • M attribute instances belonging to the same type of attribute information can be encapsulated in M tracks or projects in one-to-one correspondence, and a track or project includes one attribute instance, so that the attribute instance can be
  • the first feature information is directly added to the data box of the track or item corresponding to the attribute instance.
  • each of the M attribute instances of the same type of attribute information is encapsulated in a track to obtain M tracks, and these M tracks form a track group.
  • the first characteristic information of at least one attribute instance among the above M attribute instances can be added to the track group data box (AttributeInstanceTrackGroupBox).
  • each of the M attribute instances of the same type of attribute information is encapsulated in an item to obtain M items, and these M items constitute an entity group.
  • the first attribute information of at least one attribute instance among the M attribute instances can be added to the entity group data box (AttributeInstanceEntityToGroupBox).
  • the adding position of the first characteristic information in the media file of the target point cloud includes but not limited to the above three situations.
  • the method of the present application further includes S502-1:
  • the file encapsulating device adds second characteristic information of the attribute instance to the metadata track of the recommended window associated with the attribute instance.
  • the second characteristic information of the attribute instance is consistent with the first characteristic information of the attribute instance, including at least one of an identifier of the attribute instance, a priority of the attribute instance, and a type of the attribute instance.
  • the second characteristic information of the attribute instance includes at least one of an identifier of the attribute instance and an attribute type of the attribute instance.
  • the second characteristic information of the attribute instance includes the identifier of the attribute instance.
  • the second characteristic information of the attribute instance includes an identifier of the attribute instance and an attribute type of the attribute instance.
  • adding the second characteristic information of the attribute instance to the metadata track of the recommended window can be realized through the following procedure:
  • the camera extrinsic information ExtCameraInfoStruct() should appear in the sample entry or in the sample. The following conditions shall not occur: the value of dynamic_ext_camera_flag is 0 and the value of camera_extrinsic_flag[i] in all samples is 0.
  • num_viewports indicates the number of viewports indicated in the sample.
  • viewport_id[i] indicates the identifier of the corresponding viewport.
  • viewport_cancel_flag[i] is 1, which indicates that the window whose viewport identifier is viewport_id[i] is canceled.
  • camera_intrinsic_flag[i] The value of camera_intrinsic_flag[i] is 1, indicating that there is a camera intrinsic in the i-th window in the current sample. If dynamic_int_camera_flag is 0, this field must be 0. At the same time, when the value of camera_extrinsic_flag[i] is 0, this field must take the value of 0.
  • camera_extrinsic_flag[i] is 1, indicating that the i-th window in the current sample has camera extrinsic parameters. If dynamic_ext_camera_flag is 0, this field must be 0.
  • Attr_instance_asso_flag[i] is 1, indicating that the i-th window in the current sample is associated with the corresponding attribute instance.
  • the value of attr_instance_type is 0, the value of attr_instance_asso_flag in at least one sample in the current track must be 1.
  • Attr_type indicates the type of the attribute component, and its value refers to the above Table 3.
  • Attr_instance_id indicates an identifier of an attribute instance.
  • the second characteristic information of the attribute instance is added to the metadata track of the recommended window associated with the attribute instance.
  • the file decapsulation device requests the metadata track of the recommended window, it can determine the target attribute instance to be decoded according to the second characteristic information of the attribute instance added in the metadata track of the recommended window.
  • the second characteristic information includes the identifier of the attribute instance, and the file decapsulation device may send the identifier of the attribute instance to the file encapsulation device, so that the file encapsulation device sends the media file of the attribute instance corresponding to the identifier of the attribute instance to the file decapsulation device.
  • the encapsulation device consumes, avoiding unnecessary resources requested by the file decapsulation device, thereby saving bandwidth and decoding resources, and improving decoding efficiency.
  • the file encapsulation device associates the M attribute instance tracks through the track group data box.
  • M attribute instances are encapsulated in M attribute instance tracks one by one, and one attribute instance track includes one attribute instance, which can associate M attribute instances belonging to the same type of attribute information.
  • a track group to associate tracks of different attribute instances of the same attribute type may be implemented by adding identifiers of M attribute instances in the track group data box.
  • the M attribute instance tracks are associated through the track group data box, which can be realized by the following procedure:
  • Attr_type indicates the type of the attribute component, and its value is shown in Table 3.
  • Attr_instance_id indicates an identifier of an attribute instance.
  • Attr_instance_priority indicates the priority of the attribute instance, the smaller the value of this field, the higher the priority of the attribute instance.
  • the client can discard the attribute instance with low priority.
  • M attribute instances are encapsulated in M attribute instance items in one-to-one correspondence, then the M attribute instance items are associated through the entity group data box.
  • the M attribute instances are encapsulated in M attribute instance items in one-to-one correspondence, and one attribute instance item includes one attribute instance, so that M attribute items belonging to the same type of attribute information can be associated.
  • using an entity group to associate items of different attribute instances of the same attribute type may be implemented by adding identifiers of M attribute instances in the entity group data box.
  • associating the M attribute instance tracks through the entity group data box can be realized by the following procedure:
  • Attr_type indicates the type of the attribute component, and its value is shown in Table 3.
  • Attr_instance_id indicates an identifier of an attribute instance.
  • Attr_instance_priority indicates the priority of the attribute instance, the smaller the value of this field, the higher the priority of the attribute instance.
  • the client can discard the attribute instance with low priority.
  • the file encapsulation device acquires the target point cloud and encodes the target point cloud to obtain the code stream of the target point cloud.
  • the target point cloud includes N types of attribute information, N At least one class of attribute information in the class attribute information includes M attribute instances, N is a positive integer, and M is a positive integer greater than 1; according to the first feature information of at least one attribute instance in the M attribute instances, the target point cloud
  • the code stream is encapsulated to obtain the media file of the target point cloud, and the media file of the target point cloud includes the first feature information of at least one attribute instance.
  • this application adds the first characteristic information of the attribute instance to the media file, so that the file decapsulation device can determine the specific target attribute instance for decoding according to the first characteristic information of the attribute information, thereby saving bandwidth and decoding resources, and improving decoding. efficiency.
  • Fig. 6 is an interactive flowchart of a method for encapsulating and decapsulating point cloud media files provided by the embodiment of the present application. As shown in Fig. 6, this embodiment includes the following steps:
  • the file encapsulation device acquires a target point cloud, and encodes the target point cloud to obtain a code stream of the target point cloud.
  • the target point cloud includes N types of attribute information, at least one type of attribute information in the N types of attribute information includes M attribute instances, the N is a positive integer, and the M is a positive integer greater than 1.
  • the file encapsulation device encapsulates the code stream of the target point cloud according to the first characteristic information of at least one attribute instance in the M attribute instances, and obtains the media file of the target point cloud, and the media file of the target point cloud includes at least one attribute instance The first characteristic information of .
  • the file encapsulation device encodes and encapsulates the target point cloud according to the above steps, and after obtaining the media file of the target point cloud, it can interact with the file decapsulation device in the following ways:
  • the file encapsulation device may directly send the encapsulated media file of the target point cloud to the file decapsulation device, so that the file decapsulation device selectively consumes some attribute instances according to the first feature information of the attribute instances in the media file.
  • Method 2 The file encapsulation device sends a signaling to the file decapsulation device, and the file decapsulation device requests the file encapsulation device to consume all or part of the media files of attribute instances according to the signaling.
  • the process of the file decapsulating device requesting the consumption of the media files of some attribute instances in the second mode is introduced, and specifically refer to the following steps from S603 to .
  • the file encapsulation device sends the first information to the file decapsulation device.
  • the first information is used to indicate first characteristic information of at least one attribute instance in the M attribute instances.
  • the first characteristic information of the attribute instance includes at least one of the identifier of the attribute instance, the priority of the attribute instance, and the type of the attribute instance.
  • the above-mentioned first information is DASH signaling.
  • the semantic description of DASH signaling is shown in Table 4:
  • Table 4 is a form of the first information, and the first information in this embodiment of the present application includes but is not limited to the content shown in the above Table 4.
  • the first characteristic information of the attribute instance includes at least one of the identifier of the attribute instance, the priority of the attribute instance, and the type of the attribute instance.
  • the above-mentioned first information is DASH signaling.
  • the file decapsulating device determines a target attribute instance according to the first characteristic information of at least one attribute instance.
  • the file decapsulating device may use the first feature information to determine a target attribute instance from the at least one attribute instance, and obtain instance data corresponding to the target attribute instance.
  • the file decapsulation device determines the target attribute instance according to the first characteristic information of at least one attribute instance indicated by the first information, including but not limited to the following methods:
  • the target attribute instance if the first characteristic information of the attribute instance includes the priority of the attribute instance, one or several attribute instances with higher priority may be determined as the target attribute instance.
  • Method 2 if the first characteristic information of the attribute instance includes the identifier of the attribute instance, and the identifier of the attribute instance is used to indicate the priority of the attribute instance, then one or several attribute instances can be selected according to the identifier of the attribute instance and determined as the target attribute instance. For example, if the identifier of the attribute instance is smaller, the priority is higher, so one or several attribute instances with the smallest identifier can be determined as the target attribute instance. For another example, if the identifier of the attribute instance is larger, the priority is higher. In this way, one or several attribute instances with the largest identifier can be determined as the target attribute instance.
  • the first characteristic information of the attribute instance includes the type of the attribute instance, then the target attribute instance can be determined from at least one attribute instance according to the type of the attribute instance, for details, refer to the following example 1 and example 2.
  • Example 1 if the type of the attribute instance is an attribute instance associated with user feedback, the device for decapsulating the file determines the target attribute instance from at least one attribute instance according to the first characteristic information of at least one attribute instance of the M attribute information.
  • the target attribute instance is determined from at least one attribute instance according to the network bandwidth and/or device computing power of the file decapsulating device, and the priority of the attribute instances in the first attribute information.
  • the network bandwidth is sufficient and the computing power of the device is strong, more attribute instances in at least one attribute instance may be determined as target attribute instances. If the network bandwidth is insufficient, and/or the computing power of the device is weak, the attribute instance with the highest priority among at least one attribute instance may be determined as the target attribute instance.
  • Example 2 if the type of the attribute instance is an attribute instance associated with the recommended window, the file decapsulation device obtains the metadata track of the recommended window, and according to the second characteristic information of the attribute instance included in the metadata track of the recommended window, from A target attribute instance is determined from at least one attribute instance of the M attribute information.
  • the second characteristic information of the attribute instance includes at least one of an identifier of the attribute instance and an attribute type of the attribute instance.
  • the file decapsulating device obtains the metadata track of the recommended window by: the file encapsulating device sends second information to the file decapsulating device, and the second information is used to indicate the metadata track of the recommended window. According to the second information, the file decapsulating device requests the file encapsulating device to recommend the metadata track of the window. The file encapsulator sends the metadata track of the recommended view to the file decapsulator.
  • the above-mentioned second information may be sent before the above-mentioned first information.
  • the second information may be sent after the first information.
  • the above-mentioned second information and the above-mentioned first information are sent at the same time.
  • the metadata track of the recommended window includes the second characteristic information of the attribute instance.
  • the file decapsulation device obtains the metadata track of the recommended window according to the above steps, it obtains the second characteristic information of the attribute instance from the metadata track of the recommended window, and determines the target attribute instance according to the second characteristic information of the attribute instance, for example The attribute instance corresponding to the second characteristic information is determined as the target attribute instance.
  • the file decapsulating device determines the target attribute instance to be decoded according to the above steps, it performs the following S605.
  • the file decapsulating device sends first request information to the file encapsulating device, where the first request information is used to request the media file of the target attribute instance.
  • the first request information includes the identifier of the target attribute instance.
  • the first request information includes the first characteristic information of the target attribute instance.
  • the file encapsulation device sends the media file of the target attribute instance to the file decapsulation device according to the first request information.
  • the first request information includes the identifier of the target attribute instance, so that the file encapsulation device finds the media file corresponding to the target attribute instance corresponding to the identifier of the target attribute instance in the media file of the target point cloud, and stores the media file of the target attribute instance Send to the file decapsulating device.
  • the file decapsulating device decapsulates and decodes the media file of the target attribute instance to obtain the target attribute instance.
  • the file decapsulation device receives the media file of the target attribute instance, it first decapsulates the media file of the target attribute instance, obtains the code stream of the decapsulated target attribute instance, and then decapsulates the code stream of the target attribute instance Decode to get the decoded target attribute instance.
  • the file encapsulation device if the attribute information of the target point cloud is encoded based on the geometric information of the point cloud, at this time, the file encapsulation device also sends the media file of the geometric information corresponding to the target attribute instance to the file decapsulation device for geometric decoding of information. Based on the decoded geometric information, attribute decoding is performed on the target attribute instance.
  • Step 11 assuming that there are two attribute instances of one attribute type in the code stream of the target point cloud, and encapsulating different attribute instances in the code stream of the target point cloud according to multi-track, to obtain the media file F1 of the target point cloud.
  • the media file F1 of the target point cloud includes Track1, Track2 and Track3:
  • Track2 and Track3 are the tracks of two attribute instances.
  • Step 12 according to the information of the attribute instance in the media file F1 of the target point cloud, generate DASH signaling (i.e. the first information), for indicating the first characteristic information of at least one attribute instance, the DASH signaling includes the following content:
  • Step 13 the file decapsulation devices C1 and C2 request the point cloud media file according to the network bandwidth and the information in the DASH signaling.
  • the file decapsulation device C1 has sufficient network bandwidth and requests Representation1-Representation3, and the file decapsulation device C2 has limited network bandwidth and requests Representation1-Representation2.
  • Step 14 transfer point cloud media files.
  • Step 15 the file decapsulation device receives the point cloud file
  • C2 only receives 1 attribute instance and obtains a basic point cloud consumption experience.
  • Step 21 assuming that there are two attribute instances of one attribute type in the code stream of the target point cloud, and encapsulating different attribute instances in the code stream of the target point cloud according to multi-tracks to obtain the media file F1 of the target point cloud.
  • the media file F1 of the target point cloud includes Track1, Track2 and Track3:
  • Track2 and Track3 are the tracks of two attribute instances.
  • Step 22 according to the information of the attribute instance in the media file F1 of the target point cloud, generate DASH signaling (i.e. the first information), for indicating the first characteristic information of at least one attribute instance, the DASH signaling includes the following content:
  • Step 23 the file decapsulation devices C1 and C2 request the point cloud media file according to the network bandwidth and the information in the DASH signaling.
  • C2 Although the priorities of representation2 and representation 3 are the same, since these two attribute instances are associated with the recommended window, the corresponding information can be requested according to the user's viewing position according to the second characteristic information of the attribute instance in the metadata track of the recommended window. For media resources, only 1 property instance is requested at a time.
  • Step 24 transmitting point cloud media files.
  • Step 25 the file decapsulation device receives the point cloud file
  • C2 only receives one attribute instance, and decodes the corresponding attribute instance for consumption.
  • the file encapsulating device sends the first information to the file decapsulating device, and the first information is used to indicate the first attribute instance of at least one of the M attribute instances. 1. Feature information.
  • the file decapsulation device can select the requested target attribute instance for consumption according to the first characteristic information of at least one attribute instance and the performance of the file decoding device itself, thereby saving network bandwidth and improving decoding efficiency.
  • Fig. 7 is an interactive flowchart of a method for encapsulating and decapsulating point cloud media files provided by the embodiment of the present application. As shown in Fig. 7, this embodiment includes the following steps:
  • the file encapsulation device acquires the target point cloud, and encodes the target point cloud to obtain a code stream of the target point cloud.
  • the target point cloud includes N types of attribute information, at least one type of attribute information in the N types of attribute information includes M attribute instances, N is a positive integer, and M is a positive integer greater than 1.
  • the file encapsulation device encapsulates the code stream of the target point cloud according to the first characteristic information of at least one attribute instance in the M attribute instances, and obtains the media file of the target point cloud, and the media file of the target point cloud includes at least one attribute instance The first characteristic information of .
  • the file encapsulation device encodes and encapsulates the target point cloud according to the above steps, and after obtaining the media file of the target point cloud, it can interact with the file decapsulation device in the following ways:
  • the file encapsulation device may directly send the encapsulated media file of the target point cloud to the file decapsulation device, so that the file decapsulation device selectively consumes some attribute instances according to the first feature information of the attribute instances in the media file.
  • Method 2 The file encapsulation device sends a signaling to the file decapsulation device, and the file decapsulation device requests the file encapsulation device to consume all or part of the media files of attribute instances according to the signaling.
  • the file encapsulation device sends the first information to the file decapsulation device.
  • the first information is used to indicate first characteristic information of at least one attribute instance in the M attribute instances.
  • the first characteristic information of the attribute instance includes at least one of the identifier of the attribute instance, the priority of the attribute instance, and the type of the attribute instance.
  • the above-mentioned first information is DASH signaling.
  • the semantic description of DASH signaling is shown in Table 4 above.
  • the file decapsulating device sends second request information to the file encapsulating device according to the first information.
  • the second request is used to request the media file of the target point cloud.
  • the file encapsulation device sends the media file of the target point cloud to the file decapsulation device according to the second request information.
  • the file decapsulating device determines a target attribute instance according to the first characteristic information of at least one attribute instance.
  • the implementation process of S706 is consistent with the implementation process of S604 above.
  • the file decapsulation device The first characteristic information of the at least one attribute instance is used to determine the target attribute instance.
  • the device for decapsulating the file obtains the metadata track of the recommended window, and according to the second characteristic information of the attribute instance included in the metadata track of the recommended window, from A target attribute instance is determined from at least one attribute instance of the M attribute information.
  • the device for decapsulating the file decapsulates and decodes the media file of the target attribute instance to obtain the target attribute instance.
  • the media file corresponding to the target attribute instance is queried from the received media files of the target point cloud.
  • the file encapsulating device sends the first information to the file decapsulating device, and the first information is used to indicate the first attribute instance of at least one of the M attribute instances. 1. Feature information.
  • the file decapsulation device requests the media file of the entire target point cloud, it can select the target attribute instance for decoding and consumption according to the first feature information of at least one attribute instance and the performance of the file decoding device itself, thereby saving network bandwidth , improving the decoding efficiency.
  • FIGS. 5 to 7 are only examples of the present application, and should not be construed as limiting the present application.
  • FIG. 8 is a schematic structural diagram of a point cloud media file packaging device provided by an embodiment of the present application.
  • the device 10 is applied to a file packaging device.
  • the device 10 includes:
  • the acquisition unit 11 is configured to acquire a target point cloud, and encode the target point cloud to obtain a code stream of the target point cloud, the target point cloud includes N types of attribute information, and the N types of attribute information At least one type of attribute information includes M attribute instances, where N is a positive integer, and M is a positive integer greater than 1;
  • the encapsulation unit 12 is used to encapsulate the code stream of the target point cloud according to the first characteristic information of at least one attribute instance in the M attribute instances, to obtain the media file of the target point cloud, and the media file of the target point cloud includes at least one attribute The first characteristic information of the instance.
  • the first characteristic information of the attribute instance includes: at least one of the identifier of the attribute instance, the priority of the attribute instance, and the type of the attribute instance.
  • the type of the attribute instance includes at least one of an attribute instance associated with a recommendation window and an attribute instance associated with user feedback.
  • the encapsulation unit 12 is further configured to add the The second characteristic information of the attribute instance.
  • the second characteristic information of the attribute instance includes at least one of an identifier of the attribute instance and an attribute type of the attribute instance.
  • the encapsulation unit 12 is specifically configured to encapsulate the at least one attribute when the geometric information and attribute information of a frame of point cloud in the target point cloud are encapsulated in a track or an item
  • the first feature information of the instance is added to the sub-sample data box corresponding to the M attribute instances; or,
  • each of the M attribute instances is encapsulated in a track or a project, add the first feature information of the at least one attribute instance to the component information data corresponding to the M attribute instances in the box; or,
  • the M property instances is encapsulated in a track or a project, and the M tracks corresponding to the M property instances form a track group, or the M projects corresponding to the M property instances
  • the first feature information of the at least one attribute instance is added to the track group data box or the entity group data box.
  • the encapsulation unit 12 is further configured to associate the M attribute instance tracks through the track group data box if the M attribute instances are encapsulated in the M attribute instance tracks one by one ;or,
  • the M attribute instances are encapsulated in M attribute instance items in one-to-one correspondence, then the M attribute instance items are associated through the entity group data box.
  • the apparatus further includes a transceiver unit 13, configured to send first information to the file decapsulating device, where the first information is used to indicate first feature information of at least one attribute instance among the M attribute instances.
  • the transceiver unit 13 is configured to receive the first request information sent by the file decapsulation device, the first request is used to request the media file of the target attribute instance; and according to the first request information, Sending the media file of the target attribute instance to the file decapsulating device.
  • the transceiver unit 13 is further configured to receive the second request information sent by the file decapsulation device, the second request is used to request the media file of the target point cloud; and according to the second request information, and send the media file of the target point cloud to the file decapsulating device.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the apparatus 10 shown in FIG. 8 can execute the method embodiment corresponding to the file encapsulation device, and the aforementioned and other operations and/or functions of the various modules in the apparatus 10 are respectively for realizing the method embodiment corresponding to the file encapsulation device, for It is concise and will not be repeated here.
  • FIG. 9 is a schematic structural diagram of a point cloud media file decapsulation device provided by an embodiment of the present application.
  • the device 20 is applied to a file decapsulation device.
  • the device 20 includes:
  • a transceiver unit 21 configured to receive the first information sent by the file encapsulation device
  • the first information is used to indicate the first feature information of at least one attribute instance in the M attribute instances
  • the M attribute instances are at least one type of attribute information among the N types of attribute information included in the target point cloud. Include M attribute instances, the N is a positive integer, and the M is a positive integer greater than 1.
  • the first characteristic information of the attribute instance includes: at least one of an identifier of the attribute instance, a priority of the attribute instance, and a type of the attribute instance.
  • the type of the attribute instance includes at least one of an attribute instance associated with a recommendation window and an attribute instance associated with user feedback.
  • the second characteristic information of the attribute instance is added to the metadata track of the recommended window associated with the attribute instance.
  • the device further includes a determination unit 22 and a decoding unit 23:
  • a determining unit 22 configured to determine a target attribute instance according to the first feature information of the at least one attribute instance
  • the transceiver unit 21 is configured to send first request information to the file encapsulation device, the first request information is used to request the media file of the target attribute instance; and receive the target attribute instance sent by the file encapsulation device media files;
  • the decoding unit 23 is configured to decapsulate and decode the media file of the target attribute instance to obtain the target attribute instance.
  • the transceiving unit 21 is further configured to send second request information to the file encapsulation device according to the first information, and the second request is used to request the media file of the target point cloud; and receiving the media file of the target point cloud sent by the file encapsulation device;
  • a determining unit 22 configured to determine a target attribute instance according to the first feature information of the at least one attribute instance
  • the decoding unit 23 is configured to obtain the media file of the target attribute instance from the media file of the target point cloud; decapsulate and decode the media file of the target attribute instance to obtain the target attribute instance.
  • the determining unit 22 is specifically configured to: if the type of the attribute instance is an attribute instance associated with user feedback, then according to the The first characteristic information of at least one attribute instance of the M attribute information, determining the target attribute instance from the at least one attribute instance; or,
  • the type of the attribute instance is an attribute instance associated with the recommended window, then obtain the metadata track of the recommended window, and according to the second characteristic information of the attribute instance included in the metadata track of the recommended window, from the The target attribute instance is determined from at least one attribute instance of the M attribute information.
  • the second characteristic information of the attribute instance includes at least one of an identifier of the attribute instance and an attribute type of the attribute instance.
  • the first feature information of the attribute instance is added to the component information data box corresponding to the M attribute instances; or,
  • the M property instances is encapsulated in a track or a project, and the M tracks corresponding to the M property instances form a track group, or the M projects corresponding to the M property instances
  • the first feature information of the attribute instance is added to the track group data box or the entity group data box.
  • the media file of the target point cloud includes a track group data box, and the track group data box is used for associating The M attribute instance tracks; or, if the M attribute instances are encapsulated in M attribute instance items in one-to-one correspondence, the media file of the target point cloud includes an entity group data box, and the entity group data A box is used to associate the M attribute instance items.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the apparatus 20 shown in FIG. 9 can execute the method embodiment corresponding to the file decapsulation device, and the aforementioned and other operations and/or functions of each module in the device 20 are respectively for realizing the method embodiment corresponding to the file decapsulation device , for the sake of brevity, it is not repeated here.
  • the device in the embodiment of the present application is described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional modules may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software modules.
  • each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
  • the decoding processor is executed, or the combination of hardware and software modules in the decoding processor is used to complete the execution.
  • the software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • FIG. 10 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may be the above-mentioned file encapsulation device or file decapsulation device, or the electronic device may have the functions of a file encapsulation device and a file decapsulation device.
  • the electronic device 40 may include:
  • Memory 41 and memory 42 the memory 41 is used to store computer programs and transmit the program codes to the memory 42 .
  • the memory 42 can invoke and run computer programs from the memory 41 to implement the methods in the embodiments of the present application.
  • the memory 42 can be used to execute the above-mentioned method embodiments according to the instructions in the computer program.
  • the memory 42 may include but not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 41 includes but is not limited to:
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the computer program can be divided into one or more modules, and the one or more modules are stored in the memory 41 and executed by the memory 42 to complete the method provided by the present application .
  • the one or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the video production device.
  • the electronic device 40 may also include:
  • the transceiver 40 and the transceiver 43 can be connected to the memory 42 or the memory 41 .
  • the memory 42 can control the transceiver 43 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 43 may include a transmitter and a receiver.
  • the transceiver 43 may further include antennas, and the number of antennas may be one or more.
  • bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
  • the present application also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer can execute the methods of the above method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, server, or data center by wire (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
  • a magnetic medium such as a floppy disk, a hard disk, or a magnetic tape
  • an optical medium such as a digital video disc (digital video disc, DVD)
  • a semiconductor medium such as a solid state disk (solid state disk, SSD)
  • modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • a module described as a separate component may or may not be physically separated, and a component displayed as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请提供了一种点云媒体文件的封装与解封装方法、装置及存储介质,该方法包括:文件封装设备通过获取目标点云,并对目标点云进行编码,得到目标点云的码流,该目标点云包括N类属性信息,N类属性信息中的至少一类属性信息包括M个属性实例,N为正整数,M为大于1的正整数;根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件,该目标点云的媒体文件包括至少一个属性实例的第一特征信息。即本申请通过将属性实例的第一特性信息添加在媒体文件中,使得文件解封装设备可以根据属性信息的第一特征信息来确定具体解码的目标属性实例,进而节省宽带和解码资源,提高解码效率。

Description

点云媒体文件的封装与解封装方法、装置及存储介质
本申请要求于2021年09月01日提交中国专利局、申请号为202111022386.2、发明名称为“点云媒体文件的封装与解封装方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及视频处理技术领域,尤其涉及一种点云媒体文件的封装与解封装方法、装置及存储介质。
发明背景
点云是空间中一组无规则分布的、表达三维物体或场景的空间结构及表面属性的离散点集。点云媒体按照用户在消费媒体内容时的自由度,可以分为3自由度(Degree of Freedom,简称DoF)媒体、3DoF+媒体以及6DoF媒体。
点云中的每个点包括几何信息和属性信息,属性信息包括颜色属性、反射率等不同类型的属性信息,而同一类型的属性信息也可以包括不同的属性实例,例如一个点的颜色属性包括不同的颜色类型,将不同的颜色类型称为颜色属性的不同属性实例。在编码技术中,例如基于几何模型的点云压缩(Geometry-based Point Cloud Compression,简称GPCC),支持在一个码流里包含同一属性类型的多个属性实例。
发明内容
本申请提供一种点云媒体文件的封装与解封装方法、装置及存储介质,可以根据媒体文件中添加的M个属性实例中至少一个属性实例的第一特征信息,来选择性地消费属性实例,进而节省解码资源,提高解码效率。
本申请实施例提供一种点云媒体文件的封装方法,应用于文件封装设备,该方法包括:
获取目标点云,并对所述目标点云进行编码,得到所述目标点云的码流;
对所述码流进行封装,得到所述目标点云的媒体文件;
当所述目标点云包括至少一类属性信息的M个属性实例对应的实例数据时,将所述M个属性实例中至少一个属性实例的第一特征信息作为所述至少一个属性实例对应的实例数据的元数据,封装在所述目标点云的媒体文件中,所述第一特征信息用于标识所述至少一个属性实例与所述M个属性实例中其他属性实例的区别,所述M为大于1的正整数。
本申请实施例还提供一种点云媒体文件的解封装方法,应用于文件解封装设备,该方法包括:
接收文件封装设备发送的第一信息;其中,所述第一信息用于指示M个属性实例中的至少一个属性实例的第一特征信息,所述M个属性实例为目标点云所包括的N类属性信息中至少一类属性信息所包括的M个属性实例,所述N为正整数,所述M为大于1的正整数;
利用所述第一特征信息从所述至少一个属性实例中确定目标属性实例,获取所述目标属性实例对应的实例数据。
本申请实施例还提供一种点云媒体文件的封装装置,应用于文件封装设备,该装置包括:
获取单元,用于获取目标点云,并对所述目标点云进行编码,得到所述目标点云的码流;
封装单元,用于对所述码流进行封装,得到所述目标点云的媒体文件;当所述目标点云包括至少一类属性信息的M个属性实例对应的实例数据时,将所述M个属性实例中至少一个属性实 例的第一特征信息作为所述至少一个属性实例对应的实例数据的元数据,封装在所述目标点云的媒体文件中,所述第一特征信息用于标识所述至少一个属性实例与所述M个属性实例中其他属性实例的区别,所述M为大于1的正整数。
本申请实施例还提供一种点云媒体文件的解封装装置,应用于文件解封装设备,该装置包括:
收发单元,用于接收文件封装设备发送的第一信息;其中,所述第一信息用于指示M个属性实例中的至少一个属性实例的第一特征信息,所述M个属性实例为目标点云所包括的N类属性信息中至少一类属性信息所包括的M个属性实例,所述N为正整数,所述M为大于1的正整数;
利用所述第一特征信息从所述至少一个属性实例中确定目标属性实例,获取所述目标属性实例对应的实例数据。
本申请实施例还提供一种文件封装设备,包括:处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行第一方面的方法。
本申请实施例还提供一种文件解封装设备,包括:处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行第二方面的方法。
本申请实施例还提供了一种计算机可读存储介质,用于存储计算机程序,所述计算机程序使得计算机执行上述第一方面至第二方面中任一方面或其各实现方式中的方法。
本申请实施例还提供了一种电子设备,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行第一方面和/或第二方面任一项所述的方法。
综上,当所述目标点云包括至少一类属性信息的M个属性实例对应的实例数据时,本申请实施例的文件封装设备将所述M个属性实例中至少一个属性实例的第一特征信息作为所述至少一个属性实例对应的实例数据的元数据,封装在所述目标点云的媒体文件中。即本申请实施例通过将属性实例的第一特性信息作为元数据添加在媒体文件中,使得文件解封装设备可以根据元数据中的第一特征信息来确定具体解码的目标属性实例,进而节省宽带和解码资源,提高解码效率。
附图简要说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了三自由度的示意图;
图2示出了三自由度+的示意图;
图3示出了六自由度的示意图;
图4A为本申请实施例提供的一种沉浸媒体***的架构图;
图4B为本申请实施例提供的V3C媒体的内容流程示意图;
图5为本申请实施例提供的一种点云媒体文件封装方法流程图;
图6为本申请实施例提供的一种点云媒体文件封装与解封装方法的交互流程图;
图7为本申请实施例提供的一种点云媒体文件封装与解封装方法的交互流程图;
图8为本申请实施例提供的点云媒体文件的封装装置的结构示意图;
图9为本申请实施例提供的点云媒体文件的解封装装置的结构示意图;
图10是本申请实施例提供的电子设备的示意性框图。
实施本发明的方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、***、产品或服务器不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例涉及点云媒体的数据处理技术。
在介绍本申请技术方案之前,下面先对本申请相关知识进行介绍。
点云:点云是空间中一组无规则分布的、表达三维物体或场景的空间结构及表面属性的离散点集。点云中的每个点至少具有三维位置信息,根据应用场景的不同,还可能具有色彩、材质或其他信息。通常,点云中的每个点都具有相同数量的附加属性。
V3C容积媒体:visual volumetric video-based coding media,指捕获自三维空间视觉内容并提供3DoF+、6DoF观看体验的,以传统视频编码的,在文件封装中包含容积视频类型轨道的沉浸式媒体,包括多视角视频、视频编码点云等。
PCC:Point Cloud Compression,点云压缩。
G-PCC:Geometry-based Point Cloud Compression,基于几何模型的点云压缩。
V-PCC:Video-based Point Cloud Compression,基于传统视频编码的点云压缩。
图集:指示2D平面帧上的区域信息,3D呈现空间的区域信息,以及二者之间的映射关系和映射所需的必要参数信息。
Track:轨道,媒体文件封装过程中的媒体数据集合,一个媒体文件可由多个轨道组成,比如一个媒体文件可以包含一个视频轨道,一个音频轨道以及一个字幕轨道。
Sample:样本,媒体文件封装过程中的封装单位,一个媒体轨道由很多个样本组成。比如视频轨道的一个样本通常为一个视频帧。
DoF:Degree of Freedom,自由度。力学***中是指独立坐标的个数,除了平移的自由度外,还有转动及振动自由度。本申请实施例中指用户在观看沉浸式媒体时,支持的运动并产生内容交互的自由度。
3DoF:即三自由度,指用户头部围绕XYZ轴旋转的三种自由度。图1示意性示出了三自由度的示意图。如图1所示,就是在某个地方、某一个点在三个轴上都可以旋转,可以转头,也可以上下低头,也可以摆头。通过三自由度的体验,用户能够360度地沉浸在一个现场中。如果是静态的,可以理解为是全景的图片。如果全景的图片是动态,就是全景视频,也就是VR视频。但是VR视频是有一定局限性的,用户是不能够移动的,不能选择任意的一个地方去看。
3DoF+:即在三自由度的基础上,用户还拥有沿XYZ轴做有限运动的自由度,也可以将其称之为受限六自由度,对应的媒体码流可以称之为受限六自由度媒体码流。图2示意性示出了三自由度+的示意图。
6DoF:即在三自由度的基础上,用户还拥有沿XYZ轴***的自由度,对应的媒体码 流可以称之为六自由度媒体码流。图3示意性示出了六自由度的示意图。其中,6DoF媒体是指的6自由度视频,是指视频可以提供用户在三维空间的XYZ轴方向自由移动视点,以及围绕XYX轴自由旋转视点的高自由度观看体验。6DoF媒体是以摄像机阵列采集得到的空间不同视角的视频组合。为了便于6DoF媒体的表达、存储、压缩和处理,将6DoF媒体数据表达为以下信息的组合:多摄像机采集的纹理图,多摄像机纹理图所对应的深度图,以及相应的6DoF媒体内容描述元数据,元数据中包含了多摄像机的参数,以及6DoF媒体的拼接布局和边缘保护等描述信息。在编码端,把多摄像机的纹理图信息和对应的深度图信息进行拼接处理,并且把拼接方式的描述数据根据所定义的语法和语义写入元数据。拼接后的多摄像机深度图和纹理图信息通过平面视频压缩方式进行编码,并且传输到终端解码后,进行用户所请求的6DoF虚拟视点的合成,从而提供用户6DoF媒体的观看体验。
AVS:Audio Video Coding Standard,音视频编码标准。
ISOBMFF:ISO Based Media File Format,基于ISO(International Standard Organization,国际标准化组织)标准的媒体文件格式。ISOBMFF是媒体文件的封装标准,最典型的ISOBMFF文件即MP4(Moving Picture Experts Group 4,动态图像专家组4)文件。
DASH:dynamic adaptive streaming over HTTP,基于HTTP的动态自适应流是一种自适应比特率流技术,使高质量流媒体可以通过传统的HTTP网络服务器以互联网传递。
MPD:media presentation description,DASH中的媒体演示描述信令,用于描述媒体片段信息。
HEVC:High Efficiency Video Coding,国际视频编码标准HEVC/H.265。
VVC:versatile video coding,国际视频编码标准VVC/H.266。
Intra(picture)Prediction:帧内预测。
Inter(picture)Prediction:帧间预测。
SCC:screen content coding,屏幕内容编码。
QP:Quantization Parameter,量化参数。
沉浸式媒体指能为消费者带来沉浸式体验的媒体内容,沉浸式媒体按照用户在消费媒体内容时的自由度,可以分为3DoF媒体、3DoF+媒体以及6DoF媒体。其中常见的6DoF媒体包括点云媒体。
点云是空间中一组无规则分布的、表达三维物体或场景的空间结构及表面属性的离散点集。点云中的每个点至少具有三维位置信息,根据应用场景的不同,还可能具有色彩、材质或其他信息。通常,点云中的每个点都具有相同数量的附加属性。
点云可以灵活方便地表达三维物体或场景的空间结构及表面属性,因而应用广泛,包括虚拟现实(Virtual Reality,VR)游戏、计算机辅助设计(Computer Aided Design,CAD)、地理信息***(Geography Information System,GIS)、自动导航***(Autonomous Navigation System,ANS)、数字文化遗产、自由视点广播、三维沉浸远程呈现、生物组织器官三维重建等。
点云的获取主要有以下途径:计算机生成、3D激光扫描、3D摄影测量等。计算机可以生成虚拟三维物体及场景的点云。3D扫描可以获得静态现实世界三维物体或场景的点云,每秒可以获取百万级点云。3D摄像可以获得动态现实世界三维物体或场景的点云,每秒可以获取千万级点云。此外,在医学领域,由MRI、CT、电磁定位信息,可以获得生物组织器官的点云。这些技术降低了点云数据获取成本和时间周期,提高了数据的精度。点云数据获取方式的变革,使大量点云数据的获取成为可能。伴随着大规模的点云数据不断积累,点云数据的高效存储、传输、发布、共 享和标准化,成为点云应用的关键。
在对点云媒体进行编码后,需要对编码后的数据流进行封装并传输给用户。相对应地,在点云媒体播放器端,需要先对点云文件进行解封装,然后再进行解码,最后将解码后的数据流呈现。
图4A为本申请实施例提供的一种沉浸媒体***的架构图。如图4A所示,沉浸媒体***包括编码设备和解码设备,编码设备可以是指沉浸媒体的提供者所使用的计算机设备,该计算机设备可以是终端(如PC(Personal Computer,个人计算机)、智能移动设备(如智能手机)等)或服务器。解码设备可以是指沉浸媒体的使用者所使用的计算机设备,该计算机设备可以是终端(如PC(Personal Computer,个人计算机)、智能移动设备(如智能手机)、VR设备(如VR头盔、VR眼镜等))。沉浸媒体的数据处理过程包括在编码设备侧的数据处理过程及在解码设备侧的数据处理过程。
在编码设备端的数据处理过程主要包括:
(1)沉浸媒体的媒体内容的获取与制作过程;
(2)沉浸媒体的编码及文件封装的过程。
在解码设备端的数据处理过程主要包括:
(3)沉浸媒体的文件解封装及解码的过程;
(4)沉浸媒体的渲染过程。
另外,编码设备与解码设备之间涉及沉浸媒体的传输过程,该传输过程可以基于各种传输协议来进行,此处的传输协议可包括但不限于:DASH(Dynamic Adaptive Streaming over HTTP,动态自适应流媒体传输)协议、HLS(HTTP Live Streaming,动态码率自适应传输)协议、SMTP(Smart Media Transport Protocol,智能媒体传输协议)、TCP(Transmission Control Protocol,传输控制协议)等。
下面结合图4A,分别对沉浸媒体的数据处理过程中涉及的各个过程进行详细介绍。
一、在编码设备端的数据处理过程:
(1)沉浸媒体的媒体内容的获取与制作过程。
1)沉浸媒体的媒体内容的获取过程。
沉浸媒体的媒体内容是通过捕获设备采集现实世界的声音-视觉场景获得的。
在一种实现中,捕获设备可以是设于编码设备中的硬件组件,例如捕获设备可以包括终端的麦克风、摄像头、传感器等。另一种实现中,该捕获设备也可以包括与编码设备相连接的硬件装置,例如与服务器相连接摄像头。
该捕获设备可以包括但不限于:音频设备、摄像设备及传感设备。其中,音频设备可以包括音频传感器、麦克风等。摄像设备可以包括普通摄像头、立体摄像头、光场摄像头等。传感设备可以包括激光设备、雷达设备等。
捕获设备的数量可以为多个。这些捕获设备被部署在现实空间中的一些特定位置以同时捕获该空间内不同角度的音频内容和视频内容,捕获的音频内容和视频内容在时间和空间上均保持同步。通过捕获设备采集到的媒体内容称作沉浸媒体的原始数据。
2)沉浸媒体的媒体内容的制作过程。
捕获到的音频内容本身就是适合被执行沉浸媒体的音频编码的内容。捕获到的视频内容进行一系列制作流程后才可适合作为被执行沉浸媒体的视频编码的内容。该制作流程可以包括以下步骤。
①拼接。由于捕获到的视频内容是捕获设备在不同角度下拍摄得到的,拼接是指对这些各个角度拍摄的视频内容拼接成一个完整的、能够反映现实空间360度视觉全景的视频,即拼接后的视频是一个在三维空间表示的全景视频(或球面视频)。
②投影。投影是指将拼接形成的一个三维视频映射到一个二维(3-Dimension,2D)图像上的过程。投影形成的2D图像称为投影图像。投影的方式可包括但不限于:经纬图投影、正六面体投影。
③区域封装。投影图像可以直接被编码,也可以对投影图像进行区域封装之后再进行编码。实践中发现,在沉浸媒体的数据处理过程中,对于二维投影图像进行区域封装之后再进行编码能够大幅提升沉浸媒体的视频编码效率,因此区域封装技术被广泛应用到沉浸媒体的视频处理过程中。所谓区域封装是指将投影图像按区域执行转换处理的过程,区域封装过程使投影图像被转换为封装图像。区域封装的过程具体包括:将投影图像划分为多个映射区域,然后再对多个映射区域分别进行转换处理得到多个封装区域,将多个封装区域映射到一个2D图像中得到封装图像。其中,映射区域是指执行区域封装前在投影图像中经划分得到的区域;封装区域是指执行区域封装后位于封装图像中的区域。
转换处理可以包括但不限于:镜像、旋转、重新排列、上采样、下采样、改变区域的分辨率及移动等处理。
需要说明的是,由于采用捕获设备只能捕获到全景视频,这样的视频经编码设备处理并传输至解码设备进行相应的数据处理后,解码设备侧的用户只能通过执行一些特定动作(如头部旋转)来观看360度的视频信息,而执行非特定动作(如移动头部)并不能获得相应的视频变化,VR体验不佳,因此需要额外提供与全景视频相匹配的深度信息,来使用户获得更优的沉浸度和更佳的VR体验,这就涉及6DoF(Six Degrees of Freedom,六自由度)制作技术。当用户可以在模拟的场景中较自由的移动时,称为6DoF。采用6DoF制作技术进行沉浸媒体的视频内容的制作时,捕获设备一般会选用光场摄像头、激光设备、雷达设备等,捕获空间中的点云数据或光场数据,并且在执行上述制作流程①-③的过程中还需要进行一些特定处理,例如对点云数据的切割、映射等过程,深度信息的计算过程等。
(2)沉浸媒体的编码及文件封装的过程。
捕获到的音频内容可直接进行音频编码形成沉浸媒体的音频码流。经过上述制作流程①-②或①-③之后,对投影图像或封装图像进行视频编码,得到沉浸媒体的视频码流,例如,将打包图片(D)编码为编码图像(Ei)或编码视频比特流(Ev)。捕获的音频(Ba)被编码为音频比特流(Ea)。然后,根据特定的媒体容器文件格式,将编码的图像、视频和/或音频组合成用于文件回放的媒体文件(F)或用于流式传输的初始化段和媒体段的序列(Fs)。编码设备端还将元数据,例如投影和区域信息,包括到文件或片段中,有助于呈现解码的打包图片。
此处需要说明的是,如果采用6DoF制作技术,在视频编码过程中需要采用特定的编码方式(如点云编码)进行编码。将音频码流和视频码流按照沉浸媒体的文件格式(如ISOBMFF(ISO Base Media File Format,ISO基媒体文件格式))封装在文件容器中形成沉浸媒体的媒体文件资源,该媒体文件资源可以是媒体文件或媒体片段形成沉浸媒体的媒体文件;并按照沉浸媒体的文件格式要求采用媒体呈现描述信息(Media presentation description,MPD)记录该沉浸媒体的媒体文件资源的元数据,此处的元数据是对与沉浸媒体的呈现有关的信息的总称,该元数据可包括对媒体内容的描述信息、对视窗的描述信息以及对媒体内容呈现相关的信令信息等等。如图4A所示,编码设备会存储经过数据处理过程之后形成的媒体呈现描述信息和媒体文件资源。
沉浸媒体***支持数据盒(Box)。数据盒是指包括元数据的数据块或对象,即数据盒中包含了相应媒体内容的元数据。沉浸媒体可以包括多个数据盒,例如包括球面区域缩放数据盒(Sphere Region Zooming Box),其包含用于描述球面区域缩放信息的元数据;2D区域缩放数据盒(2DRegionZoomingBox),其包含用于描述2D区域缩放信息的元数据;区域封装数据盒(Region Wise PackingBox),其包含用于描述区域封装过程中的相应信息的元数据,等等。
二、在解码设备端的数据处理过程:
(3)沉浸媒体的文件解封装及解码的过程;
解码设备可以通过编码设备的推荐或按照解码设备端的用户需求自适应动态从编码设备获得沉浸媒体的媒体文件资源和相应的媒体呈现描述信息,例如解码设备可根据用户的头部/眼睛/身体的跟踪信息确定用户的朝向和位置,再基于确定的朝向和位置动态向编码设备请求获得相应的媒体文件资源。媒体文件资源和媒体呈现描述信息通过传输机制(如DASH、SMT)由编码设备传输给解码设备。解码设备端的文件解封装的过程与编码设备端的文件封装过程是相逆的,解码设备按照沉浸媒体的文件格式要求对媒体文件资源进行解封装,得到音频码流和视频码流。解码设备端的解码过程与编码设备端的编码过程是相逆的。解码设备对音频码流进行音频解码,还原出音频内容。
另外,解码设备对视频码流的解码过程包括如下:
①对视频码流进行解码,得到平面图像;根据媒体呈现描述信息提供的元数据,如果该元数据指示沉浸媒体执行过区域封装过程,该平面图像是指封装图像;如果该元数据指示沉浸媒体未执行过区域封装过程,则该平面图像是指投影图像;
②如果元数据指示沉浸媒体执行过区域封装过程,解码设备就将封装图像进行区域解封装得到投影图像。此处区域解封装与区域封装是相逆的,区域解封装是指将封装图像按照区域执行逆转换处理的过程。区域解封装使封装图像被转换为投影图像。区域解封装的过程具体包括:按照元数据的指示对封装图像中的多个封装区域分别进行逆转换处理得到多个映射区域,将该多个映射区域映射至一个2D图像从而得到投影图像。逆转换处理是指与转换处理相逆的处理,例如:转换处理是指逆时针旋转90度,那么逆转换处理是指顺时针旋转90度。
③根据媒体呈现描述信息将投影图像进行重建处理以转换为3D图像,此处的重建处理是指将二维的投影图像重新投影至3D空间中的处理。
(4)沉浸媒体的渲染过程。
解码设备根据媒体呈现描述信息中与渲染、视窗相关的元数据对音频解码得到的音频内容及视频解码得到的3D图像进行渲染,渲染完成即实现了对该3D图像的播放输出。特别地,如果采用3DoF和3DoF+的制作技术,解码设备主要基于当前视点、视差、深度信息等对3D图像进行渲染,如果采用6DoF的制作技术,解码设备主要基于当前视点对视窗内的3D图像进行渲染。其中,视点指用户的观看位置点,视差是指用户的双目产生的视线差或由于运动产生的视线差,视窗是指观看区域。
沉浸媒体***支持数据盒(Box)。数据盒是指包括元数据的数据块或对象,即数据盒中包含了相应媒体内容的元数据。沉浸媒体可以包括多个数据盒,例如包括球面区域缩放数据盒(Sphere Region Zooming Box),其包含用于描述球面区域缩放信息的元数据;2D区域缩放数据盒(2DRegionZoomingBox),其包含用于描述2D区域缩放信息的元数据;区域封装数据盒(Region Wise PackingBox),其包含用于描述区域封装过程中的相应信息的元数据等。
图4B为本申请一实施例提供的GPCC点云媒体的内容流程示意图。如图4B所示,沉浸媒体***包括文件封装器和文件解封装器。在一些实施例中,文件封装器可以理解为上述编码设备,文件解封装器可以理解为上述解码设备。
真实世界的视觉场景(A)由一组相机或具有多个镜头和传感器的相机设备捕获。采集结果为源点云数据(B)。一个或多个点云帧被编码为G-PCC比特流,包括编码的几何位流和属性位流(E)。然后,根据特定的媒体容器文件格式,一个或多个编码的比特流被组合成用于文件回放的媒体文件(F)或用于流式传输(Fs)的初始化段和媒体段的序列。在本申请中,媒体容器文件格式是ISO/IEC 14496-12中规定的ISO基本媒体文件格式。文件封装器还将元数据包含到文件或片段中。使用递送机制将片段Fs递送给玩家。
文件封装器输出的文件(F)与文件解封装器输入的文件(F')相同。文件解封装器处理文件(F')或接收到的段(F's),提取编码比特流(E')并解析元数据。然后将G-PCC比特流解码为解码信号(D'),并从解码信号(D')生成点云数据。在适用的情况下,根据当前的观看位置、观看方向或由各种类型的传感器(例如头部)确定的视口,将点云数据渲染并显示在头戴式显示器或任何其他显示设备的屏幕上,并跟踪,其中跟踪可以使用位置跟踪传感器或眼动跟踪传感器。除了被玩家用来访问解码点云数据的适当部分之外,当前观看位置或观看方向也可以用于解码优化。在视口相关的传递中,当前的观看位置和观看方向也被传递到策略模块(未示出),用于决定要接收或解码的轨道。
上述过程适用于实时和按需用例。
图4B中的各参数定义如下:
E/E':为编码的G-PCC比特流;
F/F':为包括轨道格式规范的媒体文件,其中可能包含对轨道样本中包含的基本流的约束。
点云中的每个点包括几何信息和属性信息。属性信息包括颜色属性、反射率等不同类型的属性信息。而同一类型的属性信息也可以包括不同的属性实例。属性实例是一个属性的具体例子,该例子中指定了该属性的取值。例如一个点的颜色属性包括不同的颜色类型,将不同的颜色类型称为颜色属性的不同属性实例。在编码技术中,例如基于几何模型的点云压缩(Geometry-based Point Cloud Compression,简称GPCC),支持在一个码流里包含同一属性类型的多个属性实例,同一属性类型的多个属性实例可以通过属性实例标识(attribute instance id)进行区分。
但是,目前的点云媒体封装技术,例如GPCC编码技术,虽然支持多个同一属性类型的属性实例在码流中同时存在,但是并没有对应的信息指示,使得文件解封装设备无法确定具体消费哪个属性实例。
为了对点云媒体封装技术的至少一个方面进行改进,本申请实施例的文件封装设备在媒体文件的封装过程中,将目标点云的同一类属性信息的M个属性实例中的至少一个属性实例的第一特性信息添加在媒体文件中。这样文件解封装设备可以根据属性信息的第一特征信息来确定具体解码的目标属性实例,进而节省宽带和解码资源,提高解码效率。
下面通过一些实施例对本申请实施例的技术方案进行详细说明。下面这几个实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图5为本申请实施例提供的一种点云媒体文件封装方法流程图。如图5所示,该方法包括如下步骤:
S501、文件封装设备获取目标点云,对目标点云进行编码,得到该目标点云的码流。
在一些实施例中,文件封装设备也称为点云封装设备,或者点云编码设备。
在一种示例性中,上述目标点云为整体点云。
在另一种示例中,上述目标点云为整体点云的一部分,例如为整体点云的一个子集。
在一些实施例中,目标点云也称为目标点云数据或目标点云媒体内容或目标点云内容等。
本申请实施例中,文件封装设备获取目标点云的方式包括但不限于如下几种。
方式一,文件封装设备从点云采集设备处获取目标点云,例如,文件封装设备从点云采集设备处获取点云采集设备所采集的点云,作为目标点云。
方式二,文件封装设备从存储设备处获取目标点云,例如,点云采集设备采集到点云数据后,将点云数据存储在存储设备中,文件封装设备从存储设备处获取目标点云。
方式三,若上述目标点云为局部点云时,文件封装设备根据上述方式一或方式二获取整体点云后,对整体点云进行块划分,将其中的一个块作为目标点云。
本申请实施例的目标点云包括N类属性信息,该N类属性信息中的至少一类属性信息包括M个属性实例,其中N为正整数,M为大于1的正整数。目标点云包括该M个属性实例对应的实例数据,例如属性类型A的属性值为A1的属性实例的实例数据。
举例说明,目标点云包括颜色属性、反射率属性、透明度属性等N类属性信息。其中,颜色属性包括不同的M个属性实例,例如颜色属性包括蓝色属性实例、红色属性实例等。
对上述获取的目标点云进行编码,得到该目标点云的码流。在一些实施例中,目标点云的编码包括对点云的几何信息和属性信息分别进行编码,得到点云的几何码流和属性码流。在一些实施例中,对目标点云的几何信息和属性信息同时进行编码,得到的点云码流包括几何信息和属性信息。
本申请实施例主要涉及到对目标点云的属性信息的编码。
S502、文件封装设备根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件。其中,目标点云的媒体文件包括上述至少一个属性实例的第一特征信息。
各实施例中,目标点云包括至少一类属性信息的M个属性实例对应的实例数据时,文件封装设备可以将M个属性实例中至少一个属性实例的第一特征信息作为该至少一个属性实例对应的实例数据的元数据,封装在所述目标点云的媒体文件中。其中,第一特征信息用于标识该至少一个属性实例与上述M个属性实例中除该至少一个属性实例之外的其他属性实例的区别。后文中,将“属性实例对应的实例数据”简称为“属性实例”。
其中属性实例的第一特征信息可以理解为用于标识该属性实例与M个属性实例中其他属性实例不同的信息。例如,属性实例的优先级、标识等。
本申请实施例对属性实例的第一特征信息的具体内容不做限制。
在一些实施例中,属性实例的第一特征信息包括:属性实例的标识、属性实例的优先权、属性实例的类型中的至少一个。
在一种示例中,属性实例的标识用字段attr_instance_id表示,该字段的不同取值表示属性实例的标识值。
在一种示例中,属性实例的优先权用字段attr_instance_priority表示。一些实施例中,该字段取值越小,说明属性实例的优先级越高。
一些实施例中,可以复用attr_instance_id指示属性实例的优先级,例如attr_instance_id取 值越小,说明属性实例的优先级越高。
在一种示例中,属性实例的类型,也称为属性实例的选择策略,用字段attr_instance_type表示,该字段的不同取值表示属性实例的不同类型。
其中,属性实例的类型可以理解为用于指示文件解封装设备从同一类型的M个属性实例中选择目标属性实例的策略。或者可以理解为用于指示不同属性实例的消费场景。例如,该属性实例的消费场景为该属性实例与场景1关联,这样文件解封装设备在场景1下,可以请求该场景1关联的属性实例,从而获得该场景1关联的属性实例的实例数据。
在一些实施例中,属性实例的类型包括与推荐视窗关联的属性实例和与用户反馈关联的属性实例中的至少一个。
例如,若属性实例的类型为与用户反馈关联的属性实例时,文件解封装设备可以根据用户反馈信息,确定与该用户反馈信息关联的属性实例,进而将该属性信息可以确定为待解码的目标属性实例。
再例如,若属性实例的类型为与推荐视窗关联的属性实例时,文件解封装设备可以根据推荐视窗相关信息,确定该推荐视窗关联的属性实例,进而将该属性实例可以确定为待解码的目标属性实例。
在一种可能的实现方式中,若字段attr_instance_type的取值为第一数值时,则表示该属性实例的类型为与推荐视窗关联的属性实例。
在一种可能的实现方式中,若字段attr_instance_type的取值为第二数值时,则表示该属性实例的类型为与用户反馈关联的属性实例。
示例性,字段attr_instance_type的取值如表1所示:
表1
attr_instance_type取值 描述
第一数值 与viewport关联的实例
第二数值 与用户反馈关联的实例
其他 保留
一些实施例中,上述第一数值为0。
一些实施例中,上述第二数值为1。
需要说明的是,上述只是对第一数值和第二数值的举例说明,第一数值和第二数值的取值包括但不限于上述0和1,具体根据实际情况确定。
本步骤中,将上述属于同一类属性信息的M个属性实例中至少一个属性实例的第一特征信息添加在目标点云的媒体文件中。
本申请实施例对上述至少一个属性实例的第一特征信息在媒体文件中的具体添加位置不做限制,例如可以添加在至少一个属性实例对应的轨道的头样本中。
在一些实施例中,上述S502中根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件(即将M个属性实例中至少一个属性实例的第一特征信息添加在目标点云的媒体文件中)的实现过程包括如下几种情况。
情况1,若目标点云中的一帧点云的几何信息和属性信息封装在一个轨道或一个项目中时,则将至少一个属性实例的第一特征信息,添加在M个属性实例对应的子样本数据盒中。
在该情况1中,目标点云在封装时,按照点云帧为封装单元进行点云码流的封装,其中一帧点云可以理解点云采集设备在一次扫描过程中扫描到的点云。或者一帧点云为预设大小的点云。 在封装时,将一帧点云的几何信息和属性信息封装在一个轨道或一个项目中时,此时该轨道或项目中包括几何信息子样本和属性信息子样本。将至少一个属性实例的第一特征信息,添加在M个属性实例对应的子样本数据盒中。
在一种示例中,若目标点云的N类属性信息封装在一个子样本中,此时,可以将至少一个属性实例的第一特征信息,添加在该子样本数据盒中。
在另一种示例中,若目标点云的N类属性信息中的每个属性信息封装在一个子样本中,若上述M个属性实例为第a类属性信息的属性实例,这样可以将M个属性实例中至少一个属性实例的第一特性信息添加在该第a类属性信息的子样本数据盒中。
在一些实施例中,若上述媒体文件的封装标准为ISOBMFF,则上述情况1对应的子样本数据盒的数据结构如下:
子样本数据盒SubsampleInformationBox中的codec_specific_parameters字段定义如下:
Figure PCTCN2022109620-appb-000001
其中,payloadType用于指示子样本中G-PCC单元的tlv_type数据类型。
attrIdx用于指示子样本中包含属性数据的G-PCC单元的ash_attr_sps_attr_idx。
multi_attr_instance_flag取值为1表示当前类型的属性存在多个属性实例;取值为0表示当前类型的属性仅存在一个属性实例。
attr_instance_id指示属性实例的标识符。
attr_instance_priority指示属性实例的优先级,该字段取值越小,说明属性实例的优先级越高。当一个属性类型存在多个属性实例时,客户端可丢弃低优先级的属性实例。
attr_instance_type指示属性实例的类型,该字段用于指示不同实例的消费场景,字段取值含义如下:
attr_instance_type取值 描述
0 与viewport关联的实例
1 与用户反馈关联的实例
其他 保留
在该情况1中,文件解封装设备在获得媒体文件后,可以从上述子样本数据盒中获得M个属性实例中至少一个属性实例的第一特征信息,进而根据该第一特征信息确定待解码的目标属性实例,进而避免解码所有属性实例,从而提高了解码效率。
情况2,若上述M个属性实例中的每个属性信息封装在一个轨道或一个项目中时,则将至少一个属性实例的第一特征信息,添加在M个属性实例对应的组件信息数据盒中。
在该情况2中,目标点云在封装时,将一帧点云的几何信息和属性信息分开封装,例如将几何信息封装在几何轨道中,将N类属性信息中的每一类属性信息中的每一个属性实例封装在一个轨道或项目中。具体是,将属于同一类属性信息的M个属性实例中的每一个属性实例封装在一个轨道或项目中时,可以将上述至少一个属性实例的第一特性信息添加在M个属性实例对应的组件数据盒中。
在一些实施例中,若上述媒体文件的封装标准为ISOBMFF,则上述情况2对应的组件数据盒的数据结构如下:
Figure PCTCN2022109620-appb-000002
其中,gpcc_type用于指示GPCC成分的类型,其取值含义如表2所示。
表2组件类型
gpcc_type取值 描述
1 保留
2 几何数据
3 保留
4 属性数据
5..31 保留
attr_index用于指示在SPS(Sequence Parameter Set)中指示的属性的序号。
attr_type_present_flag取值为1表示GPCCComponentInfoBox数据盒中指示了属性类型信息;取值为0表示GPCCComponentInfoBox数据盒中未指示属性类型信息。
attr_type指示属性成分的类型,其取值参照表3所示。
表3
Figure PCTCN2022109620-appb-000003
attr_name用于指示可直观解读(human-readable)的属性成分类型信息。
multi_attr_instance_flag取值为1表示当前类型的属性存在多个属性实例;取值为0表示当前类型的属性仅存在一个属性实例。
attr_instance_id指示属性实例的标识符。
attr_instance_priority指示属性实例的优先级,该字段取值越小,说明属性实例的优先级越高。当一个属性类型存在多个属性实例时,客户端可丢弃低优先级的属性实例。
一些实施例中,可复用attr_instance_id指示属性实例的优先级,attr_instance_id取值越小,说明属性实例的优先级越高。
attr_instance_type指示属性实例的类型,该字段用于指示不同实例的消费场景,字段取值含义如下:
attr_instance_type取值 描述
0 与viewport关联的实例
1 与用户反馈关联的实例
其他 保留
在该情况2中,文件解封装设备在获得媒体文件后,可以从上述组件数据盒中获得M个属性实例中至少一个属性实例的第一特征信息,进而根据该第一特征信息确定待解码的目标属性实例,进而避免解码所有属性实例,提高了解码效率。
在情况2的一种示例中,可以将属于同一类属性信息的M个属性实例分别一一对应封装在M个轨道或项目中,一个轨道或项目包括一个属性实例,这样可以将该属性实例的第一特征信息直接添加在该属性实例对应的轨道或项目的数据盒中。
情况3,若M个属性实例中的每个属性实例封装在一个轨道或一个项目中,且M个属性实例对应的M个轨道构成轨道组,或M个属性实例对应的M个项目构成实体组时,则将M个属性实例中至少一个属性实例的第一特征信息,添加在上述轨道组数据盒或上述实体组数据盒中。
例如,同一类属性信息的M个属性实例中的每一个属性实例封装在一个轨道中,得到M个轨道,这些M个轨道组成一个轨道组。这样可以将上述M个属性实例中至少一个属性实例的第一特性信息添加在该轨道组数据盒(AttributeInstanceTrackGroupBox)中。
再例如,同一类属性信息的M个属性实例中的每一个属性实例封装在一个项目中,得到M个项目,这些M个项目构成一个实体组。这样可以将上述M个属性实例中至少一个属性实例的第一特性信息添加在该实体组数据盒(AttributeInstanceEntityToGroupBox)中。
需要说明的是,第一特性信息在目标点云的媒体文件中的添加位置包括但不限于如上3种情况。
在一些实施例中,若属性实例的类型为与推荐视窗关联的属性实例,则本申请的方法还包括S502-1:
S502-1、文件封装设备在属性实例关联的推荐视窗的元数据轨道中,添加属性实例的第二特性信息。
在一种示例中,属性实例的第二特征信息与属性实例的第一特性信息一致,包括属性实例的标识、属性实例的优先权、属性实例的类型中的至少一个。
在另一种示例中,属性实例的第二特性信息包括属性实例的标识和属性实例的属性类型中的至少一个。例如,属性实例的第二特性信息包括属性实例的标识。再例如,属性实例的第二特征信息包括属性实例的标识和属性实例的属性类型。
在一些实施例中,在推荐视窗的元数据轨道中添加属性实例的第二特性信息,可以通过如下程序实现:
Figure PCTCN2022109620-appb-000004
Figure PCTCN2022109620-appb-000005
如果视窗信息元数据轨道存在,则相机外参信息ExtCameraInfoStruct()应当出现于样本入口中或者样本中。以下情况不得出现:dynamic_ext_camera_flag取值为0且所有样本中的camera_extrinsic_flag[i]取值均为0。
num_viewports指示样本中指示的视窗数目。
viewport_id[i]指示对应视窗的标识符。
viewport_cancel_flag[i]取值为1表示视窗标识符取值为viewport_id[i]的视窗被取消了。
camera_intrinsic_flag[i]取值为1表示当前样本中第i个视窗存在相机内参。如果dynamic_int_camera_flag取值为0,则该字段必须取值为0。同时,当camera_extrinsic_flag[i]取值为0时,该字段必须取值为0。
camera_extrinsic_flag[i]取值为1表示当前样本中的第i个视窗存在相机外参。如果dynamic_ext_camera_flag取值为0,则该字段必须取值为0。
attr_instance_asso_flag[i]取值为1表示当前样本中的第i个视窗关联了相应的属性实例。当attr_instance_type取值为0时,当前轨道中至少一个样本中的attr_instance_asso_flag取值必须为1。
attr_type指示属性成分的类型,其取值参照上述表3。
attr_instance_id指示属性实例的标识符。
本申请实施例,若属性实例的类型为与推荐视窗关联的属性实例,则在在属性实例关联的推荐视窗的元数据轨道中,添加属性实例的第二特性信息。这样文件解封装设备请求到推荐视窗的元数据轨道时,可以根据推荐视窗的元数据轨道中所添加的属性实例的第二特性信息,确定待解码的目标属性实例。例如该第二特征信息包括属性实例的标识,文件解封装设备可以将该属性实例的标识发送给文件封装设备,使得文件封装设备将该属性实例的标识对应的属性实例的媒体文件发送给文件解封装设备进行消费,避免文件解封装设备请求不需要的资源,进而节约了带宽和解码资源,提高了解码效率。
在一些实施例中,若M个属性实例一一对应封装在M个属性实例轨道中,文件封装设备将M个属性实例轨道通过轨道组数据盒进行关联。
具体是,将M个属性实例一一对应封装在M个属性实例轨道,一个属性实例轨道中包括一个属性实例,可以将属于同一类属性信息的M个属性实例进行关联。
示例性的,使用轨道组对同一属性类型不同属性实例的轨道进行关联,可以是在轨道组数据盒中添加M个属性实例的标识来实现。
在一种可能的实现方式中,将M个属性实例轨道通过轨道组数据盒进行关联,可以通过如下程序实现:
Figure PCTCN2022109620-appb-000006
Figure PCTCN2022109620-appb-000007
其中,attr_type指示属性成分的类型,其取值参照表3所示。
attr_instance_id指示属性实例的标识符。
attr_instance_priority指示属性实例的优先级,该字段取值越小,说明属性实例的优先级越高。当一个属性类型存在多个属性实例时,客户端可丢弃低优先级的属性实例。
在一些实施例中,若M个属性实例一一对应封装在M个属性实例项目中,则将M个属性实例项目通过实体组数据盒进行关联。
具体是,将M个属性实例一一对应封装在M个属性实例项目中,一个属性实例项目中包括一个属性实例,可以将属于同一类属性信息的M个属性项目进行关联。
示例性的,使用实体组对同一属性类型不同属性实例的项目进行关联,可以是在实体组数据盒中添加M个属性实例的标识来实现。
在一种可能的实现方式中,将M个属性实例轨道通过实体组数据盒进行关联,可以通过如下程序实现:
Figure PCTCN2022109620-appb-000008
其中,attr_type指示属性成分的类型,其取值参照表3所示。
attr_instance_id指示属性实例的标识符。
attr_instance_priority指示属性实例的优先级,该字段取值越小,说明属性实例的优先级越高。当一个属性类型存在多个属性实例时,客户端可丢弃低优先级的属性实例。
本申请实施例提供的点云媒体文件的封装方法,文件封装设备通过获取目标点云,并对目标点云进行编码,得到目标点云的码流,该目标点云包括N类属性信息,N类属性信息中的至少 一类属性信息包括M个属性实例,N为正整数,M为大于1的正整数;根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件,该目标点云的媒体文件包括至少一个属性实例的第一特征信息。即本申请通过将属性实例的第一特性信息添加在媒体文件中,使得文件解封装设备可以根据属性信息的第一特征信息来确定具体解码的目标属性实例,进而节省宽带和解码资源,提高解码效率。
图6为本申请实施例提供的一种点云媒体文件封装与解封装方法的交互流程图,如图6所示,本实施例包括如下步骤:
S601、文件封装设备获取目标点云,并对目标点云进行编码,得到所述目标点云的码流。
其中,目标点云包括N类属性信息,所述N类属性信息中的至少一类属性信息包括M个属性实例,所述N为正整数,所述M为大于1的正整数。
S602、文件封装设备根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件,目标点云的媒体文件包括至少一个属性实例的第一特征信息。
上述S601与上述S602的实现过程可以参照上述上述S501至S502的具体描述,在此不再赘述。
文件封装设备根据上述步骤对目标点云进行编码和封装,得到目标点云的媒体文件后,可以通过如下几种方式与文件解封装设备进行数据交互:
方式一,文件封装设备可以直接将封装得到的目标点云的媒体文件发送给文件解封装设备,使得文件解封装设备根据媒体文件中的属性实例的第一特征信息,选择性消费部分属性实例。
方式二,文件封装设备向文件解封装设备发送信令,文件解封装设备根据信令,向文件封装设备请求全部或部分的属性实例的媒体文件进行消费。
在该实施例中,对方式二中的文件解封装设备请求部分属性实例的媒体文件进行消费的过程进行介绍,具体参照以下S603至步骤,。
S603、文件封装设备向文件解封装设备发送第一信息。
该第一信息用于指示M个属性实例中至少一个属性实例的第一特征信息。
属性实例的第一特征信息包括属性实例的标识、属性实例的优先权、属性实例的类型中的至少一个。
一些实施例中,上述第一信息为DASH信令。
在一些实施例中,若上述第一信息为DASH信令时,DASH信令的语义描述如表4所示:
表4
Figure PCTCN2022109620-appb-000009
Figure PCTCN2022109620-appb-000010
Figure PCTCN2022109620-appb-000011
需要说明的是,上述表4是第一信息的一种形式,本申请实施例的第一信息包括但不限于上述表4所示的内容。
属性实例的第一特征信息包括属性实例的标识、属性实例的优先权、属性实例的类型中的至少一个。
一些实施例中,上述第一信息为DASH信令。
S604、文件解封装设备根据至少一个属性实例的第一特征信息,确定目标属性实例。
文件解封装设备可以利用该第一特征信息从该至少一个属性实例中确定目标属性实例,并获取所述目标属性实例对应的实例数据。
在该步骤中,文件解封装设备根据第一信息指示的至少一个属性实例的第一特征信息,确定目标属性实例的方式包括但不限于如下几种方式:
方式一,若属性实例的第一特性信息包括属性实例的优先权,则可以将优先权高的一个或几个属性实例,确定为目标属性实例。
方式二,若属性实例的第一特性信息包括属性实例的标识,且使用属性实例的标识表示属性实例的优先权,这样可以根据属性实例的标识,选择一个或几个属性实例,确定为目标属性实例。例如,若属性实例的标识越小表示优先权越高,这样可以将标识最小的一个或几个属性实例,确定为目标属性实例。再例如,若属性实例的标识越大表示优先权越高,这样可以将标识最大的一个或几个属性实例,确定为目标属性实例。
方式三,属性实例的第一特征信息包括属性实例的类型,则可以根据属性实例的类型,从至少一个属性实例中确定目标属性实例,具体可以参照下面的示例一和示例二。
示例一,若属性实例的类型为与用户反馈关联的属性实例,则文件解封装设备根据M个属性信息的至少一个属性实例的第一特性信息,从至少一个属性实例中确定目标属性实例。
例如,根据文件解封装设备的网络带宽和/或设备算力,以及第一特性信息中的属性实例的优先级,从至少一个属性实例中确定目标属性实例。示例性的,若网络带宽充足,且设备算力强,则可以将至少一个属性实例中较多的属性实例确定为目标属性实例。若网络带宽不充足,和/或设备算力弱,则可以将至少一个属性实例中优先级最高的属性实例确定为目标属性实例。
示例二,若属性实例的类型为与推荐视窗关联的属性实例,则文件解封装设备获取推荐视窗的元数据轨道,并根据推荐视窗的元数据轨道中包括的属性实例的第二特性信息,从M个属性信息的至少一个属性实例中确定目标属性实例。
一些实施例中,属性实例的第二特性信息包括属性实例的标识和属性实例的属性类型中的至少一个。
其中,文件解封装设备获取推荐视窗的元数据轨道的方式为:文件封装设备向文件解封装设备发送第二信息,该第二信息用于指示推荐视窗的元数据轨道。文件解封装设备根据该第二信息,向文件封装设备请求推荐视窗的元数据轨道。文件封装设备将推荐视窗的元数据轨道发送给文件解封装设备。
一些实施例中,上述第二信息可以是在上述第一信息之前发送的。
一些实施例中,上述第二信息可以是在上述第一信息之后发送的。
一些实施例中,上述第二信息与上述第一信息是同时发送。
本实施例中,若属性实例的类型为与推荐视窗关联的属性实例,则推荐视窗的元数据轨道中包括该属性实例的第二特性信息。这样文件解封装设备根据上述步骤获得推荐视窗的元数据轨道后,从推荐视窗的元数据轨道中获取属性实例的第二特性信息,并根据属性实例的第二特性信息,确定目标属性实例,例如将第二特性信息对应的属性实例,确定为目标属性实例。
文件解封装设备根据上述步骤,确定出待解码的目标属性实例后,执行如下S605。
S605、文件解封装设备向文件封装设备发送第一请求信息,该第一请求信息用于请求目标属性实例的媒体文件。
例如,该第一请求信息中包括目标属性实例的标识。
再例如,该第一请求信息中包括目标属性实例的第一特性信息。
S606、文件封装设备根据第一请求信息,将目标属性实例的媒体文件发送给文件解封装设备。
例如,第一请求信息中包括目标属性实例的标识,这样文件封装设备在目标点云的媒体文件中查询到目标属性实例的标识对应的目标属性实例对应的媒体文件,将目标属性实例的媒体文件发送给文件解封装设备。
S607、文件解封装设备对目标属性实例的媒体文件进行解封装后再解码,得到目标属性实例。
具体的,文件解封装设备接收到目标属性实例的媒体文件后,先对目标属性实例的媒体文件进行解封装,得到解封装后的目标属性实例的码流,再对目标属性实例的码流进行解码,得到解码后的目标属性实例。
在一些实施例中,若目标点云的属性信息是基于点云的几何信息编码的,此时,文件封装设备还将该目标属性实例对应的几何信息的媒体文件发送给文件解封装设备进行几何信息的解码。基于解码出的几何信息,对目标属性实例进行属性解码。
为了进一步说明本申请实施例的技术方案,下面结合具体的示例进行举例说明。
示例一:
步骤11,假设目标点云的码流中存在1个属性类型的2个属性实例,且将目标点云的码流中的不同属性实例按照多轨封装,得到目标点云的媒体文件F1。目标点云的媒体文件F1中包括Track1、Track2和Track3:
Track1:GPCCComponentInfoBox:{gpcc_type=2(Geometry)}。
Track2:GPCCComponentInfoBox:{gpcc_type=4(Attribute);multi_attr_instance_flag=1;attr_instance_id=1;attr_instance_priority=0;attr_instance_type=1}。
Track3:GPCCComponentInfoBox:{gpcc_type=4(Attribute);multi_attr_instance_flag=1;attr_instance_id=2;attr_instance_priority=1;attr_instance_type=1}。
其中,Track2和Track3为两个属性实例的轨道。
步骤12,根据目标点云的媒体文件F1中的属性实例的信息,生成DASH信令(即第一信息),用于指示至少一个属性实例的第一特性信息,DASH信令包括如下内容:
Representation1:对应track1,component@component_type=‘geom’。
Representation2:对应track2,component@component_type=‘attr’;component@attr_instance_id=1;component@attr_instance_priority=0;component@attr_instance_type=1。
Representation3:对应track3,component@component_type=‘attr’;component@attr_instance_id=2;component@attr_instance_priority=1;component@attr_instance_type=1。
将DASH信令发送给文件解封装设备。
步骤13,文件解封装设备C1和C2根据网络带宽和DASH信令中的信息,请求点云媒体文件。
一些实施例中,文件解封装设备C1网络带宽充足,请求Representation1~Representation3,文件解封装设备C2网络带宽受限,请求Representation1~Representation2。
步骤14,传输点云媒体文件。
步骤15,文件解封装设备接收点云文件,
具体是,C1:根据attr_instance_type=1,C1收到的2个属性实例随用户交互操作进行切换,C1可获得更加个性化的点云消费体验。
C2:C2仅收到1个属性实例,获得基本的点云消费体验。
示例二:
步骤21,假设目标点云的码流中存在1个属性类型的2个属性实例,且将目标点云的码流中的不同属性实例按照多轨封装,得到目标点云的媒体文件F1。目标点云的媒体文件F1中包括Track1、Track2和Track3:
Track1:GPCCComponentInfoBox:{gpcc_type=2(Geometry)}。
Track2:GPCCComponentInfoBox:{gpcc_type=4(Attribute);multi_attr_instance_flag=1;attr_instance_id=1;attr_instance_priority=0;attr_instance_type=0}。
Track3:GPCCComponentInfoBox:{gpcc_type=4(Attribute);multi_attr_instance_flag=1;attr_instance_id=2;attr_instance_priority=0;attr_instance_type=0}。
其中,Track2和Track3为两个属性实例的轨道。
步骤22,根据目标点云的媒体文件F1中的属性实例的信息,生成DASH信令(即第一信息),用于指示至少一个属性实例的第一特性信息,DASH信令包括如下内容:
Representation1:对应track1,component@component_type=‘geom’。
Representation2:对应track2,component@component_type=‘attr’;component@attr_instance_id=1;component@attr_instance_priority=0;component@attr_instance_type=0。
Representation3:对应track3,component@component_type=‘attr’;component@attr_instance_id=2;component@attr_instance_priority=0;component@attr_instance_type=0。
将DASH信令发送给文件解封装设备。
步骤23,文件解封装设备C1和C2根据网络带宽和DASH信令中的信息,请求点云媒体文件。
C1:网络带宽充足,请求2个属性实例。
C2:虽然representation2和3的优先级相同,但是由于这两个属性实例是和推荐视窗关联的,因此可根据推荐视窗元数据轨道中的属性实例的第二特性信息,根据用户观看位置请求对应的媒体资源,一次只请求1个属性实例。
步骤24,传输点云媒体文件。
步骤25,文件解封装设备接收点云文件,
C1:根据attr_instance_type=0,C1收到2个属性实例后,根据用户观看视窗选择其中一个属性实例解码消费。
C2:C2仅收到1个属性实例,解码对应的属性实例消费。
本申请实施例提供的点云媒体文件的封装与解封装方法,文件封装设备通过向文件解封装设备发送第一信息,该第一信息用于指示M个属性实例中的至少一个属性实例的第一特征信息。这样,文件解封装设备可以根据至少一个属性实例的第一特征信息,以及文件解码设备的自身性能,选择请求目标属性实例进行消费,进而节约了网络宽带,提升了解码效率。
图7为本申请实施例提供的一种点云媒体文件封装与解封装方法的交互流程图,如图7所示,本实施例包括如下步骤:
S701、文件封装设备获取目标点云,并对目标点云进行编码,得到目标点云的码流。
其中,目标点云包括N类属性信息,N类属性信息中的至少一类属性信息包括M个属性实例,N为正整数,M为大于1的正整数。
S702、文件封装设备根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件,目标点云的媒体文件包括至少一个属性实例的第一特征信息。
上述S701与上述S702的实现过程可以参照上述S501至S502的具体描述,在此不再赘述。
文件封装设备根据上述步骤对目标点云进行编码和封装,得到目标点云的媒体文件后,可以通过如下几种方式与文件解封装设备进行数据交互:
方式一,文件封装设备可以直接将封装得到的目标点云的媒体文件发送给文件解封装设备,使得文件解封装设备根据媒体文件中的属性实例的第一特征信息,选择性消费部分属性实例。
方式二,文件封装设备向文件解封装设备发送信令,文件解封装设备根据信令,向文件封装设备请求全部或部分的属性实例的媒体文件进行消费。
在该实施例中,对方式二中的文件解封装设备请求完整的目标点云的媒体文件后,选择解码部分属性实例的媒体文件进行消费的过程进行介绍,具体参照以下S703至步骤。
S703、文件封装设备向文件解封装设备发送第一信息。
该第一信息用于指示M个属性实例中至少一个属性实例的第一特征信息。
属性实例的第一特征信息包括属性实例的标识、属性实例的优先权、属性实例的类型中的至少一个。
一些实施例中,上述第一信息为DASH信令。
在一些实施例中,若上述第一信息为DASH信令时,DASH信令的语义描述如上述表4 所示。
S704、文件解封装设备根据第一信息,向文件封装设备发送第二请求信息。
该第二请求用于请求目标点云的媒体文件。
S705、文件封装设备根据第二请求信息,将目标点云的媒体文件发送给文件解封装设备。
S706、文件解封装设备根据至少一个属性实例的第一特征信息,确定目标属性实例。
其中,S706的实现过程与上述S604的实现过程一致,参照上述S604的描述,例如若属性实例的类型为与用户反馈关联的属性实例,则文件解封装设备根据M个属性信息的至少一个属性实例的第一特性信息,从至少一个属性实例中确定目标属性实例。再例如,若属性实例的类型为与推荐视窗关联的属性实例,则文件解封装设备获取推荐视窗的元数据轨道,并根据推荐视窗的元数据轨道中包括的属性实例的第二特性信息,从M个属性信息的至少一个属性实例中确定目标属性实例。
S707、文件解封装设备对目标属性实例的媒体文件进行解封装后再解码,得到目标属性实例。
根据上述步骤确定出待解码的目标属性实例后,从接收到的目标点云的媒体文件中查询到该目标属性实例对应的媒体文件。接着,先对目标属性实例的媒体文件进行解封装,得到解封装后的目标属性实例的码流,再对目标属性实例的码流进行解码,得到解码后的目标属性实例。
本申请实施例提供的点云媒体文件的封装与解封装方法,文件封装设备通过向文件解封装设备发送第一信息,该第一信息用于指示M个属性实例中的至少一个属性实例的第一特征信息。这样,文件解封装设备请求到整个目标点云的媒体文件后,可以根据至少一个属性实例的第一特征信息,以及文件解码设备的自身性能,选择目标属性实例进行解码消费,进而节约了网络宽带,提升了解码效率。
应理解,图5至图7仅为本申请的示例,不应理解为对本申请的限制。
以上结合附图详细描述了本申请的优选实施方式,但是,本申请并不限于上述实施方式中的具体细节,在本申请的技术构思范围内,可以对本申请的技术方案进行多种简单变型,这些简单变型均属于本申请的保护范围。例如,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本申请对各种可能的组合方式不再另行说明。又例如,本申请的各种不同的实施方式之间也可以进行任意组合,只要其不违背本申请的思想,其同样应当视为本申请所公开的内容。
上文结合图5和图7,详细描述了本申请的方法实施例,下文结合图8至图10,详细描述本申请的装置实施例。
图8为本申请一实施例提供的点云媒体文件的封装装置的结构示意图,该装置10应用于文件封装设备,该装置10包括:
获取单元11,用于获取目标点云,并对所述目标点云进行编码,得到所述目标点云的码流,所述目标点云包括N类属性信息,所述N类属性信息中的至少一类属性信息包括M个属性实例,所述N为正整数,所述M为大于1的正整数;
封装单元12,用于根据M个属性实例中至少一个属性实例的第一特征信息,对目标点云的码流进行封装,得到目标点云的媒体文件,目标点云的媒体文件包括至少一个属性实例的第一特征信息。
在一些实施例中,所述属性实例的第一特征信息包括:所述属性实例的标识、所述属性实 例的优先权、所述属性实例的类型中的至少一个。
在一些实施例中,所述属性实例的类型包括与推荐视窗关联的属性实例和与用户反馈关联的属性实例中的至少一个。
在一些实施例中,若所述属性实例的类型为与推荐视窗关联的属性实例,则所述封装单元12,还用于在所述属性实例关联的推荐视窗的元数据轨道中,添加所述属性实例的第二特性信息。
在一些实施例中,所述属性实例的第二特性信息包括所述属性实例的标识和所述属性实例的属性类型中的至少一个。
在一些实施例中,所述封装单元12,具体用于若所述目标点云中的一帧点云的几何信息和属性信息封装在一个轨道或一个项目中时,则将所述至少一个属性实例的第一特征信息,添加在所述M个属性实例对应的子样本数据盒中;或者,
若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中时,则将所述至少一个属性实例的第一特征信息,添加在所述M个属性实例对应的组件信息数据盒中;或者,
若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中,且所述M个属性实例对应的M个轨道构成轨道组,或所述M个属性实例对应的M个项目构成实体组时,则将所述至少一个属性实例的第一特征信息,添加在所述轨道组数据盒或所述实体组数据盒中。
在一些实施例中,所述封装单元12,还用于若所述M个属性实例一一对应封装在M个属性实例轨道中,则将所述M个属性实例轨道通过轨道组数据盒进行关联;或者,
若所述M个属性实例一一对应封装在M个属性实例项目中,则将所述M个属性实例项目通过实体组数据盒进行关联。
在一些实施例中,装置还包括收发单元13,用于向文件解封装设备发送第一信息,所述第一信息用于指示所述M个属性实例中至少一个属性实例的第一特征信息。
在一些实施例中,收发单元13,用于接收所述文件解封装设备发送的第一请求信息,所述第一请求用于请求目标属性实例的媒体文件;并根据所述第一请求信息,将所述目标属性实例的媒体文件发送给所述文件解封装设备。
在一些实施例中,收发单元13,还用于接收所述文件解封装设备发送的第二请求信息,所述第二请求用于请求所述目标点云的媒体文件;并根据所述第二请求信息,将所述目标点云的媒体文件发送给所述文件解封装设备。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图8所示的装置10可以执行文件封装设备对应的方法实施例,并且装置10中的各个模块的前述和其它操作和/或功能分别为了实现文件封装设备对应的方法实施例,为了简洁,在此不再赘述。
图9为本申请一实施例提供的点云媒体文件的解封装装置的结构示意图,该装置20应用于文件解封装设备,该装置20包括:
收发单元21,用于接收文件封装设备发送的第一信息;
其中,所述第一信息用于指示M个属性实例中的至少一个属性实例的第一特征信息,所述M个属性实例为目标点云所包括的N类属性信息中至少一类属性信息所包括的M个属性实例,所述N为正整数,所述M为大于1的正整数。
在一些实施例中,所述属性实例的第一特征信息包括:所述属性实例的标识、所述属性实例的优先权、所述属性实例的类型中的至少一个。
在一些实施例中,所述属性实例的类型包括与推荐视窗关联的属性实例和与用户反馈关联的属性实例中的至少一个。
在一些实施例中,若所述属性实例的类型为与推荐视窗关联的属性实例,则在所述属性实例关联的推荐视窗的元数据轨道中,添加有所述属性实例的第二特性信息。
在一些实施例中,所述装置还包括确定单元22和解码单元23:
确定单元22,用于根据所述至少一个属性实例的第一特征信息,确定目标属性实例;
收发单元21,用于向所述文件封装设备发送第一请求信息,所述第一请求信息用于请求所述目标属性实例的媒体文件;并接收所述文件封装设备发送的所述目标属性实例的媒体文件;
解码单元23,用于对所述目标属性实例的媒体文件进行解封装后再解码,得到所述目标属性实例。
在一些实施例中,收发单元21,还用于根据所述第一信息,向所述文件封装设备发送第二请求信息,所述第二请求用于请求所述目标点云的媒体文件;并接收所述文件封装设备发送的所述目标点云的媒体文件;
确定单元22,用于根据所述至少一个属性实例的第一特征信息,确定目标属性实例;
解码单元23,用于从所述目标点云的媒体文件中获取所述目标属性实例的媒体文件;对所述目标属性实例的媒体文件进行解封装后再解码,得到所述目标属性实例。
在一些实施例中,若属性实例的第一特征信息包括属性实例的类型,则所述确定单元22,具体用于若所述属性实例的类型为与用户反馈关联的属性实例,则根据所述M个属性信息的至少一个属性实例的第一特性信息,从所述至少一个属性实例中确定所述目标属性实例;或者,
若所述属性实例的类型为与推荐视窗关联的属性实例,则获取所述推荐视窗的元数据轨道,并根据所述推荐视窗的元数据轨道中包括的属性实例的第二特性信息,从所述M个属性信息的至少一个属性实例中确定所述目标属性实例。
在一些实施例中,所述属性实例的第二特性信息包括所述属性实例的标识和所述属性实例的属性类型中的至少一个。
在一些实施例中,若所述目标点云中的一帧点云的几何信息和属性信息封装在一个轨道或一个项目中时,则在所述M个属性实例对应的子样本数据盒中添加有所述属性实例的第一特征信息;或者,
若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中时,则在所述M个属性实例对应的组件信息数据盒中添加有所述属性实例的第一特征信息;或者,
若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中,且所述M个属性实例对应的M个轨道构成轨道组,或所述M个属性实例对应的M个项目构成实体组时,则在所述轨道组数据盒或所述实体组数据盒中添加有所述属性实例的第一特征信息。
在一些实施例中,若所述M个属性实例一一对应封装在M个属性实例轨道中,则所述目标点云的媒体文件中包括轨道组数据盒,所述轨道组数据盒用于关联所述M个属性实例轨道;或者,若所述M个属性实例一一对应封装在M个属性实例项目中,则所述目标点云的媒体文件中包括实体组数据盒,所述实体组数据盒用于关联所述M个属性实例项目。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图9所示的装置20可以执行文件解封装设备对应的方法实施例,并且装置20中的各个模块的前述和其它操作和/或功能分别为了实现文件解封装设备对应的方法实施例,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的装置。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。可选地,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图10是本申请实施例提供的电子设备的示意性框图,该电子设备可以为上述的文件封装设备、或文件解封装设备,或者该电子设备具有文件封装设备和文件解封装设备的功能。
如图10所示,该电子设备40可包括:
存储器41和存储器42,该存储器41用于存储计算机程序,并将该程序代码传输给该存储器42。换言之,该存储器42可以从存储器41中调用并运行计算机程序,以实现本申请实施例中的方法。
例如,该存储器42可用于根据该计算机程序中的指令执行上述方法实施例。
在本申请的一些实施例中,该存储器42可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器41包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器41中,并由该存储器42执行,以完成本申请提供的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序在该视频制作设备中的执行过程。
如图10所示,该电子设备40还可包括:
收发器40,该收发器43可连接至该存储器42或存储器41。
其中,存储器42可以控制该收发器43与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器43可以包括发射机和接收机。收发器43还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该视频制作设备中的各个组件通过总线***相连,其中,总线***除包括数据 总线之外,还包括电源总线、控制总线和状态信号总线。
本申请还提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
以上仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (24)

  1. 一种点云媒体文件的封装方法,其特征在于,应用于文件封装设备,所述方法包括:
    获取目标点云,并对所述目标点云进行编码,得到所述目标点云的码流;
    对所述码流进行封装,得到所述目标点云的媒体文件;
    当所述目标点云包括至少一类属性信息的M个属性实例对应的实例数据时,将所述M个属性实例中至少一个属性实例的第一特征信息作为所述至少一个属性实例对应的实例数据的元数据,封装在所述目标点云的媒体文件中,所述第一特征信息用于标识所述至少一个属性实例与所述M个属性实例中其他属性实例的区别,所述M为大于1的正整数。
  2. 根据权利要求1所述的方法,其特征在于,所述属性实例的第一特征信息包括:所述属性实例的标识、所述属性实例的优先权、所述属性实例的类型中的至少一个。
  3. 根据权利要求2所述的方法,其特征在于,所述属性实例的类型包括与推荐视窗关联的属性实例和与用户反馈关联的属性实例中的至少一个。
  4. 根据权利要求3所述的方法,其特征在于,若所述属性实例的类型为与推荐视窗关联的属性实例,则所述方法进一步包括:
    在所述属性实例关联的推荐视窗的元数据轨道中,添加所述属性实例的第二特性信息。
  5. 根据权利要求4所述的方法,其特征在于,所述属性实例的第二特性信息包括所述属性实例的标识和所述属性实例的属性类型中的至少一个。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,将所述M个属性实例中至少一个属性实例的第一特征信息作为所述至少一个属性实例对应的实例数据的元数据封装在所述目标点云的媒体文件中,包括:
    若所述目标点云中的一帧点云的几何信息和属性信息封装在一个轨道或一个项目中时,则将所述至少一个属性实例的第一特征信息,添加在所述M个属性实例对应的子样本数据盒中;或者,
    若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中时,则将所述至少一个属性实例的第一特征信息,添加在所述M个属性实例对应的组件信息数据盒中;或者,
    若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中,且所述M个属性实例对应的M个轨道构成一个轨道组,或所述M个属性实例对应的M个项目构成一个实体组时,则将所述至少一个属性实例的第一特征信息,添加在所述轨道组数据盒或所述实体组数据盒中。
  7. 根据权利要求1-5任一项所述的方法,其特征在于,进一步包括:
    若所述M个属性实例一一对应封装在M个属性实例轨道中,则将所述M个属性实例轨道通过轨道组数据盒进行关联;或者,
    若所述M个属性实例一一对应封装在M个属性实例项目中,则将所述M个属性实例项目通过实体组数据盒进行关联。
  8. 根据权利要求1-5任一项所述的方法,其特征在于,进一步包括:
    向文件解封装设备发送第一信息,所述第一信息用于指示所述M个属性实例中至少一个属性实例的第一特征信息。
  9. 根据权利要求8所述的方法,其特征在于,进一步包括:
    接收所述文件解封装设备发送的第一请求信息,所述第一请求用于请求目标属性实例的媒体文件;
    根据所述第一请求信息,将所述目标属性实例的媒体文件发送给所述文件解封装设备。
  10. 根据权利要求8所述的方法,其特征在于,进一步包括:
    接收所述文件解封装设备发送的第二请求信息,所述第二请求用于请求所述目标点云的媒体文件;
    根据所述第二请求信息,将所述目标点云的媒体文件发送给所述文件解封装设备。
  11. 一种点云媒体文件的解封装方法,其特征在于,应用于文件解封装设备,包括:
    接收文件封装设备发送的第一信息;其中,所述第一信息用于指示目标点云包括的至少一类属性信息的M个属性实例中的至少一个属性实例的第一特征信息,所述M为大于1的正整数;
    利用所述第一特征信息从所述至少一个属性实例中确定目标属性实例,获取所述目标属性实例对应的实例数据。
  12. 根据权利要求11所述的方法,其特征在于,所述属性实例的第一特征信息包括:所述属性实例的标识、所述属性实例的优先权、所述属性实例的类型中的至少一个。
  13. 根据权利要求12所述的方法,其特征在于,所述属性实例的类型包括与推荐视窗关联的属性实例和与用户反馈关联的属性实例中的至少一个。
  14. 根据权利要求13所述的方法,其特征在于,若所述属性实例的类型为与推荐视窗关联的属性实例,则在所述属性实例关联的推荐视窗的元数据轨道中,添加有所述属性实例的第二特性信息。
  15. 根据权利要求11所述的方法,其特征在于,获取所述目标属性实例对应的实例数据,包括:
    向所述文件封装设备发送第一请求信息,所述第一请求信息用于请求所述目标属性实例的媒体文件;
    接收所述文件封装设备发送的所述目标属性实例对应的媒体文件;
    对所述媒体文件进行解封装后再解码,得到所述目标属性实例的实例数据。
  16. 根据权利要求11所述的方法,其特征在于,进一步包括:
    根据所述第一信息,向所述文件封装设备发送第二请求信息,所述第二请求用于请求所述目标点云的媒体文件;
    接收所述文件封装设备发送的所述目标点云的媒体文件;
    从所述目标点云的媒体文件中获取所述目标属性实例的媒体文件;
    对所述目标属性实例的媒体文件进行解封装后再解码,得到所述目标属性实例的实例数据。
  17. 根据权利要求15或16所述的方法,其特征在于,若属性实例的第一特征信息包括属性实例的类型,则所述根据所述至少一个属性实例的第一特征信息,确定目标属性实例,包括:
    若所述属性实例的类型为与用户反馈关联的属性实例,则根据所述M个属性信息的至少一个属性实例的第一特性信息,从所述至少一个属性实例中确定所述目标属性实例;或者,
    若所述属性实例的类型为与推荐视窗关联的属性实例,则获取所述推荐视窗的元数据轨道,并根据所述推荐视窗的元数据轨道中包括的属性实例的第二特性信息,从所述M个属性信息的至少一个属性实例中确定所述目标属性实例。
  18. 根据权利要求14所述的方法,其特征在于,所述属性实例的第二特性信息包括所述属性实例的标识和所述属性实例的属性类型中的至少一个。
  19. 根据权利要求11-16任一项所述的方法,其特征在于,
    若所述目标点云中的一帧点云的几何信息和属性信息封装在一个轨道或一个项目中时,则在所 述M个属性实例对应的子样本数据盒中添加有所述属性实例的第一特征信息;或者,
    若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中时,则在所述M个属性实例对应的组件信息数据盒中添加有所述属性实例的第一特征信息;或者,
    若所述M个属性实例中的每个属性实例封装在一个轨道或一个项目中,且所述M个属性实例对应的M个轨道构成一个轨道组,或所述M个属性实例对应的M个项目构成一个实体组时,则在所述轨道组数据盒或所述实体组数据盒中添加有所述属性实例的第一特征信息。
  20. 根据权利要求11-16任一项所述的方法,其特征在于,
    若所述M个属性实例一一对应封装在M个属性实例轨道中,则所述目标点云的媒体文件中包括轨道组数据盒,所述轨道组数据盒用于关联所述M个属性实例轨道;或者,
    若所述M个属性实例一一对应封装在M个属性实例项目中,则所述目标点云的媒体文件中包括实体组数据盒,所述实体组数据盒用于关联所述M个属性实例项目。
  21. 一种点云媒体文件的封装装置,其特征在于,应用于文件封装设备,所述装置包括:
    获取单元,用于获取目标点云,并对所述目标点云进行编码,得到所述目标点云的码流;
    封装单元,用于对所述码流进行封装,得到所述目标点云的媒体文件;当所述目标点云包括至少一类属性信息的M个属性实例对应的实例数据时,将所述M个属性实例中至少一个属性实例的第一特征信息作为所述至少一个属性实例对应的实例数据的元数据,封装在所述目标点云的媒体文件中,所述第一特征信息用于标识所述至少一个属性实例与所述M个属性实例中其他属性实例的区别,所述M为大于1的正整数。
  22. 一种点云媒体文件的解封装装置,其特征在于,应用于文件解封装设备,所述装置包括:
    收发单元,用于接收文件封装设备发送的第一信息;其中,所述第一信息用于指示目标点云包括的至少一类属性信息的M个属性实例中的至少一个属性实例的第一特征信息,所述M为大于1的正整数;
    利用所述第一特征信息从所述至少一个属性实例中确定目标属性实例,获取所述目标属性实例对应的实例数据。
  23. 一种电子设备,其特征在于,包括:
    处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行权利要求1至10或11至20中任一项所述的方法。
  24. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至10或11至20中任一项所述的方法。
PCT/CN2022/109620 2021-09-01 2022-08-02 点云媒体文件的封装与解封装方法、装置及存储介质 WO2023029858A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/463,765 US20230421810A1 (en) 2021-09-01 2023-09-08 Encapsulation and decapsulation methods and apparatuses for point cloud media file, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111022386.2A CN113852829A (zh) 2021-09-01 2021-09-01 点云媒体文件的封装与解封装方法、装置及存储介质
CN202111022386.2 2021-09-01

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/463,765 Continuation US20230421810A1 (en) 2021-09-01 2023-09-08 Encapsulation and decapsulation methods and apparatuses for point cloud media file, and storage medium

Publications (1)

Publication Number Publication Date
WO2023029858A1 true WO2023029858A1 (zh) 2023-03-09

Family

ID=78976735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/109620 WO2023029858A1 (zh) 2021-09-01 2022-08-02 点云媒体文件的封装与解封装方法、装置及存储介质

Country Status (3)

Country Link
US (1) US20230421810A1 (zh)
CN (1) CN113852829A (zh)
WO (1) WO2023029858A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115941995A (zh) * 2021-08-23 2023-04-07 腾讯科技(深圳)有限公司 媒体文件封装与解封装方法、装置、设备及存储介质
CN116781676A (zh) * 2022-03-11 2023-09-19 腾讯科技(深圳)有限公司 一种点云媒体的数据处理方法、装置、设备及介质
CN115396645B (zh) * 2022-08-18 2024-04-19 腾讯科技(深圳)有限公司 一种沉浸媒体的数据处理方法、装置、设备及存储介质
US20240129562A1 (en) * 2022-10-14 2024-04-18 Rovi Guides, Inc. Systems personalized spatial video/light field content delivery
WO2024082152A1 (zh) * 2022-10-18 2024-04-25 Oppo广东移动通信有限公司 编解码方法及装置、编解码器、码流、设备、存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573522A (zh) * 2017-03-14 2018-09-25 腾讯科技(深圳)有限公司 一种标志数据的展示方法及终端
WO2020026846A1 (ja) * 2018-08-02 2020-02-06 ソニー株式会社 画像処理装置および方法
WO2020190093A1 (ko) * 2019-03-20 2020-09-24 엘지전자 주식회사 포인트 클라우드 데이터 송신 장치, 포인트 클라우드 데이터 송신 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법
CN113114608A (zh) * 2020-01-10 2021-07-13 上海交通大学 点云数据封装方法及传输方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573522A (zh) * 2017-03-14 2018-09-25 腾讯科技(深圳)有限公司 一种标志数据的展示方法及终端
WO2020026846A1 (ja) * 2018-08-02 2020-02-06 ソニー株式会社 画像処理装置および方法
WO2020190093A1 (ko) * 2019-03-20 2020-09-24 엘지전자 주식회사 포인트 클라우드 데이터 송신 장치, 포인트 클라우드 데이터 송신 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법
CN113114608A (zh) * 2020-01-10 2021-07-13 上海交通大学 点云数据封装方法及传输方法

Also Published As

Publication number Publication date
CN113852829A (zh) 2021-12-28
US20230421810A1 (en) 2023-12-28

Similar Documents

Publication Publication Date Title
WO2023029858A1 (zh) 点云媒体文件的封装与解封装方法、装置及存储介质
JP6984841B2 (ja) イメージ処理方法、端末およびサーバ
JP2020503792A (ja) 情報処理方法および装置
WO2023061131A1 (zh) 媒体文件封装方法、装置、设备及存储介质
CN114095737B (zh) 媒体文件封装及解封装方法、装置、设备及存储介质
WO2024037137A1 (zh) 一种沉浸媒体的数据处理方法、装置、设备、介质和产品
WO2024041239A1 (zh) 一种沉浸媒体的数据处理方法、装置、设备、存储介质及程序产品
WO2023226504A1 (zh) 一种媒体数据处理方法、装置、设备以及可读存储介质
WO2022193875A1 (zh) 多视角视频的处理方法、装置、设备及存储介质
JP7471731B2 (ja) メディアファイルのカプセル化方法、メディアファイルのカプセル化解除方法及び関連機器
WO2023024841A1 (zh) 点云媒体文件的封装与解封装方法、装置及存储介质
WO2023024843A1 (zh) 媒体文件封装与解封装方法、设备及存储介质
WO2023024839A1 (zh) 媒体文件封装与解封装方法、装置、设备及存储介质
WO2023016293A1 (zh) 自由视角视频的文件封装方法、装置、设备及存储介质
WO2023169004A1 (zh) 点云媒体的数据处理方法、装置、设备及介质
US20230360678A1 (en) Data processing method and storage medium
WO2023169001A1 (zh) 一种沉浸媒体的数据处理方法、装置、设备及存储介质
JP2020516133A (ja) 仮想現実アプリケーションに対して最も関心のある領域に関連付けられた情報をシグナリングするためのシステム及び方法
WO2023169003A1 (zh) 点云媒体的解码方法、点云媒体的编码方法及装置
WO2022257518A1 (zh) 沉浸媒体的数据处理方法、装置、相关设备及存储介质
CN116137664A (zh) 点云媒体文件封装方法、装置、设备及存储介质
CN117082262A (zh) 点云文件封装与解封装方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22862999

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE