WO2018119602A1

WO2018119602A1 - Rendering method and device

Info

Publication number: WO2018119602A1
Application number: PCT/CN2016/112185
Authority: WO
Inventors: 王洛威; 廉士国
Original assignee: 深圳前海达闼云端智能科技有限公司
Priority date: 2016-12-26
Filing date: 2016-12-26
Publication date: 2018-07-05
Also published as: CN107223264B; CN107223264A

Abstract

The present invention relates to the technical field of image processing, and provides a rendering method and device to lower a CPU load rate and improve a bandwidth utilization efficiency between a GPU and a CPU. The method comprises: configuring, in a user thread corresponding to a user, a rendering process state of rendering task data submitted by the user (101); sending to a rendering main thread at least one item of buffering data having been used to configure the rendering process state for the rendering task data in the user thread (102); and sending, by means of the rendering main thread, the buffering data to a graphics processing unit (GPU) to perform graphics rendering (103). The rendering method and device of the present invention are used to render graphics.

Description

一种渲染方法及装置Rendering method and device

技术领域Technical field

本申请的实施例涉及图像处理技术领域，尤其涉及一种渲染方法及装置。Embodiments of the present application relate to the field of image processing technologies, and in particular, to a rendering method and apparatus.

背景技术Background technique

在三维图形处理领域，通常对图形的处理需要经过复杂的渲染计算过程。一般来说，渲染分实时渲染和非实时渲染，实时渲染需要在I秒内产出若干张图片，主要用于三维游戏、三维模型动态实时展示；非实时渲染一般耗时较长，旨在产出物理真实的效果图，主要用于影视动漫、广告策划、室内设计、工业设计等领域。In the field of 3D graphics processing, the processing of graphics usually requires a complex rendering calculation process. In general, rendering is divided into real-time rendering and non-real-time rendering. Real-time rendering needs to produce several images in 1 second, mainly used for 3D games, dynamic real-time display of 3D models; non-real-time rendering is generally time consuming, aiming at production The physical reality renderings are mainly used in the fields of film and television animation, advertising planning, interior design, industrial design and so on.

一般在进行实时渲染过程中，数据先由CPU(Central Processing Unit，中央处理器)处理再传输到GPU(Graphics Processing Unit，图形处理器)处理，最后生成最终的已渲染的画面。如图1所示，在CPU中，渲染子***开启主线程、并创建针对三个用户的用户线程(用户线程1、用户线程2、用户线程3)，对于每个用户通过各自提交的渲染任务数据，均需要通过主线程设置渲染过程状态后再传输到GPU处理。为提高实时渲染的速度，当数据传输延时可以忽略时，一种方案是可以将数据传输至后台的云端服务器进行实时渲染的相关计算。如图1所示，现阶段三维图形渲染在进行渲染之前，需要通过CPU设定对于当前场景的渲染过程状态，在渲染过程状态设置中一般需要设定各种参数，在云端服务器上进行实时渲染时，虽然CPU的运算能力强于普通PC(personal computer，个人电脑)，但是在基于C/S(Client/Server，客户机和服务器)架构的云端服务器上实现时，每时每刻都有成千上万的用户终端接入，当用户终端接入增多时，对于每一个用户，均存在渲染过程状态的设定，而现有技术中渲染过程状态的设定都被图形API限制于一个线程/进程之中，受限于主线程缓存 FIFO(First-In First-Out，先入先出)的处理方式，如图2所示，每一个用户的图像帧的渲染过程可能需要在主线程中依次处理；而针对每一个用户的一个渲染过程均需包含如图3所示的一个渲染过程状态设置流程：绑定顶点(通常为通过bindvertex函数申请内存)>设置视图(通常为通过setviewport函数实现)>绑定渲染管道(通常通过bindpipeline函数实现)>绘制(通常为通过Draw函数实现)，最终按照上述的设置对用户提交的一个图像帧进行绘制。而上述过程中bindvertex、以及bindpipelin通常是通过在一个OpenGL Contex(Open Graphics Library Contex，开放图形库上下文中进行修改)，而现有渲染API(Application Programming Interface,应用程序编程接口)限制只能在主线程的一个OpenGL Contex对每个用户的渲染过程状态进行修改，因此渲染过程状态设定操作所花费的计算时间就不能被简单的忽略了，而CPU多核多进程根本对此不能起到作用。Generally, in the real-time rendering process, the data is first processed by a CPU (Central Processing Unit) and then transmitted to a GPU (Graphics Processing Unit) to finally generate a final rendered image. As shown in FIG. 1, in the CPU, the rendering subsystem starts the main thread and creates a user thread for three users (user thread 1, user thread 2, user thread 3), and each user submits a rendering task through each. The data needs to be set to the GPU processing by setting the rendering process state through the main thread. In order to improve the speed of real-time rendering, when the data transmission delay can be neglected, one solution is to perform data transmission to the background cloud server for real-time rendering related calculation. As shown in Figure 1, before the rendering of the 3D graphics, the CPU needs to set the rendering process state for the current scene. In the rendering process state setting, various parameters are generally set and real-time rendering is performed on the cloud server. At the same time, although the computing power of the CPU is stronger than that of a normal PC (personal computer), it is implemented every time in the cloud server based on the C/S (Client/Server, client and server) architecture. Thousands of user terminals are accessed. When the user terminal access increases, there is a setting of the rendering process state for each user, and the setting of the rendering process state in the prior art is limited to one thread by the graphics API. / process, limited by the main thread cache FIFO (First-In First-Out) processing, as shown in Figure 2, the rendering process of each user's image frame may need to be processed sequentially in the main thread; and a rendering process for each user Both need to include a rendering process state setting flow as shown in Figure 3: binding vertices (usually applying memory through the bindvertex function) > setting the view (usually through the setviewport function) > binding the rendering pipeline (usually implemented by the bindpipeline function) )> Draw (usually implemented by the Draw function), and finally draw an image frame submitted by the user according to the above settings. In the above process, bindvertex and bindpipelin are usually modified in an OpenGL Contex (Open Graphics Library Contex), and the existing rendering API (Application Programming Interface) is limited to the main An OpenGL Contex of the thread modifies the rendering process state of each user, so the computation time spent in the rendering process state setting operation cannot be simply ignored, and the CPU multi-core multi-process does not work at all.

这样，当接入的用户终端量上升时，渲染任务量增大，云端服务器单个CPU负载率过高，其他CPU无法分担渲染工作，同时由于CPU端处理延迟增大，造成GPU与CPU间的带宽并不能有效利用，GPU并不能发挥最大效用。In this way, when the amount of the accessed user terminal increases, the amount of rendering tasks increases, the single CPU load rate of the cloud server is too high, other CPUs cannot share the rendering work, and the bandwidth between the GPU and the CPU is increased due to the increased processing delay of the CPU. It can't be used effectively, and the GPU doesn't work best.

发明内容Summary of the invention

本申请的实施例提供一种渲染方法及装置，能够降低CPU负载率，提高GPU与CPU间的带宽利用率。The embodiment of the present application provides a rendering method and device, which can reduce the CPU load rate and improve the bandwidth utilization between the GPU and the CPU.

第一方面、一种渲染方法，包括：The first aspect, a rendering method, includes:

在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态；Configuring a rendering process state of the rendering task data submitted by the user in a user thread corresponding to the user;

将至少一个在用户线程中为所述渲染任务数据配置渲染过程状态后的缓存数据发送至渲染主线程；Sending at least one cached data after configuring a rendering process state for the rendering task data in a user thread to a rendering main thread;

通过所述渲染主线程将所述缓存数据发送至图形处理器GPU进行图形渲染。 The cached data is sent to the graphics processor GPU for graphics rendering by the rendering main thread.

第二方面，提供一种渲染装置，包括：In a second aspect, a rendering apparatus is provided, including:

配置单元，用于在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态；a configuration unit, configured to configure, in a user thread corresponding to the user, a rendering process state of the rendering task data submitted by the user;

转发单元，用于将所述配置单元通过至少一个在用户线程中为所述渲染任务数据配置渲染过程状态后缓存数据发送至渲染主线程；a forwarding unit, configured to send, by the configuration unit, cache data to the rendering main thread after the at least one configuration process state is configured in the user thread for the rendering task data;

发送单元，用于通过所述渲染主线程将所述缓存数据发送至图形处理器进行图形渲染。And a sending unit, configured to send the cached data to the graphics processor for graphics rendering by using the rendering main thread.

第三方面，提供电子设备，包括：存储器、通信接口和处理器，所述存储器用于存储计算机执行代码，所述处理器用于执行所述计算机执行代码控制执行上述的渲染方法，所述通信接口用于所述渲染装置与外部设备的数据传输。In a third aspect, an electronic device is provided, comprising: a memory for communicating computer execution code, and a processor for executing the computer to perform code control to execute the rendering method described above, the communication interface Data transmission for the rendering device and an external device.

第四方面，提供一种计算机存储介质，用于储存为渲染装置所用的计算机软件指令，其包含执行上述的渲染方法所设计的程序代码。In a fourth aspect, a computer storage medium is provided for storing computer software instructions for use in a rendering device, comprising program code designed to perform the rendering method described above.

第五方面，一种计算机程序，可直接加载到计算机的内部存储器中，并含有软件代码，所述计算机程序经由计算机载入并执行后能够实现上述述渲染方法。In a fifth aspect, a computer program can be directly loaded into an internal memory of a computer and includes software code, and the computer program can be loaded and executed by a computer to implement the above-described rendering method.

在上述方案中，渲染装置可以在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态；将至少一个在用户线程中为所述渲染任务数据配置渲染过程状态后的缓存数据发送至渲染主线程；通过所述渲染主线程将所述缓存数据发送至图形处理器GPU进行图形渲染，由于渲染装置能够在用户对应的用户线程中为用户提交的渲染任务数据配置渲染过程状态，之后将配置渲染过程状态后的缓存数据发送至渲染主线程处理，相比于现有技术避免了由一个线程配置所有用户对应的渲染过程状态，这样很好地发挥了CPU多核多进程的优势，由于在各自的用户线程配置了用户提交的渲染任务数据的渲染过程状态，这样主线程只需要将渲染过程状态对应的缓存数据并行发送给GPU，因此能够降低CPU负载率，提高GPU与CPU间的带宽利用率。 In the above solution, the rendering apparatus may configure, in a user thread corresponding to the user, a rendering process state of the rendering task data submitted by the user; and configure at least one cache in the user thread to configure a rendering process state for the rendering task data. Data is sent to the rendering main thread; the cached data is sent to the graphics processor GPU for graphics rendering by the rendering main thread, and the rendering device is configured to render the rendering process state for the rendering task data submitted by the user in the user thread corresponding to the user. After that, the cached data after the state of the rendering process is configured is sent to the rendering main thread for processing. Compared with the prior art, the state of the rendering process corresponding to all users is configured by one thread, so that the advantages of the CPU multi-core multi-process are well utilized. Because the rendering process state of the rendering task data submitted by the user is configured in the respective user thread, the main thread only needs to send the cache data corresponding to the rendering process state to the GPU in parallel, thereby reducing the CPU load rate and improving the GPU and the CPU. Bandwidth utilization.

附图说明DRAWINGS

为了更清楚地说明本申请实施例的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings to be used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are only some of the present application. For the embodiments, those skilled in the art can obtain other drawings according to the drawings without any creative work.

图1为现有技术提供的实施例提供的一种渲染方法的逻辑结构图；1 is a logical structural diagram of a rendering method provided by an embodiment provided by the prior art;

图2为现有技术提供的实施例提供的一种主线程中用户的图像帧处理顺序逻辑图；2 is a logic diagram of processing an image frame of a user in a main thread according to an embodiment provided by the prior art;

图3为现有技术提供的实施例提供的一种主线程中渲染状态配置过程示意图；3 is a schematic diagram of a process of configuring a rendering state in a main thread according to an embodiment provided by the prior art;

图4为本申请的实施例提供的一种渲染方法的流程图；FIG. 4 is a flowchart of a rendering method according to an embodiment of the present application;

图5为本申请的实施例提供的一种主线程中渲染状态配置过程示意图；FIG. 5 is a schematic diagram of a process of configuring a rendering state in a main thread according to an embodiment of the present application;

图6为本申请实施例提供的一种渲染方法的逻辑结构图；FIG. 6 is a logical structural diagram of a rendering method according to an embodiment of the present application;

图7为本申请的实施例提供的一种渲染装置的结构图；FIG. 7 is a structural diagram of a rendering apparatus according to an embodiment of the present application;

图8A为本申请的另一实施例提供的一种渲染装置的结构图；FIG. 8 is a structural diagram of a rendering apparatus according to another embodiment of the present application; FIG.

图8B为本申请的又一实施例提供的一种渲染装置的结构图。FIG. 8B is a structural diagram of a rendering apparatus according to still another embodiment of the present application.

具体实施方式detailed description

本申请实施例描述的***架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案，并不构成对于本申请实施例提供的技术方案的限定，本领域普通技术人员可知，随着***架构的演变和新业务场景的出现，本申请实施例提供的技术方案对于类似的技术问题，同样适用。The system architecture and the service scenario described in the embodiments of the present application are for the purpose of more clearly explaining the technical solutions of the embodiments of the present application, and do not constitute a limitation of the technical solutions provided by the embodiments of the present application. The technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.

需要说明的是，本申请实施例中，“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言，使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。 It should be noted that, in the embodiments of the present application, the words "exemplary" or "such as" are used to mean an example, an illustration, or a description. Any embodiment or design described as "exemplary" or "for example" in the embodiments of the present application should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of the words "exemplary" or "such as" is intended to present the concepts in a particular manner.

需要说明的是，本申请实施例中，“的(英文：of)”，“相应的(英文：corresponding，relevant)”和“对应的(英文：corresponding)”有时可以混用，应当指出的是，在不强调其区别时，其所要表达的含义是一致的。It should be noted that, in the embodiment of the present application, "(English: of)", "corresponding (relevant)" and "corresponding" may sometimes be mixed. It should be noted that When the difference is not emphasized, the meaning to be expressed is the same.

本申请的实施例提供的用户终端可以为个人计算机((英文全称：personal computer，缩写：PC)、上网本、个人数字助理(英文：Personal Digital Assistant，简称：PDA)等，或者上述用户终端可以为安装有可执行本申请实施例提供的方法的软件客户端或软件***或软件应用的PC等，具体的硬件实现环境可以通用计算机形式，或者是ASIC的方式，也可以是FPGA，或者是一些可编程的扩展平台例如Tensilica的Xtensa平台等等。本申请的实施例提供的服务器包括本地域名服务器、本地代理服务器，网络服务器，本申请的实施例提供服务器用于响应服务请求提供计算服务。基本构成包括处理器、硬盘、内存、***总线等，和通用的计算机架构类似。The user terminal provided by the embodiment of the present application may be a personal computer (individual computer: personal computer, abbreviation: PC), a netbook, a personal digital assistant (PDA), or the like, or the user terminal may be A software client or a software system or a software application PC or the like that can execute the method provided by the embodiment of the present application is installed. The specific hardware implementation environment may be a general computer form, an ASIC method, an FPGA, or some The extended platform for programming, such as the Xtensa platform of Tensilica, etc. The server provided by the embodiment of the present application includes a local domain name server, a local proxy server, and a network server, and an embodiment of the present application provides a server for providing a computing service in response to a service request. It includes a processor, hard disk, memory, system bus, etc., similar to a general-purpose computer architecture.

本申请的基本原理为将渲染过程状态的设置过程从主线程剥离至每个用户对应的用户线程，很好地发挥了CPU多核多进程的优势，由于在各自的用户线程配置了用户提交的渲染任务数据的渲染过程状态，这样主线程只需要渲染过程状态的缓存数据初始化为GPU能够处理的数据降低CPU负载率，提高GPU与CPU间的带宽利用率。The basic principle of the present application is to strip the setting process of the rendering process state from the main thread to the user thread corresponding to each user, which fully exploits the advantages of the CPU multi-core multi-process, because the user-submitted rendering is configured in the respective user thread. The rendering process state of the task data, so that the main thread only needs to initialize the cached data of the process state to initialize the data that the GPU can process to reduce the CPU load rate and improve the bandwidth utilization between the GPU and the CPU.

本申请的实施例提供的渲染方法可以应用于用户终端，也可以应用于基于C/S架构的云端服务器。The rendering method provided by the embodiment of the present application can be applied to a user terminal, and can also be applied to a cloud server based on a C/S architecture.

参照图4所示，本申请的实施例提供一种渲染方法，包括如下步骤：Referring to FIG. 4, an embodiment of the present application provides a rendering method, including the following steps:

101、在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态。101. Configure, in a user thread corresponding to the user, a rendering process state of the rendering task data submitted by the user.

在步骤101之前首选需要通过为每个用户创建一个用户线程。其中渲染任务数据至少包括如下状态参数：场景、场景中的元素、用户姿态；例如对于游戏场景，包含的元素有人、动物、植物、建筑、交通工具、武器等；元素的姿态可以为人或动物的姿态。步骤101具体为根据所述状态参数在用户对应的用户线程中，配置所述用户提交的渲染任务的渲染过程状态。示例性的，渲染过程状态通常设置为渲染任务数据的Context(上下文)， Context包括当前渲染管线中的所有状态，如绑定的Shader(着色器)，Render Target(渲染目标)等。在OpenGL(Open Graphics Library，开放图形库)中Context和单一线程是绑定的，所以所有需要作用于Context的操作，例如改变渲染过程状态：绑定Shader，调用Draw Call(绘制调用)，都只能在单一线程上进行。具体的，参照图5所示，步骤101包括如下步骤：Prior to step 101, it was preferred to create a user thread for each user. The rendering task data includes at least the following state parameters: a scene, an element in the scene, a user gesture; for example, for a game scene, the included elements are people, animals, plants, buildings, vehicles, weapons, etc.; the posture of the elements may be human or animal attitude. Step 101 is specifically configured to configure, according to the state parameter, a rendering process state of the rendering task submitted by the user in a user thread corresponding to the user. Illustratively, the rendering process state is typically set to render the context of the task data, The Context includes all the states in the current rendering pipeline, such as the bound Shader, Render Target, and so on. In OpenGL (Open Graphics Library), Context and single thread are bound, so all operations that need to be applied to the Context, such as changing the state of the rendering process: binding the Shader, calling Draw Call, are only Can be done on a single thread. Specifically, referring to FIG. 5, step 101 includes the following steps:

S1、在用户线程通过绑定顶点操作为用户提交的渲染任务数据申请内存。S1: The user thread applies for memory in the rendering task data submitted by the user through the binding vertex operation.

S2、在用户线程为渲染任务数据设置视图。S2. Set a view on the user thread for rendering task data.

S3、在用户线程为渲染任务数据绑定渲染管道。S3. The user thread is bound to the rendering pipeline for the rendering task data.

其中步骤S1中绑定顶点操作通常通过bindvertex函数实现，在S2中设置视图操作通常通过setviewport函数实现，在步骤S3中绑定渲染管道通常通过bindpipeline函数实现。此外为了避免每一个用户渲染过程状态不停改变带来额外开销的影响，该方案还包括步骤S4在用户线程为渲染任务数据绑定描述符，其中描述符用于指示绑定渲染管道使用的资源。其中绑定描述符操作通过binddescriptor函数实现。由于descriptor描述bindpipeline阶段所需要的资源，当真正需要改变bindpipeline阶段的参数时只需要从资源相应的位置读入即可。The binding vertex operation in step S1 is usually implemented by the bindvertex function. The setting operation in S2 is usually implemented by the setviewport function. In step S3, the binding rendering pipeline is usually implemented by the bindpipeline function. In addition, in order to avoid the impact of overhead caused by the constant change of each user's rendering process state, the solution further includes the step S4: the user thread is a rendering task data binding descriptor, wherein the descriptor is used to indicate the resource used by the binding rendering pipeline. . The binding descriptor operation is implemented by the binddescriptor function. Since the descriptor describes the resources required by the bindpipeline phase, it is only necessary to read from the corresponding location of the resource when it is really necessary to change the parameters of the bindpipeline phase.

102、将至少一个在用户线程中为渲染任务数据配置渲染过程状态后的缓存数据发送至渲染主线程。102. Send at least one cached data after configuring a rendering process state for the rendering task data in the user thread to the rendering main thread.

其中在步骤102中，当包含多个用户线程是，各个用户线程将各自配置的渲染过程状态对应的缓存数据(buffer)并行的发送给渲染主线程，其中，为了步骤102之后还包括：通过所述渲染主线程为所述至少一个用户线程对应的缓存数据建立缓存队列，这样在步骤103中渲染主线程直接将缓存队列里的缓存数据并行发送至GPU。In step 102, when a plurality of user threads are included, each user thread sends the buffer data corresponding to the respective configured rendering process state to the rendering main thread in parallel, wherein, in order to after step 102, the method includes: The rendering main thread establishes a cache queue for the cache data corresponding to the at least one user thread, so that in step 103, the rendering main thread directly sends the cache data in the cache queue to the GPU in parallel.

103、通过渲染主线程将缓存数据发送至图形处理器GPU进行渲染。 103. Send the cached data to the graphics processor GPU for rendering by the rendering main thread.

具体的参照图6所示，基于Vulkan架构(一个跨平台的2D和3D绘图应用程序接口API)，对本申请的实施例说明如下：渲染子***运行一个渲染主线程，并通过创建三个用户线程，用户一线程、用户二线程以及用户三线程，每个用户线程针对各自用户提交的渲染任务数据设置渲染过程状态(render process state)，该过程参照上述步骤101的描述不再赘述；之后各个用户配置渲染过程状态后的缓存数据并行发送至渲染主线程，这里缓存数据存储在缓存区域(通常为：commend buffer)，在渲染主线程维护一个缓存区域队列(例如commend buffer queue)存储缓存数据，由于各个用户线程将各个用户渲染过程状态对应的缓存数据并行发送至渲染主线程的任务递交时间是不能忽略的。因此，这里通常在渲染主线程中设置init fence(初始化栅栏)，当各个用户线程将各个用户渲染过程状态对应的缓存数据并行发送至渲染主线程全部完成后，通过触发fence释放，由渲染主线程将缓存区域队列中的数据递交至GPU。Referring specifically to FIG. 6, based on the Vulkan architecture (a cross-platform 2D and 3D graphics application API), an embodiment of the present application is described as follows: The rendering subsystem runs a rendering main thread and creates three user threads. a user-thread, a user-two thread, and a user three-thread, each user thread sets a render process state for the render task data submitted by the respective user, and the process is not described again with reference to the description of step 101 above; The cached data after the state of the rendering process is configured is sent to the rendering main thread in parallel, where the cached data is stored in the cache area (usually: the command buffer), and the rendering main thread maintains a cache area queue (for example, the command buffer queue) to store the cached data due to The task delivery time for each user thread to send the cache data corresponding to each user's rendering process state to the rendering main thread in parallel cannot be ignored. Therefore, here is usually set in the rendering main thread init fence (initialization fence), when each user thread sends the cache data corresponding to each user rendering process state to the rendering main thread in parallel, after triggering the fence release, by the rendering main thread Deliver data from the cache area queue to the GPU.

在上述方案中，可以在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态；将至少一个在用户线程中所述渲染任务数据配置渲染过程状态后的缓存数据发送至渲染主线程；通过所述渲染主线程将所述缓存数据发送至图形处理器GPU进行图形渲染，由于能够在用户对应的用户线程中为用户提交的渲染任务数据配置渲染过程状态，之后将配置渲染过程状态后的缓存数据发送至渲染主线程处理，相比于现有技术避免了由一个线程配置所有用户对应的渲染过程状态，这样很好地发挥了CPU多核多进程的优势，由于在各自的用户线程配置了用户提交的渲染任务数据的渲染过程状态，这样主线程只需要将渲染过程状态对应的缓存数据并行发送给GPU，因此能够降低CPU负载率，提高GPU与CPU间的带宽利用率。In the above solution, the rendering process state of the rendering task data submitted by the user may be configured in a user thread corresponding to the user; and at least one cached data after the rendering task data is configured in the user thread to configure the rendering process state to the cached data Rendering a main thread; sending, by the rendering main thread, the cached data to a graphics processor GPU for graphics rendering, since the rendering process state can be configured for the rendering task data submitted by the user in the user thread corresponding to the user, and then the rendering is configured The cached data after the process state is sent to the rendering main thread for processing. Compared with the prior art, the state of the rendering process corresponding to all users configured by one thread is avoided, so that the advantages of the CPU multi-core multi-process are well utilized, due to their respective The user thread configures the rendering process state of the rendering task data submitted by the user, so that the main thread only needs to send the cache data corresponding to the rendering process state to the GPU in parallel, thereby reducing the CPU load rate and improving the bandwidth utilization between the GPU and the CPU.

可以理解的是，通过其包含的硬件结构和/或软件模块实现上述实施例提供的功能。本领域技术人员应该很容易意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。It will be understood that the functions provided by the above embodiments are implemented by the hardware structures and/or software modules they comprise. Those skilled in the art will readily appreciate that the present application can be implemented in a combination of hardware or hardware and computer software in combination with the elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. Professionals can use different methods to achieve the described work for each specific application. Yes, but such an implementation should not be considered beyond the scope of this application.

本申请实施例可以根据上述方法示例对渲染装置进行功能模块的划分，例如，可以对应各个功能划分各个功能模块，也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现，也可以采用软件功能模块的形式实现。需要说明的是，本申请实施例中对模块的划分是示意性的，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式。The embodiment of the present application may divide the function module into the rendering device according to the foregoing method example. For example, each function module may be divided according to each function, or two or more functions may be integrated into one processing module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.

在采用对应各个功能划分各个功能模块的情况下，图7示出了上述实施例中所涉及的渲染装置的一种可能的结构示意图，渲染装置包括：配置单元71、转发单元72和发送单元73。配置单元71，用于在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态；转发单元72，用于将所述配置单元71通过至少一个在用户线程中为所述渲染任务数据配置渲染过程状态后缓存数据发送至渲染主线程；发送单元73用于通过所述渲染主线程将所述缓存数据发送至图形处理器进行图形渲染。可选的，还包括缓存单元74，用于通过所述渲染主线程为所述至少一个用户线程对应的所述缓存数据建立缓存队列；所述发送单元73具体用于将所述缓存数据建立缓存队列发送至图形处理器GPU进行图形渲染。其中，所述配置单元71具体用于在所述用户线程通过绑定顶点操作为用户提交的渲染任务数据申请内存；在所述用户线程为所述渲染任务数据设置视图；在所述用户线程为所述渲染任务数据绑定渲染管道。所述配置单元71还用于在所述用户线程为所述渲染任务数据绑定描述符，其中所述描述符用于指示绑定渲染管道使用的资源。可选的，渲染任务数据至少包括如下状态参数：场景、场景中的元素、元素的姿态；配置单元71具体用于根据所述状态参数在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态。可选的还包括：线程控制单元75，用于为每一个用户创建一个用户线程。其中，上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述，在此不再赘述。FIG. 7 is a schematic diagram showing a possible structure of the rendering apparatus involved in the foregoing embodiment. The rendering apparatus includes: a configuration unit 71, a forwarding unit 72, and a sending unit 73. . a configuration unit 71, configured to configure, in a user thread corresponding to the user, a rendering process state of the rendering task data submitted by the user, and a forwarding unit 72, configured to pass the configuration unit 71 to the user thread by using at least one After the rendering task data configures the rendering process state, the cache data is sent to the rendering main thread; the sending unit 73 is configured to send the cached data to the graphics processor for graphics rendering by the rendering main thread. Optionally, the buffer unit 74 is configured to establish, by the rendering main thread, a cache queue for the cached data corresponding to the at least one user thread, where the sending unit 73 is specifically configured to cache the cached data. The queue is sent to the graphics processor GPU for graphics rendering. The configuration unit 71 is specifically configured to apply for memory in the rendering task data submitted by the user thread by the binding vertex operation for the user; setting the view for the rendering task data in the user thread; The rendering task data is bound to the rendering pipeline. The configuration unit 71 is further configured to bind a descriptor to the rendering task data in the user thread, where the descriptor is used to indicate a resource used by the binding rendering pipeline. Optionally, the rendering task data includes at least the following state parameters: a scene, an element in the scene, and a posture of the element. The configuration unit 71 is specifically configured to configure, according to the state parameter, a rendering submitted by the user in a user thread corresponding to the user. The rendering process state of the task data. Optionally, the method further includes: a thread control unit 75, configured to create a user thread for each user. All the related content of the steps involved in the foregoing method embodiments may be referred to the functional descriptions of the corresponding functional modules, and details are not described herein again.

图8A示出了本申请一个实施例中所涉及的电子设备的一种可能的结构示意图。电子设备包括：通信模块81和处理模块82。处理模块82用于对渲染动作进行控制管理，例如，处理模块82用于支持渲染装置执行配置单元71、转发单元72以及线程控制单元75执行的方法。通信模块81用于支持渲染装置与其他设备的数据传输，实施发送单元73执行的方法。电子设备还可以包括存储模块83，用于存储辅助显示装置的程序代码和数据，例如执行缓存单元74执行的方法。FIG. 8A is a schematic diagram showing a possible structure of an electronic device involved in an embodiment of the present application. The electronic device includes a communication module 81 and a processing module 82. The processing module 82 is configured to perform control management on the rendering action. For example, the processing module 82 is configured to support the rendering device. The row configuration unit 71, the forwarding unit 72, and the method executed by the thread control unit 75. The communication module 81 is for supporting data transmission of the rendering device and other devices, and implements the method performed by the transmitting unit 73. The electronic device can also include a storage module 83 for storing program code and data of the auxiliary display device, such as a method performed by the execution cache unit 74.

其中，处理模块82可以是处理器或控制器，例如可以是中央处理器(Central Processing Unit，CPU)，通用处理器，数字信号处理器(Digital Signal Processor，DSP)，专用集成电路(Application-Specific Integrated Circuit，ASIC)，现场可编程门阵列(Field Programmable Gate Array，FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框，模块和电路。所述处理器也可以是实现计算功能的组合，例如包含一个或多个微处理器组合，DSP和微处理器的组合等等。通信模块81可以是收发器、收发电路或通信接口等。存储模块可以是存储器。The processing module 82 may be a processor or a controller, such as a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit (Application-Specific). Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure. The processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like. The communication module 81 can be a transceiver, a transceiver circuit, a communication interface, or the like. The storage module can be a memory.

当处理模块82为处理器，通信模块81为通信接口，存储模块83为存储器时，本申请实施例所涉及的电子设备可以为图8B所示的电子设备。When the processing module 82 is a processor, the communication module 81 is a communication interface, and the storage module 83 is a memory, the electronic device according to the embodiment of the present application may be the electronic device shown in FIG. 8B.

参阅图8B所示，该电子设备包括：处理器91、通信接口92、存储器93以及总线94。其中，通信接口92以及存储器93通过总线94耦接处理器91；总线94可以是外设部件互连标准(Peripheral Component Interconnect，PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture，EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示，图8B中仅用一条粗线表示，但并不表示仅有一根总线或一种类型的总线。Referring to FIG. 8B, the electronic device includes a processor 91, a communication interface 92, a memory 93, and a bus 94. The communication interface 92 and the memory 93 are coupled to the processor 91 via a bus 94. The bus 94 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. . The bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 8B, but it does not mean that there is only one bus or one type of bus.

结合本申请公开内容所描述的方法或者算法的步骤可以硬件的方式来实现，也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成，软件模块可以被存放于随机存取存储器(Random Access Memory，RAM)、闪存、只读存储器(Read Only Memory，ROM)、可擦除可编程只读存储器(Erasable Programmable ROM，EPROM)、电可擦可编程只读存储器(Electrically EPROM，EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器，从而使处理器能够从该存储介质读取信息，且可向该存储介质写入信息。当然，存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外，该ASIC可以位于核心网接口设备中。当然，处理器和存储介质也可以作为分立组件存在于核心网接口设备中。The steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware or may be implemented by a processor executing software instructions. The software instructions may be composed of corresponding software modules, which may be stored in a random access memory (RAM), a flash memory, a read only memory (ROM), an erasable programmable read only memory ( Erasable Programmable ROM (EPROM), electrically erasable programmable read only memory (EEPROM), registers, hard disk, removable hard disk, compact disk read only (CD-ROM) or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor to enable the processor to The storage medium reads information and can write information to the storage medium. Of course, the storage medium can also be an integral part of the processor. The processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in a core network interface device. Of course, the processor and the storage medium may also exist as discrete components in the core network interface device.

本领域技术人员应该可以意识到，在上述一个或多个示例中，本申请所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时，可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质，其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。Those skilled in the art will appreciate that in one or more examples described above, the functions described herein can be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium. Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another. A storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

以上所述的具体实施方式，对本申请的目的、技术方案和有益效果进行了进一步详细说明，所应理解的是，以上所述仅为本申请的具体实施方式而已，并不用于限定本申请的保护范围，凡在本申请的技术方案的基础之上，所做的任何修改、等同替换、改进等，均应包括在本申请的保护范围之内。 The specific embodiments of the present invention have been described in detail with reference to the specific embodiments of the present application. It is to be understood that the foregoing description is only The scope of protection, any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the present application are included in the scope of protection of the present application.

Claims

一种渲染方法，其特征在于，包括：A rendering method, comprising:

在用户对应的用户线程中，配置该用户提交的渲染任务数据的渲染过程状态；Configuring a rendering process state of the rendering task data submitted by the user in a user thread corresponding to the user;

将至少一个在用户线程中为所述渲染任务数据配置渲染过程状态后的缓存数据发送至渲染主线程；Sending at least one cached data after configuring a rendering process state for the rendering task data in a user thread to a rendering main thread;

通过所述渲染主线程将所述缓存数据发送至图形处理器GPU进行图形渲染。The cached data is sent to the graphics processor GPU for graphics rendering by the rendering main thread.
根据权利要求1所述的渲染方法，其特征在于，所述方法还包括：The rendering method according to claim 1, wherein the method further comprises:

通过所述渲染主线程为所述至少一个用户线程对应的所述缓存数据建立缓存队列；Establishing, by the rendering main thread, a cache queue for the cached data corresponding to the at least one user thread;

所述通过所述渲染主线程将所述缓存数据发送至图形处理器GPU进行图形渲染，包括：And sending, by the rendering main thread, the cached data to a graphics processor GPU for graphics rendering, including:

将所述缓存数据建立缓存队列发送至图形处理器GPU进行图形渲染。The cached data setup cache queue is sent to the graphics processor GPU for graphics rendering.
根据权利要求1所述的渲染方法，其特征在于，所述在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态，包括：The rendering method according to claim 1, wherein the configuring a rendering process state of the rendering task data submitted by the user in the user thread corresponding to the user comprises:

在所述用户线程通过绑定顶点操作为用户提交的渲染任务数据申请内存；Applying memory to the rendering task data submitted by the user through the binding vertex operation;

在所述用户线程为所述渲染任务数据设置视图；Setting a view for the rendering task data in the user thread;

在所述用户线程为所述渲染任务数据绑定渲染管道。The rendering pipeline is bound to the rendering task data by the user thread.
根据权利要求3所述的渲染方法，其特征在于，所述方法还包括：The rendering method according to claim 3, wherein the method further comprises:

在所述用户线程为所述渲染任务数据绑定描述符，其中所述描述符用于指示绑定渲染管道使用的资源。 The user thread is a data binding descriptor for the rendering task, wherein the descriptor is used to indicate a resource used by the binding rendering pipeline.
根据权利要求1所述的渲染方法，其特征在于，The rendering method according to claim 1, wherein

所述渲染任务数据至少包括如下状态参数：场景、场景中的元素、元素的姿态；The rendering task data includes at least the following state parameters: a scene, an element in the scene, and an attitude of the element;

所述在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态，包括：根据所述状态参数在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态。Configuring, in the user thread corresponding to the user, the rendering process state of the rendering task data submitted by the user, including: configuring, according to the state parameter, a rendering of the rendering task data submitted by the user in a user thread corresponding to the user Process status.
根据权利要求1所述的渲染方法，其特征在于，所述方法还包括：为每一个用户创建一个用户线程。The rendering method of claim 1 further comprising: creating a user thread for each user.
一种渲染装置，其特征在于，包括：A rendering device, comprising:

配置单元，用于在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态；a configuration unit, configured to configure, in a user thread corresponding to the user, a rendering process state of the rendering task data submitted by the user;

转发单元，用于将所述配置单元通过至少一个在用户线程中为所述渲染任务数据配置渲染过程状态后缓存数据发送至渲染主线程；a forwarding unit, configured to send, by the configuration unit, cache data to the rendering main thread after the at least one configuration process state is configured in the user thread for the rendering task data;

发送单元，用于通过所述渲染主线程将所述缓存数据发送至图形处理器进行图形渲染。And a sending unit, configured to send the cached data to the graphics processor for graphics rendering by using the rendering main thread.
根据权利要求7所述的渲染装置，其特征在于，还包括：The rendering apparatus according to claim 7, further comprising:

缓存单元，用于通过所述渲染主线程为所述至少一个用户线程对应的所述缓存数据建立缓存队列；a cache unit, configured to establish, by the rendering main thread, a cache queue for the cached data corresponding to the at least one user thread;

所述发送单元具体用于将所述缓存数据建立缓存队列发送至图形处理器GPU进行图形渲染。The sending unit is specifically configured to send the cache data creation cache queue to a graphics processor GPU for graphics rendering.
根据权利要求7所述的渲染装置，其特征在于，所述配置单元具体用于在所述用户线程通过绑定顶点操作为用户提交的渲染任务数据申请内存；在所述用户线程为所述渲染任务数据设置视图；在所述用户线程为所述渲染任务数据绑定渲染管道。 The rendering apparatus according to claim 7, wherein the configuration unit is specifically configured to apply for memory in the rendering task data submitted by the user thread by binding a vertex operation; and the user thread is the rendering a task data setting view; the rendering thread is bound to the rendering task data in the user thread.
根据权利要求9所述的渲染装置，其特征在于，所述配置单元还用于在所述用户线程为所述渲染任务数据绑定描述符，其中所述描述符用于指示绑定渲染管道使用的资源。The rendering apparatus according to claim 9, wherein the configuration unit is further configured to bind a descriptor to the rendering task data in the user thread, wherein the descriptor is used to indicate that the binding rendering pipeline is used. resource of.
根据权利要求7所述的渲染装置，其特征在于，A rendering device according to claim 7, wherein

所述渲染任务数据至少包括如下状态参数：场景、场景中的元素、元素的姿态；The rendering task data includes at least the following state parameters: a scene, an element in the scene, and an attitude of the element;

所述配置单元具体用于根据所述状态参数在用户对应的用户线程中，配置所述用户提交的渲染任务数据的渲染过程状态。The configuration unit is specifically configured to configure, in the user thread corresponding to the user, a rendering process state of the rendering task data submitted by the user according to the state parameter.
根据权利要求7所述的渲染装置，其特征在于，还包括：线程控制单元，用于为每一个用户创建一个用户线程。The rendering apparatus according to claim 7, further comprising: a thread control unit for creating a user thread for each user.
一种电子设备，其特征在于，包括：存储器、通信接口和处理器，所述存储器和所述通信接口耦合至所述处理器，所述存储器用于存储计算机执行代码，所述处理器用于执行所述计算机执行代码控制执行权利要求1至6任一项所述的渲染方法，所述通信接口用于所述渲染装置与外部设备的数据传输。An electronic device, comprising: a memory, a communication interface, and a processor, the memory and the communication interface coupled to the processor, the memory for storing computer execution code, the processor for executing The computer executes code control to perform the rendering method of any one of claims 1 to 6, the communication interface being used for data transmission of the rendering device and an external device.
一种计算机存储介质，其特征在于，用于储存为渲染装置所用的计算机软件指令，其包含执行权利要求1～6任一项所述的渲染方法所设计的程序代码。A computer storage medium, characterized by computer software instructions for storing a rendering device, comprising program code designed to perform the rendering method of any one of claims 1 to 6.
一种计算机程序产品，其特征在于，可直接加载到计算机的内部存储器中，并含有软件代码，所述计算机程序经由计算机载入并执行后能够实现权利要求1～6任一项所述渲染方法。 A computer program product, which can be directly loaded into an internal memory of a computer and contains software code, and the computer program can be loaded and executed by a computer to implement the rendering method according to any one of claims 1 to 6. .