CN116974544A - Method, apparatus, device, medium and program product for processing media data - Google Patents

Method, apparatus, device, medium and program product for processing media data Download PDF

Info

Publication number
CN116974544A
CN116974544A CN202310269524.XA CN202310269524A CN116974544A CN 116974544 A CN116974544 A CN 116974544A CN 202310269524 A CN202310269524 A CN 202310269524A CN 116974544 A CN116974544 A CN 116974544A
Authority
CN
China
Prior art keywords
code block
media data
code
model
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310269524.XA
Other languages
Chinese (zh)
Inventor
陈琼雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310269524.XA priority Critical patent/CN116974544A/en
Publication of CN116974544A publication Critical patent/CN116974544A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/34Graphical or visual programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application discloses a method, a device, equipment, a medium and a program product for processing media data, which relate to the technical field of computers, in particular to the technical field of processing media data. The method comprises the following steps: displaying a visual programming interface, wherein the visual programming interface comprises a code block display area, and the code block display area comprises a first code block of a prediction model and a second code block which has a logic relation with the first code block; responding to an execution instruction of the first code block, obtaining media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted; the second code block is executed based on the category. The programming functionality of the graphical programming tool can be increased.

Description

Method, apparatus, device, medium and program product for processing media data
Technical Field
The present application relates generally to the field of computer technology, and in particular, to the field of artificial intelligence technology, and more particularly, to a method, apparatus, device, medium, and program product for processing media data.
Background
With the continuous development and progress of programming technology, graphic programming tools such as blocking and scratch are currently developed on the market, and the core of the graphic programming tools is to package a text programming language in a code form into a code block, and simply and quickly complete programming by combining and splicing the code blocks with corresponding functions in a visualized code editing interface.
However, existing graphical editing tools can only provide some code blocks of basic functionality, limiting to some extent the functionality of the graphical programming tool.
Disclosure of Invention
In view of the foregoing drawbacks or shortcomings of the prior art, it is desirable to provide a method, apparatus, device, medium and program product for processing media data that can increase the programming functionality of a graphical programming tool.
In a first aspect, the present application provides a method for processing media data, the method comprising: displaying a visual programming interface, wherein the visual programming interface comprises a code block display area, and the code block display area comprises a first code block of a prediction model and a second code block which has a logic relation with the first code block; responding to an execution instruction of the first code block, obtaining media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted; the second code block is executed based on the category.
In a second aspect, the present application provides a processing apparatus for media data, the processing apparatus for media data comprising:
the display module is used for displaying a visual programming interface, and the visual programming interface comprises a code block display area, wherein the code block display area comprises a first code block of a prediction model and a second code block which has a logic relation with the first code block.
The identification model is used for responding to the execution instruction of the first code block, obtaining media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted;
and the processing module is used for executing the second code block based on the category.
In one embodiment of the application, the second code block includes control parameters of the virtual object.
The processing module is specifically used for determining control parameters of the virtual object according to the category of the media data to be predicted and controlling the virtual object to display the gesture corresponding to the control parameters.
In one embodiment of the application, the visual programming interface further comprises configuration items of the predictive model.
And the display module is also used for receiving configuration operation of configuration items of the prediction model and displaying the first code block in the code block display area.
In one embodiment of the application, the configuration items of the prediction model comprise configuration items of a machine learning control and a lower menu; the display module is used for displaying the display information of the display module, in particular for displaying the display information of the display module,
and responding to the triggering operation of the machine learning control, and displaying configuration items of a lower menu.
And receiving configuration operation of configuration items of a lower menu, and displaying the first code block in a code block display area.
In one embodiment of the application, the configuration items of the lower menu comprise code blocks of at least one candidate model. The display module is used for displaying the display information of the display module, in particular for displaying the display information of the display module,
in response to a selected operation of the code blocks of the at least one candidate model, code blocks of the target model are determined among the code blocks of the at least one candidate model.
And receiving configuration operations of a first display area and a second display area in a code block of the target model, obtaining the first code block, displaying the first code block in a code block display area, wherein the first display area is used for displaying an identification of the prediction model, and the second display area is used for displaying a data type of media data to be predicted.
In one embodiment of the present application, the presentation module is further configured to present the model training interface in response to a new operation of the predictive model in the visual programming interface.
And receiving sample adding operation of a model training interface, and displaying a training sample set, wherein the training sample set comprises at least one piece of media data and a label corresponding to the media data, and is used for training to obtain a prediction model.
In one embodiment of the application, the training module is used for training a prediction model to be trained by using a sample set in a model training interface to obtain the prediction model;
The processing model is used for transmitting the data packet of the prediction model to a database corresponding to the visual programming interface through a preset calling interface.
In one embodiment of the application, the identification module, in particular for,
and calling the first code block according to the calling function of the first code block, and transmitting the media data to be predicted to the first code block.
And running a code program of the prediction model by using the first code block, and predicting the media data to be predicted to obtain the category of the media data to be predicted.
In a third aspect, embodiments of the present application provide a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method as described in embodiments of the present application when the program is executed by the processor.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described in embodiments of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product comprising instructions which, when executed, cause a method as described in embodiments of the present application to be performed.
The processing method, the device, the equipment, the medium and the program product of the media data can only provide code blocks with basic functions due to the existing graphical editing tool, so that the programming functions of the graphical programming tool are limited to a certain extent. In order to increase the variety of the code blocks in the graphical editing tool, the application provides the code blocks with the media data prediction function in the graphical editing tool, and when the media data is required to be predicted, the type prediction of the media data can be realized by running the code program of the prediction model through the code blocks. Specifically, a visual programming interface is displayed, wherein the visual programming interface comprises a code block display area with a first code block of a prediction model and a second code block of a logic relation with the first code block; and responding to the execution instruction by receiving the execution instruction aiming at the first code block, acquiring media data to be predicted, and executing the first code block to predict the media data to be predicted, so that the category of the media data to be predicted is obtained, and then, running the second code block based on the category. According to the method, the code program for carrying out category prediction on the media data is packaged into the code block, so that the category prediction design of the media data can be easily completed in the graphical editing tool even if a user does not program a prediction model, and the code programming functionality and diversity in the graphical editing tool are increased.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
fig. 1 is a schematic structural diagram of a media data processing system according to an embodiment of the present application;
fig. 2 is a flow chart of a method for processing media data according to an embodiment of the present application;
fig. 3a is a schematic diagram illustrating the effect of a code block display area according to an embodiment of the present application;
FIG. 3b is a schematic diagram illustrating an effect of another code block display area according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating the effect of a visual programming interface according to an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating the effect of another visual programming interface according to an embodiment of the present application;
FIG. 6 is a schematic diagram illustrating the effect of yet another visual programming interface according to an embodiment of the present application;
FIG. 7 is a schematic diagram illustrating the effect of yet another visual programming interface provided by an embodiment of the present application;
FIG. 8 is a schematic diagram of the effect of a model training interface according to an embodiment of the present application;
FIG. 9 is a schematic diagram of the effect of a model training sub-interface according to an embodiment of the present application;
FIG. 10 is a code program representation of a predictive model according to an embodiment of the application;
FIG. 11 is a schematic diagram showing the effect of another model training sub-interface according to an embodiment of the present application;
fig. 12 is a flowchart illustrating another method for processing media data according to an embodiment of the present application;
fig. 13 is a schematic structural diagram of a media data processing device according to an embodiment of the present application;
fig. 14 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
The embodiment of the application relates to related applications of virtual and real scenes, and in order to better understand the scheme of the embodiment of the application, related terms and concepts possibly related to the embodiment of the application are described below.
1、Blockly
Blockly is an application that adds a visual code editor to an application program. The Blockly editor uses the snapped graphic blocks to represent code concepts such as variables, logical expressions, loops, etc. It allows the user to program directly on a programming basis without having to pay attention to the details of the grammar.
2、Scratch
The Scratch is a simple graphical programming tool, and a user can not recognize English words or use a keyboard. Programming can be easily accomplished in a building block-like manner. It avoids complex syntax but can perfectly preserve programming thought.
3、Teachable Machine
Teachable Machine is a platform that can quickly and easily create, train, and run machine learning models through a browser.
4. Virtual environment
The virtual environment is a virtual environment that an application program displays (or provides) while running on a terminal. The virtual environment may be a simulation environment for the real world, a semi-simulation and semi-fictional virtual environment, or a pure fictional virtual environment. The virtual environment may be any one of a two-dimensional virtual environment, a 2.5-dimensional virtual environment, or a three-dimensional virtual environment, and the dimensions of the virtual environment are not limited in the embodiments of the present application. For example, the virtual environment may include sky, land, sea, etc., the land may include environmental elements of a desert, city, etc., and the user may control the virtual object to move in the virtual environment.
5. Virtual object
Virtual object: refers to an object in a virtual environment, which is a fictitious object for simulating a real object or living being. For example, characters, animals, plants, oil drums, walls, stones, snow flakes, etc. displayed in the virtual environment. The virtual objects include virtual objects and virtual roles, wherein the virtual objects are objects of inanimate nature, e.g., virtual buildings, virtual vehicles, virtual props, etc. A virtual character refers to an object having a life attribute, for example, a virtual object may be a virtual character, a virtual animal, or the like.
Currently, existing graphical editing tools can only provide code blocks with basic functions, and the functions of the graphical programming tools are limited to a certain extent.
Based on the above, the embodiments of the present application provide a method, apparatus, device, medium and program product for processing media data, capable of displaying a visual programming interface, where the visual programming interface includes a code block display area having a first code block of a prediction model and a second code block having a logical relationship with the first code block; and responding to the execution instruction by receiving the execution instruction aiming at the first code block, acquiring media data to be predicted, and executing the first code block to predict the media data to be predicted, so that the category of the media data to be predicted is obtained, and then, running the second code block based on the category. In this way, the code blocks with the media data prediction function are displayed in the code block display area in the graphical editing tool, and when the media data needs to be predicted, the code program of the prediction model can be run through the code blocks, so that the category prediction of the media data is realized.
Fig. 1 is a schematic structural diagram of a media data processing system according to an embodiment of the present application. The processing method of media data provided by the embodiment of the application can be applied to the processing system 100 of the media data. Referring to fig. 1, the processing system 100 of media data includes one or more user devices 101 and a server 102. It should be noted that although fig. 1 depicts only user equipment 101, those skilled in the art will appreciate that the present application may support any number of user equipment.
By way of example, the user device 101 may be a device including, but not limited to, a personal computer, a tablet computer, a smart phone, a vehicle mounted terminal, etc., and the embodiments of the present application are not limited thereto. The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing a basic cloud computing service of a processing technology of media data.
It may be appreciated that the processing method of media data provided by the embodiment of the present application may be executed by the server 102; accordingly, the processing means of the media data may be provided in the server 102. Of course, the processing method of media data provided by the embodiment of the present application may also be executed by the user equipment 101; accordingly, the processing means of the media data may be provided in the user equipment 101. Similarly, the processing method of media data provided by the embodiment of the present application may also be executed by the user equipment 101 and the server 102 together; accordingly, the processing means of the media data may be provided in the user equipment 101 and the server 102. The embodiment of the application does not limit the form of the device for executing the processing method of the media data.
In practical application, a user can access the web-side graphical programming tool through the user equipment 101, and select a code block combination from selectable code blocks provided by a visual programming interface of the web-side graphical programming tool through the user equipment 101, so as to obtain a set of programming program. Illustratively, the user may select a desired code block within a code editing area in a visual programming interface of the graphical programming tool using a mouse of the user device 101 and drag the selected code block to a code presentation area in the graphical programming tool interface; the user can select a plurality of code blocks from the code block editing area to drag to the code block display area through a plurality of operations, and the server 102 constructs the association relationship between the plurality of code blocks according to the respective corresponding drag orders or respective corresponding drag positions of the plurality of code blocks dragged to the code block display area, so as to obtain the code block combination corresponding to a set of programming program.
The following describes the technical solution of the present application and how the technical solution of the present application solves the above technical problems in detail with reference to fig. 1 in a specific embodiment. The following specific embodiments may be combined with each other and may not be described in detail in some embodiments for the same or similar concepts or processes.
As shown in fig. 2, an embodiment of the present application provides a method for processing media data, which is applicable to the user equipment 101 or the server 102 shown in fig. 1, and specifically includes the following steps:
201. the visual programming interface is presented, the visual programming interface including a code block presentation area including a first code block of the predictive model and a second code block having a logical relationship with the first code block.
Alternatively, the code blocks are program modules with program commands formed by packing program contents written by different program codes. It will be appreciated that the code blocks in embodiments of the present application are graphical representations abstracted to a programming language, and in particular, block structures are used in place of code fragments. In the using process, in order to realize a certain program command, the selected code block can be displayed in a code block display area of the visual programming interface through clicking, dragging and other selection operations.
In one implementation, the visual programming interface is a display interface for providing visual programming. Visual programming refers to programming that can be performed by clicking or dragging a code block without using code, and an Application created by the code block can be used in a user device, which can be a relatively independent Application (APP), such as a small game, applet, etc. running in the user device.
Alternatively, the visual programming interface may be an interface in a web-side graphical programming tool. A user can access a graphical programming tool of a web terminal through user equipment, and program is written in a mode of building a code block combination in a code block display area of a visual programming interface in the graphical programming tool of the web terminal; alternatively, the user may install the client of the graphical programming tool on the user device and write the program by building a code block combination in the code block display area of the client of the graphical programming tool.
Optionally, the first code block is for running program code of the prediction model.
Wherein the program code of the prediction model run in the first code block may differ according to the prediction model selected by the user. For example, assuming that there are a prediction model a and a prediction model B, if the user selects to use the prediction model a, the first code block acquires and runs a file of program code of the prediction model a; the prediction model B is the same and will not be described in detail here.
Alternatively, there may be one or more second code blocks having a logical relationship with the first code block, and the execution result of the first code block is used to generate an execution instruction of at least one of the one or more second code blocks.
That is, the logical relationship of the first code block to the second code block can characterize the order of execution between the first code block and the second code block, and the result of execution of the first code block can indicate whether the second code block is executed.
As an example, referring to fig. 3a, a visual programming interface 30 is illustrated, the visual programming interface 30 comprising a code presentation area 31, the code block presentation area comprising a first code block 311 and a second code block 312; wherein there may be one or more of the second code blocks 312, only one of which is shown by way of example in fig. 3 a.
202. And responding to the execution instruction of the first code block, acquiring media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted.
Alternatively, the media data may be one of image data, audio data, video data, and the like. The image may be an image of a frame in a photograph or video. The media data may be captured in real time or stored in advance. Depending on the source of the media data, the media data may be divided into three data types, universal serial bus (Universal Serial Bus, USB) camera media data, virtual camera media data, and local upload media data. As one example, assuming that the media data is a picture, the picture may be any one of a USB camera picture, a virtual camera picture, and a local upload picture. It will be appreciated that the above description is merely illustrative of three data types of media data, and is not limiting of media data as other data types are possible.
Optionally, the first code block includes a first display area and a second display area, where the first display area is used for displaying the identifier of the prediction model, and the second display area is used for displaying the preset type of the media data. The obtaining of the media data to be predicted may obtain the media data to be predicted from a media data file storing a preset type of the media data displayed in the second display area according to the preset type.
In practical applications, the execution instruction of the first code block is determined by the execution position of the code block program group where the first code block is located. When the first code block is positioned at the first position of the execution sequence, an execution instruction of the first code block can be generated by triggering an operation control of the code display area; when the first code block is located at other execution positions after the first bit of the execution order, the calling function of the first code block may be called by the previous code block of the first code block in the execution order to generate the execution instruction of the first code block. Wherein the code blocks in the code block program group have execution logic.
In one possible implementation, the code block program group is a code block set formed by splicing the trigger code blocks and the execution code blocks, and after the trigger condition of the trigger code blocks in the code block program group is triggered, the execution code blocks in the code block program group are executed sequentially.
It should be understood that the trigger code block refers to a code block provided at the head of the code block program group, and is used for implementing execution triggering of the code block program group, and may be, for example, a code block as shown in fig. 3b, including a trigger code block, a first code block and a second code block, where the trigger code block has a word "when starting running" displayed therein, and is used for indicating that the whole code program group starts to execute by the code block, and the first code block and the second code block are both executing code blocks. And executing the trigger code block, and further executing the operation of the execution code block of the subsequent splicing of the trigger code block. An execution code block refers to a code block that may implement a particular operation.
It should be understood that there is one code block in the engineering file that is responsive, either to trigger code blocks or to execute code blocks, and that call relationships between code blocks are written in the code blocks according to the splice order (i.e., the execution logic order) between code blocks. For example, unique identification information of the next code block may be written in the next field of the current code block; or the unique identification information of the code block of the embedded call can be written in the input field or field of the current code block. The calling relation characterizes the execution logic sequence among the code blocks.
As an example, referring to fig. 3b in conjunction with fig. 3a, a run control 310 is also shown in the code block display area 31, and assuming that the execution order of the code blocks is executed from top to bottom, the execution order of the code blocks shown in fig. 3b is trigger code block→first code block→second code block.
203. The second code block is executed based on the category.
Optionally, the second code block is configured to run a code program that controls the virtual object.
In one possible implementation manner, when there are a plurality of second code blocks, the running relationship between the second code blocks may include a parallel running relationship and a mutually exclusive running relationship, where the parallel running relationship refers to that when a class satisfies a running condition of at least one second code block in the plurality of second code blocks having the parallel running relationship, code running of all the second code blocks having the parallel running relationship is executed; the mutually exclusive operation relationship refers to that when the category meets the operation condition of a certain second code block, the second code block with the mutually exclusive operation relationship with the second code block is not operated.
The processing method of the media data provided by the embodiment of the application limits the programming function of the imaging programming tool to a certain extent because the existing imaging editing tool can only provide code blocks with basic functions. In order to increase the variety of the code blocks in the graphical editing tool, the application provides the code blocks with the media data prediction function in the graphical editing tool, and when the media data is required to be predicted, the type prediction of the media data can be realized by running the code program of the prediction model through the code blocks. Specifically, a visual programming interface is displayed, wherein the visual programming interface comprises a code block display area with a first code block of a prediction model and a second code block of a logic relation with the first code block; and responding to the execution instruction by receiving the execution instruction aiming at the first code block, acquiring media data to be predicted, and executing the first code block to predict the media data to be predicted, so that the category of the media data to be predicted is obtained, and then, running the second code block based on the category. According to the method, the code program for carrying out category prediction on the media data is packaged into the code block, so that the category prediction design of the media data can be easily completed in the graphical editing tool even if a user does not program a prediction model, and the code programming functionality and diversity in the graphical editing tool are increased.
In a possible implementation manner of the present application, a virtual object in the virtual environment can be controlled according to the execution condition of the second code block. Thus, the second code block includes control parameters of the virtual object, the second code block being executed based on the class, comprising: and determining control parameters of the virtual object according to the category of the media data to be predicted, and controlling the virtual object to display the gesture corresponding to the control parameters.
Optionally, according to the category of the media data to be predicted, determining a second code block having a logical relationship with the category, and controlling the virtual object to display the gesture corresponding to the control parameter according to the control parameter of the virtual object in the second code block.
The gesture of the virtual object may be dynamic or static. The dynamic state may be, for example, an action corresponding to the control parameter for controlling the virtual object presentation, and the static state may be a state corresponding to the control parameter for controlling the virtual object presentation.
In one possible implementation, a virtual environment presentation area may be further included in the visual programming interface, where the virtual environment presentation interface includes a virtual object, and when the second code block is executed based on the category of media data to be predicted, a gesture of the virtual object obtained by the control parameter is presented in the virtual environment presentation area.
For example, referring to FIG. 4, code block presentation area 31 and virtual environment presentation interface 40 are presented in visual programming interface 30. Wherein 8 code blocks, code block 11, code block 12, code block 13, code block 14, code block 15, code block 16, code block 17 and code block 18 are shown in code block display area 31. The code block 11 is the aforementioned trigger code block, the code block 11 is displayed with a word of "when starting to run", which is used for representing that the whole code program group is started to be executed by the code block, the code block 11 is displayed with a word of "delay 1 second (automatic), the USB camera photographs and is stored as a word of a USB camera picture", which is used for representing that when running the code block, the USB camera photographs are utilized to perform delay 1 second photographing, and the photographed picture is stored as a USB camera picture, wherein the USB camera picture is the media content to be predicted, and delay 1 second (automatic) is a photographing configuration item. The code block 13 is the first code block, and the "photo machine learning model 11111 predicts the USB camera picture" is shown in the code block 13. The code block 14 shows a word of "if the prediction result of the camera learning model 11111 on the USB camera picture is class 1". The word "turn-up" is shown in code block 15. The code block 16 has the word "otherwise" shown therein and the code block 17 has the word "nodding" shown therein. The code block 18 is used to add new code blocks in the code presentation area 31. Virtual object 41 (e.g., a machine dog as shown in fig. 4) is presented in virtual environment presentation area 40.
It will be appreciated that the code block 15 and the code block 17 are both second code blocks, the code block 14 and the code block 16 are used for distinguishing the executing code block 15 or the executing code block 17, and if the executing result of the code block 14 is yes, that is, it is determined that the predicting result of the code block 13 to the picture of the picture type of the USB camera (that is, the media data to be predicted) using the prediction identified as "11111" is "class1", then the executing code block 15 is executed; if the execution result of the code block 14 is no, that is, it is determined that the prediction result of the code block 13 for the picture of the USB camera picture type (i.e., the media data to be predicted) using the prediction identified as "11111" is not "class1", the code block 16 is executed, and the code block 17 is executed. From the typeface displayed in the second code block shown in fig. 4, it can be seen that the logical relationship between the second code block and the first code block, and the logical relationship between the second code blocks.
Based on the above, in connection with fig. 4, in a virtual scene (such as a game scene), when the code block 14 determines that the prediction result obtained by the code block 13 is "class1", the code block 15 is executed, and the virtual object 41 (i.e. the machine dog shown in fig. 4) is controlled to turn the heel; wherein the code block 15 comprises control parameters for controlling the turning of the machine dog. When the code block 14 determines that the prediction result obtained by the code block 13 is not "class1", the code block 17 is executed on the basis of the execution code block 16, and the virtual object 41 (i.e., the machine dog shown in fig. 4) is controlled to nod; wherein the code block 17 contains control parameters for controlling the nodding of the machine dog.
It should be noted that, for the region of the code block 12 where the "delay 1 second (automatic)" word is displayed, a drop-down box control is included, and by triggering the drop-down box control, one or more photographing configuration items for configuring the USB photographing can be displayed, where the photographing configuration items are used for configuring the photographing data of the USB, for example, performing delay configuration or definition configuration, which is not limited in the embodiment of the present application. By triggering any one of the photographing configuration items, the content of the triggered photographing configuration item is displayed in the area where the word of "delay 1 second (automatic)" is displayed.
In addition, for the identification of the prediction model represented by the typeface of "11111" displayed in the first display area in the code block 13, the "USB camera picture" displayed in the second display area is used to represent the type of media data to be predicted.
In another possible implementation, a virtual environment presentation control may also be included in the visual programming interface, the virtual environment presentation control being used to trigger the presentation of the virtual environment.
Illustratively, referring to fig. 5 (a) in conjunction with fig. 4, a virtual environment presentation control 50 is also presented in the visual programming interface 30, and upon triggering the virtual environment presentation control 50, a virtual scene presentation interface 51 shown in fig. 5 (b) is presented on a presentation area of the visual programming interface. In addition, a code editing presentation control 51 is also shown in fig. 5 (a), and upon triggering of the code editing presentation control 51, the visual programming interface 30 is presented.
It is understood that the frames presented in the virtual scene presentation area of the visual programming interface may be frames presented in the virtual scene presentation interface.
In this embodiment, the control parameters for controlling the virtual object can be obtained through the execution of the second code block, so that the virtual object is controlled to display the gesture corresponding to the control parameters, and the interest of visual programming is improved.
In one possible implementation of the present application, the visual programming interface further includes a configuration item of the predictive model, and the method further includes: a configuration operation of a configuration item of the prediction model is received, and a first code block is displayed in a code block display area.
Optionally, a configuration operation of at least one configuration item of the predictive model is received, and the first code block is displayed in the code block display area.
Wherein the at least one configuration item may be a selection item and/or an input item.
Optionally, the configuration operation may be an input operation and/or a selection operation of a configuration item; wherein the selection operation may be a click operation and/or a drag operation.
Here, the input operation may be a character input operation, or may be a voice input operation, and when the input operation is a character input operation, the input character may be directly determined by the character input operation; when the input operation is a voice input operation, after the input voice information is obtained, the voice information can be converted into characters to obtain the input characters.
The click operation may refer to any one of a first touch operation, a first cursor operation, and a key operation. The first touch operation may be a touch click operation, a touch press operation or a touch slide operation of the target option, and the first touch operation may also be a single-point touch operation or a multi-point touch operation of the target option; the first cursor operation may be an operation of controlling the cursor to click on the target option or an operation of controlling the cursor to press the target option; the key operation may be a virtual key operation or an entity key operation corresponding to the target option.
The drag operation described above may be any one of the second touch operation and the second cursor operation. The second touch operation may be a touch drag operation on the target option, and drag the target option to the target area through touch, such as a code display area. The second cursor operation may be a click-drag operation of the target option, which is dragged to the target area by controlling the cursor.
In a possible implementation manner, the embodiment of the application can configure the first code block and configure the code blocks with other functions according to actual requirements. For example, the other functions may include one or more of an event function, a control function, an operation function, a variable function, a function, a mental function, a motion function, a perception function, an intelligent function, an extension mechanism function, an image capturing function, a keyboard event function, a voice recognition function, a face recognition function, a text recognition function, an intelligent chat function, a voice synthesis function, a translation function, and a WeChat recognition function.
Specifically, a configuration operation of a configuration item of the other function is received, and code blocks of the other function are displayed in a code display area.
As an example, the code block 15, combined with the word "turn heel" shown in fig. 4, is then a configuration operation for the configuration item of the sports function.
In this embodiment, through a configuration operation on a configuration item of a prediction model, a first code block corresponding to the configuration operation can be displayed in a code display area.
In one embodiment of the application, an upper-level and lower-level display mode is adopted, and a lower-level menu is displayed by triggering an upper-level control, so that configuration operation of configuration items of the prediction model is realized. The configuration items of the prediction model comprise configuration items of a machine learning control and a lower menu; receiving a configuration operation of a configuration item of a prediction model, displaying a first code block in a code block display area, comprising: responding to the triggering operation of the machine learning control, and displaying configuration items of a lower menu; and receiving configuration operation of configuration items of a lower menu, and displaying the first code block in a code block display area.
Optionally, the visual programming interface includes a code block configuration area that exposes at least one code block configuration control, including an organic machine learning control.
As an example, referring to fig. 6, the visual programming interface 30 shown in fig. 6 (a) includes a code block configuration area 32, a machine learning control is displayed in the code block configuration area 32, when the machine learning control is triggered, a lower menu 61 shown in fig. 6 (b) is displayed, and the lower menu 61 includes a plurality of configuration items for machine learning, which mainly includes four categories, specifically: a configuration item 611 of the machine learning algorithm model, a configuration item 612 of the USB camera picture model, a configuration item 613 of the audio model, and a configuration item 614 of the pose model are added.
Wherein, the configuration item 611 of the machine learning algorithm model is added for adding a prediction model, and the configuration item 612 of the USB camera picture model, the configuration item 613 of the audio model, and the configuration item 614 of the gesture model are all used for configuring the first code block.
For example, the configuration item 612 of the USB camera picture model includes a configuration item of a code block for predicting the USB camera picture by using a prediction model of a picture class, a configuration item of a code block for predicting a result of the USB camera picture by using a prediction model of a picture class, a configuration item of a code block for confidence of a result of the prediction of the USB camera picture by using a prediction model of a picture class, and the like. It should be noted that, the content of the configuration item may be set according to actual requirements, which is not limited in any way in the embodiment of the present application.
In one embodiment of the application, the configuration items of the lower menu comprise code blocks of at least one candidate model; receiving a configuration operation of configuration items of a lower menu, displaying a first code block in a code block display area, including: determining a code block of the target model among the code blocks of the at least one candidate model in response to a selected operation of the code blocks of the at least one candidate model; and receiving configuration operations of a first display area and a second display area in a code block of the target model, obtaining the first code block, displaying the first code block in a code block display area, wherein the first display area is used for displaying an identification of the prediction model, and the second display area is used for displaying a data type of media data to be predicted.
As one example, one of the configuration items is selected in the presented lower menu, e.g., what is selected is "photo machine learning model" as shown in fig. 6? To? Making predictions "this configuration item"? And configuring the content, and dragging the configured code blocks to a code display area to obtain a first code block displayed in the code display area. Or drag the configuration item selected in the lower menu to the code presentation area, in which is the "photo machine learning model? To? Making predictions "this configuration item"? "the content is configured so as to obtain the first code block displayed in the code display area.
In addition, configuration controls for configuring code blocks of other functions (herein denoted as other code blocks) may also be provided in the code block configuration area according to actual encoding requirements. For example, the code block configuration region may further include one or more of an event control, a control, an arithmetic control, a variable control, a function control, a mental control, a motion control, a perception control, an intelligent control, an extension mechanism control, an image capture control, a keyboard event control, a voice recognition control, a face recognition control, a text recognition control, an intelligent chat control, a voice synthesis control, a translation control, a WeChat recognition control, and the like.
Further, the user can configure the code block corresponding to the control through triggering operation of any control.
As an example, referring to fig. 7, when the event control 71 is triggered, an event interface 711 is shown, and a configuration item of a code block having at least one event shown in the event interface 711 may include: when the operation is started, when the broadcast 'Hi' is received, the configuration items of the code blocks of the events of the words of the broadcast 'Hi' are sent and waiting for completion, printing, etc., and assuming that the code blocks of the "when the operation is started" are needed, the code blocks of the "when the operation is started" can be dragged into the code display area 31 by dragging the configuration items of the code blocks of the events of the "when the operation is started", so that the code blocks of the "when the operation is started" can be displayed in the code display area 31, thereby realizing the configuration operation of the code blocks of the events. The configuration operation of the other code blocks is the same and is not described in detail here.
It can be appreciated that the code blocks in the configuration area of the writing code block according to the embodiment of the present application are implemented using the Google block framework. The process of implementing a new code block in a page includes: the JSON configuration is used to define the shape, document, color, option, output, connection point, etc. of the building blocks. Parameters of the code block are obtained using JavaScript, program code is generated, and the code block is exported to a programming language (e.g., javaScript, python, PHP, lua or Dart). The code blocks are added to the code block configuration area.
In the technical scheme provided by the embodiment of the application, after the first code block is obtained through the mode, the user can trigger the adjustment operation for the code block further.
Specifically, the user may trigger a code block pre-adjustment operation for the code block through the mouse of the user device 101; for example, the user may move the mouse cursor to the code block and press the left mouse button at the code block, thus triggering the code block pre-adjustment operation. Accordingly, after the server 102 detects that the code block pre-adjustment operation is triggered at the code block, the code block adjustment element (i.e., the mouse cursor) may be considered to trigger the code block pre-adjustment operation at the code block; at this time, the server 102 also starts a timer to count and records the current position of the code block adjustment element in the code block display area of the visual programming interface as the first position.
When the timing duration of the timer reaches a preset duration threshold (e.g., 750 ms), the server 102 records the current position of the code block adjustment element in the visual programming interface of the graphical programming tool as a second position; and judging whether the second position is matched with the first position recorded before, and if so, determining that the adjustment operation for the code block is triggered currently. It should be appreciated that if the server 102 detects that the user stops triggering the code block pre-adjustment operation before the timing duration of the timer does not reach the preset duration threshold value, for example, detects that the user releases the left mouse button, the timer timing may be stopped accordingly, and the user is not considered to trigger the adjustment operation for the code block.
After determining that the adjustment operation for the code block is currently triggered, the server 102 may adjust a display position of the code block in the code block display area in the visual programming interface according to the drag track for the code block, and adjust an interface parameter of the code block accordingly. For example, after determining that the adjustment operation for the code block is currently triggered, the server 102 may highlight the code block, and further adjust the display position of the code block and the interface parameter of the code block accordingly in response to a drag operation for the code block triggered by the mouse.
It should be understood that the foregoing exemplary description only illustrates the method for adjusting the code block by using the server 102, and in practical application, the method for adjusting the code block provided by the embodiment of the present application may be applicable not only to a web-side graphical programming tool, but also to a client of the graphical programming tool, and accordingly, the method for adjusting the code block provided by the embodiment of the present application may be executed by the server of the client of the graphical programming tool. In addition, the method for adjusting the code blocks provided by the embodiment of the application can also be independently executed by the user equipment supporting the operation of the graphical programming tool client, and the application scenario of the method for adjusting the code blocks provided by the embodiment of the application is not limited at all.
In this embodiment, the purpose of displaying the first code block in the code block display area is achieved by displaying the configuration item of the lower menu and receiving the configuration operation of the configuration item of the lower menu in response to the trigger operation of the machine learning control.
In one embodiment of the application, there is also provided a manner of operation from the visual programming interface directly into the model training interface. Specifically, a model training interface is displayed in response to a new operation of a prediction model in the visual programming interface; and receiving sample adding operation of a model training interface, and displaying a training sample set, wherein the training sample set comprises at least one piece of media data and a label corresponding to the media data, and is used for training to obtain a prediction model.
Optionally, the visual programming interface includes a machine learning control and a model newly-added control; responsive to a new addition of the predictive model in the visual programming interface, the presentation model training interface collectively includes: in response to a trigger operation of the machine learning control, a model add-on control (e.g., configuration item 611 of adding the machine learning algorithm model shown in fig. 6) is exposed; and responding to the triggering operation of the model newly-added control, and displaying a model training interface.
As an example, referring to fig. 8, a model training interface 80 is provided that includes three new items, namely a picture item 81, an audio item 82, and a gesture item 83. By triggering any one of the three new projects, the model training sub-interface corresponding to the project can be displayed.
For example, after triggering the picture item 81, a model training sub-interface 90 of the picture item shown in fig. 9 is displayed, three modules are shown in the model training sub-interface 90, a sample addition model 91, a training module 92, and a preview derivation module 93. The sample adding module 91 is configured to add a sample and an identifier corresponding to a label to which the sample is added, for example, the Class1 and the Class2 shown in fig. 9 are both identifiers of the label, that is, a sample category; the added sample can be added through shooting by a camera or uploaded locally; the sample addition module 91 includes a sample module 911 and an addition module 912; the sample module 911 is a module to which a sample class has been added, and the adding module 912 is configured to add a new sample class, where the sample module 911 includes an identification input area of a label, and the input area includes an identification input control 910, and after the identification input control 910 is triggered, an identification of the label can be input in the input area.
The training module 92 provides a training model control 921 and a training level configuration item 922; by triggering the training model control 921, a prediction model to be trained can be trained based on a training sample set preset algorithm, so that a prediction model is obtained; the training level configuration item 922 is used to set the training accuracy of the model, and may be set to low, medium, and high levels, for example. The preview export module 93 provides a preview function of the model and an export function of the model, and the preview export module 93 includes a model preview area 931 and a model export control 932. The model preview area 931 displays a prediction model obtained by training, media data to be predicted is input to a preset position of the model preview area 931, a control related to prediction is triggered, the media data is predicted by using the prediction model obtained by training, and a prediction result of the prediction model is displayed.
In addition, the model training interface shown in the embodiment of the present application can also show a trained model, such as item 1 in my item shown in fig. 8.
In the embodiment, a model training interface is displayed under the condition of newly adding a prediction model in a visual programming interface; and receiving sample adding operation of a model training interface, and displaying a training sample set, wherein the training sample set comprises at least one piece of media data and a label corresponding to the media data, and is used for training to obtain a prediction model.
In one embodiment of the application, in a model training interface, training a prediction model to be trained by using a sample set to obtain the prediction model; and transmitting the data packet of the prediction model to a database corresponding to the visual programming interface through a preset calling interface.
In one implementation, the predictive model for picture recognition is built based on a pre-trained lightweight neural network (Mobile Net) model, the predictive model for gesture recognition is built based on a pre-trained position-coded network (Pose Net) model, and the predictive model for audio recognition is built based on a pre-trained speech recognition network (Speech Command Recognizer) model.
Alternatively, a const app=document. Queryselector ('# tmApp') is used; this line of code obtains this node (hereinafter referred to as tmApp node), and by accessing the tmApp node, a code program of a prediction model can be obtained, and a part of the code program of the prediction model is shown in fig. 10, and some basic information of the prediction model can be obtained through the code program of the prediction model, for example: model name, model type, model information, etc. Meanwhile, the interface of the tmApp node can be called to realize functions such as automatic training, model export and the like, or callback functions thereof can be obtained, for example: preview onpreview, etc., obtaining the result of the preview, etc.
As an example, referring to FIG. 11, model training sub-interface 90 also includes a model name input box 94 for custom setting model names for predictive models.
In one implementation, after a user triggers a model export control, a trained model (i.e., a predictive model) is exported to a database corresponding to the visual programming interface. This process involves a linkage between the visual programming interface and the model training interface.
Illustratively, the code program based on the predictive model in the foregoing is driven and obtained by the tmApp node, i.e., constapp=document. A_model helper by calling an interface code (i.e., a preset call interface) const model=app; the data packet of the prediction model can be transmitted to the database corresponding to the visual programming interface through the preset calling interface.
In this embodiment, after a prediction model is obtained by training a training sample set, a data packet of the prediction model is transmitted from a model training interface to a visual programming interface through a preset calling interface, so that data linkage transmission of the model training interface and the visual programming interface is realized.
In one embodiment of the present application, executing a first code block to predict media data to be predicted to obtain a category of the media data to be predicted includes: calling the first code block according to the calling function of the first code block, and transmitting media data to be predicted to the first code block; and running a code program of the prediction model by using the first code block, and predicting the media data to be predicted to obtain the category of the media data to be predicted.
Alternatively, the code program of the first code block can be executed by the calling function of the first code block. When the code program of the first code block runs, acquiring the code program of the prediction model from a database corresponding to the visual programming interface according to the identification of the prediction model displayed in the first display area in the first code block; and acquiring the media data to be predicted from the corresponding data type library according to the data type of the media data to be predicted in the second display area in the first code block. And running a code program of the prediction model by using the first code block, and predicting the media data to be predicted to obtain the category of the media data to be predicted.
For better understanding, a method for processing media data according to an embodiment of the present application is described below with reference to fig. 12.
S11, responding to the display operation of the visual programming interface, and displaying the visual programming interface; the visual programming interface comprises a machine learning control;
and S12, responding to the triggering operation of the machine learning control, and displaying a machine learning list, wherein the machine learning list comprises configuration items added with a machine learning algorithm model.
The machine learning list is the lower menu 61 shown in fig. 6.
S13, receiving configuration operation of the configuration items added with the machine learning algorithm model.
S14, displaying a model training interface, wherein the model training interface comprises at least one new project.
The new project comprises one of a picture project, an audio project and a gesture project.
And S15, responding to the triggering operation of the target item in at least one new item, and displaying a model training sub-interface.
S16, beginning to input samples.
S17, judging whether a new category is needed. If yes, jumping to S18, and if yes; if not, the process goes to S19.
S18, adding a category to the new category,
s19, judging whether the camera is used or not. If yes, jumping to S20, and if yes; if not, the process goes to S21.
S20, inputting a sample by the camera. Jump to S22.
S21, uploading the sample locally. Jump to S22.
S22, judging whether the input is completed. If yes, jumping to S23, and if yes; if not, the process goes to S19.
S23, setting training parameters.
The training parameters are identifiers corresponding to the labels added with the samples.
And S24, responding to the triggering operation of the training model control, and starting model training.
S25, training is finished.
S26, starting a preview mode of the prediction model.
Specifically, the preview mode of the predictive model is turned on in the model preview area
S27, judging whether a camera is used or not. If yes, jumping to S23, and if yes; if not, the process goes to S29.
S28, the camera inputs media data. Jump to S30.
S29, uploading the media data locally. Jump to S30.
S30, judging whether the input is completed. If yes, jumping to S31, and if yes; if not, the process goes to S27.
S31, outputting a prediction result.
The prediction result is a prediction result of the media data input in S26 or S27.
S32, whether the result is reasonable. If yes, jumping to S33, and if yes; if not, the process goes to S16.
S33, judging whether the input sample needs to be modified. If yes, jumping to S16, and if yes; if not, the process goes to S34.
S34, judging whether the training parameters need to be modified. If yes, 23, jumping to S, and if yes; if not, the process goes to S35.
S35, a model file of the prediction model is exported.
Specifically, a model file of the predictive model is exported to a database of the visual programming interface. The model file is a data packet of the prediction model.
S36, obtaining the identification of the prediction model from the model file.
S37, dynamically adding the identification of the prediction model into the code blocks of the candidate models of the data types corresponding to the lower menu.
S38, responding to configuration operation of the code blocks of the target model, and displaying the first code blocks in a code display interface.
Specifically, the configuration operation on the code blocks of the target model includes determining the code blocks of the target model among the code blocks of the at least one candidate model in response to a selected operation of the code blocks of the at least one candidate model; and receiving configuration operations of a first display area and a second display area in a code block of the target model, obtaining the first code block, displaying the first code block in a code block display area, wherein the first display area is used for displaying an identification of the prediction model, and the second display area is used for displaying a data type of media data to be predicted.
S39, receiving the execution operation of the first code block.
S40, executing the first code block to acquire a model file and media data to be predicted.
S41, executing the first code block to conduct prediction judgment.
S42, executing the first code block to call a preset interface, and outputting a prediction result.
It should be noted that, in the above steps, the step executed after the current step is executed is not marked, and then the next step adjacent to the current step is executed by default.
Wherein, S11-S13, S35-S40 are all executed in the visual programming interface; S14-S34 are all performed at the model training interface.
It will be appreciated that the details of the steps performed in fig. 13 are briefly described, and the above steps are specifically described as follows.
It should be noted that although the operations of the method of the present application are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in that particular order or that all of the illustrated operations be performed in order to achieve desirable results.
Fig. 13 is a block diagram of a media data processing device according to an embodiment of the application.
As shown in fig. 13, the processing apparatus of media data includes: a presentation module 1301, an identification module 1302, a processing module 1303, and a training module 1304, wherein,
the display module 1301 is configured to display a visual programming interface, where the visual programming interface includes a code block display area, and the code block display area includes a first code block of the prediction model and a second code block having a logical relationship with the first code block.
The identification model is used for responding to the execution instruction of the first code block, obtaining media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted;
the processing module 1303 is configured to execute the second code block based on the category.
In one embodiment of the application, the second code block includes control parameters of the virtual object.
The processing module 1303 is specifically configured to determine a control parameter of the virtual object according to a class of the media data to be predicted, and control the virtual object to display a gesture corresponding to the control parameter.
In one embodiment of the application, the visual programming interface further comprises configuration items of the predictive model.
The display module 1301 is further configured to receive a configuration operation of a configuration item of the prediction model, and display the first code block in the code block display area.
In one embodiment of the application, the configuration items of the prediction model comprise configuration items of a machine learning control and a lower menu; the display module 1301, particularly for use in,
and responding to the triggering operation of the machine learning control, and displaying configuration items of a lower menu.
And receiving configuration operation of configuration items of a lower menu, and displaying the first code block in a code block display area.
In one embodiment of the application, the configuration items of the lower menu comprise code blocks of at least one candidate model. The display module 1301, particularly for use in,
in response to a selected operation of the code blocks of the at least one candidate model, code blocks of the target model are determined among the code blocks of the at least one candidate model.
And receiving configuration operations of a first display area and a second display area in a code block of the target model, obtaining the first code block, displaying the first code block in a code block display area, wherein the first display area is used for displaying an identification of the prediction model, and the second display area is used for displaying a data type of media data to be predicted.
In one embodiment of the present application, the exhibition module 1301 is further configured to, in response to a new operation of the prediction model in the visual programming interface, exhibit the model training interface.
And receiving sample adding operation of a model training interface, and displaying a training sample set, wherein the training sample set comprises at least one piece of media data and a label corresponding to the media data, and is used for training to obtain a prediction model.
In one embodiment of the present application, the training module 1304 is configured to train, in a model training interface, a prediction model to be trained using a sample set to obtain a prediction model;
the processing model is used for transmitting the data packet of the prediction model to a database corresponding to the visual programming interface through a preset calling interface.
In one embodiment of the present application, the identification module 1302, specifically for,
And calling the first code block according to the calling function of the first code block, and transmitting the media data to be predicted to the first code block.
And running a code program of the prediction model by using the first code block, and predicting the media data to be predicted to obtain the category of the media data to be predicted.
The processing device of the media data provided by the application can only provide code blocks with basic functions due to the existing graphical editing tool, so that the programming functions of the graphical programming tool are limited to a certain extent. In order to increase the variety of the code blocks in the graphical editing tool, the application provides the code blocks with the media data prediction function in the graphical editing tool, and when the media data is required to be predicted, the type prediction of the media data can be realized by running the code program of the prediction model through the code blocks. Specifically, a visual programming interface is displayed, wherein the visual programming interface comprises a code block display area with a first code block of a prediction model and a second code block of a logic relation with the first code block; and responding to the execution instruction by receiving the execution instruction aiming at the first code block, acquiring media data to be predicted, and executing the first code block to predict the media data to be predicted, so that the category of the media data to be predicted is obtained, and then, running the second code block based on the category. According to the method, the code program for carrying out category prediction on the media data is packaged into the code block, so that the category prediction design of the media data can be easily completed in the graphical editing tool even if a user does not program a prediction model, and the code programming functionality and diversity in the graphical editing tool are increased.
It will be appreciated that the elements recited in the processing means of the media data correspond to the individual steps in the method described with reference to fig. 2. Thus, the operations and features described above with respect to the method are equally applicable to the processing device of media data and the units contained therein, and are not described here again. The processing device of the media data may be implemented in advance in a browser of the computer device or other security applications, or may be loaded into the browser of the computer device or its security applications by means of downloading or the like. Corresponding units in the processing means of the media data may cooperate with units in the computer device to implement the solution of the embodiments of the application.
The division of the modules or units mentioned in the above detailed description is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
It should be noted that, for details not disclosed in the media data processing device in the embodiment of the present application, please refer to details disclosed in the above embodiment of the present application, and details are not described here again.
Referring now to fig. 14, fig. 14 shows a schematic diagram of a computer device suitable for use in implementing an embodiment of the present application, as shown in fig. 14, a computer system 1400 includes a Central Processing Unit (CPU) 1401 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1402 or a program loaded from a storage section 1408 into a Random Access Memory (RAM) 1403. In the RAM1403, various programs and data required for operation instructions of the system are also stored. The CPU1401, ROM1402, and RAM1403 are connected to each other through a bus 1404. An input/output (I/O) interface 1405 is also connected to the bus 1404.
The following components are connected to the I/O interface 1405; an input section 1406 including a keyboard, a mouse, and the like; an output portion 1407 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 1408 including a hard disk or the like; and a communication section 1409 including a network interface card such as a LAN card, a modem, and the like. The communication section 1409 performs communication processing via a network such as the internet. The drive 1410 is also connected to the I/O interface 1405 as needed. Removable media 1411, such as magnetic disks, optical disks, magneto-optical disks, semiconductor memory, and the like, is installed as needed on drive 1410 so that a computer program read therefrom is installed as needed into storage portion 1408.
In particular, the process described above with reference to flowchart fig. 2 may be implemented as a computer software program according to an embodiment of the application. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program contains program code for performing the method shown in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1409 and/or installed from the removable medium 1411. The above-described functions defined in the system of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 1401.
The computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation instructions of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, blocks shown in two separate connections may in fact be performed substantially in parallel, or they may sometimes be performed in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules involved in the embodiments of the present application may be implemented in software or in hardware. The described units or modules may also be provided in a processor, for example, as: a processor includes an offending person detection unit, a multi-modal detection unit, and an identification unit. Wherein the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.
As another aspect, the present application also provides a computer-readable storage medium that may be included in the computer device described in the above embodiment or may exist alone without being assembled into the computer device. The computer-readable storage medium stores one or more programs that when used by one or more processors perform the media data processing method of the present application. For example, various steps of the processing method of media data shown in fig. 2 may be performed.
Embodiments of the present application provide a computer program product comprising instructions which, when executed, cause a method as described in embodiments of the present application to be performed. For example, various steps of the processing method of media data shown in fig. 2 may be performed.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in the present application is not limited to the specific combinations of technical features described above, but also covers other technical features which may be formed by any combination of the technical features described above or their equivalents without departing from the spirit of the disclosure. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (12)

1. A method for processing media data, comprising:
displaying a visual programming interface, wherein the visual programming interface comprises a code block display area, and the code block display area comprises a first code block of a prediction model and a second code block with a logic relationship with the first code block;
responding to an execution instruction of the first code block, obtaining media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted;
the second code block is executed based on the category.
2. The method of processing media data according to claim 1, wherein the second code block includes control parameters of a virtual object;
the executing the second code block based on the category includes:
and determining control parameters of the virtual object according to the category of the media data to be predicted, and controlling the virtual object to display the gesture corresponding to the control parameters.
3. The method of processing media data of claim 1, wherein the visual programming interface further comprises configuration items of a predictive model, the method further comprising:
And receiving configuration operation of configuration items of the prediction model, and displaying the first code block in the code block display area.
4. A method of processing media data according to claim 3, wherein the configuration items of the predictive model include configuration items of machine learning controls and lower menus;
the configuration operation of receiving the configuration item of the prediction model, displaying the first code block in the code block display area, includes:
responding to the triggering operation of the machine learning control, and displaying configuration items of the lower menu;
and receiving configuration operation of configuration items of the lower menu, and displaying the first code block in the code block display area.
5. The method according to claim 4, wherein the configuration items of the lower menu include code blocks of at least one candidate model;
the configuration operation of receiving the configuration item of the lower menu displays the first code block in the code block display area, including:
determining a code block of a target model in the code blocks of the at least one candidate model in response to a selected operation of the code blocks of the at least one candidate model;
And receiving configuration operations of a first display area and a second display area in a code block of the target model, obtaining the first code block, displaying the first code block in a code block display area, wherein the first display area is used for displaying an identification of the prediction model, and the second display area is used for displaying a data type of media data to be predicted.
6. The method of processing media data according to claim 1, wherein the manner comprises:
responding to the new operation of the prediction model in the visual programming interface, and displaying a model training interface;
and receiving sample adding operation of the model training interface, and displaying a training sample set, wherein the training sample set comprises at least one piece of media data and a label corresponding to the media data, and the training sample set is used for training to obtain the prediction model.
7. The method of processing media data according to claim 6, further comprising:
training a prediction model to be trained by using a sample set in the model training interface to obtain the prediction model;
and transmitting the data packet of the prediction model to a database corresponding to the visual programming interface through a preset calling interface.
8. The method according to claim 1, wherein the executing the first code block predicts the media data to be predicted to obtain the category of the media data to be predicted, comprising:
calling the first code block according to the calling function of the first code block, and transmitting the media data to be predicted to the first code block;
and running a code program of the prediction model by using the first code block, and predicting the media data to be predicted to obtain the category of the media data to be predicted.
9. A media data processing device, comprising:
the display module is used for displaying a visual programming interface, and the visual programming interface comprises a code block display area, wherein the code block display area comprises a first code block of a prediction model and a second code block which has a logic relation with the first code block;
the identification model is used for responding to the execution instruction of the first code block, obtaining media data to be predicted, and executing the first code block to predict the media data to be predicted to obtain the category of the media data to be predicted;
And a processing model for executing the second code block based on the category.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-9 when the program is executed by the processor.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1-9.
12. A computer program product comprising instructions which, when executed, cause the method of any of claims 1-9 to be performed.
CN202310269524.XA 2023-03-10 2023-03-10 Method, apparatus, device, medium and program product for processing media data Pending CN116974544A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310269524.XA CN116974544A (en) 2023-03-10 2023-03-10 Method, apparatus, device, medium and program product for processing media data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310269524.XA CN116974544A (en) 2023-03-10 2023-03-10 Method, apparatus, device, medium and program product for processing media data

Publications (1)

Publication Number Publication Date
CN116974544A true CN116974544A (en) 2023-10-31

Family

ID=88482071

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310269524.XA Pending CN116974544A (en) 2023-03-10 2023-03-10 Method, apparatus, device, medium and program product for processing media data

Country Status (1)

Country Link
CN (1) CN116974544A (en)

Similar Documents

Publication Publication Date Title
US20210168459A1 (en) Video playing control method and apparatus, and video playing system
US10742900B2 (en) Method and system for providing camera effect
CN111090756B (en) Artificial intelligence-based multi-target recommendation model training method and device
US10762678B2 (en) Representing an immersive content feed using extended reality based on relevancy
JP2022531639A (en) How to embed information in video, computer equipment and computer programs
EP3823292A1 (en) Electronic apparatus and control method thereof
US11263815B2 (en) Adaptable VR and AR content for learning based on user's interests
KR20170112406A (en) Apparatus and method for taking a picture with avatar in augmented reality
KR20190140519A (en) Electronic apparatus and controlling method thereof
CN110968362B (en) Application running method, device and storage medium
CN109690451A (en) Display device and its control method
US20190172260A1 (en) System for composing or modifying virtual reality sequences, method of composing and system for reading said sequences
CN110060324B (en) Image rendering method and device and electronic equipment
CN117216710A (en) Multi-mode automatic labeling method, training method of labeling model and related equipment
CN116974544A (en) Method, apparatus, device, medium and program product for processing media data
EP3556443A1 (en) Tangible mobile game programming environment for non-specialists
KR20190099110A (en) Method for providing augmented reality contents service based on cloud
KR20190094879A (en) Method and apparatus for producing modular content for outdoor augmented reality services
KR20220156099A (en) Method, system, and non-transitory computer readable record medium for recommending profile photos
CN110604918B (en) Interface element adjustment method and device, storage medium and electronic equipment
CN113301436A (en) Play control method, device and computer readable storage medium
KR20220053021A (en) video game overlay
KR102620852B1 (en) Method and apparatus for providing foley sound using artificial intelligence
KR102575820B1 (en) Digital actor management system for exercise trainer
KR102500237B1 (en) Ar/vr skeletal training method, apparatus and system using model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication