CN113591523B

CN113591523B - Display device and experience value updating method

Info

Publication number: CN113591523B
Application number: CN202010444296.1A
Authority: CN
Inventors: 王光强
Original assignee: Juhaokan Technology Co Ltd
Current assignee: Juhaokan Technology Co Ltd
Priority date: 2020-04-30
Filing date: 2020-05-22
Publication date: 2023-11-24
Anticipated expiration: 2040-05-22
Also published as: CN113596536B; CN113596552B; CN113596590B; CN113596551A; CN113596590A; CN111787375A; CN113596536A; CN113591523A; CN113596551B; CN113596537B; CN113596537A; CN113596552A; CN113591524A; CN111787375B

Abstract

The application discloses a display device and an experience value updating method, which are used for responding to an input instruction for training demonstration video, acquiring the demonstration video and collecting a local video stream, wherein the demonstration video is played to show the demonstration action of a user needing training; in the heel-exercise process, performing action matching on the demonstration video and the local video stream to obtain scores corresponding to the heel-exercise process; and after the heel-exercise process is finished, generating a heel-exercise result interface according to the score, wherein an experience value control is arranged in the heel-exercise result interface, when the score is higher than the highest historical score of the heel-exercise demonstration video of the user, the experience value updated according to the score is displayed in the experience value control, and when the score is not higher than the highest historical score, the experience value before the heel-exercise process is displayed in the experience value control. Because the experience value is not updated when the score of the heel-exercise process is not higher than the highest historical score, the user can be prevented from maliciously earning the experience value in a mode of repeatedly heel-exercise the same demonstration video.

Description

Display device and experience value updating method

The present application claims priority from chinese patent office, application number 202010364203.4, application name "display device and play control method" filed 30/4/2020, the entire contents of which are incorporated herein by reference.

Technical Field

The application relates to the technical field of display equipment, in particular to display equipment and an experience value updating method.

Background

The continuous development of communication technology makes terminal devices such as computers, smart phones, display devices and the like more and more popular. In addition, the requirements of users on functions or services provided by the terminal equipment are also increasing. Display devices, such as smart televisions, can provide users with play pictures, such as audio, video, pictures, etc., and are now of great interest.

With the popularity of smart display devices, users are increasingly demanding to perform recreational activities through the large screen of the display device. Based on the increasing time and money that households pay in the interest cultivation and training for action-like activities, it can be seen that interest cultivation and training for action-like activities and the like are important to users, such as dance, gymnastics, and fitness, and the like.

Therefore, how to provide interest culture and training functions related to action activities for users through a display device to meet the needs of users is a technical problem to be solved.

Disclosure of Invention

The application provides a display device and an experience value updating method, which are used for solving at least one problem of how to provide interest culture and training functions related to action type activities for a user through the display device.

In a first aspect, the present application provides a display apparatus comprising:

the image collector is used for collecting the local image to obtain a local video stream;

a display for displaying an exemplary video, a local video stream, and/or a follow-up result interface;

a controller for:

responding to an input instruction for heel-in demonstration video, acquiring the demonstration video, and acquiring a local video stream, wherein the demonstration video shows demonstration actions required by a user to heel-in when being played;

performing action matching on the demonstration video and the local video stream to generate scores corresponding to the follow-up process according to the matching degree of the local video and the demonstration video;

and after the demonstration video is played, generating the heel-and-toe result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the heel-and-toe result interface, when the score is higher than the highest historical score of the demonstration video for the user to heel-and-toe, the experience value updated according to the score is displayed in the experience value control, and when the score is not higher than the highest historical score, the experience value before the heel-and-toe process is displayed in the experience value control.

In a second aspect, the present application provides a display apparatus comprising:

a display;

a controller for:

in response to an input instruction to play an demonstration video, obtaining the demonstration video, and obtaining the local video stream, wherein the demonstration video comprises a first video frame for showing a demonstration action that a user needs to follow, and the local video stream comprises a second video frame for showing a user action;

matching the corresponding first video frame and the corresponding second video frame, and generating scores corresponding to the heel training process according to the matching result;

and responding to the end of the playing of the demonstration video, generating a heel-and-exercise result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the heel-and-exercise result interface, the experience value updated according to the score is displayed in the experience value control when the score is higher than the highest historical score of the heel-and-exercise of the demonstration video, and the experience value before the heel-and-exercise process is displayed in the experience value control when the score is not higher than the highest historical score.

In a third aspect, the present application further provides a method for updating an experience value, the method comprising:

In a fourth aspect, the present application further provides a method for updating an experience value, the method comprising:

acquiring an demonstration video in response to an input instruction indicating a follow-up demonstration video, and acquiring a local video stream, wherein the demonstration video comprises a first video frame for showing a demonstration action required to follow-up by a user, and the local video stream comprises a second video frame for showing the action of the user;

and responding to the end of the playing of the demonstration video, generating the heel-and-toe result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the heel-and-toe result interface, the experience value updated according to the score is displayed in the experience value control when the score is higher than the historical highest score of the heel-and-toe result interface, and the experience value before the heel-and-toe process is displayed in the experience value control when the score is not higher than the historical highest score.

As can be seen from the above technical solutions, in the following process, the display device and the experience value updating method provided by the embodiments of the present application perform action matching on the exemplary video and the local video stream, so as to generate a score corresponding to the following process according to the matching degree of the exemplary video and the local video stream; and after the heel-exercise process is finished, generating a heel-exercise result interface according to the score, wherein an experience value control used for displaying the experience value is arranged in the heel-exercise result interface, the experience value control displays the experience value updated according to the score when the score is higher than the highest historical score of the heel-exercise demonstration video of the user, and the experience value control displays the experience value before the heel-exercise process when the score is not higher than the highest historical score. Because the experience value is not updated when the score of the heel-exercise process is not higher than the highest historical score, the user can be prevented from maliciously earning the experience value in a mode of repeatedly heel-exercise the same demonstration video.

Drawings

In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

A schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment is exemplarily shown in fig. 1;

a hardware configuration block diagram of the display device 200 in accordance with the embodiment is exemplarily shown in fig. 2;

a hardware configuration block diagram of the control device 100 in accordance with the embodiment is exemplarily shown in fig. 3;

a functional configuration diagram of the display device 200 according to the embodiment is exemplarily shown in fig. 4;

a schematic diagram of the software configuration in the display device 200 according to an embodiment is exemplarily shown in fig. 5;

a schematic configuration of an application program in the display device 200 according to an embodiment is exemplarily shown in fig. 6;

a schematic diagram of a user interface in a display device 200 according to an embodiment is exemplarily shown in fig. 7;

a user interface is exemplarily shown in fig. 8;

one target application home page is illustrated in FIG. 9;

one type of user interface is exemplarily shown in fig. 10 a;

Another user interface is exemplarily shown in fig. 10 b;

a user interface is exemplarily shown in fig. 11;

a user interface is exemplarily shown in fig. 12;

one type of user interface is illustrated schematically in FIG. 13;

a user interface is exemplarily shown in fig. 14;

a user interface is exemplarily shown in fig. 15;

one type of pause interface is shown schematically in fig. 16;

one user interface for presenting the save information is exemplarily shown in fig. 17;

a user interface for presenting a follow-up cue is exemplarily shown in fig. 18;

a user interface for presenting scoring information is exemplarily shown in fig. 19;

a user interface for presenting detailed performance information is exemplarily shown in fig. 20;

a user interface for viewing a follow-up screenshot artwork file is illustrated schematically in fig. 21;

another user interface for presenting detailed performance information is exemplarily shown in fig. 22;

a detailed performance information page displayed on the mobile terminal device is exemplarily shown in fig. 23;

a user interface for displaying an automatic play prompt is exemplarily shown in fig. 24;

a user interface for displaying a user exercise record is shown schematically in fig. 25.

Detailed Description

In order to make the technical solution of the present application better understood by those skilled in the art, the technical solution of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

All other embodiments, which can be made by a person skilled in the art without inventive effort, based on the exemplary embodiments shown in the present application are intended to fall within the scope of the present application. Furthermore, while the present disclosure has been described in terms of an exemplary embodiment or embodiments, it should be understood that each aspect of the disclosure may be separately implemented as a complete solution.

It should be understood that the terms "first," "second," "third," and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such as where appropriate, for example, implementations other than those illustrated or described in connection with the embodiments of the application.

Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" as used in this disclosure refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the function associated with that element.

The term "remote control" as used herein refers to a component of an electronic device (such as a display device as disclosed herein) that can be controlled wirelessly, typically over a relatively short distance. Typically, the electronic device is connected to the electronic device using infrared and/or Radio Frequency (RF) signals and/or bluetooth, and may also include functional modules such as WiFi, wireless USB, bluetooth, motion sensors, etc. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in a general remote control device with a touch screen user interface.

The term "gesture" as used herein refers to a user action by a change in hand shape or hand movement, etc., used to express an intended idea, action, purpose, or result.

A schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment is exemplarily shown in fig. 1. As shown in fig. 1, a user may operate the display apparatus 200 through the mobile terminal 300 and the control device 100.

The control device 100 may control the display apparatus 200 through a wireless or other wired manner by using a remote controller including an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication manners. The user may control the display device 200 by inputting user instructions through keys on a remote control, voice input, control panel input, etc. Such as: the user can input corresponding control instructions through volume up-down keys, channel control keys, up/down/left/right movement keys, voice input keys, menu keys, on-off keys, etc. on the remote controller to realize the functions of the control display device 200.

In some embodiments, mobile terminals, tablet computers, notebook computers, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device. The application program, by configuration, can provide various controls to the user in an intuitive User Interface (UI) on a screen associated with the smart device.

In some embodiments, the mobile terminal 300 may install a software application with the display device 200, implement connection communication through a network communication protocol, and achieve the purpose of one-to-one control operation and data communication. Such as: it is possible to implement a control command protocol established between the mobile terminal 300 and the display device 200, synchronize a remote control keyboard to the mobile terminal 300, and implement a function of controlling the display device 200 by controlling a user interface on the mobile terminal 300. The audio/video content displayed on the mobile terminal 300 can also be transmitted to the display device 200, so as to realize the synchronous display function.

As also shown in fig. 1, the display device 200 is also in data communication with the server 400 via a variety of communication means. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. In some embodiments, display device 200 receives software program updates, or accesses a remotely stored digital media library, by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The servers 400 may be one or more groups, and may be one or more types of servers. Other web service content such as video on demand and advertising services are provided through the server 400.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limited, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.

The display device 200 may additionally provide an intelligent network television function of a computer support function in addition to the broadcast receiving television function. Including, in some embodiments, web tv, smart tv, internet Protocol Tv (IPTV), etc.

A hardware configuration block diagram of the display device 200 according to an exemplary embodiment is illustrated in fig. 2. As shown in fig. 2, the display device 200 includes therein at least one of a controller 210, a modem 220, a communication interface 230, a detector 240, an input/output interface 250, a video processor 260-1, an audio processor 60-2, a display 280, an audio output 270, a memory 290, a power supply, and an infrared receiver.

A display 280 for receiving image signals from the video processor 260-1 and for displaying video content and images and components of the menu manipulation interface. The display 280 includes a display screen assembly for presenting pictures, and a drive assembly for driving the display of images. The video content may be displayed from broadcast television content, or may be various broadcast signals receivable via a wired or wireless communication protocol. Alternatively, various image contents received from the network server side transmitted from the network communication protocol may be displayed.

Meanwhile, the display 280 simultaneously displays a user manipulation UI interface generated in the display device 200 and used to control the display device 200.

And, depending on the type of display 280, a drive assembly for driving the display. Alternatively, if the display 280 is a projection display, a projection device and projection screen may be included.

The communication interface 230 is a component for communicating with an external device or an external server according to various communication protocol types. For example: the communication interface 230 may be a Wifi chip 231, a bluetooth communication protocol chip 232, a wired ethernet communication protocol chip 233, or other network communication protocol chips or near field communication protocol chips, and an infrared receiver (not shown in the figure).

The display device 200 may establish control signal and data signal transmission and reception with an external control device or a content providing device through the communication interface 230. And an infrared receiver, which is an interface device for receiving infrared control signals of the control device 100 (such as an infrared remote controller).

The detector 240 is a signal that the display device 200 uses to collect an external environment or interact with the outside. The detector 240 includes a light receiver 242, a sensor for collecting the intensity of ambient light, a parameter change may be adaptively displayed by collecting the ambient light, etc.

And the image collector 241, such as a camera, a video camera, etc., can be used for collecting external environment scenes, collecting attributes of a user or interacting gestures with the user, can adaptively change display parameters, and can also recognize the gestures of the user so as to realize the interaction function with the user.

In other exemplary embodiments, the detector 240 may also be a temperature sensor or the like, such as by sensing ambient temperature, and the display device 200 may adaptively adjust the display color temperature of the image. The display device 200 may be adjusted to display a colder color temperature shade of the image, such as when the temperature is higher, or the display device 200 may be adjusted to display a warmer color shade of the image when the temperature is lower.

In other exemplary embodiments, the detector 240, and also a sound collector or the like, such as a microphone, may be used to receive the user's sound, including the voice signal of a control instruction of the user controlling the display device 200, or collect the ambient sound for identifying the type of the ambient scene, and the display device 200 may adapt to the ambient noise.

An input/output interface 250 for data transmission between the control display device 200 of the controller 210 and other external devices. Such as receiving video signals and audio signals of an external device, or command instructions.

The input/output interface 250 may include, but is not limited to, the following: any one or more of a high definition multimedia interface HDMI interface 251, an analog or data high definition component input interface 253, a composite video input interface 252, a USB input interface 254, an RGB port (not shown in the figures), etc. may be used.

In other exemplary embodiments, the input/output interface 250 may also form a composite input/output interface from the plurality of interfaces described above.

The modem 220 receives broadcast television signals by a wired or wireless receiving method, and can perform modulation and demodulation processing such as amplification, mixing, resonance, etc., and demodulates television audio and video signals carried in a television channel frequency selected by a user and EPG data signals from a plurality of wireless or wired broadcast television signals.

The tuning demodulator 220 is responsive to the user selected television signal frequency and television signals carried by that frequency, as selected by the user, and as controlled by the controller 210.

The tuning demodulator 220 can receive signals in various ways according to broadcasting systems of television signals, such as: terrestrial broadcast, cable broadcast, satellite broadcast, or internet broadcast signals, etc.; and according to different modulation types, the modulation can be digital modulation or analog modulation mode. Depending on the type of television signal received, both analog and digital signals may be used.

In other exemplary embodiments, the modem 220 may also be in an external device, such as an external set-top box, or the like. Thus, the set-top box outputs television audio and video signals after modulation and demodulation, and inputs the television audio and video signals to the display device 200 through the input/output interface 250.

The video processor 260-1 is configured to receive an external video signal, perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image composition, etc., according to the standard codec protocol of the input signal, and obtain a signal that can be displayed or played on the directly displayable device 200.

In some embodiments, the video processor 260-1 includes at least one of a demultiplexing module, a video decoding module, an image compositing module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is used for demultiplexing the input audio/video data stream, such as the input MPEG-2, and demultiplexes the input audio/video data stream into video signals, audio signals and the like.

And the video decoding module is used for processing the demultiplexed video signals, including decoding, scaling and the like.

And an image synthesis module, such as an image synthesizer, for performing superposition mixing processing on the graphic generator and the video image after the scaling processing according to the GUI signal input by the user or generated by the graphic generator, so as to generate an image signal for display.

The frame rate conversion module is configured to convert the input video frame rate, for example, converting the 60Hz frame rate into the 120Hz frame rate or the 240Hz frame rate, and the common format is implemented in an inserting frame manner.

The display format module is used for converting the received frame rate into a video output signal, and changing the video output signal to a signal conforming to the display format, such as outputting an RGB data signal.

The audio processor 260-2 is configured to receive an external audio signal, decompress and decode the external audio signal according to a standard codec protocol of an input signal, and perform noise reduction, digital-to-analog conversion, amplification processing, and the like, to obtain a sound signal that can be played in a speaker.

In other exemplary embodiments, video processor 260-1 may include one or more chip components. The audio processor 260-2 may also include one or more chips.

And, in other exemplary embodiments, the video processor 260-1 and the audio processor 260-2 may be separate chips or integrated together in one or more chips with the controller 210.

An audio output 272, which receives the sound signal output by the audio processor 260-2 under the control of the controller 210, such as: the speaker 272, and an external sound output terminal 274 that can be output to a generating device of an external device, other than the speaker 272 carried by the display device 200 itself, such as: external sound interface or earphone interface, etc.

And a power supply source for providing power supply support for the display device 200 with power inputted from an external power source under the control of the controller 210. The power supply may include a built-in power circuit installed inside the display apparatus 200, or may be an external power source installed in the display apparatus 200, and a power interface providing an external power source in the display apparatus 200.

A user input interface for receiving an input signal of a user and then transmitting the received user input signal to the controller 210. The user input signal may be a remote control signal received through an infrared receiver, and various user control signals may be received through a network communication module.

In some embodiments, a user inputs a user command through the remote controller 100 or the mobile terminal 300, and the user input interface responds to the user input through the controller 210 according to the user input.

In some embodiments, a user may input a user command through a Graphical User Interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through the sensor to receive the user input command.

The controller 210 controls the operation of the display device 200 and responds to the user's operations through various software control programs stored on the memory 290.

As shown in fig. 2, the controller 210 includes RAM213 and ROM214, and a graphics processor 216, CPU processor 212, communication interface 218, such as: first interface 218-1 through nth interfaces 218-n, and a communication bus. The RAM213 and the ROM214 are connected to the graphics processor 216, the CPU processor 212, and the communication interface 218 via buses.

A ROM213 for storing instructions for various system starts. When the power of the display device 200 starts to be started when the power-on signal is received, the CPU processor 212 executes a system start instruction in the ROM and copies the operating system stored in the memory 290 to the RAM213, so that the running of the start operating system starts. When the operating system is started, the CPU processor 212 copies various applications in the memory 290 to the RAM213, and then starts running the various applications.

A graphics processor 216 for generating various graphical objects, such as: icons, operation menus, user input instruction display graphics, and the like. The device comprises an arithmetic unit, wherein the arithmetic unit is used for receiving various interaction instructions input by a user to carry out operation and displaying various objects according to display attributes. And a renderer that generates various objects based on the results of the operator, and displays the results of rendering on the display 280.

CPU processor 212 is operative to execute operating system and application program instructions stored in memory 290. And executing various application programs, data and contents according to various interactive instructions received from the outside, so as to finally display and play various audio and video contents.

In some exemplary embodiments, the CPU processor 212 may include multiple processors. The plurality of processors may include one main processor and a plurality or one sub-processor. A main processor for performing some operations of the display apparatus 200 in the pre-power-up mode and/or displaying a picture in the normal mode. A plurality of or a sub-processor for one operation in a standby mode or the like.

The controller 210 may control the overall operation of the display apparatus 100. For example: in response to receiving a user command to select a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.

Wherein the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: displaying an operation of connecting to a hyperlink page, a document, an image, or the like, or executing an operation of a program corresponding to the icon. The user command for selecting the UI object may be an input command through various input means (e.g., mouse, keyboard, touch pad, etc.) connected to the display device 200 or a voice command corresponding to a voice uttered by the user.

Memory 290 includes storage for various software modules for driving display device 200. Such as: various software modules stored in memory 290, including: a basic module, a detection module, a communication module, a display control module, a browser module, various service modules and the like.

The base module is a bottom software module for communicating signals between the various hardware in the post-partum care display device 200 and sending processing and control signals to the upper module. The detection module is used for collecting various information from various sensors or user input interfaces and carrying out digital-to-analog conversion and analysis management.

For example: the voice recognition module comprises a voice analysis module and a voice instruction database module. The display control module is used for controlling the display 280 to display the image content, and can be used for playing the multimedia image content, the UI interface and other information. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing data communication between the browsing servers. And the service module is used for providing various services and various application programs.

Meanwhile, the memory 290 also stores received external data and user data, images of various items in various user interfaces, visual effect maps of focus objects, and the like.

A block diagram of the configuration of the control device 100 according to an exemplary embodiment is illustrated in fig. 3. As shown in fig. 3, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory 190, and a power supply 180.

The control device 100 is configured to control the display device 200, and may receive an input operation instruction of a user, and convert the operation instruction into an instruction recognizable and responsive to the display device 200, to function as an interaction between the user and the display device 200. Such as: the user responds to the channel addition and subtraction operation by operating the channel addition and subtraction key on the control apparatus 100, and the display apparatus 200.

In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications for controlling the display apparatus 200 according to user's needs.

In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similarly to the control device 100 after installing an application that manipulates the display device 200. Such as: the user may implement the functions of controlling the physical keys of the device 100 by installing various function keys or virtual buttons of a graphical user interface available on the mobile terminal 300 or other intelligent electronic device.

The controller 110 includes a processor 112 and RAM113 and ROM114, a communication interface 218, and a communication bus. The controller 110 is used to control the operation and operation of the control device 100, as well as the communication collaboration among the internal components and the external and internal data processing functions.

The communication interface 130 enables communication of control signals and data signals with the display device 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display device 200. The communication interface 130 may include at least one of a WiFi chip, a bluetooth module, an NFC module, and other near field communication modules.

A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touchpad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can implement a user instruction input function through actions such as voice, touch, gesture, press, and the like, and the input interface converts a received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the corresponding instruction signal to the display device 200.

The output interface includes an interface that transmits the received user instruction to the display device 200. In some embodiments, an infrared interface may be used, as well as a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. And the following steps: when the radio frequency signal interface is used, the user input instruction is converted into a digital signal, and then the digital signal is modulated according to a radio frequency control signal modulation protocol and then transmitted to the display device 200 through the radio frequency transmission terminal.

In some embodiments, the control device 100 includes at least one of a communication interface 130 and an output interface. The control device 100 is provided with a communication interface 130 such as: the WiFi, bluetooth, NFC, etc. modules may send the user input instruction to the display device 200 through a WiFi protocol, or a bluetooth protocol, or an NFC protocol code.

A memory 190 for storing various operation programs, data and applications for driving and controlling the control device 200 under the control of the controller 110. The memory 190 may store various control signal instructions input by a user.

A power supply 180 for providing operating power support for the various elements of the control device 100 under the control of the controller 110. May be a battery and associated control circuitry.

A schematic diagram of the functional configuration of the display device 200 according to an exemplary embodiment is illustrated in fig. 4. As shown in fig. 4, the memory 290 is used to store an operating system, application programs, contents, user data, and the like, and performs system operations for driving the display device 200 and various operations in response to a user under the control of the controller 210. Memory 290 may include volatile and/or nonvolatile memory.

The memory 290 is specifically used for storing an operation program for driving the controller 210 in the display device 200, and storing various application programs built in the display device 200, various application programs downloaded by a user from an external device, various graphical user interfaces related to the application, various objects related to the graphical user interfaces, user data information, and various internal data supporting the application. The memory 290 is used to store system software such as OS kernel, middleware and applications, and to store input video data and audio data, and other user data.

Memory 290 is specifically used to store drivers and related data for audio and video processors 260-1 and 260-2, display 280, communication interface 230, modem 220, detector 240 input/output interface, and the like.

In some embodiments, memory 290 may store software and/or programs, the software programs used to represent an Operating System (OS) including, for example: a kernel, middleware, an Application Programming Interface (API), and/or an application program. For example, the kernel may control or manage system resources, or functions implemented by other programs (such as the middleware, APIs, or application programs), and the kernel may provide interfaces to allow the middleware and APIs, or applications to access the controller to implement control or management of system resources.

In some embodiments, memory 290 includes at least one of a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external instruction recognition module 2907, a communication control module 2908, a light receiving module 2909, a power control module 2910, an operating system 2911, and other applications 2912, a browser module, and so forth. The controller 210 executes various software programs in the memory 290 such as: broadcast television signal receiving and demodulating functions, television channel selection control functions, volume selection control functions, image control functions, display control functions, audio control functions, external instruction recognition functions, communication control functions, optical signal receiving functions, power control functions, software control platforms supporting various functions, browser functions and other applications.

A block diagram of the configuration of the software system in the display device 200 according to an exemplary embodiment is illustrated in fig. 5.

As shown in FIG. 5, operating system 2911, which includes executing operating software for handling various basic system services and for performing hardware-related tasks, acts as a medium for data processing completed between applications and hardware components. In some embodiments, portions of the operating system kernel may contain a series of software to manage display device hardware resources and to serve other programs or software code.

In other embodiments, portions of the operating system kernel may contain one or more device drivers, which may be a set of software code in the operating system that helps operate or control the devices or hardware associated with the display device. The driver may contain code to operate video, audio and/or other multimedia components. In some embodiments, a display screen, camera, flash, wiFi, and audio drivers are included.

Wherein, accessibility module 2911-1 is configured to modify or access an application program to realize accessibility of the application program and operability of display content thereof.

The communication module 2911-2 is used for connecting with other peripheral devices via related communication interfaces and communication networks.

User interface module 2911-3 is configured to provide an object for displaying a user interface, so that the user interface can be accessed by each application program, and user operability can be achieved.

Control applications 2911-4 are used for controllable process management, including runtime applications, and the like.

The event transmission system 2914, which may be implemented within the operating system 2911 or within the application 2912, is implemented in some embodiments on the one hand within the operating system 2911, and simultaneously within the application 2912, for listening to various user input events, and will implement one or more sets of predefined operational handlers based on various event references in response to recognition results of various types of events or sub-events.

The event monitoring module 2914-1 is configured to monitor a user input interface to input an event or a sub-event.

The event recognition module 2914-1 is configured to input definitions of various events to various user input interfaces, recognize various events or sub-events, and transmit them to a process for executing one or more corresponding sets of processes.

The event or sub-event refers to an input detected by one or more sensors in the display device 200, and an input from an external control device (e.g., the control device 100, etc.). Such as: various sub-events are input through voice, gesture input through gesture recognition, sub-events of remote control key instruction input of a control device and the like. In some embodiments, one or more sub-events in the remote control include a variety of forms including, but not limited to, one or a combination of key press up/down/left/right/, ok key, key press, etc. And operations of non-physical keys, such as movement, holding, releasing, etc.

Interface layout manager 2913 directly or indirectly receives user input events or sub-events from event delivery system 2914 for updating the layout of the user interface, including but not limited to the location of controls or sub-controls in the interface, and various execution operations associated with the interface layout, such as the size or location of the container, the hierarchy, etc.

As shown in fig. 6, the application layer 2912 contains various applications that may also be executed on the display device 200. Applications may include, but are not limited to, one or more applications such as: at least one of a live television application, a video on demand application, a media center application, an application center, a gaming application, and the like.

Live television applications can provide live television through different signal sources. For example, a live television application may provide television signals using inputs from cable television, radio broadcast, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.

Video on demand applications may provide video from different storage sources. Unlike live television applications, video-on-demand provides video displays from some storage sources. For example, video-on-demand may come from the server side of cloud storage, from a local hard disk storage containing stored video programs.

The media center application may provide various applications for playing multimedia content. For example, a media center may be a different service than live television or video on demand, and a user may access various images or audio through a media center application.

An application center may be provided to store various applications. The application may be a game, an application, or some other application associated with a computer system or other device but which may be run in a smart television. The application center may obtain these applications from different sources, store them in local storage, and then be run on the display device 200.

A schematic diagram of a user interface in a display device 200 according to an exemplary embodiment is illustrated in fig. 7. As shown in fig. 7, the user interface includes a plurality of view display areas, in some embodiments a first view display area 201 and a play screen 202, wherein the play screen includes a layout of one or more different items. And a selector in the user interface indicating that an item is selected, the position of the selector being movable by user input to change selection of a different item.

It should be noted that the multiple view display areas may present different levels of display images. For example, the first view display region may present video chat item content and the second view display region may present application layer item content (e.g., web page video, VOD presentation, application screen, etc.).

Optionally, the presentation of different view display areas has priority difference, and the display priorities of the view display areas are different among the view display areas with different priorities. If the priority of the system layer is higher than that of the application layer, when the user uses the acquisition selector and the picture switching in the application layer, the picture display of the view display area of the system layer is not blocked; and when the size and the position of the view display area of the application layer are changed according to the selection of the user, the size and the position of the view display area of the system layer are not affected.

The same level of display may be presented, in which case the selector may switch between the first view display region and the second view display region, and the size and position of the second view display region may change as the size and position of the first view display region changes.

In some embodiments, any one of the areas in fig. 7 may display a screen acquired by the camera.

In some embodiments, controller 210 controls the operation of display device 200 and responds to user operations associated with display 280 by running various software control programs (e.g., an operating system and/or various application programs) stored on memory 290. For example, control presents a user interface on a display, the user interface including a number of UI objects thereon; in response to a received user command for a UI object on the user interface, the controller 210 may perform an operation related to the object selected by the user command.

In some embodiments, some or all of the steps involved in embodiments of the present application are implemented within the operating system and in the target application. In some embodiments, the target application program for implementing some or all of the steps of embodiments of the present application is referred to as "baby exercise work", which is stored in the memory 290, and the controller 210 controls the operation of the display device 200 and responds to user operations related to the application program by running the application program in an operating system.

In some embodiments, the display device obtains the target application, various graphical user interfaces associated with the target application, various objects associated with the graphical user interfaces, user data information, and various internal data supporting the application from the server and stores the aforementioned data information in the memory.

In some embodiments, the display device obtains media resources, such as picture files and audio video files, from the server in response to the launching of the target application or a user operation on a UI object associated with the target application.

It should be noted that the target application is not limited to running on the display device as shown in fig. 1-7, but may also run on other hand-held devices capable of providing voice and data connectivity and having wireless connectivity, or may be connected to other processing devices of a wireless modem, such as mobile phones (or "cellular" phones) and computers with mobile terminals, and may also be portable, pocket, hand-held, computer-built-in or vehicle-mounted mobile devices that exchange data with the radio access network.

FIG. 8 is a user interface, which is one implementation of a display device system home page, that is exemplary of the present application. As shown in fig. 8, the user interface displays a plurality of items (controls) including a target item for launching the target application. As shown in FIG. 8, the target item is the item "baby dancing power". When the display displays a user interface as shown in fig. 8, a user can operate a target item "baby exercise work" by operating a control device (such as the remote controller 100), and in response to the operation of the target item, the controller starts a target application.

In some embodiments, the target application refers to a functional module that plays an exemplary video in a first video window on a display screen. Wherein the demonstration video refers to a video exhibiting demonstration actions and/or demonstration sounds. In some embodiments, the target application may also play the local video captured by the camera in a second video window on the display screen.

When the controller receives an input instruction indicating to start the target application, the controller presents a target application homepage on the display in response to the instruction. On the application homepage, various interface elements such as icons, windows, controls and the like can be displayed on the interface, including but not limited to a login account information display area (column frame control), a user data (experience value/dance power value) display area, a window control for playing recommended videos, a related user list display area and a media asset display area.

In some embodiments, at least one of a nickname, a avatar, a member identification, a member validity period of the user may be presented in the login account information presentation area; the user data display area can display data related to the target application, such as experience value/dancing power value and/or corresponding star grade identification, of the user; the related user list display area can display a ranking list (such as experience value ranking) of users in a preset geographical area range in a preset time period, or can display a friend list of the users, wherein the ranking list or the friend list can display experience values/dancing power values and/or corresponding star-level identifiers of the users; in the media asset display area, media assets are displayed in a classified mode. In some embodiments, a plurality of controls can be displayed in the media asset display area, different controls correspond to different types of media assets, and a user can trigger to display a corresponding type of media asset list by operating the controls.

In some embodiments, the user data display area and the login account information display area may be a display area, for example, the user data related to the target application is displayed in the login account information display area.

FIG. 9 illustrates an implementation of the target application home page described above, with a nickname, avatar, membership identification, membership expiration date of the user shown in the login account information display area, as shown in FIG. 9; the user data display area displays the dancing power value and star grade identification of the user; the relevant user list display area is displayed with dancing forest high hand ranking (week); the media resource display area is provided with media resource type controls such as a 'loving course', 'happy course', 'dazzling course', and 'My dancing work', and the user can operate the type controls to check the corresponding type media resource list through operating the control device, and the user can select media resource videos to be trained from the media resource list under any type. The focus is moved to an 'initiating class' control, a 'initiating class' media resource list interface is displayed after confirmation operation of a user is received, and corresponding media resource files are loaded and played according to the media resource control selected by the user in the 'initiating class' media resource list interface.

In addition, the interface shown in FIG. 9 includes a window control and an ad spot control for playing recommended videos. The recommended video may be automatically played in a window control as shown in fig. 9, or may be played in response to a play instruction input by the user. For example, the user can move the position of the selector (focus) by operating the control device so that the selector falls into a window control for playing the recommended video, and in the case where the selector falls into the window control, the user operates an "OK" key on the control device to input an instruction indicating that the recommended video is played.

In some embodiments, the controller obtains information from the server for display in a page as shown in FIG. 9, such as login account information, user data, related user list data, recommended videos, and the like, in response to an instruction to launch the above-described target application. The controller draws an interface as shown in fig. 9 through the graphic processor according to the acquired information and controls presentation on the display.

In some embodiments, the controller obtains a media resource ID corresponding to the media resource control and/or a user identifier of the display device according to the media resource control selected by the user, and sends a loading request to the server, and the server inquires corresponding video data according to the media resource ID and/or determines the authority of the display device according to the user identifier. And feeding the acquired video data and/or rights information back to the display device. The controller plays the video data and/or plays the video information according to the video data and/or the permission information and prompts the permission of the user.

In some embodiments, the target application is not a separate application, but is a part of the focused application as shown in fig. 8, that is, is a functional module of the focused application, and in some embodiments, in addition to the title controls such as "my", "movie", "juvenile", "VIP", "education", "mall", "application", etc. in the TAB column of the interactive interface, a "dance power" title control is included, and the user may display the corresponding title interface by moving the focus to a different title control, for example, after moving the focus to the "dance power" title control, enter the interface as shown in fig. 9.

With the popularity of intelligent display devices, the demand for entertainment by users through large screens is increasing, and more time and money are also being invested in interest cultivation. The application provides the user with the follow-up experience of the skills of actions and/or sounds (such as dancing, gymnastics, body building and actions in K song scenes) through the target application, so that the user can learn the skills of the actions and/or sounds at any time in the home.

In some embodiments, the video of assets presented in the asset list interface (e.g., the "loving" asset list interface, "the" happing lesson "asset list interface in the above examples) includes exemplary video, but is not limited to video for exemplary dance actions, video for exemplary fitness actions, video for exemplary gym actions, song MV video played by a display device in a K-song scene, or video for exemplary avatar actions. In the embodiment of the application, the teaching video is also called demonstration video, and the teaching video or demonstration video user can synchronously make the same actions as those demonstrated in the video while watching the demonstration video so as to realize the functions of home dancing or home body building by using the display device. The function can be visualized as "look-while-training".

In some embodiments, a "look-and-edge" scenario is as follows: a user (such as a child or teenager) can watch dance teaching video and exercise dance, a user (such as an adult) can watch body-building teaching video and exercise body-building action, the user can connect K songs with friend video, the user can sing songs and then follow MV video or virtual image to make actions, and the like. For convenience of description and distinction, in the "look-while-training" scenario, the action made by the user is referred to as a user action or a follow-up action, the demonstration action in the video is referred to as a demonstration action, the video showing the demonstration action is a demonstration video, and the action made by the user is a local video acquired after the camera is shown.

In some embodiments, if the display device has an image collector (or camera), the image collector may collect an image or a video stream of the heel-training action of the user, so that the heel-training process of the user is recorded by taking a picture or a video as a carrier. Further, the heel-exercise actions of the user are identified according to the pictures or the videos, the heel-exercise actions of the user are compared with corresponding demonstration actions, and the heel-exercise conditions of the user are evaluated according to the comparison conditions.

In some embodiments, a time tag corresponding to a standard action frame may be preset in the exemplary video, and the action is compared with the standard action frame according to the matching of the image frame at the time tag position and/or the adjacent position in the local video, and then the evaluation is performed according to the matching degree of the action.

In some embodiments, time labels corresponding to standard audio clips may be preset in the exemplary video, and the matching comparison of actions is performed according to the audio clips and standard audio clips in the local video at the time label positions and/or adjacent positions, and then the evaluation is performed according to the matching degree of the actions.

In some embodiments, a display interface of the display presents a local video stream (or a local photo) acquired by a camera and an exemplary video which is tracked by a user on the display synchronously, a first video window and a second video window are arranged in the display interface, the first video window plays the exemplary video, and the second play window plays the local video, so that the user can directly watch own tracking motion and intuitively compare the defect of the tracking motion, thereby timely improving.

When the display displays the interface shown in fig. 9 or the interface of the list of media assets after receiving the operation after the interface of fig. 9, the user can select and play the media asset video to be exercised through the operation control device, and for convenience of explanation and distinction, the media asset video selected to be exercised by the user is collectively called as a target video (i.e. an exemplary video corresponding to the selected control).

In some embodiments, in response to an instruction input by a user to follow-up a target video, the display device controller acquires the target video from the server according to a media asset ID corresponding to the selected control, and detects whether a camera is connected; if the camera is detected, the camera is controlled to be lifted and started, so that the camera starts to collect the local video stream, the loaded target video and the local video stream are displayed on the display at the same time, and if the camera is not detected, the target video is only played on the display. In some embodiments, a first playing window and a second playing window are set in a display interface (i.e. the heel training interface) during heel training, after loading of the target video is completed, in response to the fact that the camera is not detected, the target video is played in the first playing window, and a preset prompt or blackout is displayed in the second playing window. In some embodiments, when no camera is detected, a reminder without a camera is displayed in a floating layer above the heel-and-toe interface, the user enters the heel-and-toe interface to play a target video after confirming, and exits the target application or returns to the previous interface when the user inputs a disagreeable instruction.

In the case that the camera is detected, the controller sets a first playing window on a first layer of the user interface, sets a second playing window on a second layer of the user interface, plays the acquired target video in the first playing window, and plays the picture of the local video stream in the second playing window. The first playing window and the second playing window can be tiled, wherein the tiled display means that a plurality of windows divide a screen according to a certain proportion, and no superposition exists between the windows.

In some embodiments, the first playback window and the second playback window are formed by window components tiled on the same layer that occupy different positions.

Fig. 10a illustrates a user interface showing one implementation of a first playback window in which a target video frame is displayed and a second playback window in which a frame of a local video stream is displayed, as shown in fig. 10a, the first playback window and the second playback window being tiled in a display area of the display, in some embodiments, the first playback window and the second playback window having different window sizes.

And in the situation that the camera is not detected, the controller plays the acquired target video in the first playing window, and displays the shielding layer or the preset picture file in the second playing window. The first playing window and the second playing window can be tiled, wherein the tiled display means that a plurality of windows divide a screen according to a certain proportion, and no superposition exists between the windows.

Fig. 10b illustrates another user interface in which another implementation of the first play window and the second play window is shown, unlike fig. 10a, in which a target video picture is displayed in the first play window and an occlusion layer is displayed in the second play window, and a preset text element of "no camera detected" is displayed in the occlusion layer.

In some other embodiments, in the event that no camera is detected, the controller sets a first play window at a first layer of the user interface, the first play window being displayed full screen within a display area of the display.

In some embodiments, when the display device has a camera, the controller enters the heel-and-toe interface after receiving the instruction input by the user to instruct the target video to be played, and directly plays the target video and the local video stream.

In other embodiments, after receiving the instruction for tracking the target video, the controller first enters a guiding interface, and only a preview picture of the local video is displayed in the guiding interface, without playing the target video.

In some embodiments, since the camera is a concealable camera, when not in use, it is concealed within or behind the display, and when the camera is called up, the controller controls the raising and turning on of the camera, wherein the raising is for the camera to extend out of the frame of the display and the turning on is for the camera to begin capturing images.

In some embodiments, to increase the camera angle of the camera, the camera may be rotated in a lateral direction or a longitudinal direction, where the lateral direction refers to a horizontal direction when the video is being viewed normally, and the longitudinal direction refers to a vertical direction when the video is being viewed normally. The acquired image can be adjusted by adjusting the focal length of the camera in the depth direction perpendicular to the display screen.

In some embodiments, when there is no human body in the preview screen or there is a human body in the preview screen and there is an offset of the target position where the human body is located relative to a preset desired position, a graphic element for identifying the preset desired position is presented above the preview screen, and a prompt for guiding the moving target to move to the desired position is presented according to the offset of the target position relative to the desired position.

The moving object (human body) is a local user, and in different scenes, the moving objects in the preview picture can be one or a plurality of moving objects. The expected position is a position set according to the acquisition area of the image acquisition device, and when the moving target (namely, a user) is positioned at the expected position, the image acquired by the image acquisition device is most favorable for analyzing and comparing the actions of the user in the image.

In some embodiments, the image element includes an arrow indicating a direction, the arrow being oriented toward the overall desired position.

In some embodiments, the desired position refers to a graphic frame displayed on the display, and the controller sets the graphic frame in a layer above the preview image according to the position and angle of the camera and a preset mapping relationship, so that a user can intuitively see to what position the user needs to move.

In the use process, the preset position of the user standing in front of the display device is a reasonable position, and the difference of the images acquired by the cameras can be caused due to the difference of the lifting height and/or the rotation angle, so that the preset position of the graphic frame needs to be adjusted adaptively, and the preset position of the user standing in front of the display device under guidance is a reasonable position.

In some embodiments, the positional mapping of the graphic frame is as follows:

in some embodiments, a second layer is presented in the user interface, where a preview window of the local video stream is set, the second layer being located above the first layer. An implementation of loading the heel training interface and loading the second layer is used.

In some embodiments, the controller may display the preview window in a second layer on the display interface, where loading of the heel-and-toe interface is not performed or the heel-and-toe interface is located in a page stack in the background.

In some embodiments, the above-mentioned prompt for guiding the moving object to move to the desired position may identify an interface prompt for the moving direction of the object, and/or play a voice prompt for the moving direction of the object.

The target moving direction is obtained according to the deviation of the target position relative to the expected position. When one moving object exists in the preview screen, the moving direction of the object is obtained according to the deviation of the object position of the one moving object relative to the expected position; when a plurality of moving targets exist in the preview picture, the target moving direction is obtained according to the minimum offset in a plurality of offsets corresponding to the plurality of moving targets.

In some embodiments, the interface alert may be an arrow alert, the arrow line of defense of which may be determined according to the direction of movement of the target to point to the graphical element 112.

In some embodiments, a floating layer, such as a semi-transparent floating layer, having a transparency greater than a preset transparency (e.g., 50%) is presented above the preview screen, and a graphic element for identifying a desired position is displayed in the floating layer, so that a user can view the preview screen of the local video through the floating layer.

In some embodiments, another floating layer with transparency greater than a preset transparency (e.g., 50%) is presented above the preview screen, and a graphic element for identifying the target movement direction is displayed in the floating layer as an interface prompt for guiding the user to move the position.

In some embodiments, the graphical element for identifying the desired location and the graphical element for identifying the direction of movement of the target are displayed in the same float layer.

Fig. 11 illustrates a user interface in which a preview screen of a local video stream is displayed substantially full screen, and a semi-transparent float layer in which a target movement direction is identified by a graphic element 111 and a desired position is identified by a graphic element 112 is displayed above the preview screen, as illustrated in fig. 11. The position of the graphic element 111 does not coincide with the position of the graphic element 112. The moving object (user) may gradually move to a desired position according to the object moving direction identified by the graphic element 111. When the moving object in the preview screen moves to the desired position, the outline of the moving object in the preview screen coincides with the image element 112 to the greatest extent. In some embodiments, the graphic element 112 is a graphic frame.

In some embodiments, the direction of movement of the target may also be identified by an interface text element, such as "move a little to the left" as exemplarily shown in FIG. 11, or the like.

In some embodiments, the display device controller receives an instruction indicating a follow-up target video, and in response to the instruction, activates the image collector to collect a local video stream through the image collector; presenting a preview screen of the local video stream in a user interface; detecting whether a moving target exists in the preview picture; when a moving object exists in the preview picture, position coordinates of the moving object and a desired position in a preset coordinate system are respectively obtained, wherein the position coordinates of the moving object in the preset coordinate system are quantized representations of the target position of the moving object, and the position coordinates of the desired position in the preset coordinate system are quantized representations of the desired position. Further, the shift of the target position with respect to the desired position is calculated from the position coordinates of the moving target and the desired position in the preset coordinate system.

The display device controller receives an instruction for indicating a follow-up target video, and responds to the instruction, the image collector is started to collect a local video stream through the image collector; presenting a preview screen of the local video stream in a user interface; detecting whether a moving target exists in the preview picture; and when the moving object exists in the preview picture, acquiring the position coordinate of the moving object in a preset coordinate system, wherein the position coordinate of the moving object in the preset coordinate system is a quantized representation of the position of the moving object. Further, an offset of the target position relative to the desired position is calculated from position coordinates of the moving target and the desired position in a preset coordinate system, wherein the position coordinates of the desired position in the preset coordinate system are a quantized representation of the desired position.

In some embodiments, the position coordinates of the moving object in the preset coordinate system may be a set of position coordinate points of the contour of the moving object (i.e., the object contour) in the preset coordinate system. By way of example, a target profile 121 is shown in fig. 12.

In some embodiments, the target contour includes a torso portion and/or a target reference point, wherein the target reference point may be a midpoint of the torso portion or a center point of the target contour. Illustratively, torso portion 1211 and target reference point 1212 are shown in fig. 12. In these embodiments, acquiring the position coordinates of the moving object in the preset coordinate system includes: identifying a target contour from the preview picture, the target contour including a torso portion and/or a target reference point; and acquiring position coordinates of the trunk part and/or the target reference point in a preset coordinate system.

In some embodiments, the graphical element for identifying the desired location includes a graphical torso portion and/or a graphical reference point that corresponds to the target reference point in the above embodiments, i.e., if the target reference point is the midpoint of the torso portion, the graphical reference point is the midpoint of the graphical torso portion, and if the target reference point is the center point of the target contour, the graphical reference point is the center point of the graphical element. By way of example, a graphical torso portion 1221 and a graphical reference point 1222 are shown in fig. 12. In these embodiments, the obtaining of the position coordinates of the desired position in the preset coordinate system is obtaining the position coordinates of the torso portion of the figure and/or the reference point of the figure in the preset coordinate system.

In some embodiments, the offset of the target position relative to the desired position is calculated from the position coordinates of the torso portion in the preset coordinate system and the position coordinates of the torso portion of the graphic in the preset coordinate system.

In some embodiments, the origin of the preset coordinate system may be any point preset. As follows, taking the origin as an example of a lower left corner pixel point of the display screen, the torso portion may be identified using coordinates of two points in focus or coordinates of at least two other points, the target torso portion coordinates being (X ₁ ，Y ₁ ；X ₂ ，Y ₂ ) The coordinates of the torso portion of the figure are (X) ₃ ，Y ₃ ；X ₄ ，Y ₄ ) The positional shift between the two is (X ₃ -X ₁ ，Y ₃ -Y ₁ ；X ₄ -X ₂ ，Y ₄ -Y ₂ ) The user can remind according to the corresponding relation between the offset and the prompt, so that the overlapping of the target trunk part and the figure trunk part reaches the preset requirement.

In some embodiments, the offset of the target torso portion and the torso portion of the graphic may be calculated by the overlapping area of the graphic, alerting the user to successful position adjustment when the overlapping area reaches a preset threshold or the duty cycle of the overlapping area reaches a preset threshold.

In some embodiments, the user is alerted to the success of the position adjustment as the user moves to the left, based on the target torso portion and the right side frame of the graphic torso portion completing the overlap. Thus, the user can be ensured to enter the identification area completely.

In some embodiments, the user is alerted that the position adjustment is successful when the user moves to the right, based on the target torso portion and the left side frame of the graphic torso portion completing the overlap. Thus, the user can be ensured to enter the identification area completely.

In other embodiments, the offset of the target position relative to the desired position is calculated based on the position coordinates of the target reference point in the preset coordinate system and the position coordinates of the graphic reference point in the preset coordinate system.

In some embodiments, the origin of the preset coordinate system may be any point preset. As follows, taking the origin as an example of a pixel point at the lower left corner of the display screen, the coordinates of the target reference point 1212 are (X ₁ ，Y ₁ ) The coordinates of the graphic reference point 1222 are (X ₂ ，Y ₂ ) Two are thenThe positional shift between them is (X ₂ -X ₁ ，Y ₂ -Y ₁ ) At X ₂ -X ₁ At positive values, a prompt is given to the left side of the graphic element 112 and/or a prompt is given to "move a little to the right", at X ₂ -X ₁ At negative values, a prompt is given to the right side of the graphical element 112 and/or a "move a bit to the left" is given.

In some embodiments, the controller also obtains the focal distance at the location of the person and prompts the user to "tip forward" or "tip right" based on a comparison of the preset focal distances.

In some embodiments, the controller further gives a specific distance for the user to move leftwards or rightwards according to a proportional relation between the focal distance at the position of the human body and a preset focal distance, and the controller reminds the user to move rightwards by 10 cm when the proportional relation is 0.8, reminds the user to move rightwards by 15 cm when the proportional relation is 1.2, reminds the user to move leftwards by 10 cm when the proportional relation is 0.8, reminds the user to move leftwards by 10 cm when the proportional relation is negative 800pix, and reminds the user to move leftwards by 15 cm when the proportional relation is 1.2.

In some embodiments, the user is alerted to the successful position adjustment when the offset value is less than a preset threshold.

In some embodiments, the preset coordinate system is a three-dimensional coordinate system, and further, the position coordinates of the moving object and the desired position in the preset coordinate system are three-dimensional coordinates, and the offset of the object position relative to the desired position is a three-dimensional offset vector.

In some embodiments, assuming that the position coordinates of the target reference point in the preset coordinate system are (X, Y, Z), the position coordinates of the graphic reference point in the preset coordinate system are (X, Y, Z), the offset vector of the target position relative to the desired position is calculated as (X-X, Y-Y, Z-Z).

In some embodiments, when there is no offset of the target position relative to the desired position, if the above-mentioned graphical element for identifying the desired position or the interface prompt for identifying the moving direction of the target is presented in the user interface, the controller cancels the display of the graphical element or the interface prompt, and simultaneously presents a preview screen of the target video and the local video in the user interface, such as the interface shown in fig. X; if the graphical element or the interface prompt is not presented in the user interface, a preview screen of the target video and the local video is presented directly in the user interface at the same time, such as the user interface shown in fig. 10.

It should be noted that, in the above example, the case where the target position is offset from the desired position may be the case where the offset amount between the two is larger than the preset offset amount, and accordingly, the case where the target position is not offset from the desired position may be the case where the offset amount between the two is smaller than the preset offset amount.

In the above embodiment, after receiving the instruction indicating the follow-up target video, the controller does not directly play the target video to start the follow-up process, but only displays a preview screen of the local video, and moves the moving target (user) to the desired position by presenting a graphic element for identifying a preset desired position and a prompt for guiding the moving target to move to the desired position above the preview screen, so that the image collector can collect the image most favorable for analyzing and comparing the user actions in the follow-up process.

In some embodiments, the display device may control the rotation of the camera in the horizontal direction or the longitudinal direction according to whether the display device is in a horizontal placement state or a wall hanging placement state, and the rotation angles of the cameras in different placement states are different when the display device is in the same requirement.

The human body is continuously detected, and in some embodiments, the controller controls the guiding interface to be withdrawn to display the heel training interface when the deviation of the position coordinates of the target reference point in the preset coordinate system and the position coordinates of the graphic reference point in the preset coordinate system meet preset requirements or the deviation of the target trunk part and the graphic trunk part meet preset requirements.

In some embodiments, the display displays an interface as shown in FIG. 10a while the user follows a video of a media asset. When the display displays an interface as described in fig. 10a, the user may trigger the display of a floating layer (which may be a down key in some embodiments) containing controls by operating a designated key on the control device, and in response to a user operation, as shown in fig. 13 or 14, a control floating layer is presented on top of the follow-up interface, the control floating layer including at least one of a control for selecting a media video, a control for adjusting a play speed, and a control for adjusting sharpness. The user can move the focus position by operating the control device, and select the control in the control floating layer. When the focus falls into a control, a sub-floating layer corresponding to the control is presented, and at least one sub-control is displayed in the sub-floating layer. For example, when the focus falls into a control for selecting the media video, a sub-floating layer corresponding to the control is presented, and a plurality of different media video controls are presented in the sub-floating layer. The sub-floating layer refers to a floating layer positioned above the control floating layer. In some embodiments, the controls in the sub-floating layer may be implemented by adding new controls on the control floating layer.

Fig. 13 illustrates an application interface (play control interface) in which a control floating layer is displayed above a layer where a first play window and a second play window are located, where the control floating layer includes a selection control, a double-speed play control, and a sharpness control, and because a focus is located in the selection control, a sub-floating layer corresponding to the selection control is also presented in the interface, where a plurality of controls of other media videos are displayed. In the interface shown in fig. 13, the user can select other media videos to play and follow by moving the focus position.

In some embodiments, when the display displays an interface such as that of fig. 13, the user may move the focus to select the double-speed play control, and in response to the focus falling into the double-speed play control, a sub-floating layer corresponding to the double-speed play control is presented, as shown in fig. 14. And displaying a plurality of sub-controls in the sub-floating layer corresponding to the double-speed playing control, wherein the sub-controls are used for adjusting the playing speed of the target video, and when a certain sub-control is operated, responding to the operation of a user, adjusting the playing speed to the speed corresponding to the operated control. For example, in the interface shown in FIG. 14, "0.5 times", "0.75 times" and "1 times" are shown.

In another embodiment, when the display displays an interface as shown in fig. 13 or 14, the user may move the focus to select the sharpness control, and in response to the focus falling into the sharpness control, a sub-floating layer corresponding to the sharpness control is presented, as shown in fig. 15. And displaying a plurality of controls in the sub-floating layer corresponding to the definition, wherein the controls are used for adjusting the definition of the target video, and when a certain control is operated, responding to the operation of a user, adjusting the definition into the definition corresponding to the operated control. For example, in the interface shown in FIG. 14, "720P high definition" and "1080P ultra definition" are shown.

In some embodiments, when the control float layer is presented in response to a user operation, the focus is displayed on a default control set in advance, which may be any one of a plurality of controls in the control float layer. For example, as shown in fig. 13, the default control set in advance is a selection control.

In some embodiments, other media videos displayed in the sub-floating layer corresponding to the selection control are sent to the display device by the server. For example, in response to a user selection of a selection control, the display device requests media asset resource information, such as a resource name or a resource cover, etc., to be displayed in the selection list from the server. And after receiving the media resource information returned by the server, the display equipment controls the media resource information to be displayed in the selection list.

In some embodiments, to facilitate the user's differentiation of the assets resources in the selection list, the server, after receiving the request from the display device, queries the user's history heel-and-toe record according to the user ID to obtain therefrom the assets video that the user has practiced. If the media resource information issued to the display device comprises the media resource video practiced by the user, adding an identifier for indicating that the user has practiced the video in the media resource information corresponding to the media resource video. Accordingly, when the display device displays the selection list, the media asset video that has been trained is identified. For example, a "trained" logo is displayed in the interface as shown in FIG. 12.

In some embodiments, in order to facilitate the user to distinguish the media resources in the selection list, after receiving the request from the display device, the server determines whether the selection list resource requested by the display device is newly added, for example, the server may determine whether the selection list resource sent to the display device last time is newly added by comparing the selection list resource sent to the display device with the current selection list resource, and if the selection list resource requested by the display device is newly added, the server adds an identifier indicating that the video is a newly added video in the resource information corresponding to the newly added media resource. Accordingly, when the display device displays the selection list, the newly added media asset video is identified. For example, an "update" displayed in the interface as shown in FIG. 13.

In some embodiments, the controller is responsive to an instruction entered by the user to follow-up with the demonstration video, to obtain the demonstration video from the server or to obtain the pre-downloaded demonstration video from the local storage according to the resource identification of the demonstration video.

In some embodiments, the exemplary video includes the image data and audio data described above. The image data comprises a video frame sequence which shows a plurality of actions that a user needs to exercise, such as leg lifting actions, squat actions and the like. The audio data may then be narrative audio of the exemplary action and/or background sound audio (e.g., background music).

In some embodiments, the controller processes the exemplary video by controlling the video processor to parse displayable image signals and audio signals therefrom, the audio signals being processed by the audio processor and played in synchronization with the image signals.

In some embodiments, the exemplary video includes the above-mentioned image data, audio data, and subtitle data corresponding to the audio data, and the controller plays the image, the audio, and the subtitle synchronously when the exemplary video is played.

As previously described, an exemplary video includes a sequence of video frames, with frames in the sequence of video frames displayed over time under the playback control of the controller to present to the user the change in limb morphology that each action was made. The user needs to undergo the change of the limb form after completing each action, and the embodiment of the application analyzes and evaluates the situation of completing the action according to the recorded limb form. In some embodiments, the degree of matching of the motion is determined in advance by extracting continuous joint points from the local video during the follow-up procedure based on the motion model of the acquired joint point in the sequence of video frames in the exemplary video, and comparing with the motion model of the acquired joint point in advance.

In some embodiments, the process of changing the shape of a limb (i.e., the movement trajectory of the limb) that needs to be undergone by a critical action is described as the completion of an incomplete state action to a complete state action and then to a release action, that is, the incomplete state action occurs before the complete state action, and the release action occurs after the complete state action, which is the critical action to be completed. In some embodiments, the completion state action may also be referred to as a critical demonstration action or a critical action. In some embodiments, labels may be added to identify the limb change process, and different labels may be preset for action frames of actions of different nodes.

Based on this, in some embodiments, frames showing key actions in a video frame sequence included in the media video are called key frames, and key labels corresponding to each key frame are identified on a time axis of the media video, that is, a time point represented by the key labels is a time point when the corresponding key frame is played. In addition, key frames in a sequence of video frames constitute a sequence of key frames.

Further, for an exemplary video, it may include a sequence of key frames including a number of key frames, one key frame corresponding to each key label on the time axis, one key frame exhibiting each key action. In some embodiments, the key frame sequence is also referred to as a first key frame sequence.

In some embodiments, N sets of start-stop labels are further preset on a time axis of the media asset video (including the demonstration video), and each set of start-stop labels corresponds to N video segments, where each video segment is used to display an action, (or called a completion state action or a key action), and each set of start-stop labels includes a start label and a stop label, and when a progress mark on the time axis moves to a certain start label in playing the media asset video (including the demonstration video), it means that an demonstration process corresponding to a certain action starts to play, and when the progress mark on the time axis moves to the stop label, it means that the demonstration process of a certain action ends to play.

Some users (e.g., children) act very slowly due to the difference in individualization factors such as learning ability, physical coordination, etc. of different users, and it is difficult to achieve synchronization with the playing speed of the demonstration video.

To solve this problem, in some embodiments, in playing the demonstration video, when the demonstration process of playing a certain action is started, the playing speed of the demonstration video is automatically reduced, so that the user can learn and practice the key action better, avoid missing the key action, improve the action in time, and when the demonstration process of the action (i.e. the video clip showing the action) is finished, the original playing speed is automatically restored.

In some embodiments, video segments exhibiting key actions are referred to as key segments, and an exemplary video generally includes a number of key segments and at least one non-key segment (non-key segment or other segments). The non-critical section is a section of video that indicates Fan Shipin that contains non-critical actions, e.g., a section of video in which the action presenter remains standing as a spectator telling the action.

In some embodiments, the controller controls to display a user interface on the display, the user interface including a window for playing video; responding to an input instruction for playing the demonstration video, and acquiring the demonstration video, wherein the demonstration video comprises a plurality of key fragments, and the key fragments show key actions required to be practiced by a user when being played; in some embodiments, the exemplary video that the user indicates to play is also referred to as the target video. The controller controls playing the demonstration video in the window at a first speed; when the playing of the key fragment is started, the speed of playing the demonstration video is adjusted from the first speed to the second speed; when the playing of the key fragment is finished, the speed of playing the demonstration video is adjusted from the second speed to the first speed; wherein the second speed is different from the first speed.

In some embodiments, the controller plays the demonstration video, detects a start tag and an end tag on a timeline of the demonstration video; when the start tag is detected, the speed of playing the demonstration video is adjusted from the first speed to the second speed; upon detection of the end tag, the speed at which the exemplary video is played is adjusted from the second speed to the first speed. Wherein the playing of the start tag representing the key fragment starts and the stop tag representing the key fragment completes.

In some embodiments, the second speed is lower than the first speed.

In the above example, since the second speed is lower than the first speed, when the start tag is detected (i.e., the time when the progress mark on the time axis goes to the start tag mark), automatic low-speed playback is realized, the playback speed of the exemplary video is adapted to the action speed of the user, and when the end tag is detected, the first speed is automatically restored.

In some embodiments, the first speed is a normal playing speed, i.e. 1 time speed, and the second speed may be a preset 0.75 time speed or 0.5 time speed.

In some embodiments, the exemplary video file includes video frame data and audio data, and when playing the exemplary video, the same sampling rate is used to implement reading and processing of the video frame data and the audio data, so when the playing speed of the exemplary video needs to be adjusted, not only the playing speed of the video frame will be adjusted, but also the playing speed of the audio signal will be adjusted, that is, synchronous playing of audio and video is implemented.

In other embodiments, the exemplary video file includes video frame data and audio data, and the sampling rate of the video frame data and the sampling rate of the audio data are independently adjusted and controlled when the exemplary video is played, so that when the playing speed of the exemplary video needs to be adjusted, only the sampling rate of the video frame data can be changed to adjust the playing speed of the video picture, and the sampling rate of the audio data is not changed to keep the playing speed of the audio signal unchanged. For example, when the play speed needs to be reduced, the play speed of the audio is not reduced, so that the user can normally receive the description of the audio and watch the slowed action presentation.

In some embodiments, the key segments include their video data and their audio data. When the playing of the key video clip is started, adjusting the speed of the video data of the playing of the key video clip to a second speed, and maintaining the speed of the audio data of the playing of the key video clip to a first speed; when the playing of the key segment is finished, the speed of playing the video data of the next segment is adjusted to the first speed, and the audio data of the next segment is synchronously played at the first speed, wherein the next segment is a file segment which is positioned behind the key segment and is adjacent to the key segment in the demonstration video, such as other segments adjacent to the key segment.

In some embodiments, during the process of playing the video frame at the low speed, whether the playing of the key segment is finished (for example, detecting the termination label) is detected, if the termination label of the key segment is not detected, when the playing of the audio data corresponding to the corresponding period is finished, the audio data corresponding to the corresponding period may be repeatedly played, for example, when the video frame is played at the 0.5 speed, the audio data corresponding to the period may be repeatedly played twice. And after the video frame data of the period is completely played, namely after the termination label is detected, the audio data and the video frame data corresponding to the next period can be synchronously played.

In other embodiments, in the process of playing the video frame at the low speed, whether the playing of the key segment is finished (for example, detecting the termination label) is detected, if the termination label of the key segment is not detected, when the playing of the audio data corresponding to the corresponding period is finished, the playing of the audio data is paused until the playing of the video frame data of the period is finished, that is, after the termination label is detected, the audio data and the video frame data corresponding to the next period can be synchronously played. For example, when the starting label is at a time of 0:05 and the ending label is at a time of 0:15 on the time axis, in the case of playing a video frame at a speed of 0.5 times, the video frame data corresponding to the time period of 0:05-0:15 needs to be played for 20S, and the audio data corresponding to the time period needs to be played for 10S, because in order to make the audio and video synchronous to play in the time period after 0:15, when the progress mark on the time axis goes to 0:10, the audio data is paused, and when the progress mark on the time axis goes to 0:15, the audio is continuously played.

In some embodiments, during the user follow-up, automatic adjustment is achieved for the playback speed of the exemplary video only, and not for the local video stream.

In some embodiments, the controller controls to display a user interface on the display, the user interface including a first playback window for playing the demonstration video and a second playback window for playing the local video stream; responding to an input instruction for playing the demonstration video, and acquiring the demonstration video; playing the demonstration video in a first playing window, and playing the local video stream in a second playing window; the speed when other fragments of the demonstration video are played in the first playing window is a first speed, and the speed when key fragments of the demonstration video are played is a second speed which is lower than the first speed; the speed of playing the local video stream in the second playing window is a fixed preset speed.

In some embodiments, the fixed preset speed may be a first speed. In some embodiments, the learning ability and poor physical coordination of the low-age user are considered, so that if the age of the user falls within a preset age range, the speed is automatically reduced when the demonstration process of the key action is started.

In some embodiments, if the user's age is within a first age range, playing the exemplary video at a first speed; if the user's age is within a second age interval, the exemplary video is played at a second speed, wherein the second speed is different from the first speed.

In some embodiments, the first age interval and the second age interval are age intervals divided by a predetermined age, e.g., an age interval above the predetermined age is defined as the first age interval, and an age interval below the predetermined age (including the predetermined age) is defined as the second age interval. For example, the first age interval or the second age interval may be an age interval of preschool children (e.g., 1-7 years), an age interval of school children, an age interval of young people, an age interval of middle-aged people, or an age interval of elderly people.

It should be noted that, a person skilled in the art may set the first speed and the second speed according to a specific value range of the first age interval and the second age interval, and by using a principle that the exemplary video playing speed is maximally adapted to the learning ability and the action ability of the user.

It should be further noted that the first age interval and the second age interval are merely exemplary, and in other embodiments, the corresponding playing speed may be set for more age intervals according to need, and when the user's age is located in the corresponding age interval, the exemplary video may be played at the corresponding playing speed. For example, the demonstration video is played at a third speed when the age of the user is in a third age range, at a fourth speed when the age of the user is in a fourth age range, and so on.

In some embodiments, the user is in a first age range when the user's age is greater than a first starting age and less than a first ending age, and the user's age is in a second age range when the user's age is greater than a second starting age and less than a second ending age.

In some embodiments the age range may be two and demarcated by a preset age.

In some embodiments, when the age of the user is above a preset age, controlling the display to play the demonstration video at a first speed; when the age of the user is not higher than the preset age, controlling the display to play the demonstration video at a second speed; wherein the second speed is lower than the first speed.

In some embodiments, if the age of the user is not higher than the preset age or is in the second age range, when the key segment starts playing, the playing speed of playing the demonstration video is adjusted to the second speed; and when the playing of the key fragment is finished, adjusting the playing speed of the played demonstration video from the second speed to the first speed.

In some embodiments, when the key clip begins to play, the speed of the display playing the video data of the key clip is adjusted from the first speed to the second speed, and the speed of the audio output unit playing the audio data of the key clip is maintained at the first speed; after the playing of the audio data of the key fragments is completed, controlling the audio output unit to pause playing of the audio data of the key fragments or controlling the audio output unit to circularly play the audio data of the key fragments. Wherein the audio output unit is display device hardware, such as a speaker, for playing audio data.

In some embodiments, when the key segment ends playing, the display is controlled to play the video data of the next segment at the first speed, and the audio output unit is controlled to synchronously play the audio data of the next segment at the first speed, wherein the next segment is a segment located after the key segment in the exemplary video.

In some embodiments, if the age of the user is not higher than the preset age, controlling the display to play the video data of the demonstration video at the second speed; controlling the audio output unit to play the audio data of the exemplary video at the first speed.

In specific implementation, the controller acquires the age of the user; judging whether the age of the user is lower than a preset age; in the case that the age of the user is lower than the preset age, in the process of playing the demonstration video, detecting a start-stop label on a time axis, adjusting the playing speed of the demonstration video from a first speed to a second speed when the start label is detected, and adjusting the playing speed of the demonstration video from the second speed to the first speed when the end label is detected.

In some embodiments, the controller obtains user information from the user ID, and obtains age information of the user from the user information.

In other embodiments, the controller activates the image collector in response to a user-entered instruction to play an exemplary video; identifying the figure in the local image acquired by the image acquisition unit; and identifying the age of the user according to the identified character image and a preset age identification model.

In some embodiments, different low speed parameters may be set for different age ranges, for example, if the user is "3-5 years old" then the second speed is 0.5 speed; if the user is aged "6-7 years," the second speed is 0.75 times the speed.

As previously described, exemplary videos have specified types, such as the aforementioned "loving lessons", "happy lessons", etc., which types may be characterized by type identification. In view of the differences in audience and exercise difficulty of the different types of videos, in some embodiments, if the type of demonstration video is a preset type, the speed is automatically reduced when the demonstration process of playing the critical action is started. And if the type is not the preset type, normally playing the whole process until the user manually adjusts the type.

In some embodiments, the controller obtains a type identifier of the demonstration video, and if the demonstration video is determined to be of a preset type according to the type identifier, in the process of playing the demonstration video, a start-stop label on a time axis is detected, when a start label is detected, the playing speed of the demonstration video is adjusted from a first speed to a second speed, and when a stop label is detected, the playing speed of the demonstration video is adjusted from the second speed to the first speed.

In some embodiments, the resource information issued by the server to the display device includes a type identifier of the resource, so that the display device can determine whether the exemplary video is a preset type according to the type identifier of the exemplary video, where the preset type includes, but is not limited to, a type of some or all of the resources provided by the juvenile channel, such as juvenile resources provided by other channels.

In some embodiments, different low-speed parameters may be set for different types, for example, if the exemplary video belongs to "lovely class", then the second speed is 0.5 speed; if the exemplary video belongs to the "music lesson", the second speed is 0.75 times the speed.

In some embodiments, the playing speed may be automatically adjusted according to the heel training situation of the user, so that the low-speed playing mechanism is suitable for different users. And for the parts of the demonstration video, which are difficult to be smoothly followed by the user, normal speed playing is carried out, and for the parts of the demonstration video, which are difficult to be smoothly followed by the user, low-speed playing is carried out.

For convenience of explanation and distinction, the video frame sequence included in the exemplary video is referred to as a first video frame sequence, the first video frame sequence includes first key frames for displaying the completed state actions, N first key frames corresponding to the N completed state actions form a first key frame sequence, and of course, the first video frame sequence always includes non-key frames for displaying the incomplete state actions and the release actions.

In some embodiments, in response to an instruction indicating a follow-up demonstration video, the controller activates the image collector and obtains a follow-up video stream of the user from a local video stream collected by the image collector, the follow-up video stream containing some or all of the video frames in the local video stream. The present application, in distinction, refers to a sequence of video frames in a follow-up video stream as a second sequence of video frames that includes second video frames for exhibiting (documenting) user actions.

In some embodiments, user actions are analyzed according to the heel-and-toe video stream, if it is detected that the user does not make a corresponding completion state action at one or more time points (or time periods) at which the completion state action needs to be made, that is, the user actions are incomplete state actions, it is explained that the heel-and-toe actions are more difficult for the user, and at this time, the playing speed of the demonstration video by the display device can be reduced; if it is detected that the user has completed the corresponding completion state action at one or more consecutive time points (or time periods) at which the completion state action is required to be made, that is, the user action is a release action, it is indicated that these actions are less difficult for the user to follow, and at this time, the playing speed of the demonstration video by the display device can be increased.

In some embodiments, in response to an input instruction indicating to follow-up an exemplary video, the controller obtains the exemplary video including a first sequence of key frames for exhibiting a completion state action, and obtains a follow-up video stream of the user from a local video stream collected by the image collector, the follow-up video stream including a second sequence of video frames for exhibiting a user action; the controller plays the demonstration video on the display, and adjusts the playing speed of the demonstration video when the user action in the second video frame corresponding to the first key frame is not matched with the completion state action displayed by the first key frame.

The second video frame corresponding to the first key frame is extracted from the second video frame sequence according to the time information of the played first key frame.

In some embodiments, the time information of the first key frame may be a time when the display device plays the frame, and according to the time when the display device plays the first key frame, the second video frame corresponding to the time is extracted from the second video frame sequence, that is, the second video frame corresponding to the first key frame. The second video frame corresponding to a certain time may be a second video frame whose time stamp is the time, or a second video frame whose time shown by the time stamp is closest to the time.

In some embodiments, the same position may be passed during the preparation and release, so that the second video frame and other video frames adjacent thereto may be extracted, and after the joint data of the continuous frames are extracted, it may be determined whether the action is an action during the preparation or the release.

In some embodiments, the controller extracts a corresponding second video frame from the second video frame sequence according to the played first key frame, and sends the extracted second video frame (and the corresponding first key frame) to the server; and the server judges whether the user action in the second video frame is matched with the completion state action displayed by the first key frame or not by comparing the corresponding first key frame with the second video frame. And when the server judges that the user action in the second video frame is not matched with the completion state action displayed by the corresponding first key frame, returning a speed adjustment instruction to the display equipment.

In some embodiments, the controller controls the node identification (i.e., user action identification) of the second video frame and/or other video frames to be accomplished locally at the display device and uploads the node data and corresponding points in time to the server. The server determines a corresponding target demonstration video frame according to the received time point, compares the received data of the node with the joint point data of the target demonstration video frame, and feeds back the comparison result to the controller.

In some embodiments, the case where the user action in the second video frame does not match the completion status action exhibited by the corresponding first keyframe includes: the user action in the second video frame is an incomplete state action before the complete state action; the user action in the second video frame is a release action following the completion state action. Based on this, if the server determines that the user action in the second video frame is an incomplete state action, an instruction indicating a speed reduction is returned to the display device to cause the display device to reduce the play speed of the target video; if the server determines that the user action in the second video frame is a release action, an instruction for indicating to increase the speed is returned to the display device, so that the display device increases the playing speed of the target video.

Of course, in other embodiments, the display device independently determines whether the user action in the second video frame matches the completion state action shown in the first keyframe, without interaction with the server, which is not described herein.

It should be noted that, in the above implementation case of adjusting the playing speed in real time according to the exercise situation of the user, if the playing speed is adjusted to the preset maximum or minimum value, the playing speed is not adjusted up or down any more.

In some embodiments, the user may pause the video playing by operating a key or inputting voice control, and then operate the key or inputting voice to control resuming the video playing, for example, in the process of following the target video, the user may control the target video to pause the playing by operating a key or inputting voice on the control device, for example, when the display displays the interface as shown in fig. 10, the user may press the "OK" key to pause the playing, and the controller may pause the playing of the target video in response to the key input of the user and present the pause state identifier as shown in fig. 16 at the upper layer of the play screen.

In the process of tracking the target video, the controller acquires a local image through the image acquisition device and detects whether a user target, namely a person (user), exists in the local image, when the display device controller (or the server) does not detect a moving target from the local image, the display device automatically controls to pause playing of the target video, or the server instructs the display device to pause playing of the target video, and a pause state identifier is presented at the upper layer of a playing picture as shown in fig. 16.

In the above-described embodiments, the pause control performed by the controller does not affect the display of the local video picture.

In the pause play state as shown in fig. 16, the user can resume playing the target video by operating a key or voice input on the control device, for example, the user can resume playing the target video by pressing an "OK" key, and the controller resumes playing the target video and cancels the display of the pause state identification in fig. 16 in response to the key input of the user.

It can be seen that, in the above example, the user needs to operate the control device to control the display device to resume playing the target video, so that the user experience in the follow-up process is not friendly.

To address this problem, in some embodiments, in response to pause control of playback of a target video, a controller presents a pause interface on a display and displays target key frames in the pause interface, wherein the target video includes a number of key frames, each key frame exhibiting a key action that requires a follow-up, the target key frame being a designated one of the number of key frames. After the playing of the target video is paused, the image collector is controlled to continue working, and whether the user action in the local image collected after the playing is paused is matched with the key action displayed by the target key frame is judged; when the user action in the local image is matched with the key action displayed by the target key frame, resuming playing of the target video; and when the user action in the local image is not matched with the key action displayed by the last key frame, maintaining the playing suspension of the target video.

In the above embodiment, the target key frame may be the key frame showing the last key action, that is, the last key action played before the control target video pauses, or may be a representative one of several key frames.

It should be noted that, the target video referred to in the above example refers to a video that is paused, including but not limited to an exemplary video of dance motion, an exemplary video of exercise motion, an exemplary video of gymnastics motion, an MV video played in a K-song scene, or a video of an exemplary avatar motion.

As some possible implementations, a plurality of key labels are identified in advance on a time axis of the target video, where one key label corresponds to one key frame, that is, a time point represented by the key label is a time point when the corresponding key frame is played. The controller responds to receiving pause control of target video playing, detects a target key label on a time axis according to a time point of the time axis when the pause is performed, acquires a target key frame according to the target key label on the time axis, and displays the acquired target key frame in a pause interface, wherein the time point corresponding to the label of the target key frame is before the time point on the time axis when the pause is performed. Thus, the touch of pausing with the video frames after the training can be used to promote the interest.

In other possible implementations, the controller responds to pause control over playing of the target video, and pauses the target video after the target video is controlled to fall back to the moment of the target key label, so that the target key frame corresponding to the target key label is displayed on a pause interface.

In some embodiments, the target key label is a key label earlier than the current time and closest to the current time on the time axis, and the corresponding target key frame is a key frame showing the previous key action.

In the above example, when or after executing pause control on playing of the target video, the target key frame showing the key action is presented in the pause interface as a prompt action for the user to resume playing, and further, in the play pause state, the user can control to resume playing of the target video by making the prompt action, without operating the control device, so that the follow-up experience of the user is improved.

In some embodiments, displaying the obtained target key frame in the pause interface may be that after the control time axis is retracted to a time point corresponding to the target key label, playing of the demonstration video is stopped and a pause control is added in the demonstration video playing window. The controller acquires the target key frame or the joint point of the target key frame, and simultaneously the camera continuously acquires the local video data and detects the human body in the video data, and when the matching degree of the action of the human body in the video data and the action in the target key frame reaches a preset threshold value, the demonstration video is played.

In some embodiments, playing the video may be to continue playing the exemplary video from the point in time corresponding to the backed-up key label.

In some embodiments, continued playback of the exemplary video may be performed at the point in time when the pause control is received.

In some embodiments, displaying the obtained target key frame in the pause interface may be that the playback of the exemplary video is stopped without performing the time-axis rollback, a pause control is added in the exemplary video playing window, and the obtained target key frame is displayed in a floating layer above the exemplary video playing window. The controller acquires the target key frame or the joint point of the target key frame, and simultaneously, the camera continuously acquires the local video data and detects the human body in the video data, and when the matching degree of the action of the human body in the video data and the action in the target key frame reaches a preset threshold value, the demonstration video is played and the floating layer of the target key frame is cancelled.

In some embodiments, the work frame at the time of pause may be any video frame in the exemplary video.

In some embodiments, the follow-up procedure automatically ends when the target video play for the user follow-up is completed. The controller closes the image collector in response to the completion of playing the target video, closes the heel-in interface where the first playing window and the second playing window are located as shown in fig. 10, and presents an interface containing evaluation information.

In some embodiments, the user may end the follow-up procedure by operating a key or voice input on the control device before the follow-up procedure is completed, e.g., the user may operate a "back" key input on the control device to indicate an instruction to end the follow-up. In response to the instruction, the controller pauses playing the target video and presents an interface including the save information, such as the save page exemplarily shown in fig. 17.

When the display displays the save interface as shown in fig. 17, the user can operate the control for returning to the heel-exercise interface, return to the heel-exercise interface to continue heel-exercise, and also can operate the control for determining to exit from heel-exercise, so as to end the heel-exercise process.

In some embodiments, a length of play of the target video is determined for subsequent play in response to a user entered instruction to exit the follow-up.

In some embodiments, if the playing time period of the target video is not less than the preset time period (e.g., 30 s), the playing time period of the target video is saved to perform the continuous playing when playing next time, and if the playing time period of the target video is less than the preset time period (e.g., 30 s), the playing time period of the target video is not saved to restart playing when playing next time of the target video.

In some embodiments, if the playing duration of the target video is not less than a preset duration (e.g., 30 s), the local image frame corresponding to the target key frame is saved for display in a subsequent evaluation interface or playing history. If the playing time length of the target video is less than the preset time length (such as 30 s), the local image frame corresponding to the target key frame is not saved. The local image frames corresponding to the target key frames refer to video frames in the determined local video when the target key labels are detected.

In some embodiments, the video frames in the determined local video obtained when the target key tag is detected may be local image frames obtained by a camera at the time point when the target key tag is detected, or local image frames obtained by a camera at the time point when the target key tag is detected or at the adjacent time point and matched with the target key frame to a higher degree.

In some embodiments, when a user selects a video which is played and not played for follow-up, an interface including continuous playing prompt information is presented in response to an instruction input by the user to play the video, and in the continuous playing prompt interface, a last playing time and a control for the user to select whether to perform continuous playing are displayed, so that the user operates the control on the interface to autonomously select whether to perform continuous playing. Fig. 18 illustrates a follow-up cue interface, as shown in fig. 18, in which a last play time period (1 minute and 30 seconds), a control for restarting play ("restart") and a control for the user to continue play (continue follow-up) are shown.

In some embodiments, the playback of the exemplary video is controlled to be resumed in response to an instruction input by the user in the resume prompt interface as shown in fig. 18, for example, from 0 minutes to 0 seconds, or the playback of the exemplary video is resumed in response to an instruction input by the user in the resume prompt interface as shown in fig. 18, for example, from 1 minute to 30 seconds, according to the last playback time.

In some embodiments, when the controller receives an operation that the user determines to exit from the heel-off, the image collector is turned off, and the first and second play windows in the heel-off interface as shown in fig. 10a are turned off, and an interface containing evaluation information is presented.

In some embodiments, in response to the end of the follow-up procedure, an interface is presented on the display containing rating information including at least one of star grade achievements, scoring achievements, experience value increments, and experience value totals.

In some embodiments, the star grade score, the score and the experience value increment are determined according to the heel-training actions of the target key frames completed in the target video playing process and the action matching degree when the heel-training actions of the target key frames are completed, wherein the heel-training action quantity of the target key frames completed and the action matching degree when the heel-training actions of the target key frames are completed are positively correlated with the star grade score, the score and the experience value increment.

It should be noted that, in some embodiments, if the user exits from the heel-exercise in advance, the controller determines whether the playing duration of the target video is longer than a preset value in response to the instruction of exiting from the heel-exercise input by the user, and if the playing duration is longer than the preset value, generates scoring information and detailed score information according to the generated heel-exercise data (such as the acquired local video stream, scoring of part of the user actions, etc.); and if the playing time length is not longer than the preset value, deleting the generated heel training data.

FIG. 19 illustrates an interface for presenting scoring information, as shown in FIG. 19, in which star achievements, experience value increments, and experience value totals are presented in the form of items or controls, where the controls presenting the experience value totals are consistent with those shown in FIG. 10. In addition, in order to facilitate the user to view the detailed score, fig. 19 also shows a control "view the score immediately" for viewing the detailed score, and the user can enter the interface for presenting detailed score information as shown in fig. 20 or fig. 22 by operating the control.

In some embodiments, the experience value is user data related to a level boost, which is the user's acquisition of user behavior in the target application, i.e., the user can boost the experience value by training more exemplary videos, which is also a quantitative characterization of the user's behavior proficiency, i.e., the higher the experience value, the higher the proficiency of the user's practice actions, and when the experience value is accumulated to a certain value, the boost of the user level is obtained.

In order to avoid malicious earning of experience values by repeatedly sparring the same demonstration video, in some embodiments, in the process of sparring the demonstration video by the user, scoring is performed on the sparring condition of the user according to the local video stream acquired by the image acquisition device, and mapping relation exists between the score and the demonstration video, the server can inquire the recorded historical highest score of the sparring demonstration video according to the ID of the demonstration video and the ID of the user, if the score is higher than the recorded historical highest score, the new experience value obtained according to the score is displayed, and if the score is not higher than the recorded historical highest score, the original experience value is displayed. Wherein the recorded historical highest score is the highest score obtained by the user training the exemplary video over time.

In some embodiments, for scoring of the heel-exercise process, the score and the new empirical value derived from the score are presented in the heel-exercise result interface when the heel-exercise result interface of the heel-exercise process is presented.

In some embodiments, during the playing of the demonstration video (i.e. during the heel-in process), performing action matching on the demonstration video and the local video stream to obtain a score corresponding to the heel-in process; and after the demonstration video is played (i.e. after the heel-exercise process is finished), generating a heel-exercise result interface according to the obtained score, and setting an experience value control for displaying the experience value in the heel-exercise result interface, wherein the experience value control displays the experience value updated according to the score when the score is higher than the historical highest score of the demonstration video for the user, and displays the experience value before the heel-exercise process when the score is not higher than the historical highest score.

In some embodiments, the controller obtains the demonstration video in response to an input instruction to play (follow-up) the demonstration video, and collects a local video stream through the image collector; wherein the demonstration video comprises a first video frame for showing a demonstration action that the user needs to follow, and the local video stream comprises a second video frame for showing the user action; matching the corresponding first video frame with the second video frame to obtain a score based on a matching result; if the score is higher than the recorded historical highest score, loading a new experience value obtained according to the score in an experience value control; and if the score is not higher than the recorded highest score, loading and displaying an original experience value in the experience value control, wherein the original experience value is the experience value before the heel training process.

In some embodiments, when playing an exemplary video, and detecting a key tag on a timeline; obtaining a second key frame corresponding to the first key frame from the second video frame according to the time information represented by the key label when detecting one key label, wherein the second key frame is used for key training actions of a user; and obtaining a matching result of the first key frame and the second key frame which correspond to the key label at the same time. For example, the first key frame and the second key frame corresponding to the key label may be uploaded to the server, so that the server performs skeleton point matching on the key demonstration action displayed in the first key frame and the key user action displayed in the second key frame, and then receives the matching result returned by the server. For another example, the display device controller may identify a key demonstration motion in the first key frame and a key heel-and-toe motion in the second key frame, and then perform skeleton point matching on the identified key demonstration motion and key heel-and-toe motion to obtain a matching result. It can be seen that, the second key frames of a frame correspond to a matching result, which represents the matching degree or similarity between the user actions in the second key frames and the key actions corresponding to the first key frames, when the matching result represents that the matching degree/similarity between the user actions and the demonstration actions is low, the user actions are not standard enough, and when the matching result represents that the matching degree/similarity between the user actions and the demonstration actions is high, the user actions are standard.

In some embodiments, the display device may obtain the articulation point data of the second key frame in the local video according to the local video data, and upload the articulation point data to the server, so as to reduce the pressure of data transmission.

In some embodiments, the display device may upload the key tag identification to the server to reduce the data transmission pressure from transmitting the first key frame.

In some embodiments, key labels on the timeline are detected while the exemplary video is being played; and each time a key label is detected, acquiring a second key frame corresponding to the second key label from the second video frame according to the time information of the first key label, wherein the second key frame is used for displaying the heel training action of the user.

In some embodiments, the second keyframe is an image frame of the local video at the time of the first keytag. In the embodiment of the present application, since the time point represented by the key tag is a time point corresponding to the first key frame, and the second key frame is a frame extracted from the second video frame sequence according to the time information of the first key frame, one key tag corresponds to a pair of the first key frame and the second key frame.

In some embodiments, the second keyframe is an image frame in the local video at and adjacent to the time instant of the first keytag. The image for the evaluation presentation may be the image frame of the second keyframe that matches the first keyframe to the highest degree.

In some embodiments, the time information of the first keyframe may be a time when the display device plays the frame, and according to the time when the display device plays the first keyframe, the second video frame corresponding to the time is extracted from the second video frame sequence, that is, the second keyframe corresponding to the first keyframe. The video frame corresponding to a certain time may be a video frame whose time stamp is the time, or a video frame whose time shown by the time stamp is closest to the time.

In some embodiments, the matching result is specifically a matching score, and the score calculated based on the matching result or the matching score may also be referred to as a total score.

In some embodiments, a certain target video includes M frame first key frames, which show M key actions, and the target video has M key labels on a time axis, and in the follow-up process, a second key frame corresponding to the M frames can be extracted from the local video stream according to the M frame first key frames; and sequentially carrying out corresponding matching on the M-frame first key frames (the M displayed key actions) and the M-frame second key frames (the M displayed user key actions) to obtain M matching scores corresponding to the M-frame second key frames respectively, and carrying out summation, weighted summation, averaging or weighted averaging calculation on the M matching scores to obtain the total score of the follow-up process.

In some embodiments, the display device determines a frame extraction range of the local video stream according to time information of a first key frame (key frame) in the target video, extracts a preset number of local video frames from the local video stream according to the determined frame extraction range, identifies a heel-and-toe motion of a user for each extracted local video frame, longitudinally compares the heel-and-toe motion, then matches the key heel-and-toe motion with the corresponding key motion to obtain a corresponding matching score, and calculates an overall score of the heel-and-toe process after the heel-and-toe is finished.

In other embodiments, the display device sends the extracted local video frames to the server, the server identifies the user heel-exercise action in each frame, and longitudinally compares the user heel-exercise action with the corresponding key action to obtain a corresponding matching score, and after heel-exercise is finished, calculates the total score of the heel-exercise process, and returns the total score to the display device.

In some embodiments, after the server obtains a matching score for a certain key training action, the server sends a grade identifier corresponding to the matching score to the display device, and after the display device receives the grade identifier, the grade identifier is displayed in real time in a floating layer above the local screen, for example, GOOD, GREAT, PERFECT, so as to feed back the training effect to the user in real time. In addition, if the display device determines the matching score of the user training action by itself, the display device directly displays the grade identification corresponding to the matching score in the floating layer above the local screen.

In some embodiments, for practicing the total score of each exemplary video, if the score is higher than the recorded highest score, the difference between the score and the recorded highest score is obtained, and the difference is increased based on the original total score to obtain a new total score, so that the situation that the user repeatedly brushes familiar videos to improve the total score is avoided, and the application fairness is improved.

In some embodiments, if the total score is higher than the highest score noted, a corresponding experience value increment is derived from the total score; obtaining a new experience value by accumulating the experience value increment into the original experience value; further, at the end of the target video play, a new experience value is presented on the display. For example, assuming that the total score is 85 and the historical highest score is 80, the experience value increment of 5 is obtained according to the total score of 85 and the historical highest score of 80, and if the original experience value is 10005, a new experience value 10010 is obtained by integrating the experience value increment of 5 in 10005. Conversely, if the total score is not higher than the highest score noted, the experience value increment is 0, i.e., the experience value is not accumulated, at which point the original experience value is presented on the display.

Further, if the total score is higher than the highest score noted, the original empirical value is replaced with the new empirical value; if the total score is not higher than the highest score noted, the original experience value is not updated.

It should be noted that the terms "first" and "second" are used herein to distinguish similar objects and not necessarily to describe a particular order or sequence. In further embodiments, the first keyframe may also be referred to as a keyframe and the second keyframe may also be referred to as a local video frame or a follow-up screenshot.

In the above embodiment, in the process of training the target video by the user, the training situation of the user is scored according to the local video stream acquired by the image acquisition device, if the score is higher than the recorded highest score, a new experience value is obtained according to the score, and the new experience value is displayed, if the score is not higher than the recorded highest score, the experience value is not updated, and the original experience value is displayed, so that the user is prevented from maliciously earning the experience value by repeatedly training the same exemplary video.

In some embodiments, the server or the display device counts the experience value increment generated in a preset period, and updates the experience value of the user according to the experience value increment generated in the last period when the next period is entered. The preset period may be three days, seven days, etc.

In some embodiments, the display device controller, in response to the initiation of the target application, sends a request to the server for obtaining the user experience value, the request including at least the user information. The server acquires the time of updating the user experience value last time, and judges whether the interval duration from the last time of updating the user experience value meets the duration of the preset period or not; if the user experience value is satisfied, acquiring an experience value increment generated in the previous period, updating the user experience value by accumulating the experience value increment generated in the previous period into a total experience value, and returning the updated user experience value to the display device; if the user experience value is not satisfied, the user experience value is not updated, and the current user experience value is directly returned to the display device, or the display device is informed to acquire the last transmitted user experience value data from the cache data of the display device.

Accordingly, the display device receives the user experience value returned by the server and draws a user data display area in the interface so as to display the user experience value in the display area. And if the updated user experience value is received by the display device, updating the user experience value in the cache of the display device at the same time.

In some embodiments, the experience value control includes a user data presentation area setting identification bit as in FIG. 9 for identifying the experience value increment that has occurred during the current period, such as "current week +10" as shown in FIG. 9.

In some embodiments, the experience value control comprises a first sub-control in which the total value of experience values at the end of the last statistical period is presented, and a second sub-control in which the increment of experience values that has been generated in the current statistical period is presented. The first sub-control is shown as the control in which the "work value 10012" is shown in fig. 9, and the second sub-control is shown as the control in which the "current week +10" is shown in fig. 9.

In some embodiments, the first sub-control and the second sub-control partially overlap so that a user can intuitively see both sub-controls simultaneously.

In some embodiments, the first sub-control and the second sub-control are different in color so that a user can intuitively see both sub-controls simultaneously.

In some embodiments, the second child control is located in the upper right corner of the first child control.

In some embodiments, the user selects the user data display area to set the identification bit to enter a detail page for displaying the total score of the experience value, and after entering the detail page, the second sub-control remains located at the upper right corner of the first sub-control and displays the newly added score in the current statistical period.

In some embodiments, the heel training result interface is further provided with a heel training evaluation control, where the heel training evaluation control is used for displaying a target state determined according to the scores, and the target states corresponding to different scores are different.

In some embodiments, the target state presented in the follow-up evaluation control is a star rating identification as shown in fig. 9.

In some embodiments, a correspondence between the empirical value data range and the star level is established in advance, for example, 0-20000 (empirical value range) corresponds to 1 star, 20001-40000 corresponds to 2 stars, and so on. Based on this, while the user data presentation area as in fig. 9 presents the user experience value, a star level identification corresponding to the experience value, for example, 1 star as shown in fig. 9, may also be presented in the follow-up evaluation control.

After the training is finished, an interface for presenting the scoring information as shown in fig. 19 is presented on the display. While the display is displaying the interface, the user may enter the interface presenting detailed performance information by operating a control for viewing detailed performance.

In some embodiments, the detailed performance information may also be referred to as heel training result information, and the user interface that presents the heel training result information is referred to as a heel training result interface.

In some embodiments, in response to an instruction of checking detailed achievements input by a user, the display device sends a detailed achievements information interface acquisition request to the server, the display device presents detailed achievements information on a display according to detailed achievements information interface data issued by the server, the detailed achievements information comprises at least one of login user information, star achievements information, an evaluation language and a plurality of follow-up shots, the follow-up shots are local video frames in follow-up videos acquired by the user through a camera, and the follow-up shots are used for displaying follow-up actions of the user.

FIG. 20 illustrates an interface for presenting detailed performance information, such as that shown in FIG. 20, with login user information (e.g., user avatar, user experience value), star grade performance information, valuation, and 4 follow-up shots presented in the form of items or controls.

In some embodiments, the follow-up shots are arranged and displayed in the form of thumbnails in an interface as shown in fig. 20, a user may move the position of the selector by operating the control device to select a certain follow-up shot to view the original image of the selected picture, and when the display displays the original image file of the selected picture, the user may view the original images corresponding to other follow-up shots by operating the left and/or right direction keys.

In some embodiments, when the user selects the first exercise screenshot to view through the operation control device to move the selector, an original image file corresponding to the selected screenshot is obtained and presented on the display, as shown in fig. 21. In fig. 21, the user can view other artwork corresponding to the exercise screenshot by operating the left and/or right direction keys.

Fig. 22 illustrates another interface for presenting detailed performance information, unlike the interface illustrated in fig. 20, a sharing code picture (such as a two-dimensional code) including a detailed performance access address is also displayed in the interface illustrated in fig. 22, and a user can scan the sharing code picture using the mobile terminal to view the detailed performance information.

Fig. 23 illustrates a detailed performance information page displayed on the mobile terminal device, as shown in fig. 23, with login user information, star level performance, evaluation language, and at least one follow-up screenshot presented therein. And the user can share the page links with other users (namely other terminal equipment) by operating the sharing control in the page, and can store the follow-up screenshot displayed in the page and/or the original image file corresponding to the follow-up screenshot in the terminal equipment locally.

To motivate and prompt the user, in some embodiments, if the total score of one heel-exercise process is higher than a preset value, N local video frames (TopN) with the highest matching score are displayed in a detailed performance information page (or heel-exercise result interface), so as to display the highlight moment of the heel-exercise process, and if the total score is not higher than the preset value, N local video frames with the lowest matching score are displayed in the detailed performance information page, so as to display the moment to be improved of the heel-exercise process.

In some embodiments, after receiving the detailed score information interface acquisition request, the server obtains a score when the user follows the demonstration video according to the matching degree of the actions in the corresponding key frames and the local video frames, when the score is higher than a first value, the server sends a certain number of key frames and/or corresponding local video frames with higher matching degree (for example, N is greater than or equal to 1) as detailed score information interface data to the display device, and when the score is lower than a second value, sends a certain number of key frames and/or corresponding local video frames with lower matching degree as detailed score information interface data to the display device. In some embodiments, the first value and the second value may be the same value, in other embodiments, the first value and the second value are different values. In some embodiments, the controller obtains an exemplary video in response to a user entered instruction to follow-up with the exemplary video, the exemplary video including a sequence of key frames including a predetermined number (M) of key frames ordered by time, each key frame exhibiting a key action requiring user follow-up.

In some embodiments, the controller plays the target video at the follow-up interface, and obtains a local video frame corresponding to the key frame from the local video stream during the playing of the demonstration video, wherein the local video frame displays the user action.

In some embodiments, the comparison between the key frames and the local video is performed in the display device, during the training process, the controller matches the key actions displayed by the corresponding key frames with the user actions displayed by the local video frames to obtain a matching score corresponding to each local video frame, obtains a total score according to the matching score corresponding to each local video frame, and selects a target video frame to be displayed as a training result according to the total score, that is, if the total score is higher than a preset value, selects N local video frames (TopN) with the highest matching score as target video frames, if the total score is not higher than the preset value, selects N local video frames with the lowest matching score as target video frames, where N is the number of preset target video frames, for example, in fig. 19, n=4; finally, the heel-back results including the total score and the target video frame are presented, i.e., the total score and the target video frame are presented in the detailed performance page as shown in fig. 18. In some embodiments, the total score is derived from summing, weighted summing, averaging, or weighted averaging calculations of the matching scores for each local video frame.

In some embodiments, the controller detects key labels on the timeline during control of playing the exemplary video; and extracting local video frames corresponding to the key frames in time from the local video stream according to the time information of the key labels when one key label is detected, and generating a local video frame sequence according to the extracted local video frames, wherein the local video frame sequence comprises part or all of the local video frames which are arranged in descending order according to the matching scores.

In some embodiments, the first N local video frames in the sequence of local video frames are taken as first local video frames, the last N local video frames are taken as second local video frames, the first local video frames are used for being displayed in the heel-and-toe result interface when the total score is higher than a preset value, and the second local video frames are used for being displayed in the heel-and-toe result interface when the total score is not higher than the preset value. In some embodiments, the preset value may be the first value or the second value in the foregoing embodiments.

In some embodiments, the step of generating the sequence of local video frames may comprise: when a new local video frame is acquired, if an overlapped video frame exists in the first local video frame and the second local video frame, inserting the newly acquired local video frame into the local video frame sequence according to a matching score corresponding to the newly acquired local video frame to acquire a new local video frame sequence; if the first local video frame and the second local video frame do not have the overlapped video frame, inserting the newly acquired local video frame into the local video frame sequence according to the matching score corresponding to the newly acquired local video frame, and deleting the local video frame with the matching score in the middle position to obtain a new local video frame sequence.

In some embodiments, if the total score is higher than a preset value, selecting N first local video frames in the local video frame sequence as target video frames, displaying the target video frames in the training result interface, and if the total score is not higher than the preset value, selecting N second local video frames in the local video frame sequence as target video frames, displaying the target video frames in the training result interface.

It should be noted that the presence of overlapping video frames in the first local video frame and the second local video frame means that there are frames in the local video frame sequence that are both the first local video frame and the second local video frame, in which case the number of frames in the local video frame sequence is less than 2N.

It should be further noted that the absence of overlapping video frames in the first local video frame and the second local video frame means that there is no frame in the local video frame sequence that is both the first local video frame and the second local video frame, in which case the number of frames in the local video frame sequence is greater than or equal to 2N. In some embodiments, the display device side (when the display device performs the sequence generation) or the server (when the server performs the sequence generation) may employ an algorithm of bubble ordering in generating a photo sequence for presenting detailed performance information interface data.

The algorithm process is as follows: after the key frame and the local video frame are compared, the matching degree of the key frame and the local video frame is determined.

And when the number of data frames in the sequence is smaller than a preset value, adding the key frames and/or the local video frames into the sequence according to the matching degree, wherein the preset value is the sum of the number of image frames to be displayed with the score higher than the preset value and the number of image frames to be displayed with the score lower than the preset value. For example, the number of frames to be displayed is 4 frames (groups) when the score is higher than the preset value, and the number of frames to be displayed is 4 frames (groups) when the score is lower than the preset value, the preset value corresponding to the sequence is 8 frames (groups).

When the number of data frames in the sequence is greater than or equal to a preset value, a new sequence is formed according to the matching degree corresponding to each group of frames in the sequence, 4 frames (groups) with highest matching pairs are reserved in the new sequence, 4 frames (groups) with lowest matching degree are reserved, and the middle frames (groups) are deleted to enable the sequence to be maintained at 8 frames (groups). Therefore, excessive photos stored in the cache data can be avoided, and the service processing efficiency is improved.

In some, a frame refers to a sequence that contains only local video frames, and a group refers to a sequence of local video frames and corresponding key frames as a set of parameters in the sequence.

In some embodiments, the key frame and the local video frame are compared in a server, and the comparison process can refer to the description of other embodiments in the present application.

The server obtains a total score according to the matching score corresponding to each local video frame, and selects a target video frame to be displayed as a training result according to the total score, namely, if the total score is higher than a preset value, N local video frames (topN) with the highest matching score are selected as target video frames to be sent to the display device, and if the total score is not higher than the preset value, N local video frames with the lowest matching score are selected as target video frames to be sent to the display device, wherein N is the number of the preset target video frames, for example, in fig. 19, N=4; finally, the display device displays the heel-in result including the total score and the target video frame according to the received data, namely, displays the total score and the target video frame in a detailed score page as shown in fig. 18.

In the case where the above-described local video frame sequence contains all of the extracted local video frames, each frame of the local video frame is extracted, it is inserted into the local video frame sequence according to the matching score corresponding to the frame such that the number of frames in the local video frame sequence increases from 0 to M (the number of key frames contained in the exemplary video), and the local video frames in the sequence are arranged in descending order of the respective matching score. When the N frames with the highest matching score are needed to be displayed, the frames with the bit sequence of 1-N are extracted from the local video frame sequence, and when the N frames with the lowest matching score are needed to be displayed, the frames with the bit sequence of (M-N+1) -M are extracted from the local video frame sequence.

In the case that the local video frame sequence contains the extracted partial local video frames, generating an initial sequence according to the acquired 1 st to 2 nd local video frames, wherein the 1 st to 2 nd local video frames respectively correspond to the 1 st to 2 nd key frames, and arranging the 2 nd local video frames in descending order according to the matching scores; every time a local video frame (2n+i frame) is acquired from 2n+1 th frame (including n+1 th frame), inserting the frame (2n+i frame) into the initial sequence according to the matching score corresponding to the frame (2n+i frame), and deleting the frame with bit (n+1) in the initial sequence until 2n+i is equal to the preset number, namely inserting the last frame, to obtain the local video frame sequence, wherein 2N is smaller than M, i epsilon (1, M-2N).

It should be noted that, in some embodiments, if the user exits from the follow-up in advance, the number of the local video frames actually extracted may be smaller than the number N of the target video frames to be displayed, and at this time, the controller need not select the target video frames to be displayed according to the total score, and only needs to display the local video frames actually extracted as the target video frames.

In some embodiments, after receiving the operation of confirming exit input by the user, determining whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, if so, selecting the number of video frames to be displayed in the front or rear section of the sequence according to the score for displaying, and if not, displaying all the video frames.

In some embodiments, after receiving the operation of confirming exit input by the user, before judging whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, judging whether the duration and/or the number of actions of the follow-up is satisfied with the preset requirement, if so, judging whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, and if not, not.

In some embodiments, the display device uploads the local video frames sorted according to the overall score to the server so that the server adds the local video frames to the user's exercise record information.

In some embodiments, the display device uploads the node data of the local video frame and the identification of the corresponding local video frame to the server, and the server also performs information transfer of the matching degree through the parameters and the display device. In order to display the picture of the follow-up exercise in the subsequent use history. After receiving the detailed score page data, the display device draws graph scores according to the scores, displays comments according to the comment data, calls the local video frames in the cache to display follow-up pictures according to the marks of the local video frames, and meanwhile uploads the local video frames corresponding to the marks of the local video frames and the detailed score page marks to the server, and the server combines the received local video frames and the detailed score page data into one piece of follow-up data according to the detailed score page marks so as to be sent to the display device for later inquiring the follow-up history.

In some embodiments, in response to ending of the follow-up process, detecting whether a user input is received, presenting an automatic play prompt interface when the user input is not received within a preset time period, and starting countdown, wherein countdown prompt information, automatic play video information and a plurality of controls are displayed in the automatic play prompt interface, the countdown prompt information at least comprises the countdown time period, the automatic play video information comprises a video cover and/or a video name to be played after the countdown is ended, and the plurality of controls can be such as a control for controlling the replaying, a control for exiting the current interface, a control for playing the next video in a preset media asset list, and the like. And continuously detecting whether user input is received or not in the process of executing the countdown, if the user inputs the control in the interface through the control device, playing the video displayed in the interface if the user input is not received before the countdown is completed, stopping the countdown if the user input is received before the countdown is completed, and executing control logic corresponding to the user input.

In some embodiments, the second value is less than or equal to the first value. And under the condition that the second value is smaller than the first value, when the score is higher than the second value and lower than the first value, a preset number of key frames and/or corresponding local video frames are allocated in each matching degree interval according to the matching degree, and the key frames and/or the corresponding local video frames are used as follow-up shots and sent to the display equipment.

FIG. 24 illustrates a user interface that is one implementation of the automatic play alert interface described above, in which countdown alert information, i.e., "automatically play you after 5 seconds," automatic play video information, i.e., the video name "loved kindergarten" and the cover picture of the video, and "replay" control, "exit" control, and "play next" control are displayed, as shown in FIG. 24.

In some embodiments, the user may control the display of a user's exercise record comprising a number of exercise entries, each exercise entry comprising exemplary video information, scoring information, exercise time information, and/or at least one follow-up screenshot by operating the control device. The demonstration video information comprises at least one of a cover, a name, a category, a type and a duration of the demonstration video, the scoring information comprises at least one of a star grade score, a scoring score and an experience value increment, the exercise time information comprises an exercise start time and/or an exercise end time, and the exercise follow-up screenshot can be the exercise follow-up screenshot displayed in the detailed score information interface.

In some embodiments, when the display displays an application home page as shown in FIG. 9, the user may operate the "My work" control in the page via the control device to input instructions indicating that the exercise record is to be displayed. When the controller receives the instruction, a request for acquiring exercise record information is sent to a server, wherein the request at least comprises a user Identification (ID); the server responds to a request sent by the display device, searches corresponding exercise record information according to the user identification, and returns the exercise record information to the display device, wherein the exercise record information comprises a plurality of exercise items, and each exercise item comprises demonstration video information, grading information, exercise time information and/or at least one follow-up screenshot. The display device generates a page containing exercise records according to exercise record information returned by the server and presents the page on the display.

The follow-up screenshot is displayed when the display device acquires an image showing the user's action.

In some embodiments, the server responds to a request sent by the display device, searches corresponding exercise record information according to the user identifier therein, and determines whether each exercise item in the exercise record information contains a follow-up screenshot, and adds a special identifier to the item information which does not contain the follow-up screenshot, so as to indicate that the follow-up process corresponding to the exercise item does not detect the camera. On the display device side, if the exercise item returned by the server contains the heel-shot, displaying the corresponding heel-shot in the exercise record, and if the exercise item returned by the server does not contain the heel-shot and contains the special identifier, identifying that the camera is not detected in the exercise record.

The display equipment receives data issued by the server, draws an exercise record list, each exercise record comprises a first control for displaying demonstration video information by a user, a second control for displaying grading information and exercise time information, a third control for tracking a screenshot by the user, and loads demonstration video information on the first control of the first exercise record, loads grading information and exercise time information on the second control and loads the screenshot on the third control if the special identifier is not contained in the data of the first exercise record in the process of drawing the exercise record; if the data of the first exercise record contains the special identifier, loading demonstration video information on a first control of the first exercise record, loading scoring information and exercise time information on a second control, and loading a third control as a prompt for prompting the current exercise record as a prompt for detecting a camera.

In some embodiments, the heel-exercise screenshot displayed in the exercise item is a heel-exercise screenshot displayed in the corresponding detailed performance information page, and the specific implementation process may refer to the above embodiments, which is not described herein.

FIG. 25 illustrates an interface for displaying a user exercise record, which may be an interface that the user enters after operating the "My work" control of FIG. 9. As shown in fig. 25, 3 exercise items are displayed in the interface, and in the display area of each exercise item, exemplary video information, scoring information, exercise time information, and follow-up shots or an identification indicating that a camera is detected are displayed. Wherein the demonstration video information comprises cover pictures, types (loving lessons), names (standing slightly) of the demonstration video, and the scoring information comprises experience value increment (such as +4) and star grade identification, and exercise time information such as 2010-10-10-10:10.

In the above examples, the user may obtain past heel-keeping situations by looking at the exercise records, such as what exemplary videos are being followed at what time, how the heel-keeping results are, etc., so that the user may conveniently follow the exercise after the past heel-keeping situation decision, or discover the type of action that the user is good at, e.g., may follow the lower-performing exemplary video again, or follow the corresponding type of video with emphasis on the type of action good at to further refine the exercise.

In a specific implementation, the present invention further provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in each embodiment of the method provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (random access memory, RAM), or the like.

It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.

The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for the method embodiments, since they are substantially similar to the display device embodiments, the description is relatively simple, and reference is made to the description in the display device embodiments for the matters.

The embodiments of the present invention described above do not limit the scope of the present invention.

Claims

1. A display device, characterized by comprising:

a controller for:

after the demonstration video playing is finished, generating a follow-up result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the follow-up result interface, when the score is higher than a historical highest score of the demonstration video in follow-up by the user, the experience value updated according to the score is displayed in the experience value control, when the score is not higher than the historical highest score, the experience value before the follow-up process is displayed in the experience value control, the experience value control further comprises a first sub-control and a second sub-control, the experience value total value at the end of a last preset statistic period of a current preset statistic period is displayed in the first sub-control, the experience value increment total value of the current preset statistic period is displayed in the second sub-control, and when the current preset statistic period is finished, the experience value total value in the first sub-control is updated according to the experience value increment total value.

2. The display device of claim 1, wherein the demonstration video comprises a first video frame for displaying the demonstration action, wherein the local video stream comprises a second video frame for displaying the user action, wherein the action matching the demonstration video to the local video stream to generate the score corresponding to the current follow-up procedure based on the matching degree of the local video to the demonstration video comprises:

matching the demonstration action displayed by the first video frame with the user action displayed by the second video frame to obtain a matching result;

and determining a score corresponding to the follow-up process according to the matching result of the first video frame and the second video frame.

3. The display device according to claim 2, wherein a plurality of key labels are disposed on a time axis of the exemplary video, the key labels correspond to a first key frame in the first video frame, the first key frame is used for displaying key actions in the exemplary actions, and the matching of actions is performed on the first video frame and the second video frame to obtain a matching result, and the method comprises:

detecting the key label on a time axis when the demonstration video is played;

Extracting a second key frame corresponding to the first key frame in time from the first video frame according to the time information represented by the key label when one key label is detected;

and performing action matching on the corresponding first key frame and second key frame to obtain a matching result.

4. The display device of claim 3, wherein the matching results of the first video frame and the second video frame include a plurality of matching results of a plurality of corresponding first key frames and second key frames, and wherein determining the score corresponding to the present follow-up procedure based on the matching results of the first video frame and the second video frame comprises:

and determining scores corresponding to the heel training process according to the plurality of matching results.

5. The display device of claim 1, wherein the heel-exercise result interface is further provided with a heel-exercise evaluation control, and the heel-exercise evaluation control is used for displaying target states determined according to the scores, and the target states corresponding to different scores are different.

6. The display device of claim 1, wherein the generating a heel training results interface from the scoring comprises:

When the score is higher than the historical highest score, calculating an experience value increment generated in the heel-back process according to a difference value between the score and the historical highest score;

and accumulating the experience value increment into the experience value before the heel training process to obtain an updated experience value.

7. A display device, characterized by comprising:

a display;

a controller for:

in response to an input instruction to play an demonstration video, obtaining the demonstration video, and obtaining a local video stream, wherein the demonstration video comprises a first video frame for showing a demonstration action required to be followed by a user, and the local video stream comprises a second video frame for showing the action of the user;

and responding to the end of the playing of the demonstration video, generating a heel-and-exercise result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the heel-and-exercise result interface, when the score is higher than the highest historical score of the heel-and-exercise of the demonstration video by the user, the experience value updated according to the score is displayed in the experience value control, when the score is not higher than the highest historical score, the experience value before the heel-and-exercise process is displayed in the experience value control, the experience value control further comprises a first sub-control and a second sub-control, the experience value total value at the end of the last preset statistical period of the current preset statistical period is displayed in the first sub-control, the experience value increment total value of the current preset statistical period is displayed in the second sub-control, and the experience value total value in the first sub-control is updated according to the experience value increment total value when the current preset statistical period is ended.

8. The display device of claim 7, wherein a plurality of key labels are disposed on a time axis of the exemplary video, the key labels corresponding to a first key frame of the first video frames, the first key frame being used to present key actions of the exemplary actions, the matching corresponding first video frames and second video frames comprising:

detecting the key label on a time axis when the demonstration video is played;

and performing action matching on the corresponding first key frame and the second key frame to obtain a matching result.

9. A method of updating an empirical value, the method comprising:

After the demonstration video is played, a heel-and-exercise result interface is generated according to the score, an experience value control used for displaying experience values is arranged in the heel-and-exercise result interface, wherein when the score is higher than a historical highest score of the demonstration video for the user to heel and exercise, the experience value control displays experience values updated according to the score, when the score is not higher than the historical highest score, the experience value control displays experience values before the heel-and-exercise process, the experience value control further comprises a first sub-control and a second sub-control, the first sub-control displays experience value total value at the end of a last preset statistic period of a current preset statistic period, the second sub-control displays experience value increment total value of the current preset statistic period, and when the current preset statistic period is ended, the experience value increment total value in the first sub-control is updated according to the experience value increment total value.

10. A method of updating an empirical value, the method comprising: