CN116980683A - Slide show method, device and storage medium based on video - Google Patents

Slide show method, device and storage medium based on video Download PDF

Info

Publication number
CN116980683A
CN116980683A CN202311235239.2A CN202311235239A CN116980683A CN 116980683 A CN116980683 A CN 116980683A CN 202311235239 A CN202311235239 A CN 202311235239A CN 116980683 A CN116980683 A CN 116980683A
Authority
CN
China
Prior art keywords
data
video
gesture
hand
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311235239.2A
Other languages
Chinese (zh)
Other versions
CN116980683B (en
Inventor
李六七
肖勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Foresea Allchips Information & Technology Co ltd
Original Assignee
Shenzhen Foresea Allchips Information & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Foresea Allchips Information & Technology Co ltd filed Critical Shenzhen Foresea Allchips Information & Technology Co ltd
Priority to CN202311235239.2A priority Critical patent/CN116980683B/en
Publication of CN116980683A publication Critical patent/CN116980683A/en
Application granted granted Critical
Publication of CN116980683B publication Critical patent/CN116980683B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • H04N21/4355Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream involving reformatting operations of additional data, e.g. HTML pages on a television screen
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4782Web browsing, e.g. WebTV

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention relates to the field of slide control and discloses a slide playing method, device and storage medium based on video. The method comprises the following steps: receiving video data; performing hand recognition processing on the video data according to a preset position recognition algorithm to obtain a recognition position; cutting image data of the identification position in the video data to obtain a hand atlas; according to a preset gesture recognition algorithm, recognizing the hand atlas to obtain gesture instruction data; and transmitting the gesture instruction data to an H5 page for playing the slide by using a Websocket service, so that the H5 page can adjust the playing data of the slide. In the embodiment of the invention, the problem that the technology of playing the PPT on the current H5 page is excessively complicated in the operation of PPT playing is solved.

Description

Slide show method, device and storage medium based on video
Technical Field
The present invention relates to the field of slide control, and in particular, to a video-based slide playing method, apparatus, and storage medium.
Background
The current PPT mainly uses PowerPoint to play information, and the playing mode needs to install specific software in terminal equipment such as mobile phones, computers and the like, and then related PPT files can be played through network transmission.
However, this playing mode is relatively bulky, and brings great inconvenience to the user. Therefore, in order to overcome the defect of traditional PPT playing, the lightweight slide of HTML5 is used at present, PPT can be manufactured without specific software, and the PPT is shared with other users through URLs, so that great convenience is brought.
However, the current technical manner of sharing the PPT by using the URL address still requires the user to click on the screen or control the play of the PPT by using the peripheral device, which is too complicated for the PPT requiring the lecture. Therefore, a new technology is needed to solve the technical problem that the current technology of playing the PPT on the H5 page is too complicated in the operation of PPT playing.
Disclosure of Invention
The invention mainly aims to solve the technical problem that the current technology of playing PPT on H5 pages is too complicated in the operation of PPT playing.
The first aspect of the present invention provides a slide show method based on video, comprising the steps of:
receiving video data;
performing hand recognition processing on the video data according to a preset position recognition algorithm to obtain a recognition position;
cutting image data of the identification position in the video data to obtain a hand atlas;
according to a preset gesture recognition algorithm, recognizing the hand atlas to obtain gesture instruction data;
and transmitting the gesture instruction data to an H5 page for playing the slide by using a Websocket service, so that the H5 page can adjust the playing data of the slide.
Optionally, in a first implementation manner of the first aspect of the present invention, the performing, according to a preset portion recognition algorithm, a hand recognition process on the video data to obtain a recognition position includes:
splitting a frame image of the video data to obtain a video frame image set;
and carrying out feature recognition processing on each image of the video frame atlas by using a preset YOLO algorithm to obtain a recognition area set corresponding to the video frame atlas.
Optionally, in a second implementation manner of the first aspect of the present invention, the cropping the image data of the identified position in the video data to obtain a hand atlas includes:
extracting target recognition areas in the recognition area set;
extracting a target video frame image corresponding to the identification area in the video frame image set based on the target identification area;
and cutting the target video frame image based on the target identification area to generate a hand image.
Optionally, in a third implementation manner of the first aspect of the present invention, the identifying the hand atlas according to a preset gesture identification algorithm, and obtaining gesture instruction data includes:
sequentially extracting hand graphs in the hand graph set;
according to a preset gesture recognition algorithm, carrying out recognition processing on the hand graph to obtain gesture recognition data;
writing the gesture recognition data into a preset gesture instruction frame set;
judging whether the hand atlas is an empty set or not;
if the hand image is not the empty set, extracting the hand images in the hand image set in sequence again;
if the gesture instruction frame set is an empty set, performing duration analysis processing on all data of the gesture instruction frame set according to a preset change duration analysis algorithm, and generating gesture instruction data.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the performing, according to a preset change duration analysis algorithm, duration analysis processing on all data of the gesture command frame set, and generating gesture command data includes:
analyzing gesture change time length in the gesture instruction frame set;
judging whether the gesture change duration exceeds a preset duration threshold value or not;
and if the duration threshold value is exceeded, extracting gesture change data in the gesture instruction frame set, and generating gesture instruction data.
Optionally, in a fifth implementation manner of the first aspect of the present invention, after the transmitting the gesture instruction data to the H5 page of the slide show by using Websocket service, the method further includes:
monitoring webpage loading data of an H5 webpage based on a preset browser;
judging whether the webpage loading data is modified or not;
if no modification exists, the gesture instruction data are sent to a preset play cloud terminal;
and receiving the URL address fed back by the playing cloud terminal, and loading webpage data of the URL address in a preset browser.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the receiving video data includes:
based on TCP/IP protocol, video data transmitted by Internet is received.
Optionally, in a seventh implementation manner of the first aspect of the present invention, the receiving video data includes:
and shooting external data by using a camera to generate video data.
A second aspect of the present invention provides a video-based slide show device comprising: a memory and at least one processor, the memory having instructions stored therein, the memory and the at least one processor being interconnected by a line; the at least one processor invokes the instructions in the memory to cause the video-based slide show device to perform the video-based slide show method described above.
A third aspect of the present invention provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the video-based slide show method described above.
In the embodiment of the invention, the video of an operator is analyzed by utilizing computer vision and an image processing algorithm to obtain an instruction of the operator on PPT operation, and then the H5 page is adjusted by utilizing a component of a browser, so that the operator can control the play of the PPT by utilizing gestures, and the technical problem that the current technology for playing the PPT on the H5 page is excessively complicated in the operation of PPT play is solved.
Drawings
FIG. 1 is a diagram showing a first embodiment of a video-based slide show method according to an embodiment of the present invention;
FIG. 2 is a diagram of a second embodiment of a video-based slide show method according to an embodiment of the present invention;
FIG. 3 is a diagram of a third embodiment of a video-based slide show method according to an embodiment of the present invention;
FIG. 4 is a diagram of a fourth embodiment of a video-based slide show method according to an embodiment of the present invention;
FIG. 5 is a diagram of a fifth embodiment of a video-based slide show method according to an embodiment of the present invention;
FIG. 6 is a diagram of a sixth embodiment of a video-based slide show method according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an embodiment of a video-based slide show device according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a slide show method, a slide show device and a storage medium based on video.
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the present disclosure has been illustrated in the drawings in some form, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and examples of the present disclosure are for illustrative purposes only and are not intended to limit the scope of the present disclosure.
In describing embodiments of the present disclosure, the term "comprising" and its like should be taken to be open-ended, i.e., including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other explicit and implicit definitions are also possible below.
For ease of understanding, a specific flow of an embodiment of the present invention is described below with reference to fig. 1, where an embodiment of a video-based slide show method in an embodiment of the present invention includes:
101. receiving video data;
in this embodiment, the transmission of the received video data may be data transmission by the device using a USB wired interface.
Further, step 101 may be performed:
1011. based on TCP/IP protocol, video data transmitted by Internet is received.
In step 1011, as a remote assistance control scheme, the remote user may remotely control the play mode of the PPT currently displaying the H5 page using video.
Further, the step 101 may further perform the steps of:
1012. and shooting external data by using a camera to generate video data.
In step 1012, the video data is directly collected by the external camera to generate real-time collected data of the external video, so that the user can control the PPT to play data by using gestures on site.
102. Performing hand recognition processing on the video data according to a preset position recognition algorithm to obtain a recognition position;
in this embodiment, the hand recognition algorithm is used to recognize the motion of the entire person of the video data, and only the video data of the hand is a control command for controlling PPT playback.
Further, referring to fig. 2, fig. 2 is a schematic diagram of a second embodiment of the present invention, the following steps may be performed at 102:
1021. splitting a frame image of the video data to obtain a video frame image set;
1022. and carrying out feature recognition processing on each image of the video frame atlas by using a preset YOLO algorithm to obtain a recognition area set corresponding to the video frame atlas.
In steps 1021-1022, the video data is subjected to a frame-by-frame splitting process to generate image data of a video frame atlas ordered by the video data. And each image of the video frame atlas is subjected to target feature recognition processing by using the existing YOLO neural network, and a YOLO neural network variant model YOLOV5 model can be adopted, so that the recognition speed is higher. When the features are identified, marking the hand areas in the images by the YOLO model to generate identification areas corresponding to each image, and generating an identification area set by the identification areas of all the images according to the ordering combination of the video frames.
103. Cutting image data of the identification position in the video data to obtain a hand atlas;
in this embodiment, the image data of the hand identified in the video data is cut, and the hand map is cut according to the original frame number or the set frequency map frame, and the hand map set is generated by sorting according to the video playing order.
Further, referring to fig. 3, fig. 3 is a schematic diagram of a third embodiment of the present invention, the following steps may be performed at 103:
1031. extracting target recognition areas in the recognition area set;
1032. extracting a target video frame image corresponding to the identification area in the video frame image set based on the target identification area;
1033. and cutting the target video frame image based on the target identification area to generate a hand image.
In step 1031-1032, extracting one identification area data in the identification area set, finding a corresponding target video frame image in the video frame image set, recording one video frame ID in the identification area set, then inquiring the video frame ID in the video frame image set, finally carrying out area clipping on the target video frame image by utilizing the coordinate range of the target identification area based on the video frame ID as identification mark, and obtaining a hand image.
104. According to a preset gesture recognition algorithm, recognizing the hand atlas to obtain gesture instruction data;
in this embodiment, the gesture recognition algorithm is used to perform recognition processing on each picture of the hand image set, and recognition classification processing is used to perform recognition processing conversion on each picture of the hand image to generate gesture instruction data.
Further, referring to fig. 4, fig. 4 is a schematic diagram of a fourth embodiment of the present invention, in step 104, the following steps may be performed:
1041. sequentially extracting hand graphs in the hand graph set;
1042. according to a preset gesture recognition algorithm, carrying out recognition processing on the hand graph to obtain gesture recognition data;
1043. writing the gesture recognition data into a preset gesture instruction frame set;
1044. judging whether the hand atlas is an empty set or not;
1045. if the hand image is not the empty set, extracting the hand images in the hand image set in sequence again;
1046. if the gesture instruction frame set is an empty set, performing duration analysis processing on all data of the gesture instruction frame set according to a preset change duration analysis algorithm, and generating gesture instruction data.
In the steps 1041-1046, a hand graph is extracted according to the sequence of the hand graph set, then the hand graph is analyzed and identified based on a gesture identification algorithm, and the gesture identification data is generated by classifying and dividing the hand graph according to the types of the gestures. Writing the analyzed gesture recognition data into a gesture instruction frame set, analyzing whether the hand image set extracted sequentially is an empty set, if so, not extracting the picture, and if not, still extracting the hand images in the hand image set sequentially.
When the hand atlas is empty, the gestures with longer duration of the gesture instruction framework are fixed by using a change duration analysis algorithm, and then the gestures in the fixed sequence are analyzed to give gesture instruction data.
Further, referring to fig. 5, fig. 5 is a schematic diagram of a fifth embodiment of the present invention, in step 1046, the following steps may be performed:
10461. analyzing gesture change time length in the gesture instruction frame set;
10462. judging whether the gesture change duration exceeds a preset duration threshold value or not;
10463. and if the duration threshold value is exceeded, extracting gesture change data in the gesture instruction frame set, and generating gesture instruction data.
In the steps 10461-10463, a duration of time that each gesture in the gesture command frame is changed is analyzed, for example, a duration of time that the gesture gives a change of sliding is 3 seconds, and the sliding gesture for playing the next PPT is considered to be effective, and gesture command data corresponding to the sliding gesture for playing the next PPT is generated.
105. And transmitting the gesture instruction data to an H5 page for playing the slide by using a Websocket service, so that the H5 page can adjust the playing data of the slide.
In this embodiment, because the H5 page transmits the play PPT in the cloud database, the Websocket service is used to send gesture instruction data for playing the next PPT to the H5 page of the play slide, and then the cloud database modifies the play PPT of the H5 page, so as to implement gesture modification on the play content of the PPT on the H5 page.
Further, referring to fig. 6, fig. 6 is a schematic diagram of a sixth embodiment of the present invention, in step 105, the following steps may be performed:
1051. monitoring webpage loading data of an H5 webpage based on a preset browser;
1052. judging whether the webpage loading data is modified or not;
1053. if no modification exists, the gesture instruction data are sent to a preset play cloud terminal;
1054. and receiving the URL address fed back by the playing cloud terminal, and loading webpage data of the URL address in a preset browser.
In the 1051-1054 steps, monitoring whether the H5 page data is modified within a preset modification time of 1 second through a component of the browser, if no data modification is not generated in the webpage loading data, directly sending gesture instruction data to an IP address of a play cloud terminal, directly giving a new URL address by the play cloud terminal, loading the webpage data of the URL address in the preset browser, and realizing the display effect that a user is in sensory gesture control on the PPT of the H5 page for modification.
In the embodiment of the invention, the video of an operator is analyzed by utilizing computer vision and an image processing algorithm to obtain an instruction of the operator on PPT operation, and then the H5 page is adjusted by utilizing a component of a browser, so that the operator can control the play of the PPT by utilizing gestures, and the technical problem that the current technology for playing the PPT on the H5 page is excessively complicated in the operation of PPT play is solved.
Fig. 7 is a schematic structural diagram of a video-based slide show device according to an embodiment of the present invention, where the video-based slide show device 700 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 710 (e.g., one or more processors) and a memory 720, and one or more storage media 730 (e.g., one or more mass storage devices) storing application programs 733 or data 732. Wherein memory 720 and storage medium 730 may be transitory or persistent. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations on the video-based slide show device 700. Still further, the processor 710 may be configured to communicate with the storage medium 730 to execute a series of instruction operations in the storage medium 730 on the video-based slide show device 700.
The video-based slide show device 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input/output interfaces 760, and/or one or more operating systems 731, such as Windows Server, mac OS X, unix, linux, free BSD, etc. It will be appreciated by those skilled in the art that the video-based slide show device structure shown in fig. 7 is not limiting of the video-based slide show device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and which may also be a volatile computer readable storage medium, the computer readable storage medium having instructions stored therein which, when executed on a computer, cause the computer to perform the steps of the video-based slide show method.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Moreover, although operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (10)

1. A video-based slide show method, comprising the steps of:
receiving video data;
performing hand recognition processing on the video data according to a preset position recognition algorithm to obtain a recognition position;
cutting image data of the identification position in the video data to obtain a hand atlas;
according to a preset gesture recognition algorithm, recognizing the hand atlas to obtain gesture instruction data;
and transmitting the gesture instruction data to an H5 page for playing the slide by using a Websocket service, so that the H5 page can adjust the playing data of the slide.
2. The video-based slide show method according to claim 1, wherein the performing a hand recognition process on the video data according to a preset part recognition algorithm to obtain a recognition position includes:
splitting a frame image of the video data to obtain a video frame image set;
and carrying out feature recognition processing on each image of the video frame atlas by using a preset YOLO algorithm to obtain a recognition area set corresponding to the video frame atlas.
3. The video-based slide show method of claim 2, wherein cropping the image data of the identified location in the video data to obtain a hand atlas comprises:
extracting target recognition areas in the recognition area set;
extracting a target video frame image corresponding to the identification area in the video frame image set based on the target identification area;
and cutting the target video frame image based on the target identification area to generate a hand image.
4. The video-based slide show playing method according to claim 1, wherein the identifying the hand atlas according to a preset gesture identifying algorithm, and obtaining gesture instruction data includes:
sequentially extracting hand graphs in the hand graph set;
according to a preset gesture recognition algorithm, carrying out recognition processing on the hand graph to obtain gesture recognition data;
writing the gesture recognition data into a preset gesture instruction frame set;
judging whether the hand atlas is an empty set or not;
if the hand image is not the empty set, extracting the hand images in the hand image set in sequence again;
if the gesture instruction frame set is an empty set, performing duration analysis processing on all data of the gesture instruction frame set according to a preset change duration analysis algorithm, and generating gesture instruction data.
5. The method of claim 4, wherein the performing duration analysis processing on all data of the gesture command frame set according to a preset change duration analysis algorithm, and generating gesture command data includes:
analyzing gesture change time length in the gesture instruction frame set;
judging whether the gesture change duration exceeds a preset duration threshold value or not;
and if the duration threshold value is exceeded, extracting gesture change data in the gesture instruction frame set, and generating gesture instruction data.
6. The video-based slide show method according to claim 1, further comprising, after the transmitting the gesture instruction data to the H5 page of the slide show using Websocket service:
monitoring webpage loading data of an H5 webpage based on a preset browser;
judging whether the webpage loading data is modified or not;
if no modification exists, the gesture instruction data are sent to a preset play cloud terminal;
and receiving the URL address fed back by the playing cloud terminal, and loading webpage data of the URL address in a preset browser.
7. The video-based slide show method of claim 1, wherein the receiving video data comprises:
based on TCP/IP protocol, video data transmitted by Internet is received.
8. The video-based slide show method of claim 1, wherein the receiving video data comprises:
and shooting external data by using a camera to generate video data.
9. A video-based slide show device, the video-based slide show device comprising: a memory and at least one processor, the memory having instructions stored therein, the memory and the at least one processor being interconnected by a line;
the at least one processor invoking the instructions in the memory to cause the video-based slide show device to perform the video-based slide show method of any one of claims 1-8.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the video-based slide show method according to any one of claims 1-8.
CN202311235239.2A 2023-09-25 2023-09-25 Slide show method, device and storage medium based on video Active CN116980683B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311235239.2A CN116980683B (en) 2023-09-25 2023-09-25 Slide show method, device and storage medium based on video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311235239.2A CN116980683B (en) 2023-09-25 2023-09-25 Slide show method, device and storage medium based on video

Publications (2)

Publication Number Publication Date
CN116980683A true CN116980683A (en) 2023-10-31
CN116980683B CN116980683B (en) 2024-04-16

Family

ID=88483559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311235239.2A Active CN116980683B (en) 2023-09-25 2023-09-25 Slide show method, device and storage medium based on video

Country Status (1)

Country Link
CN (1) CN116980683B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609093A (en) * 2012-02-16 2012-07-25 中国农业大学 Method and device for controlling video playing by using gestures
CN105450944A (en) * 2015-11-13 2016-03-30 北京自由坊科技有限责任公司 Method and device for synchronously recording and reproducing slides and live presentation speech
CN111078078A (en) * 2019-11-29 2020-04-28 深圳市咨聊科技有限公司 Video playing control method, device, terminal and computer readable storage medium
CN112307226A (en) * 2019-07-31 2021-02-02 西安诺瓦星云科技股份有限公司 Slide playing control method, device and system and computer readable storage medium
CN113536864A (en) * 2020-04-22 2021-10-22 深圳市优必选科技股份有限公司 Gesture recognition method and device, computer readable storage medium and terminal equipment
CN114360044A (en) * 2020-10-12 2022-04-15 武汉Tcl集团工业研究院有限公司 Gesture recognition method and device, terminal equipment and computer readable storage medium
CN114564104A (en) * 2022-02-17 2022-05-31 西安电子科技大学 Conference demonstration system based on dynamic gesture control in video

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609093A (en) * 2012-02-16 2012-07-25 中国农业大学 Method and device for controlling video playing by using gestures
CN105450944A (en) * 2015-11-13 2016-03-30 北京自由坊科技有限责任公司 Method and device for synchronously recording and reproducing slides and live presentation speech
CN112307226A (en) * 2019-07-31 2021-02-02 西安诺瓦星云科技股份有限公司 Slide playing control method, device and system and computer readable storage medium
CN111078078A (en) * 2019-11-29 2020-04-28 深圳市咨聊科技有限公司 Video playing control method, device, terminal and computer readable storage medium
CN113536864A (en) * 2020-04-22 2021-10-22 深圳市优必选科技股份有限公司 Gesture recognition method and device, computer readable storage medium and terminal equipment
CN114360044A (en) * 2020-10-12 2022-04-15 武汉Tcl集团工业研究院有限公司 Gesture recognition method and device, terminal equipment and computer readable storage medium
CN114564104A (en) * 2022-02-17 2022-05-31 西安电子科技大学 Conference demonstration system based on dynamic gesture control in video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘俊轩;刘春鹏;许伟伟;赵越;周志恒;刘丽杰;: "基于HTML5的幻灯片设计及实现智能手机的播放", 黑龙江八一农垦大学学报, no. 02, pages 72 - 77 *

Also Published As

Publication number Publication date
CN116980683B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
WO2019242222A1 (en) Method and device for use in generating information
JP2020516188A (en) Face image duplication deletion method and apparatus, electronic device, storage medium, and program
WO2019047649A1 (en) Method and device for determining driving behavior of unmanned vehicle
WO2019047655A1 (en) Method and apparatus for use in determining driving behavior of driverless vehicle
KR20090097891A (en) Controlling a document based on user behavioral signals detected from a 3d captured image stream
CN111429338B (en) Method, apparatus, device and computer readable storage medium for processing video
CN111400426A (en) Robot position deployment method, device, equipment and medium
CN112927241A (en) Picture capturing and thumbnail generating method, system, equipment and storage medium
CN116980683B (en) Slide show method, device and storage medium based on video
CN111027195B (en) Simulation scene generation method, device and equipment
CN116033259B (en) Method, device, computer equipment and storage medium for generating short video
US8867837B2 (en) Detecting separator lines in a web page
CN116939306A (en) Method, system, equipment and storage medium for displaying timing of monitoring video
CN112199547A (en) Image processing method and device, storage medium and electronic equipment
CN108280184B (en) Test question extracting method and system based on intelligent pen and intelligent pen
CN114978585B (en) Deep learning symmetric encryption protocol identification method based on flow characteristics
CN111813741B (en) File sharing method and electronic equipment
CN105068708B (en) Instruction obtaining and feedback method and device and cloud server
CN108536830A (en) Picture dynamic searching method, device, equipment, server and storage medium
US20150295959A1 (en) Augmented reality tag clipper
CN114625297A (en) Interaction method, device, equipment and storage medium
CN110188833B (en) Method and apparatus for training a model
CN110263743B (en) Method and device for recognizing images
CN113965798A (en) Video information generating and displaying method, device, equipment and storage medium
CN110704294B (en) Method and apparatus for determining response time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant