CN115914775A

CN115914775A - Cover determining method and device, electronic equipment and storage medium

Info

Publication number: CN115914775A
Application number: CN202211457157.8A
Authority: CN
Inventors: 宁本德
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2022-11-21
Filing date: 2022-11-21
Publication date: 2023-04-04

Abstract

The embodiment of the invention provides a cover determining method, a cover determining device, electronic equipment and a storage medium. And extracting a plurality of second video frames to be selected from the video clips containing the first cover video frame according to a second extraction interval, wherein the extracted video frames are denser because the second extraction interval is smaller than the first extraction interval. And determining a second cover video frame from the plurality of second video frames to be selected as a cover of the target video, so that missing of more appropriate video frames caused by overlarge frame extraction intervals can be avoided, and the cover image selection efficiency is improved. Meanwhile, the second cover video frame is determined from the second video frames to be selected, so that excessive pressure on subsequent storage and algorithm processing due to excessive extracted video frames can be avoided.

Description

Cover determining method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of video processing technologies, and in particular, to a cover page determining method and apparatus, an electronic device, and a storage medium.

Background

In a video website, in order to output a high-quality cover to a user to attract the click of the user, the user can conveniently and quickly position a target video, the visual and retrieval experience is improved, and how to extract a cover picture from the video becomes a relatively important thing.

At present, a common method is to uniformly extract images from a video, analyze the images for information such as face information, voice, subtitles, and the like, and extract a better image as a cover page. For example, a frame of image is extracted every 10 seconds, and then all images after video extraction are subjected to image selection, but some clear images are sometimes omitted due to the large frame extraction interval. If the density of video frame splitting is increased, for example, one frame image is extracted every 1 second, although the situation of ignoring clearer images can be reduced, in this way, too many acquired images are caused by too short frame extracting intervals, and too much pressure is caused on subsequent storage or algorithm processing, so that the problems of low speed and low efficiency are caused when selecting the front cover.

Disclosure of Invention

The embodiment of the invention aims to provide a cover determining method, a cover determining device, electronic equipment and a storage medium, so that a cover can be determined efficiently and quickly in a target video. The specific technical scheme is as follows:

according to a first aspect of an embodiment of the present invention, there is provided a cover determination method, including:

extracting a plurality of first video frames to be selected from a target video according to a first extraction interval;

determining a first cover video frame from the plurality of first video frames to be selected;

extracting a plurality of second video frames to be selected from the video clips containing the first cover video frame in the target video according to a second extraction interval, wherein the second extraction interval is smaller than the first extraction interval;

and determining a second cover video frame from the plurality of second video frames to be selected as a cover of the target video.

In a possible embodiment, the determining a second cover video frame from the second candidate video frames as a cover of the target video includes:

respectively calculating the similarity between each second video frame to be selected and the first cover video frame;

determining a third video frame to be selected, of which the similarity meets a preset similarity condition, from the second video frame to be selected;

and determining a second cover video frame from the third video frames to be selected as a cover of the target video.

In a possible embodiment, the determining, as the cover of the target video, a second cover video frame from among the third candidate video frames includes:

respectively calculating the definition of each third video frame to be selected;

and determining a second cover video frame with the definition meeting a preset definition condition from all the third video frames to be selected as a cover of the target video.

In a possible embodiment, the separately calculating the sharpness of each of the third candidate video frames includes:

and for each third video frame to be selected, counting the gradient of the pixel value at each pixel point in the third video frame to be selected to obtain the definition of the third video frame to be selected, wherein the definition is positively correlated with the gradient.

In a possible embodiment, if there is no second cover video frame in the second candidate video frame, the first cover video frame is determined as the cover of the target video.

According to a second aspect of an embodiment of the present invention, there is provided a cover determining apparatus, characterized by comprising:

the first extraction module is used for extracting a plurality of first video frames to be selected from the target video according to a first extraction interval;

a first determining module, configured to determine a first cover video frame from the plurality of first video frames to be selected;

the second extraction module is used for extracting a plurality of second video frames to be selected from the video clips containing the first cover video frame in the target video according to a second extraction interval, wherein the second extraction interval is smaller than the first extraction interval;

and the second determining module is used for determining a second cover video frame from the plurality of second video frames to be selected as a cover of the target video.

determining a third video frame to be selected with the similarity meeting a preset similarity condition from the second video frame to be selected;

According to a third aspect of the embodiments of the present invention, there is provided an electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing any cover page determining method when the program stored in the memory is executed.

According to a fourth aspect of the embodiments of the present invention, there is provided a computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when executed by a processor, the computer program implements a cover determination method as described in any one of the above.

The embodiment of the invention has the following beneficial effects:

according to the cover determining method provided by the embodiment of the invention, a plurality of first to-be-selected video frames are extracted from a target video according to a first extraction interval, and a first cover video frame is determined from the plurality of first to-be-selected video frames. And extracting a plurality of second candidate video frames from the video clips containing the first cover video frame in the target video according to the second extraction interval, wherein the second extraction interval is smaller than the first extraction interval, so that the video frames with shorter time intervals with the first cover video frame can be found as the second candidate video frames, and then the second cover video frames are determined from the plurality of second candidate video frames to be used as the cover of the target video. Therefore, the situation that the more appropriate cover is ignored due to the fact that the frame splitting density is too large can be avoided. Compared with the method for preventing the problems by directly increasing the density of the split frames of the target video, the method for extracting the plurality of second video frames to be selected from the video clips containing the first cover video frame in the target video according to the second extraction interval can prevent the complexity and the time consumption of subsequent image selection processing from increasing due to excessive extracted images. By the method, the purpose of efficiently and quickly determining the cover can be achieved when the cover is determined.

Of course, it is not necessary for any product or method to achieve all of the above-described advantages at the same time for practicing the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by referring to these drawings.

FIG. 1 is a diagram of an exemplary cover view option provided by an embodiment of the present invention;

FIG. 2 is a flowchart of a cover determination method according to an embodiment of the present invention;

FIG. 3 is a diagram of another exemplary cover option provided by an embodiment of the present invention;

FIG. 4 is a diagram illustrating another exemplary cover selection scheme provided by an embodiment of the present invention;

FIG. 5 is a flow chart of another cover determination method provided by an embodiment of the present invention;

FIG. 6 is a flowchart of another method for determining covers according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a cover determination apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments given herein by one of ordinary skill in the art, are within the scope of the invention.

In order to facilitate a user to quickly position a target video and improve the visual experience and the retrieval experience of the user, the user is usually attracted to click the video by showing a high-quality highlight cover which is most representative in the video in a video website. Currently, a video cover is selected by obtaining an image sequence by framing a video, analyzing various information such as face information, sound information, subtitles, and the like of a plurality of images in the image sequence, and determining the video cover according to an information analysis result. As shown in FIG. 1, FIG. 1 is an exemplary cover view option. In fig. 1, a video frame is extracted every 10 seconds from a target video, and a plurality of video frames to be selected are obtained. However, a frame decimation every 10 seconds may result in some of the better video frames being ignored due to the decimation interval being too large. For example, a plurality of video frames as shown in fig. 1 are obtained by extracting frames from a time stamp of t =0 seconds and extracting one video frame every 10 seconds. Assume that at a timestamp of t =5 seconds, there is a 5 second video frame that is more suitable as a cover page. The 5 th second video frame is ignored because the frame extraction interval is too large according to the frame extraction of every 10 seconds. To avoid the situation where better video frames are ignored, one may choose to shorten the decimation interval. For example, instead a frame is decimated every 1 second. However, if the frame extraction interval is simply shortened, although the number of extracted video frames can be enlarged by ten times to obtain more video frames to be selected, too many video frames will cause great stress on subsequent storage or algorithm processing.

In order to improve the image selection efficiency without ignoring a more suitable cover when selecting an image, the present application provides a cover determination method, as shown in fig. 2, where fig. 2 is a flowchart of a cover determination method provided by an embodiment of the present invention, where,

step S101, a plurality of first video frames to be selected are extracted from the target video according to a first extraction interval.

Step S102, a first cover video frame is determined from a plurality of first video frames to be selected.

Step S103, extracting a plurality of second candidate video frames from the video clips including the first cover video frame in the target video according to a second extraction interval, where the second extraction interval is smaller than the first extraction interval.

And step S104, determining a second cover video frame from the plurality of second video frames to be selected as a cover of the target video.

In the embodiment, after a plurality of first video frames to be selected are extracted from the target video according to the first extraction interval, a first cover video frame is determined. And then extracting a plurality of second video frames to be selected from the video clips containing the first cover video frame according to a second extraction interval. The second extraction interval is smaller than the first extraction interval, so that the frame extraction is more dense, and the missing of more proper video frames caused by overlarge frame extraction interval can be avoided. Compared with the scheme for shortening the frame extraction interval of the whole target video in order to prevent the more appropriate video frames from being ignored, the method and the device only need to determine the second cover video frame from the second video frames to be selected, and can avoid excessive pressure on subsequent storage and algorithm processing caused by excessive extracted video frames.

The foregoing S101-S104 will be explained below:

in step S101, as shown in fig. 3, a first extraction interval may be set to 10 seconds, a plurality of first video frames to be selected may be extracted from a target video of 30 seconds at a frequency of extracting one video frame in 10 seconds, and 4 video frames with timestamps of t =0 seconds, t =10 seconds, t =20 seconds, and t =30 seconds may be extracted and named as a 0 th second video frame, a 10 th second video frame, a 20 th second video frame, and a 30 th second video frame, respectively.

In step S102, various information analyses are performed on the plurality of first video frames to be selected, for example, information such as faces, voices, subtitles, and the like are analyzed, and then the first video frames are identified and compared with the understood video content to determine a first cover video frame. For example, after comparing the 0 th second video frame, the 10 th second video frame, the 20 th second video frame and the 30 th second video frame in step S101, and combining various information analyses, it is found that the 20 th second video frame is more suitable for being used as a cover of the target video, and finally the 20 th second video frame can be determined as the first cover video frame.

In step S103, a plurality of second candidate video frames are extracted from the video clips including the first cover video frame in the target video, where the tth second video frame may be determined as the first cover video frame, and then extracted from the video clip starting from the first K seconds to the last K seconds of the first cover video frame, as shown in fig. 4, a specific value of K may be determined by related personnel according to experience and/or actual requirements.

For example, assume that the value of K takes 2 seconds and the second decimation interval is 1 second. As shown in fig. 3, when it is determined that the 20 th second video frame is the first cover video frame, a video frame is extracted every 1 second from the 18 th to 22 th second video clips of the target video, and a plurality of second video frames to be selected are obtained. Similarly, when the 10 th second video frame is determined to be the first cover video frame, one frame is extracted every 1 second from the 8 th to 12 th second video clips of the target video, and a plurality of second video frames to be selected are obtained. Since the second extraction interval is smaller than the first extraction interval, it is possible to avoid the occurrence of a situation where a more appropriate cover is omitted.

In step S104, a second cover video frame is determined from the plurality of second candidate video frames as a cover of the target video.

The second cover video frame is determined from the second video frames to be selected, so that the algorithm processing pressure during subsequent image selection can be reduced, and the efficiency of cover image selection is improved.

Because the first cover video frame is a video frame suitable for being used as a target video cover after information analysis, in order to make the selected second cover video also suitable for being used as a video frame of the target video cover, when the second cover video frame is selected, the second cover video frame should also be selected from video frames similar to the first cover video frame, where similar means: the similarity satisfies a preset similarity condition. As shown in fig. 5, fig. 5 is a flowchart of another cover determining method, and on the basis of fig. 2, the step S104 may include:

in step S1041, the similarity between each second video frame to be selected and the first cover video frame is calculated respectively.

And calculating the similarity between each second video frame to be selected and the first cover video frame through a similarity calculation formula. For example, the first cover video frame is I _ bst1, the sequence composed of a plurality of second candidate video frames is- _i H, by the formula Vi = ssim (I _ bst1, I) _i ) The similarity of the second video frame to be selected can be obtained. Ssim (structural similarity index) in the formula is a function for calculating the similarity of video frames.

Step S1042, determining a third video frame to be selected, whose similarity satisfies a preset similarity condition, from the second video frame to be selected.

The preset similarity condition may be a video frame with a calculated similarity greater than Threshold, where a value of Threshold is determined by a relevant person according to experience and/or actual requirements. For example, if Vi1=1, vi2=0.7, vi3=1.1, and Vi4=1.2 are obtained through the similarity calculation formula, and Threshold =0.9 is preset by the relevant person, vi1, vi3, and Vi4 may be selected as the third candidate video frame.

Step S1043, determining a second cover video frame from the third candidate video frames as a cover of the target video.

The third video frames which are obtained through calculation and meet the preset similarity condition are selected from the third video frames to be selected in the cover page image selection process, so that the workload of subsequent image selection is reduced, and the cover page image selection efficiency is improved.

In this embodiment, the similarity between each second video frame to be selected and the first cover video frame is calculated and then compared, so that the video frame meeting the preset similarity condition can be determined as the third video frame to be selected. By determining all video frames similar to the first cover video frame in the second candidate video frame as the third candidate video frame, it is convenient to make the determined cover be the cover similar to the first cover video frame as far as possible in subsequent cover determination. Meanwhile, because the third video frames to be selected are all video frames meeting the preset similarity condition, the covers are determined from a plurality of the third video frames to be selected, compared with the method of directly determining the target covers from the whole target video, so that the algorithm processing pressure in the subsequent image selection process can be reduced, and the image selection efficiency is improved.

In order to select a clearer cover when determining the cover, as shown in fig. 6, fig. 6 is a flowchart of another cover determination method, and on the basis of fig. 5, step S1043 may include:

and step S105, respectively calculating the definition of each third video frame to be selected.

Wherein the function D (I) = ∑ Σ may be calculated by sharpness _x ∑ _y I (x +2, y) -I (x, y) | for eachAnd calculating the definition of the third video frame to be selected.

And S106, determining a second cover video frame with the definition meeting the preset definition condition from all third video frames to be selected as a cover of the target video.

The preset definition condition may be a video frame with a definition greater than a preset threshold in the third video frame to be selected, for example, the preset threshold may be the definition of the first cover video. In addition, the preset definition condition can also be a video frame with the maximum definition in the third video frames to be selected.

For example, five video frames X1, X2, X3, X4, and X5 are included in the third candidate video frame, and the definitions corresponding to the video frames are D1, D2, D3, D4, and D5, respectively, obtained through the definition calculation function. Suppose that where D1, D3, and D4 are all greater than the sharpness of the first cover video frame, and D1 is the maximum sharpness. If the preset definition condition is that the definition of the video frame in the third video frame to be selected is greater than the preset threshold, any one of the video frames X1, X3 and X4 can be selected as the cover of the target video. And if the preset definition condition is the video frame with the maximum definition in the third video frames to be selected, selecting X1 as the cover of the target video.

In this embodiment, by calculating the definition of each third video frame to be selected and then comparing the definitions of the third video frames to be selected, when the cover is determined, a clearer video frame can be selected as the cover of the target video.

In this embodiment, for each third video frame to be selected, the gradient of the pixel value at each pixel point in the third video frame to be selected is counted to obtain the definition of the third video frame to be selected, where the definition is positively correlated to the gradient. When the gradient is larger, the definition of the video frame is larger, and the video frame is clearer. Conversely, when the gradient is smaller, the sharpness of the video frame is smaller, and the video frame is more blurred. Gradient refers to the rate of change in both the x and y directions of a pixel of a video frame compared to neighboring pixels. When the gradient is larger, the change rate of the gradient is larger, namely the contrast with adjacent pixels is larger, and the video frame is clearer. When the gradient is smaller, the change rate is smaller, and similarly, the contrast with the adjacent pixel is smaller, and the video frame is blurred.

In addition, in the present application, if there is no second cover video frame in the second candidate video frame, the first cover video frame is determined as the cover of the target video. When the second cover video frame does not exist in the second candidate video frame, the current first cover video frame is the most suitable cover, so that the first cover video frame can be directly selected as the cover of the target video.

For example, the preset similarity condition is a video frame with the calculated similarity greater than Threshold. And obtaining Vi1=0.6, vi2=0.7, vi3=0.8 and Vi4=0.5 through a similarity calculation formula. If the relevant person presets Threshold =0.9. At this time, the values Vi1, vi2, vi3 and Vi4 are all selected to be smaller than Threshold, so that the preset similarity condition is not met. There is no second cover video frame in the second candidate video frame. In this case, the first cover video frame can be directly selected as the cover of the target video.

Based on the same inventive concept, the present application correspondingly provides a cover determining apparatus, as shown in fig. 7, fig. 7 is a schematic structural diagram of a cover determining apparatus provided in an embodiment of the present invention, and the apparatus includes:

a first extracting module 701, configured to extract a plurality of first video frames to be selected from a target video according to a first extraction interval;

a first determining module 702, configured to determine a first cover video frame from a plurality of first video frames to be selected;

a second extraction module 703, configured to extract a plurality of second video frames to be selected from a video clip that includes the first cover video frame in the target video according to a second extraction interval, where the second extraction interval is smaller than the first extraction interval;

a second determining module 704, configured to determine a second cover video frame from the multiple second candidate video frames as a cover of the target video.

In this embodiment, after the first extraction module 701 extracts a plurality of first video frames to be selected from the target video according to the first extraction interval, the first determination module 702 determines a first cover video frame from the plurality of first video frames to be selected. Then, the second extraction module 703 extracts a plurality of second video frames to be selected from the video clips that include the first cover video frame in the target video according to a second extraction interval. Because the second extraction interval is smaller than the first extraction interval, the frame extraction is carried out according to the second extraction interval, so that the frame extraction is more intensive, and the condition that more appropriate video frames are omitted due to overlarge frame extraction interval is avoided. A second cover video frame is determined from the second candidate video frames by the second determining module 704 to serve as a cover of the target video. Compared with the scheme of obtaining more dense video frames by shortening the frame extraction interval of the whole target video, the method and the device only need to determine the second cover video frame from a plurality of second video frames to be selected. Therefore, the storage pressure and the algorithm processing pressure in the subsequent picture selection process can be reduced, and the efficiency of determining the cover is improved.

In a possible embodiment, determining a second cover video frame from the plurality of second candidate video frames as a cover of the target video includes:

In a possible embodiment, determining a second cover video frame from the third candidate video frames as a cover of the target video includes:

and determining a second cover video frame with the definition meeting the preset definition condition from the third video frames to be selected as a cover of the target video.

In a possible embodiment, the calculating the sharpness of each third candidate video frame separately includes:

In a possible embodiment, if the second cover video frame does not exist in the second candidate video frame, the first cover video frame is determined as the cover of the target video.

An embodiment of the present invention further provides an electronic device, as shown in fig. 8, which includes a processor 801, a communication interface 802, a memory 803, and a communication bus 804, where the processor 801, the communication interface 802, and the memory 803 complete mutual communication through the communication bus 804,

a memory 803 for storing a computer program;

the processor 801 is configured to implement the following steps when executing the program stored in the memory 803:

determining a first cover video frame from a plurality of first video frames to be selected;

extracting a plurality of second video frames to be selected from video clips containing the first cover video frame in the target video according to a second extraction interval, wherein the second extraction interval is smaller than the first extraction interval;

and determining a second cover video frame from the plurality of second candidate video frames as a cover of the target video.

The communication bus 804 mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 802 is used for communication between the above-described electronic apparatus and other apparatuses.

The Memory 803 may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor 801 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program realizes the steps of any of the above-mentioned cover page determining methods when being executed by a processor.

In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer, causes the computer to perform any of the cover determination methods of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the embodiments of the apparatus, the electronic device, the computer-readable storage medium, and the computer program product, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for cover identification, the method comprising:

2. The method of claim 1, wherein the determining a second cover video frame from the second plurality of candidate video frames as a cover of the target video comprises:

3. The method of claim 2, wherein the determining a second cover video frame from among the third candidate video frames as a cover of the target video comprises:

4. The method according to claim 3, wherein the separately calculating the sharpness of each of the third candidate video frames comprises:

5. The method of claim 1, wherein the first cover video frame is determined to be a cover of the target video if a second cover video frame does not exist in the second candidate video frame.

6. A cover determination device, the device comprising:

and the second determining module is used for determining a second cover video frame from the second video frames to be selected as the cover of the target video.

7. The apparatus of claim 6, wherein the determining a second cover video frame from the second plurality of candidate video frames as a cover of the target video comprises:

and determining a second cover video frame from the third video frames to be selected as the cover of the target video.

8. The apparatus of claim 7, wherein the determining a second cover video frame from among the third candidate video frames as a cover of the target video comprises:

9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.

10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-5.