CN108985244B - Television program type identification method and device - Google Patents

Television program type identification method and device Download PDF

Info

Publication number
CN108985244B
CN108985244B CN201810821306.1A CN201810821306A CN108985244B CN 108985244 B CN108985244 B CN 108985244B CN 201810821306 A CN201810821306 A CN 201810821306A CN 108985244 B CN108985244 B CN 108985244B
Authority
CN
China
Prior art keywords
program
types
video images
frames
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810821306.1A
Other languages
Chinese (zh)
Other versions
CN108985244A (en
Inventor
王月岭
黄利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Co Ltd
Original Assignee
Hisense Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Co Ltd filed Critical Hisense Co Ltd
Priority to CN201810821306.1A priority Critical patent/CN108985244B/en
Publication of CN108985244A publication Critical patent/CN108985244A/en
Application granted granted Critical
Publication of CN108985244B publication Critical patent/CN108985244B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a television program type identification method and device, which are characterized in that continuous N frames of video images in a current television program are obtained; inputting the continuous N frames of video images into a pre-trained convolutional neural network to obtain a program type corresponding to each frame of video image in the output continuous N frames of video images; and then, counting the program type corresponding to each frame of video image in the continuous N frames of video images according to a preset strategy to obtain the program type of the current television program. The invention can avoid misjudgment caused by identification error of a few video images, thereby improving the identification accuracy of the television program.

Description

Television program type identification method and device
Technical Field
The present application relates to the field of television technologies, and in particular, to a method and an apparatus for identifying a television program type.
Background
In the existing television program type identification algorithm, most of the existing algorithms identify the television programs according to the time periods and the identification information of the programs, however, the identification has limitations, and if there is no time period or identification information in a picture, the identification effect is greatly reduced.
Another common recognition algorithm is a deep learning algorithm, which is most prominently called convolutional neural network algorithm. For the identification of the television program type, because the identification rate of some pictures is often low in the video broadcasting process, the television program type is similar to a certain program and is also similar to another program. Therefore, when the obtained sample is not large enough, the recognition accuracy is not high.
Disclosure of Invention
In view of this, in order to solve the problem of low accuracy in the existing program type identification, the present invention provides a method and an apparatus for identifying a television program type, which perform program type identification on input multiple frames of video images by using a convolutional neural network, and perform combined judgment on the program types corresponding to the identified multiple frames of video images to finally determine the program types corresponding to the television programs, thereby improving the identification accuracy of the television programs.
Specifically, the method is realized through the following technical scheme:
according to a first aspect of embodiments of the present application, there is provided a method for identifying a television program type, the method including:
acquiring continuous N frames of video images in a current television program;
inputting the continuous N frames of video images into a pre-trained convolutional neural network, and acquiring a program type corresponding to each frame of video image in the N frames of video images output by the convolutional neural network;
and counting the program type corresponding to each frame of video image in the continuous N frames of video images according to a preset strategy to obtain the program type of the current television program.
As an embodiment, the training method of the convolutional neural network includes:
dividing a television program into a plurality of program types;
acquiring a video sample corresponding to each program type;
extracting image characteristic data of each frame of image in the video sample as training data;
and inputting the training data into a convolutional neural network for training to obtain a convolutional neural network model.
As an embodiment, the step of counting the program types corresponding to each frame of video images in the consecutive N frames of video images according to a preset policy to obtain the program types of the current television program includes:
respectively counting the number of the program types with the same program type in the continuous N frames of video images output by the convolutional neural network; if the number of the same program types is the maximum and is more than or equal to a first preset number, outputting the program types as prediction types;
respectively counting the number of the prediction types with the same prediction type from a plurality of continuous prediction types; and if the number of the same prediction types is the largest and is more than or equal to a second preset number, taking the prediction types as the program types of the current television program.
As an embodiment, the method further comprises:
if N in the continuous N frames of video images is larger than or equal to a first threshold value, and the program type corresponding to the current television program is not determined, acquiring the next frame of video image;
and if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a second threshold value, stopping acquiring the video images, and after a first time interval, starting to acquire the video images.
As an embodiment, the method further comprises:
and if the program type of the current television program is the same as the program type of the television program acquired next time, stopping acquiring the video image, and after a second time interval, starting to acquire the video image.
According to a second aspect of the embodiments of the present application, there is provided a television program type identification apparatus, the apparatus including:
the acquisition unit is used for acquiring continuous N frames of video images in the current television program;
the input unit is used for inputting the continuous N frames of video images into a pre-trained convolutional neural network and acquiring the program type corresponding to each frame of video image in the continuous N frames of video images output by the convolutional neural network;
and the determining unit is used for determining the program type of the current television program according to the program type corresponding to each frame of video image in the continuous N frames of video images.
As an embodiment, the apparatus further comprises:
the training unit is used for dividing the television program into a plurality of program types; acquiring a video sample corresponding to each program type; extracting image characteristic data of each frame of image in the video sample as training data; and inputting the training data into a convolutional neural network for training to obtain a convolutional neural network model.
As an embodiment, the determining unit is specifically configured to count the number of program types each having the same program type in the consecutive N frames of video images output from the convolutional neural network; if the number of the same program types is the maximum and is more than or equal to a first preset number, outputting the program types as prediction types; respectively counting the number of the prediction types with the same prediction type from a plurality of continuous prediction types; and if the number of the same prediction types is the largest and is more than or equal to a second preset number, taking the prediction types as the program types of the current television program.
As an embodiment, the apparatus further comprises:
the first stopping unit is used for acquiring a next frame of video image if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a first threshold value; and if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a second threshold value, stopping acquiring the video images, and after a first time interval, starting to acquire the video images.
As an embodiment, the apparatus further comprises:
and the second stopping unit is used for stopping acquiring the video image if the program type of the current television program is the same as the program type of the television program acquired next time, and then starting acquiring the video image after waiting for a second time interval.
As can be seen from the above embodiments, the present application can obtain consecutive N frames of video images in the current television program; inputting the continuous N frames of video images into a pre-trained convolutional neural network to obtain a program type corresponding to each frame of video image in the output continuous N frames of video images; and then, counting the program type corresponding to each frame of video image in the continuous N frames of video images according to a preset strategy to obtain the program type of the current television program. Compared with the prior art, the method can utilize the pre-trained convolutional neural network to identify the program types of the input multi-frame video images, preliminarily predict the identification results of the program types, perform statistical judgment on the program types corresponding to the identified multi-frame video images according to a multi-frame strategy, and finally determine the program types corresponding to the television programs.
Drawings
Fig. 1 is a flowchart illustrating an exemplary method for identifying tv program types according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an exemplary training scenario of the present application;
FIG. 3-1 is a schematic diagram of an exemplary first image feature extraction of the present application;
3-2 are schematic diagrams of exemplary second image feature extraction of the present application;
3-3 are schematic diagrams of exemplary third image feature extraction of the present application;
FIG. 4 is a diagram illustrating an exemplary multi-frame strategy of the present application;
FIG. 5 is a block diagram of an embodiment of a television program type identification apparatus of the present application;
FIG. 6 is a block diagram of one embodiment of a computer device of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
For the identification of the television program type, because the identification rate of some pictures is often low in the video broadcasting process, the television program type is similar to a certain program and is also similar to another program. For example: in sports news, because the program is a news program, but a lot of sports pictures are interspersed in the program, the classifier can distinguish the program which belongs to the news as sports, and errors occur; in a game program such as a scenario, when the scenario develops, the picture is very similar to a cartoon program, so that it is difficult for the classifier to judge whether the picture is a game or a cartoon. At present, the television program is usually identified by matching according to a program sample to judge the program type, but because the television program has a plurality of pictures, the structure is quite complex, and the updating speed is very high. All possible pictures cannot be listed for each type of television programs, so program samples cannot be made to face, and the program types are often misjudged, so that the efficiency of identifying the program types is low, and the effect of adjusting the pictures according to the program types is influenced.
In order to solve the above problems, the present application provides a method for identifying a television program type, which can identify a television program type by acquiring consecutive N frames of video images in a current television program; inputting the continuous N frames of video images into a pre-trained convolutional neural network to obtain a program type corresponding to each frame of video image in the output N frames of video images; and then, counting the program type corresponding to each frame of video image in the N frames of video images according to a preset strategy to obtain the program type of the current television program. Compared with the prior art, the method can utilize the pre-trained convolutional neural network to identify the program types of the input multi-frame video images, preliminarily predict the identification results of the program types, perform statistical judgment on the program types corresponding to the identified multi-frame video images according to a multi-frame strategy, and finally determine the program types corresponding to the television programs.
As follows, the following embodiments are shown to explain the television program type identification method provided by the present application.
The first embodiment,
Referring to fig. 1, a flowchart of an exemplary embodiment of a television program type identification method according to the present application is shown, where the method includes the following steps:
step 101, acquiring continuous N frames of video images in a current television program;
in this embodiment, the television program type identification method may be used for a television set or a set-top box or a computer device. When a television program is played, N consecutive frames of video images can be obtained from the current television program, where N is a positive integer greater than or equal to 2.
102, inputting the continuous N frames of video images into a pre-trained convolutional neural network, and acquiring a program type corresponding to each frame of video image in the continuous N frames of video images output by the convolutional neural network;
in this embodiment, the acquired continuous N frames of video images may be input to a pre-trained convolutional neural network, and according to a training result of the convolutional neural network, a program type may be determined for each frame of video image, and then a program type corresponding to each frame of video image in the continuous N frames of video images output by the convolutional neural network is acquired.
The training process for the convolutional neural network is specifically illustrated by the following examples.
Example II,
Please refer to fig. 2, which is a schematic diagram of an exemplary training scheme of the present application, wherein the training process specifically includes:
step 201, dividing a plurality of program types for a television program;
in this embodiment, the present application mainly divides the tv program types into the following 7 types, such as sports, news, games, animations, fantasy, movies, and other program types, and the present application can perform targeted identification for several main program categories in the tv programs.
Step 202, obtaining a video sample corresponding to each program type;
in this embodiment, through targeted learning of the salient features of the video images in the program, a typical video image in each program type can be obtained, and some video images which are determined to be disputed according to experience but often have uniform types can be obtained, for example, in a news image, image contents of other programs such as any sports, games, entertainment, movies and the like may appear, if it is determined that there is an obvious news feature, the type of the program can be considered as news, and it is seen that the video image of the type has a large dispute, but often belongs to a certain type of program. By obtaining the video sample corresponding to each program type, a more comprehensive training sample can be provided in the process of training the convolutional neural network, so that the identification accuracy of the convolutional neural network is improved.
Step 203, extracting image characteristic data of each frame of image in the video sample as training data;
in this embodiment, the image feature data of each frame of image in the video sample may be further extracted as training data, so as to obtain training data. Image characteristic data is typically image content that represents a relatively salient type of program.
For example, as shown in fig. 3-1, 3-2, and 3-3, in which the program type of fig. 3-1 is a movie, since the image features of the movie are black edges above and below the scene of the movie, the black edges above and below the image (the areas in the black line box in fig. 3-1) can be extracted as image feature data; the program type in fig. 3-2 is news, and since the image features of the news are the station caption and the lower caption at the upper left corner in the news scene, the station caption and the caption (the area in the black line frame in fig. 3-2) can be extracted as image feature data; where the program type of fig. 3-3 is sports, since the image feature of sports is green grass in the scene, grass (the area in the black box in fig. 3-3) can be extracted as image feature data. By extracting the image features in the typical scene as training data, the convolutional neural network can recognize the typical scene when judging the program type, so that the purpose of type recognition is achieved.
And step 204, inputting the training data into a convolutional neural network for training to obtain a convolutional neural network model.
And inputting the training data into a convolutional neural network for targeted processing and parameter training of a relevant learning algorithm to obtain a trained convolutional neural network model. Therefore, whether scenes played in a television have special attributes in each scene can be detected by using a trained model, and prejudgment on the types of scene programs with preset typical characteristics is completed under the guarantee of some prediction results with high reliability. In this embodiment, the structure of the convolutional neural network may include 5 convolutional layers and 3 fully-connected layers, or may be other combined structures, which is not limited in this application.
The description of the second embodiment is completed so far.
And 103, counting the program types corresponding to each frame of video image in the continuous N frames of video images according to a preset strategy to obtain the program types of the current television programs.
In this embodiment, the program type corresponding to each frame of video image in consecutive N frames of video images output by the convolutional neural network is obtained, and then the program type corresponding to each frame of video image in the consecutive N frames of video images is counted according to a preset policy to obtain the final program type of the current television program.
The specific type determination method is specifically described by the following examples.
Example III,
In this embodiment, the used prediction policy is a multi-frame policy, specifically, the multi-frame policy includes two layers, and the first-layer multi-frame policy is used to count the number of program types having the same program type in the consecutive N-frame video images output by the convolutional neural network after the convolutional neural network outputs the program type corresponding to each frame of video image in the consecutive N-frame video images; if the number of the same program types is the maximum and is more than or equal to a first preset number, outputting the program types as prediction types; the second-layer multi-frame strategy is used for respectively counting the number of the prediction types with the same prediction type from a plurality of continuous prediction types; and if the number of the same prediction types is the largest and is more than or equal to a second preset number, taking the prediction types as the program types of the current television program.
For example, please refer to fig. 4, which is an exemplary multi-frame strategy diagram of the present application, wherein it is assumed that an output of a program type corresponding to each frame in consecutive N frames of video images is obtained according to a trained convolutional neural network model, in this embodiment, N may be selected as 10, where an interval of each frame is 1 second, first, a program type corresponding to consecutive 10 frames of video images, that is, a program type of 10 frames of video images numbered 1-10 in fig. 4, is obtained, and a number of program types in the program types corresponding to the 10 frames of video images, where each program type has the same program type, is counted, if a number of one of the program types is the largest and is greater than or equal to 9, the program type is output as a first prediction type, that is, prediction type —, otherwise, the program type is not output. Then obtaining the program type corresponding to the next frame of video image, namely the program type corresponding to the video image with the number of 11, forming a new prediction group by the 11 th frame and the previous 9 frames thereof, wherein the group still contains the program type corresponding to the 10 th frame of video image (namely the program type corresponding to the video image with the number of 2-11 th frames), then predicting the program type of the new prediction group according to the method to obtain a second prediction type, namely the prediction type II, repeating the operation from the 11 th frame to the 14 th frame by the analogy, obtaining 5 continuous prediction types for counting, obtaining the number of the same prediction type of each prediction type, if the number of one prediction type is the maximum and is more than or equal to 4, outputting the prediction type as the program type of the television program, otherwise, not outputting the prediction type.
In the prior art, the multi-frame strategy is not used, and in order to ensure the stability of image quality display, the probability that the image quality parameter does not change in the 10-frame process is p10, where p is the probability of correct identification and has a value less than 1.
If the multi-frame strategy is used, the probability that the image quality parameter is stable is p10+10 p9 (1-p) p9 (10-9 p). The first layer multiframe strategy can improve stability due to p9 × (10-9p) > p 10.
Similarly, the probability of using the second multi-frame strategy is p14 (5-4p1) > p15 with p9 (10-9p) being p1, and thus the probability of stabilizing the image quality parameters is high.
For the improvement of the accuracy, the identification accuracy p of the single-frame result is 93%, and the accuracy of the multi-frame strategy is 99%.
The first preset number of frames can be selected to be 9 frames in the first layer, and the second preset number of frames can be selected to be 4 frames in the second layer, so that the accuracy and the stability can be better.
Because the scene of the television program picture is very complex and has the characteristic of long-time continuity, in order to improve the identification accuracy and avoid frequent switching, the identification result is combined by using the multi-frame strategy, so that the identification accuracy is improved.
The description of the third embodiment is completed so far.
As an embodiment, if N in the consecutive N frames of video images is greater than or equal to a first threshold value, and the program type corresponding to the current television program is not determined, obtaining a next frame of video image; and if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a second threshold value, stopping acquiring the video images, and after a first time interval, starting to acquire the video images.
For example, if the first threshold is 15 frames and the second threshold is 30 frames, the above scheme is:
when 15 frames of video images are identified but no output result is obtained, acquiring the next frame (16 th frame) of video images based on the 15 frames of video images for identification, acquiring the program type corresponding to the 16 th frame of video images, forming a new prediction group with the first 9 frames (namely 7 th to 15 th frames) of video images, wherein the group still contains the program type corresponding to the 10 frames of video images (namely the program type corresponding to the numbered 7 th to 16 th frames of video images), then performing program type prediction on the new prediction group according to the method, and repeating the operation from the 16 th frame to the 30 th frame by analogy. If there is no recognition result in the 30 th frame, it is considered that the recognition is abnormal, and since the television program type is not clear now, the video image is acquired after the first time interval (for example, 30) s and recognized until the result can be recognized.
As an embodiment, if the program type of the current television program is the same as the program type of the television program obtained last time, stopping obtaining the video image, and after a second time interval, starting obtaining the video image; and if the currently acquired program type is the same as the program types of the last two times, stopping acquiring the video image, and after waiting for a third time interval, starting to acquire the video image. For example, if the type of the television program can be identified by using the multi-frame strategy, the identification is stopped, the video image is acquired again for identification after the second time interval (for example, 30s), if the second identification result is consistent with the last identification result, the interval is extended to 1min, and so on, the interval time is gradually extended. If the results of the two recognitions are inconsistent in the recognition process, the interval time is gradually prolonged from 30s again.
Because the calculation amount of the convolutional neural network is large when the television program type is identified, if the identification is carried out all the time, the television memory occupies a large amount, and therefore the continuous occupation of the television memory can be reduced through the mechanism of the identification interval.
At present, the adjustment of the image quality parameters of the tv programs is mainly based on the station caption and EPG (electronic program guide) information, which cannot guarantee a high accuracy, but for some live programs, the standard image quality is used all the time or the image quality parameters are adjusted by the user himself, so the error rate is high or manual operation is required.
According to the method and the device, after the television program type is determined, the image quality can be automatically adjusted according to the program type, so that the stability of image display can be improved, image quality parameters cannot be frequently changed due to occasional recognition errors, and the visual experience of audiences is improved.
Therefore, the method and the device can acquire continuous N frames of video images in the current television program; inputting the continuous N frames of video images into a pre-trained convolutional neural network to obtain a program type corresponding to each frame of video image in the output continuous N frames of video images; and then, counting the program type corresponding to each frame of video image in the continuous N frames of video images according to a preset strategy to obtain the program type of the current television program. Compared with the prior art, the method can utilize the pre-trained convolutional neural network to identify the program types of the input multi-frame video images, preliminarily predict the identification results of the program types, perform statistical judgment on the program types corresponding to the identified multi-frame video images according to a multi-frame strategy, and finally determine the program types corresponding to the television programs.
Corresponding to the embodiment of the image processing method, the application also provides an embodiment of the image processing device.
Referring to fig. 5, which is a block diagram of an embodiment of a television program type identification apparatus of the present application, the apparatus 50 may include:
an obtaining unit 51, configured to obtain consecutive N frames of video images in a current television program;
an input unit 52, configured to input the consecutive N frames of video images into a pre-trained convolutional neural network, and obtain a program type corresponding to each frame of video image in the consecutive N frames of video images output by the convolutional neural network;
and the determining unit 53 is configured to count the program type corresponding to each frame of video image in the consecutive N frames of video images according to a preset policy to obtain the program type of the current television program.
As an embodiment, the apparatus further comprises:
a training unit 54 for classifying a plurality of program types for a television program; acquiring a video sample corresponding to each program type; extracting image characteristic data of each frame of image in the video sample as training data; and inputting the training data into a convolutional neural network for training to obtain a convolutional neural network model.
As an embodiment, the determining unit 53 is specifically configured to count the number of program types having the same program type in the consecutive N frames of video images output from the convolutional neural network; if the number of the same program types is the maximum and is more than or equal to a first preset number, outputting the program types as prediction types; respectively counting the number of the prediction types with the same prediction type from a plurality of continuous prediction types; and if the number of the same prediction types is the largest and is more than or equal to a second preset number, taking the prediction types as the program types of the current television program.
As an embodiment, the apparatus further comprises:
a first stopping unit 55, configured to, if N in the consecutive N frames of video images is greater than or equal to a first threshold value, not determine a program type corresponding to the current television program, obtain a next frame of video image; and if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a second threshold value, stopping acquiring the video images, and after a first time interval, starting to acquire the video images.
As an embodiment, the apparatus further comprises:
a second stopping unit 56, configured to stop acquiring the video image if the program type of the current television program acquired by the acquiring unit is the same as the program type of the television program acquired last time, and start acquiring the video image after a second time interval; and if the currently acquired program type is the same as the program types of the last two times, stopping acquiring the video image, and after waiting for a third time interval, starting to acquire the video image.
In summary, the present application can obtain continuous N frames of video images in the current television program; inputting the continuous N frames of video images into a pre-trained convolutional neural network to obtain a program type corresponding to each frame of video image in the output continuous N frames of video images; and then, counting the program type corresponding to each frame of video image in the continuous N frames of video images according to a preset strategy to obtain the program type of the current television program. Compared with the prior art, the method can utilize the pre-trained convolutional neural network to identify the program types of the input multi-frame video images, preliminarily predict the identification results of the program types, perform statistical judgment on the program types corresponding to the identified multi-frame video images according to a multi-frame strategy, and finally determine the program types corresponding to the television programs.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
Corresponding to the embodiments of the image processing method, the application also provides embodiments of a computer device for executing the image processing method.
Referring to fig. 6, a computer device includes a processor 61, a communication interface 62, a memory 63, and a communication bus 64, as one embodiment;
the processor 61, the communication interface 62 and the memory 63 are in communication with each other through the communication bus 64;
the memory 63 is used for storing computer programs;
the processor 61 is configured to execute the computer program stored in the memory 63, and when the processor 61 executes the computer program, any step of the above television program type identification method is implemented.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the embodiment of the computer device, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (8)

1. A method for identifying a type of a television program, the method comprising:
acquiring continuous N frames of video images in a current television program;
inputting the continuous N frames of video images into a pre-trained convolutional neural network, and acquiring a program type corresponding to each frame of video image in the continuous N frames of video images output by the convolutional neural network;
counting the number of program types which are the same as the program types in each program type in the program types corresponding to each frame of video images in the continuous N frames of video images output from the convolutional neural network; if the number of the same program types is the maximum and is more than or equal to a first preset number, outputting the program types as prediction types;
counting the number of prediction types having the same prediction type as the prediction type for each prediction type from a plurality of output continuous prediction types; and if the number of the same prediction types is the largest and is more than or equal to a second preset number, taking the prediction types as the program types of the current television program.
2. The method of claim 1, wherein the method of training the convolutional neural network comprises:
dividing a television program into a plurality of program types;
acquiring a video sample corresponding to each program type;
extracting image characteristic data of each frame of image in the video sample as training data;
and inputting the training data into a convolutional neural network for training to obtain a convolutional neural network model.
3. The method of claim 1, further comprising:
if N in the continuous N frames of video images is larger than or equal to a first threshold value, and the program type corresponding to the current television program is not determined, acquiring the next frame of video image;
and if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a second threshold value, stopping acquiring the video images, and after a first time interval, starting to acquire the video images.
4. The method of claim 1, further comprising:
if the program type of the current television program is the same as the program type of the television program obtained last time, stopping obtaining the video image, and after a second time interval, starting obtaining the video image; and if the currently acquired program type is the same as the program types of the last two times, stopping acquiring the video image, and after waiting for a third time interval, starting to acquire the video image.
5. An apparatus for identifying a type of a television program, the apparatus comprising:
the acquisition unit is used for acquiring continuous N frames of video images in the current television program;
the input unit is used for inputting the continuous N frames of video images into a pre-trained convolutional neural network and acquiring the program type corresponding to each frame of video image in the continuous N frames of video images output by the convolutional neural network;
the determining unit is used for counting the number of program types which are the same as the program types in each program type in the program types corresponding to each frame of video images in the continuous N frames of video images output from the convolutional neural network; if the number of the same program types is the maximum and is more than or equal to a first preset number, outputting the program types as prediction types; counting the number of prediction types having the same prediction type as the prediction type for each prediction type from a plurality of output continuous prediction types; and if the number of the same prediction types is the largest and is more than or equal to a second preset number, taking the prediction types as the program types of the current television program.
6. The apparatus of claim 5, further comprising:
the training unit is used for dividing the television program into a plurality of program types; acquiring a video sample corresponding to each program type; extracting image characteristic data of each frame of image in the video sample as training data; and inputting the training data into a convolutional neural network for training to obtain a convolutional neural network model.
7. The apparatus of claim 5, further comprising:
the first stopping unit is used for acquiring a next frame of video image if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a first threshold value; and if the program type corresponding to the current television program is not determined when N in the continuous N frames of video images is larger than or equal to a second threshold value, stopping acquiring the video images, and after a first time interval, starting to acquire the video images.
8. The apparatus of claim 5, further comprising:
the second stopping unit is used for stopping acquiring the video image if the program type of the current television program is the same as the program type of the television program acquired last time, and then starting acquiring the video image after a second time interval; and if the currently acquired program type is the same as the program types of the last two times, stopping acquiring the video image, and after waiting for a third time interval, starting to acquire the video image.
CN201810821306.1A 2018-07-24 2018-07-24 Television program type identification method and device Active CN108985244B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810821306.1A CN108985244B (en) 2018-07-24 2018-07-24 Television program type identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810821306.1A CN108985244B (en) 2018-07-24 2018-07-24 Television program type identification method and device

Publications (2)

Publication Number Publication Date
CN108985244A CN108985244A (en) 2018-12-11
CN108985244B true CN108985244B (en) 2021-10-15

Family

ID=64550039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810821306.1A Active CN108985244B (en) 2018-07-24 2018-07-24 Television program type identification method and device

Country Status (1)

Country Link
CN (1) CN108985244B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800919A (en) * 2021-01-21 2021-05-14 百度在线网络技术(北京)有限公司 Method, device and equipment for detecting target type video and storage medium
CN115996300A (en) * 2021-10-19 2023-04-21 海信集团控股股份有限公司 Video playing method and electronic display device
CN115119013B (en) * 2022-03-26 2023-05-05 浙江九鑫智能科技有限公司 Multi-level data machine control application system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807284A (en) * 2010-03-16 2010-08-18 许祥鸿 Service data retrieval method of Internet television
CN104866843A (en) * 2015-06-05 2015-08-26 中国人民解放军国防科学技术大学 Monitoring-video-oriented masked face detection method

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160094812A1 (en) * 2014-09-30 2016-03-31 Kai Chen Method And System For Mobile Surveillance And Mobile Infant Surveillance Platform
US10572735B2 (en) * 2015-03-31 2020-02-25 Beijing Shunyuan Kaihua Technology Limited Detect sports video highlights for mobile computing devices
CN106228580B (en) * 2016-07-29 2019-03-05 李铮 A kind of material detection, power-economizing method and system based on video analysis
CN106297331B (en) * 2016-08-29 2019-05-14 安徽科力信息产业有限责任公司 The method and system of crossing motor vehicles parking number is reduced using plane cognition technology
CN107194419A (en) * 2017-05-10 2017-09-22 百度在线网络技术(北京)有限公司 Video classification methods and device, computer equipment and computer-readable recording medium
CN107798313A (en) * 2017-11-22 2018-03-13 杨晓艳 A kind of human posture recognition method, device, terminal and storage medium
CN108280406A (en) * 2017-12-30 2018-07-13 广州海昇计算机科技有限公司 A kind of Activity recognition method, system and device based on segmentation double-stream digestion
CN108259990B (en) * 2018-01-26 2020-08-04 腾讯科技(深圳)有限公司 Video editing method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807284A (en) * 2010-03-16 2010-08-18 许祥鸿 Service data retrieval method of Internet television
CN104866843A (en) * 2015-06-05 2015-08-26 中国人民解放军国防科学技术大学 Monitoring-video-oriented masked face detection method

Also Published As

Publication number Publication date
CN108985244A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
US10643074B1 (en) Automated video ratings
US9137562B2 (en) Method of viewing audiovisual documents on a receiver, and receiver for viewing such documents
CN112312231B (en) Video image coding method and device, electronic equipment and medium
CN108985244B (en) Television program type identification method and device
CN110839129A (en) Image processing method and device and mobile terminal
EP2445205B1 (en) Detection of transitions between text and non-text frames in a videostream
US9224048B2 (en) Scene-based people metering for audience measurement
CN111861572B (en) Advertisement putting method and device, electronic equipment and computer readable storage medium
CN112445935B (en) Automatic generation method of video selection collection based on content analysis
CN109698957B (en) Image coding method and device, computing equipment and storage medium
CN109922334A (en) A kind of recognition methods and system of video quality
CN105704559A (en) Poster generation method and apparatus thereof
CN111372116B (en) Video playing prompt information processing method and device, electronic equipment and storage medium
WO2022087826A1 (en) Video processing method and apparatus, mobile device, and readable storage medium
CN112653918B (en) Preview video generation method and device, electronic equipment and storage medium
CN104320670A (en) Summary information extracting method and system for network video
CN111405339A (en) Split screen display method, electronic equipment and storage medium
CN115396705A (en) Screen projection operation verification method, platform and system
CN114302226B (en) Intelligent cutting method for video picture
CN110099298B (en) Multimedia content processing method and terminal equipment
CN112528748B (en) Method for identifying and intercepting static slide from video
CN109788311B (en) Character replacement method, electronic device, and storage medium
CN116682035A (en) Method, device, equipment and program product for detecting high-frame-rate video defects
CN113542909A (en) Video processing method and device, electronic equipment and computer storage medium
CN111444822A (en) Object recognition method and apparatus, storage medium, and electronic apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant