FR2853793A1

FR2853793A1 - Video sequence plan transformation detection process for video sequence indexing, involves comparing peak signal-to-noise ratio coefficient with threshold to decide existence of fading in video sequence

Info

Publication number: FR2853793A1
Application number: FR0304435A
Authority: FR
Inventors: Nathalie Laurent; Mariette Maurizot
Original assignee: France Telecom SA
Current assignee: Orange SA
Priority date: 2003-04-09
Filing date: 2003-04-09
Publication date: 2004-10-15
Anticipated expiration: 2023-04-09
Also published as: FR2853793B1

Abstract

The process involves obtaining a continuous domain of motion vectors. The vectors represent a forward movement from a current image towards a next image of a video sequence. A peak signal-to-noise ratio (PSNR) coefficient between the next image and a compensated next image is calculated from the domain. The coefficient is compared with a threshold to decide the existence of fading. The compensated next image is obtained from the current image and the motion vector domain. Independent claims are also included for the following: (1) a computer program including program code instructions for executing the process of detection of transformation of plan in video sequence (2) a video sequence indexing process including a process of detection of transformation of plan of a video sequence (3) a device of detection of transformation of plan in a video sequence formed of successive images.

Description

Procédé et dispositif de détection d'un changement de plan dans uneMethod and device for detecting a change of plane in a

séquence vidéo, programme d'ordinateur et procédé d'indexation correspondants. video sequence, computer program and corresponding indexing method.

Le domaine de l'invention est celui des séquences vidéo. Plus précisément, l'invention concerne la détection d'un changement de plan dans une séquence vidéo. The field of the invention is that of video sequences. More specifically, the invention relates to detecting a change of plane in a video sequence.

D'une façon générale, on distingue deux types de changement de plan: les coupures et les fondus. In general, there are two types of change of plan: cuts and fades.

Une coupure (ou " cut " en anglais) est un changement de plan brutal. Un fondu est défini comme un passage progressif d'un plan à un autre entre l'instant i et i+t, par une substitution progressive de la dernière image du premier plan et 10 de la première image du deuxième plan. En d'autres termes, un fondu est caractérisé par une technique permettant de substituer progressivement un plan à un autre, la disparition graduelle du premier engendrant la révélation graduelle du second. A cut (or "cut" in English) is a sudden change of plan. A fade is defined as a progressive transition from one plane to another between the instant i and i + t, by progressive substitution of the last image of the first plane and 10 of the first image of the second plane. In other words, a fade is characterized by a technique allowing to gradually replace one plane to another, the gradual disappearance of the first generating the gradual revelation of the second.

L'invention a de nombreuses applications, telles que par exemple l'indexation d'une séquence vidéo. En effet, l'accroissement du nombre de documents audiovisuels 15 disponibles au format numérique pose des problèmes de gestion de ces volumes de données. La mise à disposition de ces documents à un grand nombre d'utilisateurs nécessite une indexation automatique ou semi-automatique. La première étape d'une telle indexation consiste en un découpage en plans de la vidéo afin d'obtenir un condensé du document pour en avoir une vision globale, pour fournir des points d'entrée aux bases 20 de données et pour manipuler un contenu d'un volume plus réduit. Un tel découpage peut aussi être utilisé en entrée d'un codeur vidéo. The invention has many applications, such as for example the indexing of a video sequence. Indeed, the increase in the number of audiovisual documents available in digital format poses problems in managing these volumes of data. The provision of these documents to a large number of users requires automatic or semi-automatic indexing. The first step of such indexing is to split the video into clips to obtain a digest of the document to gain an overall view, to provide entry points to the databases, and to manipulate the content of the document. a smaller volume. Such splitting can also be used at the input of a video encoder.

La détection de changement de plan correspond à la recherche de discontinuités dans le signal transformé de la vidéo. Le changement de contenu visuel entre deux images consécutives d'un même plan est dû soit au mouvement des objets et/ou de la 25 caméra, soit à un changement d'illumination. Il faut donc extraire des caractéristiques insensibles au mouvement et au changement brusque d'illumination. Les caractéristiques les plus classiques sont basées sur les histogrammes de couleur/luminance, sur la modélisation théorique de la transformation du signal due à la transition, sur l'erreur d'estimation du mouvement, ou sur l'utilisation des données du fichier MPEG de la 30 vidéo (voir [Gargi2000, Lupatinil999] pour un état de l'art comparatif). The change of plan detection corresponds to the search for discontinuities in the transformed signal of the video. The change of visual content between two consecutive images of the same plane is due either to the movement of the objects and / or the camera, or to a change of illumination. Extremely insensitive characteristics must be extracted from the movement and the abrupt change of illumination. The most common features are based on color / luminance histograms, theoretical modeling of signal transformation due to transition, motion estimation error, or use of MPEG file data from the video (see [Gargi2000, Lupatinil999] for a comparative state of the art).

On discute ci-après différentes techniques de détection de changement de plan basées sur les caractéristiques précitées. Different plan change detection techniques based on the aforementioned characteristics are discussed below.

On connaît des techniques de détection de changement de plan par comparaison directe, ou par histogramme de couleur/luminance. Certaines de ces techniques mettent 5 en oeuvre un double seuillage de la moyenne des variations de luminance entre les images k et k+l. Le seuil bas caractérise les coupures, le seuil haut détecte les fondus. Plan change detection techniques are known by direct comparison, or by color histogram / luminance. Some of these techniques implement double thresholding of the average of the luminance variations between the images k and k + 1. The low threshold characterizes the cuts, the high threshold detects the fades.

Dans d'autres de ces techniques, le double seuillage s'effectue sur le nombre de points dont la variation de luminance est importante entre k et k+l. Selon encore une autre variante, on effectue un double seuillage de la différence temporelle de l'histogramme de 10 couleur global ou local. Pour l'histogramme local, l'image est divisée en 16 blocs sur lesquels la différence est calculée, la valeur de la caractéristique correspond à la somme des 8 plus petites valeurs. Cette méthode permet de s'affranchir des mouvements locaux. In other of these techniques, the double thresholding is done on the number of points whose luminance variation is important between k and k + 1. According to yet another variant, the time difference of the global or local color histogram is double-thresholded. For the local histogram, the image is divided into 16 blocks on which the difference is calculated, the value of the characteristic is the sum of the 8 smallest values. This method eliminates local movements.

On rappelle que la technique de détection par double seuillage permet de détecter les coupures et les fondus dans une fonction f(k, 1) de la caractéristique étudiée 15 (différence de luminance, d'histogrammes, etc.) en utilisant un seuil haut Sh et un seuil bas Sb de la manière suivante [Zhangl993]: - si la valeur de la fonction f(k, k+1) est inférieure à Sb une coupure est détectée à l'instant k; - si la valeur est seulement inférieure à Sh, l'image courante est candidate en 20 tant que première image d'une transition progressive. La fonction caractéristique est calculée entre l'image courante Ik et les images successives Ik+j: f(k, k+j). Une transition progressive entre l'image Ik et l'image Ik+n est effectivement détectée si f(k, k+n) devient inférieure à Sb et si les valeurs de f(k, k+j) pour O<j<k sont toutes inférieures à Sh. 25 On connaît aussi des techniques de détection de changement de plan par modélisation de la transition. Un fondu est modélisé par la combinaison de deux fonctions appliquées à la chrominance des deux plans se trouvant de part et d'autre du fondu; la première fonction étant décroissante, la deuxième croissante. Ce modèle ne pouvant être appliqué en dehors d'une transition, la discontinuité recherchée se 30 caractérise par l'étude des dérivées première et seconde de la série temporelle des variations de chrominance. Cette méthode demande de fixer heuristiquement certains paramètres. Le modèle de fondu peut être appris par un système de réseau neuronal en l'entraînant à l'aide de transitions construites aléatoirement (les fonctions de croissance et décroissance sont aléatoires), les seuils de détection estimés sont utilisés pour paramétrer la méthode précédente. L'étude du comportement temporel de la variance de 5 la différence de luminance peut être plus simplement utilisée. Dans un fondu la variance de l'intensité a un comportement parabolique. La détection de transition revient à détecter les portions paraboliques dans la courbe temporelle des variances mesurées (largeur et profondeur). Mais cette détection est rendue difficile à cause des variations causées par le mouvement. It will be recalled that the double threshold detection technique makes it possible to detect breaks and fades in a function f (k, 1) of the characteristic studied (difference in luminance, histograms, etc.) by using a high threshold Sh and a low threshold Sb in the following manner [Zhangl993]: if the value of the function f (k, k + 1) is smaller than Sb, a cutoff is detected at time k; if the value is only less than Sh, the current image is candidate as the first image of a progressive transition. The characteristic function is computed between the current image Ik and the successive images Ik + j: f (k, k + j). A progressive transition between the image Ik and the image Ik + n is actually detected if f (k, k + n) becomes less than Sb and if the values of f (k, k + j) for O <j <k are all less than Sh. Plan-change detection techniques are also known by transition modeling. A fade is modeled by the combination of two functions applied to the chrominance of the two planes lying on either side of the fade; the first function being decreasing, the second increasing. Since this model can not be applied outside a transition, the desired discontinuity is characterized by the study of the first and second derivatives of the time series of the chrominance variations. This method requires heuristically setting some parameters. The fade pattern can be learned by a neural network system by dragging it through randomly constructed transitions (the growth and decay functions are random), the estimated detection thresholds are used to parameterize the previous method. The study of the temporal behavior of the variance of the luminance difference can be more simply used. In a fade, the variance of the intensity has a parabolic behavior. Transition detection is like detecting the parabolic portions in the temporal curve of the measured variances (width and depth). But this detection is made difficult because of the variations caused by the movement.

On connaît également des techniques de détection de changement de plan basées sur les vidéos MPEG. Elles consistent à effectuer des statistiques sur la corrélations des coefficients DCT et sur le nombre de vecteurs de prédiction. L'idée générale est que s'il y a beaucoup de vecteurs mouvement, cela signifie qu'il est possible de prédire une image uniquement par la compensation de mouvement et donc qu'il n'y a pas de 15 changement de plan. Au contraire, s'il y a peu de vecteurs mouvement cela signifie qu'il y a un changement de plan. Ces méthodes dépendent de la qualité du codec, Encore d'autres techniques connues de détection de changement de plan sont basées sur l'étude du mouvement. Dans la majorité de ces techniques, le mouvement est estimé à l'aide d'un algorithme de mise en correspondance de blocs entre les images aux 20 instants k et k+1. L'estimation du mouvement permet d'améliorer les méthodes par comparaison de luminance, en utilisant comme caractéristique: - la somme des valeurs absolues des différences de luminance entre l'image et l'image compensée [Shahrarayl995], ou - la somme des valeurs absolues des différences des moyennes de luminance 25 en chaque bloc entre l'image et l'image compensée [Hanjalic2002]. Plan change detection techniques based on MPEG videos are also known. They consist in making statistics on the correlations of the DCT coefficients and on the number of prediction vectors. The general idea is that if there are many motion vectors, it means that it is possible to predict an image only by the motion compensation and therefore that there is no change of plane. On the contrary, if there are few motion vectors it means that there is a change of plan. These methods depend on the quality of the codec, yet other known plan change detection techniques are based on the study of motion. In the majority of these techniques, the motion is estimated using a block matching algorithm between the images at times k and k + 1. The motion estimation allows to improve the methods by comparison of luminance, by using as characteristic: - the sum of the absolute values of the differences of luminance between the image and the compensated image [Shahrarayl995], or - the sum of the values absolute differences of luminance means in each block between the image and the compensated image [Hanjalic2002].

Certaines méthodes exploitent les blocs de la mise en correspondance de la manière suivante: - en comptant le nombre de blocs dont la valeur absolue des différences de luminance entre l'image et l'image compensée est supérieure à un certain 30 seuil [Zhangl995], - ou en effectuant le rapport entre le nombre de blocs dont l'intensité moyenne augmente (resp. diminue) durant 3 images consécutives, et le nombre total de bloc (pour retrouver l'évolution parabolique de la variation de l'intensité au cours d'une transition progressive de type fondu) [Yazdi2002]. Some methods exploit the blocks of the mapping in the following manner: by counting the number of blocks whose absolute value of the luminance differences between the image and the compensated image is greater than a certain threshold [Zhangl995], - or by making the ratio between the number of blocks whose average intensity increases (resp. decreases) during 3 consecutive images, and the total number of blocks (to find the parabolic evolution of the variation of the intensity during 'a fade-type transition) [Yazdi2002].

Une autre technique connue de détection de changement de plan est basée sur la variation de la taille du support de l'estimation du mouvement dominant [Bouthemyl999]. Le mouvement affine 2D est estimé à l'aide d'un algorithme robuste multi-résolution. Dans un même plan, la taille du support du mouvement dominant varie peu temporellement. A l'aide de l'estimation du mouvement dominant effectué entre 10 l'image t-1 et t, le support présumé du mouvement dominant entre l'image t et t+1 peut être estimé. La caractéristique servant à la détection de changement de plan est le rapport entre la taille réelle du support d'estimation du mouvement dominant entre l'image t et t+1, et sa taille prédite. Les sauts significatifs de cette caractéristique sont déterminés à l'aide d'un test de Hinkley (test par somme cumulative). Another known plan change detection technique is based on the variation of the support size of the dominant motion estimation [Bouthemyl999]. 2D affine motion is estimated using a robust multi-resolution algorithm. In the same plane, the size of the support of the dominant movement varies little temporally. Using the estimation of the dominant motion made between the image t-1 and t, the presumed support of the dominant motion between the image t and t + 1 can be estimated. The feature for plan change detection is the ratio of the actual size of the dominant motion estimation medium between the image t and t + 1, and its predicted size. Significant jumps in this characteristic are determined using a Hinkley test (cumulative sum test).

Les techniques antérieures de détection de changement de plan utilisant la compensation de mouvement sont basées sur l'utilisation d'un algorithme de mise en correspondance de blocs. Or, cet algorithme n'estime correctement que les mouvements proches d'une translation, le détecteur devient alors sensible aux mouvements de rotation ou d'occlusion entre objets. Previous plane change detection techniques using motion compensation are based on the use of a block matching algorithm. However, this algorithm only correctly estimates movements close to a translation, the detector then becomes sensitive to rotation or occlusion movements between objects.

De plus, ces techniques ne prennent pas en compte le mouvement dominant, ce qui génère d'autres fausses alarmes dues aux mouvements importants de la caméra. Moreover, these techniques do not take into account the dominant movement, which generates other false alarms due to the important movements of the camera.

On notera par ailleurs que la technique de double seuillage n'est utilisée à ce jour qu'avec des seuils fixes tout au long de la séquence vidéo. Ces seuils sont choisis heuristiquement par l'utilisateur, ou sont obtenus suite à une première étude de la vidéo. 25 Le premier cas nécessite d'avoir un a priori de la qualité de la vidéo ainsi qu'une qualité constante au cours de la vidéo, et génère classiquement des fausses alarmes si le seuil haut et le seuil bas sont trop hauts, ou des non-détections si le seuil haut et le seuil bas sont trop bas. Les seuils sont évidemment meilleurs lorsqu'ils sont obtenus après une première étude de la vidéo, mais cette étude est coûteuse en calcul et en temps. Note also that the double thresholding technique is used to date only with fixed thresholds throughout the video sequence. These thresholds are chosen heuristically by the user, or are obtained after a first study of the video. The first case requires having a priori of the quality of the video and a constant quality during the video, and typically generates false alarms if the high threshold and the low threshold are too high, or no. -detections if the high threshold and the low threshold are too low. The thresholds are obviously better when they are obtained after a first study of the video, but this study is expensive in calculation and time.

On rappelle que les techniques de détection de changement de plan par modélisation de la transition présentent l'inconvénient d'être sensibles aux variations causées par le mouvement. It is recalled that the techniques of detection of change of plane by modeling of the transition have the disadvantage of being sensitive to the variations caused by the movement.

Les techniques de détection de changement de plan par comparaison directe, ou 5 par histogramme de couleur/luminance, sont très fortement sensibles aux variations de lumière (flash) et aux forts mouvement. Elles engendrent donc beaucoup de fausses alarmes. Direct comparison, or color histogram / luminance histogram detection techniques are highly sensitive to light (flash) variations and strong motion. They therefore generate a lot of false alarms.

Les techniques de détection de changement de plan basées sur les vidéos MPEG présentent les mêmes inconvénients que celles basées sur l'estimation de mouvement par 10 mise en correspondance de blocs. Plan change detection techniques based on MPEG videos have the same disadvantages as those based on block matching motion estimation.

Les techniques de détection de changement de plan basées sur la variation de la taille du support de l'estimation du mouvement dominant sont sensibles aux occlusions entre objets et aux recouvrements quand il y a de gros objets en mouvement. Plan change detection techniques based on variation in the size of the dominant motion estimation medium are sensitive to object occlusions and overlaps when there are large objects in motion.

L'invention a notamment pour objectif de pallier les différents inconvénients de 15 l'état de la technique. The object of the invention is in particular to overcome the various disadvantages of the state of the art.

Plus précisément, l'un des objectifs de la présente invention est de fournir une technique (procédé et dispositif) de détection de changement de plan plus performante que celles de l'art antérieures discutées cidessus. More precisely, one of the objectives of the present invention is to provide a technique (method and device) for plan change detection that is more efficient than those of the prior art discussed above.

L'invention a également pour objectif de fournir une telle technique qui ne soit 20 pas ou peu sensible aux mouvements, notamment de rotation ou d'occlusion entre objets à l'intérieur d'un plan. The invention also aims to provide such a technique that is not or not very sensitive to movements, including rotation or occlusion between objects within a plane.

Un autre objectif de l'invention est de fournir une telle technique permettant de réduire les fausses alarmes et les non-détection de changement de plan. Another object of the invention is to provide such a technique for reducing false alarms and non-detection of change of plan.

Un objectif complémentaire de l'invention est de fournir une telle technique 25 mettant en oeuvre, dans un mode de réalisation particulier, une technique de double seuillage, sans toutefois nécessiter une préétude de la vidéo visant à optimiser le choix des seuils. A complementary object of the invention is to provide such a technique 25 implementing, in a particular embodiment, a double thresholding technique, without however requiring a pre-study of the video to optimize the choice of thresholds.

Encore un autre objectif de l'invention est de fournir une telle technique qui soit indépendante de la qualité du codec. Yet another object of the invention is to provide such a technique that is independent of the quality of the codec.

Un autre objectif de l'invention est de fournir une telle technique qui soit peu coûteuse et simple à mettre en oeuvre. Another object of the invention is to provide such a technique which is inexpensive and simple to implement.

Ces différents objectifs, ainsi que d'autres qui apparaîtront par la suite, sont atteints selon l'invention à l'aide d'un procédé de détection d'un changement de plan dans une séquence vidéo formée d'images successives, caractérisé en ce qu'il comprend une étape de détection grossière d'un fondu comprenant les étapes suivantes: - obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis une image courante vers une image suivante de la séquence; - calcul d'un premier coefficient de ressemblance, à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée obtenue à 10 partir de l'image courante et du champ de vecteurs mouvement; décision intermédiaire sur l'existence ou non d'un fondu, par comparaison dudit premier coefficient de ressemblance avec au moins un seuil. These various objectives, as well as others which will appear subsequently, are achieved according to the invention by means of a method of detecting a change of plane in a video sequence formed of successive images, characterized in that it comprises a step of rough detection of a fade comprising the following steps: - obtaining a continuous field of motion vectors representative of a forward movement from a current image to a next image of the sequence; calculating a first likeness coefficient, from said motion vector field, between the next image and a compensated next image obtained from the current image and the motion vector field; intermediate decision on the existence or not of a fade, by comparison of said first coefficient of resemblance with at least one threshold.

Le procédé selon l'invention est donc basé sur l'étude temporelle d'un coefficient de ressemblance (préférentiellement le PSNR) issu de l'estimation d'un champ de 15 vecteurs mouvement (estimation entre deux images successives, mais aussi séparées temporellement). La détection de changement de plan s'effectue au fil de l'eau. The method according to the invention is therefore based on the temporal study of a resemblance coefficient (preferentially the PSNR) resulting from the estimation of a motion vector field (estimation between two successive images, but also temporally separated) . The change of plan detection takes place over water.

Préférentiellement, le premier coefficient de ressemblance est un coefficient PSNR. Preferably, the first resemblance coefficient is a PSNR coefficient.

On rappelle qu'un coefficient PSNR (pour " peak signal-to-noise ratio " en 20 anglais, c'est-à-dire " rapport signal sur bruit crête ") est un outil bien connu de mesure de ressemblance entre une image et une image de référence. It is recalled that a PSNR coefficient (for "peak signal-to-noise ratio" in English, ie "peak signal-to-noise ratio") is a well-known tool for measuring the resemblance between an image and a reference image.

De façon avantageuse, une décision intermédiaire d'existence d'un fondu est prise si le premier coefficient de ressemblance est supérieur à un premier seuil et inférieur à un second seuil. Advantageously, an intermediate decision of the existence of a fade is taken if the first resemblance coefficient is greater than a first threshold and less than a second threshold.

Avantageusement, le second seuil est un seuil mobile, égal à: - une moyenne d'une pluralité de premiers coefficients de ressemblance calculés à partir de champs de vecteurs de mouvement représentatifs de mouvements pour une pluralité de couples d'images compris dans une fenêtre glissante d'images de la séquence, - moins un nombre prédéterminé. Advantageously, the second threshold is a moving threshold, equal to: an average of a plurality of first resemblance coefficients calculated from movement vector fields representative of movements for a plurality of pairs of images included in a sliding window of images of the sequence, - minus a predetermined number.

L'utilisation d'un seuil haut mobile permet de rendre plus robuste la détection de fondu et d'éliminer des fausses alarmes. The use of a mobile high threshold makes it more robust to detect fading and to eliminate false alarms.

Selon une caractéristique avantageuse, la fenêtre glissante comprend entre 20 et images. According to an advantageous characteristic, the sliding window comprises between 20 and images.

De façon préférentielle, ladite étape d'obtention d'un champ de vecteurs prend en compte une estimation de mouvement appartenant au groupe comprenant: - une estimation de mouvement basée sur une mise en correspondance de blocs; - une estimation de mouvement basée sur un maillage hiérarchique. Preferably, said step of obtaining a vector field takes into account a motion estimation belonging to the group comprising: a motion estimation based on block matching; a motion estimation based on a hierarchical mesh.

Dans un mode de réalisation avantageux de l'invention, comprend en outre une 10 étape de confirmation d'une décision intermédiaire d'existence d'un fondu, visant à supprimer de fausses détection de fondu et comprenant les étapes suivantes: - obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis une image courante vers une image suivante de la séquence; - analyse du champ de vecteurs, de façon à déterminer si le mouvement dominant appartient ou non à l'une d'une pluralité de catégories de mouvement correspondant chacune à une fausse détection de fondu. In an advantageous embodiment of the invention, furthermore comprises a step of confirming an intermediate decision of the existence of a fade, aimed at eliminating false detection of fade and comprising the following steps: - obtaining of a continuous field of motion vectors representative of a forward motion from a current frame to a next frame of the sequence; analyzing the vector field so as to determine whether or not the dominant movement belongs to one of a plurality of motion categories each corresponding to a false detection of fade.

La prise en compte du mouvement dominant permet d'éviter la génération de fausses alarmes dues aux mouvements importants de la caméra. Taking into account the dominant movement makes it possible to avoid the generation of false alarms due to the large movements of the camera.

Avantageusement, ladite pluralité de catégories de mouvement correspondant chacune à une fausse détection de fondu comprend les forts mouvements globaux appartenant à l'une des catégories suivantes - les zooms ou divergences; - les translations ou travellings; - les rotations; - les mouvements hyperboliques. Advantageously, said plurality of categories of movement each corresponding to a false detection of fade comprises the strong global movements belonging to one of the following categories - zooms or divergences; - translations or tracking shots; - rotations; - hyperbolic movements.

De façon préférentielle, ladite étape d'obtention d'un champ de vecteurs prend en compte une estimation de mouvement basée sur un maillage hiérarchique. Preferably, said step of obtaining a vector field takes into account a motion estimation based on a hierarchical mesh.

On utilise un maillage hiérarchique avec une hiérarchie grossière ou fine, selon 30 les modes de réalisation (hiérarchie grossière dans la première variante discutée ci-après et hiérarchie fine dans la seconde variante). A hierarchical mesh with a coarse or fine hierarchy is used, according to the embodiments (coarse hierarchy in the first variant discussed below and fine hierarchy in the second variant).

Dans une première variante de réalisation du mode de réalisation avantageux précité de l'invention, ladite étape d'analyse du champ de vecteurs comprend les étapes suivantes: - calcul d'un second coefficient de ressemblance, à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée obtenue à partir de l'image courante et du champ de vecteurs mouvement; - comparaison dudit second coefficient de ressemblance avec un troisième seuil, une fausse détection de fondu étant décidée si le second coefficient de ressemblance est supérieur au troisième seuil. 10 Ainsi, dans cette première variante, on élimine des fausses détections de fondu en utilisant un coefficient de ressemblance (préférentiellement un PSNR) issu d'une estimation du champ de vecteurs mouvement. In a first variant embodiment of the aforementioned advantageous embodiment of the invention, said step of analyzing the vector field comprises the following steps: calculating a second resemblance coefficient, from said motion vector field, between the next image and a compensated next image obtained from the current image and the motion vector field; comparing said second resemblance coefficient with a third threshold, a false detection of fade being decided if the second resemblance coefficient is greater than the third threshold. Thus, in this first variant, false melt detections are eliminated by using a resemblance coefficient (preferably a PSNR) derived from an estimation of the motion vector field.

Avantageusement, les second et troisième seuils sont confondus. De façon avantageuse, ladite étape d'analyse du champ de vecteurs comprend 15 une étape d'analyse de la colinéarité des vecteurs mouvement, de façon à détecter un travelling constituant une fausse détection de fondu. Advantageously, the second and third thresholds are merged. Advantageously, said step of analyzing the vector field comprises a step of analyzing the collinearity of the motion vectors, so as to detect a tracking constituting a false detection of fade.

Avantageusement, ladite étape d'analyse du champ de vecteurs comprend les étapes suivantes: - obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement, en arrière depuis l'image suivante vers l'image courante de la séquence; - calcul d'un troisième coefficient de ressemblance, à partir dudit champ de vecteurs mouvement, entre l'image courante et une image courante compensée obtenue à partir de l'image suivante et du champ de vecteurs mouvement; 25 - comparaison dudit troisième coefficient de ressemblance avec ledit troisième seuil, une fausse détection de fondu étant décidée si le troisième coefficient de ressemblance est supérieur au troisième seuil; - comparaison de la différence entre ledit deuxième coefficient de ressemblance et ledit troisième coefficient de ressemblance avec un quatrième seuil, une fausse 30 détection de fondu étant décidée si ladite différence est supérieure au quatrième seuil. Advantageously, said step of analyzing the vector field comprises the following steps: obtaining a continuous field of movement vectors representative of a movement, backwards from the next image towards the current image of the sequence; calculating a third likeness coefficient, from said motion vector field, between the current image and a compensated current image obtained from the next image and the motion vector field; Comparing said third resemblance coefficient with said third threshold, a false melt detection being decided if the third resemblance coefficient is greater than the third threshold; comparing the difference between said second resemblance coefficient and said third resemblance coefficient with a fourth threshold, a false melt detection being decided if said difference is greater than the fourth threshold.

Selon une caractéristique avantageuse, ladite étape d'analyse du champ de vecteurs comprend une étape d'analyse de la colinéarité des vecteurs mouvement, en arrière depuis l'image suivante vers l'image courante de la séquence, de façon à détecter un travelling constituant une fausse détection de fondu. According to an advantageous characteristic, said step of analyzing the vector field comprises a step of analyzing the collinearity of the motion vectors, backwards from the following image towards the current image of the sequence, so as to detect a traveling traveling constituting false detection of fade.

Préférentiellement, le second coefficient de ressemblance et le troisième coefficient de ressemblance sont des coefficients PSNR. Preferably, the second resemblance coefficient and the third resemblance coefficient are PSNR coefficients.

Dans une deuxième variante de réalisation du mode de réalisation avantageux précité de l'invention, ladite étape d'analyse du champ de vecteurs comprend les étapes suivantes: - détermination du type de déformation homothétique caractérisant le mouvement dominant, en avant depuis l'image courante vers l'image suivante; - décision sur l'existence ou non d'une fausse détection de fondu, selon que le type de déformation homothétique déterminé appartient ou non à l'un d'une pluralité de types de déformations homothétiques correspondant chacun à une fausse 15 détection de fondu. In a second alternative embodiment of the aforementioned advantageous embodiment of the invention, said step of analyzing the vector field comprises the following steps: determining the type of homothetic deformation characterizing the dominant movement, ahead from the current image to the next image; decision on the existence or not of a false detection of fade, according to whether the type of homothetic deformation determined belongs or not to one of a plurality of types of homothetic deformations each corresponding to a false detection of fade.

Ainsi, dans cette deuxième variante, on élimine des fausses détections de fondu en utilisant une classification du mouvement affine dominant estimé sur le champ de vecteurs (cette classification est fonction du type de déformation homothétique qui caractérise ce mouvement dominant). Thus, in this second variant, false melt detections are eliminated by using a classification of the dominant affine movement estimated on the vector field (this classification is a function of the type of homothetic deformation which characterizes this dominant movement).

Avantageusement, ladite étape d'analyse du champ de vecteurs comprend en outre les étapes suivantes: - obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement, en arrière depuis l'image suivante vers l'image courante de la séquence; - calcul d'un troisième coefficient de ressemblance, à partir dudit champ de vecteurs mouvement, entre l'image courante et une image courante compensée obtenue à partir de l'image suivante et du champ de vecteurs mouvement; - comparaison dudit troisième coefficient de ressemblance avec ledit troisième seuil, une fausse détection de fondu étant décidée si le troisième coefficient de 30 ressemblance est supérieur au troisième seuil; - détermination du type de déformation homothétique caractérisant le mouvement dominant, en arrière depuis l'image suivante vers l'image courante; - décision sur l'existence ou non d'une fausse détection de fondu, selon que le type de déformation homothétique déterminé appartient ou non à l'un d'une pluralité 5 de types de déformations homothétiques correspondant chacun à une fausse détection de fondu. Advantageously, said step of analyzing the vector field further comprises the following steps: obtaining a continuous motion vector field representative of a movement, backward from the next image towards the current image of the sequence; calculating a third likeness coefficient, from said motion vector field, between the current image and a compensated current image obtained from the next image and the motion vector field; comparing said third resemblance coefficient with said third threshold, a false melt detection being decided if the third resemblance coefficient is greater than the third threshold; determining the type of homothetic deformation characterizing the dominant movement, backwards from the next image towards the current image; decision on the existence or not of a false detection of fade, according to whether the type of homothetic deformation determined belongs or not to one of a plurality of types of homothetic deformations each corresponding to a false detection of fade.

De façon avantageuse, ledit procédé comprend une étape de détection fine d'un fondu comprenant les étapes suivantes: - stockage d'une image (" debutfondu ") considérée comme la première image du fondu après qu'une décision intermédiaire sur l'existence d'un fondu a été prise; - itération des étapes suivantes pour chaque image suivante de la séquence: * obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis la première image du fondu vers l'image 15 suivante; * calcul d'un quatrième coefficient de ressemblance (PSNR4) courant, à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée obtenue à partir de la première image du fondu et du champ de vecteurs mouvement; * décision finale de l'existence d'un fondu si les conditions suivantes sont vérifiées: - ledit quatrième coefficient de ressemblance courant est inférieur à un cinquième seuil; - ledit quatrième coefficient de ressemblance courant est inférieur au 25 quatrième coefficient de ressemblance calculé lors de l'itération précédente; - le fondu possède une longueur qui n'est ni inférieure à une longueur minimale, ni supérieure à une longueur maximale. Avantageusement, l'étape de détection fine d'un fondu comprend en outre 30 l'itération de l'étape suivante pour chaque image suivante de la séquence: on considère il que ladite image suivante est la dernière image du fondu si ledit quatrième coefficient de ressemblance courant est inférieur à un sixième seuil. Advantageously, said method comprises a step of fine detection of a fade comprising the following steps: storage of an image ("debutfondu") considered as the first image of the fade after an intermediate decision on the existence of fade a fade has been taken; Iteration of the following steps for each subsequent image of the sequence: obtaining a continuous field of motion vectors representative of a forward movement from the first image of the fade to the next image; calculating a fourth current resemblance coefficient (PSNR4), from said motion vector field, between the next image and a compensated next image obtained from the first image of the fade and the motion vector field; * final decision of the existence of a fade if the following conditions are satisfied: - said fourth current coefficient of resemblance is less than a fifth threshold; said fourth current resemblance coefficient is less than the fourth resemblance coefficient calculated during the previous iteration; the fade has a length that is neither less than a minimum length nor greater than a maximum length. Advantageously, the step of fine detection of a fade further comprises the iteration of the following step for each following image of the sequence: it is considered that said next image is the last image of the fade if said fourth coefficient of Current similarity is less than a sixth threshold.

Selon une caractéristique avantageuse, le procédé comprend une étape préalable de détection d'une coupure: - partageant avec ladite étape de détection grossière d'un fondu lesdites étapes d'obtention d'un champ continu de vecteurs mouvement et de calcul d'un premier coefficient de ressemblance; - comprenant en outre une étape de décision de l'existence d'une coupure si le premier coefficient de ressemblance est inférieur à un septième seuil. According to an advantageous characteristic, the method comprises a preliminary step of detecting a cutoff: - sharing with said step of rough detection of a fade said steps of obtaining a continuous field of motion vectors and of calculating a first coefficient of resemblance; - further comprising a decision step of the existence of a cutoff if the first resemblance coefficient is less than a seventh threshold.

L'invention concerne également un programme d'ordinateur comprenant des instructions de code de programme pour l'exécution des étapes du procédé précité, lorsque ledit programme est exécuté sur un ordinateur.. The invention also relates to a computer program comprising program code instructions for executing the steps of the aforementioned method, when said program is executed on a computer.

L'invention concerne aussi un procédé d'indexation d'une séquence vidéo, comprenant une étape de détection d'un changement de plan mettant en oeuvre le 15 procédé précité. The invention also relates to a method of indexing a video sequence, comprising a step of detecting a change of plan using the aforementioned method.

L'invention concerne encore un dispositif de détection d'un changement de plan dans une séquence vidéo formée d'images successives, comprenant des moyens de détection grossière d'un fondu comprenant: - des moyens d'obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis une image courante vers une image suivante de la séquence; - des moyens de calcul d'un premier coefficient de ressemblance (PSNR1), à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée obtenue à partir de l'image courante et du champ de 25 vecteurs mouvement; - des moyens de décision intermédiaire sur l'existence ou non d'un fondu, par comparaison dudit premier coefficient de ressemblance avec au moins un seuil D'autres caractéristiques et avantages de l'invention apparaîtront à la lecture de la description suivante d'un mode de réalisation préférentiel de l'invention, donné à titre 30 d'exemple indicatif et non limitatif, et des dessins annexés, dans lesquels: - la figure lA présente un organigramme d'un mode de réalisation particulier du procédé selon l'invention; - les figures 1B et 1C présentent chacune une variante de réalisation de l'étape 5 apparaissant sur la figure lA (suppression de fausses détections de fondu); - la figure ID présente plus en détail l'étape 6 apparaissant sur la figure lA (détection fine d'un fondu); - la figure 2 présente un exemple d'évolution temporelle du PSNR avec le tracé des courbes de seuil, deux coupures et un fondu étant perceptibles; - la figure 3 présente un exemple d'évolution en fonction de k de la valeur du PSNR calculé à partir d'une estimation basée maillage entre les instants t et t+k; - la figure 4 illustre un exemple de structure de maillage hiérarchique pour le mouvement; - la figure 5 illustre le principe de l'interpolation affine sur une maille triangulaire. The invention also relates to a device for detecting a change of plane in a video sequence formed of successive images, comprising means for coarse detection of a fade comprising: means for obtaining a continuous field of vectors movement representative of a forward motion from a current image to a subsequent image of the sequence; means for calculating a first resemblance coefficient (PSNR1), from said motion vector field, between the following image and a compensated following image obtained from the current image and the motion vector field; intermediate decision means on the existence or not of a fade, by comparison of said first resemblance coefficient with at least one threshold. Other characteristics and advantages of the invention will become apparent on reading the following description of a preferred embodiment of the invention, given by way of indicative and nonlimiting example, and the appended drawings, in which: FIG. 1A shows a flowchart of a particular embodiment of the method according to the invention; FIGS. 1B and 1C each show an alternative embodiment of step 5 appearing in FIG. 1A (suppression of false detection of fades); FIG. ID shows in greater detail step 6 appearing in FIG. 1A (fine detection of a fade); - Figure 2 shows an example of time evolution of the PSNR with the plot of the threshold curves, two cuts and a fade being perceptible; FIG. 3 presents an example of evolution as a function of k of the value of the PSNR calculated from a mesh-based estimate between the instants t and t + k; FIG. 4 illustrates an example of a hierarchical mesh structure for the movement; - Figure 5 illustrates the principle of affine interpolation on a triangular mesh.

L'invention concerne donc un procédé de détection de changement de plan dans 15 une séquence vidéo formée d'images successives. Le principe général de l'invention consiste à étudier le comportement temporel du PSNR afin d'identifier des comportements caractérisant la présence des deux types suivants de changement de plans: coupures et fondus. The invention thus relates to a method of detecting a change of plane in a video sequence formed of successive images. The general principle of the invention consists in studying the temporal behavior of the PSNR in order to identify behaviors characterizing the presence of the following two types of change of planes: cuts and fades.

On constate que les changements de plans sont la plupart du temps perceptibles 20 dans l'évolution temporelle du PSNR. Lors d'une coupure, il y a une chute brusque instantanée du PSNR (instants 213 et 244 dans l'exemple la figure 2). Lors d'un fondu (ou d'une page tournée), la chute du PSNR est moins brutale mais peut être caractérisée par une convergence lente vers un minima local de la courbe (de l'instant 108 à l'instant 129 dans l'exemple de la figure 2). It is noted that the changes of plans are mostly perceptible in the temporal evolution of the PSNR. During a break, there is an instantaneous drop in the PSNR (times 213 and 244 in the example in Figure 2). During a fade (or a rotated page), the fall of the PSNR is less brutal but can be characterized by a slow convergence towards a local minimum of the curve (from the instant 108 to the instant 129 in the example of Figure 2).

De plus, un fondu correspondant à une substitution progressive de deux images de plans différents entre les instants t et t+k, on constate une diminution (en fonction de j) du PSNR calculé sur l'estimation du mouvement entre les instants t et t+j (O < j < k) (voir la figure 3). Moreover, a fade corresponding to a progressive substitution of two images of different planes between the instants t and t + k, shows a decrease (as a function of j) of the PSNR calculated on the estimate of the movement between the instants t and t + j (O <j <k) (see Figure 3).

Cependant, ce type de comportement est également caractéristique des forts 30 mouvements, des transitions à l'intérieur d'un plan par ailleurs continu comme l'entrée ou la sortie de personnages, ... Afin de rendre plus robuste le procédé selon l'invention, et donc d'éliminer ce type de fausse alarme, un système de seuillage mobile automatique a été élaboré. However, this type of behavior is also characteristic of strong movements, transitions inside an otherwise continuous plane like the entry or the exit of characters, ... In order to make more robust the process according to the invention, and thus to eliminate this type of false alarm, an automatic mobile thresholding system has been developed.

L'algorithme mis en place selon l'invention permet de détecter les images appartenant à un fondu, mais également de déterminer le début et la fin du fondu 5 représenté par les images se situant de part et d'autre d'une coupure. La technique mise en oeuvre repose sur l'étude de l'évolution de la courbe PSNR. Pour ce faire, il a été construit un certain nombre de règles basées sur deux caractérisations du champ de vecteurs mouvement estimé. La première concerne l'évolution temporelle de l'erreur d'estimation sur les vecteurs mouvements entre les instants t et t+k, k pouvant prendre 10 une valeur comprise entre un et le nombre d'images de la séquence étudiée. La deuxième concerne la caractérisation du mouvement affine dominant estimé. The algorithm set up according to the invention makes it possible to detect the images belonging to a fade, but also to determine the beginning and the end of the fade represented by the images situated on either side of a cut. The technique used is based on the study of the evolution of the PSNR curve. To do this, a number of rules based on two characterizations of the estimated motion vector field have been constructed. The first concerns the temporal evolution of the estimation error on the motion vectors between the instants t and t + k, k being able to take a value between one and the number of images of the studied sequence. The second concerns the characterization of the estimated dominant affine movement.

On présente maintenant, en relation avec l'organigramme de la figure lA, un mode de réalisation particulier du procédé selon l'invention. Now, in relation to the flowchart of FIG. 1A, a particular embodiment of the method according to the invention is presented.

On suppose que l'on dispose d'un estimateur de champ de vecteurs mouvement 15 basé sur une mise en correspondance de blocs ("block matching" en anglais), d'un estimateur de champ de vecteurs mouvement basé maillage (voir détails en annexe 2) et d'une méthode d'estimation et de classification du mouvement affine dominant. It is assumed that we have a motion vector field estimator based on a block matching ("block matching"), a mesh-based motion vector field estimator (see details in the appendix). 2) and a method of estimating and classifying dominant affine movement.

L'estimateur basé sur une mise en correspondance de blocs peut être utilisé pour la première phase de détection. L'estimateur basé maillage, qui est plus précis, est 20 nécessaire pour la phase de confirmation de détection de fondu (c'est-à-dire pour l'estimation (i, i+k) et pour l'estimation du mouvement affine dominant). The block matching estimator can be used for the first detection phase. The mesh-based estimator, which is more accurate, is necessary for the melt detection confirmation phase (i.e., for estimating (i, i + k) and for estimating affine motion dominant).

On considère deux hypothèses heuristiques (HI et H2) sur le montage. Hl: Un fondu ne précède pas et ne suit pas une coupure (c'est-à-dire la détection des coupures est considérée comme étant plus fiable que celle des fondus). We consider two heuristic hypotheses (HI and H2) on the montage. Hl: A fade does not precede or follow a cut (ie cut detection is considered more reliable than fade).

H2: Un fondu a une taille minimale et une taille maximale, typiquement 4 et 30 images. H2: A fade has a minimum size and a maximum size, typically 4 and 30 frames.

On utilise par ailleurs un jeu de cinq règles (RI à R5) sur le PSNR pour la détection de changement de plan. RI: Une coupure est un changement de plan brutal, elle est donc identifiée In addition, a set of five rules (RI to R5) is used on the PSNR for plan change detection. RI: A break is a sudden change of plan, so it's identified

30 lorsque le PSNR est inférieur à un seuil fixé par l'utilisateur (" seuilCUT "). Par défaut, ce seuil de détection des coupures " seuilCUT " est initialisé à 13. Il peut cependant évoluer en fonction du contenu de la séquence vidéo. Effectivement, si la courbe PSNR augmente (respectivement décroît) dans le temps, on incrémente (respectivement décrémente) le seuil de détection des coupures " seuilCUT ". 30 when the PSNR is below a threshold set by the user ("thresholdCUT"). By default, this threshold of detection of cuts "thresholdCUT" is initialized with 13. It can however evolve according to the contents of the video sequence. Indeed, if the PSNR curve increases (respectively decreases) in time, it increments (respectively decrements) the detection threshold cuts "thresholdCUT".

R2: Un fondu est suspecté lorsqu'il y a convergence lente vers un minima local 5 de la courbe du PSNR. C'est-à-dire lorsque les PSNR des images compensées I't à I't+k obtenues par estimation avant et arrière sont compris dans un intervalle de valeurs inférieures à un seuil mobile (explicité ci-dessous) mais supérieures au seuil de détection des coupures + S (" seuilCUT "+S, avec S = 3). R2: Fade is suspected when there is slow convergence to a local minimum 5 of the PSNR curve. That is, when the PSNRs of the compensated images I't at I't + k obtained by forward and backward estimation are within a range of values below a moving threshold (explained below) but above the threshold cut detection + S ("thresholdCUT" + S, with S = 3).

R3: Le mouvement dominant qui peut être estimé entre les deux images 10 successives d'un même fondu: R3.1 n'est pas nul mais d'amplitude faible: la valeur du PSNR varie peut à l'intérieur du fondu; R3.2 ne peut pas être assimilé à un mouvement de caméra (travelling ou zoom). R3: The dominant movement that can be estimated between the two successive images of the same fade: R3.1 is not zero but of low amplitude: the value of the PSNR can vary inside the fade; R3.2 can not be likened to camera movement (tracking or zooming).

R4: Le passage progressif d'un plan à un autre se traduit par le comportement suivant. Lors d'un fondu, les PSNR correspondants à l'estimation du mouvement entre les images (i, i+1), (i, i+2), ... (i, i+ k), ... (i, i+t) décroissent en fonction de k et deviennent inférieur à un seuil (" seuilCUT " +S). R4: The gradual transition from one plane to another results in the following behavior. During a fade, the PSNR corresponding to the estimation of the movement between the images (i, i + 1), (i, i + 2), ... (i, i + k), ... (i, i + t) decrease as a function of k and become less than a threshold ("thresholdCUT" + S).

R5: Lorsque l'on est en présence d'un fondu et que le PSNR correspondant à 20 l'estimation du mouvement entre i et i+t est faible (inférieur à " seuilCUT " -S', S'=2), on a des images très dissemblables et donc le fondu est considéré fini. Effectivement, estimer le mouvement entre It et It+k, qui représentent respectivement la dernière image d'un plan et la première image d'un nouveau plan, revient à détecter un changement de plan brutal, donc une coupure. Cette approche permet donc de déterminer le début (It) et 25 la fin du fondu (It+k). R5: When one is in the presence of a fade and the PSNR corresponding to the estimation of the movement between i and i + t is weak (less than "thresholdCUT" -S ', S' = 2), one has very dissimilar images and so the fade is considered finished. Indeed, estimating the movement between It and It + k, which represent respectively the last image of a plane and the first image of a new plane, is to detect a sudden change of plane, thus a cut. This approach therefore makes it possible to determine the beginning (It) and the end of the fade (It + k).

Le seuil mobile (" SEUIL ") est le résultat de la soustraction d'une valeur fournie par l'utilisateur (entre 0,5 et 2) à la moyenne mobile (typiquement sur 50 valeurs) des PSNR calculés sur l'estimation de vecteurs mouvement entre deux images successives. The moving threshold ("THRESHOLD") is the result of subtracting a value provided by the user (between 0.5 and 2) to the moving average (typically over 50 values) of the PSNRs calculated on the vector estimate movement between two successive images.

Le mouvement dominant affine 2D estimé est généralement considéré comme 30 étant le mouvement de la caméra dans une séquence vidéo. La classification du mouvement dominant affine est utilisée dans la présente invention pour éliminer les fausses alarmes dues à d'importants mouvements de caméra. L'estimation du mouvement dominant affine 2D (c'est- à-dire un modèle à six paramètres) à partir d'un champ de vecteurs s'effectue classiquement par l'algorithme des moindres carrés. Cet algorithme permet de minimiser l'erreur entre les vecteurs mouvement estimés par 5 l'estimation du mouvement basée maillage, et les vecteurs mouvements calculés à partir du modèle affine recherché. Cette estimation peut s'effectuer en plusieurs itérations en faisant varier le support de l'estimation du mouvement dominant afin de ne tenir compte que des valeurs des vecteurs mouvements les plus pertinentes envers le modèle recherché. La classification du modèle affine dans une des six classes de mouvement ou 10 translation (i.e. selle, noeud, centre, spirale, noeud impropre et noeud étoilé), s'effectue à l'aide d'un arbre binaire de décision, par seuillage statistique sur les paramètres du mouvement en calculant les marges d'erreurs des valeurs à l'aide de la matrice de covariance, en posant l'hypothèse que les paramètres suivent des lois normales. The estimated 2D affine dominant motion is generally considered to be the motion of the camera in a video sequence. Classification of affine dominant motion is used in the present invention to eliminate false alarms due to large camera movements. The estimation of the 2D affine dominant motion (that is, a six-parameter model) from a vector field is conventionally performed by the least squares algorithm. This algorithm makes it possible to minimize the error between the motion vectors estimated by the mesh-based motion estimation, and the motion vectors calculated from the desired affine model. This estimation can be done in several iterations by varying the support of the estimation of the dominant movement in order to take into account only the values of the most relevant motion vectors towards the sought model. The classification of the affine model in one of the six classes of movement or translation (ie saddle, node, center, spiral, improper node and star knot) is carried out using a binary decision tree, by statistical thresholding. on the motion parameters by calculating the error margins of the values using the covariance matrix, assuming that the parameters follow normal laws.

On décrit maintenant en détail les étapes de l'organigramme de la figure 1A, 15 correspondant à une implémentation particulière de l'algorithme du procédé selon l'invention. The steps of the flowchart of FIG. 1A are now described in detail, corresponding to a particular implementation of the algorithm of the method according to the invention.

Au cours de l'étape référencée 1, l'utilisateur initialise le seuil mobile " SEUIL ". Ce seuil mobile est par exemple réglé initialement à 29, 5 puis recalculé après chaque itération (comme expliqué ci-dessus). During the step referenced 1, the user initializes the moving threshold "THRESHOLD". This moving threshold is for example initially set to 29, then recalculated after each iteration (as explained above).

Grâce au mécanisme des étapes référencées 2, 7 et 8 (basé sur l'utilisation de la variable " index "), on itère les étapes suivantes de la première à l'avant-dernière image de la séquence vidéo: - détection d'une coupure (étape référencée 3); - détection grossière d'un fondu (étape référencée 4); 25 - suppression de fausses détections de fondu (étape référencée 5); - détection fine d'un fondu (étape référencée 6). L'étape 3 de détection d'une coupure comprend elle-même les étapes suivantes - estimation de mouvement, basée correspondance de blocs (ou basée maillage selon une variante), entre les images index et index+1 (étape référencée 31) 30 - calcul du PSNR relatif à cette estimation, noté PSNR1 (étape référencée 32); - si PSNR1 < " seuilCUT " (étape référencée 33) alors: a) il y a une coupure entre les images index et index+1 (Rl) (étape référencée 34); b) s'il y a eu une détection de fondu juste avant la coupure, le fondu est éliminé (Hl). Thanks to the mechanism of the steps referenced 2, 7 and 8 (based on the use of the variable "index"), iterates the following steps from the first to the penultimate frame of the video sequence: - detection of a cut (step referenced 3); - Coarse detection of a fade (step referenced 4); Removal of false fade detections (step referenced 5); fine detection of a fade (step referenced 6). The step 3 of detecting a clipping itself comprises the following steps - motion estimation, based block correspondence (or variant-based mesh), between the index and index images + 1 (step referenced 31). calculation of the PSNR relating to this estimate, denoted PSNR1 (step referenced 32); if PSNR1 <"thresholdCUT" (step referenced 33) then: a) there is a break between the index and index images + 1 (R1) (step referenced 34); b) If there has been a fade detection just before the break, the fade is eliminated (H1).

L'étape 4 de détection grossière d'un fondu consiste à déterminer si on est dans la bande de suspicion d'un fondu, c'est-à-dire si les deux conditions suivantes sont vérifiées: - PSNR1 > " seuilCUT " +3; - PSNR1 < " SEUIL ". The step 4 of rough detection of a fade consists in determining if one is in the band of suspicion of a fade, that is to say if the two following conditions are verified: - PSNR1> "thresholdCUT" +3 ; - PSNR1 <"THRESHOLD".

On présente maintenant en détail, en relation avec les figures 1B et 1C, deux variantes de réalisation de l'étape 5 de suppression de fausses détections de fondu (fausses alarmes dues à des mouvements importants: travelling, zoom ou objet de fort mouvement) (R3). Le principe commun aux deux variantes est de déterminer, à partir d'une analyse du champ de vecteurs, si le mouvement dominant appartient ou non à 15 l'une d'une pluralité de catégories de mouvement correspondant chacune à une fausse détection de fondu (zooms ou divergences, translations ou travellings, rotations, mouvements hyperboliques, et plus généralement tous les forts mouvements globaux). We now present in detail, in connection with FIGS. 1B and 1C, two alternative embodiments of step 5 for suppressing false detection of fade (false alarms due to significant movements: traveling, zooming or objects of strong movement) ( R3). The principle common to both variants is to determine, from a vector field analysis, whether or not the dominant motion belongs to one of a plurality of motion categories each corresponding to a false detection of fade ( zooms or divergences, translations or travellings, rotations, hyperbolic movements, and more generally all the strong global movements).

Dans la première variante (voir figure lB), l'étape 5 de suppression de fausses détections de fondu comprend elle-même les étapes suivantes: estimation du mouvement entre l'image index et index+1 à l'aide de l'algorithme basé maillage, afin d'estimer les vecteurs de mouvement dominants (étape référencée 51); - calcul du PSNR relatif à cette estimation, noté PSNR2 (étape référencée 52); - si PSNR2 > " SEUIL " (étape référencée 53): alors il y a en fait un fort 25 mouvement global dans l'image et l'hypothèse de la présence d'un fondu doit être rejetée; sinon on teste s'il y a un travelling (étape référencée 54), par calcul de la colinéarité des vecteurs mouvement (si les vecteurs de mouvement sont colinéaires le mouvement dominant est un travelling et l'hypothèse de la 30 présence d'un fondu doit être rejetée); - s'il n'y a pas de travelling, estimation arrière basée maillage du mouvement entre les images index+1 et index, afin d'estimer les vecteurs de mouvement dominants (étape référencée 55); - calcul du PSNR relatif à cette estimation, noté PSNR3 (étape référencée 56); 5 - si PSNR3 > " SEUIL " (étape référencée 57): alors il y a en fait un fort mouvement global dans l'image et l'hypothèse de la présence d'un fondu doit être rejetée; - si PSNR2 - PSNR3 > 3 (étape référencée 58): alors il y a en fait un fort mouvement global dans l'image et l'hypothèse de la présence d'un fondu doit 10 être rejetée; - sinon (c'est-à-dire en cas de réponse négative aux étapes référencées 57 et 58) on teste s'il y a un travelling (étape référencée 59), par calcul de la colinéarité des vecteurs mouvement arrière. In the first variant (see FIG. 1B), the step 5 of eliminating false fade detections itself comprises the following steps: estimation of the movement between the index image and index + 1 using the algorithm based on mesh, in order to estimate the dominant motion vectors (step referenced 51); calculation of the PSNR relating to this estimate, denoted PSNR2 (step referenced 52); if PSNR2> "THRESHOLD" (step referenced 53): then there is in fact a strong global movement in the image and the hypothesis of the presence of a fade must be rejected; otherwise we test if there is a tracking (step referenced 54), by calculating the collinearity of the motion vectors (if the motion vectors are collinear the dominant motion is a tracking shot and the hypothesis of the presence of a fade must be rejected); if there is no tracking, estimation based back mesh motion between index + 1 images and index, in order to estimate the dominant motion vectors (step referenced 55); calculation of the PSNR relating to this estimate, denoted PSNR3 (step referenced 56); 5 - if PSNR3> "THRESHOLD" (step referenced 57): then there is in fact a strong global movement in the image and the hypothesis of the presence of a fade must be rejected; if PSNR2 - PSNR3> 3 (step referenced 58): then there is in fact a strong global movement in the image and the hypothesis of the presence of a fade must be rejected; - Otherwise (that is to say, in case of negative response to the steps referenced 57 and 58) is tested whether there is a tracking (step referenced 59), by calculating the collinearity of the rear motion vectors.

Dans la deuxième variante (voir figure 1C), l'étape 5 de suppression de fausses 15 détections de fondu comprend elle-même les étapes suivantes: estimation du mouvement entre les images index et index+1 à l'aide de l'algorithme basé maillage, afin d'estimer les vecteurs de mouvement dominants (étape référencée 51', identique à l'étape référencée 51 de la figure lB); - estimation du type du mouvement affine global (déformation homothétique) à partir de l'estimation des vecteurs de mouvement (étape référencée 52'); - si le type de mouvement affine global (c'est-à-dire dominant) est un zoom ou une translation dont la norme est supérieure à 1 pixel: on a un mouvement global important, il n'y a pas de fondu possible (étape référencée 53'); 25 - dans le cas contraire, estimation arrière basée maillage du mouvement entre les images index+1 et index, afin d'estimer les vecteurs de mouvement dominants (étape référencée 55'); calcul du PSNR relatif à cette estimation, noté PSNR3 (étape référencée 56'); - si PSNR3 > " SEUIL " (étape référencée 57'): alors il y a en fait un fort 30 mouvement global dans l'image et l'hypothèse de la présence d'un fondu doit être rejetée; - sinon, on estime le type du mouvement affine global (déformation homothétique) à partir de l'estimation arrière des vecteurs de mouvement (étape référencée 58'); - si le type de mouvement affine global est un zoom ou une translation dont la 5 norme est supérieure à 1 pixel: on a un mouvement global important, il n'y a pas de fondu possible (étape référencée 59'). On présente maintenant en détail, en relation avec la figure 1D, l'étape 6 de détection fine d'un fondu. In the second variant (see FIG. 1C), step 5 of removing false fade detections itself comprises the following steps: estimation of the movement between the index and index images + 1 using the algorithm based on mesh, in order to estimate the dominant motion vectors (step referenced 51 ', identical to the step referenced 51 of FIG. 1B); estimating the type of global affine motion (homothetic deformation) from the estimation of motion vectors (step referenced 52 '); if the type of global affine movement (that is to say dominant) is a zoom or a translation whose standard is greater than 1 pixel: we have a global movement important, there is no fade possible ( referenced step 53 '); 25 - in the opposite case, rear estimation based mesh of the movement between the index + 1 and index images, in order to estimate the dominant motion vectors (step referenced 55 '); calculation of the PSNR relating to this estimate, denoted PSNR3 (step referenced 56 '); if PSNR3> "THRESHOLD" (step referenced 57 '): then there is in fact a strong global movement in the image and the hypothesis of the presence of a fade must be rejected; otherwise, the type of global affine motion (homothetic deformation) is estimated from the rear estimation of the motion vectors (step referenced 58 '); if the global affine movement type is a zoom or a translation whose norm is greater than 1 pixel: we have a large global movement, there is no fade (step referenced 59 '). DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG.

Si l'image " index " est susceptible d'être dans un fondu (c'est-à-dire s'il n'y a ni 10 coupure, ni mouvement global important, ni travelling, et que les valeurs des PSNR résultant de l'estimation avant et arrière sont dans la bande de détection définie précédemment) et que ce n'était pas le cas de l'image " index-i " alors l'image " index " est peut être la première image d'un fondu. On stocke alors la valeur de la variable " index " dans la variable " debutFondu " (étape référencée 61). En d'autres 15 termes, si à l'instant index, il n'y a ni coupure, ni mouvement global important, ni travelling, et que les valeurs des PSNR résultant de l'estimation avant et arrière sont inférieur au seuil mobile " SEUIL " alors il est possible qu'il existe un fondu entre les images debutFondu et index+1. (R2). If the "index" image is likely to be in a fade (i.e., there is no break, large global motion, or tracking, and the values of the PSNRs resulting from the forward and backward estimate are in the previously defined detection band) and that was not the case of the "index-i" image, so the "index" image may be the first image of a fade . The value of the variable "index" is stored in the variable "debutFondu" (step referenced 61). In other words, if at index time there is no break, large global motion, or tracking, and the PSNR values resulting from the forward and backward estimate are less than the moving threshold. THRESHOLD "then it is possible that there is a fade between the images fuzz and index + 1. (R2).

Au cours de l'étape référencée 62, on calcule l'estimation du mouvement basée 20 maillage entre les images " debutFondu " et " index+1 ". Puis, on calcule le PSNR relatif à cette estimation, noté PSNR4 (étape référencée 63). During the step referenced 62, the estimation of the mesh-based motion between the "fuzzy" and "index + 1" images is calculated. Then, the PSNR relating to this estimate, denoted PSNR4 (step referenced 63), is calculated.

Si PSNR4 n'est pas inférieur au seuil seuilCUT+3 (étape référencée 64), l'hypothèse de la présence d'un fondu doit être rejetée. If PSNR4 is not below threshold thresholdCUT + 3 (step referenced 64), the assumption of the presence of a fade must be rejected.

S'il n'y a pas décroissance de PSNR4, l'hypothèse de la présence d'un fondu doit 25 être rejetée (soit c'est une fausse alarme, soit le fondu a été détecté à l'itération précédente, et l'image index+1 ne fait pas partie de ce fondu qui est donc fini) (R4). Il y a décroissance de PSNR4 si le PSNR4 courant est inférieur au PSNR4 calculé lors de l'itération précédente sur l'estimation du mouvement entre les images " debutFondu " et " index " (dans le cas o " debutFondu " < " index ". If there is no decrease in PSNR4, the hypothesis of the presence of a fade must be rejected (either it is a false alarm, or the fade has been detected at the previous iteration, and the image index + 1 is not part of this fade which is thus finite) (R4). There is a decrease of PSNR4 if the current PSNR4 is lower than the PSNR4 computed during the previous iteration on the estimation of the movement between the images "débutFondu" and "index" (in the case where "débutFondu" <"index".

Si PSNR4 est inférieur au seuil seuilCUT+3 et s'il y a décroissance de PSNR4, on recherche au cours de l'étape référencée 66 la position initiale " initFondu " comme étant le plus petit indice inférieur à debutFondu vérifiant PSNR(i-2, i-1) -PSNR(i- 1, i) < 0.5 pour i allant de initFondu+2 à debutFondu (R3.1). En effet, il est possible que l'image " debutFondu " ne soit pas la première image du fondu. If PSNR4 is below threshold thresholdCUT + 3 and if there is a decrease of PSNR4, it is searched during step referenced 66 the initial position "initFondu" as being the smallest index below debutFondu checking PSNR (i-2 , i-1) -PSNR (i-1, i) <0.5 for i ranging from initFondu + 2 to earlyFondu (R3.1). Indeed, it is possible that the image "beginFondu" is not the first image of the fade.

Si la longueur L du fondu n'est ni trop petite (L > Lmin), ni trop grande (L < 5 Lmax), et si le fondu ne suit pas une coupure (étape référencée 67) , alors on a effectivement détecté un fondu (étape référencée 68) (H1, H2) . If the length L of the fade is neither too small (L> Lmin), nor too large (L <5 Lmax), and if the fade does not follow a cut (step referenced 67), then it has actually detected a fade (step referenced 68) (H1, H2).

Si le PSNR4 correspondant à l'estimation du mouvement entre " debutFondu " et " index+l " est faible et inférieur au seuil " seuilCUT " - S', avec par exemple S' égal à 2 (étape référencée 69), on a des images très dissemblables et donc le fondu est 10 considéré fini (étape référencée 610). (R3.2) ANNEXE 1: REFERENCES BIBLIOGRAPHIQUES [Bouthemyl999] P. Bouthemy, M. Gelgon, F. Ganansia, "A Unified Approach to Shot Change Detection and Camera Motion Characterization", IEEE Trans. on Circuits and Systems for Video Technology, vol. 9, No. 7, octobre 1999 [Gargi2000] U. Gargi, R. Kasturi, S.H. Strayer, "Performance Characterization of Video-Shot-Change Detection Methods", IEEE Trans. on Circuits and Systems for Video Technology, vol. 10, No. 1, février 2000 10 [Hanjalic2002] A. Hanjalic, "Shot-Boundary Detection: Unraveled and Resolved?", IEEE Trans. on Circuits and Systems for Video Technology, vol. 12, No. 2, février 2002 [Lupatini 1999] G. Lupatini, C. Saraceno, R. Leonardi, "Scene break detection: a comparison", workshop on Research issues in Data Engineering, in Int. Conf on data Engineering, pp. 34-41, Orlando, Florida, 1998 [Shahrarayl995] B. Shahraray, "Scene Change Detection and Content Based sampling of Video Sequences, proc. of SPIE Symp. Electronic Imaging Science and Technology: 20 Digital Video Compression, Algorithms and Technologies, vol. 2419, pp. 2-13, 1995 [Yazdi2002] M. Yazdi, A. Zaccarin, "Scene break detection and classification using a block-wise difference method", proc. of Int. Conf. on Image Processing, vol. 3, pp. 394397, 2002 [Zhangl995] H.J. Zhang, A Kankanhalli, S.W. Smoliar, "Video Parsing and Browsing using Compressed data", Multimedia Tools and Applications, pp. 89-111, 1995. If the PSNR4 corresponding to the estimation of the movement between "startFlow" and "index + 1" is weak and less than the threshold "thresholdCUT" - S ', with for example S' equal to 2 (step referenced 69), we have very dissimilar images and thus the fade is considered finite (step referenced 610). (R3.2) APPENDIX 1: BIBLIOGRAPHIC REFERENCES [Bouthemyl999] P. Bouthemy, M. Gelgon, F. Ganansia, "A Unified Approach to Shot Change Detection and Camera Motion Characterization", IEEE Trans. Circuits and Systems for Video Technology, Vol. 9, No. 7, October 1999 [Gargi2000] U. Gargi, R. Kasturi, S. H. Strayer, "Performance Characterization of Video-Shot-Change Detection Methods", IEEE Trans. Circuits and Systems for Video Technology, Vol. 10, No. 1, February 2000 [Hanjalic2002] A. Hanjalic, "Shot-Boundary Detection: Unraveled and Resolved?", IEEE Trans. Circuits and Systems for Video Technology, Vol. 12, No. 2, February 2002 [Lupatini 1999] G. Lupatini, C. Saraceno, R. Leonardi, "Scene break detection: a comparison", workshop on Research issues in Data Engineering, in Int. Conf on Data Engineering, pp. 34-41, Orlando, Florida, 1998 [Shahrarayl995] B. Shahraray, "Scene Change Detection and Content Based Sampling of Video Sequences, Proceedings of SPIE Symp Electronic Imaging Science and Technology: 20 Digital Video Compression, Algorithms and Technologies, Flight 2419, pp. 2-13, 1995 [Yazdi2002] M. Yazdi, A. Zaccarin, "Scene Break Detection and Classification Using a Block-wise Difference Method", Proceedings of Int.Confon on Image Processing, Vol 3 , pp. 394397, 2002 [Zhangl995] HJ Zhang, A Kankanhalli, Smoliar SW, "Video Parsing and Browsing Using Compressed Data," Multimedia Tools and Applications, pp. 89-111, 1995.

[Zhangl993] H.J. Zhang, A Kankanhalli, S.W. Smoliar, "Automatic partitionning of 30 full-motion video", Mulimedia Systems, vol. l,pp. 1028, 1993 ANNEXE 2: ESTIMATION DU MOUVEMENT Les techniques utilisées pour l'estimation de mouvement dans les séquences d'images peuvent être classées suivant trois grandes familles selon la méthode de calcul des paramètres 5 de la description: les méthodes dites de mises en correspondances, les méthodes dites de transformées et les méthodes différentielles. [Zhangl993] H.J. Zhang, A Kankanhalli, S.W. Smoliar, "Automatic partitioning of 30 full-motion video", Mulimedia Systems, vol. I, pp. 1028, 1993 APPENDIX 2: MOTION ESTIMATION The techniques used for motion estimation in image sequences can be classified according to three major families according to the method for calculating the parameters of the description: the so-called mapping methods , so-called transform methods and differential methods.

Les méthodes différentielles déterminent les paramètres de mouvement par optimisation d'un critère mathématique de qualité (par exemple une erreur quadratique entre l'image et sa valeur 10 prédite par compensation de mouvement), optimisation réalisée à l'aide de méthode d'optimisation mathématique différentielle, dans le cas de critère possédant les propriétés mathématiques nécessaires: par exemple la dérivabilité du critère. Differential methods determine the motion parameters by optimizing a mathematical quality criterion (for example a quadratic error between the image and its value predicted by motion compensation), optimization carried out using a mathematical optimization method. differential, in the case of criterion having the necessary mathematical properties: for example the differentiability of the criterion.

L'invention reprend par exemple la technique de détermination d'un modèle d'éléments finis du mouvement constitué d'un maillage et des vecteurs de déplacement aux noeuds de ce 15 maillage, telle que décrite dans la demande de brevet français publiée sous le numéro FR 2 784 211. The invention uses, for example, the technique for determining a finite element model of the motion consisting of a mesh and displacement vectors at the nodes of this mesh, as described in the French patent application published under the number FR 2 784 211.

Calcul d'un champ de mouvement entre les images Nt et Nk+t Le champ de mouvement entre les images NI et Nk+t est calculé, sous la forme de maillages hiérarchiques, par exemple triangulaires, Tbt et Tf+t comme illustré sur la figure 4. 20 De tels maillages sont obtenus par division de certaines mailles. Par exemple, les mailles triangulaires sont divisées en 4 sous-triangles, en fonction d'un certain critère au cours du processus d'estimation du mouvement. A chaque niveau de la hiérarchie, des décisions de division ou non sont prises pour chaque maille. Une fois ces divisions décidées, les mailles 25 adjacentes des mailles divisées sont alors divisées de sorte à conserver une structure de maillage conforme. Le maillage initial, avant division (sommet de la hiérarchie), peut être quelconque. Calculation of a motion field between the images Nt and Nk + t The motion field between the images NI and Nk + t is calculated, in the form of hierarchical meshes, for example triangular, Tbt and Tf + t as illustrated on FIG. Figure 4. 20 Such meshes are obtained by division of certain meshes. For example, the triangular meshes are divided into 4 sub-triangles, depending on a certain criterion during the motion estimation process. At each level of the hierarchy, decisions of division or not are taken for each mesh. Once these divisions are decided, the adjacent meshes of the divided meshes are then divided so as to maintain a conformal mesh structure. The initial mesh, before division (top of the hierarchy), can be any.

Dans l'exemple de la figure 4, l'estimateur de mouvement décide de diviser les triangles 3 et 8. Ceci entraîne la division des triangles 2, 4, 7 et 9. Le processus est itéré jusqu'à un niveau prédéfini de hiérarchie. In the example of Figure 4, the motion estimator decides to divide triangles 3 and 8. This causes the division of triangles 2, 4, 7 and 9. The process is iterated to a predefined level of hierarchy.

La figure 4 illustre le principe de l'interpolation affine sur une maille triangulaire. Figure 4 illustrates the principle of affine interpolation on a triangular mesh.

Dans le cas de maillages triangulaires, l'expression du champ de mouvement défini par un maillage triangulaire T est donné sur chaque triangle e par: d(fi,D)= Te(,)dn, VlS(x,y)ee never(e) o: e dénote l'élément triangulaire de T contenant le point courant p de coordonnées x et y, * {'er(e)} dénote l'ensemble de ses trois noeuds ou sommets, numérotés i, j, k de positions 15 Pi, P etpk, * 'Y (l = i,j,k)représente les coordonnées barycentriques du point p(x,y) dans l'élément triangulaire e.j,k avec: U(tx, y) = a, + pX + yy, ae /, pe 9t I {tl q=jdk Y(X y)= 1 si p(x, y)e eiJk [ll=ij.k x, y) = 1 [TI (x, y) = 0 sinon Un tel modèle définit un champ partout continu, ce qui constitue l'aspect primordial de l'approche. En effet, l'usage d'un estimateur BMA, qui par définition est discontinu ne peut caractériser le mouvement global d'une image, c'est-à-dire zoom, translation, ... En effet, les vecteurs de mouvement obtenus par maillage sont tous orientés dans la même direction lors 25 d'un travelling, ou bien convergent vers le centre de l'image pour un zoom inversé, alors que dans l'estimation de mouvement BMA les vecteurs de mouvement n'ont pas tous la même direction. De plus, il permet un contrôle fin de la précision de représentation. In the case of triangular meshes, the expression of the motion field defined by a triangular mesh T is given on each triangle e by: d (fi, D) = Te (,) dn, VlS (x, y) ee never ( e) o: e denotes the triangular element of T containing the current point p of x and y coordinates, * {'er (e)} denotes all of its three nodes or vertices, numbered i, j, k of positions 15 Pi, P etpk, * Y (l = i, j, k) represents the barycentric coordinates of the point p (x, y) in the triangular element ej, k with: U (tx, y) = a, + pX + yy, ae /, pe 9t I {tl q = jdk Y (X y) = 1 if p (x, y) e eiJk [ll = ij.kx, y) = 1 [TI (x, y) = 0 otherwise Such a model defines a continuous field everywhere, which constitutes the essential aspect of the approach. Indeed, the use of a BMA estimator, which by definition is discontinuous, can not characterize the overall motion of an image, ie zoom, translation, ... Indeed, the motion vectors obtained mesh are all oriented in the same direction during a traveling shot, or converge towards the center of the image for an inverted zoom, whereas in motion estimation BMA the motion vectors do not all have the same direction. same direction. In addition, it allows a fine control of the representation accuracy.

A chaque niveau de la hiérarchie de maillages, les vecteurs nodaux de mouvement sont calculés de sorte à minimiser une erreur de prédiction. Différents estimateurs de mouvement basés maillages peuvent être utilisés, par exemple celui décrit dans les brevets FR n0 98 11227, ou FR n0 99 15568. At each level of the hierarchy of meshes, the nodal motion vectors are computed so as to minimize a prediction error. Different motion estimators based on meshes can be used, for example that described in FR Patents No. 98 11227, or FR No. 99 15568.

Le point important est que le maillage final résulte d'un processus hiérarchique à partir d'un maillage initial par divisions. Ce caractère hiérarchique est en effet mis à profit dans la présente invention pour caractériser le type de transition traitée: fondu ou changement de scène. The important point is that the final mesh results from a hierarchical process starting from an initial mesh by divisions. This hierarchical character is indeed used in the present invention to characterize the type of transition processed: fade or scene change.

Claims

REVENDICATIONS

1. Procédé de détection d'un changement de plan dans une séquence vidéo formée d'images successives, caractérisé en ce qu'il comprend une étape de détection grossière d'un fondu comprenant les étapes suivantes: obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis une image courante vers une image suivante de la séquence; - calcul d'un premier coefficient de ressemblance (PSNR1), à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée 10 obtenue à partir de l'image courante et du champ de vecteurs mouvement; - décision intermédiaire sur l'existence ou non d'un fondu, par comparaison dudit premier coefficient de ressemblance avec au moins un seuil. 1. A method of detecting a change of plane in a video sequence formed of successive images, characterized in that it comprises a step of rough detection of a fade comprising the following steps: obtaining a continuous field of vectors movement representative of a forward motion from a current image to a subsequent image of the sequence; calculating a first resemblance coefficient (PSNR1), from said motion vector field, between the next image and a compensated next image obtained from the current image and the motion vector field; - intermediate decision on the existence or not of a fade, by comparison of said first coefficient of resemblance with at least one threshold.

2. Procédé selon la revendication 1, caractérisé en ce que le premier coefficient de ressemblance est un coefficient PSNR. 2. Method according to claim 1, characterized in that the first resemblance coefficient is a PSNR coefficient.

3. Procédé selon l'une quelconque des revendications 1 et 2, caractérisé en ce qu'une décision intermédiaire d'existence d'un fondu est prise si le premier coefficient de ressemblance est supérieur à un premier seuil et inférieur à un second seuil. 3. Method according to any one of claims 1 and 2, characterized in that an intermediate decision of existence of a fade is taken if the first resemblance coefficient is greater than a first threshold and less than a second threshold.

4. Procédé selon la revendication 3, caractérisé en ce que le second seuil est un seuil mobile, égal à: - une moyenne d'une pluralité de premiers coefficients de ressemblance calculés à partir de champs de vecteurs de mouvement représentatifs de mouvements pour une pluralité de couples d'images compris dans une fenêtre glissante d'images de la séquence, moins un nombre prédéterminé. 4. Method according to claim 3, characterized in that the second threshold is a moving threshold, equal to: an average of a plurality of first resemblance coefficients calculated from motion vector fields representing movements for a plurality pairs of images included in a sliding window of images of the sequence, minus a predetermined number.

5. Procédé selon la revendication 4, caractérisé en ce que la fenêtre glissante comprend entre 20 et 50 images. 5. Method according to claim 4, characterized in that the sliding window comprises between 20 and 50 images.

6. Procédé selon l'une quelconque des revendications 1 à 5, caractérisé en ce que ladite étape d'obtention d'un champ de vecteurs prend en compte une estimation de mouvement appartenant au groupe comprenant: - une estimation de mouvement basée sur une mise en correspondance de blocs; - une estimation de mouvement basée sur un maillage hiérarchique. 6. Method according to any one of claims 1 to 5, characterized in that said step of obtaining a vector field takes into account a motion estimation belonging to the group comprising: a motion estimation based on a setting in block correspondence; a motion estimation based on a hierarchical mesh.

7. Procédé selon l'une quelconque des revendications 1 à 6, caractérisé en ce qu'il comprend en outre une étape de confirmation d'une décision intermédiaire d'existence d'un fondu, visant à supprimer de fausses détection de fondu et comprenant les étapes suivantes: - obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis une image courante vers une image suivante de la séquence; analyse du champ de vecteurs, de façon à déterminer si le mouvement dominant appartient ou non à l'une d'une pluralité de catégories de mouvement 10 correspondant chacune à une fausse détection de fondu. 7. Method according to any one of claims 1 to 6, characterized in that it further comprises a confirmation step of an intermediate decision of existence of a fade, to remove false detection of fade and comprising the following steps: - obtaining a continuous field of motion vectors representative of a movement forward from a current image to a next image of the sequence; analyzing the vector field, so as to determine whether the dominant motion belongs to one or more of a plurality of motion categories each corresponding to a false melt detection.

8. Procédé selon la revendication 7, caractérisé en ce que ladite pluralité de catégories de mouvement correspondant chacune à une fausse détection de fondu comprend les forts mouvements globaux appartenant à l'une des catégories suivantes: - les zooms ou divergences; - les translations ou travellings; - les rotations; - les mouvements hyperboliques. 8. Method according to claim 7, characterized in that said plurality of categories of movement each corresponding to a false detection of fade includes strong global movements belonging to one of the following categories: zooms or divergences; - translations or tracking shots; - rotations; - hyperbolic movements.

9. Procédé selon l'une quelconque des revendications 7 et 8, caractérisé en ce que ladite étape d'obtention d'un champ de vecteurs prend en compte une estimation de 20 mouvement basée sur un maillage hiérarchique. 9. A method according to any one of claims 7 and 8, characterized in that said step of obtaining a vector field takes into account a motion estimation based on a hierarchical mesh.

10. Procédé selon l'une quelconque des revendications 7 à 9, caractérisé en ce que ladite étape d'analyse du champ de vecteurs comprend les étapes suivantes: - calcul d'un second coefficient de ressemblance (PSNR2), à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée 25 obtenue à partir de l'image courante et du champ de vecteurs mouvement; - comparaison dudit second coefficient de ressemblance avec un troisième seuil, une fausse détection de fondu étant décidée si le second coefficient de ressemblance est supérieur au troisième seuil. 10. Method according to any one of claims 7 to 9, characterized in that said step of analyzing the vector field comprises the following steps: - calculation of a second resemblance coefficient (PSNR2), from said field of motion vectors, between the next image and a compensated next image obtained from the current image and the motion vector field; comparing said second resemblance coefficient with a third threshold, a false detection of fade being decided if the second resemblance coefficient is greater than the third threshold.

11. Procédé selon la revendication 10 et l'une quelconque des revendications 3 à 9, 30 caractérisé en ce que les second et troisième seuils sont confondus. 11. The method of claim 10 and any one of claims 3 to 9, characterized in that the second and third thresholds are merged.

12. Procédé selon l'une quelconque des revendications 7 à 11, caractérisé en ce que ladite étape d'analyse du champ de vecteurs comprend une étape d'analyse de la colinéarité des vecteurs mouvement, de façon à détecter un travelling constituant une fausse détection de fondu. 12. Method according to any one of claims 7 to 11, characterized in that said step of analyzing the vector field comprises a step of analyzing the collinearity of the motion vectors, so as to detect a tracking constituting a false detection. fade.

13. Procédé selon l'une quelconque des revendications 10 à 12, caractérisé en ce que ladite étape d'analyse du champ de vecteurs comprend les étapes suivantes: - obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement, en arrière depuis l'image suivante vers l'image courante de la séquence; - calcul d'un troisième coefficient de ressemblance (PSNR3), à partir dudit champ de vecteurs mouvement, entre l'image courante et une image courante compensée obtenue à partir de l'image suivante et du champ de vecteurs mouvement; - comparaison dudit troisième coefficient de ressemblance (PSNR3) avec ledit troisième seuil, une fausse détection de fondu étant décidée si le troisième coefficient de ressemblance est supérieur au troisième seuil; - comparaison de la différence entre ledit deuxième coefficient de ressemblance (PSNR2) et ledit troisième coefficient de ressemblance (PSNR3) avec un quatrième seuil, une fausse détection de fondu étant décidée si ladite différence 20 est supérieure au quatrième seuil. 13. Method according to any one of claims 10 to 12, characterized in that said step of analyzing the vector field comprises the following steps: - obtaining a continuous field of motion vectors representative of a movement, backward from the next image to the current frame of the sequence; calculating a third resemblance coefficient (PSNR3), from said motion vector field, between the current image and a compensated current image obtained from the next image and the motion vector field; comparing said third resemblance coefficient (PSNR3) with said third threshold, a false detection of fade being decided if the third resemblance coefficient is greater than the third threshold; comparing the difference between said second resemblance coefficient (PSNR2) and said third resemblance coefficient (PSNR3) with a fourth threshold, a false detection of fade being decided if said difference is greater than the fourth threshold.

14. Procédé selon la revendication 13, caractérisé en ce que ladite étape d'analyse du champ de vecteurs comprend une étape d'analyse de la colinéarité des vecteurs mouvement, en arrière depuis l'image suivante vers l'image courante de la séquence, de façon à détecter un travelling constituant une fausse détection de fondu. The method according to claim 13, characterized in that said step of analyzing the vector field comprises a step of analyzing the collinearity of the motion vectors, backwards from the next image towards the current image of the sequence, in order to detect a tracking constituting a false detection of fade.

15. Procédé selon l'une quelconque des revendications 10 à 14, caractérisé en ce que le second coefficient de ressemblance (PSNR2) et le troisième coefficient de ressemblance (PSNR3) sont des coefficients PSNR. 15. Method according to any one of claims 10 to 14, characterized in that the second resemblance coefficient (PSNR2) and the third resemblance coefficient (PSNR3) are PSNR coefficients.

16. Procédé selon l'une quelconque des revendications 7 à 9, caractérisé en ce que ladite étape d'analyse du champ de vecteurs comprend les étapes suivantes: - détermination du type de déformation homothétique caractérisant le mouvement dominant, en avant depuis l'image courante vers l'image suivante; - décision sur l'existence ou non d'une fausse détection de fondu, selon que le type de déformation homothétique déterminé appartient ou non à l'un d'une pluralité de types de déformations homothétiques correspondant chacun à une fausse détection de fondu. 16. Method according to any one of claims 7 to 9, characterized in that said step of analyzing the vector field comprises the following steps: determination of the type of homothetic deformation characterizing the dominant movement, forward from the image current to the next image; - Deciding whether or not to have false melt detection, depending on whether or not the determined homothetic deformation type belongs to one of a plurality of homothetic deformation types each corresponding to a false melt detection.

17. Procédé selon la revendication 16, caractérisé en ce que ladite étape d'analyse du champ de vecteurs comprend en outre les étapes suivantes: obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement, en arrière depuis l'image suivante vers l'image courante de la séquence; - calcul d'un troisième coefficient de ressemblance (PSNR3), à partir dudit champ de vecteurs mouvement, entre l'image courante et une image courante compensée obtenue à partir de l'image suivante et du champ de vecteurs mouvement; - comparaison dudit troisième coefficient de ressemblance (PSNR3) avec ledit troisième seuil, une fausse détection de fondu étant décidée si le troisième coefficient de ressemblance est supérieur au troisième seuil; - détermination du type de déformation homothétique caractérisant le mouvement dominant, en arrière depuis l'image suivante vers l'image courante; - décision sur l'existence ou non d'une fausse détection de fondu, selon que le type de déformation homothétique déterminé appartient ou non à l'un d'une pluralité de types de déformations homothétiques correspondant chacun à une fausse détection de fondu. 17. The method of claim 16, characterized in that said step of analyzing the vector field further comprises the following steps: obtaining a continuous field of motion vectors representative of a movement, back from the next image to the current frame of the sequence; calculating a third resemblance coefficient (PSNR3), from said motion vector field, between the current image and a compensated current image obtained from the next image and the motion vector field; comparing said third resemblance coefficient (PSNR3) with said third threshold, a false detection of fade being decided if the third resemblance coefficient is greater than the third threshold; determining the type of homothetic deformation characterizing the dominant movement, backwards from the next image towards the current image; - Deciding whether or not to have false melt detection, depending on whether or not the determined homothetic deformation type belongs to one of a plurality of homothetic deformation types each corresponding to a false melt detection.

18. Procédé selon l'une quelconque des revendications 1 à 17, caractérisé en ce qu'il comprend une étape de détection fine d'un fondu comprenant les étapes suivantes: - stockage d'une image (" debutfondu ") considérée comme la première image du fondu après qu'une décision intermédiaire sur l'existence d'un fondu a été prise; - itération des étapes suivantes pour chaque image suivante de la séquence: * obtention d'un champ continu de vecteurs mouvement représentatif d'un 30 mouvement en avant depuis la première image du fondu vers l'image suivante; * calcul d'un quatrième coefficient de ressemblance (PSNR4) courant, à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée obtenue à partir de la première image du fondu et du champ de vecteurs mouvement; * décision finale de l'existence d'un fondu si les conditions suivantes sont vérifiées: - ledit quatrième coefficient de ressemblance courant est inférieur à un cinquième seuil; - ledit quatrième coefficient de ressemblance courant est inférieur au 10 quatrième coefficient de ressemblance calculé lors de l'itération précédente; - le fondu possède une longueur qui n'est ni inférieure à une longueur minimale, ni supérieure à une longueur maximale. 18. Method according to any one of claims 1 to 17, characterized in that it comprises a step of fine detection of a fade comprising the following steps: - storage of an image ("débutfondu") considered the first image of the fade after an intermediate decision on the fade has been made; Iteration of the following steps for each subsequent image of the sequence: obtaining a continuous field of motion vectors representative of a forward motion from the first image of the fade to the next image; calculating a fourth current resemblance coefficient (PSNR4), from said motion vector field, between the next image and a compensated next image obtained from the first image of the fade and the motion vector field; * final decision of the existence of a fade if the following conditions are satisfied: - said fourth current coefficient of resemblance is less than a fifth threshold; said fourth current resemblance coefficient is less than the fourth resemblance coefficient calculated during the previous iteration; the fade has a length that is neither less than a minimum length nor greater than a maximum length.

19. Procédé selon la revendication 18, caractérisé en ce que l'étape de détection fine 15 d'un fondu comprend en outre l'itération de l'étape suivante pour chaque image suivante de la séquence: - on considère que ladite image suivante est la dernière image du fondu si ledit quatrième coefficient de ressemblance courant est inférieur à un sixième seuil.19. The method according to claim 18, characterized in that the step of fine detection of a fade further comprises the iteration of the following step for each following image of the sequence: said next image is considered to be the last image of the fade if said fourth current resemblance coefficient is less than a sixth threshold.

20. Procédé selon l'une quelconque des revendications 1 à 19, caractérisé en ce qu'il 20 comprend une étape préalable de détection d'une coupure: - partageant avec ladite étape de détection grossière d'un fondu lesdites étapes d'obtention d'un champ continu de vecteurs mouvement et de calcul d'un premier coefficient de ressemblance (PSNR1); - comprenant en outre une étape de décision de l'existence d'une coupure si le premier coefficient de ressemblance est inférieur à un septième seuil.20. Method according to any one of claims 1 to 19, characterized in that it comprises a preliminary step of detecting a cutoff: - sharing with said step of rough detection of a fade said steps of obtaining a continuous field of motion vectors and calculating a first resemblance coefficient (PSNR1); - further comprising a decision step of the existence of a cutoff if the first resemblance coefficient is less than a seventh threshold.

21. Programme d'ordinateur, caractérisé en ce qu'il comprend des instructions de code de programme pour l'exécution des étapes du procédé selon l'une quelconque des revendications 1 à 20, lorsque ledit programme est exécuté sur un ordinateur. 21. Computer program, characterized in that it comprises program code instructions for executing the steps of the method according to any one of claims 1 to 20, when said program is executed on a computer.

22. Procédé d'indexation d'une séquence vidéo, caractérisé en ce qu'il comprend 30 une étape de détection d'un changement de plan mettant en oeuvre le procédé selon l'une 22. A method of indexing a video sequence, characterized in that it comprises a step of detecting a change of plane implementing the method according to one of the following:

quelconque des revendications 1 à 20. any of claims 1 to 20.

23. Dispositif de détection d'un changement de plan dans une séquence vidéo formée d'images successives, caractérisé en ce qu'il comprend des moyens de détection grossière d'un fondu comprenant: - des moyens d'obtention d'un champ continu de vecteurs mouvement représentatif d'un mouvement en avant depuis une image courante vers une image suivante de la séquence; - des moyens de calcul d'un premier coefficient de ressemblance (PSNR1), à partir dudit champ de vecteurs mouvement, entre l'image suivante et une image suivante compensée obtenue à partir de l'image courante et du champ de 10 vecteurs mouvement; - des moyens de décision intermédiaire sur l'existence ou non d'un fondu, par comparaison dudit premier coefficient de ressemblance avec au moins un seuil. 23. Device for detecting a change of plane in a video sequence formed of successive images, characterized in that it comprises means for coarse detection of a fade comprising: means for obtaining a continuous field motion vector representative of a forward motion from a current image to a subsequent image of the sequence; means for calculating a first resemblance coefficient (PSNR1), from said motion vector field, between the following image and a compensated following image obtained from the current image and from the motion vector field; - Intermediate decision means on the existence or not of a fade, by comparison of said first resemblance coefficient with at least one threshold.