ES2938576A1

ES2938576A1 - METHOD FOR EVALUATING THE ATTENTION OF A SUBJECT WHO TRANSITS THROUGH A DETERMINED AREA AND COMPUTER PROGRAMS OF THE SAME (Machine-translation by Google Translate, not legally binding)

Info

Publication number: ES2938576A1
Application number: ES202130943A
Authority: ES
Inventors: Palma Manuel López; Fuertes Montserrat Corbalán; Barrio Javier Gago; Rubió Josep Ramon Morros
Original assignee: Universitat Politecnica de Catalunya UPC
Current assignee: Universitat Politecnica de Catalunya UPC
Priority date: 2021-10-07
Filing date: 2021-10-07
Publication date: 2023-04-12

Abstract

A method and computer programs are proposed to evaluate the attention of a subject that transits through a certain area. The method comprises receiving images including a subject and an area of interest environment with objects; and calculating the attention paid by the subject to an object by processing. The processing calculates a trajectory of the subject; for each of the points of the trajectory, it calculates a vision model of the subject's eye; a parameter relating to a density of attention focus on the object taking the calculated eye vision model into consideration; and transforms the parameter to a reference system of the area of interest, providing a parameter relative to the density of the transformed focus of attention; and computes a parameter relative to a visual focus of the trajectory, (Machine-translation by Google Translate, not legally binding)

Description

DESCRIPCIÓNDESCRIPTION

MÉTODO PARA EVALUAR LA ATENCIÓN DE UN SUJETO QUE TRANSITA POR UNA METHOD TO EVALUATE THE ATTENTION OF A SUBJECT WHO TRANSITS THROUGH A

DETERMINADA ZONA Y PROGRAMAS DE ORDENADOR DEL MISMOA PARTICULAR AREA AND COMPUTER PROGRAMS THEREIN

Campo de la técnicatechnique field

La presente invención concierne a un método, y programas informáticos, para evaluación de la atención de uno o más sujetos.The present invention concerns a method, and software, for evaluating the attention of one or more subjects.

Antecedentes de la invenciónBackground of the invention

Disponer de un método para determinar a qué prestan atención los sujetos (o personas) es algo muy buscado desde hace algún tiempo. Esta necesidad se agudiza con la llegada de Internet donde se traza el comportamiento humano a partir de búsquedas y acciones que se realizan en la web.Having a method to determine what subjects (or people) pay attention to has been highly sought after for some time. This need is exacerbated with the arrival of the Internet where human behavior is traced from searches and actions carried out on the web.

Sin embargo, cuando las personas se mueven por recintos cerrados, no existe en la actualidad una forma bien establecida para poder asociar las preferencias, o mejor poder comparar qué productos o señales llaman la atención de las personas.However, when people move around closed spaces, there is currently no well-established way to associate preferences, or better to compare which products or signals attract people's attention.

En el estado de la técnica actual existen diferentes algoritmos para la identificación de personas u objetos. La mayoría de las referencias conocidas detectan objetos estáticos. Por ejemplo, en [1] se reconocen caras, personas y coches estáticos presentando un sistema de clasificación y reconocimiento de patrones con porcentajes de acierto altos. Algunas técnicas están orientadas a extraer la silueta perfecta lo cual puede ser complicado en función de la escena, debido a que la misma presente variaciones de iluminación, un suelo que dificulte el reconocimiento de las personas por problema de contraste o por ejemplo confusión con sombras. Otras técnicas se basan en la detección de regiones de la persona o características, y en aplicar modelos de clasificación sobre ellas o aprendizaje supervisado [2]. Estos sistemas no necesitan la obtención de una forma muy definida, sino que con una buena aproximación es suficiente. En [3] para la detección de las personas utilizan cuatro detectores, uno para: la cabeza, las piernas, el brazo izquierdo y el brazo derecho, manejando una arquitectura de clasificación jerárquica, en la que el aprendizaje ocurre en más de dos niveles. Presentan resultados que muestran que su sistema tiene un rendimiento significativamente mejor que un detector de personas de cuerpo completo. Su sistema es más robusto para localizar vistas parciales de personas y personas cuyas partes del cuerpo tienen poco contraste con el fondo. In the current state of the art there are different algorithms for the identification of people or objects. Most of the known references detect static objects. For example, in [1] faces, people and static cars are recognized, presenting a classification and pattern recognition system with high success rates. Some techniques are aimed at extracting the perfect silhouette, which can be complicated depending on the scene, because it presents lighting variations, a floor that makes it difficult to recognize people due to contrast problems or, for example, confusion with shadows. Other techniques are based on the detection of regions of the person or characteristics, and on applying classification models on them or supervised learning [2]. These systems do not need to obtain a very defined shape, but a good approximation is enough. In [3] for the detection of people they use four detectors, one for: the head, the legs, the left arm and the right arm, managing a hierarchical classification architecture, in which learning occurs in more than two levels. They present results showing that their system performs significantly better than a full body people detector. Their system is more robust at locating partial views of people and people whose body parts have little contrast to the background.

Otros métodos dan a conocer la detección de objetos o personas en movimiento utilizando secuencias de vídeo [4-8].Other methods disclose the detection of moving objects or people using video sequences [4-8].

Por otro lado, según sea la aplicación requerida, lo que interesa es detectar a la persona y saber trazar su trayectoria [9].On the other hand, depending on the application required, what matters is detecting the person and knowing how to trace their trajectory [9].

Para detectar a personas, otras investigaciones se basan en detectar la cara de las mismas. Por ejemplo, [10] cuenta la gente que pasa a través de una puerta detectando las caras. La tecnología de seguimiento ocular se ha utilizado para analizar la dirección de la mirada del usuario y determinar si él/ella está mirando activamente a un determinado producto o señal. En [11] investigan la prominencia visual en una tienda de sus productos y cómo esta prominencia afecta a las decisiones del cliente. El análisis se realiza mediante el uso de datos de seguimiento visual y datos de ventas en la tienda de comestibles. También se utiliza el seguimiento ocular y se pueden ubicar cámaras frontales en o cerca del producto o señal para tener una vista frontal del cliente. En esta posición frontal, la cara y los ojos de las personas pueden ser detectados, haciendo posible un análisis fino de la dirección de la mirada. Como inconveniente, se necesita una cámara para cada producto analizado.To detect people, other investigations are based on detecting their faces. For example, [10] counts people passing through a door by detecting faces. Eye tracking technology has been used to analyze the direction of a user's gaze and determine if he/she is actively looking at a certain product or sign. In [11] they investigate the visual prominence of their products in a store and how this prominence affects customer decisions. The analysis is done by using eye tracking data and grocery store sales data. Eye tracking is also used and front cameras can be placed on or near the product or sign to get a front view of the customer. In this frontal position, people's faces and eyes can be detected, making fine analysis of gaze direction possible. As a drawback, a camera is needed for each product analyzed.

En [12] se estudia la atención que los transeúntes prestan a un anuncio al aire libre usando una sola cámara de video. Se analiza el foco de atención visual (VFOA) para un número variable de personas (logran hasta tres) sin movimiento restringido. Su método consta de dos componentes: una red bayesiana dinámica, que rastrea simultáneamente personas en la escena y estima su pose de cabeza, y dos modelos VFOA para múltiples personas basados en modelos de mezcla gaussiana (GMM) y modelos ocultos de Markov (HMM), que infieren VFOA de un sujeto desde su ubicación y pose de cabeza. En [13] proponen un método para que la información de la postura de la cabeza se utilice para producir mejores alineaciones faciales para el reconocimiento de expresiones o rostros invariantes en las posturas. Se trata de un enfoque actual basado en el aprendizaje profundo supervisado. In [12] the attention paid by passersby to an outdoor advertisement using a single video camera is studied. Visual focus of attention (VFOA) is tested for a variable number of people (they achieve up to three) without restricted movement. Their method consists of two components: a dynamic Bayesian network, which simultaneously tracks people in the scene and estimates their head pose, and two multi-person VFOA models based on Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). , which infer VFOA of a subject from its location and head pose. In [13] they propose a method for head posture information to be used to produce better facial alignments for recognition of expressions or posture-invariant faces. This is a current approach based on supervised deep learning.

Asimismo, se conocen algunas patentes y/o solicitudes de patente que dan a conocer métodos y sistemas en base en la detección y análisis del movimiento de al menos un individuo dentro de un recinto, sala o espacio público, mediante cámaras cenitales de visión artificial, capturando la información del espacio y procesando la misma para estimar la trayectoria del movimiento de dicho individuo.Likewise, some patents and/or patent applications are known that disclose methods and systems based on the detection and analysis of the movement of at least one individual within an enclosure, room or public space, by means of artificial vision overhead cameras, capturing the information of the space and processing it to estimate the trajectory of the movement of said individual.

Por ejemplo, el documento EP 3096263 B1 divulga un método y sistema de reconocimiento de la orientación del cuerpo humano basado en una cámara de dos lentes. El método comprende las etapas de recibir una imagen en escala de grises o en color y un mapa de profundidad correspondiente; proyectar una imagen tridimensional para obtener una vista superior; rastrear una trayectoria de movimiento del cuerpo humano; determinar si la velocidad de movimiento del cuerpo humano es mayor que un umbral; si la velocidad de movimiento del cuerpo humano es mayor que el umbral, reconocer una dirección de la trayectoria de movimiento del cuerpo humano como una orientación del cuerpo humano; si la velocidad de movimiento del cuerpo humano es menor o igual que el umbral, realizar la clasificación para adquirir un par inicial de direcciones relativas; realizar una proyección hacia atrás para adquirir información bidimensional de una región de la cabeza; y realizar la clasificación para obtener la orientación del cuerpo humano.For example, the document EP 3096263 B1 discloses a method and system for recognizing the orientation of the human body based on a camera with two lenses. The method comprises the steps of receiving a grayscale or color image and a map of corresponding depth; project a three-dimensional image to obtain a superior view; trace a movement trajectory of the human body; determine if the speed of movement of the human body is greater than a threshold; if the movement speed of the human body is greater than the threshold, recognizing a direction of the movement trajectory of the human body as an orientation of the human body; if the movement speed of the human body is less than or equal to the threshold, perform classification to acquire an initial pair of relative directions; perform a backward projection to acquire two-dimensional information from a region of the head; and perform classification to obtain the orientation of the human body.

El documento US 9672634 B2 describe un método para el seguimiento de objetos de vídeo, comprendiendo los pasos de: recibir una secuencia de imágenes estereoscópicas; recibir un mapa de profundidad para cada imagen estereoscópica de la secuencia; calcular un histograma de primer eje para cada mapa de profundidad; aplicar un primer método de detección de objetos para rastrear objetos basándose en el contenido de las imágenes y/o los mapas de profundidad; aplicar, en paralelo al primer método de detección de objetos, un segundo método de detección de objetos para rastrear objetos basándose en el contenido de los histogramas de los mapas de profundidad; y determinar ubicaciones de objetos rastreados basándose en la comparación de los resultados del primer método de detección de objetos y el segundo método de detección de objetos.Document US 9672634 B2 describes a method for tracking video objects, comprising the steps of: receiving a sequence of stereoscopic images; receiving a depth map for each stereoscopic image in the sequence; calculating a first axis histogram for each depth map; applying a first object detection method to track objects based on the content of the images and/or depth maps; applying, in parallel to the first object detection method, a second object detection method for tracking objects based on the content of the histograms of the depth maps; and determining locations of tracked objects based on the comparison of the results of the first object detection method and the second object detection method.

El documento WO 2018235923 A1 describe un dispositivo de estimación de posición. El dispositivo acepta datos de imágenes en movimiento, incluida una serie de fotogramas obtenidos al capturar imágenes de un sujeto en una trayectoria de movimiento mientras se mueve, extrae un punto característico relacionado con el sujeto capturado en cada fotograma incluido en los datos de imagen en movimiento y crea una reconfiguración mapa en el que el punto característico está asociado con coordenadas en un espacio de reconfiguración, que es un espacio tridimensional virtual prescrito. Asimismo, el dispositivo recupera un punto característico de referencia correspondiente al punto característico en los datos de imagen de referencia asociados con la información de posición, adquiere una relación de transformación entre un sistema de coordenadas mundial y un sistema de coordenadas de la coordenada asociado con el punto característico, y utiliza la relación de transformación para corregir el mapa de reconfiguración realizando la corrección a escala de una cantidad de movimiento de una cámara que capturó los datos de la imagen en movimiento, mientras que suprime un cambio en un resultado estimado de una posición y actitud de la cámara, representada usando las coordenadas en el espacio de reconfiguración, y realizando la corrección de manera que la posición y actitud de la cámara se acerquen a la información de posición asociada con los datos de imagen de referencia. The document WO 2018235923 A1 describes a position estimation device. The device accepts moving image data, including a series of frames obtained by capturing images of a subject on a motion path while moving, extracts a feature point related to the captured subject in each frame included in the moving image data and creates a reconfiguration map in which the feature point is associated with coordinates in a reconfiguration space, which is a prescribed virtual three-dimensional space. Also, the device retrieves a reference feature point corresponding to the feature point in the reference image data associated with the position information, acquires a transformation between a world coordinate system and a coordinate system of the coordinate associated with the characteristic point, and uses the transformation ratio to correct the reconfiguration map by performing scaling correction of an amount of movement of a camera that captured the moving image data, while suppressing a change in an estimated result of a position and attitude of the camera, represented using the coordinates in the space of reconfiguring, and performing the correction so that the position and attitude of the camera approximate the position information associated with the reference image data.

Los documentos del estado de la técnica no dan a conocer como obtener información adicional acerca del foco de atención, estimada únicamente a partir de la información de la posición de la cabeza, hombros y cuerpo de un individuo.The documents of the state of the art do not disclose how to obtain additional information about the focus of attention, estimated solely from the information of the position of the head, shoulders and body of an individual.

Por lo tanto, se necesitan nuevos sistemas/métodos que permitan evaluar la atención prestada por uno o más sujetos que pasen por una zona/recinto (zona de análisis) utilizando un número reducido de cámaras y respetando la privacidad de dicho(s) sujeto(s).Therefore, new systems/methods are needed to evaluate the care provided by one or more subjects passing through an area/enclosure (analysis area) using a reduced number of cameras and respecting the privacy of said subject(s)( s).

Referencias:References:

[1] C. Papageorgiou and T. Poggio, “A Trainable System for Object Detection,” Int. J. Comput. Vis., vol. 38, no. 1, pp. 15-33, Jun. 2000.[1] C. Papageorgiou and T. Poggio, “A Trainable System for Object Detection,” Int. J. Comput. Vis., vol. 38, no. 1, p. 15-33, June 2000.

[2] A. Broggi, A. Fascioli, I. Fedriga, A. Tibaldi, and M. Del Rose, “Stereo-based preprocessing for human shape localization in unstructured environments,” in IEEE Intelligent Vehicles Symposium, Proceedings, 2003.[2] A. Broggi, A. Fascioli, I. Fedriga, A. Tibaldi, and M. Del Rose, “Stereo-based preprocessing for human shape localization in unstructured environments,” in IEEE Intelligent Vehicles Symposium, Proceedings, 2003.

[3] A. Mohan, C. Papageorgiou, and T. Poggio, “Example-based object detection in images by components,” IEEE Trans. Pattern Anal. Mach. Intell., 2001.[3] A. Mohan, C. Papageorgiou, and T. Poggio, “Example-based object detection in images by components,” IEEE Trans. Pattern Anal. Mach. Intell., 2001.

[4] A. Datta, M. Shah, and N. Da Vitoria Lobo, “Person-on-person violence detection in video data,” Proc. - Int. Conf. Pattern Recognit., 2002.[4] A. Datta, M. Shah, and N. Da Vitoria Lobo, “Person-on-person violence detection in video data,” Proc. - Int. Conf. Pattern Recognit., 2002.

[5] P. Viola, M. J. Jones, and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” Int. J. Comput. Vis., 2005.[5] P. Viola, M. J. Jones, and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” Int. J. Comput. Vis., 2005.

[6] N. Dalal, B. Triggs, and C. Schmid, “Human detection using oriented histograms of flow and appearance,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2006. [6] N. Dalal, B. Triggs, and C. Schmid, “Human detection using oriented histograms of flow and appearance,” in Lecture Notes in Computer Science ( including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2006 .

[7] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool, “Robust tracking-by-detection using a detector confidence particle filter,” in Proceedings of the IEEE International Conference on Computer Vision, 2009.[7] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool, “Robust tracking-by-detection using a detector confidence particle filter,” in Proceedings of the IEEE International Conference on Computer Vision, 2009.

[8] H. Cho, Y. W. Seo, B. V. K. V. Kumar, and R. R. Rajkumar, “A multi-sensor fusion system for moving object detection and tracking in urban driving environments,” in Proceedings -IEEE International Conference on Robotics and Automation, 2014.[8] H. Cho, Y. W. Seo, B. V. K. V. Kumar, and R. R. Rajkumar, “A multi-sensor fusion system for moving object detection and tracking in urban driving environments,” in Proceedings -IEEE International Conference on Robotics and Automation, 2014.

[9] M. Andriluka, S. Roth, and B. Schiele, “People-tracking-by-detection and peopledetection-by-tracking,” in 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. [9] M. Andriluka, S. Roth, and B. Schiele, “People-tracking-by-detection and peopledetection-by-tracking,” in 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008.

[10] T. Y. Chen, C. H. Chen, D. J. Wang, and Y. L. Kuo, “A people counting system based on face-detection,” in Proceedings - 4th International Conference on Genetic and Evolutionary Computing, ICGEC 2010, 2010.[10] T. Y. Chen, C. H. Chen, D. J. Wang, and Y. L. Kuo, “A people counting system based on face-detection,” in Proceedings - 4th International Conference on Genetic and Evolutionary Computing, ICGEC 2010, 2010.

[11] J. Clement, J. Aastrup, and S. Charlotte Forsberg, “Decisive visual saliency and consumers' in-store decisions,” J. Retail. Consum. Serv., vol. 22, pp. 187-194, Jan. 2015.[11] J. Clement, J. Aastrup, and S. Charlotte Forsberg, “Decisive visual saliency and consumers' in-store decisions,” J. Retail. cons. serv., vol. 22, p. 187-194, Jan. 2015.

[12] M. Farenzena, L. Bazzani, V. Murino, and M. Cristani, No Title, vol. 5716 LNCS. Springer, Berlin, Heidelberg, 2009, pp. 481-489.[12] M. Farenzena, L. Bazzani, V. Murino, and M. Cristani, No Title, vol. 5716 LNCS. Springer, Berlin: Heidelberg, 2009, p. 481-489.

[13] F. Kuhnke and J. Ostermann, “Deep head pose estimation using synthetic images and partial adversarial domain adaption for continuous label spaces,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, vol. 2019-Octob, pp. 10163-10172. [13] F. Kuhnke and J. Ostermann, “Deep head pose estimation using synthetic images and partial adversarial domain adaptation for continuous label spaces,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, vol. 2019-Octob, pp. 10163-10172.

Exposición de la invenciónExhibition of the invention

A tal fin, ejemplos de realización de la presente invención aportan, de acuerdo a un primer aspecto, un método para evaluar la atención de un sujeto que transita por una determinada zona. El método comprende recibir, por una unidad de procesamiento, una secuencia de imágenes obtenidas por una o más cámaras 3D dispuestas en posición cenital dentro de una zona de interés durante un determinado intervalo de tiempo, en donde la secuencia de imágenes incluye al menos un primer sujeto y un entorno de la zona de interés que incluye una pluralidad de objetos; y calcular, por la unidad de procesamiento, la atención prestada por el primer sujeto a un determinado objeto de la pluralidad de objetos mediante procesamiento de unos fotogramas de la secuencia de imágenes.To this end, exemplary embodiments of the present invention provide, according to a first aspect, a method for evaluating the attention of a subject passing through a certain area. The method comprises receiving, by a processing unit, a sequence of images obtained by one or more 3D cameras arranged in a zenithal position within an area of interest during a certain time interval, wherein the image sequence includes at least a first subject and a region of interest environment including a plurality of objects; and calculating, by the processing unit, the attention paid by the first subject to a certain object of the plurality of objects by processing a few frames of the image sequence.

Según la presente invención, el procesamiento incluye calcular una trayectoria del primer sujeto mediante el cálculo de una posición y de unos ángulos de la cabeza del primer sujeto en relación con unas coordenadas espaciales de las superficies que delimitan al menos parte de la zona de interés. Asimismo, para cada uno de los puntos de la trayectoria calculada se calcula:According to the present invention, the processing includes calculating a trajectory of the first subject by calculating a position and angles of the first subject's head in relation to spatial coordinates of the surfaces that delimit at least part of the area of interest. Likewise, for each of the points of the calculated trajectory, the following is calculated:

- un modelo de visión de un ojo del primer sujeto con base en la determinación de una superficie ocupada por el objeto mirado en la retina y durante qué tiempo, y en el cálculo de una distancia del primer sujeto al objeto; y- a vision model of an eye of the first subject based on determining a surface occupied by the viewed object on the retina and for what time, and on calculating a distance from the first subject to the object; and

- un parámetro relativo a una densidad de foco de atención del primer sujeto en el objeto tomando en consideración el modelo de la visión del ojo calculado.- a parameter relative to a density of attention focus of the first subject on the object taking into consideration the calculated eye vision model.

El parámetro relativo a la densidad de foco de atención se transforma a un sistema de referencia de la zona de interés, proporcionando un parámetro relativo a la densidad de foco de atención transformado. Finalmente, se calcula un parámetro relativo a un foco de atención visual de la trayectoria, sumando cada uno de los parámetros relativos a la densidad de foco de atención transformados.The parameter relative to the density of focus of attention is transformed to a reference system of the area of interest, providing a parameter relative to the density of focus of attention transformed. Finally, a parameter relative to a visual focus of attention of the trajectory is calculated, adding each of the parameters relative to the density of the transformed focus of attention.

En un ejemplo de realización, la superficie ocupada por el objeto en la retina se determina mediante el cálculo de un ángulo sólido relativo al objeto mirado. El parámetro relativo a la densidad de foco de atención puede ser proporcional al ángulo sólido e inversamente proporcional al cuadrado de la distancia.In an exemplary embodiment, the surface occupied by the object on the retina is determined by calculating a solid angle relative to the object being viewed. The parameter related to the density of focus of attention can be proportional to the solid angle and inversely proportional to the square of the distance.

En un ejemplo de realización, el parámetro relativo a la densidad de foco de atención se calcula, además, tomando en consideración una función coseno del ángulo normal a la superficie donde reside el objeto.In an embodiment, the parameter relating to the density of focus of attention is also calculated by taking into consideration a cosine function of the angle normal to the surface where the object resides.

Particularmente, los ángulos de la cabeza comprenden un ángulo de inclinación, un ángulo de giro horizontal y un ángulo de alabeo de la cabeza.In particular, the head angles comprise a tilt angle, a pan angle and a head roll angle.

En un ejemplo de realización, la secuencia de imágenes comprende una pluralidad de sujetos. En este caso, la unidad de procesamiento, previamente al cálculo de la atención, ejecuta un algoritmo de segmentación sobre uno o más fotogramas de la secuencia de imágenes para detectar y diferenciar los distintos sujetos.In an exemplary embodiment, the image sequence comprises a plurality of subjects. In this case, the processing unit, prior to calculating the attention, executes a segmentation algorithm on one or more frames of the image sequence to detect and differentiate the different subjects.

El área de interés puede comprender un recinto cerrado. Asimismo, la superficie donde reside el objeto puede comprender un suelo, una pared, una mesa, una estantería o una silla, entre otros.The area of interest may comprise an enclosed area. Likewise, the surface where the object resides can comprise a floor, a wall, a table, a shelf or a chair, among others.

Otras realizaciones de la invención que se desvelan en el presente documento incluyen también productos de programas de ordenador, o informáticos, para realizar las etapas y operaciones del método propuesto en el primer aspecto de la invención. Más particularmente, un producto de programa de ordenador es una realización que tiene un medio legible por ordenador que incluye instrucciones de programa informático codificadas en el mismo que cuando se ejecutan en al menos un procesador de un sistema informático producen al procesador realizar las operaciones indicadas en el presente documento como realizaciones de la invención.Other embodiments of the invention disclosed herein also include computer program products for performing the steps and operations of the method proposed in the first aspect of the invention. More particularly, a computer program product is an embodiment having a computer-readable medium that includes computer program instructions encoded therein that when executed on at least one processor of a computer system cause the processor to perform the operations indicated in herein as embodiments of the invention.

En un ejemplo de realización, las instrucciones de código están configuradas para calcular la superficie ocupada por el objeto en la retina con base en un ángulo sólido relativo al objeto mirado. In an exemplary embodiment, the code instructions are configured to calculate the surface area occupied by the object on the retina based on a solid angle relative to the viewed object.

En un ejemplo de realización, las instrucciones de código están configuradas para calcular el parámetro relativo a la densidad de foco de atención de manera proporcional al ángulo sólido e inversamente proporcional al cuadrado de la distancia.In an example of embodiment, the code instructions are configured to calculate the parameter related to the density of focus of attention in a manner proportional to the solid angle and inversely proportional to the square of the distance.

En un ejemplo de realización, las instrucciones de código están configuradas, además, para calcular el parámetro relativo a la densidad de foco de atención tomando en consideración una función coseno de un ángulo normal a la superficie donde reside el objeto.In an embodiment, the code instructions are furthermore configured to calculate the parameter relative to the density of focus of attention taking into consideration a cosine function of an angle normal to the surface where the object resides.

Por tanto, la presente invención proporciona una nueva metodología/estrategia para cuantificar la atención prestada por las personas en los diferentes objetos de su entorno. La nueva métrica modeliza el sistema como si fuera el ojo para determinar la atención prestada y cuantifica dicha atención en todos los puntos de una zona de interés. Para el cálculo de dicha atención se introduce el concepto de trayectoria orientada como el conjunto de posiciones y de ángulos de orientación de la cabeza, de cada persona de interés y en el tiempo que sea de interés.Therefore, the present invention provides a new methodology/strategy to quantify the attention paid by people to the different objects in their environment. The new metric models the system as if it were the eye to determine the attention provided and quantifies that attention at all points in an area of interest. To calculate said attention, the concept of oriented trajectory is introduced as the set of positions and angles of orientation of the head, of each person of interest and in the time that is of interest.

Breve descripción de los dibujosBrief description of the drawings

Las anteriores y otras características y ventajas se comprenderán más plenamente a partir de la siguiente descripción detallada de unos ejemplos de realización, meramente ilustrativa y no limitativa, con referencia a los dibujos que la acompañan, en los que:The foregoing and other features and advantages will be more fully understood from the following detailed description of some embodiment examples, merely illustrative and non-limiting, with reference to the accompanying drawings, in which:

La Fig. 1 es un diagrama de flujo que ilustra un método para evaluar la atención de un sujeto que transita por una determinada zona, según un ejemplo de realización de la presente invención.Fig. 1 is a flowchart illustrating a method for evaluating the attention of a subject passing through a certain area, according to an exemplary embodiment of the present invention.

La Fig. 2 muestra la distribución de nube de puntos de la zona de análisis. El cuadro gris central muestra la zona capturada por la cámara, que dispone de alta densidad de puntos, mientras que las paredes y el suelo tienen una densidad reducida. Los objetivos de análisis están en la zona de cámara, y en las paredes, fuera de ella.Fig. 2 shows the point cloud distribution of the analysis area. The central gray box shows the area captured by the camera, which has a high density of points, while the walls and the floor have a reduced density. The analysis objectives are in the camera area, and on the walls, outside of it.

La Fig. 3 muestra esquemáticamente un ejemplo de un observador que mira un objeto, indicándose la inclinación (pitch), giro vertical alabeo (roll) y giro horizontal (yaw).Fig. 3 schematically shows an example of an observer looking at an object, indicating pitch, roll, and yaw.

Las Figs. 4 y 5 muestran esquemáticamente un sistema de coordenadas de una sala respecto al sistema de referencia del ojo. The Figs. 4 and 5 schematically show a coordinate system of a room with respect to the reference system of the eye.

Descripción detallada de la invención y de unos ejemplos de realizaciónDetailed description of the invention and some embodiments

La presente invención proporciona un método, y programas informáticos, para evaluar la atención de un sujeto que transita por una determinada zona. Para ello, se utiliza un sistema de visión, preferiblemente 3D, con una o más cámaras en posición cenital. La invención permite detectar un sujeto que se mueve por un espacio mirando hacia objetos que le resulten atractivos (es decir, foco de atención). Asimismo, la invención permite realizar el seguimiento de la trayectoria de dicho sujeto midiendo los ángulos de la cabeza, secuencia a secuencia.The present invention provides a method, and software, to assess the attention of a subject passing through a given area. For this, a vision system is used, preferably 3D, with one or more cameras in a zenithal position. The invention makes it possible to detect a subject that moves through a space looking towards objects that are attractive to it (ie, focus of attention). Likewise, the invention makes it possible to track the trajectory of said subject by measuring the angles of the head, sequence by sequence.

Para realizar la cuantificación se propone un método que genera una magnitud en cada punto de las superficies a analizar. Dicha magnitud indica si este punto ha sido observado y en qué cantidad. Es decir, dicha magnitud es proporcional al tiempo de observación e inversamente proporcional al cuadrado de la distancia a la que se encuentra el punto observado por el sujeto. De hecho, la citada magnitud está relacionada con la zona del fondo de ojo que ha sido utilizada para observar el objeto y el tiempo que se ha observado. Por tanto, se utiliza la sensibilidad en función de la zona en la que se encuentra el objeto respecto al sujeto (es decir, el observador). La trayectoria orientada del sujeto se calcula determinando la posición de la cabeza y la dirección de visionado. Se supone que la dirección natural del ojo coincide con la dirección de la cabeza.To carry out the quantification, a method is proposed that generates a magnitude at each point of the surfaces to be analyzed. This magnitude indicates if this point has been observed and in what quantity. That is, said magnitude is proportional to the observation time and inversely proportional to the square of the distance at which the point observed by the subject is located. In fact, the aforementioned magnitude is related to the area of the fundus that has been used to observe the object and the time that it has been observed. Therefore, the sensitivity is used depending on the area in which the object is with respect to the subject (ie, the observer). The oriented path of the subject is calculated by determining the position of the head and the viewing direction. The natural direction of the eye is assumed to coincide with the direction of the head.

En cada punto de la trayectoria se utiliza un modelo de la visión del ojo que transforma la zona de la retina asociada a una magnitud en la superficie. Dicha magnitud se denomina densidad de foco y se calcula utilizando una nube de puntos PCL (Point Cloud Library) 3D de las superficies a analizar, ya sean interiores o exteriores a la zona de captura de la trayectoria. La densidad de foco se acumula en cada fotograma de la trayectoria. Finalmente, se determina la atención en los objetos sumando la densidad de foco en cada punto de la superficie. Los que tengan mayor densidad de foco serán los que han generado mayor interés. Dichas magnitudes, no tienen un valor absoluto sino un valor de comparación entre diferentes objetos de interés.At each point of the trajectory, an eye vision model is used that transforms the area of the retina associated with a magnitude on the surface. This magnitude is called focus density and is calculated using a 3D PCL (Point Cloud Library) point cloud of the surfaces to be analysed, whether inside or outside the trajectory capture area. The focus density accumulates in each frame of the trajectory. Finally, the attention on the objects is determined by adding the focus density at each point on the surface. Those with the highest density of focus will be the ones that have generated the most interest. These magnitudes do not have an absolute value but a comparison value between different objects of interest.

Con referencia a la Fig. 1, en la misma se muestra un ejemplo de realización del método propuesto. Según este ejemplo de realización, una unidad de procesamiento (o un procesador) recibe una secuencia de imágenes obtenidas por una o más cámaras 3D dispuestas en posición cenital dentro de una zona de interés durante un determinado intervalo de tiempo. La secuencia de imágenes incluye uno o más sujetos y un entorno de la zona de interés que incluye una serie de objetos. Asimismo, la unidad de procesamiento calcula la atención prestada por al menos uno de los sujetos a un determinado objeto mediante procesamiento de unos fotogramas de la secuencia de imágenes. En este caso, el citado procesamiento comprende calcular la trayectoria del sujeto mediante el cálculo de la posición y de los ángulos de la cabeza en relación con las coordenadas espaciales de las superficies que delimitan al menos parte de la zona de interés. Para cada uno de los puntos de la trayectoria se calcula un modelo de visión del ojo del sujeto con base a la determinación de la superficie ocupada por el objeto mirado en la retina y del tiempo de observación, y en el cálculo de una distancia del sujeto al objeto. También se calcula un parámetro relativo a una densidad de foco de atención del sujeto en el objeto tomando en consideración el modelo de la visión del ojo calculado; y transformar el parámetro relativo a la densidad de foco de atención a un sistema de referencia de la zona de interés, proporcionando un parámetro relativo a la densidad de foco de atención transformado. Finalmente se calcula un parámetro relativo al foco de atención visual de la citada trayectoria, sumando cada uno de los parámetros relativos a la densidad de foco de atención transformados.Referring to Fig. 1, it shows an embodiment of the proposed method. According to this exemplary embodiment, a processing unit (or a processor) receives a sequence of images obtained by one or more 3D cameras arranged in a zenithal position within a zone of interest during a determined interval of time. The image sequence includes one or more subjects and a region of interest environment that includes a number of objects. Likewise, the processing unit calculates the attention paid by at least one of the subjects to a given object by processing some frames of the image sequence. In this case, the Said processing comprises calculating the trajectory of the subject by calculating the position and angles of the head in relation to the spatial coordinates of the surfaces that delimit at least part of the area of interest. For each of the points of the trajectory, a vision model of the subject's eye is calculated based on the determination of the surface occupied by the object looked at on the retina and the observation time, and on the calculation of a distance from the subject. to the object. A parameter relating to a focus density of the subject on the object is also calculated by taking the calculated eye vision model into consideration; and transforming the focus density parameter to a reference frame of the region of interest, providing a transformed focus density parameter. Finally, a parameter relative to the visual focus of attention of the aforementioned trajectory is calculated, adding each one of the parameters relative to the density of the transformed focus of attention.

Las dimensiones de la zona de interés, la zona que capta la cámara cenital (donde hay imagen), y los objetos de interés, se convierten en una nube de puntos donde se realiza el análisis (Fig. 2). Tal y como puede observarse en la Fig. 2, hay zonas con mayor densidad de puntos que otras. Esto se hace así para acelerar el tiempo de procesado, que depende del número de puntos. Sin embargo, aunque haya zonas con menor densidad de puntos se mantiene un detalle aceptable de la zona a analizar. Obsérvese que la densidad de puntos es mayor en la zona de captura de la cámara (zona gris central) y menor en la zona externa. Hay objetos de análisis situados en una zona bastante densa y otros, los de las paredes, en una zona con menor densidad de puntos.The dimensions of the area of interest, the area captured by the overhead camera (where there is an image), and the objects of interest, are converted into a cloud of points where the analysis is performed (Fig. 2). As can be seen in Fig. 2, there are areas with a higher density of points than others. This is done in order to speed up the processing time, which depends on the number of points. However, even if there are areas with a lower density of points, an acceptable detail of the area to be analyzed is maintained. Note that the density of points is higher in the area captured by the camera (central gray area) and lower in the outer area. There are analysis objects located in a fairly dense area and others, those on the walls, in an area with a lower density of points.

En la presente invención la trayectoria se define por la secuencia temporal de la posición de la cabeza y los ángulos que determinan la dirección de visionado. Por tanto, dicha trayectoria se caracteriza por una sucesión ordenada de posiciones y ángulos, que localizan la cabeza en el espacio y la orientan respecto a su entorno:In the present invention the trajectory is defined by the temporal sequence of the head position and the angles that determine the viewing direction. Therefore, said trajectory is characterized by an ordered succession of positions and angles, which locate the head in space and orient it with respect to its surroundings:

dondewhere

(x, y, Z) es su posición y el segundo término son los tres ángulos: inclinación (pitch), alabeo (roll) y giro horizontal (yaw).( x, y, Z) is its position and the second term is the three angles: pitch, roll, and yaw .

Para obtener el foco de atención en 2D se ha de tener en cuenta sólo el ángulo de giro horizontal (yaw). Sin embargo, si es 3D se han de considerar los ángulos de giro horizontal y de inclinación (pitch). Únicamente se tienen en cuenta estos ángulos porque son los que interesan para poder determinar el foco de atención de la gente en un recinto. El objeto de atención puede estar a diferentes alturas, así como los sujetos pueden tener diferentes alturas, por tanto, si se analiza el ángulo de inclinación se está introduciendo otro grado de libertad que permitirá aumentar el rango de aplicaciones del sistema.To obtain the focus of attention in 2D, only the horizontal rotation angle (yaw) has to be taken into account. However, if it is 3D, the pan and tilt angles (pitch) must be considered. Only these angles are taken into account because they are the ones that are of interest to be able to determine the focus of attention of the people in a room. The object of attention can be at different heights, as well as the subjects can have different heights, therefore, if the angle of inclination is analyzed, another degree of freedom is being introduced that will allow the range of applications of the system to be increased.

Método de cálculo 3D:3D calculation method:

En el método de cálculo 3D, para determinar la atención que un sujeto presta a un objeto se analiza cómo trabaja el sistema visual humano, más concretamente los ojos. El ojo humano dispone de un sensor, la retina, donde se forman las imágenes proyectadas a través de una lente, el cristalino. Para determinar cuál es la atención de un objeto se debe determinar qué porcentaje está ocupando dicho objeto en la retina y durante cuánto tiempo. La zona ocupada en la retina se medirá en ángulo sólido (estereorradianes) relativo al foco. Esta zona ocupada en la retina dependerá de la superficie del objeto en el mundo real y de su distancia al ojo.In the 3D calculation method, to determine the attention that a subject pays to an object, it is analyzed how the human visual system works, more specifically the eyes. The human eye has a sensor, the retina, where images projected through a lens, the crystalline lens, are formed. To determine what is the attention of an object, it is necessary to determine what percentage the said object is occupying on the retina and for how long. The occupied area on the retina will be measured in solid angle (steradians) relative to the focus. This occupied area on the retina will depend on the surface of the object in the real world and its distance from the eye.

La relación de ángulo sólido constante viene dada por:The constant solid angle ratio is given by:

donde A es el área del objeto y d la distancia al ojo. Por tanto, para mantener la misma atención en el ojo el tamaño del objeto debe crecer con el cuadrado de la distancia. Además, la atención humana tiene dependencia angular horizontal y vertical. Esta dependencia angular se indica mediante una función multiplicativa dependiente del ángulo. En el caso del ojo el foco de atención se puede definir:where A is the area of the object and d is the distance to the eye. Therefore, to maintain the same attention in the eye, the size of the object must increase with the square of the distance. Also, human attention has horizontal and vertical angular dependence. This angular dependence is indicated by a multiplicative function depending on the angle. In the case of the eye, the focus of attention can be defined:

Donde a y Q son los ángulos con el eje x e y, respectivamente, en el sistema de referencia del ojo, Ox y Oy son las desviaciones típicas asociadas a los ángulos del campo de visión (horizontal y vertical).Where a and Q are the angles with the x and y axis, respectively, in the eye's reference system, Ox and Oy are the standard deviations associated with the angles of the field of vision (horizontal and vertical).

Así, por ejemplo, si el interés es analizar cuál ha sido la atención con capacidad de lectura, el ángulo sería de ±10° y ^o =10.Thus, for example, if the interest is to analyze what attention has been with reading capacity, the angle would be ±10° and ^o =10.

Si en lugar del ojo se utilizase una cámara,If a camera were used instead of the eye,

( ^{. _ [}0 ^, si ángulo > ángulomáximo ( ^{. _ [} 0 ^, if angle > maxangle

ángulo a> ) enotroscasosangle a> ) in other cases

Para evaluar la densidad de foco de atención del observador (es decir del sujeto) al objeto, hay que dividir el objeto en píxeles o elementos mínimos de superficie del objeto orientado al sujeto. En la Fig. 4 se puede ver que la distancia del observador al píxel considerado vale d (es el módulo del vector rd). Proyectando el objeto y el píxel considerado en el plano ZX se puede ver la proyección del vector rd, d y el ángulo Qx que representa el ángulo horizontal de la visión. Al proyectar sobre el plano ZY se ve el vector rdy y el ángulo Qy que representa el ángulo vertical.In order to evaluate the density of focus of attention of the observer (ie of the subject) to the object, it is necessary to divide the object into pixels or minimum surface elements of the object oriented to the subject. In Fig. 4 it can be seen that the distance from the observer to the considered pixel is worth d (it is the magnitude of the vector rd). Projecting the object and the considered pixel in the ZX plane, it is possible to see the projection of the vector rd, d and the angle Qx that represents the horizontal angle of vision. When projected on the ZY plane, we see the vector rdy and the angle Qy that represents the vertical angle.

Como se ha comentado anteriormente, el foco de atención es proporcional al ángulo sólido del objeto proyectado en la retina del ojo. Por tanto, una parte de la función densidad de atención será una función proporcional a dicho ángulo. Dado que la Fánguio está relacionada con la superficie del objeto proyectada en la retina, la densidad de foco será proporcional a Fángulo e inversamente proporcional al cuadrado de la distancia d, tal como indica la ecuación del ángulo sólido. Por último, hay que considerar que, si la visión no es frontal, la información de la atención se debe penalizar con la función coseno del ángulo normal a la superficie, es decir, Qn. Con todas estas consideraciones la función de densidad de foco de atención (DFOA) en un punto determinado P se define de la siguiente manera:As previously mentioned, the focus of attention is proportional to the solid angle of the projected object on the retina of the eye. Therefore, a part of the attention density function will be a function proportional to said angle. Since the Fanguio is related to the surface of the object projected on the retina, the focus density will be proportional to the Fangulo and inversely proportional to the square of the distance d, as indicated by the solid angle equation. Finally, it must be considered that, if the vision is not frontal, the attention information must be penalized with the cosine function of the normal angle to the surface, that is, Qn. With all these considerations, the focus density function of attention (DFOA) at a given point P is defined as follows:

donde K es la constante de normalización, Fángulo la función descrita en las ecuaciones anteriores, d la distancia desde el ojo al punto P y Qⁿ el ángulo con la normal a la superficie donde reside el punto P. Si bien se puede complementar los factores con funciones que premien la dirección de movimiento, así como la posición relativa de la cabeza, en posiciones naturales o forzadas, desgraciadamente no son medibles de forma objetiva; por lo que no serán utilizadas por el método propuesto.where K is the normalization constant, Fangle the function described in the previous equations, d the distance from the eye to point P and Q ⁿ the angle with the normal to the surface where point P resides. Although the factors can be complemented with functions that reward the direction of movement, as well as the relative position of the head, in natural or forced positions, unfortunately they are not objectively measurable; so they will not be used by the proposed method.

Se puede calcular la DFOA (P) en cualquier punto P perteneciente a una superficie (suelo, paredes, mesa, etc.) o a un objeto presente en la zona de análisis. La función DFOA (P) valdrá cero en puntos de objetos ocultos para el sujeto (observador) y en los puntos que el sujeto no ha mirado durante el tiempo de análisis.The DFOA (P) can be calculated at any point P belonging to a surface (floor, walls, table, etc.) or to an object present in the analysis area. The DFOA (P) function will be zero in hidden object points for the subject (observer) and in the points that the subject has not looked during the analysis time.

La DFOA (P) calculada hasta ahora está basada en el sistema de referencia del ojo, y puesto que la posición y ángulo del ojo está cambiando en cada posición de la trayectoria es muy complicado de manejar. Para solventar este punto se realiza una transformación ([7]) al sistema de referencia de la sala (X’Y’Z’), ver Fig. 5.The DFOA (P) calculated so far is based on the reference system of the eye, and since the position and angle of the eye is changing at each position of the trajectory it is very complicated to handle. To solve this point, a transformation ([7]) is made to the reference system of the room ( X'Y'Z'), see Fig. 5.

DFOA(P') = [T] ■ DFOAlocal(P) DFOA ( P') = [T] ■ DFOAlocal ( P)

donde DFOAlocal(P) es la DFOA calculada en el sistema de referencia del ojo del sujeto, y DFOA(P’) es la DFOA transformando el punto P del sistema de referencia del ojo al punto P’ del sistema de referencia de la sala.where DFOAlocal(P) is the DFOA calculated in the subject's eye frame of reference, and DFOA(P') is the DFOA transforming point P of the eye frame of reference to point P' of the room frame of reference.

Sin embargo, la DFOA(P’) solo indica el foco de atención de un sujeto desde un punto determinado. Para calcular el foco de atención visual en un punto (VFOA: Visual Focus of Attention) se tiene que realizar la suma de todas las DFOA(P’) en todos los puntos de la trayectoria (7) del sujeto. Además, se puede añadir una suma de todos los sujetos de un colectivo (C) considerados en el análisis. Para normalizar la VFOA(P’), se multiplican las sumas por una constante N:However, DFOA(P') only indicates the focus of a subject's attention from a given point. In order to calculate the visual focus of attention at a point (VFOA: Visual Focus of Attention), the sum of all the DFOA(P') at all the points of the trajectory (7) of the subject must be made. In addition, a sum of all the subjects of a group (C) considered in the analysis can be added. To normalize the VFOA(P'), multiply the sums by a constant N:

La normalización se realiza calculando la suma de todas las VFOA(P’) de todos los puntos P’ de los objetos y superficies de la zona de análisis, y ajustando N para que esa suma valga 1: The normalization is done by calculating the sum of all the VFOA(P') of all the points P' of the objects and surfaces in the analysis area, and adjusting N so that this sum is equal to 1:

La expresión VFOA es una nube de puntos de intensidad que, por ejemplo, se puede representar mediante un código de colores. Por ejemplo, un determinado color se puede utilizar para representar máxima atención y otro color diferente para indicar poca atención. The VFOA expression is a cloud of intensity points that, for example, can be represented by a color code. For example, a certain color can be used to represent maximum attention and a different color to indicate low attention.

La ecuación VFOA(P’) anterior no incluye el tiempo, sin embargo, este se tendría que tener en cuenta cuando se consideran múltiples trayectorias simultáneas. Si se tiene en cuenta la variable tiempo la ecuación VFOA(P’) quedaría:The above VFOA(P') equation does not include time, however time would have to be taken into account when considering multiple simultaneous paths. If the time variable is taken into account, the equation VFOA(P') would be:

La VFOA(P’, t) expresa la dependencia temporal de la VFOA, donde los sumatorios se realizan en un colectivo determinado de las DFOA que coincidan con el tiempo t. Naturalmente, esto solo tiene aplicaciones no triviales en el caso de que existan múltiples trayectorias simultáneas. En estas trayectorias las sumas se realizan dentro de todos los elementos que en el instante de tiempo t hayan estado en la zona de tránsito.The VFOA(P', t) expresses the temporal dependence of the VFOA, where the summations are made in a given group of the DFOAs that coincide with time t. Naturally, this only has non-trivial applications in the case of multiple simultaneous trajectories. In these trajectories, the sums are made within all the elements that at time t have been in the transit zone.

La expresión de la VFOA acumulada, puede también restringirse a un intervalo, entre t i y t2: The expression of the accumulated VFOA can also be restricted to an interval, between ti and t2:

VFOA{P, t ( t l , f2 )) = N V VFOA(P,t) VFOA{P, t ( tl , f2 )) = N V VFOA ( P,t )

t l,t2tl,t2

La principal función de esta dependencia temporal es la sincronización de la función de foco de atención con eventos ocurridos en la zona de análisis, de forma que se pueda determinar qué evento ha producido más atención.The main function of this temporal dependency is the synchronization of the attention focus function with events that have occurred in the area of analysis, so that it can be determined which event has produced the most attention.

Método de cálculo 2D:2D calculation method:

El método de cálculo 2D es una simplificación del anterior método 3D. Se basa en utilizar, únicamente, la información contenida en la proyección de los objetos de la sala y de la trayectoria de los observadores sobre el plano X ’Y del sistema de coordenadas de la sala que se muestra en la Fig. 5. Por tanto, en lugar de la posición (x ’,y’,z’) y los tres ángulos de Euler (yaw, pitch, rolí), se usa la posición (x’,y’) y un solo ángulo de Euler (yaw). Para el cálculo de la DFOA(K) en el sistema de coordenadas del ojo se usa solo la proyección sobre el plano ZX del sistema de coordenadas del observador que se muestra en la Fig. 4. Posteriormente se realiza el cambio de variables al sistema X ’Y’ de la sala. Los cálculos con estas trayectorias 2D son más fáciles de conseguir, aunque también tienen limitaciones en las informaciones que se podrán extraer y en el ámbito de utilización.The 2D calculation method is a simplification of the previous 3D method. It is based on using only the information contained in the projection of the objects in the room and the trajectory of the observers on the X'Y plane of the coordinate system of the room. which is shown in Fig. 5. Therefore, instead of the position ( x ',y',z') and the three Euler angles ( yaw, pitch, roll), the position ( x',y ') and a single Euler angle (yaw). To calculate the DFOA(K) in the eye coordinate system, only the projection on the ZX plane of the observer's coordinate system shown in Fig. 4 is used. Subsequently, the change of variables to the X system is made. 'Y' of the room. Calculations with these 2D trajectories are easier to achieve, although they also have limitations in the information that can be extracted and in the scope of use.

Para el análisis de los tiempos de visualización se pueden considerar tres definiciones de tiempo de las trayectorias orientadas:For the analysis of the display times, three time definitions of the oriented trajectories can be considered:

• Tiempo de permanencia (Dweíí time): tiempo total que la persona es visible por una cámara en el punto de interés• Dweíí time ( Dweíí time): total time that the person is visible by a camera at the point of interest

• Tiempo de visualización (In-View time): tiempo total que la persona ve la cámara del punto de interés en un ángulo del cono de visión (45°) • Tiempo de atención (Attention Time): tiempo total que la persona ve la cámara del punto de interés en un ángulo del cono de atención (25°).• Viewing time ( In-View time): total time that the person sees the camera of the point of interest at an angle of the cone of vision (45°) • Attention time ( Attention Time): total time that the person sees the point of interest camera at an angle of the cone of attention (25°).

Para cada punto p se genera la secuencia X }¡ con el ángulo al punto p en cada fotograma k de la trayectoria i. Con {Xk}i se extraen las secuencias que cumplen Xk < tiempo de visualización y Xk < tiempo de atención.For each point p, the sequence X }¡ is generated with the angle to point p in each frame k of the trajectory i. With {Xk}i the sequences that meet Xk < display time and Xk < attention time are extracted.

Las citadas secuencias se denominan {xÁ} para tiempo de visualización y {Xkat} para tiempo de atención. Asimismo, se buscan las secuencias j de duración mayor a 1 segundo:The aforementioned sequences are called {xÁ} for display time and {Xkat} for attention time. Likewise, j sequences with a duration greater than 1 second are searched for:

T? you? ^{= =} 'Y ^jlo n g itu d d X f t i j ) 'Y ^j long itu dd X ftij ) ^{■ ■} tftf

jj

Al sumar todas las trayectorias, se pueden calcular los valores finales de los tiempos de atención y visualización:By adding all the trajectories, the final values of attention and display times can be calculated:

Asimismo, en algunos ejemplos de realización, la invención proporciona un sistema para evaluar la atención de un sujeto que transita por una determinada zona. El sistema o dispositivo puede estar formado por dos partes: una unidad de adquisición de imágenes y un módulo DSP (Digital Signal Processing) de personas/sujetos, y un dispositivo o servidor de computación.Likewise, in some embodiments, the invention provides a system for evaluating the attention of a subject who passes through a certain area. The system or device can be made up of two parts: an image acquisition unit and a person/subject DSP (Digital Signal Processing) module, and a computing device or server.

La unidad de adquisición de imágenes particularmente consiste en una o más cámaras, por ejemplo, 3D, con un CODEC y un DSP capaz de gestionar el rastreo de los objetos. Además, esta unidad emite paquetes de datos al dispositivo o servidor (local o en la nube). The image acquisition unit in particular consists of one or more cameras, for example 3D, with a CODEC and a DSP capable of managing the tracking of the objects. Also, this unit broadcasts data packets to the device or server (local or cloud).

En un ejemplo de realización, el servidor consiste en un procesador múltiple con capacidad de realizar operaciones matemáticas, de procesado digital de señal y de representación gráfica, en la que se ejecutan los algoritmos de detección FOA y los algoritmos de detección de trayectorias y de presencia.In an embodiment, the server consists of a multiple processor capable of performing mathematical operations, digital signal processing, and graphic representation, in which the FOA detection algorithms and the trajectory and presence detection algorithms are executed. .

La invención propuesta puede implementarse en hardware, software, firmware o cualquier combinación de los mismos. Si se implementa en software, las funciones pueden almacenarse en o codificarse como una o más instrucciones o código en un medio legible por ordenador.The proposed invention may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored in or encoded as one or more instructions or code on a computer-readable medium.

El medio legible por ordenador incluye medio de almacenamiento informático. El medio de almacenamiento puede ser cualquier medio disponible que pueda accederse mediante un ordenador. A modo de ejemplo, y no de limitación, tal medio legible por ordenador puede comprender RAM, ROM, EEPROM, CD-ROM u otro almacenamiento de disco óptico, almacenamiento de disco magnético o de estado sólido, u otros dispositivos de almacenamiento magnético, o cualquier otro medio que pueda usarse para llevar o almacenar código de programa deseado en la forma de instrucciones o estructuras de datos y que pueda accederse mediante un ordenador. Disco (disk) y disco (disc), como se usan en el presente documento, incluyen discos compactos (CD), láser disc, disco óptico, disco versátil digital (DVD), disco flexible y disco de Blu-ray donde los discos (disks) reproducen normalmente datos de forma magnética, mientras que los discos (discs) reproducen datos de forma óptica con láseres. Deberían incluirse también combinaciones de los anteriores dentro del alcance de medio legible por ordenador. Cualquier procesador y el medio de almacenamiento pueden residir en un ASIC. El ASIC puede residir en un terminal de usuario. Como alternativa, el procesador y el medio de almacenamiento pueden residir como componentes discretos en un terminal de usuario. The computer readable medium includes computer storage medium. The storage medium can be any available medium that can be accessed by a computer. By way of example, and not limitation, such computer-readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, solid-state or magnetic disk storage, or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Disc (disk) and disc (disc), as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disc, and Blu-ray disc where discs ( disks) typically reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer readable media. Any processor and storage medium can reside in an ASIC. The ASIC may reside in a user terminal. Alternatively, the processor and storage medium may reside as discrete components in a user terminal.

Como se usa en el presente documento, los productos de programa de ordenador que comprenden medios legibles por ordenador incluyen todas las formas de medio legible por ordenador excepto, hasta el punto de que ese medio se considere que no son señales de propagación transitorias no establecidas.As used herein, computer program products comprising computer-readable media include all forms of computer-readable media except, to the extent that such media is considered to be non-established transient propagation signals.

El alcance de la presente invención está definido en las reivindicaciones adjuntas. The scope of the present invention is defined in the appended claims.

Claims

REIVINDICACIONES

1. Método para evaluar la atención de un sujeto que transita por una determinada zona, el método comprende:1. Method to evaluate the attention of a subject who passes through a certain area, the method includes:

recibir, por una unidad de procesamiento, una secuencia de imágenes obtenidas por una o más cámaras 3D dispuestas en posición cenital dentro de una zona de interés durante un determinado intervalo de tiempo, en donde la secuencia de imágenes incluye al menos un primer sujeto y un entorno de la zona de interés con una pluralidad de objetos; y calcular, por la unidad de procesamiento, la atención prestada por el primer sujeto a un determinado objeto de dicha pluralidad de objetos mediante procesamiento de unos fotogramas de la secuencia de imágenes, en donde el procesamiento incluye las siguientes etapas:receive, by a processing unit, a sequence of images obtained by one or more 3D cameras arranged in a zenithal position within an area of interest during a certain time interval, wherein the image sequence includes at least a first subject and a environment of the area of interest with a plurality of objects; and calculating, by the processing unit, the attention paid by the first subject to a certain object of said plurality of objects by processing a few frames of the image sequence, wherein the processing includes the following steps:

- calcular una trayectoria del primer sujeto mediante el cálculo de una posición y de unos ángulos de la cabeza del primer sujeto en relación con unas coordenadas espaciales de unas superficies que delimitan al menos parte de la zona de interés;- calculating a trajectory of the first subject by calculating a position and angles of the first subject's head in relation to spatial coordinates of surfaces that delimit at least part of the area of interest;

- para cada uno de los puntos de la trayectoria calculada:- for each of the points of the calculated trajectory:

- calcular un modelo de visión de un ojo del primer sujeto con base en la determinación de una superficie ocupada por el objeto mirado en la retina y durante qué tiempo, y en el cálculo de una distancia del primer sujeto al objeto;- calculating a vision model of an eye of the first subject based on determining a surface occupied by the viewed object on the retina and for what time, and calculating a distance from the first subject to the object;

- calcular un parámetro relativo a una densidad de foco de atención del primer sujeto en el objeto tomando en consideración el modelo de la visión del ojo calculado; y- calculating a parameter relative to a density of attention focus of the first subject on the object taking into consideration the model of the vision of the calculated eye; and

- transformar dicho parámetro relativo a la densidad de foco de atención a un sistema de referencia de la zona de interés, proporcionando un parámetro relativo a la densidad de foco de atención transformado; y- transforming said parameter relating to the attention focus density to a reference system of the area of interest, providing a parameter relating to the transformed attention focus density; and

- calcular un parámetro relativo a un foco de atención visual de dicha trayectoria, sumando cada uno de los parámetros relativos a la densidad de foco de atención transformados.- calculating a parameter relative to a focus of visual attention of said trajectory, adding each one of the parameters relative to the density of the transformed focus of attention.

2. Método según la reivindicación 1, en donde la superficie ocupada por el objeto en la retina se determina con base en el cálculo de un ángulo sólido relativo al objeto mirado.2. Method according to claim 1, wherein the surface occupied by the object on the retina is determined based on the calculation of a solid angle relative to the viewed object.

3. Método según la reivindicación 2, en donde el parámetro relativo a la densidad de foco de atención es proporcional al ángulo sólido e inversamente proporcional al cuadrado de la distancia. 3. Method according to claim 2, wherein the parameter relative to the density of focus of attention is proportional to the solid angle and inversely proportional to the square of the distance.

4. Método según una cualquiera de las reivindicaciones anteriores, en donde el parámetro relativo a la densidad de foco de atención se calcula, además, tomando en consideración una función coseno de un ángulo normal a la superficie donde reside el objeto.4. Method according to any one of the preceding claims, wherein the parameter relative to the density of focus of attention is also calculated, taking into consideration a cosine function of an angle normal to the surface where the object resides.

5. Método según una cualquiera de las reivindicaciones anteriores, en donde los ángulos de la cabeza comprenden un ángulo de inclinación, un ángulo de giro horizontal y un ángulo de alabeo de la cabeza.A method according to any one of the preceding claims, wherein the head angles comprise a bank angle, a yaw angle and a head roll angle.

6. Método según la reivindicación 1, en donde la secuencia de imágenes comprende una pluralidad de sujetos, y en donde la unidad de procesamiento, previamente al cálculo de la atención, ejecuta un algoritmo de segmentación sobre uno o más fotogramas de la secuencia de imágenes para detectar y diferenciar los distintos sujetos.6. Method according to claim 1, wherein the image sequence comprises a plurality of subjects, and wherein the processing unit, prior to calculating attention, executes a segmentation algorithm on one or more frames of the image sequence. to detect and differentiate the different subjects.

7. Método según una cualquiera de las reivindicaciones anteriores, en donde el área de interés comprende un recinto cerrado.7. Method according to any one of the preceding claims, wherein the area of interest comprises a closed area.

8. Método según la reivindicación 4, en donde la superficie donde reside el objeto comprende un suelo, una pared, una mesa, una estantería o una silla.The method of claim 4, wherein the surface where the object resides comprises a floor, a wall, a table, a shelf, or a chair.

9. Producto de programa de ordenador que incluye instrucciones de código que cuando se ejecutan por un sistema de computación implementan un método para evaluar la atención de un sujeto que transita por una determinada zona de acuerdo a la reivindicación 1.9. Computer program product that includes code instructions that, when executed by a computer system, implement a method for evaluating the attention of a subject who passes through a certain area according to claim 1.

10. Producto de programa de ordenador según la reivindicación 9, en donde las instrucciones de código están configuradas para calcular la superficie ocupada por el objeto en la retina con base en un ángulo sólido relativo al objeto mirado.The computer program product of claim 9, wherein the code instructions are configured to calculate the surface area occupied by the object on the retina based on a solid angle relative to the viewed object.

11. Producto de programa de ordenador según la reivindicación 10, en donde las instrucciones de código están configuradas para calcular el parámetro relativo a la densidad de foco de atención de manera proporcional al ángulo sólido e inversamente proporcional al cuadrado de la distancia.The computer program product of claim 10, wherein the code instructions are configured to calculate the parameter relating to the focus density proportional to the solid angle and inversely proportional to the square of the distance.

12. Producto de programa de ordenador según la reivindicación 9, 10 u 11, en donde las instrucciones de código están configuradas, además, para calcular el parámetro relativo a la densidad de foco de atención tomando en consideración una función coseno de un ángulo normal a la superficie donde reside el objeto. The computer program product according to claim 9, 10 or 11, wherein the code instructions are further configured to calculate the parameter relating to the density of attention focus by taking into consideration a cosine function of an angle normal to the surface where the object resides.