WO2010060146A1 - Metric for quantifying attention and applications thereof - Google Patents

Metric for quantifying attention and applications thereof Download PDF

Info

Publication number
WO2010060146A1
WO2010060146A1 PCT/AU2009/001547 AU2009001547W WO2010060146A1 WO 2010060146 A1 WO2010060146 A1 WO 2010060146A1 AU 2009001547 W AU2009001547 W AU 2009001547W WO 2010060146 A1 WO2010060146 A1 WO 2010060146A1
Authority
WO
WIPO (PCT)
Prior art keywords
region
interest
metric
display
measure
Prior art date
Application number
PCT/AU2009/001547
Other languages
French (fr)
Inventor
Nicholas John Langdale-Smith
Timothy James Henry Edwards
Original Assignee
Seeing Machines Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2008906134A external-priority patent/AU2008906134A0/en
Application filed by Seeing Machines Limited filed Critical Seeing Machines Limited
Publication of WO2010060146A1 publication Critical patent/WO2010060146A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42201Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] biosensors, e.g. heat sensor for presence detection, EEG sensors or any limb activity sensors worn by the user
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program

Definitions

  • the present invention relates to measuring and producing a metric for human attention toward a known object or specified region.
  • the invention has been developed primarily for use as a method and apparatus for providing a quantitative measure of attentiveness and changes in attentiveness toward information that is displayed visually (in particular visual advertisements), and will be described hereinafter with reference to this application. However, it will be appreciated that the invention is not limited to this particular field of use.
  • a known method for measuring interest to a visual stimulus includes that taught by Vertegaal et al (United States Patent Application 2006/0110008).
  • This method tracks eye gaze using IR to detect pupil effect and glints. It teaches a measure of visual interest by determining the correlation of the subject's optical axis with the object of interest over a percentage of time that the object is on display.
  • this method relies on measurement of eyes alone and relies on controlled lighting conditions (strong IR illumination).
  • illumination power scales with the square of the square of distance (d4), which places a limitation on the detection range of the system.
  • Another known method for measuring interest to a visual stimulus includes that taught by Cohen-Solal et al (United States Patent 6,873,710).
  • This method determines various demographic statistics, including gender, race or age statistics; the current size of the audience; how quickly the audience is changing; and how much attention the audience is paying to the presented advertising or information. It teaches a measure of interest for the audience as a whole by evaluating the audio or video information of the audience, to identify behaviour that suggests whether or not the audience is paying attention to the presented advertising or information.
  • this method does not include gaze direction or face orientation in its audience attention evaluation.
  • the measurement of attention described requires an audience that is captive in the sense that they are located in front of a single stimulus and are responding only to it, and the method therein assumes that there are no other stimuli in the environment that may be causing the responses.
  • Chardon et al. discloses a method and system for audience targeting of advertising where display time is auctioned to content providers based on obtained audience information. Although audience information is collected and made available to content providers based on their requirements (such as the standard marketing metric "opportunity to see") it is directed to audience-based content adaptation. A broadly applicable metric for quantifying human attention afforded to a known region of interest is not disclosed.
  • Van Erlach et al (United States Patent Application 20030179229) teaches a bio metrically-driven customization of human-machine interfaces.
  • the content of a smart sign may be altered based on biometrics of a passerby.
  • this method does not include any specific measurement technique for attention but refers to a general method of determining the emotional state of a person from at least one obtained user biometric, for the singular purpose of providing bionietrically determined feedback.
  • Such a system is likely to be improved from the invention herein which is a specific technique to reliably determine the attentive state of an individual or group of people toward an object, under a diverse range of operational conditions.
  • the stimuli is visual, but may equally be or include stimuli to any of the senses.
  • Another object of the invention in its preferred form is to correlate the metric with other information derived or provided, such as demographic information, location, time, and date to provide statistics that can be used by a connected system.
  • a method for producing a metric indicative of attentiveness toward a region of interest comprising the steps of:
  • the metric is combined with demographic information.
  • a plurality of faces are detected.
  • the plurality of faces are detected over a period of time.
  • the one or more measures includes the orientation of the eyes toward the region of interest.
  • the one of more measures includes the orientation of the face toward the region of interest.
  • the one of more measures includes any one or more selected from the group comprising
  • the metric is a weighted sum or average of each of the one or more measures.
  • the image data is captured by a plurality of video cameras.
  • the method further includes the step of updating the region of interest in response to said metric.
  • the region of interest is a digital display.
  • the region of interest is any one selected from the group comprising a:
  • the metric is used in calculating an advertising rating or market value of the region of interest.
  • an apparatus for producing a metric indicative of attentiveness toward a region of interest comprising:
  • a processor adapted to detect at least one face within said image data, calculate one or more measure for each said face indicative of its attentiveness toward said region of interest and combine each said measure to provide said metric indicative of its attentiveness toward said region of interest.
  • the processor of the apparatus updates the region of interest in response to the metric.
  • the region of interest of the apparatus is a digital display.
  • the region of interest is any one selected from the group comprising a: a. television;
  • the metric is used in calculating an advertising rating or market value of the region of interest.
  • FIG. 1 is an example flowchart of a method for evaluating an attention metric using a single camera
  • FIG. 2 is a partial flowchart of a method according to FIG 1, showing the evaluation of an attention value for a single face from a single camera;
  • FIG. 3 is an example plan view of the orientation of an individual for whom the metric is being calculated.
  • a metric (“facetime”) is disclosed that in an embodiment quantifies the level of human attentiveness towards a known object or region of space.
  • attentiveness can include interest toward an advertisement.
  • the "facetime" metric is designed for measuring human attentiveness toward specific object or region of interest in space that may include specific displayed information, under diverse operational conditions, such as in a mall or outdoor shopping area, where there may be strong incident lighting and a plurality of stimuli.
  • This attentiveness is indicated by many factors including the number and proximity of faces to the displayed information, and for each face: the orientation of the face toward the displayed information; the length of time looking at (or facing) the displayed information; the movement of the face with respect to the displayed information; any changes in facial geometry, e.g. indicating interaction or emotional reaction; that occur when looking at the displayed information.
  • the attentiveness metric is calculated by tracking aspects of humans in nearby proximity of the advertisement using image information obtained from a camera or set of cameras which are located so as to observe any potential onlookers.
  • These aspects can include any one or more of: the number of onlookers; the distance, position, and orientation of the onlooker's face, eyes, and facial features; presence and orientation of glasses or sunglasses; skin colour; and shadow geometry.
  • the metric may be used in a system where the quantified metric of attention may be used in conjunction and correlated with other information, such as demographic information.
  • the quantified metric can be correlated with changes in the advertisement.
  • the quantified metric can be fed-back to the advertisement, allowing the advertisement to change or respond.
  • the metric can indicate attentiveness toward a digital sign, traditional printed sign or other display used for advertising purposes.
  • the metric is preferably indicative of the effectiveness, and hence correlates to the market value, of each advertisement displayed.
  • the metric is used as a factor in an advertisement display network system that incorporates one or more locations, where the metric is used to influence one or more of: the locations an advertisement is shown; the price of showing the advertisement at a particular time and duration; the type and/or content of the advertisement shown, for example to maximise saliency and relevance.
  • the influence of the metric is such that the market value of advertisement locations changes (e.g. increases) with changed (e.g. increased) levels of attention.
  • tracking techniques can be employed to complement the metric data with demographic information such as approximate viewer age, gender, and race.
  • Attention metric data gathered for a particular advertisement can provide a number of useful benefits to the advertiser or owner of the advertising space.
  • the data can be correlated across results from other advertisements, either measured at the same display location or at different locations. Other correlations can also be made, including the time of day, date, day of week, location of the sign or demographics of the viewers, for providing further information to advertisers such as when certain advertisements are best displayed.
  • the interrelationship of such data such as time-of-day and viewer age or gender demographics, can also be analysed and provide useful information.
  • a method can include the steps of:
  • step (d) also includes delivering auxiliary demographic information (for example age, sex, race) that is available to be correlated to the metric.
  • auxiliary information for example age, sex, race
  • the auxiliary information can be determined in real-time by suitable means and/or be stored information retrieved locally or via a data network.
  • a comparative value for individual advertisements can be calculated from the metric.
  • the metrics of individual stimuli can be compared against other metrics derived at the same site to determine which particular advertisement provokes more attention from passersby, and also for instance correlated with sex, age, race demographics.
  • the metrics of identical advertisements across differently positioned display regions can also be compared to determine the relative value of particular display sites.
  • An owner of advertising space can also use information derived from the metric, for example, when setting a price to charge the advertiser.
  • the price can be modulated by time of the day, amount of passers-by, or peak occurrences of certain demographics when compared across different display locations.
  • the relevance of the advertisement can also be indicated by the metric.
  • affording attention to the stimuli, and/or having a particular demographic can be used in quantifying relevance.
  • the metric provides the foundation for calculating a value indicative of the relevance of displayed information.
  • an owner of advertising space can use the information derived from the metric to allow targeted advertising to be displayed at all times. It would be appreciated that this allows the owner of advertising space to improve the saliency of advertising displayed and/or the potential to maximise their revenue.
  • the metric is used as the basis for a unit of measurement for attention with an associated economic value.
  • the price of the advertising spaces, at a particular point in time can be set based on their past, current or predicted levels of available attention units.
  • the attention units can be associated with other data such as time and date, demographics and location.
  • the cumulated attention units of a particular sign could be broken down by gender, age, and so forth. It would be appreciated that attention, now measurable and unitised, can be traded according to the particular requirements of the content provider. For example, a content provider may wish to purchase 25% of the available attention units in a certain geographical location and timeslot.
  • the hardware requirement can be relatively cheap, for example, a camera capable of providing periodic snapshots that are made available on a network or stored locally for later retrieval and analysis.
  • This camera would preferably have night- vision capabilities or be complemented by a suitable light-source such as an infra-red beacon.
  • This camera would preferably be weather-proof and tamper-proof.
  • the disclosed metric constitutes an improvement by preferably including simple face-detection and any one or more measures indicative of face tracking, head pose determination, and tracking facial geometry as it changes shape over time. It would be appreciated that face-detection can be potentially performed more consistently and at greater distances because faces are larger than eyes and are less often covered or hidden from view. This embodiment can therefore provide a more robust metric for a wide-range of applications including determination of demographic specific viewer reaction. Examples of relevant tracking apparatus include faceLAB, DSS, or faceAPI available from Seeing Machines Ltd. (http://www.seeingmachines.com).
  • a reaction to an advertisement can be determined more accurately than simply counting eyes (or faces); by observing the angular orientation toward an advertisement correlated with the duration of time spent at a particular orientation angle. Similarly, if gaze direction tracking is possible, the rotation amount and duration of observation may be used also. It would also be appreciated that the detection and tracking of eyeglasses, including sunglasses, would also be useful when calculating the metric as people that wear eyeglasses are themselves a valuable demographic category, and the frames of glasses can be used to locate the orientation of the head more accurately.
  • the method for tracking eye glasses is as set out in PCT Publication WO 2007/062478 entitled "Visual Tracking of Eye glasses in Visual Head and Eye Tracking Systems" the disclosure of which is hereby incorporated by cross reference. Detection of eye closure can also be useful when calculating the metric as well as combination with other information such as viewer alertness through calculating measures such as PERCLOS (percentage of eye closure).
  • a metric can be indicative of attentiveness toward, and reaction to a, or region of a, television; plasma or LCD screen; computer screen; status display; advertising display; information display or kiosk; gauge or readout; control interface; art gallery or museum installation; billboard; phone; web-browser image; road-sign; shop window display; or any object or region of space that has the potential to demand or require human visual attention.
  • the method of providing stimuli can include an information screen (such as plasma and LCD display), a static two- dimensional display (for example a simple poster), or a three-dimensional display (for example a shop window display).
  • the metric is used to generate a visualisation of attention payed to particular regions of the known object.
  • the visualisation could be a heat map.
  • the application of the metric would allow long term observation and accrual of data. It would be appreciated that significant amounts of data can be aggregated when the metric is used within a network of advertisement displays.
  • the metric can be used to determine the real-time effectiveness of the advertisement being displayed and correlated to other data, such as demographic data and/or emotional reaction. It would be appreciated that since the content of the advertisement is known and/or can be analysed, the advertisement content when correlated to the metric and other data allows valuable market research data to be generated.
  • a metric is indicative of human attentiveness toward a visual display and derived from continual detection,measurement, and tracking of faces and facial features (including eyes, lips, eyebrows, eyelids) relative to the orientation of the known object(s) or display region(s). Facial feature tracking is employed such as provided by faceAPI (http://www.faceapi.com), which is available from Seeing Machines Ltd. Demographic details and emotional reactions to the stimulus can also be determined and logged.
  • a metric can be used to provide an advertising "rating" or market value (and therefore sale price) to a visual display region supporting a visual stimulus.
  • visual stimulus include: television; plasma or LCD screen; computer screen; status display; advertising display; information display or kiosk; gauge or readout; control interface; art gallery or museum installation; billboard; phone; web-browser image; road-sign; shop window display; or any object or region of space that has the potential to demand or require human visual attention.
  • the metric can be used to provide a measure indicative of the response (or attentiveness) to a visual stimulus, for example an image displayed at the visual display region.
  • the metric can be used to provide a measure indicative of the change in response (or attentiveness) toward different images displayed at the visual display region, i.e. the relative response for the purpose of ranking the effectiveness of each image, video or information display such as to maximise saliency, relevance, and/or revenue.
  • the metric can be used to provide a measure indicative of the price for a display region. It would be appreciated that a visual display region can be sold or auctioned on the Internet using the metric to set/influence the price for the region.
  • the metric changes with changed human observation of a region of space, over time. The more observations of the same region produce a higher value.
  • the degree of individual observation can also influence the value.
  • the degree of observation can be further derived from face and/or eyeball orientation relative to the position of the display region.
  • an addition value indicative of emotional reaction to the stimulus, derived from facial feature tracking, can also influence the metric.
  • FIG. 1 an embodiment of a method for quantifying the level of attentiveness towards a known object is disclosed. This method comprises
  • ⁇ - producing a metric indicative of overall visual attention 140 payed to a display region.
  • the absence of a face, or inability to track a face can also provide information when performing calculation of the metric. It would be appreciated that video information includes data indicative of one or more images.
  • an apparatus for producing a metric indicative of attentiveness toward a visual stimulus comprises a means for capturing video data; and a processor adapted to detect at least one face within the video data, calculate one or more measures for each face indicative of its attentiveness toward the visual stimulus and combine each of the measures to provide the metric indicative of its attentiveness toward the visual stimulus.
  • one or more cameras 210 are located proximal to the display region (region of interest) 220 such that images of onlookers 230 can be captured. This apparatus comprises:
  • a processor (not shown) adapted to o detect and track at least one face; o perform a calculation of a metric; and o produce a metric indicative of visual attention payed to a display region 220.
  • the geometry involved in the calculation of the metric may include:
  • a cameras 210 can include infra-red detection (or other forms of night vision) to enable operation at night or in conditions of low light. Infrared illumination may also be included.
  • the camera can include an onboard processor for performing the disclosed method.
  • the image or video data captured by the camera 210 can be transmitted to an external processor, for example by an IR camera to a processor connected via a data network.
  • Measurements from the images that influence the metric include:
  • the period of observation For example, the amount of time an individual is in detectable proximity to the region of interest;
  • FIG. 3 an embodiment of the algorithm for quantifying the level of attentiveness of an individual towards a known object is disclosed.
  • a tiered system is used with the basis of the individual's attention metric dependent on the particular regime at time t.
  • the regime utilised is dependent on the individual's distance from, and orientation to, the camera and therefore offer varied levels of precision.
  • the image obtained from camera (210 in Figure 2) is analysed. If face data 300 is detected, the data of each individual present in the image is processed. For each individual, the distance of the face from the camera is calculated 302 to determine the appropriate tracking method. If within a predetermined range (Kl) 302 the face orientation is calculated 303. If the orientation and distance of the face to the camera is within a predetermined range (K2, K3) 304 then the eye regions are determined 305. The eyelid closure of each eye is detected and combined into a single estimate of eyelid closure 306. If the eyes are sufficiently open (K4) 307, the gaze direction is calculated for each eye 308 and combined into an individual gaze direction angle 309.
  • Factors X(d) and Y(td) are normalised functions (e.g. sigmoid curves) that are used to take account of the distance of the face from the region of interest and duration that the face is detected in front of the region of interest, td is reset 315 once a face is no longer visible to the camera 314. These factors are combined with the gaze direction value to produce an attention metric for the individual at time 1310.
  • the attention metric at time t is based on face orientation 311 combined with X(d) and Y(td) 312. However, if the face distance is unacceptable for face orientation determination then head position angle 313 is combined with X(d) and Y(td) to produce the attention metric for the individual at time 1317.
  • the attention measurements from each face are combined into an overall attention level estimate to the region of interest at time t.
  • the number of faces is recorded and thus can be used as supplemental data or incorporated into the attention metric.
  • the thresholds Kl -K4 could alternatively be calculated real-time.
  • biometric data of the onlooker is recorded, such as iris imagery or facial feature data through standard methods.
  • the processor updates the visual stimulus in response to the metric.
  • the visual stimulus is typically selected from the group comprising television; plasma or LCD screen; computer screen; status display; advertising display; information display or kiosk; gauge or readout; control interface; art gallery or museum installation; billboard; phone; web-browser image; road-sign; shop window display; or any object or region of space that has the potential to demand or require human visual attention.
  • the visual stimulus is preferably a digital display.
  • the metric can be used to calculate an advertising rating or market value of the visual stimulus.
  • the means for capturing video information includes one or more cameras. Preferably multiple cameras are used to provide enough coverage when the display size is very large.
  • the method for capturing face pose is as set out in United States Patent 7,043,056 entitled “Facial Image Processing System” the disclosure of which is hereby incorporated by cross reference.
  • the movement and/or position of the onlooker is used to alter the content of the region of interest.
  • a view-dependent rendering effect such as desktop-based virtual reality
  • the onlooker can have the sensation that they are in a virtual world and/or the content is directed at them alone.
  • demographic data can also be determined for each identified face.
  • demographic data can include age, gender and race. It would be appreciated that for displays capable of real-time updating, for example digital displays, this data can be used to tailor advertising to the onlooker in real-time.
  • a metric can be indicative of the real-time effectiveness and cost effectiveness of the advertisement being displayed, which is correlated to the probability that an onlooker observed it.
  • an embodiment of the disclosed method and apparatus can quantify the level of human attentiveness toward and reaction to visually displayed information. It would also be appreciated that this metric is applicable to situations where a quantified response to other stimuli is required.
  • the stimulus can be directed to any of the senses, such as a smell or sound.
  • the illustrated method evaluates a substantially realtime quantifiable measure of attentiveness toward a known object.
  • a processor coupled to the camera can perform the image capture, face detection, and calculate a metric.
  • the metric can be communicated via data network to a central site associated with collating metrics for selecting a stimulus to display.
  • a camera without processing capability can communicate a video stream to a processor (for example via a data network) for processing the video stream to compute the metric.
  • the processors can communicate with a further computer system for enabling the retrieval of statistics for viewing. In some embodiments the selection of what is displayed, and the price to be paid for displaying it, can be adjusted based on the statistics retrieved.
  • a digital display solution with an updating display would include data network connectivity for display new material.
  • the data network connection could also be used to communicate either the computed metric or the video data to another processor.
  • a metric can be calculated locally without a data network connection.
  • a data network connection could be used to communicate the computed metric to another processor.
  • a metric can be calculated locally without a data network connection.
  • the metric can be collected after- the-fact for non-realtime usage.
  • some of the embodiments are described herein as a method or combination of elements of a method that can be implemented by a processor of a computer system or by other means of carrying out the function.
  • a processor with the necessary instructions for carrying out such a method or element of a method forms a means for carrying out the method or element of a method.
  • an element described herein of an apparatus embodiment is an example of a means for carrying out the function performed by the element for the purpose of carrying out the invention.

Abstract

A method is disclosed for producing a metric indicative of attentiveness toward a region of interest (135). The method comprises the steps of: capturing (110) image data in proximity to said region of interest; detecting (120) the presence of one or more faces (within said image data; calculating (130) one or more measure for each one of said faces, said measure indicative of attentiveness toward said region of interest; and combining (140) each said measure to provide said metric indicative of its attentiveness toward said region of interest. The region of interest may include a visual stimulus, a stimulus to any other of the senses or a combination of stimuli. Corresponding apparatus are also disclosed.

Description

METRIC FOR QUANTIFYING ATTENTION AND APPLICATIONS THEREOF
FIELD OF THE INVENTION
[0001] The present invention relates to measuring and producing a metric for human attention toward a known object or specified region.
[0002] The invention has been developed primarily for use as a method and apparatus for providing a quantitative measure of attentiveness and changes in attentiveness toward information that is displayed visually (in particular visual advertisements), and will be described hereinafter with reference to this application. However, it will be appreciated that the invention is not limited to this particular field of use.
BACKGROUND OF THE INVENTION
[0003] Any discussion of the prior art throughout the specification should in no way be considered as an admission that such prior art is widely known or forms part of the common general knowledge in the field.
[0004] A known method for measuring interest to a visual stimulus includes that taught by Vertegaal et al (United States Patent Application 2006/0110008). This method tracks eye gaze using IR to detect pupil effect and glints. It teaches a measure of visual interest by determining the correlation of the subject's optical axis with the object of interest over a percentage of time that the object is on display. Disadvantageously, this method relies on measurement of eyes alone and relies on controlled lighting conditions (strong IR illumination). This presents a number of limitations since visual interest can only be quantified when the eyes are visible (e.g. the person is facing the stimulus, the person's eyes are not obscured by eyewear) and when lighting noise (e.g. sunlight, dynamically lit environments) does not interfere with the illumination conditions. Furthermore, illumination power scales with the square of the square of distance (d4), which places a limitation on the detection range of the system.
[0005] Another known method for measuring interest to a visual stimulus includes that taught by Cohen-Solal et al (United States Patent 6,873,710). This method determines various demographic statistics, including gender, race or age statistics; the current size of the audience; how quickly the audience is changing; and how much attention the audience is paying to the presented advertising or information. It teaches a measure of interest for the audience as a whole by evaluating the audio or video information of the audience, to identify behaviour that suggests whether or not the audience is paying attention to the presented advertising or information. Disadvantageously, this method does not include gaze direction or face orientation in its audience attention evaluation. Furthermore, the measurement of attention described requires an audience that is captive in the sense that they are located in front of a single stimulus and are responding only to it, and the method therein assumes that there are no other stimuli in the environment that may be causing the responses.
[0006] Chardon et al. (PCT Patent Application WO 2007/120686) discloses a method and system for audience targeting of advertising where display time is auctioned to content providers based on obtained audience information. Although audience information is collected and made available to content providers based on their requirements (such as the standard marketing metric "opportunity to see") it is directed to audience-based content adaptation. A broadly applicable metric for quantifying human attention afforded to a known region of interest is not disclosed.
[0007] Van Erlach et al (United States Patent Application 20030179229) teaches a bio metrically-driven customization of human-machine interfaces. In particular it teaches that the content of a smart sign may be altered based on biometrics of a passerby. Disadvantageously, this method does not include any specific measurement technique for attention but refers to a general method of determining the emotional state of a person from at least one obtained user biometric, for the singular purpose of providing bionietrically determined feedback. Such a system is likely to be improved from the invention herein which is a specific technique to reliably determine the attentive state of an individual or group of people toward an object, under a diverse range of operational conditions.
[0008] There is a need in the art to improve the accuracy and robustness of the attention measurement process. This is because existing techniques often rely on controlled conditions and neglect to adapt the technique based on onlooker proximity to the region of interest. Additionally, there exists a need to provide a broadly applicable, quantitative measure of human attention to enable statistical gathering and decision making based on the level of human attention afforded to a particular known region of space or area of interest.
OBJECT OF THE INVENTION
[0009] It is an object of the present invention to overcome or ameliorate at least one of the disadvantages of the prior art, or to provide a useful alternative.
[00010] It is an object of the invention in its preferred form to provide a method of evaluating a substantially real-time quantifiable measure of attentiveness toward a specific object, region of space, or stimulus, by an individual or group of people under potentially diverse illumination conditions and wherein there may be a plurality of stimuli influencing attention. Preferably the stimuli is visual, but may equally be or include stimuli to any of the senses. Another object of the invention in its preferred form is to correlate the metric with other information derived or provided, such as demographic information, location, time, and date to provide statistics that can be used by a connected system.
SUMMARY OF THE INVENTION
[00011] According to the a first aspect of the invention there is provided a method for producing a metric indicative of attentiveness toward a region of interest, said method comprising the steps of:
a. capturing image data in proximity to said region of interest;
b. detecting the presence of one or more faces within said image data;
c. calculating one or more measures for each one of said faces, said measures indicative of attentiveness toward said region of interest; and
d. combining each said measure to provide said metric indicative of its attentiveness toward said region of interest.
Preferably the metric is combined with demographic information.
Preferably a plurality of faces are detected.
Preferably the plurality of faces are detected over a period of time. Preferably the one or more measures includes the orientation of the eyes toward the region of interest.
Preferably the one of more measures includes the orientation of the face toward the region of interest.
Preferably the one of more measures includes any one or more selected from the group comprising
a. the number of faces;
b. the distance effaces from the region of interest;
c. the orientation of the eyes toward the region of interest;
d. the orientation of the face toward the region of interest;
e. the length of time looking at the region of interest;
f. the length of time facing the region of interest;
g. facial reaction to the interest; and
h. relative changes in facial expressions when altering the region of interest.
Preferably the metric is a weighted sum or average of each of the one or more measures.
Preferably the image data is captured by a plurality of video cameras.
Preferably the method further includes the step of updating the region of interest in response to said metric.
Preferably the region of interest is a digital display.
Preferably the region of interest is any one selected from the group comprising a:
a. television;
b. plasma or LCD screen;
c. computer screen;
d. status display; e. advertising display;
f. information display or kiosk;
g. gauge or readout;
h. control interface;
i. art gallery or museum installation;
j. billboard;
k. phone;
1. web-browser image;
m. road-sign;
n. shop window display; or
o. any object or region of space that has the potential to demand or require human visual attention.
Preferably the metric is used in calculating an advertising rating or market value of the region of interest.
[00012] According to a second aspect of the invention there is provided an apparatus for producing a metric indicative of attentiveness toward a region of interest, said apparatus comprising:
a. a means for capturing image data; and
b. a processor adapted to detect at least one face within said image data, calculate one or more measure for each said face indicative of its attentiveness toward said region of interest and combine each said measure to provide said metric indicative of its attentiveness toward said region of interest.
Preferably the processor of the apparatus updates the region of interest in response to the metric. Preferably the region of interest of the apparatus is a digital display.
Preferably the region of interest is any one selected from the group comprising a: a. television;
b. plasma or LCD screen;
c. computer screen;
d. status display;
e. advertising display;
f. information display or kiosk;
g. gauge or readout;
h. control interface;
i. art gallery or museum installation;
j. billboard;
k. phone;
1. web-browser image;
m. road-sign;
n. shop window display; or
o. any object or region of space that has the potential to demand or require human visual attention.
Preferably the metric is used in calculating an advertising rating or market value of the region of interest.
BRIEF DESCRIPTION OF THE DRAWINGS
[00013] A preferred embodiment of the invention will now be described, by way of example only, with reference to the accompanying drawings in which:
[00014] FIG. 1 is an example flowchart of a method for evaluating an attention metric using a single camera; [00015] FIG. 2 is a partial flowchart of a method according to FIG 1, showing the evaluation of an attention value for a single face from a single camera;
[00016] FIG. 3 is an example plan view of the orientation of an individual for whom the metric is being calculated.
PREFERRED EMBODIMENT OF THE INVENTION
[00017] A metric ("facetime") is disclosed that in an embodiment quantifies the level of human attentiveness towards a known object or region of space.
[00018] By way of example, attentiveness can include interest toward an advertisement.
[00019] The "facetime" metric is designed for measuring human attentiveness toward specific object or region of interest in space that may include specific displayed information, under diverse operational conditions, such as in a mall or outdoor shopping area, where there may be strong incident lighting and a plurality of stimuli. This attentiveness is indicated by many factors including the number and proximity of faces to the displayed information, and for each face: the orientation of the face toward the displayed information; the length of time looking at (or facing) the displayed information; the movement of the face with respect to the displayed information; any changes in facial geometry, e.g. indicating interaction or emotional reaction; that occur when looking at the displayed information.
[00020] In an embodiment, the attentiveness metric is calculated by tracking aspects of humans in nearby proximity of the advertisement using image information obtained from a camera or set of cameras which are located so as to observe any potential onlookers. These aspects can include any one or more of: the number of onlookers; the distance, position, and orientation of the onlooker's face, eyes, and facial features; presence and orientation of glasses or sunglasses; skin colour; and shadow geometry.
[00021] In an embodiment the metric may be used in a system where the quantified metric of attention may be used in conjunction and correlated with other information, such as demographic information. The classification of demographic category against a set of pre-determined categories, whereby the demographic category or categories of each onlooker is determined by tracking aspects which may include any one or more of: the height of the face above the ground, the colour of the skin, the texture of the skin, the geometric shape of the face, the texture and colour of the eyes and irises, the presence and shape of glasses or sunglasses, and the presence and location of any jewellery or head adornment such as hats, helmets, etc.
[00022] In an embodiment the quantified metric can be correlated with changes in the advertisement.
[00023] In an embodiment the quantified metric can be fed-back to the advertisement, allowing the advertisement to change or respond.
[00024] By way of example only, the metric can indicate attentiveness toward a digital sign, traditional printed sign or other display used for advertising purposes. For this type of display, the metric is preferably indicative of the effectiveness, and hence correlates to the market value, of each advertisement displayed.
[00025] In an embodiment, by way of example only, the metric is used as a factor in an advertisement display network system that incorporates one or more locations, where the metric is used to influence one or more of: the locations an advertisement is shown; the price of showing the advertisement at a particular time and duration; the type and/or content of the advertisement shown, for example to maximise saliency and relevance. The influence of the metric is such that the market value of advertisement locations changes (e.g. increases) with changed (e.g. increased) levels of attention. Furthermore, tracking techniques can be employed to complement the metric data with demographic information such as approximate viewer age, gender, and race.
[00026] Attention metric data gathered for a particular advertisement can provide a number of useful benefits to the advertiser or owner of the advertising space. In a network system, the data can be correlated across results from other advertisements, either measured at the same display location or at different locations. Other correlations can also be made, including the time of day, date, day of week, location of the sign or demographics of the viewers, for providing further information to advertisers such as when certain advertisements are best displayed. The interrelationship of such data, such as time-of-day and viewer age or gender demographics, can also be analysed and provide useful information. [00027] By way of example only, a method can include the steps of:
(a) Placing a camera near the known object or region of interest.
(b) Performing face detection and where conditions are suitable, tracking of the head-pose, lips, eyelids and gaze direction.
(c) Using the tracking information to determine the level of attention in a particular time window.
(d) Correlating attention level with the advertisement that is shown.
(e) Using the correlated information to provide feedback to the advertiser and/or owner of the advertising space.
[00028] In another embodiment, step (d) also includes delivering auxiliary demographic information (for example age, sex, race) that is available to be correlated to the metric. The auxiliary information can be determined in real-time by suitable means and/or be stored information retrieved locally or via a data network.
[00029] A comparative value for individual advertisements can be calculated from the metric. The metrics of individual stimuli can be compared against other metrics derived at the same site to determine which particular advertisement provokes more attention from passersby, and also for instance correlated with sex, age, race demographics. The metrics of identical advertisements across differently positioned display regions can also be compared to determine the relative value of particular display sites.
[00030] An owner of advertising space can also use information derived from the metric, for example, when setting a price to charge the advertiser. Using the information provided by the metric, the price can be modulated by time of the day, amount of passers-by, or peak occurrences of certain demographics when compared across different display locations.
[00031] The relevance of the advertisement can also be indicated by the metric. By way of example, affording attention to the stimuli, and/or having a particular demographic can be used in quantifying relevance. The metric provides the foundation for calculating a value indicative of the relevance of displayed information. [00032] By way of example, an owner of advertising space can use the information derived from the metric to allow targeted advertising to be displayed at all times. It would be appreciated that this allows the owner of advertising space to improve the saliency of advertising displayed and/or the potential to maximise their revenue.
[00033] In an embodiment, the metric is used as the basis for a unit of measurement for attention with an associated economic value. In this way, the price of the advertising spaces, at a particular point in time, can be set based on their past, current or predicted levels of available attention units. The attention units can be associated with other data such as time and date, demographics and location. By way of example only, the cumulated attention units of a particular sign could be broken down by gender, age, and so forth. It would be appreciated that attention, now measurable and unitised, can be traded according to the particular requirements of the content provider. For example, a content provider may wish to purchase 25% of the available attention units in a certain geographical location and timeslot.
[00034] In an embodiment, the hardware requirement can be relatively cheap, for example, a camera capable of providing periodic snapshots that are made available on a network or stored locally for later retrieval and analysis. This camera would preferably have night- vision capabilities or be complemented by a suitable light-source such as an infra-red beacon. This camera would preferably be weather-proof and tamper-proof.
[00035] It would be appreciated that just using eye detection or eye tracking for evaluating the effectiveness and market cost of advertising is limiting. In an embodiment, the disclosed metric constitutes an improvement by preferably including simple face-detection and any one or more measures indicative of face tracking, head pose determination, and tracking facial geometry as it changes shape over time. It would be appreciated that face-detection can be potentially performed more consistently and at greater distances because faces are larger than eyes and are less often covered or hidden from view. This embodiment can therefore provide a more robust metric for a wide-range of applications including determination of demographic specific viewer reaction. Examples of relevant tracking apparatus include faceLAB, DSS, or faceAPI available from Seeing Machines Ltd. (http://www.seeingmachines.com). Furthermore, tracking methods and systems as described in United States Patent US 7043056 and PCT Patent Application Publication Numbers WO 2003/081532, WO 2004/003849, WO 2007/062478, and WO 2008/106725, by the same applicant, are hereby incorporated by cross reference.
[00036] By way of example, a reaction to an advertisement can be determined more accurately than simply counting eyes (or faces); by observing the angular orientation toward an advertisement correlated with the duration of time spent at a particular orientation angle. Similarly, if gaze direction tracking is possible, the rotation amount and duration of observation may be used also. It would also be appreciated that the detection and tracking of eyeglasses, including sunglasses, would also be useful when calculating the metric as people that wear eyeglasses are themselves a valuable demographic category, and the frames of glasses can be used to locate the orientation of the head more accurately. In one embodiment, the method for tracking eye glasses is as set out in PCT Publication WO 2007/062478 entitled "Visual Tracking of Eye glasses in Visual Head and Eye Tracking Systems" the disclosure of which is hereby incorporated by cross reference. Detection of eye closure can also be useful when calculating the metric as well as combination with other information such as viewer alertness through calculating measures such as PERCLOS (percentage of eye closure).
[00037] By way of example only, a metric can be indicative of attentiveness toward, and reaction to a, or region of a, television; plasma or LCD screen; computer screen; status display; advertising display; information display or kiosk; gauge or readout; control interface; art gallery or museum installation; billboard; phone; web-browser image; road-sign; shop window display; or any object or region of space that has the potential to demand or require human visual attention. The method of providing stimuli can include an information screen (such as plasma and LCD display), a static two- dimensional display (for example a simple poster), or a three-dimensional display (for example a shop window display).
[00038] In an embodiment, the metric is used to generate a visualisation of attention payed to particular regions of the known object. By way of example, the visualisation could be a heat map.
[00039] It would be appreciated that valuable information can be incorporated within the metric by visual classification of the age, gender, race and other potential demo graphics of viewers. By quantifying the effectiveness of an advertisement in reaching potential consumers, and feeding this data back, advertising effectiveness can be improved. It would also be appreciated that tracking consumer exposure to advertisements is relevant for both marketing research and accounting purposes.
[00040] In an embodiment, the application of the metric would allow long term observation and accrual of data. It would be appreciated that significant amounts of data can be aggregated when the metric is used within a network of advertisement displays. By way of example, the metric can be used to determine the real-time effectiveness of the advertisement being displayed and correlated to other data, such as demographic data and/or emotional reaction. It would be appreciated that since the content of the advertisement is known and/or can be analysed, the advertisement content when correlated to the metric and other data allows valuable market research data to be generated.
[00041] In an embodiment a metric is indicative of human attentiveness toward a visual display and derived from continual detection,measurement, and tracking of faces and facial features (including eyes, lips, eyebrows, eyelids) relative to the orientation of the known object(s) or display region(s). Facial feature tracking is employed such as provided by faceAPI (http://www.faceapi.com), which is available from Seeing Machines Ltd. Demographic details and emotional reactions to the stimulus can also be determined and logged.
[00042] By way of example, a metric can be used to provide an advertising "rating" or market value (and therefore sale price) to a visual display region supporting a visual stimulus. Examples of visual stimulus include: television; plasma or LCD screen; computer screen; status display; advertising display; information display or kiosk; gauge or readout; control interface; art gallery or museum installation; billboard; phone; web-browser image; road-sign; shop window display; or any object or region of space that has the potential to demand or require human visual attention. By way of a further example, the metric can be used to provide a measure indicative of the response (or attentiveness) to a visual stimulus, for example an image displayed at the visual display region. By way of a further example, the metric can be used to provide a measure indicative of the change in response (or attentiveness) toward different images displayed at the visual display region, i.e. the relative response for the purpose of ranking the effectiveness of each image, video or information display such as to maximise saliency, relevance, and/or revenue. By way of another example, the metric can be used to provide a measure indicative of the price for a display region. It would be appreciated that a visual display region can be sold or auctioned on the Internet using the metric to set/influence the price for the region.
[00043] In an embodiment, the metric changes with changed human observation of a region of space, over time. The more observations of the same region produce a higher value. In an embodiment, the degree of individual observation can also influence the value. The degree of observation can be further derived from face and/or eyeball orientation relative to the position of the display region. In an embodiment, an addition value indicative of emotional reaction to the stimulus, derived from facial feature tracking, can also influence the metric.
[00044] Referring to FIG. 1, an embodiment of a method for quantifying the level of attentiveness towards a known object is disclosed. This method comprises
> capturing image data 110;
> detecting face locations and boundary regions 120;
> performing a calculation of a metric for each face 130 towards known object 135 (e.g. known location (G) and orientation (Nroi)); and
^- producing a metric indicative of overall visual attention 140 payed to a display region.
[00045] In an embodiment, the absence of a face, or inability to track a face, can also provide information when performing calculation of the metric. It would be appreciated that video information includes data indicative of one or more images.
[00046] By way of example, an apparatus for producing a metric indicative of attentiveness toward a visual stimulus comprises a means for capturing video data; and a processor adapted to detect at least one face within the video data, calculate one or more measures for each face indicative of its attentiveness toward the visual stimulus and combine each of the measures to provide the metric indicative of its attentiveness toward the visual stimulus. [00047] Referring to FIG. 2, in an embodiment, one or more cameras 210 are located proximal to the display region (region of interest) 220 such that images of onlookers 230 can be captured. This apparatus comprises:
> a means for capturing video information 210;
> a processor (not shown) adapted to o detect and track at least one face; o perform a calculation of a metric; and o produce a metric indicative of visual attention payed to a display region 220.
[00048] It would be appreciated that multiple cameras would be appropriate in some situations. The geometry involved in the calculation of the metric may include:
> The position of the centre of the known region of interest (G) 221.
> The origin of the camera reference frame (O) 211.
> The position of the onlooker's head in space (P) 231 measured from the image, located at the midpoint of the line that joins the centre of the left and right eyeball spheres 232, 233.
> The distance (d) 212 between the points O and P.
> The vector (Troi) 222 from point G to P.
> The camera normal (Nc) 213, orthogonal to the image plane (view axis).
> The region of interest normal (Nroi) 223, which defines the front of the known region of interest 220.
> The face normal (Nf) 234, measured from the image using 3D head-tracking.
> Directions of gaze for the left (EgI) 235 and right (Egr) 236 eyes, measured from the image.
> The unified gaze direction (Eg) 237 calculated from EgI and/or Egr.
> The angles between the display region planar normal (Nroi) and Eg, Nf, and Troi, annotated as (θ) 224, (β) 225, and (ψ) 226, respectively.
> The angle (α) 227 between the face normal and vector from camera origin O to the head position P.
> Eye lid closure (Ec) 238 calculated from one or both eyes.
[00049] In an embodiment a cameras 210 can include infra-red detection (or other forms of night vision) to enable operation at night or in conditions of low light. Infrared illumination may also be included. The camera can include an onboard processor for performing the disclosed method. Alternatively, the image or video data captured by the camera 210 can be transmitted to an external processor, for example by an IR camera to a processor connected via a data network.
[00050] Measurements from the images that influence the metric include:
> The number of faces detected;
> For each face, angular values relating to gaze direction (θ) 224, face direction (β) 225, and head position (ψ) 226;
> For each face, the period of observation. For example, the amount of time an individual is in detectable proximity to the region of interest;
> For each face, its distance from the region of interest;
> For each face, its facial expressions such as eyebrow, lip and eyelid movements for example to derive emotional reaction.
[00051] Referring to FIG. 3, an embodiment of the algorithm for quantifying the level of attentiveness of an individual towards a known object is disclosed. A tiered system is used with the basis of the individual's attention metric dependent on the particular regime at time t. The regime utilised is dependent on the individual's distance from, and orientation to, the camera and therefore offer varied levels of precision.
[00052] The image obtained from camera (210 in Figure 2) is analysed. If face data 300 is detected, the data of each individual present in the image is processed. For each individual, the distance of the face from the camera is calculated 302 to determine the appropriate tracking method. If within a predetermined range (Kl) 302 the face orientation is calculated 303. If the orientation and distance of the face to the camera is within a predetermined range (K2, K3) 304 then the eye regions are determined 305. The eyelid closure of each eye is detected and combined into a single estimate of eyelid closure 306. If the eyes are sufficiently open (K4) 307, the gaze direction is calculated for each eye 308 and combined into an individual gaze direction angle 309. Factors X(d) and Y(td) are normalised functions (e.g. sigmoid curves) that are used to take account of the distance of the face from the region of interest and duration that the face is detected in front of the region of interest, td is reset 315 once a face is no longer visible to the camera 314. These factors are combined with the gaze direction value to produce an attention metric for the individual at time 1310. On the other hand, if the face orientation and/or distance from the camera is unacceptable for eye region detection then the attention metric at time t is based on face orientation 311 combined with X(d) and Y(td) 312. However, if the face distance is unacceptable for face orientation determination then head position angle 313 is combined with X(d) and Y(td) to produce the attention metric for the individual at time 1317.
[00053] The attention measurements from each face are combined into an overall attention level estimate to the region of interest at time t. The number of faces is recorded and thus can be used as supplemental data or incorporated into the attention metric. The thresholds Kl -K4 could alternatively be calculated real-time.
[00054] In another embodiment, biometric data of the onlooker is recorded, such as iris imagery or facial feature data through standard methods.
[00055] In an embodiment the processor updates the visual stimulus in response to the metric. The visual stimulus is typically selected from the group comprising television; plasma or LCD screen; computer screen; status display; advertising display; information display or kiosk; gauge or readout; control interface; art gallery or museum installation; billboard; phone; web-browser image; road-sign; shop window display; or any object or region of space that has the potential to demand or require human visual attention. The visual stimulus is preferably a digital display. In this embodiment the metric can be used to calculate an advertising rating or market value of the visual stimulus.
[00056] In an embodiment the means for capturing video information includes one or more cameras. Preferably multiple cameras are used to provide enough coverage when the display size is very large. In one embodiment, the method for capturing face pose is as set out in United States Patent 7,043,056 entitled "Facial Image Processing System" the disclosure of which is hereby incorporated by cross reference.
[00057] In an embodiment, the movement and/or position of the onlooker (e.g. from head pose or gaze direction) is used to alter the content of the region of interest. By way of example, a view-dependent rendering effect (such as desktop-based virtual reality) is created in the content presented to the onlooker. By altering the content based on the onlookers movement and/or position,the onlooker can have the sensation that they are in a virtual world and/or the content is directed at them alone.
[00058] In an embodiment, demographic data can also be determined for each identified face. By way of example that demographic data can include age, gender and race. It would be appreciated that for displays capable of real-time updating, for example digital displays, this data can be used to tailor advertising to the onlooker in real-time.
[00059] It would be appreciated that a metric can be indicative of the real-time effectiveness and cost effectiveness of the advertisement being displayed, which is correlated to the probability that an onlooker observed it.
[00060] It would be appreciated that an embodiment of the disclosed method and apparatus can quantify the level of human attentiveness toward and reaction to visually displayed information. It would also be appreciated that this metric is applicable to situations where a quantified response to other stimuli is required. The stimulus can be directed to any of the senses, such as a smell or sound.
[00061] It will be appreciated that the illustrated method evaluates a substantially realtime quantifiable measure of attentiveness toward a known object.
[00062] In a digital signage environment, by way of example, a processor coupled to the camera can perform the image capture, face detection, and calculate a metric. The metric can be communicated via data network to a central site associated with collating metrics for selecting a stimulus to display. Alternatively, a camera without processing capability can communicate a video stream to a processor (for example via a data network) for processing the video stream to compute the metric. Furthermore, the processors can communicate with a further computer system for enabling the retrieval of statistics for viewing. In some embodiments the selection of what is displayed, and the price to be paid for displaying it, can be adjusted based on the statistics retrieved.
[00063] For a digital display solution with an updating display, by way of example, would include data network connectivity for display new material. The data network connection could also be used to communicate either the computed metric or the video data to another processor. [00064] For a shop window display, by way of example, a metric can be calculated locally without a data network connection. A data network connection could be used to communicate the computed metric to another processor.
[00065] For a museum or art-gallery environment, by way of example, a metric can be calculated locally without a data network connection. The metric can be collected after- the-fact for non-realtime usage.
[00066] Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to".
[00067] As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
[00068] Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may refer to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.
[00069] Similarly it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.
[00070] Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.
[00071] Furthermore, some of the embodiments are described herein as a method or combination of elements of a method that can be implemented by a processor of a computer system or by other means of carrying out the function. Thus, a processor with the necessary instructions for carrying out such a method or element of a method forms a means for carrying out the method or element of a method. Furthermore, an element described herein of an apparatus embodiment is an example of a means for carrying out the function performed by the element for the purpose of carrying out the invention.
[00072] In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
[00073] Although the invention has been described with reference to specific examples, it will be appreciated by those skilled in the art that the invention may be embodied in many other forms.

Claims

THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:
1. A method for producing a metric indicative of attentiveness toward a region of interest, said method comprising the steps of:
a. capturing image data in proximity to said region of interest;
b. detecting the presence of one or more faces within said image data;
c. calculating one or more measure for each one of said faces, said measure indicative of attentiveness toward said region of interest; and
d. combining each said measure to provide said metric indicative of its attentiveness toward said region of interest.
2. A method according to claim 1 wherein said metric is combined with demographic information.
3. A method according to claim 1 or claim 2 wherein a plurality of faces are detected.
4. A method according to claim 3 wherein said plurality of faces are detected over a period of time.
5. A method according to any one of the preceding claims wherein said measure includes the orientation of the eyes toward the region of interest.
6. A method according to any one of the preceding claims wherein said measure includes the orientation of the face toward the region of interest.
7. A method according to any one of the preceding claims wherein said measure includes any one or more selected from the group comprising
a. the number of faces;
b. the distance of faces from the region of interest;
c. the orientation of the eyes toward the region of interest;
d. the orientation of the face toward the region of interest;
e. the length of time looking at the region of interest; f. the length of time facing the region of interest;
g. facial reaction to the interest; and
h. relative changes in facial expressions when altering the region of interest.
8. A method according to any one of the preceding claims wherein said metric is a weighted sum or average of each said measure.
9. A method according to any one of the preceding claims wherein said image data is captured by a plurality of video cameras.
10. A method according to any one of the preceding claims further comprising the step of updating the region of interest in response to said metric.
11. A method according to any one of the preceding claims wherein said region of interest is a digital display.
12. A method according to any one of the preceding claims wherein said region of interest is any one selected from the group comprising a:
a. television;
b. plasma or LCD screen;
c. computer screen;
d. status display;
e. advertising display;
f. information display or kiosk;
g. gauge or readout;
h. control interface;
i. art gallery or museum installation;
j. billboard;
k. phone;
1. web-browser image;
m. road-sign;
n. shop window display; or
o. any object or region of space that has the potential to demand or require human visual attention.
13. A method according to any one of the preceding claims wherein said metric is used in calculating an advertising rating or market value of said region of interest.
14. A method for producing a metric indicative of attentiveness toward a region of interest substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples.
15. An apparatus for producing a metric indicative of attentiveness toward a region of interest, said apparatus comprising:
a. a means for capturing image data; and
b. a processor adapted to detect at least one face within said image data, calculate one or more measure for each said face indicative of its attentiveness toward said region of interest and combine each said measure to provide said metric indicative of its attentiveness toward said region of interest.
16. An apparatus according to claim 15 wherein said processor updates said region of interest in response to said metric.
17. An apparatus according to any one of claims 15 or 16 wherein said region of interest is a digital display.
18. An apparatus according to any one of claims 15 to 17 wherein said region of interest is any one selected from the group comprising a:
a. television;
b. plasma or LCD screen;
c. computer screen; d. status display;
e. advertising display;
f. information display or kiosk;
g. gauge or readout;
h. control interface;
i. art gallery or museum installation;
j. billboard;
k. phone;
1. web-browser image;
m. road-sign;
n. shop window display; or
o. any object or region of space that has the potential to demand or require human visual attention.
19. An apparatus according to any one of claims 15 to 18 wherein said metric is used in calculating an advertising rating or market value of said region of interest.
20. An apparatus for producing a metric indicative of attentiveness toward a region of interest substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples.
PCT/AU2009/001547 2008-11-27 2009-11-26 Metric for quantifying attention and applications thereof WO2010060146A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2008906134 2008-11-27
AU2008906134A AU2008906134A0 (en) 2008-11-27 Metric for quantifying attention and applications thereof

Publications (1)

Publication Number Publication Date
WO2010060146A1 true WO2010060146A1 (en) 2010-06-03

Family

ID=42225132

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2009/001547 WO2010060146A1 (en) 2008-11-27 2009-11-26 Metric for quantifying attention and applications thereof

Country Status (1)

Country Link
WO (1) WO2010060146A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012039902A1 (en) * 2010-09-22 2012-03-29 General Instrument Corporation System and method for measuring audience reaction to media content
US8473975B1 (en) * 2012-04-16 2013-06-25 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US8879155B1 (en) 2011-11-09 2014-11-04 Google Inc. Measurement method and system
WO2015042472A1 (en) * 2013-09-20 2015-03-26 Interdigital Patent Holdings, Inc. Verification of ad impressions in user-adptive multimedia delivery framework
CN109828662A (en) * 2019-01-04 2019-05-31 杭州赛鲁班网络科技有限公司 A kind of perception and computing system for admiring commodity
CN110023832A (en) * 2016-06-23 2019-07-16 奥特涅茨公司 Interactive content management
US10354291B1 (en) 2011-11-09 2019-07-16 Google Llc Distributing media to displays
US10469916B1 (en) 2012-03-23 2019-11-05 Google Llc Providing media content to a wearable device
CN110517094A (en) * 2019-08-30 2019-11-29 软通动力信息技术有限公司 A kind of visitor's data analysing method, device, server and medium
US10598929B2 (en) 2011-11-09 2020-03-24 Google Llc Measurement method and system
WO2020159768A1 (en) * 2019-01-30 2020-08-06 Oohms, Ny, Llc System and method of tablet-based distribution of digital media content
ES2785304A1 (en) * 2019-04-03 2020-10-06 Aguilar Francisco Arribas Audience measurement apparatus and procedure (Machine-translation by Google Translate, not legally binding)
US20220284455A1 (en) * 2021-03-03 2022-09-08 Shirushi Inc. Purchasing analysis system, purchasing analysis method, and computer program

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006060889A1 (en) * 2004-12-09 2006-06-15 Cmetrics Media Inc. Method and system for assessing viewership of a medium
WO2007128057A1 (en) * 2006-05-04 2007-11-15 National Ict Australia Limited An electronic media system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006060889A1 (en) * 2004-12-09 2006-06-15 Cmetrics Media Inc. Method and system for assessing viewership of a medium
WO2007128057A1 (en) * 2006-05-04 2007-11-15 National Ict Australia Limited An electronic media system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SMITH K. ET AL.: "Tracking the Multi Person Wandering Visual Focus of Attention", ICMI'06, 20 January 2010 (2010-01-20), ALBERTA CANADA, Retrieved from the Internet <URL:http://iad.iitb.fraunhofer.de/~iss/voit08.pdf> *

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8438590B2 (en) 2010-09-22 2013-05-07 General Instrument Corporation System and method for measuring audience reaction to media content
WO2012039902A1 (en) * 2010-09-22 2012-03-29 General Instrument Corporation System and method for measuring audience reaction to media content
US10354291B1 (en) 2011-11-09 2019-07-16 Google Llc Distributing media to displays
US11892626B2 (en) 2011-11-09 2024-02-06 Google Llc Measurement method and system
US11579442B2 (en) 2011-11-09 2023-02-14 Google Llc Measurement method and system
US8879155B1 (en) 2011-11-09 2014-11-04 Google Inc. Measurement method and system
US11127052B2 (en) 2011-11-09 2021-09-21 Google Llc Marketplace for advertisement space using gaze-data valuation
US10598929B2 (en) 2011-11-09 2020-03-24 Google Llc Measurement method and system
US9439563B2 (en) 2011-11-09 2016-09-13 Google Inc. Measurement method and system
US9952427B2 (en) 2011-11-09 2018-04-24 Google Llc Measurement method and system
US10469916B1 (en) 2012-03-23 2019-11-05 Google Llc Providing media content to a wearable device
US11303972B2 (en) 2012-03-23 2022-04-12 Google Llc Related content suggestions for augmented reality
US11792477B2 (en) 2012-04-16 2023-10-17 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US10080053B2 (en) 2012-04-16 2018-09-18 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US9485534B2 (en) 2012-04-16 2016-11-01 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US8473975B1 (en) * 2012-04-16 2013-06-25 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US10536747B2 (en) 2012-04-16 2020-01-14 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US10986405B2 (en) 2012-04-16 2021-04-20 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
US8869183B2 (en) 2012-04-16 2014-10-21 The Nielsen Company (Us), Llc Methods and apparatus to detect user attentiveness to handheld computing devices
CN105830108A (en) * 2013-09-20 2016-08-03 交互数字专利控股公司 Verification Of Ad Impressions In User-Adptive Multimedia Delivery Framework
WO2015042472A1 (en) * 2013-09-20 2015-03-26 Interdigital Patent Holdings, Inc. Verification of ad impressions in user-adptive multimedia delivery framework
EP3475759A4 (en) * 2016-06-23 2020-04-22 Outernets, Inc. Interactive content management
CN110023832A (en) * 2016-06-23 2019-07-16 奥特涅茨公司 Interactive content management
CN109828662A (en) * 2019-01-04 2019-05-31 杭州赛鲁班网络科技有限公司 A kind of perception and computing system for admiring commodity
WO2020159768A1 (en) * 2019-01-30 2020-08-06 Oohms, Ny, Llc System and method of tablet-based distribution of digital media content
US11064255B2 (en) 2019-01-30 2021-07-13 Oohms Ny Llc System and method of tablet-based distribution of digital media content
US11671669B2 (en) 2019-01-30 2023-06-06 Oohms, Ny, Llc System and method of tablet-based distribution of digital media content
ES2785304A1 (en) * 2019-04-03 2020-10-06 Aguilar Francisco Arribas Audience measurement apparatus and procedure (Machine-translation by Google Translate, not legally binding)
CN110517094A (en) * 2019-08-30 2019-11-29 软通动力信息技术有限公司 A kind of visitor's data analysing method, device, server and medium
US20220284455A1 (en) * 2021-03-03 2022-09-08 Shirushi Inc. Purchasing analysis system, purchasing analysis method, and computer program

Similar Documents

Publication Publication Date Title
WO2010060146A1 (en) Metric for quantifying attention and applications thereof
US11892626B2 (en) Measurement method and system
US9952427B2 (en) Measurement method and system
US10365714B2 (en) System and method for dynamic content delivery based on gaze analytics
US8295542B2 (en) Adjusting a consumer experience based on a 3D captured image stream of a consumer response
US20130110617A1 (en) System and method to record, interpret, and collect mobile advertising feedback through mobile handset sensory input
JP6123140B2 (en) Digital advertising system
Dalton et al. Display blindness? Looking again at the visibility of situated displays using eye-tracking
CN103561635A (en) Gaze tracking system
US20130138499A1 (en) Usage measurent techniques and systems for interactive advertising
AU2011276637B2 (en) Systems and methods for improving visual attention models
WO2018127782A1 (en) Wearable augmented reality eyeglass communication device including mobile phone and mobile computing via virtual touch screen gesture control and neuron command
CN109670456A (en) A kind of content delivery method, device, terminal and storage medium
CA2687348A1 (en) Method and system for audience measurement and targeting media
KR20120087679A (en) Advertisement system using motion cognition
US11227307B2 (en) Media content tracking of users&#39; gazing at screens
Sippl et al. Real-time gaze tracking for public displays
KR102477231B1 (en) Apparatus and method for detecting interest in gazed objects
EP2685351A1 (en) Method for calibration free gaze tracking using low cost camera
US20220253893A1 (en) System and Method of Tracking the Efficacy of Targeted Adaptive Digital Advertising
EP2833308A1 (en) System and method for estimating views on display areas using eye tracking
Lange et al. Dikablis (digital wireless gaze tracking system)–operation mode and evaluation of the human machine interaction
KR20160100749A (en) Exhibition management system and method
US20170236151A1 (en) Systems, devices, and methods of providing targeted advertising
KR20220089461A (en) System and method for gate management based on the body information of passers watched contents

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09828444

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09828444

Country of ref document: EP

Kind code of ref document: A1