WO2009086683A1 - Automatic detection, labeling and tracking of team members in a video - Google Patents

Automatic detection, labeling and tracking of team members in a video Download PDF

Info

Publication number
WO2009086683A1
WO2009086683A1 PCT/CN2007/003986 CN2007003986W WO2009086683A1 WO 2009086683 A1 WO2009086683 A1 WO 2009086683A1 CN 2007003986 W CN2007003986 W CN 2007003986W WO 2009086683 A1 WO2009086683 A1 WO 2009086683A1
Authority
WO
WIPO (PCT)
Prior art keywords
player
team
target player
regions
frame
Prior art date
Application number
PCT/CN2007/003986
Other languages
French (fr)
Inventor
Xiaofeng Tong
Jia Liu
Yimin Zhang
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to PCT/CN2007/003986 priority Critical patent/WO2009086683A1/en
Publication of WO2009086683A1 publication Critical patent/WO2009086683A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements

Definitions

  • Embodiments of the present invention generally relate to the field of video processing, and, more particularly to an automatic detection, labeling and tracking of team members in a video.
  • Player detection, labeling and tracking is critical for the study of team tactics and player activities in TV broadcast sports video. While some progress has been made on this topic, it is still challenging due to the difficulties such as player-to-player occlusion, low discriminative appearance between players, varying number of players on the screen, abrupt camera motion, and video blur.
  • FIG. 1 is a graphical illustration of an example frame of a video, in accordance with one example embodiment of the invention
  • FIG. 2 is a flow chart of an example method for developing a player labeling module, in accordance with one example embodiment of the invention
  • FIG. 3 is a flow chart of an example method for testing a player labeling module, in accordance with one example embodiment of the invention.
  • FIG. 4 is a block diagram of an example article of manufacture including content which, when accessed by a device, causes the device to implement one or more aspects of one or more embodiment(s) of the invention.
  • Embodiments of the present invention are generally directed to an automatic detection, labeling and tracking of team members in a video.
  • a method is introduced to receive a frame from a sports video, to identify a playing surface in the frame, to identify player regions on the playing surface, to transform pixels from the player regions into player models, to aggregate the player models from a plurality of frames, and to determine a team model to represent a first team and a second team based on clustering of player models.
  • Other embodiments are also disclosed and claimed.
  • Fig. 1 is a graphical illustration of an example frame of a video, in accordance with one example embodiment of the invention.
  • Frame 100 is intended to represent x.
  • frame 100 may include one or more of global view 102, playing surface 104, boundary 106, outer region 108, first team members 110, second team members 112, referee 114, ball 116, first team labels 118, second team labels 120, and referee label 122 coupled as shown in Fig. 1. While shown as being a frame of a soccer match, frame 100 may well be from a video of another sport, such as basketball or football, or any other type of video that would benefit from the teachings of the present invention.
  • Global view 102 represents the type of view depicted.
  • global view 102 represents a view in which the frame is predominantly of the playing surface or field and several players, as opposed to a close-up view or a crowd view, for example. While shown as including two first team members 110 and two second team members 112 for simplicity, many more players may be present in frame 100.
  • Playing surface 104 may be grass or another surface type that is predominantly solid and uniform in color and may be surrounded by boundary 106 that separates and distinguishes playing surface 104 from outer region 108, which may include, for example, spectators.
  • First team members 110 would have matching uniforms that are distinguishable from the uniforms worn by second team members 112 and further distinguishable from the uniform worn by referee 114.
  • the differences in player and referee uniforms allow labels, such as first team labels 118, second team labels 120, and referee label 122, to be added to frame 100.
  • ball 116 and other anomalies on playing surface 104 are ignored for player detection, labeling and tracking purposes.
  • Fig. 2 is a flow chart of an example method for developing a player labeling module, in accordance with one example embodiment of the invention. It will be readily apparent to those of ordinary skill in the art that although the following operations may be described as a sequential process, many of the operations may in fact be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged without departing from the spirit of embodiments of the invention.
  • method 200 begins with receiving (202) a frame of a sports video such as frame 100.
  • frames that don't depict global view 102 are ignored.
  • a video processing system may grab one or two frames per second to process out of a 25 frame per second video source.
  • playing surface 104 is identified based on its color (perhaps green) that makes up a majority of frame 100.
  • the dominant color of playing surface 104 is learned by accumulating HSV color histograms.
  • playing surface 104 can be extracted from frame 100 through dominant color segmentation, morphological filtering and connect-component analysis.
  • Method 200 continues with identifying (206) player regions on the playing surface.
  • player regions comprise groupings of contrasting colors present on playing surface 104 of sufficient size, for example number of pixels.
  • ball 116 would be too small to be considered a player region.
  • pixels from player regions are transformed into histograms which represent player models, hi one embodiment, only pixels from the upper half of the player regions (and therefore more likely to include a team jersey worn on the upper body) are included in creating of the player models.
  • a large pool of pixels is collected from player regions and transformed into CIE-Luv space.
  • a Gaussian Mixture Model may be estimated with N components by Expectation-Maximization (EM) clustering. Centers of these components are referred to as prototypes. The adjacent components with small center distance are merged together. The resultant merged components are referred to as meta-prototypes.
  • AU player samples are represented as a histogram by binning all pixels into the corresponding meta-prototype.
  • EM clustering again to estimate K clusters over the meta- prototype histogram of all player samples.
  • the centers of these clusters are named submodels.
  • team models are created (210).
  • the player models are aggregated from a plurality of frames and team models for a first team and a second team are determined based on statistical analysis of predominant player models.
  • a referee model is also created and maintained.
  • the clusters are merged into four clusters with near absorption. Their centers are labeled real- models.
  • FIG. 3 is a flow chart of an example method for testing a player labeling module, in accordance with one example embodiment of the invention. It will be readily apparent to those of ordinary skill in the art that although the following operations may be described as a sequential process, many of the operations may in fact be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged without departing from the spirit of embodiments of the invention.
  • method 300 begins with receiving (302) a frame of a sports video such as frame 100.
  • a video processing system may grab fewer frames per second to process than are available from the video source.
  • the frame is analyzed to determine (304) whether a global view is depicted.
  • a global view is not depicted, then the frame is not processed further. If a global view 102 is depicted in the frame 100 then it is further processed.
  • playing surface 104 is identified based on its color (perhaps green) that makes up a majority of frame 100.
  • the dominant color of playing surface 104 is learned by accumulating HSV color histograms.
  • playing surface 104 can be extracted from frame 100 through dominant color segmentation, morphological filtering and connect-component analysis.
  • Method 300 continues with identifying (308) target player regions on the playing surface.
  • target player regions comprise groupings of contrasting colors present on playing surface 104 of sufficient size, for example number of pixels.
  • ball 116 would be too small to be considered a target player region.
  • a detector scans across the filtered image regions of playing surface 104 with multiple scales. Target player regions may represent areas with multiple responses.
  • a target player region is labeled as a member of a team if the target player region is sufficiently similar to a stored team model, hi one embodiment, a target player region is assigned the sub-model's label with the nearest Bhattacharyya distance, hi one embodiment, first team labels 118 would be added to frame 100 around first team members 110, while second team labels 120 (different in color than first team labels 118) would be added to frame 100 around second team members 112. While shown as rectangles in Fig.
  • the team labels can be any shape and color
  • referee label 122 would be added to frame 100 around referee 114 if the target player region is sufficiently similar to a stored referee model
  • tracking of team members includes maintaining a list of labeled target player regions for use with subsequent frames
  • coordinates on frame 100 of a labeled team member are saved and utilized in the decision- making process of labeling a target player region.
  • the stored list of labeled target player regions can be relied upon in part to maintain two labels.
  • the list of labeled target player regions can be utilized to assist in identifying an ambiguous target player region, hi one embodiment, a list of labeled target player regions is purged if it is determined that an abrupt change in view, such as a camera switch, has occurred. hi one embodiment, a set of rectangles is used to represent the detected player regions at frame t.
  • a player is enclosed by a rectangle, in which the binary mask segmented by dominant color and HSV color histogram are taken as observations.
  • a tracking module may find the players correspondence between adjacent frames with binary mask and color information.
  • One aim of tracking is to find the correspondence of players in adjacent rectangles.
  • Another aim is to discriminate false alarms and make up missing detection instance.
  • Bi-direction tracking, forward tracking from time t to t+1, and backward tracking from time t+1 to t, may be used to handle this problem.
  • Adjacent rectangles may be an exact match if the overlap of their enclosed rectangles is sufficient large (binary likelihood), and the similarity of the likelihood of their color histograms in HSV color space is enough large (color likelihood).
  • Fig. 4 illustrates a block diagram of an example storage medium comprising content which, when accessed, causes an electronic appliance to implement one or more aspects of the disclosed methods 200 and/or 300.
  • storage medium 400 includes content 402 (e.g., instructions, data, or any combination thereof) which, when executed, causes the appliance to implement one or more aspects of methods described above.
  • the machine-readable (storage) medium 400 may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards, flash memory, or other type of media / machine-readable medium suitable for storing electronic instructions.
  • the present invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem, radio or network connection).
  • Embodiments of the present invention may be used in a variety of applications. Although the present invention is not limited in this respect, the invention disclosed herein may be used in microcontrollers, general-purpose microprocessors, Digital Signal Processors (DSPs), Reduced Instruction-Set Computing (RISC), Complex Instruction-Set Computing (CISC), among other electronic components. However, it should be understood that the scope of the present invention is not limited to these examples.
  • DSPs Digital Signal Processors
  • RISC Reduced Instruction-Set Computing
  • CISC Complex Instruction-Set Computing
  • Embodiments of the present invention may also be included in integrated circuit blocks referred to as core memory, cache memory, or other types of memory that store electronic instructions to be executed by the microprocessor or store data that may be used in arithmetic operations.
  • core memory cache memory
  • an embodiment using multistage domino logic in accordance with the claimed subject matter may provide a benefit to microprocessors, and in particular, may be incorporated into an address decoder for a memory device.
  • the embodiments may be integrated into radio systems or hand-held portable devices, especially when devices depend on reduced power consumption.
  • laptop computers cellular radiotelephone communication systems
  • two-way radio communication systems one-way pagers
  • two-way pagers two-way pagers
  • PCS personal communication systems
  • PDA's personal digital assistants
  • the present invention includes various operations.
  • the operations of the present invention may be performed by hardware components, or may be embodied in machine- executable content (e.g., instructions), which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the operations.
  • the operations may be performed by a combination of hardware and software.
  • machine- executable content e.g., instructions
  • the operations may be performed by a combination of hardware and software.
  • the invention has been described in the context of a computing appliance, those skilled in the art will appreciate that such functionality may well be embodied in any of number of alternate embodiments such as, for example, integrated within a communication appliance (e.g., a cellular telephone).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

In some embodiments, an automatic detection, labeling and tracking of team members in a video is presented. In this regard, a method is introduced to receive a frame from a sports video, to identify a playing surface in the frame, to identify player regions on the playing surface, to transform pixels from the player regions into player models, to aggregate the player models from a plurality of frames, and to determine a team model to represent a first team and a second team based on clustering of player models. Other embodiments are also provided.

Description

AUTOMATIC DETECTION, LABELING AND TRACKING OF TEAM
MEMBERS IN A VIDEO
FIELD OF THE INVENTION
Embodiments of the present invention generally relate to the field of video processing, and, more particularly to an automatic detection, labeling and tracking of team members in a video.
BACKGROUND OF THE INVENTION
Player detection, labeling and tracking is critical for the study of team tactics and player activities in TV broadcast sports video. While some progress has been made on this topic, it is still challenging due to the difficulties such as player-to-player occlusion, low discriminative appearance between players, varying number of players on the screen, abrupt camera motion, and video blur.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements, and in which:
FIG. 1 is a graphical illustration of an example frame of a video, in accordance with one example embodiment of the invention;
FIG. 2 is a flow chart of an example method for developing a player labeling module, in accordance with one example embodiment of the invention; FIG. 3 is a flow chart of an example method for testing a player labeling module, in accordance with one example embodiment of the invention; and
FIG. 4 is a block diagram of an example article of manufacture including content which, when accessed by a device, causes the device to implement one or more aspects of one or more embodiment(s) of the invention.
DETAILED DESCRIPTION
Embodiments of the present invention are generally directed to an automatic detection, labeling and tracking of team members in a video. In this regard, in accordance with but one example implementation of the broader teachings of the present invention, a method is introduced to receive a frame from a sports video, to identify a playing surface in the frame, to identify player regions on the playing surface, to transform pixels from the player regions into player models, to aggregate the player models from a plurality of frames, and to determine a team model to represent a first team and a second team based on clustering of player models. Other embodiments are also disclosed and claimed.
Fig. 1 is a graphical illustration of an example frame of a video, in accordance with one example embodiment of the invention. Frame 100 is intended to represent x. In accordance with the illustrated example embodiment, frame 100 may include one or more of global view 102, playing surface 104, boundary 106, outer region 108, first team members 110, second team members 112, referee 114, ball 116, first team labels 118, second team labels 120, and referee label 122 coupled as shown in Fig. 1. While shown as being a frame of a soccer match, frame 100 may well be from a video of another sport, such as basketball or football, or any other type of video that would benefit from the teachings of the present invention. Global view 102 represents the type of view depicted. In one embodiment, global view 102 represents a view in which the frame is predominantly of the playing surface or field and several players, as opposed to a close-up view or a crowd view, for example. While shown as including two first team members 110 and two second team members 112 for simplicity, many more players may be present in frame 100. Playing surface 104 may be grass or another surface type that is predominantly solid and uniform in color and may be surrounded by boundary 106 that separates and distinguishes playing surface 104 from outer region 108, which may include, for example, spectators.
First team members 110 would have matching uniforms that are distinguishable from the uniforms worn by second team members 112 and further distinguishable from the uniform worn by referee 114. In one embodiment of the present invention, as described in more detail hereinafter, the differences in player and referee uniforms allow labels, such as first team labels 118, second team labels 120, and referee label 122, to be added to frame 100. In one embodiment, ball 116 and other anomalies on playing surface 104 are ignored for player detection, labeling and tracking purposes.
Fig. 2 is a flow chart of an example method for developing a player labeling module, in accordance with one example embodiment of the invention. It will be readily apparent to those of ordinary skill in the art that although the following operations may be described as a sequential process, many of the operations may in fact be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged without departing from the spirit of embodiments of the invention.
In one embodiment, method 200 begins with receiving (202) a frame of a sports video such as frame 100. In one example embodiment, frames that don't depict global view 102 are ignored. In one embodiment, a video processing system may grab one or two frames per second to process out of a 25 frame per second video source.
Next is identifying (204) a playing surface in the frame, such as playing surface 104. In one embodiment, playing surface 104 is identified based on its color (perhaps green) that makes up a majority of frame 100. In one embodiment, the dominant color of playing surface 104 is learned by accumulating HSV color histograms. In one embodiment, playing surface 104 can be extracted from frame 100 through dominant color segmentation, morphological filtering and connect-component analysis.
Method 200 continues with identifying (206) player regions on the playing surface. In one embodiment, player regions comprise groupings of contrasting colors present on playing surface 104 of sufficient size, for example number of pixels. In one embodiment, ball 116 would be too small to be considered a player region.
Next is creating (208) player models. In one example embodiment, pixels from player regions are transformed into histograms which represent player models, hi one embodiment, only pixels from the upper half of the player regions (and therefore more likely to include a team jersey worn on the upper body) are included in creating of the player models. In one embodiment, a large pool of pixels is collected from player regions and transformed into CIE-Luv space. A Gaussian Mixture Model (GMM) may be estimated with N components by Expectation-Maximization (EM) clustering. Centers of these components are referred to as prototypes. The adjacent components with small center distance are merged together. The resultant merged components are referred to as meta-prototypes. AU player samples are represented as a histogram by binning all pixels into the corresponding meta-prototype. In one embodiment, to model the players' appearance, firstly we use EM clustering again to estimate K clusters over the meta- prototype histogram of all player samples. The centers of these clusters are named submodels.
Then, team models are created (210). In one embodiment, the player models are aggregated from a plurality of frames and team models for a first team and a second team are determined based on statistical analysis of predominant player models. In one embodiment, a referee model is also created and maintained. In one embodiment, the clusters are merged into four clusters with near absorption. Their centers are labeled real- models. A labeling function assigns each real-model and sub-model exactly one label in a label set LS = {Team A, Team B, Referee, Outlier}. The two real-models with the largest size are identified as Team A and Team B, as well as their corresponding sub-models. A minimum average distance (MAD) from the other real-models to the two team sub-models may then be computed. The real-model with smaller MAD may be labeled as Referee, and another one with larger MAD may be labeled Outlier. Fig. 3 is a flow chart of an example method for testing a player labeling module, in accordance with one example embodiment of the invention. It will be readily apparent to those of ordinary skill in the art that although the following operations may be described as a sequential process, many of the operations may in fact be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged without departing from the spirit of embodiments of the invention.
According to but one example implementation, method 300 begins with receiving (302) a frame of a sports video such as frame 100. In one example embodiment, a video processing system may grab fewer frames per second to process than are available from the video source. Next, the frame is analyzed to determine (304) whether a global view is depicted.
In one embodiment, if a global view is not depicted, then the frame is not processed further. If a global view 102 is depicted in the frame 100 then it is further processed.
Next is identifying (306) a playing surface in the frame, such as playing surface 104. In one embodiment, playing surface 104 is identified based on its color (perhaps green) that makes up a majority of frame 100. In one embodiment, the dominant color of playing surface 104 is learned by accumulating HSV color histograms. In one embodiment, playing surface 104 can be extracted from frame 100 through dominant color segmentation, morphological filtering and connect-component analysis.
Method 300 continues with identifying (308) target player regions on the playing surface. In one embodiment, target player regions comprise groupings of contrasting colors present on playing surface 104 of sufficient size, for example number of pixels. In one embodiment, ball 116 would be too small to be considered a target player region. In one embodiment, a detector scans across the filtered image regions of playing surface 104 with multiple scales. Target player regions may represent areas with multiple responses.
Next is comparing (310) player regions to stored team models, such as those generated by method 200. hi one embodiment, numerical representations for the target player regions are developed by performing boosted cascade detection on the upper half of the target player regions, hi one embodiment, each target player region is represented by its meta-prototype histogram
Lastly, team members are labeled and tracked (312). hi one embodiment, a target player region is labeled as a member of a team if the target player region is sufficiently similar to a stored team model, hi one embodiment, a target player region is assigned the sub-model's label with the nearest Bhattacharyya distance, hi one embodiment, first team labels 118 would be added to frame 100 around first team members 110, while second team labels 120 (different in color than first team labels 118) would be added to frame 100 around second team members 112. While shown as rectangles in Fig. 1, the team labels can be any shape and color, hi one embodiment, referee label 122 would be added to frame 100 around referee 114 if the target player region is sufficiently similar to a stored referee model, hi another embodiment, referee 114 or any other target player region (perhaps a goaltender) not sufficiently similar to one of the team models would be labeled with an outlier label (perhaps a different color rectangle). hi one embodiment, tracking of team members includes maintaining a list of labeled target player regions for use with subsequent frames, hi one embodiment, coordinates on frame 100 of a labeled team member are saved and utilized in the decision- making process of labeling a target player region. For example, if two team members are labeled in a frame and subsequently become occluded such that analysis of the singular frame can not reveal two team members, the stored list of labeled target player regions can be relied upon in part to maintain two labels. Also, for example, if a player temporarily becomes inverted or unrecognizable in a particular frame, the list of labeled target player regions can be utilized to assist in identifying an ambiguous target player region, hi one embodiment, a list of labeled target player regions is purged if it is determined that an abrupt change in view, such as a camera switch, has occurred. hi one embodiment, a set of rectangles is used to represent the detected player regions at frame t. A player is enclosed by a rectangle, in which the binary mask segmented by dominant color and HSV color histogram are taken as observations. A tracking module may find the players correspondence between adjacent frames with binary mask and color information. One aim of tracking is to find the correspondence of players in adjacent rectangles. Another aim is to discriminate false alarms and make up missing detection instance. Bi-direction tracking, forward tracking from time t to t+1, and backward tracking from time t+1 to t, may be used to handle this problem. Adjacent rectangles may be an exact match if the overlap of their enclosed rectangles is sufficient large (binary likelihood), and the similarity of the likelihood of their color histograms in HSV color space is enough large (color likelihood).
Fig. 4 illustrates a block diagram of an example storage medium comprising content which, when accessed, causes an electronic appliance to implement one or more aspects of the disclosed methods 200 and/or 300. In this regard, storage medium 400 includes content 402 (e.g., instructions, data, or any combination thereof) which, when executed, causes the appliance to implement one or more aspects of methods described above.
The machine-readable (storage) medium 400 may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards, flash memory, or other type of media / machine-readable medium suitable for storing electronic instructions. Moreover, the present invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem, radio or network connection).
In the description above, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well-known structures and devices are shown in block diagram form.
Embodiments of the present invention may be used in a variety of applications. Although the present invention is not limited in this respect, the invention disclosed herein may be used in microcontrollers, general-purpose microprocessors, Digital Signal Processors (DSPs), Reduced Instruction-Set Computing (RISC), Complex Instruction-Set Computing (CISC), among other electronic components. However, it should be understood that the scope of the present invention is not limited to these examples.
Embodiments of the present invention may also be included in integrated circuit blocks referred to as core memory, cache memory, or other types of memory that store electronic instructions to be executed by the microprocessor or store data that may be used in arithmetic operations. In general, an embodiment using multistage domino logic in accordance with the claimed subject matter may provide a benefit to microprocessors, and in particular, may be incorporated into an address decoder for a memory device. Note that the embodiments may be integrated into radio systems or hand-held portable devices, especially when devices depend on reduced power consumption. Thus, laptop computers, cellular radiotelephone communication systems, two-way radio communication systems, one-way pagers, two-way pagers, personal communication systems (PCS), personal digital assistants (PDA's), cameras and other products are intended to be included within the scope of the present invention.
The present invention includes various operations. The operations of the present invention may be performed by hardware components, or may be embodied in machine- executable content (e.g., instructions), which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the operations. Alternatively, the operations may be performed by a combination of hardware and software. Moreover, although the invention has been described in the context of a computing appliance, those skilled in the art will appreciate that such functionality may well be embodied in any of number of alternate embodiments such as, for example, integrated within a communication appliance (e.g., a cellular telephone).
Many of the methods are described in their most basic form but operations can be added to or deleted from any of the methods and information can be added or subtracted from any of the described messages without departing from the basic scope of the present invention. Any number of variations of the inventive concept is anticipated within the scope and spirit of the present invention. In this regard, the particular illustrated example embodiments are not provided to limit the invention but merely to illustrate it. Thus, the scope of the present invention is not to be determined by the specific examples provided above but only by the plain language of the following claims.

Claims

CLAIMSWhat is claimed is:
1. A method comprising: receiving a frame from a sports video; identifying a playing surface in the frame; identifying player regions on the playing surface; transforming pixels from the player regions into player models; aggregating the player models from a plurality of frames; and determining a team model to represent a first team and a second team based on statistical analysis of player models.
2. The method of claim 1, wherein the sports video comprises a soccer video.
3. The method of claim 1, further comprising: ignoring a frame if it is determined that a global view is not depicted.
4. The method of claim 1, wherein transforming pixels from the player regions into player models comprises transforming pixels from an upper half of the player regions into player models.
5. The method of claim 1, further comprising: labeling a target player region as belonging to a team if the target player region is sufficiently similar to the team model.
6. The method of claim 1 , further comprising: labeling a target player region as an outlier if the target player region is not sufficiently similar to the team models.
7. A method comprising: receiving a global view frame from a sports video; identifying a playing surface in the frame; identifying target player regions on the playing surface; developing numerical representations for the target player regions; and labeling a target player region as a member of a team if the target player region is sufficiently similar to a stored team model.
8. The method of claim 7, further comprising: maintaining a list of labeled target player regions for use with subsequent frames.
9. The method of claim 7, wherein labeling a target player region comprises placing a colored rectangle in the frame around the target player region.
10. The method of claim 7, further comprising: labeling a target player region as a referee if the target player region is sufficiently similar to a stored referee model.
11. The method of claim 7, wherein developing numerical representations for the target player regions comprises: performing a boosted cascade detection on an upper half of the target player regions.
12. The method of claim 7, wherein the sports video comprises a soccer video.
13. A storage medium comprising content which, when executed by an accessing machine, causes the accessing machine to receive a frame from a sports video, to identify a playing surface in the frame, to identify player regions on the playing surface, to transform pixels from the player regions into player models, to aggregate the player models from a plurality of frames, and to determine a team model to represent a first team and a second team based on statistical analysis of player models.
14. The storage medium of claim 13, wherein the sports video comprises a soccer video.
15. The storage medium of claim 13, further comprising content which, when executed by the accessing machine, causes the accessing machine to ignore a frame if it is determined that a global view is not depicted.
16. The storage medium of claim 13, wherein the content to transform pixels from the player regions into player models comprises contents to transform pixels from an upper half of the player regions into player models.
17. The storage medium of claim 13, further comprising content which, when executed by the accessing machine, causes the accessing machine to label a target player region as belonging to a team if the target player region is sufficiently similar to the team model.
18. A storage medium comprising content which, when executed by an accessing machine, causes the accessing machine to receive a global view frame from a sports video, to identify a playing surface in the frame, to identify target player regions on the playing surface, to develop numerical representations for the target player regions, and to label a target player region as a member of a team if the target player region is sufficiently similar to a stored team model.
19. The storage medium of claim 18, further comprising content which, when executed by the accessing machine, causes the accessing machine to maintain a list of labeled target player regions for use with subsequent frames.
20. The storage medium of claim 18, wherein the content to label a target player region as a member of a team comprises content to place a colored rectangle in the frame around the target player region.
21. The storage medium of claim 18, further comprising content which, when executed by the accessing machine, causes the accessing machine to label a target player region as a referee if the target player region is sufficiently similar to a stored referee model.
22. The storage medium of claim 18, wherein the content to develop numerical representations for the target player regions comprises content to perform a boosted cascade detection on an upper half of the target player regions.
PCT/CN2007/003986 2007-12-29 2007-12-29 Automatic detection, labeling and tracking of team members in a video WO2009086683A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2007/003986 WO2009086683A1 (en) 2007-12-29 2007-12-29 Automatic detection, labeling and tracking of team members in a video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2007/003986 WO2009086683A1 (en) 2007-12-29 2007-12-29 Automatic detection, labeling and tracking of team members in a video

Publications (1)

Publication Number Publication Date
WO2009086683A1 true WO2009086683A1 (en) 2009-07-16

Family

ID=40852767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/003986 WO2009086683A1 (en) 2007-12-29 2007-12-29 Automatic detection, labeling and tracking of team members in a video

Country Status (1)

Country Link
WO (1) WO2009086683A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194306A (en) * 2017-03-31 2017-09-22 上海体育学院 Sportsman's method for tracing and device in video
US20210097418A1 (en) * 2019-09-27 2021-04-01 Stats Llc System and Method for Improved Structural Discovery and Representation Learning of Multi-Agent Data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2430830A (en) * 2005-09-28 2007-04-04 Univ Dundee Image sequence movement analysis system using object model, likelihood sampling and scoring
CN1992911A (en) * 2005-12-31 2007-07-04 中国科学院计算技术研究所 Target tracking method of sports video

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2430830A (en) * 2005-09-28 2007-04-04 Univ Dundee Image sequence movement analysis system using object model, likelihood sampling and scoring
CN1992911A (en) * 2005-12-31 2007-07-04 中国科学院计算技术研究所 Target tracking method of sports video

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194306A (en) * 2017-03-31 2017-09-22 上海体育学院 Sportsman's method for tracing and device in video
CN107194306B (en) * 2017-03-31 2020-04-28 上海体育学院 Method and device for tracking ball players in video
US20210097418A1 (en) * 2019-09-27 2021-04-01 Stats Llc System and Method for Improved Structural Discovery and Representation Learning of Multi-Agent Data

Similar Documents

Publication Publication Date Title
Baysal et al. Sentioscope: a soccer player tracking system using model field particles
Xing et al. Multiple player tracking in sports video: A dual-mode two-way bayesian inference approach with progressive observation modeling
Ngo et al. Motion-based video representation for scene change detection
Yan et al. A Tennis Ball Tracking Algorithm for Automatic Annotation of Tennis Match.
US20130335635A1 (en) Video Analysis Based on Sparse Registration and Multiple Domain Tracking
Khatoonabadi et al. Automatic soccer players tracking in goal scenes by camera motion elimination
AU2017272325A1 (en) System and method of generating a composite frame
CN113508419A (en) System and method for generating athlete tracking data from broadcast video
Cuevas et al. Techniques and applications for soccer video analysis: A survey
Ufkes et al. A markerless augmented reality system for mobile devices
US10803598B2 (en) Ball detection and tracking device, system and method
Parisot et al. Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera
Mustamo Object detection in sports: TensorFlow Object Detection API case study
Kim et al. Multiple player tracking in soccer videos: an adaptive multiscale sampling approach
Denman et al. Content-based analysis for video from snooker broadcasts
Yamamoto et al. Multiple players tracking and identification using group detection and player number recognition in sports video
Zhao et al. Background subtraction based on integration of alternative cues in freely moving camera
Lien et al. Scene-based event detection for baseball videos
Hasegawa et al. Synthesis of a stroboscopic image from a hand-held camera sequence for a sports analysis
Xia et al. Kiwifruit counting using KiwiDetector and KiwiTracker
WO2009086683A1 (en) Automatic detection, labeling and tracking of team members in a video
Hung et al. Generalized playfield segmentation of sport videos using color features
Siva et al. Real-time, embedded scene invariant crowd counting using scale-normalized histogram of moving gradients (HoMG)
Hervieu et al. Understanding sports video using players trajectories
Ekin et al. Framework for tracking and analysis of soccer video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07855982

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07855982

Country of ref document: EP

Kind code of ref document: A1