WO2015190071A1 - 映像処理方法、映像処理装置 - Google Patents
映像処理方法、映像処理装置 Download PDFInfo
- Publication number
- WO2015190071A1 WO2015190071A1 PCT/JP2015/002808 JP2015002808W WO2015190071A1 WO 2015190071 A1 WO2015190071 A1 WO 2015190071A1 JP 2015002808 W JP2015002808 W JP 2015002808W WO 2015190071 A1 WO2015190071 A1 WO 2015190071A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- play
- frame
- video
- video processing
- processor
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/87—Regeneration of colour television signals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30221—Sports video; Sports image
Definitions
- the present technology relates to a video processing method and a video processing apparatus that perform processing on video data of a video of a sports game.
- American football and soccer are competitive sports that are particularly popular in the West.
- video of shooting a game is analyzed, and the results of the analysis are practiced and fed back to the next game, and a highlight video is created.
- the time during which an offense / defense operation called “down” is performed (hereinafter referred to as “play”) and the time at which no offense / defense operation is performed are repeated. That is, the section of play is more important in terms of analyzing the American football game. Accordingly, it is desired that at least one of the start point and end point of the play section can be efficiently and accurately extracted from video data obtained by shooting an American football game.
- sports images In recent years, research on analysis of images of sports games (hereinafter referred to as “sports images”) has been actively conducted.
- the technology related to sports video analysis includes video summarization methods that automatically extract highlight sections from long-time game videos and automatically create highlight videos, and form formation recognition.
- a method for extracting the start point of an American football game play based on the characteristics of the video color (hue, saturation, brightness, etc.) and camera work for example, Patent Document 1.
- a method for creating a highlight video by calculating the importance level in a sports video from the content written in Twitter (registered trademark) and the amount of posts within a certain time, and determining a key frame (for example, non-patent literature) 1).
- conventionally proposed tactic analysis methods include play analysis (see, for example, Non-Patent Document 2) that records how a player behaved during a game, and tactic analysis that records the behavior of the entire team ( For example, see Non-Patent Document 3.
- play analysis see, for example, Non-Patent Document 2
- tactic analysis that records the behavior of the entire team
- reproduction of highlight scenes, creation of a starting video of a specific player, and the like can be given.
- a formation recognition method for example, see Non-Patent Document 6 that automatically detects a scrimmage line, which is an initial formation, from a video of an American football game and classifies the formation type.
- Patent Document 1 may be less accurate depending on the color environment of the video and camera work. Further, since the method described in Non-Patent Document 1 needs to use media information other than sports video such as writing in Twitter (registered trademark), it can only deal with large-scale broadcast video such as terrestrial video. . In addition, the methods described in Non-Patent Document 2 and Non-Patent Document 3 need to use a plurality of camera images or manually detect or track a player or a ball. Furthermore, since the method described in Non-Patent Document 4 extracts only the information of the initial formation that is relatively easy to detect, it is insufficient as the amount of information for tactical analysis.
- the purpose of the present technology is to provide a video processing method and a video processing device that enable efficient and highly accurate extraction of a play section from a video of a sports game.
- This technology is a video processing method and a video processing device in which a processor processes video data of a video of a sports game.
- the processor inputs video data, calculates a motion amount of the player for each frame from the input video data, and based on the calculated motion amount, a start frame of play in the game and one of the plays At least one of the end frames, which is the frame in which the previous play, which is the previous play, is completed is estimated.
- FIG. 1 is an explanatory diagram illustrating an example of an image used in an embodiment of the present technology.
- FIG. 2 is a plan view showing an example of the configuration of an American football field targeted by the present embodiment.
- FIG. 3A is an explanatory diagram illustrating an example of a captured image of an initial formation of play targeted by the present embodiment.
- FIG. 3B is an explanatory diagram illustrating an example of an image captured of the initial formation of play targeted by the present embodiment.
- FIG. 3C is an explanatory diagram illustrating an example of an image captured of the initial formation of play targeted by the present embodiment.
- FIG. 4 is a block diagram showing an example of the configuration of the video processing apparatus according to the present embodiment.
- FIG. 5 is an explanatory diagram showing an example of the optical flow intensity in the present embodiment.
- FIG. 6 is an explanatory diagram showing an example of time transition of the total optical flow intensity in the present embodiment.
- FIG. 7 is an explanatory diagram for describing an example of a classifier in the present embodiment.
- FIG. 8 is an explanatory diagram showing an example of how the play start position is estimated in the present embodiment.
- FIG. 9 is an explanatory diagram illustrating an example of a player position detection result according to the present embodiment.
- FIG. 10 is an explanatory diagram showing an example of a state of the density calculation process in the present embodiment.
- FIG. 11 is an explanatory diagram showing an example of the density distribution in the present embodiment.
- FIG. 12A is an explanatory diagram for describing an example of a calculation method of the degree of concentration in the present embodiment.
- FIG. 12B is an explanatory diagram for describing an example of a calculation method of the degree of concentration in the present embodiment.
- FIG. 13 is a diagram illustrating an example of a quantized optical flow according to the present embodiment.
- FIG. 14 is an explanatory diagram showing an example of the concentrated position in the present embodiment.
- FIG. 15 is a flowchart showing an example of the operation of the video processing apparatus according to the present embodiment.
- FIG. 16 is a flowchart showing an example of the play start estimation process in the present embodiment.
- FIG. 17 is a flowchart showing an example of the play end estimation process in the present embodiment.
- FIG. 18 is a plan view showing an example of a confirmation operation acceptance screen in the present embodiment.
- FIG. 19 is a flowchart illustrating an example of the confirmation operation accepting process in the present embodiment.
- FIG. 20 is an explanatory diagram showing an example of a system to which the video processing apparatus according to the present embodiment is applied.
- FIG. 21 is a diagram showing the accuracy verification result of moving image reduction in the video processing apparatus according to the present embodiment.
- FIG. 22 is a diagram showing the accuracy verification result of the play start position in the video processing apparatus according to the present embodiment.
- FIG. 23 is a diagram showing the accuracy verification result of the play end position in the video processing apparatus according to the present embodiment.
- FIG. 1 is an explanatory diagram showing an example of a video shot of an American football game.
- FIG. 2 is a plan view showing an example of the configuration of an American football field.
- FIG. 3 is an explanatory diagram showing an example of the initial formation of play.
- American football is a competition like a team battle that is divided between the attacking side and the defensive side.
- field in the range (hereinafter referred to as “field”) 120 surrounded by the side lines 121, 122 and the goal lines 123, 124, it is not possible to advance (gain) by 10 yards in four attack opportunities. The right to attack is transferred to the opponent team. For this reason, information such as how many yard gains were made in one attack is very important in the game analysis.
- the players of both teams form initial formations 131 to 133 called scrimmage lines (see FIGS. 3A, 3B, and 3C). Then, play is started by throwing the ball from the center of the initial formation. When the initial formation is formed, most players pause once. And all players start moving at the same time as play begins. That is, at the start of play, most of the players start moving from a state where they are once stationary.
- play end position a plurality of players usually gather toward the ball position (hereinafter referred to as “play end position”), and the players are in a dense state. Also, at the end of play, most players will slow down and do not act with sudden movement changes such as dashes and faints.
- the next play will start from the play end position. However, if the play ends outside the two inbound lines 125, 126 (see FIG. 2), the next play starts from the inbound lines 125, 126 closer to the play end position. That is, the position where each play is started, that is, the position where the initial formation is formed (hereinafter referred to as “play start position”) has a correlation with the play end position of the previous play.
- American football is characterized by the fact that, due to the nature of the rules, the movement of most players (the movement of the entire field) suddenly increases at the start of play, and the initial formation is formed. It has the feature that the movement of the player decreases rapidly.
- a feature is that the play start position of each play has a correlation with the play end position of the previous play.
- the sections of each play are estimated by extracting these features from the video data of the video 110. More specifically, a frame corresponding to the start point of play (hereinafter referred to as “play start frame”) and a frame corresponding to the end point of play (hereinafter referred to as “play start”) from the frames constituting the video data for each play. Frame)).
- play start frame a frame corresponding to the start point of play
- play start a frame corresponding to the end point of play
- the shapes of the initial formations 131 to 133 are less variable even when the teams are different.
- the image of the initial formation displayed on the video differs depending on the relationship between the position of the camera that captures the video 110 and the position where the initial formation is formed.
- FIG. 3A is an explanatory diagram showing an example of an image obtained by photographing the initial formation 131 assembled on the left side of the field from a camera arranged at a position close to the center of the field.
- FIG. 3B is an explanatory diagram illustrating an example of an image obtained by photographing the initial formation 132 assembled at the center of the field from the same camera.
- FIG. 3C is an explanatory diagram illustrating an example of an image of the initial formation 133 assembled on the right side of the field from the same camera.
- the play start frame is estimated by further utilizing the characteristics of the initial formation or the change in the player movement around the play start time.
- FIG. 4 is a block diagram showing an example of the configuration of the video processing apparatus according to the present embodiment.
- the video processing apparatus 200 includes a video input unit 210, a play start estimation unit 220, a play end estimation unit 230, a confirmation operation reception unit 240, and an estimation result processing unit 250.
- the video input unit 210 inputs video data (hereinafter referred to as “video”) of a video shot of an American football game (hereinafter referred to as “game”). For example, the video input unit 210 receives video via a communication network from a video camera installed so as to capture the entire field of the game venue from the side. Then, the video input unit 210 outputs the input video to the play start estimation unit 220.
- video video data
- game American football game
- the video is a video of the entire field as shown in FIG.
- the video is, for example, time-series image data of 60 frames per second.
- the play start estimation unit 220 estimates the play start position in the game based on the input video.
- the play start estimation unit 220 calculates the amount of motion in each part of the frame for each frame. Further, the play start estimation unit 220 detects an initial formation from the video, and estimates a play start frame and a play start position of each play based on the amount of movement and the detection result of the initial formation.
- the amount of motion is information indicating at least one of the degree and direction of the magnitude of motion in a predetermined region in the video. Details of the amount of movement will be described later.
- the play start estimation unit 220 uses the video, the motion amount information indicating the motion amount of each area of each frame, and the start frame information indicating the estimated play start frame and the play start position. Output to.
- the configuration of the play start estimation unit 220 is an example, and the estimation of the play start position is not limited to the above-described example.
- the play start estimation unit 220 estimates the play start frame and the play start position using changes in player movement around the start of play. For example, the play start estimation unit 220 estimates the play start frame and the play start position using the amount of change (difference) in luminance between the previous and next frames. Specifically, the play start estimation unit 220 compares the luminance between corresponding pixels between two consecutive frames, for example, and changes the luminance of each pixel and the luminance of all the pixels. Calculate the total amount of change.
- the play start estimation unit 220 uses, as a reference, a frame with a small amount of change in luminance at all pixels, and / or a frame with a small amount of change in luminance at all pixels and / or a few frames before and after that. presume.
- the play start estimation unit 220 estimates a region where the luminance change amount is large after the play start frame as the play start position.
- the play start estimation unit 220 outputs the video and start frame information indicating the estimated play start frame and play start position to the play end estimation unit 230.
- the element for estimating the change in the movement of the player other feature amounts of pixels (including a pixel or a set of pixels) such as brightness and RGB values may be used instead of the luminance of the pixels. .
- the play end estimation unit 230 estimates, based on the start frame information, the end frame of the play immediately before the play in the game (hereinafter referred to as “immediate play”) from the input video for each play.
- the play end estimation unit 230 estimates an area (hereinafter referred to as “play end area”) that is likely to be the end position of the previous play based on the play start position indicated by the input start frame information.
- the play end estimation unit 230 extracts the position of the player in each frame (hereinafter referred to as “player position”) from the video, and calculates the density of the player position for each frame based on the extracted player position.
- the play end estimation unit 230 calculates the degree of concentration based on the amount of motion at each position of each frame indicated by the input motion amount information (or the motion amount information newly acquired by the play start estimation unit 220). . Further, the play end estimation unit 230 estimates the play end position based on the calculated density and concentration.
- the density is information indicating the degree of density of player positions in the frame.
- the degree of concentration is information indicating the degree of gathering of the movement direction of the players, and is a value calculated for each grid set at equal intervals in the field, for example. Details of the degree of congestion and the degree of concentration will be described later.
- the play end estimation unit 230 determines the end of play of each play based on the amount of motion indicated by the input motion amount information and whether or not the estimated play end position is included in the estimated play end region. Estimate the frame.
- the play end estimation unit 230 outputs the input video and start frame information and end frame information indicating the estimated play end frame and play end position to the confirmation operation reception unit 240.
- the play start frame estimated by the play start estimation unit 220 is hereinafter referred to as “start frame candidate”. Further, the play end frame estimated by the play end estimation unit 230 is hereinafter referred to as “end frame candidate”.
- the confirmation operation accepting unit 240 generates and displays a confirmation operation acceptance screen based on the input video, start frame information, and end frame information.
- the confirmation operation acceptance screen is, for each play, each start frame candidate estimated for the play and one or a plurality of end frames estimated for the immediately preceding play that is the play immediately before the corresponding play. This is a screen for displaying candidates in association with each other. Details of the confirmation operation acceptance screen will be described later.
- the confirmation operation accepting unit 240 accepts a determination operation on the displayed start frame candidate and end frame candidate, estimates the start frame candidate on which the determination operation has been performed as a play start frame, and ends the determination operation performed.
- the frame candidate is estimated as a play end frame.
- the confirmation operation reception unit 240 displays a confirmation operation reception screen via a user interface (not shown) such as a liquid crystal display with a touch panel provided in the video processing apparatus 200, and a user for the displayed confirmation operation reception screen. Accepts operations from.
- a user interface such as a liquid crystal display with a touch panel provided in the video processing apparatus 200
- the confirmation operation accepting unit 240 outputs the video and play section information indicating the estimated play start frame and play end frame to the estimation result processing unit 250.
- the estimation result processing unit 250 extracts the video portion of the play section from the video based on the play start frame and the play end frame indicated by the input play section information, and displays the extraction result on the above-described display, for example.
- the video processing apparatus 200 is for work such as a processor (CPU (Central Processing Unit)), a storage medium such as a ROM (Read Only Memory) storing a control program, and a RAM (Random Access Memory).
- a processor Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- a memory and a communication circuit In this case, the function of each unit described above is realized by a processor (CPU) executing a control program.
- the video processing apparatus 200 having such a configuration can estimate the play section by paying attention to the player's movement and position characteristics at the start and end of play.
- the optical flow intensity of a high density optical flow is adopted as the amount of movement.
- the amount of movement is a value indicating the magnitude of the movement of the player in each direction for each direction.
- FIG. 5 is an explanatory diagram showing an example of the optical flow intensity (motion amount) acquired from the video.
- a dark-colored portion 300 indicates a portion having a high motion amount value.
- FIG. 6 is an explanatory diagram showing an example of temporal transition of the total amount of optical flow intensity within one frame (hereinafter referred to as “total optical flow intensity”).
- total optical flow intensity represents the total optical flow intensity
- the horizontal axis represents time.
- the play start estimation unit 220 displays a video on the above-described user interface, and accepts designation of a field area in the video by a touch operation from the user. Then, the play start estimation unit 220 divides the designated area into, for example, 200 ⁇ 200 small areas (hereinafter referred to as field grids). Then, the play start estimation unit 220 uses the Farneback method (for example, G. Farneback, “Two-Frame Motion Estimation Based on Polynomial Expansion 3”, In Proc. ) To obtain the optical flow intensity of the high-density optical flow. Note that the play start estimation unit 220 preferably applies a bilateral filter to the video as preprocessing for noise removal.
- the Farneback method for example, G. Farneback, “Two-Frame Motion Estimation Based on Polynomial Expansion 3”, In Proc.
- the play start estimation unit 220 preferably applies a bilateral filter to the video as preprocessing for noise removal.
- the calculation method of the optical flow intensity is not limited to the above-described method.
- the optical flow intensity may be calculated using the Lucas-Kanade method (“Knowledge Forest, Group 2, Section 2, Chapter 4, 4-1-1”, IEICE, 2013, p. 2). Refer to -7).
- the total optical flow intensity indicates the magnitude of the movement of all players displayed in the video.
- the movement of most players increases rapidly at the start of play, and the movement of most players decreases rapidly at the end of play. Therefore, as shown in FIG. 6, the total optical flow strength 301 increases rapidly immediately after the play start timing 302 and decreases rapidly immediately before the play end timing 303.
- the total optical flow intensity 301 calculated from the motion amount is a value that changes characteristically at the play start timing and the play end timing.
- a method using a discriminator is adopted as a method for detecting the initial formation.
- the play start estimation unit 220 stores in advance a discriminator (detector) for detecting the initial formation from the video.
- This discriminator is, for example, an image HOG feature quantity (for example, N. Dalal and B. Trigs, “Histograms of orientated gradients for human detection” from a large number of images obtained by photographing various initial shapes under various illumination conditions. , In CVPR2005, pp.886-893 vol.1, 2005) Adaboost (for example, P. Viola and M.
- FIG. 7 is an explanatory diagram for explaining an example of a discriminator for detecting an initial formation.
- the shape of the initial formation on the image has little variability, but changes depending on the position where the initial formation is formed.
- the play start estimation unit 220 uses the left area 311, the center area 312, and the field 120 as boundaries between the goal lines 123 and 124 and the inbound lines 125 and 126 of 35 yards.
- the right area 313 is divided into three areas.
- the play start estimation unit 220 uses a discriminator L314 generated from the initial formation formed in the left area 311 for the left area 311. Similarly, the play start estimation unit 220 is assembled with the discriminator C315 generated from the initial formation formed in the central area 312 with respect to the central area 312, and with the right area 313 with respect to the right area 313. Each discriminator R316 generated from the initial formation is used.
- the play start estimation unit 220 executes full screen search by changing the discriminator for each area.
- FIG. 8 is an explanatory diagram showing an example of how the play start position is estimated.
- the play start estimation unit 220 obtains, for example, a plurality of areas 318 from the video 317 as an initial formation detection result using an estimator. Then, the play start estimation unit 220 estimates the position 319 of the center of gravity of the plurality of regions 318 detected by the estimator as the play start position.
- the play start estimation unit 220 may project-convert the play start position on the video to the field 120 (overhead image), and use the converted position (for example, a field grid) as the play start position.
- Such projective transformation is performed using, for example, a predetermined projective transformation matrix. This projective transformation matrix is calculated in advance on the basis of coordinates manually given to the field 120 every 10 yards on the video.
- the initial formation is set at the start of play as described above. Therefore, the frame in which the initial formation is detected is a frame that is highly likely to be a frame at the start of play.
- FIG. 9 is an explanatory diagram showing an example of the player position detection result from the video.
- FIG. 10 is an explanatory diagram showing an example of a state of the density calculation process.
- the play end estimation unit 230 is, for example, a discriminator (detection) generated by performing learning on the HOG feature amount of an image from a large number of images obtained by shooting players with various postures under various lighting conditions using Adaboost. Are stored in advance.
- the play end estimation unit 230 detects, for example, a rectangular area 322 indicating an image area occupied by each player as a player position of each player from the video 321 by using the discriminator.
- the rectangular area 322 is referred to as a “player rectangle”.
- the play end estimation unit 230 calculates the degree of congestion from the detected player position for each frame.
- the play end estimation unit 230 calculates the density of the field grid 331 as illustrated in FIG.
- the play end estimation unit 230 obtains a rectangular area 332, that is, an area 333 (indicated by hatching in the drawing) where the player rectangle 322 overlaps with 25 field grid areas around the field grid 331.
- the play end estimation unit 230 calculates the density L density for the field grid 331 using, for example, the following equation (1).
- R is the area of the rectangular area 332
- R p is the area of the area 333 where the rectangular area 332 and the player rectangle 322 overlap.
- the play end estimation unit 230 calculates the density L density for all the field grids in the video, the position where the density L density becomes the maximum value or the position of the center of gravity of the distribution of the density L density is set as the density position. decide.
- the crowded position is a position that is highly likely to be the play end position.
- FIG. 11 is an explanatory diagram showing an example of the distribution of the density.
- the density increases as the area where a plurality of players gather.
- the play end estimation unit 230 may generate a denseness display image in which the color, density, and the like of the video are changed according to the denseness as shown in FIG. By displaying such an image, the user can visually confirm a position where the density is high or low.
- ⁇ About concentration level> the sum for each field grid when each of the quantized optical flow intensities is propagated along the direction of the optical flow is employed.
- FIG. 12A and FIG. 12B are explanatory diagrams for explaining an example of a calculation method of the concentration degree.
- the play end estimation unit 230 increases the degree of concentration of each of a plurality of field grids (indicated by diagonal hatching in the figure) in the direction of 45 degrees to the lower left from the position 335, for example.
- the play end estimation unit 230 performs the same process for the optical flows at all other positions. As a result, for example, as shown in FIG. 12B, the concentration increases in the field grid 339 in which the directions of the optical flows at the plurality of positions 336, 337, and 338 overlap.
- the optical flows at all positions are processed, and the concentration degree of each field grid is calculated.
- the field grid where the degree of concentration is maximized is estimated to be the position where more players are moving.
- the play end estimation unit 230 may perform weighting according to the distance from each position to the field grid that is the target of the increase in concentration.
- the play end estimation unit 230 may give a negative value to the field grid positioned ahead in the direction opposite to the optical flow direction. Thereby, the accuracy can be further improved.
- the play end estimation unit 230 calculates the concentration degree of each field grid, for example, according to the following procedure.
- the play end estimation unit 230 quantizes the optical flow intensity of each field grid in eight directions.
- FIG. 13 is a diagram showing an example of a quantized optical flow.
- the movement of each player is defined by being quantized in eight directions in each part of the area where each player is projected.
- the play end estimation unit 230 increases the concentration degree of all the field grids on the extension line of each optical flow direction by a value inversely proportional to the distance.
- the play end estimation unit 230 reduces the concentration degree of all field grids on the extension line in the direction opposite to the quantized direction by a value proportional to the distance.
- the play end estimation unit 230 calculates the concentration L direction for each field grid using, for example, the following equations (2) to (4).
- L direction_direct in Expression (2) represents the degree of concentration with respect to the direction of the optical flow.
- L direction_opposite in equation (3) represents the degree of concentration in the opposite direction of the optical flow.
- grid indicates all field grids in the field or video, and dis (grid) indicates from the field grid for which the concentration L direction is calculated to the field grid indicated by grid. Indicates distance.
- w1 represents the weight for L direction_direct
- w2 denotes a weight for L direction_oppposite.
- the play end estimation unit 230 calculates the concentration L direction for all field grids in the field or video, the position where the concentration L direction becomes the maximum value, or the center of gravity position of the distribution of the concentration L direction Is determined as a concentrated position.
- FIG. 14 is an explanatory diagram showing an example of the concentrated position.
- the concentrated position 342 is determined at any position in the field area of the video 341. As described above, the concentration position 342 is also a position that is highly likely to be the play end position, similarly to the crowded position.
- FIG. 15 is a flowchart showing an example of the operation of the video processing apparatus 200.
- step S1000 the video input unit 210 inputs a video shot of an American football game.
- step S2000 the play start estimation unit 220 performs a play start estimation process for estimating a play start frame and a play start position.
- step S3000 the play end estimation unit 230 performs a play end estimation process for estimating a play end frame and a play end position.
- step S4000 the confirmation operation accepting unit 240 performs confirmation operation acceptance processing for accepting confirmation operations for the estimation results in steps S2000 and S3000 from the user.
- step S5000 the estimation result processing unit 250 outputs play section information indicating the result of the confirmation operation in step S4000 and indicating the estimated play start frame and play end frame.
- FIG. 16 is a flowchart illustrating an example of the play start estimation process.
- step S2010 the play start estimation unit 220 calculates a motion amount (optical flow intensity) for each grid of each frame of the video, and stores the calculation result in the memory.
- step S2020 the play start estimation unit 220 selects one frame from the video, for example, by selecting sequentially from the top of the video.
- step S2030 the play start estimation unit 220 acquires a motion amount for a predetermined section immediately before the selected frame.
- the predetermined section here is, for example, a section from a frame that is 120 frames back from the selected frame to the selected frame.
- step S2040 the play start estimation unit 220 first calculates the total optical flow strength by adding all the optical flow strengths within the frame for every frame in the predetermined section. Then, the play start estimation unit 220 determines whether or not a predetermined start motion condition corresponding to the sudden increase in motion amount is satisfied using the calculated total optical flow intensity of each frame. To do.
- the start motion condition is specifically a condition that, for example, all of the following formulas (5) to (7) are satisfied.
- optical [] indicates the total optical flow intensity.
- L, M, and N are constants determined in advance by experiments or the like.
- L is an integer greater than or equal to 2, for example 120.
- M is 2, for example.
- N is 20, for example.
- optical [0] indicates the total optical flow strength of the selected frame
- optical [120] indicates the total optical flow strength of a frame 120 frames before the selected frame.
- Optical Max video i.e., one game worth of video used in the analysis
- the play start estimation unit 220 calculates an optical Max and stores it in the memory when performing the process of step S2040 for the first time.
- the start motion condition is not limited to the above-described content.
- the moving average of the change amount of the total optical flow intensity is greater than or equal to a predetermined value
- the change rate of the change amount of the total optical flow intensity is greater than or equal to a predetermined value, etc. Also good.
- start motion condition may further include other conditions such as an elapsed time from the immediately previous play start frame being a predetermined threshold or more.
- the play start estimation unit 220 If the start motion condition is not satisfied (S2040: NO), the play start estimation unit 220 returns the process to step S2020, and moves to a process for an unprocessed frame, that is, a frame not yet selected in step S2020. Moreover, the play start estimation part 220 advances a process to step S2050, when start movement conditions are satisfy
- step S2050 the play start estimation unit 220 performs initial formation detection for a start frame in a predetermined section.
- step S2060 the play start estimation unit 220 determines whether or not a predetermined start image condition corresponding to the initial formation being displayed in the frame is satisfied.
- the start image condition is, for example, a condition that the initial formation is detected from the video with a certainty of a certain value or more.
- step S2060: NO If the start image condition is not satisfied (S2060: NO), the play start estimation unit 220 returns the process to step S2020 and moves to a process for an unprocessed frame. If the start image condition is satisfied (S2060: YES), the play start estimation unit 220 advances the process to step S2070.
- Such a determination process can prevent erroneous detection of a frame having a high density other than at the start of play, such as a frame when a player is changed.
- step S2070 the play start estimation unit 220 sets start frame candidates based on the currently selected frame. Specifically, the play start estimation unit 220 sets, for example, a start frame for a predetermined time as a start frame candidate.
- This start frame candidate is a group of frames that are candidates for the play start frame.
- step S2080 the play start estimation unit 220 estimates the play start position. Specifically, the play start estimation unit 220 sets, for example, the detected position of the initial formation as the play start position.
- step S2090 the play start estimation unit 220 determines whether an unprocessed frame exists in the video. When there is an unprocessed frame (S2090: YES), the play start estimation unit 220 returns the process to step S2020 and moves to a process for an unprocessed frame. Further, when the processing for all the frames is completed (S2090: NO), the play start estimation unit 220 advances the processing to step S3000 (play end estimation processing) in FIG.
- FIG. 17 is a flowchart illustrating an example of the play end estimation process.
- step S3010 the play end estimation unit 230 estimates the play end area of the play immediately before the corresponding play (previous play) for each of the start frame candidates set by the play start estimation process (see FIG. 16).
- the play end estimation unit 230 estimates the play end area for each start frame candidate so as to limit the play end position area of the immediately preceding play based on the play start position. Note that information obtained from the start frame information output from the play start estimation unit 220 is used as the play start frame and the play start position.
- the play end estimation unit 230 draws a vertical line from the play start position to the side lines 121 and 122 (see FIG. 2) closer to the play start position, and an area having a width of 10 yards in the video centered on the vertical line. Presumed to be the play end area. Note that the numerical value of 10 yards is an example, and it goes without saying that it is important to demarcate a predetermined area based on the play start position when setting the play end area.
- step S3020 the play end estimation unit 230 selects one frame from the video, for example, by selecting from the top of the video or sequentially from the frame immediately after the first start frame candidate.
- step S3030 the play end estimation unit 230 acquires a motion amount for a predetermined section immediately before the selected frame.
- the predetermined section here is, for example, a section from a frame that is 120 frames back from the selected frame to the selected frame.
- step S3040 the play end estimation unit 230 first calculates the total optical flow intensity for each frame for all frames in the predetermined section. Then, the play end estimation unit 230 uses the calculated total optical flow intensity of each frame to determine a predetermined amount corresponding to the sudden decrease in the amount of motion and the amount of change in the amount of motion. It is determined whether or not the end motion condition is satisfied.
- the end motion condition is specifically a condition that, for example, both the following expressions (8) and (9) are satisfied.
- P and Q are constants determined in advance by experiments or the like.
- P is an integer greater than or equal to 1, for example 120.
- Q is an integer of 1 or more, for example, 5.
- R is, for example, 15.
- the end motion condition is not limited to the contents described above. For example, if the moving average of the change amount of the total optical flow intensity is less than a predetermined negative value, the change rate of the change amount of the total optical flow intensity is less than the predetermined negative value, etc. May be adopted.
- the end motion condition may be that the elapsed time from the previous most recent play start frame is less than a predetermined threshold, or that the elapsed time to the next most recent play start frame is less than a predetermined threshold, etc. These conditions may be further included.
- the play end estimation unit 230 returns the process to step S3020, and proceeds to a process for an unprocessed frame, that is, a frame not yet selected in step S3020. Moreover, the play completion estimation part 230 advances a process to step S3050, when completion
- step S3050 the play end estimation unit 230 extracts player positions for the selected frame, and calculates a crowded position and a concentrated position.
- step S3060 the play end estimation unit 230 estimates a midpoint between the crowded position and the concentrated position as the play end position.
- the play end estimation unit 230 extracts the player position for the selected frame, and calculates the density L density and the concentration degree L direction from the extracted player position. Finally, the play end position is estimated by calculating the play end position likelihood L terminal using those results.
- L terminal can be calculated by obtaining the sum of the density L density and the concentration L direction for each position and obtaining the position where the sum is the maximum value. Further, L terminal may be calculated by obtaining the sum of the density L density and the concentration degree L direction for each position and obtaining a position that is an intermediate point between the two positions where the peak values are taken.
- the play end estimation unit 230 performs projective conversion of the position on the video to the field 120 (overhead image), similarly to the play start position.
- the converted position (for example, field grid) is set as the final play end position.
- step S3070 the play end estimation unit 230 determines whether or not the play end position satisfies the end position condition.
- the end position condition is a predetermined end position condition corresponding to the fact that the play end position and the play start position of the next play have the above correlation.
- the play end position is included in the play end area.
- the play start position here is, for example, the play start position in the play start frame that first exists after the selected frame.
- the play end estimation unit 230 When the end position condition is not satisfied (S3070: NO), the play end estimation unit 230 returns the process to step S3020, and moves to a process for an unprocessed frame. In addition, when the end position condition is satisfied (S3070: YES), the play end estimation unit 230 advances the process to step S3080.
- step S3080 the play end estimation unit 230 sets end frame candidates based on the currently selected frame. Specifically, the play start estimation unit 220 sets, for example, the currently selected frame as an end frame candidate.
- step S3090 the play end estimation unit 230 determines whether an unprocessed frame exists in the video.
- the play end estimation unit 230 returns the process to step S3020 and proceeds to the process for the unprocessed frame.
- step S4000 confirmation operation acceptance processing
- the confirmation operation accepting unit 240 accepts the confirmation operation using the confirmation operation acceptance screen in the confirmation operation acceptance process. Prior to the description of the confirmation operation reception process, an outline of the confirmation operation reception screen will be described.
- FIG. 18 is a plan view showing an example of the confirmation operation acceptance screen.
- the confirmation operation reception screen 360 includes, for example, a candidate display selection area 361, an operation button area 362, and a video display area 363.
- the candidate display selection area 361 displays thumbnails of a plurality of start frame candidates in time series along the vertical direction when a plurality of start frame candidates are estimated.
- the candidate display selection area 361 also arranges the thumbnails of the start frame candidates and the thumbnails of the representative images of the end frame candidates estimated as the end frames of the play corresponding to the start frame candidates, along the horizontal direction. indicate.
- the thumbnail of the end frame candidate of the immediately preceding play is displayed on one line of the thumbnail of the start frame candidate of a certain play. That is, the candidate display selection area 361 displays the start frame candidate for play and the end frame candidate for the previous play in association with each other according to the arrangement method of each candidate for each play.
- Each thumbnail is generated by reducing the representative image of the start frame candidate or the end frame candidate. Details of the representative image will be described later.
- the operation button area 362 includes a play button, a pause button, a reception button for accepting a reproduction operation, a pause operation, a stop operation, a determination operation, and a deletion operation for the display item selected in the candidate display selection area 361. Display a stop button, a decision button, and a delete button.
- the video display area 363 is an area for displaying a representative image corresponding to the designated thumbnail, or a video section including a start frame candidate or an end frame candidate corresponding to the designated thumbnail. Details of the video section will be described later.
- each part constituting the confirmation operation acceptance screen 360 are not limited to the example shown in FIG.
- the candidate display selection area 361 arranges thumbnails of representative images of a plurality of start frame candidates in time series along the horizontal direction, and represents representative images of end frame candidates corresponding to the thumbnails of the representative images of the start frame candidates. These thumbnails may be displayed side by side along the vertical direction.
- the candidate display selection area 361 may display all thumbnails arranged in chronological order in one column in the vertical direction or one column in the horizontal direction.
- FIG. 19 is a flowchart showing an example of the confirmation operation acceptance process.
- step S4010 the confirmation operation accepting unit 240 sets a representative image and a video section for each of the start frame candidate and the end frame candidate.
- the confirmation operation accepting unit 240 sets the start frame as a representative image, and a predetermined section including before and after the start frame (for example, one second before the start frame candidate). , The section up to 3 seconds after the start frame candidate) is set as the video section.
- the confirmation operation accepting unit 240 sets the end frame as a representative image, and sets the end frame candidate from a predetermined section including the end frame (for example, 3 seconds before the end frame candidate). The section until one second later) is set as the video section.
- step S4020 the confirmation operation reception unit 240 generates and displays a confirmation operation reception screen 360 (see FIG. 18).
- step S4030 the confirmation operation accepting unit 240 performs a designation operation on either the start frame candidate or the end frame candidate (hereinafter collectively referred to as “candidate”) displayed in the candidate display selection area 361 (see FIG. 18). It is determined whether or not.
- the confirmation operation accepting unit 240 advances the process to step S4040.
- the confirmation operation accepting unit 240 advances the process to step S4050 described later.
- step S4040 the confirmation operation receiving unit 240 performs highlighting such as superimposing a frame line 364 (see FIG. 18) on the designated candidate thumbnail. Further, the confirmation operation accepting unit 240 displays the designated candidate representative image in the video display area 363 (see FIG. 18).
- step S4050 the confirmation operation receiving unit 240 determines whether or not a reproduction operation has been performed in the operation button area 362 (see FIG. 18) in a state where any candidate is designated.
- the confirmation operation accepting unit 240 advances the process to step S4060.
- the confirmation operation accepting unit 240 advances the process to step S4070 described later.
- step S4060 the confirmation operation accepting unit 240 reproduces the designated candidate video section and displays it in the video display area 363 (see FIG. 18).
- the confirmation operation accepting unit 240 stops the playback of the video section when a stop operation is performed in the operation button area 362 (see FIG. 18).
- the confirmation operation accepting unit 240 performs reproduction (resume) from the stop point.
- step S4070 the confirmation operation receiving unit 240 determines whether or not a deletion operation has been performed in the operation button area 362 (see FIG. 18) in a state where any candidate is designated. If the delete operation is performed (S4070: YES), the confirmation operation accepting unit 240 advances the process to step S4080. In addition, when the deletion operation is not performed (S4070: NO), the confirmation operation accepting unit 240 advances the process to step S4090 described later.
- step S4080 the confirmation operation accepting unit 240 cancels the setting of the designated candidate and deletes the corresponding thumbnail from the candidate display selection area 361.
- step S4090 the confirmation operation receiving unit 240 determines whether or not a determination operation has been performed in the operation button area 362 (see FIG. 18) in a state where any candidate is designated.
- the determination operation is performed (S4090: YES)
- the confirmation operation receiving unit 240 advances the process to step S4100.
- the determination operation is not performed (S4070: NO)
- the confirmation operation receiving unit 240 advances the process to step S4110 described later.
- step S4100 if the designated candidate is a start frame candidate, the confirmation operation accepting unit 240 sets the candidate as a play start frame. If the designated candidate is an end frame candidate, the confirmation operation accepting unit 240 ends the play. Set to frame.
- step S4110 the confirmation operation receiving unit 240 determines whether or not the confirmation operation has ended.
- the confirmation operation is completed, for example, when a determination operation is performed on all candidates remaining in the candidate display selection area 361, or a confirmation button (not shown) displayed on the confirmation operation reception screen 360 is displayed. No.) is clicked.
- step S4110 If the confirmation operation has not ended (S4110: NO), the confirmation operation receiving unit 240 returns the process to step S4030. In addition, when the confirmation operation is completed (S4110: YES), the confirmation operation receiving unit 240 advances the process to step S4120.
- step S4120 the confirmation operation accepting unit 240 generates play section information indicating the play start frame and the play end frame set in the confirmation operation accepting unit 240. Then, the confirmation operation accepting unit 240 advances the process to the process of Step S2000 in FIG. 15 (outputs play section information).
- the video processing apparatus 200 can estimate the play section by paying attention to the characteristics of the player's movement and position at the start and end of play.
- FIG. 20 is an explanatory diagram showing an example of a system to which the video processing apparatus 200 is applied.
- the estimation result 371 of the video processing apparatus 200 can be used in an archive system 372 that records past plays and can search for similar plays later.
- the moving image aggregation is to extract one or a plurality of time-sequential play intervals and exclude low-importance intervals such as timeout intervals.
- the play start position and the play end position estimated by the video processing device 200 are used, it is possible to calculate the gain yardage using the video divided for each play due to the video reduction. Furthermore, an efficient tactical analysis can be realized for the video for each play obtained by the video reduction.
- the obtained information is recorded in association with each attribute, so that it can be used as a condition search target and can be provided to the archive system.
- the video processing apparatus 200 can generate very useful information from the viewpoint of game analysis, it is suitable for various systems related to game analysis.
- the ball tracking method in American football for example, Junji Kurano, Taiki Yamamoto, Hirokatsu, Kataoka, Masaki Hayashi, Yoshimitsu Aoki, "Ball Tracking in Team Sports by Focusing on Ball Holder Candidates", In International Workshop on Advanced Image Technology 2014 ( IWAIT 2014), 2014
- player tracking methods using back number recognition for example, Taiki Yamamoto, Hirokata Kataoka, Masaki Hayashi
- Yoshimitsu Aoki "Multiple Players Tracking and Identification Using Group Detection and Player Number Recognition in Sports Video," In the 39th Annual Conference of the IEEE Industrial Electronics Society (IECON2013), by combining the 2013 reference), and the like, more detailed Auto Tactical analysis is possible.
- the experimental video has a resolution of 1740 ⁇ 300 pixels and 60 fps.
- One pixel in the experimental video corresponds to approximately 7.3 cm in the real space, and 1 pixel in the bird's-eye view corresponds to approximately 9.8 cm in the real space.
- the moving image contraction rate C was calculated using the following equation (10) using experimental moving images (total 90000 frames, 32 plays).
- frame C is the total number of frames after reduction
- frame all is the total number of frames of the original video.
- FIG. 21 is a diagram showing the accuracy verification result of the video contraction.
- the total number of plays (number of detected plays) is 31, the number of frames after reduction frame c is 12094, and the video reduction ratio (video aggregation degree) C is It was 13.44.
- FIG. 22 is a diagram showing the accuracy verification result of the play start position.
- FIG. 23 is a diagram showing the accuracy verification result of the play end position.
- the video processing apparatus 200 estimates the play section by focusing on the movement and position characteristics of the players at the start and end of play. It is possible to extract efficiently and with high accuracy.
- the video processing apparatus 200 is used in various systems related to the game analysis, so that it is highly accurate and efficient. Game analysis can be realized.
- the video processing apparatus 200 displays a representative image and a video portion and accepts a selection operation from the user. For this reason, the video processing apparatus 200 according to the present embodiment can more reliably prevent the play start frame and the play end frame from being erroneously estimated, and realize a more accurate game analysis.
- the video processing apparatus 200 may calculate the amount of motion of the frame for each section necessary for the start motion condition or the end motion condition every time a frame is selected.
- the play end estimation unit 230 determines whether two or more play start frames are estimated before the end of play end process for all videos is completed. The play end estimation process may be started for the section.
- the video to be processed may be a video shot only around the ball position.
- the method for estimating the play start position is not limited to the above example.
- the play start estimation unit 220 may display a video, accept a designation of a manual play start position from the user, and estimate the designated position as a play start position.
- the play start estimation unit 220 does not necessarily need to consider the correlation between the play start position and the play end position of the immediately preceding play. For example, the play start estimation unit 220 may estimate a play start frame or a play end frame based only on one or more of the amount of movement, the degree of congestion, and the degree of concentration.
- the method of estimating the play end position is not limited to the above example.
- the crowded position may be estimated as the play end position as it is, or the concentrated position may be estimated as the play end position as it is.
- the position of the referee may be extracted from the image, and the position of the referee extracted immediately after the amount of movement is rapidly reduced may be estimated as the play end position.
- the video processing device 200 may perform only the process after the confirmation operation accepting process without determining the start frame candidate and the end frame candidate. .
- a part of the configuration of the video processing apparatus 200 may be separated from other parts by being arranged in an external device such as a server on the network.
- the video processing device 200 needs to include a communication unit for communicating with the external device.
- the video processing apparatus 200 after determining the play start frame (start frame candidate) and the play end frame (end frame candidate) in the form of candidates, displays the representative image and the video portion,
- the play start frame and the play end frame may be determined from the start frame candidate and the end frame candidate without receiving the selection operation from the user.
- all of the start frame candidate and the end frame candidate may be determined as the start frame and the end frame.
- This technology can be applied not only to American football images but also to other sports images. That is, the present technology is widely applicable to sports in which a play is composed of a plurality of play sections, and a player's movement is characteristic or regular at the start or end of the play section.
- the present technology can be applied to a sport with a rule in which offense and defense are switched, and more specifically, it is suitable for a sport in which the timing of offense and defense switching is clearly defined in the rule.
- switching of offense and defense includes concepts such as switching of offense and defense for each team in American football, baseball, and the like, and switching of the right to serve (serve side, receive side) in tennis, table tennis, volleyball and the like.
- this technology is a sport in which the movement decreases immediately before the start of the game, such as sumo or martial arts, the player is positioned at a predetermined position, or takes a predetermined posture, and the movement increases immediately after the start of the game. It is also suitable for sports matches. Based on the above concept, it can be said that the present technology is particularly suitable for an American football game.
- the video processing method and the video processing device are video processing methods in which a processor performs processing on video data of a video of a sporting game, the video data being input, and the input video data
- the movement amount of the player is calculated, and based on the calculated movement amount, the start frame of the play in the game and the end frame of the last play which is the play immediately before the play are ended. At least one of them is estimated.
- the video processing method and the video processing device detect an initial formation formed by athletes of the sports team from video data, and based on the calculated amount of movement and the detection result of the initial formation, The start frame of the play may be estimated.
- the video processing method and the video processing device detect an initial formation formed by players of the sports team from the video data, and determine the position of the image of the initial formation in the start frame to start play in the game. It may be estimated as a position.
- the video processing method and the video processing device may estimate the end frame of the play from the input video data based on the start position.
- the video processing method and the video processing device estimate an end area of the immediately preceding play based on the estimated start position, and a frame including the end position of the immediately preceding play in the game based on the amount of movement. And the frame associated with the end position may be estimated as the end frame on the condition that the estimated end position is included in the estimated end region.
- the video processing method and the video processing device calculate at least one of the density and concentration of the player positions, and the end position based on at least one of the calculated density and concentration. May be estimated.
- the video processing method and the video processing device may estimate a motion increase section in which the amount of motion has rapidly increased, and estimate the start frame based on the estimated motion increase section.
- the video processing method and the video processing apparatus may estimate a motion reduction section in which the amount of motion has rapidly decreased, and estimate the end frame based on the estimated motion reduction section.
- the video processing method and the video processing apparatus may display the start frame and one or a plurality of end frame candidates in association with each other from the video data.
- the video processing method and the video processing apparatus accept a determination operation on the displayed one or more end frame candidates, and estimate the end frame candidate on which the determination operation has been performed as the end frame. Also good.
- the video processing method and the video processing apparatus accept a playback operation for the displayed start frame and the one or more end frame candidates, and the start frame of the video data on which the determination operation has been performed.
- the video data portion corresponding to the section corresponding to the end frame candidate may be reproduced and displayed.
- the video processing method and the video processing device display the plurality of start frames in time series along a first direction, and each of the start frames
- the end frame estimated for the play corresponding to the start frame may be displayed on the screen side by side along a second direction intersecting the first direction.
- This technology is useful as a video processing method that can efficiently and accurately extract a play section from a video of a sports game.
- Video processing device 110 317, 321, 341 Video 120 Field 121, 122 Side line 123, 124 Goal line 125, 126 Inbound line 131, 132, 133 Initial formation 200
- Video processing device 210 Video input unit 220
- Play start estimation unit 230 Play end estimation Unit 240 confirmation operation receiving unit 250
- estimation result processing unit 300 dark portion 301 total optical flow intensity 302 play start timing 303 play end timing 311 left area 312 central area 313 right area 314 discriminator L 315 Classifier C 316 Classifier R 318, 333 area 319 center of gravity position 322, 332 rectangular area 331 field grid 335, 336, 337, 338 position 339 field grid 342 concentrated position 360 confirmation operation acceptance screen 361 candidate display selection area 362 operation button area 363 video display area 364 frame Line 371 Estimation result 372 Archive system
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
まず、アメリカンフットボールの試合のルールのうち、プレーの開始および終了に関する部分の概要について説明する。
次に、本実施の形態に係るアメリカンフットボール映像処理方法を用いた映像処理装置の構成について説明する。
本実施の形態においては、動き量として、高密度オプティカルフロー(dense optical flow)のオプティカルフロー強度を採用する。すなわち、動き量は、各所における選手の動きの、方向毎の大きさを示す値である。
本実施の形態においては、初期陣形の検出手法として、識別器を用いた手法を採用する。
本実施の形態においては、密集度として、各選手の画像領域の重なり度合いを採用する。
本実施の形態においては、集中度として、量子化されたオプティカルフロー強度のそれぞれを、オプティカルフローの方向に沿って伝播させたときの、フィールドグリッド毎の総和を採用する。
次に、映像処理装置200の動作について説明する。
図16は、プレー開始推定処理の一例を示すフローチャートである。
図17は、プレー終了推定処理の一例を示すフローチャートである。
確認操作受付部240は、確認操作受付処理において、確認操作受付画面を用いて確認操作の受け付けを行う。確認操作受付処理の説明に先立って、確認操作受付画面の概要について説明する。
ここで、本実施の形態に係る映像処理装置200が適用されるシステムの具体例について説明する。
発明者は、本実施の形態にかかる映像処理装置200によるプレー区間の推定の精度に関して、実験を行った。以下、かかる実験とその結果について説明する。
実験には、2013年10月6日に行われたアメリカンフットボール社会人リーグの試合を固定カメラで撮影した映像を用いた。
発明者は、プレー開始フレームおよびプレー終了フレームを取得することによる動画縮約の精度を検証した。具体的には、実験動画(計90000フレーム、32プレー)を用いて、以下の式(10)を用いて、動画縮約率Cを算出した。ここで、frameCは、縮約後の総フレーム数であり、frameallは、元の映像の総フレーム数である。
プレー開始位置の精度評価実験においては、射影変換後のフィールド画像中で、手動で与えたground truthとユークリッド距離比較して精度比較を行った。
アメリカンフットボールでは、パスが誰にも触れられずにフィールドの外に出てしまう等、パスが失敗した場合、再度同じ場所からプレーが開始する。このため、プレー終了位置の精度評価実験は、そのような場合を除いた計15個の動画を用いて行った。そして、射影変換後の俯瞰画像中において、手動で与えた終了位置と本手法で算出した終了位置の間でユークリッド距離比較を行った。
以上説明したように、本実施の形態に係る映像処理装置200は、プレーの開始時および終了時の選手の動きや位置の特徴に着目して、プレー区間を推定するので、映像からプレー区間を効率良くかつ高精度に抽出することを可能にする。
なお、図15~図17、図19に示す各処理の順序は、上述の例に限定されない。例えば、映像処理装置200は、フレームの動き量を、フレームを選択する毎に、開始動き条件あるいは終了動き条件に必要な区間について、都度、算出してもよい。
本技術の映像処理方法、映像処理装置は、プロセッサがスポーツの試合を撮影した映像の映像データに対する処理を行う映像処理方法であって、映像データを入力し、入力した前記映像データから、フレーム毎に、選手の動き量を算出し、算出した前記動き量に基づいて、前記試合におけるプレーの開始フレームおよび、前記プレーの1つ前のプレーである直前プレーの終了したフレームである、終了フレームのうち少なくとも1つを推定する。
120 フィールド
121,122 サイドライン
123,124 ゴールライン
125,126 インバウンズライン
131,132,133 初期陣形
200 映像処理装置
210 映像入力部
220 プレー開始推定部
230 プレー終了推定部
240 確認操作受付部
250 推定結果処理部
300 色の濃い部分
301 総オプティカルフロー強度
302 プレー開始タイミング
303 プレー終了タイミング
311 左エリア
312 中央エリア
313 右エリア
314 識別器L
315 識別器C
316 識別器R
318,333 領域
319 重心の位置
322,332 矩形領域
331 フィールドグリッド
335,336,337,338 位置
339 フィールドグリット
342 集中位置
360 確認操作受付画面
361 候補表示選択領域
362 操作ボタン領域
363 映像表示領域
364 枠線
371 推定結果
372 アーカイブシステム
Claims (24)
- プロセッサがスポーツの試合を撮影した映像の映像データに対する処理を行う映像処理方法であって、
前記プロセッサは、
前記映像データを入力し、
入力した前記映像データから、フレーム毎に、選手の動き量を算出し、
算出した前記動き量に基づいて、前記試合におけるプレーの開始フレームおよび、前記プレーの1つ前のプレーである直前プレーの終了したフレームである、終了フレームのうち少なくとも1つを推定する、
映像処理方法。 - 前記プロセッサは、
前記映像データから、前記スポーツのチームの選手によって組まれる初期陣形を検出し、
算出した前記動き量と、前記初期陣形の検出結果と、に基づいて、前記プレーの開始フレームを推定する、
請求項1に記載の映像処理方法。 - 前記プロセッサは、
前記映像データから、前記スポーツのチームの選手によって組まれる初期陣形を検出し、
前記開始フレームにおける前記初期陣形の画像の位置を、前記試合におけるプレーの開始位置と推定する、
請求項1に記載の映像処理方法。 - 前記プロセッサは、
前記開始位置に基づいて、入力した前記映像データから、前記プレーの終了フレームを推定する、
請求項3に記載の映像処理方法。 - 前記プロセッサは、
推定した前記開始位置に基づいて、前記直前プレーの終了領域を推定し、
前記動き量に基づいて、前記試合における前記直前プレーの終了位置を含むフレームを推定し、
推定した前記終了領域に、推定された前記終了位置が含まれることを条件として、当該終了位置に対応付けられた前記フレームを前記終了フレームと推定する、
請求項4に記載の映像処理方法。 - 前記プロセッサは、
前記選手位置の密集度および集中度のうち少なくとも1つを算出し、算出された前記密集度および集中度のうち少なくとも1つに基づいて前記終了位置を推定する、
請求項5に記載の映像処理方法。 - 前記プロセッサは、
前記動き量が急激に増大した動き増大区間を推定し、推定された前記動き増大区間に基づいて、前記開始フレームを推定する、
請求項1に記載の映像処理方法。 - 前記プロセッサは、
前記動き量が急激に減少した動き減少区間を推定し、推定された前記動き減少区間に基づいて、前記終了フレームを推定する、
請求項1に記載の映像処理方法。 - 前記プロセッサは、
前記映像データから、前記開始フレームと、1つまたは複数の前記終了フレーム候補と、を対応付けて画面に表示させる、
請求項1に記載の映像処理方法。 - 前記プロセッサは、
表示された前記1つまたは複数の終了フレーム候補に対する決定操作を受け付け、
前記決定操作が行われた前記終了フレーム候補を、前記終了フレームと推定する、
請求項9に記載の映像処理方法。 - 前記プロセッサは、
表示された前記開始フレームおよび前記1つまたは複数の終了フレーム候補に対する再生操作を受け付け、
前記映像データのうち、前記決定操作が行われた前記開始フレームおよび前記終了フレーム候補に対応する区間に対応する映像データ部分を、再生して表示する、
請求項9に記載の映像処理方法。 - 前記プロセッサは、
前記開始フレームが推定しているとき、前記複数の開始フレームを第1の方向に沿って時系列に並べて表示し、かつ、前記開始フレームのそれぞれと、前記開始フレームに対応する前記プレーについて推定された前記終了フレームとを、前記第1の方向に交差する第2の方向に沿って並べて画面に表示させる、
請求項9に記載の映像処理方法。 - プロセッサがスポーツの試合を撮影した映像の映像データに対する処理を行う映像処理装置であって、
前記プロセッサは、
前記映像データを入力し、
入力した前記映像データから、フレーム毎に、選手の動き量を算出し、
算出した前記動き量に基づいて、前記試合におけるプレーの開始フレームおよび、前記プレーの1つ前のプレーである直前プレーの終了したフレームである、終了フレームのうち少なくとも1つを推定する、
映像処理装置。 - 前記プロセッサは、
前記映像データから、前記スポーツのチームの選手によって組まれる初期陣形を検出し、
算出した前記動き量と、前記初期陣形の検出結果と、に基づいて、前記プレーの開始フレームを推定する、
請求項13に記載の映像処理装置。 - 前記プロセッサは、
前記映像データから、前記スポーツのチームの選手によって組まれる初期陣形を検出し、
前記開始フレームにおける前記初期陣形の画像の位置を、前記試合におけるプレーの開始位置と推定する、
請求項13に記載の映像処理装置。 - 前記プロセッサは、
前記開始位置に基づいて、入力した前記映像データから、前記プレーの終了フレームを推定する、
請求項15に記載の映像処理装置。 - 前記プロセッサは、
推定した前記開始位置に基づいて、前記直前プレーの終了領域を推定し、
前記動き量に基づいて、前記試合における前記直前プレーの終了位置を含むフレームを推定し、
推定した前記終了領域に、推定された前記終了位置が含まれることを条件として、当該終了位置に対応付けられた前記フレームを前記終了フレームと推定する、
請求項16に記載の映像処理装置。 - 前記プロセッサは、
前記選手位置の密集度および集中度のうち少なくとも1つを算出し、算出された前記密集度および集中度のうち少なくとも1つに基づいて前記終了位置を推定する、
請求項17に記載の映像処理装置。 - 前記プロセッサは、
前記動き量が急激に増大した動き増大区間を推定し、推定された前記動き増大区間に基づいて、前記開始フレームを推定する、
請求項13に記載の映像処理装置。 - 前記プロセッサは、
前記動き量が急激に減少した動き減少区間を推定し、推定された前記動き減少区間に基づいて、前記終了フレームを推定する、
請求項13に記載の映像処理装置。 - 前記プロセッサは、
前記映像データから、前記開始フレームと、1つまたは複数の前記終了フレーム候補と、を対応付けて画面に表示させる、
請求項13に記載の映像処理装置。 - 前記プロセッサは、
表示された前記1つまたは複数の終了フレーム候補に対する決定操作を受け付け、
前記決定操作が行われた前記終了フレーム候補を、前記終了フレームと推定する、
請求項21に記載の映像処理装置。 - 前記プロセッサは、
表示された前記開始フレームおよび前記1つまたは複数の終了フレーム候補に対する再生操作を受け付け、
前記映像データのうち、前記決定操作が行われた前記開始フレームおよび前記終了フレーム候補に対応する区間に対応する映像データ部分を、再生して表示する、
請求項21に記載の映像処理装置。 - 前記プロセッサは、
前記開始フレームが推定しているとき、前記複数の開始フレームを第1の方向に沿って時系列に並べて表示し、かつ、前記開始フレームのそれぞれと、前記開始フレームに対応する前記プレーについて推定された前記終了フレームとを、前記第1の方向に交差する第2の方向に沿って並べて画面に表示させる、
請求項21に記載の映像処理装置。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/312,692 US9928879B2 (en) | 2014-06-10 | 2015-06-03 | Video processing method, and video processing device |
JP2016527625A JP6488295B2 (ja) | 2014-06-10 | 2015-06-03 | 映像処理方法、映像処理装置 |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014-119902 | 2014-06-10 | ||
JP2014119902 | 2014-06-10 | ||
JP2014-150720 | 2014-07-24 | ||
JP2014150720 | 2014-07-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015190071A1 true WO2015190071A1 (ja) | 2015-12-17 |
Family
ID=54833186
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/002808 WO2015190071A1 (ja) | 2014-06-10 | 2015-06-03 | 映像処理方法、映像処理装置 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9928879B2 (ja) |
JP (1) | JP6488295B2 (ja) |
WO (1) | WO2015190071A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107441691A (zh) * | 2017-09-12 | 2017-12-08 | 上海视智电子科技有限公司 | 基于体感摄像头的健身方法和健身设备 |
EP3324307A1 (en) | 2016-11-18 | 2018-05-23 | Kabushiki Kaisha Toshiba | Retrieval device, retrieval method, and computer-readable medium |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10186044B2 (en) | 2015-03-04 | 2019-01-22 | Panasonic Intellectual Property Management Co., Ltd. | Person tracking method and person tracking device |
US10025986B1 (en) * | 2015-04-27 | 2018-07-17 | Agile Sports Technologies, Inc. | Method and apparatus for automatically detecting and replaying notable moments of a performance |
JP6996384B2 (ja) * | 2018-03-27 | 2022-01-17 | 富士通株式会社 | 表示プログラム、表示方法および表示装置 |
US11948097B1 (en) * | 2019-04-11 | 2024-04-02 | Stark Focus LLC | System and method for viewing an event |
US11715303B2 (en) * | 2020-02-13 | 2023-08-01 | Stats Llc | Dynamically predicting shot type using a personalized deep neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05253324A (ja) * | 1992-03-11 | 1993-10-05 | Zexel Corp | スポーツ用運動解析装置 |
JP2003143546A (ja) * | 2001-06-04 | 2003-05-16 | Sharp Corp | フットボールビデオ処理方法 |
JP2004260765A (ja) * | 2003-02-27 | 2004-09-16 | Nihon Knowledge Kk | 実技分析システム及びプログラム |
JP2013188426A (ja) * | 2012-03-15 | 2013-09-26 | Sony Corp | 情報処理装置、情報処理システムおよびプログラム |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0949822A3 (en) * | 1998-04-07 | 2004-07-28 | Matsushita Electric Industrial Co., Ltd. | Video coding control method and apparatus |
US7499077B2 (en) * | 2001-06-04 | 2009-03-03 | Sharp Laboratories Of America, Inc. | Summarization of football video content |
ES2714362T3 (es) * | 2010-01-05 | 2019-05-28 | Isolynx Llc | Sistema para mostrar información de eventos atléticos en un marcador |
JP5809403B2 (ja) * | 2010-09-14 | 2015-11-10 | 株式会社バンダイナムコエンターテインメント | プログラム、サーバ、及びネットワークシステム |
US9367745B2 (en) * | 2012-04-24 | 2016-06-14 | Liveclips Llc | System for annotating media content for automatic content understanding |
US8803913B1 (en) * | 2013-09-09 | 2014-08-12 | Brian Scott Edmonston | Speed measurement method and apparatus |
-
2015
- 2015-06-03 US US15/312,692 patent/US9928879B2/en active Active
- 2015-06-03 JP JP2016527625A patent/JP6488295B2/ja active Active
- 2015-06-03 WO PCT/JP2015/002808 patent/WO2015190071A1/ja active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05253324A (ja) * | 1992-03-11 | 1993-10-05 | Zexel Corp | スポーツ用運動解析装置 |
JP2003143546A (ja) * | 2001-06-04 | 2003-05-16 | Sharp Corp | フットボールビデオ処理方法 |
JP2004260765A (ja) * | 2003-02-27 | 2004-09-16 | Nihon Knowledge Kk | 実技分析システム及びプログラム |
JP2013188426A (ja) * | 2012-03-15 | 2013-09-26 | Sony Corp | 情報処理装置、情報処理システムおよびプログラム |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3324307A1 (en) | 2016-11-18 | 2018-05-23 | Kabushiki Kaisha Toshiba | Retrieval device, retrieval method, and computer-readable medium |
US11189035B2 (en) | 2016-11-18 | 2021-11-30 | Kabushiki Kaisha Toshiba | Retrieval device, retrieval method, and computer program product |
CN107441691A (zh) * | 2017-09-12 | 2017-12-08 | 上海视智电子科技有限公司 | 基于体感摄像头的健身方法和健身设备 |
Also Published As
Publication number | Publication date |
---|---|
US20170206932A1 (en) | 2017-07-20 |
JP6488295B2 (ja) | 2019-03-20 |
US9928879B2 (en) | 2018-03-27 |
JPWO2015190071A1 (ja) | 2017-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6488295B2 (ja) | 映像処理方法、映像処理装置 | |
US11380100B2 (en) | Methods and systems for ball game analytics with a mobile device | |
US11157742B2 (en) | Methods and systems for multiplayer tagging for ball game analytics generation with a mobile computing device | |
US7310589B2 (en) | Processing of video content | |
CN102819749B (zh) | 一种基于视频分析的足球越位自动判别***和方法 | |
JP6249706B2 (ja) | 情報処理装置、情報処理方法及びプログラム | |
CN112154482A (zh) | 球类运动影像分析装置及球类运动影像分析方法 | |
Pidaparthy et al. | Keep your eye on the puck: Automatic hockey videography | |
US20230377336A1 (en) | Method of operating server providing sports video-based platform service | |
Vats et al. | Puck localization and multi-task event recognition in broadcast hockey videos | |
KR20090118634A (ko) | 운동경기의 자동분석 시스템 및 그 방법 | |
KR101701632B1 (ko) | 판정 방법 및 장치 | |
JP7113335B2 (ja) | プレイ分析装置、及び、プレイ分析方法 | |
JP7113336B2 (ja) | プレイ分析装置、及び、プレイ分析方法 | |
JP6371020B1 (ja) | 情報処理装置 | |
KR20020078449A (ko) | 축구 비디오 자동 분석 장치 및 방법 | |
WO2020071092A1 (ja) | プレイ分析装置、及び、プレイ分析方法 | |
KR20210048050A (ko) | 중계 영상을 활용한 야구 경기 분석 장치 및 방법과 요약 비디오 생성 방법 | |
JP7296546B2 (ja) | プレイ分析装置、及び、プレイ分析方法 | |
US20230031622A1 (en) | Live Possession Value Model | |
Kurano et al. | Ball trajectory extraction in team sports videos by focusing on ball holder candidates for a play search and 3D virtual display system | |
Kim et al. | Media adaptation model based on character object for cognitive TV | |
CN117274372A (zh) | 一种自动判断运动员投球出手的方法及*** | |
KITAHARA et al. | A Proposal on Automatic Analysis Method of Tennis Play Using Movies of Tennis Match |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15806168 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016527625 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15312692 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15806168 Country of ref document: EP Kind code of ref document: A1 |